以下是一个基本的Python爬虫模板,可以根据需要进行修改:
“`python
import requests
from bs4 import BeautifulSoup
# 设置请求头,模拟浏览器访问
headers = {
‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3’}
# 发送请求
response = requests.get(url, headers=headers)
# 解析HTML
soup = BeautifulSoup(response.text, ‘html.parser’)
# 获取需要的数据
data = soup.find(‘div’, {‘class’: ‘example’})
# 输出结果
print(data.text)
“`
其中,`url`为需要爬取的网页链接,`headers`为请求头,`response`为响应对象,`soup`为BeautifulSoup对象,`data`为需要获取的数据。可以根据需要修改这些变量的名称和内容。
#! -*- encoding:utf-8 -*-
import requests
# 要访问的目标页面
targetUrl = "http://ip.hahado.cn/ip"
# 代理服务器
proxyHost = "ip.hahado.cn"
proxyPort = "39010"
# 代理隧道验证信息
proxyUser = "username"
proxyPass = "password"
proxyMeta = "http://%(user)s:%(pass)s@%(host)s:%(port)s" % {
"host" : proxyHost,
"port" : proxyPort,
"user" : proxyUser,
"pass" : proxyPass,
}
proxies = {
"http" : proxyMeta,
"https" : proxyMeta,
}
resp = requests.get(targetUrl, proxies=proxies)
print resp.status_code
print resp.text
服务器托管,北京服务器托管,服务器租用 http://www.fwqtg.net
机房租用,北京机房租用,IDC机房托管, http://www.fwqtg.net
JPEGmini is a desktop application for Mac and Windows that reduces the file size of images and videos without compromising their q…