首页 > 编程知识 正文

python为什么叫爬虫,路由器设置代理ip

时间:2023-05-06 05:32:19 阅读:135060 作者:2602

设置代理池,避免请求频率过高,禁止访问站点

1 .配置用户代理user-agentimportrequestsfromlxmlimportetree ' # 1。 当浏览器内核版本检测https://ie.icoa.cn/#访问网站时,浏览器会自动检测到用户代理#浏览器内核检测头={ ' x-requested-with 发送' authority ' : ' ie.icoa.cn ' } headers=headers,verify=false (RES.encoding=' utf8 ' root=etree.htree ) 2.22.0']#如何伪装浏览器,请在头中输入用户代理头={ ' user-agent ' : ' Mozilla/5.0 (windows nt 10.0; Win64; x64 )苹果WebKit/537.36 (khtml, like Gecko ) chrome/86.0.4240.111 safari/537.36 )、' x-requested-with ' 3366666 ' authority ' 33660 ' ie.icoa.id headers=headers,verify=false (RES.encoding=' utf8 ' root=etree.html ) RES.text ) ua=root.XPath ('/tablath ) x64 ) appleWebKit/537.36 ) khtml、like gecko (chrome/86.0.4240.111 safari/537.36 ' )2.生成ua工具: fake-user代理公式from fake _ useragentimportuseragent #一般用法# 1.ua.randomua=UserAgent () U1=ua.randomprint ) ) U1 ) # Mozilla/5 MSIE 10.0; Windows NT 6.1; trident/5.0(#2.指定浏览器print(UA.ie ) #Mozilla/4.0 ) ) compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; NET CLR 2.0.50727; 媒体中心PC 6.0; NET CLR 3.5.30729; NET CLR 3.0.30729; net4.0c(print ) UA.chrome ) #Mozilla/5.0 ) x11; Ubuntu; Linux i686 on x86_64 ) appleWebKit/537.36(khtml,like Gecko ) chrome/53.0.2820.59 safari/537.36打印(ua.firefirefo ) RV :64.0 (gecko/2010 01 01 Firefox/64.0 )浏览器代理池(用户代理池) importrandomUA_list=[] forIinrange(5) 3360UA_list.aaa forIinrange(5) :user_agent=random.choice ) ua_list ) headers={ ' user 原生IP :百度搜索) IP“importrequeest”ie=utf-8f=8RSV _ BP=1RSV _ idx=1tn=Baidu wd=IP ' RES=requests.get (UE ) RES.text (print ) IP ) ) (113.200.157.34 ) )设置代理IPproxies={'htties}的' https ' : ' http://171.15.15 } proxies=proxies ()/span'IP=re.findall(pat,res.text ) print (print ) IP ) ) 113.200.157.34 )

版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。