首页 > 编程知识 正文

爬虫程序在生活中的应用,网络爬虫python代码

时间:2023-05-03 08:19:51 阅读:249550 作者:2835

from bs4 import BeautifulSoupimport requestsdef one(url): r = requests.get(url, allow_redirects = False) fin = r.text #fin是上面的变量名?soup.find里是按级别找的 soup = BeautifulSoup(fin,'html.parser') comments = soup.find('div','hos-guide-box') comments_list = comments.find_all(['p','h2','h3','h4']) result = [] for i in range(len(comments_list)): result.append(comments_list[i].text.strip()) result1 = " ".join(result) return result1

XHR过滤掉不相关信息

def main():#这个网址打不开 r = requests.post('http://wapjbk.39.net/DiseaseArea/SpeciesCate', data={'id':4,'cateId':77}).json() list1 = [] file = [] #list1是网址,file是症状名 for i in r: list1.append(i.split('~')[0]) file.append(i.split('~')[1]) for i in range(len(list1)): url = 'http://wapjbk.39.net/'+list1[i]+'/zztz/' result1 = one(url) file1 ="D://dabao//爬虫练习//泌尿系统//" + file[i] + ".txt" f = open(file1,'a',encoding='utf-8') f.write(result1)if __name__ == '__main__': main()

版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。