Python简易百度图片爬虫

本文将为您介绍如何用Python编写一个简易的百度图片爬虫。

一、准备工作

在开始编写爬虫之前，我们需要安装一些必要的库。首先，确保已经安装了Python，并安装以下库：

pip install requests
pip install BeautifulSoup4
pip install lxml
pip install urllib3

二、爬取图片

首先，我们导入需要的库：

import requests
from bs4 import BeautifulSoup
import os

然后，定义一个函数来创建保存图片的目录：

def create_dir(dir_name):
    if not os.path.exists(dir_name):
        os.makedirs(dir_name)

接下来，定义一个函数来下载图片：

def download_image(url, dir_name, file_name):
    response = requests.get(url)
    with open(os.path.join(dir_name, file_name), 'wb') as f:
        f.write(response.content)

接着，我们设定一些爬取的参数：

keyword = '美食'
num_images = 10

然后，使用百度图片的搜索接口来获取图片的链接：

search_url = 'https://image.baidu.com/search/flip?tn=baiduimage&ie=utf-8&word=' + keyword
response = requests.get(search_url)
soup = BeautifulSoup(response.text, 'lxml')
img_tags = soup.find_all('img', class_='main_img')
img_urls = [img['data-imgurl'] for img in img_tags]

接下来，创建保存图片的目录，并下载图片：

dir_name = 'images'
create_dir(dir_name)

for i, img_url in enumerate(img_urls[:num_images]):
    file_name = 'image{}.jpg'.format(i+1)
    download_image(img_url, dir_name, file_name)
    print('Downloaded {}'.format(file_name))

三、运行结果

运行上述代码后，程序将会自动下载指定数量的图片，并保存在images目录下。

至此，我们已经成功编写了一个简易的百度图片爬虫。

以上就是本文对Python简易百度图片爬虫的详细阐述。通过本文的介绍，您可以了解到如何使用Python编写一个简单的爬虫，并实现图片的批量下载。希望本文对您有所帮助！