Python爬图片网站模板

本文介绍了使用Python编写的爬取图片网站模板代码。通过该模板，您可以快速爬取各种图片网站上的图片数据。

一、模块导入

首先，我们需要导入相关的模块，包括requests、BeautifulSoup和os。

import requests
from bs4 import BeautifulSoup
import os

二、网页请求和解析

接下来，我们需要发送请求并解析网页内容。这里以爬取某个图片网站上的图片为例。

def get_html(url):
    response = requests.get(url)
    response.encoding = 'utf-8'
    return response.text

def parse_html(html):
    soup = BeautifulSoup(html, 'html.parser')
    image_tags = soup.find_all('img')
    image_urls = [tag['src'] for tag in image_tags]
    return image_urls

url = 'https://www.example.com'  # 替换为你要爬取的网站链接
html = get_html(url)
image_urls = parse_html(html)

三、图片下载

获取到图片的链接后，我们可以使用requests模块再次发送请求，下载图片。

def download_image(url, save_path):
    response = requests.get(url)
    with open(save_path, 'wb') as f:
        f.write(response.content)

def download_images(image_urls):
    save_dir = 'images'  # 保存图片的目录
    if not os.path.exists(save_dir):
        os.makedirs(save_dir)
    for i, url in enumerate(image_urls):
        save_path = os.path.join(save_dir, f'image_{i}.jpg')
        download_image(url, save_path)

download_images(image_urls)

四、运行代码

将以上代码存为一个.py文件，使用Python解释器运行即可开始爬取图片。

python scrape_images.py

五、总结

通过本文的介绍，您学习了如何使用Python编写爬取图片网站的模板代码。通过修改模板中的请求URL和图片保存路径，您可以快速适应不同的爬取需求。希望本文能对您在爬取图片数据方面提供一些帮助。