首页python多线程爬虫

python多线程爬虫

时间: 2023-11-14 18:05:21 浏览: 49

Python有一个内置的threading库可以用于多线程编程。下面是一个简单的多线程爬虫的示例： ```python import threading import requests class SpiderThread(threading.Thread): def __init__(self, url): threading.Thread.__init__(self) self.url = url def run(self): response = requests.get(self.url) print(response.text) urls = ['https://www.baidu.com', 'https://www.google.com', 'https://www.bing.com'] threads = [] for url in urls: thread = SpiderThread(url) thread.start() threads.append(thread) for thread in threads: thread.join() ``` 这个例子中，我们创建了一个SpiderThread类，它继承自threading.Thread类，并重写了run()方法。在run()方法中，我们使用requests库向指定的url发送GET请求，并输出响应内容。在主程序中，我们创建了一个urls列表，其中包含了我们要爬取的网站的URL。我们遍历这个列表，创建一个SpiderThread对象，并调用start()方法启动线程。然后将这个线程添加到threads列表中。最后，我们遍历threads列表，并调用join()方法等待所有线程执行完毕。

阅读全文