推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
Ewig
V2EX  ›  Python

scrapy 框架里面 middleware

  •  
  •   Ewig · Jan 16, 2019 · 1547 views
    This topic created in 2698 days ago, the information mentioned may be changed or developed.
    from scrapy.http.headers import Headers
    from Espider.tools.get_cookies import get_cookies
    import pymongo,random
    from Espider.tools.user_agents import user_agents
    from fake_useragent import UserAgent


    class zhipincookiemiddleware():

    def __init__(self, mongodbHost, mongodbPort, mongodbName):
    self.mongodbHost = mongodbHost
    self.mongodbPort = mongodbPort
    self.mongodbName = mongodbName

    @classmethod
    def from_crawler(cls, crawler):
    return cls(mongodbHost=crawler.settings.get('MONGODB_HOST'), mongodbPort=crawler.settings.get('MONGODB_PORT'),
    mongodbName=crawler.settings.get('MONGODB_DBNAME'))

    def process_request(self, request, spider):
    ua=UserAgent()

    self.client = pymongo.MongoClient(self.mongodbHost, self.mongodbPort)
    self.mongodb = self.client[self.mongodbName]

    self.collection = self.mongodb[spider.name + '_cookie']
    self.cookies_str = self.collection.find_one()['cookie']
    self.headers = {
    "User-Agent":ua.random,
    "cookie": random.choice(self.cookies_str)}

    request.headers = Headers(self.headers)

    框架里面写了一个 cookiemiddleware
    我这里写了一个 random cookie,在每次请求的时候 会重新随机一下 cookie 吗?
    1 replies    2019-01-17 09:21:54 +08:00
    holajamc
        1
    holajamc  
       Jan 17, 2019
    你们公司还要人嘛?
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   878 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 28ms · UTC 21:58 · PVG 05:58 · LAX 14:58 · JFK 17:58
    ♥ Do have faith in what you're doing.