We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好.我就是那个给您发邮件的.今天配置了下环境,终于跑起来了.还有几个问题.如果有时间的话,请帮忙回复下:
如果您有时间,请帮忙回复下.邮件,或者在这里都可以.非常感谢您!!!!
The text was updated successfully, but these errors were encountered:
您这个项目是把整个网站所有的音频信息都爬下来吗?
不能保证都爬下来,但是一个网站一百万左右应该能够保证。
我正常跑完其中一个网站.大概需要多长时间能跑完?
要看你的限速和带宽,100Mb 的带宽打满的话一个网站两天到三天
Sorry, something went wrong.
谢谢您回复我这些简单的问题.还有2个问题.麻烦您有时间在帮忙解答下. 1:然后我前天开始跑脚本,但是早上起来发现,脚本卡在一个url上面.不往下面继续跑了,然后查看mongo里面有30w条数据,我在debug.log里面也没发现和这个有关的log.我强制终止后,重新运行,还是能跑起来的,会是网站限制了我的爬取吗?如果不是,这个大概会是什么情况呢? 2:我想提高下爬取的速度.看了下scrapy的官方文档.然后修改了以下字段. AUTOTHROTTLE_START_DELAY=2 CONCURRENT_REQUESTS=128 然后修改了mongo存储数量 ITEM_PIPELINES = { 'tutorial.pipelines.SaveToMongo': 300,} 这些可以提高爬取速度的吗? 如果您有时间,请帮忙解答一下.非常感谢!!
No branches or pull requests
您好.我就是那个给您发邮件的.今天配置了下环境,终于跑起来了.还有几个问题.如果有时间的话,请帮忙回复下:
2.我正常跑完其中一个网站.大概需要多长时间能跑完?
如果您有时间,请帮忙回复下.邮件,或者在这里都可以.非常感谢您!!!!
The text was updated successfully, but these errors were encountered: