GitXplorerGitXplorer
w

call_scrapy_in_a_pyScript

public
0 stars
0 forks
0 issues

Commits

List of commits on branch master.
Unverified
d9a09ee26004fdda4cc0842df2ce1c7f2e3a6594

scrapy in one py script

wwilliamjqk committed 8 years ago
Unverified
9d4eb111307685c214decea2174f2101c7438da5

:tada: Added .gitattributes

wwilliamjqk committed 8 years ago

README

The README file for this repository.

Use the Scrapy library to crawl the page url

I re-wrote the reptile program with Scrapy library. Currently on the Internet search Scrapy most of the use of the command line in the operation, the scrapy as an application. I want to write the function used in a py script, so write this.

Note: On the basis of the bs4 program is different. Later in the online search, found that Scrapy does not support multi-threaded (multi-threaded is Scrapy own internal optimization, but can not be manually configured, still can run more crawlers), yield \ Request \ callback these together With the efficiency is still very high.