Scrapy on Debian 6
Debian 6 comes with Scrapy 8 as downloadable packages on apt. Here a quick howto to get this spider works on Debian
sudo apt-get install python-scrapy cd mkdir mydir cd mydir scrapy-ctl startproject anime export SCRAPY_SETTINGS_MODULE=anime.settings export PYTHONPATH=/home/YOURHOMEHERE/mydir
If you’ve already a bot but you, to run your spider thanks to point 6 and 7 you can simply type:
scrapy-ctl crawl example.com
Otherwise, now you can follow the howto on tutorial section of Scrapy 8 or this awesome howto by Pravin Paratey to write your own bot, but remember to use the scrapy-ctl command instead of the .py version and to add all your spiders to SCRAPY_SETTINGS_MODULE and PYTHONPATH.
awesome howto by Pravin Paratey (http://pravin.insanitybegins.com)
To list your available (and correctly configured) spider, just type:
scrapy-ctl list
If a bot doesn’t appear here, you have an issue on point 6 or 7 or you have a misconfigured spider, i.e. I was forgetting the SPIDER part on bottom of my spider and I was using domain instead of domain_name on my script, see Pravin’s howto to write correct Scrapy 0 code.
https://web.archive.org/web/20130104000000*/http://doc.scrapy.org/en/0.8/intro/tutorial.html (https://web.archive.org)
https://web.archive.org/web/20130104000000*/http://pravin.insanitybegins.com/posts/writing-a-spider-in-10-mins-using-scrapy (https://web.archive.org)
Response: 20 (Success), text/gemini
| Original URL | gemini://chirale.org/2013-01-04_920.gmi |
|---|---|
| Status Code | 20 (Success) |
| Content-Type | text/gemini; charset=utf-8 |