SMOLNET PORTAL home about changes

Scrapy on Debian 6

Debian 6 comes with Scrapy 8 as downloadable packages on apt. Here a quick howto to get this spider works on Debian

Scrapy 8 (http://doc.scrapy.org)


sudo apt-get install python-scrapy cd mkdir mydir cd mydir scrapy-ctl startproject anime export SCRAPY_SETTINGS_MODULE=anime.settings export PYTHONPATH=/home/YOURHOMEHERE/mydir

If you’ve already a bot but you, to run your spider thanks to point 6 and 7 you can simply type:

scrapy-ctl crawl example.com


Otherwise, now you can follow the howto on tutorial section of Scrapy 8 or this awesome howto by Pravin Paratey to write your own bot, but remember to use the scrapy-ctl command instead of the .py version and to add all your spiders to SCRAPY_SETTINGS_MODULE and PYTHONPATH.

Scrapy 8 (http://doc.scrapy.org)
awesome howto by Pravin Paratey (http://pravin.insanitybegins.com)


To list your available (and correctly configured) spider, just type:

scrapy-ctl list


If a bot doesn’t appear here, you have an issue on point 6 or 7 or you have a misconfigured spider, i.e. I was forgetting the SPIDER part on bottom of my spider and I was using domain instead of domain_name on my script, see Pravin’s howto to write correct Scrapy 0 code.

https://web.archive.org/web/20130104000000*/http://doc.scrapy.org/en/0.8/intro/tutorial.html (https://web.archive.org)
https://web.archive.org/web/20130104000000*/http://doc.scrapy.org/en/0.8/intro/tutorial.html (https://web.archive.org)
https://web.archive.org/web/20130104000000*/http://pravin.insanitybegins.com/posts/writing-a-spider-in-10-mins-using-scrapy (https://web.archive.org)
Response: 20 (Success), text/gemini
Original URLgemini://chirale.org/2013-01-04_920.gmi
Status Code20 (Success)
Content-Typetext/gemini; charset=utf-8