Project

basilisk

0.01
No commit activity in last 3 years
No release in over 3 years
A command-line front-end for the anemone web-spider. Generates reports for seo, http errors and an xml sitemap. Extensible page handler.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

>= 0.1.2
>= 1.5.0
>= 1.3.0
 Project Readme

basilisk¶ ↑

a command-line front-end for the anemone web-crawler (github.com/chriskite/anemone). basilisk produces useful reports for qa-ing websites. It also features an extensible page processor class for writing your own page processors.

Included page processors:

  • seo: generates a csv with the following columns: url, title, description, keywords, h1s, h2s

  • sitemap: generates an xml sitemap

  • image: generates a list of broken images and images lacking an alt tag.

  • error: generates a csv of urls returning html response codes other than success and redirect.

See the generated yml config file for even more options.

install¶ ↑

sudo gem install basilisk

usage¶ ↑

To create a new search:

basil create [search_name] [url]
  • Creates a search config file ([search_name].yml), which you may edit to change the default options, specify which page process you want to run, any regex and css terms for searching across the site, and regexes for skipping urls.

To run the search:

basil run [search_name]
  • Runs the specified search. Note: you must create a search before running it. Files generated by the page processors will reside in a folder called [search_name].

author & license¶ ↑

basilisk is licensed under a modified MIT licence. See LICENCE.txt.

basilisk was written by Kyle Banker, largely dependent on the anemone web-crawler by Chris Kite.

Copyright 2009 Alexander Interactive, Inc.