Software Tools/Search

= Search Index Systems Esp. for Full Text Search =


 * Xapian
 * Whoosh
 * Solr/Lucence
 * Sphinx

See also: http://knowledgeforge.net/ckan/trac/wiki/SearchEngine

Xapian

 * C++ library with many language bindings
 * http://xapian.org/

Whoosh

 * Pure python
 * Quite slow, no way to adapt indexing (tokenization, expansion etc.)
 * Tends to corrupt its own index when multithreading.
 * http://whoosh.ca

Solr

 * Heavy-weight Java app
 * Nice Python package: [solrpy](http://code.google.com/p/solrpy/) (there are others but this one seems to be actively developed)
 * http://pudo.org:8983/solr/www.ckan.net/admin/ has a irregularly updated index of www.ckan.net for testing. Schema and transfer script availbale upon request. --pudo