[BNM] Big money for a free text search implementation

AndrewGill73@gmail.com andrewgill73 at gmail.com
Wed Apr 29 16:27:43 BST 2009


You may want to consider using Sphinx (http://www.sphinxsearch.com/).

Andy Gill

2009/4/29 David Pashley <david at davidpashley.com>

> On Apr 29, 2009 at 15:08, Wayne Douglas praised the llamas by saying:
> > The problem using a DB plugin/architecture approach is that you need to
> > define all of the columns in all of the tables where the search indexing
> > needs to take place. You also need to do this carefully as it could be -
> as
> > far as FTS goes - expensive resource wise.
> >
> > The best way to do it really would be to actually crawl the page output
> on
> > all of the searchable areas - as google does. This is obviously a lot
> harder
> > to accomplaish than just adding an attribute to an object. Unless you use
> > Google site search - but I'm not sure how skinable that is. I think it
> looks
> > shit (from the implementations that I've seen) in a site that's not OSS
> or
> > charity to be fair as it looks like a 10 minute jobby - which is what it
> is.
> >
> >
> > In a greenfield project adding search capability by adding attributes to
> say
> > domain objects isn't an issue - in an app that's evolved over the years -
> > that could be a right old clusterf*ck.
> >
> > Not sure what MySql FTS is like but I know that the SQLS version is
> pretty
> > shiz.
> >
> Oh, definitely wouldn't use a database for indexing something that
> wasn't already in the database and wasn't a large dataset.
>
> As people have pointed out, an architecture built on top of Lucene (or
> it's many reimplementations) would be a sensible plan. Lucene however
> doesn't scale well after a point and isn't easy to ensure high
> availablilty. Solr has some rudimentary replication from what I
> remember.
>
> From the last time I looked at this, at a certain point, you probably
> want to look into using DLucene[0] and Hadoop, although it wasn't really
> ready for the prime time then.
>
>
> [0] That's Distributed Lucene, not Lucene implemented in D or Delphi.
> http://wiki.apache.org/hadoop/DistributedLucene
> --
> David Pashley
> david at davidpashley.com
> Nihil curo de ista tua stulta superstitione.
> --
>
> BNM Subscribe/Unsubscribe:
> http://www.brightonnewmedia.org/options/bnmlist
>
> BNM powered by Wessex Networks:
> http://www.wessexnetworks.com
>


More information about the BNMlist mailing list. Powered by Wessex Networks