[pylucene-dev] pylucene and recommendations for RAM

Andi Vajda vajda at osafoundation.org
Thu Apr 5 13:01:13 PDT 2007


On Thu, 5 Apr 2007, David Pratt wrote:

> Hi Andi. What needs to be done and where would one get started - is this java 
> programming? This is where we need Nuxeo and their java skills :-) Do you 
> feel this could solve the issue of a remotely distributed index? It seems as 
> though it is the answer - but I have no experience with this at the moment to 
> draw from. I have been reading posts on mailing lists about large indexes. 
> Can you see a series of 2GB servers serving up search for 50+ million docs 
> efficiently through the Remote Parallel Multisearcher. Many thanks.

The RemoteSearchable APIs need to be wrapped and serialization for PyLucene 
Java objects needs to be implemented. Then, if you want to mix and match 
(yeah, right) Lucene remote and PyLucene local, you need to also make sure the 
bits match bit for bit on both sides. I don't know how good the libgcj 
serialization support is but my feeling is not too good (read: too mature). 
This may change with the arrival of OpenJDK and its possible use with libgcj.

Seriously, I think that a remote solution for PyLucene has a much higher 
chance of success with a python-specific API built and designed for python.
A more hands off approach, ie not using raw serialization or picking, but a 
higher level search API has a much better chance,

Andi..


More information about the pylucene-dev mailing list