[pylucene-dev] simple threading segfault demo
Norbert Wojtowicz
wojtowicz.norbert at gmail.com
Mon Mar 12 19:31:20 PST 2007
Hello,
I'm sure someone can give you more detailed advice, but the general rule
with PyLucene and threading is you need to use PyLucene.PythonThread
wherever you would normally use a python thread. It's a small wrapper
for python's thread that fixes some issues with gcj and the garbage
collector. I'm sure someone can explain that better, but I've learned
this is the golden rule when working with PyLucene and threads.
Cheers,
Norbert
On Mon, 2007-03-12 at 20:15 -0700, Ofer Nave wrote:
> Hello.
>
> I wanted to try splitting my index up into two slices and indexing each in
> separate threads to see if it would run faster on a dual-proc box, but my
> script began segfaulting as soon as threading was added. This is the first
> time I've ever used threads in Python, so I might be doing something
> obviously stupid.
>
> Anyway, I pared down the script to a minimal test case that still yields a
> segfault. Here is the code:
>
> ---
> #!/usr/bin/python
> import os
> import sys
> import threading
>
> import PyLucene
>
> class Indexer(object):
> def __init__(self, index_dir):
> self.index_dir = index_dir
> if not os.path.exists(index_dir):
> os.mkdir(index_dir)
>
> def run(self):
> worker1 = Worker(self.index_dir + '/1', 1)
> worker2 = Worker(self.index_dir + '/2', 2)
> worker1.start()
> worker2.start()
> while (worker1.isAlive() or worker2.isAlive()):
> pass
>
> class Worker(threading.Thread):
> def __init__(self, index_dir, worker_id):
> threading.Thread.__init__(self)
> self.index_dir = index_dir
> self.worker_id = worker_id
> if not os.path.exists(index_dir):
> os.mkdir(index_dir)
>
> def run(self):
> print 'woo hoo: ' + self.index_dir
> self.store = PyLucene.FSDirectory.getDirectory(self.index_dir, True)
> self.store.close()
>
> if __name__ == '__main__':
> if len(sys.argv) < 2:
> print "Usage: python " + __file__ + " <index_dir>"
> sys.exit(1)
> print 'PyLucene', PyLucene.VERSION, 'Lucene', PyLucene.LUCENE_VERSION
> indexer = Indexer(sys.argv[1])
> indexer.run()
> ---
>
> The output is as follows:
>
> [ofer at rnd01 ~/bin]$ lucene_segfault_demo /tmp
> PyLucene 2.1.0-1 Lucene 2.1.0-509013
> woo hoo: /tmp/1
> Segmentation fault
>
> Any ideas?
>
> -ofer
>
> _______________________________________________
> pylucene-dev mailing list
> pylucene-dev at osafoundation.org
> http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
More information about the pylucene-dev
mailing list