From edi at agharta.de Tue Mar 1 08:19:44 2005 From: edi at agharta.de (Edi Weitz) Date: Tue, 01 Mar 2005 09:19:44 +0100 Subject: [elephant-devel] Some problems installing Elephant Message-ID: Hi! I tried installing Elephant (0.2.1) on a Debian testing system with Berkeley DB 4.2 last night and came across a couple of issues which I'm reporting below. I don't know which of these are known so I'll just list them all. I. Allegro 7.0 1. I installed the 'libpth14' package from Debian for the threading library but got these results when compiling/loading Elephant: Error: Loading /usr/lib/libpthread.so failed with error: /usr/lib/libpthread.so: invalid ELF header. There's another package available called 'libpth2' but I get the same error with that one. I then proceeded without libpthread.so, i.e. I just removed the form which tried to load this library and all else seems to work (more or less). Is this library really needed? And what for? 2. I saw a couple of warnings during compilation: ;;; Compiling file ;;; /usr/local/lisp/source/elephant-0.2.1/src/collections.lisp ; While compiling (METHOD (SETF GET-VALUE) (T T BTREE-INDEX)): Warning: variable BT is used yet it was declared ignored ; While compiling (METHOD CURSOR-GET-BOTH (SECONDARY-CURSOR T T)): Warning: variable CURSOR is used yet it was declared ignored ; While compiling (METHOD CURSOR-GET-BOTH-RANGE (SECONDARY-CURSOR T T)): Warning: variable CURSOR is used yet it was declared ignored ; While compiling (METHOD CURSOR-PUT (SECONDARY-CURSOR T)): Warning: variable CURSOR is used yet it was declared ignored Warning: While compiling these undefined functions were referenced: #:G2387 from position 44023 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G2357 from position 43753 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G2296 from position 43146 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G2118 from position 40902 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G1949 from position 27052 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G1921 from position 20890 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G1646 from position 10155 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G1631 from position 8250 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G1619 from position 7935 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G1553 from position 7283 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G1266 from position 5184 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G1236 from position 4939 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp #:G906 from position 2337 in /usr/local/lisp/source/elephant-0.2.1/src/berkeley-db.lisp 3. Running all tests with (EL-TEST::DO-ALL-TESTS) gave this: 3 out of 85 total tests failed: ELEPHANT-TESTS::ARRAYS-1, ELEPHANT-TESTS::NO-EVAL-INITFORM, ELEPHANT-TESTS::UPDATE-CLASS. From the docs I expected only one test to fail. What about the other two? II. Then I tried CMUCL 19a (from Debian) and had problems similar to 1. and 3. above, namely: 1. The Debian libthread.so doesn't work with CMUCL either: Error in function SYSTEM::LOAD-OBJECT-FILE: Can't open object "/usr/lib/libpthread.so": NIL [Condition of type SIMPLE-ERROR] 2. More than one test fails: 2 out of 85 total tests failed: ELEPHANT-TESTS::NO-EVAL-INITFORM, ELEPHANT-TESTS::UPDATE-CLASS. Finally, I installed Berkeley DB 4.3 and gave the Elephant CVS version a try. I haven't captured all the details but it wasn't flawless either. I seem to remember the same problems with libpthread.so and there were also test failures. In addition, CMUCL complained about locked packages. Cheers, Edi. From ben at medianstrip.net Wed Mar 2 03:41:06 2005 From: ben at medianstrip.net (Ben) Date: Tue, 1 Mar 2005 22:41:06 -0500 (EST) Subject: [elephant-devel] Some problems installing Elephant In-Reply-To: References: Message-ID: <20050301223152.L93163@contarex.medianstrip.net> On Tue, 1 Mar 2005, Edi Weitz wrote: > I. Allegro 7.0 Allegro 7.0 isn't currently supported yet by us. If there is enough demand we'll port the Allegro 6.2 stuff to it. Is there a pressing need for it? We usually use SBCL. > 1. I installed the 'libpth14' package from Debian for the threading > library but got these results when compiling/loading Elephant: > > Error: Loading /usr/lib/libpthread.so failed with error: > /usr/lib/libpthread.so: invalid ELF header. > > There's another package available called 'libpth2' but I get the > same error with that one. I then proceeded without libpthread.so, > i.e. I just removed the form which tried to load this library and > all else seems to work (more or less). Is this library really > needed? And what for? In all honestly I don't know. I work with FreeBSD. On Fedora systems it was necessary to include this library, so I assumed it was a linux thing. > 2. I saw a couple of warnings during compilation: I guess the Allegro compiler is more picky than the SBCL one. > 3. Running all tests with (EL-TEST::DO-ALL-TESTS) gave this: > > 3 out of 85 total tests failed: ELEPHANT-TESTS::ARRAYS-1, > ELEPHANT-TESTS::NO-EVAL-INITFORM, ELEPHANT-TESTS::UPDATE-CLASS. > > From the docs I expected only one test to fail. What about the > other two? This one is mind-boggling. The two tests no-eval-initform and update-class should fail, they were fixed in the CVS release. However I have never seem arrays-1 fail, unless just about everything else does. The only thing I can imagine is this having something to do with upgraded-array-element-types. Unfortunately I haven't tested on Allegro 7.0. > II. Then I tried CMUCL 19a (from Debian) and had problems similar to > 1. and 3. above, namely: > > 1. The Debian libthread.so doesn't work with CMUCL either: > > Error in function SYSTEM::LOAD-OBJECT-FILE: > Can't open object "/usr/lib/libpthread.so": NIL > [Condition of type SIMPLE-ERROR] > > 2. More than one test fails: > > 2 out of 85 total tests failed: ELEPHANT-TESTS::NO-EVAL-INITFORM, > ELEPHANT-TESTS::UPDATE-CLASS. > > Finally, I installed Berkeley DB 4.3 and gave the Elephant CVS version > a try. I haven't captured all the details but it wasn't flawless > either. I seem to remember the same problems with libpthread.so and > there were also test failures. In addition, CMUCL complained about > locked packages. This is also a little perplexing. Can you provide more details on the test failures? We're running 19a over here and it seems to pass the tests, though maybe there are patches you're running we're not? Thanks for the detailed bug report. Take care, B From edi at agharta.de Wed Mar 2 07:29:52 2005 From: edi at agharta.de (Edi Weitz) Date: Wed, 02 Mar 2005 08:29:52 +0100 Subject: [elephant-devel] Some problems installing Elephant In-Reply-To: <20050301223152.L93163@contarex.medianstrip.net> (ben@medianstrip.net's message of "Tue, 1 Mar 2005 22:41:06 -0500 (EST)") References: <20050301223152.L93163@contarex.medianstrip.net> Message-ID: On Tue, 1 Mar 2005 22:41:06 -0500 (EST), Ben wrote: > Allegro 7.0 isn't currently supported yet by us. If there is enough > demand we'll port the Allegro 6.2 stuff to it. Is there a pressing > need for it? No. I guess for Allegro 7.0 the medium-term solution will be to use AllegroCache (which is also based on SleepyCat): > In all honestly I don't know. I work with FreeBSD. On Fedora > systems it was necessary to include this library, so I assumed it > was a linux thing. OK, I seemingly got no error messages due to this library missing so maybe it's not needed on Linux. > I guess the Allegro compiler is more picky than the SBCL one. I always thought SBCL was especially picky. However, a warning about a variable which is used but declared IGNOREd is actually a good thing... :) > This one is mind-boggling. The two tests no-eval-initform and > update-class should fail, they were fixed in the CVS release. > However I have never seem arrays-1 fail, unless just about > everything else does. The only thing I can imagine is this having > something to do with upgraded-array-element-types. Unfortunately I > haven't tested on Allegro 7.0. If you need more info let me know - I can run more tests if you want. > This is also a little perplexing. Can you provide more details on > the test failures? We're running 19a over here and it seems to pass > the tests, though maybe there are patches you're running we're not? I'll send a detailed test run later. Thanks, Edi. From aml at gia.ist.utl.pt Wed Mar 2 09:47:23 2005 From: aml at gia.ist.utl.pt (Antonio Menezes Leitao) Date: Wed, 02 Mar 2005 09:47:23 +0000 Subject: [elephant-devel] Some problems installing Elephant In-Reply-To: <20050301223152.L93163@contarex.medianstrip.net> (ben@medianstrip.net's message of "Tue, 1 Mar 2005 22:41:06 -0500 (EST)") References: <20050301223152.L93163@contarex.medianstrip.net> Message-ID: <87k6oqidlg.fsf@gia.ist.utl.pt> Ben writes: > On Tue, 1 Mar 2005, Edi Weitz wrote: > >> I. Allegro 7.0 > > Allegro 7.0 isn't currently supported yet by us. If there is enough > demand we'll port the Allegro 6.2 stuff to it. Is there a pressing > need for it? We usually use SBCL. I adapted elephant to work with Allegro 7.0. It was easy to do but I didn't test it carefully as I'm not using anything fancy from elephant (just persistent classes, btrees and transactions). The regression tests show: 3 out of 85 total tests failed: ARRAYS-1, NO-EVAL-INITFORM, UPDATE-CLASS. >> 1. I installed the 'libpth14' package from Debian for the threading >> library but got these results when compiling/loading Elephant: >> >> Error: Loading /usr/lib/libpthread.so failed with error: >> /usr/lib/libpthread.so: invalid ELF header. >> >> There's another package available called 'libpth2' but I get the >> same error with that one. I then proceeded without libpthread.so, >> i.e. I just removed the form which tried to load this library and >> all else seems to work (more or less). Is this library really >> needed? And what for? I had to process the libpthread.so for Allegro to be able to load it (they must be shared libraries): $ ld -shared -o /libpthread.so /lib/tls/libpthread.so.0 Then, on the file sleepycat.lisp, include (change) the following: #+linux (unless #+allegro (uffi:load-foreign-library "/libpthread.so" :module "pthread") #-allegro (uffi:load-foreign-library "/lib/tls/libpthread.so.0" :module "pthread") (error "Couldn't load libpthread!")) It works for me but, anyway, maybe the library is not really needed. Best regards, Ant?nio Leit?o. From walter at pelissero.de Wed Mar 2 18:26:54 2005 From: walter at pelissero.de (Walter C. Pelissero) Date: Wed, 2 Mar 2005 19:26:54 +0100 Subject: [elephant-devel] #+asdf Message-ID: <16934.1390.631193.166530@zaphod.home.loc> Just to point out that at around line 161 of sleepycat.lisp there is a form that doesn't look good to me: (if (find-package 'asdf) (merge-pathnames #p"libsleepycat.so" (asdf:component-pathname (asdf:find-system 'elephant))) "/usr/local/share/common-lisp/elephant-0.2/libsleepycat.so") In fact, if the asdf package is not present, the lisp reader complains loudly (a fatal error at least on CMUCL, but most likely on other implemenatations as well) once it gets to the asdf:component-pathname. In fact the reader has to fully read the if-form whether the asdf package will be present at run time or not. I understand that asdf doesn't show its presence in *features* (a pity), so a way around this could be something like (eval-when (:load-toplevel :compile-toplevel) (pushnew :asdf *features*)) in the .asd file. And then use the usual #+asdf throughout the code. (The eval-when may not be necessary, though.) -- walter pelissero http://www.pelissero.de From edi at agharta.de Wed Mar 2 19:57:03 2005 From: edi at agharta.de (Edi Weitz) Date: Wed, 02 Mar 2005 20:57:03 +0100 Subject: [elephant-devel] #+asdf In-Reply-To: <16934.1390.631193.166530@zaphod.home.loc> (Walter C. Pelissero's message of "Wed, 2 Mar 2005 19:26:54 +0100") References: <16934.1390.631193.166530@zaphod.home.loc> Message-ID: On Wed, 2 Mar 2005 19:26:54 +0100, "Walter C. Pelissero" wrote: > I understand that asdf doesn't show its presence in *features* (a > pity) Really? On my machines it does. Maybe you're using a very old version? From walter at pelissero.de Wed Mar 2 20:49:51 2005 From: walter at pelissero.de (Walter C. Pelissero) Date: Wed, 2 Mar 2005 21:49:51 +0100 Subject: [elephant-devel] #+asdf In-Reply-To: References: <16934.1390.631193.166530@zaphod.home.loc> Message-ID: <16934.9967.969898.357947@zaphod.home.loc> Edi Weitz writes: > On Wed, 2 Mar 2005 19:26:54 +0100, "Walter C. Pelissero" wrote: > > > I understand that asdf doesn't show its presence in *features* (a > > pity) > > Really? On my machines it does. It does on mine as well. > Maybe you're using a very old version? Or maybe I've been using a very old version of my memories. Side note. I personally dislike to write code that, at runtime, in a way or another depends on ASDF (or any other system description utility). Hence I'd probably put a #. in front of (asdf:component-pathname (asdf:find-system 'elephant)) -- walter pelissero http://www.pelissero.de From ben at medianstrip.net Thu Mar 3 20:10:49 2005 From: ben at medianstrip.net (Ben) Date: Thu, 3 Mar 2005 15:10:49 -0500 (EST) Subject: [elephant-devel] db gc In-Reply-To: <200502191841.44119.mega@hotpop.com> References: <04Nov18.154346cet.336116@fwall.essnet.se> <20041118135658.E91422@contarex.medianstrip.net> <200502191841.44119.mega@hotpop.com> Message-ID: <20050303144546.V83213@contarex.medianstrip.net> Sorry for not responding to this earlier, have been occupied. Your code looks good. Thanks for working on this. Some points: 1) does it work with Andrew's new MOP stuff? 2) the implementation you are working on is an offline implementation. the technical issues i had mostly had to do with online implementations. i think an online implementation is probably too hard to be worth it at this point. 3) since it is offline, you can probably open sleepycat up with some flags which will make this go fast. 4) "Can references to a persistent object have different class names (maybe due to a change-class)?" it does appear this is the case. i think Andrew is the expert here -- Andrew? the original collector i had in mind didn't know about the classes of persistent objects. it just kept track of OIDs blindly. the implementation was to be a little dirtier but easier. it was guaranteed then to not collect "unreferenced slots" which come from change-class. of course it couldn't collect discarded slots either. it appears that maybe i'm storing objects incorrectly. perhaps the right way to do this is to store objects as OIDs without classes, and then have a separate OID -> class table. that way change-class can work correctly. it depends on if you think change-class should update the DB or not, though. (mental note to self: if one implements this, one should make sure the instance cache code does the right thing e.g. check the class before handing back a cached instance!) in some ways the change-class / update-class-for-x semantics are still a little fuzzy. maybe Andrew can illuminate us here! take care, B On Sat, 19 Feb 2005, [utf-8] G??bor Melis wrote: > On Thursday 18 November 2004 20:02, Ben wrote: >> Writing the GC is long overdue. There are some technical issues with >> this which have yet to be solved, actually. In the first pass, it >> will probably require taking the store off-line and running a separate >> gc program on it -- that shouldn't be too hard. it may be possible to >> write an online collector but as of yet i don't know how to do it. i >> expect the gc will come with the next release (in a month or so -- >> after i'm done teaching this quarter!) > > I started hacking on the gc. It works by replacing deserialize with a similar > function that records the oids instead of calling get-cached-instance and > reading all slots, key-value pairs in persistent classes and btrees. > > (defmethod walk-persistent ((btree btree)) > (map-btree (lambda (key value) (declare (ignore key value))) > btree > :degree-2 t)) > > (defmethod walk-persistent ((obj persistent)) > (let ((class (class-of obj)) > (persistent-effective-slot-definition-class > (find-class 'persistent-effective-slot-definition))) > (loop for slot-definition in (class-slots class) > when (eq (class-of slot-definition) > persistent-effective-slot-definition-class) > do (slot-value-using-class class obj slot-definition)))) > > (defun elephant-gc (&optional (sc *store-controller*)) > (let ((old-oids (make-hash-table)) > (new-oids (make-hash-table))) > (flet ((marker (controller oid class) > (declare (ignore controller)) > (unless (gethash oid old-oids) > (setf (gethash oid new-oids) class)))) > (with-marking-deserialize (#'marker) > ;; mark the root > (setf (gethash -1 old-oids) > (class-name (class-of (controller-root sc)))) > (walk-persistent (controller-root sc)) > ;; > (loop while (< 0 (hash-table-count new-oids)) > do (maphash (lambda (oid class) > (walk-persistent (make-instance class :from-oid oid)) > (setf (gethash oid old-oids) class)) > new-oids) > (clrhash new-oids)))) > ;; now OLD-OIDS contains the oids of all reachable objects > (maphash (lambda (oid class) > (format t "~S ~S~%" oid class)) > old-oids) > )) > > It seems to detect live objects OK. The next step is to iterate through > controller-db and controller-btrees and delete records that have keys > starting with a non-alive oid, right? Controller-indices and > controller-indices-assoc can be left alone, I hope. > > What are those technical issues you mentioned above? > > Can references to a persistent object have different class names (maybe due to > a change-class)? > > G > From mega at hotpop.com Fri Mar 4 13:35:27 2005 From: mega at hotpop.com (Gabor Melis) Date: Fri, 4 Mar 2005 14:35:27 +0100 Subject: [elephant-devel] db gc In-Reply-To: <20050303144546.V83213@contarex.medianstrip.net> References: <20050303144546.V83213@contarex.medianstrip.net> Message-ID: <05Mar4.143915cet.336213@fwall.essnet.se> On Thursday 03 March 2005 21:10, Ben wrote: > Sorry for not responding to this earlier, have been occupied. > > Your code looks good. Thanks for working on this. Some points: > > 1) does it work with Andrew's new MOP stuff? If it is in CVS, yes. I test against CVS HEAD. > > 2) the implementation you are working on is an offline implementation. > the technical issues i had mostly had to do with online > implementations. i think an online implementation is probably too > hard to be worth it at this point. Well, meanwhile the implementation reached the proof of concept level. It detects garbage and collects it, but it is slow. I think there are two causes for this: - db-cursor-delete is slow: only 1000-2000 deletes per second :-) - fully deserializing everything and recording oids is easy but wasteful Even if these problems were solved satisfactorily, I would not like to take my website off-line daily, so I intend to look into the incremental version, as well. I'm interested in those technical issues. The issue I see here is how to efficiently record the oid refs overwritten/created. > > 3) since it is offline, you can probably open sleepycat up with some > flags which will make this go fast. > > 4) "Can references to a persistent object have different class names > (maybe due to a change-class)?" > > it does appear this is the case. i think Andrew is the expert here -- > Andrew? > > the original collector i had in mind didn't know about the classes of > persistent objects. it just kept track of OIDs blindly. the > implementation was to be a little dirtier but easier. it was > guaranteed then to not collect "unreferenced slots" which come from > change-class. of course it couldn't collect discarded slots either. > > it appears that maybe i'm storing objects incorrectly. perhaps the > right way to do this is to store objects as OIDs without classes, and > then have a separate OID -> class table. that way change-class can > work correctly. it depends on if you think change-class should update > the DB or not, though. (mental note to self: if one implements this, > one should make sure the instance cache code does the right thing > e.g. check the class before handing back a cached instance!) > > in some ways the change-class / update-class-for-x semantics are still > a little fuzzy. maybe Andrew can illuminate us here! > > take care, B From mega at hotpop.com Mon Mar 7 08:34:30 2005 From: mega at hotpop.com (=?iso-8859-1?q?G=E1bor_Melis?=) Date: Mon, 7 Mar 2005 09:34:30 +0100 Subject: [elephant-devel] full text indexing Message-ID: <05Mar7.134108cet.334405@fwall.essnet.se> There are 100k users in the db. Each user has a description string. I need to search users for words in their descriptions by prefix. Currently I lean towards simply maintaining a reverse word->oid index in a btree. Searching this is easy, but whenever a description changes the old mapping entries for that oid need to be deleted and the new ones added. Has any of you implemented such a scheme or have other ideas? Cheers, G?bor From mega at hotpop.com Tue Mar 8 15:10:46 2005 From: mega at hotpop.com (=?utf-8?q?G=C3=A1bor_Melis?=) Date: Tue, 8 Mar 2005 16:10:46 +0100 Subject: [elephant-devel] db gc In-Reply-To: <05Mar4.143915cet.336213@fwall.essnet.se> References: <05Mar4.143915cet.336213@fwall.essnet.se> Message-ID: <05Mar8.161043cet.334142@fwall.essnet.se> Attached is a patch against cvs HEAD with a small offline gc implementation. How to test it: Save your data! Invoke elephant::elephant-gc and pray. To see what it does set elephant::*debug-gc* to t (not recommended for anything bigger than a test db). AFAIK the size of db files do not necessarily shrink (not even after a db_checkpoint), because berkeley db keeps the unused regions around to be reused later. It does not deserialize everything in the db, but actively looks for oids, so it is not exceptionally slow only very. Mostly due to I/O I guess: in the mark phase random seeking really hurts. Reading the db more linearly would probably help. Cheers, G?bor -------------- next part -------------- A non-text attachment was scrubbed... Name: elephant-gc.patch Type: text/x-diff Size: 14788 bytes Desc: not available URL: From eemg at esw.inesc-id.pt Tue Mar 8 18:01:00 2005 From: eemg at esw.inesc-id.pt (=?iso-8859-15?q?Edgar_Gon=E7alves?=) Date: Tue, 08 Mar 2005 18:01:00 +0000 Subject: [elephant-devel] Elephant (from CVS) in Win32 with Allegro 6.2 Message-ID: Hi! After succeding with building the sleepycat DLL with MSVC.NET, I can't execute the tests with the command (elephant-tests::do-all-tests). I'm using the latest version of sleepycat, also (db-4.3.27.NC). The error message spooked me, and I don't know where/how I can look for more error information or debug it. Here's the output: Received signal number 11 (Segmentation violation) [Condition of type SYNCHRONOUS-OPERATING-SYSTEM-SIGNAL] Restarts: 0: [ABORT] Abort handling SLIME request. 1: [ABORT] Abort entirely from this process. Backtrace: 0: (SWANK::DEBUG-IN-EMACS #) 1: ((FLET SWANK:SWANK-DEBUGGER-HOOK SWANK::DEBUG-IT)) 2: (SWANK:SWANK-DEBUGGER-HOOK # #) 3: (ERROR SYNCHRONOUS-OPERATING-SYSTEM-SIGNAL :NAME #1="Segmentation violation" :NUMBER 11 :FORMAT-CONTROL "Received signal number ~s ~@[(~a)~]" :FORMAT-ARGUMENTS (11 #1#)) 4: (#:G956 0 1409335344) 5: (SLEEPYCAT::%DB-ENV-CREATE 0) 6: (SLEEPYCAT:DB-ENV-CREATE) 7: ((METHOD ELEPHANT:OPEN-CONTROLLER (ELEPHANT:STORE-CONTROLLER)) #) 8: ((:INTERNAL (:EFFECTIVE-METHOD 1 T T NIL NIL) 0) #) 9: (ELEPHANT-TESTS::DO-ALL-TESTS) 10: (EVAL (ELEPHANT-TESTS::DO-ALL-TESTS)) Thanks for any help, -- Edgar Gon?alves INESC-ID, Software Engineering Group (Technical University of Lisbon) Portugal ---------- -- Edgar Gon?alves INESC-ID, Software Engineering Group (Technical University of Lisbon) Portugal From ben at medianstrip.net Thu Mar 10 06:15:24 2005 From: ben at medianstrip.net (Ben) Date: Thu, 10 Mar 2005 01:15:24 -0500 (EST) Subject: [elephant-devel] full text indexing In-Reply-To: <05Mar7.134108cet.334405@fwall.essnet.se> References: <05Mar7.134108cet.334405@fwall.essnet.se> Message-ID: <20050310011406.P84380@contarex.medianstrip.net> i don't know how to do this sort of thing. AFAICT the "update problem" is an issue for most indexing systems. have you looked at lucene et al? B On Mon, 7 Mar 2005, [iso-8859-1] G?bor Melis wrote: > There are 100k users in the db. Each user has a description string. I need to > search users for words in their descriptions by prefix. Currently I lean > towards simply maintaining a reverse word->oid index in a btree. Searching > this is easy, but whenever a description changes the old mapping entries for > that oid need to be deleted and the new ones added. Has any of you > implemented such a scheme or have other ideas? > > Cheers, G?bor > _______________________________________________ > elephant-devel site list > elephant-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel > From mega at hotpop.com Thu Mar 10 07:57:49 2005 From: mega at hotpop.com (=?iso-8859-1?q?G=E1bor_Melis?=) Date: Thu, 10 Mar 2005 08:57:49 +0100 Subject: [elephant-devel] full text indexing In-Reply-To: <20050310011406.P84380@contarex.medianstrip.net> References: <20050310011406.P84380@contarex.medianstrip.net> Message-ID: <05Mar10.085750cet.334264@fwall.essnet.se> On Thursday 10 March 2005 07:15, Ben wrote: > i don't know how to do this sort of thing. AFAICT the "update > problem" is an issue for most indexing systems. have you looked at > lucene et al? Lucene needs delete and re-add for update. Hopefully, I will do a bit better than that. I've begun implementing simple, index based solution. Elephant was modified to allow multiple secondary-keys to be returned by key-fn (any objections to this? It was great to have the index maintenance on the lisp side). In the full text indexing case the secondary keys are of course words. On update all secondary-key primary-key pairs that belong to the user being updated are removed and then re-added (for now :-)). Seems to work, but this is all rather preliminary, one question is the performance of db_del_kv in the presence of a lot of duplicates (i.e. a lot of users with the same word). I asked in the berkeley-db newsgroup if it's a problem and whether DB_DUP_SORT helps with the performance. No answers so far. G?bor From mega at hotpop.com Sun Mar 13 20:01:26 2005 From: mega at hotpop.com (=?utf-8?q?G=C3=A1bor_Melis?=) Date: Sun, 13 Mar 2005 21:01:26 +0100 Subject: [elephant-devel] db gc In-Reply-To: <20050303144546.V83213@contarex.medianstrip.net> References: <04Nov18.154346cet.336116@fwall.essnet.se> <200502191841.44119.mega@hotpop.com> <20050303144546.V83213@contarex.medianstrip.net> Message-ID: <200503132101.26701.mega@hotpop.com> On Thursday 03 March 2005 21:10, Ben wrote: > it appears that maybe i'm storing objects incorrectly. perhaps the > right way to do this is to store objects as OIDs without classes, and > then have a separate OID -> class table. that way change-class can > work correctly. it depends on if you think change-class should update > the DB or not, though. (mental note to self: if one implements this, > one should make sure the instance cache code does the right thing > e.g. check the class before handing back a cached instance!) This patch does what you describe except for the mental not which I do not understand. I thought my full text indices were big because the class name is stored in each reference to the persistent object. Turns out I was mistaken, but here it is anyway. The gc had to be modified a bit, too. G?bor -------------- next part -------------- A non-text attachment was scrubbed... Name: persistent-serialization.patch Type: text/x-diff Size: 4445 bytes Desc: not available URL: -------------- next part -------------- ;;; -*- Mode: Lisp; Syntax: ANSI-Common-Lisp; Base: 10 -*- ;;; TODO: ;;; ;;; * read db linearly and record the object graph in memory, or just ;;; cache it when it's too big to hold ;;; ;;; * different classes for the same oid? ;;; ;;; * incremental gc (in-package "ELEPHANT") (eval-when (:compile-toplevel :load-toplevel :execute) (defparameter *debug-gc* nil)) (defmacro when-debug-gc (&body body) (when *debug-gc* `(progn , at body))) (defmacro debug-gc (&rest args) (when *debug-gc* `(format t , at args))) (defun mark-oids (buf-str marker) "Read oids from BUF-STR as if it was deserialized, but skip irrelevant data and be fast. Call MARKER with controller oid and class." (declare (optimize (speed 3) (safety 0)) (type (or null buffer-stream) buf-str)) (labels ((%mark-oids (bs) (declare (optimize (speed 3) (safety 0)) (type buffer-stream bs)) (let ((tag (buffer-read-byte bs))) (declare (type foreign-char tag)) (debug-gc "buffer=~S~%tag=~S~%" bs tag) (cond ((= tag +fixnum+) (buffer-skip bs 4)) ((= tag +nil+) nil) ((or (= tag +ucs1-symbol+) (= tag +ucs2-symbol+) (= tag +ucs4-symbol+)) (buffer-skip bs (buffer-read-fixnum bs)) (%mark-oids bs)) ((or (= tag +ucs1-string+) (= tag +ucs2-string+) (= tag +ucs4-string+)) (buffer-skip bs (buffer-read-fixnum bs))) ((= tag +persistent+) (funcall marker *store-controller* (buffer-read-fixnum bs) (really-deserialize bs :recursivep t))) ((= tag +single-float+) (buffer-skip bs 4)) ((= tag +double-float+) (buffer-skip bs 8)) ((= tag +char+) (buffer-skip bs 4)) ((or (= tag +ucs1-pathname+) (= tag +ucs2-pathname+) (= tag +ucs4-pathname+)) (buffer-skip bs (buffer-read-fixnum bs))) ((or (= tag +positive-bignum+) (= tag +negative-bignum+)) (buffer-skip bs (buffer-read-fixnum bs))) ((= tag +rational+) (%mark-oids bs) (%mark-oids bs)) ((= tag +cons+) (let* ((id (buffer-read-fixnum bs)) (maybe-cons (gethash id *circularity-hash*))) (unless maybe-cons (setf (gethash id *circularity-hash*) t) (%mark-oids bs) (%mark-oids bs)))) ((= tag +hash-table+) (let* ((id (buffer-read-fixnum bs)) (maybe-hash (gethash id *circularity-hash*))) (unless maybe-hash ;; test, rehash-size, rehash-threshold (%mark-oids bs) (%mark-oids bs) (%mark-oids bs) (setf (gethash id *circularity-hash*) t) (loop for i fixnum from 0 below (really-deserialize bs :recursivep t) do ;; key, value (%mark-oids bs) (%mark-oids bs))))) ((= tag +object+) (let* ((id (buffer-read-fixnum bs)) (maybe-o (gethash id *circularity-hash*))) (unless maybe-o ;; class (%mark-oids bs) (setf (gethash id *circularity-hash*) t) (loop for i fixnum from 0 below (really-deserialize bs :recursivep t) do ;; slot, value (%mark-oids bs) (%mark-oids bs))))) ((= tag +array+) (let* ((id (buffer-read-fixnum bs)) (maybe-array (gethash id *circularity-hash*))) (unless maybe-array (let ((flags (buffer-read-byte bs)) (total-size 1)) (loop for i fixnum from 0 below (buffer-read-int bs) do (setf total-size (* total-size (buffer-read-int bs)))) ;; has fill pointer? (when (/= 0 (logand +fill-pointer-p+ flags)) (buffer-read-int bs)) (setf (gethash id *circularity-hash*) t) (loop for i fixnum from 0 below total-size do (%mark-oids bs)))))) (t (error "mark-oids fubar!")))))) (etypecase buf-str (null (return-from mark-oids nil)) (buffer-stream (setq *lisp-obj-id* 0) (clrhash *circularity-hash*) (%mark-oids buf-str))))) (defmacro with-marking-deserialize ((marker) &body body) "Execute BODY in an environment where calls to DESERIALIZE are hijacked and end up as MARK-OIDS calls that call MARKER for each oid/class encountered." (let ((buf (gensym))) `(let ((*deserialize-fn* #'(lambda (,buf) (mark-oids ,buf ,marker)))) , at body))) (defparameter *persistent-effective-slot-definition-class* (find-class 'persistent-effective-slot-definition)) (defun walk (oid class) "Read persistent object slots and btree key/value pairs to force deserialization." (setq class (find-class class)) (flet ((walk-slot (slot-name) (with-buffer-streams (key-buf value-buf) (buffer-write-int oid key-buf) (serialize slot-name key-buf) (let ((buf (db-get-key-buffered (controller-db *store-controller*) key-buf value-buf))) (when buf (deserialize buf)))))) (when (subtypep class 'persistent) (debug-gc "Walking object:~S (~S) ~%" oid class) (loop for slot-definition in (class-slots class) when (eq (class-of slot-definition) *persistent-effective-slot-definition-class*) do (debug-gc "Walking object slot:~S ~S~%" oid (slot-definition-name slot-definition)) (walk-slot (slot-definition-name slot-definition))))) (when (subtypep class 'btree) (debug-gc "Walking btree:~S~%" oid) ;; FIXME: make-instance should be avoided here (map-btree (lambda (key value) (declare (ignore key value)) (debug-gc "Walked btree kv:~S ~S~%" key value)) (make-instance 'btree :from-oid oid)))) (defmacro with-db-cursor ((name value) &body body) `(let ((,name ,value)) (unwind-protect (progn , at body) (db-cursor-close ,name)))) (defun db-cursor-move (db-cursor &rest flags) "Small wrapper for DB-CURSOR-MOVE-BUFFERED." (with-buffer-streams (key-buf value-buf) (apply #'db-cursor-move-buffered db-cursor key-buf value-buf flags))) (defun gc-btree (btree live-oids &key oid-in-value) "Remove all entries from BTREE belonging to oids not in LIVE-OIDS. Read the oid from the key or the value an entry according to OID-IN-VALUE." (let ((n-visited 0) (n-deleted 0)) ;; db-cursor-delete must be enclosed in a transaction, else we got ;; a rather generic berkeley db error (with-transaction (:degree-2 t :txn-nosync t :dirty-read t) (with-db-cursor (db-cursor (db-cursor btree)) (loop for (key value) = (multiple-value-list (db-cursor-move db-cursor :next t)) while key do (let ((oid (buffer-read-int (if oid-in-value value key)))) (when-debug-gc (let ((k (deserialize key))) (if (gethash oid live-oids) (debug-gc "Keeping:~S ~S~%" oid k) (debug-gc "GCing:~S ~S~%" oid k))) (force-output)) (incf n-visited) (unless (gethash oid live-oids) (incf n-deleted) (db-cursor-delete db-cursor)))))) (debug-gc "~A/~A~%" n-visited n-deleted))) (defun elephant-gc () "Remove unreferenced \(garbage) objects from the db. This needs to be run offline, i.e. with no other db operations running including open transactions." (unwind-protect (let ((sc *store-controller*) (processed-oids (make-hash-table)) (current-oids (make-hash-table)) (new-oids (make-hash-table))) (flet ((marker (controller oid class) (declare (ignore controller)) (unless (or (gethash oid processed-oids) (gethash oid current-oids)) (debug-gc "marking oid:~S~%" oid) (setf (gethash oid new-oids) class)))) ;; Let's mark live objects. (with-marking-deserialize (#'marker) ;; mark the root (setf (gethash -1 current-oids) (class-name (class-of (controller-root sc)))) (loop while (< 0 (hash-table-count current-oids)) do ;; walk objects for current oids (maphash (lambda (oid class) (debug-gc "processing oid:~S(~S)~%" oid class) (walk oid class) (setf (gethash oid processed-oids) class)) current-oids) ;; move NEW-OIDS to CURRENT-OIDS (setf current-oids new-oids new-oids (make-hash-table)))) ;; PROCESSED-OIDS now contains the oids of all reachable ;; objects, remove what's not in it (when-debug-gc (maphash (lambda (oid class) (debug-gc "~S ~S~%" oid class)) processed-oids)) (time (gc-btree (controller-db sc) processed-oids)) (time (gc-btree (controller-btrees sc) processed-oids)) ;; If an inexed btree becomes garbage its indices need to ;; be cleaned up. (time (gc-btree (controller-indices sc) processed-oids :oid-in-value t)))) ;; make-instance (used in WALK for btrees) calls modify the ;; instance cache but the cached data is wrong since mark-oids ;; skips a lot of things (clrhash (instance-cache *store-controller*)))) From mega at hotpop.com Sun Mar 13 20:08:30 2005 From: mega at hotpop.com (=?iso-8859-1?q?G=E1bor_Melis?=) Date: Sun, 13 Mar 2005 21:08:30 +0100 Subject: [elephant-devel] full text indexing In-Reply-To: <05Mar7.134108cet.334405@fwall.essnet.se> References: <05Mar7.134108cet.334405@fwall.essnet.se> Message-ID: <200503132108.30679.mega@hotpop.com> On Monday 07 March 2005 09:34, G?bor Melis wrote: > There are 100k users in the db. Each user has a description string. I need > to search users for words in their descriptions by prefix. Currently I lean > towards simply maintaining a reverse word->oid index in a btree. Searching > this is easy, but whenever a description changes the old mapping entries > for that oid need to be deleted and the new ones added. Has any of you > implemented such a scheme or have other ideas? It seems I've got a mostly working albeit a bit ugly implementation of simple text indexing. The index code was modified to allow multiple keys to be returned by key-fn. See attachment. The rest of the code will follow once I figure out how to clean it up. Now, I have a freaking cursor join for duplicates implementation in lisp because I wasn't aware of db join. There is hope mine is faster, though :-). > > Cheers, G?bor > _______________________________________________ > elephant-devel site list > elephant-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel -------------- next part -------------- A non-text attachment was scrubbed... Name: multikey-index.patch Type: text/x-diff Size: 5638 bytes Desc: not available URL: From edi at agharta.de Mon Mar 14 17:28:14 2005 From: edi at agharta.de (Edi Weitz) Date: Mon, 14 Mar 2005 18:28:14 +0100 Subject: [elephant-devel] BerkeleyDB license (Was: Elephant license) In-Reply-To: <20050228113040.T25570@contarex.medianstrip.net> (ben@medianstrip.net's message of "Mon, 28 Feb 2005 15:30:08 -0500 (EST)") References: <20050228113040.T25570@contarex.medianstrip.net> Message-ID: On Mon, 28 Feb 2005 15:30:08 -0500 (EST), Ben wrote: > 3) Pay sleepycat. I don't know how much they cost. Hi! I had an email exchange with SleepyCat's sales department concerning the license situation and I thought I'd give a summary for this list so everyone who is considering using Elephant for commercial projects is aware of this. Please note that this reflects my understanding of the situation. IANAL - if in doubt ask SleepyCat yourself. As a general rule you can use SleepyCat for free if your application is fully open source - in the case of Lisp this would also include the Lisp implementation itself, i.e. if you deploy on, say, AllegroCL or LispWorks your application isn't open source anymore. If your application is /not/ open source and you distribute it to customers (which includes alpha/beta releases, giving code to contractors or affiliates, or access to web applications) you have to pay SleepyCat - see below. They have a clause in their license that makes exceptions for Perl and Python but this clause doesn't apply to Lisp because a Lisp application maps Berkeley DB into its address space.[1] So, what does Berkeley DB cost? Looks like you have several options. If you want to distribute it with several different applications or with several copies of one application you might want to look at their "buyout" offer which is online: Another option would be to pay a price per machine, i.e. for each machine Berkeley DB will be used on you pay a fixed price. A sample contract for this option is available here: I was given quotes for this option as well but I'm reluctant to disclose them here as they're not publicly available. Let me just say that if you think AllegroCL is expensive then Berkeley DB is probably not for you... :) Of course, there's also the option to negotiate special prices with SleepyCat - that's up to you. I have to say that at this moment this means I won't consider the Elephant/Berkeley DB combo for my Lisp projects because SleepyCat's pricing scheme doesn't fit for me. Obviously, this is not a decision based on the technical merits of Elephant or Berkeley DB. Hope this info was helpful. Cheers, Edi. [1] As a side note the guy from SleepyCat originally said to me that the Perl/Python exception would apply to Lisp as well because it generally applies to all "fully interpretative" languages. I had to explain to him that Lisp is a compiled language... :) From ben at medianstrip.net Mon Mar 14 18:35:07 2005 From: ben at medianstrip.net (Ben) Date: Mon, 14 Mar 2005 13:35:07 -0500 (EST) Subject: [elephant-devel] Retiring Message-ID: <20050314133443.U32661@contarex.medianstrip.net> Hi all, Thanks for your continued interest in Elephant. Unfortunately I have some bad news. My day job has become all-consuming (and it unfortunately does not involve working with Elephant at all.) As such I feel I need to step down as maintainer. Andrew has volunteered to maintain this project, but he will also probably have limited cycles to contribute coding too. While it seems Elephant is in a semi-useful state, it does need more work. So, if you like Elephant and would like to make it better, there are plenty of things to work on! I will stay on these email lists and try to contribute what knowledge I have. Andrew will try to coordinate. I've outlined here various projects in priority order, as I see them. At some point in the future I hope to come back! High-priority projects: 1) we are storing persistent objects incorrectly. they should be stored only as OIDs, and we should have a separate OID->class table. this way change-class can be handled correctly. 2) GC. Gabor has started on this -- thank you! 3) New release. Update docs, webpage to reflect new stuff. (Sleepycat 4.3, SBCL+unicode, new MOP stuff.) 4) Metering and understanding locking issues. Large transactions seem to use a lot of locks. In general understanding how to use Sleepycat efficiently seems like a good thing. 5) Tests tests tests! 6) LispWorks, Allegro 7.0 and OpenMCL compatibility. There are some patches which have been sent in (check the logs.) Also someone sent in Solaris patches a while ago. Difficult, but important projects: 7) callbacks for btree sorting: right now the sorted btree sorting function is a bit of nasty C code. i have code somewhere which did callbacks instead, i can try to revive that. the issue is that the sorter needs to know the underlying lisp representation. this is a screw for unicode strings and bignums. the C code is a hack for 16 bit unicode strings for allegro / lispworks (using IBMs ICU stuff), and for the sbcl unicode stuff (using glibc wchar stuff.) also bignum sorting is by approximation via floats, which fails after some point. but in hindsight maintaining all this is probably a lose. 8) Other backends. Sleepycat is nice but has license issues. Calling to C is nice but not nice. Elephant is fairly modular: serializer + MOP stuff + backend. Therefore: write different backends. Possibly: Embedded Firebird, SQLLite (seems to not be that great.) Or, write an All-Lisp backend. Copy what ZODB does (maybe DirectoryStorage instead of FileStorage.) Ask me for details. With SQL backends, there are possibility of using SQL types to solve some of the btree sorting problems (and break others!) 9) init/reinit object protocol: Dan Knapp suggests (or I infer) an initialization / reinitialization protocol might be nice, since there is a difference between objects which are freshly created and those which are loaded from the DB. Med-priority projects: 10) equality joins have to be done on the lisp side, since many lisp btrees share the same sleepycat btree. 11) single-user-mode: SUM = cache all slot values. used to be high priority, but i no longer want it, so unless someone wants it... Low-priority projects: 12) class / slot to ID performance hack. create a table which contains slot definitions, probably cache this in memory on the class. store slots as OID + slot ID -- avoid the symbol lookup. 13) typed arrays. some lisps can pass things like arrays of 32 bit signed integers directly to C. right now we destructure and restructure. invalidated by 11). 14) bignum fixes. OpenMCL: check that ldb is non-consing (i think it is), look at %ldb-fixnum-from-bignum. profile %bignum-ref on CMUCL / SBCL. or, figure out a great way to serialize bignums! 15) peter b's patch to non-persistent objs. check the list. 16) dynamic-extent for transactions on SBCL/CMUCL. i've tried to get with-transaction to not cons..... Impossible projects: 17) Serialize lambdas, closures, functions. Seems to require implementation support. Also serialization of packages. Take care, B From edi at agharta.de Mon Mar 14 19:46:57 2005 From: edi at agharta.de (Edi Weitz) Date: Mon, 14 Mar 2005 20:46:57 +0100 Subject: [elephant-devel] Retiring In-Reply-To: <20050314133443.U32661@contarex.medianstrip.net> (ben@medianstrip.net's message of "Mon, 14 Mar 2005 13:35:07 -0500 (EST)") References: <20050314133443.U32661@contarex.medianstrip.net> Message-ID: On Mon, 14 Mar 2005 13:35:07 -0500 (EST), Ben wrote: > SQLLite (seems to not be that great.) Would you care to comment on that? I just looked at it the other day and at first sight it seems nice. What problems to you see? (I understand that it always locks the whole file but for a certain class of projects I don't think that's a big problem.) > Or, write an All-Lisp backend. Copy what ZODB does (maybe > DirectoryStorage instead of FileStorage.) Ask me for details. OK, I'll bite: If you could provide some details about what you think is good in ZODB and how it should be ported to Lisp I'd be interested. > Impossible projects: > > Also serialization of packages. Maybe look at how Plob! does it. Cheers, Edi. From ben at medianstrip.net Mon Mar 14 22:02:44 2005 From: ben at medianstrip.net (Ben) Date: Mon, 14 Mar 2005 17:02:44 -0500 (EST) Subject: [elephant-devel] db gc In-Reply-To: <200503132101.26701.mega@hotpop.com> References: <04Nov18.154346cet.336116@fwall.essnet.se> <200502191841.44119.mega@hotpop.com> <20050303144546.V83213@contarex.medianstrip.net> <200503132101.26701.mega@hotpop.com> Message-ID: <20050314170053.O49113@contarex.medianstrip.net> hey! thanks for the patch. ok, i've changed my mind, i'll do one last set of patches to elephant and cut a release, to include all of gabor and other people's fixes. gabor, let me know when you feel like a good time to do this is, e.g. when your code will settle. i'll try to take a serious look at it too at some point. B On Sun, 13 Mar 2005, [utf-8] G??bor Melis wrote: > On Thursday 03 March 2005 21:10, Ben wrote: >> it appears that maybe i'm storing objects incorrectly. perhaps the >> right way to do this is to store objects as OIDs without classes, and >> then have a separate OID -> class table. that way change-class can >> work correctly. it depends on if you think change-class should update >> the DB or not, though. (mental note to self: if one implements this, >> one should make sure the instance cache code does the right thing >> e.g. check the class before handing back a cached instance!) > > This patch does what you describe except for the mental not which I do not > understand. I thought my full text indices were big because the class name is > stored in each reference to the persistent object. Turns out I was mistaken, > but here it is anyway. The gc had to be modified a bit, too. > > G??bor > From ben at medianstrip.net Mon Mar 14 22:05:13 2005 From: ben at medianstrip.net (Ben) Date: Mon, 14 Mar 2005 17:05:13 -0500 (EST) Subject: [elephant-devel] Re: BerkeleyDB license (Was: Elephant license) In-Reply-To: References: <20050228113040.T25570@contarex.medianstrip.net> Message-ID: <20050314170344.D49113@contarex.medianstrip.net> Thanks Edi for doing this homework. On Mon, 14 Mar 2005, Edi Weitz wrote: > They have a clause in their license that makes exceptions for Perl and > Python but this clause doesn't apply to Lisp because a Lisp > application maps Berkeley DB into its address space.[1] this is an utterly ironic bit of bad luck for lisp: who knew there was an economic argument in the compile / interpret debate? B From ben at medianstrip.net Mon Mar 14 22:55:01 2005 From: ben at medianstrip.net (Ben) Date: Mon, 14 Mar 2005 17:55:01 -0500 (EST) Subject: [elephant-devel] Retiring In-Reply-To: References: <20050314133443.U32661@contarex.medianstrip.net> Message-ID: <20050314170526.F49113@contarex.medianstrip.net> On Mon, 14 Mar 2005, Edi Weitz wrote: > On Mon, 14 Mar 2005 13:35:07 -0500 (EST), Ben wrote: > >> SQLLite (seems to not be that great.) > > Would you care to comment on that? I just looked at it the other day > and at first sight it seems nice. What problems to you see? (I > understand that it always locks the whole file but for a certain class > of projects I don't think that's a big problem.) i was mostly responding to the "locking everything" thing, and perhaps misconceived feelings that it wasn't ready for prime-time in the same way postgresql or sleepycat is. not that elephant is either..... >> Or, write an All-Lisp backend. Copy what ZODB does (maybe >> DirectoryStorage instead of FileStorage.) Ask me for details. > > OK, I'll bite: If you could provide some details about what you think > is good in ZODB and how it should be ported to Lisp I'd be interested. woo hoo! here's an overview of the different ZODB backends: http://cvs.zope.org/*checkout*/ZODB3/Doc/storages.html?rev=1.8.4.2&content-type=text/plain the first thing to do is to choose between filestorage and directorystorage. one big difference between elephant and FileStorage is that elephant stores slots in btree pairs while zodb FileStorage stores objects at offsets. hence the FileStorage way of doing things would require more elephant massaging. directorystorage on the other hand uses the underlying filesystem's btrees implicitly. i could be convinced of either way though. some thoughts: filestorage: simple and fast. requires an in-memory oid+slot->offset index. to mesh with elephant, either implement btrees and then store slots in a btree as we do now, or (more likely) modify the MOP implementation to use offsets rather than keys. directorystorage: also pretty simple. well-documented. uses the filesystem as a btree. no index needed. should be easier to implement for elephant. requires posix filesystem access on the lisp side (which requires getting your hands dirty, IIRC.) i'd have to think about this more, but it seems possible for multiple lisps to talk to the store without a server (which is definitely not possible for filestorage.) implementing btrees might be trivial (i'm not sure about this though.) http://dirstorage.sourceforge.net/technical.html has good documentation on the implementation of directorystorage. this probably goes without saying, but on the first pass make the simplest implementation possible (no caching!) for either of these systems, packing should be combined with GC. >> Impossible projects: >> >> Also serialization of packages. > > Maybe look at how Plob! does it. i will. take care, B From robertlread at austin.rr.com Wed Mar 16 00:32:30 2005 From: robertlread at austin.rr.com (Robert L. Read) Date: Tue, 15 Mar 2005 18:32:30 -0600 Subject: [elephant-devel] Retiring In-Reply-To: References: <20050314133443.U32661@contarex.medianstrip.net> Message-ID: <1110933150.5776.59.camel@localhost.localdomain> Just an FYI --- I'm using Elephant with SBCL for a project that I hope will be commercial some day. I could use something else, but I really like Elephant. I'm not expert enough right now to maintain it, but I might grow into it. So, I encourage the maintainers to keep working on it, and hope to help soon. ---- Robert L. Read, PhD read &T robertlread.net Consider visiting Progressive Engineering: http://robertlread.net/pe In Austin: 912-8593 "Think globally, Act locally." -- RBF -------------- next part -------------- An HTML attachment was scrubbed... URL: From blumberg at math.uchicago.edu Tue Mar 15 23:56:40 2005 From: blumberg at math.uchicago.edu (Andrew Blumberg) Date: Tue, 15 Mar 2005 17:56:40 -0600 (CST) Subject: [elephant-devel] Retiring In-Reply-To: <1110933150.5776.59.camel@localhost.localdomain> References: <20050314133443.U32661@contarex.medianstrip.net> <1110933150.5776.59.camel@localhost.localdomain> Message-ID: it's really nice to hear that people are using elephant. although ben has entered "retirement", i intend to continue maintaining the project (and cadging advice out of ben as necessary). however, i'm sufficiently occupied with other tasks that i'm hoping to act more in the role of fixing bugs, integrating patches, and coordinating development more intensive hacking. this having been said, i'd really like to see elephant become bulletproof and so i'm excited about usage in commercial situations. as such, i'm particularly interested in doing necessary modifications to help this along insofar as i can. regards, andrew On Tue, 15 Mar 2005, Robert L. Read wrote: > Just an FYI --- I'm using Elephant with SBCL for a project that I hope > will be commercial some day. > I could use something else, but I really like Elephant. I'm not expert > enough right now to maintain it, > but I might grow into it. > > So, I encourage the maintainers to keep working on it, and hope to help > soon. > > ---- > Robert L. Read, PhD read &T > robertlread.net > Consider visiting Progressive Engineering: > http://robertlread.net/pe > In Austin: 912-8593 "Think > globally, Act locally." -- RBF > > > From aml at gia.ist.utl.pt Fri Mar 18 01:41:03 2005 From: aml at gia.ist.utl.pt (Antonio Menezes Leitao) Date: Fri, 18 Mar 2005 01:41:03 +0000 Subject: [elephant-devel] db gc In-Reply-To: <20050314170053.O49113@contarex.medianstrip.net> (ben@medianstrip.net's message of "Mon, 14 Mar 2005 17:02:44 -0500 (EST)") References: <04Nov18.154346cet.336116@fwall.essnet.se> <200502191841.44119.mega@hotpop.com> <20050303144546.V83213@contarex.medianstrip.net> <200503132101.26701.mega@hotpop.com> <20050314170053.O49113@contarex.medianstrip.net> Message-ID: <87psxxg28g.fsf@gia.ist.utl.pt> Hi, Ben writes: > ok, i've changed my mind, i'll do one last set of patches to elephant > and cut a release, to include all of gabor and other people's fixes. Just in case you are thinking about what to do first, I humbly suggest solving the SBCL+unicode/Elephant compatibility issues: that would make one Elephant user extremely happy :-) Best regards and thanks a lot for Elephant, Ant?nio Leit?o. From ben at medianstrip.net Fri Mar 18 02:13:22 2005 From: ben at medianstrip.net (Ben) Date: Thu, 17 Mar 2005 21:13:22 -0500 (EST) Subject: [elephant-devel] db gc In-Reply-To: <87psxxg28g.fsf@gia.ist.utl.pt> References: <04Nov18.154346cet.336116@fwall.essnet.se> <200502191841.44119.mega@hotpop.com> <20050303144546.V83213@contarex.medianstrip.net> <200503132101.26701.mega@hotpop.com> <20050314170053.O49113@contarex.medianstrip.net> <87psxxg28g.fsf@gia.ist.utl.pt> Message-ID: <20050317211255.H17107@contarex.medianstrip.net> those have already been solved, you have to use what is in CVS though. also upgrade to sleepycat 4.3. take care, B On Fri, 18 Mar 2005, Antonio Menezes Leitao wrote: > Hi, > > Ben writes: > >> ok, i've changed my mind, i'll do one last set of patches to elephant >> and cut a release, to include all of gabor and other people's fixes. > > Just in case you are thinking about what to do first, I humbly suggest > solving the SBCL+unicode/Elephant compatibility issues: that would > make one Elephant user extremely happy :-) > > Best regards and thanks a lot for Elephant, > > Ant?nio Leit?o. > _______________________________________________ > elephant-devel site list > elephant-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel > From aml at gia.ist.utl.pt Fri Mar 18 09:49:54 2005 From: aml at gia.ist.utl.pt (Antonio Menezes Leitao) Date: Fri, 18 Mar 2005 09:49:54 +0000 Subject: [elephant-devel] db gc In-Reply-To: <20050317211255.H17107@contarex.medianstrip.net> (ben@medianstrip.net's message of "Thu, 17 Mar 2005 21:13:22 -0500 (EST)") References: <04Nov18.154346cet.336116@fwall.essnet.se> <200502191841.44119.mega@hotpop.com> <20050303144546.V83213@contarex.medianstrip.net> <200503132101.26701.mega@hotpop.com> <20050314170053.O49113@contarex.medianstrip.net> <87psxxg28g.fsf@gia.ist.utl.pt> <20050317211255.H17107@contarex.medianstrip.net> Message-ID: <87sm2t2shp.fsf@gia.ist.utl.pt> Hi, Ben writes: > those have already been solved, you have to use what is in CVS though. > also upgrade to sleepycat 4.3. > I'm using the CVS version and I upgraded do 4.3 but I just can't get it working. Aparently, the problem is that SBCL doesn't read (correctly) a db that was created using Allegro 7.0. I'm not sure if the problem is in SBCL or in Allegro 7.0. Here's what I did: First, I modified elephant to work with Allegro 7.0. Here are my changes: Index: Makefile =================================================================== RCS file: /project/elephant/cvsroot/elephant/Makefile,v retrieving revision 1.6 diff -u -r1.6 Makefile --- Makefile 24 Feb 2005 01:06:20 -0000 1.6 +++ Makefile 18 Mar 2005 09:36:19 -0000 @@ -7,13 +7,13 @@ SHELL=/bin/sh UNAME:=$(shell uname -s) -DB43DIR=/db/ben/lisp/db43 -DBLIBDIR=$(DB43DIR)/lib/ -DBINCDIR=$(DB43DIR)/include/ +#DB43DIR=/db/ben/lisp/db43 +#DBLIBDIR=$(DB43DIR)/lib/ +#DBINCDIR=$(DB43DIR)/include/ # *BSD users will probably want -#DBLIBDIR=/usr/local/lib/db43 -#DBINCDIR=/usr/local/include/db43 +DBLIBDIR=/usr/lib/ +DBINCDIR=/usr/include/ ifeq (Darwin,$(UNAME)) SHARED=-bundle Index: src/elephant.lisp =================================================================== RCS file: /project/elephant/cvsroot/elephant/src/elephant.lisp,v retrieving revision 1.14 diff -u -r1.14 elephant.lisp --- src/elephant.lisp 24 Feb 2005 01:07:52 -0000 1.14 +++ src/elephant.lisp 18 Mar 2005 09:40:01 -0000 @@ -187,7 +187,10 @@ slot-definition-initargs class-finalized-p finalize-inheritance - compute-slots) + compute-slots + class-direct-slots + slot-definition-readers + slot-definition-writers) #+allegro (:import-from :excl compute-effective-slot-definition-initargs) Index: src/sleepycat.lisp =================================================================== RCS file: /project/elephant/cvsroot/elephant/src/sleepycat.lisp,v retrieving revision 1.13 diff -u -r1.13 sleepycat.lisp --- src/sleepycat.lisp 24 Feb 2005 01:06:09 -0000 1.13 +++ src/sleepycat.lisp 18 Mar 2005 09:40:32 -0000 @@ -135,15 +135,17 @@ ;; This one worked for me. There are known issues with ;; Red Hat and Berkeley DB, search google. #+linux - (unless - (uffi:load-foreign-library "/lib/tls/libpthread.so.0" :module "pthread") + (unless + #+allegro (uffi:load-foreign-library "/home/aml/LispEssentials/elephant/allegro7.0/libpthread.so" :module "pthread") + #-allegro (uffi:load-foreign-library "/lib/tls/libpthread.so.0" :module "pthread") (error "Couldn't load libpthread!")) (unless (uffi:load-foreign-library ;; Sleepycat: this works on linux #+linux - "/db/ben/lisp/db43/lib/libdb.so" + ;;"/db/ben/lisp/db43/lib/libdb.so" + "/usr/lib/libdb-4.3.so" ;; this works on FreeBSD #+(and (or bsd freebsd) (not darwin)) "/usr/local/lib/db43/libdb.so" @@ -185,7 +187,7 @@ buffer-write-float buffer-write-double buffer-write-string buffer-read-byte buffer-read-fixnum buffer-read-int buffer-read-uint buffer-read-float buffer-read-double - #-(and allegreo ics) buffer-read-ucs1-string + #-(and allegro ics) buffer-read-ucs1-string #+(or lispworks (and allegro ics)) buffer-read-ucs2-string #+(and sbcl sb-unicode) buffer-read-ucs4-string)) Using these changes, I created a db and populated it with lots of objects. There's one root object that is the unique entry point for all the others. In Allegro 7.0, this root object has oid 0. Later on, I close Allegro and start SBCL. When I read the root object in SBCL, its oid is 7000 and the object is "empty" (that is, it doesn't reference any other objects). Do you have any idea about what's happening? Thanks in advance, Ant?nio Leit?o. From mega at hotpop.com Sat Mar 19 10:42:44 2005 From: mega at hotpop.com (=?utf-8?q?G=C3=A1bor_Melis?=) Date: Sat, 19 Mar 2005 11:42:44 +0100 Subject: [elephant-devel] db gc In-Reply-To: <20050314170053.O49113@contarex.medianstrip.net> References: <20050314170053.O49113@contarex.medianstrip.net> Message-ID: <05Mar19.114247cet.334144@fwall.essnet.se> On Monday 14 March 2005 23:02, Ben wrote: > hey! thanks for the patch. > > ok, i've changed my mind, i'll do one last set of patches to elephant > and cut a release, to include all of gabor and other people's fixes. > > gabor, let me know when you feel like a good time to do this is, > e.g. when your code will settle. i'll try to take a serious look at > it too at some point. > > B I think it is in reasonable shape, gc is cleaned up, a bit faster and there are tests for join and gc. The attached big patch contains the latest versions of multiple things (let me know if I should try to split it): - revised serialization of persistent objects (type is not in each ref, this is incompatible on the db level) - gc - multiple-key indices - join Note: maybe key-fn should always return a list of secondary keys and not t/nil/:multiple-key + key (or keys). This would be incompatible on the source level. With multi indices + join it is easy to do 'full' text indexing, if this stuff goes in I'll write the doc. Until then look at the last test in testjoin.lisp. I'm not satisfied with join API, but it works and the direction in which it will develop is not clear yet. Hopefully one day it will support queries like: (and (< my-index 42) (or (= other-index "hello") (= other-index "world"))). ChangeLog from the patch: * src/join.lisp, tests/testjoin.lisp: implemented join for secondary cursors * src/utils.lisp: added with-gensyms * src/sleepycat.lisp: added buffer-compare * src/collections.lisp: multiple key support for indices * src/libsleepycat.c: the new lisp_raw_compare does what lisp_compare does but on arrays instead of DB*. This function is called from lisp. * src/gc.lisp, tests/testgc.lisp: implemented offline gc * src/serializer.lisp: renamed deserialize to really-deserialize and made deserialize call *deserialize-fn* to allow gc to hook into it * src/sleepycat.lisp: added buffer-skip * tests/elephant-tests.lisp: cleanup * src/classes.lisp, tests/mop-tests.lisp, tests/testcollections.lisp: store type of an object in controller-db keyed by oid instead of in every reference to it; change-class should work reliably now Gabor -------------- next part -------------- A non-text attachment was scrubbed... Name: big.patch Type: text/x-diff Size: 64477 bytes Desc: not available URL: From lam at tuxfamily.org Sun Mar 27 18:22:10 2005 From: lam at tuxfamily.org (Nicolas Lamirault) Date: Sun, 27 Mar 2005 20:22:10 +0200 Subject: [elephant-devel] DB problem Message-ID: <87mzspaqzx.fsf@no-log.org> hi, i have a problem with elephant 0.2.1 i use a debian system with : ii sbcl 0.8.19.39-2 * sbcl This is SBCL 0.8.19.39, an implementation of ANSI Common Lisp. More information about SBCL is available at . SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. * (push "/home/nicolas/src/elephant-0.2.1/" asdf:*central-registry*) ("/home/nicolas/src/elephant-0.2.1/" #P"/root/.clc/systems/" #P"/usr/share/common-lisp/systems/" (MERGE-PATHNAMES ".sbcl/systems/" (USER-HOMEDIR-PATHNAME)) (MERGE-PATHNAMES "site-systems/" (TRUENAME (POSIX-GETENV "SBCL_HOME"))) (MERGE-PATHNAMES "systems/" (TRUENAME (POSIX-GETENV "SBCL_HOME"))) *DEFAULT-PATHNAME-DEFAULTS*) * (asdf:operate 'asdf:load-op :elephant) ; loading system definition from ; #P"/home/nicolas/src/elephant-0.2.1/elephant.asd" into # ; registering # as ELEPHANT ; loading system definition from #P"/usr/share/common-lisp/systems/uffi.asd" ; into # ; registering # as UFFI NIL * (use-package "ELE") T * (open-store "/tmp/") # * (add-to-root "my key" "my string") "my string" * (get-from-root "my key") debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): (A SB-INT:STREAM-ENCODING-ERROR was caught when trying to print *DEBUG-CONDITION* when entering the debugger. Printing was aborted and the SB-INT:STREAM-ENCODING-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.) You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [OUTPUT-NOTHING] Skip output of this character. 1: [ABORT ] Reduce debugger level (leaving debugger, returning to toplevel). 2: [TOPLEVEL ] Restart at toplevel READ/EVAL/PRINT loop. (SB-INT:STREAM-ENCODING-ERROR 2 # debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): (A SB-INT:STREAM-ENCODING-ERROR was caught when trying to print *DEBUG-CONDITION* when entering the debugger. Printing was aborted and the SB-INT:STREAM-ENCODING-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.) You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [OUTPUT-NOTHING] Skip output of this character. 1: Skip output of this character. 2: [ABORT ] Reduce debugger level (leaving debugger, returning to toplevel). 3: [TOPLEVEL ] Restart at toplevel READ/EVAL/PRINT loop. debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): (A SB-INT:STREAM-ENCODING-ERROR was caught when trying to print *DEBUG-CONDITION* when entering the debugger. Printing was aborted and the SB-INT:STREAM-ENCODING-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.) You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [OUTPUT-NOTHING] Skip output of this character. 1: Skip output of this character. 2: Skip output of this character. 3: [ABORT ] Reduce debugger level (leaving debugger, returning to toplevel). 4: [TOPLEVEL ] Restart at toplevel READ/EVAL/PRINT loop. debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): (A SB-INT:STREAM-ENCODING-ERROR was caught when trying to print *DEBUG-CONDITION* when entering the debugger. Printing was aborted and the SB-INT:STREAM-ENCODING-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.) You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [OUTPUT-NOTHING] Skip output of this character. 1: Skip output of this character. 2: Skip output of this character. 3: Skip output of this character. 4: [ABORT ] Reduce debugger level (leaving debugger, returning to toplevel). 5: [TOPLEVEL ] Restart at toplevel READ/EVAL/PRINT loop. debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): (A SB-INT:STREAM-ENCODING-ERROR was caught when trying to print *DEBUG-CONDITION* when entering the debugger. Printing was aborted and the SB-INT:STREAM-ENCODING-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.) You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [OUTPUT-NOTHING] Skip output of this character. 1: Skip output of this character. 2: Skip output of this character. 3: Skip output of this character. 4: Skip output of this character. 5: [ABORT ] Reduce debugger level (leaving debugger, returning to toplevel). 6: [TOPLEVEL ] Restart at toplevel READ/EVAL/PRINT loop. debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): (A SB-INT:STREAM-ENCODING-ERROR was caught when trying to print *DEBUG-CONDITION* when entering the debugger. Printing was aborted and the SB-INT:STREAM-ENCODING-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.) You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [OUTPUT-NOTHING] Skip output of this character. 1: Skip output of this character. 2: Skip output of this character. 3: Skip output of this character. 4: Skip output of this character. 5: Skip output of this character. 6: [ABORT ] Reduce debugger level (leaving debugger, returning to toplevel). 7: [TOPLEVEL ] Restart at toplevel READ/EVAL/PRINT loop. debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): (A SB-INT:STREAM-ENCODING-ERROR was caught when trying to print *DEBUG-CONDITION* when entering the debugger. Printing was aborted and the SB-INT:STREAM-ENCODING-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.) You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [OUTPUT-NOTHING] Skip output of this character. 1: Skip output of this character. 2: Skip output of this character. 3: Skip output of this character. 4: Skip output of this character. 5: Skip output of this character. 6: Skip output of this character. 7: [ABORT ] Reduce debugger level (leaving debugger, returning to toplevel). 8: [TOPLEVEL ] Restart at toplevel READ/EVAL/PRINT loop. debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): (A SB-INT:STREAM-ENCODING-ERROR was caught when trying to print *DEBUG-CONDITION* when entering the debugger. Printing was aborted and the SB-INT:STREAM-ENCODING-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.) You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [OUTPUT-NOTHING] Skip output of this character. 1: Skip output of this character. 2: Skip output of this character. 3: Skip output of this character. 4: Skip output of this character. 5: Skip output of this character. 6: Skip output of this character. 7: Skip output of this character. 8: [ABORT ] Reduce debugger level (leaving debugger, returning to toplevel). 9: [TOPLEVEL ] Restart at toplevel READ/EVAL/PRINT loop. debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): (A SB-INT:STREAM-ENCODING-ERROR was caught when trying to print *DEBUG-CONDITION* when entering the debugger. Printing was aborted and the SB-INT:STREAM-ENCODING-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.) You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [OUTPUT-NOTHING] Skip output of this character. 1: Skip output of this character. 2: Skip output of this character. 3: Skip output of this character. 4: Skip output of this character. 5: Skip output of this character. 6: Skip output of this character. 7: Skip output of this character. 8: Skip output of this character. 9: [ABORT ] Reduce debugger level (leaving debugger, returning to toplevel). 10: [TOPLEVEL ] Restart at toplevel READ/EVAL/PRINT loop. debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): (A SB-INT:STREAM-ENCODING-ERROR was caught when trying to print *DEBUG-CONDITION* when entering the debugger. Printing was aborted and the SB-INT:STREAM-ENCODING-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.) You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. You can type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [OUTPUT-NOTHING] Skip output of this character. 1: Skip output of this character. 2: Skip output of this character. 3: Skip output of this character. 4: Skip output of this character. 5: Skip output of this character. 6: Skip output of this character. 7: Skip output of this character. 8: Skip output of this character. 9: Skip output of this character. 10: [ABORT ] Reduce debugger level (leaving debugger, returning to toplevel). 11: [TOPLEVEL ] Restart at toplevel READ/EVAL/PRINT loop. debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread 28606: encoding error on stream # (:EXTERNAL-FORMAT :LATIN-9): Help! 11 nested errors. SB-KERNEL:*MAXIMUM-ERROR-DEPTH* exceeded. Help! 11 nested errors. SB-KERNEL:*MAXIMUM-ERROR-DEPTH* exceeded. 0[12] any idea about this problem ? -- Nicolas Lamirault