From Chase at common-lisp.net Tue Jul 4 00:13:22 2006 From: Chase at common-lisp.net (Chase at common-lisp.net) Date: Mon, 03 Jul 2006 18:13:22 -0600 Subject: [elephant-devel] Chase Online. Banking Account registration information Message-ID: An HTML attachment was scrubbed... URL: From franks-muc at web.de Thu Jul 13 22:24:11 2006 From: franks-muc at web.de (franks-muc at web.de) Date: Fri, 14 Jul 2006 00:24:11 +0200 Subject: [elephant-devel] clisp Message-ID: <132509901@web.de> Hello developers ! I'm trying to have a database on win32: Elephant with clsql-postgresql-socket runs on ACL-trial. But I would not want to purchase the ACL license only for that reason. While a lispworks license might be affordable and I can load elephant and clsql, clsql produces an error upon starting the tests. Now I'm hoping for clisp! However I get the following error when loading elephant: DEFCLASS ELEPHANT:BTREE-INDEX, slot option for slot ELEPHANT:KEY-FN: :TRANSIENT is not a valid slot option [Condition of type SYSTEM::SIMPLE-SOURCE-PROGRAM-ERROR] Restarts: 0: [RETRY] Retry performing # on #. 1: [ACCEPT] Continue, treating # on # as having been successful. 2: [ABORT-REQUEST] Abort handling SLIME request. 3: [ABORT] ABORT Backtrace: 0: #> 1: #> 2: #> 3: # 4: # 5: # 6: # 7: # 8: # 9: # 10: # 11: # 12: # 13: # 14: #> 15: # 16: # 17: # 18: # 19: # I am completely lost here and may not solve this. Could someone point me to a solution or a direction so that I can continue ? Thanks in advance. Frank Schorr ______________________________________________________________ Verschicken Sie romantische, coole und witzige Bilder per SMS! Jetzt bei WEB.DE FreeMail: http://f.web.de/?mc=021193 From franks-muc at web.de Thu Jul 13 22:48:11 2006 From: franks-muc at web.de (franks-muc at web.de) Date: Fri, 14 Jul 2006 00:48:11 +0200 Subject: [elephant-devel] lispworks Message-ID: <132549956@web.de> Hello again ! in my previous post of today I thought that elephant does not run in lispworks on win32 due to a problem in clsql. This was wrong. In fact, no tests failed for clsql. The problem appears to relate more to elephant: ELE-TESTS 1 : 1 > :bb # Condition: Dynamic-Slot-Value-Using-Slotd is not defined for slot ELEPHANT::%OID in # Call to (METHOD (SETF CLOS::DYNAMIC-SLOT-VALUE-USING-SLOTD) (T T T)) (offset 61) CLOS::NEW-VALUE : 1 CLOS::INSTANCE : # CLOS::SLOTD : # Binding frame: CLOS::*SETF-FROM-SLOT-MISSING* : NIL Call to (METHOD (SETF CLOS::DYNAMIC-SLOT-VALUE-USING-SLOTD) (T T T)) (offset 136) CLOS::NEW-VALUE : 1 CLOS::INSTANCE : # CLOS::SLOTD : # Call to (METHOD INITIALIZE-INSTANCE :BEFORE (PERSISTENT)) (offset 73) ELEPHANT::INSTANCE : # ELEPHANT::INITARGS : :DONT-KNOW ELEPHANT::FROM-OID : 1 ELEPHANT::SC : # CLOS::.ISL. : :DONT-KNOW Binding frame: CLOS::*NEXT-METHODS* : NIL Call to # (offset 51) Call to # (offset 126) Call to CLOS::MAKE-INSTANCE-FROM-CLASS-1 (offset 469) CLASS : # CLOS::INITARGS : (:SC # :FROM-OID 1) Call to (METHOD ELEPHANT::OPEN-CONTROLLER (ELEPHANT-CLSQL::SQL-STORE-CONTROLLER)) (offset 304) ELEPHANT-CLSQL::SC : # DBG::G : :DONT-KNOW ELEPHANT-CLSQL::RECOVER : :DONT-KNOW ELEPHANT-CLSQL::RECOVER-FATAL : :DONT-KNOW ELEPHANT-CLSQL::THREAD : :DONT-KNOW CLOS::.ISL. : #(#> # NIL NIL> # 2 0) CLOS::.PV. : # ELEPHANT-CLSQL::DBTYPE : :POSTGRESQL-SOCKET ELEPHANT-CLSQL::CON : # Call to # (offset 126) Call to OPEN-STORE (offset 136) ELEPHANT::SPEC : (:CLSQL (:POSTGRESQL-SOCKET "localhost" "clsql-tests" "postgres" "$postgres%")) ELEPHANT::RECOVER : NIL ELEPHANT::RECOVER-FATAL : NIL ELEPHANT::THREAD : T Binding frame: *STORE-CONTROLLER* : NIL Call to DO-BACKEND-TESTS (offset 93) Is there a solution ? Can I still hope to run elephant on windows in lispworks (or clisp) ? Frank Schorr _____________________________________________________________________ Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! http://smartsurfer.web.de/?mc=100071&distributionid=000000000071 From eslick at csail.mit.edu Fri Jul 14 01:20:45 2006 From: eslick at csail.mit.edu (Ian S Eslick) Date: Thu, 13 Jul 2006 21:20:45 -0400 Subject: [elephant-devel] clisp References: <132509901@web.de> Message-ID: <008e01c6a6e3$bc7a5a80$0801000a@PRIMARY> I'm not sure we've debugged the system under clisp or Win32 yet. It's possible we're suffering from a lack of MOP support for CLISP as I recall there are some incompatibilities in the CLISP MOP. This is CLISP + Elephant + CL-SQL and the various required libraries? I can take a peek at it tomorrow. Ian ----- Original Message ----- From: To: Sent: Thursday, July 13, 2006 6:24 PM Subject: [elephant-devel] clisp > Hello developers ! > > I'm trying to have a database on win32: > Elephant with clsql-postgresql-socket runs on ACL-trial. But I would not > want to purchase the ACL license only for that reason. > While a lispworks license might be affordable and I can load elephant and > clsql, clsql produces an error upon starting the tests. > Now I'm hoping for clisp! > However I get the following error when loading elephant: > > DEFCLASS ELEPHANT:BTREE-INDEX, slot option for slot ELEPHANT:KEY-FN: > :TRANSIENT is not a valid slot option > [Condition of type SYSTEM::SIMPLE-SOURCE-PROGRAM-ERROR] > > Restarts: > 0: [RETRY] Retry performing # on > #. > 1: [ACCEPT] Continue, treating # on > # as having been successful. > 2: [ABORT-REQUEST] Abort handling SLIME request. > 3: [ABORT] ABORT > > Backtrace: > 0: #> > 1: #> > 2: #> > 3: # &REST ...) ...)-7-1-1-1|> > 4: # &REST ...) ...)-7-1-1|> > 5: # > 6: # > 7: # ...)-8-1-1-1|> > 8: # ...)-8-1-1|> > 9: # > 10: # > 11: # ARGS) ...)-2-1-1|> > 12: # > 13: # > 14: #> > 15: # (CLASS NAME &KEY ...) ...)-95-2|> > 16: # > 17: # > 18: # > 19: # > > > I am completely lost here and may not solve this. > Could someone point me to a solution or a direction so that I can continue > ? > Thanks in advance. > Frank Schorr > ______________________________________________________________ > Verschicken Sie romantische, coole und witzige Bilder per SMS! > Jetzt bei WEB.DE FreeMail: http://f.web.de/?mc=021193 > > _______________________________________________ > elephant-devel site list > elephant-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel > From aycan.irican at core.gen.tr Tue Jul 18 16:04:20 2006 From: aycan.irican at core.gen.tr (Aycan iRiCAN) Date: Tue, 18 Jul 2006 19:04:20 +0300 Subject: [elephant-devel] typo fix Message-ID: <87mzb66gsb.fsf@core.gen.tr> Here is a typo fix for the cvs head. $ cvs diff -u src/elephant/classindex.lisp Index: src/elephant/classindex.lisp =================================================================== RCS file: /project/elephant/cvsroot/elephant/src/elephant/classindex.lisp,v retrieving revision 1.13 diff -u -r1.13 classindex.lisp --- src/elephant/classindex.lisp 19 Jun 2006 01:03:30 -0000 1.13 +++ src/elephant/classindex.lisp 18 Jul 2006 15:50:58 -0000 @@ -257,12 +257,12 @@ slot-name (class-name class)) (progn (when update-class (register-indexed-slot class slot-name)) -;; (with-transaction (:store-controller sc) - (add-index (find-class-index class :sc sc) - :index-name slot-name - :key-form (make-slot-key-form class slot-name) - :populate populate)) - t)) + ;; (with-transaction (:store-controller sc) + (add-index (find-class-index class :sc sc) + :index-name slot-name + :key-form (make-slot-key-form class slot-name) + :populate populate) + t))) (defmethod remove-class-slot-index ((class symbol) slot-name &key (sc *store-controller*)) (remove-class-slot-index (find-class class) slot-name :sc sc)) Best Regards. -- Aycan iRiCAN C0R3 Computer Security Group http://www.core.gen.tr -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 190 bytes Desc: not available URL: From aycan.irican at core.gen.tr Tue Jul 18 16:29:45 2006 From: aycan.irican at core.gen.tr (Aycan iRiCAN) Date: Tue, 18 Jul 2006 19:29:45 +0300 Subject: [elephant-devel] another patch for bdb-controller Message-ID: <87k66a6fly.fsf@core.gen.tr> SBCL complained about invalid feature expression. So I fixed it. $ cvs diff -u src/db-bdb/bdb-controller.lisp Index: src/db-bdb/bdb-controller.lisp =================================================================== RCS file: /project/elephant/cvsroot/elephant/src/db-bdb/bdb-controller.lisp,v retrieving revision 1.9 diff -u -r1.9 bdb-controller.lisp --- src/db-bdb/bdb-controller.lisp 19 Jun 2006 00:47:24 -0000 1.9 +++ src/db-bdb/bdb-controller.lisp 18 Jul 2006 16:14:55 -0000 @@ -193,7 +193,8 @@ (defmethod shell-kill (pid) #+allegro (sys:reap-os-subprocess :pid pid :wait t) - #+(port (not allegro)) (port:run-prog "kill" :wait t :args (list "-9" (format nil "~A" pid))) + #+(and (not allegro) port) (port:run-prog "kill" :wait t :args (list "-9" (format nil "~A" pid))) + #+(and sbcl linux) (sb-ext:process-kill "/bin/kill" (list "-9" (format nil "~A" pid))) ) ;; Best Regards -- Aycan iRiCAN C0R3 Computer Security Group http://www.core.gen.tr -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 190 bytes Desc: not available URL: From read at robertlread.net Tue Jul 18 16:46:18 2006 From: read at robertlread.net (Robert L. Read) Date: Tue, 18 Jul 2006 11:46:18 -0500 Subject: [elephant-devel] another patch for bdb-controller In-Reply-To: <87k66a6fly.fsf@core.gen.tr> References: <87k66a6fly.fsf@core.gen.tr> Message-ID: <1153241178.4658.498.camel@localhost.localdomain> Thanks! I'm completely busy until after this week (I have to prepare three lectures (and not in my native language)) for a convention this weekend. I'll review these and add them next week, if Ian doesn't do it first. On Tue, 2006-07-18 at 19:29 +0300, Aycan iRiCAN wrote: > SBCL complained about invalid feature expression. So I fixed it. > > $ cvs diff -u src/db-bdb/bdb-controller.lisp > Index: src/db-bdb/bdb-controller.lisp > =================================================================== > RCS file: /project/elephant/cvsroot/elephant/src/db-bdb/bdb-controller.lisp,v > retrieving revision 1.9 > diff -u -r1.9 bdb-controller.lisp > --- src/db-bdb/bdb-controller.lisp 19 Jun 2006 00:47:24 -0000 1.9 > +++ src/db-bdb/bdb-controller.lisp 18 Jul 2006 16:14:55 -0000 > @@ -193,7 +193,8 @@ > > (defmethod shell-kill (pid) > #+allegro (sys:reap-os-subprocess :pid pid :wait t) > - #+(port (not allegro)) (port:run-prog "kill" :wait t :args (list "-9" (format nil "~A" pid))) > + #+(and (not allegro) port) (port:run-prog "kill" :wait t :args (list "-9" (format nil "~A" pid))) > + #+(and sbcl linux) (sb-ext:process-kill "/bin/kill" (list "-9" (format nil "~A" pid))) > ) > > ;; > > Best Regards > > _______________________________________________ > elephant-devel site list > elephant-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From eslick at csail.mit.edu Tue Jul 18 17:10:55 2006 From: eslick at csail.mit.edu (Ian Eslick) Date: Tue, 18 Jul 2006 13:10:55 -0400 Subject: [elephant-devel] typo fix In-Reply-To: <87mzb66gsb.fsf@core.gen.tr> References: <87mzb66gsb.fsf@core.gen.tr> Message-ID: <44BD161F.9080802@csail.mit.edu> I'll put these in tomorrow, my current short-term deadline is tonight. Ian Aycan iRiCAN wrote: > Here is a typo fix for the cvs head. > > $ cvs diff -u src/elephant/classindex.lisp > Index: src/elephant/classindex.lisp > =================================================================== > RCS file: /project/elephant/cvsroot/elephant/src/elephant/classindex.lisp,v > retrieving revision 1.13 > diff -u -r1.13 classindex.lisp > --- src/elephant/classindex.lisp 19 Jun 2006 01:03:30 -0000 1.13 > +++ src/elephant/classindex.lisp 18 Jul 2006 15:50:58 -0000 > @@ -257,12 +257,12 @@ > slot-name (class-name class)) > (progn > (when update-class (register-indexed-slot class slot-name)) > -;; (with-transaction (:store-controller sc) > - (add-index (find-class-index class :sc sc) > - :index-name slot-name > - :key-form (make-slot-key-form class slot-name) > - :populate populate)) > - t)) > + ;; (with-transaction (:store-controller sc) > + (add-index (find-class-index class :sc sc) > + :index-name slot-name > + :key-form (make-slot-key-form class slot-name) > + :populate populate) > + t))) > > (defmethod remove-class-slot-index ((class symbol) slot-name &key (sc *store-controller*)) > (remove-class-slot-index (find-class class) slot-name :sc sc)) > > > Best Regards. > > > ------------------------------------------------------------------------ > > _______________________________________________ > elephant-devel site list > elephant-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel From lists at infoway.net Wed Jul 26 21:36:33 2006 From: lists at infoway.net (Daniel Salama) Date: Wed, 26 Jul 2006 17:36:33 -0400 Subject: [elephant-devel] Design Suggestion Request Message-ID: Hi all, I'm relatively new to OODB and in particular Elephant. I learn best by working with applications, so normally, I would try to migrate something I have done in a different application into the new environment I'm learning. For this exercise, I'd like to migrate an application I developed in a different environment using MySQL. It's a small invoicing application. The concept I'm trying to visualize, for both design, efficiency, and performance, is the following (very simplified version): I have many customers who place many orders. The orders get eventually shipped and converted to invoices. Invoices can receive full payment or many partial payments, etc, etc. You guys get the idea. Coming from a relational world, I would have something like a customers table, an orders table with order_items, invoices with invoice_items, maybe something like transactions, etc. I would be able to query all customers, or get gross sales or gross payment reports, as well as look up any customer and get individual order history, invoice history, etc. So, in my mind, in terms of architecting this, I could do something like a collection of customers where each customer object would have a a few slots. One for orders which will in turn a have collection of items. Another for invoices, which will have another for its items, and so on and so forth. That setup, I would think, would give me a nice model for looking up a client and then pulling its corresponding historical records. However, it would seem to me that if I then wanted to run a gross sales or gross collections report, I would need to iterate through all customers to get the corresponding data to possible then select specific date ranges or other filter criteria. That to me sounds extremely inefficient. Now, that could be just because that's the way OODB work in general or because I have a completely wrong design on what I should do. The other approach I thought would be to model it similarly as to how I would do it in a relational database. Basically, I would create separate collections of objects representing the tables I would have in the relational database. Then, within each object, e.g. a customer object, I would create a reference to a collection that holds a subset of invoices, for example. This would allow me to simply query the invoices collection of a customer and that's it. At the same time, I would be able to query the entire invoices collection. This sounds like a better approach. However, I'm wondering if anyone could comment on the overall design, and even possibly provide some sample code or pseudocode on achieving it. Thanks, Daniel From read at robertlread.net Thu Jul 27 13:42:30 2006 From: read at robertlread.net (Robert L. Read) Date: Thu, 27 Jul 2006 08:42:30 -0500 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: References: Message-ID: <1154007750.4658.855.camel@localhost.localdomain> On Wed, 2006-07-26 at 17:36 -0400, Daniel Salama wrote: > The other approach I thought would be to model it similarly as to > how > I would do it in a relational database. Basically, I would create > separate collections of objects representing the tables I would have > in the relational database. Then, within each object, e.g. a > customer > object, I would create a reference to a collection that holds a > subset of invoices, for example. This would allow me to simply query > the invoices collection of a customer and that's it. At the same > time, I would be able to query the entire invoices collection. Dear Daniel, I think this approach is much better than creating a very large object. Personally, I have an opinion a lot of people disagree with --- I use the "prevalence" model, which is basically that I keep all of the objects in memory, and when I change something I write back to the data store. This pretty much makes your reporting efficiency issues go away, because you can compute any report really, really fast. I have checked in, in the "contrib" directory, a packaged called DCM, for Data Collection Management, that does the in-memory management --- the responsibility of the user is to call "register-object" whenever an object needs to be back. DCM also supports the "reference" problem that you mention --- that is, instead of putting a whole object into a slot, you put the key there and look it up in a separate object. In this model, each class of object you would objectify (which is very similar to the "tables" in relational model or "entities" in the entity-relational model.) Each should class gets a "director", and you operate against the director when you do something. One of the advantages of this approach is that you can choose the "strategy" for each director --- so you can choose to cache the objects in memory, or to directly use the database store, or even to use a generational system. I think the tests of DCM could be considered a little bit of pseudocode for what you want. In considering whether or not things should be kept in memory, one should do the math: the maximum number of objects * their size vs. the free memory. Memories are big and getting bigger. Let me know if this addresses you ideas or if you have any other questions; Ian and I had previously agreed that the lack of a big example application is one of the biggest weaknesses in Elephant right now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eslick at csail.mit.edu Thu Jul 27 14:08:43 2006 From: eslick at csail.mit.edu (Ian Eslick) Date: Thu, 27 Jul 2006 10:08:43 -0400 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: <1154007750.4658.855.camel@localhost.localdomain> References: <1154007750.4658.855.camel@localhost.localdomain> Message-ID: <44C8C8EB.1090406@csail.mit.edu> Another way to get reasonable reporting efficiency for queries with multiple constraints is to use cursors over secondary indices directly. For your example I would create a CLOS class in Elephant instead of a table. Each instance of this class is like a record in a table. To link two records together you have slots that contain references to the contained objects. You don't keep collections, those are to be stored in secondary indices so they are ordered and readout can be linear in the size of the result set. (defpclass customer ((name ... :index t))) (defpclass invoice ((number ... :index t) (date ... :index t) (customer ... :index t))) You can now easily do BTree lookups using secondary indices via the 0.6.0 indexing mechanism: (get-instances-by-value 'invoice 'number ) => returns a single invoice (get-instances-by-range 'invoice 'date ) => returns a list of invoices between two dates (get-instances-by-value 'invoice 'customer These are highly efficient calls although they do require going to disk unless you are using DCM, but the performance for reports is not going to be a real problem as the secondary indices are ordered so in retrieving a set of invoices over dates so performance is roughly O( # invoices in range + log # total invoices ) and for small ranges is effectively log 2N disk accesses (one for index, one for deserializing object if it's not cached). The rub is when you want to do a query for a customer's invoices between two date ranges using the customer's name. In SQL this is a join query and the query compiler will typically optimize how this works. If either set is likely to be small (# of invoices per customer or # of invoices in a date range) you can write your own query to fetch one subset and filter it: (defun get-customer-invoices-for-dates (name start-date end-date) (let ((customer (car (get-instances-by-value 'customer 'name name)))) (select-if (lambda (invoice) (in-date-range-p invoice start-date end-date)) (get-instances-by-value 'invoice 'customer customer)))) To handle joins over large collections, a more sophisticated function using cursors would need to be written and in some cases you'd need to generate some new index data structures to get the time down to O ( N log M ) with N the target dataset and M the largest # of instances for any class involved in the query. An easy way to do this is to use derived indices which, for example, can use any number of slots in an object to create an ordering - so you can pre-sort objects by date and then by customer so you can skip over customer-date pairs as you traverse the btree. That might require some more explanation but I dont' have the time just now. :) In a future release we hope to integrate DCM/prevalence style functionality more directly into the persistent object system so common linear queries are fast and only reports over non-resident (infrequently and not recently accessed) objects is expensive. Good luck and let us know if you need more suggestions! Ian Robert L. Read wrote: > On Wed, 2006-07-26 at 17:36 -0400, Daniel Salama wrote: >> The other approach I thought would be to model it similarly as to how >> I would do it in a relational database. Basically, I would create >> separate collections of objects representing the tables I would have >> in the relational database. Then, within each object, e.g. a customer >> object, I would create a reference to a collection that holds a >> subset of invoices, for example. This would allow me to simply query >> the invoices collection of a customer and that's it. At the same >> time, I would be able to query the entire invoices collection. > Dear Daniel, > I think this approach is much better than creating a very large > object. > > Personally, I have an opinion a lot of people disagree with --- I > use the "prevalence" model, > which is basically that I keep all of the objects in memory, and when > I change something I > write back to the data store. This pretty much makes your reporting > efficiency issues > go away, because you can compute any report really, really fast. > > I have checked in, in the "contrib" directory, a packaged called > DCM, for Data Collection Management, > that does the in-memory management --- the responsibility of the user > is to call "register-object" whenever > an object needs to be back. DCM also supports the "reference" problem > that you mention --- that is, > instead of putting a whole object into a slot, you put the key there > and look it up in a separate object. > > In this model, each class of object you would objectify (which is > very similar to the "tables" in > relational model or "entities" in the entity-relational model.) Each > should class gets a "director", and > you operate against the director when you do something. One of the > advantages of this approach is > that you can choose the "strategy" for each director --- so you can > choose to cache the objects in > memory, or to directly use the database store, or even to use a > generational system. > > I think the tests of DCM could be considered a little bit of > pseudocode for what you want. > > In considering whether or not things should be kept in memory, one > should do the math: the > maximum number of objects * their size vs. the free memory. Memories > are big and getting bigger. > > Let me know if this addresses you ideas or if you have any other > questions; Ian and I had > previously agreed that the lack of a big example application is one of > the biggest weaknesses in > Elephant right now. > > ------------------------------------------------------------------------ > > _______________________________________________ > elephant-devel site list > elephant-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel From fungsin.lui at gmail.com Thu Jul 27 17:02:21 2006 From: fungsin.lui at gmail.com (Lui Fungsin) Date: Thu, 27 Jul 2006 10:02:21 -0700 Subject: [elephant-devel] Fwd: elephant 0.60 + sbcl 0.9.4 + ubuntu 5.05 linux 2.6 In-Reply-To: <3990b5930607262015h22d0d691oe500a2efc3a3f9b2@mail.gmail.com> References: <3990b5930607262015h22d0d691oe500a2efc3a3f9b2@mail.gmail.com> Message-ID: <3990b5930607271002q1686babeib07cba1eceef286a@mail.gmail.com> Hi Ian, I tried to send the following to elephant-devel at common-lisp.net twice and they were both bounced. Could you please help forward this to the list? Thank you very much! -- fungsin ---------- Forwarded message ---------- From: Lui Fungsin Date: Jul 26, 2006 8:15 PM Subject: elephant 0.60 + sbcl 0.9.4 + ubuntu 5.05 linux 2.6 To: elephant-devel at common-lisp.net Hi all, I got the following error (backtraces included) when I tried to run the bdb back end test. (asdf:operate 'asdf:load-op :elephant-tests) (in-package "ELEPHANT-TESTS") (setf *default-spec* *testbdb-spec*) (do-backend-tests) I installed the following package on ubuntu libdb4.3 libdb4.3-dev (for the c headers) Note that I changed the libdb path to point to the correct location on my system (setf SLEEPYCAT::*sleepycat-foreign-library-path* "/usr/lib/libdb-4.3.so") It seems that one of the uffi call falls. Any help would be much appreciated! debugger invoked on a SLEEPYCAT::DB-ERROR in thread #: Berkeley DB error: Invalid argument Type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [ABORT] Exit debugger, returning to top level. 0: (SLEEPYCAT::DB-ENV-OPEN # "cl/libdb/elephant/tests/testdb/" :JOINENV NIL :INIT-CDB NIL :INIT-LOCK T :INIT-LOG T :INIT-MPOOL T :INIT-REP NIL :INIT-TXN T :RECOVER NIL :RECOVER-FATAL NIL :CREATE T :LOCKDOWN NIL :PRIVATE NIL :SYSTEM-MEM NIL :THREAD T :MODE 416) 1: ((SB-PCL::FAST-METHOD ELEPHANT::OPEN-CONTROLLER (SLEEPYCAT::BDB-STORE-CONTROLLER)) (#(NIL 3 2) . #()) # # (:RECOVER NIL :RECOVER-FATAL NIL :THREAD T)) 2: (OPEN-STORE (:BDB "cl/libdb/elephant/tests/testdb/") :RECOVER NIL :RECOVER-FATAL NIL :THREAD T) 3: (DO-BACKEND-TESTS (:BDB "cl/libdb/elephant/tests/testdb/")) 4: (SB-INT:EVAL-IN-LEXENV (DO-BACKEND-TESTS) #) 5: (SB-EXT:INTERACTIVE-EVAL (DO-BACKEND-TESTS)) 6: (SB-IMPL::REPL-FUN NIL) 7: ((LAMBDA ())) 8: ((LAMBDA ())) 9: (SB-IMPL::%WITH-REBOUND-IO-SYNTAX #) 10: (SB-IMPL::TOPLEVEL-REPL NIL) 11: (SB-IMPL::TOPLEVEL-INIT) 12: ((LABELS SB-IMPL::RESTART-LISP)) From lists at infoway.net Thu Jul 27 17:45:46 2006 From: lists at infoway.net (Daniel Salama) Date: Thu, 27 Jul 2006 13:45:46 -0400 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: <44C8C8EB.1090406@csail.mit.edu> References: <1154007750.4658.855.camel@localhost.localdomain> <44C8C8EB.1090406@csail.mit.edu> Message-ID: <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> Wow! What a great set of responses. Thanks so much for the information. Robert, I understand your approach. However, I don't know if using DCM at my beginner's stage may be more complicated. I also think that, although RAM is getting cheaper every day, there are just physical limitations that machines have. I'm looking over my application's data and it's not that bad, considering that after 2 years worth of data, I'm using approximately 1GB of hard drive space. So, independently of that, and if I understood you correctly, when I'm abstracting and accessing my model through the director, it sounds to me that either I persist all on storage or in cache (for each director). If I choose cache, then I have to manually write code to persist changes, instead of it being "automatic". I do like the whole DCM concept from a performance perspective but I certainly have to study further that whole framework to see how to put it into good use. Now, Ian, I think that what you explained addressed my immediate needs, considering my current knowledge of Elephant. I guess when I mentioned collections in my original email I should have been more specific to CLOS persistent objects, which would be more efficient. So, in a way, that's what I was looking for. What I didn't grasp clear enough was the secondary indexing and querying model, which certainly seem to do what I initially have in mind. I do think that I will need to implement more complex indices, such as indices spanning over multiple slots, as you mentioned, so if you don't mind expanding on what you started typing regarding the cursor iteration functions or models, I would really appreciate it. The other question I have is: I couldn't find documentation on the elephant online manual/tutorial regarding defpclass, get-instances-by- value, get-instances-by-range, and select-if. Is there any formal documentation about it or were those presented by you as pseudocode samples? Overall, I couldn't estimate at this moment how much it would benefit me by using cache vs hitting the storage penalty. As I said, I think I'll start with Ian's suggestions and as I learn more of Elephant and DCM, I'll start migrating/integrating DCM into the picture. Thanks again. I will keep you guys posted about my progress. Daniel On Jul 27, 2006, at 10:08 AM, Ian Eslick wrote: > Another way to get reasonable reporting efficiency for queries with > multiple constraints is to use cursors over secondary indices > directly. > > For your example I would create a CLOS class in Elephant instead of a > table. Each instance of this class is like a record in a table. To > link two records together you have slots that contain references to > the > contained objects. You don't keep collections, those are to be stored > in secondary indices so they are ordered and readout can be linear in > the size of the result set. > > (defpclass customer > ((name ... :index t))) > > (defpclass invoice > ((number ... :index t) > (date ... :index t) > (customer ... :index t))) > > You can now easily do BTree lookups using secondary indices via the > 0.6.0 indexing mechanism: > > (get-instances-by-value 'invoice 'number ) => > returns a > single invoice > (get-instances-by-range 'invoice 'date ) => > returns a list of invoices between two dates > (get-instances-by-value 'invoice 'customer > > These are highly efficient calls although they do require going to > disk > unless you are using DCM, but the performance > for reports is not going to be a real problem as the secondary indices > are ordered so in retrieving a set of invoices over > dates so performance is roughly O( # invoices in range + log # total > invoices ) and for small ranges is effectively > log 2N disk accesses (one for index, one for deserializing object if > it's not cached). > > The rub is when you want to do a query for a customer's invoices > between > two date ranges using the customer's name. In SQL this is a join > query > and the query compiler will typically optimize how this works. If > either set is likely to be small (# of invoices per customer or # of > invoices in a date range) you can write your own query to fetch one > subset and filter it: > > (defun get-customer-invoices-for-dates (name start-date end-date) > (let ((customer (car (get-instances-by-value 'customer 'name > name)))) > (select-if (lambda (invoice) (in-date-range-p invoice start- > date > end-date)) > (get-instances-by-value 'invoice 'customer customer)))) > > To handle joins over large collections, a more sophisticated function > using cursors would need to be written and in > some cases you'd need to generate some new index data structures to > get > the time down to O ( N log M ) with N the target dataset and M the > largest # of instances for any class involved in the query. An > easy way > to do this is to use derived indices which, for example, can use any > number of slots in an object to create an ordering - so you can pre- > sort > objects by date and then by customer so you can skip over customer- > date > pairs as you traverse the btree. That might require some more > explanation but I dont' have the time just now. :) > > In a future release we hope to integrate DCM/prevalence style > functionality more directly into the persistent object system so > common > linear queries are fast and only reports over non-resident > (infrequently > and not recently accessed) objects is expensive. > > Good luck and let us know if you need more suggestions! > > Ian > > Robert L. Read wrote: >> On Wed, 2006-07-26 at 17:36 -0400, Daniel Salama wrote: >>> The other approach I thought would be to model it similarly as to >>> how >>> I would do it in a relational database. Basically, I would create >>> separate collections of objects representing the tables I would have >>> in the relational database. Then, within each object, e.g. a >>> customer >>> object, I would create a reference to a collection that holds a >>> subset of invoices, for example. This would allow me to simply query >>> the invoices collection of a customer and that's it. At the same >>> time, I would be able to query the entire invoices collection. >> Dear Daniel, >> I think this approach is much better than creating a very large >> object. >> >> Personally, I have an opinion a lot of people disagree with --- I >> use the "prevalence" model, >> which is basically that I keep all of the objects in memory, and when >> I change something I >> write back to the data store. This pretty much makes your reporting >> efficiency issues >> go away, because you can compute any report really, really fast. >> >> I have checked in, in the "contrib" directory, a packaged called >> DCM, for Data Collection Management, >> that does the in-memory management --- the responsibility of the user >> is to call "register-object" whenever >> an object needs to be back. DCM also supports the "reference" >> problem >> that you mention --- that is, >> instead of putting a whole object into a slot, you put the key there >> and look it up in a separate object. >> >> In this model, each class of object you would objectify >> (which is >> very similar to the "tables" in >> relational model or "entities" in the entity-relational model.) Each >> should class gets a "director", and >> you operate against the director when you do something. One of the >> advantages of this approach is >> that you can choose the "strategy" for each director --- so you can >> choose to cache the objects in >> memory, or to directly use the database store, or even to use a >> generational system. >> >> I think the tests of DCM could be considered a little bit of >> pseudocode for what you want. >> >> In considering whether or not things should be kept in memory, >> one >> should do the math: the >> maximum number of objects * their size vs. the free memory. Memories >> are big and getting bigger. >> >> Let me know if this addresses you ideas or if you have any other >> questions; Ian and I had >> previously agreed that the lack of a big example application is >> one of >> the biggest weaknesses in >> Elephant right now. >> >> --------------------------------------------------------------------- >> --- >> >> _______________________________________________ >> elephant-devel site list >> elephant-devel at common-lisp.net >> http://common-lisp.net/mailman/listinfo/elephant-devel > _______________________________________________ > elephant-devel site list > elephant-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel From read at robertlread.net Thu Jul 27 18:07:28 2006 From: read at robertlread.net (Robert L. Read) Date: Thu, 27 Jul 2006 13:07:28 -0500 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> References: <1154007750.4658.855.camel@localhost.localdomain> <44C8C8EB.1090406@csail.mit.edu> <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> Message-ID: <1154023648.4658.885.camel@localhost.localdomain> On Thu, 2006-07-27 at 13:45 -0400, Daniel Salama wrote: > Robert, I understand your approach. However, I don't know if using > DCM at my beginner's stage may be more complicated. I also think > that, although RAM is getting cheaper every day, there are just > physical limitations that machines have. I'm looking over my > application's data and it's not that bad, considering that after 2 > years worth of data, I'm using approximately 1GB of hard drive > space. > So, independently of that, and if I understood you correctly, when > I'm abstracting and accessing my model through the director, it > sounds to me that either I persist all on storage or in cache (for > each director). If I choose cache, then I have to manually write > code > to persist changes, instead of it being "automatic". Yes, you have understood the issues exactly. A reasonable way to work is to use completely persistent objects and see how the performance is for you --- LISP and elephant support this kind of rapid prototyping extremely well. I may be a bit old-fashioned---but I often find that I end up having to take explicit control of the write-back policy in any case, and I personally never find having to remember when to write things a burden, since they are almost always part of a "business rule", if your using a 3-tiered application. On the other hand, you can follow your plan based on Ian's idea, and similar layer on secondary indexes once prototyping shows that you need them. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at infoway.net Thu Jul 27 18:23:14 2006 From: lists at infoway.net (Daniel Salama) Date: Thu, 27 Jul 2006 14:23:14 -0400 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: <1154023648.4658.885.camel@localhost.localdomain> References: <1154007750.4658.855.camel@localhost.localdomain> <44C8C8EB.1090406@csail.mit.edu> <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> <1154023648.4658.885.camel@localhost.localdomain> Message-ID: Granded. Now, your original suggestion addressed the issue of using collections in slots and instead of collections of objects, simply to use collections of references to objects, which make sense and in a way is somewhat along the lines of what indexes do (from a general PoV). Not having looked at DCM yet, is it possible to just use the "persistence machinery" and DCM in a more seamless fashion? For example, if I declare a persistent CLOS class, can I hook that up to DCM and get the benefits of DCM and persistence at the same time? From Ian's last statement, this doesn't seem possible yet, but I may be wrong. Thanks, Daniel On Jul 27, 2006, at 2:07 PM, Robert L. Read wrote: > A reasonable way to work is to use completely persistent objects > and see how the performance > is for you --- LISP and elephant support this kind of rapid > prototyping extremely well. I may be > a bit old-fashioned---but I often find that I end up having to take > explicit control of the write-back > policy in any case, and I personally never find having to remember > when to write things a burden, > since they are almost always part of a "business rule", if your > using a 3-tiered application. > > On the other hand, you can follow your plan based on Ian's idea, > and similar layer on secondary > indexes once prototyping shows that you need them. -------------- next part -------------- An HTML attachment was scrubbed... URL: From read at robertlread.net Thu Jul 27 18:47:42 2006 From: read at robertlread.net (Robert L. Read) Date: Thu, 27 Jul 2006 13:47:42 -0500 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: References: <1154007750.4658.855.camel@localhost.localdomain> <44C8C8EB.1090406@csail.mit.edu> <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> <1154023648.4658.885.camel@localhost.localdomain> Message-ID: <1154026062.4658.907.camel@localhost.localdomain> On Thu, 2006-07-27 at 14:23 -0400, Daniel Salama wrote: > Not having looked at DCM yet, is it possible to just use the > "persistence machinery" and DCM in a more seamless fashion? For > example, if I declare a persistent CLOS class, can I hook that up to > DCM and get the benefits of DCM and persistence at the same time? From > Ian's last statement, this doesn't seem possible yet, but I may be > wrong. > No, that isn't possible. It is possible to design a CLOS structure first (without the concern of persistence), and then with a very small amount of work, use direct persistence and slot-based indexing and functional indexes if necessary to make everything persistent. It would then be reasonable to "switch- over" to a DCM based system by making those classes inherit from "managed-object" instead. It would not take too much work... maybe a few hours...to move back and forth between the two ideas. But they cannot be simultaneously employed in any useful way. Actually one of the great things about Elephant is that it allows you to change, or delay, these kind of implementation decisions as much as possible---for example, which technology you use as the back-end database. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eslick at csail.mit.edu Thu Jul 27 18:56:03 2006 From: eslick at csail.mit.edu (Ian Eslick) Date: Thu, 27 Jul 2006 14:56:03 -0400 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: References: <1154007750.4658.855.camel@localhost.localdomain> <44C8C8EB.1090406@csail.mit.edu> <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> <1154023648.4658.885.camel@localhost.localdomain> Message-ID: <44C90C43.6030606@csail.mit.edu> All, I think the long term goal is to make read/write policy something that is integrated into a per-class policy with per-instance state. Robert, have you forwarded that discussion we had about DCM to the list? If so it should be in the archives from several months back. The short story is that you want to have different policies with different amounts of explicit control based on your problem. For small DB's where ACID properties aren't important - just keep everything in memory as if Elephant wasn't there. If you want to save something - just tell the object to store itself to get the benefits of auto serialization. For larger DB's you might want a tiered approach: - Critical new objects have read-from-memory but write-through-to-disk for fast reads but safe writes - Non-critical new objects are read/write from memory unless explicitly checkpointed - Critical objects that aren't new are read/write from disk (existing Elephant default) so you don't overwhelm your physical memory Rucksack has some of the same features as DCM, but is not yet ready for real deployment (since I last looked a few months back). The 0.6.0 manual should have a section on indexing and the function reference should have get-instances-by-class (an elephant function in elephant/classindex.lisp). defpclass may not be documented but is just shorthand for adding the :metaclass class option. Just M-. defpclass with elephant loaded. In fact the doc strings are available for all these if you do M-. via Slime when the elephant package is loaded. The select-if is a function I use locally that just accepts each element of the list accepted by the predicate, it's not a part of elephant. I'm not sure when we'll get to adding new major features but if someone would like to dive in and think about this I'm happy to answer questions about the rationale behind the code. Ian Daniel Salama wrote: > Granded. Now, your original suggestion addressed the issue of using > collections in slots and instead of collections of objects, simply to > use collections of references to objects, which make sense and in a > way is somewhat along the lines of what indexes do (from a general PoV). > > Not having looked at DCM yet, is it possible to just use the > "persistence machinery" and DCM in a more seamless fashion? For > example, if I declare a persistent CLOS class, can I hook that up to > DCM and get the benefits of DCM and persistence at the same time? From > Ian's last statement, this doesn't seem possible yet, but I may be wrong. > > Thanks, > Daniel > > On Jul 27, 2006, at 2:07 PM, Robert L. Read wrote: > >> A reasonable way to work is to use completely persistent objects and >> see how the performance >> is for you --- LISP and elephant support this kind of rapid >> prototyping extremely well. I may be >> a bit old-fashioned---but I often find that I end up having to take >> explicit control of the write-back >> policy in any case, and I personally never find having to remember >> when to write things a burden, >> since they are almost always part of a "business rule", if your using >> a 3-tiered application. >> >> On the other hand, you can follow your plan based on Ian's idea, and >> similar layer on secondary >> indexes once prototyping shows that you need them. > From midfield at gmail.com Thu Jul 27 19:14:22 2006 From: midfield at gmail.com (Ben) Date: Thu, 27 Jul 2006 14:14:22 -0500 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: <44C90C43.6030606@csail.mit.edu> References: <1154007750.4658.855.camel@localhost.localdomain> <44C8C8EB.1090406@csail.mit.edu> <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> <1154023648.4658.885.camel@localhost.localdomain> <44C90C43.6030606@csail.mit.edu> Message-ID: <9157df230607271214i7f083509r710449e82adbc5ae@mail.gmail.com> i wonder if these sorts of design issues for elephant newbies might be laid out in a document or tutorial. perhaps the "blog in a minute" tutorial can be updated to reflect your more mature codebase and mature approaches to designing applications? or a design faq? just an idea, Ben On 7/27/06, Ian Eslick wrote: > All, > > I think the long term goal is to make read/write policy something that > is integrated into a per-class policy with per-instance state. Robert, > have you forwarded that discussion we had about DCM to the list? If so > it should be in the archives from several months back. The short story > is that you want to have different policies with different amounts of > explicit control based on your problem. > > For small DB's where ACID properties aren't important - just keep > everything in memory as if Elephant wasn't there. If you want to save > something - just tell the object to store itself to get the benefits of > auto serialization. > > For larger DB's you might want a tiered approach: > - Critical new objects have read-from-memory but write-through-to-disk > for fast reads but safe writes > - Non-critical new objects are read/write from memory unless explicitly > checkpointed > - Critical objects that aren't new are read/write from disk (existing > Elephant default) so you don't overwhelm > your physical memory > > Rucksack has some of the same features as DCM, but is not yet ready for > real deployment (since I last looked a few months back). > > The 0.6.0 manual should have a section on indexing and the function > reference should have get-instances-by-class (an elephant function in > elephant/classindex.lisp). defpclass may not be documented but is just > shorthand for adding the :metaclass class option. Just M-. defpclass > with elephant loaded. In fact the doc strings are available for all > these if you do M-. via Slime when the elephant package is loaded. > > The select-if is a function I use locally that just accepts each element > of the list accepted by the predicate, it's not a part of elephant. > > I'm not sure when we'll get to adding new major features but if someone > would like to dive in and think about this I'm happy to answer questions > about the rationale behind the code. > > Ian > > Daniel Salama wrote: > > Granded. Now, your original suggestion addressed the issue of using > > collections in slots and instead of collections of objects, simply to > > use collections of references to objects, which make sense and in a > > way is somewhat along the lines of what indexes do (from a general PoV). > > > > Not having looked at DCM yet, is it possible to just use the > > "persistence machinery" and DCM in a more seamless fashion? For > > example, if I declare a persistent CLOS class, can I hook that up to > > DCM and get the benefits of DCM and persistence at the same time? From > > Ian's last statement, this doesn't seem possible yet, but I may be wrong. > > > > Thanks, > > Daniel > > > > On Jul 27, 2006, at 2:07 PM, Robert L. Read wrote: > > > >> A reasonable way to work is to use completely persistent objects and > >> see how the performance > >> is for you --- LISP and elephant support this kind of rapid > >> prototyping extremely well. I may be > >> a bit old-fashioned---but I often find that I end up having to take > >> explicit control of the write-back > >> policy in any case, and I personally never find having to remember > >> when to write things a burden, > >> since they are almost always part of a "business rule", if your using > >> a 3-tiered application. > >> > >> On the other hand, you can follow your plan based on Ian's idea, and > >> similar layer on secondary > >> indexes once prototyping shows that you need them. > > > _______________________________________________ > elephant-devel site list > elephant-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel > From lists at infoway.net Thu Jul 27 19:32:55 2006 From: lists at infoway.net (Daniel Salama) Date: Thu, 27 Jul 2006 15:32:55 -0400 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: <9157df230607271214i7f083509r710449e82adbc5ae@mail.gmail.com> References: <1154007750.4658.855.camel@localhost.localdomain> <44C8C8EB.1090406@csail.mit.edu> <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> <1154023648.4658.885.camel@localhost.localdomain> <44C90C43.6030606@csail.mit.edu> <9157df230607271214i7f083509r710449e82adbc5ae@mail.gmail.com> Message-ID: <029AC51A-F688-4ACE-8264-C929B494F0CE@infoway.net> That's an excellent idea. I wasn't aware there's a blog for Elephant. Is there one? Thanks, Daniel On Jul 27, 2006, at 3:14 PM, Ben wrote: > i wonder if these sorts of design issues for elephant newbies might be > laid out in a document or tutorial. perhaps the "blog in a minute" > tutorial can be updated to reflect your more mature codebase and > mature approaches to designing applications? or a design faq? > > just an idea, > Ben > > On 7/27/06, Ian Eslick wrote: >> All, >> >> I think the long term goal is to make read/write policy something >> that >> is integrated into a per-class policy with per-instance state. >> Robert, >> have you forwarded that discussion we had about DCM to the list? >> If so >> it should be in the archives from several months back. The short >> story >> is that you want to have different policies with different amounts of >> explicit control based on your problem. >> >> For small DB's where ACID properties aren't important - just keep >> everything in memory as if Elephant wasn't there. If you want to >> save >> something - just tell the object to store itself to get the >> benefits of >> auto serialization. >> >> For larger DB's you might want a tiered approach: >> - Critical new objects have read-from-memory but write-through-to- >> disk >> for fast reads but safe writes >> - Non-critical new objects are read/write from memory unless >> explicitly >> checkpointed >> - Critical objects that aren't new are read/write from disk (existing >> Elephant default) so you don't overwhelm >> your physical memory >> >> Rucksack has some of the same features as DCM, but is not yet >> ready for >> real deployment (since I last looked a few months back). >> >> The 0.6.0 manual should have a section on indexing and the function >> reference should have get-instances-by-class (an elephant function in >> elephant/classindex.lisp). defpclass may not be documented but is >> just >> shorthand for adding the :metaclass class option. Just M-. defpclass >> with elephant loaded. In fact the doc strings are available for all >> these if you do M-. via Slime when the elephant package is loaded. >> >> The select-if is a function I use locally that just accepts each >> element >> of the list accepted by the predicate, it's not a part of elephant. >> >> I'm not sure when we'll get to adding new major features but if >> someone >> would like to dive in and think about this I'm happy to answer >> questions >> about the rationale behind the code. >> >> Ian >> >> Daniel Salama wrote: >> > Granded. Now, your original suggestion addressed the issue of using >> > collections in slots and instead of collections of objects, >> simply to >> > use collections of references to objects, which make sense and in a >> > way is somewhat along the lines of what indexes do (from a >> general PoV). >> > >> > Not having looked at DCM yet, is it possible to just use the >> > "persistence machinery" and DCM in a more seamless fashion? For >> > example, if I declare a persistent CLOS class, can I hook that >> up to >> > DCM and get the benefits of DCM and persistence at the same >> time? From >> > Ian's last statement, this doesn't seem possible yet, but I may >> be wrong. >> > >> > Thanks, >> > Daniel >> > >> > On Jul 27, 2006, at 2:07 PM, Robert L. Read wrote: >> > >> >> A reasonable way to work is to use completely persistent >> objects and >> >> see how the performance >> >> is for you --- LISP and elephant support this kind of rapid >> >> prototyping extremely well. I may be >> >> a bit old-fashioned---but I often find that I end up having to >> take >> >> explicit control of the write-back >> >> policy in any case, and I personally never find having to remember >> >> when to write things a burden, >> >> since they are almost always part of a "business rule", if your >> using >> >> a 3-tiered application. >> >> >> >> On the other hand, you can follow your plan based on Ian's >> idea, and >> >> similar layer on secondary >> >> indexes once prototyping shows that you need them. >> > >> _______________________________________________ >> elephant-devel site list >> elephant-devel at common-lisp.net >> http://common-lisp.net/mailman/listinfo/elephant-devel >> > _______________________________________________ > elephant-devel site list > elephant-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel From eslick at csail.mit.edu Thu Jul 27 19:43:30 2006 From: eslick at csail.mit.edu (Ian Eslick) Date: Thu, 27 Jul 2006 15:43:30 -0400 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: <029AC51A-F688-4ACE-8264-C929B494F0CE@infoway.net> References: <1154007750.4658.855.camel@localhost.localdomain> <44C8C8EB.1090406@csail.mit.edu> <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> <1154023648.4658.885.camel@localhost.localdomain> <44C90C43.6030606@csail.mit.edu> <9157df230607271214i7f083509r710449e82adbc5ae@mail.gmail.com> <029AC51A-F688-4ACE-8264-C929B494F0CE@infoway.net> Message-ID: <44C91762.1040402@csail.mit.edu> I think there was a tutorial that included a simple design for a persistent logging system and queries against it. Should be in examples/index-tutorial.lisp. There's a nice wiki example for UCW that could easily be adapted to show off elephant - and a few of our users, including me, use elephant as a data backend for a UCW front end. UCW is a bit much for a general tutorial though. I think a blog backend to portable aserve would make for a nice example along with a discussion of tradeoffs. Of course someone would need to volunteer for such a thing. It will be a little while before I can do more than small support sessions or bug fixes for elephant... :) Ian Daniel Salama wrote: > That's an excellent idea. I wasn't aware there's a blog for Elephant. > Is there one? > > Thanks, > Daniel > > On Jul 27, 2006, at 3:14 PM, Ben wrote: > >> i wonder if these sorts of design issues for elephant newbies might be >> laid out in a document or tutorial. perhaps the "blog in a minute" >> tutorial can be updated to reflect your more mature codebase and >> mature approaches to designing applications? or a design faq? >> >> just an idea, >> Ben >> >> On 7/27/06, Ian Eslick wrote: >>> All, >>> >>> I think the long term goal is to make read/write policy something that >>> is integrated into a per-class policy with per-instance state. Robert, >>> have you forwarded that discussion we had about DCM to the list? If so >>> it should be in the archives from several months back. The short story >>> is that you want to have different policies with different amounts of >>> explicit control based on your problem. >>> >>> For small DB's where ACID properties aren't important - just keep >>> everything in memory as if Elephant wasn't there. If you want to save >>> something - just tell the object to store itself to get the benefits of >>> auto serialization. >>> >>> For larger DB's you might want a tiered approach: >>> - Critical new objects have read-from-memory but write-through-to-disk >>> for fast reads but safe writes >>> - Non-critical new objects are read/write from memory unless explicitly >>> checkpointed >>> - Critical objects that aren't new are read/write from disk (existing >>> Elephant default) so you don't overwhelm >>> your physical memory >>> >>> Rucksack has some of the same features as DCM, but is not yet ready for >>> real deployment (since I last looked a few months back). >>> >>> The 0.6.0 manual should have a section on indexing and the function >>> reference should have get-instances-by-class (an elephant function in >>> elephant/classindex.lisp). defpclass may not be documented but is just >>> shorthand for adding the :metaclass class option. Just M-. defpclass >>> with elephant loaded. In fact the doc strings are available for all >>> these if you do M-. via Slime when the elephant package is loaded. >>> >>> The select-if is a function I use locally that just accepts each >>> element >>> of the list accepted by the predicate, it's not a part of elephant. >>> >>> I'm not sure when we'll get to adding new major features but if someone >>> would like to dive in and think about this I'm happy to answer >>> questions >>> about the rationale behind the code. >>> >>> Ian >>> >>> Daniel Salama wrote: >>> > Granded. Now, your original suggestion addressed the issue of using >>> > collections in slots and instead of collections of objects, simply to >>> > use collections of references to objects, which make sense and in a >>> > way is somewhat along the lines of what indexes do (from a general >>> PoV). >>> > >>> > Not having looked at DCM yet, is it possible to just use the >>> > "persistence machinery" and DCM in a more seamless fashion? For >>> > example, if I declare a persistent CLOS class, can I hook that up to >>> > DCM and get the benefits of DCM and persistence at the same time? >>> From >>> > Ian's last statement, this doesn't seem possible yet, but I may be >>> wrong. >>> > >>> > Thanks, >>> > Daniel >>> > >>> > On Jul 27, 2006, at 2:07 PM, Robert L. Read wrote: >>> > >>> >> A reasonable way to work is to use completely persistent objects and >>> >> see how the performance >>> >> is for you --- LISP and elephant support this kind of rapid >>> >> prototyping extremely well. I may be >>> >> a bit old-fashioned---but I often find that I end up having to take >>> >> explicit control of the write-back >>> >> policy in any case, and I personally never find having to remember >>> >> when to write things a burden, >>> >> since they are almost always part of a "business rule", if your >>> using >>> >> a 3-tiered application. >>> >> >>> >> On the other hand, you can follow your plan based on Ian's idea, and >>> >> similar layer on secondary >>> >> indexes once prototyping shows that you need them. >>> > >>> _______________________________________________ >>> elephant-devel site list >>> elephant-devel at common-lisp.net >>> http://common-lisp.net/mailman/listinfo/elephant-devel >>> >> _______________________________________________ >> elephant-devel site list >> elephant-devel at common-lisp.net >> http://common-lisp.net/mailman/listinfo/elephant-devel > From read at robertlread.net Thu Jul 27 20:07:23 2006 From: read at robertlread.net (Robert L. Read) Date: Thu, 27 Jul 2006 15:07:23 -0500 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: <44C91762.1040402@csail.mit.edu> References: <1154007750.4658.855.camel@localhost.localdomain> <44C8C8EB.1090406@csail.mit.edu> <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> <1154023648.4658.885.camel@localhost.localdomain> <44C90C43.6030606@csail.mit.edu> <9157df230607271214i7f083509r710449e82adbc5ae@mail.gmail.com> <029AC51A-F688-4ACE-8264-C929B494F0CE@infoway.net> <44C91762.1040402@csail.mit.edu> Message-ID: <1154030844.4658.916.camel@localhost.localdomain> A really good tutorial would be great; the test suite is probably the closest thing we have right now. I absolutely have to focus on getting my business off the ground (which uses Elephant on top of Postgres) before I volunteer for any more significant work. On Thu, 2006-07-27 at 15:43 -0400, Ian Eslick wrote: > I think a blog backend to portable aserve would make for a nice > example > along with a discussion of tradeoffs. Of course someone would need to > volunteer for such a thing. It will be a little while before I can do > more than small support sessions or bug fixes for elephant... :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eslick at csail.mit.edu Thu Jul 27 20:14:13 2006 From: eslick at csail.mit.edu (Ian Eslick) Date: Thu, 27 Jul 2006 16:14:13 -0400 Subject: [elephant-devel] Design Suggestion Request In-Reply-To: <1154030844.4658.916.camel@localhost.localdomain> References: <1154007750.4658.855.camel@localhost.localdomain> <44C8C8EB.1090406@csail.mit.edu> <83AA4351-3BD5-4D1A-BB75-FA0E821CB23B@infoway.net> <1154023648.4658.885.camel@localhost.localdomain> <44C90C43.6030606@csail.mit.edu> <9157df230607271214i7f083509r710449e82adbc5ae@mail.gmail.com> <029AC51A-F688-4ACE-8264-C929B494F0CE@infoway.net> <44C91762.1040402@csail.mit.edu> <1154030844.4658.916.camel@localhost.localdomain> Message-ID: <44C91E95.8050201@csail.mit.edu> I'm a similar boat related to graduating - but I'm happy to support a volunteer or two by reviewing, answering questions and suggesting. Any takers to do some small tasks to make elephant a little better or to document the larger tasks that would make it lots better? As for Daniel's request for more details on special indexing; bug me in a week or two with some specifics about what the existing indexing didn't do for you. I've created some of my own special structures (many-to-many maps, for example) in my own application that could use some tuning but then be generally available. Ian Robert L. Read wrote: > A really good tutorial would be great; the test suite is probably the > closest thing we have right now. > I absolutely have to focus on getting my business off the ground > (which uses Elephant on top of > Postgres) before I volunteer for any more significant work. > > > > > On Thu, 2006-07-27 at 15:43 -0400, Ian Eslick wrote: >> I think a blog backend to portable aserve would make for a nice example >> along with a discussion of tradeoffs. Of course someone would need to >> volunteer for such a thing. It will be a little while before I can do >> more than small support sessions or bug fixes for elephant... :) From eslick at csail.mit.edu Fri Jul 28 21:12:20 2006 From: eslick at csail.mit.edu (Ian Eslick) Date: Fri, 28 Jul 2006 17:12:20 -0400 Subject: [elephant-devel] Querying Elephant Message-ID: <44CA7DB4.4040006@csail.mit.edu> Here is a start on a clearer statement of the standard thinking about designing objects and writing query functions. I invite hacking on this but questions will probably go unanswered for the next week or so. There are two operations that all queries in Elephant stem from: 1) A BTree lookup of a value given a key 2) A linear traversal of a BTree in an order defined by the native key ordering There are two types of BTrees 1) Standard BTrees require that all keys are unique 2) Secondary BTrees allow duplicate key values, ordering of values with duplicate keys is undefined (but in practice I believe it's the inverse order of insertion) Given two classes: (defpclass blog-entry ((user :index t) (category ) (title ) (date :index t) (content ))) (defpclass user ((name :index t))) Question to think about: Why didn't I put in an index for category? Now say we want to generate the following blog entry summary pages: 1) All a user's entries 2) All a user's entries under a given category 3) Users blog entries for the past month 4) Above filtered by category Simple index query for a unique element: ============================= (defun get-user (name) (get-instance-by-value 'user 'name name)) Simple index query for a set of elements: ============================= (defun user-blog-entries (name) (get-instances-by-value 'blog-entry 'user (get-user name))) Query a set of objects from an index, but filter by some property: ============================================== The answer to the above question, why there is no index for category is that I know in advance that there are lots of instances for each category so paying the cost of grouping them into an index isn't worthwhile. I rarely ask for all users from a category (see #5) so can pay linear cost in that case. If I do that alot then go ahead and add the index! There are two ways to do this one. Both require serializing the values, in this case the blog entry records into memory and both require accessing the slot value. Neither require deserializing the larger content until it's needed for rendering. The only real difference is in the consing. The first one conses more than the other but for small query sets this isn't a problem and is certainly easier to read. (defun user-blog-entries-by-category (name category) (select-if (lambda (entry) (eq blog-entry-category category)) (user-blog-entries name))) (defun user-blog-entries-by-category (name category &aux results) "These loops need some nice macros to clean them up. This may not run and was written from memory without testing but is indicative" (flet ((get-next (cur) (multiple-value-bind (exists? skey val pkey) (cursor-next cur) (when exists? (if (eq (blog-entry-category val) category) (push val results) exists?)))) (with-transaction () (with-inverted-cursor (cur blog-entry (get-user name)) (loop while (get-next cur)) (nreverse results))))) ;; (defun select-if (fn data &aux results) ;; (labels ((rec (in out) ;; (cond ((null in) ;; (nreverse out)) ;; ((funcall fn (car in)) ;; (rec (cdr in) (cons (car in) out))) ;; (t (rec (cdr in) out))))) ;; (rec data nil))) Query a range of objects, then filter by user =============================== Which is bigger? Get the likely small set and then filter by the criterion for the larger set. (The general set intersection compilation optimization when set sizes are known apriori) (defun get-user-entries-for-dates (name startdate enddate) (let ((user (get-user name))) (select-if (lambda (entry) (eq (blog-entry-user entry) user)) (get-instances-by-range 'blog-entry 'date startdate enddate)))) Query and constrain multiple values ========================== To query all entries for a user and in a given category between two dates do something like the above, but select-if over, say, username and category. The general principle is select the likely smallest set and then filter those elements by the other criteria. This works good in most cases. If you really want to speed up looking up the conjunction of two or more features, you can build an index which orders these totally. That is to say, make a function provided to (make-derived-index ) that given an object computes a number (or string) that creates a new namespace ordered in the appropriate way. Let's say I do alot of category based user and date queries to generate blog pages from a server database. Big amount of records for date, category and maybe for a given user as well (I would just look up by user and then filter, but over several years this probably becomes a large set). You would like an index that implements: USER | CATEGORY | DATE For example, I could make a bignum with a 32-bit user ID, 16-bit category ID and a 32-bit date and map/shift the specific values appropriately. I can also put these same fields into a fixed field delimited string. Slow lookup, but easy to debug. "IAN |Hacking |2423156663" I can now do range queries over conjunctions of NAME+CATEGORY over a given date range in time approximately linear in the result set. I'm sure there's a cleaner solution to this problem as this is common in the RDB/SQL implementation world (join tables?) but this works well for most common applications people are likely to be using elephant for. Enjoy, hack up or critique at your leisure. Ian