[elephant-devel] Add indexed persistent class slots to elephant?

Ian Eslick eslick at csail.mit.edu
Tue Jan 24 04:09:47 UTC 2006


While diving into the elephant code to understand it better I started to 
think about my normal usage model and that one common model is to lookup 
objects by slot value or a range of slot values.  This seems like a very 
common operation and that adding an initarg ':indexed' to the metaclass 
would allow for some simple default functionality:

low-level interface:
- define cursors over persistent-class slots as well as btrees and 
secondary indices
- make it easy to iterate over duplicate class+slot and class+slot+value 
keys
- we get an index of every persistent-object of a given class if we 
implement
   the right comparison operation.

mid-level interface:
- grab sets of objects based on slot-name and slot-value or range of 
slot values

high-level interface:
- a simple constraint language with boolean combinators that selects 
instances
  based on various combinations of slot ranges or values
- it becomes easier to compile constraints when the class contains 
information
  directly that tells you what indexes exist so you can do optimize the 
query ahead
  of time.

Supporting this requires adding an additional around method to (setf 
slot-value-using-class) on
persistent-slots to specialize on indexed slots and update the slot 
index and then potentially
adding an additional layer of cursor operators.  This is optional 
functionality that will only slow down write, not read, operations and 
will be backwards compatible.  It should be easy to add SQL support.  
The benefit will be to add some nice default behavior that makes the 
database aspect of the low-level interfaces much more directly 
accessible to new users. 

On my local copy I've implemented the metaclass support, overloading and 
a good chunk of the constraint language and still pass all of the 
current tests.  I think I understand the problem well enough now to 
query the user community for advice and buy-in.  I have yet to support 
all the unpleasant details related to changing classes, but the 
implications of dropping or adding an indexed slot is rather 
straightforward so I think that finishing the implementation and writing 
the appropriate tests isn't too much work.

The first question is whether the primary developers and users are open 
to the addition of this feature.

If so, the big design question I'm facing at present is:

1) Reuse the current btree infrastructure to create a btree for each 
class that maps oids to persistent-objects and instantiate a secondary 
index for each indexed slot using the slot accessor functions.  This is 
the easisest to implement, but might provide somewhat poor performance 
on create & writes.

2) Create another underlying DB with string keys 
"class-name+slot-name+value" => "oid"? 

2a) - Is it better to point to oid's or directly to serialized 
persistent-objects?  The nice thing about oid's is that later I can 
implement join-like operations in the query language using oids without 
having to deserialize and cache persistent objects.  Persistent-objects 
are perhaps more convenient for direct use, however.

Comments would be greatly appreciated.  I especially invite debate if 
others feel this is the wrong level of abstraction to work at (i.e. 
instead write a new def macro for indexed classes and a related protocol 
that accomplishes the same result by reusing primary and secondary 
btrees).  The proposal above seems in good taste to me and I've already 
invested some quality time in it, but since I'll be touching a fair bit 
of the system to put this in I want to make sure there is support.

Ian






More information about the elephant-devel mailing list