[elephant-devel] Querying for objects on two slots

Ian Eslick eslick at media.mit.edu
Wed Jan 14 01:57:00 UTC 2009


I'd like to add a mechanism for sorting on small tuples, but it would  
be different for each backend.  I think I know how to do it for BDB,  
but it will take some work. CLSQL could do it pretty easily (however  
see my response to Leslie's recent mail).

Alex, what would the issues be for the postmodern backend?  Could we  
use a behind the scenes convert-to-string strategy so we had a common  
API for indexing on tuples?

Unless we have a relatively straightforward solution for all back- 
ends, this will have to be a post-1.0 feature.  I'll have to look into  
the idea I had for BDB again before committing either way.

Ian

PS - I just tagged ELEPHANT-0-1-A2 after fixing a few bugs and adding  
some additional small features.  Thanks to Kevin Raison for helping  
track down a bug I introduced...

On Jan 13, 2009, at 8:25 PM, Yarek Kowalik wrote:

> When serializing tuples, is the string representation best:  You  
> suggest using (format t "~A ~A" a b) - is that efficient enough?   
> what about doing (cons a b) = is there a way to index and search for  
> conses? Any other ideas?
>
> Yarek
>
> On Tue, Jan 6, 2009 at 5:17 AM, Alex Mizrahi  
> <killerstorm at newmail.ru> wrote:
> YK> Is this a reasonable way of finding an object of type
> YK> 'my-class that matches on values val-a and val-b for slots a and  
> b?
>
> yep, it is reasonable if you have relatively low number of objects
> in returned by (get-instances-by-value 'my-class 'slot-b val-b) query.
> if number of objects is significant and you get a slowdown because
> of this, you might want to optimize this. a trivial thing is to try it
> symmertrically with slot-a -- whatever returns less objects is better.
> less trivial optimizations would be to work on lower-level -- via
> map-inverted-index (to avoid allocating whole list but instead test  
> objects
> one by one) or even cursor API (this way you can retrieve oids rather
> than objects, which should be faster, and also iterating both  
> indices at
> once might be a significant benefit if values are not uniformly
> distributed).
>
> but the most optimal way doing this in case of high number of objects
> in both slot-a and slot-b queries would be building and using multi- 
> column
> index. unfortunately, Elephant does not help you with it -- either  
> you'll
> have
> to serialize slot tuples into a string (e.g (format nil "~a_~a" slot-a
> slot-b)),
> or reorganize your data to use a custom index structure (like btree of
> btrees).
>
> there are also backend-specific considerations. for postmodern
>
>  (intersection (get-instances-by-value 'my-class 'slot-b val-b)
>                    (get-instances-by-value 'my-class 'slot-a val-a))
>
> would be much faster then testing objects one by one, for BDB -- i  
> doubt so.
>
> LP> Use MAP-CLASS, this will considerably speed up the query.
>
> how is that? using at least one index is much better than using no  
> index at
> all.
>
>
>
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel





More information about the elephant-devel mailing list