[elephant-devel] db gc

Ben ben at medianstrip.net
Thu Mar 3 20:10:49 UTC 2005


Sorry for not responding to this earlier, have been occupied.

Your code looks good.  Thanks for working on this.  Some points:

1) does it work with Andrew's new MOP stuff?

2) the implementation you are working on is an offline implementation.
the technical issues i had mostly had to do with online
implementations.  i think an online implementation is probably too
hard to be worth it at this point.

3) since it is offline, you can probably open sleepycat up with some
flags which will make this go fast.

4) "Can references to a persistent object have different class names
(maybe due to a change-class)?"

it does appear this is the case.  i think Andrew is the expert here --
Andrew?

the original collector i had in mind didn't know about the classes of
persistent objects.  it just kept track of OIDs blindly.  the
implementation was to be a little dirtier but easier.  it was
guaranteed then to not collect "unreferenced slots" which come from
change-class.  of course it couldn't collect discarded slots either.

it appears that maybe i'm storing objects incorrectly.  perhaps the
right way to do this is to store objects as OIDs without classes, and
then have a separate OID -> class table.  that way change-class can
work correctly.  it depends on if you think change-class should update
the DB or not, though.  (mental note to self: if one implements this,
one should make sure the instance cache code does the right thing
e.g. check the class before handing back a cached instance!)

in some ways the change-class / update-class-for-x semantics are still
a little fuzzy.  maybe Andrew can illuminate us here!

take care, B

On Sat, 19 Feb 2005, [utf-8] Gábor Melis wrote:

> On Thursday 18 November 2004 20:02, Ben wrote:
>> Writing the GC is long overdue.  There are some technical issues with
>> this which have yet to be solved, actually.  In the first pass, it
>> will probably require taking the store off-line and running a separate
>> gc program on it -- that shouldn't be too hard.  it may be possible to
>> write an online collector but as of yet i don't know how to do it.  i
>> expect the gc will come with the next release (in a month or so --
>> after i'm done teaching this quarter!)
>
> I started hacking on the gc. It works by replacing deserialize with a similar
> function that records the oids instead of calling get-cached-instance and
> reading all slots, key-value pairs in persistent classes and btrees.
>
> (defmethod walk-persistent ((btree btree))
>  (map-btree (lambda (key value) (declare (ignore key value)))
>             btree
>             :degree-2 t))
>
> (defmethod walk-persistent ((obj persistent))
>  (let ((class (class-of obj))
>        (persistent-effective-slot-definition-class
>         (find-class 'persistent-effective-slot-definition)))
>    (loop for slot-definition in (class-slots class)
>       when (eq (class-of slot-definition)
>                persistent-effective-slot-definition-class)
>       do (slot-value-using-class class obj slot-definition))))
>
> (defun elephant-gc (&optional (sc *store-controller*))
>  (let ((old-oids (make-hash-table))
>        (new-oids (make-hash-table)))
>    (flet ((marker (controller oid class)
>             (declare (ignore controller))
>             (unless (gethash oid old-oids)
>               (setf (gethash oid new-oids) class))))
>      (with-marking-deserialize (#'marker)
>        ;; mark the root
>        (setf (gethash -1 old-oids)
>              (class-name (class-of (controller-root sc))))
>        (walk-persistent (controller-root sc))
>        ;;
>        (loop while (< 0 (hash-table-count new-oids))
>           do (maphash (lambda (oid class)
>                         (walk-persistent (make-instance class :from-oid oid))
>                         (setf (gethash oid old-oids) class))
>                       new-oids)
>           (clrhash new-oids))))
>    ;; now OLD-OIDS contains the oids of all reachable objects
>    (maphash (lambda (oid class)
>               (format t "~S ~S~%" oid class))
>             old-oids)
>    ))
>
> It seems to detect live objects OK. The next step is to iterate through
> controller-db and controller-btrees and delete records that have keys
> starting with a non-alive oid, right? Controller-indices and
> controller-indices-assoc can be left alone, I hope.
>
> What are those technical issues you mentioned above?
>
> Can references to a persistent object have different class names (maybe due to
> a change-class)?
>
> G
>


More information about the elephant-devel mailing list