[cl-prevalence-devel] cl-prevalence speed issues

Wed Apr 4 11:15:34 UTC 2007

Gabor,

On 04 Apr 2007, at 12:40, Gabor Vitez wrote:

> Hi,
>
> I just started to toy around with cl-prevalence; however I found  
> strange speed issues:
>
> loading a database from transaction log is way faster than loading  
> it from snapshot.
>
> I modified one of the test scripts from the cl-prevalence  
> distribution:
>
> (require 'asdf)
> (require 'cl-prevalence)
> (in-package :cl-prevalence)
> (defclass numbers ()
>   ((numbers-list :accessor get-numbers-list :initform nil))
>     (:documentation "Object to hold our list of numbers"))
> (defun tx-create-numbers-root (system)
>       "Transaction function to create a numbers instance as a root  
> object"
>         (setf (get-root-object system :numbers) (make-instance  
> 'numbers)))
> (defun tx-add-number (system number)
>   "Transaction function to add a number to the numbers list"
>   (let ((numbers (get-root-object system :numbers)))
>     (push number (get-numbers-list numbers))))
> (defparameter *system-location* (pathname "/tmp/demo1-prevalence- 
> system/")
>   "Filesystem location of the prevalence system")
> (defvar *system* (time (make-prevalence-system *system-location*)) )
> (execute *system* (make-transaction 'tx-create-numbers-root))
> (time (dotimes (i 100000) (execute *system* (make-transaction 'tx- 
> add-number i)))        )
> ;(time (snapshot *system*))
> (close-open-streams *system*)
>
>
> I use this script to create a database; later to load it; then to  
> snapshot it and load it again (uncommenting the appropriate parts  
> between the runs).
>
> Creating and snapshotting is fast; however loading from snapshot is  
> slow.
> Times:
> creating: 19.89 seconds
> loading from transaction log: 53.942 seconds   < this is good
> snapshotting: 5.297 seconds
> loading from snapshot: 182.713 seconds          < this is strange
> snapshotting again: 1.165 seconds
>
> Any ideas what this strangeness can be?
>
>
>     Gabor

I haven't run you code or experimented with it, but by looking at it,  
one possible explanation might be the following:

When serializing Lisp datastructures, using either the XML or the S- 
EXPRESSION format, the serializer must constantly watch out for  
shared and circular datastructures. This is done using a hashtable  
holding all Lisp objects seen during a serialization session. Reading  
a small serialized transaction is much less work than reading a 100K  
list. Lists are serialized and deserialized using individual cons  
cells, which is costly. While doing this serialization or  
deserialization, a hashtable of the same size is built and each  
element checked against it. This might in effect by slower to do at  
once than applying 100K transactions.

A possibility to speed this up might be to use a properly sized  
sequence instead of a list. Be sure to look at the resulting  
serialization text file itself too.

HTH,

Sven