[cl-prevalence-devel] simple-array serialization patch

Fri Apr 6 17:24:23 UTC 2007

On 2007-04-06, at 03:34, Sven Van Caekenberghe wrote:

> Mike,
>
> On 05 Apr 2007, at 17:19, Michael J. Forster wrote:
>
>> I don't know if you or anyone else is interested, but I have  
>> implemented
>> xml and sexp serialization/deserialization of simple arrays -- I  
>> needed it
>> for an app that uses cl-prevalence.  I've attached the patch.
>
> The patch is OK in terms of code (I guess it is working fine in  
> your situation), but I am not sure that it is conceptually correct  
> (but maybe I am wrong).
>

No, you are correct, and, in my haste, I posted the patch without  
fully describing
my scenario or intentions.  My apologies.

> According to my reading of CLHS the type simple-array on itself  
> does not guarantee a (what I would call) homogeneous array (an  
> array with the same type of element everywhere). The typespecs  
> '(simple-array *) and '(simple-array <element-type>) would refer to  
> this, but I don't know whether you can use them in method signatures.
>
> Even so, the array-element-type could very well be too general,  
> like T or cons or array. In that case, your serialization code  
> fails to take shared and circular references into account (you are  
> effectively assuming more primitive, non-shared, non-circural  
> element-types - which probably works in the way you are using CL- 
> PREVALENCE).
>
> So, as I see and understand it now, your code would be OK, if we  
> further qualify it with a test that the array-element-type is  
> somewhat 'primitive'. But I am not sure how to express that in the  
> method signature or how to test/enforce it in code, maybe we need a  
> custom type predicate ?
>

Yes, method signatures, one of my bigger CL gripes, though I do  
appreciate the
reasons that the CLOS designers allowed dispatch on class rather than  
type,
including compound typespecs.  (It's like complaining that Feanor's  
Simarils didn't
come in orange. ;-)

I think you nailed the issue in your second last sentence above.  To  
my thinking,
non-vector arrays are concrete types as opposed to the more abstract  
vectors and
lists and even more abstract sequences.  One has to qualify non- 
vector array
element type on a case-by-case basis, which is perfectly acceptable  
-- and
expected -- at the application level, but not for reusable  
libraries.  Hence, the
inviability of my patch.

Really, what I wanted to do was extend the cl-prevalence  
serialization/deserialization
for my-application-specific-2D-array-of-rationals by writing methods  
in my application
sources.  However, while serialize-xml-internal and serialize-sexp- 
internal are generic
functions, the corresponding deserialization functions are not.  So,  
with barely an
hour to deliver a feature, I hacked the ugly hack ;-)

Perhaps the deserialization functions could be reworked as GFs,  
allowing complete
application-specific extension?  I would be happy to help out if  
you're interested.

>> BTW, I would like to say that cl-prevalence is fantastic.  We've  
>> been using
>> it for five non-trivial (>25 classes, avg. 3000 instances per  
>> class) webapps
>> without a hitch for almost a year now.
>
> That is very nice to hear: could you give some more details, like:
>
> - what CL implementation you are using ?

We develop with LW 4.4 and 5.0 on Mac and Windows; we deploy to CMUCL
19b on FreeBSD and LW 5.0 on Mac.

> - what serialization you are using ?

We've tried both and would prefer to use the sexp format for its greater
readability.  However, we started with xml and haven't had an  
opportunity to
change it.

> - machine details ?

Dell 2U
Intel P4 3.2GHz
4GB RAM
160GB usable disk, RAID1

Apple Xserve
G5 Dual 2.3GHz
2GB RAM
140GB usable disk, RAID 5

> - the typical sizes of you transaction and snapshot files ?
> - total number of objects under prevalence, 75000 ?
> - rate of change (transaction log growth per day or so) ?
> - size of the image ?

I will collect some stats over the next few weeks and post them.

> - do you have any GC problems ?

None that we've detected, though, without any outward signs of memory  
exhaustion,
dying processes, or poor overall application performance, we haven't  
gone looking
for trouble.  I will start recording the GC stats as well.

> - anything else you want to share

Probably, yes, though I need to find some time to organize my thoughts.

Suffice it to say, we've built a substantial database management  
layer atop of
cl-prevalence, and, often, when I try to explain it to customers or  
business partners,
most can't understand why we didn't just use SQL, some object- 
relational mapping
package, and so forth.

It's hard to explain, given my rather unique experience in the  
database application
market.  My first employer and mentor, Dave Voorhis, is the author of  
one of only a
handful of true relational database management systems:

	http://dbappbuilder.sourceforge.net/Rel.html

If I can't convince someone that a SQL DBMS is not an RDBMS, then I  
can't begin to
explain why we don't use SQL, why we went to the trouble of building  
our own DBMS,
and why we can, legitimately, call it a RDBMS in spite of the word  
"prevalence" and
the associated flame-fest.

Anyway, sorry, the rant wasn't meant for you. :-)  Simply covering my  
corporate butt in
case a customer or competitor ever reads this and attempts to  
misrepresent our position.
In the end, cl-prevalence is a real boon to our work.  If you have a  
PayPal button for the
project, I would happily click it!

Regards,

Mike

--
Michael J. Forster <mike at sharedlogic.ca>