[cells-devel] Meta-level oof relations ... How?

Kenny Tilton ktilton at nyc.rr.com
Tue May 24 05:03:39 UTC 2005


> I think it is worth here to distinguish between two separate domains  
> where the terms CLASS, INSTANCE, RELATION, DEPENDENCY each have  
> different meanings and side effects.
>
> In the Application domain (my app being based on Cells) has Classes  
> that get defined while adapting the app to a user's needs.  A typical  
> class would be "BOM" (said bill of material). The other one to be use  
> in the use case outlined below is class "PART".
>
> So we have:
>
> + Application Class BOM
> + Application Class PART
> + Class Relation IS-USED-IN: PART -> BOM (read: a PART can be related  
> to a BOM with the relation IS-USED-IN). This is a relation defined on  
> CLASS (!) level.
>
> + Instance "As Built BOM, Baseline 001" of application class BOM
> + Instance "High Pressure Turbine, Serial Nr 0001" of class PART
>
> In order to make the use case more verbose we define some attributes  
> on the classes:
>
> + Attribute PRICE on class BOM
> + Attribute NR-PARTS on class BOM
>
> + Attribute PRICE on class PART
>
> The relation is a "general" one: if any attribute of an instance of  
> class PART changes then recalculate all attributes of the  
> corresponding instance of class BOM.
>
> Several things have to be noted here:
>
> Adding a part to a Bom is done by the following steps (not using  
> Cells but the - say - traditional approach):
>
> 0. BOM instance is already there
> 1. Create an instance of class PART
> 2. Create an instance of relation IS-USED-IN
> 3. Set the LHS (left hand side) of the relation to the object id of  
> the new part
> 4. Set the RHS of the relation to the object id of the BOM instance
>
> Now, the update mechanism in this traditional approach always crawls  
> all relation instances using the id of the changed object to find all  
> other objects being related to that instance id and sends an update  
> event to all these instances.
>
> This should be avoided by limiting the nr of events being fired. Not  
> every related object has to be updated when a part is added. The same  
> is true when the Price attribute of a part is changed: the BOM need  
> not recalculate the nr of parts in it.
>
> The intended use case is:
>
> Define a relation class that says: update attribute PRICE on  
> instances of class BOM if the attribute PRICE of instances of class  
> PART changes - respecting the fact that a particular instance of  
> class PART is always related to particular instances of class BOM (so  
> update only those affected).
>
> A part can be related to more than one BOM. A BOM can have more than  
> one part (well, a simple n:m relation).
>
> So, I want to define a class-level dependency between classes BOM and  
> PART by defining the slots PRICE in BOM and NR-PARTS, PRICE in PART  
> as cells. 

As I said, just use an :initform or :default-initarg:

(defclass BOM (priceable)
   ()
   (:default-initargs
       :price (c? (apply '+ (mapcar 'price (^parts))))))

BTW, if this gets into a long-enough list (a) at 64 you will hit a silly 
Cells limitation which i will look at eliminating shortly but (b) i 
think we are into new territory which I first noticed on the Dow-Jones 
Index use case, viz, one slot quite reasonably depending on a kazillion 
other slots. DJI got solved by ducking the problem -- it turned out each 
ticker was being depended on twice (for two different values) and a 
carefully placed without-c-dependency got the dependency count under 64 
without sacrificing true dependency (the others were semantically 
redundant) -- but obviously this bad boy of an issue has to be dealt with.

How long are the parts lists? Even after I relax the 64-link limitation 
I observed a performance hit from cells internals searching down that 
long list (gets hit a lot, relatively speaking). I am thinking a clever 
divide-conquer scheme will be necessary, in which cells internals 
generate trees of synapses with a maximum of 16 or whatever dependencies 
at each node. We will see.


>
>
> Using whatever mechanism out of the Cells package I defined the  
> relation between these Cell slots. As there will be millions of  
> instances of PART and (a few less) instances of BOMs I want to have  
> to define the dependency just once while still being able to  
> overwrite the firing rules and the dependency rules between any two  
> instances of classes BOM and PART. 

A Cell data structure has, inter alia, slots for used cells and cells 
which use it. These must be instance specific, since in the end we 
really do have a normal case of instances depending on other instances. 
The slots for the rule will all contain the same rule instance (unless 
it closes over a lexical variable). So you could save one slot in each 
Cell -- if defstruct supported class-allocated slots, but I do not think 
they do. (You would think I would know for sure, but Lisp still 
surprises me from time to time.)

Are all these parts loaded into RAM at once? With that kind of demand on 
memory i do not think Cells will stand out as a burden, though i could 
be wrong. Have you tried the normal approach and found a problem? I am 
not completely against early optimization and do it often, but this 
would be a case where I would let the normal approach fail before 
working on optimization.

Conceivably we can carve out a lighter-weight Cell, but I am not sure we 
can save more than a few slots. I think structures are implemented as 
arrays, so we are just making smaller arrays. I am concerned that that 
will not help much if a bottleneck is discovered. I would then lean 
towards your idea of a single Cell capable of managing the dependencies 
of all class instances. That would save almost all the slots at the cost 
of adding a hashtable lookup on an instance before getting to its list 
of used or user cells, and of course save all those make-cell calls 
themselves.

Hey, if you can set up some dummy classes and bog it down, I will see 
what I can do about creating class-allocated cells which can still be 
overridden if necessary.

>>
>> You lost me. Do you mean it gets changed in the sense that some  
>> other part is now being used, or in the sense that some attribute  of 
>> the part (say, the price of the turbine) changes?
>
>
> Both:
>
> If a new part is added or a part is deleted from the BOM or the price  
> attribute is changed. 

You just need slots Part-BOMs and BOM-Parts. Any rule such as:

 (defclass BOM ()
     ()
     (:default-initargs :price (c?  (apply '+ (mapcar 'price (^parts))))))

Will establish dependencies on (a) the parts slot and (b) the price of 
each part. Note by the way that this means you cannot use destructive 
operations to change a parts list, unless you get sneaky and create a 
destructive function which guarantees the first cons cell is different, 
perhaps by re-consing the first element back on (assuming it is not 
being deleted).

>
>
>>> the synapse between the two classes triggers  the /right/ instance  
>>> of class BOM, here the instance "As Built BOM ,  Baseline 001".
>>
>>
>> "Triggers"? Do you mean, the total on the BOM would have to be  
>> changed to reflect the changed turbine price?
>
>
> Yes.
>
>> Anyway, it does not sound as if the two classes require dataflow,  it 
>> sounds as if each instance of each class will require that  dataflow 
>> with some other instance.
>
>
> Agreed.
>
>> As much as I talk about instance-oriented programming, hey, if you  
>> author a cell in a rule supplied in an initform or default-initarg,  
>> guess what? You get class-oriented behavior. :) Just do not  override 
>> those rules at make-instance time. :)
>
>
> Aha! 

Well, let's see if you hit a problem with that approach and then look at 
optimizations. Might be a fun task. In a sense, the class-allocated Cell 
is precisely analogous (he said after several seconds of careful 
analysis) to the RDBMS scheme of setting up an intermediate many-many 
relation, with the extra feature of automating dataflow from parts to BOM.

kt








More information about the cells-devel mailing list