Copyright ©2008 T. H. Merrett COMP 617 Information Systems Winter 2008 Week 10 Polymorphic Relations, Display2D and Transactions 1. The notes on display2D from last term say that the current implementation cannot do certain updates, because they require new attributes to be added to the relation or because they require the relation to be restructured, say from flat to nested. For example, in Triangle(x1,y1,x2,y2,x3,y3,lc,fc,fp,label) we had two triangles, one with a blue boundary and yellow fill in a vertical pattern, the other with a yellow boundary and blue fill in a vertical pattern. Triangle ( x1 y1 x2 y2 x3 y3 lc fc fp label) 5000 4000 2000 4000 5000 3000 1 6 50 "Tri1" 3000 1000 5000 1000 5000 2500 6 1 50 "Tri2" One update display2D disallows is changing the thickness of one of the boundaries, say for Tri2. Since the default line thickness, 1, is used for both boundaries, the relation to be updated has no attribute (from the .vocabulary) meaning line thickness. To allow this update requires _polymorphic_ relations: relations that can have a varying set of attributes. Then in particular a new attribute can be added for some tuples only. We can suppose that display2D adds the attribute given in the default .vocabulary with meaning line_thickness: lt. After the change by display2D , the value of the display2D expression becomes( x1 y1 x2 y2 x3 y3 lc fc fp lt label) 5000 4000 2000 4000 5000 3000 1 6 50 "Tri1" 3000 1000 5000 1000 5000 2500 6 1 50 2 "Tri2" 2. Let's explore polymorphism in this sense. The blank value for lt in Tri1 just means that attribute lt is not defined for that tuple. This is the same as saying it is the DC null value. And that is the same as saying that the new relation is the ujoin of two components = T1 ujoin T2 where T1( x1 y1 x2 y2 x3 y3 lc fc fp label) 5000 4000 2000 4000 5000 3000 1 6 50 "Tri1" and T2( x1 y1 x2 y2 x3 y3 lc fc fp lt label) 3000 1000 5000 1000 5000 2500 6 1 50 2 "Tri2" Note that the DC null value is the same as the attribute being not there at all. The heart of polymorphism for relations is the converse: every relation has all possible attributes, but the attributes we don't normally write is are taken to have DC null values for all tuples. But we did not create by a ujoin but rather by an update. So update needs to behave consistently. We can define update R add S to mean exactly R <- R ujoin S (and R <+ S likewise). This was in fact the original definition, but we find that the present implementation of jRelix is more limited. Here's what we should have. R(X Y) S(X Z) T(U) R ujoin S R <+ S update R add S a 1 a q k (X Y Z) (X Y Z) (X Y Z) b 1 c q a 1 q a 1 q a 1 q b 1 b 1 b 1 c q c q c q Furthermore, if the attributes are completely disjoint, we'll have R ujoin T R <+ T update R add T (X Y U) (X Y U) (X Y U) a 1 a 1 a 1 b 1 b 1 b 1 k k k This means re-implementing ujoin , which currently gives the Cartesian product in an example such as R ujoin T. It also means cleaning up the implementation of update .. add and <+. 3. Since polymorphic relations can add or lose attributes at any time, we should have updates to do this. update insert update remove update insert remove update remove insert Inserted attributes must be virtual, already defined by domain algebra. In the combined alternatives, we can say that the execution order reads from left to right. So we can mimic the effect of our last update by display2D . T1 <- where label="Tri1" in Triangle; T2 <- where label="Tri2" in Triangle; let lt be 2; update T2 insert lt; <- T1 ujoin T2; (This does not add anything new for top-level relations. We could equally write T1 <- where label="Tri1" in Triangle; let lt be 2; T2 <- [x1,y1,x2,y2,x3,y3,lc,fc,fp.lt,label] where label="Tri2" in Triangle; <- T1 ujoin T2; But we'll see that it is convenient for updating nested relations.) 4. Now let's suppose display2D tries to add an extra vertex to Tri1. The new data cannot be accommodated in Triangle(x1,y1,x2,y2,x3,y3,lc,fc,fp,lt,label) at all, because one figure now has an extra pair of coordinates. We cannot just add, say x4 and y4, because we made a convention that this would connect every pair of coordinates with every other, making a flattened tetrahedron, not the quadrilateral we built wth the xfig interface to display2D . [See Note 6 for a retraction of this. The update discussed in Note 4 might now be done by just adding attributes (x4, y4) as in Note 1.] So we must convert the affected triangle to a sequenced polygon. This means converting the flat relation, Triangle, to a nested relation, say Triangle (Triangle1 ) (label Triangle11 lc fc fp lt) (sq x1 y1 ) "Tri1" 1 5000 4000 1 6 50 2 2000 4000 3 3102 3493 4 5000 3000 "Tri2" 1 3000 1000 6 1 50 2 2 5000 1000 3 5000 2500 If display2D is intended to be able to do this, we should be able to show the way in Aldat. let sq be 1; let Triangle1a be relation(sq,x1,y1); let x1 be x2; let y1 be y2; let sq be 2; let Triangle1b be Triangle1a ujoin [sq,x1,y1] in relation(sq,x2,y2); let x1 be x3; let y1 be y3; let sq be 3; let Triangle11 be Triangle1b ujoin [sq,x1,y1] in relation(sq,x3,y3); let Triangle1 be red ujoin of relation(label,Triangle11,lc,fc,fp,lt); Triangle <- [Triangle1] in Triangle; We didn't need update..insert or update..remove for this. (We could have used them to change Triangle, but the top-level assignment was easier.) We can suppose display2D does this for the whole relation as soon as it detects the need to reconfigure. The update is now easy. relation newVert(sq,x1,y1) <- {(2.5,3102,3493)}; update Triangle/Triangle1/Triangle11 add newVert using where Triangle/Triangle1/label = "Tri1"; (and we must follow up by redefining sq as fun + of 1 order sq). 5. Now suppose we have used display2D to change the colour of _one_ of the edges in Tri1 from blue to yellow. The relation must be reconfigured again, this time as a set of edges for each triangle, so that the colour applies unambiguously to each edge. Triangle (Triangle1 ) (label Triangle12 fc fp lt) ( x1 y1 x2 y2 lc) "Tri1" 5000 4000 2000 4000 6 6 50 2000 4000 3102 3493 1 3102 3493 5000 3000 1 5000 3000 5000 4000 1 "Tri2" 3000 1000 5000 1000 6 1 50 2 5000 1000 5000 2500 6 5000 2500 3000 1000 6 let x2 be fun succ of x1 order sq; let y2 be fun succ of y1 order sq; let Triangle12 be [x1,y1,x2,y2,lc] in Triangle11; update Triangle/Triangle1 insert Triangle12 remove Triangle11, lc; Note that insert must be done first, or Triangle11 won't be there to construct Triangle12 from. Now we mimic the update by display2D. relation Tri1x1y1(label,x1,y1) <- {("Tri1",5000,4000)}; update Triangle/Triangle1/Triangle12 change lc <- 6 using Tri1x1y1; 6. I have changed my mind about the convention stated at the beginning of Note 4. Display2D and display3D should both allow any number of coordinate-set attributes, with the convention that edges connect the vertices in the order given. Thus a quadrilateral could be represented as Quad(x1 y1 x2 y2 x3 y3 x4 y4) 0 0 2 0 2 2 1 2 or as Quad(sq x y) 0 0 0 1 2 0 2 2 2 3 1 2 the latter being more general, of course. If we want to connect every vertex to every other, we must supply a set (not a sequence) of coordinates in a nested relation. Simplex(x y z) 0 0 0 0 1 1 1 0 1 1 1 0 will display as the four vertices of a tetrahedron, but Simplex or Simplex (label simp ) (lc simp ) "s1" (x y z) 1 (x y z) 0 0 0 0 0 0 0 1 1 0 1 1 1 0 1 1 0 1 1 1 0 1 1 0 will display the six edges, too: in the first case the label will be associated with the tetrahedron; in the second the edges will be blue. Any polyline (sequence of vertices connected by edges), whether specified by sets of coordinate attributes or by a sequence of tuples, is displayed open by default. If data pertaining to faces (such as fc fill_colour or fp fill_pattern, or, suitably placed, a strg label) is present, the polyline is closed by the display. (Of course, if the polyline specification ends with the starting vertex, then it will appear closed even in default.) Thus Triangle(x1,y1,x2,y2,x3,y3,lc) 0 0 1 0 1 1 1 will show two blue edges connecting the three vertices, but Triangle(x1,y1,x2,y2,x3,y3,lc,fc,fp,label) 0 0 1 0 1 1 1 6 50 "Tri" will show the triangle filled with yellow vertical stripes and all three (blue) edges of its boundary. (We need only one of fc, fp or label to display all three edges.) (This last is exactly the present behaviour of display2D.) Similarly, Triangle(sq x y) 0 0 0 1 1 0 2 1 1 will show the two edges of an open triangle, while Triangle (fc Tri ) (sq x y) 1 0 0 0 1 1 0 2 1 1 will show a blue closed triangle with all three edges (in the default colour of black). Of course, Triangle(x1,y1,x2,y2,x3,y3,x4,y4) 0 0 1 0 1 1 0 0 or Triangle(sq x y) 0 0 0 1 1 0 2 1 1 4 0 0 will both show closed triangles, because the final vertex is the same as the first. 7. In summary, this requires the following changes to the relix implementation. 1) Change ujoin so, given operands with disjoint attributes, it gives a union instead of a Cartesian product. 2) Change <+ and update..add to be exactly the same as L <- L ujoin R in L <+ R, etc. This requires a polymorphic representation of relations, best as a virtual union of the separate pieces. 3) Implement update..insert and update..remove This must use the above polymorphic representation of relations. 4) Augment display2D to support all xfig updates in this framework. Related to this are the changes to update required by the undo and redo operators (week6p1). 5) Add the builtin symdiff() computation for event handlers. (But leave .trigger alone.) 6) Implement undo and redo as deferred event handlers: they are defined automatically by the update, and invoked explicitly by the programmer. (This is the opposite of regular event handlers, which are defined by the programmer and invoked automatically by the update.) Their names are constructed by the rules undo|redo : add|delete : () undo|redo : change : [ ]() just as for regular event handlers. 7) Implement the boolean computations undone and redone similarly (Note 4 of week6p1. 8) Note that undo, redo, undone and redone need state variables such as tape, undid, redid and parent. These were created by the containing computation, woops, in Note 2 of week6p1, but the system can automatically produce the result of calling woops(). Undo, redo, etc. are useful in their own right, but they arose while investigating transactions (week3p1, week6p1). We cannot implement transactions in jRelix because it is a main-memory implementation of Aldat, but we can add the syntax for transactions. Here it is. 1) Anywhere a statement is allowed, we should now accept that it can begin by the keyword transaction or synonyms such as transactiad or iad (standing for isolated-atomic-durable). Most frequently this will be used with bracketed statements, and the syntax will look like iad{ ... }; or transactiad{ ... } 2) At the end of any transaction we can accept a compensating transaction, specified as compensated by or synonyms such as sateby . iad{ ... }sateby{ ... }; Of course, it is up to the programmer to ensure that the compensating transaction completely undoes the transaction. Hey can often call undo. This supports sagas (pp.120--1 of [BernNewc:TP], Note 6 of week3p1): the implementation would invoke the compensating transactions when it finds that the saga must be aborted. 3) The system must know if a transaction is part of a saga, in order to be able to run the compensating transactions in case of abort, and it must know which saga it is in. So we allow a parameterized variant of iad and its synonyms, passing in the name of the saga as a string. transaction("sag1"){ ... }sateby{ ... }; Note that compensating transactions are obligatory for any transaction which is part of a saga. 4) It seems to me that syntax such as commit and abort is completely replaced by the undo and redo operations and their tests (undone, ..), but many more examples must be found and worked. 5) I have not decided if the syntax should explicitly describe mechanisms such as enqueuing and dequeuing multi-host transactions against communications crashes (pp.118--9 of [BernNewc:TP]), "conversations" or logging (both on p.122 of [BernNewc:TP]). I suspect these are best left to the implementation as part of the mechanism to guarantee i.a.d. (or a.d. if i. is imposible, as in sagas). [BernNewc:TP] P. A. Bernstein and E. Newcomer, "Principles of Transaction Processing", San Francisco, Morgan Kaufmann Publishers, Inc., 1997