An overview of the Aldat Project at McGill 97/10
T. H. Merrett
McGill University
Abstract
Databases deal primarily with data on secondary storage, which differs from
the data types of conventional programming languages by being persistent,
shared, and often too large for RAM. Database languages have mainly been
query languages. Some programming languages have been made persistent, usually
giving rise to object-oriented databases. The relational model has been used
as a basis for logic programming. How far can the unification of databases
and programming languages be taken, especially in view of the need for data
units which may exceed RAM capacity?
We present some results of exploring this question. It leads to further questions which we can answer.
P.L. | D.B. | |
---|---|---|
1975 | iterator | relation |
1984 | recursion | fixed point |
1992 | scope | multidatabases |
1992 | process syncr. | selector |
1992 | reflection | metadata |
1993 | inheritance | join |
1994 | aggregation | join |
1994 | function | relation |
1995 | A.D.T. | nesting |
1996 | events | active |
1997 | objects | instantiation |
199? | type | constraint |
Main ideas:
(Relix - SQL) =
no tuples, only relations
StuCour( Stu | Cour | Asst | Exam) |
---|---|---|---|
Joe | Algol | 45 | 42 |
Sue | Algol | 47 | 38 |
Joe | Pascal | 43 | 42 |
Sue | Pascal | 45 | 42 |
JoeCour( Cour | Asst | Exam) |
---|---|---|
Algol | 45 | 42 |
Pascal | 43 | 42 |
JoeCour( Cour | Asst | Exam) |
---|---|---|
Algol | 45 | 42 |
Pascal | 43 | 42 |
JoeMark( Avg) |
---|
86 |
StuCour( Stu | Cour | Asst | Exam) |
---|---|---|---|
Joe | Algol | 45 | 42 |
Sue | Algol | 47 | 38 |
Joe | Pascal | 43 | 42 |
Sue | Pascal | 45 | 42 |
TotMark( Cour | Total) |
---|---|
Algol | 172 |
Pascal | 172 |
What kinds of programming can be done in a language whose data structures support possibly very large sets of related items, and whose operations abstract over loops?
Main idea:
Unify functions and relations
Topics
C < - A + B | |
---|---|
alt | A < - C - B |
alt | B < - C - A |
T(C) | R(A | B) | S(A | B | C) |
---|---|---|---|---|---|
4 | 5 | 2 | 5 | 2 | 7 |
6 | 3 | 6 | 3 | 9 |
Computations (cont.)
New Horizons from Computations
Computations encourage us to think in terms of sets of related equations.
This introduces some amusements which people who like abstractions may already have formalized.
Try uniform acceleration:
d = d0 + v0*t + a*t^2/2 |
---|
v = v0 + a*t |
Combining Computations (cont.)
(Missing table: see PostScript version )
The matrix shows when this can be implemented as two computations joined, in some order, with actual parameters provided by some input data.
New Horizons from Computations (cont.)
(Missing equations: see PostScript version )
Here is another pair of equations, rotation in two Euclidean dimensions.
where c is cosine and s is sine. So, of course,
But computations motivate us also to find
and
This is the improper Lorentz transformation that gives time reflection. !
computation to climb a concept hierarchy. (ref. Jiawei Han)
comp generalize( Src, Rslt, Attr, Thld)
{ let distinct be equiv + of 1 by Attr;
while [] where distinct > Thld in Src
do
{ local tmp < - Src[ Attr icomp
Low] Hier;
if [] in tmp then
{ let Attr be High;
Rslt < - [ Attr, ~{ Attr}] in
[ High, ~{ Attr}] in tmp;
} else
Rslt < - Src;
}
}
( Hier( Low, High) holds the concept hierarchy)
Mining (cont.)
For example, with a data relation
Student( Name | Major | Born | GPA) |
---|---|---|---|
Anderson | hist | Vanc. | 3.5 |
Bach | biol | Calg. | 3.7 |
: | : | : | : |
by the invocations
generalize( in Student, out MajorGen, in
Major, 3);
generalize( in MajorGen, out BornGen, in
Born, 3);
generalize( in BornGen, out StudentAbil, in
GPA, 3);
Predictive Modelling for D.S.S.
Predictive Modelling (cont.)
Here is the code for the last two, using least squares for the model:
let yavg be ( red + of year) /
( red + of 1.0);
let qavg be ( red + of qty) /
( red + of 1.0);
let slope be
( red + of (( year - yavg)*
( qty - qavg)))
/ ( red + of ( (year - yavg)**2));
let base be qavg - slope* yavg;
comp LinInterp( year, qty, slope, base) is
qty slope* year + base;
FutSales is LinInterp icomp PastSales;
We could also have programmer-defined events which might automatically trigger more than one event-handler computation.
Main idea:
Relational operators all apply to attributes
Topics
Example (Jaeschke & Schek, 1982):
Books( Authors | Title | Price | Descript) |
---|---|---|---|
A1,A2 | T1 | P1 | D1, D2 |
A2 | T2 | P2 | D1, D2 |
A1 | T3 | P1 | D1,D2,D3 |
[ Title] where Authors sup { (A2)} and
Descript sup { (D1), (D2)} in Books
Example (Deshpande & Larson, 1987):
(Missing table: see PostScript version )
Find employees without children:
[ ENo] where not ([] in Children)
in Employee
(Note the projection of the attribute, Children.)
Find employees who took course 314:
[ ENo] where ([] where
CNo= 314 in Training)
in Employee
(Note the selection of the attribute, Training.)
Double nesting (cont.)
Find children of employees who took course 314:
[ Children] where ([] where
CNo= 314 in Training)
in Employee
(Note the result is still a nested relation.)
Find names of those children:
let C be [ Name] in Children;
or
let C be Children.Name
[ C] where ([] where CNo= 314 in
Training)
in Employee
(Missing table: see PostScript version )
Find DEPTs with two A grades:
let A1 be red + of 1;
let S2 be red ujoin of [ Name,
[ A1] in where Grd= "A" in Courses]
in Students;
let A2 be red + of A1;
Ans [ Dept] where Prof.S2.A2= 2 in
EFaculty;
(Missing table: see PostScript version )
Find total costs
(A11 6; A12 5; A1 14; A2 2; A 20):
let SumCost be red + of CalCost;
let CalCost
be if Assembly= then Cost
else Cost + Assembly.SumCost;
Ans [ Id, CalCost] in Assembly;
The NEST and UNNEST operators are too unruly and unimportant to be implemented except in terms of other primitives which we need anyway.
Main idea:
Instantiate with domain and relational algebras
Topics
Instantiate by domain algebra actualization.
e.g., bank accounts
comp ba( deposit, balance) is
{ state bal init 0;
deposit < - comp( dep) is
{ bal < - bal + dep
alt
dep < - bal - old bal;
} /* deposit */
balance < - comp(b) is
{ b < - bal;
} /* balance */
} /* ba */
Given
accts( ac# | client) |
---|---|
1 | joe |
2 | sue |
let ba1 be [ deposit, balance] in ba;
accounts [ ac#, client, ba1] in accts;
comp molecules( initial, hitWall, getvel, putvel, ..) { ...
hitWall < - comp( which) is
{ case which:
{ "x": xvel < - - xvel;
"y": yvel < - - yvel;
"z": zvel < - - zvel;
} } /* hitWall */
} /* molecules */
The binary operations are defined outside the class, with parameters.
For instance, bang collides two molecules (of equal mass), swapping
their velocities.
bang comp( mol1, mol2, mola, molb) is
{ local xv1, yv1, zv1, xv2, yv2, zv2;
mol1.getvel( xv1, yv1, zv1);
mol2.getvel( xv2, yv2, zv2);
mol2.putvel( xv1, yv1, zv1);
mol1.putvel( xv2, yv2, zv2);
} /* bang */
This is all invoked on a relation of molecules.
gas for mol# from 1 to 10000
par mol#;
Binary Operators on Objects (cont.)
let mol1 be [ initial, hitWall, getvel, putvel, ..]
in molecules;
let mol2 be [ initial, hitWall, getvel, putvel, ..]
in molecules;
binaryMols ([ mol#1, mol1] in gas) ijoin
[ mol#2, mol2] in gas
update binaryMols change
bang( in mol1, in mol2,
out mol1, out mol2)
using where close in ( binaryMols ijoin near);
( near is another binary computation which inputs mol1 and mol2 and outputs the boolean, close.)