Copyright ©2007 T. H. Merrett

Aldat

Wikipedia categories: programming languages, databases

Motivation

Computer memory comes in a variety of forms, two principle ones being primary memory ("RAM"), and secondary memory ("SS") which includes hard and floppy disks and CD-ROM. Data structures and languages for RAM are core computer science. The Aldat Project is concerned with language and data structures for SS.

Secondary storage is distinguished from RAM primarily by the long times required to find data, relative to the times required to transfer it for processing. Different memory organization needs different data structures, algorithms and languages.

There are many general-purpose languages for RAM and it is unusual to encounter languages specialized for individual applications. Languages for SS are usually query languages rather than full programming languages, and they are specialized both to particular data structures (hierarchies, linked lists, tables, etc.) and to particular applications (administrative data, spatio-temporal data, logic data, semistructured data, etc.) For instance, RAM has no spatio-temporal dialect of FORTRAN or Java, but SS has Arc-Info while commercial relational databases (SS) do not have comparable spatial capability except as add-ons.

Aldat is a general purpose SS programming language developed at McGill University. This development has been empirical, following and sometimes leading the main thrusts of database research (including hierarchies, data-structure-set, multiset relations, entities and relationships, object-oriented databases, logic databases, active databases, data warehousing, data mining, semistructured data, Internet data, and streams and timeseries) with the aim of including all the evolving possibilities of SS programming. The difficulties of doing this are captured in the phrase "impedance mismatch", which characterizes the gaps among applications on one hand and between querying and programming on the other.

Aldat is a relational programming language which includes all query capabilities of commercial relational database systems, as well as general programming.

The language and data abstractions with which Aldat accommodates SS also turn out to be valuable in programming for RAM, and so has benefits for all computer science.

History

Overview

The Aldat secondary storage programming language consists of two largely independent algebras, the Domain Algebra, operating on attributes, and the Aldat Relational Algebra, operating on first-normal-form relations. Both satisfy the principle of abstraction, which ensures that the attribute operations of the domain algebra can be specified independently of any relation and that the relational operations of the relational algebra are general and independent of the specific content of any relation. They also satisfy the principle of closure, permitting arbitrarily long expressions.

The domain algebra consists of two families of operators. The family of scalar operators includes arithmetic and other numerical functions, booleans, and string operators, all with the proviso that any scalar expression may be evaluated entirely on each individual tuple of a relation. The family of aggregate operators includes four meta-operators: reduction and equivalence reduction take associative-commutative operators as parameters and allow sums, counts, maxima, minima, products, averages, standard deviations, etc., of attributes over all tuples; functional and partial functional mapping remove the restriction to associative-commutative operators by using attibute values to induce orderings on the tuples.

The Aldat Relational Algebra provides a simple rationalization of the classical relational algebra by separating joins into two families of set-based operations, one extending set-valued binary operations to relations and the other extending logic-valued binary operations on sets to relations. In the former family, set intersection becomes natural join. In the latter, set containment generalizes relational division and set overlap gives natural composition.

Aldat continues the rationalization of relational algebra by merging projection and selection into a single operator and generalizing it to include quantifiers. It adds a pattern-matching unary operator, grep, and an open-ended family of "editors". Updates to single relations introduce non-functional operations into the language. Finally, the relational algebra is used to evaluate the expressions of the domain algebra by "actualizing" the resulting attributes in new relations, and, conversely, the domain algebra is extended to include the relational algebra so as to provide operations on attributes which are nested relations.

Programming language paradigms such as parametric abstraction appear in the Aldat Computations, which treats functions and procedures as special cases of relations, and such as scoping, which appears both in the nesting of relations within other relations and in the analogous nesting of databases within each other to form multidatabases and databases distributed over the Internet. Scoping permits "path expressions" which use regular expressions to specify special cases of domain-algebra recursion equivalent to query languages on XML-like semistructured data.

Further Reading

Links

Aldat Relational Algebra,
Domain Algebra,
QT-selectors,
Aldat Computations