Copyright ©2007 T. H. Merrett
Aldat
Wikipedia categories: programming languages, databases
Motivation
Computer memory comes in a variety of forms, two principle ones being primary
memory ("RAM"), and secondary memory ("SS") which includes hard and floppy disks
and CD-ROM. Data structures and languages for RAM are core computer science.
The Aldat Project is concerned with language and data structures for SS.
Secondary storage is distinguished from RAM primarily by the long times required
to find data, relative to the times required to transfer it for processing.
Different memory organization needs different data structures, algorithms and
languages.
There are many general-purpose languages for RAM and it is unusual to encounter
languages specialized for individual applications. Languages for SS are usually
query languages rather than full programming languages, and they are specialized
both to particular data structures (hierarchies, linked lists, tables, etc.) and
to particular applications (administrative data, spatio-temporal data, logic
data, semistructured data, etc.) For instance, RAM has no spatio-temporal
dialect of FORTRAN or Java, but SS has Arc-Info while commercial relational
databases (SS) do not have comparable spatial capability except as add-ons.
Aldat is a general purpose SS programming language developed at
McGill University. This development has
been empirical,
following and sometimes leading the main thrusts of database research (including
hierarchies, data-structure-set, multiset relations, entities and relationships,
object-oriented databases, logic databases, active databases, data warehousing,
data mining, semistructured data, Internet data, and streams and timeseries)
with the aim of including
all the evolving possibilities of SS programming. The difficulties of doing
this are captured in the phrase "impedance mismatch", which characterizes the
gaps among applications on one hand and between querying and programming on
the other.
Aldat is a relational programming language which includes all query capabilities
of commercial relational database systems, as well as general programming.
The language and data abstractions with which Aldat accommodates SS also turn
out to be valuable in programming for RAM, and so has benefits for all
computer science.
History
- Started in the 1970s with the classical relational database model.
- Domain Algebra
[ T. H. Merrett "Experience with
the Domain Algebra" Proc. 3rd Internat. Conf. on Data and Knowledge
Bases
1988 San Mateo CA, Morgan Kaufmann Publishers Inc. pp.335-346]
introduced ˜1974.
- QT-selectors
[ T. H. Merrett "The Extended
Relational Algebra, A Basis for Query Languages" Databases: Improving
Usability and Responsiveness 1978 Academic Press pp.99-128]
add quantified queries to the relational algebra.
- Recursive views
[ A. Clouâtre, N. Laliberté, T. H. Merrett
"A General Implementation of Relational Recursion with Speedup Techniques for
Programmers" Information Processing Letters 32 (Sept. 1989)
pp.257-61]
introduced in the 1980s allow transitive closure and
inference engines.
- Domain algebra applied to spatial and temporal data.
- Aldat Computations
[T. H. Merrett
"Computations: Constraint Programming with the Relational Algebra" Proc.
Internat. Symp. on Next Generation Database Systems and Their
Applications Fukuoka, Japan, Sept. 1993 pp.12-17 ]
introduced in
1990s to give procedural abstraction using relational syntax. Active databases
and abstract data types are special cases.
- Subsuming relational algebra within domain algebra supplies operators
needed for nested relations without new syntax.
- Metadata included, allowing association and classification data mining.
- Instantiation and inheritance (object-orientation) follow.
- Recursive nesting and recursive domain algebra introduced in 2000s.
- Semistructured data processing follows.
Overview
The Aldat secondary storage programming language consists of two largely
independent algebras, the Domain Algebra, operating on
attributes, and the Aldat Relational Algebra, operating
on first-normal-form relations. Both satisfy the principle of
abstraction, which ensures that the attribute
operations of the domain algebra can be specified independently of any relation
and that the relational operations of the relational algebra are general and
independent of the specific content of any relation. They also satisfy the
principle of closure, permitting arbitrarily long expressions.
The domain algebra consists of two families of operators. The family of scalar
operators includes arithmetic and other numerical functions, booleans, and
string operators, all with the proviso that any scalar expression may be
evaluated entirely on each individual tuple of a relation. The family of
aggregate operators includes four meta-operators: reduction and
equivalence reduction take associative-commutative operators as
parameters and allow sums, counts, maxima, minima, products, averages,
standard deviations, etc., of attributes over all tuples; functional
and partial functional mapping remove the restriction to
associative-commutative operators by using attibute values to induce orderings
on the tuples.
The Aldat Relational Algebra provides a simple
rationalization of the
classical relational algebra by separating joins into two families of
set-based operations, one extending set-valued binary operations to relations
and the other extending logic-valued binary operations on sets to relations. In
the former family, set intersection becomes natural join. In the latter, set
containment generalizes relational division and set overlap gives natural
composition.
Aldat continues the rationalization of relational algebra by merging projection
and selection into a single operator and generalizing it to include quantifiers.
It adds a pattern-matching unary operator, grep, and an open-ended
family of "editors". Updates to single relations introduce non-functional
operations into the language. Finally, the relational algebra is used to
evaluate the expressions of the domain algebra by "actualizing" the resulting
attributes in new relations, and, conversely, the domain algebra is extended to
include the relational algebra so as to provide operations on attributes which
are nested relations.
Programming language paradigms such as parametric abstraction appear in the
Aldat Computations, which treats functions and
procedures as special cases of
relations, and such as scoping, which appears both in the nesting of relations
within other relations and in the analogous nesting of databases within each
other to form multidatabases and databases distributed over the Internet.
Scoping permits "path expressions" which use regular expressions to
specify special cases of domain-algebra recursion equivalent to query languages
on XML-like semistructured data.
Further Reading
Links
Aldat Relational Algebra,
Domain Algebra,
QT-selectors,
Aldat Computations