Updates and Events in a Nested Relational

Programming Language

 

 

 

Weizhong Sun

 

 

 

School of Computer Science

McGill University, Montreal

March 2000

 

 

 

A thesis submitted to the Faculty of Graduate Studies and Research in partial

fulfillment of the requirements of the degree of Master of Science

 

 

 

 

Contents

 

Abstract                                                        i  

 

Résumé                                                          ii

 

Acknowledgements                                                iii

 

1          Introduction                                         1

1.1        Relational Model                                     1

1.1.1      Flat Relational Model                                1

1.1.1.1    Introduction                                         1

1.1.1.2    Relational Algebra                                   2

1.1.1.3    Domain Algebra                                       3

1.1.1.4    Update Operation                                     3

1.1.2      Nested Relational Model                              4

1.1.3      Discussion                                           7

1.2        Active Database Systems                              8

1.2.1      Introduction                                         8

1.2.2      ECA Model                                            9

1.2.2.1    Basic Concepts                                       10

1.2.2.2    Rule Execution Granularity                           10

1.2.2.3    Transaction and Coupling Modes                       11

1.2.2.4    ECA Model in SQL3                                    14

1.2.3      Discussion                                           19

1.3        Summary and Thesis Outline                           21

 

2          jRelix Overview                                      23

2.1        Declaration of Domain and Relation                   23

2.1.1      Domain Declaration                                   23

2.1.2      Relation Declaration and Initialization              25

2.1.3      Example                                              26

2.2        Relational Algebra                                   27

2.2.1      Assignment and Incremental Assignment                27

2.2.2      Relational Expressions                               28

2.3        Domain Algebra                                       34

2.4        Unification of Relational Algebra and Domain Algebra 36

2.5        Computation                                          37

 

3          User Manual on Updates and Event Handlers            42

3.1        User Manual on Updates                               42

3.1.1      Introduction                                         42

3.1.2      Syntax and Semantics                                 42

3.1.3      Worked Example of Flat Relation                      43

3.1.4      Worked Example of Nested Relation                    48

3.2        User Manual on Event Handlers                        53

3.2.1      Introduction                                         53

3.2.2      Computation as Event Handler                         56

3.2.3      Cascading Event Handler                              56

3.2.4      Event Handler On/Off                                 58

3.2.5      Printing Event Handler                               58

3.2.6      Deleting Event Handler                               58

3.2.7      Trigger and New and Rest                             59

3.2.8      Undo Command                                         61

3.2.9      Worked Example of Event Handler                      62

 

4          Implementation of Update and Event Handler           65

4.1        Representation of Nested Relations                   65

4.2        Implementation of Updates                            66

4.2.1      EvaluateUpdate                                       68

4.2.1.1    Algorithm of EvaluateUpdate                          68

4.2.1.2    Example of EvaluateUpdate                            70

4.2.2      ExecuteUpdate                                        72     

4.2.2.1    Algorithm of ExecuteUpdate                           72

4.2.2.2    Example of ExecuteUpdate                             72

4.2.3      SwitchTrigger                                        73

4.2.3.1    Algorithm of SwitchTrigger                           73

4.2.3.2    Example of SwitchTrigger                             74

4.3        Implementation of Event Handlers                     75

4.3.1      Computation as Event Handler                         75

4.3.2      Algorithm of ExecuteEventHandler                     76

4.3.3      Example of ExecuteEventHandler                       78

4.4        Combination of Updates and EventHandlers             80

4.5        Setting Event Handlers On/Off                        81

4.6        Deleting Event Handlers                              82

4.7        Printing Event Handlers                              82

 

5          Conclusion                                           83

5.1        Conclusions                                          83

5.2        Future Work                                          83

 

Appendix A                                                      91

 

Bibliography                                                    99

 

 

 

 

Abstract

 

This thesis documents the design and implementation of Updates and Event Handlers in a relational database programming language. Update operations are implemented in the nested relational model and Event Handlers transform the database system from passive to active.

     The update operation allows the user to change values of specified attributes in certain tuples. These attributes can be selected by a "using" clause which uses a relational algebra operation to select tuples from the relation we want to update. We can also use updates to add or delete some tuples to or from the relation. The nested update in nested relations will also be presented.

     Event handlers are introduced as procedures, invoked by events which are system generated procedure calls. We implement the event handler based on computation - the procedural abstraction facility of the database programming language.

     An update statement can invoke multiple event handlers. Event handlers may contain update statements which in turn invoke other event handlers. This introduces cascading event handlers. We will present the combination of update and event handler algorithm by which the update operation is coupled with event handler closely and neatly, and the cascading event handler is handled by the system's recursive execution.

     In this thesis, we provide the user a uniform syntax to update both flat relations and nested relations. The unification of computation and procedure leads to a simpler language. More over, the explicitness and intuitiveness of the definition and implementation of both event and event handler are under substantial considerations.

 

 

Résumé

 

Cette thèse décrit la conception et l'implémentation d'un manipulateur d'événement et de mise à jour, dans un language de base de données relationnelle. Les opérations de mise à jour sont implementées par un modèle relationnel emboîté, et les manipulateurs d'événement transforment un système de base de données du passif à l'actif.

    L'opération de mise à jour permet à l'usager de changer la valeur de certains attributs dans une série. Ces attributs peuvent être choisis en se servant d'une clause qui utilise une opération d'algèbre relationnelle afin de choisir la série dont la relation doit être mise à jour. Il est aussi possible d'utiliser l'opération de mise à jour pour ajouter ou enlever une serie d'une relation. La mise à jour emboîté d'une relation emboîté sera aussi présenté.

     Des manipulateurs d'évènements seront présentés en tant que procédures, invoqués par des évènements, qui sont des appels de procédure générés par un système. L'implémentation de ces manipulateurs d'évènement est basée sur des traitement de données - la facilité de l'abstraction procédural du language de programation de base de données.

     L'instruction de mise à jour peut invoquer de multiple manipulateurs d'évènement. Les manipulateurs d'évènement peuvent à leur tour invoquer d'autres manipulateurs d'évènement. Seront présenté, une combinaison d'algorithme de mise à jour et de manipulations reliées de près de façon efficace, et la cascade d'un manipulateur d'évènement par l'exécution récursive d'un système.

     Dans cette thèse, une syntaxe uniforme de mise à jour pour des relations emboîté ou non est présentée aux lecteurs. L'unification des traitements et procédures mênent à un langage plus simple. De plus, l'évidence et l'intuitivetée de la définition et l'implémentation des deux manipulateurs d'évènement sont des considérations substantiels.                                                                                                                      

 

 

Acknowledgements

 

I would like to thank many people who provided assistance to me. Among them, I especially wish to thank Prof. T. H. Merrett, my thesis supervisor. His invaluable advice, patient guidance, continuous encouragement, and financial support provided me a favorable environment to write my thesis. Without these, this thesis might never have been written.

     I would like to thank McGill University and the faculty of Computer Science for the financial support of Maxtern Recruitment Fellowship and Tuition Fee Waiver Fellowship.

     I would also like to thank Ian Garton, for his assistance and advice during the project from which I benefit a lot. I am also indebted to my friends, Heng Jia and Jian Fu, for their enthusiastic assistance on preparing this thesis.

     I would like to thank my family, for their continuous support and encouragement throughout my studies at McGill.

     Last but not least a special thanks goes to Lynn, whose love and encouragement are the driving force for me.

 

 

Chapter 1

 

Introduction

 

This thesis documents the design and implementation of Updates and Event Handlers for  jRelix. The basic aim of our work is to support update operations in a nested relational model and transform jRelix from a passive database system to an active one. We first have a historical overview of the relational model including both the flat and the nested relational model. Then we will discuss active database systems. Finally we will present the thesis outline.

 

1.1   Relational Model

 

1.1.1 Flat Relational Model

 

1.1.1.1 Introduction

 

The relational model introduced by Codd [Cod70] [Cod79] represents the database as a collection of time-varying relations. The relation is a simple and uniform data structure which consists of rows and columns. Each relation resembles a table, and each row in the table represents a collection of related data values. The terminology "tuple" is used to refer to a row, and "attribute" is used to refer to a column header. "Domain" refers to the data type of values that can appear in a column. Formally, a relation is a subset of the Cartesian product of its domains.

     If the domains are all simple, such a relation has a tabular representation with the following properties:

* All tuples are distinct from each other.

* A relation is an unordered set of tuples.

* Each attribute is unique and the order of columns is irrelevant.

* The attribute values are atomic. By atomic it means each value in the domain is undecomposable.

The fourth property is the so-called first-normal-form (1NF) requirement. A relation that satisfies 1NF is also called a flat relation.

     The concept of normalization was introduced in [Cod70] and dealt with more rigorously in his later papers [Cod72a] [Cod72b]. It addresses how data ought to be organized within a database in order to make the database as compact and as easy to manage as possible and ensure the result is consistent. The theory is based on a series of normal forms - which define increasingly stringent requirements. A thorough discuss about normalization technique can be found in [Dat81] and [Ull82].

     A number of early projects were implemented after the introduction of the relational model, and the most significant research may be attributed to three projects [CB99]. The first was System R, which was developed at IBM's San Jose Research Laboratory in California during the late 1970s [AB+76]. This project was designed to prove the practicality of the relational model by providing an implementation of its data structures and operations. The second project was the INGRES project at the University of California at Berkeley [HSW75]. INGRES involved the development of a prototype RDBMS which spawned the commercial product INGRES. The third project was the Peterlee Relational Test Vehicle at the IBM UK Scientific Center in Peterlee [Tod76]. This project was significant for research into such issues as query processing and optimization, and functional extension. The idea of procedural expression (computation) of an infinite relation was first proposed in this project.

 

1.1.1.2 Relational Algebra

 

The relational algebra is first suggested by [Cod70], and consists of a collection of operations applied on relations. All operations take a relation as an operand and return a relation as a result. The result relation can be further manipulated. The property that any relational algebra operation evaluates to a relation is called the "closure principle". This allows complex relational expressions by concatenating a series of simple operations.

     The relational algebra operations are usually divided into two groups based on the number of their operands.

* Unary operations, includes projection, selection.

* Binary operations, includes mu-join and sigma-join.

 

1.1.1.3 Domain Algebra

 

The need for arithmetic and related operations on the values of attributes in individual tuples is encountered. Merrett [Mer76] proposed the domain algebra which consists of a set of operations to manipulate attributes in individual tuples. The user can create new attributes from the existing ones with two kinds of operations:

* horizontal operations: new value is generated from the values within a tuple

- Constant

- Rename

- Mathematical operations

- If-then-else

- Predefined functions

* vertical operations: new value is generated from values along an attribute

- Reduction

- Equivalence

- Functional Mapping

- Partial Functional Mapping

 

1.1.1.4 Update Operation

 

As Codd described, a relational database is a time-varying collection of data, all of which can be accessed and updated as if they were organized as a collection of time-varying tabular relations.

     There are three ways to update a relation as time processes:

* Insert additional tuples into the relation.

* Delete existing tuples from the relation.

* Change the attribute values of the existing tuples in the relation.

 

1.1.2 Nested Relational Model

 

In the late 1970s a significant direction in database research was established: various extensions to the relational model were proposed to enrich the semantics of the original model, and to open the model to better meet the requirements.

      In 1977 Makinouchi [Mak77] proposed to generalize the relational model by removing the 1NF constraint. Jaeschke and Schek [JS82] introduced a generalization of the ordinary relational model by allowing relations with set-valued attributes and adding two restructuring operators, the nest and the unnest operations, to manipulate such (one-level) nested relations. Thomas and Fishcher  [TF86] generalized Jaeschke and Schek's model and allowed nested relations of arbitrary (but fixed) depth. The definition of recursively nested relations was also discussed [LS88]. Detailed examination can be found in [AB84] [FT83] [KK89] [SS86].

     Meanwhile, various query languages have been introduced for the nested relational model. Roth, Korth and Silberschatz [RKS88] defined a calculus-like query language for the nested relational database model of Thomas and Fischer. Since then, many SQL-like query languages [PA86] [KR89] [PT86], graphics-oriented query languages [HP87] and datalog-like languages [BK86] [BNR+87] have been introduced for this model or slight generalizations of it.

     By removing the 1NF constraint of the flat relational model, the attributes in the nested relational model can be relation-valued. Let us consider an example of a database for employees in different departments of a company. Each employee in a department has name and salary. The relationship can be represented diagrammatically as a tree, as shown in Figure 1.1.

 

                        Figure 1.1: Hierarchical structure example

     

As well, the contents of above information structure can be illustrated like that shown in Figure 1.2.

                  Figure 1.2: Example of a hierarchical record

 

Alternatively, we can use a nested relation to present the information as illustrated in Table 1.1.  The nested relation consists of two tuples each having two attributes:

* DEPARTMENT: The name of the department. Its data type is string (atomic).

* EMPLOYEES: A nested relation containing the information of each employee in the department.  Each tuple in relation EMPLOYEES contains a whole relation as an attribute. The first tuple contains a relation with 4 tuples. The second tuple contains a relation with 3 tuples.

 

 

 

               Table 1.1: Example of a nested relation presentation

 

If we use the 1NF conformant relation to store such company information, it would have repeated values for attribute DEPARTMENT as shown in Table 1.2.

 

 

 

 

             Table 1.2: 1NF-conformant company record table

 

The main advantages of nested relation model are the following [Lev91] [Yua98]:

* For storage efficiency, nested relations minimize redundancy data

* For programming flexibility, nested relations allow a very flexible interface at the external level, since both flat and hierarchical data can be presented to the user.

 

1.1.3 Discussion

 

The flat relational model is usually thought to be not qualified to handle applications such as CAD/CAM where the objects to be modeled are structured in hierarchy (also referred to as complex objects) while the nested relational model is more capable of handling such applications. But this is not true. As shown in the implementation of jRelix (detailed discussion of jRelix will be presented in the following chapters and can be found in [Hao98] [Yuan98] [Bak98]), the nested relational model essentially does not extend the data model with new concepts or operators. It just permits a more deliberate use of type constructors (attribute could be of relation type) and the query language (including relational algebra and domain algebra in jRelix) already available.

     To structure hierarchical data, the user can define the data with nested relation-valued attributes. The system will then map the nested relations to flat relations in the storage and the query language can be applied on both flat relations and nested relations. The only extension is to integrate relational algebra expressions into domain expressions thus any operations that are performed on the top-level relations can be performed at the lower level attributes which are also relations.

     It might be argued that mapping nested relations to flat relations requires lots of joins, which is very time consuming. But this is not a problem of mapping itself, but an issue of the implementation of joins. The join operation can be implemented using a pointer so that the cost of joins of flat relations is just the cost of pointer de-referencing of nested relations. Queries with low selectivity are suitable to be handled by pointer de-referencing.  

     A thorough examination of how to use jRelix to build up CAD/CAM applications will not be discussed further as it is beyond the scope of work presented in this thesis.

 

 

 

1.2   Active Database Systems

 

1.2.1 Introduction

 

Traditional database systems are passive in the sense that commands are executed by the database (e.g., query, update, delete) as and when requested by the user or application program. Such systems lack the capability to respond to happenings of interest without the user intervention.

     One early thread of research in active database systems focused on scaling up production rule systems so that they could deal with large numbers of rules and facts stored in a database. The primary goal of these large-scale production rule systems was to support inferencing over a database  [DE89] [Han89] [SLR88] [SKM92].

     In the mid to late 1970s, triggers were proposed as a way of enforcing integrity constraints, and a trigger subsystem for System R was described in [EC75] [Esw76]. In  1982, Stonebraker first proposed the use of situation-action rules as a unifying mechanism for implementing constraints, view processing, triggers and access control [Sto82]. This idea was implemented in the POSTGRES Rule System I. In [DH+94], Situation-action rules were used to embed the shared operational semantics of applications in databases. [KDM88] described the event-trigger mechanism for enforcing complex constraints in CAD databases.

     Research in active database really exploded after the introduction of ECA-rule first proposed in HiPAC [DB+88]. Based on the idea of ECA, HiPAC developed concepts for time-constrained active, object-oriented database systems. Besides HiPAC, there are another two important pioneering research projects in active database systems: Postgres Rule System II [SJ+90] and Starburst Rule System [Wid96], which introduced various granularity of ECA rules

     Following these early pioneering projects, more and more research projects in active database systems have been implemented, such as Chimera (an active and deductive DBMS) [FMT84], REACH (an active real-time system) [BB+93], and Ode (a persistent C++ system with triggers) [GJ91].

     For example, an inventory control system needs to monitor the inventory database, so that when the quantity in stock of some item falls below a threshold, a re-ordering activity may be initiated. This behavior could be implemented over a passive database system in the following two ways, neither of which is satisfactory. First, a new application program is written to poll the database periodically. However, polling too often can be inefficient, polling too slowly may result in delayed responses to critical situation. Alternatively, every program that updates the inventory database could check the condition and invoke the reordering operation if necessary; however, this is poor software engineering because the desired semantics are embedded in many programs and also every updater needs to know which downstream operations to call.

     An Active Database System is a conventional, passive database system extended with the capability of reactive behavior. The desired behavior is expressed in rules that are defined and stored in the database [VB98].

     If we build up the above inventory control system using an active database system, we can define a rule that when quantity in stock falls below the threshold, the system will trigger a request of order. The active database system is now responsible for detecting the situation of interest and triggering the appropriate response. This has the benefit that the rules can be shared by many application programs, and the database system can optimize their implementation.

     An active database system has some advantages over its passive counterpart [VB98]:

1. Perform functions that in passive database systems must be encoded in applications.

2. Perform tasks that require special-purpose subsystems in passive database systems.

 

1.2.2 ECA Model

 

In this section, we will discuss the event-condition-action (ECA) model that is widely used in active database systems. Before we go any further we will give formal definitions of several basic concepts in ECA model.

 

1.2.2.1 Basic Concepts

 

According to [MD89], we have the following definitions:

 

*  "An event is the occurrence of pre-defined state which triggers the rule and causes the system to evaluate the condition. Event can be defined as having typed formal parameters, which are bound at the time an instance is signaled, and are passed to the condition and action.  Event can be either a single primitive event or a composite event. Primitive event can be one of the followings: database operations (data definition, data manipulation, or transaction control), temporal events (absolute, relative or periodic), or external notification (application defined events). Composite event is built up from the primitive type events by operators such as disjunction, sequence, closure, conjunction, and negation. For example, the event on insert to Emp or update to salary of Emp is a composite event using the operator or."

 

*  "Conditions are typically predicates or queries against the database state."

 

* "An action is a sequence of operations which are executed when the condition of the triggering event is satisfied. These operations can be either database operations or requests to application programs."

 

1.2.2.2 Rule Execution Granularity

 

Rules may be executed at several granularities: at a tuple- or instance-level; at a table- or statement-level; or at a transaction or other granularity. Some projects support one granularity, others a mixture.

     We will illustrate that the semantics can be quite different for the different granularities with the following example.

 

EMP ( Name           Salary )   

      M.Gordon        25000

      P.McConnel      22000

 

Event: delete the tuple where Salary >= 20000

Condition: true

Action: decrease Salary by 5000

 

     If the rule is evaluated at the statement level, i.e., after the entire update operation finishes, the result is the empty set (because both tuples are deleted by the update operation). If the rule is evaluated once for each tuple, then the P. McConnel tuple will remain in the relation: after the M. Gordon tuple is deleted, the rule fires, causing P. McConnel's salary to decrease to 17000, so that it no longer satisfies the predicate of the delete operation.

 

1.2.2.3 Transaction and Coupling Mode

 

Before we discuss various coupling modes of rule execution, we introduce the concept of transaction which is closely relative to coupling modes.

 

* A transaction can be considered as a collection of actions with the following properties [GR93]:

-    Consistency: a transaction is a correct transformation of the state of the database.  

The actions taken as a group do not violate any integrity constraints associated       with the state.

- Atomicity: the changes a transaction performs are atomic; either all happen or none happen.

- Isolation: although transactions execute concurrently, it appears to each transaction, T, that each other transactions executes either before T or after T, but not both.

- Durability: once a transaction completes successfully (commits), its changes to the state survive failures.

 

A transaction may contain any number of nested transactions [Mos85] [GR93]. A nested transaction is fully contained within another transaction (its parent). A nested transaction can have as its parent either another nested transaction or a top level transaction. The related transactions are organized in the form of a transaction tree.

     In the basic nested transaction model proposed in [Mos85], a nested transaction is started explicitly by the parent transaction, which is suspended until the nested transaction commits or aborts. The effects of a nested transaction do not become permanent until it and all its ancestors commit through a top level transaction. A nested transaction may be aborted without causing its parent transaction to abort. The parent transaction can create another transaction to retry the aborted one. When a parent transaction aborts its effects and the effects of all its descendants are ignored. If the top transaction aborts, the whole transaction tree is aborted.

     Three more types of nested transactions are presented in [DHL90] as the extension of the basic nested transaction model.

 

* Deferred nested transactions are nested transactions whose execution is explicitly delayed until the end of the user's top transaction. If more than one deferred nested transaction is spawned within a user's transaction, they all execute in parallel at the end of the user's transaction. If any of these deferred nested transactions spawns itself another nested transaction, this is executed until all deferred transactions from the level above have finished.

 

* Nested top-transactions are top transactions started from within another transaction and are represented by their own tree. However, it may not see any non-committed objects of the spawning transaction and is not automatically aborted when the spawning transaction aborts.

 

* Causally-dependent-top-transactions (CD-top) are spawned from within another transaction and are like nested top-transactions that have their own transaction tree but are commit-dependent on the parent. Aborting the spawning transaction aborts the CD-top transaction. However, aborting CD-top transaction has no effect on the spawning transaction.

 

     Now we move on to the topic of coupling modes of rule execution. An important aspect of ECA rule execution is the exact time of event detection, condition checking, and action execution relative to the triggering operation and the end of the transaction. There can be various delays in checking the condition of a rule after its event is detected or the execution of a rule's action can be delayed after its condition is evaluated to true. These delays can either occur within one transaction or spread over different transactions. The latter case raises the issue of nested transactions.

     The various possibilities of delays of the rule execution are called coupling modes. There are three coupling modes, both for the relative delay between event detection and condition checking (EC coupling) and between condition checking and action execution (CA coupling):

 

* Immediate: The condition (action) is evaluated (executed) immediately after the event (condition) within the same transaction. Immediate coupling modes can be used, for example, to enforce security constraints or to propagate updates. For rule execution in small granularity, the condition (action) evaluation (execution) will be immediately done for each rule execution in that granularity.

 

* Deferred: The condition (action) is evaluated (executed) within the same transaction as the event (condition) of the rules, but at the end of the triggering transaction and before the triggering transaction commits. Deferred coupling modes can be used to defer constraint checking until the end of the transaction, thereby, allowing temporary inconsistencies within a single transaction. In small granularity, all the condition (action) evaluations (executions) of rule execution in that granularity will be deferred until the end of the transaction.

 

* Detached: The condition (action) is evaluated (executed) within a different transaction from the event (condition). The execution of the action can be dependent upon or independent of the committing of the transaction in which the event took place or the condition was evaluated. The benefit of this mode is that it results in shorter transactions that can be committed earlier, improve response time and concurrency. It can also be used to monitor access to sensitive information, as running a detached independent transaction will record information on the access even if the transaction in which the access took place is aborted. The detached coupling mode is supposed to be implemented in nested transaction model.

 

     The selection of proper coupling mode(s) for rule execution is highly dependent on the transaction model(s) supported by the active database system.

 

1.2.2.4 ECA Model in SQL3

 

The database language SQL is now firmly established as the predominant language standard for database access. The first version of SQL standard, SQL-86, was published in 1986. A substantial reversion of the standard appeared in 1992 as SQL-92. One significant extension is that SQL-92 introduced the notion of referential actions, which can be considered as a limited form of rule support. More recently, significant active database functionality is being incorporated into SQL3 [Day95] [Nor98] in the form of triggers.

     A trigger in SQL3 is a named event-condition-action rule that is activated by a database state transition. Each trigger is associated with a particular table and is activated whenever that table is modified and an optional condition, specified as part of the trigger definition, evaluates to true. Triggers can be processed either at tuple-level or table-level granularity.

     To define a trigger in SQL3, the name of the trigger must be unique among all trigger names. The triggering operation of a trigger can only be an INSERT, DELETE or UPDATE statement. The specification of the trigger condition is optional. If omitted, a trigger condition that specifies WHEN (TRUE) is implicit.

     Each trigger is associated with a trigger activation time, which determines whether the trigger is activated before or after the triggering operation. Therefore, there are two modes of trigger in SQL3: BEFORE mode and AFTER mode. A trigger in BEFORE mode is activated before the operation that modifies a table executes, while a trigger in AFTER mode is activated after the modifying operation executes. The specification of the trigger activation time is mandatory.

     The two kinds of triggers offer different benefits. Since AFTER triggers execute after the triggering operation, they are quite useful for performing follow-on updates to other tables. In contrast, BEFORE triggers are especially useful for enforcing transitional constraints.

     Consider the following example:

 

Emp ( empno, name, dno, age )

Dept ( dno, dname, loc, nemps )

CREATE TRIGGER deptdel

BEFORE DELETE ON Dept

WHEN Dept.nemps > 0

BEGIN

DELETE FROM Emp WHERE Emp.dno = Dept.dno

END

 

We have defined a trigger deptdel:

* subject table: Dept

* operation: DELETE

* condition: WHEN Dept.nemps > 0

* action: DELETE Emp WHERE Emp.dno = Dept.dno

* mode: BEFORE

* granularity: table-level

 

Trigger deptdel is in the BEFORE mode. It enforces the constraint that when a department tuple is deleted, if the number of employees in the department is greater than zero, then delete all employees of the department first.

     The trigger we defined above is processed in the table-level granularity, which means the trigger is executed once for all the changes applied on the subject table Dept. In the following example, we will define a trigger of tuple-level granularity.

 

CREATE TRIGGER empdel

AFTER DELETE ON Emp

FOR EACH ROW

BEGIN

UPDATE Dept SET nemps = nemps - 1 WHERE Dept.dno = Emp.dno

END

 

In the trigger empdel, we have

 

subject table: EMP

operation: DELETE

condition is omitted, which implies WHEN(TRUE)

action:

UPDATE Dept SET nemps =  nemps - 1 WHERE Emp.dno = Dept.dno.

mode: AFTER

granularity: FOR EACH ROW specifies the granularity as tuple-level.

 

When we execute the statement:

 

DELETE FROM Emp WHERE age >= 60;

 

the trigger empdel will execute once for each of the tuple that is deleted from Emp, which decreases the number of employees of the corresponding department by 1.

 

We summarize the execution rules of trigger as follows:

 

* When a triggering operation on its subject table is initiated and the trigger condition evaluates to true, a trigger is to be activated. When a trigger is activated, its trigger action is executed as part of the execution of its triggering operation.

 

* The trigger action is a list of SQL statements within a BEGIN/END block. When a trigger is activated, each statement in the block is executed in order.

 

* The execution of trigger actions of a trigger may activate other triggers (including the same trigger). Because of this, it is possible for the execution of a statement involving triggers to never terminate. It is up to the user to make sure that cascade triggers do not lead to non-terminating statement executions.

 

     In the following example, we use the trigger deptdel and empdel to illustrate the cascade trigger invocation.

 

CREATE TRIGGER deptdel

BEFORE DELETE ON Dept

WHEN Dept.nemps > 0

BEGIN

DELETE FROM Emp WHERE Emp.dno = Dept.dno

END

 

CREATE TRIGGER empdel

AFTER DELETE ON Emp

FOR EACH ROW

BEGIN

UPDATE Dept SET nemps = nemps - 1 WHERE Dept.dno = Emp.dno

END

 

We execute the following statement

 

DELETE FROM Dept WHERE Dept.dno = DeptNo

 

where DeptNo is the number of the department to be deleted.

 

Before the corresponding department is deleted from Dept, the table-level trigger deptdel will be activated and the statement

 

DELETE FROM Emp WHERE Emp.dno = Dept.dno

 

in the BEGIN/END block will be executed. Subsequently this statement will activate the tuple-level trigger empdel. Thus trigger empdel will execute once for each of the tuple that is deleted from the Emp, which means the statement

 

UPDATE Dept SET nemps = nemps - 1 WHERE Dept.dno = Emp.dno

 

will be executed once after a tuple is deleted from Emp. The operations continue until all the employees in that department are deleted from Emp; meanwhile Dept.nemps becomes zero. Then the statement

 

DELETE FROM Dept WHERE Dept.dno = DeptNo

 

is executed so that the department DeptNo is deleted from Dept.

 

1.2.3 Discussion

 

1. About Event, Condition and Action

 

In our work, we make a simpler and more intuitive definition of event, condition and action:

 

* Events are system-generated procedure calls.

 

This idea came from Visual Basic [SK+94], an event-driven programming language. In this event-driven approach, programs can be invoked by a variety of pre-defined actions such as a mouse click or a keyboard stroke, which are called events. The system has a facility to detect occurrences of events and then perform some specified actions which are referred to as event handlers.

 

* Event handlers are procedures.

 

The event handler executed in response to an event can be expressed as a procedure.

Consider the following example:

 

procedure mouse.leftclick

begin

      print("hello")

end

 

Here we define an event handler, which is a procedure that prints "hello" on the screen, in response to the event of a left click on the mouse.

     As shown above, the action print is taken within the event handler. We do not have Condition part in our model yet. Although we can evaluate condition in the event handler, it is desired to have an explicit component in the model to perform this function. This will be discussed further as future work in Chapter 5.

 

2. About Composite Event

 

In our model, in response to the composite event using the operator or such as

 

on insert to Emp or update to salary of Emp

 

we can define a common procedure which is called by the event handlers that deal with insert to Emp and update to salary of Emp, respectively. If either of the two events occurs, the corresponding event handler will be triggered and call that common procedure. It is not obvious how to implement the composite event using the operator and because it involves the detection of concurrent events which is beyond the scope of event handling.

 

3. About Rule Execution Granularity

 

In our model we do not support rule execution in tuple-level granularity because there is no concept of tuple in jRelix.  The tuple-level granularity can cause data inconsistency

problems. As shown in example in 1.2.2.2, the relation resulting from the tuple-level rule

execution depends on the order of tuples in the original relation, which is contradictory to the relational model theory that results should not depend on the order of the tuples.

     It is possible to work at sub-relation granularity (i.e., with subset of the tuples of a relation, defined, say, by a predicate) in our model, but it will still cause data inconsistency just as the tuple-level granularity.

 

4. Ideas to be followed in our work

 

We summarize the ideas to be followed in our work and make the comparison between our ideas and SQL3.

 

                Table 1.3: Ideas to be followed in our model

 

1.3   Summary and Thesis Outline

 

The purpose of this chapter was to introduce the basic conception of the Nested Relational Model and Active Database Systems. In the remainder of the thesis, we will present our start at the implementation of Updates in a nested relational database, jRelix, and the implementation of Event Handlers which enable jRelix to be an Active Database System.

     In chapter 2, we will introduce jRelix in detail and discuss some of the features relevant to our work. Chapter 3 is a user manual intended to provide a tutorial to readers about the features of Updates and Event Handlers in jRelix and how to use them.

Chapter 4 documents the implementation details of Updates and Event Handlers in jRelix. Finally, chapter 5 summarizes the thesis and suggests work to do in the future.

 

 

Chapter 2

 

jRelix Overview

 

This chapter is a tutorial about jRelix - a relational database programming language. jRelix consists of a database management system (DBMS) which is responsible for organizing and storing data, and a query language Aldat - Algebraic Data Language, based on relational algebra and domain algebra [Mer77]. The nested relation model presented in jRelix leads to a significant extension of the domain algebra. The extended domain algebra now includes the relational algebra based on the conceptually unified definition of both flat and nested relation [Hao98, Yuan98].

     Section 2.1 describes the declaration of attributes and relations and the initialization of relations. Section 2.2 discusses the assignments, incremental assignments and relational expressions. Section 2.3 discusses the domain algebra. Section 2.4 covers the unification of relational algebra and domain algebra upon nested relations. Section 2.5 introduces computation [Bak98] which will be used to implement the event handler in jRelix.

 

2.1   Declaration of Domain and Relation

 

In this section we describe the syntax of declaration of domains and relations, initialization of relations, as well as some system commands applied on domains and relations.

 

2.1.1 Domain Declaration

 

We use the key word domain to declare attributes used in a relation. The syntax ( refer to Appendix A for complete jRelix syntax ) goes as following:

 

domain IDList Type

 

Here IDList specifies the list of attributes being declared. And Type specifies the type of these attributes. We summarize the valid types supported in nested relations of jRelix in table 2.1.

 

 

                       Table 2.1: Attribute types in jRelix

 

There are four kinds of meta-type for attributes:

 

* type IDList specifies the attributes on which the nested attributes are defined

* type expression is for relational and domain expressions  (to be implemented)

* type statement is for statements  (to be implemented)

* type computation  is used to define computations as attributes ( for details refer to [Bak98] )

 

We use command sd to show the information (name, type, etc.) of a specific attribute or all the attributes currently defined in the system if the optional parameter Identifier is omitted. The syntax of this command is:

 

sd (Identifier)?

 

The command dd is used to delete attributes specified in IDList. However, if any of the attributes are being used in any existing relation, the command will fail. This requires that the user delete all the relations associated with the specified attributes before deleting the attributes. The syntax is:

 

dd IDList

 

2.1.2 Relation Declaration and Initialization

 

The syntax of relation declaration goes as the following:

 

relation IDList "(" IDList ")" ( Initialization )?

 

The first IDList specifies the relation being created, and the second IDList specifies the attributes on which the new relation(s) will be defined. Initialization is optional.

 

We use the command sr to show information (name, type,number of tuples, sort, etc.) of a specific relation or information of all the relations currently defined in the system if the optional parameter Identifier is omitted. The syntax goes as follow:

 

sr ( Identifier )?

 

The command pr is used to print a relation on the screen. The syntax is:

 

pr Expression

 

The command dr is used to remove the relations specified in IDList from the system.

The syntax is:

 

dr IDList

 

2.1.3 Example

 

domain Name strg;

domain Sal intg;

domain EMP (Name, Sal);

domain DETP strg;

relation employees ( DEPT, EMP ) <- { ("stereo", {("M.Gordon", 25000), ("P.McConnel", 22000), ("T.Anastasio", 27000), ("J.Fishman", 24000)}),

("television", {("B.Martin", 38000), ("C.Wood", 32000), ("J.Medeski", 35000)})};

 

Note that the assignment (<-) operator will be formally introduced in Section 2.2. Here we defined a nested relation employees which has two attributes, DEPT and EMP. Attribute DEPT is of primitive type string, while attribute EMP is of non-primitive type, a relation itself, defined on attributes Name and Sal. Attribute Name is of primitive type string and attribute Sal is of primitive type integer. The relation is shown as follows:

 

 

 

2.2   Relational Algebra

 

In this section we will discuss relational algebra which consists of a set of functional operators. These operators are applied on one or two relations and produce a result relation. This closure property of relational algebra allows complex expressions to be constructed by various operators. The syntax of the operations will be presented and examples are used to illustrate their functionalities.

     We start with how to use assignment and incremental assignment. Then we discuss relational expressions. 

 

2.2.1 Assignment and Incremental Assignment

 

jRelix provides two assignment operators, one is assignment ( <- ) which creates a relation using the result of a relational expression, the other is incremental assignment

( <+ ) which adds the result of a relational expression to an existing relation. The syntax goes as follow:

 

      Identifier ( "<-" | "<+" ) Expression

            |

      Identifier "[" IDList ( "<-" | "<+" ) ExpressionList "]" Expression

 

The assignment ( <- ) operator always creates a relation named by Identifier which consists of the same domain and data as the source relation. The source relation remains unaffected. If the relation which has the same name has already existed in the system, it will be removed first.

     The incremental assignment ( <+ ) operator, if there is no relation with the same specified name already existing in the system, behaviors the same as the assignment operator. If such a relation does exist in the system, the result of the right side Expression will be added to the left side relation provided that both of them are defined on the same set of attributes.

 

Consider the following example:

 

domain sno strg;

domain mark intg;

relation Record ( sno, mark ) <- {("1", 85), ( "2", 70), ("3", 90), ("4", 70)};

relation moreRecord ( sno, mark ) <- { ("5", 65) };

oldRecord <- Record;

Record <+ moreRecord;

 

The results are shown as follows:

 

oldRecord  ( sno       mark )

             1           85

             2           70

             3           90

             4           70

 

Record ( sno        mark )

         1            85

         2            70

         3            90

         4            70     

         5            65

 

The above example shows how we copy the data of relation Record to oldRecord using assignment and add the data of relation moreRecord to Record using incremental assignment.

 

2.2.2 Relational Expressions

 

Relational expressions include projection, selection and join. They can be divided into

two groups, unary operations and binary operations. Unary operations require a relation

as input and produce a relation as output. Binary operations require two relations as input

and produce a relation as output. The source relations will not be affected in both kinds of

expressions.

 

Unary Operations

 

* Projection

Projection extracts a subset of the source relation. The result relation consists of a subset of the attributes of the source relation specified by IDList. Duplicate tuples will be removed from the result relation. If the IDList is omitted, the resulting relation will contain one tuple with one boolean attribute .bool. The value is true if the operand relation has at least one tuple. Otherwise the value is false. The syntax goes as follow:

 

"[" ( IDList )? "]" in Expression

 

Example:

 

1. Retrieve the marks in relation Record.

 

Result <- [mark] in Record

Result ( mark ) 

        65

        70

        85

        90

 

2. Check whether there is tuple in relation Record

 

Result <- [ ] in Record

Result ( .bool )

          true

 

* Selection

 

Selection returns a subset of the source relation specified by Expression. The tuples in the result relation are those satisfying the condition of the SelectionClause. The syntax goes as follow:

 

where SelectClause in Expression

 

Example:

 

Retrieve the tuples in which mark is higher than 80 in relation Record

Result <- where mark > 80 in Record;

Result ( sno    mark)

         1        85

         3        90

 

* T - Selection

 

Projection and selection can be combined into one expression called T - Selection. The syntax goes as follow:

           

"[" ( IDList )? "]" where SelectClause in Expression

 

Example:

 

Retrieve the student number whose mark is higher than 80 in relation Record.

Result <- [ sno ] where mark > 80 in Record;

Result ( sno )

         1

         3 

 

* QT - Selection

 

Support for QT - Selection has not been implemented in jRelix. Refer to [Mer84] for more information.

 

Binary Operations

 

There are two categories of binary operators: (-joins and (-joins. (-joins are a generalization of set operations on relations, and (-joins are a generalization of logical operations on relations [Mer84]. The results of (-joins and (-joins are also relations.

The syntax for join goes as follows:

 

Expression JoinOperator Expression

                        |

Expression "[" ExprList ":" JoinOperator (":")? ExprList "]" Expression

 

In the first production, the common attributes of the left and right side relations are used as join attributes. In the case where the left and right side relations have no common attributes, the user may specify which attributes form the join attributes. This is handled in the second production.

 

mu-joins

mu-joins correspond to the binary set operations of union, intersection and difference. We summarize mu-joins in the following table.

 

                  Table 2.2: Summary of mu-joins

 

In general, mu-joins operators can be defined in terms of three components - center, left and right. Given two relations R(X,Y), S(Y,Z), the three components are defined as follows:

 

center(R,S) = {(x,y,z) |(x,y) in R and (y,z) in S}

left(R,S)   = {(x,y,dc)|(x,y) in R and any z ((y,z) not in S)}

right(R,S)  = {(dc,y,z)|(y,z) in S and any x ((x,y) not in R)}

 

We have:

 

R ujoin S = center(R,S) U left(R,S) U right(R,S)

R ijoin S = center(R,S)

R djoin S = X,Y in left(R,S)

R drjoin S = Y,Z in right(R,S)

R lrjoin S = left(R,S) U center(R,S)

R rjoin S = right(R,S) U center(R,S)

R sjoin S = left(R,S) U right(R,S)

 

Notice that here is a new symbol dc. This is one of the two null values in jRelix. The other is dk. Details of dc and dk can be found in [Mer84].

 

Example:    CLASS ( ITEM     TYPE )       RECLASS ( ITEM     TYPE )

                    Yarn       a                    Yarn       a 

                    String     a                    String     b

                    Ball       b                    Top        a

                    Sandal     c

 

Result1 <-  CLASS ijoin RECLASS;

Result1 ( ITEM      TYPE )

          Yarn       a

 

Result2 <- CLASS ujoin RECLASS;

Result2 ( ITEM       TYPE )

          Ball         b

          Sandal       c

          String       a

          String       b

          Top          a

          Yarn         a

 

Result3 <- CLASS djoin RECLASS;

Result3 ( ITEM     TYPE )

          Ball      b  

          Sandal    c

          String    a

       

sigma-joins

 

sigma-joins correspond to the set comparison operators such as 'subset?' or 'equals?'. We summarize sigma-joins in the following table.

 

 

 

 

                             Table 2.3: Summary of sigma-joins

 

As of this writing, mu-joins have been implemented in jRelix while sigma-joins have not. But they will soon be included in jRelix. Refer to [Mer84] for details of mu-joins and sigma-joins.

 

2.3   Domain Algebra

 

Relational algebra considers relations as the primitive data [Mer84], therefore does not provide power to manipulate attributes. Domain algebra, which is proposed in [Mer77], provides a set of operations applied on attributes. They can be classified into two groups: horizontal and vertical. Figure 2.1 provides a summary of the domain algebra.

 

 

                       Figure 2.1: Summary of domain algebra

 

     It is necessary to declare a virtual domain when we make use of the domain algebra.

Virtual domains provide a flexible way to manipulate domains independent of the context of relations. They are declared on a set of actual domains or virtual domains that are subsequently based on actual domains. And they are actualized when they appear in the relational algebra, based on the actual domains' data in the source relation.

     Horizontal operations are used to manipulate individual tuples, while vertical operations are applied over sets of tuples. In the following example, we show the user a horizontal operation and a vertical operation of domain algebra, respectively.

Example:

 

> let SUBTOT be PRICE*QTY   (horizontal - works along a tuple)

> let TOT be red + of SUBTOT   (vertical - combine values from more than one tuple)

 

A thorough description of domain algebra can be found in [Mer84]. For details of its implementation please refer to [Yua98]. Syntax of them is shown in Appendix A.

 

2.4   Unification of Relational Algebra and Domain Algebra

 

jRelix provides full support for nested relations in which an attribute of a relation could itself be a relation. It not only requires the relational algebra to handle different joins of nested relations, but also implies that the domain algebra should be capable of handling nested attributes that actually are relations. Therefore, with nested relations, domain algebra must absorb relational algebra.

     In the jRelix system, the basic strategy of implementation of the domain algebra for nested relations is to extend the domain algebra, which integrates relational expressions into domain expressions so that the relational algebra can be applied to the nested attributes. Consequently, any operations that are performed on the top level relations can be performed at the lower level attributes which are also relations.

 

Take the relation employees from section 2.1.3.

 

Example 1:

 

> namesal <- [EMP] in employees;

 

The above statement does a projection over nested attribute EMP in relation employees. The result goes as the following:

 

          EMP

        ( NAME         SAL )

          M.Gordon     25000

          P.McConnel   22000

          T.Anastasio  27000

          J.Fishman    24000

          B.Martin     38000

          C.Wood       32000

          J.Medeski    35000

 

Note that there could be duplicate tuples in the result of projection on the nested attribute in jRelix. For instance, if the tuple (M.Gordon, 25000) appears twice, in both department stereo and television, it will be duplicate after the projection over EMP.

 

Example 2:

 

> let SenEmp be [NAME] where SAL >= 30000 in EMP;

> seniors <- [SenEmp] in employess;

 

The first statement defines a virtual attribute SenEmp using domain algebra. It is observed that the relational statement  [NAME] where SAL >= 33000 in EMP

is integrated in the domain expression. Thus we have defined an attribute SenEmp which is a relation itself. The second statement actualizes the nested attribute SenEmp in relation employees with a relational expression. We get the following result:

 

Seniors ( SenEmp )

         ( NAME )

           B.Martin        

           J.Medeski

 

     In jRelix, the domain algebra is implemented to be able to realize almost all the functionalities of relational algebra, leading it to be a union of both traditional domain algebra and relational algebra. For a complete discussion of the extended domain algebra in jRelix please refer to [Yua98].

 

2.5   Computation

 

Computation implements procedural abstraction in jRelix.  The basis of computation can be found in [Mer93]. The formal syntax (details in appendix A) for the declaration of a computation goes as follow:

 

"comp" Identifier "(" ( ParameterList )? ")" "is" ComputationBody

 

We will briefly discuss the computation with examples in this section. Please refer to [Bak98] for a complete explanation of computations in jRelix. An example of the declaration of a computation is shown as follows:

 

> domain D,V,T float;

> comp velocity (D, V, T) is

  { D <- V * T;

  } alt

  { V <- D / T;

  } alt

  { T <- D / V;

  };

 

The computation name is velocity. There are three parameters in this computation, and they are all defined as domains. There are three "alt" blocks in this example. All of them satisfy the constraint V = D / T. Given values for any two of these variables, the value of the third will be calculated according to the constraint.

     The central design principle applied in implementing computation is to make them resemble relations. We show the relation corresponding to velocity as below:

 

VELOCITY ( D       V      T  )

           1       1      1

           2       1      2

           .       .      .

           .       .      .

           34.1    5.5    6.2

           .       .      .

           .       .      .

 

It is an infinite relation, every tuple satisfies the constraint V = D / T. Further more, all tuples satisfy this constraint is included in the relation. The parameters of a computation become the domain of its associated relation.

     As we have already discussed, the full support of nested relations in jRelix means that the attribute of a relation could itself be a relation. This implies that computation can also take relations as its parameters.

 

     Now we discuss the invocation of computations.

 

* Invoking a computation with a select expression

 

We can use a select expression to supply specific values for the input parameters of a computation.

 

Example:

 

> result <- where D = 10 and T = 2 in velocity;

 

result ( D     V     T )

         10    5     2

 

We set parameter D to 10 and T to 2 using a selection clause. This causes these two domains to be the input parameters, so the second block is executed in order to calculate V.

 

* Invoking a computation with array syntax

 

jRelix provides an array syntax ( for details refer to [Hao98] ) which is syntax sugar for a T-selection. This can be used to invoke a computation.

Example:

 

> result <- velocity [10, ,2]

 

The expression velocity [10, , 2] is equivalent to a T-select statement whose projection list consists of only the second parameter of velocity. And the first parameter has the input value 10 and the third parameter has the input value 2. The expression is equivalent to  

 

[V] where D = 10 and T = 2 in velocity

 

Thus the relation result should be

 

result [ V ]

         5

 

* Invoking a computation by joining it with a relation

 

It has been explained how a computation could be thought of as a relation, perhaps an infinite one. Therefore, it makes sense to take a natural join of a computation with a relation.

 

 

 

Example:

 

temprel ( D     T )

          20    10

          15    3

          16    4

 > result <- velocity ijoin temprel;

 

We get the relation result as follow:

 

 result ( D     V     T )

          20    2     10

          15    5     3

          16    4     4

 

     As we have discussed, the parameters of a computation can be thought of as domains of the computation. When two relations are joined together, their common domains are called the join domains. This is retained when a computation is joined with a relation.

Thus D and T are the join domains in this example. The join domains D and T become the input parameters of the computation while the remaining parameters become outputs, so the second "alt" block of velocity is selected for execution.

 

* Stand-alone invocation of computations

 

Computation may also be invoked by means of a top-level call in jRelix taking domains and relations as its parameters or taking no parameters. Thus computation functions just as procedure, which is a top level procedural abstraction implemented in Relix, the predecessor of jRelix. Please refer to [RSL95] for a complete discussion of procedure in Relix. Take the following example:

 

 

> comp AssignComp ( ) is

   { result <- temprel;

   };

> comp JoinComp ( A, B, C ) is

   { C <- A ujoin B;

   };

> AssignComp ( );

> JoinComp ( in oldRecord, in moreRecord, out Record);

 

Here we have defined two computations. The stand-alone invocation of computation AssignComp is a top level call which takes no parameter. The stand-alone invocation of computation JoinComp is also a top level call, taking three parameters where in and out are used to specify the input and output parameters. The statement(s) in the computation body will be executed when the computation is called.

     The unification of computation and procedure leads to a simpler language. In the next chapter, we will present to the user how to use computations to define event handlers.

 

 

Chapter 3

 

User Manual on Updates & Event Handlers

 

In this chapter, we discuss how to use updates for both flat and nested relations, and how to define event handlers in jRelix.

 

3.1   User Manual on Updates

 

3.1.1 Introduction

 

The update operation allows us to change values of specified attributes in certain tuples. These attributes could be selected by a using clause which uses a relational algebra operation to select tuples from the relation we want to update. We can also use updates to add or delete some tuples to or from the relation. The nested update in nested relations will be presented.

 

3.1.2 Syntax and Semantics

 

Update provides the mechanism for changing a relation. There are three basic update operations on relations: add, delete and change. The syntax for update is:

 

UpdateStatement := update Identifier (add | delete) Expression

                        |

             update Identifier change (StatementList)? (UsingClause)?

 

UsingClause := using JoinOperator Expression

            |

using "[" IDList ":" JoinOperator (":")? ExpressionList "]" Expression

 

Here the first two productions add or delete the result of Expression to or from the relation being updated. The semantics of add is the same as that of the incremental assignment. The semantics of delete is the same as that of the djoin. The third production updates part of the relation in the way specified by StatementList. The part of the relation to be updated is the join result (specified by JoinOperator) of the relation being updated and the result of expression in UsingClause. If there is no UsingClause, the whole relation is updated. The StatementList that follows the keyword change may contain update statements. This nested update is used to update values of nested attributes in nested relations.

 

Note that the precedence ordering of update would be more intuitive if the syntax is modified as following:

 

update Identifier (UsingClause)? change (StatementList)?

 

3.1.3 Worked Example of Flat Relation

 

First we show the user how to update tuples in non-nested relations.

 

      CLASS                         RECLASS

      ( ITEM            TYPE )      ( ITEM           TYPE )

        Yarn            a             Yarn           a

        String          a             String         b

        Ball            b             Top            a

        Sandal          c

 

In the following examples assume that CLASS contains the above values every time we enter a new command.

* Example 1:   update CLASS add RECLASS;

 

Result:    CLASS ( ITEM           TYPE )

                   Ball           b

                   Sandal         c

                   String         a

                   String         b

                   Top            a

                   Yarn           a

 

* Example 2:   update CLASS delete RECLASS;

 

Result:     CLASS ( ITEM          TYPE )

                    Ball          b

                    Sandal        c

                    String        a

 

* Example 3:   update CLASS change TYPE <- "B" using ijoin RECLASS;

 

Result:  The result of CLASS ijoin RECLASS is the tuple {Yarn, a}

         The change instruction, TYPE <- "B", changes this to {Yarn, b}

 

         So we have CLASS ( ITEM         TYPE )

                            Ball         b

                            Sandal       c

                            String       a

                            Yarn         b

 

Example 4:   update CLASS change TYPE <- "b" using ijoin ( [ITEM] in RECLASS );

 

CLASS ijoin [ITEM] in RECLASS

( ITEM       TYPE )

  String     a

  Yarn       a

 

Result:    CLASS ( ITEM        TYPE )

                   Ball        b

                   Sandal      c

                   String      B

                   Yarn        B

 

* Example 5:   let NEWTYPE be TYPE;

               update CLASS change TYPE <- NEWTYPE

                      using ijoin ( [ITEM, NEWTYPE] in RECLASS );

 

CLASS ijoin ( [ITEM, NEWTYPE] in RECLASS )

( ITEM        TYPE        NEWTYPE )

  String      a           b

  Yarn        a           a

 

Result: CLASS (  ITEM          TYPE )

                 Ball          b

                 Sandal        c

                 String        b

                 Yarn          a

 

* Example 6:  update CLASS change TYPE <- NEWTYPE

                           using ujoin ( [ITEM, NEWTYPE] in RECLASS );

 

CLASS ujoin ( [ITEM, NEWTYPE] in RECLASS )

( ITEM        TYPE       NEWTYPE )

  Ball        b          dc

  Sandal      c          dc

  String      a          b

  Top         dc         a

  Yarn        a          a

 

Result:  CLASS ( ITEM        TYPE )

                 Ball        b

                 Sandal      c

                 String      b

                 Yarn        a

                 Top         a

NOTE: dc or dk does not replace a known value.

 

* Example 7:  update CLASS change TYPE <- "B" using djoin RECLASS;

 

CLASS djoin RECLASS

( ITEM       TYPE )

  Ball       b

  Sandal     c

  String     a

 

Result:   CLASS ( ITEM        TYPE )

                  Ball        B

                  Sandal      B

                  String      B

                  Yarn        a

 

The following two examples are of global change.

 

Example 8:   update CLASS change TYPE <- "C";

 

Result:   CLASS ( ITEM        TYPE )

                  Ball        C

                  Sandal      C

                  String      C

                  Yarn        C

 

Example 9:   let NEWTYPE be if TYPE = "c" then "B" else TYPE;

             update CLASS change TYPE <- NEWTYPE;

 

Result:  CLASS ( ITEM        TYPE )

                 Ball        b

                 Sandal      B

                 String      a

                 Yarn        a

 

The following example uses a T-selector to select tuples to be updated.

 

Example 10:  update CLASS change TYPE <- "B"

                          using ijoin ( where ITEM = "Yarn" in CLASS );

 

CLASS ijoin ( where ITEM = "Yarn" in CLASS )

( ITEM        TYPE )

  Yarn        a

 

Result:   CLASS ( ITEM        TYPE )

                  Ball        b

                  Sandal      c

                  String      a

                  Yarn        B

               

3.1.4 Worked Example of Nested Relation

 

The following examples show the user how to update tuples in nested relations.

Consider the following relations where relation EMP is a domain of relation employees:

 

 

employees

( DEPT            EMP )

                ( NAME          SAL )

  stereo          M.Gordon      25000

                  P.McConnel    22000

                  T.Anastasio   27000

                  J.Fishman     24000

  television      B.Martin      38000

                  C.Wood        32000

                  J.Medeski     35000

 

NEWEMP ( NAME         SAL )         RETIREEMP ( NAME )

         P.Alger      30000                     M.Gordon

                                                B.Martin

                                                 

We assume that employees and EMP contain the above values every time we enter a new command.

 

* Example 11:   update employees change ( update EMP add NEWEMP )

                using ijoin ( where DEPT = "television" in employees );

 

employees ijoin ( where DEPT = "television" in employees )

( DEPT           EMP )

               ( NAME           SAL )

  television     B.Martin       38000

                C.Wood         32000

                 J.Medeski      35000

 

Result:

employees

( DEPT           EMP )

               ( NAME           SAL )

  stereo         M.Gordon       25000

                 P.McConnel     22000

                 T.Anastasio    27000

                 J.Fishman      24000

  television     B.Martin       38000

                 C.Wood         32000

                 J.Medeski      35000

                 P.Alger        30000

 

* Example 12:  update employees change ( update EMP delete RETIREEMP );

 

Result:

employees

( DEPT           EMP )

               ( NAME           SAL )

  stereo         P.McConnel     22000

                 T.Anastasio    27000

                 J.Fishman      24000

  television     C.Wood         32000

                 J.Medeski      35000

 

* Example 13:   update employees change ( update EMP change SAL <- SAL +10000 )

                using ijoin ( where DEPT = "stereo" in employees );

 

employees ijoin ( where DEPT = "stereo" in employees )

( DEPT           EMP )

               ( NAME           SAL )

  stereo         M.Gordon       25000

                 P.McConnel     22000

                 T.Anastasio    27000

                 J.Fishman      24000

 

 

Result:    

employees

( DEPT           EMP )

               ( NAME           SAL )

  stereo         M.Gordon       35000

                 P.McConnel     32000

                 T.Anastasio    37000

                 J.Fishman      34000

  television     B.Martin       38000

                 C.Wood         32000

                 J.Medeski      35000

 

* Example 14:   update employees change ( update EMP change SAL <- SAL +10000

                using djoin RETIREEMP );

 

EMP djoin RETIREEMP

( DEPT           EMP )

               ( NAME           SAL )

  stereo         P.McConnel     22000

                 T.Anastasio    27000

                 J.Fishman      24000

  television     C.Wood         32000

                 J.Medeski      35000

 

Result:   

employees

( DEPT           EMP )

               ( NAME           SAL )

  stereo         M.Gordon       25000

                 P.McConnel     32000

                 T.Anastasio    37000

                 J.Fishman      34000

  television     B.Martin       38000

                 C.Wood         42000

                 J.Medeski      45000

        

* Example 15:   update employees change ( update EMP change SAL <- SAL +10000

                                          using djoin RETIREEMP )

                using ijoin ( where DEPT = "stereo" in employees);

 

employees ijoin ( where DEPT = "stereo" in employees )

( DEPT           EMP )

               ( NAME           SAL )

  stereo         M.Gordon       25000

                 P.McConnel     22000

                 T.Anastasio    27000

                 J.Fishman      24000

 

Then the above part of EMP (say EMP') is selected for nested update.

 

EMP' djoin RETIREEMP

EMP

( NAME           SAL )

  P.McConnel     22000

  T.Anastasio    27000

  J.Fishman      24000

 

Result:   

employees

( DEPT           EMP )

               ( NAME           SAL )

  stereo         M.Gordon       25000

                 P.McConnel     32000

                 T.Anastasio    37000

                 J.Fishman      34000

  television     B.Martin       38000

                 C.Wood         32000

                 J.Medeski      35000

 

* Example 16:   let NEWSAL be if SAL < 35000 then SAL + 10000 else SAL;

                update employees change ( update EMP change SAL <- NEWSAL );

 

Result:

employees

( DEPT           EMP )

               ( NAME             SAL )

  stereo         M.Gordon         35000

                 P.McConnel       32000

                 T.Anastasio      37000

                 J.Fishman        34000

  television     B.Martin         38000

                 C.Wood           42000

                 J.Medeski        35000

 

 

3.2   User Manual on Event Handlers

 

3.2.1 Introduction

 

Procedural abstraction has been achieved in jRelix in the form of computation. And computation is also implemented to function as procedure which was differently implemented in previous versions of Relix. Therefore, computation is used to define event handlers in jRelix. In this chapter we will present the reader the mechanism by which he/she can define event handlers, properties of event handlers and operations which can be performed on them. Event handlers defined on nested updates will also be demonstrated. Finally, the usage of the undo command within event handlers will be explored.

 

3.2.2 Computation as Event Handlers

 

Event handlers are procedures (computations) and events are system-generated procedural calls. So far, in our implementation, events are generated by updates. But they may arise otherwise, e.g., mouse click, notice from the Internet, or a database read.

     To distinguish event handlers from user-defined computations, we give them special names, constituted according to the following syntax.

 

* Naming Event Handlers

 

The original syntax for declaration of computation goes like:

 

Computation := "comp" Identifier "(" (ParameterList)? ")" "is" ComputationBody

 

We modify the syntax as follows so that we can define a computation as an event handler:

 

Computation := "comp" CompName "(" (ParameterList)? ")" "is" ComputationBody

 

where

 

CompName := Identifier | EventName

 

EventName := (prefix":")?action":"relation("["attribute-list"]")?

 

Now we will discuss each component of the EventName.

 

* prefix

 

Prefix is an optional part of the name, could be either pre or post. If prefix is omitted, it is post by default. Pre implies that the event handler would be executed before the execution of the event which triggers it. Post implies that the event handler would be executed after the execution of the event which triggers it. The pre or post event handler is coupled with the event which triggers it without interruption.

 

* action

 

Action specifies by what kind of update operation on the relation would the according event handler be activated. There are three possible types of action: add, delete, or change.

 

* relation

 

Relation identifies the name of the relation to be updated.

 

* attribute-list

 

The optional attribute-list is a list of attributes of the preceding relation. For event handlers with action add or delete, the attribute-list is always omitted. For event handlers with action change, the attribute-list could be any subset of the original attributes, including the empty subset, which means change action on any attribute(s) will trigger the event handler. Attributes in the attribute-list are separated from one another by comma.

 

Here are some examples of valid event names for a relation R with attributes a, b, c:

 

pre:add:R

post:delete:R

add:R (equivalent to post:add:R)

delete:R (equivalent to post:delete:R)

pre:change:R[a,b]

post:change:R[a,b,c]

post:change:R

change:R[a] (equivalent to post:change:R[a] )

 

 

The following are not valid event names:

 

pre:add:R[a]

post:delete:R[b,c]

 

Event names and handlers are related one to one. There will not be any two or more event handlers with the same event name.

 

* Defining Event Handlers

 

Because computation is used to define event handler, defining an event handler is just like defining a computation but with a special event-name.

 

comp event-name ( ) is

{

statements;

}

Note that event handlers do not take parameters because they are system-generated procedural calls without the user intervention.

 

3.2.3 Cascading Event Handlers

 

As shown above, in the body of event handler, there could be several statements including updates which may have their own event handlers. This introduces cascading event handlers which will be executed in turn. They would be cascaded as deep as the user wants. The following figure illustrates the cascading situation.

 

 

                  Figure 3.1: Cascading Event Handlers

 

( Solid lines represent pre or post event handler calls. Dotted lines represent statements in an event handler body which in turn trigger cascading event handlers. Event represents an update operation upon a relation. )

 

The sequence of execution can be found by reading the above figure top down:

 

Pre-Event2-Handler

Event2

Post-Event2-Handler

Pre-Event3-Handler

Event3

Post-Event3-Handler

Event1

Pre-Event4-Handler

Event4

Post-Event4-Handler

 

Cascading event handlers are useful in such applications as spread sheets.

 

 

3.2.4 Event Handler On/Off

 

When first defined, an event handler is set to state on which means it will be executed when the according event occurs. If it is set to state off afterwards, then the according event will not trigger the event handler even it still exists in the system. The user can activate it by setting it to state on again.

 

The following are the commands used to set the event handler's state:

 

eventon event-name

 

This command sets the event-handler to state on.

 

eventoff event-name

 

This command sets the event-handler to state off.

 

3.2.5 Printing Event Handlers

 

We use the same pr command that is used for relation to print an event handler.

 

pr event-name

 

This command will print out the definition of the event handler with the specified event-name.

 

3.2.6 Deleting Event Handlers

 

If the user wants to define a new event handler with an existing name, the old event handler should be deleted first; otherwise, the system will prompt a warning and the new definition will not be accepted. We use the same dr command that is used for relation to delete an event handler.

 

dr event-name

 

3.2.7 Trigger and New and Rest

 

For the update operation, the affected relation would be separated into three pieces which are named Trigger, New and Rest. They only exist in the pre and post event handler. These three pieces are useful to achieve the undo operation which will be discussed later. Trigger is defined as being the tuples which will be affected by an update operation on the original relation, or you can call it Old. New is defined as being the new values of those old tuples. Rest is defined as being the tuples which are not affected by an update operation on the original relation. Note that Trigger, New and Rest are always created even when there is no event handler defined for this operation as they could be useful in further development of the system. For example, they could be used to achieve the rollback issued by the user in a later time afte the update operation or recover the system from crash.

 

Considering the following examples:

 

R[X, Y]    P[X, Y]    Q[X, Y]   S[X, Y]

  a  1       a  1       a  3      b  1

  a  2

  b  1

 

     

                                      Trigger        New           Rest

update R change X<-"c" using ijoin P; {(a,1)}        {(c,1)}       {(a,2),(b,1)}

update R add Q;                       {}             Q             R

update R delete S;                    S              {}            R-S              

 

Consider the nested update-add:

 

em

( DEPT           EMP )

               ( NAME )

  tele           a

                 b

  stereo         c

                 d

 

NEWEMP[NAME]

       e

 

update em change (update EMP add NEWEMP) using ijoin where DEPT = "tele"  in em;          

 

    Trigger                New                           Rest

em  {(tele,a),(tele,b)}    {(tele,a),(tele,b),(tele,e)}  {(stereo,c),(stereo,d)}

 

Now let's consider the nested update-change. Take original em.

 

update em change (update EMP change NAME <-  "z" using ijoin where NAME = "a" in

                  EMP) using ijoin where DEPT = "tele" in em ;

 

      Trigger           New               Rest

em    {(tele,a)}        {(tele,z)}        {(tele,b),(stereo,c),(stereo,d)}

 

In our system, Trigger is available to the user as a normal relation. Therefore, the user can apply relational algebra upon Trigger (for examples, refer to the 3.2.9), and use pr command to print out Trigger.

 

pr trigger

 

As the scope of Trigger is within the update statement and its event handlers (if there is any), the commands related to Trigger are only used within the corresponding event handlers. Note that New and Rest are hidden from the user. 

 

3.2.8 Undo Command

 

In jRelix, whenever there is an update event, we produce the Trigger, New and Rest. As mentioned before, their scope is within the event itself and its pre/post mode event handlers. What the update operation does is switch these three pieces so that Trigger is detached from Rest, and New is combined with Rest. This creates the new relation for the update operation. Meanwhile, the Trigger becomes New, and New becomes Trigger.

     We add a new command, undo, in jRelix, which is used in event handler. What undo does is to switch Trigger, New and Rest in which Trigger and New exchange their places again, functioning the same as update operation. Obviously the update operation is reversed by executing command undo.

     If we execute the undo command in the pre-mode event handler which runs before the update event, it just updates the relation as what the update operation does. So we could equally name undo do. And when it comes to run the update operation, it functions as the command undo by switching those three pieces again. All in all, the undo and the update operation both function as a toggle. We illustrate the above explanation in

Figure 3.2.

 

                                 

         

                    Figure 3.2: Undo using Trigger, New and Rest

 

The syntax of the undo command goes as follow:

 

undo

 

Note that the undo is also just available within the corresponding event handlers.

 

3.2.9 Worked Example of Event Handlers

 

In this section we will introduce two examples to show how event handlers are used.

 

Example 1:

 

domain X,Y intg;

relation R(X,Y);

comp pre:change:R() is

{ S <- trigger;

}

update R change X <- 1;

 

By defining the event handler pre:change:R, trigger of R will be assigned to S, a copy of tuples to be changed in R, each time before an update-change operation is applied upon relation R.

 

Example 2:

 

This example shows an extreme case of cascading event handlers, in which event handler does further update on the same relation.

 

domain n integer;

relation start(n) <- {4};

 

comp post:add:iota () is

{ let nmin1 be (red min of n) - 1;

   let n be nmin1;

   update iota add [n] in [nmin1] where nmin1>=0 in iota;

};

update iota add start;

 

By defining the event handler post:add:iota, each time there is an update-add operation upon relation iota, the event handler will be executed after the execution of update-add operation.

     Since there is an update-add statement in the event handler post:add:iota, after executing this statement, it will trigger the event handler post:add:iota again. It seems that the system will not ever stop. But it should not be the case. We have a scheme to stop the update-eventhandler-update infinite loop.

 

Definition: A void update is an update which does no change on the subject relation.

 

Stop Scheme: If an update is a void update, then the cascading event handler should stop.

 

Now we examine how it is applied in the example. From the beginning, relation start is {4}, relation iota is empty. After executing the statement update iota add start, relation iota becomes {4}. Then the event handler post:add:iota is triggered, so it comes into the event handler body to execute those statements within it. The first two are let...be... statements which cause no change in relation iota. The third one is an update-add statement upon relation iota which produces a number whose value is the min value of all tuples in relation iota minus 1 and the condition of adding it to iota is that it should be non-negative. After executing it, relation iota becomes {4,3}. And the event handler post:add:iota is triggered again. It automatically runs through the event handler again following the above procedure. The procedure will repeat for several times until the number produced by the min value of all tuples in iota minus 1 is less then 0. Then the update-add statement in the event handler will not do any change on iota, which means it satisfies the stop condition according to the stop scheme. The following figures illustrate this procedure.

 

 

                         Figure 3.3: Cascading Event Handlers

 

 

 

Chapter 4

 

Implementation of Updates and Event Handlers

 

4.1   Representation of Nested Relations

 

The implementation of nested relation is built upon flat relations. When a non-primitive type of attribute is defined, a relation of the same name, but with a prefix ".", is automatically created by the system and hidden from the user. The new relation consists of the attributes which make up the non-primitive type of domain, as well as one additional attribute .id. This is the surrogate field used to link up the upper level relation with nested non-primitive attributes.

     The relation employees from section 2.1.3 will serve as an example. The declaration of the nested attribute EMP leads to the creation of the relation .EMP which is initially empty. The creation of nested relation employees causes tuples to be added to .EMP. Surrogates are stored under the EMP field of employees.

     Note that in relation employees, surrogate 1,2 of the nested attribute EMP link the value 1, 2 of attribute .id in the relation .EMP, shown as the following:

 

employees ( DEPT       EMP )

            stereo     .1

            television .2

 

.EMP ( .id           Name            Salary )

       .1            M.Gordon        25000

       .1            P.McConnel      22000

       .1            T.Anastasio     27000

       .1            J.Fishman       24000

       .2            B.Martin        38000

       .2            C.Wood          32000

       .2            J.Medeski       35000

 

Refer to [Hao98, Yua98] for more details about implementation of nested relations.

 

4.2   Implementation of Updates

 

In this section we will discuss the algorithm for update operations. There are three ways to update a relation: add some tuples, delete some tuples, or change the values of some specific tuples. We call them update-add, update-delete, and update-change, respectively. The update-add is just syntax sugar for the incremental assignment while the update-Delete is just syntax sugar for djoin. For update-change, it could be a global change or a partial change. Considering nested update-change, there could be update-add, update-delete or update-change nested in the upper level update-change, as deep as the user wants. For syntax refer to Appendix A.

     Essentially, we create three pieces of relation for the update operation, which are Trigger(Old), New and Rest. To do the update, whether it's update-add, update-delete, or update-change, we just switch these three pieces to produce the updated relation. This also applies to undo operation. Both update and undo function like a toggle.

     The nested updates will not do any change on the upper level relation. So, for implementation, to be able to keep a copy of the old relation and make the undo operation, we just keep track of the Trigger and New and Rest down to the lowest level relation.  Take the example of nested update-add from section 3.2.7:

 

update em change (update EMP add NEWEMP) using ijoin where DEPT = "tele" in em;          

 

We don't keep the Trigger, New and Rest of em (which is the top level relation), but those of EMP (which is the nested relation to be changed).

 

      Trigger          New               Rest

EMP   {}               {(.1,e)}          {(.1,a),(.1,b),(.2,c),(.2,d)}

 

Now let's consider nested update-change.

 

update em change (update EMP change NAME <-  "z" using ijoin where NAME = "a" in

                  EMP) using ijoin where DEPT = "tele" in em ;

 

We get Trigger and New and Rest of EMP, just as we do for the nested update-add.

 

      Trigger               New                 Rest

EMP   {(.1,a)}              {(.1,z)}            {(.1,b),(.2,c),(.2,d)}

 

We only need to keep these three pieces according to the lowest level relation which is actually changed. This makes undo command easy by just switching Trigger and New with Rest, like a toggle.

     The structure of the update statement is kept in a tree according to the syntax. We refer to the tree whenever we need to analyze the statement.

 

There are three major components of the algorithm:

 

* EvaluateUpdate

* ExecuteUpdate

* SwitchTrigger

 

 

4.2.1 EvaluateUpdate

 

4.2.1.1 Algorithm of EvaluateUpdate

 

EvaluateUpdate is the function that creates Trigger, New and Rest for the update operation upon a relation.

     Here are the explanations of the variables in the algorithm:

* upper_level_trigger: the input parameter of the function, which is the affected part of the upper level relation for a nested update; set to the first level relation when the function was firstly called

* cur_level_rel: the current level relation to be updated

* ad_rel: the relation to be added to the original relation

* del_rel: the relation to be deleted to the original relation

* using_rel: the relation appearing in the using clause

* using_join: the join type appearing in the using clause

* temp_rel: the temporary relation used in the algorithm

* Trigger: the affected part of the original relation

* New: the new values of the tuples affected in the original relation

* Rest: the unaffected tuples of the original relation

 

The algorithm of EvaluateUpdate( upper_level_trigger ) goes as follows:

 

* analyze the statement tree to get the action_type of the current level update

* if action_type equals to ADD

  * if it is the first level

    * set New to ad_rel

* else

  * set temp_rel to the result of upper_level_trigger ijoin

      cur_level_rel on the surrogate

      * for each surrogate in temp_rel

        * for each tuple in ad_rel

          * add surrogate to the tuple

          * add the tuple into New

          * set Trigger to empty    

    * set Rest to cur_level_rel

* if action_type equals to DELETE

  * if it is the first level

    * set temp_rel to cur_level_rel

  * else

    * set temp_rel to the result of upper_level_trigger ijoin cur_level_rel

    * set Trigger to the result of temp_rel ijoin del_rel

    * set New to empty

    * set Rest to the result of temp_rel djoin del_rel

* if action_type equals to CHANGE

  * if there is a UsingClause

    * get the using_rel and using_join from the UsingClause

    * set temp_rel to the result of cur_level_rel using_join using_rel

  * else

    * set temp_rel to cur_level_rel

  * if it is the first level

    * set Trigger to temp_rel

  * else

    * set temp_rel to the result of upper_level_trigger ijoin temp_rel

    * set Trigger to temp_rel

  * set Rest to the result of cur_level_rel djoin Trigger

  * for each statement in current level

    * if this statement is of type ASSIGNMENT

      * for each tuple in Trigger

* replace the value of the domain with new value and add the tuple into

  New

* adjust Trigger, New and Rest so that the Trigger only includes the 

  tuples actually being changed

    * if this statement is of type UPDATE

      * call EvaluateUpdate( temp_rel )

 

The algorithm will recursively call itself until it arrives at the deepest nested update operation and then create Trigger, New and Rest accordingly.

 

4.2.1.2 Example of EvaluateUpdate

 

Take the relation employees and RETIREEMP from section 3.1.4 as the example.

 

update employees change ( update EMP change SAL <- SAL +10000

                          using djoin RETIREEMP )

                 using ijoin ( where DEPT = "stereo" in employees);

 

* The system calls EvaluateUpdate ( employees )

* The action_type of current level (first level) update is CHANGE

* There is a UsingClause in the current level

* The using_rel is the relation got from (where DEPT = "stereo" in employees)

* The using_type is ijoin

* Set temp_rel to the result of  employees ijoin (where DEPT ="stereo" in employees), so temp_rel is {(stereo, .1)}

* Set Trigger to temp_rel, so Trigger is {(stereo, .1)}

* Set Rest to employees djoin Trigger, so Rest is {(television, .2)}

* The statement of current level is:

      update EMP change SAL <- SAL +10000 using djoin RETIREEMP

* This statement is of type UPDATE, so call EvaluateUpdate (temp_rel)

  * The upper_level_relation is {(stereo, .1)}, got from the input parameter

  * The action_type of current level (second level) update is CHANGE

  * There is a UsingClause in the current level

  * The using_rel is the relation RETIREEMP

  * The using_type is djoin

  * Set temp_rel to the result of  EMP djoin RETIREEMP,

    so temp_rel is {(.1          P.McConnel     22000)

                    (.1          T.Anastasio    27000)

                    (.1          J.Fishman      24000)

                    (.2          C.Wood         32000)

                    (.2          J.Medeski      35000)}

* This is not the first level, so set temp_rel to the result of      

  upper_level_trigger ijoin temp_rel, so temp_rel is

    {(.1          P.McConnel      22000)

     (.1          T.Anastasio     27000)

     (.1          J.Fishman       24000)}

  * Set Trigger to temp_rel, so Trigger is {(.1     P.McConnel     22000)

                                            (.1     T.Anastasio    27000)

                                            (.1     J.Fishman      24000)}

  * Set Rest to the result of cur_level_rel djoin Trigger,

    so Rest is {(.2          C.Wood          32000)

                (.2          J.Medeski       35000)}

  * The statement of current level is: SAL <- SAL +10000

  * The statement is of type ASSIGNMENT

* For each tuple in Trigger, replace the old value of domain SAL with new  

  value which is 10000 larger than the old one, and add the tuple in New.

    So New is {(.1         P.McConnel       32000)

               (.1         T.Anastasio      37000)

               (.1         J.Fishman        34000)}

  * Adjust Trigger, New and Rest if necessary.

  * Finish

 

 

4.2.2 ExecuteUpdate

 

4.2.2.1 Algorithm of ExecuteUpdate

 

ExecuteUpdate is the function that really updates the relation by switching Trigger, New and Rest produced by function EvaluateUpdate.

 

The algorithm goes as follows:

 

ExecuteUpdate( )

 

* analyze the statement tree to get the action_type of current level update

* if the action_type equals to ADD or DELETE

  * call SwitchTrigger ( )

* if the action_type equals to CHANGE

  * for each statement in current level

    * if the statement is of type ASSIGNEMNT

      * call SwitchTrigger( )

    * if the statement is of type UPDATE

      * call ExecuteUpdate ( )

 

The algorithm will recursively call itself until it arrives at the deepest nested update operation and then switch Trigger, New and Rest accordingly.

 

4.2.2.2 Example of ExecuteUpdate

 

Following the above example, we have got Trigger, New and Rest by calling EvaluateUpdate( ), and now we want to actually get the result of the update operation:

 

 

update employees change ( update EMP change SAL <- SAL +10000

                          using djoin RETIREEMP )

                 using ijoin ( where DEPT = "stereo" in employees);

 

* The system calls ExecuteUpdate( )

* The action_type of current level (first level) update is CHANGE

* The statement of current level is:

      update EMP change SAL <- SAL + 10000 using djoin RETIREEMP

* The statement is of type UPDATE, so call ExecuteUpdate ( )

  * The action_type of current level (second level) update is CHANGE

  * The statement of current level is: SAL <- SAL + 10000

  * The statement is of type ASSIGNMENT

    * Call SwitchTrigger ( )

    * Finish

 

4.2.3 SwitchTrigger

 

4.2.3.1 Algorithm of SwitchTrigger

 

StwitchTrigger is the function that switchs Trigger, New and Rest which are produced by EvaluateUpdate. It is called by the function ExecuteUpdate.

 

The algorithm goes as follows:

 

SwitchTrigger ( )

 

* set the updated relation to be the result of New ujoin Rest

* store Trigger to a temporary relation

* set Trigger to New

* set New to that temporary relation

 

The algorithm that creates the updated relation by toggling Trigger and New with Rest is quite straightforward. We use the same algorithm to achieve undo operation. It is obvious that if we switch Trigger and New again with Rest after the update operation, we actually get the original relation, thus undo the update operation.

 

4.2.3.2 Example of SwitchTrigger

 

Based on the above example, we have got Trigger, New and Rest as follows:

 

 

Trigger:  {(.1          P.McConnel      22000)

           (.1          T.Anastasio     27000)

           (.1          J.Fishman       24000)}

 

Rest:  {(.2          C.Wood          32000)

        (.2          J.Medeski       35000)}

 

New: {(.1         P.McConnel       32000)

      (.1         T.Anastasio      37000)

      (.1         J.Fishman        34000)}

 

Now it is time to make the change:

 

* New ujoin Rest -> updated relation

So the updated relation is:

{(.1         P.McConnel      32000)

 (.1         T.Anastasio     37000)

 (.1         J.Fishman       34000)

 (.2         C.Wood          32000)

 (.2         J.Medeski       35000)}

 

 

* Trigger -> temporary relation

So the temporary relation is:

{(.1          P.McConnel      22000)

 (.1          T.Anastasio     27000)

 (.1          J.Fishman       24000)}

 

* New -> Trigger

So Trigger becomes:

{(.1         P.McConnel      32000)

 (.1         T.Anastasio     37000)

 (.1         J.Fishman       34000)}

 

* temporary relation -> New

So New becomes:

{(.1          P.McConnel      22000)

 (.1          T.Anastasio     27000)

 (.1          J.Fishman       24000)}

 

4.3   Implementation of Event Handlers

 

In this section we will discuss the way we implement event handlers and explain the syntax change we made to support event handlers. We will also describe the algorithm used to select proper event handlers for an update operation. Finally we will explain how event handlers are combined with the implementation of updates.

 

4.3.1 Computation as Event Handlers

 

Consider relation R (X, Y, Z),

                     a  b  1

                     c  d  2

we define the following event handler:

 

 

 

> comp post:change:R[X] ( ) is

{

statements;

};

 

Since event handlers are computations, they are stored as regular computations, both in the system table and on disk (refer to Chapter 2). We add a new attribute, Active, in the system table .rel. It is used to indicate whether an event handler is active (1) or not (0). The user can change the event handler's state by using the command eventon or eventoff. For relation and regular computation, the attribute Active has no effect, so always set to 0.

     Now the system table .rel will look like this:

 

.rel ( name             type     nattr    sort     ntuples      active)

       R                rel      3        1        2            0

       post:change:R[X] comp     0        0        0            1

 

And the event handler is stored on disk as post:change:R[X].comp.

 

Note that the prefix of EventName is optional. If it is omitted, we take it as post.

 

     Next we will discuss how the proper event handlers are selected to execute when an update operation occurs.

 

4.3.2 Algorithm of ExecuteEventHandler

 

ExecuteEventHandler is the function used to execute proper event handlers according to the update operation. The function takes two input parameters: one is prefix, the other is event_name.

 

* prefix: pre or post; when the function is called before the update operation, it is set to pre; when the function is called after the update operation, it is set to post.

* event_name: the event's name, indicating the action type (add/delete/change), relation name and affected attributes of the update operation.

 

The event handlers both satisfying the prefix and event_name will be put into selection_list. The algorithm goes as follows:

 

ExecuteEventHandler ( prefix, event_name )

* get the action_type from event_name

* get the relation_name from event_name

* get the attribute_list from event_name; if action_type is add/delete, attribute_list must be empty.

* set query1 to be prefix, set query2 to be (action_type + relation_name), set query3 to be attribute_list.

* search the system table .rel, get those computations with attribute Active set to 1 (which must be event handlers )

* for each selected computation, analyze their name.

* separate the name to three parts: mode(pre or post), event and attrlist; if 

  the name do not have mode, set mode to post

  * if mode equals to query1

    * if action_type of query2 is ADD/DELETE

      * if event equals to query2

        * add the event handler name to the selection_list

    * if action_type of query2 is CHANGE

      * if event equals to query2 and if attrlist is the subset of query3

        * add the event handler name to the selection_list

* for each event handler in the selection_list

  * execute the event handler

 

4.3.3 Example of ExecuteEventHandler

 

Now we examine how the selection of event handlers is done when we execute the following update operation:

 

> update R change X <- "e";

 

Assuming we have defined six event handlers. The system table .rel goes as follows:

 

.rel ( name              type    nattr    sort    ntuples     active)

       R                rel     3        1       2           0

       change:R[Z]       comp    0        0       0           1

       pre:add:R         comp    0        0       0           1

       pre:change:R      comp    0        0       0           1

       pre:change:R[X]  comp    0        0       0           1

       pre:change:R[X,Y] comp    0        0       0           1

       post:change:R[X] comp    0        0       0           1

       post:change:R[Y] comp    0        0       0           0

 

Before the update statement is executed, the system calls:

ExecuteEventHandler ("pre", "change:R:[X]")

 

* action_type of event_name is CHANGE

* relation_name of event_name is R

* attribute_list of event_name is X

* set query1 to be pre, set query2 to be change:R, query3 to be X get all

  computations with attribute active set to 1, which are change:R[Z], pre:add:R,                    

  pre:change:R, pre:change:R[X,Y], post:change:R[X]

* for change:R[Z], because it does not have mode, set mode to post, set event to 

  change:R, set attrlist to Z

  * mode does not equal to query1, so the event handler is not added to 

    selection_list

* for pre:add:R, set mode to pre, set event to add:R, set attrlist to empty

  * mode equals to query1

  * action_type of query2 is CHANGE

  * event does not equal to query2, so the event handler is not added to

    selection_list

* for pre:change:R, set mode to pre, set event to change:R, set attrlist to  

  empty

  * mode equals to query1

  * action_type of query2 is CHANGE

  * event equals to query2 and attrlist is the subset of query3, so the event

    handler is added to selection_list

* for pre:change:R[X], set mode to pre, set event to change:R, set attrlist to 

  X,

  * mode equals to query1

  * action_type of query2 is CHANGE

* event equals to query2 and attrlist is the subset of query3 (here attrrlist

  equals to query3), so the event handler is added to selection_list

* for pre:change:R[X, Y], set mode to pre, set event to change:R, set attrlist

  to X, Y

  * mode equals to query1

  * action_type of query2 is CHANGE

  * event equals to query2 but attrlist is not subset of query3, so the event

    handler is not added to selection_list

* for post:change:R[X], set mode to post, set event to change:R, set attrlist to

  X

  * mode does not equal to query1, so the event handler is not added to

    selection_list

* there are two event handlers, pre:change:R and pre:change:R[X], in the

  selection_list

  * execute event handler pre:change:R

  * execute event handler pre:change:R[X]

 

Finishing the event handler's execution, the system will run the update statement. After that, the system will call:

 

ExecuteEventHandler ("post", "change:R:[X]")  

 

The similar procedure goes through to select the event handler post:change:R[X] to execute. Note that in the body of event handler, there could be update statements which in turn have their own event handlers; namely, cascading event handlers.

     In the next section, we will introduce the combination of update and event handler algorithm by which the update operation is coupled with event handler closely and neatly, and the cascading event handler is handled by the system's recursive execution.

 

4.4   Combination of Updates and Event handlers

 

To combine updates and event handlers, we just need to insert the function call of ExecuteEventHandler into ExecuteUpdate as follows:

 

ExecuteUpdate ( )

 

* ExecuteEventHandler("pre", event_name)

* analyze the statement tree to get the action_type of current level update

* if the action_type equals to ADD or DELETE

  * call SwitchTrigger ( )

* if the action_type equals to CHANGE

  * for each statement in current level

    * if the statement is of type ASSIGNEMNT

      * call SwitchTrigger( )

    * if the statement is of type UPDATE

      * call ExecuteUpdate ( )

* ExecuteEventHandler("post", event_name)

 

Reading an update statement, the Interpreter first calls the function EvaluateUpdate( ), then calls ExecuteUpdate( ).

     In the function ExecuteEventHandler, proper event handlers will be found and executed. Executing the statements in an event handler body is an indirect recursive call to the interpreter. When the interpreter reads an update statement from the event handler body, it will call EvaluateUpdate ( ) and ExecuteUpdate ( ) again. Thus we achieve the cascading event handler recursively.

     We illustrate the whole procedure as the following figure (dotted line shows indirect call to the interpreter):

 

 

                   Figure 4.1: Coupling of Update and Event Handler

 

 

4.5   Setting Event Handlers On/Off

 

The algorithm of eventon and eventoff is quite straightforward. To implement these two commands, we introduce the following syntax:

 

EventOn : = eventon EventName ( "[" IDList "]" )?

 

EventOff : = eventoff EventName ("[" IDList "]" )?

 

 

The algorithm goes as follows:

 

* separate the EventName and IDList

* search the system table .rel, get a list of event handlers whose name equals to EventName and whose attribute list equals to IDList

* for each entry in the selected event handler list

  * set Active to requested state

 

4.6   Deleting Event Handlers

 

We use the same dr command used to delete relation to delete event handler. Once the Interpreter detects that the dr command is followed by an event name, the following algorithm is executed:

 

* get the event handler to be deleted from the system table .rel

* remove it from .rel

 

4.7   Printing Event Handlers

 

We use the same pr command used to print relation to print event handler. Once the Interpreter detects that the pr command is followed by an event name, the following algorithm is executed:

 

* get the event handler to be printed from the system table .rel

* print the source code of the event handler

 

 

 

Chapter 5

 

Conclusion

 

In this chapter we first summarize the work presented in this thesis. Then we discuss some future work to do on updates, event handlers and transaction control.

 

5.1   Conclusions

 

We have implemented Updates and Event Handlers in jRelix based on a nested relational model. In our implementation, update statements can handle changes applied on either atomic-valued attributes of flat relation or relation-valued attributes of nested relations. We provide for the user a uniform syntax for updating both flat relations and nested relations.

     We have used jRelix computations to perform event programming in jRelix. By modifying the syntax of computation name, we can accept an event name as a computation name; thus, define an event handler using computation. As we presented, computation can be unified with procedure (refer to [Elk96] for details of procedure and defining event handler with procedure in Relix), leading to a simpler language.

     According to our definition, events are nothing more than system-generated procedure calls, and event handlers are procedures invoked by corresponding events. It is simpler to implement the event handler execution model based on this definition and more intuitive to use it (refer to section 1.2.3).

 

5.2   Future Work

 

In this section, we will discuss some issues about further improvement on jRelix. First we discuss the integration of computation in an update statement. Then we discuss how to add condition evaluation into our event handler execution model. Finally we move on to the topic of transaction control in jRelix.

 

* Computation in Updates

 

Theoretically, computation can appear after the key word change in the update statement.

As of this writing, we have partially implemented the combination of updates and computations. Consider the following example:

 

>domain DEP, B intg;

>domain DEPOSIT comp(DEP);

>domain BALANCE comp(B);

>comp ba(BALANCE, DEPOSIT) is

  state bal intg;

  state oldbal intg;

  { comp DEPOSIT(DEP) is

     {  oldbal <- bal;

         bal <- bal + DEP;

      } alt

     {  DEP <- bal - oldbal;

     }

  };

  comp BALANCE (B) is

  { B <- bal;

  };

  bal <- 0;

};

>relation accts(ACCNO, CLIENT) <- {(1729, "pat"), (4104. "suz")};

>accounts <- accts ijoin ba;

>update accounts change DEPOSIT(100) using where ACCNO = 4104 in accounts;

 

In the above example, the computation ba contains two nested computations as its domains: DEPOSIT and BALANCE, which are two methods of ba. Note that there are two state variables that are defined within ba and used by the two methods. The relation accounts is instantiated by the join of acct and ba, and thus all the tuples in the relation accounts have their own copy of the two state variables. Actually state variables are implemented as hidden domains of the relation. Please refer to [Bak98] for details of stateful computations.

     The above update statement will make a deposit of $100 to the account 4104. We do this by the following two steps: 

 

* Find the corresponding statements in the computation body of DEPOSIT.

* Execute each statement for each selected tuple.

 

In this case, for the selected account 4104 we first assign the value of the hidden domain bal to oldbal, and then increase the value of domain bal by 100.

     Due to time limitation, our implementation can only accept parameters of integer type. More work needs to be done on the implementation and testing of such statements.  Obviously, the combination of updates and computations improves the flexibility and functionality of jRelix.

 

* Condition Evaluation 

 

Comparing to the ECA model, current implementation of event handler in jRelix does not explicitly have a condition evaluation scheme. In the current model, we only have two components: the events, which are system-generated procedure calls, and the event handlers, which are procedures performing actions in response to the events. Every time an event occurs, the corresponding event handler(s) will be invoked without evaluating a specific condition.

     There are two alternative ways to integrate condition evaluation into our model. One way is that we evaluate the condition within the event handler, in which case the current model is capable of handling it and need no modification. The other way is that we add a condition evaluation component in our model which will improve the explicitness of the model.  To do this, first we need to modify the syntax of the definition of event handler. We suggest the condition part be integrated with event name of the event handler. The syntax can go as follows:

 

ComputationName : = Identifier

                          |

                    EventName ( "[" IDList "]" )? (":" ConditionExpression)?

 

ConditionExpression : = "when" Expression

 

The following expressions can be valid condition expressions:

 

when quantity < 5;

when (salary < 10000) and (DEPT = "stereo");

when ([]  where mark > 85 in Record);

 

Thus we can define the following event handler:

 

comp post:change:inventory[quantity]:when quantity < 5 ( ) is

{

update inventory change quantity <- quantity + 20;

update stock change quantity <- quantity - 20;

}

 

If one or more items are sold, the quantity of the item in the inventory will be changed. Only when the quantity goes below 5 will the above event handler be activated and transfer 20 items from stock to inventory.

     We suggest the condition evaluation be performed within the procedure

 

ExecuteEventHandler ( prefix, event_name )

* get the action_type from event_name

* get the relation_name from event_name

* get the attribute_list from event_name; if action_type is add/delete, attribute_list must be empty.

* set query1 to be prefix, set query2 to be (action_type + relation_name), set query3 to be attribute_list.

* search the system table .rel, get those computations with attribute Active set to 1 (which must be event handlers since only event handler could set Active to be 1)

* for each selected computation, analyze their name.

  * separate the name to four parts: mode("pre" or "post"), event, attrlist and 

  condition; if the name does not have mode, set mode to "post"

  * if mode equals to query1

    * if action_type of query2 is ADD/DELETE

      * if event equals to query2

        * if condition evaluates to true

          * add the event handler name to the selection_list

    * if action_type of query2 is CHANGE

      * if event equals to query2 and if attrlist is the subset of query3

        * if condition evaluates to true

          * add the event handler name to the selection_list

  * for each event handler in the selection_list

    * execute the event handler

 

The modifications on the original algorithm are indicated by bold font.

     Taking the above example, when we execute the following statement:

 

update inventory change quantity <- quantity - 1;

 

we first decrease the quantity of the inventory by 1; then we find the corresponding event handler based on the event name and prefix, which is

 

comp post:change:inventory[quantity]:when quantity < 5 ( )

 

We retrieve the condition part of the event handler, when quantity < 5, and evaluate it. If the condition evaluates to true, we will invoke this event handler; otherwise, not. The algorithm can be improved for efficiency. If the change is to increase the quantity or it is an add operation, we do not need to check this condition. The same applies to other update operations if we pay additional attention to the specific conditions.  

The algorithm we adopt here corresponds to the immediate coupling mode (both EC coupling and CA coupling) in which case the event detection, condition evaluation and action execution all occur immediately one after another in the triggering transaction.

 

More research needs to be done on what can be set as conditions, and how to sensibly evaluate them.

 

* Transaction Control

 

In an active database system, an update could trigger multiple event handlers. Event handlers may contain update statements which in turn invoke other event handlers. Each of these event handlers can be considered as a sub-transaction and the way they are cascading is unforeseeable as they can be independently defined in different applications. This introduces cascading transaction. We will discuss more of it in the following section.

 

The current implementation of jRelix does not support transaction control. The essence of the concept of transaction is consistency, which requires that the actions be taken as a group to transform the state of the database correctly. This implies the concept of atomicity, which enforces all the actions in a transaction either all happen or none happen. Partial execution of a transaction may leave the database in an inconsistent state.

     We suggest that a transaction in jRelix be defined by using "begin_T" and "end_T" to group statements together, shown as follows:

 

begin_T

{

     statements

}end_T

 

Once the interpreter encounters "begin_T" it should impose locks on all the relations affected in the transaction and execute the grouped statements. If "end_T" is reached and all operations succeed, changes upon relations should be committed and all the locks imposed by the transaction should be released. If an operation in the transaction fails or a "abort_T" command is encountered, the changes applied on the relations within the transaction should be undone and all the corresponding locks should be released.

     We have implemented an undo command which can be used to reverse the changes applied on the relation by an update statement. In current implementation, we only support to use this command within the event handler of the corresponding update statement. But the idea of undo can be applied upon the transaction level as a recovery mechanism of transaction control. We can expand the scope of the Trigger, New and Rest from statement to transaction, thus when a transaction is aborted, we can revert all the changes happened within the transaction by switching all Trigger pieces and New pieces with Rest pieces.

     There are some other issues, such as locking scheme and concurrency control, to be addressed in the future implementation of transaction control. We can take advantage of the two versions of data that are kept, Trigger and New, to improve the concurrency of transactions.

     One major source of concurrency bottlenecks is queries. Query transactions typically run much longer than update transactions and they access lots of data. So if they run using two-phase locking - which means a transaction must get all of its locks before releasing any of them (refer to [BN87] for details on two-phase locking) - they often set many locks and hold these locks for a long time, creating long delays of update transactions.

     If no read locks are set and writes still obey two-phase locking, the queries will not slow down the execution of update transactions. Since we keep two versions of data, the old version of data will not be overwritten until the next update applies on the same data. The query transaction which begins before the update transaction commits can read the old version data and the query transaction which begins after the update transaction commits can read the new version of data (time-stamping the transactions), both of which do not need to set read locks on the data. Thus the query transaction will not block the update transaction. The update transactions are still serializable with respect to each other as long as they obey two-phase locking.

     The anomaly of unrepeatable read will occur under above assumption. Consider the following example:

 

E:       r6 w7 c7    r6 w8    r6 c6 c8

R:    R1          R2       R3

 

where E represents the execution of three transactions 6, 7 and 8. R represents a data item and R1, R2 and R3 are three versions of R. w means write, r means read and c means commit. Suppose these transactions all operate on the same data item R.

     The first r6 reads the current version of data item R, say R1. The second r6 still reads R1 ignoring w7's effect because transaction 6 begins before transaction 7 commits. The third r6 should still read R1. But when transaction 8 produces R3, R1 will be discarded in our current model because only two latest version of R will be kept. Thus the third r6 can not access R1 which causes the unrepeatable read problem. 

     There are several ways to solve this problem:

* Restrict repeating reads on the same data item in one transaction

* Set read locks: query transactions also obey two-phase locking.

* Multi-version data: each data item consists of a sequence of versions, each version corresponding to each update that was applied to it.

 

More research is needed on cascading transaction control, especially the concurrency among them, which is a nontrivial topic.

 

 

Appendix A

 

Backus-Naur Form for jRelix Commands

 

This appendix documents jRelix grammar/syntax in the Backus-Naur Form (BNF) format. Most of the grammar is taken from Biao's thesis [Hao98] and we made some modifications for grammar of update (refer to section 3.1.2) and event handler (refer to section 3.2.2.1).

     The convention of this BNF definition is explained in the following figure.

 

                              Figure A.1: BNF convention

 

     The grammar is created from the grammar specification (in file Parser.jjt), using jjdoc which is the JavaCC documentation generator. Because JavaCC is a top-down parser, left-recursion is not allowed in the grammar specification.

     There are five token definitions: <EOF> for end-of-file; <IDENTIFIER> for identifier; <INTEGER_LITERAL> for integer constants; <FLOAT_LITERAL> for floating constants; and <STRING_LITERAL> string constants.

Start := Command ";" | Statement ";" | ";" | <EOF>

 

Command := "help" (<IDENTIFIER>)?| "quit" | "input" FilePath | "debug"

| "time" | "trace" | "dd" IDList | "dr" (IDList

| EventName ("[" IDList "]")?)| "pr" (Expression

| EventName ("[" IDList "]")?)| "sd" (<IDENTIFIER>)?

| "sr" (<IDENTIFIER>)?| "sc" <IDENTIFIER> | "srd" | "ssd"

| "ssr" | "undo" | "print" <STRING_LITERAL>

| "eventon" EventName ("[" IDList "]")?

| "eventoff" EventName ("[" IDList "]")?

 

Statement := SequentialStatement

 

SequentialStatement := ParallelStatement("--" ParallelStatement)*

 

ParallelStatement := ChoiceStatement ("||" ChoiceStatement)*

 

ChoiceStatement := PrimaryStatement ("??" PrimaryStatement)*

 

PrimaryStatement := Declaration | Assignment | Update

| ComputationCall | Conditional | ForLoop | WhileLoop

| Exit | DeadLock | Exec | "(" Statement ")"

| StatementBlock

 

StatementBlock := "{" Statement (";" Statement)* (";")? "}"

 

Conditional := "if" Expression "then" Statement

("else" Statement)?

 

ForLoop := ("for" Identifier)? ("from" Expression)?

      ("to" Expression)? ("by" Expression)?

"do" Statement

 

WhileLoop := "while" Expression "do" Statement

 

Exit := "exit"

 

DeadLock := "deadlock"

 

Exec := "exec" Identifier

 

Declaration := "relation" IDList "(" IDList ")" (Initialization)?

| Identifier ("initial" Expression)? "is" Expression ("target" Expression)?

| "domain" IDList Type

| "let" Identifier ("initial" Expression)? "be" Expression

| "comp" CompName "(" (ParameterList)? ")" "is" ComputationBody

 

Initialization := "<-" ("{" ConstantTupleList "}"

| Identifier | FilePath )

| "rcopy" (Identifier | FilePath)

| "is" (Identifier | FilePath)

 

ConstantTupleList := ConstantTuple ("," ConstantTuple)*

 

ConstantTuple := "(" Constant ("," Constant)* ")"

 

Constant := Literal | "{" ConstantTupleList "}"

 

Identifier := <IDENTIFIER>

 

FilePath := <STRING_LITERAL>

 

Assignment := Identifier (AssignOperator Expression

| "[" IDList AssignOperator ExpressionList "]" Expression)

 

AssignOperator := "<-" | "<+"

 

Update := "update" Identifier(UpdateOperator Expression

| "change" (StatementList)? ("using" UsingClause)?

| "[" IDList UpdateOperator ExpressionList "]" Expression )

 

UpdateOperator := "add" | "delete"

 

StatementList :=  Statement ("," Statement)*

 

UsingClause := Identifier

| "(" Expression ")"

| JoinOperator Expression

| "[" ExpressionList ":" JoinOperator (":")? ExpressionList "]" Expression

 

IDList := Identifier ("," Identifier)*

 

ExpressionList := Expression ("," Expression)*

 

Expression := Disjunction

 

Disjunction := Conjunction ("or" Conjunction)*

 

Conjunction := Comparison ("and" Comparison)*

 

Comparison := Concatenation (ComparativeOperator Concatenation)?

 

Concatenation := MinMax ("cat" MinMax)*

 

MinMax := Summation (MinMaxOperator Summation)*

 

Summation := JoinExpression (AdditiveOperator JoinExpression)*

 

JoinExpression := Projection

((JoinOperator Projection

  | "[" ExpressionList ":" JoinOperator (":")?

    ExpressionList "]"  Projection

  )

)*

 

Projection := Projector (("in" | "from") Projection | Selection)

| Selection

 

Projector := (QuantifierOperator)? "[" (ExpressionList)? "]"

 

Selection := Selector | Qselector | Term

 

Selector := ("where" | "when") Expression ("in" | "from") Projection    | "edit" (Projection)? | "zorder" Projection

 

Qselector := "quant" QuantifierList

             (("where" | "when") Expression)?

             ("in" | "from") Projection

 

QuantifierOperator := "." | "%" | "#"

 

QuantifierList := Quantifier ("," Quantifier)*

 

Quantifier := "(" Expression ")" Expression

 

Term := Factor (MultiplicativeOperator Factor)*

 

Factor := UnaryOperator Factor | Power

 

Power := Primary ("**" Power)*

 

Primary := Literal | QuantifierOperator | ArrayElement

| PositionalRename | Identifier | Cast | "(" Expression ")"

| Pick | Eval | Function | IfThenElseExpression

| VerticalExpression

 

ArrayElement := Identifier "[" ArrayIndexList "]"

 

ArrayIndexList := (Expression)? ("," (Expression)?)*

 

PositionalRename := Identifier "(" (IDList)? ")"

 

Cast := "(" Type ")" Primary

 

Pick := "pick" Selection

 

Eval := "eval" Expression

 

Function := FunctionOperator "(" Expression ")"

 

Literal := "null" | "dc" | "dk" | "true" | "false"

| (SignOperator)? (<INTEGER_LITERAL> | <FLOAT_LITERAL>)

| <STRING_LITERAL>

 

SignOperator := "plus" | "-"

 

IfThenElseExpression := "if" Expression "then" Expression

      "else" Expression

 

VerticalExpression := "red" AssoCommuOperator "of" Expression

      | "equiv" AssoCommuOperator "of" Expression

       "by" ExpressionList

      | "fun" OrderedOperator "of" Expression

       "order" ExpressionList

      | "par" OrderedOperator "of" Expression

       ("order" ExpressionList "by" ExpressionList

        | "by" ExpressionList "order" ExpressionList

  )

 

Type := ("boolean" | "bool") | "short" | ("integer" | intg")

| "long" | ("float" | "real") | "double"

| ("string" | "strg" )| "text" | ("statement" | "stmt")

| ("expression" | "expr")

| ("computation" | "comp") "(" IDList ")"

| "(" IDList ")"

 

AssoCommuOperator := ("or" | "|")

| ("and" | "&") | "min" | "max" | "+" | "*"

      | ("ijoin" | "natjoin") | "ujoin" | "sjoin"

 

OrderedOperator := AssoCommuOperator

| "cat" | "-" | "\" | "mod" | "**" | "pred" | "succ"

 

ComparativeOperator := "substr" | "=" | "!="

| ">" | "<" | ">=" | "<="

 

MinMaxOperator := "min" | "max"

 

AdditiveOperator := "+" | "-"

 

JoinOperator := "nop" | MuJoin | ("not")? SigmaJoin

 

MuJoin := ("ijoin" | "natjoin") | "ujoin" | "djoin" | "sjoin"

| "ljoin" | "rjoin" |   "dljoin" | "drjoin"

 

SigmaJoin := ("icomp" | "natcomp") | "eqjoin"

| ("gejoin" | "sup" | "div") | "ltjoin"

| ("lejoin" | "sub") | ("iejoin" | "sep")

 

MultiplicativeOperator := "*" | "\" | "mod"

 

UnaryOperator := "+" | "-" | "not"

 

FunctionOperator := "abs" | "sqrt" | "sin" | "asin"

| "cos" | "acos" | "tan" | "atan" | "sinh" | "cosh"

| "tanh" | "log" | "log10" | "round" | "ceil"

| "floor" | "isknown" | "chr" | "ord"

 

ParameterList := Parameter ("," Parameter)*

 

Parameter := <IDENTIFIER> (":" "seq")?

 

ComputationBody := ComputationVariableDeclaration

ComputationBlock ("alt" ComputationBlock)*

 

ComputationBlock := "{" ComputationStatements "}"

 

ComputationVariableDeclaration :=

      (LocalVariableDeclaration

 | StateVariableDeclaration

 | ComputationalInitialization

)*

 

LocalVariableDeclaration := "local" IDList Type ";"

 

StateVariableDeclaration := "start" IDList Type ";"

 

ComputationalInitialization := IDList "<-" Expression ";"

 

ComputationStatements := CompStatement ((";" CompStatement |

"also" CompStatement))* (";")?

 

CompStatement := Statement | Command

 

ComputationCall := Identifier "(" (CallParameterList)? ")"

 

CallParameterList := CallParameter ("," CallParameter)*

 

CallParameter := ("in" | "out") <IDENTIFIER>

 

CompName := CompIdentifier | EventName ("[" IDList "]")?

 

CompIdentifier := <IDENTIFIER>

 

EventName := (Prefix ":")? Action ":" Identifier

 

Prefix := "pre" | "post"

 

Action := "add" | "delete" | "change"

 

 

 

 

Bibliography

 

[AB+76]     M. M. Astrahan, M. W. Blasgen, D. D. Chamberlin, K. P. Eswaran, J. N. Gray, P. P. Griffith, W. F. King, R. A. Lorie, P. R. McJones, J. W. Mehl, G. R. Putzolu, I. L. Traiger, B. W. Wade and V. Watson. System R: Relational Approach to Database Management. ACM Trans. Database Systems, 1(2), 97-137, 1976.

 

[AB84]      S. Abiteboul and N. Bidoit. Non first normal form relations to represent hierarchically organized data. In Proceedings of 2nd ACMSIGACT/SIGMOD Symposium on Principles of Database Systems, pages 191-200, 1984.

 

[Bak98]     Patrick Baker. Java Implementation of Computations in a Database Programming Language. Master's thesis, McGill University, 1998.

 

[BB+93]     M. Berndtsson and B. Lings. On developing reactive object-oriented databases. IEEE Data Engineering Bulletin, 15(4), pages 31-34, 1992.

 

[BK86]      F. Bancilhon and S. Khoshafian. A calculus for complex objects. In Proceedings 5th of ACM SIGACT-SIGMOD Symposium on Principles of Database System, pages 53-59, 1986.

 

[BN87]      P. A. Bernstein and E. Newcomer. Principles of Transaction Processing. Morgan Kaufmann Publishers, Inc, 1987.

 

[BNR+87]    C. Beeri, S. Naqvi, R. Ramakrishnan, O. Shmueli, and S. Tsur. Sets and negation in logic database language (ldl1). In Proceedings 6th PODS, San Diego, pages 21-37, 1987.

 

[CA+94]     S. Chakravarthy, E. Anwar, L. Maugis, and D. Mishra. Design of Sentinel: An object-oriented DBMS with event-based rules, Information and Software Technology, 39(9), pages 555-568, 1994.

 

[CB99]      T. M. Connolly and C. E. Begg. Database Systems - A Practical Approach to Design, Implementation and Management,

 

[CCS94]     C. Collet, T. Coupaye, and T. Svensen. Naos - Efficient and modular reactive capabilities in an object-oriented database system. Int. Conf. On Very Large Databases, Santiago, Chile, 1994, pages 132-143.

 

[Cod70]     E. F. Codd. A relational model for large shared data banks. Communications of the ACM, 13(6):377-387, 1970.

 

[Cod72a]    E. F. Codd. Database Systems: Further Normalization of the Data Base Relational Model. Prentice-Hall, Edited by R. Rustin, 1972.

 

[Cod72b]    E. F. Codd. Database Systems: Relational Completeness of Data Base Sublanguages. Prentice-Hall, Edited by R. Rustin, 1972.

 

[Cod79]     E. F. Codd. Extending the database relational model to capture more meaning. ACM Transactions on Database Systems, 4:397-434, 1979.

 

[Dat81]     C. J. Date. An Introduction to Database Systems, 3rd Edition. Addison-Wesley, Reading, MA, 1981.

 

 

[Day95]     Umeshwar Dayal. Ten Years of Activity in Active Database Systems: What Have We Accomplished? In Active and Real-Time Database Systems (ARTDB-95), pages 3-22, Springer- Verlag, Skövde, 1995.

 

[DB+88]     U. Dayal, B. Blaustein, A. Buchmann, U. Chakravarthy, M, Hsu, R. Ledin, D. McCarthy, A Rosenthal, and S. Sarin. The HiPAC Project: Combining Active Databases and Timing Constraints. SIGMOD Record, Vol. 17, No. 1, March 1988, 51-70.

 

[DE89]      L. M. L. Delcambre and J. N. Etheredge. The relational Production Language: A production language for relational database. In L. Kerschberg, editor, Expert Database Systems - Proceedings from the Second International Conference, pages 333-351. Benjiamin/Cummings, Redwood City, California, 1989.

 

[DH+94]     U. Dayal, H. Y. Hwang, F. Manola, A. Rosenthal, and J. M. Smith. Knowledge-Oriented Database Management. Phase 1 Final Technical Report, Computer Corporation of America, Cambridge, MA, Aug. 1994.

 

[DHL90]     U. Dayal, M. Hsu and R. Ladin. Organizing Long-Running Activities with Triggers and Transactions. In Proc. Of the 1990 ACM SIGMOD Intl. Conf. on Management of Data, Atlantic City, NJ, May 1990.

 

[EC75]      K. P. Eswaran and D. D. Chamberlin. Functional specifications of a subsystem for data base integrity. In proceedings of the First International Conference on Very Large Data Bases, pages 317-326, Barcelona, Spain, September 1991.

 

[Elk96]     Ahmad Ziad El-kays. Implementing Event Handlers in a Database Programming Language. Master's thesis, McGill University, 1996.

 

[Esw76]     K. P. Eswaran. Specifications, implementations and interactions of a trigger subsystem in an integrated database system. IBM Research Report RJ 1820, IBM San Jose Research Laboratory, San Jose, California, August 1976.

 

[FMT84]     P. Fraternali, D. Montesi, and L. Tanca. Active Database Semantics. In Proceedings of the Fifth Australian Database Conference, University of Canterbury, New Zealand, Jan. 1984.

 

[FT83]      P. C. Fisher and S. J. Thomas. Operations on non-first-normal-form relations. In Proceedings of IEEE COMPSAC '83, pages 464-475, 1983.

 

[GGD91]     S. Gatziu, A. Geppert, and K. R. Dittrich. Integrating active concepts into an object-oriented database systems. Workshop on Database Programming Languages, Nafplion, Greece, 1991, pages 399-415.

 

[GJ91]      N. Gehani and H. V. Jagadish. Ode as an active database: Constraints and triggers. In Proceedings of the Seventeenth International Conference on Very Large Data Bases, pages 327-336, Barcelona, Spain, September 1991.

 

[GR93]      Jim Gray and Andreas Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann Publishers, Inc., San Mateo, CA, 1993.

 

[Han89]     E. N. Hanson. An initial report on the design of Ariel: A DBMS with an integrated production rule system. SIGMOD Record, Special Issue on Rule Management and Proceeding in Expert Database Systems, 18(3):12-19, September 1989.

 

[Hao98]     Biao Hao. Implementation of The Nested Relational Algebra in Java.  Master's thesis, Mcgill University. 1998.

[HSW75]     G. D. Held, M. R. Stonebraker and E. Wong. INGRES: a relational database system. In Proceedings of AFIPS National Computer Conference, Vol 44, pages 409-416, AFIPS Press, Montvale, N. J. 1975.

 

[HP87]      G. Houben and J. Paredaens. The R-Algebra: An extension of an Algebra for Nested Relations. Technology University, Eindhoven, 1987.

 

[JS82]      G. Jaeschke and H. J. Schek. Remarks on the algebra of non-first-normal-form relations. In Proceedings of the First ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, pages 124-138, Los Angeles, 1982.

 

[KDM88]     A. M. Kotz, K. R. Dittrich, and J. A. Mulle. Supporting semantic rules by a generalized event/trigger mechanism. In Advances in Database Technology - EDBT '88, Lecture Notes in Computer Science 303, pages 76-91. Springer-Verlag, Berlin, March 1988.

 

[KK89]      H. Kitagawa and T. L. Kunii. The Unnormalized Relational Data Model for Office Form Processor Design. Springer-Verlag, Tokyo, Japan, 1989.

 

[KR89]      H. F. Korth and M. A. Roth. Query languages for nested relational databases. In Nested Relations and Complex Objects in Databases, Lecture Notes in Computer Science 361. Springer-Verlag, Berlin, 1989.

 

[Lev91]     M. Levene. The nested Universal Relation Database Model. Springer-Verlag, Berlin, 1991.

 

[LS88]      R. Lorie and H. J. Schek. On dynamically defined objects and SQL. In Proceedings of 2nd Workshop on Object-Oriented Database Systems, Bad Münster, 1988.

 

[Mak77]     A. Makinouchi. A consideration on normal form of not-necessarily-normalized relation in the relational data model. In Proceedings of 3rd International Conference on Very Large Data Bases, pages 447-453, Tokyo, 1977.

 

[MD89]      D. McCarthy and U. Dayal. The architecture of an active data base management system. Proceedings of ACM SIGMOD, Portland, Oregon 1989, 215-224.

 

[Mer76]     T. H. Merrett. MRDS: An algebraic relational database system. In Canadian Computer Conference, pages 102-124, Montreal, 1976.

 

[Mer77]     T. H. Merrett. Relations as programming language elements. Information Processing Letters, 6(1):29-33, 1977.

 

[Mer84]     T. H. Merrett. Relational Information Systems. Reston Publishing Co., Reston, VA, 1984.

 

[Mos85]     J. Eliot Moss. Nested Transactions: An Approach to Reliable Distributed Computing. The MIT Press, Massachusetts, 1985.

 

[MUE94]     Thomas A. Mueck. Active Databases: Concepts and Design Support. Advances in Computers, Vol. 39, Academic Press Inc., 1994, 107-189.

 

[Nor98]     Norman W. Paton. Active Rules in Database Systems. Springer-Verlag,

      New York, 1998.

 

[PA86]      P. Pistor and F. Anderson. Designing a generalized NF2 model with an SQL-type language interface. In Proceedings of the 12th International Conference on Very Large Data Base, pages 278-285, 1986.

 

[PT86]      P. Pistor and R. Traunmueller. A database language for sets, lists and tables. Information Systems, 11(4):323-336, 1986.

 

[RKS88]     M. Roth, H. Korth and A. Silberschatz. Extended algebra and calculus for nested relational database. ACM Transactions on Database Systems, 13(4):389-417, 1988.

 

[RSL95]     Rebecca S. Lui. Implementing Procedures in a Database Programming Language. Master's thesis, McGill University, 1996

 

[SJ+90]     M. Stonebraker, A. Jhingran, J. Goh, and S. Potamianos. On rules, procedures, caching and views in data base systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 281-290, 1990.

 

[SLR88]     T. Sellis, C.-C. Lin, and L. Raschid. Implementing large production systems in a DBMS environment: Concepts and algorithms. In proceedings of the ACM SIGMOD International Conference on Management of Data, pages 404-412, Chicago, Illinois, June 1988.

 

[SK+94]     J. Sayles, S. Karlen, P. Molchan, and G. Bilodeau. GUI-BASED DESIGN AND DEVELOPMENT FOR CLIENT/SERVER APPLICATIONS. John Wiley and Sons, Inc., New York, 1994.

 

[SKM92]     E. Simon, J. Kiernan, and C. de Maindreville. Implementing high level active rules on top of a relational DBMS. In proceedings of the Eighteenth International Conference on Very Large Data Bases, pages 315-326, Vancouver, British Columbia, August 1992.

 

[SS86]      H. Schek and M. Scholl. The relational model with relation-valued attributes. Information Systems, 11(2):137-147, 1986.

[Sto82]     M. Stonebraker et al. A Rules System for Relational Data Base Management Systems. Proceedings of the Second International Conference on Databases, Jerusalem, June 1982.

 

[TF86]      S. J. Thomas and P. C. Fischer. Nested relational structures. In P. C. Kanellakis, editor, Advances in Computing Research III, The Theory of Databases, pages 269-307. JAI Press, 1986.

 

[Tod76]     S. J. P. Todd. Peterlee Relational Test Vehicle PRTV, a relational overview. IBM Scientific Center Report UKSC0075, Peterlee, England, July 1975.

 

[Ull82]     J. D. Ullman. Principles of Database Systems, 2nd Edition. Computer Science Press, Rockville, MD, 1982.

 

[VB98]      I. Vlahavas and N. Bassiliades. Parallel, Object-Oriented, and Active Knowledge Base Systems. Kluwer Academic Publishers, Boston/Dordrecht/London, 1998.

 

[Wid96]     J. Widom. The Starburst rule system, in Active Database Systems: Trigger and Rules for Advanced Database Processing, J. Widom and S. Ceri, Eds.: Morgan Kaufmann Publishers, 1996, pages 177-206.

 

[Yua98]     Zhongxia Yuan. Java implementation of the nested domain algebra in a database programming language. Master's thesis, McGill University, 1998.