Copyright ©2008  T. H. Merrett

COMP 617	Information Systems		Winter 2008		Week 13


			  Access Control in Aldat

Access control protects data from unauthorized access: reading, updating and
deleting; or, more forcefully, snooping, hacking and sabotage.

1. The UNIX operating system has a very simple access control mechanism. Each
file is flagged for three possible operations for three categories of user:

	ls -l week13p1
	-rw-r--r--   1 tim  tim  253 Apr  8 09:36 week13p1

This means that a) the owner of the file (tim) may read and write it,
b) a designated special group, e.g., tim's lab or his students, may read it, and
c) the general public may read it.

The owner may change these permissions, say to permit anybody to do anything,
including execute the file as a user-written UNIX command:

	chmod 777 week13p1
	ls -l week13p1
	-rwxrwxrwx   1 tim  tim  443 Apr  8 09:39 week13p1

And he may restore the original permissions:

	chmod 644 week13p1

Note that each triple is three bits, so its values range from 0 to 7.

(The first bit of the ten is set by UNIX if the file is a directory:

	ls -al
	drwxr-xr-x   54 tim  tim    1836 Apr  8 09:47 .
	:
)

2. We might control relations in the same way. Indeed, since a relix database
is a directory and each relix relation is a file, UNIX already provides us with
access control. [GrahamDenning:accesControl] develop this approach to its
logical conclusion.

But we are likely to need finer control than just at the relation level: at the
level of tuples (if we want to permit each student to read only heir own mark in
a courses(course,student,mark) relation); or at the level of attributes (if we
want to permit any student to calculate the class average).

Ultimately, we would probably need to provide control at the level of each datum
(value of a given attribute in a given tuple) and to specify for each individual
user whay hey can do with the datum. UNIX-like flags would clearly require far
more space than the data.

3. There are some classical problems with access control. One is delegation and
revocation [GriffWade:authorizationDB]. If B has access to a file and can pass
on this privilege, transitively, to C, then C can pass it on to D and so on. If
B later revokes C's privilege then there must be a transitive revocation all the
way down the line. Except if D, say, independently got the permission from
somebody else, sat A, who had given it to B in the first place or who had been
given it, along with B, by the owner. Then D and anybody D had delegated would
retain the privilege.

[GriffWade:authorizationDB] attempted a solution using timestamps to avoid
processing the full delegation graph, and [Fagin:authorizationDB] had to
correct it. The solution is not straightforward.

4. Another problem arises from aggregation, e.g., [Denning:statDBprivacy].
If we allow somebody to calculate averages on an attribute who is not entitled
to know the individual data, hey can easily find out the individual data anyway.

For instance Max is allowed to compute
	let totMark be equiv + of mark by course;
	let totStud be equiv + of 1 by course;
	let avgMark be totMark/totStud;
on courses(course,student,mark) but is not allowed to see any mark but his own.
He can easily figure out Sal's mark by comparing
	[course,totMark] in courses
with
	[course,totMark] where student != "Sal" in courses

Each of these is a legitimate aggregation over, presumably, dozens or hundreds
of tuples, and so should be safe, but the combination reveals a supposedly
confidential datum.

5. More recent work has looked at role-based access control (RBAC), in which
the privileges are assigned to roles, such as chair of department or instructor
of course, rather than to individual people. See, e.g., [BacMooYao:roleSecurity]

But if we know the individual people, we can maintain relations which record the
role assignments.

6. I believe that all this can be done in Aldat with the addition of only one
new construct,
	ACself[],
a function which returns the authenticated identifier of the person issuing
the query.

Authentication is an issue I won't address here. The normal approach is to
store passwords securely (e.g., encrypted) and to provide an interface to
demand and check a user's password. There are many alternatives, such as
biometrics, or asking a question only the user can answer. (Acces control
through IP address, as in McGill's institutional membership of ACM, is simpler,
the part of the authentication mechanism that just doesn't let you in at all.)

In addition, we use Aldat's ADT (abstract data type) mechanism, which simply
hides all data and provides controlled access through parametric computations
(public methods).

7. Let's work an academic example, with the data

	personnel		roles			students(student gpa)
	(employee salary)	(employee role)			    Tom  4.0
	    Sam     50		   Sam    dean			    Sal  3.5
	    Joe     40		   Joe    chair			    Max  3.5
	    Pat     30		   Joe    prof		profs(prof course)
	    Tom     15		   Pat    prof		       Joe  601
	    Ann     30		   Ann    prof		       Pat  501
	    Jon     20		   Tom    TA		       Pat  502
				   Jon    admin		       Ann  503
	courses(course student mark)		delegate
		 601     Tom     A		(from employee role restriction)
		 501     Sal     A		 Sam     Jon   dean
		 501     Max     B		 Joe     Jon   chair
		 502     Sal     B		 Pat     Tom   prof     501
		 503     Max     A
(I'll have to work on the generality of  delegate().)

The kinds of access control we wish to impose are fairly complicated.
- Everybody may read personal information (students their marks, employees
  their salaries, even instructors "their" students and marks in their courses).
- Nobody may read others' private information.
- The dean may hire (and fire) by adding (deleting) tuples in  personnel().
- The chair may appoint (or alter or dis-appoint) course instructors by adding
  (changing, or deleting) tuples in  profs().
- Students may enrol in (or drop) courses by adding (deleting) tuples in
  courses().
- Profs may enter (or change) marks by changing tuples in  courses() - which
  are initially assumed to be DC nulls, not the grades I've shown.
- The admin may cause GPas to be calculated from  courses()  and inserted in
  students()  by changing tuples in  students() - gpas are also initially DC.
- Anybody may find the numbers enrolled in a course and the average mark, by
  aggregating on  courses().
- The admin may assign (de-assign) roles by adding (deleting) tuples of  roles()
- Anybody with a role may delegate it to anybody else, but a role may be
  delegated to only one person at a time (to avoid delegate/revoke
  transitivities)

8. A technique we can use for checking roles is a set of views such as

	studentHemself is [student] where student=ACself[] in students;

This may be ijoined to any query to limit the query to data that are private to
the student.

We also may use  ACself[]  independently of any such view.

Incorporating all access as public methods in an ADT exploits the ADT property
of hiding everything. There are two downsides. First, the resulting philosophy
is that nothing is allowed unless explicitly. This would not do well in some
situations, e.g., you would not want to be prevented from parking your car
unless there was a sign saying explicitly that you could do so. But I would
need an example of data access control where only prevention must be explicit.
(Note added after course: this is not true. See end of these notes.)

Second, the access control is hardwired into the ADT. This requires an Aldat
programmer to build the system rather than allowing end-users to put together
whatever access control they would like. That seems to be fine for the academic
example, and I would need examples of opposite requirements.

Note that if you want your data to be open to all, don't use an ADT, or use one
that provides all Aldat operations on the internal data.

9. Here is the ADT. (Some operations are omitted for brevity, but how to build
them should be obvious.)

comp academics(deanHire,chaireplace,studEnrol,profGrade,adminGPA,publicCheck,
  adminRole,anyDelegate) is	// showing no domain declarations
{ state relation personnel(employee salary);
  state relation students(student gpa);		// gpa initially DC
  state relation profs(prof course);
  state relation courses(course student mark);	// mark initially DC
  state relation roles(employee role);
  state relation delegate(from employee role restriction);

  // self views: the filters
  deanHemself is ([employee] where role="dean" & employee=ACself[] in roles)
        ujoin [employee] where role="dean" & employee=ACself[] in delegate;
  chairHemself is ([employee] where role="chair" &employee=ACself[] in roles)
        ujoin [employee] where role="chair" & employee=ACself[] in delegate;
  adminHemself is [employee] where role="admin" & employee=ACself[] in roles;
  profHemself is [employee] where role="prof" & employee=ACself[] in roles;
  studentHemself is [student] where student=ACself[] in students;

  // the public methods
  comp deanHire(newHire) is	// newHire(employee, salary)
  { update personnel add newHire ijoin [] in deanHemself };
  comp chaireplace(newInstructor) is	// newInstructor(reProf,course)
  { update profs change prof <- reProf using
					newInstructor ijoin chairHemself
  };
  comp studEnrol(newStudent) is		// newStudent(course,student)
  { update courses add [course,student] in (newStudent ijoin studentHemself) };
  comp profGrade(newMarks) is		// newMarks(course,student,newMark)
  { update courses change mark <- newMark using
		[course,student,newMark] in newMarks ijoin
		profs [prof:ijoin:employee] profHemself
  };
    // NB must enforce presence of course in newMarks, else will give
       each student the new mark for all heir courses.
  comp adminGPA() is			// admin may compute gpas
  { let GPA be (equiv + of
	if mark="A" then 4 else
	if mark="B" then 3 else
	if mark="C" then 2 else
	if mark="D" then 1 else 0 by student)/equiv + of 1 by student;
    update students change gpa <- GPA using
		([student,GPA] in courses) ijoin adminHemself
  };
  comp publicCheck(query,answer) is	// unrestricted access
  { let enrol be equiv + of 1 by course;
    let avgmark be (equiv + of
	if mark="A" then 4 else
	if mark="B" then 3 else
	if mark="C" then 2 else
	if mark="D" then 1 else 0 by student)/enrol;
    if query = "enrol" then answer <- [enrol] in courses else
    if query = "marks" then answer <- [avgmark] in courses else
    answer <- [enrol,avgmark] in courses
  };
  comp adminRole(newRole) is		// newRole(employee,role)
  { update roles add newRole ijoin [] in adminHemself };
  comp anyDelegate(newDelegate} is	// cannot be redelegated
     // newDelegate(from,employee,role,restriction)
  { let emp be employee;
    update delegate add [from,employee,role,restriction] in
     (newDelegate [from,role :ijoin: emp,role]       // Delegator must
      ([emp,role] where employee=ACself[] in roles)  // have the role.
     );
     // block > 1 level of delegation:
     let delecount be equiv + of 1 by role;
     if [red max of delecount] in delegate > 1 then undo:add:delegate()
  }
} // academics

10. Exercise. How does this approach relate to "query modification"
[StoneWong:accessControl]? Rework the examples of that paper using ACself,
views and ADTs. Hint: ACself(), being a computation, is also a relation. The
syntactic sugar, ACself[], is not the only T-selector we can use.

11. Exercise. Using computations to give parametrized views and the fact
that computations are relations and can themselves be viewed, implement
the "synergistic" authorization mechanisms of [Minsky:authorization]
with ACself(). Minsky considers situations such as the originator of the
data and the security officer both independently having a say in who can
access it.


(Note added during class. An example countering both the philosophy of blocking
everything except what may explicitly be revealed and the reliance on an Aldat
programmer might be an individual who wants to make heir own data public with
certain restrictions, such as a prof with a temporary marks relation in which
students can read their own marks but not others'. This would need new
mechanisms.)

(Note added after course. This example is not very convincing, but here's how
to prohibit, as opposed to permit. Suppose all access by anyone is allowed,
except Tom is not allowed to see Sal's GPA. Here's the view _permitting_ Tom
to see Sal's GPA:
	tomSeeSal is where Name="Sal" in students ijoin ACself("Tom");
This will be empty for every querier except Tom.
Prohibition can be done by the complementary view: take  students  apart and
put it back together without the above:
	tomNotSeeSalGPA is student djoin tomSeeSal ujoin [student] in tomSeeSal;
Put this view into the ADT, and leave all relations except  students  publicly
accessible outside the ADT.)

[BacMooYao:roleSecurity] J. Bacon, K. Moody, and W. Yao "A Model of OASIS
Role-Based Access Control and its Support for Active Security" ACM Trans.
Information and System Security, vol. 5, no. 4, 2002/11, pp. 492-540

[Denning:statDBprivacy] D.E. Denning "Secure Statistical Databases with Random
Sample Queries" ACM Trans. Database Systems, vol. 5, no. 3, 1980/9, pp. 291-315.

[Fagin:authorizationDB] R. Fagin "On an Authorization Mechanism" ACM Trans.
Database Systems, vol. 3, no. 3, 1978/9, pp. 310-319

[GrahamDenning:accesControl] G. Scott Graham and Peter J. Denning,
"Protection---Principles and Practice", Proc. {AFIPS} Spring Joint Computer
Conference, AFIPS Press, 1972, pp. 417--29

[GriffWade:authorizationDB] Patricia P. Griffiths Bradford W. Wade (IBM Research
Lab., San Jose) "An authorization mechanism for a relational database system"
ACM Trans. Database Systems, vol 1 no 3 1976/9 pp. 242--55

[Minsky:authorization] Naftaly Minski "Synergistic authorization in database
systems", Proc. 7th Internat. Conf. on VLDB, Sept. 1981

[StoneWong:accessControl] Michael Stonebraker and Eugene Wong, "Access Control
in a Relational Database Management System by Query Modification", Proc. 1974
ACM National Conference