Beluga Syntax

Beluga is a two-level system:

  1. the first level is the LF logical framework supporting specification of formal systems;
  2. the computation-level supports programming with LF specifications.

Hence, Beluga programs consist of two kinds of declarations: LF constant declarations and computation-level declarations. We will use EBNF metasyntax. In particular, {a} means repeat production a zero or more times, [a] means optionally apply production a and comments in the grammar are enclosed between (* and *).

sig ::= { lf_decl         (* LF constant declaraction *)
        | c_decl }        (* Computation-level declaration *)

Comments in Beluga files start with % and extend to the end of the line.

Warning:

LF declarations

There are two ways of making LF declarations. The first follows closely the syntax used in the Twelf system. There are however a few differences which we will mention below. The second uses a style akin to Standard ML datatype declarations.

In the following, id refers to identifiers starting with a lower case letter and ID identifiers starting with an upper case letter.

lf_decl ::= lf_datatype_decl    (* new syntax for LF signatures *)
          | id ":" lf_type "."	(* Twelf style LF signatures *)
          | id ":" lf_kind "."	(* Twelf style LF signatures *)
          | %name id ID "."    (* name preference pragma *)

op_arrow ::= "->" | "<-"        (* A <- B same as B -> A *)

lf_kind ::= type
          | lf_type op_arrow lf_kind
          | "{" id ":" lf_type "}" lf_kind

lf_type ::= id {term}                       (* A M1 ... M2 *)
          | lf_type op_arrow lf_type
          | "{" id ":" lf_type "}" lf_type  (* Pi x:A. K  or  Pi x:A. B *)

lf_datatype_decl ::= datatype id ":" lf_kind "=" [lf_con_decl] {"|" lf_con_decl} ";"

lf_con_decl ::= id ":" lf_type

term ::= (id | ID)		  (* variable x or constant a or c *)
       | "\" id "." term	  (* lambda x. M *)
       | term term		  (* M N *)
       | _                        (* hole, to be filled by term reconstruction *)

The constructs {x:U} V and \x. V bind the identifier x in V, which may shadow other constants or bound variables. As usual in type theory, U -> V is treated as an abbreviation for {x:U} V where x does not appear in V.

In the order of precedence, we disambiguate the syntax and impose restrictions as follows:

Next, we give a few simple examples illustrating what terms Beluga parses and which ones it does not. For example, the following are parsed identically using the old style LF syntax:

d : a <- b <- {x:term} c x -> p x.
d : ({x:term} c x -> p x) -> b -> a.
d : ((a <- b) <- ({x: term} ((c x) -> (p x)))).

The following declarations using old-style LF syntax parses:

d : p x <- a <- b <-  c x.  % against our convention but parses
d : p X <- a <- b <-  c X.  % Better code: uses convention

P:term -> type              % Parse error

The following in the old-style LF syntax will not parse:

d: p ((\x.x) a) .    % this will give an error since there is a redex
d: p a.              % this will parse.

We advocate using a new style for LF declarations based on datatypes.

datatype foo: {x:A} p x -> type =
  | d : a -> b -> foo c
  | f : d -> e -> foo k
;

More differences with Twelf syntax: We summarize a few more differences here. These restrictions may be addressed in the future:

Computation-level declarations

In addition to LF declarations, we support three additional kinds of declarations: schema declarations, which describe the structure of contexts, declarations of recursive functions (including mutually recursive functions), and non-recursive declarations.

c_decl ::= ctx_schema_decl      (* schema declaration *)
         | c_datatype_decl      (* see "Computation-level datatypes" below *)
         | c_typedef            (* type abbreviation *)
         | c_rec_prog           (* recursive programs *)
         | c_let_prog           (* non-recursive programs *)
         | exp ";"              (* computation-level expression *)

c_typedef ::= typedef id ":" c_kind "=" c_type ";"

c_rec_prog ::= rec id ":" c_type "=" exp {and id ":" c_type "=" exp} ";"

c_let_prog ::= let id [":" c_type] "=" exp ";"

To embed LF terms into computations, we extend the LF term and type language to support contextual variables and projections on bound variables and contextual variables. These extensions cannot be used when declaring an LF signature, but only when we use LF objects within computation-level programs and declarations.

term, lf_type ::= cvar sigma           (* contextual variable *)
                | id "." int           (* k-th projection of a bound variable x *)
                | block                (* Sigma x:A.B *)

(* Contextual variable *)
cvar ::= id                (* context variable *)
       | ID                (* meta-variable *)
       | "#" id            (* parameter variable *)
       | "#" id "." k      (* k-th projection *)

block ::= block id ":" lf_type {"," id ":" lf_type}

(* Substitutions  *)
sigma ::=                     (* Nothing denotes the empty substitution *)
        | ..                  (* Identity substitution *)
        | sigma term          (* substitution sigma , M *)

We impose the following restrictions:

Computation-level datatype declarations

A computational-level datatype were introduced in [Cave and Pientka, POPL12]. They support indexing a datatype with contextual objects.

c_datatype_decl ::=
         datatype id ":" c_kind "=" [ID : c_type] {"|" ID : c_type}
         {and id ":" c_kind "=" [ID : c_type] {"|" ID : c_type}} ";"

Schemas and Contexts

Schemas classify context. We allow contexts to be built up not only by single declarations, but also by Sigma-types, i.e. dependent tuples. Logically, this means we can have not only atoms in our context, but also conjunctions of atoms (where the conjunction corresponds to the Sigma-type type-theoretically).

% Schema
ctx_schema_decl ::= schema [some "[" ctx "]"] block
                  | schema lf_type

hyp         ::= id ":" lf_type | block

% Context
ctx         ::= [hyp {"," hyp}]
              | id ["," hyp {"," hyp}]  (* Context variable *)

ctx_hat     ::= [id {"," id}]           (* A context with the types erased. *)

Computation-level Types

index_obj ::= "[" ctx "]"               (* A context index *)
            | "[" ctx "." term "]"      (* A contextual object *)

quantif ::= "{" id ":" "[" ctx "." term "]" "}"      (* Dependent function type *)
	  | "{" "#" id ":" "[" ctx "." term "]" "}"  (* Dependent function type *)
	  | "{" id ":" ctx_schema "}"		     (* Explicit context *)
	  | "(" id ":" ctx_schema ")"		     (* Implicit context *)

c_kind ::= ctype
         | c_type "->" c_kind
         | quantif c_kind

c_type ::= "[" ctx "." lf_type "]"   (* Contextual type [Psi . A] *)
         | ID {index_obj}            (* Computational-level datatype *)
         | c_type "*" c_type           (* Tuples *)
         | c_type "->" c_type        (* The type of functions *)
         | quantif c_type	     (* Dependent function *)

In the order of precedence, we disambiguate the syntax:

Computation-level Expressions

exp    ::= "[" ctx_hat "." term "]"      (* Contextual object *)
         | "[" ctx "." term "]"          (* Type annotated contextual object *)
         | "(" exp "," exp ")"           (* Tuple *)
         | id                            (* Variable *)
         | ID                            (* Compution-level constructor *)
         | fn id "=>" exp                (* Computation-level functions *)
         | mlam ID "=>" exp              (* Abstraction on dependent type index *)
         | mlam id "=>" exp              (* Context abstraction *)
         | exp exp                       (* Application *)
         | exp "[" ctx "]"               (* Context application *)
         | case exp of branch {branch}  (* Case analysis *)
         | case exp of "{" "}"          (* Case analysis at uninhabited type *)
         | let patt = exp in exp        (* Case expression with a single branch *)

branch ::= {quantif} patt "=>" exp

(* Patterns *)
patt ::= "[" ctx_hat "." term "]"        (* Contextual object *)
       | "[" ctx "." term "]"            (* Type annotated contextual object *)
       | "(" patt  "," patt ")"          (* Tuple *)
       | id                              (* Pattern variable *)
       | ID {index_obj}                  (* Computation-level constructor *)
       | patt ":" c_type                 (* Type annotation in pattern *)

The constructs {X:[Psi . A]} tau the identifier X must be an upper-case letter. It may overshadow other contextual variables in tau. Note that [Psi . A] -> tau is NOT just an abbreviation for {X:[Psi . A]} tau . where X does not appear in tau. On the computation-level we retain the difference between quantification and implication. It is however worth noting that there is a simple translation between programs of these two types which involves lifting universally quantified elements to become proper stand-alone computation-level expressions.

In the order of precedence, we disambiguate the syntax and impose restrictions as follows:

  • Juxtaposition (application) is left associative and has highest precedence.
  • Bound variables and constants must be written with lower-case letters.
  • Free meta-variables must be written with upper-case letters.
  • All LF terms need to be written in beta-normal form, i.e. no redeces can occur in terms. LF terms occurring as part of a contextual object must be written in eta-expanded form. For example, one must write [g . lam \x.E .. x] and writing [g . lam E] is currently rejected.

Emacs mode

There is an emacs mode for Beluga, which is distributed with the Beluga code and example base. This will display the the source-syntax in a nicer format.