# Beluga Syntax

Beluga is a two-level system:

- the first level is the LF logical framework supporting specification of formal systems;
- the computation-level supports programming with LF specifications.

Hence, Beluga programs consist of two kinds of declarations: LF
constant declarations and computation-level declarations. We will use
EBNF
metasyntax. In particular, `{a}`

means repeat
production `a`

zero or more times, `[a]`

means
optionally apply production `a`

and comments in the grammar
are enclosed between `(*`

and `*)`

.

sig ::= { lf_decl(* LF constant declaraction *) | c_decl }(* Computation-level declaration *)

Comments in Beluga files start with `%`

and extend to the end of the line.

**Warning:**

- Beluga is a protype and work in progress. As such, some of the subtleties in the syntax may still change over time. While we try to ensure backwards compatibility, this may not always be possible.
- This description of the source-level syntax for Beluga below is not exhaustive. It tries to provide a high-level description and intuition for Belgua programs.

## LF declarations

There are two ways of making LF declarations. The first follows closely the syntax used in the Twelf system. There are however a few differences which we will mention below. The second uses a style akin to Standard ML datatype declarations.

In the following, `id`

refers to identifiers
starting with a lower case letter and `ID`

identifiers
starting with an upper case letter.

lf_decl ::= lf_datatype_decl(* new syntax for LF signatures *) | id ":" lf_type "."(* Twelf style LF signatures *) | id ":" lf_kind "."(* Twelf style LF signatures *) |%name id ID "."(* name preference pragma *) op_arrow ::= "->" | "<-"(* A <- B same as B -> A *) lf_kind ::=type | lf_type op_arrow lf_kind | "{" id ":" lf_type "}" lf_kind lf_type ::= id {term}(* A M1 ... M2 *) | lf_type op_arrow lf_type | "{" id ":" lf_type "}" lf_type(* Pi x:A. K or Pi x:A. B *) lf_datatype_decl ::=datatype id ":" lf_kind "=" [lf_con_decl] {"|" lf_con_decl} ";" lf_con_decl ::= id ":" lf_type term ::= (id | ID)(* variable x or constant a or c *) | "\" id "." term(* lambda x. M *) | term term(* M N *) | _(* hole, to be filled by term reconstruction *)

The constructs `{x:U} V`

and `\x. V`

bind the identifier `x`

in `V`

, which may shadow other constants or bound variables. As usual in type theory, `U -> V`

is treated as an abbreviation for `{x:U} V`

where `x`

does not appear in `V`

.

In the order of precedence, we disambiguate the syntax and impose restrictions as follows:

- Juxtaposition (application) is left associative and has highest precedence.
`->`

is right and`<-`

left associative with equal precedence.`:`

is left associative.`{}`

and`\`

are weak prefix operators.- Bound variables and constants must be written with lower-case letters.
- Free variables can be written with upper-case letters or lower case letters; but we use the convention to write them with upper-case letters.
- All terms need to be written in beta-normal form, i.e. no redices can occur in terms. However, variables do not need to be eta-expanded.

Next, we give a few simple examples illustrating what terms Beluga parses and which ones it does not. For example, the following are parsed identically using the old style LF syntax:

d : a <- b <- {x:term} c x -> p x. d : ({x:term} c x -> p x) -> b -> a. d : ((a <- b) <- ({x: term} ((c x) -> (p x)))).

The following declarations using old-style LF syntax parses:

d : p x <- a <- b <- c x.% against our convention but parses d : p X <- a <- b <- c X.% Better code: uses convention P:term -> type% Parse error

The following in the old-style LF syntax will not parse:

d: p ((\x.x) a) .% this will give an error since there is a redex d: p a.% this will parse.

We advocate using a new style for LF declarations based on datatypes.

datatype foo: {x:A} p x -> type = | d : a -> b -> foo c | f : d -> e -> foo k ;

**More differences with Twelf syntax:** We summarize a few more differences here. These restrictions may be addressed in the future:

- Lambda-abstractions cannot be annotated with the type of the input. Writing
`\x:nat.x`

is illegal in Beluga. - Omitting the type in a Pi-type is illegal in Beluga. One cannot simply write
`({x}foo x T) -> foo E T'.`

. One must give the type of the variable`x`

.

## Computation-level declarations

In addition to LF declarations, we support three additional kinds of declarations: schema declarations, which describe the structure of contexts, declarations of recursive functions (including mutually recursive functions), and non-recursive declarations.

c_decl ::= ctx_schema_decl(* schema declaration *) | c_datatype_decl(* see "Computation-level datatypes" below *) | c_typedef(* type abbreviation *) | c_rec_prog(* recursive programs *) | c_let_prog(* non-recursive programs *) | exp ";"(* computation-level expression *) c_typedef ::=typedef id ":" c_kind "=" c_type ";" c_rec_prog ::=rec id ":" c_type "=" exp {and id ":" c_type "=" exp} ";" c_let_prog ::=let id [":" c_type] "=" exp ";"

To embed LF terms into computations, we extend the LF term and type language to support contextual variables and projections on bound variables and contextual variables. These extensions cannot be used when declaring an LF signature, but only when we use LF objects within computation-level programs and declarations.

term, lf_type ::= cvar sigma(* contextual variable *) | id "." int(* k-th projection of a bound variable x *) | block(* Sigma x:A.B *) (* Contextual variable *) cvar ::= id(* context variable *) | ID(* meta-variable *) | "#" id(* parameter variable *) | "#" id "." k(* k-th projection *) block ::=block id ":" lf_type {"," id ":" lf_type}(* Substitutions *) sigma ::=(* Nothing denotes the empty substitution *) | ..(* Identity substitution *) | sigma term(* substitution sigma , M *)

We impose the following restrictions:

- Context variables always occur on their own (i.e. sigma is always empty).

## Computation-level datatype declarations

A computational-level datatype were introduced in [Cave and Pientka, POPL12]. They support indexing a datatype with contextual objects.

c_datatype_decl ::=datatype id ":" c_kind "=" [ID : c_type] {"|" ID : c_type} {and id ":" c_kind "=" [ID : c_type] {"|" ID : c_type}} ";"

## Schemas and Contexts

Schemas classify context. We allow contexts to be built up not only by single declarations, but also by Sigma-types, i.e. dependent tuples. Logically, this means we can have not only atoms in our context, but also conjunctions of atoms (where the conjunction corresponds to the Sigma-type type-theoretically).

% Schema ctx_schema_decl ::=schema [some "[" ctx "]"] block |schema lf_type hyp ::= id ":" lf_type | block% Context ctx ::= [hyp {"," hyp}] | id ["," hyp {"," hyp}](* Context variable *) ctx_hat ::= [id {"," id}](* A context with the types erased. *)

## Computation-level Types

index_obj ::= "[" ctx "]"(* A context index *) | "[" ctx "." term "]"(* A contextual object *) quantif ::= "{" id ":" "[" ctx "." term "]" "}"(* Dependent function type *) | "{" "#" id ":" "[" ctx "." term "]" "}"(* Dependent function type *) | "{" id ":" ctx_schema "}"(* Explicit context *) | "(" id ":" ctx_schema ")"(* Implicit context *) c_kind ::=ctype | c_type "->" c_kind | quantif c_kind c_type ::= "[" ctx "." lf_type "]"(* Contextual type [Psi . A] *) | ID {index_obj}(* Computational-level datatype *) | c_type "*" c_type(* Tuples *) | c_type "->" c_type(* The type of functions *) | quantif c_type(* Dependent function *)

In the order of precedence, we disambiguate the syntax:

`->`

is right associative.`:`

is left associative.`{}`

`and`

`\`

are weak prefix operators.- Bound LF variables and constants must be written with lower-case letters.
- Free meta-variables must be written with upper-case letters.
- Context variables are written with lower-case letter.
- All LF terms need to be written in beta-normal form, i.e. no redeces can occur in terms.
<-- LF terms occurring as part of a contextual object must be written in
eta-expanded form. For example, one must write
`[g . lam \x.E .. x]`

and writing`[g . lam E]`

is currently rejected. --> - In the contextual type
`[Psi . A]`

, the type`A`

must be a base type. This can always be achieved by translating a type`[Psi . (B -> A)]`

to`[Psi,x:B . A]`

. By applying this translation, any function type can be lowered.

```
```

```
```## Computation-level Expressions

exp ::= "[" ctx_hat "." term "]" (* Contextual object *)
| "[" ctx "." term "]" (* Type annotated contextual object *)
| "(" exp "," exp ")" (* Tuple *)
| id (* Variable *)
| ID (* Compution-level constructor *)
| fn id "=>" exp (* Computation-level functions *)
| mlam ID "=>" exp (* Abstraction on dependent type index *)
| mlam id "=>" exp (* Context abstraction *)
| exp exp (* Application *)
| exp "[" ctx "]" (* Context application *)
| case exp of branch {branch} (* Case analysis *)
| case exp of "{" "}" (* Case analysis at uninhabited type *)
| let patt = exp in exp (* Case expression with a single branch *)
branch ::= {quantif} patt "=>" exp
(* Patterns *)
patt ::= "[" ctx_hat "." term "]" (* Contextual object *)
| "[" ctx "." term "]" (* Type annotated contextual object *)
| "(" patt "," patt ")" (* Tuple *)
| id (* Pattern variable *)
| ID {index_obj} (* Computation-level constructor *)
| patt ":" c_type (* Type annotation in pattern *)

The constructs `{X:[Psi . A]} tau`

the identifier `X`

must be an upper-case
letter. It may overshadow other contextual variables in `tau`

. Note that
`[Psi . A] -> tau`

is NOT just an abbreviation for `{X:[Psi . A]} tau `

.
where `X`

does not appear in `tau`

. On the computation-level we retain the
difference between quantification and implication. It is however worth
noting that there is a simple translation between programs of these
two types which involves lifting universally quantified elements to
become proper stand-alone computation-level expressions.

In the order of precedence, we disambiguate the syntax and impose restrictions as follows:

- Juxtaposition (application) is left associative and has highest precedence.
- Bound variables and constants must be written with lower-case letters.
- Free meta-variables must be written with upper-case letters.
- All LF terms need to be written in beta-normal form, i.e. no redeces can occur in terms. LF terms occurring as part of a contextual object must be written in eta-expanded form. For example, one must write
`[g . lam \x.E .. x]`

and writing `[g . lam E]`

is currently rejected.

## Emacs mode

There is an emacs mode for Beluga, which is distributed with the Beluga code and example base. This will display the the source-syntax in a nicer format.