# Compiler Design <br> Lecture 16: Liveness Analysis 

Christophe Dubach

Winter 2023

Some material from Prof. Michelle Strout, CS553, Colorado State University.
Timestamp: 2023/03/05 15:35:00

## Proper register allocation

Assign each virtual register to an architectural register (if possible).

Example using virtual registers:

```
.data
x: .space 4
y: .space 4
.text
    la vo, x
    Iw v1, (v0)
    add v2, v0, v1
    la v3, y
    lw v4, (v3)
    sub v5, v4, v2
    add v6, v2, v4
    sw v5, (v0)
    sw v6, (v3)
```

After " "proper" register allocation:

```
.data
x: .space 4
y: .space 4
text
    la $t0, x
    lw $t1, ($t0)
    add $t2, $t0, $t1
    la $t3, y
    lw $t4, ($t3)
    sub $t5, $t4, $t2
    add $t6, $t2, $t4
    sw $t5, ($t0)
    sw $t6, ($t3)
```


## Problem:

- What if more virtual registers used than the number of architectural registers available?

Solution:

- Re-cycle architectural registers.
$\cdot \Rightarrow$ Need to know which values is going to be used in the future.


## Terminology

From now on in this lecture, we will use the term variable to denote a virtual register.

## Liveness

## Definition

A variable (virtual register) is live at some point in the program if it has previously been defined by an instruction and will be used by an instruction in the future. It is dead otherwise.
@ Two variables can use the same architectural register if they are never used at the same time, i.e. never simulataneously live.
$\Rightarrow$ Register allocation use liveness information.

## Example:

```
.data
x: .space 4
y: .space 4
.text
la v0, x
lw v1, (v0)
add v2, v1, v1
la v3, y
lw v4, (v3)
sub v5, v4, v2
add v6, v2, v4
sw v5, (v0)
sw v6, (v3)
```

Question: what is the minimum number of architectural registers needed?

Computing liveness is more complicated in the presence of control flow (e.g. loops, if-then-else).

Assembly pseudo-code: ${ }^{1}$

```
a = 0
L1: b = a + 1
    c = c + b
    a = b*2
    if (a<9) goto L1
    return c
```

Question: what is the live range of $b$ ?
To answer this question we need to understand the dynamic flow of the program execution.

[^0]
## Control-Flow Graph (CFG)

Concept invented in 1970 by:

## Directed graph:



Frances Allen (1932-2020), IBM, (1st woman to receive Turing Award in 2006!)
source: Rama, CC BY-SA. 20 FR, wikimedia

$$
\begin{aligned}
\text { L1: } & a=0 \\
& b=a+1 \\
& c=c+b \\
& a=b * 2 \\
& \text { if }(a<9) \text { goto L1 } \\
& \text { return } c
\end{aligned}
$$



What is the live range of $b$ ?

- b is used in statement 4, so b is live on the $3 \rightarrow 4$ edge
- since statement 3 does not define $b, b$ is also live on the $2 \rightarrow 3$ edge
- statement 2 defines $b$, so any value of $b$ on the $1 \rightarrow 2$ and $5 \rightarrow 2$ edges are not needed, so b is dead along these edges
b live range is $2 \rightarrow 3 \rightarrow 4$


Live range of a:

- $1 \rightarrow 2$ and $4 \rightarrow 5 \rightarrow 2$

Live range of b :

- $2 \rightarrow 3 \rightarrow 4$

Live range of c :

- entry $\rightarrow 1 \rightarrow 2 \rightarrow 3 \rightarrow 4 \rightarrow 5 \rightarrow 2$ and $5 \rightarrow 6$

@ Since a and b never simultaneously live, can share a register.


## Terminology

## Flow Graph

- a Control Flow Graph (CFG) has out-edges that leads to successor nodes and in-edges that come from predecessor nodes
- $\operatorname{pred}(\mathrm{n})=$ set of all predecessors of node $n$ $\operatorname{succ}(n)=$ set of all successors of node $n$


## Examples

- Out-edges of node 5: $5 \rightarrow 6$ and $5 \rightarrow 2$
- $\operatorname{succ}(5)=\{2,6\}$
- $\operatorname{pred}(5)=\{4\}$
- $\operatorname{pred}(2)=\{1,5\}$



## Uses and Defs

Def (definition)

- A write of a value to a variable
- $\operatorname{def}(\mathrm{v})=$ set of CFG nodes that define variable $v$

$$
1: a=0
$$

- $\operatorname{def}(n)=$ set of variables defined at node $n$

Use

- A read of a variable's value
- use(v) = set of CFG nodes that use variable v
- use(n) = set of variables used at node n


## More precise definition of liveness

A variable $v$ is live on a CFG edge if

- $\exists$ a directed path from that edge to a use of $v$ (node $\in$ use(b)) and
- that path does not go through any def of v (nodes $\notin \operatorname{def}(\mathrm{v})$ ).



## Computing Liveness

## Flow of Liveness

## Data-flow

- Liveness of variables is a property that flows through the edges of the CFG


## Direction of flow

- Liveness flows backward in the CFG: behaviour of future nodes determines liveness at a given node

Example: flow of liveness for a


Example: flow of liveness for b


## Liveness at Nodes

We have liveness on edges

- before and after each node


Two more definitions:

- A variable is live-out at a node if it is live on any of that node's out-edges
- A variable is live-in at a node if it is live on any of that node's in-edges


## Computing Liveness

## Rules for computing liveness

1. Generate liveness:


## Data-flow equations

$$
\begin{aligned}
\operatorname{LIVE}_{\text {in }}(n) & =\operatorname{use}(n)_{1}^{1} \cup\left(\operatorname{LIVE}_{\text {out }}(n)-\operatorname{def}(n)\right){ }_{3} \\
\operatorname{LIVE}_{\text {out }}(n) & =\bigcup_{\forall s \in \operatorname{succ}(n)} \operatorname{LIVE} \text { in }(s)
\end{aligned}
$$

## Solving the Data-flow equations

$$
\begin{array}{ll}
\hline \text { 1: } & \text { for all node } \mathrm{n} \in \mathrm{CFG} \text { do } \\
\text { 2: } & \operatorname{LIVE}_{\text {in }}(n)=\varnothing \\
\text { 3: } & \operatorname{LIVE}_{\text {out }}(n)=\varnothing \\
\text { 4: end for } \\
\text { 5: repeat } \\
\text { 6: } & \text { for all node } n \in \operatorname{CFG} \text { do } \\
\text { 7: } & \operatorname{LIVE}_{\text {in }}^{\prime}(n)=\operatorname{LIVE}_{\text {in }}(n) \\
\text { 8: } & \operatorname{LIVE}_{\text {out }}^{\prime}(n)=\operatorname{LIVE}_{\text {out }}(n) \\
\text { 9: } & \operatorname{LIVE}_{\text {in }}(n)=\operatorname{use}(n) \cup\left(\operatorname{LIVE}_{\text {out }}(n)-\operatorname{def}(n)\right) \\
\text { 10: } & \operatorname{LIVE}_{\text {out }}(n)=\bigcup_{\forall s \in \operatorname{succ}(n)} \operatorname{LIVE}_{\text {in }}(s) \\
\text { 11: } & \text { end for } \\
\text { 12: until } & \operatorname{LIVE}_{\text {in }}^{\prime}(n)=\operatorname{LIVE}_{\text {in }}(n) \wedge \operatorname{LIVE}_{\text {out }}^{\prime}(n)=\operatorname{LIVE}_{\text {out }}(n) \forall n \\
\hline
\end{array}
$$

This is a fix-point algorithm for iterative liveness analysis.

## Example



| node | use | def | 1st |  | 2nd |  | 3rd |  | 4th |  | 5th |  | 6th |  | 7th |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | in | out | in | out | in | out | in | out | in | out | in | out | in | out |
| 1 |  | a |  |  |  | a |  | a |  | ac | c | ac | c | ac | c | ac |
| 2 | a | b | a |  | a | bc | ac | bc | ac | bc | ac | bc | ac | bc | ac | bc |
| 3 | bc | c | $b c$ |  | bc | b | bc | b | bc | b | bc | b | bc | $b c$ | $b c$ | bc |
| 4 | b | a | b |  | b | a | b | a | b | ac | bc | ac | bc | ac | $b c$ | ac |
| 5 | a |  | a | a | a | ac | ac | ac | ac | ac | ac | ac | ac | ac | ac | ac |
| 6 | c |  | c |  | c |  | c |  | c |  | c |  | c |  | c |  |

## Data-flow equations

$$
\begin{aligned}
\operatorname{LIVE}_{\text {in }}(n) & =u s e(n) \cup\left(\operatorname{LIVE}_{\text {out }}(n)-\operatorname{def}(n)\right) \\
\operatorname{LIVE}_{\text {out }}(n) & =\bigcup_{\forall s \in \operatorname{succ}(n)} \operatorname{LIVE}_{\text {in }}(s)
\end{aligned}
$$

There is something inefficient about this process.


For instance, consider the $3 \rightarrow 4$ edge in the graph:

- $\operatorname{LIVE}_{\text {out }}(4)$ is used to compute $\operatorname{LIVE}_{\text {in }}(4)$
- $\operatorname{LIVE}_{\text {in }}(4)$ is used to compute $\operatorname{LIVE}_{\text {out }}(3)$
§ The algorithm would converge faster if we process the nodes backwards.


## Backward Liveness Analysis

```
1: for all node \(n \in C F G\) do
2: \(\quad \operatorname{LIVE}_{\text {in }}(n)=\varnothing\)
3: \(\quad \operatorname{LIVE}_{\text {out }}(n)=\varnothing\)
4: end for
5: repeat
6: \(\quad\) for all node \(\mathrm{n} \in \mathrm{CFG}\) in reverse pre-order do
7: \(\quad \operatorname{LIVE}_{i n}^{\prime}(n)=\operatorname{LIVE}_{i n}(n)\)
8: \(\quad \operatorname{LIVE}_{\text {out }}^{\prime}(n)=\operatorname{LIVE}_{\text {out }}(n)\)
9: \(\quad \quad \operatorname{LIVE}_{\text {out }}(n)=\underset{\forall s \in \operatorname{succ}(n)}{ } \operatorname{LIVE}_{\text {in }}(s)\)
10: \(\quad \quad \operatorname{LIVE}_{\text {in }}(n)=u s e(n) \cup\left(\operatorname{LIVE}_{\text {out }}(n)-\operatorname{def}(n)\right)\)
11: end for
12: until \(\operatorname{LIVE}_{\text {in }}^{\prime}(n)=\operatorname{LIVE}_{\text {in }}(n) \wedge \operatorname{LIVE}_{\text {out }}^{\prime}(n)=\operatorname{LIVE}_{\text {out }}(n) \forall n\)
```


## Example with Backward Liveness Analysis



| node | use | def | 1st |  | 2nd |  | 3rd |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  |  |  | out | in | out | in | out | in |
| 6 | $c$ |  |  | $c\|l\| l l l l$ |  |  |  |  |
| 5 | a |  | c | c |  | $c$ |  |  |
| 4 | b | a | ac | bc | ac | ac | ac | ac |
| 3 | bc | c | bc | bc | bc | bc | ac | bc |
| 2 | a | b | bc | ac | bc | ac | bc | bc |
| 1 |  | a | ac | c | ac | c | ac | c |

## Converges in only 3 iterations!

## Data-flow equations

$$
\begin{aligned}
\operatorname{LIVE}_{\text {out }}(n) & =\bigcup_{\forall s \in \operatorname{succ}(n)} \operatorname{LIVE}_{\text {in }}(s) \\
\operatorname{LIVE}_{\text {in }}(n) & =\operatorname{use}(n) \cup\left(\operatorname{LIVE}_{\text {out }}(n)-\operatorname{def}(n)\right)
\end{aligned}
$$

## More performance considerations

> Basic Block
> A straight sequence of assembly instruction which (usually) finishes with a branch/jump instruction.

> Key property: Either all the instructions in the sequence execute or none execute.

Can significantly decrease the size that a CFG occupies in memory by grouping nodes that have a single predecessor and a single successor into basic blocks.

The instructions in a basic block can be simply represented as a list (rather than a graph).

## Example

No basic blocks:


With basic blocks:


$$
\begin{aligned}
\operatorname{use}(2) & =\{a, c\} \\
\operatorname{def}(2) & =\{a, b, c\}
\end{aligned}
$$

## Next lecture

- Proper register allocation


[^0]:    ${ }^{1}$ We illustrate concepts at a slightly higher level than assembly from this point on.

