## Project Name

VLIW Architecture

The prototype is optimized to do multiway branching every cycle, without branch stalls. Here is an example: a C code fragment for finding the maximum of an array:

```    for(i=0; i < N; i++)
if (a[i] > max) max=a[i];
```

At cycle n, the relevant register contents are (assuming i=0; N>0;):

```mar0, r4=&a[i]; r8=&a[N-1]; r2=max=-INF;
```

The VLIW code for the loop starts as follows

```(LOOP ((LOAD mar\$0 (r\$1))      ;r1=load a[i] (*r4)
(ADD r\$4 4 (r\$4 mar\$0)) ;increment array pointer r4 (=&a[i+1])
;also copy it to mar0
(GOTO L1)))
```

Compare operations set the condition codes cc0, cc1 in cycle n+1 (notice that the second iteration is started before the first one is finished; this is called a software pipelining technique):

```(L1 ((GT r\$1 r\$2 (cc\$0))       ;cc0= (a[i] > max)
(GT r\$4 r\$8 (cc\$1))       ;cc1= (&a[i+1] > &a[N-1])
;cannot overwrite r1 (=a[i]) yet,
;still need to update max
;"?" means a speculative operation
(ADD? r\$4 4 (r\$4 mar\$0))  ;r4,mar0= &a[i+2] (2nd iteration)
(GOTO L2)))
```

Then, multiway branching is done in cycle n+2 using these condition codes, and the correct target instruction ( L2 or exit) is fetched (further compares may set condition codes at the same time):

```(L2 ((IF (NOT cc\$0)                ;if not(a[i]>max)
((COPY? r\$3 (r\$1))        ;can overwrite r1 with a[i+1] now
(IF (NOT cc\$1)            ;if not(&a[i+1]>&a[N-1])
((GT r\$3 r\$2 (cc\$0))  ;cc0= (a[i+1] > max) (2nd it.)
(GT r\$4 r\$8 (cc\$1))  ;cc1= (&a[i+2]>&a[N-1]) (2nd it.)
(ADD? r\$4 4 (r\$4 mar\$0))  ;r4,mar0=&a[i+3] (3rd it.)
(GOTO L2))
ELSE                     ;(&a[i+1]>&a[N-1])
((GOTO exit)) ))      ;r2= maximum of array at exit
ELSE                         ;if (a[i]>max)
((COPY r\$1 (r\$2))         ;update max= a[i]
(COPY? r\$3 (r\$1))        ;can overwrite r1 with a[i+1] now
(IF (NOT cc\$1)           ;if not(&a[i+1]>&a[N-1])
((GT r\$3 r\$1 (cc\$0)) ;cc0= (a[i+1] > a[i]) (2nd it.)
;GT uses OLD r\$1
(GT r\$4 r\$8 (cc\$1)) ;cc1= (&a[i+2]>&a[N-1]) (2nd it.)