Simulation of Cleopatra:

Generating an equivalent C program

Master's Project
Instructor: Azer Bestavros
Kyung-Suk Lhee
Boston University
January 1996


Abstract

This project aims to make a preprocessor of the programming language Cleopatra, created by Professor Azer Bestavros. The preprocessor parses Cleopatra specifications and generates an equivalent simulator written in C, which is then compiled and run on UNIX operating system. Various UNIX system calls are used in the C simulator. Pipe() and fork(), among others, are mainly used in simulating the parallelism of Cleopatra.


Table of Contents

  1. Simulation
  2. Parsing Cleopatra Specification
  3. Problems and Restrictions
  4. Components of Preprocessor
  5. Installation
  6. Sample Program

1. Simulation

Each class in a source program is decomposed to a set of functions so as to keep modularity and to make channel mapping easier. A function call in the class represents either Included Class or Time-constrainted Event-driven Transaction (TET). A class function body consists of a set of fork() to spawn processes. A process may be either for reading a channel or for performing a transaction specified in TET. A (set of) pipe simulates a channel (Signature or Internal Channel),and shared memory simulates the State Space of a class and serves as variable to be used by TETs in the class.


Input Channel Duplication

The main strategy of simulation is Signature / Internal Channel mapping (channel in short) to the Included Classes and TETs of a class. Since a channel can be shared by multiple classes or TETs, a pipe representing an input channel must be duplicated to accomodate the number of its users. For example, a class might have a few Included Classes and a few TETs as follows.
TRA-class main()
{

internal:
   -> x(), y(), z()

include:
   world    y() -> x(), z();
   control  x() -> y();

act:
   x() -> y():
   y() -> z():
}
Channel duplication starts when it is declared as Internal Channel, and the channel can be duplicated subsequently by Included Classes. For example, channel x is duplicated twice because the class control and the first TET have it as their input channel. An equivalent C functions would be like this (parameters are pipe descriptors).
world   (alias_y1, x, z);
control (alias_x1, y);

TET1    (alias_x2, y);
TET2    (alias_y2, z);
For each input channel, a process is spawned and acts as an agent responsible for channel mapping. The agent of channel x, for example, is like following.
while (1) {
   read(x, buffer);
   write(alias_x1, buffer);
   write(alias_x2, buffer);
}
Thus an agent scatters its incoming value to appropriate classe or TET. Output channels are mapped directly without duplication, thereby outgoing value directly going to the original channel. The channel x and y act as input channel and are subject to the duplication. The channel z, though it is implicitly declared as both input and output channel, is used for output only, thus is not duplicated.


Variations of duplication

There are some variations in channel mapping, based on the following observation. Channel duplication of the first two cases is the same as described before. In the third case, it is implied that the TET (or any other TET in the same class) will also fire to the ouput signature. For example,
TRA-class user
   -> x()               /* output signature */
{

act:
   init(), x() -> x():

}
Once the TET is initiated, it will continue triggering and firing itself. The outgoing value through the firing channel x will arrive at where it is defined as Internal Channel (possibly with a name other than x). Because the class does not know the context outside, the TET above should fire twice; once to the channel x (to outside) and once to itself. Equivalent C function would be like;

TET(x)
{
  while (1) {
    read(alias_x);       /* initially don't block */

    (commit section)

    write(x);            /* to outside */
    write(alias_x);      /* to itself  */
  }

}

If the channel x were an Internal Channel, then the TET could fire once to the channel x only, because the destination channel is known so that an agent process can be spawned as before.

For the last case, there can be many solutions. Two of them would be as follows. I opted for the second method. I have not used select or poll function, but I suspected there could be some peculiarities, such as interruptability by signal.


Output Channel Protection

When firing, the firing channel is protected by semaphore so that the value is queued properly at the destination pipe. For this, a semaphore is created when an Internal Channel is declared, and the semaphore is passed as a parameter to Included Class or TET as well as alias pipe of the channel. TET firing to a channel x, for example, would be as follows.
while (1) {

  (commit section)

  P(sem_x);             /* for mutual exclusion */
  write(x, buffer);
  V(sem_x);

}


Time Constraint

Sleep() and alarm() function is used for time constraint. If time delay is specified then TET sleeps immediately after it is triggered. If timeout is occured then TET does not fire, its state variable remaining unchanged since TET does not update the shared memory. Triggering value is timestamped so that TET can check the arrival time of the value and sleep or set alarm appropriately.


Channel Mapping In TET Level

TET function is composed of two kinds of processes; process that triggers the TET (one such process for each input channel), and a commit / firing process that performs comit: section and fires to output channel. Triggering process is independent on each other. It is connected with firing process with a pair of pipes. These two pipes (called data and info) are paired and always written and read at the same time. If there are two triggering channel in a TET, for exsample, then
while (1) {               /* process that triggers channel x */

  read(x, x_buffer);
  P(sem_write);           /* to protect triggering value */
  write(data, x_buffer);
  write(info, infobuffer);
  V(sem_write);

}

while (1) {               /* process that triggers channel y */

  read(y, y_buffer);
  P(sem_write);
  write(data, y_buffer);
  write(info, infobuffer);
  V(sem_write);

}

while (1) {               /* firing process */

  read(info, infobuffer);
  if (x triggered)
    read(data, x_buffer);
  else if (y triggered)
    read(data, y_buffer);

  (commit section)

}
In triggering process, the pipe info sends two values. One is the pipe descriptor (x or y) so that firing process can recognize which channel has triggered. The other is the time (by calling time() function) at which the channel is triggered. The triggering value is queued in the pipe data.


2. Parsing Cleopatra Specification

During scanning a source program the preprocessor biulds a data structure, which contains all the necessary information to subsequently generate an equivalent C program. The data structure is defined in the declare.h file. The syntax specification written in Yacc consists of only a necessary set of Cleopatra syntax. Many detailes are to be compiled by C compiler, which are the following.
  1. Declaration: type and declarator
    Type and declarator are parsed partially. Composite type, for example, can be used but its members are not parsed and left unknown. Structure and array can be used, but pointer should not be used in that State Space (variable) resides in shared memory area.
  2. Expression / assignment statement
    These are not parsed and copied verbatim to the target file.
Parts of Cleopatra specification that contains the above are listed below.
  1. State variable (in which incoming value is saved) in triggering channel of TET
    It is treated as a text, and is not checked whether it is valid or not. Text inside a triggering channel is copied verbatim to the target file. It is because composite type or variable is not completely parsed.
  2. Expression that is specified in firing channel of TET
  3. Actual parameter in Included Class
  4. Condition and time constraint in TET
  5. Init section and commit section in TET


3. Problems and Restrictions

  1. Error checking:
    Preprocessing a Cleopatra specification and compiling the generated C program are independent, thus tracing an error might be difficult.
  2. Partial syntax specification:
    Source program is not completely parsed. Programmer is responsible for such unparsed parts mentioned before.
  3. Resource cleanup upon termination:
    Semaphores and shared memories should be removed when program terminates. In this preprocessor, program runs for a specified time (in seconds). If, however, program terminates abnormally (i.e. by pressing control-C or a system call failure) then these resources could remain in the system and user might have to remove them manually.
  4. Time constraint:
    Granularity of time is second, in that the only available UNIX function for timestamping is time(), which returns current time in seconds.
  5. Init: section and commit section:
    They are copied as a string of characters. Moreover, it is returned by lex as a single token. Thus it could be a problem even for a moderately large commit section. It is also true for #include or #define section.
  6. Static variable:
    TET consists of heavyweight processes, so variable to be shared by multiple processes reside in shared memory area. If TETs were implemented as threads (i.e. they share an address space) then array or pointer could be feasible and used more efficiently.
  7. Syntax specification:
    I tried to conform to the original Cleopatra syntax, but there are some restrictions. For example, State variables must be initialized in Init: section, not in State: section, which only accepts declaration. More details are in cleo.y file.


4. Components of preprocesor

cleo.y: Cleopatra syntax specification
cleo.l: lex file
subr_1.c: functions for building data structure
subr_2.c: functions for producing target file
declare.h: data structure definitions
sub_1.h: function prototypes of subr_1.c
sub_2.h: function prototypes of subr_2.c
These are currently in the directory /home/grad/kyung/project/. Output executable is ccleo, which expects up to three command-line arguments. First one must be a file name (with extention .cleo) that contains valid Cleopatra specification, and second one may be the target file name or time (in seconds) that specifies how long the program will run. If target file name is not specified then default is the source file name with .c extention. Time argument must be preceded by - sign. The default value is 10 seconds.


5. Installation

To install preprocessor, copy makefile and the seven files above in your directory, and type 'make'. The name of preprocessor is 'ccleo'. The usage of preprocessor is 'ccleo source-file-name [target-file-name] [-time]'.


6. Sample Program

Source program written in Cleopatra

factorial.cleo

TRA-class main
   ->
{
   internal:
   -> in(int), out(int)

   included:
   fact in() -> out();
   start     -> in();
	
}

TRA-class fact
   request(int) -> result(int)
{
   state:
   int x, f;

   init:
   x = -1; f = 0;

   act:
   request(x) -> :
      commit { f = 1; }

   -> :
      unless (x < 1)
      commit {f = f * x; x--; }

   -> result(f) :
      unless (x > 0)
      commit {printf("result is: %d\n", f);}
}

TRA-class start
   -> send(int)
{
   act:
   init() -> send(6):
   ;
}
Generated C program (runs for three seconds)
ccleo factorial.cleo -3

output C file is factorial.c

#include <stdio.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/sem.h>
#include <sys/stat.h>
#include <signal.h>
#include <setjmp.h>
#include <errno.h>

#define  SHM_MODE (SHM_R | SHM_W)
#define  PERMS    (S_IRUSR | S_IWUSR)
#define  CHAR     "c"
#define  ERR_MSG(x) perror("x error")

#define  lock(x)  \
while (((retval = semop(x, P, 1)) == -1) &&  \
(errno == EINTR)) ;                          \
if (retval == -1) ERR_MSG(sem P);

#define  unlock(x)  \
while (((retval = semop(x, V, 1)) == -1) &&  \
(errno == EINTR)) ;                          \
if (retval == -1) ERR_MSG(sem V);

typedef void Sigfunc(int);
Sigfunc  *signal(int, Sigfunc *);
int       pid, bufsize, sz, retval;
jmp_buf   env_alrm;
struct {
   int       pd;
   time_t    time;
}         buf;
struct sembuf P[1];
struct sembuf V[1];
union semun {
   int       val;
   struct semid_ds *buf;
   ushort   *array;
}         sem_init_arg;
void      sigfunc(int signo);
void      sig_alrm(int signo);
int
main()
{
   int       in[2];
   int       inbuf;
   int       sem_in;
   int       alias00[2];
   int       out[2];
   int       outbuf;
   int       sem_out;
   signal(SIGUSR1, SIG_DFL);
   signal(SIGALRM, SIG_IGN);
   sz = sizeof(buf);
   P[0].sem_num = 0;
   P[0].sem_op = -1;
   P[0].sem_flg = 0;
   V[0].sem_num = 0;
   V[0].sem_op = 1;
   V[0].sem_flg = 0;
   sem_init_arg.val = 1;
   if (pipe(in) == -1)
      ERR_MSG(pipe);
   if (pipe(alias00) == -1)
      ERR_MSG(pipe);
   if (pipe(out) == -1)
      ERR_MSG(pipe);
   if ((sem_in = semget(IPC_PRIVATE, 1, PERMS)) == -1)
      ERR_MSG(sem get);
   if (semctl(sem_in, 0, SETVAL, sem_init_arg) != 0)
      ERR_MSG(sem initialize);
   if ((sem_out = semget(IPC_PRIVATE, 1, PERMS)) == -1)
      ERR_MSG(sem get);
   if (semctl(sem_out, 0, SETVAL, sem_init_arg) != 0)
      ERR_MSG(sem initialize);
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      bufsize = sizeof(int);
      while (1) {
	 if (read(in[0], &inbuf, bufsize) != bufsize)
	    ERR_MSG(read);
	 if (write(alias00[1], &inbuf, bufsize) != bufsize)
	    ERR_MSG(write);
      }
      exit(0);
   }
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      bufsize = sizeof(int);
      while (1) {
	 if (read(out[0], &outbuf, bufsize) != bufsize)
	    ERR_MSG(read);
      }
      exit(0);
   }
   fact(alias00, out, sem_out);
   start(in, sem_in);
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      signal(SIGUSR1, sigfunc);
      pause();
      if (semctl(sem_in, 0, IPC_RMID, 0) == -1)
	 ERR_MSG(sem remove);
      if (semctl(sem_out, 0, IPC_RMID, 0) == -1)
	 ERR_MSG(sem remove);
      exit(0);
   } sleep(3);
   kill(0, SIGUSR1);
   return 0;
}
int
fact(int request[], int result[], int sem_result)
{
   int       requestbuf;
   int       alias00[2];
   int       resultbuf;
   int       trigger[2];
   char      triggerbuf;
   int       alias01[2];
   int       alias02[2];
   int       x;
   int      *shm_x;
   int       xid;
   int       f;
   int      *shm_f;
   int       fid;
   int       sem_var;
   if (pipe(alias00) == -1)
      ERR_MSG(pipe);
   if (pipe(trigger) == -1)
      ERR_MSG(pipe);
   if (pipe(alias01) == -1)
      ERR_MSG(pipe);
   if (pipe(alias02) == -1)
      ERR_MSG(pipe);
   if ((xid = shmget(IPC_PRIVATE, sizeof(int), SHM_MODE)) == -1)
      ERR_MSG(shm get);
   if ((shm_x = (int *) shmat(xid, 0, 0)) == (int *) -1)
      ERR_MSG(shm attach);
   if ((fid = shmget(IPC_PRIVATE, sizeof(int), SHM_MODE)) == -1)
      ERR_MSG(shm get);
   if ((shm_f = (int *) shmat(fid, 0, 0)) == (int *) -1)
      ERR_MSG(shm attach);
   if ((sem_var = semget(IPC_PRIVATE, 1, PERMS)) == -1)
      ERR_MSG(sem get);
   if (semctl(sem_var, 0, SETVAL, sem_init_arg) != 0)
      ERR_MSG(sem initialize);

   x = -1;
   f = 0;


   *shm_x = x;
   *shm_f = f;
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      bufsize = sizeof(int);
      while (1) {
	 if (read(request[0], &requestbuf, bufsize) != bufsize)
	    ERR_MSG(read);
	 if (write(alias00[1], &requestbuf, bufsize) != bufsize)
	    ERR_MSG(write);
	 if (write(trigger[1], CHAR, 1) != 1)
	    ERR_MSG(write);
      }
      exit(0);
   }
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      while (1) {
	 if (read(trigger[0], &triggerbuf, 1) != 1)
	    ERR_MSG(read);
	 if (write(alias01[1], CHAR, 1) != 1)
	    ERR_MSG(write);
	 if (write(alias02[1], CHAR, 1) != 1)
	    ERR_MSG(write);
      }
      exit(0);
   }
   facttet00(sem_var, shm_x, shm_f, alias00, trigger);
   facttet01(sem_var, shm_x, shm_f, alias01, trigger);
   facttet02(sem_var, shm_x, shm_f, alias02, result, sem_result);
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      signal(SIGUSR1, sigfunc);
      pause();
      if (shmctl(xid, IPC_RMID, NULL) == -1)
	 ERR_MSG(shm remove);
      if (shmctl(fid, IPC_RMID, NULL) == -1)
	 ERR_MSG(shm remove);
      if (semctl(sem_var, 0, IPC_RMID, 0) == -1)
	 ERR_MSG(sem remove);
      exit(0);
   } return 0;
}
int
facttet00(int sem_var, int *shm_x, int *shm_f, int request[], int fire[])
{
   int       info[2], data[2];
   int       requestbuf;
   char      firebuf;
   int       x;
   int       f;
   pipe(info);
   pipe(data);
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      bufsize = sizeof(int);
      while (1) {
	 if (read(request[0], &requestbuf, bufsize) != bufsize)
	    ERR_MSG(read);
	 if (write(data[1], &requestbuf, bufsize) != bufsize)
	    ERR_MSG(write);
	 if (write(info[1], &buf, sz) != sz)
	    ERR_MSG(write);
      }
      exit(0);
   }
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      while (1) {
	 if (read(info[0], &buf, sz) != sz)
	    ERR_MSG(read);
	 if (read(data[0], &requestbuf, sizeof(int)) != sizeof(int))
	    ERR_MSG(read);
	 lock(sem_var);
	 x = *(shm_x);
	 f = *(shm_f);
	 x = requestbuf;
	 unlock(sem_var);
	 f = 1;
	 lock(sem_var);
	 *(shm_x) = x;
	 *(shm_f) = f;
	 unlock(sem_var);
	 if (write(fire[1], CHAR, 1) != 1)
	    ERR_MSG(write);
      }
      exit(0);
   }
   return 0;
}
int
facttet01(int sem_var, int *shm_x, int *shm_f, int trigger[], int fire[])
{
   int       info[2], data[2];
   char      triggerbuf;
   char      firebuf;
   int       x;
   int       f;
   pipe(info);
   pipe(data);
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      bufsize = sizeof(char);
      while (1) {
	 if (read(trigger[0], &triggerbuf, bufsize) != bufsize)
	    ERR_MSG(read);
	 if (write(data[1], &triggerbuf, bufsize) != bufsize)
	    ERR_MSG(write);
	 if (write(info[1], &buf, sz) != sz)
	    ERR_MSG(write);
      }
      exit(0);
   }
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      while (1) {
	 if (read(info[0], &buf, sz) != sz)
	    ERR_MSG(read);
	 if (read(data[0], &triggerbuf, sizeof(char)) != sizeof(char))
	    ERR_MSG(read);
	 lock(sem_var);
	 x = *(shm_x);
	 f = *(shm_f);
	 if (x < 0) {
	    unlock(sem_var);
	    continue;
	 }
	 f = f * x;
	 x--;
	 if (x < 0) {
	    unlock(sem_var);
	    continue;
	 }
	 *(shm_x) = x;
	 *(shm_f) = f;
	 unlock(sem_var);
	 if (write(fire[1], CHAR, 1) != 1)
	    ERR_MSG(write);
      }
      exit(0);
   }
   return 0;
}
int
facttet02(int sem_var, int *shm_x, int *shm_f, int trigger[], int result[], int sem_result)
{
   int       info[2], data[2];
   char      triggerbuf;
   int       resultbuf;
   int       x;
   int       f;
   pipe(info);
   pipe(data);
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      bufsize = sizeof(char);
      while (1) {
	 if (read(trigger[0], &triggerbuf, bufsize) != bufsize)
	    ERR_MSG(read);
	 if (write(data[1], &triggerbuf, bufsize) != bufsize)
	    ERR_MSG(write);
	 if (write(info[1], &buf, sz) != sz)
	    ERR_MSG(write);
      }
      exit(0);
   }
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      while (1) {
	 if (read(info[0], &buf, sz) != sz)
	    ERR_MSG(read);
	 if (read(data[0], &triggerbuf, sizeof(char)) != sizeof(char))
	    ERR_MSG(read);
	 lock(sem_var);
	 x = *(shm_x);
	 f = *(shm_f);
	 if (x > 0) {
	    unlock(sem_var);
	    continue;
	 }
	 printf("result is: %d\n", f);
	 if (x > 0) {
	    unlock(sem_var);
	    continue;
	 }
	 *(shm_x) = x;
	 *(shm_f) = f;
	 unlock(sem_var);
	 lock(sem_result);
	 resultbuf = f;
	 if (write(result[1], &resultbuf, sizeof(int)) != sizeof(int))
	    ERR_MSG(write);
	 unlock(sem_result);
      }
      exit(0);
   }
   return 0;
}
int
start(int send[], int sem_send)
{
   int       sendbuf;
   starttet00(send, sem_send);
   return 0;
}
int
starttet00(int send[], int sem_send)
{
   int       sendbuf;
   if ((pid = fork()) < 0)
      ERR_MSG(fork);
   else if (pid == 0) {
      lock(sem_send);
      sendbuf = 6;
      if (write(send[1], &sendbuf, sizeof(int)) != sizeof(int))
	 ERR_MSG(write);
      unlock(sem_send);
      exit(0);
   }
   return 0;
}
void
sigfunc(int signo)
{
   return;
}
void
sig_alrm(int signo)
{
   longjmp(env_alrm, 1);
}