Writing an Emulator for the MIX Architecture

I’ve been reading a lot of TAOCP lately (that’s The Art of Computer Programming by Donald Knuth for you plebes out there 😛 ). This classic book series consists mainly of designing a lot of algorithms in machine language. The machine language used is that of the fictional MIX architecture, which Knuth created specifically for this work. MIX has never been implemented in hardware, though people have written emulators for it. In fact one of the questions on the famous Hacker Test is about running a MIX simulator. So I figured I would attempt this project myself.

First a brief description of the architecture: The MIX is both a binary computer and a decimal computer. It works in bytes which can hold between 64 and 100 unique values (so 6 bits if treated as a binary computer and 2 digits if treated as a decimal computer). It has nine registers – A, X, I1, I2, I3, I4, I5, I6, and J – and exactly 4000 units of memory. Each register and each unit of memory is one word in size; a word is six bytes, including one byte for the sign. All instructions are one word in length and consist of one byte for the op-code, three bytes (including the sign byte) for the address, one byte for an index, and one byte for a modifier that specifies what section of the register or word in memory to use (the default is all six bytes).

The emulator that I’m writing consists of two parts: an assembler to convert the MIX assembly language notation to machine language, and an execution environment to read the machine language file and execute the instructions on a simulated MIX machine. I started with the execution environment. The most obvious first step is to define the data types used as well as the data that will be operated on:


/* Define data types: */
struct word {
        unsigned int f0 : 2;
        unsigned int f1 : 6;
        unsigned int f2 : 6;
        unsigned int f3 : 6;
        unsigned int f4 : 6;
        unsigned int f5 : 6;
};

/* Define memory locations: */
struct word rA  = { 000000 };
struct word rX  = { 000000 };
struct word rI1 = { 000000 };
struct word rI2 = { 000000 };
struct word rI3 = { 000000 };
struct word rI4 = { 000000 };
struct word rI5 = { 000000 };
struct word rI6 = { 000000 };
struct word rJ  = { 000000 };
struct word memory[4000];

I cheated a little on my word type here by having the first bit-field be only two bits long when it’s technically supposed to be an entire byte. I figured it wasn’t a problem since you only really need one bit to hold the sign, and this way the bit-fields all fit nicely into a single 32-bit unsigned integer, whereas if I made the sign byte 6 bits long like the rest of them, it would take up 36 bits and there would be 28 bits wasted. Why waste 28 bits of the struct just so you can waste 5 bits of the sign? Seems kinda retarded if you ask me.

Now technically a byte is supposed to encompass 64 values if you’re using binary and 100 values if you’re using decimal. I don’t know of any way to implement this other than having two separate word types (one with 6-bit bytes for binary and one with 8-bit bytes for BCD) and operating on them both in parallel. Since that would make the program needlessly complicated, I’m going to skip over that for now. I just want to get the basic functionalities working first before I start adding more bells and whistles to make the program more true to the original theoretical model.

I started my work with the execution environment, which is in the file run.c. This file has a main function, an instruction decoding function, and a function for each of the individual instructions in the language. The main function starts by opening the executable file, then it checks to make sure the file size is a multiple of 4 (because each instruction is 4 bytes long),and then it starts to read the instructions from the file. I haven’t tested this particular file yet but is what I have so far:


 1 #include "mix.h"
 2 
 3 void decode( struct word * );
 4 void ADD( unsigned intunsigned intunsigned int );
 5 
 6 void decode( struct word *instruction ){
 7         unsigned int address = instruction->f1<<6 + instruction->f2;
 8         switch( instruction->f5 ){
 9                 case  1 : ADD( instruction->f4, instruction->f3, address ); break;
10                 default : fprintfstderr"Invalid instruction\n" ); exit(1);
11         }
12 }
13 
14 void ADD( unsigned int f4, unsigned int f3, unsigned int address ){
15         
16 }
17 
18 int mainint argc, char **argv ){
19         FILE *fp;
20         // Open file:
21         if( !(fp = fopen( argv[1], "r" )) ){
22                 fprintfstderr"%s%s%s\n", argv[0], argv[1], strerror( errno ) );
23                 exit( errno );
24         }
25         // Make sure file is the right size:
26         fseek( fp, 0SEEK_END );
27         ifftell( fp ) % 4 ){
28                 fprintfstderr"Size of input file needs to be a multiple of 4\n" );
29                 fclose( fp );
30                 exit( -1 );
31         }
32         fseek( fp, 0SEEK_SET );
33         // Read file:
34         int c;
35         int i = 0;
36         int instruction;
37         while( (c = fgetc( fp )) != EOF ){
38                 ungetc( c, fp );
39                 fread( &instruction, 41, fp );
40                 printf"Byte %d:\nOp-code: %x\nModifier: %x\nIndex: %x\nAddress: %x\n\n",
41                         i, f5_bits( instruction ), f4_bits( instruction ), f3_bits( instruction ), address_bits( instruction ) );
42                 i++;
43         }
44         fclose( fp );
45         return 0;
46 }

In the last 24 hours I have also made significant progress on the other two files – the header file and the C file for the assembler. The header file is called mix.h. I have moved most of my macro and type definitions, global variable declarations, and function prototypes to this file.


 1 #ifndef _MIX_H_
 2 #define _MIX_H_
 3 
 4 #include <stdio.h>
 5 #include <stdlib.h>
 6 #include <errno.h>
 7 #include <stdbool.h>
 8 #include <string.h>
 9 #include <stdint.h>
10 
11 #define lookup "lookup.tbl"
12 
13 struct lookup_table {
14         char mnemonic[8];
15         int opcode;
16         int modifier;
17         struct lookup_table *next;
18 };
19 
20 struct lookup_table *lookup_head;
21 struct lookup_table *lookup_cur;
22 
23 /* Word type used as the base memory unit */
24 struct word {
25         unsigned int f0 : 2;
26         unsigned int f1 : 6;
27         unsigned int f2 : 6;
28         unsigned int f3 : 6;
29         unsigned int f4 : 6;
30         unsigned int f5 : 6;
31 };
32 
33 /* Ranges of bits in a 4-byte instruction */
34 #define f0_bits( i ) ((i & (  3 << 30)) >> 30)
35 #define f1_bits( i ) ((i & (127 << 24)) >> 24)
36 #define f2_bits( i ) ((i & (127 << 18)) >> 18)
37 #define f3_bits( i ) ((i & (127 << 12)) >> 12)
38 #define f4_bits( i ) ((i & (127 <<  6)) >>  6)
39 #define f5_bits( i )  (i &  127)
40 
41 #define address_bits( i ) ((f0_bits( i ) << 12) | (f1_bits( i ) << 6) | f2_bits( i ))
42 
43 #define flatten( word ) ((word.f0 << 30) | (word.f1 << 24) | (word.f2 << 18) | (word.f3 << 12) | (word.f4 << 6) | word.f5)
44 
45 /* Define memory locations: */
46 struct word rA  = { 000000 };
47 struct word rX  = { 000000 };
48 struct word rI1 = { 000000 };
49 struct word rI2 = { 000000 };
50 struct word rI3 = { 000000 };
51 struct word rI4 = { 000000 };
52 struct word rI5 = { 000000 };
53 struct word rI6 = { 000000 };
54 struct word rJ  = { 000000 };
55 struct word memory[4000];
56 
57 #define N 0 // No value
58 #define E 1 // Equal
59 #define L 2 // Less than
60 #define G 3 // Greater than
61 
62 bool overflow = false;
63 unsigned char cmpflag = N;
64 
65 __BEGIN_DECLS
66 char *upcasechar * );
67 unsigned char opc_numchar * );
68 unsigned char mod_numchar * );
69 unsigned char idx_numchar * );
70 unsigned char low_numchar * );
71 unsigned char upr_numchar * );
72 __END_DECLS
73 
74 #endif

The first section of this header file after the #include guard and the header file inclusions defines a lookup table, which is used to hold data from an external file that tells the MIX simulator what mnemonics correspond to which opcode numbers. Since there can be multiple mnemonics for a single opcode number depending on the value of the modifier, I included the modifier in the lookup table as well. The opcode lookup table is a linked list of structs that the MIX simulator will scan whenever it needs to match a mnemonic to a number (in the case of the assembler) or an opcode/modifier pair to a specific procedure (in the case of the execution environment).

After the definition of the word type is a series of macro definitions (lines 34-43) whose purpose is to either extract sections of bits from a 4-byte instruction so they can be operated on as fields in a word structure, or flatten a word structure into a 4-byte instruction to be written to a machine code file by the assembler. The first six of these each correspond to one of the bit-fields, and their purpose is to derive the bit-fields from the corresponding sections of the instruction code. The seventh macro combines the three address fields into a whole address. The eighth macro flattens a word structure into a single int value so it can be written to a binary file.

By far the file that I’ve made the most progress on and the one of the two components that I’m closest to completing is asm.c – the assembler. This program is somewhat similar in structure to the execution environment. It starts by parsing its command line parameters to determine which input file and which output file to use. If none are provided it uses standard input and standard output for the input file and output file respectively. It then reads the opcode lookup table from an auxiliary file (lines 36-58) and builds a linked list that stores all the lookup information. It then goes on to read the input assembly file, parsing each line and dividing it into tokens, which it then uses to build a word structure, and then flatten that word structure into a 4-byte integer which it writes to the output file (the executable).


  1 #include "mix.h"
  2 #include <ctype.h>
  3 
  4 // To debug this program, compile with option -D_DEBUG
  5 
  6 int mainint argc, char **argv ){
  7         char *infile = NULL;
  8         char *outfile = NULL;
  9         /* Parse command line options: */
 10         forint i = 1; i < argc; i++ ){
 11                 if( argv[i][0] == '-' ){
 12                         if( argv[i][1] == 'o' ){
 13                                 outfile = argv[++i];
 14                         }
 15                 }
 16                 else infile = argv[i];
 17         }
 18         /* Open input file stream: */
 19         FILE *fp;
 20         FILE *fq;
 21         if( !infile ) fp = stdin;
 22         else if( !(fp = fopen( infile, "r" )) ){
 23                 fprintfstderr"%s%s%s\n", argv[0], infile, strerror( errno ) );
 24                 exit( errno );
 25         }
 26         /* Open output file stream: */
 27         if( !outfile ) fq = stdout;
 28         else if( !(fq = fopen( outfile, "w" ) ) ){
 29                 fprintfstderr"%s%s%s\n", argv[0], outfile, strerror( errno ) );
 30                 exit( errno );
 31         }
 32 
 33         int c;
 34         int bufsize = 64;
 35         char *buf = (char *) malloc( bufsize );
 36         /* Read lookup table from file: */
 37         FILE *lp;
 38         int i = 0;
 39         if( !(lp = fopenlookup"r" ) ) ){
 40                 fprintfstderr"Can't find opcode lookup table: %s\n"strerror( errno ) );
 41                 exit( errno );
 42         }
 43         lookup_head = (struct lookup_table *) mallocsizeofstruct lookup_table ) );
 44         strcpy( lookup_head->mnemonic, "dummy" );
 45         lookup_head->opcode = -1;
 46         lookup_cur = lookup_head;
 47         i = 0;
 48         while( (c = fgetc( lp )) != EOF ){
 49                 ungetc( c, lp );
 50                 fgets( buf, bufsize, lp );
 51                 lookup_cur->next = (struct lookup_table *) mallocsizeofstruct lookup_table ) );
 52                 lookup_cur = lookup_cur->next;
 53                 strcpy( lookup_cur->mnemonic, strtok( buf, \t" ) );
 54                 lookup_cur->opcode = atoistrtokNULL\t" ) );
 55                 lookup_cur->modifier = atoistrtokNULL"\n" ) );
 56         }
 57         lookup_cur = lookup_head;
 58         fclose( lp );
 59         /* Start processing input: */
 60         while( (c = fgetc( fp )) != EOF ){
 61                 ungetc( c, fp );
 62                 // Get input line from file:
 63                 fgets( buf, bufsize, fp );
 64                 while( buf[strlen( buf )-1] != '\n' ){
 65                         buf = (char *) realloc( buf, bufsize<<1 );
 66                         fgets( buf + bufsize - 1, bufsize, fp );
 67                         bufsize <<= 1;
 68                 }
 69                 bufsize = 64;
 70                 struct word instruction;
 71                 // Parse input line:
 72                 char *noncomment = strtok( buf, ";\n" );
 73                 char *opcode = strtok( noncomment, \t\n" );
 74                 char *address = strtokNULL\t\n;" );
 75                 char *base = NULL;
 76                 char *extra = NULL;
 77                 char *first = NULL;
 78                 char *modification = NULL;
 79                 char *index = NULL;
 80                 if( address ){ // Skip if there is no address field
 81                         base = strtok( address, "," );
 82                         extra = strtokNULL"\0" );
 83                         first = strtok( extra, "()" );
 84                         if( !first ); // Placeholder to avoid segfaults
 85                         else ifstrlen( first ) == 3 && first[1] == ':' ) modification = first;
 86                         else ifstrlen( first ) ){
 87                                 index = first;
 88                                 modification = strtokNULL"()" );
 89                         }
 90                 }
 91                 if( index == NULL || index[0] == '\0' ) index = "0";
 92                 if( modification == NULL || modification[0] == '\0' ) modification = "0:5";
 93                 if( base == NULL ) base = "0";
 94                 // Set field values:
 95                 instruction.f5 = opc_num( opcode );
 96                 instruction.f4 = mod_num( modification );
 97                 instruction.f3 = idx_num( index );
 98                 instruction.f2 = low_num( base );
 99                 instruction.f1 = upr_num( base );
100                 instruction.f0 = 0;
101 
102                 /* BEGIN DEBUG SECTION */
103                 #ifdef _DEBUG
104                 printf"opcode:\t%s\nbase:\t%s\nmod:\t%s\nindex:\t%s\n",
105                          opcode, address, base, extra, modification, index );
106                 printf"f5:\t%x\nf4:\t%x\nf3:\t%x\nf2:\t%x\nf1:\t%x\nf0:\t%x\n",
107                          instruction.f5, instruction.f4, instruction.f3, instruction.f2, instruction.f1, instruction.f0 );
108                 int intbase = atoi( base );
109                 // Print binary representation of base:
110                 printf"Base:\t" );
111                 forint i = 11; i >= 0; i-- ){
112                         putchar( intbase & (1<<i) ? '1' : '0' );
113                 }
114                 putchar'\n' );
115                 // Print binary representation of base sections:
116                 printf"f1|f2:\t" );
117                 forint i = 5; i >= 0; i-- ){
118                         putchar( instruction.f1 & (1<<i) ? '1' : '0' );
119                 }
120                 putchar'|' );
121                 forint i = 5; i >= 0; i-- ){
122                         putchar( instruction.f2 & (1<<i) ? '1' : '0' );
123                 }
124                 printf"\n\n" );
125                 /* END DEBUG SECTION */
126 
127                 // Write output:
128                 #else
129                 int32_t f = flatten( instruction );
130                 fwrite( &f, 41, fq );
131                 #endif
132         }
133         fclose( fp );
134         fclose( fq );
135         return 0;
136 }
137 
138 // Converts a string to uppercase
139 char *upcasechar *str ){
140         int len = strlen( str );
141         static char copy[10];
142         forint i = 0; i < len; i++ ){
143                 copy[i] = toupper( str[i] );
144         }
145         copy[len] = '\0';
146         return copy;
147 }
148 
149 // Returns integer value for modifier string
150 unsigned char mod_numchar *str ){
151         unsigned char lower = str[0] - '0';
152         unsigned char upper = str[2] - '0';
153         return upper + (lower << 3);
154 }
155 
156 // Returns integer value for index string
157 unsigned char idx_numchar *str ){
158         return atoi( str );
159 }
160 
161 // returns lower 6 bits of address field
162 unsigned char low_numchar *str ){
163         return f5_bitsatoi( str ) );
164 }
165 
166 // Returns upper 6 bits of address field
167 unsigned char upr_numchar *str ){
168         return f4_bitsatoi( str ) );
169 }
170 
171 // Returns opcode number for a mnemonic
172 unsigned char opc_numchar *str ){
173         char *opcode = upcase( str );
174         lookup_cur = lookup_head;
175         while( lookup_cur->next ){
176                 lookup_cur = lookup_cur->next;
177                 if( !strcmp( opcode, lookup_cur->mnemonic ) ){
178                         struct lookup_table *temp = lookup_cur;
179                         lookup_cur = lookup_head;
180                         return temp->opcode;
181                 }
182         }
183         return -1;
184 }

You might notice that there’s an optional debug section used to print the values of all the fields to make sure they’re correct. In fact, the information is only written to the output file when the debug information is not set to be printed – the two options are mutually exclusive.

There follow a number of function definitions, including the five functions that derive the fields of the word structure and function to convert a string into uppercase (so the assembler doesn’t need to be case-sensitive). The last function is the most complicated, but is still fairly simple: it traverses the lookup table until it finds a match, then returns the opcode number it finds there. If it can’t find anything it returns an error status of -1.

I have tested my assembler with the following MIX assembly language file:


        add     2000,60(1:5)    ; First instruction
        sub     2000,70(1:3)    ; Second instruction
        mul     2000,50         ; Third instruction
        div     2000,(4:5)      ; Fourth instruction
        char                    ; Fifth instruction
        hlt                     ; Sixth instruction

It produces a binary file whose hexdump looks like this:


$ hexdump -C test.exec
00000000  41 c3 43 1f c2 62 40 1f  43 21 43 1f 44 09 40 1f  |A.C..b@.C!C.D.@.|
00000010  45 01 00 00 45 01 00 00                           |E...E...|
00000018

To be sure that it works, here is the debug information printed when -D_DEBUG is set:


opcode: add
base:   2000
mod:    2000
index:  60
f5:     1
f4:     d
f3:     3c
f2:     10
f1:     1f
f0:     0
Base:   011111010000
f1|f2:  011111|010000

opcode: sub
base:   2000
mod:    2000
index:  70
f5:     2
f4:     b
f3:     6
f2:     10
f1:     1f
f0:     0
Base:   011111010000
f1|f2:  011111|010000

opcode: mul
base:   2000
mod:    2000
index:  50
f5:     3
f4:     5
f3:     32
f2:     10
f1:     1f
f0:     0
Base:   011111010000
f1|f2:  011111|010000

opcode: div
base:   2000
mod:    2000
index:  (4:5
f5:     4
f4:     25
f3:     0
f2:     10
f1:     1f
f0:     0
Base:   011111010000
f1|f2:  011111|010000

opcode: char
base:   (null)
mod:    0
index:  (null)
f5:     5
f4:     5
f3:     0
f2:     0
f1:     0
f0:     0
Base:   000000000000
f1|f2:  000000|000000

opcode: hlt
base:   (null)
mod:    0
index:  (null)
f5:     5
f4:     5
f3:     0
f2:     0
f1:     0
f0:     0
Base:   000000000000
f1|f2:  000000|000000

Looks like I still need to fix the last part of the fourth instruction, but other than that, it works perfectly.

I’ve been building my lookup table based on the information I found in this article. I don’t fully understand all the instructions yet, but I’ve built up a fairly sizable file so far…


NOP      0      *
ADD      1      *
FADD     1      6
SUB      2      *
FSUB     2      6
MUL      3      *
FMUL     3      6
DIV      4      *
FDIV     4      6
NUM      5      0
CHAR     5      1
HLT      5      2
AND      5      3
OR       5      4
XOR      5      5
FLOT     5      6
FIX      5      7
NUME     5      100
SLA      6      0
SRA      6      1
SLAX     6      2
SRAX     6      3
SLC      6      4
SRC      6      5
SLB      6      6
SRB      6      7
INT      6      9
MOVE     7      *

I’ve also created my own Vim syntax file just for this project:


 1 " Syntax highlighting specific to this project
 2 
 3 syntax keyword cFunction f0_bits
 4 syntax keyword cFunction f1_bits
 5 syntax keyword cFunction f2_bits
 6 syntax keyword cFunction f3_bits
 7 syntax keyword cFunction f4_bits
 8 syntax keyword cFunction f5_bits
 9 syntax keyword cFunction address_bits
10 syntax keyword cFunction flatten
11 syntax keyword cFunction upcase
12 syntax keyword cFunction opc_num
13 syntax keyword cFunction mod_num
14 syntax keyword cFunction idx_num
15 syntax keyword cFunction low_num
16 syntax keyword cFunction upr_num
17 
18 syntax keyword cConstant lookup
19 syntax keyword cConstant N
20 syntax keyword cConstant E
21 syntax keyword cConstant L
22 syntax keyword cConstant G

And I made a tags file for convenience and because I felt like it:


!_TAG_FILE_FORMAT       2       /extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED       1       /0=unsorted, 1=sorted, 2=foldcase/
!_TAG_PROGRAM_AUTHOR    Darren Hiebert  /dhiebert@users.sourceforge.net/
!_TAG_PROGRAM_NAME      Exuberant Ctags //
!_TAG_PROGRAM_URL       http://ctags.sourceforge.net    /official site/
!_TAG_PROGRAM_VERSION   5.8     //
ADD     run.c   /^void ADD( unsigned int f4, unsigned int f3, unsigned int address ){$/;"       f
E       mix.h   58;"    d
G       mix.h   60;"    d
L       mix.h   59;"    d
N       mix.h   57;"    d
_MIX_H_ mix.h   2;"     d
address_bits    mix.h   41;"    d
cmpflag mix.h   /^unsigned char cmpflag = N;$/;"        v
decode  run.c   /^void decode( struct word *instruction ){$/;"  f
f0      mix.h   /^      unsigned int f0 : 2;$/;"        m       struct:word
f0_bits mix.h   34;"    d
f1      mix.h   /^      unsigned int f1 : 6;$/;"        m       struct:word
f1_bits mix.h   35;"    d
f2      mix.h   /^      unsigned int f2 : 6;$/;"        m       struct:word
f2_bits mix.h   36;"    d
f3      mix.h   /^      unsigned int f3 : 6;$/;"        m       struct:word
f3_bits mix.h   37;"    d
f4      mix.h   /^      unsigned int f4 : 6;$/;"        m       struct:word
f4_bits mix.h   38;"    d
f5      mix.h   /^      unsigned int f5 : 6;$/;"        m       struct:word
f5_bits mix.h   39;"    d
flatten mix.h   43;"    d
idx_num asm.c   /^unsigned char idx_num( char *str ){$/;"       f
lookup  mix.h   11;"    d
lookup_cur      mix.h   /^struct lookup_table *lookup_cur;$/;"  v       typeref:struct:lookup_table
lookup_head     mix.h   /^struct lookup_table *lookup_head;$/;" v       typeref:struct:lookup_table
lookup_table    mix.h   /^struct lookup_table {$/;"     s
low_num asm.c   /^unsigned char low_num( char *str ){$/;"       f
main    asm.c   /^int main( int argc, char **argv ){$/;"        f
main    run.c   /^int main( int argc, char **argv ){$/;"        f
memory  mix.h   /^struct word memory[4000];$/;" v       typeref:struct:word
mnemonic        mix.h   /^      char mnemonic[8];$/;"   m       struct:lookup_table
mod_num asm.c   /^unsigned char mod_num( char *str ){$/;"       f
modifier        mix.h   /^      int modifier;$/;"       m       struct:lookup_table
next    mix.h   /^      struct lookup_table *next;$/;"  m       struct:lookup_table     typeref:struct:lookup_table::lookup_table
opc_num asm.c   /^unsigned char opc_num( char *str ){$/;"       f
opcode  mix.h   /^      int opcode;$/;" m       struct:lookup_table
overflow        mix.h   /^bool overflow = false;$/;"    v
rA      mix.h   /^struct word rA  = { 0, 0, 0, 0, 0, 0 };$/;"   v       typeref:struct:word
rI1     mix.h   /^struct word rI1 = { 0, 0, 0, 0, 0, 0 };$/;"   v       typeref:struct:word
rI2     mix.h   /^struct word rI2 = { 0, 0, 0, 0, 0, 0 };$/;"   v       typeref:struct:word
rI3     mix.h   /^struct word rI3 = { 0, 0, 0, 0, 0, 0 };$/;"   v       typeref:struct:word
rI4     mix.h   /^struct word rI4 = { 0, 0, 0, 0, 0, 0 };$/;"   v       typeref:struct:word
rI5     mix.h   /^struct word rI5 = { 0, 0, 0, 0, 0, 0 };$/;"   v       typeref:struct:word
rI6     mix.h   /^struct word rI6 = { 0, 0, 0, 0, 0, 0 };$/;"   v       typeref:struct:word
rJ      mix.h   /^struct word rJ  = { 0, 0, 0, 0, 0, 0 };$/;"   v       typeref:struct:word
rX      mix.h   /^struct word rX  = { 0, 0, 0, 0, 0, 0 };$/;"   v       typeref:struct:word
upcase  asm.c   /^char *upcase( char *str ){$/;"        f
upr_num asm.c   /^unsigned char upr_num( char *str ){$/;"       f
word    mix.h   /^struct word {$/;"     s

This is the first time I’ve ever attempted something like this. I know it’s only an assembler, and assemblers are generally easier to implement than compilers for high-level languages since it’s more or less a 1-to-1 translation of instructions, but this is the first time I’ve actually implemented a programming language. I’ve got the assembler almost completed, with the only parts remaining being to clean up a couple minor bugs and adapt the modifier field so that it can do different things depending on the opcode (right now it just selects a range, which is only how it’s supposed to work in certain cases). This functionality should be fairly easy to implement. After that my next step would be of course to write the execution environment, which will probably be a lot longer, since I have to implement a separate function for each instruction. I’ll obviously need some form of I/O for the execution environment, and I’m thinking I’ll use a debug console that shows the values of all the registers as well as the current instruction. Apparently there are MIX instructions for device I/O as well, so it will be fairly interesting figuring out how to implement those.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s