Assembler

In a nutshell: Every computer has a binary machine language, in which instructions are written as series of 0's and 1's, and a symbolic machine language, also known as assembly language, in which instructions are expressed using human-friendly mnemonics. Both languages do exactly the same thing, and are completely equivalent. But, writing programs in assembly is far easier and safer then writing in binary. In order to enjoy this luxury, someone has to translate our symbolic programs into binary code that can execute as-is on the target computer. This translation service is done by an agent called assembler. The assembler can be either a person who carries out the translation manually, or a computer program that automates the process. In this module and final project in the course we learn how to build an assembler. In particular, we'll develop the capability of translating symbolic Hack programs into binary code that can be executed as-is on the Hack platform. Each one of you can choose to accomplish this feat in two different ways: you can either implement an assembler using a high-level language, or you can simulate the assembler's operation using paper and pencil. In both cases we give detailed guidelines about how to carry out your work.

Key concepts: Binary and symbolic machine languages, parsing, symbol tables, code generation, cross assembler, assembler implementation.

Unit 6.1: Assembly Languages and Assemblers

  • Basic Assembler Logic

    • Repeat:

      • Read the next Assembly language command

      • Break it into the different fields it is composed of

      • Lookup the binary code for each field

      • Combine these codes into a single machine language command

      • Output this machine language command

  • The assembler translates assembly language to machine language

  • The assembler enters a symbol into the table only when that symbol has not appeared before.

Unit 6.2: The Hack Assembly Language

  • Assembly Program Elements

    • white space

      • Empty lines / indentation

      • Line comments

      • in-line comments

    • instructions

      • A

      • C

    • symbols

      • references

      • label declarations

  • Ignore white space!

Unit 6.3: The Assembly Process - Handling Instructions

  • Translating A Instructions

    • if value is a decimal constant, generate the equivalent 15-bit binary constant

    • if value is a symbol, later

    • Example

      • What is the binary value of the instruction @9 ?

        • 0000000000001001

  • Translating C Instructions

    • Parse statement and save it into 3 individual fields

      • dest = comp ; jump

    • Example

      • What is the binary value of the instruction MD=A-1;JGE ?

        • 111 0 110010 011 011

  • For each instruction

    • parse, break into fields

    • A: translate dec to bin

    • C: generate bin for each field, assemble into full 16bit instruction

    • Write the 16 bit instructions to output file

Unit 6.4: The Assembly Process - Handling Symbols

  • Symbols

    • variable: represent memory locations where the programmer wants to maintain values

      • any symbol XXX appearing which is not predefined and is not defined elsewhere using (XXX) directive is treated as a variable

      • assigned a unique memory address, starting at 16

      • @variableSymbol

        • if first time, assign a unique memory address

        • else, replace with it's value

    • label: represent destinations of goto instructions

      • declared by the pseudo-command (xxx)

      • this directive defines the symbol XXX to refer to the memory location holding the next instruction in the program

      • @labelSymbol -> replace with its value

    • pre-defined: represent special memory locations

      • 23 symbols

      • @preDefinedSymbol -> replace with its value

  • Symbol table

    • Contains symbol:value pairs

    • initialize with the predefined symbols

    • First pass: add the label symbols --look for '(' symbols

    • Second pass: add the var. symbols

  • The Assembly process

    • Initialization

      • Construct an empty symbol table

      • Add the predefined symbols to the symbol table

    • First pass

      • Scan the entire program

      • For each instruction of the form (xxx):

        • add the pair (xxx, address) to the symbol table, where address is the number of the instruction following (xxx)

    • Second pass

      • Set n to 16

      • Scan the entire program again; for each instruction:

        • If the instruction is @symbol, look up symbol in the table:

          • If (symbol, value) is found, use value to complete the instruction's translation;

          • if not found:

            • Add (symbol, n) to the symbol table

            • Use n to complete the instruction's translation

            • n++

        • If the instruction is a C instruction, complete the instruction's translation

        • Write the translated instruction to the output file

Unit 6.5: Developing a Hack Assembler

  • Main loop

    • Get the next command and parse it

    • select A or C, translate

    • output the resulting machine language command

Unit 6.6: Project 6 Overview: Programming Option

  • Contract

    • Develop a HackAssembler program

    • The source program is supplied: Xxx.asm

    • The generated code is written into a text file named Xxx.hack

    • Assumption: Xxx.asm is error-free

  • Proposed design

    • Parser: unpacks each instruction into its underlying fields

    • Code: translates each field into its corresponding binary value

    • SymbolTable: manages the symbol table

    • Main: initializes the I/O files and drives the process

  • Proposed implementation

    • staged development

      • develop a basic assembler that translates assembly programs without symbols

      • develop an ability to handle symbols

      • morph basic into one that can translate any assembly program

      • Supplied test programs

        • Add.asm: tests white space and instruction handling

        • Max.asm (with symbols) and MaxL.asm (without symbols)

  • Testing Options

    • Hardware sim

      • load Xxx.hack into Hack Computer chip, then execute it

    • CPU Emulator

      • load Xxx.hack into supplied CPUEmulator, then execute it

    • Assembler

      • use supplied Assembler to translate Xxx.asm; compare resulting code with yours

Last updated