How variables translates to machine code

A basic introduction to assembly programming

Updated 21. January 2022

Content:

Intro CPU and RAM Processing workflow Practical example

Most programmers have a solid understanding of how variable functions in their program code, but do you know how the CPU and memory actually work when you update a variable?

In this article, you will get a sneak peek into the language of the old gods of computing. I will give a low-level explanation of how variable declarations and variable updates work on the bare metal CPU and RAM.

CPU and RAM

Processing and memory

To fully understand what I'm about to explain you need a solid understanding of what exactly a CPU and RAM actually is. And how it operates.

The CPU (Central Processing Unit) is a computer chip that does all the heavy lifting when it comes to doing things on the computer. Exactly what the CPU is doing is programmed, and fed into the CPU one line at a time. These lines of instructions are called machine-code.

The CPU can access a lot of input and output devices (IO devices) to read information from and send information to. One such device is the RAM.

RAM (Random Accessed Memory) is one or more computer chips that store data that the CPU can read data from and store data on. The RAM chip stores data in physical locations as bits in the form of an electric charge.

To read from and write to a specific memory location the CPU uses something called the address-bus to identify what data it wants to communicate with, and a signal to identify whether it wants to read data from that address or write data to that address.

On the CPU there are several memory locations that are called registers. These can store data that is to be processed by the processor.

Processing workflow

From memory adress and back again

To process data, the CPU is instructed to read data from a specific RAM address and store it into a register. Once stored in a register the data is ready to be processed.

To perform a simple calculation like addition, the CPU needs to store two values in two registers. Let's call them Register X and Y. The output from that calculation might be stored in a third register, Register Z.

Now the processor can write the data that is stored in Register Z to an address in the RAM chip.

Practical example

Assembly code for a simple variable update

Ok, so now we have a fundamental understanding of how the CPU does some very basic stuff. What does this look like in machine code?

Machine code is unreadable to humans because it is binary, but assembly language is machine code written in a way that humans are able to read. Please keep in mind that the instructions in Assembly are readable, but still require training (and a pocket dictionary) to read.

Assembly language is essentially a way of translating binary machine language to something that is possible for humans to read. Because of this, I think of assembly as the language of the old gods of computing. It is the first level of abstraction placed above the binary digits that the processor actually speaks.

In this example we will replicate the following JavaScript code:

script.js

variableA = 10;
variableA++;

How complicated can that be for a CPU? The CPU needs to do this in several steps. I will be using the 6502 CPU and corresponding Assembly language that was used by classic computers like the Apple II, Atari 2600, and the Nintendo Entertainment System as an example. I choose this architecture because it has a pretty simple instruction set syntax that does not scare everyone away.

6502_source.asm

LDX #10      ; Load X register with the number 10
STX $A1      ; Store value in X register to RAM address A1

LDX $A1      ; Load X register with the value stored in RAM address A1
INX          ; Increase value stored in X register by 1
STX $A1      ; Store value in X register to RAM address A1

As you can see from the file above the creation of a simple variable, that gets stored in RAM takes up two lines of assembly code, and to increase a variable by one takes three lines of assembly code to perform.

Each line of assembly code is actually comprised of two or more lines of machine code, which might take several CPU cycles to perform. because it has to perform several internal actions on each instruction that was given via machine code.

For old and relatively slow Processors like the 6502, it was important to keep track of how many processor cycles instructions would take to prevent the processor from getting out of sync with the television beam outputting the graphics on the screen.