Assembly language

Photo by Jonas Svidras on Unsplash

Photo by Jonas Svidras on Unsplash

Many programmers go through the phase of learning new programming languages and realizing that many of the methods and concepts learned in another programming language do apply to the new programming language they check out as well.

That is until they stumble upon assembly language.

What is so special about assembly language, and why can it be a good idea to dip your toes into the lowes level of programming language there is to learn?

In this article I will try to answer all those questions and hopefully encourage you to try it for yourself.

Low level

How low can you go?

While most other programming languages are relatively easy to learn because they abstract away the difficult and gritty details of making the processor do what we want it to do. Assembly language is the complete opposite.

Assembly language makes you literally control exactly what the CPU does. For each line of code, you give exact instructions of what the CPU is supposed to do. This means that you must understand the basic functionality of computers. How does RAM really work, how is Input read, how is output is given, and how does a CPU actually operate?

Most people know that a computer only speaks with 0s and 1s, but very few have taken their time to learn how it actually looks like.

A lot of times you can get very far by not understanding how a computer works when programming. But learning low-level programming in Assembly will give you reference points to learning and truly understanding computers at a completely new level. Most people know that a computer only speaks with 0s and 1s, but very few have taken their time to learn what it actually looks like.

When programming assembly you literally look at the content of the memory in your RAM chip, and you specifically tell the processor what to do with that content. This is a completely different way of programming compared to high-level programming where you just focus on variables and ways to manipulate that data through functions.

To me, it is truly fascinating and charming to look and feel the connection between the microprocessor, the electronics, and the software. When learning about assembly programming you will be able to feel a stronger connection between the electronics on the PCB of your motherboard, and the operating system.

Assemblers

What does an assembler do?

Assembly Language is a layer of abstraction on top of the binary 1's and 0's that makeup machine code, also known as CPU instructions. In practice, that means that you have human-readable mnemonics like "JMP $F1B2" (jump to a memory address F1B2 to read the next instruction), that gets converted to a binary CPU instruction that the CPU can execute by the assembler.

The beauty of assembly language is that there is just a limited amount of CPU instructions that exist. Because of this, the transition from one Assembly Language dialect to another is perhaps even easier than jumping from one high-level programming language to another.

What the assembler does is that it looks at the beginning of the assembly language source code for what type of processor it is supposed to assemble a program for. It then uses a dictionary to convert the human-readable mnemonics into machine code. It's actually a one-to-one, line-by-line, direct translation.

The main difference is that a memory address has the size of a byte (8-bit) regardless of system architecture, and a complete instruction usually consists of 2 or more bytes of data. An example might be using the instruction "ADC #12" (Add 12 to the value stored in the A register). This requires one memory address for the "ADC" instruction, followed by a memory address holding the number 12 that is to be added to the number stored in the register called A. So the assembler uses a translation dictionary to translate the assembly language into binary values that are put in the correct order ready to be loaded into RAM.

After a brief study of the 6502 CPU architecture and Assembly Language that was used for the Atari 2600, Nintendo Entertainment System, and Commodore 64, I tried my hands on some modern Windows x86_64 Assembly. The similarities were striking. The difference was mainly the number of registers and the bigger and more complex instruction set available.

Where to begin?

Walk before you try run

When learning anything new it's important to learn to walk before you try to run. Running before being able to walk might be possible, but there is a great chance that you will stumble and get frustrated a lot more than if you have the patience to become a good walker first. The same holds true for any skill, even programming in assembly.

The 8-bit CPU architecture of the 6502 processor is perhaps the simplest processor to learn assembly language on. It has a very limited instruction set and just three main registers to manage. This limitation makes it a lot more approachable and easier to learn and understand compared to modern x86 (32-bit) x86_64 or (64-bit) assembly language.

Believe me when I say that 8-bit programming on a 6502 CPU is not easy, but because there are a lot fewer moving parts in and easier to get your head around than if you started with a more modern assembly language with a full modern instruction set and all the registers you have to manage.

Please remember that babies spend a long time crawling and being frustrated that they can't walk before they finally manage to walk. The same is true with Assembly language programming. For me, it was hard and frustrating in the beginning. I had to do a lot of trial and error and read other people's code for inspiration before I got to the walking phase. And I honestly think I will stay in the walking phase a long time simply because I actually do very little assembly programming compared to other types of programming.

I have read a lot of places that recommend some experience with C/C++ programming before starting to learn Assembly Language. I don't agree with that recommendation. What I do recommend is that you get a fundamental understanding of how a processor and memory actually work. That way the concept of CPU instructions, memory addressing and pointers (indirect addressing) becomes a lot less confusing. For this insight, I can highly recommend the book below.

book cover

But How Do It Know? - The Basic Principles of Computers for Everyone

J. Clark Scott - 222 sider

A book that takes you through the process of building a computer from scratch using logic gates. That way the reader ends up understanding how CPU and RAM are actually built and by doing that they get insight into exactly what machine code is, and how Assembly Language can be used to tell a CPU what to do.

"Since the book helped me connect so many dots, I'm actually a bit frustrated that I did not read it earlier."

Wow, what an amazing book! This is perhaps the most important and best book I have ever read about computers in my entire life. The author manages to explain in an easy-to-understand way how I'm able to build my very own computer using (extremely many) logic gates.

Concepts like machine language and how assembly language is used became crystal clear to me. After reading this gem of a book I got a completely new appreciation and deeper understanding of hardware and software. Since the book helped me connect so many dots, I'm actually a bit frustrated that I did not read it earlier.

To me, this book is a must-read for anyone interested in computers and programming.

3. February 2022