Subject:Assembly Language

From Wikibooks, the open-content textbooks collection

Jump to: navigation, search

The books in this subject area deal specifically with Assembly Language topics.

Assembly Language Books Subtopics
Books with print versions
Books with PDF Versions

Contents

[edit] About Assembly Language


The term Assembly Language refers to the lowest-level human-readable method for programming a particular computer. Assembly Languages are platform specific, and therefore there is a different Assembly Language necessary for programming every different type of computer. Assembly Language generally has a one-to-one correspondence with the underlying machine language, which is not human readable. Some compilers, such as GCC, will convert high-level languages (such as C or Ada) into platform-dependent assembly language before assembling into machine language. An assembly language is a low-level language for programming computers. It implements a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture. This representation is usually defined by the hardware manufacturer, and is based on abbreviations (called mnemonics) that help the programmer remember individual instructions, registers, etc. An assembly language is thus specific to a certain physical or virtual computer architecture (as opposed to most high-level languages, which are portable).

[edit] Introduction to Assembly

Assembly language is the lowest level human-readable way to program a computer. The term "Assembly language" is a bit of a misnomer because there is no single standardized assembly language. Each processor has its own, and it is very rare that different processor families, or even multiple members of the same family, will share a common assembly language. In this sense, Assembly is practically a different language for every processor; a program you write for an Intel processor will not work for a Motorola processor, etc.

Assembly language typically has a one-to-one correspondence to the underlying machine code. This means that each assembly instruction can be mapped to one machine code, and each machine code can be uniquely mapped back to a single assembly instruction. Assembly instructions are called mnemonics. These tell the computer to perform simple tasks, such as add two numbers, load a piece of information from memory or store it to memory, or jump to another certain instruction. The combination of these simple tasks creates the complicated behavior exhibited in programs. Programming Assembly is much more tedious and error-prone than coding in a higher level language, but there are times when it is beneficial or necessary.

Because of the problems associated with assembly language programming, most programs are written in high-level languages, such as C. Compilers translate such code to assembly language, and an assembler then converts the program into binary machine language that can be executed by the processor.

Unfortunately, assembly language cannot be presented to you as a table of commands and a paragraph of instruction on how these commands are written. The available commands and their parameters change depending on the processor, as well as the command's internal representation. In addition, different companies occasionally write different assemblers (complete with different assembly languages) for the same processor.

All microprocessors behave very differently as to how they perform tasks such as addressing memory, communicating with peripherals, and other details. This difference between microprocessors means assembly language is generally not portable.

[edit] Learning an Assembly Language

Before learning an assembly language, you will have to choose a specific microprocessor architecture to write for. It may be useful to learn a different assembly language, then transfer those skills to your chosen microprocessor.

The MIPS Assembly and SPARC Assembly languages are helpful in learning assembly language programming skills, whilst the x86 Assembly language is the assembly language used by far on most PC processors.

Randall Hyde's High Level Assembly (HLA) is considered a good tool by some for learning assembly language for those who are already skilled in a high-level programming language, such as C, Java, Python, Perl or C++.

Because of the difference in processors, and the difference in assembly languages, it is impossible to make a single recommendation that all students new to assembly should learn first.

[edit] Computer Architectures

A detailed knowledge of computer workings isn't necessary for a beginning instruction of Assembly. However an understanding of how the processor works and how it deals with memory, and so forth, is important to know. The collection of processor and its relationship to memory and the rest of the computer is often known as an architecture.

Processor: The CPU, or central processing unit. This is the part of the computer that actually computes things. It takes in a series of bits that form an instruction, and performs that instruction.

Memory: This is the RAM in your computer, it is a temporary storage area for things like program data, an operating system, and the program stack.

Input/Output devices: These are peripherals - devices like your monitor, keyboard and mouse, USB devices and your hard drive are all examples of input or output (I/O) devices.

[edit] Registers

Registers are small, fast memory units that reside inside the processor itself. Every microprocessor architecture contains registers, in varying sizes, shapes, and numbers. For instance, the MIPS and SPARC architectures contain 32 general purpose registers each, although they also contain a number of special hardware registers that cannot be directly used by the programmer. In general, registers are the length of the machine word: on a 32-bit machine, registers are 32 bits, for example.

[edit] Programming Assembly

Computers, despite what some people may think, are not intelligent at all. Computers operate by acting on specific instructions, stored sequentially in memory. Computers can perform no other actions besides those specifically listed. Also, it is important to note that computer instructions are highly atomic: that is that they each perform a single, small instruction. For example, the following pseudo-code:

x = y + 2 * z

must be broken down into several small steps, in order for the computer to perform the operation:

  1. read z from memory into register r1.
  2. multiply register r1 by two.
  3. read y from memory into register r2.
  4. add r1 and r2. Store result in r3.
  5. write register r3 into memory at location x.

A common adage about assembly language programming is that programs must be written in "baby steps". The computer will make no assumptions, and will only act on the information that it is expressly provided.

[edit] Assembly vs High-Level Languages

Higher-Level Languages, such as Fortran, C, Ada and others, were designed with the express purpose in mind to avoid programming in assembly. Assembly language programming, as shown above is tedious, and the large amount of code required to even perform simple tasks increases the probability of bugs in your code. Therefore, people generally write code in higher-level languages, and let the compilers do the tedious work of writing out the assembly code, and converting to machine code. However, there are a number of occasions when programming in assembly is beneficial, and other occasions when it is required:

  1. When writing operating system kernels.
  2. When writing bootloaders.
  3. When writing device drivers.
  4. When writing code that is speed sensitive.
  5. Writing code to take advantage of specific hardware features.

Assembly has a number of advantages over high-level languages, and a good programmer will know when and how to exploit these advantages. First, assembly language is close to the machine. High-level languages are written to be portable to many different platforms, and therefore high-level code cannot possibly take advantage of the special features found on all platforms. Also, writing in a higher-level language introduces code bloat: the compiler creates too much code to perform a given task. This means that optimized code written in pure assembly can frequently run faster, and the generated code is smaller than code written in high-level languages. However, assembly suffers from the fact that it is slow to write, and is prone to bugs and errors.

[edit] Further Reading

[edit] External Links

Assembly Programming Tutorials

Personal tools
Create a book
In other languages