The Art of Assembly Language Programming (1996)

pan69 · on March 6, 2022

This version of the book is good! Great for beginners and experts alike. However, in later versions of this book (one of which I have on my bookshelf), the author uses High Level Assembler (HLA) which is according to the back cover of the book; "a revolutionary tool...". It is basically a proprietary compiler that uses a syntax that is "interesting" but not portable to the rest of the world of x86 assembler.

So, if you buy the book (like I did), please be aware of this.

userbinator · on March 6, 2022

I agree. HLA syntax can best be described as "if someone who knows only C decided to make an Asm syntax". I remember much flamewars on newsgroups about it.

GNU GAS syntax is roughly as repugnant, but became popular only because of GNU/Linux. Deviating from the official docs has caused much divisiveness and confusion.

nyanpasu64 · on March 6, 2022

As someone coming from a C++/Rust background and dabbling in retro game assembly, the biggest roadblocks I've faced trying to understand hand-written SPC700 assembly are flags, memorizing/looking up instructions, branches and jumps being "backwards" from C++ in x86-64 and a total wildcard in SPC700, unstructured control flow, and integer literals being dereferences rather than immediates by default. I haven't tried using HLA yet, but I get the impression its syntax would greatly assist with unstructured control flow (and possibly distinguishing between pointers/offsets and immediates). The manual at https://www.plantation-productions.com/Webster/HighLevelAsm/... is relatable:

> For example, one might test the carry flag after an addition to determine if an unsigned overflow has occurred using code like the following:

  add eax, 5
  jnc NoOverflow
  << code to execute if overflow occurs >>
  NoOverflow:

> Although this code is straightforward, you would be surprised how many students cannot visualize this code. On the other hand, if you feed them some pseudo code like:

  add eax, 5
  if( the carry flag is set ) then
  << code to execute if overflow occurs >>
  endif

> Those same students won't have any problems understanding this code.

I too find reading assembly a hindrance to understanding the intent and behavior of the code, and I mentally map assembly to high-level behaviors similar to structured programming. If HLA is more accessible and productive for regular programmers as a tool to read/write reasonably-efficient code on architectures without good C compilers, I don't consider that repugnant but empowering. (Admittedly it's more disconnected from the output binary instructions than regular assembly.)

(Sidenote: are assemblers designed around simplicity of parsing and speed at processing compiler-produced assembly, or user experience for humans writing assembly, or is the latter more common on architectures without good C compilers?)

akira2501 · on March 7, 2022

I added brace to branch conversion in an MSP430 assembler I wrote. You could write code like:

    add eax, 5
    jnc {
        nop
    }

And it would automatically create the labels for you, syntactically, it was equivalent to what you've written above, but with labels like ".auto_brace_X".

You could also do the typical "if-else" branching with slightly extended syntax like:

    add eax, 5
    jnc {
      <carry is set>
    }:jmp {
      <carry is not set>
    }

    <continue>

Which allowed easy extension to "do-while" loops:

    {
        <some loop>
        dec ecx
    }:jnz

The trickiest part is remembering that all branches are "unless" branches instead of "if" branches, and it obviously only works for simple constructs and can make refactoring and optimization a pain; however, there was plenty of code that benefited from using this.

pwdisswordfish9 · on March 12, 2022

Amazing that MSP430 has the same mnemonics as x86.

akira2501 · on March 14, 2022

lol.. was just trying to be familiar. here's some old code I dug up that used it:

        ; find next task in circular list Next(6 cycles) Top(14 cycles)
        {
                add.w #10, r10                  ;2 increment pointer to next task area
                cmp.w #mms_task.end, r10        ;2 are we at the end of the task list?
                jnz {                           ;2 no?  start next task
    .init:              eint                             ;1 ensure interrupts are enabled
                        bis.w #MMS_EACH, &mms_task.state ;5 ensure EACH state flag is set
                        mov.w #mms_task.list, r10        ;2 restart at top of list
                }
        
                ; build state and check against task Skip(12 cycles) Take(31 cycles)
                mov.b &IFG2, r8                 ;3 r8  = ifg2 interrupt flags
                and.w #(MMS_TX+MMS_RX), r8      ;1 get only TX and RX ready flags
        
                mov.w &mms_task.state, r9       ;3 r9  = current task state
                bis.w r8, r9                    ;1 add TX and TX flags to global state
                
                ; determine if this task needs to run
                bit.w @r10, r9                  ;2 task request flags & system flags
        }:jz                                    ;2 no matches?  find next task

        ; load the task state and return into it
        mov.w r10, &mms_task.ptr        ;4 set current task area to r10
        mov.w @r10+, r12                ;2 r12 = task[0] (request flags)
        mov.w @r10+, r8                 ;2 r8  = task[1] (PC)
        mov.w @r10+, r4                 ;2 r4  = task[2]
        mov.w @r10+, r5                 ;2 r5  = task[3]
        mov.b @r10+, r6                 ;2 r6  = task[4] (low byte)
        mov.b @r10,  r7                 ;2 r7  = task[4] (high byte)
        
        mov.w #0, r10                   ;1 clear r10 pointer for safety
        mov.w r8, PC                    ;2 "return" into the yield

WantonQuantum · on March 7, 2022

Assembly language is usually an almost one-to-one correspondence with machine code instructions of the target architecture. Assemblers can build on that by providing macros, etc but it's fundamentally about the programmer being aware of the instructions the CPU supports and using them to program.

I'm having trouble understanding the difficulty in the example. If "jnc" doesn't suit then there's "jc" to do the opposite.

If the difficulty is the conceptual leap to unstructured branching then I'd say that's what the teacher's job is - teach the students how to think about it.

chillfox · on March 7, 2022

Cryptic names like "jnc" and "jc" is the huge hindrance in understanding it. It would be so much easier if it was just called what it did instead of an acronym.

"jump_if_condition_is_met" or "jump_short_if_not_carry" (or which ever one it maps to) is much better and clearer than "jnc"

shadowofneptune · on March 7, 2022

> are assemblers designed around simplicity of parsing and speed at processing compiler-produced assembly, or user experience for humans writing assembly, or is the latter more common on architectures without good C compilers?

That really depends on the assembler. Each one is different. Some support a degree of structured programming, while others do not even allow for relocatable addresses.

pjmlp · on March 6, 2022

TI used to have an Assembly for a DSP that was like C written in SSA form.

No idea how it was called.

Fully agree with GAS syntax being a pain.

mananaysiempre · on March 6, 2022

Are you sure it was TI? The Blackfin DSPs made by Analog Devices are known for using an infix assembly syntax in the official documentation, although it might be a stretch to tie it to C specifically.

pjmlp · on March 7, 2022

As far as I can remember yes, it was an article in Dr Dobbs Journal back in the day.

Basically imagine a C subset where variables are the registers, mov is = , expressions are Dest = Reg1 op Reg2 and so forth.

I can gladly be corrected if it was another vendor.

msla · on March 6, 2022

> GNU GAS syntax is roughly as repugnant, but became popular only because of GNU/Linux.

AT&T syntax predates GNU and x86, and spread just fine without Linux.

edgyquant · on March 7, 2022

Gas uses AT&T syntax but I believe that poster is referring to how they borrow certain things from C (and the preprocessor.)

pizza234 · on March 6, 2022

If you refer to "The Art of 64-Bit Assembly", while this is correct, there are two things to consider

1. the real meat of ASM doesn't really depend on the Assembler, so one can (and ends up) coding the exercise with the Assembler of their choice

2. people have freely contributed the translation of the listings/programs to all the major Assemblers, so even if one doesn't want to code their own version, they can just use the contributed versions

Certainly, there are some chapters expressly dedicated to MASM functionalities, so those are wasted.

As a matter of fact, I worked through it on Linux/NASM, and wasn't really bothered, as the book is really good quality.

pjmlp · on March 7, 2022

> Certainly, there are some chapters expressly dedicated to MASM functionalities, so those are wasted.

Not for the legions of developers doing Windows and XBox games.

pizza234 · on March 7, 2022

Right! I was referring to those who program on non-Windows platforms.

gtirloni · on March 7, 2022

I read the first edition and loved it. Then I read the HLA edition and wish I had never done it.

vbezhenar · on March 6, 2022

I'm reading RP2040 Assembly Language Programming book right now. It's very interesting introduction to the assembly language. And using assembly for microcontrollers seems very fitting to me, at least for hobbyist level.

repiret · on March 7, 2022

As microcontrollers go, Cortex-M is especially pleasant to program in C.

ghaff · on March 6, 2022

Another interesting book--at least for historical context--is Michael Abrash's Zen of Assembly Language Programming. A lot of the optimizations are a good read if you're into that sort of thing but there's not a lot of practical interest for today. (Abrash in fact never wrote the Volume 2 he was planning to.)

pan69 · on March 6, 2022

Which is also hosted on Phatcode:

http://www.phatcode.net/res/224/files/html/index.html

imiric · on March 6, 2022

Would this still be relevant/approachable for a beginner coming from higher level languages, or is there a more modern resource for learning assembly that would be better suited?

I guess most of it is still relevant, but I'd rather not deal with DOS to follow along, and would prefer working on Linux.

Also, the x86 instruction set seems daunting to pick up for a beginner. Would it be better learning on a 4/8-bit or toy machine first?

pedrolins · on March 6, 2022

I’m not an assembly programmer, but I’ve learned assembly as part of introductory CS courses (computer architecture classes) and the approachable alternative you’re looking for is the assembly language of a RISC architecture such as ARM, MIPS, or RISC-V. I’d recommend learning the latter because of how approachable it is.

Gene_Parmesan · on March 6, 2022

I found the book "Programming From the Ground Up" to be a pretty good intro to programming in x86, and importantly, more recent than this text. I had to do just a tiny bit of research to find out how to get my shell to emulate 32 bit mode, since it's x86 and not x86-64, but it is in fact Linux based. I imagine it'd be an order of magnitude easier than trying to figure out a DOS setup.

The full x86 ISA is daunting, I'm sure, but for a beginner looking to get a sense of what asm is, you don't engage with the entirety of the ISA. The PFTGU text is great specifically because it assumes an audience of beginners who want to learn the basics of how asm operates to enable core programming concepts. It's not aiming to be exhaustive like the linked text.

Edit: And also importantly, it's also (legitimately) available for free online.

Shosty123 · on March 6, 2022

There's a new version of that book. Same author:

https://link.springer.com/book/10.1007/978-1-4842-7437-8

spc476 · on March 6, 2022

To get a feel for assembly, then yes, any 8-bit CPU would be a good choice. The 8080, Z80, 6502 or 6809. There are plenty of emulated machines today, tons of material are available online, and even cross assemblers that run on modern machines to make development easier. The base concepts will carry over.

The x86 instruction set is daunting and it can appear quite arbitrary unless one knows the near 50 years of historical baggage it carries along with it. But I think if you have some experience with an 8-bit CPU, plus just kind of accept the weirdness of it, it's not that bad (especially the 64-bit stuff, which in a way, is simpler, even if half the registers are named, and half are numbered).

xupybd · on March 6, 2022

The 8051 microcontroller is pretty easy to learn. I'd recommend that as a starter on assembly. But if you want to end up playing with x86 why not start there.

userbinator · on March 6, 2022

If you want to start small, Z80 might be a good point, but it seems opposed to your not wanting to start with DOS; a small VM will be very useful, and with Asm you'll soon discover that even 64k is plenty to play with.

msla · on March 6, 2022

> Also, the x86 instruction set seems daunting to pick up for a beginner. Would it be better learning on a 4/8-bit or toy machine first?

The x86 instruction set isn't daunting, the 16-bit x86 memory model is, speaking as someone who learned x86 assembly at a young age. It is easier to work on Linux, because segmentation is a non-issue and you have a flat address space.

userbinator · on March 7, 2022

Segmentation is also a non-issue if you stay within a single segment. Especially if you're writing in Asm and just starting out, even 64k is a lot. A "hello world" has less than a dozen bytes of executable code. The majority of the standard commands in DOS are <64k binaries.

spc476 · on March 7, 2022

It's also not much of an issue if you split code and data, each being 64K in size.

shadowofneptune · on March 7, 2022

Is segmentation really that bad? With an assembler you have direct control of which segment is being used. It's unlikely a hobby project will span more than 64k. Even if it does, for a lot of projects the individual data structures are small enough to split between segments.

spc476 · on March 7, 2022

It only becomes an issue of have a data structure that can exceed 64K in size, or multiple structures, who aggregate size is greater than 64k in size. Then you have to deal with so called FAR pointers.

djmips · on March 6, 2022

People interested in this might also enjoy 'Inner Loops' (1997), an excellent book that covers optimizing x86 at that time but is instructive in general and the concepts are still applicable today.

https://www.amazon.com/Inner-Loops-Sourcebook-Software-Devel...

phendrenad2 · on March 6, 2022

What's a good modern book for modern assembly languges?

justin66 · on March 7, 2022

https://asmirvine.com/

shadowofneptune · on March 6, 2022

A very useful book for MS-DOS programming. MASM is also an excellent assembler, even in early versions. The proc/endp blocks for instance make reading a subroutine much easier. Does anyone know of a similar book for AMD64 assembly language programming?

ranger207 · on March 6, 2022

The author (Randall Hyde) just released an x64 assembly book in October: https://nostarch.com/art-64-bit-assembly

shadowofneptune · on March 6, 2022

Thanks, this looks great!

bch · on March 6, 2022

Related(?) I'm looking for recommendations for a good intro/reference to the intel arch wrt operations and registers, esp wrt Unix conventions (if unix differs from (eg) Windows wrt creating a process or pushing/popping stack...)

canMarsHaveLife · on March 6, 2022

Can Assembly be considered a programming language? Isn't is just a language for... assembling?

jazzyjackson · on March 6, 2022

One might say it is the only programming language, everything else is just macro expansion, converting pseudo-english mnemonics into machine instruction.

Snark aside, yes, people program entire operating systems and programs and games in assembly. Famously, Rollercoaster Tycoon was written entirely in assembly.

shadowofneptune · on March 6, 2022

This is true only for the most simple assemblers. In practice, each assembler comes with its own distinct set of features.

Microsoft Macro Assembler for example offers:

• Records/structs, bitfields

• Typed labels/pointers. Intel in particular introduced this feature.

• Procedure blocks which allow locally scoped labels within them

• The automatic allocation of local variables onto the stack.

• if/then blocks, for loop blocks

• Memory model directives

• Macros, equates, etc.

Other assemblers like NASM omit some of these features, as introducing versions of these features which are completely compatible with the MASM ones would be difficult. For example, NASM allows MASM-like procedure blocks, but they are not block-scoped. They're just a notation for the programmer.

mdp2021 · on March 6, 2022

Yes, certainly Assembly is a programming language. If you could not express your program in it, all of our discipline would fall (if a script defined a process and what it translates to did not...). "Things work" because /other/ languages translate into it: that is the chief language, named ("symbolic") reading of machine code. It has conditionals and loops, it has logic and arithmetic operators, it has "exactly" what the processor has: it is the most precise definition of the program, how can it not be a programming language?

You are probably thinking of the assembling process of pieces of machine code - through the assembler. But the assembler also translates symbolics into machine code, in the goal to join the different pieces of code and data - so it's more than just concatenating. To define the memory addresses to be encoded for jumps, for example, you must have defined them - so, in the assembler (e.g. concatenating subroutines) you imply the assembly (i.e. naming pieces of machine code, symbolically treated).

pavlov · on March 6, 2022

Most assemblers have macros and other directives that give you some level of higher-order control than just writing machine code.

convolvatron · on March 7, 2022

assuming you have enough registers, burn a small number of them to support internal use in macros. assign the rest to values of interest. you may end up needing to write an additional layer of macros with their own registers that use the first layer as primitives.

now you're essentially programming in basic

ghaff · on March 6, 2022

Sure. It's less abstracted than anything except for machine code but that doesn't make it not a language.

krallja · on March 6, 2022

Why does it call the processors the 886, 8286, 8386?

djur · on March 6, 2022

They're simplified hypothetical variants of the x86 processors, according to one chapter.

krallja · on March 6, 2022

Ah, I see it in Section 3.3: http://www.phatcode.net/res/223/files/html/Chapter_3/CH03-3....

galangalalgol · on March 6, 2022

That is a good question, maybe they had only heard it spoken, so they thought it was 83 86 instead of 8086,80386 etc.

shadowofneptune · on March 6, 2022

No, the book introduces these serial numbers as idealized versions of the processors. It allows the reader, who is treated as a beginner to processors, to develop a mental model of them without any complications. The edge cases, pitfalls, and issues of the true processors are revealed over the course of the book.

Not sure about the merits of calling the ideal processors by special names.

agumonkey · on March 6, 2022

In France, 80386 was often read 80-3-86 (same for 80-2-86). And eigty SPC three === eighty three.

makapuf · on March 6, 2022

Never heard of it that way in france with my friends, for me it was always 80-86, 80-88 or just 386 or 286 (or 80-286 if you wanted to sound technical).

slim · on March 7, 2022

you heard 386 spoken "trois cent quatre-vingt six" ? I can confirm I always heard 386 spoken 3-86

renox · on March 7, 2022

FWIW I've always called it 386 (3 cent quatre-vingt-six)..

agumonkey · on March 8, 2022

seems like we found a new pain au chocolat flamewar :p

slim · on March 8, 2022

chocolatine

makapuf · on March 8, 2022

Pain au chocolat (dx2-66)

agumonkey · on March 6, 2022

well well well .. I think I really heard it that way. Not that it means a lot anyway.