hirvi74
a year ago
God, the title reminds me of when I when I took an x86 assembly class in college about a decade ago. Only 6 of the dumbest souls in the CS program dared to take the class the semester. The professor for the class was an ex-NASA computer engineer. Our test used to be writing assembly by hand. We were graded for accuracy too. I swear, at that point in time, I could convert between Hex, Dec, Oct, and Binary almost without thinking. I made a D in class with 40+ hours a week of studying. However, I learned more in that one class than my entire degree.
Anyway, thank you, OP, for sharing this. I have been looking into picking up ARM as way to crawl out of burnout from my career. It's been a years since I even touched x86 with any seriousness. I will add this book to my list resources.
alok-g
a year ago
Ever tried writing machine code (not assembly) by hand? I used to do that a few decades back for an 8-bit microprocessor. I am still looking for good resources on how to do that for a modern processor.
ReleaseCandidat
a year ago
You look up the opcodes for each assembler instructions in the ISA specifications, like https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-.... Information about the machine code format starts at "2.2 Base Instruction Formats". The actual opcodes you can find in chapter 9,"RV32/64G Instruction Set Listings". But be aware that some assembler instructions (called e.g. pseudo instructions) generate more than one.
The linked document is an old one, current ones are at https://riscv.org/technical/specifications/
alok-g
a year ago
Awesome! Thanks!
zmodem
a year ago
Here's an example for x86: https://www.hanshq.net/ones-and-zeros.html
bitwize
a year ago
Oh man you gave me flashbacks. The TRS-80 Model II's OS, TRSDOS-II, had a built-in debugger that was little more than a monitor. You could step through instructions, examine and write to memory, set breakpoints to absolute memory locations, and that was it. I remember hand-assembling tiny Z80 programs in that thing and jumping into them, just to test my understanding of how machine code programs worked and how the computer executed them, and being super thrilled when I could get an A to appear somewhere on the screen or something.
The machine had a much more complete assembly language programming toolkit which I also used to write more sophisticated programs, employing this debugger to examine them. But I felt like I'd "cracked the code" of the computer when I plugged hex numbers into RAM and then ran them straight from there.
Most CPU ISA documentation should give you the opcodes that correspond to instruction mnemonics. You may have to plug in your own operands (registers, etc.) into bit fields in the instruction encoding. If you're serious about hand-assembling to begin with this should be no problem.
tomcam
a year ago
I’ve told this before, but it’s so amazing I like to give Tim credit. I worked on Visual Basic at Microsoft with Tim Paterson, who also created the operating system that became MS-DOS. He worked on code generation and debugged by looking at the opcodes in a hex dump. Assembly was too slow for him him.
pjc50
a year ago
"ARM Architecture reference manual": https://documentation-service.arm.com/static/5f8daeb7f86e165... (assuming that link works)
Start at section A5 describing the encoding. The instruction set is very much designed for clean decode, so instructions are grouped by bit pattern; every instruction in the manual has its bit pattern described.
Very much a "but why?" situation, since translation from assembly to machine code is so easily automatable and doing it by hand adds so little value.
alok-g
a year ago
>> but why?
Agreed. More of a curiosity for me from learning and research purposes.
eterps
a year ago
You should definitely look at: https://github.com/akkartik/mu/blob/main/subx.md
eterps
a year ago
Also, running machine code directly can be done like this:
alok-g
a year ago
This is cool! This is how I used to do also by embedding hand-written machine code within a BASIC program and calling it to run natively.
eterps
a year ago
> This is how I used to do also by embedding hand-written machine code within a BASIC program and calling it to run natively
Me too; that's also the reason why I wanted that possibility back.
alok-g
a year ago
SubX seems like an assembly language itself following a subset of x86 32-bit instructions. Would looking into this help me understand how to translate from assembly to machine code manually? Thanks.
akkartik
a year ago
SubX is a weird thing (I built it) that is somewhere between machine code and Assembly language. You have to type in the opcodes directly, which people typically associate with machine code. But it smooths some aspects of programming in machine code. You'll get nice errors if you accidentally write invalid machine code, it won't just go off and run data as code or something like that.
I'd be happy to support you if you choose to try it out! Ask as many questions as you like.
Even if you choose not to, you might like the cheatsheet in the repo (from https://net.cs.uni-bonn.de/fileadmin/user_upload/plohmann/x8...)
alok-g
a year ago
Thanks! I understand better now.
And that cheat sheet PDF is cool, exactly what I was looking for. Any chance you are aware of something similar for x64? Thanks.
akkartik
a year ago
No, sorry. Honestly I spent a long time going to the source of the 3-volume Intel manual. (There's link to them as well in my Readme.) I think that's really what you need to do for machine code, if you're not using Assembly or Assembly-ish that has done that work for you. But then any Assembly language will have its own manual you need to bone up on.. That's mostly why I built SubX: the manual is like 10 pages, and I distilled down the parts of the Intel manual you need to know. But yeah, only for 32-bit. I always found x64 very hacky with the register bits split up between bytes and whatnot. 32-bit is a legitimately nice ergonomic machine.
alok-g
a year ago
Thanks for your efforts creating SubX. Now I understand the motivations behind that better.
kragen
a year ago
arm is the nicest instruction encoding you can get actual hardware for (not thumb or aarch64). risc-v is pretty okay at the assembly level but the instruction encoding is almost deliberately sadistic. amd64 isn't too terrible but not nearly as nice as arm
older official arm documentation is a lot better than recent, which is very poor quality (though still pretty reliable.) oldnewthing and azeria-labs have good tutorials, though she got some of the condition flags wrong
mbonnet
a year ago
I took a class called Computer Architecture in 2019 that was half Armv7 assembly programming. The tests were handwritten. It set me up for so much success in my subsequent career of embedded flight software.
I have also never again reached the high water mark of my programming life, which happened in that class - entering a 10-second ascended state and writing a 60-line complex assembly function in one go, no backspaces, no changes, and it working perfectly.
kragen
a year ago
this could be easy or hard depending on what you had to write in assembly on the test. like it wouldn't be that hard to write a subroutine to add up an array of integers or something
addem: xor eax, eax
loop:
test ecx, ecx
jnz ok
ret
ok: add eax, [ebx + ecx * 4]
dec ecx
jmp loop
(i haven't tested this, it'd be hilarious if i got it wrong)hirvi74
a year ago
The assembly on the test was much larger than the snippet you provided. I wish a kept a copy of the old test so I could just copy and example problem. Honestly though, that wasn't the worst part of the tests. It was the most dangerous though because many of the questions/instructions relied on previously completed questions/instruction to be correct.
So, like if you passed a wrong value to a certain register, then downstream, every problem that used that register would be off.
The tests were often 4 pages front and back with the handwritten assembly, definition matching, word problems, essay responses, etc.. We had like 75 minutes to take the test too. This was all at a public, no-name state university too. I was no MIT student or anything.
Anyway, I'll never forget the first day of class. Our professor said we were to have 4 tests and a final and some labs.
My friend in the class: "If we make a 100 on all 4 tests, do we have to take the final?"
The professor: "Hell, if you make 100 on 4 test, then I will let you write the final."
My friend: "Why? Has no one ever done that before?"
Professor: "No, in fact, in the 20 years I have taught this class, no one has ever scored a 100 on any test."
We all knew we were in for hell after that.
kragen
a year ago
that's awesome! i wish you'd kept a copy too
gus_massa
a year ago
Assuming ebx is the pointer to the array and ecx is the length, doesn't this sum the slots from 1 to ecx (incusive) instead of 0 to ecx-1 (inclusive)?
kragen
a year ago
hahaha, yes! i guess it wasn't as trivial as i thought. that's what i get for trying to be clever — guess i wouldn't have done that well on that exam ;)