Sunday, 07.13.2025, 3:23 PM
Welcome Guest

Hacking & Security with Gaurav

Site menu

Statistics

Total online: 1

Guests: 1

Users: 0

Login form

Main » » Assembly Language Tutorial for Reverse Engineering part 1

4:49 PM

Assembly Language Tutorial for Reverse Engineering part 1

Bottom of Form

POSTED BY GAURAV BAPLAWAT ON OCT 9, 2011

Bottom of Form

Hello friends, lets continue our tutorial on reverse engineering. Today i will teach you assembly language basic that are necessary for learning reverse engineering. As we all know assembly language is very important for reverse engineering and we must know, what are registers and which register serves for what. How the assembly language instruction work and how can we relate them with normal high language coding( C, JAVA, VB, etc.) to hack any software.
So friends, lets start our reverse engineering

What is Assembly language?

Assembly language is a low level or simply called machine language made up of machine instructions. Assembly language is specific to processor architecture example different for x86 architecture than for SPARC architecture. Assembly language consist of assembly instructions and CPU registers. I means I will explain my tutorial considering x86 architecture... Ahhha... From where i start explaining to you ... assembly language is too big topic... I think i have to tell only what you need for reverse engineering.. So i start from CPU registers.

CPU registers - Brief Introduction:

First of all what are registers? Most of Computer Engineering and Electronics Engineering guys knows about them but for others, Registers are small segments of memory inside CPU that are used for storing temporary data. Some registers have specific functions, others are just use for some general data storage. I am considering that you all are using x86 machines. There are two types of processors 32 bit and 64 bit processors. In a 32 bit processor, each register can hold 32 bits of data. On the other hand 64 bit register can hold 64 bit data. I am explaining this tutorial considering that we are using 32 bit processors. I will explain the same for 64 bits in later.

There are several registers but for Reverse engineering we grvbaplawat.do.am users are only interested in general purpose registers. We are interested in only 9 General purpose registers namely:

EAX
EBX
ECX
EDX
ESI
EDI
ESP
EBP
EIP

All these registers serves for different purposes. So I will start explaining all of them one by one for a more clear and accurate understanding of register concepts. I am putting more strain on these because these registers are called heart of reverse engineering.

EAX register is accumulator register which is used to store results of calculations. If any function returns a value its stored into EAX register. We can access EAX register using functions to retrieve the value of EAX register.

Note: EAX register can also be used for holding normal values regardless of calculations too.

The EDX is the data register. It’s basically an extension of EAX to assist it in storing extra data for complex operations. It can also be used for general purpose data storage.

The ECX, also called the count register, is used for looping operations. The repeated operations could be storing a string or counting numbers.

The ESI and EDI relied upon by loops that process data. The ESI register is the source index for data operation and holds the location of the input data stream. The EDI points to the location where the result of data operation is stored, or the destination index.

ESP is the stack pointer, and EBP is the base pointer. These registers are used for managing function calls and stack operations. When a function is called, the function’s arguments are pushed on the stack and are followed by a return address. The ESP register points to the very top of the stack, so it will point to the return address. EBP is used to point to the bottom of the call stack.

EBX is the only register that was not designed for anything specific. It can be used for extra storage.

EIP is the register that points to the current instruction being executed. As the CPU moves through the binary executing code, EIP is updated to reflect the location where the execution is occurring.

The 'E' at the beginning of each register name stands for Extended. When a register is referred to by its extended name, it indicates that all 32 bits of the register are being addressed. An interesting thing about registers is that they can be broken down into smaller subsets of themselves; the first sixteen bits of each register can be referenced by simply removing the 'E' from the name. For example, if you wanted to only manipulate the first sixteen bits of the EAX register, you would refer to it as the AX register. Additionally, registers AX through DX can be further broken down into two eight bit parts. So, if you wanted to manipulate only the first eight bits (bits 0-7) of the AX register, you would refer to the register as AL; if you wanted to manipulate the last eight bits (bits 8-15) of the AX register, you would refer to the register as AH ('L' standing for Low and 'H' standing for High).