Every time you call a procedure, the 8086 stack quietly saves a return address, shuffles a stack pointer, and creates a structured frame of memory that the callee can use for parameters and local variables. The moment things go wrong — a missed POP, a wrong RET, a mismatched call convention — the crash that follows feels completely unrelated to the actual bug. This post makes the mechanics explicit, from the exact bytes SP points to on every PUSH, to how BP creates a stable window into a procedure’s own data that survives any number of nested calls.
The Stack Lives in the SS Segment
The stack is a region of memory in the Stack Segment. SS holds the 16-bit segment base; SP holds the 16-bit offset to the current top item (the most recently pushed word). Physical address of the stack top = (SS × 16) + SP. The stack grows downward: PUSH decrements SP first, then writes. POP reads first, then increments SP. SP always points to the last item pushed, not the next free slot.

PUSH and POP: Byte-Level Mechanics
PUSH is always 16-bit on the 8086 — there is no byte PUSH. It decrements SP by 2, then writes the low byte of the source at SS:SP and the high byte at SS:SP+1 (little-endian). POP reads the word at SS:SP into the destination, then increments SP by 2. The old memory location is not erased; it simply becomes garbage below the new stack top.
; Starting: SS=9000h, SP=0100h
PUSH AX ; AX=1234h
; SP = 00FEh
; SS:00FEh = 34h (low byte), SS:00FFh = 12h (high byte)
PUSH BX ; BX=5678h
; SP = 00FCh
; SS:00FCh = 78h, SS:00FDh = 56h
POP CX ; CX = 5678h (reads from SS:00FCh)
; SP = 00FEh
POP DX ; DX = 1234h (reads from SS:00FEh)
; SP = 0100h -- fully restored
POP CS does not exist on the 8086 — CS can only change via a far JMP, far CALL, or IRET. A common trap: PUSH SP on the original 8086 pushes SP after decrementing it, not the pre-decrement value. The 80286 and later reversed this behavior.
CALL and RET: Near vs Far
CALL pushes the return address — the IP of the instruction immediately after the CALL — then jumps to the target. A near CALL pushes only IP (2 bytes). A far CALL pushes CS then IP (4 bytes total). RET pops IP; RETF pops IP then CS. RET n pops the return address then additionally drops n bytes of caller parameters, implementing the callee-cleanup convention.
; Near call (within same segment)
CALL my_proc ; pushes return IP, jumps to my_proc
; ... execution continues here after RET ...
my_proc PROC NEAR
; body
RET ; pops IP, returns to caller
my_proc ENDP
; Callee-cleanup: RET 4 pops return address then discards 4 bytes (2 word params)
my_proc2 PROC NEAR
RET 4
my_proc2 ENDP
Stack Frames with BP
SP moves every time you PUSH or POP, so using it to reference parameters breaks the moment SP changes inside the function. Copy SP into BP at entry and never move BP — from that point [BP+4] is always the first parameter, no matter how many more items are pushed afterward. BP also defaults to SS as its segment, so stack frame access needs no override prefix.
SP moves every time you PUSH or POP, so referencing a parameter as “SP + some offset” breaks whenever the offset changes. BP solves this: at procedure entry, you copy SP into BP and never move BP. From that moment, [BP+4] always means “first parameter” no matter how many times SP moves afterward. BP also defaults to SS for its segment, so [BP+4] accesses the stack without any override prefix.

.model small
.stack 200h
.data
result DW 0
.code
add_words PROC NEAR
PUSH BP
MOV BP, SP ; anchor frame
; [BP+4] = param1 (pushed 2nd by caller)
; [BP+6] = param2 (pushed 1st by caller)
MOV AX, [BP+4]
ADD AX, [BP+6] ; AX = param1 + param2
POP BP
RET
add_words ENDP
main:
MOV AX, @data
MOV DS, AX
MOV AX, 10
PUSH AX ; param1
MOV AX, 32
PUSH AX ; param2
CALL add_words ; AX = 42
ADD SP, 4 ; caller-cleanup: remove 2 words
MOV result, AX
MOV AH, 4Ch
INT 21h
END main
INT and IRET: Interrupts on the Same Stack
INT pushes FLAGS, then CS, then IP (6 bytes) onto SS:SP before jumping to the ISR. IRET pops all three in reverse. Using RET inside an ISR pops only IP, leaving FLAGS and CS on the stack and corrupting SP — the program crashes on the very next stack operation. Every ISR must exit with IRET, and hardware ISRs must send an EOI to the 8259 first.
Software and hardware interrupts use the same SS:SP stack as your procedures, but push three items instead of one: FLAGS, then CS, then IP. IRET pops all three in reverse. Using RET or RETF inside an ISR leaves FLAGS on the stack, corrupts SP, and causes an immediate crash when the “return address” turns out to be the old CS value.
; Minimal correct ISR skeleton
my_isr PROC FAR
PUSH AX
PUSH DS
MOV AX, @data
MOV DS, AX ; must reload DS -- interrupted code had its own DS
; ... ISR work ...
MOV AL, 20h
OUT 20h, AL ; End-Of-Interrupt to 8259 (hardware ISRs only)
POP DS
POP AX
IRET ; pops IP, CS, FLAGS -- the only correct ISR exit
my_isr ENDP
Calling Conventions
A calling convention is a contract between caller and callee defining who pushes arguments, what order they go on the stack, who removes them, and which registers must be preserved. Mixing conventions silently corrupts SP. The three conventions relevant to 8086 assembly are cdecl (C default), Pascal/stdcall (callee-cleanup), and register-based (__fastcall variant).
| Convention | Arg push order | Stack cleanup | Return value | Callee must preserve | Supports variadic? |
|---|---|---|---|---|---|
| cdecl | Right to left | Caller (ADD SP,n) | AX (16-bit) / DX:AX (32-bit) | BP, SI, DI, DS, SS | Yes — caller knows arg count |
| Pascal / stdcall | Left to right | Callee (RET n) | AX / DX:AX | BP, SI, DI, DS, SS | No — callee must know count |
| __fastcall | First 2 args in AX, DX; rest right to left | Callee | AX / DX:AX | BP, SI, DI, DS, SS | No |
; --- cdecl: caller pushes right-to-left, caller cleans up ---
; C prototype: int add(int a, int b);
; Call: result = add(10, 20);
PUSH 20 ; second arg pushed first (right-to-left)
PUSH 10 ; first arg pushed last
CALL _add
ADD SP, 4 ; caller removes 2 words = 4 bytes
; AX = return value
_add PROC NEAR
PUSH BP
MOV BP, SP
; [BP+4] = a (10), [BP+6] = b (20)
MOV AX, [BP+4]
ADD AX, [BP+6]
POP BP
RET ; cdecl: just RET, no stack adjustment
_add ENDP
; --- Pascal/stdcall: caller pushes left-to-right, callee cleans up ---
; Call: result = add(10, 20);
PUSH 10 ; first arg first (left-to-right)
PUSH 20
CALL _add_pascal
; SP already corrected by RET 4 inside callee
_add_pascal PROC NEAR
PUSH BP
MOV BP, SP
MOV AX, [BP+4] ; b (20) -- top of frame because pushed last
ADD AX, [BP+6] ; a (10)
POP BP
RET 4 ; callee cleans 2 word args
_add_pascal ENDP
Interfacing Assembly with C/C++
Writing an assembly procedure callable from C requires four things: a leading underscore on the public name (C compilers prepend _ to every function name in the object file), a PUBLIC declaration so the linker can see it, the correct cdecl prologue/epilogue, and the C source declaring it with extern. The assembly file and C file are compiled separately and linked together.
; --- math_asm.asm ---
.MODEL SMALL
.CODE
PUBLIC _asm_multiply ; underscore prefix matches C's name mangling
; int asm_multiply(int a, int b);
; cdecl: args on stack, caller cleans up, return in AX
_asm_multiply PROC NEAR
PUSH BP
MOV BP, SP
PUSH SI ; preserve SI (callee-saved)
MOV AX, [BP+4] ; a
MOV SI, [BP+6] ; b
IMUL SI ; DX:AX = a * b (signed)
; AX = low 16-bit result (returned to C as int)
POP SI
POP BP
RET ; cdecl: no stack adjustment
_asm_multiply ENDP
END
/* --- main.c --- */
#include <stdio.h>
extern int asm_multiply(int a, int b); /* matches _asm_multiply in obj */
int main(void) {
int result = asm_multiply(6, 7);
printf("6 * 7 = %d\n", result); /* prints: 6 * 7 = 42 */
return 0;
}
Compile and link: masm math_asm.asm; then cl main.c math_asm.obj (MSVC) or tcc main.c math_asm.obj (Turbo C). The linker resolves _asm_multiply from the object file automatically.
ENTER and LEAVE (80186+)
ENTER and LEAVE are 80186+ instructions designed to automate the standard stack frame prologue and epilogue. ENTER takes two operands: the number of bytes to reserve for local variables and a nesting level (0 for normal procedures). LEAVE restores SP from BP and pops BP. Despite being designed as a shortcut, they are slower than the explicit equivalent — use them only when targeting 80286+ and code size matters more than speed.
| Approach | Code | Bytes | Clocks (8086) |
|---|---|---|---|
| Explicit prologue | PUSH BP / MOV BP,SP / SUB SP,4 | 5 | 15+2+4 = 21 |
| ENTER | ENTER 4, 0 | 4 | N/A on 8086 (80186+ only) |
| Explicit epilogue | MOV SP,BP / POP BP | 3 | 2+12 = 14 |
| LEAVE | LEAVE | 1 | N/A on 8086; ~8 clocks on 80286 |
Read Next & Related Articles
- Read next: ⑧ Interrupt System — INT pushes three stack items; IRET unwinds all three — see exactly how the IVT and 8259 fit together
- ⑥ Flag Register — PUSHF/POPF for saving and restoring the full flag state across calls
- ⑤ Addressing Modes — [BP+offset] and the SS default segment rule explained in full
- ④ BIU/EU Architecture — how the SS segment register and SP participate in bus cycles
- Memory Segmentation in 8086 — how SS:SP maps to a physical address
- The Complete 8086 Register Reference — SP, BP, and all register roles
- 8086 Flag Register — PUSHF/POPF and FLAGS layout
- 8086 Interrupt System — INT pushes three items; IRET unwinds all three
- 8086 Addressing Modes — [BP+offset] and the SS default segment rule
FAQs
Q: Why does BP default to SS while BX defaults to DS?
BP was designed for stack frame access — accessing [BP+n] is always a stack operation, so SS is the natural segment. BX is a general data pointer; DS is its natural default. This lets you write [BP+4] without any segment override prefix.
Q: What is the difference between RET and RET 4?
RET pops just the return IP. RET 4 pops the return IP then adds 4 to SP, removing two word-parameters from the stack. This is the callee-cleanup (Pascal) convention; plain RET with ADD SP,n after the call is the caller-cleanup (cdecl) convention.
Q: What happens if I push an odd number of times and return?
SP will be off by 2 (or more) when RET executes. RET reads whatever garbage is at SS:SP as the return address and jumps there — almost certainly crashing immediately. Always ensure every PUSH inside a procedure has a matching POP before RET.
Q: Can I use the stack inside an ISR?
Yes — INT already pushed FLAGS, CS, and IP onto the current SS:SP stack before entering the ISR, so the stack is active and ready. Always PUSH any registers you modify at the top of the ISR and POP them before IRET to leave the interrupted program’s state intact.