8086 Stack Operations: SS:SP, PUSH/POP, CALL/RET, and Stack Frames

Every time you call a procedure, the 8086 stack quietly saves a return address, shuffles a stack pointer, and creates a structured frame of memory that the callee can use for parameters and local variables. The moment things go wrong — a missed POP, a wrong RET, a mismatched call convention — the crash that follows feels completely unrelated to the actual bug. This post makes the mechanics explicit, from the exact bytes SP points to on every PUSH, to how BP creates a stable window into a procedure’s own data that survives any number of nested calls.

The Stack Lives in the SS Segment

The stack is a region of memory in the Stack Segment. SS holds the 16-bit segment base; SP holds the 16-bit offset to the current top item (the most recently pushed word). Physical address of the stack top = (SS × 16) + SP. The stack grows downward: PUSH decrements SP first, then writes. POP reads first, then increments SP. SP always points to the last item pushed, not the next free slot.

8086 stack diagram showing downward growth and SP pointing to last pushed item
The stack grows downward. SP always points to the last pushed item.

PUSH and POP: Byte-Level Mechanics

PUSH is always 16-bit on the 8086 — there is no byte PUSH. It decrements SP by 2, then writes the low byte of the source at SS:SP and the high byte at SS:SP+1 (little-endian). POP reads the word at SS:SP into the destination, then increments SP by 2. The old memory location is not erased; it simply becomes garbage below the new stack top.

; Starting: SS=9000h, SP=0100h

PUSH AX         ; AX=1234h
                ; SP = 00FEh
                ; SS:00FEh = 34h (low byte), SS:00FFh = 12h (high byte)

PUSH BX         ; BX=5678h
                ; SP = 00FCh
                ; SS:00FCh = 78h, SS:00FDh = 56h

POP  CX         ; CX = 5678h (reads from SS:00FCh)
                ; SP = 00FEh

POP  DX         ; DX = 1234h (reads from SS:00FEh)
                ; SP = 0100h  -- fully restored

POP CS does not exist on the 8086 — CS can only change via a far JMP, far CALL, or IRET. A common trap: PUSH SP on the original 8086 pushes SP after decrementing it, not the pre-decrement value. The 80286 and later reversed this behavior.

CALL and RET: Near vs Far

CALL pushes the return address — the IP of the instruction immediately after the CALL — then jumps to the target. A near CALL pushes only IP (2 bytes). A far CALL pushes CS then IP (4 bytes total). RET pops IP; RETF pops IP then CS. RET n pops the return address then additionally drops n bytes of caller parameters, implementing the callee-cleanup convention.

; Near call (within same segment)
CALL my_proc    ; pushes return IP, jumps to my_proc
; ... execution continues here after RET ...

my_proc PROC NEAR
    ; body
    RET         ; pops IP, returns to caller
my_proc ENDP

; Callee-cleanup: RET 4 pops return address then discards 4 bytes (2 word params)
my_proc2 PROC NEAR
    RET 4
my_proc2 ENDP

Stack Frames with BP

⚡ Key Takeaway — BP as a Stable Frame Anchor

SP moves every time you PUSH or POP, so using it to reference parameters breaks the moment SP changes inside the function. Copy SP into BP at entry and never move BP — from that point [BP+4] is always the first parameter, no matter how many more items are pushed afterward. BP also defaults to SS as its segment, so stack frame access needs no override prefix.

SP moves every time you PUSH or POP, so referencing a parameter as “SP + some offset” breaks whenever the offset changes. BP solves this: at procedure entry, you copy SP into BP and never move BP. From that moment, [BP+4] always means “first parameter” no matter how many times SP moves afterward. BP also defaults to SS for its segment, so [BP+4] accesses the stack without any override prefix.

Stack frame diagram showing BP as stable reference and SP moving below
Stack frame with BP as the stable reference. SP moves; BP does not.
.model small
.stack 200h
.data
    result DW 0
.code

add_words PROC NEAR
    PUSH BP
    MOV  BP, SP          ; anchor frame
    ; [BP+4] = param1 (pushed 2nd by caller)
    ; [BP+6] = param2 (pushed 1st by caller)
    MOV  AX, [BP+4]
    ADD  AX, [BP+6]      ; AX = param1 + param2
    POP  BP
    RET
add_words ENDP

main:
    MOV AX, @data
    MOV DS, AX
    MOV AX, 10
    PUSH AX              ; param1
    MOV AX, 32
    PUSH AX              ; param2
    CALL add_words       ; AX = 42
    ADD  SP, 4           ; caller-cleanup: remove 2 words
    MOV  result, AX
    MOV  AH, 4Ch
    INT  21h
END main

INT and IRET: Interrupts on the Same Stack

⚡ Key Takeaway — INT Pushes Three Items; Only IRET Unwinds Them

INT pushes FLAGS, then CS, then IP (6 bytes) onto SS:SP before jumping to the ISR. IRET pops all three in reverse. Using RET inside an ISR pops only IP, leaving FLAGS and CS on the stack and corrupting SP — the program crashes on the very next stack operation. Every ISR must exit with IRET, and hardware ISRs must send an EOI to the 8259 first.

Software and hardware interrupts use the same SS:SP stack as your procedures, but push three items instead of one: FLAGS, then CS, then IP. IRET pops all three in reverse. Using RET or RETF inside an ISR leaves FLAGS on the stack, corrupts SP, and causes an immediate crash when the “return address” turns out to be the old CS value.

; Minimal correct ISR skeleton
my_isr PROC FAR
    PUSH AX
    PUSH DS
    MOV  AX, @data
    MOV  DS, AX          ; must reload DS -- interrupted code had its own DS
    ; ... ISR work ...
    MOV  AL, 20h
    OUT  20h, AL         ; End-Of-Interrupt to 8259 (hardware ISRs only)
    POP  DS
    POP  AX
    IRET                 ; pops IP, CS, FLAGS -- the only correct ISR exit
my_isr ENDP

Calling Conventions

A calling convention is a contract between caller and callee defining who pushes arguments, what order they go on the stack, who removes them, and which registers must be preserved. Mixing conventions silently corrupts SP. The three conventions relevant to 8086 assembly are cdecl (C default), Pascal/stdcall (callee-cleanup), and register-based (__fastcall variant).

Convention Arg push order Stack cleanup Return value Callee must preserve Supports variadic?
cdeclRight to leftCaller (ADD SP,n)AX (16-bit) / DX:AX (32-bit)BP, SI, DI, DS, SSYes — caller knows arg count
Pascal / stdcallLeft to rightCallee (RET n)AX / DX:AXBP, SI, DI, DS, SSNo — callee must know count
__fastcallFirst 2 args in AX, DX; rest right to leftCalleeAX / DX:AXBP, SI, DI, DS, SSNo
; --- cdecl: caller pushes right-to-left, caller cleans up ---
; C prototype: int add(int a, int b);
; Call: result = add(10, 20);
    PUSH 20         ; second arg pushed first (right-to-left)
    PUSH 10         ; first arg pushed last
    CALL _add
    ADD  SP, 4      ; caller removes 2 words = 4 bytes
    ; AX = return value

_add PROC NEAR
    PUSH BP
    MOV  BP, SP
    ; [BP+4] = a (10), [BP+6] = b (20)
    MOV  AX, [BP+4]
    ADD  AX, [BP+6]
    POP  BP
    RET             ; cdecl: just RET, no stack adjustment
_add ENDP

; --- Pascal/stdcall: caller pushes left-to-right, callee cleans up ---
; Call: result = add(10, 20);
    PUSH 10         ; first arg first (left-to-right)
    PUSH 20
    CALL _add_pascal
    ; SP already corrected by RET 4 inside callee

_add_pascal PROC NEAR
    PUSH BP
    MOV  BP, SP
    MOV  AX, [BP+4]    ; b (20) -- top of frame because pushed last
    ADD  AX, [BP+6]    ; a (10)
    POP  BP
    RET  4             ; callee cleans 2 word args
_add_pascal ENDP

Interfacing Assembly with C/C++

Writing an assembly procedure callable from C requires four things: a leading underscore on the public name (C compilers prepend _ to every function name in the object file), a PUBLIC declaration so the linker can see it, the correct cdecl prologue/epilogue, and the C source declaring it with extern. The assembly file and C file are compiled separately and linked together.

; --- math_asm.asm ---
.MODEL SMALL
.CODE

PUBLIC _asm_multiply      ; underscore prefix matches C's name mangling

; int asm_multiply(int a, int b);
; cdecl: args on stack, caller cleans up, return in AX
_asm_multiply PROC NEAR
    PUSH BP
    MOV  BP, SP
    PUSH SI                ; preserve SI (callee-saved)

    MOV  AX, [BP+4]        ; a
    MOV  SI, [BP+6]        ; b
    IMUL SI                ; DX:AX = a * b (signed)
    ; AX = low 16-bit result (returned to C as int)

    POP  SI
    POP  BP
    RET                    ; cdecl: no stack adjustment
_asm_multiply ENDP

END
/* --- main.c --- */
#include <stdio.h>

extern int asm_multiply(int a, int b);   /* matches _asm_multiply in obj */

int main(void) {
    int result = asm_multiply(6, 7);
    printf("6 * 7 = %d\n", result);     /* prints: 6 * 7 = 42 */
    return 0;
}

Compile and link: masm math_asm.asm; then cl main.c math_asm.obj (MSVC) or tcc main.c math_asm.obj (Turbo C). The linker resolves _asm_multiply from the object file automatically.

ENTER and LEAVE (80186+)

ENTER and LEAVE are 80186+ instructions designed to automate the standard stack frame prologue and epilogue. ENTER takes two operands: the number of bytes to reserve for local variables and a nesting level (0 for normal procedures). LEAVE restores SP from BP and pops BP. Despite being designed as a shortcut, they are slower than the explicit equivalent — use them only when targeting 80286+ and code size matters more than speed.

Approach Code Bytes Clocks (8086)
Explicit prologuePUSH BP / MOV BP,SP / SUB SP,4515+2+4 = 21
ENTERENTER 4, 04N/A on 8086 (80186+ only)
Explicit epilogueMOV SP,BP / POP BP32+12 = 14
LEAVELEAVE1N/A on 8086; ~8 clocks on 80286

Read Next & Related Articles

📚 Recommended Reading Order
  • Read next: ⑧ Interrupt System — INT pushes three stack items; IRET unwinds all three — see exactly how the IVT and 8259 fit together
  • ⑥ Flag Register — PUSHF/POPF for saving and restoring the full flag state across calls
  • ⑤ Addressing Modes — [BP+offset] and the SS default segment rule explained in full
  • ④ BIU/EU Architecture — how the SS segment register and SP participate in bus cycles

FAQs

Q: Why does BP default to SS while BX defaults to DS?
BP was designed for stack frame access — accessing [BP+n] is always a stack operation, so SS is the natural segment. BX is a general data pointer; DS is its natural default. This lets you write [BP+4] without any segment override prefix.

Q: What is the difference between RET and RET 4?
RET pops just the return IP. RET 4 pops the return IP then adds 4 to SP, removing two word-parameters from the stack. This is the callee-cleanup (Pascal) convention; plain RET with ADD SP,n after the call is the caller-cleanup (cdecl) convention.

Q: What happens if I push an odd number of times and return?
SP will be off by 2 (or more) when RET executes. RET reads whatever garbage is at SS:SP as the return address and jumps there — almost certainly crashing immediately. Always ensure every PUSH inside a procedure has a matching POP before RET.

Q: Can I use the stack inside an ISR?
Yes — INT already pushed FLAGS, CS, and IP onto the current SS:SP stack before entering the ISR, so the stack is active and ready. Always PUSH any registers you modify at the top of the ISR and POP them before IRET to leave the interrupted program’s state intact.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.