My precompiled x86 code may be running in a 16-bit (real mode or 16-bit protected mode) or a 32-bit (i386 protected mode). How do I detect it from the code at runtime?
I was able to come up with this NASM source:
bits 16
cpu 386
pushf
test ax, strict word 0 ; In 32-bit mode this is `test eax, ...', +2 bytes.
jmp short found_16
; Fall through to found_32.
found_32:
bits 32
popf
int 32 ; Or whatever code.
found_16:
bits 16
popf
int 16 ; Or whatever code.
However, I don't like it, because it uses the stack. Is there a solution which doesn't modify any general-purpose registers, segment registers or flags, doesn't use the stack, and works on a 8086 (16-bit mode only) and on a 386 (both modes)?
I've tried lea esi, [dword esi+0]
in 32-bit mode, but that translates to a non-nop in 16-bit mode.
Please note that I'm aware that for most programs the mode is decided at compile time (as part of the architecture and platform), and they don't have to be able the detect the mode at runtime. Also for programs started normally, the operating system will choose the correct mode based on the file header, thus there is almost no danger of accidentally running a full program file in the wrong mode. However, some program snippets such as as exploit shellcode can benefit from runtime detection of all kinds (including the architecture and the operating system). I also have some other obscure use cases in mind.
My precompiled x86 code may be running in a 16-bit (real mode or 16-bit protected mode) or a 32-bit (i386 protected mode). How do I detect it from the code at runtime?
I was able to come up with this NASM source:
bits 16
cpu 386
pushf
test ax, strict word 0 ; In 32-bit mode this is `test eax, ...', +2 bytes.
jmp short found_16
; Fall through to found_32.
found_32:
bits 32
popf
int 32 ; Or whatever code.
found_16:
bits 16
popf
int 16 ; Or whatever code.
However, I don't like it, because it uses the stack. Is there a solution which doesn't modify any general-purpose registers, segment registers or flags, doesn't use the stack, and works on a 8086 (16-bit mode only) and on a 386 (both modes)?
I've tried lea esi, [dword esi+0]
in 32-bit mode, but that translates to a non-nop in 16-bit mode.
Please note that I'm aware that for most programs the mode is decided at compile time (as part of the architecture and platform), and they don't have to be able the detect the mode at runtime. Also for programs started normally, the operating system will choose the correct mode based on the file header, thus there is almost no danger of accidentally running a full program file in the wrong mode. However, some program snippets such as as exploit shellcode can benefit from runtime detection of all kinds (including the architecture and the operating system). I also have some other obscure use cases in mind.
Share Improve this question edited yesterday Peter Cordes 367k49 gold badges717 silver badges979 bronze badges asked yesterday ptspts 87.7k23 gold badges115 silver badges198 bronze badges 03 Answers
Reset to default 7If you are happy with temporary changes to esi
that are then undone something like this could work:
bits 16
lea si, [si + 0xa6]
jmp short found16
nop
found32:
bits 32
lea esi, [esi + 0x6ff21500]
int 32
nop dword [eax + 1] ; padding for better disasm
found16:
bits 16
lea si, [si - 0xa6]
int 16
In 16 bit mode that decodes to:
00000000 8DB4A600 lea si,[si+0xa6]
00000004 EB0D jmp short 0x13
00000006 90 nop
00000007 8DB60015 lea si,[bp+0x1500]
0000000B F26F repne outsw
0000000D CD20 int 0x20
0000000F 0F1F4001 nop word [bx+si+0x1]
00000013 8DB45AFF lea si,[si-0xa6]
00000017 CD10 int 0x10
And in 32 bit:
00000000 8DB4A600EB0D90 lea esi,[esi-0x6ff21500]
00000007 8DB60015F26F lea esi,[esi+0x6ff21500]
0000000D CD20 int 0x20
I realized I can improve on my previous solution.
JMP NEAR
, opcode 0xE9 takes a two-byte 16-bit immediate displacement in 16-bit mode, and a four-byte 32-bit displacement in 32-bit mode. Moreover, this displacement is relative to the start of the next instruction. So if the upper 16 bits of the 32-bit displacement are zero, this means that the jump target in 16-bit mode is two bytes below the jump target in 32-bit mode. That's just enough space for a short jump to the real 16-bit destination.
NASM example:
bits 16
jmp near found_16
dw 0x0
found_16:
bits 16
jmp short main_16 ; must be exactly 2 bytes
found_32:
bits 32
;; up to 127 total bytes of code can go here
;; jump elsewhere if you need more space
int 32
hlt
main_16:
bits 16
;; unlimited space here
int 16
hlt
Output of ndisasm -b16 foo.bin
:
00000000 E90200 jmp 0x5
00000003 0000 add [bx+si],al ; not executed
00000005 EB03 jmp short 0xa
00000007 CD20 int 0x20 ; not executed
00000009 F4 hlt ; not executed
0000000A CD10 int 0x10
0000000C F4 hlt
Output of ndisasm -b32 foo.bin
:
00000000 E902000000 jmp 0x7
00000005 EB03 jmp short 0xa ; not executed
00000007 CD20 int 0x20
00000009 F4 hlt
0000000A CD10 int 0x10 ; not executed
0000000C F4 hlt ; not executed
My previous solution, included for reference, was to use 0x0001
as the upper 16 bits of the displacement, so that in 32-bit mode, the jump target is 64K+2 bytes further along. This requires having at least 64K+ of code space available.
bits 16
jmp near do_16
next_insn_16:
dw 0x1
next_insn_32:
do_16:
int 16
;; The space between next_insn_32 and do_32
;; should equal 0x10000 + (do_16 - next_insn_16)
db (0x10000 + (do_16 - next_insn_16) - ($ - next_insn_32)) dup 0x90
do_32:
bits 32
int 32
Output of ndisasm -b16 foo.bin
:
00000000 E90200 jmp 0x5
00000003 0100 add [bx+si],ax
00000005 CD10 int 0x10
00000007 90 nop
; ...
Output of ndisasm -b32 foo.bin
:
00000000 E902000100 jmp 0x10007
00000005 CD10 int 0x10
00000007 90 nop
; ...
00010006 90 nop
00010007 CD20 int 0x20
It's very unusual to want this and want to preserve the original architectural state including FLAGS. Your test
instruction-length difference detection method is what I'd use. Or mov reg, imm16/32
if you want to preserve FLAGs but can clobber your choice of register.
If your CPU supports long NOPs (0F 1F modrm
), you can use that instead of the test eax, imm32
/ imm16
opcode to avoid affecting even FLAGS.
Long-NOP support is present at least since P6 (Pentium Pro / Pentium II), but maybe not in P5 Pentium or earlier; if you care about compatibility with retro hardware you should double-check or just avoid this and use Jester's answer.
bits 32
nop dword [eax + 0x02EB0000] ; 02 is the branch displacement for 16-bit mode
found32:
int 32
found16:
int 16
Decoded in 16-bit mode:
$ ndisasm -b16 polyglot
00000000 0F1F800000 nop word [bx+si+0x0]
00000005 EB02 jmp short 0x9
00000007 CD20 int 0x20
00000009 CD10 int 0x10
Decoded in 32-bit mode:
$ ndisasm -b32 polyglot
00000000 0F1F800000EB02 nop dword [eax+0x2eb0000]
00000007 CD20 int 0x20
00000009 CD10 int 0x10
Same code with the branch displacement calculated by the assembler
So you can put up to 127 bytes between the nop
/jmp short
and the found16:
label.
bits 32
; nop dword [eax + 0x02EB0000]
db 0x0f, 0x1f, 0x80, 0, 0 ; nop word/dword opcode, ModRM, and 2 bytes of displacement
db 0xEB ; jmp short opcode
db found16 - ($+1) ; relative to the end of the insn, but $ is the start of the line
found32:
bits 32
int 32
found16:
bits 16
int 16
Semi-related: https://codegolf.stackexchange/questions/139243/determine-your-languages-version/139717#139717 -
11 bytes of machine code that sets AL = 16
, 32
, or 64
(and destroys FLAGS, CX, and R8B).