MCPcopy
hub / github.com/cea-sec/miasm

github.com/cea-sec/miasm @v0.1.5 sqlite

repository ↗ · DeepWiki ↗ · release v0.1.5 ↗
7,085 symbols 23,173 edges 340 files 2,208 documented · 31%
README

Build Status Build status Miasm tests Code Climate Join the chat at https://gitter.im/cea-sec/miasm

What is Miasm?

Miasm is a free and open source (GPLv2) reverse engineering framework. Miasm aims to analyze / modify / generate binary programs. Here is a non exhaustive list of features:

  • Opening / modifying / generating PE / ELF 32 / 64 LE / BE
  • Assembling / Disassembling X86 / ARM / MIPS / SH4 / MSP430
  • Representing assembly semantic using intermediate language
  • Emulating using JIT (dynamic code analysis, unpacking, ...)
  • Expression simplification for automatic de-obfuscation
  • ...

See the official blog for more examples and demos.

Table of Contents

Basic examples

Assembling / Disassembling

Import Miasm x86 architecture:

>>> from miasm.arch.x86.arch import mn_x86
>>> from miasm.core.locationdb import LocationDB

Get a location db:

>>> loc_db = LocationDB()

Assemble a line:

>>> l = mn_x86.fromstring('XOR ECX, ECX', loc_db, 32)
>>> print(l)
XOR        ECX, ECX
>>> mn_x86.asm(l)
['1\xc9', '3\xc9', 'g1\xc9', 'g3\xc9']

Modify an operand:

>>> l.args[0] = mn_x86.regs.EAX
>>> print(l)
XOR        EAX, ECX
>>> a = mn_x86.asm(l)
>>> print(a)
['1\xc8', '3\xc1', 'g1\xc8', 'g3\xc1']

Disassemble the result:

>>> print(mn_x86.dis(a[0], 32))
XOR        EAX, ECX

Using Machine abstraction:

>>> from miasm.analysis.machine import Machine
>>> mn = Machine('x86_32').mn
>>> print(mn.dis('\x33\x30', 32))
XOR        ESI, DWORD PTR [EAX]

For MIPS:

>>> mn = Machine('mips32b').mn
>>> print(mn.dis(b'\x97\xa3\x00 ', "b"))
LHU        V1, 0x20(SP)

Intermediate representation

Create an instruction:

>>> machine = Machine('arml')
>>> instr = machine.mn.dis('\x00 \x88\xe0', 'l')
>>> print(instr)
ADD        R2, R8, R0

Create an intermediate representation object:

>>> lifter = machine.lifter_model_call(loc_db)

Create an empty ircfg:

>>> ircfg = lifter.new_ircfg()

Add instruction to the pool:

>>> lifter.add_instr_to_ircfg(instr, ircfg)

Print current pool:

>>> for lbl, irblock in ircfg.blocks.items():
...     print(irblock)
loc_0:
R2 = R8 + R0

IRDst = loc_4

Working with IR, for instance by getting side effects:

>>> for lbl, irblock in ircfg.blocks.items():
...     for assignblk in irblock:
...         rw = assignblk.get_rw()
...         for dst, reads in rw.items():
...             print('read:   ', [str(x) for x in reads])
...             print('written:', dst)
...             print()
...
read:    ['R8', 'R0']
written: R2

read:    []
written: IRDst

More information on Miasm IR is in the corresponding Jupyter Notebook.

Emulation

Giving a shellcode:

00000000 8d4904      lea    ecx, [ecx+0x4]
00000003 8d5b01      lea    ebx, [ebx+0x1]
00000006 80f901      cmp    cl, 0x1
00000009 7405        jz     0x10
0000000b 8d5bff      lea    ebx, [ebx-1]
0000000e eb03        jmp    0x13
00000010 8d5b01      lea    ebx, [ebx+0x1]
00000013 89d8        mov    eax, ebx
00000015 c3          ret
>>> s = b'\x8dI\x04\x8d[\x01\x80\xf9\x01t\x05\x8d[\xff\xeb\x03\x8d[\x01\x89\xd8\xc3'

Import the shellcode thanks to the Container abstraction:

>>> from miasm.analysis.binary import Container
>>> c = Container.from_string(s, loc_db)
>>> c
<miasm.analysis.binary.ContainerUnknown object at 0x7f34cefe6090>

Disassembling the shellcode at address 0:

>>> from miasm.analysis.machine import Machine
>>> machine = Machine('x86_32')
>>> mdis = machine.dis_engine(c.bin_stream, loc_db=loc_db)
>>> asmcfg = mdis.dis_multiblock(0)
>>> for block in asmcfg.blocks:
...  print(block)
...
loc_0
LEA        ECX, DWORD PTR [ECX + 0x4]
LEA        EBX, DWORD PTR [EBX + 0x1]
CMP        CL, 0x1
JZ         loc_10
->      c_next:loc_b    c_to:loc_10
loc_10
LEA        EBX, DWORD PTR [EBX + 0x1]
->      c_next:loc_13
loc_b
LEA        EBX, DWORD PTR [EBX + 0xFFFFFFFF]
JMP        loc_13
->      c_to:loc_13
loc_13
MOV        EAX, EBX
RET

Initializing the JIT engine with a stack:

>>> jitter = machine.jitter(loc_db, jit_type='python')
>>> jitter.init_stack()

Add the shellcode in an arbitrary memory location:

>>> run_addr = 0x40000000
>>> from miasm.jitter.csts import PAGE_READ, PAGE_WRITE
>>> jitter.vm.add_memory_page(run_addr, PAGE_READ | PAGE_WRITE, s)

Create a sentinelle to catch the return of the shellcode:

def code_sentinelle(jitter):
    jitter.running = False
    jitter.pc = 0
    return True

>>> jitter.add_breakpoint(0x1337beef, code_sentinelle)
>>> jitter.push_uint32_t(0x1337beef)

Active logs:

>>> jitter.set_trace_log()

Run at arbitrary address:

>>> jitter.init_run(run_addr)
>>> jitter.continue_run()
RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000000 RDX 0000000000000000
RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000
zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000
RIP 0000000040000000
40000000 LEA        ECX, DWORD PTR [ECX+0x4]
RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000004 RDX 0000000000000000
RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000
zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000
....
4000000e JMP        loc_0000000040000013:0x40000013
RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000004 RDX 0000000000000000
RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000
zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000
RIP 0000000040000013
40000013 MOV        EAX, EBX
RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000004 RDX 0000000000000000
RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000
zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000
RIP 0000000040000013
40000015 RET
>>>

Interacting with the jitter:

>>> jitter.vm
ad 1230000 size 10000 RW_ hpad 0x2854b40
ad 40000000 size 16 RW_ hpad 0x25e0ed0

>>> hex(jitter.cpu.EAX)
'0x0L'
>>> jitter.cpu.ESI = 12

Symbolic execution

Initializing the IR pool:

>>> lifter = machine.lifter_model_call(loc_db)
>>> ircfg = lifter.new_ircfg_from_asmcfg(asmcfg)

Initializing the engine with default symbolic values:

>>> from miasm.ir.symbexec import SymbolicExecutionEngine
>>> sb = SymbolicExecutionEngine(lifter)

Launching the execution:

>>> symbolic_pc = sb.run_at(ircfg, 0)
>>> print(symbolic_pc)
((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)

Same, with step logs (only changes are displayed):

>>> sb = SymbolicExecutionEngine(lifter, machine.mn.regs.regs_init)
>>> symbolic_pc = sb.run_at(ircfg, 0, step=True)
Instr LEA        ECX, DWORD PTR [ECX + 0x4]
Assignblk:
ECX = ECX + 0x4
________________________________________________________________________________
ECX                = ECX + 0x4
________________________________________________________________________________
Instr LEA        EBX, DWORD PTR [EBX + 0x1]
Assignblk:
EBX = EBX + 0x1
________________________________________________________________________________
EBX                = EBX + 0x1
ECX                = ECX + 0x4
________________________________________________________________________________
Instr CMP        CL, 0x1
Assignblk:
zf = (ECX[0:8] + -0x1)?(0x0,0x1)
nf = (ECX[0:8] + -0x1)[7:8]
pf = parity((ECX[0:8] + -0x1) & 0xFF)
of = ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1))[7:8]
cf = (((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1)) ^ ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1)))[7:8]
af = ((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1))[4:5]
________________________________________________________________________________
af                 = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5]
pf                 = parity((ECX + 0x4)[0:8] + 0xFF)
zf                 = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1)
ECX                = ECX + 0x4
of                 = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8]
nf                 = ((ECX + 0x4)[0:8] + 0xFF)[7:8]
cf                 = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8]
EBX                = EBX + 0x1
________________________________________________________________________________
Instr JZ         loc_key_1
Assignblk:
IRDst = zf?(loc_key_1,loc_key_2)
EIP = zf?(loc_key_1,loc_key_2)
________________________________________________________________________________
af                 = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5]
EIP                = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
pf                 = parity((ECX + 0x4)[0:8] + 0xFF)
IRDst              = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
zf                 = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1)
ECX                = ECX + 0x4
of                 = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8]
nf                 = ((ECX + 0x4)[0:8] + 0xFF)[7:8]
cf                 = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8]
EBX                = EBX + 0x1
________________________________________________________________________________
>>>

Retry execution with a concrete ECX. Here, the symbolic / concolic execution reach the shellcode's end:

```pycon

from miasm.expression.expression import ExprInt sb.symbols[machine.mn.regs.ECX] = ExprInt(-3, 32) symbolic_pc = sb.run_at(ircfg, 0, step=True) Instr LEA ECX, DWORD PTR [ECX + 0x4] Assignblk: ECX = ECX + 0x4


af = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5] EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10) pf = parity((ECX + 0x4)[0:8] + 0xFF) IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10) zf = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1) ECX = 0x1 of = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8] nf = ((ECX + 0x4)[0:8] + 0xFF)[7:8] cf = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8] EBX = EBX + 0x1


Instr LEA EBX, DWORD PTR [EBX + 0x1] Assignblk: EBX = EBX + 0x1


af = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5] EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10) pf = parity((ECX + 0x4)[0:8] + 0xFF) IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10) zf = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1) ECX = 0x1 of = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8] nf = ((ECX + 0x4)[0:8] + 0xFF)[7:8] cf = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8] EBX = EBX + 0x2


Instr CMP CL, 0x1 Assignblk: zf = (ECX[0:8] + -0x1)?(0x0,0x1) nf = (ECX[0:8] + -0x1)[7:8] pf = parity((ECX[0:8] + -0x1) & 0xFF) of = ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1))[7:8] cf = (((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1)) ^ ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1)))[7:8] af = ((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1))[4:5]


af = 0x0 EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10) pf = 0x1 IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10) zf = 0x1 ECX = 0x1 of = 0x0 nf = 0x0 cf

Core symbols most depended-on inside this repo

append
called by 1523
miasm/loader/pe.py
check_instruction
called by 1317
test/arch/mep/asm/ut_helpers_asm.py
addop
called by 677
miasm/arch/x86/arch.py
rmmod
called by 428
miasm/arch/x86/arch.py
zeroExtend
called by 251
miasm/expression/expression.py
compute
called by 229
test/arch/arm/sem.py
add
called by 218
miasm/os_dep/win_api_x86_32.py
is_int
called by 202
miasm/expression/expression.py

Shape

Method 3,911
Function 2,031
Class 1,143

Languages

Python100%

Modules by API surface

miasm/arch/x86/sem.py483 symbols
miasm/os_dep/win_api_x86_32.py334 symbols
miasm/arch/x86/arch.py294 symbols
miasm/arch/arm/arch.py277 symbols
miasm/expression/expression.py273 symbols
miasm/core/types.py198 symbols
miasm/arch/aarch64/arch.py190 symbols
miasm/core/objc.py165 symbols
miasm/arch/arm/sem.py159 symbols
miasm/arch/aarch64/sem.py156 symbols
miasm/core/cpu.py155 symbols
miasm/loader/elf_init.py131 symbols

Dependencies from manifests, versioned

pyparsing2.0 · 1×

For agents

$ claude mcp add miasm \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact