Writing Small Shellcode
xchg eax, reg swaps the contents of eax and another register
lodsd / lodsb loads the dword / byte pointed to by esi into eax / al,
and increments esi
stosd / stosb saves the dword / byte in eax / al at the address pointed to
by edi, and increments edi
pushad / popad saves / restores all registers to / from the stack
cdq extends eax into a quad-word using edx – this can be used
to set edx = null if we know that eax < 0x80000000
2. Use instructions with multiple effects.
Sometimes we can achieve two desirable things at once, for example using the
above instructions xchg, lods, or stos.
3. Bend API rules.
Sometimes a Windows API specifies that a parameter should be of a particular type
or in a particular range, however through experimentation we can determine that the
actual implementation is more tolerant. For example, many APIs which take a
structure and a value for the size of the structure will work perfectly well provided that
the size parameter is simply large enough. If we know that an arbitrary large number
already exists on the stack, we can exploit the API’s tolerance to avoid having to set
the parameter explicitly.
Many APIs accept null values in several parameters, and these are often the
parameters at the end of the list which are pushed onto the stack last. Rather than
push a null register several times, we can first “flush” a large portion of stack to zero,
and then only push the non-null parameters, relying on our empty stack to implicitly
pass null values for the rest. When calling several such functions in succession, the
reduction in size of our code can be significant.
We can also use space on the stack when an API requires a large structure as a
parameter. Often, we might find that the one-byte “push esp” instruction is all we
need to pass a “valid” pointer to a structure. In some cases, APIs will tolerate more
than one structure overlapping, particularly when one is an [in] parameter and the
other an [out] parameter.
4. Don’t think like a programmer.
As programmers, we get used to the idea of the call stack working in a particular,
systematic way, where we push a function’s inputs, call the function, maybe adjust
the stack pointer, and then store / process the function’s output. As shellcoders, we
can be more imaginative. To create small code, we can make use of known values in
registers to push parameters long before they will actually be used. We can use
existing values on the stack as implicit parameters without pushing anything. If we
know a suitable value exists up or down the stack, we can just adjust esp to get it in
the right place. We can also do away with the idea of a frame pointer relative to which
we locate calling parameters or local variables. These useful compiler constructs are
often too inefficient for tight shellcode, and in any case the frame pointer register is
fantastically useful for storing information across API calls (see below).
5. Make efficient use of registers.
The x86 registers were not all created equal. Many useful instructions are
implemented for specific registers only, or are shorter for some registers than others.
Certain registers are always, or very often, preserved across API calls (ebp, esi and
edi can be relied upon, and sometimes others in specific cases). It is far more
efficient to use these registers to store information, rather than saving it on the stack.
NGSSoftware Insight Security Research Page 3 of 15 http://www.ngssoftware.com