nepenthes & libemu

Spent some time on writing a libemu module for nepenthes last months, turned out to be rather difficult as nepenthes is a single threaded program and shellcode emulation is slow and may require creating new processes. Installing software with a broken Makefile ran rm -rf / (yes, as root) on my system, therefore this effort got lost anyway.
honeytrap is more likely to get working shellcode emulation than nepenthes, is multiple processes structure fits exactly the needs.

But even if you do not have to worry about creating (sub)processes, emulating shellcodes is not easy.
Most shellcode is written as fire&forget, if the shellcode works, be glad, if it does not work, do not care about the attacked host.

For example relying on returnvalues from system calls to create arguments on the stack, without verification.
In this case, the shellcode relied on the returnvalue of connect, and used it to create the length parameter for the recv syscall.

HMODULE LoadLibraryA (
     LPCTSTR lpFileName = 0x0012fe90 => 
           = "ws2_32";
) = 0x71a10000;
SOCKET socket (
     int af = 2;
     int type = 1;
     int protocol = 0;
) =  3;
int connect (
     SOCKET s = 3;
     struct sockaddr_in * name = 0x0012fc80 => 
         struct   = {
             short sin_family = 2;
             unsigned short sin_port = 6460 (port=15385);
             struct in_addr sin_addr = {
                 unsigned long s_addr = removed (host=removed);
             };
             char sin_zero = "       ";
         };
     int namelen = 16;
) =  -1;
int recv (
     SOCKET s = 3;
     char * buf = 0x0012fc90 => 
         none;
     int len = -64769;
     int flags = -1;
) =  -1;

If the connect failed, recv got called with a negative signed value (-64769) for the buffer size and the flags (-1).

In other cases, shellcode misbehaves, for example it tries to connect the attacker forever - if the host is unreachable.

while ( connect(...) );

Not to mention shellcode is very hard to write and as error prone any other software, but once it works, it gets deployed.

This shellcode wants to connect() to a remotehost, but the namelen parameter is incorrect.

int connect (
     SOCKET s = 3;
     struct sockaddr_in * name = 0x0041728a => 
         struct   = {
             short sin_family = 2;
             unsigned short sin_port = 7470 (port=11805);
             struct in_addr sin_addr = {
                 unsigned long s_addr = 1982129222 (host=70.228.36.118);
             };
             char sin_zero = "       ";
         };
     int namelen = 4289293;
) =  0;

Looking at the assembly reveals the problem is a typo

lea edx,[ebx+0x17f]
mov byte [edx],0x16
push edx              ; push namelen
lea edx,[ebx+0xfc]
mov word [edx],0x2    
mov di,[ebx+0x8]
mov [edx+0x2],di
mov edi,[ebx+0x4]
mov [edx+0x4],edi
push edx              ; push sockaddr *
mov eax,[ebx+0xf8]
push eax              ; push socket
call ds:[ebx+0x49]    ; call connect(socket,  sockaddr *,  namelen)

As only the namelen is wrong, we can focus on the first lines.

mov byte [edx],0x16
push edx              ; push namelen

This code stores 0x16 at memory address [edx], and pushes the memory address on the stack, instead of the value 0x16, which would be the appropriate value for namelen on win32.
Unfortunately this code works, at least on windows, else the worm (in this case the shellcode is csend from agobot) would not have had that impact and media attention in the past. If you emulate it on linux, you have to sanatize the namelen argument.

Another point is, if we emulate code, we allow others to execute code on our boxes in a sandbox environment.
Even though SQLSlammer has shown it is possible to write viral shellcode, current shellcode only helps getting access to a machine for further actions, shellcode (tcp)scanning for vulnerable hosts exploiting them is possible and therefore we have to think about it.

One approach is profile the shellcode, execute it in a close sandbox, without proxying the syscalls to the host operating system, create a graph of the syscalls, measure the graph afterwards, guessing what the shellcode might want to do.
This simple finite-state graphs and a path matching scheme to determine the behavior of them shellcode approach is outlined in ShellShock: Capturing Multi-Stage Attacks in Virtual Honeypots by Ryan Smith, Adam Prigden, Braxton Thomason, Vitaly Shmatikov, but the paper is not public available.
One drawback with this approch is shellcode with more than one stage, as we can not profile the second stage, as the shellcode did not receive it in its closed sandbox.

So, whats left?
Current idea is implementing a nanny for shellcode to make sure it behaves during interactive emulation, we'll see if it works out.

Comments



2008/07/19/nepenthes_libemu.txt · Last modified: 2010/06/15 13:15 by common
chimeric.de = chi`s home Creative Commons License Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0