Back to Solutions List
Challenge #6, probably the most toughest task among the series. We are blessed with 64 bit statically linked ELF file with stripped symbols. During the challenge we will be using the following tools:
First things, first
Let’s execute the file (in VM of course) and see what will be the output (if any):
Not much, but it’s a start. This
no will be our anchor and starting point in a minute.
Before continuing further I’d like to take a look statically on the binary. As already mentioned, the file comes with striped symbols, meaning we have no straight forward clues left for us. To continue, one needs to find the
main function as this is the code to start from. Entry point of the execution, in most cases, will start from bootstrapping code which will prepare the environment for the programmer’s code to run. The preparation process is managed by
__libc_start_main function with the following interface:
The first parameter here is the pointer to the
main function and now, fire up
radare2 and let’s do some actual examinations.
radare2 is able to identify
main function during code analysis stage.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
So, knowing that it’s 64 bit executable with appropriate ABI, we’d expect the
main function be passed in
RDI register and
radare2 indeed supports the assumption.
Overview of the binary
The binary is heavily obfuscated with a lot of junk instructions and spaghetti code which makes it in general not user friendly. On the figure, you are seeing starting function and yes, this is one function where even IDA complained about amount of nodes being more that 1000. Further analysis showed that most of the code is in the same unfriendly condition.
As you can see, it’s easy to get lost, but still let’s dive in for a while. Let’s start randomly examine various parts of the function and look for something. After some time, reoccurring patterns start to appear which are different from other junk code.
Various constants are getting updated with first letters of some predefined words and this was happening all over the place. Constant’s examination showed interesting thing, all of them are actually cells of a static array.
References to most of them showed the same update pattern:
Intuitively, let’s examine the head of the array to check whether it’s referenced anywhere that could be of any interest.
Before moving to the dynamic part of the challenge, some of you have spotted the bingo point (as I call it). It looks, that the constant array is actually an obfuscated shellcode which will be executed at the end. Currently it’s not interesting to understanding what type of obfuscation was used. Now, I hope the general idea is clear and I’d like to sum things up, before moving on to actually verifying all the theories:
- The binary is hardened with spaghetti and junk code
- During the execution, static array is filled with first letters of the predefined words
- Eventually the array will be do-obfuscated and executed – this is an educated guess which will be checked during binary execution
So now, (hopefully) you understand a little bit what is going on. At the next step,
gdb will be use as primary tool to solve the challenge and IDA will accompany us on the way. The author left numerous clues to be used and help us get to the end. The first one is the
no message which appeared at the start.
The goal is to breakpoint on
loc_44BAB9 (Fig. 5) and get to shellcode execution.
1 2 3 4 5 6 7
no in IDA. Analyzing the chain on (Fig. 6) one can immediately see, that there were not enough arguments given on start up. There is still no information what should be supplied, but this will definitely get there. So, let’s restart with one arguments and follow the results.
1 2 3 4 5 6 7
This time we explore the previous finding, where on error, the message was printed with
na – this also shows insufficient parameters (Fig. 7) we supplied, so another one is needed.
Just to check how many parameters there actually are, try to add more than two and it always will generate :
1 2 3 4 5 6 7 8 9 10 11 12
and code confirmation:
So, there are only 2 parameters to work with.
Adding two parameters, got us to the next trouble:
1 2 3 4 5 6 7
Let’s try once again and check why we got this output (Fig. 8) by using the Xref for
What we have here is actually a
ptrace (0x65 system call) call with
As you probably understood, current process is already traced by parent (
gdb), so the new call to
ptrace will result in failure. The solution for this trick is actually quiet easy, just overwrite
set $EAX = 1 after return from
ptrace or patch
jz short loc_41f232 (Fig. 8) to
jmp short loc_41f232 with your favorite hex editor. Assuming that this trouble was solved, let’s continue.
So, now we know that the application expects 2 arguments and was protected with anti-debugging. We continue now with the following:
1 2 3 4 5 6 7
Repeating the previous technique, it could be seen (Fig. 9) that some buffer is compared to
Backtracking, leads to the fact , that the first parameter is actually
V and stored in
buffer before checking with
To reveal the first parameter, let’s
V and get
Once the application re-executed with new parameter, it freezes. Breaking in
gdb reveals the issue.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0x23 system call is actually
nanosleep which is called from within
sub_473B70. Sleeping is easily neutralized by supplying small sleep time.
Finally, after all the adventures,
gdb stopped on
0x44bab9 – just before decoding the static array with presumable shellcode.
The contents of the array @
0x729900 are likely to be base64 encoded (I did not find the need to check the algo). After the decoding, the following shellcode will be executed (only part of it is show here).
The idea here is to take the obfuscation algorithm and execute it backwards with the help of the pen or
python. It’s not very complex, so I leave it for you to implement. If everything is done right, you will get the following mail:
Bonus – back-connect code
As a bonus, the author left for us some back-connect code, sort of a prize as it will be activated only when the right 2nd parameter was supplied (which is the email).
Fig. 12 – back-connect code or here
This how it looks when executed:
So be careful and always use a controlled (to some degree) environment!!!