This RE challenge is quite challenging, and a bit of a step up from the other 3, rightly so, as it is classified as “Medium” difficulty on HTB. It involves code obfuscation, and anti-debugging techniques such as checking the debug flag and timing analysis.
Running the binary
When we execute the binary, we see that it simply prints two statements before exiting. When we run it a few times, sometimes nothing is printed out, and I have a rough speculation why later down the post.
Dynamic analysis with x64dbg
When we run it in x64dbg, we see that nothing is printed on the console as we would expect from running the binary. Upon observation, we see two TLS callbacks that occur before the program actually enters the actual main function.
This is our first glimpse of an anti-debugging technique. Here is a resources to read more about TLS How Malware Defends Itself Using TLS Callback Functions (sans.edu). In summary TLS, or Thread Local Storage, allows execution of certain code before entering the main function of the program. This allows malware to execute their instructions before the debugging even happens. To counter this, debuggers such as x64dbg have settings to stop the program even at TLS level, which is why we can see the two TLS Callbacks above. To get a another view of this, we load the program using IDA to observe the code.
Static Analysis with IDA
When we load the program in IDA and click on Exports, we see 3 segments of code, the 2 TLS code, and the main code.
When we look at the main function, it jumps to _mainCRTStartup_0, which is shown below
Here we see another anti-debugger technique, which is the line “mov eax large fs:30h”. What this does is to read information from the Process Environment Block (PEB) which contains a flag to indicate if the program is being debugged or not. We see that two pieces of information are being checked and read from the PEB. In the first code block, it reads EAX+2h, while in the 3rd block it reads EAX+68h. The code roughly reads like this:
- Get a pointer to the TEB (located at fs:18h)
- Get a pointer to the PEB (located at TEB+30h)
- Check the BeingDebugged flag (located at PEB+2)
- Check the NtGlobalFlag (located at PEB + 68)
To circumvent this, we can install a plugin that sets anti-debugging flags for us: https://github.com/x64dbg/ScyllaHide. This will set the value of BeingDebugged and NtGlobalFlag to 0, hence bypassing these anti-debugging techniques.
After defeating the first two anti-debugging techniques, we come across one final technique, which is a timing analysis logic. “rdtsc” reads the current time, and stores it in EAX before moving it to EBX. It then performs a few operations before calling “rdtsc” again. It then compare the new time read vs the previous timing taken, and checks if the difference between the two are greater than 0x3E8, or in 1000ms, (or 1 second) in decimal. Usually when stepping through a program, it takes longer than 1000ms to animate through all those steps, therefore it would be greater than 1 second, and jump to the fail branch.
We can overcome this by changing the value of EAX to 0 during the comparison stage.
The fail branch is on the right if the analysis took more than 1 second, which weirdly states that the sp-analysis has failed. I’m guessing this terminates the program because some invalid value is placed in the ESP. I theorize that this also causes the program to fail periodically when running it normally through the command line, as the VM only has limited CPU resources, its possible that some runs took more than 1 second to execute, hence failing to print the statements.
Looking at the left side when we bypass the previous anti-debugging technique, it NOP-sleds into a XOR loop with the value 0x5C. We now pivot back to x64dbg to look at the same piece of code
Static Analysis with x64dbg
The code above is exactly the same as what we saw in IDA, albeit not as fancy. It starts with checking the PEB for the BeingDebugged flag and NtGlobalFlag, followed by the timing analysis, and finally the NOP sled to the XOR loop.
If we look at the XOR loop, it starts from 0x401620, and XORs the value until the address 0x401791. We add a break point to after the XOR loops has finished to observe what has changed to that address space.
Looking the difference between the two, its clear that the code was obfuscated before, and now de-obfuscated after being XOR-ed by 0x5C. Although not shown here, it also explains why we could not find any strings that was printed on the command line before the XOR loop happened.
Observing de-obfuscated the code, we see that it does a timing analysis again, and checks if the time elapsed is more than 1 second. We can easily circumvent this by changing the value of EAX to 0.
After we have bypassed the final anti-debugging hurdle, this piece of code is ran:
It’s another XOR loop, that XORs the value in EAX with the value 0x4B, and stores the result in the EDI register by calling stosb STOS/STOSB/STOSW/STOSD–Store String (jaist.ac.jp). Again we put a breakpoint at the end of the program, and allow the loop to complete it execution, before looking at the string residing at the address pointed to by EDI. We right click EDI and follow dump to see the flag!
Learning about the anti-debugging techniques such as TLS Callbacks, PEB checks, and timing analysis was interesting, but I never figured out what the TLS was actually doing, and it seems like we could let it execute with no side effects. I’m guessing the PEB check and timing analysis could be placed in the TLS, and the program would not execute at all as it exits before even going to the main function.