Now that we know about the PE (portable executable) format, let’s talk about reflective loading and what happens in memory.
First of all, welcome back! Hope you didn’t choke on all the food you ate during the Christmas as I surely did. Wish you all the best for 2021.
Let’s recap. Our goal is to emulate Windows’s loader (as simply as we can), load and execute a PE file directly from memory (cf. achieve reflective loading).
In our previous article, we covered how a PE file was structured when it resided on disk and how we could parse it. In this article we are going to cover how to load and execute it.
But first let me talk about the sponsor of this article, NORDVPN
Table of content
- In memoria
- Achieving reflective loading
- Useful Links
In this article, we will be working with SimpleEXE.exe, the same PE file that we analyzed last time.
Right, let’s start. The first step is to find a memory range large enough for our PE file.
“How big?”, you ask, askingly
If you recall my last blog post you must certainly read, we refer to OptionalHeader->SizeOfImage to know how much to allocate for the specific PE.
During the executable’s allocation, the loader will attempt to reserve a chunk of memory at address OptionalHeader->ImageBase.
If the address is already occupied or not enough space is available, the loader will allocate at a random address in memory. In that case, relocation will have to be performed when loading the binary or else the execution will fail.
Realistically we’ll allocate the buffer using VirtualAllocEx and work with the pointer returned by the function, so relocations are to be expected.
Now that we have a buffer to work with, let’s examine how the file will be mapped to memory.
If we refer to our previous diagram, here is a broad overview of what happens:
Remember all that stuff with Relative Virtual Address and Real File Offset ? Well forget all about them.
In memory, Relative Virtual Address are relative to ImageBase. So, in order to find Virtual Addresses (the address in memory) of objects or sections, simply add the Relative Virtual Address to OptionalHeader->ImageBase. Voilà.
Sections ? Easy stuff.
Let’s take an example with our SimpleEXE.exe’s section table.
|Size of raw data||0x7800||0x2800||0x1000||0x200||0x2000||0x400|
As I told all y’all above, to get object’s memory address it has to be loaded in, just add the object’s VirtualAddress to the ImageBase.
I know you got it, but let’s show an example for the .text section:
VirtualAddress 0x11000 ImageBase 0x0000000140000000 Memory offset 0x0000000140011000
It is as simple as that. Repeat for as many sections as necessary.
Now that our sections are mapped, let’s focus on the import table now. Here is our binary’s IAT:
What is the problem with the address highlighted in red ?
I am sure you guessed it because you’re awesome but let’s explain it anyways.
IAT VA = 0x01F790
Chances are there is nothing at this address in memory.
Let’s open notepad.exe’s memory and take a look at what’s in there:
Ain’t no data at this offset. If we refer to 64-bit address space layout, usually these low addresses are reserved.
If the binary calls this function during execution, it will receive an ACCESS_VIOLATION error and crash.
If you recall, our imageBase is 0x140000000, which is already an indicator of what kind of values we are working with.
Let’s run a simple program that prints the address of GetCurrentProcess in KERNEL32.dll:
[+] Address of data 0x7ffe4c251db0 | GetCurrentProcess
Right. The current values in our IAT are clearly wrong.
To fix this, we need the list of all the imported DLLs and the functions’ addresses that our binary needs. We obtain it by inspecting our binary. We can then load libraries remotely with the help of LoadLibraryToProcess and once this is done, we retrieve the modules’ handles by calling EnumProcessModules.
With that, it is just a matter of calling GetProcAddress and patching the binary’s IAT.
This method is not the one you would see in a typical reflective loader. Usually, the proper method would be to copy the executable/library directly in the process and CreateRemoteThread on a small shellcode built to retrieve the addresses of at least LoadLibrary and GetProcAddress. The shellcode would then remap and patch the binary within the process, removing the need to do this remotely (which is convenient but extremely suspicious).
In our case we choose the remote method for the sake of simplicity.
After retrieving addresses and patching, our IAT now looks like this:
[+] Address of data 0x7ffe4e0bb7f0 | HeapAlloc [+] Address of data 0x7ffe4c246350 | HeapFree [+] Address of data 0x7ffe4c246a50 | GetProcessHeap [+] Address of data 0x7ffe4c246790 | GetCurrentThread [+] Address of data 0x7ffe4c266af0 | TerminateThread
Our binary will now be able to call imported functions without trouble.
A picture is worth a thousand words so here is a simple diagram that explains the changes we made to the IAT:
One last thing to fix before execution is the relocation table. As we saw in the last article, the relocation table is a lookup table listing all of the PE file’s offsets requiring patching when the file is loaded at a different address from the one specified in Optional Header->ImageBase.
Let’s say that we’re looking for a space to allocate the binary in. We get an allocation at an address different from Optional Header->ImageBase:
[+] ImageBase @ 0x0000000140000000 [+] Got alloc at @ 0x000002CDD3550000
If offsets in our binary are referenced relative to ImageBase, this is going to cause trouble if we try to access them and our binary has been loaded to a different address. That’s why there is a relocation table, for value to be relocated, duh.
If we take a look at address at the first entry of the relocation table we have:
If you look closely, you’ll notice that the value above is relative to ImageBase, which would be okay if the binary is loaded this offset. If not, we would have to patch it.
To fix it, we’ll use the following formula:
[+] Value to patch 0x0000000140011A70 [+] Formula: value - Imagebase + new_allocation [+] Formula: 0x140011A70 - 0x140000000 + 0x2CDD3550000 [+] Corrected value 0x000002CDD3561A70
Loop through all values in the relocation table, patch them all and, we are done !
Achieving reflective loading
Let’s wrap up. Here are the steps we’ll have to code to achieve our goal:
- Allocate a buffer
- Copy headers to buffer
- Map sections to buffer
- OpenProcess on the target process
- Allocate memory in the process
- Get addresses of imported functions and patch IAT
- Patch relocations table
- Write buffer to process
- Start thread at the entry point of the binary
And after long sessions of debugging, we finally got it:
The SimpleEXE program got executed directly from notepad’s memory… Great success!
Here is the output of the reflector program:
This was a really fun subject to dive in. I have a better understanding (and hope you do too winkwink) of PE files and how malicious programs implement simple injection techniques.
Now that you understand the basics, you can extrapolate this to understand how other injection techniques work (process hollowing, etc.) or maybe try to guess how packing works and how it could be implemented.
There are lots of stuff I did not cover here, but I hope you liked these first two articles.
Next up, I’ll either write about packing or maybe I’ll make a detailed article about NTFS and a NTFS parser I developed earlier in 2020.
Don’t forget to like, comment and subscribe!