Now that we know about the PE (portable executable) format, let’s talk about reflective loading and what happens in memory.

Introduction

First of all, welcome back! Hope you didn’t choke on all the food you ate during the Christmas as I surely did. Wish you all the best for 2021.

Let’s recap. Our goal is to emulate Windows’s loader (as simply as we can), load and execute a PE file directly from memory (cf. achieve reflective loading).

In our previous article, we covered how a PE file was structured when it resided on disk and how we could parse it. In this article we are going to cover how to load and execute it.

~~But first let me talk about the sponsor of this article, NORDVPN~~

Table of content

In memoria
- Memory allocation
- Mapping
Achieving reflective loading
Conclusion
Useful Links

In memoria

In this article, we will be working with SimpleEXE.exe, the same PE file that we analyzed last time.

Memory allocation

Right, let’s start. The first step is to find a memory range large enough for our PE file.

“How big?”, you ask, askingly

If you recall my last blog post you must certainly read, we refer to OptionalHeader->SizeOfImage to know how much to allocate for the specific PE.

“But, where?”

During the executable’s allocation, the loader will attempt to reserve a chunk of memory at address OptionalHeader->ImageBase.
If the address is already occupied or not enough space is available, the loader will allocate at a random address in memory. In that case, relocation will have to be performed when loading the binary or else the execution will fail.
Realistically we’ll allocate the buffer using VirtualAllocEx and work with the pointer returned by the function, so relocations are to be expected.

Mapping

Now that we have a buffer to work with, let’s examine how the file will be mapped to memory.
If we refer to our previous diagram, here is a broad overview of what happens:

Remember all that stuff with Relative Virtual Address and Real File Offset ? Well forget all about them.

In memory, Relative Virtual Address are relative to ImageBase. So, in order to find Virtual Addresses (the address in memory) of objects or sections, simply add the Relative Virtual Address to OptionalHeader->ImageBase. Voilà.

Sections

Sections ? Easy stuff.
Let’s take an example with our SimpleEXE.exe’s section table.

Section name	.text	.rdata	.idata	.data	.pdata	.reloc
Virtual address	0x11000	0x19000	0x1f000	0x1c000	0x1d000	0x23000
Size of raw data	0x7800	0x2800	0x1000	0x200	0x2000	0x400
Characterics	RX	R	R	RW	R	R

As I told all y’all above, to get object’s memory address it has to be loaded in, just add the object’s VirtualAddress to the ImageBase.

I know you got it, but let’s show an example for the .text section:

VirtualAddress               0x11000
ImageBase                    0x0000000140000000

Memory offset                0x0000000140011000

It is as simple as that. Repeat for as many sections as necessary.

Import table

Now that our sections are mapped, let’s focus on the import table now. Here is our binary’s IAT:

What is the problem with the address highlighted in red ?

I am sure you guessed it because you’re awesome but let’s explain it anyways.

IAT[0] VA = 0x01F790

Chances are there is nothing at this address in memory.
Let’s open notepad.exe’s memory and take a look at what’s in there:

Ain’t no data at this offset. If we refer to 64-bit address space layout, usually these low addresses are reserved.
If the binary calls this function during execution, it will receive an ACCESS_VIOLATION error and crash.

If you recall, our imageBase is 0x140000000, which is already an indicator of what kind of values we are working with.
Let’s run a simple program that prints the address of GetCurrentProcess in KERNEL32.dll:

[+] Address of data                 0x7ffe4c251db0 | GetCurrentProcess

Right. The current values in our IAT are clearly wrong.

To fix this, we need the list of all the imported DLLs and the functions’ addresses that our binary needs. We obtain it by inspecting our binary. We can then load libraries remotely with the help of LoadLibraryToProcess and once this is done, we retrieve the modules’ handles by calling EnumProcessModules.
With that, it is just a matter of calling GetProcAddress and patching the binary’s IAT.

This method is not the one you would see in a typical reflective loader. Usually, the proper method would be to copy the executable/library directly in the process and CreateRemoteThread on a small shellcode built to retrieve the addresses of at least LoadLibrary and GetProcAddress. The shellcode would then remap and patch the binary within the process, removing the need to do this remotely (which is convenient but extremely suspicious).
In our case we choose the remote method for the sake of simplicity.

After retrieving addresses and patching, our IAT now looks like this:

[+] Address of data                 0x7ffe4e0bb7f0 | HeapAlloc
[+] Address of data                 0x7ffe4c246350 | HeapFree
[+] Address of data                 0x7ffe4c246a50 | GetProcessHeap
[+] Address of data                 0x7ffe4c246790 | GetCurrentThread
[+] Address of data                 0x7ffe4c266af0 | TerminateThread

Our binary will now be able to call imported functions without trouble.

A picture is worth a thousand words so here is a simple diagram that explains the changes we made to the IAT:

Relocations

One last thing to fix before execution is the relocation table. As we saw in the last article, the relocation table is a lookup table listing all of the PE file’s offsets requiring patching when the file is loaded at a different address from the one specified in Optional Header->ImageBase.

Let’s say that we’re looking for a space to allocate the binary in. We get an allocation at an address different from Optional Header->ImageBase:

[+] ImageBase                           @ 0x0000000140000000
[+] Got alloc at                        @ 0x000002CDD3550000

If offsets in our binary are referenced relative to ImageBase, this is going to cause trouble if we try to access them and our binary has been loaded to a different address. That’s why there is a relocation table, for value to be relocated, duh.

If we take a look at address at the first entry of the relocation table we have:

If you look closely, you’ll notice that the value above is relative to ImageBase, which would be okay if the binary is loaded this offset. If not, we would have to patch it.

To fix it, we’ll use the following formula:

[+] Value to patch                      0x0000000140011A70

[+] Formula:                            value - Imagebase + new_allocation
[+] Formula:                            0x140011A70 - 0x140000000 + 0x2CDD3550000

[+] Corrected value                     0x000002CDD3561A70

Loop through all values in the relocation table, patch them all and, we are done !

Achieving reflective loading

Let’s wrap up. Here are the steps we’ll have to code to achieve our goal:

Allocate a buffer
Copy headers to buffer
Map sections to buffer
OpenProcess on the target process
Allocate memory in the process
Get addresses of imported functions and patch IAT
Patch relocations table
Write buffer to process
Start thread at the entry point of the binary

And after long sessions of debugging, we finally got it:

The SimpleEXE program got executed directly from notepad’s memory… Great success!

Here is the output of the reflector program:

PEReflection.exe

[+] File C:\...\Debug\SimpleEXE.exe opened
[+] File size                           0x0000e400

[+] Mapping file initialized

[+] Copying sections

[+] Mapping sections
  [+] Writing                           .text
  [+] Writing                           .rdata
  [+] Writing                           .data
  [+] Writing                           .pdata
  [+] Writing                           .idata
  [+] Writing                           .msvcjmc
  [+] Writing                           .00cfg
  [+] Writing                           .rsrc
  [+] Writing                           .reloc

[+] Opened process                      0x00000954
[+] Got alloc                           @ 0x000002CDD33F0000
[+] Delta                               0x000002CC933F0000

[+] Handling relocations
[+] Relocations completed

[+] Fixing import table
  [+] IAT                               @ 000002451025C200

[+] Number of dlls                      0x00000003
  [+] Loading KERNEL32.dll
  [+] Loading VCRUNTIME140D.dll
  [+] Loading ucrtbased.dll

[+] Enumerating modules in remote process

  [+] Found in remote process           0x00007FFE4C230000 | KERNEL32.DLL
  [+] Found in remote process           0x00007FFE17F00000 | VCRUNTIME140D.dll
  [+] Found in remote process           0x00007FFE0FCC0000 | ucrtbased.dll
  [+] Patching IAT

[+] Import table fixed

[+] Writing buffer to process
[+] Wrote successfully                  0x24000 Bytes

[+] Creating remote thread
[+] Address of entrypoint               @ 0x000002CDD340105F

[+] All systems go

Conclusion

This was a really fun subject to dive in. I have a better understanding (and hope you do too winkwink) of PE files and how malicious programs implement simple injection techniques.
Now that you understand the basics, you can extrapolate this to understand how other injection techniques work (process hollowing, etc.) or maybe try to guess how packing works and how it could be implemented.

There are lots of stuff I did not cover here, but I hope you liked these first two articles.
Next up, I’ll either write about packing or maybe I’ll make a detailed article about NTFS and a NTFS parser I developed earlier in 2020.

~~Don’t forget to like, comment and subscribe!~~

See ya!

Useful Links

https://en.wikipedia.org/wiki/Portable_Executable
https://upload.wikimedia.org/wikipedia/commons/1/1b/Portable_Executable_32_bit_Structure_in_SVG_fixed.svg
https://wiki.osdev.org/PE
https://docs.microsoft.com/en-us/windows/win32/debug/pe-format
http://www.openrce.org/reference_library/files/reference/PE%20Format.pdf
https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/corkami/PE102posterV1.pdf
https://www.aldeid.com/wiki/PE-Portable-executable
https://ntcore.com/?page_id=388