Some days ago in the 0x00sec discord channel, somebody asked how to
remove a library from memory. That person was trying to inject some code
in a program using LD_PRELOAD to get some code executed
automatically, however, after doing its stuff the library stayed in
memory even after deleting the file in the disk. In other words, it was
still visible in /proc/PID/maps.
The concept was very interesting so after a brief discussion in the chat I decided to give it a try and after some iterations I got it kindof working. It was working except for a small issue that I haven’t solved yet.
In this post I will drive you through the progression I followed to achieve this goal. First I will tell you how all this stuff works, then I will show you a real example and after that we can start iterating the process until we get our shared library obliterated from memory. I can advance already that it was an interesting journey where I learn a lot about dynamic linking.
Let’s start.
LD_PRELOAD
The LD_PRELOAD environmental variable allows us to
define a set of libraries that will be loaded before any other library
when executing a program. What´s the thing with that?, Well, dynamic
binaries, in general, resolve symbols on-demand, that is known as
lazy binding and the order the libraries are loaded in memory
matters. Let’s explain this step by step.
Dynamic binaries make use of dynamic libraries, usually many of them. These libraries are loaded when the program is executed and the same library may be loaded by different programs at different times. This roughly means that we do not know in which memory address the library will be located for each binary, or in other words, we need to do some calculations in order to figure out where our function is located. Well, actually is the dynamic linker/loader task to do that. BTW, I’ll use dynamic linker and loader interchangeably in this post as it really does both tasks.
Anyway, these calculations are not that hard but if your program uses some hundreds of functions spread through multiple libraries (think for example a web browser), doing all these resolution when the program gets executed will make the start-up time longer and many of those functions will not be required until later, or even never for a specific execution.
Therefore, dynamic binaries resolve symbols, or get the real pointers to the functions they need, only when needed. That requires some voodoo magic as well as the help of the dynamic linker or dynamic loader, whatever you prefer to call it..
This dynamic loader does many things, but its main task is to keep track of all the libraries and find the functions when requested. For that, it keeps a list of libraries needed by each program and when it has to seek for a symbol, will go through all of those libraries in the order they were loaded in memory. That means that if two libraries implement the same function, the first one loaded will be the one used.
And here is when LD_PRELOAD pops up. This allows us to
put, whatever libraries we want, first on the list, or in other words,
this allows us to hijack any function used by any program. That’s very
powerful, isn’t it?.
There are several legit uses of this variable and also some nefarious ones, as for example, the development of user space rootkits. You see the point?… if you can change the implementation of any function, you can easily provide a version that hides the information you want… just saying.
Reference Implementation.
Enough theory, let’s see this in action. For that, let’s write a simple program that will just write some information in the console, and then will wait for the user to press a key to continue. Something like this:
#include <stdio.h>
#include <unistd.h>
#include <arpa/inet.h>
int main () {
printf ("PID: %ld\n", getpid());
printf ("Starting Main App]\n");
printf ("--------------------------------\n");
getc(stdin);
getchar();
printf ("Press any key to finish...\n");
// getc will fire the symbol resolution... we had removed test.so
// from the process otherwise the program will crash
getc(stdin);
getchar();
printf ("Finishing Main App\n");
return 0;
}In this example we will make our library to hook on
getchar. The call to getc (stdin) has a two
folded purpose. In one hand will wait for the user to press a key (that
was needed in the early implementation when the getchar
function was actually completely gone) and will also fire a symbol
resolution. We’ll see later why that is important. Other than that, the
program is pretty basic. All those getchar and
getc looks like a non-sense, but please, bear with me.
Now, let’s write a very minimal library to hijack
getchar:
#include <stdio.h>
int getchar () {
printf ("DEBUG: getchar() invoked. Do nothing...\n");
return 0;
}We can compile the library with:
all: test.so
${CC} -fPIC -shared -o test.so test.c
And now
$ LD_PRELOAD=./lib1.so ./hello PID: 1782079 Starting Main App] -------------------------------- DEBUG: getchar() invoked... Do nothing Press any key to finish... DEBUG: getchar() invoked... Do nothing Finishing Main App
Now the program will not wait for the user on getchar,
it will just show our message. However, the program will stop in the
getc(stdin) line (you see now why we need that?). If at
that point, when the program is waiting for user input, in a different
terminal we run the following command:
$ cat /proc/1782079/maps
You will see something like this:
7f4ba4861000-7f4ba4862000 r--p 00000000 fd:00 16819591 /tmp/preload/lib1.so 7f4ba4862000-7f4ba4863000 r-xp 00001000 fd:00 16819591 /tmp/preload/lib1.so 7f4ba4863000-7f4ba4864000 r--p 00002000 fd:00 16819591 /tmp/preload/lib1.so 7f4ba4864000-7f4ba4865000 r--p 00002000 fd:00 16819591 /tmp/preload/lib1.so 7f4ba4865000-7f4ba4866000 rw-p 00003000 fd:00 16819591 /tmp/preload/lib1.so
That is our pre-loaded libary in the memory of the main process. This is what we want to get rid of, so if one program is doing something weird, when examining its memory map, nothing suspicious will pop up.
Unloading a Dynamic Library
Unfortunately, there is no way to unload a library injected using
LD_PRELOAD. In fact, even when you load your library with
dlopen it is not guarantied that the library will be
unmapped after calling to dlclose. Tried
dlopen (NULL) that should return a handler to the current
module, followed by dlclose but that didn’t worked. So
we’ll have to be more drastic… What about unmapping all the library
memory.
We can easily unmap memory using the unmap system call.
However, we cannot unmap the page that is containing the code being
executed. Well, actually we can, but the program crashes just after,
when the control is returned to a, now, unmapped memory area.
So, what we have to do is allocate a new memory block (this one anonymous, i.e. not backed up by a file, so nothing will be shown in the memory map), copy the unmapping code there and execute the code in that newly allocated area. In this example we will just move a minimal function to unmap the library memory, but you could allocate more memory and move over more code to do whatever you want.
The code to do this can be something like:
void *code, *orig;
long pagesize = getpagesize();
orig = 0x11223344; // We will calculate this later
code = mmap (NULL, pagesize,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_ANON | MAP_PRIVATE, 0, 0);
printf ("DEBUG: Memory allocated at: %p\n", code);
memcpy (code, _delete, pagesize);
int (*new_delete) (long*, int) = (void *) code;
new_delete ((long *)orig, pagesize);For testing this we need a _delete function. In order to
keep things simple we will write the _delete function in
asm, so we have full control and we don’t have to care about relocation
or fixing GOT entries. That is a bit out of scope for this simple proof
of concept. So, this is our initial _delete.s implementing
containing our _delete function:
.text
.global _delete
_delete:
mov $1, %rax
mov $1, %rdi
lea msg(%rip),%rsi
mov $15, %rdx
syscall
leave
ret
msg:
.asciz "Hello, world!\n"
.section .note.GNU-stack,"",@progbitsJust a simple hello world for the time being. Note the use of
PC-relative indexing to access the msg label (so we can
move the code freely into any memory address) and the
.section at the end to remove some linker warnings.
However we cannot call the function like that because the goal of the
_delete function is to unmap memory and if we just
return, we’ll return to some unmapped address that will cause a
SEGFAULT. The easiest solution is to use a jmp instead of a
call. Then, when we return from _delete the
code will behave as returning from the getchar function
even when we are in a completely different memory page… We can do this
with some embedded asm:
__asm ( ".intel_syntax noprefix\n"
"mov rdi, %0\n" // RDI = ptr
"mov rsi, %1\n" // RSI = val
"jmp %2\n" // jump to *target
".att_syntax" // Restore AT&T Syntax ot code after this point
// or the rest of the program Won't assemble
:
: "r"(orig), "r"(pagesize), "r"(code)
: "rdi", "rsi"
);
printf ("DEBUG: YOU SHOULDN'T SEE THIS MESSAGE!!!!\n");This is equivalent to the function call we did before but without
using call and therefore without touching the stack. In
other words, when we start executing our copy of _delete,
the stack is exactly the same that in the main getchar
function, so returning will bring us back to the main program.
Note that as I added that printf at the end of the
program to check that we do not return back to the getchar
function, I had to re-enable AT&T syntax as that is the one produced
by gcc and fed to gas.
Ummaping the memory
Let’s write the function to unmap the memory as that will create
quite some problems that we’ll have to solve. In theory we should get
all the pointers from /proc/self/maps associated to the
library and unmap them, but for our simple library we can just figure
out the page that contains our code and then unmap the previous page and
the next three.
We can easily calculate the page containing the code with this code:
orig = (void *)((long) getchar & ~(pagesize - 1));Basically, we just calculate the page address that contains the
getchar function in this library.
_Note that this library is very small and all the code fits in a page
4Kb. A more complicated libary may span through more pages and then you
have to properly parse /proc/self/maps.
Now we can write our simplified _delete function like
this:
.text
.global _delete
_delete:
mov $11, %rax ; Delete code page
syscall
mov $11, %rax ; Delete PHDR page
sub $0x1000, %rdi
syscall
mov $11, %rax ; Delete read only data
add $0x2000, %rdi
syscall
mov $11, %rax ; THis is a second map of the same memory... Related to the GOT
add $0x1000, %rdi
syscall
mov $11, %rax ; Delete the data page
add $0x1000, %rdi
syscall
leave
ret
.section .note.GNU-stack,"",@progbitsYes, we just call unmap 5 times. Note that the second
block is 0x1000 before the one we pass, which is the one containing the
code, so we have to substract 0x1000 and then add
0x2000 to skip the code block that we’ve already
deleted.
Now, if you run the program, you will see something like this:
$ LD_PRELOAD=./lib3.so ./hello PID: 2037736 Starting Main App] --------------------------------
And the program will stop waiting for a key to be pressed. Then in
another console we can check the map for the indicated process. We
should still see the library mapped in memory. After pressing enter,
preload getchar gets executed and we can see the debug
messages we added to our library. Something like:
DEBUG: getchar() invoked... DEBUG: Allocating memory 1 page DEBUG: Memory allocated at: 0x7fe2c726c000
At this point all our code was executed and we are waiting for the
user to press a key in the getc(stdin) in the main program,
which means that the memory has already be unmapped. You can verify that
checking the /proc/PID/maps file. No entries associated
with the library, but if you look with attention you’ll notice the
memory block were we copied our _delete function.
If you press again RETURN in the main program, it will
crash. What happens is that after the call to getc (which
was already resolved because we used it before calling
getchar) we call again getchar which was
resolved earlier, and therefore is pointing to a memory block that was
unmapped, causing the SEGFAULT.
A primer on dynamic linking
I won’t go in all the details on how dynamic linking works, but you
need to understand the very basics to be able to follow what will come
next. When you execute a dynamic binary, the kernel will load the
process and maps its memory. As part of that process it checks the
.interp section to figure out which dynamic linker to use.
Then it loads the dynamic linker (usualy ld.so) the same
way that the original binary, but this time it passes the original
binary (the mapped version in memory) to the dynamic linker to finish
the preparation for the execution.
The dynamic linker performs several tasks for this preparation, but a
few are specially important for us. The first one is that it loads all
the libraries needed by the binary, the list you get with the
ldd command. But before that it loads whatever libraries we
had mentioned in LD_PRELOAD. Here the order is important as
it will dictate which library has priority when the same symbol (a
function name for example) is defined in several libraries. In our
example, our library is loaded first, so our version of
getchar has priority over the libc
version.
The other important concept is how the dynamic linker gets the actual address of the different functions used. This is not trivial as the same library may be loaded at different addresses in different processes, depending on, for example, which other libraries are used by the program. The ELF file contains information to allow the dynamic linker to determine the whereabouts of each function knowing the base address of each library.
The process involves two main structures the PLT
Procedure Linkage Table and the GOT Global
Offset Table. The PLT contains multiple entries, being
each one of then a so-called trampoline function. By default whenever
you call a function from a dynamic library, you have to go through the
PLT. The first time the function is invoked, the
PLT points to the symbol resolver function of the dynamic
loader, which is in charge of finding the symbol through all the loaded
libraries. Once that address is found, the PLT is patched so the next
time the function is called, instead of invoke the resolver, it invokes
the actual function. Yes, all function invocations in a dynamic binary
has an extra indirection.
In our program, when we call getchar for the second
time, the PLT was already patched to point to a function at
an address that no longer exists, effectively SEGFAULTing the process.
What we have to do is patch this structure so the right code gets
executed once we have done our stuff.
Getting Ready to Patch
.PLT.GOT
In order to be able to patch the .plt.got entry
associated to our hook function to avoid further crashes of the main
program we have to do a few things.
- First we have to find out the base memory address for the processes. Nowadays systems use ASLR, so every time the program is executed it and its libraries, are loaded at different addresses.
- Then we have to get a candidate function to use instead of our
getchar. As we are patching typical libc functions, we just have to chose the actual libc function, the one the program will be using if noLD_PRELOADlibrary is injected. - Finally, We have to find the
DT_PLTGOTdynamic value to determine the exact position of the.plt.gotsection in memory so we can patch it.
Let’s go step by step. The easiest way to get the current process
base memory address is scanning the /proc/self/maps file
for the main application filename. So, we first get the application
filename:
char buffer[1024];
char *binary_name;
FILE *f;
char *aux;
if ((f = fopen ("/proc/self/cmdline", "rt")) == NULL) {
perror ("fopen:");
return -1;
}
fgets (buffer, 1024, f);
fclose (f);
aux = buffer + strlen(buffer);
while (*aux != '/' && aux > buffer) aux--;
binary_name = strdup (*aux == '/' ? aux + 1 : aux);
printf ("DEBUG: Process %ld '%s´\n", getpid(), binary_name);This code will get the program name from
/proc/self/cmdline and clean it so we can easily find the
memory blocks we are interested in. Nothing really special here. Now
that we have the binary name let’s get the base address parsing
/proc/self/maps.
unsigned char *prog;
long mem_start;
// Get Process base address
if ((f = fopen ("/proc/self/maps", "rt")) == NULL) {
perror ("fopen:");
return -1;
}
while (!feof(f)) {
fgets (buffer, 1024, f);
// Memory map is sorted by address, the first address is the lowest
if (strstr (buffer, binary_name) == NULL) continue;
else break;
};
fclose (f);
if ((aux = strchr (buffer, '-')) == NULL) {
printf ("Malformed memory map entry\n");
return -1;
}
*aux=0;
sscanf (buffer, "%lx", &mem_start);
printf ("DEBUG: Process mapped at: 0x%lx \n", mem_start);
// Got the ELF Mapped segment
prog = (unsigned char *)mem_start; // Got the ELF Mapped segmentAlso, pretty straight forward. The two first hex numbers separated by
a hyphen (-) are the segment starting and ending address.
These are always shown in ascending order, so the first match is the
lower address that shall contain the PHDR segment, or the
ELF header that will gave us the information we need to get to the
information we want.
Note that I’m assuming each block is one page long, but in reality you should get start and end address and calculate the size of the block.
Note that we should use a loop similar to the one above to store the
starting address and size for all the blocks matching the library name
and then pass them to the _delete function to make it work
in all cases, but I haven’t wrote that code.
Getting the pointer to
.plt.got
Now we’ve to get the value of DT_PLTGOT. This is stored
in the dynamic section of the binary and contains the offset, from the
base address to the PLT table that contains the pointer to our hook
function which, eventually, we want to patch.
Using a few helper functions, the code is not that complicated (the helper functions are the ones a bit tedious to write).
size_t d_size = 0;
Elf64_Dyn *dyn = get_dynsection(prog, &d_size); // Get Dynamic section and size
long got1 = (long) get_pltgot (dyn, d_size); // Gets DT_PLTGOT
printf ("DEBUG: PLTGOT located at: %p + %p = %p\n",
mem_start, got1, mem_start+got1);
// Get getchr function's PLT entry
long *hook_got = get_got_ptr (prog, dyn, d_size, (Elf64_Ehdr *)prog, "getchar");
printf ("DEBUG: getchar_got entry : %p\n", hook_got);The first function get_dyn_section returns a pointer to
the dynamic section of the program in memory and the number of items in
that section (using the output parameter `d_size). This is the function
code:
// Returns a pointer to the section and the number of entries in size
Elf64_Dyn*
get_dynsection (void *prg, size_t *size) {
Elf64_Ehdr *ehdr = (Elf64_Ehdr *)prg;
/* Program header table */
Elf64_Phdr *phdr = (Elf64_Phdr *)((uint8_t *)prg + ehdr->e_phoff);
Elf64_Dyn *dyn = NULL;
size_t dyn_cnt = 0;
/* Locate PT_DYNAMIC segment */
for (int i = 0; i < ehdr->e_phnum; i++) {
if (phdr[i].p_type == PT_DYNAMIC) {
dyn = (Elf64_Dyn *)((uint8_t *)prg + phdr[i].p_offset);
dyn_cnt = phdr[i].p_filesz / sizeof(Elf64_Dyn);
break;
}
}
if (!dyn) {
fprintf(stderr, "No PT_DYNAMIC segment found.\n");
return NULL;
}
*size = dyn_cnt;
return dyn;
}It just go through all the program headers looking for a segment of
type PT_DYNAMIC. Once found it returns the pointer to that
program header as well as the number of items it contains. You can see
the contents of this segment using readelf -d.
Next we need to extract the PLTGOT entry value, using
the get_pltgot function that you can see below:
void*
get_pltgot (Elf64_Dyn *dyn, size_t dyn_cnt) {
/* Search for DT_PLTGOT entry */
for (size_t i = 0; i < dyn_cnt; i++) {
if (dyn[i].d_tag == DT_PLTGOT) {
/* DT_PLTGOT is usually an address in memory space, not an offset */
void *pltgot = (void *)dyn[i].d_un.d_ptr;
return pltgot;
}
}
fprintf(stderr, "DT_PLTGOT not found.\n");
return NULL;
}The function just goes through the dynamic section looking for an
item with the tag DT_PLTGOT and returning its
value. Pretty simple.
Now that we have the .plt.got table, we just need to
find the entry associated to the function we have overwritten in our
preload library (getchar in our example). This is done by
the function get_got_ptr that you can see below (this one I
asked ChatGTP to write it for me :) :
void *
get_got_ptr (void *base, Elf64_Dyn *dyn, int dyn_count, Elf64_Ehdr *ehdr, const char *name)
{
/* Extract dynamic section pointers */
Elf64_Sym *symtab = NULL;
char *strtab = NULL;
Elf64_Rela *jmprel = NULL;
size_t pltrelsz = 0;
int pltrel_type = 0;
for (size_t i = 0; i < dyn_count; i++) {
switch (dyn[i].d_tag) {
case DT_SYMTAB: symtab = (Elf64_Sym *)(base + dyn[i].d_un.d_ptr); break;
case DT_STRTAB: strtab = (char *)(base + dyn[i].d_un.d_ptr); break;
case DT_JMPREL: jmprel = (Elf64_Rela *)(base + dyn[i].d_un.d_ptr); break;
case DT_PLTRELSZ: pltrelsz = dyn[i].d_un.d_val; break;
case DT_PLTREL: pltrel_type = dyn[i].d_un.d_val; break;
}
}
if (!symtab || !strtab || !jmprel || pltrel_type != DT_RELA)
return NULL;
size_t count = pltrelsz / sizeof(Elf64_Rela);
/* Iterate PLT relocations */
for (size_t i = 0; i < count; i++) {
Elf64_Rela *r = &jmprel[i];
unsigned sym_index = ELF64_R_SYM(r->r_info);
const char *sym_name = &strtab[symtab[sym_index].st_name];
if (strcmp(sym_name, name) == 0) {
/* GOT entry location */
void *got_entry_address = (uint8_t *)base + r->r_offset;
return got_entry_address;
}
}
return NULL;
}Basically the code goes through the PLT relocation table, extracting
the symbol it is associated with and checking the name of that symbol
against the string we pass. This is roughly what you will see in the
offset column of the readelf -r output.
$ readelf -r hello Relocation section '.rela.dyn' at offset 0x628 contains 10 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000003dd0 000000000008 R_X86_64_RELATIVE 1180 000000003dd8 000000000008 R_X86_64_RELATIVE 1140 000000004038 000000000008 R_X86_64_RELATIVE 4038 000000003fc0 000100000006 R_X86_64_GLOB_DAT 0000000000000000 __libc_start_main@GLIBC_2.34 + 0 000000003fc8 000200000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_deregisterTM[...] + 0 000000003fd0 000700000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0 000000003fd8 000900000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_registerTMCl[...] + 0 000000003fe0 000c00000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize@GLIBC_2.2.5 + 0 000000004040 000b00000005 R_X86_64_COPY 0000000000004040 stdout@GLIBC_2.2.5 + 0 000000004050 000d00000005 R_X86_64_COPY 0000000000004050 stdin@GLIBC_2.2.5 + 0 Relocation section '.rela.plt' at offset 0x718 contains 6 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000004000 000300000007 R_X86_64_JUMP_SLO 0000000000000000 puts@GLIBC_2.2.5 + 0 000000004008 000400000007 R_X86_64_JUMP_SLO 0000000000000000 getpid@GLIBC_2.2.5 + 0 000000004010 000500000007 R_X86_64_JUMP_SLO 0000000000000000 printf@GLIBC_2.2.5 + 0 000000004018 000600000007 R_X86_64_JUMP_SLO 0000000000000000 getchar@GLIBC_2.2.5 + 0 <======== 000000004020 000800000007 R_X86_64_JUMP_SLO 0000000000000000 fflush@GLIBC_2.2.5 + 0 000000004028 000a00000007 R_X86_64_JUMP_SLO 0000000000000000 getc@GLIBC_2.2.5 + 0
With all this elements the code we shows at the beginning gives us the pointer to the absolute address of the PLT entry associated with the hook function we used in our preload library. I reproduce the code here again for your convenience:
size_t d_size = 0;
Elf64_Dyn *dyn = get_dynsection(prog, &d_size);
long got1 = (long) get_pltgot (dyn, d_size);
printf ("DEBUG: PLTGOT located at: %p + %p = %p\n",
mem_start, got1, mem_start+got1);
// Get HOOK function's PLT entry
long *hook_got = get_got_ptr (prog, dyn, d_size, (Elf64_Ehdr *)prog, "getchar");
printf ("DEBUG: getchar_got entry : %p\n", hook_got);Now we just have to find the libc implementation of
getchar and patch hook_got.
int (*original_hook)();
original_hook = dlsym(RTLD_NEXT, "getchar");
printf ("DEBUG: Original %s: %p\n", "getchar", original_hook);
*(long*)(hook_got) = (long)original_hook;Now we need to link our library against the dl library
in order to use dlsym to find the NEXT
function implementation. If at this point, you recompile everything and
run again the program you will see that the program now executes
normally, but it crashes when finishing.
The final Crash
The program crashes at the end because, when the program ends, some
extra code from the C runtime (those crtX.o files) gets
executed after the main function returns. One of the things
that code does is to run the destructors for all the modules (libraries
and the main program), so when it tries to access our library, and
figure out if there is a destructor to runm it ends up check unmapped
memory. Those pointers (the destructor table) are populated by the
dynamic loader at the beginning of the execution, far before our code
gets executed.
Actually, if you add any other function after getchar
the program will also crash, because our unmapped library is still in
the search list of the dynamic linker, so it will try to use it to
resolve the symbol and, while trying to check the details of our
library, the program will crash.
Fixing this is quite tricky because the search list is an internal
dynamic loader structure not intended to be used from normal programs as
it may changes at any time. That is what usually means that is internal.
Actually, this structure depends on the libc version you are using. For
this experiment I was using libc 2.36 and the relevant structure we are
talking about, struct link_map, can be found at:
https://elixir.bootlin.com/glibc/glibc-2.36/source/include/link.h#L95
This struct is accessible as the second entry in the PLT table, and
have a lot of fields. I haven’t manage to get access to the
r_scope field that is suppose to contain the list of
libraries used when resolving dynamic symbols. Roughly we have to access
that list and remove our library from it, but even recompiling that
specific version of lib C I didn’t manage to figure out the offset to
that field. However, we can do a very simple fix to, at least allow the
program to execute normally and just getting an error when the
application finish, instead of a crash.
The link_map structure has a field named
l_ns that represents the library namespace. You can control
this using the dlmopen function, but we will play with it
directly. If we set this field to NULL for the entry
associated to our library, in the link_map, it won’t be
used for symbol resolution anymore. In all honesty, haven’t dive deep
enough in the dynamic linker code to fully understand why this happens.
The dynamic linker code is difficult and huge, but for the proof of
concept, this hack may do the trick.
The first thing we have to do is get access to the
link_map structure that is the second entry in the PLT
table. We already got the offset to the plt table so we can easily made
it to the link_map with the following code.
long *got = (long*)(mem_start + got1);
struct link_map *p = (struct link_map*) got[1];
printf ("DEBUG: GOT: %p -> GOT[1] -> %lx\n", got, got[1]);And know that we have our link_map we just need to look
for our library and change its namespace:
while (p) {
printf ("DEBUG: => '%s' %p", p->l_name, p->l_ld);
if (strstr (p->l_name, "test")) {
// When found, patch libname so the library won't be used for
// symbol resolution
printf (" **PATCHED**");
// TODO: We should actually patch l_scope in link_map
*((long*)((unsigned char *)p + 40)) = 0; // Patch l_ns
}
printf ("\n");
p = p->l_next;
}Yes, link_map is a linked list in which each element
represents one of the loaded libraries by the ELF entity.
As I said before, this structure is internal, but the very first
fields are accesible using a simplified version of the struct defined in
link.h and it is OK to use them. Fields like,
l_name or l_next are public. However field
l_ns is not defined in that simplied struct (as this may
change on the next libc version) and that is why I use offset
40 to access the l_ns field for glibc 2.36.
Note how this is pretty much a hack and needs to be adjusted for
different libc versions.
Now your program will run till the end without issues but it will show the following error when finishing:
Inconsistency detected by ld.so: dl-fini.c: 87: _dl_fini: Assertion `ns != LM_ID_BASE || i == nloaded' failed!
That’s it. If somebody manage to get rid of these last message, please let me know.
Colophon
This experiment started as a informal discussion and ended up being an amazing journey into the guts of the dynamic loader and how it works. Every problem that I was facing open a completely new rabbit hole to dig in and found all kind of wonders hidden inside the libc and the dynamic linker. Really enjoy it!
You can find the source code in my github:
https://github.com/0x00pf/preload
■
