How to unload a LD_PRELOAD library the hard way
HACKING
How to unload a LD_PRELOAD library the hard way
2025-11-25
By
pico

Some days ago in the 0x00sec discord channel, somebody asked how to remove a library from memory. That person was trying to inject some code in a program using LD_PRELOAD to get some code executed automatically, however, after doing its stuff the library stayed in memory even after deleting the file in the disk. In other words, it was still visible in /proc/PID/maps.

The concept was very interesting so after a brief discussion in the chat I decided to give it a try and after some iterations I got it kindof working. It was working except for a small issue that I haven’t solved yet.

In this post I will drive you through the progression I followed to achieve this goal. First I will tell you how all this stuff works, then I will show you a real example and after that we can start iterating the process until we get our shared library obliterated from memory. I can advance already that it was an interesting journey where I learn a lot about dynamic linking.

Let’s start.

LD_PRELOAD

The LD_PRELOAD environmental variable allows us to define a set of libraries that will be loaded before any other library when executing a program. What´s the thing with that?, Well, dynamic binaries, in general, resolve symbols on-demand, that is known as lazy binding and the order the libraries are loaded in memory matters. Let’s explain this step by step.

Dynamic binaries make use of dynamic libraries, usually many of them. These libraries are loaded when the program is executed and the same library may be loaded by different programs at different times. This roughly means that we do not know in which memory address the library will be located for each binary, or in other words, we need to do some calculations in order to figure out where our function is located. Well, actually is the dynamic linker/loader task to do that. BTW, I’ll use dynamic linker and loader interchangeably in this post as it really does both tasks.

Anyway, these calculations are not that hard but if your program uses some hundreds of functions spread through multiple libraries (think for example a web browser), doing all these resolution when the program gets executed will make the start-up time longer and many of those functions will not be required until later, or even never for a specific execution.

Therefore, dynamic binaries resolve symbols, or get the real pointers to the functions they need, only when needed. That requires some voodoo magic as well as the help of the dynamic linker or dynamic loader, whatever you prefer to call it..

This dynamic loader does many things, but its main task is to keep track of all the libraries and find the functions when requested. For that, it keeps a list of libraries needed by each program and when it has to seek for a symbol, will go through all of those libraries in the order they were loaded in memory. That means that if two libraries implement the same function, the first one loaded will be the one used.

And here is when LD_PRELOAD pops up. This allows us to put, whatever libraries we want, first on the list, or in other words, this allows us to hijack any function used by any program. That’s very powerful, isn’t it?.

There are several legit uses of this variable and also some nefarious ones, as for example, the development of user space rootkits. You see the point?… if you can change the implementation of any function, you can easily provide a version that hides the information you want… just saying.

Reference Implementation.

Enough theory, let’s see this in action. For that, let’s write a simple program that will just write some information in the console, and then will wait for the user to press a key to continue. Something like this:

#include <stdio.h>
#include <unistd.h>
#include <arpa/inet.h>

int main () {
    printf ("PID: %ld\n", getpid());
    printf ("Starting Main App]\n");
    printf ("--------------------------------\n");
    getc(stdin);
    getchar();

    printf ("Press any key to finish...\n");
    // getc will fire the symbol resolution... we had removed test.so 
    // from the process otherwise the program will crash
    getc(stdin);
    getchar();
    printf ("Finishing Main App\n");
    return 0;
}

In this example we will make our library to hook on getchar. The call to getc (stdin) has a two folded purpose. In one hand will wait for the user to press a key (that was needed in the early implementation when the getchar function was actually completely gone) and will also fire a symbol resolution. We’ll see later why that is important. Other than that, the program is pretty basic. All those getchar and getc looks like a non-sense, but please, bear with me.

Now, let’s write a very minimal library to hijack getchar:

#include <stdio.h>

int getchar () {
    printf ("DEBUG: getchar() invoked. Do nothing...\n");
    return 0;
}

We can compile the library with:

all: test.so
    ${CC} -fPIC -shared -o test.so test.c

And now

$ LD_PRELOAD=./lib1.so ./hello
PID: 1782079
Starting Main App]
--------------------------------
DEBUG: getchar() invoked... Do nothing
Press any key to finish...

DEBUG: getchar() invoked... Do nothing
Finishing Main App

Now the program will not wait for the user on getchar, it will just show our message. However, the program will stop in the getc(stdin) line (you see now why we need that?). If at that point, when the program is waiting for user input, in a different terminal we run the following command:

$ cat /proc/1782079/maps

You will see something like this:

7f4ba4861000-7f4ba4862000 r--p 00000000 fd:00 16819591                   /tmp/preload/lib1.so
7f4ba4862000-7f4ba4863000 r-xp 00001000 fd:00 16819591                   /tmp/preload/lib1.so
7f4ba4863000-7f4ba4864000 r--p 00002000 fd:00 16819591                   /tmp/preload/lib1.so
7f4ba4864000-7f4ba4865000 r--p 00002000 fd:00 16819591                   /tmp/preload/lib1.so
7f4ba4865000-7f4ba4866000 rw-p 00003000 fd:00 16819591                   /tmp/preload/lib1.so

That is our pre-loaded libary in the memory of the main process. This is what we want to get rid of, so if one program is doing something weird, when examining its memory map, nothing suspicious will pop up.

Unloading a Dynamic Library

Unfortunately, there is no way to unload a library injected using LD_PRELOAD. In fact, even when you load your library with dlopen it is not guarantied that the library will be unmapped after calling to dlclose. Tried dlopen (NULL) that should return a handler to the current module, followed by dlclose but that didn’t worked. So we’ll have to be more drastic… What about unmapping all the library memory.

We can easily unmap memory using the unmap system call. However, we cannot unmap the page that is containing the code being executed. Well, actually we can, but the program crashes just after, when the control is returned to a, now, unmapped memory area.

So, what we have to do is allocate a new memory block (this one anonymous, i.e. not backed up by a file, so nothing will be shown in the memory map), copy the unmapping code there and execute the code in that newly allocated area. In this example we will just move a minimal function to unmap the library memory, but you could allocate more memory and move over more code to do whatever you want.

The code to do this can be something like:

  void    *code, *orig;
  long     pagesize = getpagesize();
  orig = 0x11223344;  // We will calculate this later
  code = mmap (NULL, pagesize,
           PROT_READ | PROT_WRITE | PROT_EXEC,
           MAP_ANON | MAP_PRIVATE, 0, 0);
  printf ("DEBUG: Memory allocated at: %p\n", code);
  memcpy (code, _delete, pagesize);
  int (*new_delete) (long*, int) = (void *) code;
  new_delete ((long *)orig, pagesize);

For testing this we need a _delete function. In order to keep things simple we will write the _delete function in asm, so we have full control and we don’t have to care about relocation or fixing GOT entries. That is a bit out of scope for this simple proof of concept. So, this is our initial _delete.s implementing containing our _delete function:

    .text
    .global _delete
_delete:

    mov $1, %rax
    mov $1, %rdi
    lea    msg(%rip),%rsi
    mov $15, %rdx
    syscall

    leave
    ret
    
msg:
    .asciz "Hello, world!\n"
    
.section .note.GNU-stack,"",@progbits

Just a simple hello world for the time being. Note the use of PC-relative indexing to access the msg label (so we can move the code freely into any memory address) and the .section at the end to remove some linker warnings.

However we cannot call the function like that because the goal of the _delete function is to unmap memory and if we just return, we’ll return to some unmapped address that will cause a SEGFAULT. The easiest solution is to use a jmp instead of a call. Then, when we return from _delete the code will behave as returning from the getchar function even when we are in a completely different memory page… We can do this with some embedded asm:

  __asm ( ".intel_syntax noprefix\n"
      "mov rdi, %0\n"   // RDI = ptr
      "mov rsi, %1\n"   // RSI = val
      "jmp %2\n"         // jump to *target
      ".att_syntax"      // Restore AT&T Syntax ot code after this point
                         // or the rest of the program Won't assemble
      :
      : "r"(orig), "r"(pagesize), "r"(code)
      : "rdi", "rsi"
      );
  printf ("DEBUG: YOU SHOULDN'T SEE THIS MESSAGE!!!!\n");

This is equivalent to the function call we did before but without using call and therefore without touching the stack. In other words, when we start executing our copy of _delete, the stack is exactly the same that in the main getchar function, so returning will bring us back to the main program.

Note that as I added that printf at the end of the program to check that we do not return back to the getchar function, I had to re-enable AT&T syntax as that is the one produced by gcc and fed to gas.

Ummaping the memory

Let’s write the function to unmap the memory as that will create quite some problems that we’ll have to solve. In theory we should get all the pointers from /proc/self/maps associated to the library and unmap them, but for our simple library we can just figure out the page that contains our code and then unmap the previous page and the next three.

We can easily calculate the page containing the code with this code:

  orig = (void *)((long) getchar & ~(pagesize - 1));

Basically, we just calculate the page address that contains the getchar function in this library.

_Note that this library is very small and all the code fits in a page 4Kb. A more complicated libary may span through more pages and then you have to properly parse /proc/self/maps.

Now we can write our simplified _delete function like this:

    .text
    .global _delete
_delete:
    mov $11, %rax      ; Delete code page
    syscall

    mov $11, %rax      ; Delete PHDR page
    sub $0x1000, %rdi
    syscall

    mov $11, %rax      ; Delete read only data
    add $0x2000, %rdi
    syscall

    mov $11, %rax      ; THis is a second map of the same memory... Related to the GOT
    add $0x1000, %rdi
    syscall

    mov $11, %rax     ; Delete the data page
    add $0x1000, %rdi
    syscall

    leave
    ret

.section .note.GNU-stack,"",@progbits

Yes, we just call unmap 5 times. Note that the second block is 0x1000 before the one we pass, which is the one containing the code, so we have to substract 0x1000 and then add 0x2000 to skip the code block that we’ve already deleted.

Now, if you run the program, you will see something like this:

$ LD_PRELOAD=./lib3.so ./hello
PID: 2037736
Starting Main App]
--------------------------------

And the program will stop waiting for a key to be pressed. Then in another console we can check the map for the indicated process. We should still see the library mapped in memory. After pressing enter, preload getchar gets executed and we can see the debug messages we added to our library. Something like:

DEBUG: getchar() invoked...
DEBUG: Allocating memory 1 page
DEBUG: Memory allocated at: 0x7fe2c726c000

At this point all our code was executed and we are waiting for the user to press a key in the getc(stdin) in the main program, which means that the memory has already be unmapped. You can verify that checking the /proc/PID/maps file. No entries associated with the library, but if you look with attention you’ll notice the memory block were we copied our _delete function.

If you press again RETURN in the main program, it will crash. What happens is that after the call to getc (which was already resolved because we used it before calling getchar) we call again getchar which was resolved earlier, and therefore is pointing to a memory block that was unmapped, causing the SEGFAULT.

A primer on dynamic linking

I won’t go in all the details on how dynamic linking works, but you need to understand the very basics to be able to follow what will come next. When you execute a dynamic binary, the kernel will load the process and maps its memory. As part of that process it checks the .interp section to figure out which dynamic linker to use. Then it loads the dynamic linker (usualy ld.so) the same way that the original binary, but this time it passes the original binary (the mapped version in memory) to the dynamic linker to finish the preparation for the execution.

The dynamic linker performs several tasks for this preparation, but a few are specially important for us. The first one is that it loads all the libraries needed by the binary, the list you get with the ldd command. But before that it loads whatever libraries we had mentioned in LD_PRELOAD. Here the order is important as it will dictate which library has priority when the same symbol (a function name for example) is defined in several libraries. In our example, our library is loaded first, so our version of getchar has priority over the libc version.

The other important concept is how the dynamic linker gets the actual address of the different functions used. This is not trivial as the same library may be loaded at different addresses in different processes, depending on, for example, which other libraries are used by the program. The ELF file contains information to allow the dynamic linker to determine the whereabouts of each function knowing the base address of each library.

The process involves two main structures the PLT Procedure Linkage Table and the GOT Global Offset Table. The PLT contains multiple entries, being each one of then a so-called trampoline function. By default whenever you call a function from a dynamic library, you have to go through the PLT. The first time the function is invoked, the PLT points to the symbol resolver function of the dynamic loader, which is in charge of finding the symbol through all the loaded libraries. Once that address is found, the PLT is patched so the next time the function is called, instead of invoke the resolver, it invokes the actual function. Yes, all function invocations in a dynamic binary has an extra indirection.

In our program, when we call getchar for the second time, the PLT was already patched to point to a function at an address that no longer exists, effectively SEGFAULTing the process. What we have to do is patch this structure so the right code gets executed once we have done our stuff.

Getting Ready to Patch .PLT.GOT

In order to be able to patch the .plt.got entry associated to our hook function to avoid further crashes of the main program we have to do a few things.

  • First we have to find out the base memory address for the processes. Nowadays systems use ASLR, so every time the program is executed it and its libraries, are loaded at different addresses.
  • Then we have to get a candidate function to use instead of our getchar. As we are patching typical libc functions, we just have to chose the actual libc function, the one the program will be using if no LD_PRELOAD library is injected.
  • Finally, We have to find the DT_PLTGOT dynamic value to determine the exact position of the .plt.got section in memory so we can patch it.

Let’s go step by step. The easiest way to get the current process base memory address is scanning the /proc/self/maps file for the main application filename. So, we first get the application filename:

  char          buffer[1024];
  char          *binary_name;

  FILE          *f;

  char           *aux;
  if ((f = fopen ("/proc/self/cmdline", "rt")) == NULL) {
    perror ("fopen:");
    return -1;
  }
  fgets (buffer, 1024, f);
  fclose (f);
  aux = buffer + strlen(buffer);
  while (*aux != '/' && aux > buffer) aux--;
  binary_name = strdup (*aux == '/' ? aux + 1 : aux);
  printf ("DEBUG: Process %ld '%s´\n", getpid(), binary_name);

This code will get the program name from /proc/self/cmdline and clean it so we can easily find the memory blocks we are interested in. Nothing really special here. Now that we have the binary name let’s get the base address parsing /proc/self/maps.

  unsigned char *prog;
  long           mem_start;
  
 // Get Process base address
  if ((f = fopen ("/proc/self/maps", "rt")) == NULL) {
    perror ("fopen:");
    return -1;
  }
  while (!feof(f)) {
    fgets (buffer, 1024, f);
    // Memory map is sorted by address, the first address is the lowest
    if (strstr (buffer, binary_name) == NULL) continue;
    else break; 
  };
  fclose (f);
  if ((aux = strchr (buffer, '-')) == NULL) {
    printf ("Malformed memory map entry\n");
    return -1;
  }
  *aux=0;
  sscanf (buffer, "%lx", &mem_start);
  printf ("DEBUG: Process mapped at: 0x%lx \n", mem_start);
  
  // Got the ELF Mapped segment
  prog = (unsigned char *)mem_start;  // Got the ELF Mapped segment

Also, pretty straight forward. The two first hex numbers separated by a hyphen (-) are the segment starting and ending address. These are always shown in ascending order, so the first match is the lower address that shall contain the PHDR segment, or the ELF header that will gave us the information we need to get to the information we want.

Note that I’m assuming each block is one page long, but in reality you should get start and end address and calculate the size of the block.

Note that we should use a loop similar to the one above to store the starting address and size for all the blocks matching the library name and then pass them to the _delete function to make it work in all cases, but I haven’t wrote that code.

Getting the pointer to .plt.got

Now we’ve to get the value of DT_PLTGOT. This is stored in the dynamic section of the binary and contains the offset, from the base address to the PLT table that contains the pointer to our hook function which, eventually, we want to patch.

Using a few helper functions, the code is not that complicated (the helper functions are the ones a bit tedious to write).

  size_t     d_size = 0;
  Elf64_Dyn  *dyn = get_dynsection(prog, &d_size); // Get Dynamic section and size
  long got1 = (long) get_pltgot (dyn, d_size);     // Gets DT_PLTGOT
  printf ("DEBUG: PLTGOT located at: %p + %p = %p\n",
      mem_start, got1, mem_start+got1);
  
  // Get getchr function's  PLT entry
  long *hook_got = get_got_ptr (prog, dyn, d_size, (Elf64_Ehdr *)prog, "getchar");
  printf ("DEBUG: getchar_got entry : %p\n", hook_got);

The first function get_dyn_section returns a pointer to the dynamic section of the program in memory and the number of items in that section (using the output parameter `d_size). This is the function code:

// Returns a pointer to the section and the number of entries in size
Elf64_Dyn*
get_dynsection (void *prg, size_t *size) {
  Elf64_Ehdr *ehdr = (Elf64_Ehdr *)prg;
  
  /* Program header table */
  Elf64_Phdr *phdr = (Elf64_Phdr *)((uint8_t *)prg + ehdr->e_phoff);
  Elf64_Dyn  *dyn = NULL;
  size_t      dyn_cnt = 0;
  
  /* Locate PT_DYNAMIC segment */
  for (int i = 0; i < ehdr->e_phnum; i++) {
    if (phdr[i].p_type == PT_DYNAMIC) {
      dyn = (Elf64_Dyn *)((uint8_t *)prg + phdr[i].p_offset);
      dyn_cnt = phdr[i].p_filesz / sizeof(Elf64_Dyn);
      break;
    }
  }
  if (!dyn) {
    fprintf(stderr, "No PT_DYNAMIC segment found.\n");
    return NULL;
  } 
  *size = dyn_cnt;
  
  return dyn;
}

It just go through all the program headers looking for a segment of type PT_DYNAMIC. Once found it returns the pointer to that program header as well as the number of items it contains. You can see the contents of this segment using readelf -d.

Next we need to extract the PLTGOT entry value, using the get_pltgot function that you can see below:

void*
get_pltgot (Elf64_Dyn  *dyn, size_t dyn_cnt) {
  
  /* Search for DT_PLTGOT entry */
  for (size_t i = 0; i < dyn_cnt; i++) {
    if (dyn[i].d_tag == DT_PLTGOT) {
      /* DT_PLTGOT is usually an address in memory space, not an offset */
      void *pltgot = (void *)dyn[i].d_un.d_ptr;
      return pltgot;
    }
  }
  
  fprintf(stderr, "DT_PLTGOT not found.\n");
  return NULL;
}

The function just goes through the dynamic section looking for an item with the tag DT_PLTGOT and returning its value. Pretty simple.

Now that we have the .plt.got table, we just need to find the entry associated to the function we have overwritten in our preload library (getchar in our example). This is done by the function get_got_ptr that you can see below (this one I asked ChatGTP to write it for me :) :

void *
get_got_ptr (void *base, Elf64_Dyn *dyn, int dyn_count, Elf64_Ehdr *ehdr, const char *name)
{
  /* Extract dynamic section pointers */
  Elf64_Sym  *symtab = NULL;
  char       *strtab = NULL;
  Elf64_Rela *jmprel = NULL;
  size_t     pltrelsz = 0;
  int        pltrel_type = 0;
  
  for (size_t i = 0; i < dyn_count; i++) {
    switch (dyn[i].d_tag) {
    case DT_SYMTAB:   symtab      = (Elf64_Sym *)(base + dyn[i].d_un.d_ptr); break;
    case DT_STRTAB:   strtab      = (char *)(base + dyn[i].d_un.d_ptr); break;
    case DT_JMPREL:   jmprel      = (Elf64_Rela *)(base + dyn[i].d_un.d_ptr); break;
    case DT_PLTRELSZ: pltrelsz    = dyn[i].d_un.d_val; break;
    case DT_PLTREL:   pltrel_type = dyn[i].d_un.d_val; break;
    }
  }

  if (!symtab || !strtab || !jmprel || pltrel_type != DT_RELA) 
    return NULL;
  
  size_t count = pltrelsz / sizeof(Elf64_Rela);
  
  /* Iterate PLT relocations */
  for (size_t i = 0; i < count; i++) {
    Elf64_Rela *r = &jmprel[i];
    
    unsigned sym_index = ELF64_R_SYM(r->r_info);
    const char *sym_name = &strtab[symtab[sym_index].st_name];
    
    if (strcmp(sym_name, name) == 0) {
      /* GOT entry location */
      void *got_entry_address = (uint8_t *)base + r->r_offset;
      return got_entry_address;
    }
  }
  
  return NULL;
}

Basically the code goes through the PLT relocation table, extracting the symbol it is associated with and checking the name of that symbol against the string we pass. This is roughly what you will see in the offset column of the readelf -r output.

$ readelf -r hello

Relocation section '.rela.dyn' at offset 0x628 contains 10 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000003dd0  000000000008 R_X86_64_RELATIVE                    1180
000000003dd8  000000000008 R_X86_64_RELATIVE                    1140
000000004038  000000000008 R_X86_64_RELATIVE                    4038
000000003fc0  000100000006 R_X86_64_GLOB_DAT 0000000000000000 __libc_start_main@GLIBC_2.34 + 0
000000003fc8  000200000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_deregisterTM[...] + 0
000000003fd0  000700000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
000000003fd8  000900000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_registerTMCl[...] + 0
000000003fe0  000c00000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize@GLIBC_2.2.5 + 0
000000004040  000b00000005 R_X86_64_COPY     0000000000004040 stdout@GLIBC_2.2.5 + 0
000000004050  000d00000005 R_X86_64_COPY     0000000000004050 stdin@GLIBC_2.2.5 + 0

Relocation section '.rela.plt' at offset 0x718 contains 6 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000004000  000300000007 R_X86_64_JUMP_SLO 0000000000000000 puts@GLIBC_2.2.5 + 0
000000004008  000400000007 R_X86_64_JUMP_SLO 0000000000000000 getpid@GLIBC_2.2.5 + 0
000000004010  000500000007 R_X86_64_JUMP_SLO 0000000000000000 printf@GLIBC_2.2.5 + 0
000000004018  000600000007 R_X86_64_JUMP_SLO 0000000000000000 getchar@GLIBC_2.2.5 + 0 <========
000000004020  000800000007 R_X86_64_JUMP_SLO 0000000000000000 fflush@GLIBC_2.2.5 + 0
000000004028  000a00000007 R_X86_64_JUMP_SLO 0000000000000000 getc@GLIBC_2.2.5 + 0

With all this elements the code we shows at the beginning gives us the pointer to the absolute address of the PLT entry associated with the hook function we used in our preload library. I reproduce the code here again for your convenience:

  size_t     d_size = 0;
 Elf64_Dyn  *dyn = get_dynsection(prog, &d_size);
 long got1 = (long) get_pltgot (dyn, d_size);
 printf ("DEBUG: PLTGOT located at: %p + %p = %p\n",
     mem_start, got1, mem_start+got1);
 
 // Get HOOK function's  PLT entry
 long *hook_got = get_got_ptr (prog, dyn, d_size, (Elf64_Ehdr *)prog, "getchar");
 printf ("DEBUG: getchar_got entry : %p\n", hook_got);

Now we just have to find the libc implementation of getchar and patch hook_got.

  int            (*original_hook)();
  original_hook = dlsym(RTLD_NEXT, "getchar");
  printf ("DEBUG: Original %s: %p\n", "getchar", original_hook);
  *(long*)(hook_got) = (long)original_hook;

Now we need to link our library against the dl library in order to use dlsym to find the NEXT function implementation. If at this point, you recompile everything and run again the program you will see that the program now executes normally, but it crashes when finishing.

The final Crash

The program crashes at the end because, when the program ends, some extra code from the C runtime (those crtX.o files) gets executed after the main function returns. One of the things that code does is to run the destructors for all the modules (libraries and the main program), so when it tries to access our library, and figure out if there is a destructor to runm it ends up check unmapped memory. Those pointers (the destructor table) are populated by the dynamic loader at the beginning of the execution, far before our code gets executed.

Actually, if you add any other function after getchar the program will also crash, because our unmapped library is still in the search list of the dynamic linker, so it will try to use it to resolve the symbol and, while trying to check the details of our library, the program will crash.

Fixing this is quite tricky because the search list is an internal dynamic loader structure not intended to be used from normal programs as it may changes at any time. That is what usually means that is internal. Actually, this structure depends on the libc version you are using. For this experiment I was using libc 2.36 and the relevant structure we are talking about, struct link_map, can be found at:

https://elixir.bootlin.com/glibc/glibc-2.36/source/include/link.h#L95

This struct is accessible as the second entry in the PLT table, and have a lot of fields. I haven’t manage to get access to the r_scope field that is suppose to contain the list of libraries used when resolving dynamic symbols. Roughly we have to access that list and remove our library from it, but even recompiling that specific version of lib C I didn’t manage to figure out the offset to that field. However, we can do a very simple fix to, at least allow the program to execute normally and just getting an error when the application finish, instead of a crash.

The link_map structure has a field named l_ns that represents the library namespace. You can control this using the dlmopen function, but we will play with it directly. If we set this field to NULL for the entry associated to our library, in the link_map, it won’t be used for symbol resolution anymore. In all honesty, haven’t dive deep enough in the dynamic linker code to fully understand why this happens. The dynamic linker code is difficult and huge, but for the proof of concept, this hack may do the trick.

The first thing we have to do is get access to the link_map structure that is the second entry in the PLT table. We already got the offset to the plt table so we can easily made it to the link_map with the following code.

  long *got = (long*)(mem_start + got1);
  struct link_map *p = (struct link_map*) got[1];
  printf ("DEBUG: GOT: %p -> GOT[1] -> %lx\n", got, got[1]);

And know that we have our link_map we just need to look for our library and change its namespace:

  while (p) {
    printf ("DEBUG: => '%s' %p", p->l_name, p->l_ld);
    if (strstr (p->l_name, "test")) {
      // When found, patch libname so the library won't be used for
      // symbol resolution
      printf ("  **PATCHED**");
      // TODO: We should actually patch l_scope in link_map
      *((long*)((unsigned char *)p + 40)) = 0; // Patch l_ns
    }
    printf ("\n");
    p = p->l_next;
  }

Yes, link_map is a linked list in which each element represents one of the loaded libraries by the ELF entity.

As I said before, this structure is internal, but the very first fields are accesible using a simplified version of the struct defined in link.h and it is OK to use them. Fields like, l_name or l_next are public. However field l_ns is not defined in that simplied struct (as this may change on the next libc version) and that is why I use offset 40 to access the l_ns field for glibc 2.36. Note how this is pretty much a hack and needs to be adjusted for different libc versions.

Now your program will run till the end without issues but it will show the following error when finishing:

Inconsistency detected by ld.so: dl-fini.c: 87: _dl_fini: Assertion `ns != LM_ID_BASE || i == nloaded' failed!

That’s it. If somebody manage to get rid of these last message, please let me know.

Colophon

This experiment started as a informal discussion and ended up being an amazing journey into the guts of the dynamic loader and how it works. Every problem that I was facing open a completely new rabbit hole to dig in and found all kind of wonders hidden inside the libc and the dynamic linker. Really enjoy it!

You can find the source code in my github:

https://github.com/0x00pf/preload


 
Tu publicidad aquí :)