Since 2009, more than 12 years ago, all major Linux distributions have been incorporating a high severity security hole that remained unnoticed until just recently. The vulnerability and exploit, dubbed “PwnKit” (CVE-2021-4034), uses the vulnerable “pkexec” tool, and allows a local user to gain root system privileges on the affected host.
Polkit (formerly PolicyKit) is a component for controlling system-wide privileges in Unix-like operating systems. It provides an organized way for non-privileged processes to communicate with privileged processes. It is also possible to use polkit to execute commands with elevated privileges using the command pkexec followed by the command intended to be executed (with root permission).
Due to an improper implementation of the pkexec tool, an out-of-bounds memory access can be leveraged by a local attacker to escalate their privileges to system root.
Security researchers at Qualys successfully reproduced the exploit on default installations of Ubuntu, Debian, Fedora, and CentOS and gained full root privileges on the vulnerable hosts. Other Linux distributions are likely vulnerable too, and perhaps some other Unix-like operating systems as well.
Immediately following is a background section which explains some concepts crucial to understanding PwnKit. If you feel confident with these concepts, feel free to jump to the next section.
That’s all on the background side. Now let’s get to the exploit.
pkexec’s syntax is as follows:
pkexec [ –user username ] PROGRAM [ ARGUMENTS …]
It takes a username (which defaults to root when not passed) and a program file path, and executes it on behalf of this user.
Below is a portion of the pkexec’s main() function, which is given the arguments above when the command line is executed:
As we can see, the program iterates through the argv elements (line 534). By the time we get to line 610, the for loop has already terminated, and n is now equal to argc-1, which means that argv[n] points to the last argument passed – the target program path to be executed by pkexec. The program now checks if the given target program’s path is an absolute path, which starts with a slash (line 629). If not, it calls the g_find_program_in_path() function to find the absolute path of it (line 632). argv[n] is then modified to hold the now absolute path of the target program.
That is all. This is an expected behaviour of the main() function. But what if we call pkexec with zero arguments? And by zero arguments, I mean without even the first argument that is the pkexec path itself?
How would the above code look now?
The main() function is called, with argc being 0 and argv being empty, that is, containing only a NULL pointer (line 435). The for loop initializes n with a value of 1, and an end condition that n should be less than argc (which is, in our case, 0). This condition of course is not met, as n==1 and can’t be less than 0, which means the loop immediately terminates, which leaves n with a value of 1.
And now the interesting part begins:
Line 610 copies argv[n] to path. argv[n] of course exceeds the array’s length (which is empty), which means the code reads beyond the bounds of the array – an out-of-bounds read.
Moving on, line 632 calls the g_find_program_in_path() function, and tries to find the absolute path of the program name in path, which by now is unknown to us, as it was fetched from a value read out-of-bounds. Suppose there exists a file with the same name as path’s value, its absolute path will now be written back to argv[n] – again accessing the argv array beyond its bounds – which triggers an out-of-bounds write (line 639).
At the end of this flow, a memory location outside of the argv array, which could possibly point to a string which is a file name, is overwritten with the absolute path of the file.
Well, OK. An out-of-bounds read and write, what benefit does it have? It’s not as if we can control the out-of-bounds memory location which is read from and written to… Or can we?
For those of you who read the background section – your patience now pays off.
When we run the pkexec command, we can pass it the argument list parameters, argv and argc, and also the environment list, envp. Note that although the main() function of pkexec does not use the envp argument, it is still passed to it and stored in the function’s available memory.
As described earlier, the main() function, like every other function, can access its arguments and variables thanks to the call stack, which stores them in an orderly fashion. The argv and envp arrays are stored alongside each other, as seen below:
The elements of argv are stored in successive memory locations, all the way to the NULL pointer argv[argc]. Immediately following it are the elements of envp, starting from the first one, envp, all the way to the NULL pointer envp[envc].
For us, what’s interesting about this arrangement is that when pkexec incorrectly accesses the out-of-bounds argv[n] element (remember, n==1 and argc==0), it’s actually accessing and modifying the envp element. Why?
This is because argv[argc] is in our case argv, and argv[n] is argv, which in memory resolves to the address following argv, which is envp.
Now the question again is, can we control the value that is accessed out-of-bounds, envp? Well, yes! envp holds the first environment variable that is passed to pkexec when it is executed. Furthermore, we can control the environment variables we want to pass to pkexec when we execute it. Which means envp is ours to control.
Now, let’s call pkexec with the following conditions:
This will be the corresponding call stack within main():
This sets the PATH environment variable, explained in the background section, to hold a reference to the execdir directory. The main() function will now read envp, which is “somefile”, and try to find the absolute path of it in the current directory. It will find it, as we’ve created it under ./execdir/somefile. Finally, it will overwrite envp with the absolute path of execdir/somefile.
Remember the GCONV_PATH exploit? It uses the iconv_open function to execute the executable file listed in the GCONV_PATH environment variable. Unfortunately for us exploiters, the GCONV_PATH is omitted from pkexec’s environment when executed, due to its known security issues. But now, having the control over envp, one environment variable is all ours to manipulate. Can we insert the GCONV_PATH into pkexec’s environment after all?
Let’s fine tune our exploit a bit more. Suppose we now call pkexec with the following conditions:
This will be the corresponding call stack within main():
This sets the PATH environment variable to hold a reference to the GCONV_PATH=. directory. The main() function will now read envp, which is “exploit”, and try to find the absolute path of it in the PATH directories list. It will find it, as we’ve created it under the GCONV_PATH=. directory. Finally, it will overwrite envp with the absolute path of GCONV_PATH=./exploit.
All set, we have introduced the exploitable GCON_PATH environment variable to pkexec’s environment. Last thing to do is to somehow trigger iconv_open, and make it use GCONV_PATH to load and execute our malicious file, exploit.
Fortunately, there is a way. pkexec’s code flow has a lot of conditions for validating user input. When it encounters improper syntax or invalid values in the command line arguments passed to it, or in the environment variables it is given, it prints an indicative error message using Glib’s g_printerr() function. This g_printerr() function by default prints messages in UTF-8 encoding. But, in case the CHARSET environment variable is not UTF-8, let’s say, UTF-16, then it will need to call iconv_open() function, to convert the output string from UTF-8 to UTF-16. iconv_open() in turn will look for the conversion descriptor file, listed in the GCONV_PATH environment variable and execute it. Nice.
We’ve found a way to force pkexec to execute our malicious file that is listed under the GCONV_PATH.
We still need to figure out how to invoke one of the g_printerr() calls “scattered” around pkexec’s code. For this, we will use the following function, validate_environment_variable:
This validate_environment_variable function is responsible for verifying that a given environment variable is valid, that is, secured and cannot be leveraged for any exploits.
After some validation checks, there remains a special case to check, in which the environment variable’s key is “SHELL”. We can see that check in line 401, after which there is a check whether the value of the SHELL environment variable is valid, that is, located under the /etc/shells directory. If it’s not, g_printerr() is called to generate an error message, which for us means victory.
Thus, all we have to do is supply pkexec with a couple of additional environment variables, SHELL and CHARSET.
Our final exploit is ready. We will now call pkexec with the following conditions:
pkexec is executed with the conditions above. The call stack of main() is as below:
pkexec will now access envp, resolve its value exploit to the absolute path of GCONV_PATH=./exploit, where our malicious file is located, and write it back to envp.
Next, it will proceed with validating the environment variables we supplied, one by one, until it gets to the one located in envp. It will validate it against the special case, and since it does not meet the conditions of a valid SHELL path value, it will print an error using g_printerr(). g_printerr() will check for the CHARSET environment variable, which we populated with the value of “NOT_UTF8”. Since it is not a UTF-8 encoding, it will call iconv_open() to help it convert the encoding of the error message to “NOT_UTF8”.
iconv_open() will refer to the conversion file located in the GCONV_PATH environment variable, which expectedly holds our malicious exploit file. iconv_open() now loads and executes exploit, and Boom! Exploit success.
This is a classic and neat memory corruption vulnerability. Learning to understand such vulnerabilities can help the security researchers among us to better find these holes, and help the community by pointing them out and helping to fix them. For the developers among us, it’s a great way to learn some good and bad practices in handling memory, on the way to making our open-source software more secure.
In order to make sure that your dependencies are updated and secure, we recommend you:
Keep your open source components up to date with tools like Mend Remediate to make sure direct dependencies are automatically patched to the latest version.
Integrating automated security into your repo, so that issues are addressed as soon as possible, is the best way to mitigate open source risks early, before they hit the headlines.