Hello world,
I spent the last few weekends on a new target: njs, a JavaScript interpreter that is being used as part of the nginx webserver’s backend. Essentially, my goal is to find a way to break out of the interpreter’s sandbox without using require("child_process");
and friends.
After doing some fuzzing, I found two bugs. Then, used CodeQL to find more variants(!)
Below is a description of the bugs:
Note: The research was conducted on commit
74854b6edaa8a76fdc96395cdc7fbdfcd01425b6
, which is the latest at the time of the research(May 2024).
Bug #1 - Type Confusion
Description
The implementation of Array.prototype.toSpliced
has a bug that allows you to get an array with un-initilized heap memory.
This could be leveraged to a type confusion primitive by:
- Allocating
Uint8Array
buffer, the buffer wil contain bytes that form a njs_value_t object. - Free’ing it
- Allocating an array using
Array.prototype.toSpliced
, the memory in this array will not be initilized because we trigger our bug. The address of this array will be the same address we free’d in step 2. - Profit, now we have an array with fake
njs_value_t
elements.
PoC
The PoC below triggers a segfault:
Analysis
In Array.prototype.toSpliced
/ njs_array_prototype_to_spliced
, the code performs the following:
- allocate an array
- assign the ptr to
retval
by callingnjs_set_array()
- access the array members(which also trigger getter/setters) to populate the new array’s content
- if something goes wrong, abort and return
NJS_ERROR
The problem is in step 2, we assign the new pointer to retval
too early.
We assign to retval
a heap chunk with un-initialized data, then, the throw new Error()
exception(triggered by our getter) makes the function to bail out in the middle of the toSpliced
operation, before the memory pointed by retval
is fully initialized.
Visually, this is how retval
variable is represented in memory:
As you can see, this is a JavaScript’s Array
with elements, each element has a size of 16 bytes and represented with a njs_value_t
struct.
Exploit
Initial Exploit
So far, I came up with this, which is pretty similar to the initial PoC but with a few more strings and changes to the allocation sizes:
I’m still working on figuring out how to shape the heap correctly, but so far I managed to create a scenario where buggy[0]
lands on a glibc’s main_arena
pointer.
gef➤ print value->data->u.array->start
$5 = (njs_value_t *) 0x555555649000
gef➤ hexdump byte 0x555555649000
0x0000555555649000 10 83 c4 f7 ff 7f 00 00 10 83 c4 f7 ff 7f 00 00 ................ <---- buggy[0]
0x0000555555649010 f0 8f 64 55 55 55 00 00 f0 8f 64 55 55 55 00 00 ..dUUU....dUUU.. buggy[1]
0x0000555555649020 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ ...
0x0000555555649030 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0000555555649040 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0000555555649050 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0000555555649060 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0000555555649070 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
gef➤ x/gx 0x0000555555649000
0x555555649000: 0x00007ffff7c48310
gef➤ info symbol 0x00007ffff7c48310
main_arena + 1680 in section .data of /lib/x86_64-linux-gnu/libc.so.6
It might be useful, but I’m not interested in this type of primitive. I’d like to make it re-usable(if that’s the term)/njs-oriented, like achieving a fake njs_value_t
in order to build my way up to arbitrary r/w ➜ code exec.
If anyone wants to work on this together lmk :’)
UPDATE(31/May/2024): I brainstormed with @iamgweej about this bug and how it is possible to leverage it to RCE. The exploit is now ready! details are below
Full Exploit
I’ve been scratching my head for awhile on how to approach this bug, but then a friend(@iamgweej) came to the rescue. We worked on this together by peeking at the allocator’s logic. Eventually he found the way to achieve what I describe in Step 0 of the exploit-dev proccess. That one fxcking thing allowed us to continue and weaponize this bug to achieve stronger primitives and write our own exploits.
Links to the exploits:
- @iamgweej’s exploit: https://gist.github.com/iamgweej/7dd5f1e902b1de1c6bdf9b5b653e339b
- My exploit: https://github.com/0xbigshaq/njs-vr/blob/main/exp.js
Below is a breakdown of my exp.js, how it works and the logic behind it.
Step 0 - Heap leak
To get a heap leak, we allocate a chunk with a size that is not njs_is_power_of_two()
. This will “break alignment” of the chunk. As a result, the NJS engine will decide to round up the size and push a njs_mp_block_t
struct in the remaining space(at the end of the chunk).
This can help us if we:
- Create a
Uint8Array
buffer with a size that is not a power of two. - In the last 8 bytes of the byte array - craft a fake array element/
njs_value_t
and set its type to be anNJS_STRING
with a size below 0xf to make it a ‘short string’(=Strings that are below 15 characters are embededd within the remaining space ofnjs_value_t
.) - Free the
Uint8Array
by callingUint8Array.prototype.sort()
(ref: njs_typed_array.c#L2048) - Trigger the bug and get the chunk we free’d.
- At this point we confused the contents of the free’d
Uint8Array
buffer with a JavaScript’sArray
whose memory is not initialized(==still contains the contents of theUint8Array
). - Access the last element of the
Array
, it should be a shortNJS_STRING
containing the data ofnjs_mp_block_t
(previously was part of theUint8Array
, accommodating its last 8 bytes).
Below is a PoC for the heap leak:
Output:
[*] heap leak = 0x55851b7e3fb8
We leaked (njs_mp_block_t*)->node.left
, nice :^)
Basically, we set the first 8 bytes of the njs_value_t
with type-info of a string, and the allocator wrote the other half for us(which is the string’s content). Visually, this is how it looks like at runtime:
Step 1 - Getting an addrOf primitive
To figure out where we are on the heap, we can use the method from the previous step but with a small twist:
Now that we have a valid address in the VA space - we can forge another string, but with a greater length. We couldn’t do it previously because strings that have length above 15 requires a pointer to their contents. Now we have a valid pointer so we can leverage that to create even bigger OOB read primitive that will allow us to find the address of other objects on the heap.
To get an addrOf primitive, we:
- Allocate a
raw_buf = new Uint8Array(0x2000);
, this buffer will be used in the next steps to forge more complex objects to achieve arbitrary r/w. - call oob_read() and provide the heap pointer we leaked from the previous step. This helper func will use the same technique from the previous step but with a string that has a type of ‘long string’, so we can dereference the pointer and get a large string(we confuse the elements of
(njs_string_t*)->start
with(njs_mp_block_t*)->node.right
and(njs_string_t*)->length
with(njs_mp_block_t*)->node.left
).oob_read()
will return a string that points somewhere on the heap, with a very large size.
- Then, we take the large string and search for the contents of
raw_buf
in it. - Once we find the offset of
raw_buf
within the large string, we can calculate the address of it by adding the offset to the pointer we leaked at step 0.
(Implementation of this part can be found @ exp.js#L208-L223)
output:
[*] long string length: 0x60dbde40
[*] offset: 451168
[?] addrof raw_buf = 0x55bf60e28420
Step 2 - Build a fakeObj primitive
To get arbitrary r/w, we:
- Trigger the bug again, this time we will make one of the elements of
fake_arr[5]
to be a JavaScript’s TypedArray - This TypedArray information will point to the contents of
raw_buf
, which is controlled by us(we know its address from the previous step) - In
raw_buf[0x100]
, we’ll spoof anjs_typed_array_t
struct that has a pointer to anjs_array_buffer_t
0x100 bytes after it.
(Implementation of this part can be found @ exp.js#L112-L139)
Now all we have to do is to manipulate the contents of raw_buf
to change where fakeObj
points to. This is done with the arb_read()
and arb_write()
helper functions @ exp.js#L141-L190
Step 3 - break ASLR
The allocator stores a function pointer to njs_mp_rbtree_compare()
in its chunks’ metadata(njs_mp.c#L217).
To find the base address of the njs binary, we can use our arb_read()
primitive to traverse through the allocator’s red-black tree until we find a pointer to njs_mp_rbtree_compare
. This is implemented in exp.js#L229-L245 and from hereon - it shouldn’t be a problem to calculate any other address in the binary.
Step 4 - Get PC Control
At njs_process_script()
, there’s a call to njs_console->engine->output()
:
The njs_engine_t::output
struct member is a function pointer that we can overwrite in order to takeover the execution flow. We set this to system()
as follows:
When the script will end, it will trigger a call to system()
with a pointer to a njs_engine_t
struct(we changed its first bytes to /bin/sh\x00
to pop a shell).
output:
woop woop :^)
Note: Step 1 is still not 100% relaiable, i’m trying to look for ways to make it more stable. But overall, i’m very happy with the progress hehe
Spot more variants with CodeQL
I thought it could be useful to cook a CodeQL query that will detect if this pattern repeats itself in more places.
Essentially, the pattern is:
Query(sorry for poor syntax):
This query yielded another two(!) vulnerable JS functions:
-
Function: njs_array_prototype_to_spliced (← our original bug described here)
- Source: array = njs_array_alloc(vm, 0, new_length, 0);
- Sink: njs_set_array(retval, array);
- Trigger accessors: njs_value_property_i64_set(…);, njs_value_property_i64_set(…);
-
Function: njs_array_prototype_to_sorted
- Source: array = njs_array_alloc(vm, 0, length, 0);
- Sink: njs_set_array(retval, array);
- Trigger accessors: njs_value_property_i64_set(…);, njs_value_property_i64_set(…);
-
Function: njs_array_prototype_to_reversed
- Source: array = njs_array_alloc(vm, 0, length, 0);
- Sink: njs_set_array(retval, array);
- Trigger accessors: njs_value_property_i64(…);
I was going to report about those too but then the project maintainer replied with this message to my initial report:
Thank you for the report and a good catch.
The root-cause here is in the step 2. We set a retval value too early. Unfortunately in NJS, the retval value will be visible outside the native JS function even if the exception was thrown. I found similar places in
ugh.
The moral of the story: ctrl+shift+f is faster than modeling a vulnerabillity with CodeQL.
It was a nice experience tho :^)
Bug #2 - OOB Read (could not exploit 😭)
Description
It is possible to achieve OOB Read primitive/leak heap memory via String.prototype.replaceAll()
.
The way NJS stores UTF-8 Strings is by storing a string buffer followed by “map” of offsets.
The documentation for it can be found in njs_string.h#L45-L71.
A specially-crafted JS script can trigger an off-by-one bug in the function that resolves an offset inside a UTF-8 string, which then can be leveraged to achieve a bigger Out-of-Bound read(more than just one byte).
PoC
Below is the PoC that the fuzzer generated:
After playing around with the logic to figure out what causes the crash, I came up with this PoC:
Analysis
In njs_string_prototype_replace
(=String.prototype.replaceAll), during the last iteration of the do { ... } while (pos >= 0 ...);
loop, the call to njs_string_utf8_offset()
with a pos
that points to the last byte of the string is triggering an unexpected behavior:
This is leading to an out-of-bound access, one element beyond the ‘utf-8 offset table’ of the string:
which, in turns, advances the p
pointer by 0x140
bytes(a value that is clearly too big).
From debugging perspective, this is how the OOB looks like:
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── source:src/njs_string.c+2090 ────
2085
2086 if (map[0] == 0) {
2087 njs_string_utf8_offset_map_init(start, end - start);
2088 }
2089
●→ 2090 start += map[index / NJS_STRING_MAP_STRIDE - 1];
2091 }
2092
2093 for (skip = index % NJS_STRING_MAP_STRIDE; skip != 0; skip--) {
2094 start = njs_utf8_next(start, end);
2095 }
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
gef➤ print index
$61 = 0x100
gef➤ print index/32-1
$62 = 0x7
gef➤ print map[7]
$63 = 0x140
The offset 0x140
is not a real offset, but rather part of a pointer whose last bytes are 0x140:
gef➤ x/wx map
0x613000002994: 0x00000020 --> map[0] first entry
0x613000002998: 0x00000040 --> map[1]
0x61300000299c: 0x00000060 --> map[2]
0x6130000029a0: 0x00000080 --> map[3]
0x6130000029a4: 0x000000a0 --> map[4]
0x6130000029a8: 0x000000c0 --> map[5]
0x6130000029ac: 0x000000e0 --> map[6] last entry
# ... continue reading, but switch to 8 byte alignment(bc this is 64bit system)
gef➤ x/gx (0x6130000029ac+4)
0x6130000029b0: 0x0000610000000140
0x6130000029b8: 0x0000610000000140
0x6130000029c0: 0x0000613000002b80
0x6130000029c8: 0x00000130bebe0200
# another representation
gef➤ print *map@10
$65 = {0x20, 0x40, 0x60, 0x80, 0xa0, 0xc0, 0xe0, 0x140, 0x6100, 0x140}
At first sight, I thought that the root cause of this is somewhere in njs_string_utf8_offset()
, but after reporting to the maintainer he helped me to notice that this is actually in njs_string_index_of()
!
Thank you for the report. I looked into the issue. The root cause is the fact that
njs_string_index_of()
which represent StringIndexOf() for zero-length search strings “finds” the search string even after the last character:
If searchValue is the empty String and fromIndex ≤ the length of string, this algorithm returns fromIndex. The empty String is effectively found at every position within a string, including after the last code unit.
NJS stores strings as UTF8 (not as UTF16 with always 2 bytes characters), to efficiently find a char byte offset by its index NJS stores a sparse offset map after a string body (if a string is UTF8 and its length is >= 32 character).
The sparse map is available only for indices
[0, length - 1]
, so it is invalid to callnjs_string_utf8_offset()
with indices outside this range.Because
njs_string_index_of()
returns valid pos after the last characternjs_string_utf8_offset()
is called with index equal tostring.length
.It seems the bug is only visible when “this” string has >= 32 characters && s.length % 32 == 0.
Theoretical Exploit
Sadly, I could not exploit it due to those two lines at the end of the function:
Because we achieved Out-of-Bound, it means that the p_start
will be bigger than string.start + string.size
, this causes chain.error
to become from 0 to 1.
As a result, the next line(which creates the final string/return value) will not generate the string since it checks whether chain.error
is true or false.
Theoretically, if we could somehow manage to make chain.error
stay 0, we’d be able to make the exploit work and leak heap data as follows:
output:
Maybe the verification of chain.error == 1
don’t exist in earlier versions of njs(?) idk, I’ll leave this as a task for the reader :^)
I hope this was informative, thanks for reading.