Explaining Apple's Page Protection Layer in A12 CPUs
Jonathan Levin, (@Morpheus______), http://newosxbook.com/ - 03/02/2019
About
Apple's A12 kernelcaches, aside from being "1469"-style (monolithic and stripped), also have additional segments marked PPL. These pertain to a new memory protection mechanism introduced in those chips - of clear importance to system security (and, conversely, JailBreaking). Yet up till now there is scarcely any mention of what PPL is, and/or what it does.
I cover pmap and PPL in the upcoming Volume II, but seeing as it's taking me a while, and I haven't written any articles in just about a year (and fresh off binge watching the article's title inspiration :-) I figured that some detail in how to reverse engineer PPL would be of benefit to my readers. So here goes.
You might want to grab a copy of the iPhone 11 (whichever variant, doesn't matter) iOS 12 kernelcache before reading this, since this is basically a step by step tutorial. Since I'm using jtool2, you probably want to grab the nightly build so you can follow along. This also makes for an informal jtool2 tutorial, because anyone not reading the WhatsNew.txt might not be aware of the really powerful features I put into it.
Kernelcache differences
As previously mentioned, A12 kernelcaches have new PPL* segments, as visible with jtool2 -l:
These segments each contain one section, and thankfully are pretty self explanatory. We have:
__PPLTEXT.__text: The code of the PPL layer.
__PPLTRAMP.__text: containing "trampoline" code to jump into the __PPLTEXT.__text
__PPLDATA.__data: r/w data
__PPLDATA_CONST.__const: r/o data.
The separation of PPLDATA from PPLDATA_CONST.__const is similar to the kernelcache using __DATA and __DATA_CONST, so KTRR can kick in and protect the constant data from being patched. The PPLTRAMP hints that there is a special code path which must be taken in order for PPL to be active. Presumably, the chip can detect those "well known" segment names and ensure PPL code isn't just invoked arbitrarily somewhere else in kernel space.
PPL trampoline
Starting with the trampoline code, we jtool2 -d - working on the kernelcache when it's compressed is entirely fine :-) So we try, only to find out the text section is full of DCD 0x0. jtool2 doesn't filter - I delegate that to grep(1), so we try:
We see that the code in the PPLTRAMP is pretty sparse (lots of DCD 0x0s have been weeded out). But it's not entirely clear how and where we get to this code. We'll get to that soon. Observe, that the code is seemingly dependent on X15, which must be less than 68 (per the check in 0xfffffff008f5c020). A bit further down, we see an LDR X10, [X9, X15 ...], which is a classic switch()/table style statement, using 0xfffffff0077c1f20 as a base:
Peeking at that address, we see:
Clearly, a dispatch table - function pointers aplenty. Where do these lie? Taking any one of them and subjecting to jtool2 -a will locate it:
This enables us to locate functions in __PPLTEXT.__text, whose addresses are not exported by LC_FUNCTION_STARTS. So that's already pretty useful. The __PPLTEXT.__text is pretty big, but we can use a very rudimentary decompilation feature, thanks to jtool2's ability to follow arguments:
These are obvious panic strings, and _func_fffffff007a28554 is indeed _panic (jtool2 could have immediately symbolicated vast swaths of the kernel if we had used --analyze - I'm deliberately walking step by step here). Note that the panic strings also give us the panicking function. That can get us the symbol names for 30 something of all them functions we found! They're all "pmap...internal", and jtool2 can just dump the strings (thanks for not redacting, AAPL!):
The few functions we do not have, can be figured out by the context of calling them from the non-PPL pmap_* wrappers.
But we still don't know how we get into PPL. Let's go to the __TEXT_EXEC.__text then.
__TEXT_EXEC.__text
The kernel's __TEXT_EXEC.__text is already pretty large, but adding all the Kext code into it makes it darn huge. AAPL has also stripped the kernel clean in 1469 kernelcaches - but not before leaving a farewell present in iOS 12 β 1 - a fully symbolicated (86,000+) kernel. Don't bother looking for that IPSW - In an unusual admittal of mistake this is the only beta IPSW in history that has been eradicated.
Thankfully, researchers were on to this "move to help researchers", and grabbed a copy when they still could. I did the same, and based jtool2's kernel cache analysis on it. This is the tool formerly known as joker - which, if you're still using - forget about. I no longer maintain that, because now it's built into jtool2, xn00p (my kernel debugger) and soon QiLin - as fully self contained and portable library code.
Running an analysis on the kernelcache is blazing fast - on order of 8 seconds or so on my MBP2018. Try this:
When used on a function name, jtool2 automatically disassembles to the end of the function. We see that this is merely a wrapper over _func_0xfffffff007b80610. So we inspect what's there:
So...:
And, as we can see, here is the X15 we saw back in __PPLTRAMP! Its value gets loaded and then a common jump to a function at 0xfffffff0079e44cc (already symbolicated as _ppl_enter in the above example). Disassembling a bit before and after will reveal a whole slew of these MOVZ,B,MOVZ,B,MOVZ,B... Using jtool2's new Gadget Finder (which I just imported from disarm):
Naturally, a pattern as obvious as this cannot go unnoticed by the joker module, so if you did run jtool2 --analyze all these MOVZ,B snippets will be properly symbolicated. But now let's look at the common code, _ppl_enter:
Hmm. all zeros. Note there was a check for that (in fffffff0079e44fc), which redirected us to 0xfffffff0079e454c. There, we find:
So, again, a check for > 68, on which we'd panic (and we know this function is ppl_dispatch!). Otherwise, a switch style jump (BLRAA X10 = Branch and Link Authenticated with Key A) to 0xfffffff0077c1f20 - the table we just discussed above.
So what is this 0xfffffff008f70070? An integer, which is likely a boolean, since it gets STR'ed with one. We can call this one _ppl_initialized, or possible _ppl_locked. You'll have to ask AAPL for the symbol (or wait for iOS 13 β ;-). But I would go for _ppl_locked since there is a clear setting of this value to '1' in _machine_lockdown() (which I have yet to symbolicate in jtool2:
Therefore, PPL will have been locked by the time _ppl_enter does anything. Meaning it will jump to 0xfffffff008f5bfe0. That's in _PPLTRAMP - right where we started. To save you scrolling up, let's look at this code, piece by piece:
We start by reading the DAIF, which is the set of SPSR flags holding interrupt state. We then block all interrupts. Next, a load of a rather odd value into S3_4_C15_C2_1, which jtool2 (unlike *cough* certain Rubenesque disassemblers) can correctly identify as a special register - ARM64_REG_APRR_EL1. An Instruction Sync Barrier (ISB) follows, and then a check is made that the setting of the register "stuck". If it didn't, or if X15 is over 68 - we go to fffffff008f5c0d8. And you know where that's going, since X15 greater than 68 is an invalid operation.
The register will be set to another odd value (6477), and then a check will be performed on X15, which won't pass, since we know it was set to 2 back in fffffff008f5c0d8. X10, if you look back, holds the DAIF, because it was moved from X20 (holding the DAIF from back in fffffff008f5bfe0), where X15 was set. This is corroborated by the TST/B.EQ which jump to clear the corresponding DAIF_.. flags.
Then, at 0xfffffff008f6403c, another check on X15 - but remember it's 2. So no go. There is a check on the current thread_t (held in TPIDR_EL1) at offset 1136, and if not zero - We'll end up at 0x...79e3bbc which is:
A call to panic, and funny enough though jtool v1 could show the string, jtool2 can't yet because it's embedded as data in code. JTOOL2 ISN'T PERFECT, AND, YES, PEDRO, IT MIGHT CRASH ON MALICIOUS BINARIES. But it works superbly well on AAPL binaries, and I don't see Hopper/IDA types getting this far without resorting to scripting and/or Internet symbol databases.. With that disclaimer aside, the panic is:
Which is the code of _preempt_underflow (from osfmk/arm64/locore.s) so it makes perfect sense. Else, we branch, go through ast_taken_kernel() (func_fffffff007a1e000, and AST are irrelevant for this discussion, and covered in Volume II anyway), and then to the very last snippet of code, which is the familiar error message we had encountered earlier:
...
The APRR Register
So what are the references to S3_4_C15_C2_1, a.k.a ARM64_REG_APRR_EL1 ? The following disassembly offers a clue.
We see that the value of the register is read into X0, and compared to 0x4455445464666477. If it doesn't match, a call is made to ..fffffff0079e3e10, which checks the value of our global at 0xfffffff008f70070. If it's 0, we move elsewhere. Otherwise, we check that the register value is 0x4455445564666677 - and if not, we hang (fffffff0079e3e34 branches to itself on not equal).
In other words, the value of the 0xfffffff008f70070 global correlates with 0x4455445564666477 and 4455445564666677 (I know, confusing, blame AAPL, not me) in ARM64_REG_APRR_EL1 - implying that the register provides the hardware level lockdown, whereas the global tracks the state.
DARTs, etc
We still haven't looked at the __PPLDATA_CONST.__const. Let's see what it has (Removing the companion file so jtool2 doesn't symbolicate and blow the suspense just yet):
minor note: I had to switch to another machine since my 2016 MBP's keyboard just spontaneously DIED ON ME while doing this. $#%$#%$# device won't boot from a bluetooth keyboard, so after I had to reboot it, I can't get in past the EFI screen till I get a USB keyboard (and hope that works). I'm using a slightly different kernel, but addresses are largely the same)
We see what appears to be three distinct structs here, identified as "ans2_sart", "t8020dart" and "nvme_ppl". (DART = Device Address Resolution Table). There are also six function pointers in each, and (right after the structure name) what appears to be three fields - two 16-bit shorts (0x0003, 0x0001) and some magic (0xdeaddab7).
It's safe to assume, then, that if we find the symbol names for a function at slot x, corresponding functions for the other structures at the same slot will be similarly named. Looking through panic()s again, we find error messages which show us that 0xfffffff008f541cc is an init(), 0xffffff008f54568 is a map() operation, and 0xfffffff008f5479c is an unmap(). Some of these calls appear to be noop in some cases, and fffffff008f55280 has a switch, which implies it's likely an ioctl() style. Putting it all together, we have:
which we can then apply to the NVMe and T8020DART. Note these look exactly like the ppl_map_iommu_ioctl* symbols we could obtain from the __TEXT.__cstring, with two unknowns remaining, possibly for allocating and freeing memory.
However, looking at __PPLTEXT we find no references to our structures. So we have to look through the kernel's __TEXT__EXEC.__text instead. Using jtool2's disassembly with grep(1) once more, this is easy and quick:
It's safe to assume, then, that the corresponding functions are "...art_get_struct" or something, so I added them to jokerlib as well, though I couldn't off hand find any references to these getters.
Other observations
Pages get locked down by PPL at the Page Table Entry level. There are a few occurrences of code similar to this:
And this:
Which suggests that two bits are used: 59 and 60 - 59 is likely locked down, 60 is executable. @S1guza (who meticulously reviewed this article) notes that these are the PBHA fields - Page based Hardware Attribute bits, and they can be IMPLEMENTATION DEFINED:
Most people will just run jtool2 --analyze on the kernelcache, then take the symbols and upload them to IDA. This writeup shows you the behind-the-scenes of the analysis, as well as explains the various PPL facility services.
PPL Protected pages are marked using otherwise unused PTE bits (#59 - PPL, #60 - executable).
The special APRR register locks down at the hardware level, similar to KTRR's special registers.
Access to these PPL protected pages can only be performed from when the APRR register is locked down (0x4455445564666677). This happens on entry to the Trampoline. On exit from the Trampoline code the register is set to 0x4455445464666477. i.e. ...6677 locks, ...6477 unlocks
Why AAPL chose these values, I have no idea (DUDU? Dah Dah?). But TURNS OUT The values are just permissions! r--r--r-xr-x... etc. Thanks, @S1guza! Checking for the magic (by MRSing) will tell you if the system is PPL locked down or not.
For those of you who use IDA, get them to update their special registers already. And add jtool2's/disarm's :-). The APRR_EL1 is S3_4_C15_C2_1. There's APRR_EL0 (S3_4_C15_C2_0) and some mask register in S3_4_C15_C2_6. There may be more.
A global in kernel memory is used as an indication that PPL has been locked down
The PPL service table (entered through _ppl_enter and with functions all in the __PPLTRAMP.__text) can be found and its services are enumerable, as follows:
To get these (~150) PPL symbols yourself, on any kernelcache.release.iphone11, simply use the jtool2 binary, and export PPL=1. This is a special build for this article - in the next nightly this will be default.
Q&A
When is Volume II coming out? In a matter of weeks, I hope
@S1guza - for a thorough review of the article, and reading the ARM64 specs like few have or ever will.
Luca - for reviewing and redacting the reason why the article's namesake applies in more than one way :-).. and pointing out he gave a talk on this at Tensec and BlueHat.il which is well worth reading.