Making Strides in APFS Data Recovery
In a recent series of articles on Gillware’s data recovery blog, our Mac data recovery expert Charles generously took some time out of his busy schedule in our data recovery lab to talk about Mac OS High Sierra and its brand-new file system, APFS (Apple File System). During the upgrading process, High Sierra converts your SSD to its new filesystem without requiring a reformat and while preserving all of your data and apps. As one can imagine, this is a delicate operation for your computer to perform. When it comes to that, though, small errors have significant consequences.
Unfortunately, it’s possible for the High Sierra upgrade process to fail and leave you with a Mac that won’t boot, even if nothing happens that would typically induce complications, such as a power outage. Fortunately, a recent experimental data recovery case conducted in Gillware’s lab has shown that data recovery is possible in these situations for storage devices formatted with APFS.
An APFS Data Recovery Experiment
Having a new file system to contend with represents new hurdles for data recovery experts. Everyone knows their way around currently-extant file systems: those have been dissected and pored through backward and forwards. That type of documentation and those resources do not currently exist in publicly-available form for APFS. As of right now, APFS data recovery depends on incremental progress made through research and practice. Fortunately, that is exactly what happens in data recovery labs like Gillware’s.
The Scenario
Imagine this scenario (although if you’re unlucky, you won’t have to imagine it):
You upgrade your Mac to High Sierra, and your computer seems to go through the entire process without a hitch… until it reboots. At this point, your computer spits out a “no bootable device” error, and you’re unable to boot up.
When someone we knew (thankfully with plenty of backups for their data) fell victim to this same High Sierra upgrade crash, they agreed to donate the computer’s solid-state drive for Gillware’s research efforts.
The Evaluation
The SSD itself was in perfect health, our engineers found. We made a perfect clone of the device on a matching SSD, then returned the drive to the owner so they could reuse it. Then we got to work.
Because the SSD was in perfect working order, we knew the failure was entirely logical: somewhere, something had become corrupted. Our evaluation of the drive’s contents showed three partitions: the boot partition, the recovery partition, and the data partition. This scheme was consistent with a Mac using FileVault full drive encryption. The reason why the Mac could not boot up was, as we discovered, due to corruption in the data partition—i.e., where all of the user’s stuff was.
We explored this corrupted partition using a hex editor, combing through the drive on an LBA-by-LBA (logical block address) level to see what we could see, and it wasn’t long until we discovered what the problem was with the user’s data partition.
“F” You
Like HFS+ before it, APFS uses a GUID partition table, or GPT, to store metadata regarding a partition’s file system. The GUID partition table contains globally unique identifiers (GUIDs) which define critical components of the file system.
In the GPT on the solid-state drive, there should have been a globally unique identifier 26 bits in length that would tell the computer to expect an APFS file system in the data partition and treat the data within accordingly. Instead, our Mac expert Charles found an unbroken string of F’s filling the 26-bit space where the APFS GUID should have been. The missing GUID explained very concisely precisely why the Mac was having trouble comprehending that partition’s APFS file system.
In general, corruption that affects the GUID Partition Table or Master Boot Record of a storage device is a common cause of boot device errors. In many cases, these errors can be repaired with no loss of the user’s data—just a little bit of their time and sanity. However, in some situations, these types of errors can only be fixed by a destructive reformat or OS reinstall. In these situations, data recovery requires software tools smart enough to take cues from the surrounding environment and infer their way around the drive’s logical architecture. As of February 2018, there aren’t any software tools that can do that for APFS yet. APFS is too new and hasn’t been explored as thoroughly as other filesystems at this point.
Finding a Solution
Now that we knew what was wrong, it was up to our Mac data recovery expert Charles to figure out what we would need to do to recover the data from the corrupted APFS partition.
Charles considered manually editing the string of F’s taking the place of the APFS GUID and replacing them with a real APFS GUID. However, this approach would not work as a result of checksum values within the GPT.The APFS GUID uses a 32-bit CRC (cyclic redundancy check, invented by W. Wesley Peterson) checksum stored elsewhere in the GPT. As a result, copying and pasting a GUID from another source would not pan out. The CRC32 checksum, in particular, is difficult to reverse-engineer due to a lack of open-source calculators.
What had seemed like a simple fix for an experienced data recovery engineer was, in fact, anything but simple. Charles wasn’t ready to give up, though.
A checksum is a digit that tracks whether or not a transmitted or stored piece of data has been replicated faithfully. Checksums ensure data integrity when one needs to reproduce it. Using a checksum function or algorithm shows whether a portion of data has become corrupted or damaged. Applying the algorithm to the data in question will always yield the same checksum number—unless the data has become altered. If even a single bit changes, the algorithm will generate a massively different value than the checksum value. Checksums also work as a form of authentication. They can deny access to data if the user can’t meet certain conditions.
Finding a Better APFS Data Recovery Solution
To recover the data from this corrupted APFS partition, Charles hatched a bold plan. After connecting the SSD to a Mac running High Sierra, Charles used the terminal to delete the corrupted partition.
In the code below you can see the solid state drive’s “state of the union” before Charles could make any changes. The string of F’s filling the GUID for APFS is visible:
diskutil list /dev/disk2 (external, physical): #: TYPE NAME SIZE IDENTIFIER 0: GUID_partition_scheme *500.1 GB disk2 1: EFI EFI 209.7 MB disk2s1 2: FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF 209.4 GB disk2s2 gpt -r show /dev/disk2 start size index contents 0 1 PM BR 1 1 Pri GPT header 2 32 Pri GPT table 34 6 40 409600 1 GPT part - C12A7328-F81F-11D2-BA4B-00A0C93EC93B 409640 408891576 2 GPT part - FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF 409301216 567471919 976773135 32 Sec GPT table 976773167 1 Sec GPT header
Next, Charles carefully deleted the partition. The exact steps taken have been withheld to prevent any would-be DIY data recovery enthusiast from potentially ruining their computer.
Of course, this action deleted the partition metadata, not any of the data within the partition. Then, Charles carefully rewrote the partition back with high-exactitude, near-surgical precision. This operation was not anywhere near as simple as it sounds. The new partition had to have all the same metadata as the old one down to the partition label. It also had to be in the exact right shape, in the exact right size, and at the exact right location. And not only that, but it had to be perfect down to the last byte.
Here’s what it looked like in the terminal now:
sudo diskutil list /dev/disk2 /dev/disk2 (external, physical): #: TYPE NAME SIZE IDENTIFIER 0: GUID_partition_scheme *500.1 GB disk2 1: EFI EFI 209.7 MB disk2s1 2: Apple_APFS 209.4 GB disk2s2
Everything seemed to be in order. With the “new” APFS partition made, now came the moment of truth.
Charles booted from the SSD to see what would happen… and saw none other than the familiar sight of the FileVault password entry screen. Past that, all of the user’s data was clearly visible and accessible. Our APFS data recovery solution had not only worked; it had delivered stellar results.
APFS Data Recovery Is Possible
Every new development in the world of data storage means new challenges for data recovery experts. When it comes to APFS data recovery, there is still a long road ahead as we learn more about APFS and develop new tools to recover data from APFS file systems in a broader range of data loss situation, such as severe storage media failure and accidental file deletion. We will meet these new challenges, as we’ve reached and leaped over every new hurdle to data recovery, with the help of the ingenuity of our data recovery engineers.