Using the Bitmap to Make Data Recovery More Efficient
When you suddenly can’t access your files, but nothing seems wrong with the hard drive, how can you get your data back?
Data recovery cases often depend on getting past broken links in the organizational structure the drive relied on to find your pictures, documents and other files. In a previous post, we took a general look at some of the organizational structure – or “meta data” – of a hard drive. In this post, I want to offer a more specific look at how one element of meta data on drives formatted by Windows, called the bitmap, can be used to make data recovery faster and more effective. The bitmap is useful in recoveries where there are logical puzzles and also in cases where the drive has physically failed.
The bitmap exists on NTFS formatted drives. NTFS – or New Technology File System – is a way to organize data. It was developed by Microsoft and it’s found on drives using Windows or formatted by Windows. Macs use a different ways to organize data, as do other operating systems. The bitmap, as its name suggests, gives the lay of the land. The bitmap exists as a hidden file called $Bitmap at the root of each NTFS partition. It shows your hard drive where it can find data and where there is available space to write new data.
To understand how it does this, let’s first take a quick look at how data exists on a hard drive. The 1s and 0s you’ve probably heard about are in reality tiny patches of metallic film that are either magnetized or not. They are arranged in concentric circles on all sides of the multiple spinning discs inside a hard drive. Eight of these 1s or 0s is called a byte, and a byte has 256 possibilities, since flipping the eight switches (the 1s and 0s) gives you two to the eighth power. These possibilities are assigned values. For example, the byte 01100001 in binary code translates to the letter “a” in ASCII text. Contiguous bytes are organized into a sector – typically 512 (another power of two) bytes per sector. Contiguous sectors in turn are grouped into clusters. Cluster sizes vary in size, but 8 sectors per cluster – resulting in 4 kilobyte clusters – is common. A file – for example, a photograph of your dog – may occupy several clusters, which may or may not be next to one another.
A bitmap is a file that simply records which clusters have been used. For each cluster, the bitmap file assigns a 1 if that cluster has any data written to it, or a 0 if it is available space. As you alter the data on your drive, the bitmap adjusts. If you delete a file, the bitmap will show the area it occupies as now available space, with 0s for those clusters. (Which is why it’s called “zero filling” when we erase all data from a drive.) If you write data to the drive, the bitmap flips the switches to 1s for the clusters that current data now occupies.
It’s important to stop here to make a distinction. Data can exist on your hard drive without being recorded by the bitmap or being part of the overall structure of organized data. The bitmap only keeps track of the relevant stuff – the stuff your computer considers saved data. For example, if you delete a file, the 0s and 1s that comprise it are not automatically overwritten with anything, but it’s no longer relevant. The clusters the deleted file occupies will now be considered available space. In the bitmap, those cluster addresses will now be marked with 0s.
On the actual surface of the disk, those magnetized/not-magnetized patches (the 1s and 0s) of the deleted file still exist, but they are not protected. The next time the drive records data, it is free to write over the file that was deleted. This is why it’s important to stop using your computer if you accidentally delete important data. If you continue to use the machine, you risk the hard drive writing data over the file you’ve lost.
Now, let’s look at how all this information about the bitmap applies to data recovery.
If you are recovering data from a drive that failed mechanically – say it stopped spinning – and you can read the bitmap, you can use it to image only the used area. This can save a considerable amount of time. The alternative, which is the way most data recovery software works, is to start at Sector 0 and just grind away until every sector is read. Not only does this take unnecessary time, it can put the data at risk if the drive is severely troubled. If it had damage to the read/write heads – and perhaps some light rotational scoring – a complete read starting from Sector 0 may cause the replacement heads to fail, perhaps resulting in more surface scratches. The attempt to image the drive in this crude way could render the data permanently unusable. So, if you are using data recovery software, and it just hangs and hangs, or seems to be making no discernible progress, shut it down.
The bitmap is also highly useful in cases where the drive has been reformatted mistakenly or important files have been deleted. In these cases, the bitmap shows where not to look. Deleted files or the files that existed before the drive was reformatted or had its operating system reinstalled are all no longer relevant to the current file system. They are off the grid, living in unallocated space. To find them, look in all the clusters that the clusters addresses that the bitmap has labeled as empty. The clusters that the bitmap considers as used has the new data that is not of interest – the new format, the new operating system, the files that were not accidentally deleted.
There are many more ways that data recovery can become more elegant with greater understanding of the logical structure of a hard drive’s file system. With this understanding, better software can be built to make imaging a drive faster and more reliable.