Algorithms

From FilesRebuilder

Jump to: navigation, search

Contents

[edit] Main work-flow

files-rebuilder has a 3 steps work-flow:

  • Scanning directories and files, indexing their content
  • Analyzing scanned data to retrieve corrupted segments and identify how to rebuild directories
  • Rebuilding the directories based on what has been analyzed

[edit] Scanning

Scanning is done by parsing directories to rebuild and recovered directories, getting a maximum information for each file:

  • A list of blocks' CRC
  • Metadata (size, modification date...)
  • A list of segments data, analyzing the file's content (this part is done using files-hunter). For each segment, a list of blocks' CRC is also computed.

[edit] Analyzing

Analyzing begins with building indexes on the scanned data to find easily matching files between recovered directories and directories to rebuild.

Indexes used are:

  • File CRC: Used to find exactly matching files (also contains Segments CRC).
  • File base name: Used to find files that lost their parent directory.
  • File size: Used to find files that might have some corrupted content replacing sane data (also contains Segments size).
  • File date: Used to find files that might have some corrupted content.
  • File extension: Used to find files that might have some corrupted content and name, but still kept their extension.
  • Segment extension: Used to find files that might have corrupted content for a given file type. This index then contains sub-indexes, indexing some of the segments' metadata.
  • Block CRC: Used to find matching blocks in files (also contains Segments block CRC).
  • Directory hierarchy: Used to find corrupted directory entries, eventually missing some files.

Then those indexes are used to match recovered files with possible files to be rebuilt. The match is based on a probability computed from the indexes. A first user validation occurs at file level if needed.

Then directories are matched with also a probability computed based on indexes and matching files in each one of them. A second user validation occurs at directory level if needed.

[edit] Rebuilding

Matching files and directories found during the analysis step are effectively written on disk.

Personal tools