How to make ddrescue rescue unreadable files along with rsync run when you backup files, when they exist only at the source?

Question

How to make ddrescue rescue unreadable files along with rsync run when you backup files, when they do exist only at the source?

I think that major programs that work on backup disks must "merge" with ddrescue for platter disks.

I have a lot of disks that are faulty. I would like to rescue these files because they may not exist elsewhere.

I would like to merge rsync with ddrescue. If I have a disk with million files, it is painful to check all input/output errors of files and rescue them manually.

How I can make rsync, when input/output error appears, to run ddrescue instead?

score 2 · Answer 1 · answered Aug 12 '23 at 04:30

If source media is unstable, address that problem (dump and replace it, or restore from backups, whatever) first, before doing any other operations, including regular backup, with rsync or anything else.

ddrescue is not designed to be a part of standard backup procedure, nor it is wise to rely on it for any regular procedures. It's for block device recovery only.

Use dd if you want to dump an image of the block device in a backup procedure, and if it throws any errors, you know, that's a time to use those backups.

Speaking of backups, I disagree that rsync alone can be considered a backup tool. Where's history of previous backup snapshots (so you can store multiple past versions of data e.g. yesterday, two days ago, week ago and month ago, all available)? Where's cleanup for them? Where's problem reporting? It can be a part of a backup solution which does all this and uses rsync to transfer files, but it is not to be used alone. Like RAID, rsync is not a backup — but a mere synchronization tool, this time at a file level.

score 1 · Answer 2 · answered Jan 08 '24 at 02:49

Disclaimer: if you are trying to recover something valuable use the right tool for the job. Using ddrescue to make a full image of the disk "before" attempting recovery is the right approach. Use the right tools for the job!

However, you may be looking for:

rsync --partial

which will keep the files in the destination even if there was a read error on the source (instead of deleting the attempted copy). This matches the behavior of scp.

--ignore-errors can have a similar side-effect as --partial but it is worth noting that it has other side effects too like deleting the file from the source location if you are also using the --remove-sent-files or --remove-source-files (which might not be what you want--especially if you later realize that you want to try ddrescue on a specific corrupt file). I'm sure there are other side effects too.

Both rsync and scp have almost nothing in common with block level tools like ddrescue and as such they will truncate the file to the first read error but rsync will then remove the partially copied file unless you ask to keep it via the --partial flag.

If you want to copy just a single file (not truncated to first read error) you can actually use ddrescue:

ddrescue --sparse ./corrupt-file ./recovered-file

or dd:

dd if=corrupt-file of=recovered-file conv=noerror,sync
truncate --reference corrupt-file recovered-file

but ddrescue does a much better job retrying failed blocks!

Modern disks are actually pretty good at recovering bad blocks if you have enough patience for hundreds of retries.

However, the results above might not be as good as using ddrescue across the whole partition because the filesystem gets in the way and prevents reading partially corrupt blocks (ie. even if it is only one bit or one sector that is corrupt the whole block will be a read error)

To answer your question more directly you could pipe the stderr of rsync into sed and then run ddrescue for each file which failed then rsync the recovered files to replace the partial ones.

How to make ddrescue rescue unreadable files along with rsync run when you backup files, when they exist only at the source?

2 Answers2