Robocopying a deduplicated volume on a non-deduplicated volume resulted in a corrupt backup (original source is long gone).
I have read the Microsoft Support 2834834 article titled "FSRM and Data Deduplication may be adversely affected when you use Robocopy /MIR in Windows Server 2012" but seems to only concern itself with the case where \System Volume Information\Dedup is deleted but I have that folder intact, so I need some help from the community with solving this.
I have also read about the inner workings of the data deduplication in Windows Server 2012 in Lanterna, D., & Barili, A. (2017). Forensic analysis of deduplicated file systems. Digital Investigation, 20, S99-S106.
I am fairly confident that most if not all of the data is physically stored on the disk.
From what I understand the data is stored in the "Dedup" folder in structures called chunk stores, which I have, and each file is described by a reparse point (with reparse tag 0x80000013 ) that holds the information needed to read the data back from the chunk stores.
Result of a fsutil query on an otherwise inaccessible file:0000: 03 01 84 01 46 65 52 70 bc f4 b6 b5 7c 01 00 00 ....FeRp....|... 0010: 00 00 09 00 0a 00 04 00 5c 00 00 00 0a 00 04 00 ........\....... 0020: 58 00 00 00 0c 00 10 00 60 00 00 00 0b 00 08 00 X.......`....... 0030: 70 00 00 00 06 00 08 00 78 00 00 00 05 00 04 00 p.......x....... 0040: 80 00 00 00 09 00 04 00 84 00 00 00 0d 00 80 00 ................ 0050: 88 00 00 00 0d 00 74 00 08 01 00 00 00 00 00 00 ......t......... 0060: 01 00 00 00 fa 7b 1b 5d 3e b7 a0 42 95 cb 91 ce .....{.]>..B.... 0070: 19 21 cf b2 59 f8 51 72 66 6f d3 01 8c 4d 04 00 .!..Y.Qrfo...M.. 0080: 00 00 00 00 00 00 00 00 00 00 00 00 52 62 52 70 ............RbRp 0090: 21 61 0d 2b 80 00 00 00 00 00 08 00 0a 00 04 00 !a.+............ 00a0: 50 00 00 00 09 00 04 00 74 00 00 00 09 00 04 00 P.......t....... 00b0: 70 00 00 00 06 00 08 00 78 00 00 00 06 00 08 00 p.......x....... 00c0: 54 00 00 00 0c 00 10 00 5c 00 00 00 00 00 00 00 T.......\....... 00d0: 00 00 00 00 09 00 04 00 6c 00 00 00 01 00 00 00 ........l....... 00e0: 62 a2 00 00 00 00 05 00 39 53 f0 0b 9a 3b 00 4b b.......9S...;.K 00f0: a9 da c4 19 65 bd ee 40 00 00 00 00 00 00 00 00 ....e..@........ 0100: 0c 00 00 00 00 00 00 00 00 00 00 00 44 64 52 70 ............DdRp 0110: 28 a1 cb be 74 00 00 00 00 00 03 00 0a 00 04 00 (...t........... 0120: 28 00 00 00 0e 00 08 00 2c 00 00 00 0d 00 40 00 (.......,.....@. 0130: 34 00 00 00 01 00 00 00 62 a2 00 00 00 00 05 00 4.......b....... 0140: 62 a2 00 00 00 00 05 00 d0 8c 11 04 00 00 02 00 b............... 0150: 01 00 00 00 48 02 00 00 48 01 00 00 00 00 00 00 ....H...H....... 0160: 9e 56 4f 1c 3f 30 10 b2 7e 2b b2 bd be 9e c0 0b .VO.?0..~+...... 0170: 8c 4d 04 00 00 00 00 00 00 00 00 00 00 00 00 00 .M.............. 0180: 2e 76 2c f6 .v,.
Now I got a first problem: my reparse point is slightly different. Sure, fsutil stripped the header of the reparse point, but that's not the only difference from the example in the forensic article...
- I have 03 01 84 where the example here has the middle byte set to zero.
- and many others...
Example reparse point from the quoted articleAddress Hexadecimal content+0x00 C0 00 00 00 A0 00 00 00 0x08 00 00 00 00 00 00 03 00 0x10 84 00 00 00 18 00 00 00 0x18 13 00 00 80 7C 00 00 00 0x20 01 02 7C 00 00 00 00 00 0x28 16 8F 09 00 00 00 00 00 0x30 00 00 00 00 00 00 00 00 0x38 E5 90 E4 2E F0 44 9A 4F 0x40 8D 59 D6 D8 A2 B5 65 2C 0x48 40 00 40 00 40 00 00 00 0x50 F5 F4 B2 C1 6E B0 D1 01 0x58 01 00 00 00 00 00 01 00 0x60 00 50 00 00 01 00 00 00 0x68 01 00 00 00 08 05 00 00 0x70 C8 01 00 00 00 00 00 00 0x78 9C FC 06 75 EB 4E D1 0C 0x80 FD 13 F3 14 AA 1D B1 D3 0x88 8C BA 9C 19 E2 EF D5 12 0x90 50 58 CE B1 FB 58 05 00 0x98 C1 AD 45 7A 00 00 00 00 0xA0
So, I think I should start by asking these things:
- Are there existing tools to work with this kind of situation? I tried various data recovery programs that boasted NTFS deduplication compatibility but none of them actually recovered the actual data. Only the reparse point hex.
- Is there some more official data deduplication specification for programmers wishing to write software to interact with it?
- Does windows mark the partition as 'data-deduplicated' somewhere not in the actual file system? I want to make Windows Server 2016 recognize the deduplicated partition as such...
- Am I the only one who got this kind of problem since 2012 ?