* Is it possible to set the amount of CRC32C hashes per a data sector in integritysetup?
@ 2023-09-10 15:12 Горбешко Богдан
2023-09-10 16:53 ` Michael Kjörling
0 siblings, 1 reply; 4+ messages in thread
From: Горбешко Богдан @ 2023-09-10 15:12 UTC (permalink / raw)
To: cryptsetup
I'd like to sacrifice a significant amount of disk space (let's say,
20%) for recovery checksums: in a way RAR or Parchive can do that, or in
a way it's implemented in some physical-level media (DVD, HDD, digital
TV, etc.) Some of them store the checksums separately, some among the
data; my goal is to store them among the data on the same medium.
Currently, I came up with splitting a medium into 6 even partitions and
making a RAID 5 array out of them. It seems to work, though I was
suggested that dm-integrity might be a more straightforward solution for
the task. So, before I started using the RAID 5 solution in practice,
I'd like to make sure if the same is possible with dm-integrity first.
AFAIU, the primary goal of dm-integrity is checking if the data are not
corrupted and throwing read errors otherwise. Though it still supports
checksums besides of hashes. Can they be used for recovering the data
on-the-fly?
I've read the manual of integritysetup and still not sure if it's
possible to achieve a significantly big checksum/data ratio. Reducing
the sector size lower than 512 does not work. If I increase the tag
size, would it be used to store a larger checksum, or just would be
padded to store one tiny checksum per a tag? Maybe there is some other way?
Thanks.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Is it possible to set the amount of CRC32C hashes per a data sector in integritysetup?
2023-09-10 15:12 Is it possible to set the amount of CRC32C hashes per a data sector in integritysetup? Горбешко Богдан
@ 2023-09-10 16:53 ` Michael Kjörling
2023-09-10 22:05 ` Arno Wagner
0 siblings, 1 reply; 4+ messages in thread
From: Michael Kjörling @ 2023-09-10 16:53 UTC (permalink / raw)
To: cryptsetup
On 10 Sep 2023 18:12 +0300, from bodqhrohro@gmail.com (Горбешко Богдан):
> AFAIU, the primary goal of dm-integrity is checking if the data are not
> corrupted and throwing read errors otherwise. Though it still supports
> checksums besides of hashes. Can they be used for recovering the data
> on-the-fly?
_My understanding_ (which may be wrong) is that this is not what
dm-integrity is intended for; nor does it look like it supports it.
> Maybe there is some other way?
I would suggest looking at Btrfs or ZFS. Both support self-healing of
data based on redundant storage within an array; and at least ZFS (and
I don't see why not Btrfs) can work with a set of partitions on the
same underlying storage device. It's not a typically recommended
setup, but it is technically possible.
In both cases you'd sacrifice a fair bit of performance, but it sounds
like you are more interested in ensuring data integrity, so maybe that
will be an acceptable trade-off in your situation.
--
Michael Kjörling 🔗 https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Is it possible to set the amount of CRC32C hashes per a data sector in integritysetup?
2023-09-10 16:53 ` Michael Kjörling
@ 2023-09-10 22:05 ` Arno Wagner
2023-09-11 10:12 ` Milan Broz
0 siblings, 1 reply; 4+ messages in thread
From: Arno Wagner @ 2023-09-10 22:05 UTC (permalink / raw)
To: cryptsetup
On Sun, Sep 10, 2023 at 18:53:04 CEST, Michael Kjörling wrote:
> On 10 Sep 2023 18:12 +0300, from bodqhrohro@gmail.com (Горбешко Богдан):
> > AFAIU, the primary goal of dm-integrity is checking if the data are not
> > corrupted and throwing read errors otherwise. Though it still supports
> > checksums besides of hashes. Can they be used for recovering the data
> > on-the-fly?
>
> _My understanding_ (which may be wrong) is that this is not what
> dm-integrity is intended for; nor does it look like it supports it.
It is not. What this needs is things like RAID 1/5/6 (which can
be done in files) and some specialized tools.
Incidentally, "checksums" do never support recovery from
errors. What you need for that is error correcting codes.
Transparently, RAID is probably the easiest option in Linux
and the one the works on sector-level. Given that modern drives
(HDD and SSDs) already do extensive error-correction, you probably
need to deal with whole-sector loss (or worse) to get any
additional effect.
You can do RAID wia the md layer or integrated in some
filesystems.
If this is about some embedded device having its storage
survive long-term, I suggest going for industrial flash
instead (often PCMCIA, sometimes SD) as that can be
gotten with 10 years and more data lifetime.
Regards,
Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: arno@wagner.name
GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718
----
A good decision is based on knowledge and not on numbers. -- Plato
If it's in the news, don't worry about it. The very definition of
"news" is "something that hardly ever happens." -- Bruce Schneier
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Is it possible to set the amount of CRC32C hashes per a data sector in integritysetup?
2023-09-10 22:05 ` Arno Wagner
@ 2023-09-11 10:12 ` Milan Broz
0 siblings, 0 replies; 4+ messages in thread
From: Milan Broz @ 2023-09-11 10:12 UTC (permalink / raw)
To: Arno Wagner, cryptsetup
On 9/11/23 00:05, Arno Wagner wrote:
> On Sun, Sep 10, 2023 at 18:53:04 CEST, Michael Kjörling wrote:
>> On 10 Sep 2023 18:12 +0300, from bodqhrohro@gmail.com (Горбешко Богдан):
>>> AFAIU, the primary goal of dm-integrity is checking if the data are not
>>> corrupted and throwing read errors otherwise. Though it still supports
>>> checksums besides of hashes. Can they be used for recovering the data
>>> on-the-fly?
>>
>> _My understanding_ (which may be wrong) is that this is not what
>> dm-integrity is intended for; nor does it look like it supports it.
>
> It is not. What this needs is things like RAID 1/5/6 (which can
> be done in files) and some specialized tools.
>
> Incidentally, "checksums" do never support recovery from
> errors. What you need for that is error correcting codes.
Yes, this fully applies to dm-integrity (read-write integrity protected device).
That said, dm-verity (that is read-only target and uses Merkle tree
protection for the whole device) can be combined with forward
error correction codes (FEC, here Reed-Salomon) and can fix
a lot of corrupted data (and veritysetup fully supports FEC options).
(If used with smartphones, it apparently extends lifetime by
continuously repairing failing flash storage that internal correction
no longer handles - but I have never seen real data that it actually works
for real phones in the wild...
But I think that was motivation for Google to implement FEC for dm-verity.).
I had an idea how to combine FEC with dm-integrity, but it never
happened and probably never happen, as nobody will pay the time
for development. (And obviously it will be very slow.)
> Transparently, RAID is probably the easiest option in Linux
> and the one the works on sector-level. Given that modern drives
> (HDD and SSDs) already do extensive error-correction, you probably
> need to deal with whole-sector loss (or worse) to get any
> additional effect.
Reed-Solomon (and some new algorithms) are very good in repairing such
situations. (It is all about erasure coding.)
But I think many people just use RAID with dm-integrity now (despite it is slow).
Milan
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-09-11 10:12 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-10 15:12 Is it possible to set the amount of CRC32C hashes per a data sector in integritysetup? Горбешко Богдан
2023-09-10 16:53 ` Michael Kjörling
2023-09-10 22:05 ` Arno Wagner
2023-09-11 10:12 ` Milan Broz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox