public inbox for cryptsetup@lists.linux.dev
 help / color / mirror / Atom feed
From: Marc SCHAEFER <schaefer@alphanet.ch>
To: cryptsetup@lists.linux.dev
Subject: Re: known issue with specific kernel releases and snapshots on LVM on md RAID integrity devices
Date: Sun, 5 Oct 2025 16:59:14 +0200	[thread overview]
Message-ID: <aOKHwjBZKdqGXj5m@alphanet.ch> (raw)
In-Reply-To: <f71c3ee6-4c31-418e-a141-f941b22b8f5c@gmail.com>

Hello and thank you for your ideas,

On Sat, Oct 04, 2025 at 08:34:26PM +0200, Milan Broz wrote:
> > Do you see anything peculiar with this?
> I do not see anything special, just checksum errors that can be caused by many things.

Indeed.

> I would start with checking hardware, specifically reseat RAM modules and cables,
> check and run some extensive memory and other system tests etc.

I did extensive memory checks before, and I did them again, found nothing
(memtest86, long kernel compilation loops). What I find peculiar is that in
most cases, both RAID1 mirrors had an integrity error at the same blocks:
especially knowing that each RAID1 mirror is, in this case, over its own
dm-integrity device.

But maybe I am mistaken, and there is some optimisation which makes both
dm-integrity devices recognize the original block to be written is shared
between the two device and it is optimized by dm-integrity to be calculated
only once?  If yes, then a simple RAM corruption would explain this issue.

> Check system logs, there could be something that precedes this issue.

No, there was nothing, I get logcheck summaries for this system and there
was no I/O error or anything before this.  I also monitor temperatures,
and there were quite ok just before and during the issues.

Locating the issue more closely I determined which LV contained errors (the
errors, except the correcte one, were always at the same addresses), then
checked all the files from the LV (*) (as errors were on both R1 copies, it
should have shown something), and the error is only seen when reading the raw
device LV, but not in the files theselves.  This also explain why the
initial error was on a snapshot (that is temporary, quickly deleted).

I thus wrote a big new file filling the LV fs, no error, then re-read the raw
LV volume without errors. I am now running a md raidcheck again, that should
tell me if the dm-integrity CRCs are now written correctly on both
devices.

Another question: does --enable-discard really work?  Looking at the
documentation, I am not sure it really works with crc32 and bitmap
integrity mode, do you have a final answer?  If you have no idea,
I will do some tests on a spare system to see what combination
of options really does discard (I am not sure how to test this,
presumably NVMs have a namespace-full percentage, not sure
it's really reliable). Do you have any suggestion?

Thank you and have a nice week.

(*) simple read, then backup comparison for the files that don't
    change.

      reply	other threads:[~2025-10-05 14:59 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-03 15:01 known issue with specific kernel releases and snapshots on LVM on md RAID integrity devices Marc SCHAEFER
2025-10-04 18:34 ` Milan Broz
2025-10-05 14:59   ` Marc SCHAEFER [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOKHwjBZKdqGXj5m@alphanet.ch \
    --to=schaefer@alphanet.ch \
    --cc=cryptsetup@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox