public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed
From: Roland <devzero@web.de>
To: Hannes Reinecke <hare@suse.de>,
	Reindl Harald <h.reindl@thelounge.net>,
	linux-raid@vger.kernel.org
Subject: Re: status of bugzilla #99171 - mdraid broken for O_DIRECT
Date: Thu, 16 Oct 2025 01:09:57 +0200	[thread overview]
Message-ID: <fd968202-04aa-48e3-bbd7-8520570d1ae2@web.de> (raw)
In-Reply-To: <9ef4398c-3488-492e-82ed-903fc46fed70@suse.de>

> Welll ... I am sure you are aware of the somewhat dubious state of zfs 
> and linux, right?

yes , i know about this "dubious" state due to licensing issues, but 
it's in this state for years now and a pretty solid and well installable 
and usable filesystem , used in many enterprise setups, though.
i run dozens of zfs installations for years and did not have a single 
major issue , data loss or data corruption with those. but that's a 
different story not belonging here...

> And anyway: 'break userspace' is a matter of debate here; the use of
> O_DIRECT effectively moves the burden of checking I/O from the kernel
> to userspace; with O_DIRECT you can submit _any_ I/O without the kernel
> interfering, but at the same time you _must_ ensure that the I/O
> submitted conforms to the expectations the block layer has.
> And one of the expectation is that data is not modified between
> assembling the request and submitting the request to the drive.
>
> But that is precisely what the test program does.
>
>> please look at this issue from a security perspective.
>>
>> if you can break or corrupt your raid mirror from userspace even from 
>> an insulated layer/environment, i would better consider this 
>> "testcase" to be "malicious code" , which is able to subvert the 
>> virtualization/block/ fs layer stack.
>>
>> how could we prevent, that non-trused users in a vm or container 
>> environment can execute this "invalid" code ?
>>
> Well, yes, but then this is O_DIRECT.
>
>>
>> how can we prevent, that they do harm on the underlying mirror in a 
>> hosting environment for example ?
>>
>
> Well, this has been an ongoing debate for years, and we from the linux
> side have had long discussions about that, too.
> But eventually we settled on the notion of 'stable pages', ie that the
> data buffer for a command _must not_ be modified between assembling the
> command and submitting the command to the drivers.
> Precisely such that we _can_ do things like data checksumming.
>
>> not using it in a hosting environment is a little bit weird strategy 
>> for a linux basic technoligy which exists for years.
>>
> Oh, agreed. We do want to make linux better.
> But there is a perfectly viable workaround (namely: do not disable
> caching on the VM ...). So the question really is: where's the
> advantage?
> Security and O_DIRECT is always a very tricky subject, as O_DIRECT
> is precisely there to circumvent checks in the kernel. And yes,
> some of these checks are there to prevent security issues.
> So of course the will be security implications, but that was
> kinda the idea.
>
> Cheers,
>
> Hannes 

thank you for your feedback.

i see, things are complicated and O_DIRECT is a very special beast....

meanwhile, i gave bcachefs a try today , because it looks interesting .

like zfs, it does not seem to be affected by this problem, at least from 
my first tests reported at 
https://bugzilla.kernel.org/show_bug.cgi?id=99171#c26 (i hope this is a 
valid test for consistency)

so we have at least a second "software raid" technology besides zfs, 
which does NOT suffer from the "by design" O_DIRECT breakage.

that's at least surprising me, as bcachefs is far from production 
ready,  and i wonder why it just seems to work at this early stage of 
development.

roland







  reply	other threads:[~2025-10-15 23:10 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <A4168F21-4CDF-4BAD-8754-30BAA1315C6F@web.de>
2025-10-14 20:14 ` status of bugzilla #99171 - mdraid broken for O_DIRECT Roland
2025-10-15  6:56   ` Hannes Reinecke
2025-10-15 23:09     ` Roland [this message]
2025-10-16  6:02       ` Hannes Reinecke
2025-10-17 20:18         ` Roland
2025-10-20  6:44           ` Hannes Reinecke
2024-10-09 20:08 Roland
2024-10-09 21:38 ` Reindl Harald
2024-10-10  6:53   ` Hannes Reinecke
2024-10-10  7:29     ` Roland
2024-10-10  8:34       ` Hannes Reinecke
2025-10-11 19:25         ` Roland
2025-10-13  6:48           ` Hannes Reinecke
2025-10-13 19:06             ` Roland
     [not found]             ` <6fb3e2cb-8eeb-4e76-9364-16348d807784@web.de>
2025-10-14  6:31               ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fd968202-04aa-48e3-bbd7-8520570d1ae2@web.de \
    --to=devzero@web.de \
    --cc=h.reindl@thelounge.net \
    --cc=hare@suse.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox