From: Roland <devzero@web.de>
To: Hannes Reinecke <hare@suse.de>,
Reindl Harald <h.reindl@thelounge.net>,
linux-raid@vger.kernel.org
Subject: Re: status of bugzilla #99171 - mdraid broken for O_DIRECT
Date: Thu, 16 Oct 2025 01:09:57 +0200 [thread overview]
Message-ID: <fd968202-04aa-48e3-bbd7-8520570d1ae2@web.de> (raw)
In-Reply-To: <9ef4398c-3488-492e-82ed-903fc46fed70@suse.de>
> Welll ... I am sure you are aware of the somewhat dubious state of zfs
> and linux, right?
yes , i know about this "dubious" state due to licensing issues, but
it's in this state for years now and a pretty solid and well installable
and usable filesystem , used in many enterprise setups, though.
i run dozens of zfs installations for years and did not have a single
major issue , data loss or data corruption with those. but that's a
different story not belonging here...
> And anyway: 'break userspace' is a matter of debate here; the use of
> O_DIRECT effectively moves the burden of checking I/O from the kernel
> to userspace; with O_DIRECT you can submit _any_ I/O without the kernel
> interfering, but at the same time you _must_ ensure that the I/O
> submitted conforms to the expectations the block layer has.
> And one of the expectation is that data is not modified between
> assembling the request and submitting the request to the drive.
>
> But that is precisely what the test program does.
>
>> please look at this issue from a security perspective.
>>
>> if you can break or corrupt your raid mirror from userspace even from
>> an insulated layer/environment, i would better consider this
>> "testcase" to be "malicious code" , which is able to subvert the
>> virtualization/block/ fs layer stack.
>>
>> how could we prevent, that non-trused users in a vm or container
>> environment can execute this "invalid" code ?
>>
> Well, yes, but then this is O_DIRECT.
>
>>
>> how can we prevent, that they do harm on the underlying mirror in a
>> hosting environment for example ?
>>
>
> Well, this has been an ongoing debate for years, and we from the linux
> side have had long discussions about that, too.
> But eventually we settled on the notion of 'stable pages', ie that the
> data buffer for a command _must not_ be modified between assembling the
> command and submitting the command to the drivers.
> Precisely such that we _can_ do things like data checksumming.
>
>> not using it in a hosting environment is a little bit weird strategy
>> for a linux basic technoligy which exists for years.
>>
> Oh, agreed. We do want to make linux better.
> But there is a perfectly viable workaround (namely: do not disable
> caching on the VM ...). So the question really is: where's the
> advantage?
> Security and O_DIRECT is always a very tricky subject, as O_DIRECT
> is precisely there to circumvent checks in the kernel. And yes,
> some of these checks are there to prevent security issues.
> So of course the will be security implications, but that was
> kinda the idea.
>
> Cheers,
>
> Hannes
thank you for your feedback.
i see, things are complicated and O_DIRECT is a very special beast....
meanwhile, i gave bcachefs a try today , because it looks interesting .
like zfs, it does not seem to be affected by this problem, at least from
my first tests reported at
https://bugzilla.kernel.org/show_bug.cgi?id=99171#c26 (i hope this is a
valid test for consistency)
so we have at least a second "software raid" technology besides zfs,
which does NOT suffer from the "by design" O_DIRECT breakage.
that's at least surprising me, as bcachefs is far from production
ready, and i wonder why it just seems to work at this early stage of
development.
roland
next prev parent reply other threads:[~2025-10-15 23:10 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <A4168F21-4CDF-4BAD-8754-30BAA1315C6F@web.de>
2025-10-14 20:14 ` status of bugzilla #99171 - mdraid broken for O_DIRECT Roland
2025-10-15 6:56 ` Hannes Reinecke
2025-10-15 23:09 ` Roland [this message]
2025-10-16 6:02 ` Hannes Reinecke
2025-10-17 20:18 ` Roland
2025-10-20 6:44 ` Hannes Reinecke
2024-10-09 20:08 Roland
2024-10-09 21:38 ` Reindl Harald
2024-10-10 6:53 ` Hannes Reinecke
2024-10-10 7:29 ` Roland
2024-10-10 8:34 ` Hannes Reinecke
2025-10-11 19:25 ` Roland
2025-10-13 6:48 ` Hannes Reinecke
2025-10-13 19:06 ` Roland
[not found] ` <6fb3e2cb-8eeb-4e76-9364-16348d807784@web.de>
2025-10-14 6:31 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fd968202-04aa-48e3-bbd7-8520570d1ae2@web.de \
--to=devzero@web.de \
--cc=h.reindl@thelounge.net \
--cc=hare@suse.de \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox