From: Jens Axboe <axboe@kernel.dk>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
hannes@cmpxchg.org, clm@meta.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCHSET v2 0/15] Uncached buffered IO
Date: Mon, 11 Nov 2024 10:39:55 -0700 [thread overview]
Message-ID: <3c904d74-c685-4f8a-bc1d-edc24da59fa5@kernel.dk> (raw)
In-Reply-To: <ZzI97bky3Rwzw18C@casper.infradead.org>
On 11/11/24 10:25 AM, Matthew Wilcox wrote:
> On Sun, Nov 10, 2024 at 08:27:52AM -0700, Jens Axboe wrote:
>> 5 years ago I posted patches adding support for RWF_UNCACHED, as a way
>> to do buffered IO that isn't page cache persistent. The approach back
>> then was to have private pages for IO, and then get rid of them once IO
>> was done. But that then runs into all the issues that O_DIRECT has, in
>> terms of synchronizing with the page cache.
>
> Today's a holiday, and I suspect you're going to do a v3 before I have
> a chance to do a proper review of this version of the series.
Probably, since I've done some fixes since v2 :-). So you can wait for
v3, I'll post it later today anyway.
> I think "uncached" isn't quite the right word. Perhaps 'RWF_STREAMING'
> so that userspace is indicating that this is a streaming I/O and the
> kernel gets to choose what to do with that information.
Yeah not sure, it's the one I used back in the day, and I still haven't
found a more descriptive word for it. That doesn't mean one doesn't
exist, certainly taking suggestions. I don't think STREAMING is the
right one however, you could most certainly be doing random uncached IO.
> Also, do we want to fail I/Os to filesystems which don't support
> it? I suppose really sophisticated userspace might fall back to
> madvise(DONTNEED), but isn't most userspace going to just clear the flag
> and retry the I/O?
Also something that's a bit undecided, you can make arguments for both
ways. For just ignoring the flag if not support, the argument would be
that the application just wants to do IO, uncached if available. For the
other argument, maybe you have an application that wants to fallback to
O_DIRECT if uncached isn't available. That application certainly wants
to know if it works or not.
Which is why I defaulted to return -EOPNOTSUPP if it's not available.
An applicaton may probe this upfront if it so desires, and just not set
the flag for IO. That'd keep it out of the hot path.
Seems to me that returning whether it's supported or not is the path of
least surprises for applications, which is why I went that way.
> Um. Now I've looked, we also have posix_fadvise(POSIX_FADV_NOREUSE),
> which is currently a noop. But would we be better off honouring
> POSIX_FADV_NOREUSE than introducing RWF_UNCACHED? I'll think about this
> some more while I'm offline.
That would certainly work too, for synchronous IO. But per-file hints
are a bad idea for async IO, for obvious reasons. We really want per-IO
hints for that, we have a long history of messing that up. That doesn't
mean that FMODE_NOREUSE couldn't just set RWF_UNCACHED, if it's set.
That'd be trivial.
Then the next question is if setting POSIX_FADV_NOREUSE should fail of
file->f_op->fop_flags & FOP_UNCACHED isn't true. Probably not, since
it'd potentially break applications. So probably best to just set
f_iocb_flags IFF FOP_UNCACHED is true for that file.
And the bigger question is why on earth do we have this thing in the
kernel that doesn't do anything... But yeah, now we could make it do
something.
--
Jens Axboe
next prev parent reply other threads:[~2024-11-11 17:39 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-10 15:27 [PATCHSET v2 0/15] Uncached buffered IO Jens Axboe
2024-11-10 15:27 ` [PATCH 01/15] mm/filemap: change filemap_create_folio() to take a struct kiocb Jens Axboe
2024-11-10 15:27 ` [PATCH 02/15] mm/readahead: add folio allocation helper Jens Axboe
2024-11-10 15:27 ` [PATCH 03/15] mm: add PG_uncached page flag Jens Axboe
2024-11-10 15:27 ` [PATCH 04/15] mm/readahead: add readahead_control->uncached member Jens Axboe
2024-11-10 15:27 ` [PATCH 05/15] mm/filemap: use page_cache_sync_ra() to kick off read-ahead Jens Axboe
2024-11-10 15:27 ` [PATCH 06/15] mm/truncate: make invalidate_complete_folio2() public Jens Axboe
2024-11-10 15:27 ` [PATCH 07/15] fs: add RWF_UNCACHED iocb and FOP_UNCACHED file_operations flag Jens Axboe
2024-11-10 15:28 ` [PATCH 08/15] mm/filemap: add read support for RWF_UNCACHED Jens Axboe
2024-11-11 9:15 ` Kirill A. Shutemov
2024-11-11 14:12 ` Jens Axboe
2024-11-11 15:16 ` Christoph Hellwig
2024-11-11 15:17 ` Jens Axboe
2024-11-11 17:09 ` Jens Axboe
2024-11-11 23:42 ` Jens Axboe
2024-11-12 5:13 ` Christoph Hellwig
2024-11-12 15:14 ` Jens Axboe
2024-11-12 16:39 ` Brian Foster
2024-11-12 17:06 ` Jens Axboe
2024-11-12 17:19 ` Jens Axboe
2024-11-12 18:44 ` Brian Foster
2024-11-12 19:08 ` Jens Axboe
2024-11-12 19:39 ` Brian Foster
2024-11-12 19:45 ` Jens Axboe
2024-11-12 20:21 ` Brian Foster
2024-11-12 20:25 ` Jens Axboe
2024-11-13 14:07 ` Jens Axboe
2024-11-11 15:25 ` Kirill A. Shutemov
2024-11-11 15:31 ` Jens Axboe
2024-11-11 15:51 ` Kirill A. Shutemov
2024-11-11 15:57 ` Jens Axboe
2024-11-11 16:29 ` Kirill A. Shutemov
2024-11-10 15:28 ` [PATCH 09/15] mm/filemap: drop uncached pages when writeback completes Jens Axboe
2024-11-11 9:17 ` Kirill A. Shutemov
2024-11-10 15:28 ` [PATCH 10/15] mm/filemap: make buffered writes work with RWF_UNCACHED Jens Axboe
2024-11-10 15:28 ` [PATCH 11/15] mm: add FGP_UNCACHED folio creation flag Jens Axboe
2024-11-10 15:28 ` [PATCH 12/15] ext4: add RWF_UNCACHED write support Jens Axboe
2024-11-10 15:28 ` [PATCH 13/15] iomap: make buffered writes work with RWF_UNCACHED Jens Axboe
2024-11-10 15:28 ` [PATCH 14/15] xfs: punt uncached write completions to the completion wq Jens Axboe
2024-11-10 15:28 ` [PATCH 15/15] xfs: flag as supporting FOP_UNCACHED Jens Axboe
2024-11-11 15:27 ` Christoph Hellwig
2024-11-11 15:33 ` Jens Axboe
2024-11-11 17:25 ` [PATCHSET v2 0/15] Uncached buffered IO Matthew Wilcox
2024-11-11 17:39 ` Jens Axboe [this message]
2024-11-11 21:24 ` Yu Zhao
2024-11-11 21:48 ` Matthew Wilcox
2024-11-11 22:07 ` Yu Zhao
2024-11-20 23:11 ` Yuanchu Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3c904d74-c685-4f8a-bc1d-edc24da59fa5@kernel.dk \
--to=axboe@kernel.dk \
--cc=clm@meta.com \
--cc=hannes@cmpxchg.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox