linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: John Garry <john.g.garry@oracle.com>, David Bueso <dave@stgolabs.net>
Cc: Theodore Ts'o <tytso@mit.edu>,
	lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-mm <linux-mm@kvack.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Matthew Wilcox <willy@infradead.org>,
	Dave Chinner <david@fromorbit.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [LSF/MM/BPF TOPIC] untorn buffered writes
Date: Wed, 22 May 2024 14:56:57 -0700	[thread overview]
Message-ID: <Zk5qKUJUOjGXEWus@bombadil.infradead.org> (raw)
In-Reply-To: <9e230104-4fb8-44f1-ae5a-a940f69b8d45@oracle.com>

On Wed, May 15, 2024 at 01:54:39PM -0600, John Garry wrote:
> On 27/02/2024 23:12, Theodore Ts'o wrote:
> > Last year, I talked about an interest to provide database such as
> > MySQL with the ability to issue writes that would not be torn as they
> > write 16k database pages[1].
> > 
> > [1] https://urldefense.com/v3/__https://lwn.net/Articles/932900/__;!!ACWV5N9M2RV99hQ!Ij_ZeSZrJ4uPL94Im73udLMjqpkcZwHmuNnznogL68ehu6TDTXqbMsC4xLUqh18hq2Ib77p1D8_4mV5Q$
> > 
> 
> After discussing this topic earlier this week, I would like to know if there
> are still objections or concerns with the untorn-writes userspace API
> proposed in https://lore.kernel.org/linux-block/20240326133813.3224593-1-john.g.garry@oracle.com/
> 
> I feel that the series for supporting direct-IO only, above, is stuck
> because of this topic of buffered IO.

I think it was good we had the discussions at LSFMM over it, however
I personally don't percieve it as stuck, however without any consensus
being obviated or written down anywhere it would not be clear to anyone
that we did reach any consensus at all. Hope is that lwn captures any
consensus if any was indeed reached as you're not making it clear any
was.

In case it helps, as we did with the LBS effort it may also be useful to
put together bi-monthly cabals to follow up progress, and divide and
conquer any pending work items.

> So I sent an RFC for buffered untorn-writes last month in https://lore.kernel.org/linux-fsdevel/20240422143923.3927601-1-john.g.garry@oracle.com/,
> which did leverage the bs > ps effort. Maybe it did not get noticed due to
> being an RFC. It works on the following principles:
> 
> - A buffered atomic write requires RWF_ATOMIC flag be set, same as
>   direct IO. The same other atomic writes rules apply.
> - For an inode, only a single size of buffered write is allowed. So for
>   statx, atomic_write_unit_min = atomic_write_unit_max always for
>   buffered atomic writes.
> - A single folio maps to an atomic write in the pagecache. So inode
>   address_space folio min order = max order = atomic_write_unit_min/max
> - A folio is tagged as "atomic" when atomically written and written back
>   to storage "atomically", same as direct-IO method would do for an
>   atomic write.
> - If userspace wants to guarantee a buffered atomic write is written to
>   storage atomically after the write syscall returns, it must use
>   RWF_SYNC or similar (along with RWF_ATOMIC).

From my perspective the above just needs the IOCB atomic support, and
the pending long term work item there is the near-write-through buffered
IO support. We could just wait for buffered-IO support until we have
support for that. I can't think of anying blocking DIO support though,
now that we at least have a mental model of how buffered IO *should*
work.

What about testing? Are you extending fstests, blktests?

  Luis


  reply	other threads:[~2024-05-22 21:57 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-28  6:12 [LSF/MM/BPF TOPIC] untorn buffered writes Theodore Ts'o
2024-02-28 11:38 ` [Lsf-pc] " Amir Goldstein
2024-02-28 20:21   ` Theodore Ts'o
2024-02-28 14:11 ` Matthew Wilcox
2024-02-28 23:33   ` Theodore Ts'o
2024-02-29  1:07     ` Dave Chinner
2024-02-28 16:06 ` John Garry
2024-02-28 23:24   ` Theodore Ts'o
2024-02-29 16:28     ` John Garry
2024-02-29 21:21       ` Ritesh Harjani
2024-02-29  0:52 ` Dave Chinner
2024-03-11  8:42 ` John Garry
2024-05-15 19:54 ` John Garry
2024-05-22 21:56   ` Luis Chamberlain [this message]
2024-05-23 11:59     ` John Garry
2024-06-01  9:33       ` Theodore Ts'o
2024-06-11 15:23         ` John Garry
2024-05-23 12:59   ` Christoph Hellwig
2024-05-28  9:21     ` John Garry
2024-05-28 10:57       ` Christoph Hellwig
2024-05-28 11:09         ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zk5qKUJUOjGXEWus@bombadil.infradead.org \
    --to=mcgrof@kernel.org \
    --cc=dave@stgolabs.net \
    --cc=david@fromorbit.com \
    --cc=john.g.garry@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=martin.petersen@oracle.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).