All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Pavel Begunkov <asml.silence@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	io-uring@vger.kernel.org, Conrad Meyer <conradmeyer@meta.com>,
	linux-block@vger.kernel.org, linux-mm@kvack.org,
	Jan Kara <jack@suse.cz>,
	Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Subject: Re: [RFC 5/5] block: implement io_uring discard cmd
Date: Wed, 21 Aug 2024 10:55:06 +0800	[thread overview]
Message-ID: <ZsVXClra11+yLjss@fedora> (raw)
In-Reply-To: <c69d1769-ae86-4659-bbda-6f7760a8e83f@gmail.com>

On Tue, Aug 20, 2024 at 06:19:00PM +0100, Pavel Begunkov wrote:
> On 8/20/24 17:30, Jens Axboe wrote:
> > On 8/19/24 8:36 PM, Ming Lei wrote:
> > > On Mon, Aug 19, 2024 at 02:01:21PM -0600, Jens Axboe wrote:
> > > > On 8/15/24 7:45 PM, Ming Lei wrote:
> ...
> > > > > Meantime the handling has to move to io-wq for avoiding to block current
> > > > > context, the interface becomes same with IORING_OP_FALLOCATE?
> > > > 
> > > > I think the current truncate is overkill, we should be able to get by
> > > > without. And no, I will not entertain an option that's "oh just punt it
> > > > to io-wq".
> > > 
> > > BTW, the truncate is added by 351499a172c0 ("block: Invalidate cache on discard v2"),
> > > and block/009 serves as regression test for covering page cache
> > > coherency and discard.
> > > 
> > > Here the issue is actually related with the exclusive lock of
> > > filemap_invalidate_lock(). IMO, it is reasonable to prevent page read during
> > > discard for not polluting page cache. block/009 may fail too without the lock.
> > > 
> > > It is just that concurrent discards can't be allowed any more by
> > > down_write() of rw_semaphore, and block device is really capable of doing
> > > that. It can be thought as one regression of 7607c44c157d ("block: Hold invalidate_lock in
> > > BLKDISCARD ioctl").
> > > 
> > > Cc Jan Kara and Shin'ichiro Kawasaki.
> > 
> > Honestly I just think that's nonsense. It's like mixing direct and
> > buffered writes. Can you get corruption? Yes you most certainly can.
> > There should be no reason why we can't run discards without providing
> > page cache coherency. The sync interface attempts to do that, but that
> > doesn't mean that an async (or a different sync one, if that made sense)
> > should.
> 
> I don't see it as a problem either, it's a new interface, just need
> to be upfront on what guarantees it provides (one more reason why
> not fallocate), I'll elaborate on it in the commit message and so.

Fair enough.

> 
> I think a reasonable thing to do is to have one rule for all write-like
> operations starting from plain writes, which is currently allowing races
> to happen and shift it to the user. Purely in theory we can get inventive
> with likes of range lock trees, but that's unwarranted for all sorts of
> reasons.
> 
> > If you do discards to the same range as you're doing buffered IO, you
> > get to keep both potentially pieces. Fact is that most folks are doing
> > dio for performant IO exactly because buffered writes tend to be
> > horrible, and you could certainly use that with async discards and have
> > the application manage it just fine.
> > 
> > So I really think any attempts to provide page cache synchronization for
> > this is futile. And the existing sync one looks pretty abysmal, but it
> > doesn't really matter as it's a sync interfce. If one were to do
> 
> It should be a pain for sync as well, you can't even spin another process
> and parallelise this way.

Yes, this way has degraded some sync discard workloads perf a lot.

Thanks,
Ming


  reply	other threads:[~2024-08-21  2:55 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-14 10:45 [RFC 0/5] implement asynchronous BLKDISCARD via io_uring Pavel Begunkov
2024-08-14 10:45 ` [RFC 1/5] io_uring/cmd: expose iowq to cmds Pavel Begunkov
2024-08-14 10:45 ` [RFC 2/5] io_uring/cmd: give inline space in request " Pavel Begunkov
2024-08-14 10:45 ` [RFC 3/5] filemap: introduce filemap_invalidate_pages Pavel Begunkov
2024-08-14 10:45 ` [RFC 4/5] block: introduce blk_validate_discard() Pavel Begunkov
2024-08-14 10:45 ` [RFC 5/5] block: implement io_uring discard cmd Pavel Begunkov
2024-08-15  1:42   ` Ming Lei
2024-08-15 14:33     ` Jens Axboe
2024-08-15 17:11       ` Pavel Begunkov
2024-08-15 23:44         ` Ming Lei
2024-08-16  1:24           ` Jens Axboe
2024-08-16  1:45             ` Ming Lei
2024-08-16  1:59               ` Pavel Begunkov
2024-08-16  2:08                 ` Ming Lei
2024-08-16  2:16                   ` Pavel Begunkov
2024-08-19 20:02                     ` Jens Axboe
2024-08-19 20:01               ` Jens Axboe
2024-08-20  2:36                 ` Ming Lei
2024-08-20 16:30                   ` Jens Axboe
2024-08-20 17:19                     ` Pavel Begunkov
2024-08-21  2:55                       ` Ming Lei [this message]
2024-08-15  4:08   ` kernel test robot
2024-08-15  4:08   ` kernel test robot
2024-08-15 14:42   ` Jens Axboe
2024-08-15 15:50 ` [RFC 0/5] implement asynchronous BLKDISCARD via io_uring Jens Axboe
2024-08-15 17:26   ` Pavel Begunkov
2024-08-15 16:15 ` Martin K. Petersen
2024-08-15 17:12   ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZsVXClra11+yLjss@fedora \
    --to=ming.lei@redhat.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=conradmeyer@meta.com \
    --cc=io-uring@vger.kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shinichiro.kawasaki@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.