Re: [LSF/MM/BPF TOPIC] breaking the 512 KiB IO boundary on x86_64

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Keith Busch <kbusch@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-block@vger.kernel.org, lsf-pc@lists.linux-foundation.org,
	david@fromorbit.com, leon@kernel.org, hch@lst.de,
	sagi@grimberg.me, axboe@kernel.dk, joro@8bytes.org,
	brauner@kernel.org, hare@suse.de, willy@infradead.org,
	john.g.garry@oracle.com, p.raghav@samsung.com,
	gost.dev@samsung.com, da.gomez@samsung.com
Subject: Re: [LSF/MM/BPF TOPIC] breaking the 512 KiB IO boundary on x86_64
Date: Fri, 21 Mar 2025 22:51:42 +0530	[thread overview]
Message-ID: <87frj6s3ix.fsf@gmail.com> (raw)
In-Reply-To: <Z92WBePJ620r5-13@kbusch-mbp>

Keith Busch <kbusch@kernel.org> writes:

> On Fri, Mar 21, 2025 at 07:43:09AM +0530, Ritesh Harjani wrote:
>> i.e. w/o large folios in block devices one could do direct-io &
>> buffered-io in parallel even just next to each other (assuming 4k pagesize). 
>> 
>>            |4k-direct-io | 4k-buffered-io | 
>> 
>> 
>> However with large folios now supported in buffered-io path for block
>> devices, the application cannot submit such direct-io + buffered-io
>> pattern in parallel. Since direct-io can end up invalidating the folio
>> spanning over it's 4k range, on which buffered-io is in progress.
>
> Why would buffered io span more than the 4k range here? You're talking
> to the raw block device in both cases, so they have the exact same
> logical block size alignment. Why is buffered io allocating beyond
> the logical size granularity?

This can happen in following 2 cases - 
1. System's page size is 64k. Then even though the logical block size
granularity for buffered-io is set to 4k (blockdev --setbsz 4k
/dev/sdc), it still will instantiate a 64k page in the page cache.

2. Second is the recent case where (correct me if I am wrong) we now
have large folio support for block devices. So here again we can
instantiate a large folio in the page cache where buffered-io is in
progress correct? (say a previous read causes a readahead and installs a
large folio in that region). Or even iomap_write_iter() these days tries
to first allocate a chunk of size mapping_max_folio_size().

However with large folio support now in block devices, I am not sure
whether an application can retain much benefit of doing buffered-io (if
they happen to mix buffered-io and direct-io carefully over a logical
boundary). Because the direct-io can end up invalidating the entire
large folio, if there is one, in the region where the direct-io
operation is taking place. However this may still be useful if only
buffered-io is being performed on the block device.

-ritesh

next prev parent reply	other threads:[~2025-03-21 18:38 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20 11:41 [LSF/MM/BPF TOPIC] breaking the 512 KiB IO boundary on x86_64 Luis Chamberlain
2025-03-20 12:11 ` Matthew Wilcox
2025-03-20 13:29   ` Daniel Gomez
2025-03-20 14:31     ` Matthew Wilcox
2025-03-20 13:47 ` Daniel Gomez
2025-03-20 14:54   ` Christoph Hellwig
2025-03-21  9:14     ` Daniel Gomez
2025-03-20 14:18 ` Christoph Hellwig
2025-03-20 15:37   ` Bart Van Assche
2025-03-20 15:58     ` Keith Busch
2025-03-20 16:13       ` Kanchan Joshi
2025-03-20 16:38       ` Christoph Hellwig
2025-03-20 21:50         ` Luis Chamberlain
2025-03-20 21:46       ` Luis Chamberlain
2025-03-20 21:40   ` Luis Chamberlain
2025-03-20 18:46 ` Ritesh Harjani
2025-03-20 21:30   ` Darrick J. Wong
2025-03-21  2:13     ` Ritesh Harjani
2025-03-21  3:05       ` Darrick J. Wong
2025-03-21  4:56         ` Theodore Ts'o
2025-03-21  5:00           ` Christoph Hellwig
2025-03-21 18:39             ` Ritesh Harjani
2025-03-21 16:38       ` Keith Busch
2025-03-21 17:21         ` Ritesh Harjani [this message]
2025-03-21 18:55           ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87frj6s3ix.fsf@gmail.com \
    --to=ritesh.list@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=da.gomez@samsung.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=john.g.garry@oracle.com \
    --cc=joro@8bytes.org \
    --cc=kbusch@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=sagi@grimberg.me \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.