From: John Garry <john.g.garry@oracle.com>
To: Christoph Hellwig <hch@lst.de>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
axboe@kernel.dk, kbusch@kernel.org, sagi@grimberg.me,
jejb@linux.ibm.com, martin.petersen@oracle.com,
viro@zeniv.linux.org.uk, brauner@kernel.org, dchinner@redhat.com,
jack@suse.cz, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
tytso@mit.edu, jbongio@google.com, linux-scsi@vger.kernel.org,
ming.lei@redhat.com, jaswin@linux.ibm.com, bvanassche@acm.org
Subject: Re: [PATCH v2 00/16] block atomic writes
Date: Tue, 16 Jan 2024 11:35:47 +0000 [thread overview]
Message-ID: <6135eab3-50ce-4669-a692-b4221773bb20@oracle.com> (raw)
In-Reply-To: <20231221132236.GB26817@lst.de>
On 21/12/2023 13:22, Christoph Hellwig wrote:
> On Thu, Dec 21, 2023 at 01:18:33PM +0000, John Garry wrote:
>>> For SGL-capable devices that would be
>>> BIO_MAX_VECS, otherwise 1.
>> ok, but we would need to advertise that or whatever segment limit. A statx
>> field just for that seems a bit inefficient in terms of space.
> I'd rather not hard code BIO_MAX_VECS in the ABI, which suggest we
> want to export is as a field. Network file systems also might have
> their own limits for one reason or another.
Hi Christoph,
I have been looking at this issue again and I am not sure if telling the
user the max number of segments allowed is the best option. I’m worried
that resultant atomic write unit max will be too small.
The background again is that we want to tell the user what the maximum
atomic write unit size is, such that we can always guarantee to fit the
write in a single bio. And there would be no iovec length or alignment
rules.
The max segments value advertised would be min(queue max segments,
BIO_MAX_VECS), so it would be 256 when the request queue is not limiting.
The worst case scenario for iovec layout (most inefficient) which the
user could provide would be like .iov_base = 0x...0E00 and .iov_length =
0x400, which would mean that we would have 2x pages and 2x DMA sg elems
required for each 1024B-length iovec. I am assuming that we will still
use the direct IO rule of LBS length and alignment.
As such, we then need to set atomic write unit max = min(queue max
segments, BIO_MAX_VECS) * LBS. That would mean atomic write unit max 256
* 512 = 128K (for 512B LBS). For a DMA controller of max segments 64,
for example, then we would have 32K. These seem too low.
Alternative I'm thinking that we should just limit to 1x iovec always,
and then atomic write unit max = (min(queue max segments, BIO_MAX_VECS)
- 1) * PAGE_SIZE [ignoring first/last iovec contents]. It also makes
support for non-enterprise NVMe drives more straightforward. If someone
wants, they can introduce support for multi-iovec later, but it would
prob require some more iovec length/alignment rules.
Please let me know your thoughts.
Thanks,
John
next prev parent reply other threads:[~2024-01-16 11:37 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-12 11:08 [PATCH v2 00/16] block atomic writes John Garry
2023-12-12 11:08 ` [PATCH v2 01/16] block: Add atomic write operations to request_queue limits John Garry
2023-12-13 1:25 ` Ming Lei
2023-12-13 9:13 ` John Garry
2023-12-13 12:28 ` Ming Lei
2023-12-13 19:01 ` John Garry
2023-12-14 4:38 ` Martin K. Petersen
2023-12-14 13:46 ` Ming Lei
2023-12-14 4:34 ` Martin K. Petersen
2023-12-14 16:12 ` Christoph Hellwig
2023-12-12 11:08 ` [PATCH v2 02/16] block: Limit atomic writes according to bio and queue limits John Garry
2023-12-12 11:08 ` [PATCH v2 03/16] fs/bdev: Add atomic write support info to statx John Garry
2023-12-13 10:24 ` Jan Kara
2023-12-13 11:02 ` John Garry
2023-12-12 11:08 ` [PATCH v2 04/16] fs: Increase fmode_t size John Garry
2023-12-13 11:20 ` Jan Kara
2023-12-13 13:03 ` John Garry
2023-12-13 13:02 ` Christian Brauner
2023-12-13 13:15 ` John Garry
2023-12-13 16:03 ` Christoph Hellwig
2023-12-14 8:56 ` John Garry
2023-12-12 11:08 ` [PATCH v2 05/16] fs: Add RWF_ATOMIC and IOCB_ATOMIC flags for atomic write support John Garry
2023-12-13 13:31 ` Al Viro
2023-12-13 16:02 ` John Garry
2024-01-22 8:29 ` John Garry
2023-12-12 11:08 ` [PATCH v2 06/16] block: Add REQ_ATOMIC flag John Garry
2023-12-12 11:08 ` [PATCH v2 07/16] block: Pass blk_queue_get_max_sectors() a request pointer John Garry
2023-12-12 11:08 ` [PATCH v2 08/16] block: Limit atomic write IO size according to atomic_write_max_sectors John Garry
2023-12-15 2:27 ` Ming Lei
2023-12-15 13:55 ` John Garry
2023-12-12 11:08 ` [PATCH v2 09/16] block: Error an attempt to split an atomic write bio John Garry
2023-12-12 11:08 ` [PATCH v2 10/16] block: Add checks to merging of atomic writes John Garry
2023-12-12 11:08 ` [PATCH v2 11/16] block: Add fops atomic write support John Garry
2023-12-12 11:08 ` [PATCH v2 12/16] scsi: sd: Support reading atomic write properties from block limits VPD John Garry
2023-12-12 11:08 ` [PATCH v2 13/16] scsi: sd: Add WRITE_ATOMIC_16 support John Garry
2023-12-12 11:08 ` [PATCH v2 14/16] scsi: scsi_debug: Atomic write support John Garry
2023-12-12 11:08 ` [PATCH v2 15/16] nvme: Support atomic writes John Garry
2023-12-12 11:08 ` [PATCH v2 16/16] nvme: Ensure atomic writes will be executed atomically John Garry
2023-12-12 16:32 ` [PATCH v2 00/16] block atomic writes Christoph Hellwig
2023-12-13 9:32 ` John Garry
2023-12-13 15:44 ` Christoph Hellwig
2023-12-13 16:27 ` John Garry
2023-12-14 14:37 ` Christoph Hellwig
2023-12-14 15:46 ` John Garry
2023-12-18 22:50 ` Keith Busch
2023-12-19 5:14 ` Darrick J. Wong
2023-12-19 5:21 ` Christoph Hellwig
2023-12-19 12:41 ` John Garry
2023-12-19 15:17 ` Christoph Hellwig
2023-12-19 16:53 ` John Garry
2023-12-21 6:50 ` Christoph Hellwig
2023-12-21 9:49 ` John Garry
2023-12-21 12:19 ` Christoph Hellwig
2023-12-21 12:48 ` John Garry
2023-12-21 12:57 ` Christoph Hellwig
2023-12-21 13:18 ` John Garry
2023-12-21 13:22 ` Christoph Hellwig
2023-12-21 13:56 ` John Garry
2024-01-16 11:35 ` John Garry [this message]
2024-01-17 15:02 ` Christoph Hellwig
2024-01-17 16:16 ` John Garry
2024-01-09 9:55 ` John Garry
2024-01-09 16:02 ` Christoph Hellwig
2024-01-09 16:52 ` John Garry
2024-01-09 23:04 ` Dave Chinner
2024-01-10 8:55 ` John Garry
2024-01-10 9:19 ` Christoph Hellwig
2024-01-11 1:40 ` Darrick J. Wong
2024-01-11 5:02 ` Christoph Hellwig
2024-01-11 9:55 ` John Garry
2024-01-11 14:45 ` Christoph Hellwig
2024-01-11 16:11 ` John Garry
2024-01-11 16:15 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6135eab3-50ce-4669-a692-b4221773bb20@oracle.com \
--to=john.g.garry@oracle.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=bvanassche@acm.org \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=jaswin@linux.ibm.com \
--cc=jbongio@google.com \
--cc=jejb@linux.ibm.com \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=ming.lei@redhat.com \
--cc=sagi@grimberg.me \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox