linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nilay Shroff <nilay@linux.ibm.com>
To: John Garry <john.g.garry@oracle.com>
Cc: axboe@kernel.dk, brauner@kernel.org, bvanassche@acm.org,
	dchinner@redhat.com, djwong@kernel.org, hch@lst.de, jack@suse.cz,
	jbongio@google.com, jejb@linux.ibm.com, kbusch@kernel.org,
	linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-scsi@vger.kernel.org, linux-xfs@vger.kernel.org,
	martin.petersen@oracle.com, ming.lei@redhat.com,
	ojaswin@linux.ibm.com, sagi@grimberg.me, tytso@mit.edu,
	viro@zeniv.linux.org.uk
Subject: Re: [PATCH v3 10/15] block: Add fops atomic write support
Date: Wed, 14 Feb 2024 17:17:47 +0530	[thread overview]
Message-ID: <d34ef016-88d2-4ae9-9a7b-f7431429acc7@linux.ibm.com> (raw)
In-Reply-To: <445a05e7-f912-4fb8-b66e-204a05a1524f@oracle.com>



On 2/14/24 16:59, John Garry wrote:
> On 14/02/2024 09:38, Nilay Shroff wrote:
>>
>>
>> On 2/13/24 17:22, John Garry wrote:
>>> On 13/02/2024 11:08, Nilay Shroff wrote:
>>>>> It's relied that atomic_write_unit_max is <= atomic_write_boundary and both are a power-of-2. Please see the NVMe patch, which this is checked. Indeed, it would not make sense if atomic_write_unit_max > atomic_write_boundary (when non-zero).
>>>>>
>>>>> So if the write is naturally aligned and its size is <= atomic_write_unit_max, then it cannot be straddling a boundary.
>>>> Ok fine but in case the device doesn't support namespace atomic boundary size (i.e. NABSPF is zero) then still do we need
>>>> to restrict IO which crosses the atomic boundary?
>>>
>>> Is there a boundary if NABSPF is zero?
>> If NABSPF is zero then there's no boundary and so we may not need to worry about IO crossing boundary.
>>
>> Even though, the atomic boundary is not defined, this function doesn't allow atomic write crossing atomic_write_unit_max_bytes.
>> For instance, if AWUPF is 63 and an IO starts atomic write from logical block #32 and the number of logical blocks to be written
> 
> When you say "IO", you need to be clearer. Do you mean a write from userspace or a merged atomic write?
Yes I meant write from the userspace. Sorry for the confusion here.
> 
> If userspace issues an atomic write which is 64 blocks at offset 32, then it will be rejected.
> 
> It will be rejected as it is not naturally aligned, e.g. a 64 block writes can only be at offset 0, 64, 128,
So it means that even though h/w may support atomic-write crossing natural alignment boundary, the kernel would still reject it.
> 
>> in this IO equals to #64 then it's not allowed.
>>  However if this same IO starts from logical block #0 then it's allowed.
>> So my point here's that can this restriction be avoided when atomic boundary is zero (or not defined)?
> 
> We want a consistent set of rules for userspace to follow, whether the atomic boundary is zero or non-zero.
> 
> Currently the atomic boundary only comes into play for merging writes, i.e. we cannot merge a write in which the resultant IO straddles a boundary.
> 
>>
>> Also, it seems that the restriction implemented for atomic write to succeed are very strict. For example, atomic-write can't
>> succeed if an IO starts from logical block #8 and the number of logical blocks to be written in this IO equals to #16.
>> In this particular case, IO is well within atomic-boundary (if it's defined) and atomic-size-limit, so why do we NOT want to
>> allow it? Is it intentional? I think, the spec doesn't mention about such limitation.
> 
> According to the NVMe spec, this is ok. However we don't want the user to have to deal with things like NVMe boundaries. Indeed, for FSes, we do not have a direct linear map from FS blocks to physical blocks, so it would be impossible for the user to know about a boundary condition in this context.
> 
> We are trying to formulate rules which work for the somewhat orthogonal HW features of both SCSI and NVMe for both block devices and FSes, while also dealing with alignment concerns of extent-based FSes, like XFS.
Hmm OK, thanks for that explanation. 

Thanks,
--Nilay

  reply	other threads:[~2024-02-14 11:48 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-24 11:38 [PATCH v3 00/15] block atomic writes John Garry
2024-01-24 11:38 ` [PATCH v3 01/15] block: Add atomic write operations to request_queue limits John Garry
2024-02-13  6:22   ` Christoph Hellwig
2024-01-24 11:38 ` [PATCH v3 02/15] block: Limit atomic writes according to bio and queue limits John Garry
2024-02-13  4:33   ` Ritesh Harjani
2024-02-13  8:05     ` John Garry
2024-01-24 11:38 ` [PATCH v3 03/15] fs/bdev: Add atomic write support info to statx John Garry
2024-01-24 11:38 ` [PATCH v3 04/15] fs: Add RWF_ATOMIC and IOCB_ATOMIC flags for atomic write support John Garry
2024-01-24 11:38 ` [PATCH v3 05/15] block: Add REQ_ATOMIC flag John Garry
2024-02-13  6:24   ` Christoph Hellwig
2024-01-24 11:38 ` [PATCH v3 06/15] block: Pass blk_queue_get_max_sectors() a request pointer John Garry
2024-02-13  6:23   ` Christoph Hellwig
2024-02-13  8:15     ` John Garry
2024-01-24 11:38 ` [PATCH v3 07/15] block: Limit atomic write IO size according to atomic_write_max_sectors John Garry
2024-02-13  6:26   ` Christoph Hellwig
2024-02-13  8:15     ` John Garry
2024-02-14  7:26       ` Christoph Hellwig
2024-02-14  9:24         ` John Garry
2024-01-24 11:38 ` [PATCH v3 08/15] block: Error an attempt to split an atomic write bio John Garry
2024-01-24 11:38 ` [PATCH v3 09/15] block: Add checks to merging of atomic writes John Garry
2024-02-12 10:54   ` Nilay Shroff
2024-02-12 11:20     ` [PATCH " John Garry
2024-02-12 12:01       ` Nilay Shroff
2024-02-12 12:09     ` John Garry
2024-02-13  6:52       ` Nilay Shroff
2024-01-24 11:38 ` [PATCH v3 10/15] block: Add fops atomic write support John Garry
2024-02-13  9:36   ` Nilay Shroff
2024-02-13  9:58     ` [PATCH " John Garry
2024-02-13 11:08       ` Nilay Shroff
2024-02-13 11:52         ` John Garry
2024-02-14  9:38           ` Nilay Shroff
2024-02-14 11:29             ` John Garry
2024-02-14 11:47               ` Nilay Shroff [this message]
2024-01-24 11:38 ` [PATCH v3 11/15] scsi: sd: Support reading atomic write properties from block limits VPD John Garry
2024-02-13  6:31   ` Christoph Hellwig
2024-02-13  8:16     ` John Garry
2024-01-24 11:38 ` [PATCH v3 12/15] scsi: sd: Add WRITE_ATOMIC_16 support John Garry
2024-01-24 11:38 ` [PATCH v3 13/15] scsi: scsi_debug: Atomic write support John Garry
2024-01-24 11:38 ` [PATCH v3 14/15] nvme: Support atomic writes John Garry
2024-02-13  6:42   ` Christoph Hellwig
2024-02-13 14:21     ` John Garry
2024-02-14  8:00       ` Christoph Hellwig
2024-02-14  9:21         ` John Garry
2024-02-14 12:27   ` Nilay Shroff
2024-02-14 13:02     ` John Garry
2024-02-14 16:45       ` Nilay Shroff
2024-01-24 11:38 ` [PATCH v3 15/15] nvme: Ensure atomic writes will be executed atomically John Garry
2024-01-25  0:52   ` Keith Busch
2024-01-25 11:28     ` John Garry
2024-01-29  6:20       ` Christoph Hellwig
2024-01-29  9:36         ` John Garry
2024-01-29 14:39           ` Christoph Hellwig
2024-01-26  3:50     ` Chaitanya Kulkarni
2024-02-13  6:42   ` Christoph Hellwig
2024-02-13 14:07     ` John Garry
2024-01-29  6:18 ` [PATCH v3 00/15] block atomic writes Christoph Hellwig
2024-01-29  9:17   ` John Garry
2024-02-06 18:44   ` John Garry
2024-02-10 12:12     ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d34ef016-88d2-4ae9-9a7b-f7431429acc7@linux.ibm.com \
    --to=nilay@linux.ibm.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=bvanassche@acm.org \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jbongio@google.com \
    --cc=jejb@linux.ibm.com \
    --cc=john.g.garry@oracle.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    --cc=ojaswin@linux.ibm.com \
    --cc=sagi@grimberg.me \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).