From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F0937C07E97 for ; Mon, 4 Dec 2023 02:30:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=tJPC9DmNEur0ykkO9N98kEn1eLUMAn1MYmT0jUvM8c4=; b=aohgU5ZMdD+ml0p+UDlcjnB+KE 0mlEMXgM6chkp+mnAPkZ6bu3xO1xE/s/zYFYwARPqtbSyzB0oMacAgAnGqXi907tzjulFFRrxHrJj vpY76U3knVXcPmpUjuyGk73VE8nC1Fz8voKbgQUWhPi1SxX60YOpRhFE5mvYIII0ikUrnD70yYSQD Gq05PEBEMReEwXW42KFzbN9BpzIXhKKwoxzQ4HrFEtcR9A0Ol5s7I8K08mhJb10sm5cN8Yhy2HY2V Ysdb1eovMkCZR3wKJjF44TzRVLoV+turFM8Ise9wHq4vOePRO6aqicEA3LpjuXd5uZ/H9E5QJz9YN uGbOyJOA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r9yjP-002qF0-33; Mon, 04 Dec 2023 02:30:47 +0000 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r9yjJ-002qD0-2n for linux-nvme@lists.infradead.org; Mon, 04 Dec 2023 02:30:43 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701657038; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=tJPC9DmNEur0ykkO9N98kEn1eLUMAn1MYmT0jUvM8c4=; b=Y0gAh+1iSQVJgmqWqFNBGTv86lEbrPEsw15zKRetbISocRtcHLhBtrxWLf76IQKwuMgwMl qYHQCW0wTgTiE7RT6oLjoIBUztW/lVpL3yW322H8hLDYMRCrKlPM20EZ05e7VG01YwCwh4 qYRXqcCxZY5t85G8Fr8uxxBrIF+ZazY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-517-kyPI3YZUPXWYIekSdoIZ4Q-1; Sun, 03 Dec 2023 21:30:32 -0500 X-MC-Unique: kyPI3YZUPXWYIekSdoIZ4Q-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 384798007B3; Mon, 4 Dec 2023 02:30:31 +0000 (UTC) Received: from fedora (unknown [10.72.120.8]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1AF4A492BFE; Mon, 4 Dec 2023 02:30:18 +0000 (UTC) Date: Mon, 4 Dec 2023 10:30:14 +0800 From: Ming Lei To: John Garry Cc: axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, jejb@linux.ibm.com, martin.petersen@oracle.com, djwong@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, chandan.babu@oracle.com, dchinner@redhat.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-api@vger.kernel.org, ming.lei@redhat.com Subject: Re: [PATCH 10/21] block: Add fops atomic write support Message-ID: References: <20230929102726.2985188-1-john.g.garry@oracle.com> <20230929102726.2985188-11-john.g.garry@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230929102726.2985188-11-john.g.garry@oracle.com> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.10 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231203_183041_996541_FF711782 X-CRM114-Status: GOOD ( 19.82 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org On Fri, Sep 29, 2023 at 10:27:15AM +0000, John Garry wrote: > Add support for atomic writes, as follows: > - Ensure that the IO follows all the atomic writes rules, like must be > naturally aligned > - Set REQ_ATOMIC > > Signed-off-by: John Garry > --- > block/fops.c | 42 +++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 41 insertions(+), 1 deletion(-) > > diff --git a/block/fops.c b/block/fops.c > index acff3d5d22d4..516669ad69e5 100644 > --- a/block/fops.c > +++ b/block/fops.c > @@ -41,6 +41,29 @@ static bool blkdev_dio_unaligned(struct block_device *bdev, loff_t pos, > !bdev_iter_is_aligned(bdev, iter); > } > > +static bool blkdev_atomic_write_valid(struct block_device *bdev, loff_t pos, > + struct iov_iter *iter) > +{ > + unsigned int atomic_write_unit_min_bytes = > + queue_atomic_write_unit_min_bytes(bdev_get_queue(bdev)); > + unsigned int atomic_write_unit_max_bytes = > + queue_atomic_write_unit_max_bytes(bdev_get_queue(bdev)); > + > + if (!atomic_write_unit_min_bytes) > + return false; The above check should have be moved to limit setting code path. > + if (pos % atomic_write_unit_min_bytes) > + return false; > + if (iov_iter_count(iter) % atomic_write_unit_min_bytes) > + return false; > + if (!is_power_of_2(iov_iter_count(iter))) > + return false; > + if (iov_iter_count(iter) > atomic_write_unit_max_bytes) > + return false; > + if (pos % iov_iter_count(iter)) > + return false; I am a bit confused about relation between atomic_write_unit_max_bytes and atomic_write_max_bytes. Here the max IO length is limited to be <= atomic_write_unit_max_bytes, so looks userspace can only submit IO with write-atomic-unit naturally aligned IO(such as, 4k, 8k, 16k, 32k, ...), but these user IOs are allowed to be merged to big one if naturally alignment is respected and the merged IO size is <= atomic_write_max_bytes. Is my understanding right? If yes, I'd suggest to document the point, and the last two checks could be change to: /* naturally aligned */ if (pos % iov_iter_count(iter)) return false; if (iov_iter_count(iter) > atomic_write_max_bytes) return false; Thanks, Ming