From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F591C52D7F for ; Thu, 15 Aug 2024 23:44:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 705748D0012; Thu, 15 Aug 2024 19:44:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B6E08D000B; Thu, 15 Aug 2024 19:44:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57C998D0012; Thu, 15 Aug 2024 19:44:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3423B8D000B for ; Thu, 15 Aug 2024 19:44:54 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A394B1A184F for ; Thu, 15 Aug 2024 23:44:53 +0000 (UTC) X-FDA: 82456112466.21.E0DDDF8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf03.hostedemail.com (Postfix) with ESMTP id B82A520027 for ; Thu, 15 Aug 2024 23:44:51 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="fsaMa/ZJ"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf03.hostedemail.com: domain of ming.lei@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=ming.lei@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723765437; a=rsa-sha256; cv=none; b=3t3f2Ubxz0BC2aiTJZc9FhNrDws+s0ehEs0f0Wu2u1y9MolCOFiyQr/drarkWNvksrzeGx FS5ZgH9s1meQO5MjqJhkZ5WlG0Mt7b7R8wHfPquP/i42qcVDG6lchEoEyWCkCKRV4LMMCT QMj5pWnD7F+72/1fWJT93AtDIUgjIDE= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="fsaMa/ZJ"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf03.hostedemail.com: domain of ming.lei@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=ming.lei@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723765437; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QOb3UN1CKWbBfb8ov4q73kqKN1ZyoosgZ8TwQ+HhtV4=; b=mYicmqdzYgKdccPPF0RlkDZnOGEknJLIsNkWI53ZJFRzlSWh2ojl/MYuTXl5S/PZyvuFee mvvcLYflYyuaq4XbHOFwXOMt0wxfLxcKRvDj1ugPtaD/6j3pJVyocG6yxHo/QLQApJxcpa MnO9GaHRukZIHaI2aTy+iD07mtynOqo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723765491; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=QOb3UN1CKWbBfb8ov4q73kqKN1ZyoosgZ8TwQ+HhtV4=; b=fsaMa/ZJRJ2qqqtenhOrRHtZLlh85wc8boxgrsSL8+kw/9KSpOKHcYp5wt6MJW3exx7ZcQ BIy1uta0D04WJ8xoE6HCowxD3+D0tp3UOClOiCGpdGg987bFcF9w8Hoi0qwfa+Nz2w/rLT 4Yb4SuUxllVak4VLYGqrmwvdVG1hzXY= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-93-piKqWIi-NImKipJMIY-Dig-1; Thu, 15 Aug 2024 19:44:47 -0400 X-MC-Unique: piKqWIi-NImKipJMIY-Dig-1 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1680A1956064; Thu, 15 Aug 2024 23:44:46 +0000 (UTC) Received: from fedora (unknown [10.72.116.41]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 9AF0719560A3; Thu, 15 Aug 2024 23:44:40 +0000 (UTC) Date: Fri, 16 Aug 2024 07:44:34 +0800 From: Ming Lei To: Pavel Begunkov Cc: Jens Axboe , io-uring@vger.kernel.org, Conrad Meyer , linux-block@vger.kernel.org, linux-mm@kvack.org, ming.lei@redhat.com Subject: Re: [RFC 5/5] block: implement io_uring discard cmd Message-ID: References: <6ecd7ab3386f63f1656dc766c1b5b038ff5353c2.1723601134.git.asml.silence@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Rspamd-Queue-Id: B82A520027 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: tuurmt3okcydybuw7x94yhw6nszzdqtm X-HE-Tag: 1723765491-82488 X-HE-Meta: U2FsdGVkX18ea5qOJdoC9JNP/tq/4hgrCLlSZm5rGLU5h0Hg18IbiLmlHqGCxN1d1OdQYEt8ZM5b/zLmu/PXlvtfnxayhgqTHmxK0860M6uSacumeJqb/luDmWy1+wl2bV/kqm+TmaX4DGVnd6QIbKkgXbuxB7OEhBB0YP7xxNnpJK7O9+VrvronuGiFb9HaDVtjD4TP+yw1hHzzlOj/7H5Rl/4cbiBlM+PLVkZJQyozU3WX9xvrguph7OGbXi0lioYsUz0P3nZdW6QlJ4woT01kSHE6yJ7brjHHbKsZPuv8Cey7XMYhQIpcosnKIxVa5kfmO8btiJOAFrhrq5RqNh1EcCir+avVNTU0ldiiaB435RaOjANN8nj4Cd72Trk0hjw2sX9JZQvIxxQEf0aDEkte3pMudLh3hLwpdsfmvFjz5Wi0Qgl0vdGVr3erPBdi1r9Ym9pVPqPIrOCxFEr15xUv1zz41aOPxyXrXEdv6mdqFYavFTlUQR2cjD5oU9A4qYhmk3Y6qJNfjHJHi2Mp5/zZVbodfqTY5GleO9TPtg+qB//Gt9Cn/xpwtLIwI7Aay6s3WdfEZV5UpTWT+Eoq9ovLGVoPZlNSQi6eftlkAOrN4A3S4bHjIQPSlG+wT5LgcI4gf33bEeIOIZlYbnzxWA7E9VmcX6z6P4OpdR3dHK9IrmS+lZcZblc19myuMHK41z2gRn2QOqKhPTAJA/1zW1HD00exFXqe1BCkwspAazKxzKTW5+Me41Ixh1UcnEvjFZe/SH9/M8eDBWo1qO8VJDjjhY1MiDjgftOypjm4Ok25KvTAgVfyMW+r6TfvTwlzbNv0FMXT+c6ztCj1Wjl5DuZzC9Ihb76qqVIpxwUiRg93//smJFliX5XNrl8oTAy6Vlp43xntoE7/YRrB/hrx93AM1BFOEqvUhbNCUUn5rbybddWCo9ZJAlvTFaEHECWwpStV6g8AEsoc3+cQ7NW D3qgTA7y Ml6IQ6I/5rTziZNdWeS2Y7jT+r6XtxQsjfHySjW260MB6z2YdDq9hxECbML/y5NA4BIIxGVsVqLAJ6yVJbVafpq1xCEPqM2mFw04Kjmxvj08dx0rjRMYJg/06yRa2cAxQ+/Rw//K0DDqjq1h4lu+p30YUakN614Ne4DIoPdLT0Om4wwwuQYWrmlETSTBlAR4Of+XjmHZ0bb27PPpUU3HStysm/ZxrPlYPe+1LsLaJF4K/J6BmzxGBV8XwYhjnUIZ6+w58PLoTFyrSHgMwAUxfWKbbrJuMmTQz61FibcFO99LiOhjg8b4vo/h29PqnWm0n6tXYrwCQXySjRvG/Hl3qBiTwOtigyu+vYl5S4nD+2uw5pujLucMcnJ37jzRzBeSqCtUy7yAp9GLq3iIZjV2O2vzHx5DAMGG9AqAINpKj+g/tPQLBbONhPHyFfriLt/7VCniVPvzpAXeUzPSvmqC6yUVFEtz+tl61/9gMjzS9j4VHmQg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 15, 2024 at 06:11:13PM +0100, Pavel Begunkov wrote: > On 8/15/24 15:33, Jens Axboe wrote: > > On 8/14/24 7:42 PM, Ming Lei wrote: > > > On Wed, Aug 14, 2024 at 6:46?PM Pavel Begunkov wrote: > > > > > > > > Add ->uring_cmd callback for block device files and use it to implement > > > > asynchronous discard. Normally, it first tries to execute the command > > > > from non-blocking context, which we limit to a single bio because > > > > otherwise one of sub-bios may need to wait for other bios, and we don't > > > > want to deal with partial IO. If non-blocking attempt fails, we'll retry > > > > it in a blocking context. > > > > > > > > Suggested-by: Conrad Meyer > > > > Signed-off-by: Pavel Begunkov > > > > --- > > > > block/blk.h | 1 + > > > > block/fops.c | 2 + > > > > block/ioctl.c | 94 +++++++++++++++++++++++++++++++++++++++++ > > > > include/uapi/linux/fs.h | 2 + > > > > 4 files changed, 99 insertions(+) > > > > > > > > diff --git a/block/blk.h b/block/blk.h > > > > index e180863f918b..5178c5ba6852 100644 > > > > --- a/block/blk.h > > > > +++ b/block/blk.h > > > > @@ -571,6 +571,7 @@ blk_mode_t file_to_blk_mode(struct file *file); > > > > int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode, > > > > loff_t lstart, loff_t lend); > > > > long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg); > > > > +int blkdev_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags); > > > > long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg); > > > > > > > > extern const struct address_space_operations def_blk_aops; > > > > diff --git a/block/fops.c b/block/fops.c > > > > index 9825c1713a49..8154b10b5abf 100644 > > > > --- a/block/fops.c > > > > +++ b/block/fops.c > > > > @@ -17,6 +17,7 @@ > > > > #include > > > > #include > > > > #include > > > > +#include > > > > #include "blk.h" > > > > > > > > static inline struct inode *bdev_file_inode(struct file *file) > > > > @@ -873,6 +874,7 @@ const struct file_operations def_blk_fops = { > > > > .splice_read = filemap_splice_read, > > > > .splice_write = iter_file_splice_write, > > > > .fallocate = blkdev_fallocate, > > > > + .uring_cmd = blkdev_uring_cmd, > > > > > > Just be curious, we have IORING_OP_FALLOCATE already for sending > > > discard to block device, why is .uring_cmd added for this purpose? > > Which is a good question, I haven't thought about it, but I tend to > agree with Jens. Because vfs_fallocate is created synchronous > IORING_OP_FALLOCATE is slow for anything but pretty large requests. > Probably can be patched up, which would involve changing the > fops->fallocate protot, but I'm not sure async there makes sense > outside of bdev (?), and cmd approach is simpler, can be made > somewhat more efficient (1 less layer in the way), and it's not > really something completely new since we have it in ioctl. Yeah, we have ioctl(DISCARD), which acquires filemap_invalidate_lock, same with blkdev_fallocate(). But this patch drops this exclusive lock, so it becomes async friendly, but may cause stale page cache. However, if the lock is required, it can't be efficient anymore and io-wq may be inevitable, :-) Thanks, Ming