From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27D17C3DA7F for ; Thu, 15 Aug 2024 17:10:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7DF3A8D0008; Thu, 15 Aug 2024 13:10:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 78F5A8D0002; Thu, 15 Aug 2024 13:10:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 65B058D0008; Thu, 15 Aug 2024 13:10:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 4010B8D0002 for ; Thu, 15 Aug 2024 13:10:47 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id DBAE71A16CE for ; Thu, 15 Aug 2024 17:10:45 +0000 (UTC) X-FDA: 82455119250.04.3A25FF1 Received: from mail-ej1-f54.google.com (mail-ej1-f54.google.com [209.85.218.54]) by imf26.hostedemail.com (Postfix) with ESMTP id CCA9314001A for ; Thu, 15 Aug 2024 17:10:42 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QmVgraJV; spf=pass (imf26.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.218.54 as permitted sender) smtp.mailfrom=asml.silence@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723741745; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WFfhhKmG7FhdO965mMa/VXQpMbla35Yaq9l0tLryaQQ=; b=asCWd8H6BRtYYx3HQvC42WX7Kpp01fySXg/1kW7XSIT7zwOgTbMIBGr+LSghyR/7lDeJhF 3d/D/xhcO5z5xMbUz+KybjJ/jgvAa0E3QPp59+pAJd1luX/lNCEQLst3UokmmVkSIi+1aL oPZCunwUOmO2OUm3iR3+c54uJ55ivAQ= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QmVgraJV; spf=pass (imf26.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.218.54 as permitted sender) smtp.mailfrom=asml.silence@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723741745; a=rsa-sha256; cv=none; b=flK5p8hbhCXUF/R5FTgzl35UAXDF5dFXi90wWBjN5OP4V4cfbkyZgw/aHsb5VtUU/nwqeI jYDcSgI8mSjR8O0IBTxfOlBj83j7/FZqcJclcOeYsYaXUgE4ww95k2WhO6XlwzUGr8ij3Y rLPHXebAtgT/SIzO8UEhjp0jd+s+4LU= Received: by mail-ej1-f54.google.com with SMTP id a640c23a62f3a-a7a8caef11fso149996966b.0 for ; Thu, 15 Aug 2024 10:10:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723741841; x=1724346641; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=WFfhhKmG7FhdO965mMa/VXQpMbla35Yaq9l0tLryaQQ=; b=QmVgraJVpfApjk7nx8JXyft146FY6vQT3QYtyBSBPfXcHMRQK+VWOfV8ZviKB9ZwS2 NCdEe6Lqy0uWMFWI3yAO1T8Y/n73KFzp8r71d2yj2ywxDFuziv3LD0wX/CJ3uSjODwc0 d592Gua7ziNEK9XccbVf9ob+7xxZ7IjRP/Fulj0V/03xkqEEE3gT8+cYAhPanyQHLA9t iI3U9r0ny2dytE5xcKNOgHZYrELHEZKziKOXqJuh9xaH2daCp2AVoO8gHrxEC1uDG4LP kX6zaGR8mAirZ+gies3KAVn86mEL8ME/BUSA5cg7JsgkDVqO1Nmv0YBvNgE9H+lcPQnb fy8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723741841; x=1724346641; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WFfhhKmG7FhdO965mMa/VXQpMbla35Yaq9l0tLryaQQ=; b=W6ic6lT6frK+O9HbdiUa4+TFWuUTxAIJxG2Y0haH+IB1U7NYUaNPKpm1QAS2gkDY4F lGvFBf9MEBY2vA4B1Eq4QFAAjTMeWYYnbl21tsJVRJNHB3W1J7tC/57XBi7fPoWbM7bw gdZNW+aOMb4rH2LA0OU+K/4ebHH4b7rUWxa6maJpGJgCx5SKGHneYD8k1WQzinwLv55o /ZhN1276jmDMpXST+sV4R4NQCaZ17iVO/fm58O29TNyUU6Qvj7GffsCW6kwwN1AeYK2V rJIMkY6+CAgUkP+tRisrltrAuJ/VZMxZLi74zt8Vy7qgSbAoEDbau0oh27QUPOr6i8rf fcVA== X-Forwarded-Encrypted: i=1; AJvYcCXbcJwUhU0G234HWNBpedS3UAj1m1Xmr1FmgxygZBVTNjtcZX1yUWWMwv1tUExSXF8fOSXtk5Yejv/mrBiJsz7T49o= X-Gm-Message-State: AOJu0Yyjk1BLqWLK8KACj4R9qN7drbsAOsHlrsdln6ac7pUuER9F4CJF XZ7JVWNZOkbXgXH2JBsImG3fGR+66O1WBgaEh4p9lpqah2C/uqCR X-Google-Smtp-Source: AGHT+IGGb/jWJQ/vB1ycj475VfAt2DYJmXOpBIVMSV5/HH7G6bXsH5yiIVN61ohuq27CGOs8cJrOtQ== X-Received: by 2002:a17:907:94d4:b0:a7a:9171:88f4 with SMTP id a640c23a62f3a-a8392a48ebdmr14703066b.68.1723741840818; Thu, 15 Aug 2024 10:10:40 -0700 (PDT) Received: from [192.168.42.192] ([85.255.234.87]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a838394725bsm126342266b.168.2024.08.15.10.10.40 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 15 Aug 2024 10:10:40 -0700 (PDT) Message-ID: Date: Thu, 15 Aug 2024 18:11:13 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC 5/5] block: implement io_uring discard cmd To: Jens Axboe , Ming Lei Cc: io-uring@vger.kernel.org, Conrad Meyer , linux-block@vger.kernel.org, linux-mm@kvack.org References: <6ecd7ab3386f63f1656dc766c1b5b038ff5353c2.1723601134.git.asml.silence@gmail.com> Content-Language: en-US From: Pavel Begunkov In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CCA9314001A X-Stat-Signature: e84si5xaf3tgjqbty1s664d5qtzs6y7i X-Rspam-User: X-HE-Tag: 1723741842-589363 X-HE-Meta: U2FsdGVkX1+lHL6EAqgmNvxyr0mzbEt5XCS3ls8MUHVR3DG9wAdPxo3ZMSUCGUy2zh8LniktbYmY/UBz7JkCmFJMy/AL4sNCjGkfLWqclQP1y1yjnySiwqqk/HnIK1wEA3C45jYv4kB3Y5zfPobU8QozPNFqenaDbA318y5j8/qYwnKLvXBOvMMcy8jPAJAkA6S8nDJ+nVzNKw59rCvZwBUvBJ4RPCMlb6/V80GnA236EgvD9v2u9w/3CkN/7L8lbz1dqVv/eRlOmirqxlFc+68QRUihjgxwZiwimMQ6BVCvaK3+R5UkmX7GdqawbMkCQaIxBN18+7qimU2VBm2Z8yKD0nPRZEHjuoBnlco5yh3JAu+64e+VzITmNydK9PrqkSWsxtRYSFtpcMzFh3uZ9goP3p4KTdWv/MyHg6fFnOp2JBYnmynV98IejgBxFYPuN/JiSvTYTChDOLAQGJJdcfiZZ5nfi5WEgS5fu0zN2JOTNzEkvPMLihYw1WmQX9t0h4pfXRyFhnF4zxymXtI63+HMDyUHcaVgFU7EtS2InRx5MoW427zG52BC4fR0LKbHjDj9HXGaVIUPYpEgh6da8qm0aYeMU1oULL861w4NG9stK2KmQH/J2VsC9YKbxwlRtUV7xKfuw1XXjg5vI+1q8+OAhjahBao1ev6wEyTIfn2yD7vkt2e4M1ZiyaFigoZCXBg2+nyuKPiT4t5KrRxbsnGO0PYatlSqOLdBJJs8zYGynUuQLZ8hVXvaZ9XDsALjlpL76U2zSkRdACZl1MPF021SDy1OwuIoY/oK812DHxg0A+peJlg0gzq9+xZ2p7iKZyFLzHy91xF08YnrE6tW0yznHHU1lyTSudXSiOjtWH84TQB3fS7e+HlgSQdht9t8IHlhhpn8NIllZsTOwORVcMtXIW20epccRYcEUVOAdIpDlwxZkQwGp6GvScjls+c0ELLTAH5hnbwaBZ4f0Kp p563xJYx vIuMLD8VG8R+ZxL8hcNHoos2dMcl6iUCMC9myO8hgaO6sfl/sDkh+qvFDI0prsurOuXjDRf3MjU6oVUS/uO07EX8OsxzfaAZ5PBO93ZEIMkwZqg1cou3mqManSkbA14ZimMc2s733HHrJNOfXT0IQ5jURvmtsC4ES4XWHomV9QhMixJ4yszOsDAmRc+yRFiVSGnudnfejUxgOBDJQZnNyaRH0Wudto5nrmXNBLdcGJVlXFq4ZlC64dephYOrHkjryWROdTzZ9zZ8LCU+74D8f6ebxiuN9pWFG7oohXYcGoswUNtgqRlhlBfLoyFIKID5hXwQd/g04Su62iBd8ui15WO2ie8iPadaY/zVZhiHKKpqDeOYirWbniUpechmmAy3pErwy3ZchsuxZhCrkAcXtraxvShLI88fSBj+jCM8WjJsFVM5nZWSq1hH3cKsNpg5k8JUWzZSxM/tBN1n8w+8uWG7YD4SgLG2uIOw1OXrQ//CDdFM0h4WUcJZclY3Cnamq16AbIvFsVHrRQzidL79zA/MYgx5/I9pFPv9P X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 8/15/24 15:33, Jens Axboe wrote: > On 8/14/24 7:42 PM, Ming Lei wrote: >> On Wed, Aug 14, 2024 at 6:46?PM Pavel Begunkov wrote: >>> >>> Add ->uring_cmd callback for block device files and use it to implement >>> asynchronous discard. Normally, it first tries to execute the command >>> from non-blocking context, which we limit to a single bio because >>> otherwise one of sub-bios may need to wait for other bios, and we don't >>> want to deal with partial IO. If non-blocking attempt fails, we'll retry >>> it in a blocking context. >>> >>> Suggested-by: Conrad Meyer >>> Signed-off-by: Pavel Begunkov >>> --- >>> block/blk.h | 1 + >>> block/fops.c | 2 + >>> block/ioctl.c | 94 +++++++++++++++++++++++++++++++++++++++++ >>> include/uapi/linux/fs.h | 2 + >>> 4 files changed, 99 insertions(+) >>> >>> diff --git a/block/blk.h b/block/blk.h >>> index e180863f918b..5178c5ba6852 100644 >>> --- a/block/blk.h >>> +++ b/block/blk.h >>> @@ -571,6 +571,7 @@ blk_mode_t file_to_blk_mode(struct file *file); >>> int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode, >>> loff_t lstart, loff_t lend); >>> long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg); >>> +int blkdev_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags); >>> long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg); >>> >>> extern const struct address_space_operations def_blk_aops; >>> diff --git a/block/fops.c b/block/fops.c >>> index 9825c1713a49..8154b10b5abf 100644 >>> --- a/block/fops.c >>> +++ b/block/fops.c >>> @@ -17,6 +17,7 @@ >>> #include >>> #include >>> #include >>> +#include >>> #include "blk.h" >>> >>> static inline struct inode *bdev_file_inode(struct file *file) >>> @@ -873,6 +874,7 @@ const struct file_operations def_blk_fops = { >>> .splice_read = filemap_splice_read, >>> .splice_write = iter_file_splice_write, >>> .fallocate = blkdev_fallocate, >>> + .uring_cmd = blkdev_uring_cmd, >> >> Just be curious, we have IORING_OP_FALLOCATE already for sending >> discard to block device, why is .uring_cmd added for this purpose? Which is a good question, I haven't thought about it, but I tend to agree with Jens. Because vfs_fallocate is created synchronous IORING_OP_FALLOCATE is slow for anything but pretty large requests. Probably can be patched up, which would involve changing the fops->fallocate protot, but I'm not sure async there makes sense outside of bdev (?), and cmd approach is simpler, can be made somewhat more efficient (1 less layer in the way), and it's not really something completely new since we have it in ioctl. > I think wiring up a bdev uring_cmd makes sense, because: > > 1) The existing FALLOCATE op is using vfs_fallocate, which is inherently > sync and hence always punted to io-wq. > > 2) There will most certainly be other async ops that would be > interesting to add, at which point we'd need it anyway. > > 3) It arguably makes more sense to have a direct discard op than use > fallocate for this, if working on a raw bdev. > > And probably others... > -- Pavel Begunkov