From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6D71FC6FA83 for ; Fri, 2 Sep 2022 18:56:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:References:Content-Type: In-Reply-To:MIME-Version:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=4eLYZnr6xFbgq4GxcRAahDAAw7otRIbJYrXiu6bjolg=; b=OKfy7P+KwBP67JTjbRPeJ/SG3t X7nmmNFLNIIOokclcfIF2bbW9HQagvP+794RXz9dC1GX3g4CO938CxkspsyxjBNVT1e4liw0xLsVK 8oMXMTgtcAN0QmM4FVt1/oddNMyPXwGJU4bk3WvA+z4FgNHn/fSylSM7QIIJeeEHVgBci/qebnmD5 36SYQUN0coy2OFzQwU1BFoZjT0Ck21syxgxG0L66tpAl16AjLlX4EbIs4hqI6Wzqa+DECiFjL3nzU o3hgFfYmRJuv3gYjg+akiCjOck2nRVvPEOglOdZbS7rgLuLWnOpOdrVpZXg+vn4wCFGmlf6Fkoy6Y iHBfbH+w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oUBpi-009u40-E9; Fri, 02 Sep 2022 18:56:02 +0000 Received: from mailout4.samsung.com ([203.254.224.34]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oUBpf-009u03-BM for linux-nvme@lists.infradead.org; Fri, 02 Sep 2022 18:56:01 +0000 Received: from epcas5p2.samsung.com (unknown [182.195.41.40]) by mailout4.samsung.com (KnoxPortal) with ESMTP id 20220902185555epoutp0422df7d961898278887e337ece73300fd~RIAfW5YOb1828318283epoutp04Y for ; Fri, 2 Sep 2022 18:55:55 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout4.samsung.com 20220902185555epoutp0422df7d961898278887e337ece73300fd~RIAfW5YOb1828318283epoutp04Y DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1662144955; bh=4eLYZnr6xFbgq4GxcRAahDAAw7otRIbJYrXiu6bjolg=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=T74VFScqDbiBa9+/m8IuJIPLjTKV5Xqvvg2FI5elMlDM4gyKJq0s92NLMJWVFKBq3 pcSFVR0+7pOlXo+Ygjcf5bAh0JDUJFsgOpLw4orViM8wVhmZsgCc9jEiT1ZitBvV9d 2mvY/QZknX57p+GoyL+f6MNK37xfdcw+jEczou6c= Received: from epsnrtp3.localdomain (unknown [182.195.42.164]) by epcas5p2.samsung.com (KnoxPortal) with ESMTP id 20220902185555epcas5p2d39ee95cd1fcaa04403b566617f6460e~RIAfCzqap3127031270epcas5p2v; Fri, 2 Sep 2022 18:55:55 +0000 (GMT) Received: from epsmges5p1new.samsung.com (unknown [182.195.38.181]) by epsnrtp3.localdomain (Postfix) with ESMTP id 4MK6Z108YLz4x9Pp; Fri, 2 Sep 2022 18:55:53 +0000 (GMT) Received: from epcas5p4.samsung.com ( [182.195.41.42]) by epsmges5p1new.samsung.com (Symantec Messaging Gateway) with SMTP id A4.91.59633.8B152136; Sat, 3 Sep 2022 03:55:52 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p1.samsung.com (KnoxPortal) with ESMTPA id 20220902185552epcas5p1a3bd8094fb643fb03adbfbc72ddbe10d~RIAcnaua00646106461epcas5p1r; Fri, 2 Sep 2022 18:55:52 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20220902185552epsmtrp2debe40835de617a7d42661c2c1a19de9~RIAcmuU1z3263532635epsmtrp2o; Fri, 2 Sep 2022 18:55:52 +0000 (GMT) X-AuditID: b6c32a49-dfdff7000000e8f1-ca-631251b830c9 Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id EA.50.14392.8B152136; Sat, 3 Sep 2022 03:55:52 +0900 (KST) Received: from test-zns (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20220902185551epsmtip15d28ee76bdbbdbc6e42b1cb33a6e5164~RIAbY5F9z3257232572epsmtip1n; Fri, 2 Sep 2022 18:55:51 +0000 (GMT) Date: Sat, 3 Sep 2022 00:16:08 +0530 From: Kanchan Joshi To: Jens Axboe Cc: hch@lst.de, kbusch@kernel.org, asml.silence@gmail.com, io-uring@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, gost.dev@samsung.com Subject: Re: [PATCH for-next v3 0/4] fixed-buffer for uring-cmd/passthrough Message-ID: <20220902184608.GA6902@test-zns> MIME-Version: 1.0 In-Reply-To: <2b4a935c-a6b1-6e42-ceca-35a8f09d8f46@kernel.dk> User-Agent: Mutt/1.9.4 (2018-02-28) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprBJsWRmVeSWpSXmKPExsWy7bCmlu6OQKFkg01tFhZzVm1jtFh9t5/N 4uaBnUwWK1cfZbJ413qOxWLSoWuMFntvaVvMX/aU3YHDY+esu+wel8+Wemxa1cnmsXlJvcfu mw1sHn1bVjF6fN4kF8AelW2TkZqYklqkkJqXnJ+SmZduq+QdHO8cb2pmYKhraGlhrqSQl5ib aqvk4hOg65aZA3SSkkJZYk4pUCggsbhYSd/Opii/tCRVISO/uMRWKbUgJafApECvODG3uDQv XS8vtcTK0MDAyBSoMCE7o6l9P1vBArmKT/dEGhi3SHQxcnBICJhILD2f18XIxSEksJtR4tjJ BewQzidGiXsbLzFDON8YJTZsXg3kcIJ1/OlewAKR2MsosfjLajYI5xmjxNyjl8GqWARUJA7u us4IsoNNQFPiwuRSkLCIgIJEz++VbCA2s8AqRokpv6RBbGEBb4mmvgdgrbwCOhI77h+HsgUl Ts58wgIyhlPAVmLhr0qQsKiAssSBbceZQNZKCEzkkJjVsw3qOBeJ30c6mCBsYYlXx7ewQ9hS Ep/f7WWDsJMlLs08B1VTIvF4z0Eo216i9VQ/M8RtGRJbn05ghbD5JHp/P2GCBBevREebEES5 osS9SU9ZIWxxiYczlkDZHhJH+mexwENx0rbLLBMY5WYheWcWkhUQtpVE54cmIJsDyJaWWP6P A8LUlFi/S38BI+sqRsnUguLc9NRi0wLDvNRyeAwn5+duYgQnUS3PHYx3H3zQO8TIxMF4iFGC g1lJhHfqYYFkId6UxMqq1KL8+KLSnNTiQ4ymwNiZyCwlmpwPTON5JfGGJpYGJmZmZiaWxmaG SuK8U7QZk4UE0hNLUrNTUwtSi2D6mDg4pRqYksvOzGtT76wKOz7PP1SwylafdfnjCWmPgi1/ X9vwN4U75Gr6K+/LS/oibtpbSgTzHzSpM3LJMYgI+NV87/TN89PDVtfneLjFHDZxlI29PMWs 5E+vzOUHBYqaxnu9TnvF7mVd7F2r//axmVMZ4yPHg49++HffcN7nKZ8c87La6qrDijTeGq0r SyQlZJg+sjRdOyT2OmCJMe8X9mzfR1orG8LaPT63Xjzrbrpo8t2l/16GbWpbelW2pjvddYmH S4vV5lWL/wQvnTnxekRj5awv3+8LP2UJfPjy//0KjR1eW5u1OT9Nsf5xKmph3pm3UWn7r6Xk zwyJNlY6u1tSwCj+32K926cTs9XzDJkel5gqsRRnJBpqMRcVJwIA1JxDQisEAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrBLMWRmVeSWpSXmKPExsWy7bCSnO6OQKFkg9PPpC3mrNrGaLH6bj+b xc0DO5ksVq4+ymTxrvUci8WkQ9cYLfbe0raYv+wpuwOHx85Zd9k9Lp8t9di0qpPNY/OSeo/d NxvYPPq2rGL0+LxJLoA9issmJTUnsyy1SN8ugStjSe9floJL0hU3Ls5mbGBsEOti5OSQEDCR +NO9gKWLkYtDSGA3o8T9j/eZIBLiEs3XfrBD2MISK/89Z4coesIosXtbCyNIgkVAReLgrutA NgcHm4CmxIXJpSBhEQEFiZ7fK9lAbGaBVYwSU35Jg9jCAt4STX0PmEFsXgEdiR33jzNDzPzE KHHp+ylWiISgxMmZT1ggms0k5m1+yAwyn1lAWmL5Pw4Qk1PAVmLhr0qQClEBZYkD244zTWAU nIWkeRaS5lkIzQsYmVcxSqYWFOem5xYbFhjmpZbrFSfmFpfmpesl5+duYgRHhpbmDsbtqz7o HWJk4mA8xCjBwawkwjv1sECyEG9KYmVValF+fFFpTmrxIUZpDhYlcd4LXSfjhQTSE0tSs1NT C1KLYLJMHJxSDUysbf/r+KZE7VU23/MtXfNx9FZVXuHAL15vtteZK/3NeSWS9u7yvNTM0Pv8 1yIevY4VbmLlauPSln/ptfbMi+JvBzaKTNz3z/JU8/WzjIXTjzhdN1R8YrZgdnNxf3jk0ZL/ RwKnS+27dq2P81my7rXXPjJyG95dKRORv58/109kytzn976nPZ2cL1M442CjvMGzZTqLjEJ/ iN250qeu4nKXZ+r8souVfxmrZ9YvdW/jm/zFjv/3zN+dJgc5jDNjI7i2rz4s3zn7eFy0Ztap Ki+th0qf3+j4qPU9M1tuX9QyYc/C21ffL3M22T15JW+Te+OcBdMPpk3m0D41RVwltNrsTARr nHOknfwmEdOy3Y5KLMUZiYZazEXFiQBSUX0O+wIAAA== X-CMS-MailID: 20220902185552epcas5p1a3bd8094fb643fb03adbfbc72ddbe10d X-Msg-Generator: CA Content-Type: multipart/mixed; boundary="----pcQIgMQ_HVXS4qWPbWHvSbnTt3jWAgM.sCJInyI3qCVu2Rgf=_445cb_" CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20220902152701epcas5p1d4aca8eebc90fb96ac7ed5a8270816cf References: <20220902151657.10766-1-joshi.k@samsung.com> <2b4a935c-a6b1-6e42-ceca-35a8f09d8f46@kernel.dk> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220902_115559_875603_FC0F0300 X-CRM114-Status: GOOD ( 25.14 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org ------pcQIgMQ_HVXS4qWPbWHvSbnTt3jWAgM.sCJInyI3qCVu2Rgf=_445cb_ Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Disposition: inline On Fri, Sep 02, 2022 at 10:32:16AM -0600, Jens Axboe wrote: >On 9/2/22 10:06 AM, Jens Axboe wrote: >> On 9/2/22 9:16 AM, Kanchan Joshi wrote: >>> Hi, >>> >>> Currently uring-cmd lacks the ability to leverage the pre-registered >>> buffers. This series adds the support in uring-cmd, and plumbs >>> nvme passthrough to work with it. >>> >>> Using registered-buffers showed peak-perf hike from 1.85M to 2.17M IOPS >>> in my setup. >>> >>> Without fixedbufs >>> ***************** >>> # taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p0 -F1 -B0 -O0 -n1 -u1 /dev/ng0n1 >>> submitter=0, tid=5256, file=/dev/ng0n1, node=-1 >>> polled=0, fixedbufs=0/0, register_files=1, buffered=1, QD=128 >>> Engine=io_uring, sq_ring=128, cq_ring=128 >>> IOPS=1.85M, BW=904MiB/s, IOS/call=32/31 >>> IOPS=1.85M, BW=903MiB/s, IOS/call=32/32 >>> IOPS=1.85M, BW=902MiB/s, IOS/call=32/32 >>> ^CExiting on signal >>> Maximum IOPS=1.85M >> >> With the poll support queued up, I ran this one as well. tldr is: >> >> bdev (non pt) 122M IOPS >> irq driven 51-52M IOPS >> polled 71M IOPS >> polled+fixed 78M IOPS except first one, rest three entries are for passthru? somehow I didn't see that big of a gap. I will try to align my setup in coming days. >> Looking at profiles, it looks like the bio is still being allocated >> and freed and not dipping into the alloc cache, which is using a >> substantial amount of CPU. I'll poke a bit and see what's going on... > >It's using the fs_bio_set, and that doesn't have the PERCPU alloc cache >enabled. With the below, we then do: Thanks for the find. >polled+fixed 82M > >I suspect the remainder is due to the lack of batching on the request >freeing side, at least some of it. Haven't really looked deeper yet. > >One issue I saw - try and use passthrough polling without having any >poll queues defined and it'll stall just spinning on completions. You >need to ensure that these are processed as well - look at how the >non-passthrough io_uring poll path handles it. Had tested this earlier, and it used to run fine. And it does not now. I see that io are getting completed, irq-completion is arriving in nvme and it is triggering task-work based completion (by calling io_uring_cmd_complete_in_task). But task-work never got called and therefore no completion happened. io_uring_cmd_complete_in_task -> io_req_task_work_add -> __io_req_task_work_add Seems task work did not get added. Something about newly added IORING_SETUP_DEFER_TASKRUN changes the scenario. static inline void __io_req_task_work_add(struct io_kiocb *req, bool allow_local) { struct io_uring_task *tctx = req->task->io_uring; struct io_ring_ctx *ctx = req->ctx; struct llist_node *node; if (allow_local && ctx->flags & IORING_SETUP_DEFER_TASKRUN) { io_req_local_work_add(req); return; } .... To confirm, I commented that in t/io_uring and it runs fine. Please see if that changes anything for you? I will try to find the actual fix tomorow. diff --git a/t/io_uring.c b/t/io_uring.c index d893b7b2..ac5f60e0 100644 --- a/t/io_uring.c +++ b/t/io_uring.c @@ -460,7 +460,6 @@ static int io_uring_setup(unsigned entries, struct io_uring_params *p) p->flags |= IORING_SETUP_COOP_TASKRUN; p->flags |= IORING_SETUP_SINGLE_ISSUER; - p->flags |= IORING_SETUP_DEFER_TASKRUN; retry: ret = syscall(__NR_io_uring_setup, entries, p); if (!ret) ------pcQIgMQ_HVXS4qWPbWHvSbnTt3jWAgM.sCJInyI3qCVu2Rgf=_445cb_ Content-Type: text/plain; charset="utf-8" ------pcQIgMQ_HVXS4qWPbWHvSbnTt3jWAgM.sCJInyI3qCVu2Rgf=_445cb_--