From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7DE73C54E64 for ; Thu, 28 Mar 2024 11:32:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date: Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=f6X0T0QvsvPcleLoF5b3AzTy2619Cz/nMq7hoZ7FRjo=; b=1cqaygxXcJmqYHoySj3oPZbrQu WRwqywalQCoi6cbpHj681HpMwoME83fHDU/adMytJXtCNT7cV98ODqaIdV9E4cuj1JCDoduI85Cqw AyXg4C3AsZJEwr8EZtRDXRMTB7qTV5suPleVZrnLnuqL48i3kAGUY+VjNnU8AAMz/MqLB64Er6ZQw b1eLYDF/HvpyrSx0nu9exBmvRftuEpUXQVXnFS/+gWks7w0s+pdBSLmg7KSfJlBGth7dKF+TcfNp4 AAVyosVz6eUWH8uokhVAllh24T+tYzKCfCit/2x3m8tDg1Oxw7RWHSBBYYn+kokgs3oj2APjQMjM3 +yTc8k2g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rpnzT-0000000DkOq-47YA; Thu, 28 Mar 2024 11:32:15 +0000 Received: from smtp-out2.suse.de ([195.135.223.131]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rpnzP-0000000DkNr-0KOw for linux-nvme@lists.infradead.org; Thu, 28 Mar 2024 11:32:13 +0000 Received: from imap2.dmz-prg2.suse.org (imap2.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:98]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id EA38920807; Thu, 28 Mar 2024 11:32:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1711625526; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f6X0T0QvsvPcleLoF5b3AzTy2619Cz/nMq7hoZ7FRjo=; b=DRzMMwGly/4s/Qnqw0Rudt6U9az8fANjYFHG2KHeMaoizJ35bPA+U7zaks9ndOrfbV4dxD T5gdC4B7TGaiKiwusqYNYKKxYQASGeT0pdvsTgsKlzwMhJZoaDsiClaBI6aFIxUBiSUOuv 1CKWC8m9vhhutUAJpAlejFZT5slQRw8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1711625526; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f6X0T0QvsvPcleLoF5b3AzTy2619Cz/nMq7hoZ7FRjo=; b=dHn4V7iRB1rFqvVEFx681erfCExeSMwjuACIkGMPyHAGlYP4565/Q2tiEERVbgHvyIRTov zZpHOeuIDMx0rYCw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1711625525; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f6X0T0QvsvPcleLoF5b3AzTy2619Cz/nMq7hoZ7FRjo=; b=wP7uUpiFqKyvBDk1TZUXs9paUNMcahGyrJWrtJzfEI7zvm8oZEPmXzqpx60LpEdYhGgwjT bC43HerEnuOejR1LOH/O86orbzPrhwqhG07gJJmbrKCcoo7zx0LXjdVfzpQi9WLZQkCM02 mzoP7vJNqf85+E+1g4PwvR72ICdZkZo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1711625525; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f6X0T0QvsvPcleLoF5b3AzTy2619Cz/nMq7hoZ7FRjo=; b=S6lMdoZNjcYre+Kh3yhGhKUhwGE0YyneqooJHbFax6R9ExAqViso03vPc+3gaKxiMxDtd4 /Fuj4dJYyhJAT9Cw== Received: from imap2.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap2.dmz-prg2.suse.org (Postfix) with ESMTPS id B0D3513AF7; Thu, 28 Mar 2024 11:32:05 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap2.dmz-prg2.suse.org with ESMTPSA id EeGyKjVVBWbvGQAAn2gu4w (envelope-from ); Thu, 28 Mar 2024 11:32:05 +0000 Message-ID: Date: Thu, 28 Mar 2024 12:32:05 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH RFC 0/2] block,nvme: latency-based I/O scheduler Content-Language: en-US To: Sagi Grimberg , Hannes Reinecke , Jens Axboe Cc: Keith Busch , Christoph Hellwig , linux-nvme@lists.infradead.org, linux-block@vger.kernel.org References: <20240326153529.75989-1-hare@kernel.org> <5cade4b4-f19f-422d-ab93-bc853b1563d1@grimberg.me> From: Hannes Reinecke In-Reply-To: <5cade4b4-f19f-422d-ab93-bc853b1563d1@grimberg.me> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spamd-Result: default: False [-4.50 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; XM_UA_NO_VERSION(0.01)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; DWL_DNSWL_BLOCKED(0.00)[suse.de:dkim]; BAYES_HAM(-3.00)[100.00%]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; DKIM_TRACE(0.00)[suse.de:+]; MX_GOOD(-0.01)[]; RCPT_COUNT_SEVEN(0.00)[7]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; NEURAL_HAM_SHORT(-0.20)[-1.000]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:98:from] Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=wP7uUpiF; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=S6lMdoZN X-Rspamd-Queue-Id: EA38920807 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240328_043211_293372_43C426AD X-CRM114-Status: GOOD ( 15.03 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org On 3/28/24 11:38, Sagi Grimberg wrote: > > > On 26/03/2024 17:35, Hannes Reinecke wrote: >> Hi all, >> >> there had been several attempts to implement a latency-based I/O >> scheduler for native nvme multipath, all of which had its issues. >> >> So time to start afresh, this time using the QoS framework >> already present in the block layer. >> It consists of two parts: >> - a new 'blk-nodelat' QoS module, which is just a simple per-node >>    latency tracker >> - a 'latency' nvme I/O policy >> >> Using the 'tiobench' fio script I'm getting: >>    WRITE: bw=531MiB/s (556MB/s), 33.2MiB/s-52.4MiB/s >>    (34.8MB/s-54.9MB/s), io=4096MiB (4295MB), run=4888-7718msec >>      WRITE: bw=539MiB/s (566MB/s), 33.7MiB/s-50.9MiB/s >>    (35.3MB/s-53.3MB/s), io=4096MiB (4295MB), run=5033-7594msec >>       READ: bw=898MiB/s (942MB/s), 56.1MiB/s-75.4MiB/s >>    (58.9MB/s-79.0MB/s), io=4096MiB (4295MB), run=3397-4560msec >>       READ: bw=1023MiB/s (1072MB/s), 63.9MiB/s-75.1MiB/s >>    (67.0MB/s-78.8MB/s), io=4096MiB (4295MB), run=3408-4005msec >> >> for 'round-robin' and >> >>    WRITE: bw=574MiB/s (601MB/s), 35.8MiB/s-45.5MiB/s >>    (37.6MB/s-47.7MB/s), io=4096MiB (4295MB), run=5629-7142msec >>      WRITE: bw=639MiB/s (670MB/s), 39.9MiB/s-47.5MiB/s >>    (41.9MB/s-49.8MB/s), io=4096MiB (4295MB), run=5388-6408msec >>       READ: bw=1024MiB/s (1074MB/s), 64.0MiB/s-73.7MiB/s >>    (67.1MB/s-77.2MB/s), io=4096MiB (4295MB), run=3475-4000msec >>       READ: bw=1013MiB/s (1063MB/s), 63.3MiB/s-72.6MiB/s >>    (66.4MB/s-76.2MB/s), io=4096MiB (4295MB), run=3524-4042msec >> for 'latency' with 'decay' set to 10. >> That's on a 32G FC testbed running against a brd target, >> fio running with 16 thread. > > Can you quantify the improvement? Also, the name latency suggest > that latency should be improved no? > 'latency' refers to 'latency-based' I/O scheduler, ie it selects the path with the least latency. It does not necessarily _improve_ the latency. Eg for truly symmetric fabrics it doesn't. It _does_ improve matters when running on asymmetric fabrics (eg on a two socket system with two PCI HBAs, each connected to one socket, or like the example above with one path via 'loop', and the other via 'tcp' and address '127.0.0.1'). And, of course, if you have congested fabrics, where it should be able to direct I/O to the least congested path. But I'll see to extract the latency numbers, too. What I really wanted to show is that we _can_ track latency without harming performance. Cheers, Hannes