From: Jens Axboe <axboe@fb.com>
To: Keith Busch <keith.busch@intel.com>
Cc: "Matias Bjørling" <m@bjorling.me>,
"Matthew Wilcox" <willy@linux.intel.com>,
"sbradshaw@micron.com" <sbradshaw@micron.com>,
"tom.leiming@gmail.com" <tom.leiming@gmail.com>,
"hch@infradead.org" <hch@infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Subject: Re: [PATCH v7] NVMe: conversion to blk-mq
Date: Fri, 13 Jun 2014 09:11:37 -0600 [thread overview]
Message-ID: <539B14A9.8010204@fb.com> (raw)
In-Reply-To: <alpine.LRH.2.03.1406130838210.4699@AMR>
On 06/13/2014 09:05 AM, Keith Busch wrote:
> On Fri, 13 Jun 2014, Jens Axboe wrote:
>> On 06/12/2014 06:06 PM, Keith Busch wrote:
>>> When cancelling IOs, we have to check if the hwctx has a valid tags
>>> for some reason. I have 32 cores in my system and as many queues, but
>>
>> It's because unused queues are torn down, to save memory.
>>
>>> blk-mq is only using half of those queues and freed the "tags" for the
>>> rest after they'd been initialized without telling the driver. Why is
>>> blk-mq not making utilizing all my queues?
>>
>> You have 31 + 1 queues, so only 31 mappable queues. blk-mq symmetrically
>> distributes these, so you should have a core + thread sibling on 16
>> queues. And yes, that leaves 15 idle hardware queues for this specific
>> case. I like the symmetry, it makes it more predictable if things are
>> spread out evenly.
>
> You'll see performance differences on some workloads that depend on which
> cores your process runs and which one services an interrupt. We can play
> games with with cores and see what happens on my 32 cpu system. I usually
> run 'irqbalance --hint=exact' for best performance, but that doesn't do
> anything with blk-mq since the affinity hint is gone.
Huh wtf, that hint is not supposed to be gone. I'm guessing it went away
with the removal of the manual queue assignments.
> I ran the following script several times on each version of the
> driver. This will pin a sequential read test to cores 0, 8, and 16. The
> device is local to NUMA node on cores 0-7 and 16-23; the second test
> runs on the remote node and the third on the thread sibling of 0. Results
> were averaged, but very consistent anyway. The system was otherwise idle.
>
> # for i in $(seq 0 8 16); do
> > let "cpu=1<<$i"
> > cpu=`echo $cpu | awk '{printf "%#x\n", $1}'`
> > taskset ${cpu} dd if=/dev/nvme0n1 of=/dev/null bs=4k count=1000000
> iflag=direct
> > done
>
> Here are the performance drops observed with blk-mq with the existing
> driver as baseline:
>
> CPU : Drop
> ....:.....
> 0 : -6%
> 8 : -36%
> 16 : -12%
We need the hints back for sure, I'll run some of the same tests and
verify to be sure. Out of curiousity, what is the topology like on your
box? Are 0/1 siblings, and 0..7 one node?
next prev parent reply other threads:[~2014-06-13 15:12 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-10 9:20 [PATCH v7] conversion to blk-mq Matias Bjørling
2014-06-10 9:20 ` [PATCH v7] NVMe: " Matias Bjørling
2014-06-10 15:51 ` Keith Busch
2014-06-10 16:19 ` Jens Axboe
2014-06-10 19:29 ` Keith Busch
2014-06-10 19:58 ` Jens Axboe
2014-06-10 21:10 ` Keith Busch
2014-06-10 21:14 ` Jens Axboe
2014-06-10 21:21 ` Keith Busch
2014-06-10 21:33 ` Matthew Wilcox
2014-06-11 16:54 ` Jens Axboe
2014-06-11 17:09 ` Matthew Wilcox
2014-06-11 22:22 ` Matias Bjørling
2014-06-11 22:51 ` Keith Busch
2014-06-12 14:32 ` Matias Bjørling
2014-06-12 16:24 ` Keith Busch
2014-06-13 0:06 ` Keith Busch
2014-06-13 14:07 ` Jens Axboe
2014-06-13 15:05 ` Keith Busch
2014-06-13 15:11 ` Jens Axboe [this message]
2014-06-13 15:16 ` Keith Busch
2014-06-13 18:14 ` Jens Axboe
2014-06-13 19:22 ` Keith Busch
2014-06-13 19:29 ` Jens Axboe
2014-06-13 20:56 ` Jens Axboe
2014-06-13 21:28 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=539B14A9.8010204@fb.com \
--to=axboe@fb.com \
--cc=hch@infradead.org \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=m@bjorling.me \
--cc=sbradshaw@micron.com \
--cc=tom.leiming@gmail.com \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox