public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@fb.com>
To: Keith Busch <keith.busch@intel.com>
Cc: "Matias Bjørling" <m@bjorling.me>,
	"Matthew Wilcox" <willy@linux.intel.com>,
	"sbradshaw@micron.com" <sbradshaw@micron.com>,
	"tom.leiming@gmail.com" <tom.leiming@gmail.com>,
	"hch@infradead.org" <hch@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Subject: Re: [PATCH v7] NVMe: conversion to blk-mq
Date: Fri, 13 Jun 2014 09:11:37 -0600	[thread overview]
Message-ID: <539B14A9.8010204@fb.com> (raw)
In-Reply-To: <alpine.LRH.2.03.1406130838210.4699@AMR>

On 06/13/2014 09:05 AM, Keith Busch wrote:
> On Fri, 13 Jun 2014, Jens Axboe wrote:
>> On 06/12/2014 06:06 PM, Keith Busch wrote:
>>> When cancelling IOs, we have to check if the hwctx has a valid tags
>>> for some reason. I have 32 cores in my system and as many queues, but
>>
>> It's because unused queues are torn down, to save memory.
>>
>>> blk-mq is only using half of those queues and freed the "tags" for the
>>> rest after they'd been initialized without telling the driver. Why is
>>> blk-mq not making utilizing all my queues?
>>
>> You have 31 + 1 queues, so only 31 mappable queues. blk-mq symmetrically
>> distributes these, so you should have a core + thread sibling on 16
>> queues. And yes, that leaves 15 idle hardware queues for this specific
>> case. I like the symmetry, it makes it more predictable if things are
>> spread out evenly.
> 
> You'll see performance differences on some workloads that depend on which
> cores your process runs and which one services an interrupt. We can play
> games with with cores and see what happens on my 32 cpu system. I usually
> run 'irqbalance --hint=exact' for best performance, but that doesn't do
> anything with blk-mq since the affinity hint is gone.

Huh wtf, that hint is not supposed to be gone. I'm guessing it went away
with the removal of the manual queue assignments.

> I ran the following script several times on each version of the
> driver. This will pin a sequential read test to cores 0, 8, and 16. The
> device is local to NUMA node on cores 0-7 and 16-23; the second test
> runs on the remote node and the third on the thread sibling of 0. Results
> were averaged, but very consistent anyway. The system was otherwise idle.
> 
>  # for i in $(seq 0 8 16); do
>   > let "cpu=1<<$i"
>   > cpu=`echo $cpu | awk '{printf "%#x\n", $1}'`
>   > taskset ${cpu} dd if=/dev/nvme0n1 of=/dev/null bs=4k count=1000000
> iflag=direct
>   > done
> 
> Here are the performance drops observed with blk-mq with the existing
> driver as baseline:
> 
>  CPU : Drop
>  ....:.....
>    0 : -6%
>    8 : -36%
>   16 : -12%

We need the hints back for sure, I'll run some of the same tests and
verify to be sure. Out of curiousity, what is the topology like on your
box? Are 0/1 siblings, and 0..7 one node?


  reply	other threads:[~2014-06-13 15:12 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-10  9:20 [PATCH v7] conversion to blk-mq Matias Bjørling
2014-06-10  9:20 ` [PATCH v7] NVMe: " Matias Bjørling
2014-06-10 15:51   ` Keith Busch
2014-06-10 16:19     ` Jens Axboe
2014-06-10 19:29       ` Keith Busch
2014-06-10 19:58         ` Jens Axboe
2014-06-10 21:10           ` Keith Busch
2014-06-10 21:14             ` Jens Axboe
2014-06-10 21:21               ` Keith Busch
2014-06-10 21:33                 ` Matthew Wilcox
2014-06-11 16:54                   ` Jens Axboe
2014-06-11 17:09                     ` Matthew Wilcox
2014-06-11 22:22                       ` Matias Bjørling
2014-06-11 22:51                         ` Keith Busch
2014-06-12 14:32                           ` Matias Bjørling
2014-06-12 16:24                             ` Keith Busch
2014-06-13  0:06                               ` Keith Busch
2014-06-13 14:07                                 ` Jens Axboe
2014-06-13 15:05                                   ` Keith Busch
2014-06-13 15:11                                     ` Jens Axboe [this message]
2014-06-13 15:16                                       ` Keith Busch
2014-06-13 18:14                                         ` Jens Axboe
2014-06-13 19:22                                           ` Keith Busch
2014-06-13 19:29                                             ` Jens Axboe
2014-06-13 20:56                                               ` Jens Axboe
2014-06-13 21:28                                             ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=539B14A9.8010204@fb.com \
    --to=axboe@fb.com \
    --cc=hch@infradead.org \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=m@bjorling.me \
    --cc=sbradshaw@micron.com \
    --cc=tom.leiming@gmail.com \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox