Re: [patch,v2 00/10] make I/O path allocations more numa-friendly

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jeff Moyer <jmoyer@redhat.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Robert Elliott <Elliott@hp.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: [patch,v2 00/10] make I/O path allocations more numa-friendly
Date: Mon, 12 Nov 2012 16:26:38 -0500	[thread overview]
Message-ID: <x497gpqjywh.fsf@segfault.boston.devel.redhat.com> (raw)
In-Reply-To: <509E16B8.4070506@acm.org> (Bart Van Assche's message of "Sat, 10 Nov 2012 09:56:24 +0100")

Bart Van Assche <bvanassche@acm.org> writes:

> On 11/09/12 21:46, Jeff Moyer wrote:
>>> On 11/06/12 16:41, Elliott, Robert (Server Storage) wrote:
>>>> It's certainly better to tie them all to one node then let them be
>>>> randomly scattered across nodes; your 6% observation may simply be
>>>> from that.
>>>>
>>>> How do you think these compare, though (for structures that are per-IO)?
>>>> - tying the structures to the node hosting the storage device
>>>> - tying the structures to the node running the application
>>
>> This is a great question, thanks for asking it!  I went ahead and
>> modified the megaraid_sas driver to take a module parameter that
>> specifies on which node to allocate the scsi_host data structure (and
>> all other structures on top that are tied to that).  I then booted the
>> system 4 times, specifying a different node each time.  Here are the
>> results as compared to a vanilla kernel:
>>
[snip]
> Which NUMA node was processing the megaraid_sas interrupts in these
> tests ? Was irqbalance running during these tests or were interrupts
> manually pinned to a specific CPU core ?

irqbalanced was indeed running, so I can't say for sure what node the
irq was pinned to during my tests (I didn't record that information).

I re-ran the tests, this time turning off irqbalance (well, I set it to
one-shot), and the pinning the irq to the node running the benchmark.
In this configuration, I saw no regressions in performance.

As a reminder:

>> The first number is the percent gain (or loss) w.r.t. the vanilla
>> kernel.  The second number is the standard deviation as a percent of the
>> bandwidth.  So, when data structures are tied to node 0, we see an
>> increase in performance for nodes 0-3.  However, on node 3, which is the
>> node the megaraid_sas controller is attached to, we see no gain in
>> performance, and we see an increase in the run to run variation.  The
>> standard deviation for the vanilla kernel was 1% across all nodes.

Here are the updated numbers:

data structures tied to node 0

application tied to:
node 0:  0 +/-4%
node 1:  9 +/-1%
node 2: 10 +/-2%
node 3:  0 +/-2%

data structures tied to node 1

application tied to:
node 0:  5 +/-2%
node 1:  6 +/-8%
node 2: 10 +/-1%
node 3:  0 +/-3%

data structures tied to node 2

application tied to:
node 0:  6 +/-2%
node 1:  9 +/-2%
node 2:  7 +/-6%
node 3:  0 +/-3%

data structures tied to node 3

application tied to:
node 0:  0 +/-4%
node 1: 10 +/-2%
node 2: 11 +/-1%
node 3:  0 +/-5%

Now, the above is apples to oranges, since the vanilla kernel was run
w/o any tuning of irqs.  So, I went ahead and booted with
numa_node_parm=-1, which is the same as vanilla, and re-ran the tests.

When we compare a vanilla kernel with and without irq binding, we get
this:

node 0:  0 +/-3%
node 1:  9 +/-1%
node 2:  8 +/-3%
node 3:  0 +/-1%

As you can see, binding irqs helps nodes 1 and 2 quite substantially.
What this boils down to, when you compare a patched kernel with the
vanilla kernel, where they are both tying irqs to the node hosting the
application, is a net gain of zero, but an increase in standard
deviation.

Let me try to make that more readable.  The patch set does not appear
to help at all with my benchmark configuration.  ;-)  One other
conclusion I can draw from this data is that irqbalance could do a
better job.

An interesting (to me) tidbit about this hardware is that, while it has
4 numa nodes, it only has 2 sockets.  Based on the numbers above, I'd
guess nodes 0 and 3 are in the same socket, likewise for 1 and 2.

Cheers,
Jeff

next prev parent reply	other threads:[~2012-11-12 21:26 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-02 21:45 [patch,v2 00/10] make I/O path allocations more numa-friendly Jeff Moyer
2012-11-02 21:45 ` [patch,v2 01/10] scsi: add scsi_host_alloc_node Jeff Moyer
2012-11-03 16:35   ` Bart Van Assche
2012-11-05 14:06     ` Jeff Moyer
2012-11-02 21:45 ` [patch,v2 02/10] scsi: make __scsi_alloc_queue numa-aware Jeff Moyer
2012-11-02 21:45 ` [patch,v2 03/10] scsi: make scsi_alloc_sdev numa-aware Jeff Moyer
2012-11-02 21:45 ` [patch,v2 04/10] scsi: allocate scsi_cmnd-s from the device's local numa node Jeff Moyer
2012-11-03 16:36   ` Bart Van Assche
2012-11-05 14:09     ` Jeff Moyer
2012-11-02 21:45 ` [patch,v2 05/10] sd: use alloc_disk_node Jeff Moyer
2012-11-03 16:37   ` Bart Van Assche
2012-11-05 14:12     ` Jeff Moyer
2012-11-05 14:57       ` Bart Van Assche
2012-11-05 15:32         ` taco
2012-11-02 21:45 ` [patch,v2 06/10] ata: use scsi_host_alloc_node Jeff Moyer
2012-11-02 21:46 ` [patch,v2 07/10] megaraid_sas: " Jeff Moyer
2012-11-02 21:46 ` [patch,v2 08/10] mpt2sas: " Jeff Moyer
2012-11-02 21:46 ` [patch,v2 09/10] lpfc: " Jeff Moyer
2012-11-02 21:46 ` [patch,v2 10/10] cciss: use blk_init_queue_node Jeff Moyer
2012-11-06 15:41 ` [patch,v2 00/10] make I/O path allocations more numa-friendly Elliott, Robert (Server Storage)
2012-11-06 19:12   ` Bart Van Assche
2012-11-09 20:46     ` Jeff Moyer
2012-11-10  8:56       ` Bart Van Assche
2012-11-12 21:26         ` Jeff Moyer [this message]
2012-11-13  1:26           ` Elliott, Robert (Server Storage)
2012-11-13 15:44             ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=x497gpqjywh.fsf@segfault.boston.devel.redhat.com \
    --to=jmoyer@redhat.com \
    --cc=Elliott@hp.com \
    --cc=bvanassche@acm.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).