Re: [patch,v2 00/10] make I/O path allocations more numa-friendly

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Bart Van Assche <bvanassche@acm.org>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: Robert Elliott <Elliott@hp.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: [patch,v2 00/10] make I/O path allocations more numa-friendly
Date: Sat, 10 Nov 2012 09:56:24 +0100	[thread overview]
Message-ID: <509E16B8.4070506@acm.org> (raw)
In-Reply-To: <x49pq3my061.fsf@segfault.boston.devel.redhat.com>

On 11/09/12 21:46, Jeff Moyer wrote:
>> On 11/06/12 16:41, Elliott, Robert (Server Storage) wrote:
>>> It's certainly better to tie them all to one node then let them be
>>> randomly scattered across nodes; your 6% observation may simply be
>>> from that.
>>>
>>> How do you think these compare, though (for structures that are per-IO)?
>>> - tying the structures to the node hosting the storage device
>>> - tying the structures to the node running the application
>
> This is a great question, thanks for asking it!  I went ahead and
> modified the megaraid_sas driver to take a module parameter that
> specifies on which node to allocate the scsi_host data structure (and
> all other structures on top that are tied to that).  I then booted the
> system 4 times, specifying a different node each time.  Here are the
> results as compared to a vanilla kernel:
>
> data structures tied to node 0
>
> application tied to:
> node 0:  +6% +/-1%
> node 1:  +9% +/-2%
> node 2:  +10% +/-3%
> node 3:  +0% +/-4%
>
> The first number is the percent gain (or loss) w.r.t. the vanilla
> kernel.  The second number is the standard deviation as a percent of the
> bandwidth.  So, when data structures are tied to node 0, we see an
> increase in performance for nodes 0-3.  However, on node 3, which is the
> node the megaraid_sas controller is attached to, we see no gain in
> performance, and we see an increase in the run to run variation.  The
> standard deviation for the vanilla kernel was 1% across all nodes.
>
> Given that the results are mixed, depending on which node the workload
> is running, I can't really draw any conclusions from this.  The node 3
> number is really throwing me for a loop.  If it were positive, I'd do
> some handwaving about all data structures getting allocated one node 0
> at boot, and the addition of getting the scsi_cmnd structure on the same
> node is what resulted in the net gain.
>
> data structures tied to node 1
>
> application tied to:
> node 0:  +6% +/-1%
> node 1:  +0% +/-2%
> node 2:  +0% +/-6%
> node 3:  -7% +/-13%
>
> Now this is interesting!  Tying data structures to node 1 results in a
> performance boost for node 0?  That would seem to validate your question
> of whether it just helps out to have everything come from the same node,
> as opposed to allocated close to the storage controller.  However, node
> 3 sees a decrease in performance, and a huge standard devation.  Node 2
> also sees an increased standard deviation.  That leaves me wondering why
> node 0 didn't also experience an increase....
>
> data structures tied to node 2
>
> application tied to:
> node 0:  +5% +/-3%
> node 1:  +0% +/-5%
> node 2:  +0% +/-4%
> node 3:  +0% +/-5%
>
> Here, we *mostly* just see an increase in standard deviation, with no
> appreciable change in application performance.
>
> data structures tied to node 3
>
> application tied to:
> node 0:  +0% +/-6%
> node 1:  +6% +/-4%
> node 2:  +7% +/-4%
> node 3:  +0% +/-4%
>
> Now, this is the case where I'd expect to see the best performance,
> since the HBA is on node 3.  However, that's not what we get!  Instead,
> we get maybe a couple percent improvement on nodes 1 and 2, and an
> increased run-to-run variation for nodes 0 and 3.
>
> Overall, I'd say that my testing is inconclusive, and I may just pull
> the patch set until I can get some reasonable results.

Which NUMA node was processing the megaraid_sas interrupts in these 
tests ? Was irqbalance running during these tests or were interrupts 
manually pinned to a specific CPU core ?

Thanks,

Bart.

next prev parent reply	other threads:[~2012-11-10  8:56 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-02 21:45 [patch,v2 00/10] make I/O path allocations more numa-friendly Jeff Moyer
2012-11-02 21:45 ` [patch,v2 01/10] scsi: add scsi_host_alloc_node Jeff Moyer
2012-11-03 16:35   ` Bart Van Assche
2012-11-05 14:06     ` Jeff Moyer
2012-11-02 21:45 ` [patch,v2 02/10] scsi: make __scsi_alloc_queue numa-aware Jeff Moyer
2012-11-02 21:45 ` [patch,v2 03/10] scsi: make scsi_alloc_sdev numa-aware Jeff Moyer
2012-11-02 21:45 ` [patch,v2 04/10] scsi: allocate scsi_cmnd-s from the device's local numa node Jeff Moyer
2012-11-03 16:36   ` Bart Van Assche
2012-11-05 14:09     ` Jeff Moyer
2012-11-02 21:45 ` [patch,v2 05/10] sd: use alloc_disk_node Jeff Moyer
2012-11-03 16:37   ` Bart Van Assche
2012-11-05 14:12     ` Jeff Moyer
2012-11-05 14:57       ` Bart Van Assche
2012-11-05 15:32         ` taco
2012-11-02 21:45 ` [patch,v2 06/10] ata: use scsi_host_alloc_node Jeff Moyer
2012-11-02 21:46 ` [patch,v2 07/10] megaraid_sas: " Jeff Moyer
2012-11-02 21:46 ` [patch,v2 08/10] mpt2sas: " Jeff Moyer
2012-11-02 21:46 ` [patch,v2 09/10] lpfc: " Jeff Moyer
2012-11-02 21:46 ` [patch,v2 10/10] cciss: use blk_init_queue_node Jeff Moyer
2012-11-06 15:41 ` [patch,v2 00/10] make I/O path allocations more numa-friendly Elliott, Robert (Server Storage)
2012-11-06 19:12   ` Bart Van Assche
2012-11-09 20:46     ` Jeff Moyer
2012-11-10  8:56       ` Bart Van Assche [this message]
2012-11-12 21:26         ` Jeff Moyer
2012-11-13  1:26           ` Elliott, Robert (Server Storage)
2012-11-13 15:44             ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=509E16B8.4070506@acm.org \
    --to=bvanassche@acm.org \
    --cc=Elliott@hp.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).