Re: [patch,v2 00/10] make I/O path allocations more numa-friendly

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Bart Van Assche <bvanassche@acm.org>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: Robert Elliott <Elliott@hp.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: [patch,v2 00/10] make I/O path allocations more numa-friendly
Date: Sat, 10 Nov 2012 09:56:24 +0100	[thread overview]
Message-ID: <509E16B8.4070506@acm.org> (raw)
In-Reply-To: <x49pq3my061.fsf@segfault.boston.devel.redhat.com>

On 11/09/12 21:46, Jeff Moyer wrote:
>> On 11/06/12 16:41, Elliott, Robert (Server Storage) wrote:
>>> It's certainly better to tie them all to one node then let them be
>>> randomly scattered across nodes; your 6% observation may simply be
>>> from that.
>>>
>>> How do you think these compare, though (for structures that are per-IO)?
>>> - tying the structures to the node hosting the storage device
>>> - tying the structures to the node running the application
>
> This is a great question, thanks for asking it!  I went ahead and
> modified the megaraid_sas driver to take a module parameter that
> specifies on which node to allocate the scsi_host data structure (and
> all other structures on top that are tied to that).  I then booted the
> system 4 times, specifying a different node each time.  Here are the
> results as compared to a vanilla kernel:
>
> data structures tied to node 0
>
> application tied to:
> node 0:  +6% +/-1%
> node 1:  +9% +/-2%
> node 2:  +10% +/-3%
> node 3:  +0% +/-4%
>
> The first number is the percent gain (or loss) w.r.t. the vanilla
> kernel.  The second number is the standard deviation as a percent of the
> bandwidth.  So, when data structures are tied to node 0, we see an
> increase in performance for nodes 0-3.  However, on node 3, which is the
> node the megaraid_sas controller is attached to, we see no gain in
> performance, and we see an increase in the run to run variation.  The
> standard deviation for the vanilla kernel was 1% across all nodes.
>
> Given that the results are mixed, depending on which node the workload
> is running, I can't really draw any conclusions from this.  The node 3
> number is really throwing me for a loop.  If it were positive, I'd do
> some handwaving about all data structures getting allocated one node 0
> at boot, and the addition of getting the scsi_cmnd structure on the same
> node is what resulted in the net gain.
>
> data structures tied to node 1
>
> application tied to:
> node 0:  +6% +/-1%
> node 1:  +0% +/-2%
> node 2:  +0% +/-6%
> node 3:  -7% +/-13%
>
> Now this is interesting!  Tying data structures to node 1 results in a
> performance boost for node 0?  That would seem to validate your question
> of whether it just helps out to have everything come from the same node,
> as opposed to allocated close to the storage controller.  However, node
> 3 sees a decrease in performance, and a huge standard devation.  Node 2
> also sees an increased standard deviation.  That leaves me wondering why
> node 0 didn't also experience an increase....
>
> data structures tied to node 2
>
> application tied to:
> node 0:  +5% +/-3%
> node 1:  +0% +/-5%
> node 2:  +0% +/-4%
> node 3:  +0% +/-5%
>
> Here, we *mostly* just see an increase in standard deviation, with no
> appreciable change in application performance.
>
> data structures tied to node 3
>
> application tied to:
> node 0:  +0% +/-6%
> node 1:  +6% +/-4%
> node 2:  +7% +/-4%
> node 3:  +0% +/-4%
>
> Now, this is the case where I'd expect to see the best performance,
> since the HBA is on node 3.  However, that's not what we get!  Instead,
> we get maybe a couple percent improvement on nodes 1 and 2, and an
> increased run-to-run variation for nodes 0 and 3.
>
> Overall, I'd say that my testing is inconclusive, and I may just pull
> the patch set until I can get some reasonable results.

Which NUMA node was processing the megaraid_sas interrupts in these 
tests ? Was irqbalance running during these tests or were interrupts 
manually pinned to a specific CPU core ?

Thanks,

Bart.

next prev parent reply	other threads:[~2012-11-10  8:56 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-02 21:45 [patch,v2 00/10] make I/O path allocations more numa-friendly Jeff Moyer
2012-11-02 21:45 ` [patch,v2 01/10] scsi: add scsi_host_alloc_node Jeff Moyer
2012-11-03 16:35   ` Bart Van Assche
2012-11-05 14:06     ` Jeff Moyer
2012-11-02 21:45 ` [patch,v2 02/10] scsi: make __scsi_alloc_queue numa-aware Jeff Moyer
2012-11-02 21:45 ` [patch,v2 03/10] scsi: make scsi_alloc_sdev numa-aware Jeff Moyer
2012-11-02 21:45 ` [patch,v2 04/10] scsi: allocate scsi_cmnd-s from the device's local numa node Jeff Moyer
2012-11-03 16:36   ` Bart Van Assche
2012-11-05 14:09     ` Jeff Moyer
2012-11-02 21:45 ` [patch,v2 05/10] sd: use alloc_disk_node Jeff Moyer
2012-11-03 16:37   ` Bart Van Assche
2012-11-05 14:12     ` Jeff Moyer
2012-11-05 14:57       ` Bart Van Assche
2012-11-05 15:32         ` taco
2012-11-02 21:45 ` [patch,v2 06/10] ata: use scsi_host_alloc_node Jeff Moyer
2012-11-02 21:46 ` [patch,v2 07/10] megaraid_sas: " Jeff Moyer
2012-11-02 21:46 ` [patch,v2 08/10] mpt2sas: " Jeff Moyer
2012-11-02 21:46 ` [patch,v2 09/10] lpfc: " Jeff Moyer
2012-11-02 21:46 ` [patch,v2 10/10] cciss: use blk_init_queue_node Jeff Moyer
2012-11-06 15:41 ` [patch,v2 00/10] make I/O path allocations more numa-friendly Elliott, Robert (Server Storage)
2012-11-06 19:12   ` Bart Van Assche
2012-11-09 20:46     ` Jeff Moyer
2012-11-10  8:56       ` Bart Van Assche [this message]
2012-11-12 21:26         ` Jeff Moyer
2012-11-13  1:26           ` Elliott, Robert (Server Storage)
2012-11-13 15:44             ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=509E16B8.4070506@acm.org \
    --to=bvanassche@acm.org \
    --cc=Elliott@hp.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.