linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ewan D. Milne" <emilne@redhat.com>
To: Hannes Reinecke <hare@suse.de>
Cc: Laurence Oberman <loberman@redhat.com>,
	Linux SCSI Mailinglist <linux-scsi@vger.kernel.org>,
	fcoe-devel@open-fcoe.org,
	"Curtis Taylor (cjt@us.ibm.com)" <cjt@us.ibm.com>,
	Bud Brown <bubrown@redhat.com>
Subject: Re: [Open-FCoE] Issue with fc_exch_alloc failing initiated by fc_queuecommand on NUMA or large configurations with Intel ixgbe running FCOE
Date: Tue, 11 Oct 2016 10:51:29 -0400	[thread overview]
Message-ID: <1476197489.4019.68.camel@localhost.localdomain> (raw)
In-Reply-To: <2d30ad34-7a5e-b2ba-05f4-b0d831944f4c@suse.de>

On Sat, 2016-10-08 at 19:35 +0200, Hannes Reinecke wrote:
> You might actually be hitting a limitation in the exchange manager code.
> The libfc exchange manager tries to be really clever and will assign a 
> per-cpu exchange manager (probably to increase locality). However, we 
> only have a limited number of exchanges, so on large systems we might 
> actually run into a exchange starvation problem, where we have in theory 
> enough free exchanges, but none for the submitting cpu.
> 
> (Personally, the exchange manager code is in urgent need of reworking.
> It should be replaced by the sbitmap code from Omar).
> 
> Do check how many free exchanges are actually present for the stalling 
> CPU; it might be that you run into a starvation issue.

We are still looking into this but one thing that looks bad is that
the exchange manager code rounds up the number of CPUs to the next
power of 2 before dividing up the exchange id space (and uses the lsbs
of the xid to extract the CPU when looking up an xid). We have a machine
with 288 CPUs, this code is just begging for a rewrite as it looks to
be wasting most of the limited xid space on ixgbe FCoE.

Looks like we get 512 offloaded xids on this adapter and 4096-512
non-offloaded xids.  This would give 1 + 7 xids per CPU.  However, I'm
not sure that even 4096 / 288 = 14 would be enough to prevent stalling.

And, of course, potentially most of the CPUs aren't submitting I/O, so
the whole idea of per-CPU xid space is questionable.

-Ewan


  parent reply	other threads:[~2016-10-11 14:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1812349047.803888.1475929839972.JavaMail.zimbra@redhat.com>
     [not found] ` <1812349047.803888.1475929839972.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-08 12:57   ` Issue with fc_exch_alloc failing initiated by fc_queuecommand on NUMA or large configurations with Intel ixgbe running FCOE Laurence Oberman
     [not found]     ` <209207528.804499.1475931430678.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-08 17:35       ` Hannes Reinecke
2016-10-08 17:53         ` [Open-FCoE] " Laurence Oberman
     [not found]           ` <1360350390.815966.1475949181371.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-08 19:44             ` Laurence Oberman
     [not found]               ` <1271455655.818631.1475955856691.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-09 15:52                 ` Laurence Oberman
     [not found]                   ` <141863610.848432.1476028364025.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-11 13:39                     ` Laurence Oberman
2016-10-11 14:51         ` Ewan D. Milne [this message]
2016-10-12 15:26           ` [Open-FCoE] " Ewan D. Milne
2016-10-12 15:46             ` Hannes Reinecke
2016-10-13  1:20               ` Laurence Oberman
     [not found]                 ` <1564904000.1465519.1476321625269.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-13 12:43                   ` Patch: Revert commit 3e22760d4db6fd89e0be46c3d132390a251da9c6 due to performance issues Laurence Oberman
     [not found]                     ` <2049046384.1533310.1476362610308.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-13 12:51                       ` Hannes Reinecke
2016-10-13 12:55                       ` Johannes Thumshirn
2016-10-14 20:39                     ` Patch: [Open-FCoE] " Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1476197489.4019.68.camel@localhost.localdomain \
    --to=emilne@redhat.com \
    --cc=bubrown@redhat.com \
    --cc=cjt@us.ibm.com \
    --cc=fcoe-devel@open-fcoe.org \
    --cc=hare@suse.de \
    --cc=linux-scsi@vger.kernel.org \
    --cc=loberman@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).