Re: Issue with fc_exch_alloc failing initiated by fc_queuecommand on NUMA or large configurations with Intel ixgbe running FCOE

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Laurence Oberman <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Linux SCSI Mailinglist
	<linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	fcoe-devel-s9riP+hp16TNLxjTenLetw@public.gmane.org
Cc: "Curtis Taylor (cjt-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org)"
	<cjt-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
	Bud Brown <bubrown-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: Issue with fc_exch_alloc failing initiated by fc_queuecommand on NUMA or large configurations with Intel ixgbe running FCOE
Date: Sat, 8 Oct 2016 08:57:10 -0400 (EDT)	[thread overview]
Message-ID: <209207528.804499.1475931430678.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <1812349047.803888.1475929839972.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Hello

This has been a tough problem to chase down but was finally reproduced.
This issue is apparent on RHEL kernels and upstream so justified reporting here.

Its out there and some may not be aware its even happening other than very slow performance using ixgbe and software FCOE on large configurations.

Upstream Kernel used for reproducing is 4.8.0

I/O performance was noted to be very impacted on a large NUMA test system (64 CPUS 4 NUMA nodes) running the software fcoe stack with Intel ixgbe interfaces.
After capturing blktraces we saw for every I/O there was at least one blk_requeue_request and sometimes hundreds or more.
This resulted in IOPS rates being marginal at best with queuing and high wait times.
After narrowing this down with systemtap and trace-cmd we added further debug and it was apparent this was dues to SCSI_MLQUEUE_HOST_BUSY being returned.
So I/O passes but very slowly as it constantly having to be requeued.

The identical configuration in our lab with a single NUMA node and 4 CPUS does not see this issue at all.
The same large system that reproduces this was booted with numa=off and still sees the issue.

The flow is as follows:

>From with fc_queuecommand
          fc_fcp_pkt_send() calls fc_fcp_cmd_send() calls tt.exch_seq_send() which calls fc_exch_seq_send

this fails and returns NULL in fc_exch_alloc() as the list traveral never creates a match.

static struct fc_seq *fc_exch_seq_send(struct fc_lport *lport,
				       struct fc_frame *fp,
				       void (*resp)(struct fc_seq *,
						    struct fc_frame *fp,
						    void *arg),
				       void (*destructor)(struct fc_seq *,
							  void *),
				       void *arg, u32 timer_msec)
{
	struct fc_exch *ep;
	struct fc_seq *sp = NULL;
	struct fc_frame_header *fh;
	struct fc_fcp_pkt *fsp = NULL;
	int rc = 1;

	ep = fc_exch_alloc(lport, fp);     ***** Called Here and fails
	if (!ep) {
		fc_frame_free(fp);
		printk("RHDEBUG: In fc_exch_seq_send returned NULL because !ep with ep = %p\n",ep);
		return NULL;
	}
..
..
]

 fc_exch_alloc() - Allocate an exchange from an EM on a
 *	/**
 *	     local port's list of EMs.
 * @lport: The local port that will own the exchange
 * @fp:	   The FC frame that the exchange will be for
 *
 * This function walks the list of exchange manager(EM)
 * anchors to select an EM for a new exchange allocation. The
 * EM is selected when a NULL match function pointer is encountered
 * or when a call to a match function returns true.
 */
static inline struct fc_exch *fc_exch_alloc(struct fc_lport *lport,
					    struct fc_frame *fp)
{
	struct fc_exch_mgr_anchor *ema;

	list_for_each_entry(ema, &lport->ema_list, ema_list)
		if (!ema->match || ema->match(fp))
			return fc_exch_em_alloc(lport, ema->mp);
	return NULL;                                 ***** Never matches so returns NULL
}

RHDEBUG: In fc_exch_seq_send returned NULL because !ep with ep = (null)
RHDEBUG: rc -1 with !seq = (null) after calling tt.exch_seq_send  within fc_fcp_cmd_send
RHDEBUG: rc non zero in :unlock within fc_fcp_cmd_send = -1
RHDEBUG: In fc_fcp_pkt_send, we returned from  rc = lport->tt.fcp_cmd_send with rc = -1

RHDEBUG: We hit SCSI_MLQUEUE_HOST_BUSY in fc_queuecommand with rval in fc_fcp_pkt_send=-1

I am trying to get my head around why a large multi-node system sees this issue even with NUMA disabled.
Has anybody seen this or is aware of this with configurations (using fc_queuecommand)

I am continuing to add debug to narrow this down.

Thanks
Laurence

next      parent reply	other threads:[~2016-10-08 12:57 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1812349047.803888.1475929839972.JavaMail.zimbra@redhat.com>
     [not found] ` <1812349047.803888.1475929839972.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-08 12:57   ` Laurence Oberman [this message]
     [not found]     ` <209207528.804499.1475931430678.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-08 17:35       ` Issue with fc_exch_alloc failing initiated by fc_queuecommand on NUMA or large configurations with Intel ixgbe running FCOE Hannes Reinecke
2016-10-08 17:53         ` [Open-FCoE] " Laurence Oberman
     [not found]           ` <1360350390.815966.1475949181371.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-08 19:44             ` Laurence Oberman
     [not found]               ` <1271455655.818631.1475955856691.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-09 15:52                 ` Laurence Oberman
     [not found]                   ` <141863610.848432.1476028364025.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-11 13:39                     ` Laurence Oberman
2016-10-11 14:51         ` [Open-FCoE] " Ewan D. Milne
2016-10-12 15:26           ` Ewan D. Milne
2016-10-12 15:46             ` Hannes Reinecke
2016-10-13  1:20               ` Laurence Oberman
     [not found]                 ` <1564904000.1465519.1476321625269.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-13 12:43                   ` Patch: Revert commit 3e22760d4db6fd89e0be46c3d132390a251da9c6 due to performance issues Laurence Oberman
     [not found]                     ` <2049046384.1533310.1476362610308.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-13 12:51                       ` Hannes Reinecke
2016-10-13 12:55                       ` Johannes Thumshirn
2016-10-14 20:39                     ` Patch: [Open-FCoE] " Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=209207528.804499.1475931430678.JavaMail.zimbra@redhat.com \
    --to=loberman-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=bubrown-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=cjt-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=fcoe-devel-s9riP+hp16TNLxjTenLetw@public.gmane.org \
    --cc=linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).