All of lore.kernel.org
 help / color / mirror / Atom feed
From: Artem Bityutskiy <dedekind1@gmail.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Stefan Haberland <sth@linux.vnet.ibm.com>,
	Jens Axboe <axboe@kernel.dk>,
	"Herring,
	Jan-kristian Augustin" <jan-kristian.augustin.herring@intel.com>,
	Thorsten Leemhuis <regressions@leemhuis.info>
Subject: Re: regression: SCSI/SATA failure
Date: Mon, 05 Mar 2018 16:41:43 +0200	[thread overview]
Message-ID: <1520260903.2637.34.camel@gmail.com> (raw)
In-Reply-To: <1519311270.2535.53.camel@intel.com>

Linux-Regression-ID: lr#15a115

On Thu, 2018-02-22 at 16:54 +0200, Artem Bityutskiy wrote:
> Hi Christoph,
> 
> one of our test box Skylake servers does not boot with v4.16-rcX.
> Bisection lead us to this commit:
> 
> 84676c1f21e8 genirq/affinity: assign vectors to all possible CPUs
> 
> Reverting this single commit fixes the problem.
> 
> The server is a Dell R640 machine with the latest Dell BIOS. It has a
> single SATA SSD and we do not use raid, even though the system does
> have a megaraid controller.

Correction: we have Raid0 with this single disk.

> Are you aware of this issue? Below is the failure message and the
> full
> dmesg with some debugging boot parameters is here:
> 
> https://pastebin.com/raw/tTYrTAEQ

FYI, the regression still exists and reverting this single patch fixes
it. But today Dell server

I did not have time to really debug this, but I think people who are
working with this should quickly see what is going on.

I think the platform reports way too large possible CPU count. Indeed,
in dmesg I see this:

[    0.000000] smpboot: Allowing 328 CPUs, 224 hotplug CPUs

224 is way too large for this system. It only has 2 sockets, it but the
number looks like if the system had 4 sockets.

The commit changes IRQ affinity logic from being per-present CPU to
being per-possible CPU:

-       for_each_present_cpu(cpu)
+       for_each_possible_cpu(cpu)

And it looks like this has an unexpected side-effect on this Dell
platform.

Artem.

  parent reply	other threads:[~2018-03-05 14:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-22 14:54 regression: SCSI/SATA failure Bityutskiy, Artem
2018-02-22 14:57 ` Artem Bityutskiy
2018-03-05 14:17   ` Thorsten Leemhuis
2018-03-05 14:41     ` David Laight
2018-02-26  9:40 ` Thorsten Leemhuis
2018-03-05 14:41 ` Artem Bityutskiy [this message]
2018-03-28  6:31 ` Artem Bityutskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1520260903.2637.34.camel@gmail.com \
    --to=dedekind1@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=borntraeger@de.ibm.com \
    --cc=hch@lst.de \
    --cc=jan-kristian.augustin.herring@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=regressions@leemhuis.info \
    --cc=sth@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.