All of lore.kernel.org
 help / color / mirror / Atom feed
From: James.Bottomley@HansenPartnership.com (James Bottomley)
Subject: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional
Date: Sun, 07 Feb 2016 15:07:21 -0800	[thread overview]
Message-ID: <1454886441.2329.27.camel@HansenPartnership.com> (raw)
In-Reply-To: <56B7C527.6050300@kernel.dk>

On Sun, 2016-02-07@15:28 -0700, Jens Axboe wrote:
> On 02/07/2016 09:04 AM, James Bottomley wrote:
> > On Sun, 2016-02-07@10:22 +0100, Christoph Hellwig wrote:
> > > Keith said it should be on by default, and I promised him to
> > > change
> > > it once we run into problems, which I guess this counts as.
> > > 
> > > But just curious:  what distro are you using?  Upstream systemd
> > > explicitly rejected using scsi_id for NVMe here:
> > > 
> > > 	https://github.com/systemd/systemd/issues/1453
> > > 
> > > and all my test systems don't do this either.
> > 
> > This was SUSE (in my case, openSUSE Leap).  I just checked the
> > source
> > package; they patch the by-id rules back in for NVME:
> > 
> > # PATCH-FIX-SUSE 1101-rules-persistent-device-names-for-NVMe
> > -devices.patch (bsc#944132)
> > Patch1101:      1101-rules-persistent-device-names-for-NVMe
> > -devices.patch
> > 
> > The bugzilla is giving access denied for bug id 944132, so it's
> > likely
> > some proprietary vendor problem.  The patch has no preamble, so
> > it's
> > hard to tell what they were thinking.
> 
> I run root-on-nvme on my laptop, and I haven't observed any problems.

Me too apparently.  It looks like this problem may be SUSE specific
unless another distro has enabled this.  I can see why they would: you
do need persistent names for devices, even NVMe ones.

> Generally I hate for options to default y unless absolutely 
> necessary, it's a sure fire way to feature creep your kernel without 
> noticing. I don't think getting all hot about this issue is fair, if 
> the only known case is suse.

Well, OK, I'm annoyed because it was a systemd system which means
debugging boot failures are excruciatingly difficult so it took me a
week and a half to find out what the problem was.  Perhaps I was a bit
rash to label this as an easily foreseen problem.

I opened a bug against SUSE to tell them to turn it on:

https://bugzilla.opensuse.org/show_bug.cgi?id=965497

The second problem is that there's currently no way to transition to
using the serial attribute the way the udev 60-persistent-storage.rules
are written, so if distros have some by-id hack, it will have to be
maintained for a while.  I annotated the already closed bug on this in
systemd with the rules that work for me.

> If anything, let's make the description better. It's trying to be
> funny, it'd be better if it was descriptive and covered this case as
> well.

The problem with this is that when moving to new kernels, distro
maintainers don't read the new option help texts, they just take the
defaults.  However, I checked the only other distribution I use
(debian) and they don't have a nvme persistent ID hack, so if someone
checked ubuntu and Red Hat, I think all the majors are now covered and
perhaps there's no need to do anything more.

James

WARNING: multiple messages have this Message-ID (diff)
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>
Cc: "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	linux-block@vger.kernel.org,
	linux-scsi <linux-scsi@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional
Date: Sun, 07 Feb 2016 15:07:21 -0800	[thread overview]
Message-ID: <1454886441.2329.27.camel@HansenPartnership.com> (raw)
In-Reply-To: <56B7C527.6050300@kernel.dk>

On Sun, 2016-02-07 at 15:28 -0700, Jens Axboe wrote:
> On 02/07/2016 09:04 AM, James Bottomley wrote:
> > On Sun, 2016-02-07 at 10:22 +0100, Christoph Hellwig wrote:
> > > Keith said it should be on by default, and I promised him to
> > > change
> > > it once we run into problems, which I guess this counts as.
> > > 
> > > But just curious:  what distro are you using?  Upstream systemd
> > > explicitly rejected using scsi_id for NVMe here:
> > > 
> > > 	https://github.com/systemd/systemd/issues/1453
> > > 
> > > and all my test systems don't do this either.
> > 
> > This was SUSE (in my case, openSUSE Leap).  I just checked the
> > source
> > package; they patch the by-id rules back in for NVME:
> > 
> > # PATCH-FIX-SUSE 1101-rules-persistent-device-names-for-NVMe
> > -devices.patch (bsc#944132)
> > Patch1101:      1101-rules-persistent-device-names-for-NVMe
> > -devices.patch
> > 
> > The bugzilla is giving access denied for bug id 944132, so it's
> > likely
> > some proprietary vendor problem.  The patch has no preamble, so
> > it's
> > hard to tell what they were thinking.
> 
> I run root-on-nvme on my laptop, and I haven't observed any problems.

Me too apparently.  It looks like this problem may be SUSE specific
unless another distro has enabled this.  I can see why they would: you
do need persistent names for devices, even NVMe ones.

> Generally I hate for options to default y unless absolutely 
> necessary, it's a sure fire way to feature creep your kernel without 
> noticing. I don't think getting all hot about this issue is fair, if 
> the only known case is suse.

Well, OK, I'm annoyed because it was a systemd system which means
debugging boot failures are excruciatingly difficult so it took me a
week and a half to find out what the problem was.  Perhaps I was a bit
rash to label this as an easily foreseen problem.

I opened a bug against SUSE to tell them to turn it on:

https://bugzilla.opensuse.org/show_bug.cgi?id=965497

The second problem is that there's currently no way to transition to
using the serial attribute the way the udev 60-persistent-storage.rules
are written, so if distros have some by-id hack, it will have to be
maintained for a while.  I annotated the already closed bug on this in
systemd with the rules that work for me.

> If anything, let's make the description better. It's trying to be
> funny, it'd be better if it was descriptive and covered this case as
> well.

The problem with this is that when moving to new kernels, distro
maintainers don't read the new option help texts, they just take the
defaults.  However, I checked the only other distribution I use
(debian) and they don't have a nvme persistent ID hack, so if someone
checked ubuntu and Red Hat, I think all the majors are now covered and
perhaps there's no need to do anything more.

James

  reply	other threads:[~2016-02-07 23:07 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-06 18:33 complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional James Bottomley
2016-02-06 18:33 ` James Bottomley
2016-02-07  9:22 ` Christoph Hellwig
2016-02-07  9:22   ` Christoph Hellwig
2016-02-07 16:04   ` James Bottomley
2016-02-07 16:04     ` James Bottomley
2016-02-07 22:28     ` Jens Axboe
2016-02-07 22:28       ` Jens Axboe
2016-02-07 23:07       ` James Bottomley [this message]
2016-02-07 23:07         ` James Bottomley
2016-02-08  7:24         ` Christoph Hellwig
2016-02-08  7:24           ` Christoph Hellwig
2016-02-08  7:32         ` Hannes Reinecke
2016-02-08  7:32           ` Hannes Reinecke
2016-02-08 10:01           ` Sagi Grimberg
2016-02-08 10:01             ` Sagi Grimberg
2016-02-08 10:13             ` Christoph Hellwig
2016-02-08 10:13               ` Christoph Hellwig
2016-02-08 15:12               ` Keith Busch
2016-02-08 15:12                 ` Keith Busch
2016-02-08 15:19                 ` Hannes Reinecke
2016-02-08 15:19                   ` Hannes Reinecke
2016-02-08 16:15                   ` Keith Busch
2016-02-08 16:15                     ` Keith Busch
2016-02-08 15:23           ` James Bottomley
2016-02-08 15:23             ` James Bottomley
2016-02-09 12:50       ` Christoph Hellwig
2016-02-09 12:50         ` Christoph Hellwig
2016-02-09 13:29         ` Jens Axboe
2016-02-09 13:29           ` Jens Axboe
2016-02-09 17:12           ` Christoph Hellwig
2016-02-09 17:12             ` Christoph Hellwig
2016-02-09 17:14             ` Jens Axboe
2016-02-09 17:14               ` Jens Axboe
2016-02-09 17:19               ` Christoph Hellwig
2016-02-09 17:19                 ` Christoph Hellwig
2016-02-09 17:19                 ` Jens Axboe
2016-02-09 17:19                   ` Jens Axboe
2016-02-09 15:37         ` James Bottomley
2016-02-09 15:37           ` James Bottomley
2016-02-08  7:26     ` Hannes Reinecke
2016-02-08  7:26       ` Hannes Reinecke
2016-02-08  7:26       ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1454886441.2329.27.camel@HansenPartnership.com \
    --to=james.bottomley@hansenpartnership.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.