public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Gloria Tsai <Gloria.Tsai@ssstc.com>
Cc: Christoph Hellwig <hch@lst.de>, Jongpil Jung <jongpuls@gmail.com>,
	Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>,
	Sagi Grimberg <sagi@grimberg.me>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jongpil19.jung@samsung.com" <jongpil19.jung@samsung.com>,
	"jongheony.kim@samsung.com" <jongheony.kim@samsung.com>,
	"dj54.sohn@samsung.com" <dj54.sohn@samsung.com>
Subject: Re: [PATCH V3 1/1] nvme: Add quirk for LiteON CL1 devices running FW 220TQ,22001
Date: Tue, 3 Nov 2020 10:23:18 +0100	[thread overview]
Message-ID: <20201103092318.GA16071@lst.de> (raw)
In-Reply-To: <HK2PR02MB4004EE20977D0B14516B030AEE110@HK2PR02MB4004.apcprd02.prod.outlook.com>

On Tue, Nov 03, 2020 at 02:21:16AM +0000, Gloria Tsai wrote:
> When host issue shutdown + D3hot in suspend, NVMe drive might have
> chance choosing wrong pointer which has already been used by GC then
> cause over program.
> Do GC before shutdown -> delete IO Q -> shutdown from host -> breakup GC -> D3hot -> enter PS4 -> have a chance swap block -> use wrong pointer on device SRAM -> over program

Aka there is data corruption?

> The issue only happens in simple suspend (shutdown+D3hot) with specific FW on Kahoku board.

Kahoku is a specific LiteOn controller?  Or it is the host system?

Maybe main issue with this patch is that it mixes up two axis:

 - use power states for suspend despite HMB on specific host systems
   identified by the DMI ids.  This kinda makes sense to me, as
   the power state based suspends has lots of advantages, so having
   a whitelist when to use it seem ok, despite the clutter that this
   causes.
 - then tie this to specific NVMe devices that don't work without this
   quirk, which leaves open the issue what we do when we encounter such
   a device in a different host system.  If shutdown + D3hot causes
   problems there is seems like for the case where the above quirk doesn't
   apply we should just skip the shutdown and let the D3hot do a surprise
   power removal?  That's mean recovery when coming back from the suspend,
   but would cause less corruption?

In other words I think this needs to be two patches:

 1) quirk based on the DMI table and allow power state based
    suspend on given host systems even when a HMB is enabled
 2) quirk based on the nvme device (and if possible use the PCI IDs)
    to disable shutdown before suspend (possibly with a warning printk
    when this happens)

      reply	other threads:[~2020-11-03  9:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-28  9:14 [PATCH V3 1/1] nvme: Add quirk for LiteON CL1 devices running FW 220TQ,22001 Jongpil Jung
2020-10-28 17:17 ` Christoph Hellwig
2020-10-29  2:20   ` Gloria Tsai
2020-10-29  2:33     ` Keith Busch
2020-10-29  3:15       ` Keith Busch
2020-10-29  3:21         ` Gloria Tsai
2020-10-29 14:55 ` Christoph Hellwig
2020-11-02 18:13   ` Christoph Hellwig
2020-11-03  2:21     ` Gloria Tsai
2020-11-03  9:23       ` Christoph Hellwig [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201103092318.GA16071@lst.de \
    --to=hch@lst.de \
    --cc=Gloria.Tsai@ssstc.com \
    --cc=axboe@fb.com \
    --cc=dj54.sohn@samsung.com \
    --cc=jongheony.kim@samsung.com \
    --cc=jongpil19.jung@samsung.com \
    --cc=jongpuls@gmail.com \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox