All of lore.kernel.org
 help / color / mirror / Atom feed
From: hch@lst.de (Christoph Hellwig)
Subject: [PATCH 0/4] Rework NVMe abort handling
Date: Thu, 19 Jul 2018 16:50:05 +0200	[thread overview]
Message-ID: <20180719145005.GA21000@lst.de> (raw)
In-Reply-To: <20180719143534.i36vo45lhz24xbrg@linux-x5ow.site>

On Thu, Jul 19, 2018@04:35:34PM +0200, Johannes Thumshirn wrote:
> > No with the the code following what we have in PCIe that just means
> > we'll eventually controller reset after the I/O command times out
> > the second time as we still won't have seen a completion for it.
> 
> Exactly that was my intention.

Which means the only thing you do for your use case is to delay
recovery even further.

> OK, let me see where I'm stuck here. We're issuing a command, it gets
> lost due to $REASON and I'm aborting it. The upper layers then
> eventually retry the command and it arrives at the target side. But so
> does the old command as well and we have a duplicate. Correct?

The upper layer is only going to retry after tearing down the transport
connection.  And a tear down of the connection MUST clear all pending
commands on the way.  If it doesn't we are in deep, deep trouble.

A NVMe abort has no chance of clearing things at the transport layer.

WARNING: multiple messages have this Message-ID (diff)
From: Christoph Hellwig <hch@lst.de>
To: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
	Keith Busch <keith.busch@intel.com>,
	James Smart <james.smart@broadcom.com>,
	Hannes Reinecke <hare@suse.de>, Ewan Milne <emilne@redhat.com>,
	Max Gurtovoy <maxg@mellanox.com>,
	Linux NVMe Mailinglist <linux-nvme@lists.infradead.org>,
	Linux Kernel Mailinglist <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/4] Rework NVMe abort handling
Date: Thu, 19 Jul 2018 16:50:05 +0200	[thread overview]
Message-ID: <20180719145005.GA21000@lst.de> (raw)
In-Reply-To: <20180719143534.i36vo45lhz24xbrg@linux-x5ow.site>

On Thu, Jul 19, 2018 at 04:35:34PM +0200, Johannes Thumshirn wrote:
> > No with the the code following what we have in PCIe that just means
> > we'll eventually controller reset after the I/O command times out
> > the second time as we still won't have seen a completion for it.
> 
> Exactly that was my intention.

Which means the only thing you do for your use case is to delay
recovery even further.

> OK, let me see where I'm stuck here. We're issuing a command, it gets
> lost due to $REASON and I'm aborting it. The upper layers then
> eventually retry the command and it arrives at the target side. But so
> does the old command as well and we have a duplicate. Correct?

The upper layer is only going to retry after tearing down the transport
connection.  And a tear down of the connection MUST clear all pending
commands on the way.  If it doesn't we are in deep, deep trouble.

A NVMe abort has no chance of clearing things at the transport layer.

  reply	other threads:[~2018-07-19 14:50 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-19 13:28 [PATCH 0/4] Rework NVMe abort handling Johannes Thumshirn
2018-07-19 13:28 ` Johannes Thumshirn
2018-07-19 13:28 ` [PATCH 1/4] nvme: factor out pci abort handling into core Johannes Thumshirn
2018-07-19 13:28   ` Johannes Thumshirn
2018-07-19 16:29   ` kbuild test robot
2018-07-19 16:29     ` kbuild test robot
2018-07-19 13:28 ` [PATCH 2/4] nvme: rdma: abort commands before resetting controller Johannes Thumshirn
2018-07-19 13:28   ` Johannes Thumshirn
2018-07-19 13:28 ` [PATCH 3/4] nvmet: loop: " Johannes Thumshirn
2018-07-19 13:28   ` Johannes Thumshirn
2018-07-19 13:28 ` [PATCH 4/4] nvme: fc: " Johannes Thumshirn
2018-07-19 13:28   ` Johannes Thumshirn
2018-07-19 13:42 ` [PATCH 0/4] Rework NVMe abort handling Christoph Hellwig
2018-07-19 13:42   ` Christoph Hellwig
2018-07-19 14:10   ` Johannes Thumshirn
2018-07-19 14:10     ` Johannes Thumshirn
2018-07-19 14:23     ` Christoph Hellwig
2018-07-19 14:23       ` Christoph Hellwig
2018-07-19 14:35       ` Johannes Thumshirn
2018-07-19 14:35         ` Johannes Thumshirn
2018-07-19 14:50         ` Christoph Hellwig [this message]
2018-07-19 14:50           ` Christoph Hellwig
2018-07-19 14:54           ` Johannes Thumshirn
2018-07-19 14:54             ` Johannes Thumshirn
2018-07-19 15:04             ` James Smart
2018-07-19 15:04               ` James Smart
2018-07-20  6:36               ` Johannes Thumshirn
2018-07-20  6:36                 ` Johannes Thumshirn
2018-07-19 15:00     ` James Smart
2018-07-19 15:00       ` James Smart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180719145005.GA21000@lst.de \
    --to=hch@lst.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.