public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: "Maurizio Lombardi" <mlombard@arkamax.eu>
To: "Hannes Reinecke" <hare@suse.de>,
	"Maurizio Lombardi" <mlombard@redhat.com>, <kbusch@kernel.org>
Cc: <mheyne@amazon.de>, <emilne@redhat.com>, <jmeneghi@redhat.com>,
	<linux-nvme@lists.infradead.org>, <dwagner@suse.de>,
	<mlombard@arkamax.eu>, <mkhalfella@purestorage.com>,
	<chaitanyak@nvidia.com>, <hare@kernel.org>, <hch@lst.de>
Subject: Re: [PATCH V3 0/8] nvme: Refactor and expose per-controller timeout configuration
Date: Mon, 13 Apr 2026 11:21:45 +0200	[thread overview]
Message-ID: <DHRX0LGNILAX.LTFZVPDRLGOH@arkamax.eu> (raw)
In-Reply-To: <e7a5ab7f-722e-4767-a9a6-2a1ae73bd8b7@suse.de>

On Mon Apr 13, 2026 at 10:12 AM CEST, Hannes Reinecke wrote:
> On 4/10/26 09:39, Maurizio Lombardi wrote:
>> This patchset tries to address some limitations in how the NVMe driver handles
>> command timeouts.
>> Currently, the driver relies heavily on global module parameters
>> (NVME_IO_TIMEOUT and NVME_ADMIN_TIMEOUT), making it difficult for users to
>> tune timeouts for specific controllers that may have very different
>> characteristics. Also, in some cases, manual changes to sysfs timeout values
>> are ignored by the driver logic.
>> 
>> For example this patchset removes the unconditional timeout assignment in
>> nvme_init_request. This allows the block layer to correctly apply the request
>> queue's timeout settings, ensuring that user-initiated changes via sysfs
>> are actually respected for all requests.
>> 
>> It introduces new sysfs attributes (admin_timeout and io_timeout) to the NVMe
>> controller. This allows users to configure distinct timeout requirements for
>> different controllers rather than relying on global module parameters.
>> 
> What about KATO?
> With this patchset the user can set arbitrary values to the I/O timeout,
> which easily can be lower than KATO.

it's worth noting that unless I am missing something the user can already
trigger this exact scenario today by setting an I/O timeout lower than KATO,
using the global nvme_io_timeout module parameter. 

> And as per spec a KATO timeout implies a transport disruption, requiring
> a controller reset.
> But due to the internal design of the nvme error handling we do conflate
> transport disruption and command timeout, so an _I/O_ timeout triggers
> a controller reset.

this is true only for TCP and RDMA host drivers,
because PCI and FC already support I/O aborts.

Apple doesn't support abort, but it's not clear in the comments if it's
the driver that lacks support for it or it's the controller that
doesn't handle the abort command.

> Which means that a command timeout lower than KATO will result in false
> positives, with the controller being reset even though the connection
> is perfectly happy.

Right.

We could try to send abort commands in RDMA and TCP host drivers timeout handlers.
Maybe cancel, if supported, and falling back to abort if cancel
commands are not available. I already had patches for this kind of
stuff.

Maurizio


  reply	other threads:[~2026-04-13  9:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10  7:39 [PATCH V3 0/8] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
2026-04-10  7:39 ` [PATCH V3 1/8] nvme: Let the blocklayer set timeouts for requests Maurizio Lombardi
2026-04-10  7:39 ` [PATCH V3 2/8] nvme: add sysfs attribute to change admin timeout per nvme controller Maurizio Lombardi
2026-04-10  7:39 ` [PATCH V3 3/8] nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT Maurizio Lombardi
2026-04-10  7:39 ` [PATCH V3 4/8] nvme: add sysfs attribute to change IO timeout per nvme controller Maurizio Lombardi
2026-04-10  7:39 ` [PATCH V3 5/8] nvme: use per controller timeout waits over depending on global default Maurizio Lombardi
2026-04-10  7:39 ` [PATCH V3 6/8] nvme-core: align fabrics_q teardown with admin_q in nvme_free_ctrl Maurizio Lombardi
2026-04-10  7:39 ` [PATCH V3 7/8] nvmet-loop: do not alloc admin tag set during reset Maurizio Lombardi
2026-04-10  7:39 ` [PATCH V3 8/8] nvme-core: warn on allocating admin tag set with existing queue Maurizio Lombardi
2026-04-13  8:12 ` [PATCH V3 0/8] nvme: Refactor and expose per-controller timeout configuration Hannes Reinecke
2026-04-13  9:21   ` Maurizio Lombardi [this message]
2026-04-14 19:14     ` John Meneghini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DHRX0LGNILAX.LTFZVPDRLGOH@arkamax.eu \
    --to=mlombard@arkamax.eu \
    --cc=chaitanyak@nvidia.com \
    --cc=dwagner@suse.de \
    --cc=emilne@redhat.com \
    --cc=hare@kernel.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jmeneghi@redhat.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mheyne@amazon.de \
    --cc=mkhalfella@purestorage.com \
    --cc=mlombard@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox