Linux SCSI subsystem development
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: David Jeffery <djeffery@redhat.com>,
	linux-scsi@vger.kernel.org,
	"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>
Subject: Re: [PATCH] scsi: Wake up the error handler when final completions race against each other
Date: Wed, 14 Jan 2026 09:44:47 -0800	[thread overview]
Message-ID: <7bfc9707-0057-405f-8f11-17bd932713bc@acm.org> (raw)
In-Reply-To: <20260113161036.6730-1-djeffery@redhat.com>

On 1/13/26 9:08 AM, David Jeffery wrote:
> The fragile ordering between marking commands completed or failed so that
> the error handler only wakes when the last running command completes or
> times out has race conditions. These race conditions can cause the scsi
> layer to fail to wake the error handler, leaving I/O through the scsi host
> stuck as the error state cannot advance.
> 
> First, there is an memory ordering issue within scsi_dec_host_busy. The
> write which clears SCMD_STATE_INFLIGHT may be reordered with reads counting
> in scsi_host_busy. While the local CPU will see its own write, reordering
> can allow other CPUs in scsi_dec_host_busy or scsi_eh_inc_host_failed to
> see a raised busy count, causing no CPU to see a host busy equal to the
> host_failed count.
> 
> This race condition can be prevented with a memory barrier on the error
> path to force the write to be visible before counting host busy commands.
> 
> Second, there is a general ordering issue with scsi_eh_inc_host_failed. By
> counting busy commands before incrementing host_failed, it can race with a
> final command in scsi_dec_host_busy, such that scsi_dec_host_busy does not
> see host_failed incremented but scsi_eh_inc_host_failed counts busy
> commands before SCMD_STATE_INFLIGHT is cleared by scsi_dec_host_busy,
> resulting in neither waking the error handler task.
> 
> This needs the call to scsi_host_busy to be moved after host_failed is
> incremented to close the race condition.
Reviewed-by: Bart Van Assche <bvanassche@acm.org>

  reply	other threads:[~2026-01-14 17:44 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-13 16:08 [PATCH] scsi: Wake up the error handler when final completions race against each other David Jeffery
2026-01-14 17:44 ` Bart Van Assche [this message]
2026-01-17  4:38 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7bfc9707-0057-405f-8f11-17bd932713bc@acm.org \
    --to=bvanassche@acm.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=djeffery@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox