Re: Interrupt strangeness in scsi_request_fn()

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-scsi@vger.kernel.org
Subject: Re: Interrupt strangeness in scsi_request_fn()
Date: Fri, 01 Aug 2014 18:00:03 -0400	[thread overview]
Message-ID: <1406930403.2654.40.camel@jarvis> (raw)
In-Reply-To: <20140801204848.GE7525@quack.suse.cz>

On Fri, 2014-08-01 at 22:48 +0200, Jan Kara wrote:
>   Hello,
> 
>   when debugging one bug, I've noticed one strangeness in
> scsi_request_fn(). We enter it with interrupts disabled and queue_lock
> held. In the function we do stuff like:
>                 spin_unlock_irq(shost->host_lock);
>                 /*
>                  * Finally, initialize any error handling parameters, and
>                  * set up
>                  * the timers for timeouts.
>                  */
>                 scsi_init_cmd_errh(cmd);
> 
>                 /*
>                  * Dispatch the command to the low-level driver.
>                  */
>                 rtn = scsi_dispatch_cmd(cmd);
>                 spin_lock_irq(q->queue_lock);
> 
> Why do we enable interrupts there? The thing is that scsi_request_fn() can
> be called from IO completion (thus softirq context) and if we enable
> interrupts there, another HW interrupt can interrupt softirq processing
> which seems wrong to me.

According to my reading of kernel/softirq.c, it looks like interrupts
*are* already enabled when softirq functions are called by design.  That
doesn't mean your patch is necessarily wrong, it just means the premise
above isn't quite right.

> Now to admit my motivation I have reports from one customer (SLES kernel so
> too old to be really interesting upstream but still of some value I
> believe) where he's seeing softlockup reports when fully loading his SAN
> and from the crash dumps it seems the machine simply spends too long in IO
> completion because HW interrupts keep interrupting it plus there's
> contention on shost->host_lock. If I change scsi_request_fn() to never
> enable interrupts problems go away.
> 
> So what would people say to something like the attached patch?

I think the theory for enabling interrupts here is that the path is
rather long ... it goes through the driver to the hardware and that the
problem you describe above *should* be self correcting.  If we keep
interrupting dispatch, then we're throttling the outbound I/O which
should reduce the returning interrupts.

I'm not immovably opposed, I'd just like to see better justification.

James

     prev parent reply	other threads:[~2014-08-01 22:00 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-01 20:48 Interrupt strangeness in scsi_request_fn() Jan Kara
2014-08-01 22:00 ` James Bottomley [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1406930403.2654.40.camel@jarvis \
    --to=james.bottomley@hansenpartnership.com \
    --cc=jack@suse.cz \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).