public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chris Clayton <chris2553@googlemail.com>,
	Jaswinder Singh Rajput <jaswinder@kernel.org>,
	NeilBrown <neilb@suse.de>,
	linux-kernel@vger.kernel.org, scsi <linux-scsi@vger.kernel.org>,
	Tejun Heo <tj@kernel.org>,
	Arjan van de Ven <arjan@linux.intel.com>
Subject: Re: 2.6.30-rc8 Oops whilst booting
Date: Mon, 08 Jun 2009 17:38:16 +0000	[thread overview]
Message-ID: <1244482696.4079.345.camel@mulgrave.site> (raw)
In-Reply-To: <alpine.LFD.2.01.0906081003370.6847@localhost.localdomain>

On Mon, 2009-06-08 at 10:21 -0700, Linus Torvalds wrote:
> 
> On Mon, 8 Jun 2009, James Bottomley wrote:
> > 
> > The root cause is a reordering of the devices caused by the async code.
> 
> That's NULL information.
> 
> OF COURSE the root cause is the async code. We know that. We're looking 
> for the specifics.
> 
> In particular, before that commit, at most you will wait for too _much_. 
> In other words, it's a "good" wait. 
> 
> Your commit caused it to wait for less, and that then showed a bug. Not 
> all that surprising - it's now not waiting enough.

right ... my question was whether this exposed an existing bug that was
hidden by the waiting too much.  Actually, I audited all the async code
and that's impossible: we don't actually have any async domains at all
(except for the spurious superblock s_async_list, which never gets
anything added to its runqueue), so it must be a bug in the code.

> You tried to avoid a deadlock situation of waiting for too much, but you 
> avoided the deadlock by now waiting for too little. 
> 
> I also think that your code is simply buggy. As far as I can tell, int he 
> case of having both running and pending events, you'll always pick the 
> pending cookie. But it's the _running_ cookie that has the lower event 
> number, isn't it?

Yes, see later fix.  Assuming we get confirmation from the reporter, we
should be good to go.

> I dunno. It all looks very fishy to me.

Well, the other option is to revert the fix ... since there is no other
separated domain, there's nothing really to fix ... the original code
that showed the problem was a SCSI feature tree conversion of our
current async scanning code to the async infrastructure which used a
separate domain.

James



  parent reply	other threads:[~2009-06-08 17:38 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200906061959.55592.chris2553@googlemail.com>
     [not found] ` <200906062215.30571.chris2553@googlemail.com>
     [not found]   ` <1244381140.30664.12.camel@ht.satnam>
     [not found]     ` <c6b1100b0906071138g2c46fb34vc1a2beb9438f1f1e@mail.gmail.com>
2009-06-07 22:31       ` 2.6.30-rc8 Oops whilst booting Jaswinder Singh Rajput
2009-06-07 22:55         ` NeilBrown
2009-06-08  8:08           ` Chris Clayton
2009-06-08 10:58             ` Chris Clayton
2009-06-08 11:34               ` Jaswinder Singh Rajput
2009-06-08 12:53                 ` Chris Clayton
2009-06-08 16:21                   ` Linus Torvalds
2009-06-08 16:51                     ` James Bottomley
2009-06-08 17:06                       ` Jaswinder Singh Rajput
2009-06-08 17:45                         ` Chris Clayton
2009-06-08 18:21                           ` Linus Torvalds
2009-06-08 19:17                             ` Chris Clayton
2009-06-08 20:03                               ` Chris Clayton
2009-06-08 17:21                       ` Linus Torvalds
2009-06-08 17:37                         ` Arjan van de Ven
2009-06-08 17:38                         ` James Bottomley [this message]
2009-06-08 17:06                     ` James Bottomley
2009-06-08 17:42                       ` Linus Torvalds
2009-06-08 17:49                         ` James Bottomley
2009-06-08 18:33                       ` Chris Clayton
2009-06-08 14:23             ` James Bottomley
2009-06-08 15:04               ` Chris Clayton
2009-06-08 15:17               ` Chris Clayton
2009-06-08 15:32                 ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1244482696.4079.345.camel@mulgrave.site \
    --to=james.bottomley@hansenpartnership.com \
    --cc=arjan@linux.intel.com \
    --cc=chris2553@googlemail.com \
    --cc=jaswinder@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox