All of lore.kernel.org
 help / color / mirror / Atom feed
From: Suparna Bhattacharya <suparna@in.ibm.com>
To: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Zach Brown <zach.brown@oracle.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] dio: remove bogus refcounting BUG_ON
Date: Thu, 5 Jul 2007 09:23:13 +0530	[thread overview]
Message-ID: <20070705035313.GA4165@in.ibm.com> (raw)
In-Reply-To: <1183602311.29435.1.camel@dyn9047017100.beaverton.ibm.com>

On Wed, Jul 04, 2007 at 07:25:10PM -0700, Badari Pulavarty wrote:
> On Tue, 2007-07-03 at 15:28 -0700, Zach Brown wrote:
> > Linus, Andrew, please apply the bug fix patch at the end of this reply
> > for .22.
> > 
> > > >>One of our perf. team ran into this while doing some runs.
> > > >>I didn't see anything obvious - it looks like we converted
> > > >>async IO to synchronous one. I didn't spend much time digging
> > > >>around.
> > 
> > OK, I think this BUG_ON() is just broken.  I wasn't able to find any
> > obvious bugs from reading the code which would cause the BUG_ON() to
> > fire.  If it's reproducible I'd love to hear what the recipe is.
> > 
> > I did notice that this BUG_ON() is evaluating dio after having dropped
> > it's ref :/.  So it's not completely absurd to fear that it's a race
> > with the dio's memory being reused, but that'd be a pretty tight race.
> > 
> > Let's remove this stupid BUG_ON and see if that test box still has
> > trouble.  It might just hit the valid BUG_ON a few lines down, but this
> > unsafe BUG_ON needs to go.
> 
> I went through the code multiple times, I can't find how we can trigger
> the BUG_ON(). But unfortunately, our perf. team is able reproduce the
> problem. Debug indicated that, the ret2 == 1 :(
> 
> Not sure how that can happen. Ideas ?

Does it trigger even if you avoid referencing dio in the BUG_ON(), i.e.
with something like ...


--- direct-io.c	2007-07-02 01:24:24.000000000 +0530
+++ direct-io-debug.c	2007-07-05 09:18:56.000000000 +0530
@@ -1104,9 +1104,10 @@ direct_io_worker(int rw, struct kiocb *i
 	 * decide to wake the submission path atomically.
 	 */
 	spin_lock_irqsave(&dio->bio_lock, flags);
+	is_async = dio->is_async;
 	ret2 = --dio->refcount;
 	spin_unlock_irqrestore(&dio->bio_lock, flags);
-	BUG_ON(!dio->is_async && ret2 != 0);
+	BUG_ON(!is_async && ret2 != 0);
 	if (ret2 == 0) {
 		ret = dio_complete(dio, offset, ret);
 		kfree(dio);

> 
> Thanks,
> Badari
> 
> > 
> > -------
> > 
> > dio: remove bogus refcounting BUG_ON
> > 
> > Badari Pulavarty reported a case of this BUG_ON is triggering during
> > testing.  It's completely bogus and should be removed.
> > 
> > It's trying to notice if we left references to the dio hanging around in
> > the sync case.  They should have been dropped as IO completed while this
> > path was in dio_await_completion().  This condition will also be
> > checked, via some twisty logic, by the BUG_ON(ret != -EIOCBQUEUED) a few
> > lines lower.  So to start this BUG_ON() is redundant.
> > 
> > More fatally, it's dereferencing dio-> after having dropped its
> > reference.  It's only safe to dereference the dio after releasing the
> > lock if the final reference was just dropped.  Another CPU might free
> > the dio in bio completion and reuse the memory after this path drops the
> > dio lock but before the BUG_ON() is evaluated.
> > 
> > This patch passed aio+dio regression unit tests and aio-stress on ext3.
> > 
> > Signed-off-by: Zach Brown <zach.brown@oracle.com>
> > Cc: Badari Pulavarty <pbadari@us.ibm.com>
> > 
> > diff -r 509ce354ae1b fs/direct-io.c
> > --- a/fs/direct-io.c	Sun Jul 01 22:00:49 2007 +0000
> > +++ b/fs/direct-io.c	Tue Jul 03 14:56:41 2007 -0700
> > @@ -1106,7 +1106,7 @@ direct_io_worker(int rw, struct kiocb *i
> >  	spin_lock_irqsave(&dio->bio_lock, flags);
> >  	ret2 = --dio->refcount;
> >  	spin_unlock_irqrestore(&dio->bio_lock, flags);
> > -	BUG_ON(!dio->is_async && ret2 != 0);
> > +
> >  	if (ret2 == 0) {
> >  		ret = dio_complete(dio, offset, ret);
> >  		kfree(dio);
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Lab, India


  reply	other threads:[~2007-07-05  3:50 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-28  3:01 DIO panic on 2.6.21.5 Badari Pulavarty
2007-06-29 22:10 ` Zach Brown
2007-06-30  0:03   ` Badari Pulavarty
2007-07-03 22:28     ` [PATCH] dio: remove bogus refcounting BUG_ON Zach Brown
2007-07-05  2:25       ` Badari Pulavarty
2007-07-05  3:53         ` Suparna Bhattacharya [this message]
2007-07-05 17:11         ` Zach Brown
2007-07-05 17:16           ` Badari Pulavarty
2007-07-05 17:21             ` Zach Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070705035313.GA4165@in.ibm.com \
    --to=suparna@in.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=zach.brown@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.