public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* xfs vs. lockdep
@ 2006-10-09 17:58 Eric Sandeen
  2006-10-09 23:36 ` Vlad Apostolov
  2006-10-10  0:47 ` David Chinner
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Sandeen @ 2006-10-09 17:58 UTC (permalink / raw)
  To: xfs

FC6 kernels are oopsing when lockdep & memory debugging are turned on,
looks like due to this code:

xfs_ireclaim(xfs_inode_t *ip)
{
   ...
        /*
         * Here we do a spurious inode lock in order to coordinate with
         * xfs_sync().  This is because xfs_sync() references the inodes
         * in the mount list without taking references on the corresponding
         * vnodes.  We make that OK here by ensuring that we wait until
         * the inode is unlocked in xfs_sync() before we go ahead and
         * free it.  We get both the regular lock and the io lock because
         * the xfs_sync() code may need to drop the regular one but will
         * still hold the io lock.
         */
        xfs_ilock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL);
...
        /*
         * Free all memory associated with the inode.
         */
        xfs_idestroy(ip);
}

So, lock & free.  This frees memory that lockdep is still pointing to,
and tries to use later.

Calling xfs_iunlock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL); just before
xfs_idestroy seems to solve it, but is this safe...?

-Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs vs. lockdep
  2006-10-09 17:58 xfs vs. lockdep Eric Sandeen
@ 2006-10-09 23:36 ` Vlad Apostolov
  2006-10-10  3:06   ` Eric Sandeen
  2006-10-10  0:47 ` David Chinner
  1 sibling, 1 reply; 9+ messages in thread
From: Vlad Apostolov @ 2006-10-09 23:36 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs

Eric Sandeen wrote:
> FC6 kernels are oopsing when lockdep & memory debugging are turned on,
> looks like due to this code:
>
> xfs_ireclaim(xfs_inode_t *ip)
> {
>    ...
>         /*
>          * Here we do a spurious inode lock in order to coordinate with
>          * xfs_sync().  This is because xfs_sync() references the inodes
>          * in the mount list without taking references on the corresponding
>          * vnodes.  We make that OK here by ensuring that we wait until
>          * the inode is unlocked in xfs_sync() before we go ahead and
>          * free it.  We get both the regular lock and the io lock because
>          * the xfs_sync() code may need to drop the regular one but will
>          * still hold the io lock.
>          */
>         xfs_ilock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL);
> ...
>         /*
>          * Free all memory associated with the inode.
>          */
>         xfs_idestroy(ip);
> }
>
> So, lock & free.  This frees memory that lockdep is still pointing to,
> and tries to use later.
>
> Calling xfs_iunlock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL); just before
> xfs_idestroy seems to solve it, but is this safe...?
>
> -Eric
>
>   
Hi Eric,

Could you please provide some more information. What kernel, test case 
and call stack at the time of the crash.

Thanks,
Vlad

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs vs. lockdep
  2006-10-09 17:58 xfs vs. lockdep Eric Sandeen
  2006-10-09 23:36 ` Vlad Apostolov
@ 2006-10-10  0:47 ` David Chinner
  2006-10-10  1:45   ` Timothy Shimmin
  2006-10-10  2:25   ` Eric Sandeen
  1 sibling, 2 replies; 9+ messages in thread
From: David Chinner @ 2006-10-10  0:47 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs

On Mon, Oct 09, 2006 at 12:58:58PM -0500, Eric Sandeen wrote:
> FC6 kernels are oopsing when lockdep & memory debugging are turned on,
> looks like due to this code:
> 
> xfs_ireclaim(xfs_inode_t *ip)
> {
>    ...
>         /*
>          * Here we do a spurious inode lock in order to coordinate with
>          * xfs_sync().  This is because xfs_sync() references the inodes
>          * in the mount list without taking references on the corresponding
>          * vnodes.  We make that OK here by ensuring that we wait until
>          * the inode is unlocked in xfs_sync() before we go ahead and
>          * free it.  We get both the regular lock and the io lock because
>          * the xfs_sync() code may need to drop the regular one but will
>          * still hold the io lock.
>          */
>         xfs_ilock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL);
> ...
>         /*
>          * Free all memory associated with the inode.
>          */
>         xfs_idestroy(ip);
> }
> 
> So, lock & free.  This frees memory that lockdep is still pointing to,
> and tries to use later.
> 
> Calling xfs_iunlock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL); just before
> xfs_idestroy seems to solve it, but is this safe...?

It should be - we call xfs_iextract() before the xfs_ilock() call
shown above. That means the inode has been removed from the mount
list when we take the locks. Once the inode has been removed
from the mount list, the only possible current user is xfs_sync_inodes(),
and it will only be referencing the inode if it is currently working
on the inode. If it is working on the inode, then it will be holding
at least one of the inode locks.

Hence by the time we have the lock here in xfs_ireclaim we have guaranteed
that there are no other outstanding references and no new references
can occur. Therefore it should be safe to drop the lock before destroying
it.

There have been other bits of code in XFS where locks have been taken
just before item destroy. IIRC, one even had a comment explaining it
was safe to do this that was longer than just putting the unlock call
in the code. :/

FWIW, we call mrfree() on both the ilock and the iolock, but these are
#defined to null statements. If there is a destructor for the underlying
lock type, we probably should call that in mrfree() so the debugging code
can catch these probelms that only trigger in debug code.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs vs. lockdep
  2006-10-10  0:47 ` David Chinner
@ 2006-10-10  1:45   ` Timothy Shimmin
  2006-10-10  2:21     ` Eric Sandeen
  2006-10-10  2:25   ` Eric Sandeen
  1 sibling, 1 reply; 9+ messages in thread
From: Timothy Shimmin @ 2006-10-10  1:45 UTC (permalink / raw)
  To: David Chinner, Eric Sandeen; +Cc: xfs

--On 10 October 2006 10:47:26 AM +1000 David Chinner <dgc@sgi.com> wrote:

> On Mon, Oct 09, 2006 at 12:58:58PM -0500, Eric Sandeen wrote:
>> FC6 kernels are oopsing when lockdep & memory debugging are turned on,
>> looks like due to this code:
>>
>> xfs_ireclaim(xfs_inode_t *ip)
>> {
>>    ...
>>         /*
>>          * Here we do a spurious inode lock in order to coordinate with
>>          * xfs_sync().  This is because xfs_sync() references the inodes
>>          * in the mount list without taking references on the
>>          corresponding * vnodes.  We make that OK here by ensuring that
>>          we wait until * the inode is unlocked in xfs_sync() before we
>>          go ahead and * free it.  We get both the regular lock and the
>>          io lock because * the xfs_sync() code may need to drop the
>>          regular one but will * still hold the io lock.
>>          */
>>         xfs_ilock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL);
>> ...
>>         /*
>>          * Free all memory associated with the inode.
>>          */
>>         xfs_idestroy(ip);
>> }
>>
>> So, lock & free.  This frees memory that lockdep is still pointing to,
>> and tries to use later.
>>
>> Calling xfs_iunlock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL); just before
>> xfs_idestroy seems to solve it, but is this safe...?
>
> It should be - we call xfs_iextract() before the xfs_ilock() call
> shown above. That means the inode has been removed from the mount
> list when we take the locks. Once the inode has been removed
> from the mount list, the only possible current user is xfs_sync_inodes(),
> and it will only be referencing the inode if it is currently working
> on the inode. If it is working on the inode, then it will be holding
> at least one of the inode locks.
>
> Hence by the time we have the lock here in xfs_ireclaim we have guaranteed
> that there are no other outstanding references and no new references
> can occur. Therefore it should be safe to drop the lock before destroying
> it.
>
Yeah, there really seems like something would be wrong if you can't
unlock it before destroying it.
I would have thought you'd need to guarantee that you are the only
one with access to it before destroying it otherwise there'd be problems :)
(Which as you say we do)

This one rings a bell. I seem to recall multiple places where we destroy
without releasing the lock first.
And I vaguely remember Nathan mentioning that this was causing grief
for lockdep:)

--Tim

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs vs. lockdep
  2006-10-10  1:45   ` Timothy Shimmin
@ 2006-10-10  2:21     ` Eric Sandeen
  2006-10-10  4:55       ` Timothy Shimmin
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2006-10-10  2:21 UTC (permalink / raw)
  To: Timothy Shimmin; +Cc: David Chinner, xfs

Timothy Shimmin wrote:
>> Hence by the time we have the lock here in xfs_ireclaim we have 
>> guaranteed
>> that there are no other outstanding references and no new references
>> can occur. Therefore it should be safe to drop the lock before destroying
>> it.
>>
> Yeah, there really seems like something would be wrong if you can't
> unlock it before destroying it.

I thought so too; if anybody else might catch it post-unlock pre-free, they're 
in for a big surprise anyway :)

> I would have thought you'd need to guarantee that you are the only
> one with access to it before destroying it otherwise there'd be problems :)
> (Which as you say we do)

right.

> This one rings a bell. I seem to recall multiple places where we destroy
> without releasing the lock first.
> And I vaguely remember Nathan mentioning that this was causing grief
> for lockdep:)


Ok, cool.  Want a formal patch or you guys want to just free it up...

         /*
          * Free all memory associated with the inode.
          */
+       xfs_iunlock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL);
         xfs_idestroy(ip);

Thanks,

-Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs vs. lockdep
  2006-10-10  0:47 ` David Chinner
  2006-10-10  1:45   ` Timothy Shimmin
@ 2006-10-10  2:25   ` Eric Sandeen
  1 sibling, 0 replies; 9+ messages in thread
From: Eric Sandeen @ 2006-10-10  2:25 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs

David Chinner wrote:

> FWIW, we call mrfree() on both the ilock and the iolock, but these are
> #defined to null statements. If there is a destructor for the underlying
> lock type, we probably should call that in mrfree() so the debugging code
> can catch these probelms that only trigger in debug code.

 From a quick look I don't see those destructors, might be good to write some 
though.

It'd also be good to turn our spinlock_destroy(lock) into a 
WARN_ON(spin_is_locked(lock)) or something...

it'd be extra nice if lockdep could grok that the lock it's looking at is full 
of free poison, print a warning, take it off the list and move on... this was a 
bear to figure out which one was causing the problem.

I have a few changes to get more desructors called for some of the newer locks 
that are missing them (think agirotor_lock...) that I'll send when I get a moment.

-Eric

> Cheers,
> 
> Dave.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs vs. lockdep
  2006-10-09 23:36 ` Vlad Apostolov
@ 2006-10-10  3:06   ` Eric Sandeen
  2006-10-10  3:23     ` Vlad Apostolov
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2006-10-10  3:06 UTC (permalink / raw)
  To: Vlad Apostolov; +Cc: xfs

Vlad Apostolov wrote:

> Hi Eric,
> 
> Could you please provide some more information. What kernel, test case 
> and call stack at the time of the crash.
> 
> Thanks,
> Vlad

Vlad, for reference, this was a very recent FC6 test kernel, which has lockdep & 
slab debugging on.  Oopsing is as simple as mounting & umounting.  It dies in 
lockdep code.

See also https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=209062

-Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs vs. lockdep
  2006-10-10  3:06   ` Eric Sandeen
@ 2006-10-10  3:23     ` Vlad Apostolov
  0 siblings, 0 replies; 9+ messages in thread
From: Vlad Apostolov @ 2006-10-10  3:23 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs

Eric Sandeen wrote:
> Vlad Apostolov wrote:
>
>> Hi Eric,
>>
>> Could you please provide some more information. What kernel, test 
>> case and call stack at the time of the crash.
>>
>> Thanks,
>> Vlad
>
> Vlad, for reference, this was a very recent FC6 test kernel, which has 
> lockdep & slab debugging on.  Oopsing is as simple as mounting & 
> umounting.  It dies in lockdep code.
>
> See also https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=209062
>
> -Eric
>
Thanks Eric,

I opened a pv and I see David C. replied to your email too.

Vlad

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs vs. lockdep
  2006-10-10  2:21     ` Eric Sandeen
@ 2006-10-10  4:55       ` Timothy Shimmin
  0 siblings, 0 replies; 9+ messages in thread
From: Timothy Shimmin @ 2006-10-10  4:55 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: David Chinner, xfs

--On 9 October 2006 9:21:10 PM -0500 Eric Sandeen <sandeen@sandeen.net> 
wrote:
>> This one rings a bell. I seem to recall multiple places where we destroy
>> without releasing the lock first.
>> And I vaguely remember Nathan mentioning that this was causing grief
>> for lockdep:)
>
>
> Ok, cool.  Want a formal patch
Nope :)

> or you guys want to just free it up...
>
>          /*
>           * Free all memory associated with the inode.
>           */
> +       xfs_iunlock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL);
>          xfs_idestroy(ip);
>
> Thanks,
>
Thanks, I'll just check it in under Vlad's bug.

Cheers,
Tim.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-10-10  4:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-09 17:58 xfs vs. lockdep Eric Sandeen
2006-10-09 23:36 ` Vlad Apostolov
2006-10-10  3:06   ` Eric Sandeen
2006-10-10  3:23     ` Vlad Apostolov
2006-10-10  0:47 ` David Chinner
2006-10-10  1:45   ` Timothy Shimmin
2006-10-10  2:21     ` Eric Sandeen
2006-10-10  4:55       ` Timothy Shimmin
2006-10-10  2:25   ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox