[PATCH] stable: restart busy extent search after node removal

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] stable: restart busy extent search after node removal
@ 2011-07-12 22:03 Eric Sandeen
  2011-07-13  0:12 ` Dave Chinner
  2011-07-13 13:50 ` Alex Elder
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Sandeen @ 2011-07-12 22:03 UTC (permalink / raw)
  To: xfs-oss

Sending this for review prior to stable submission...

A user on #xfs reported that a log replay was oopsing in
__rb_rotate_left() with a null pointer deref.

I traced this down to the fact that in xfs_alloc_busy_insert(),
we erased a node with rb_erase() when the new node overlapped,
but left it specified as the parent node for the new insertion.

So when we try to insert a new node with an erased node as
its parent, obviously things go very wrong.

Upstream,
97d3ac75e5e0ebf7ca38ae74cebd201c09b97ab2 xfs: exact busy extent tracking
actually fixed this, but as part of a much larger change.  Here's
the relevant bit:

                * We also need to restart the busy extent search from the
                * tree root, because erasing the node can rearrange the
                * tree topology.
                */
               rb_erase(&busyp->rb_node, &pag->pagb_tree);
               busyp->length = 0;
               return false;

We can do essentially the same thing to older codebases by restarting
the search after the erase.

This should apply to .35 through .39, and was tested on .39
with the oopsing replay reproducer.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

Index: linux-2.6/fs/xfs/xfs_alloc.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_alloc.c
+++ linux-2.6/fs/xfs/xfs_alloc.c
@@ -2664,6 +2664,12 @@ restart:
 					new->bno + new->length) -
 				min(busyp->bno, new->bno);
 		new->bno = min(busyp->bno, new->bno);
+		/*
+		 * Start the search over from the tree root, because
+		 * erasing the node can rearrange the tree topology.
+		 */
+		spin_unlock(&pag->pagb_lock);
+		goto restart;
 	} else
 		busyp = NULL;

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] stable: restart busy extent search after node removal
  2011-07-12 22:03 [PATCH] stable: restart busy extent search after node removal Eric Sandeen
@ 2011-07-13  0:12 ` Dave Chinner
  2011-07-13  0:14   ` Eric Sandeen
  2011-07-13 13:50 ` Alex Elder
  1 sibling, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2011-07-13  0:12 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs-oss

On Tue, Jul 12, 2011 at 05:03:38PM -0500, Eric Sandeen wrote:
> Sending this for review prior to stable submission...
> 
> A user on #xfs reported that a log replay was oopsing in
> __rb_rotate_left() with a null pointer deref.
> 
> I traced this down to the fact that in xfs_alloc_busy_insert(),
> we erased a node with rb_erase() when the new node overlapped,
> but left it specified as the parent node for the new insertion.
> 
> So when we try to insert a new node with an erased node as
> its parent, obviously things go very wrong.
> 
> Upstream,
> 97d3ac75e5e0ebf7ca38ae74cebd201c09b97ab2 xfs: exact busy extent tracking
> actually fixed this, but as part of a much larger change.  Here's
> the relevant bit:
> 
>                 * We also need to restart the busy extent search from the
>                 * tree root, because erasing the node can rearrange the
>                 * tree topology.
>                 */
>                rb_erase(&busyp->rb_node, &pag->pagb_tree);
>                busyp->length = 0;
>                return false;
> 
> We can do essentially the same thing to older codebases by restarting
> the search after the erase.
> 
> This should apply to .35 through .39, and was tested on .39
> with the oopsing replay reproducer.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
> 
> Index: linux-2.6/fs/xfs/xfs_alloc.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_alloc.c
> +++ linux-2.6/fs/xfs/xfs_alloc.c
> @@ -2664,6 +2664,12 @@ restart:
>  					new->bno + new->length) -
>  				min(busyp->bno, new->bno);
>  		new->bno = min(busyp->bno, new->bno);
> +		/*
> +		 * Start the search over from the tree root, because
> +		 * erasing the node can rearrange the tree topology.
> +		 */
> +		spin_unlock(&pag->pagb_lock);
> +		goto restart;
>  	} else
>  		busyp = NULL;

Looks good.

I'm guessing that the only case I was able to hit during testing of
this code originally was the "overlap with exact start block match",
otherwise I would have seen this. I'm not sure that there really is
much we can do to improve the test coverage of this code, though.
Hell, just measuring our test coverage so we know what we aren't
testing would probably be a good start. :/

Reviewed-by: Dave Chinner <dchinner@redhat.com>

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] stable: restart busy extent search after node removal
  2011-07-13  0:12 ` Dave Chinner
@ 2011-07-13  0:14   ` Eric Sandeen
  2011-07-13  0:20     ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2011-07-13  0:14 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs-oss

On 7/12/11 7:12 PM, Dave Chinner wrote:
> On Tue, Jul 12, 2011 at 05:03:38PM -0500, Eric Sandeen wrote:
>> Sending this for review prior to stable submission...
>>
>> A user on #xfs reported that a log replay was oopsing in
>> __rb_rotate_left() with a null pointer deref.
>>
>> I traced this down to the fact that in xfs_alloc_busy_insert(),
>> we erased a node with rb_erase() when the new node overlapped,
>> but left it specified as the parent node for the new insertion.
>>
>> So when we try to insert a new node with an erased node as
>> its parent, obviously things go very wrong.
>>
>> Upstream,
>> 97d3ac75e5e0ebf7ca38ae74cebd201c09b97ab2 xfs: exact busy extent tracking
>> actually fixed this, but as part of a much larger change.  Here's
>> the relevant bit:
>>
>>                 * We also need to restart the busy extent search from the
>>                 * tree root, because erasing the node can rearrange the
>>                 * tree topology.
>>                 */
>>                rb_erase(&busyp->rb_node, &pag->pagb_tree);
>>                busyp->length = 0;
>>                return false;
>>
>> We can do essentially the same thing to older codebases by restarting
>> the search after the erase.
>>
>> This should apply to .35 through .39, and was tested on .39
>> with the oopsing replay reproducer.
>>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> ---
>>
>> Index: linux-2.6/fs/xfs/xfs_alloc.c
>> ===================================================================
>> --- linux-2.6.orig/fs/xfs/xfs_alloc.c
>> +++ linux-2.6/fs/xfs/xfs_alloc.c
>> @@ -2664,6 +2664,12 @@ restart:
>>  					new->bno + new->length) -
>>  				min(busyp->bno, new->bno);
>>  		new->bno = min(busyp->bno, new->bno);
>> +		/*
>> +		 * Start the search over from the tree root, because
>> +		 * erasing the node can rearrange the tree topology.
>> +		 */
>> +		spin_unlock(&pag->pagb_lock);
>> +		goto restart;
>>  	} else
>>  		busyp = NULL;
> 
> Looks good.
> 
> I'm guessing that the only case I was able to hit during testing of
> this code originally was the "overlap with exact start block match",
> otherwise I would have seen this. I'm not sure that there really is
> much we can do to improve the test coverage of this code, though.
> Hell, just measuring our test coverage so we know what we aren't
> testing would probably be a good start. :/

Apparently the original oops, and the subsequent replay oopses,
were on a filesystem VERY busy with torrents.

Might be a testcase ;)

> Reviewed-by: Dave Chinner <dchinner@redhat.com>

Thanks,
-Eric


> 
> Cheers,
> 
> Dave.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] stable: restart busy extent search after node removal
  2011-07-13  0:14   ` Eric Sandeen
@ 2011-07-13  0:20     ` Dave Chinner
  2011-07-13  1:27       ` Eric Sandeen
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2011-07-13  0:20 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs-oss

On Tue, Jul 12, 2011 at 07:14:19PM -0500, Eric Sandeen wrote:
> On 7/12/11 7:12 PM, Dave Chinner wrote:
> > On Tue, Jul 12, 2011 at 05:03:38PM -0500, Eric Sandeen wrote:
> >> Sending this for review prior to stable submission...
> >>
> >> A user on #xfs reported that a log replay was oopsing in
> >> __rb_rotate_left() with a null pointer deref.
> >>
> >> I traced this down to the fact that in xfs_alloc_busy_insert(),
> >> we erased a node with rb_erase() when the new node overlapped,
> >> but left it specified as the parent node for the new insertion.
> >>
> >> So when we try to insert a new node with an erased node as
> >> its parent, obviously things go very wrong.
> >>
> >> Upstream,
> >> 97d3ac75e5e0ebf7ca38ae74cebd201c09b97ab2 xfs: exact busy extent tracking
> >> actually fixed this, but as part of a much larger change.  Here's
> >> the relevant bit:
> >>
> >>                 * We also need to restart the busy extent search from the
> >>                 * tree root, because erasing the node can rearrange the
> >>                 * tree topology.
> >>                 */
> >>                rb_erase(&busyp->rb_node, &pag->pagb_tree);
> >>                busyp->length = 0;
> >>                return false;
> >>
> >> We can do essentially the same thing to older codebases by restarting
> >> the search after the erase.
> >>
> >> This should apply to .35 through .39, and was tested on .39
> >> with the oopsing replay reproducer.
> >>
> >> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> >> ---
> >>
> >> Index: linux-2.6/fs/xfs/xfs_alloc.c
> >> ===================================================================
> >> --- linux-2.6.orig/fs/xfs/xfs_alloc.c
> >> +++ linux-2.6/fs/xfs/xfs_alloc.c
> >> @@ -2664,6 +2664,12 @@ restart:
> >>  					new->bno + new->length) -
> >>  				min(busyp->bno, new->bno);
> >>  		new->bno = min(busyp->bno, new->bno);
> >> +		/*
> >> +		 * Start the search over from the tree root, because
> >> +		 * erasing the node can rearrange the tree topology.
> >> +		 */
> >> +		spin_unlock(&pag->pagb_lock);
> >> +		goto restart;
> >>  	} else
> >>  		busyp = NULL;
> > 
> > Looks good.
> > 
> > I'm guessing that the only case I was able to hit during testing of
> > this code originally was the "overlap with exact start block match",
> > otherwise I would have seen this. I'm not sure that there really is
> > much we can do to improve the test coverage of this code, though.
> > Hell, just measuring our test coverage so we know what we aren't
> > testing would probably be a good start. :/
> 
> Apparently the original oops, and the subsequent replay oopses,
> were on a filesystem VERY busy with torrents.
> 
> Might be a testcase ;)

That just means large files. And fragmentation levels are
effectively dependent on whether the torrent client uses
preallocation or not. Just creating a set of large fragmented file
using preallocation, shutting the filesystem down in the middle
of it and then doing log replay might do the trick...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] stable: restart busy extent search after node removal
  2011-07-13  0:20     ` Dave Chinner
@ 2011-07-13  1:27       ` Eric Sandeen
  2011-07-15 14:19         ` Alex Elder
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2011-07-13  1:27 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, xfs-oss

On 7/12/11 7:20 PM, Dave Chinner wrote:
> On Tue, Jul 12, 2011 at 07:14:19PM -0500, Eric Sandeen wrote:
>> On 7/12/11 7:12 PM, Dave Chinner wrote:
>>> On Tue, Jul 12, 2011 at 05:03:38PM -0500, Eric Sandeen wrote:
>>>> Sending this for review prior to stable submission...
>>>>
>>>> A user on #xfs reported that a log replay was oopsing in
>>>> __rb_rotate_left() with a null pointer deref.
>>>>
>>>> I traced this down to the fact that in xfs_alloc_busy_insert(),
>>>> we erased a node with rb_erase() when the new node overlapped,
>>>> but left it specified as the parent node for the new insertion.
>>>>
>>>> So when we try to insert a new node with an erased node as
>>>> its parent, obviously things go very wrong.
>>>>
>>>> Upstream,
>>>> 97d3ac75e5e0ebf7ca38ae74cebd201c09b97ab2 xfs: exact busy extent tracking
>>>> actually fixed this, but as part of a much larger change.  Here's
>>>> the relevant bit:
>>>>
>>>>                 * We also need to restart the busy extent search from the
>>>>                 * tree root, because erasing the node can rearrange the
>>>>                 * tree topology.
>>>>                 */
>>>>                rb_erase(&busyp->rb_node, &pag->pagb_tree);
>>>>                busyp->length = 0;
>>>>                return false;
>>>>
>>>> We can do essentially the same thing to older codebases by restarting
>>>> the search after the erase.
>>>>
>>>> This should apply to .35 through .39, and was tested on .39
>>>> with the oopsing replay reproducer.
>>>>
>>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>>>> ---
>>>>
>>>> Index: linux-2.6/fs/xfs/xfs_alloc.c
>>>> ===================================================================
>>>> --- linux-2.6.orig/fs/xfs/xfs_alloc.c
>>>> +++ linux-2.6/fs/xfs/xfs_alloc.c
>>>> @@ -2664,6 +2664,12 @@ restart:
>>>>  					new->bno + new->length) -
>>>>  				min(busyp->bno, new->bno);
>>>>  		new->bno = min(busyp->bno, new->bno);
>>>> +		/*
>>>> +		 * Start the search over from the tree root, because
>>>> +		 * erasing the node can rearrange the tree topology.
>>>> +		 */
>>>> +		spin_unlock(&pag->pagb_lock);
>>>> +		goto restart;
>>>>  	} else
>>>>  		busyp = NULL;
>>>
>>> Looks good.
>>>
>>> I'm guessing that the only case I was able to hit during testing of
>>> this code originally was the "overlap with exact start block match",
>>> otherwise I would have seen this. I'm not sure that there really is
>>> much we can do to improve the test coverage of this code, though.
>>> Hell, just measuring our test coverage so we know what we aren't
>>> testing would probably be a good start. :/
>>
>> Apparently the original oops, and the subsequent replay oopses,
>> were on a filesystem VERY busy with torrents.
>>
>> Might be a testcase ;)
> 
> That just means large files. And fragmentation levels are
> effectively dependent on whether the torrent client uses
> preallocation or not. Just creating a set of large fragmented file
> using preallocation, shutting the filesystem down in the middle
> of it and then doing log replay might do the trick...

well yeah, my point was, it was in fact badly fragmented.

To quote my favorite meaningless xfs_db statistic,

actual 29700140, ideal 185230, fragmentation factor 99.38%

I guess that's "only" 160 extents per file.

But one of the 2.2G files had 44,000 extents, as an example.
I am guessing the client did not preallocate.  :)

-Eric

> Cheers,
> 
> Dave.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] stable: restart busy extent search after node removal
  2011-07-12 22:03 [PATCH] stable: restart busy extent search after node removal Eric Sandeen
  2011-07-13  0:12 ` Dave Chinner
@ 2011-07-13 13:50 ` Alex Elder
  1 sibling, 0 replies; 8+ messages in thread
From: Alex Elder @ 2011-07-13 13:50 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs-oss

On Tue, 2011-07-12 at 17:03 -0500, Eric Sandeen wrote:
> Sending this for review prior to stable submission...
> 
> A user on #xfs reported that a log replay was oopsing in
> __rb_rotate_left() with a null pointer deref.
> 
> I traced this down to the fact that in xfs_alloc_busy_insert(),
> we erased a node with rb_erase() when the new node overlapped,
> but left it specified as the parent node for the new insertion.
> 
> So when we try to insert a new node with an erased node as
> its parent, obviously things go very wrong.
> 
> Upstream,
> 97d3ac75e5e0ebf7ca38ae74cebd201c09b97ab2 xfs: exact busy extent tracking
> actually fixed this, but as part of a much larger change.  Here's
> the relevant bit:
> 
>                 * We also need to restart the busy extent search from the
>                 * tree root, because erasing the node can rearrange the
>                 * tree topology.
>                 */
>                rb_erase(&busyp->rb_node, &pag->pagb_tree);
>                busyp->length = 0;
>                return false;
> 
> We can do essentially the same thing to older codebases by restarting
> the search after the erase.
> 
> This should apply to .35 through .39, and was tested on .39
> with the oopsing replay reproducer.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>

Looks good.

Reviewed-by: Alex Elder <aelder@sgi.com>


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] stable: restart busy extent search after node removal
  2011-07-13  1:27       ` Eric Sandeen
@ 2011-07-15 14:19         ` Alex Elder
  2011-07-16  1:20           ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Alex Elder @ 2011-07-15 14:19 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Eric Sandeen, xfs-oss

On Tue, 2011-07-12 at 20:27 -0500, Eric Sandeen wrote:
> On 7/12/11 7:20 PM, Dave Chinner wrote:
> > On Tue, Jul 12, 2011 at 07:14:19PM -0500, Eric Sandeen wrote:
> >> On 7/12/11 7:12 PM, Dave Chinner wrote:
> >>> On Tue, Jul 12, 2011 at 05:03:38PM -0500, Eric Sandeen wrote:
> >>>> Sending this for review prior to stable submission...
> >>>>
> >>>> A user on #xfs reported that a log replay was oopsing in
> >>>> __rb_rotate_left() with a null pointer deref.
> >>>>
> >>>> I traced this down to the fact that in xfs_alloc_busy_insert(),
> >>>> we erased a node with rb_erase() when the new node overlapped,
> >>>> but left it specified as the parent node for the new insertion.
> >>>>
> >>>> So when we try to insert a new node with an erased node as
> >>>> its parent, obviously things go very wrong.
> >>>>
> >>>> Upstream,
> >>>> 97d3ac75e5e0ebf7ca38ae74cebd201c09b97ab2 xfs: exact busy extent tracking
> >>>> actually fixed this, but as part of a much larger change.  Here's
> >>>> the relevant bit:
> >>>>
> >>>>                 * We also need to restart the busy extent search from the
> >>>>                 * tree root, because erasing the node can rearrange the
> >>>>                 * tree topology.
> >>>>                 */
> >>>>                rb_erase(&busyp->rb_node, &pag->pagb_tree);
> >>>>                busyp->length = 0;
> >>>>                return false;
> >>>>
> >>>> We can do essentially the same thing to older codebases by restarting
> >>>> the search after the erase.
> >>>>
> >>>> This should apply to .35 through .39, and was tested on .39
> >>>> with the oopsing replay reproducer.
> >>>>
> >>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> >>>> ---
> >>>>
> >>>> Index: linux-2.6/fs/xfs/xfs_alloc.c
> >>>> ===================================================================
> >>>> --- linux-2.6.orig/fs/xfs/xfs_alloc.c
> >>>> +++ linux-2.6/fs/xfs/xfs_alloc.c
> >>>> @@ -2664,6 +2664,12 @@ restart:
> >>>>  					new->bno + new->length) -
> >>>>  				min(busyp->bno, new->bno);
> >>>>  		new->bno = min(busyp->bno, new->bno);
> >>>> +		/*
> >>>> +		 * Start the search over from the tree root, because
> >>>> +		 * erasing the node can rearrange the tree topology.
> >>>> +		 */
> >>>> +		spin_unlock(&pag->pagb_lock);
> >>>> +		goto restart;
> >>>>  	} else
> >>>>  		busyp = NULL;
> >>>
> >>> Looks good.
> >>>
> >>> I'm guessing that the only case I was able to hit during testing of
> >>> this code originally was the "overlap with exact start block match",
> >>> otherwise I would have seen this. I'm not sure that there really is
> >>> much we can do to improve the test coverage of this code, though.
> >>> Hell, just measuring our test coverage so we know what we aren't
> >>> testing would probably be a good start. :/
> >>
> >> Apparently the original oops, and the subsequent replay oopses,
> >> were on a filesystem VERY busy with torrents.
> >>
> >> Might be a testcase ;)

So, would you mind trying to create this as a test?
Can you come up with a reliable way to create a
small but *very* fragmented filesystem to do stuff
with?

Maybe a function to do that would be useful (sort
of like the one Allison Henderson did for creating
full filesystem) for doing various tests, including
log replay, xfs_repair, and various operations while
"loaded" in that way.

					-Alex


> > 
> > That just means large files. And fragmentation levels are
> > effectively dependent on whether the torrent client uses
> > preallocation or not. Just creating a set of large fragmented file
> > using preallocation, shutting the filesystem down in the middle
> > of it and then doing log replay might do the trick...
> 
> well yeah, my point was, it was in fact badly fragmented.
> 
> To quote my favorite meaningless xfs_db statistic,
> 
> actual 29700140, ideal 185230, fragmentation factor 99.38%
> 
> I guess that's "only" 160 extents per file.
> 
> But one of the 2.2G files had 44,000 extents, as an example.
> I am guessing the client did not preallocate.  :)
> 
> -Eric
> 
> > Cheers,
> > 
> > Dave.
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs



_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] stable: restart busy extent search after node removal
  2011-07-15 14:19         ` Alex Elder
@ 2011-07-16  1:20           ` Dave Chinner
  0 siblings, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2011-07-16  1:20 UTC (permalink / raw)
  To: Alex Elder; +Cc: Eric Sandeen, Eric Sandeen, xfs-oss

On Fri, Jul 15, 2011 at 09:19:02AM -0500, Alex Elder wrote:
> On Tue, 2011-07-12 at 20:27 -0500, Eric Sandeen wrote:
> > On 7/12/11 7:20 PM, Dave Chinner wrote:
> > > On Tue, Jul 12, 2011 at 07:14:19PM -0500, Eric Sandeen wrote:
> > >>> I'm guessing that the only case I was able to hit during testing of
> > >>> this code originally was the "overlap with exact start block match",
> > >>> otherwise I would have seen this. I'm not sure that there really is
> > >>> much we can do to improve the test coverage of this code, though.
> > >>> Hell, just measuring our test coverage so we know what we aren't
> > >>> testing would probably be a good start. :/
> > >>
> > >> Apparently the original oops, and the subsequent replay oopses,
> > >> were on a filesystem VERY busy with torrents.
> > >>
> > >> Might be a testcase ;)
> 
> So, would you mind trying to create this as a test?
> Can you come up with a reliable way to create a
> small but *very* fragmented filesystem to do stuff
> with?

See test 042 - it's not hard to do..

But 042 only uses a 48MB filesystem. To generate hundreds of
thousands of extents, it needs to be done on a filesystem that can
hold hundreds of thousands of blocks - gigabytes in size, IOWs.

What I'd like to do is basically fill the fs full of single block
files, delete every alternate one (fragments free space to stress
those btrees), then fill the fs again with a single preallocation on
a new file to convert the freespace fragments to a fragmented bmbt
index, then free the remaining single block files and fill the fs
again with a single preallocation on the same file that already
fills half the fs. Finally, unmount the filesystem, mount it again
and remove the extents back to the free space by iteratively
punching out sparse ranges of the large file until it is empty. e.g.
0-1MB, 10-11MB, .... 1000MB-1001MB, 1-2MB, 11-12MB, .....

That should be a deterministic test that does the same btree
operations from run to run and provide decent coverage of most of
the btree and extent tree operations - including loading a massive
bmap tree from disk into memory.

I'd also like to repeat the test, but this time doing a random
delete of half the files so the fragmented file is not made up
entirely of single block extents. That will perturb the way the
btrees grow and shrink and so will execute btree operations in
combinations that the above deterministic test won't. e.g. it will
trip bmbt split/merges causing freespace btree split/merges in the
one allocation/free operation that a deterministic test will
never hit...

We don't really have coverage of bmap extent trees with that number
of extents in them right now, and test 250 shows that we do really
need that coverage (it exercised a bug in a 2->3 level split, IIRC).
I'd also be inclined to use a 512 byte filesystem block size with
only 2 AGs to cause the height of both the freespace and bmap the
btrees to increase much more quickly, too.

If we can, I'd like the test to range up to at least million extents
in a bmap btree - that covers single unfragmented files into the
multi-PB range for 4k block size filesystems.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-07-16  1:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-12 22:03 [PATCH] stable: restart busy extent search after node removal Eric Sandeen
2011-07-13  0:12 ` Dave Chinner
2011-07-13  0:14   ` Eric Sandeen
2011-07-13  0:20     ` Dave Chinner
2011-07-13  1:27       ` Eric Sandeen
2011-07-15 14:19         ` Alex Elder
2011-07-16  1:20           ` Dave Chinner
2011-07-13 13:50 ` Alex Elder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox