Re: [3.2-rc7] slowdown, warning + oops creating lots of files

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Liu Bo <liubo2009@cn.fujitsu.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Chris Samuel <chris@csamuel.org>, linux-btrfs@vger.kernel.org
Subject: Re: [3.2-rc7] slowdown, warning + oops creating lots of files
Date: Thu, 05 Jan 2012 14:11:31 -0500	[thread overview]
Message-ID: <4F05F5E3.70600@cn.fujitsu.com> (raw)
In-Reply-To: <20120105022630.GD24466@dastard>

On 01/04/2012 09:26 PM, Dave Chinner wrote:
> On Wed, Jan 04, 2012 at 09:23:18PM -0500, Liu Bo wrote:
>> On 01/04/2012 06:01 PM, Dave Chinner wrote:
>>> On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote:
>>>> On 05/01/12 09:11, Dave Chinner wrote:
>>>>
>>>>> Looks to be reproducable.
>>>> Does this happen with rc6 ?
>>> I haven't tried. All I'm doing is running some benchmarks to get
>>> numbers for a talk I'm giving about improvements in XFS metadata
>>> scalability, so I wanted to update my last set of numbers from
>>> 2.6.39.
>>>
>>> As it was, these benchmarks also failed on btrfs with oopsen and
>>> corruptions back in 2.6.39 time frame.  e.g. same VM, same
>>> test, different crashes, similar slowdowns as reported here:
>>> http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062
>>>
>>> Given that there is now a history of this simple test uncovering
>>> problems, perhaps this is a test that should be run more regularly
>>> by btrfs developers?
>>>
>>>> If not then it might be easy to track down as there are only
>>>> 2 modifications between rc6 and rc7..
>>> They don't look like they'd be responsible for fixing an extent tree
>>> corruption, and I don't really have the time to do an open-ended
>>> bisect to find where the problem fix arose.
>>>
>>> As it is, 3rd attempt failed at 22m inodes, without the warning this
>>> time:
> 
> .....
> 
>>> It's hard to tell exactly what path gets to that BUG_ON(), so much
>>> code is inlined by the compiler into run_clustered_refs() that I
>>> can't tell exactly how it got to the BUG_ON() triggered in
>>> alloc_reserved_tree_block().
>>>
>> This seems to be an oops led by ENOSPC.
> 
> At the time of the oops, this is the space used on the filesystem:
> 
> $ df -h /mnt/scratch
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdc         17T   31G   17T   1% /mnt/scratch
> 
> It's less than 0.2% full, so I think ENOSPC can be ruled out here.
> 

This bug has done something with our block reservation allocator, not the real disk space.

Can you try the below one and see what happens?

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index b1c8732..5a7f918 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3978,8 +3978,8 @@ static u64 calc_global_metadata_size(struct btrfs_fs_info *fs_info)
 		    csum_size * 2;
 	num_bytes += div64_u64(data_used + meta_used, 50);
 
-	if (num_bytes * 3 > meta_used)
-		num_bytes = div64_u64(meta_used, 3);
+	if (num_bytes * 2 > meta_used)
+		num_bytes = div64_u64(meta_used, 2);
 
 	return ALIGN(num_bytes, fs_info->extent_root->leafsize << 10);
 }

> I have noticed one thing, however, in that the there are significant
> numbers of reads coming from disk when the slowdowns and oops occur.
> When everything runs fast, there are virtually no reads occurring at
> all.  It looks to me that maybe the working set of metadata is being
> kicked out of memory, only to be read back in again short while
> later. Maybe that is a contributing factor.
> 
> BTW, there is a lot of CPU time being spent on the tree locks. perf
> shows this as the top 2 CPU consumers:
> 
> -   9.49%  [kernel]  [k] __write_lock_failed
>    - __write_lock_failed
>       - 99.80% _raw_write_lock
>          - 79.35% btrfs_try_tree_write_lock
>               99.99% btrfs_search_slot
>          - 20.63% btrfs_tree_lock
>               89.19% btrfs_search_slot
>               10.54% btrfs_lock_root_node
>                  btrfs_search_slot
> -   9.25%  [kernel]  [k] _raw_spin_unlock_irqrestore
>    - _raw_spin_unlock_irqrestore
>       - 55.87% __wake_up
>          + 93.89% btrfs_clear_lock_blocking_rw
>          + 3.46% btrfs_tree_read_unlock_blocking
>          + 2.35% btrfs_tree_unlock
> 

hmm, the new extent_buffer lock scheme written by Chris is aimed to avoid such cases,
maybe he can provide some advices.

thanks,
liubo

> Cheers,
> 
> Dave.

next prev parent reply	other threads:[~2012-01-05 19:11 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-04 21:44 [3.2-rc7] slowdown, warning + oops creating lots of files Dave Chinner
2012-01-04 22:11 ` Dave Chinner
2012-01-04 22:23   ` Chris Samuel
2012-01-04 23:01     ` Dave Chinner
2012-01-05  2:23       ` Liu Bo
2012-01-05  2:26         ` Dave Chinner
2012-01-05 19:11           ` Liu Bo [this message]
2012-01-05 11:43             ` Dave Chinner
2012-01-05 18:46       ` Chris Mason
2012-01-05 19:45         ` Chris Mason
2012-01-05 20:12           ` Dave Chinner
2012-01-05 21:02             ` Chris Mason
2012-01-05 21:24               ` Chris Samuel
2012-01-06  1:22                 ` Chris Mason
2012-01-07 21:34               ` Christian Brunner
2012-01-12 16:18                 ` Christian Brunner

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:b1c8732 dfblob:5a7f918 )
 OR (
bs:"Re: [3.2-rc7] slowdown, warning + oops creating lots of files" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F05F5E3.70600@cn.fujitsu.com \
    --to=liubo2009@cn.fujitsu.com \
    --cc=chris@csamuel.org \
    --cc=david@fromorbit.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).