All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans Reiser <reiser@namesys.com>
To: Jeffrey Mahoney <jeffm@suse.com>
Cc: "Mike Benoit" <ipso@snappymail.ca>, "Jan Kara" <jack@suse.cz>,
	"Jure Pečar" <pegasus@nerv.eu.org>,
	reiserfs-list@namesys.com
Subject: Re: ReiserFS v3 choking when free space falls below 10%?
Date: Tue, 11 Jul 2006 22:42:24 -0700	[thread overview]
Message-ID: <44B48BC0.2000608@namesys.com> (raw)
In-Reply-To: <44B44856.5070409@suse.com>

You make this way too complicated because you are trying to be way too
perfect.  If you scan 3 bitmap blocks and find nothing, stop trying to
size match.

Hans

Jeffrey Mahoney wrote:

> Jeff Mahoney wrote:
>
> >Hans Reiser wrote:
>
> >>>Guys, if you run the kernel under a debugger, and get it to where you
> >>>see the excessive CPU usage, and then start stepping through the bitmap
> >>>code, I am sure it will be very obvious what the error is.  Can anyone
> >>>do that for us?  Jeff?
>
> >Apologies to everyone CC'd who've already seen this message. It was
> >bounced from the namesys servers and I wanted to preserve the CC list.
>
> >***
>
> >Mike sent me a copy of the metadata and I am now able to reproduce
> >locally. My profiling looks like this:
>
> >samples  %      image name      app name        symbol name
> >148596  17.8573 reiserfs.ko     reiserfs        reiserfs_in_journal
> >58194   6.9934  reiserfs.ko     reiserfs        search_by_key
> >38937   4.6792  vmlinux         vmlinux         memmove
> >38783   4.6607  reiserfs.ko     reiserfs        scan_bitmap_block
> >38466   4.6226  jbd             jbd             (no symbols)
> >23249   2.7939  vmlinux         vmlinux         __find_get_block
> >18196   2.1867  vmlinux         vmlinux         tty_write
> >17734   2.1312  vmlinux         vmlinux         do_ioctl
> >17293   2.0782  loop            loop            (no symbols)
> >15400   1.8507  vmlinux         vmlinux         cond_resched_lock
> >14836   1.7829  vmlinux         vmlinux         copy_user_generic_c
> >14143   1.6996  reiserfs.ko     reiserfs        do_journal_end
> >13638   1.6389  vmlinux         vmlinux         find_next_zero_bit
> >13236   1.5906  vmlinux         vmlinux         default_llseek
> >12925   1.5532  vmlinux         vmlinux         bit_waitqueue
> >8921    1.0721  vmlinux         vmlinux         __delay
>
>
> >Hans -
>
> >My speculation about the bitmaps being fragmented was right on. I wrote
> >a quick little script to parse the output of debugreiserfs -m and report
> >on the frequency of different window sizes. Windows of 1-31 blocks are
> >extremely common, accounting for 99.8% of all free windows. The problem
> >is that in my testing, where I made the allocator report the size of
> >allocation requests, the most common request was for a window of 32
> blocks.
>
> >What's happening is that we keep finding windows that are too small,
> >which results in a lot of wasted effort. The cycle goes like this:
>
> >if (unfm && is_block_in_journal(s, bmap_n, *beg, beg))
> >        continue;
> >/* first zero bit found; we check next bits */
> >for (end = *beg + 1;; end++) {
> >        if (end >= *beg + max || end >= boundary
> >            || reiserfs_test_le_bit(end, bi->bh->b_data)) {
> >                next = end;
> >                break;
> >        }
> >        /* finding the other end of zero bit window requires
> >         * looking into journal structures (in
> >         * case of searching for free blocks for unformatted nodes) */
> >        if (unfm && is_block_in_journal(s, bmap_n, end, &next))
> >                break;
> >}
>
> >If the window is too small, we end up looping up to the top and try to
> >find another one. Since the overwhelming majority of the windows are too
> >small, we go through just about all the bitmaps without backing off the
> >window size.
>
>
> To be clear, eventually the allocations are honored, but only after
> *all* of the bitmaps are searched. On the third pass, we drop the window
> to a single block and restart the scan, eventually building a 32-block
> set that is probably quite fragmented. This occurs on every write, hence
> the huge performance hit.
>
> It appears as though ext3 doesn't have this problem because they don't
> batch writes the way reiserfs does. They'll start a search at a decent
> hint the same way we do, but the window is always one block.
>
> So, we're stuck between a rock and a hard place. We can have the better
> allocation performance at lower usage and sacrifice performance later or
> we can have stable allocation performance at an overall reduction in
> performance.
>
> I have an idea that may get around both problems, but I'm not sure how
> well it would be received. We currently do some very basic caching of
> bitmap metadata such as the first zero bit and how many free blocks
> there are. What if we constructed an extent map of the free windows in
> each bitmaps when we cache the metadata and adjust the map when we
>
> There's a third option, but I'm not sure how well it would be received
>
> Right now, the allocator keeps track of things like how full a bitmap is
> and where the first zero bit is. It would also be possible to cache a
> list of windows in each bitmap to accelerate performance. This would
> have to be a shrinkable cache, since the pathlogical case could mean
> occupying
>
> -Jeff
>
> --
> Jeff Mahoney
> SUSE Labs


  reply	other threads:[~2006-07-12  5:42 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-29 17:41 ReiserFS v3 choking when free space falls below 10%? Mike Benoit
2006-06-29 19:12 ` Vladimir V. Saveliev
2006-06-29 20:15   ` Mike Benoit
2006-06-29 20:22     ` Vladimir V. Saveliev
2006-06-29 21:01       ` Mike Benoit
2006-06-29 20:36     ` Nate Diller
2006-06-30 16:33     ` Hans Reiser
2006-06-30 16:47       ` Jeff Mahoney
2006-06-30 17:04         ` Hans Reiser
2006-06-30 17:46           ` Mike Benoit
2006-06-30 18:18             ` Hans Reiser
2006-07-05  0:37         ` Mike Benoit
2006-07-05  2:37           ` Hans Reiser
2006-07-05 14:42             ` Tom Vier
2006-07-05 19:12             ` Jeff Mahoney
     [not found]             ` <20060706125856.fdac1d16.pegasus@nerv.eu.org>
2006-07-06 15:43               ` Mike Benoit
2006-07-06 16:01                 ` Jonathan Briggs
2006-07-06 17:26                 ` Toby Thain
2006-07-06 17:26                   ` Toby Thain
2006-07-06 18:02                 ` Jeff Mahoney
2006-07-06 18:12                   ` Hans Reiser
2006-07-06 18:19                     ` Jeff Mahoney
2006-07-06 18:47                       ` Mike Benoit
2006-07-06 19:17                         ` Hans Reiser
2006-07-06 18:27                   ` Mike Benoit
2006-07-06 18:39                     ` Jeff Mahoney
2006-07-07  7:29                       ` Mike Benoit
2006-07-07 17:49                         ` Jan Kara
2006-07-07 17:50                           ` Jeff Mahoney
2006-07-07 18:07                             ` Jan Kara
2006-07-07 18:08                               ` Jeff Mahoney
2006-07-07 19:05                               ` Hans Reiser
2006-07-07 19:18                                 ` Jan Kara
2006-07-07 19:38                                   ` Hans Reiser
2006-07-07 20:18                           ` Mike Benoit
2006-07-07 21:04                           ` Mike Benoit
2006-07-07 21:20                             ` Hans Reiser
2006-07-08 18:45                               ` Jeff Mahoney
2006-07-09  0:01                                 ` Hans Reiser
2006-07-09  0:02                                 ` Hans Reiser
2006-07-12  0:54                                 ` Jeffrey Mahoney
2006-07-12  5:42                                   ` Hans Reiser [this message]
2006-07-12  5:52                                     ` Jeffrey Mahoney
2006-07-12  8:18                                       ` Hans Reiser
2006-07-12 16:06                                         ` Jeff Mahoney
2006-07-24 22:26 ` ReiserFS v3 choking when free space falls below 10% - FIXED Mike Benoit
2006-07-24 22:32   ` Jeff Mahoney
2006-07-26  0:10   ` David Masover
2006-07-26  2:25     ` Mike Benoit
2006-07-26 14:29     ` Hans Reiser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44B48BC0.2000608@namesys.com \
    --to=reiser@namesys.com \
    --cc=ipso@snappymail.ca \
    --cc=jack@suse.cz \
    --cc=jeffm@suse.com \
    --cc=pegasus@nerv.eu.org \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.