Linux EXT4 FS development
 help / color / mirror / Atom feed
From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Alex Zhuravlev <azhuravlev@whamcloud.com>
Cc: "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: [RFC] improve malloc for large filesystems
Date: Wed, 20 Nov 2019 13:13:53 -0500	[thread overview]
Message-ID: <20191120181353.GG4262@mit.edu> (raw)
In-Reply-To: <8738E8FF-820F-48A5-9150-7FF64219ED42@whamcloud.com>

Hi Alex,

A couple of comments.  First, please separate this patch so that these
two separate pieces of functionality can be reviewed and tested
separately:

> 1) mballoc tries too hard to find the best chunk which is
>  counterproductive - it makes sense to limit this process

> 2) during scanning the bitmaps are loaded one by one, synchronously
>  - it makes sense to prefetch few groups at once

As far the prefetch is concerned, please note that the bitmap is first
read into the buffer cache via read_block_bitmap_nowait(), but then it
needs to be copied into buddy bitmap pages where it is cached along
side the buddy bitmap.  (The copy in the buddy bitmap is a combination
of the on-disk block allocation bitmap plus any outstanding
preallocations.)  From that copy of block bitmap, we then generate the
buddy bitmap and as a side effect, initialize the statistics
(grp->bb_first_free, grp->bb_largest_free_order, grp->bb_counters[]).

It is these statistics that we need to be able to make allocation
decisions for a particular block group.  So perhaps we should drive
the readahead of the bitmaps from ext4_mb_init_group() /
ext4_mb_init_cache(), and make sure that we actually initialize the
ext4_group_info structure, and not just read the bitmap into buffer
cache and hope it gets used before memory pressure pushes it out of
the buddy cache.

Andreas has suggested going even farther, and perhaps storing this
derived information from the allocation bitmaps someplace convenient
on disk.  This is an on-disk format change, so we would want to think
very carefully before going down that path.  Especially since if we're
going to go this far, perhaps we should consider using an on-disk
b-tree to store the allocation information, which could be more
efficient than using allocation bitmaps plus buddy bitmaps.

Cheers,

							- Ted

  parent reply	other threads:[~2019-11-20 18:14 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20 10:35 [RFC] improve malloc for large filesystems Alex Zhuravlev
2019-11-20 11:56 ` Artem Blagodarenko
2019-11-20 18:33   ` Alex Zhuravlev
2019-11-20 18:13 ` Theodore Y. Ts'o [this message]
2019-11-20 18:22   ` Alex Zhuravlev
2019-11-21  7:03   ` Alex Zhuravlev
2019-11-21  8:30     ` Artem Blagodarenko
2019-11-21  8:52       ` Alex Zhuravlev
2019-11-21  9:18         ` Artem Blagodarenko
2019-11-21 14:41           ` Alex Zhuravlev
2019-11-25 21:39             ` Andreas Dilger
2019-12-02  8:46               ` Alex Zhuravlev
2019-11-21  7:03   ` Alex Zhuravlev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191120181353.GG4262@mit.edu \
    --to=tytso@mit.edu \
    --cc=azhuravlev@whamcloud.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox