From: Jan Kara <jack@suse.cz>
To: "Andreas Grünbacher" <andreas.gruenbacher@gmail.com>
Cc: Jan Kara <jack@suse.cz>, Ted Tso <tytso@mit.edu>,
linux-ext4@vger.kernel.org, Laurent GUERBY <laurent@guerby.net>,
Andreas Dilger <adilger@dilger.ca>
Subject: Re: [PATCH 0/6] ext[24]: MBCache rewrite
Date: Mon, 14 Dec 2015 22:14:10 +0100 [thread overview]
Message-ID: <20151214211410.GP8474@quack.suse.cz> (raw)
In-Reply-To: <CAHpGcMKvyBqouqaq7zhCrxDCjW3vq_GE_wfoDyK00=ppPeYWzQ@mail.gmail.com>
Hi,
On Sat 12-12-15 00:57:01, Andreas Grünbacher wrote:
> 2015-12-09 18:57 GMT+01:00 Jan Kara <jack@suse.cz>:
> > Hello,
> >
> > inspired by recent reports [1] of problems with mbcache I had a look into what
> > we could to improve it. I found the current code rather overengineered
> > (counting with single entry being in several indices, having homegrown
> > implementation of rw semaphore, ...).
> >
> > After some thinking I've decided to just reimplement mbcache instead of
> > improving the original code in small steps since the fundamental changes in
> > locking and layout would be actually harder to review in small steps than in
> > one big chunk and overall the new mbcache is actually pretty simple piece of
> > code (~450 lines).
> >
> > The result of rewrite is smaller code (almost half the original size), smaller
> > cache entries (7 longs instead of 13), and better performance (see below
> > for details).
>
> I agree that mbcache has scalability problems worth fixing; you may
> also be right about replacing instead of fixing it.
>
> I would prefer an actual replacement over adding mbcache2 though: the
> two existing users will be converted immediately; there is no point in
> keeping the old version around. For that, can the current mbcache be
> converted to the API of the new one in a separate patch first (alloc +
> insert vs. create, get + release/free vs. delete_block)?
Well, conversion into the new API isn't that trivial because of the
lifetime / locking differences. If people prefer it, I can add a patch that
just renames everything from mb2 to mb after the old code is removed.
Opinions?
> The corner cases that mbcache has problems with are:
>
> (1) Many files with the same xattrs: Right now, an xattr block can be
> shared among at most EXT[24]_XATTR_REFCOUNT_MAX = 2^10 inodes. If 2^20
Do you know why there's this limit BTW? The on-disk format can support upto
2^32 references...
> inodes are cached, they will have at least 2^10 xattr blocks, all of
> which will end up in the same hash chain. An xattr block should be
> removed from the mbcache once it has reached its maximum refcount, but
> if I haven't overlooked something, this doesn't happen right now.
> Fixing that should be relatively easy.
Yeah, that sounds like a good optimization. I'll try that.
> (2) Very many files with unique xattrs. We might be able to come up
> with a reasonable heuristic or tweaking knob for detecting this case;
> if not, we could at least use a resizable hash table to keep the hash
> chains reasonably short.
So far we limit number of entries in the cache which keeps hash chains
short as well. Using resizable hash table and letting the system balance
number of cached entries just by shrinker is certainly possible however I'm
not sure whether the complexity is really worth it.
Regarding detection of unique xattrs: We could certainly detect trashing
of mbcache relatively easily. The difficult part if how to detect when to
enable it again because the workload can change. I'm thinking about some
backoff mechanism like caching only each k-th entry asked to be inserted
(starting with k = 1) and doubling k if we don't reach some low-watermark
cache hit ratio in some number of cache lookups, reducing k to half if
we reach high-watermark cache hit ratio. However again I'm not yet
convinced complex schemes like this are worth it for mbcache (although it
might be interesting research topic as such) and so I didn't try anything
like this for the initial posting.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-12-14 21:14 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-09 17:57 [PATCH 0/6] ext[24]: MBCache rewrite Jan Kara
2015-12-09 17:57 ` [PATCH 1/6] mbcache2: Reimplement mbcache Jan Kara
2015-12-11 23:58 ` Andreas Grünbacher
2015-12-15 11:08 ` Jan Kara
2015-12-16 15:52 ` Jan Kara
2015-12-22 12:20 ` Andreas Grünbacher
2015-12-22 13:07 ` Jan Kara
2015-12-22 13:16 ` Andreas Grünbacher
2015-12-22 13:29 ` Jan Kara
2015-12-09 17:57 ` [PATCH 2/6] ext4: Convert to mbcache2 Jan Kara
2015-12-09 17:57 ` [PATCH 3/6] ext2: " Jan Kara
2015-12-09 17:57 ` [PATCH 4/6] mbcache: Remove Jan Kara
2015-12-09 17:57 ` [PATCH 5/6] mbcache2: Limit cache size Jan Kara
2015-12-09 17:57 ` [PATCH 6/6] mbcache2: Use referenced bit instead of LRU Jan Kara
2015-12-11 23:58 ` Andreas Grünbacher
2015-12-14 10:34 ` Jan Kara
2015-12-11 23:57 ` [PATCH 0/6] ext[24]: MBCache rewrite Andreas Grünbacher
2015-12-14 21:14 ` Jan Kara [this message]
2015-12-14 22:47 ` Andreas Grünbacher
2016-02-19 21:33 ` Theodore Ts'o
2016-02-22 7:56 ` Jan Kara
2016-02-22 17:17 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151214211410.GP8474@quack.suse.cz \
--to=jack@suse.cz \
--cc=adilger@dilger.ca \
--cc=andreas.gruenbacher@gmail.com \
--cc=laurent@guerby.net \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.