All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: "Andreas Grünbacher" <andreas.gruenbacher@gmail.com>
Cc: Jan Kara <jack@suse.cz>, Ted Tso <tytso@mit.edu>,
	linux-ext4@vger.kernel.org, Laurent GUERBY <laurent@guerby.net>,
	Andreas Dilger <adilger@dilger.ca>
Subject: Re: [PATCH 1/6] mbcache2: Reimplement mbcache
Date: Tue, 22 Dec 2015 14:29:50 +0100	[thread overview]
Message-ID: <20151222132950.GC7266@quack.suse.cz> (raw)
In-Reply-To: <CAHpGcMLibLH5g7nJ4624Dd-bgLx_dp3FOVzmufOqPGihYDsGRQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1805 bytes --]

On Tue 22-12-15 14:16:28, Andreas Grünbacher wrote:
> 2015-12-22 14:07 GMT+01:00 Jan Kara <jack@suse.cz>:
> > On Tue 22-12-15 13:20:58, Andreas Grünbacher wrote:
> >> That test scenario probably isn't very realistic: xattrs are mostly
> >> initialized at or immediately after file create time; they rarely
> >> removed. Hash chains should shrink significantly for that scenario.
> >
> > Well, the refcount gets dropped when the file itself is deleted. And that
> > isn't that rare. So I agree my benchmark isn't completely realistic in
> > changing the xattr but dropping xattr block refcount due to deleted file
> > will be relatively frequent so I don't thing refcount distribution obtained
> > by my benchmark is completely out.
> 
> Good point.
> 
> >> In addition, if the hash table is sized reasonably, long hash chains
> >> won't hurt that much because we can stop searching them as soon as we
> >> find the first reusable block. This won't help when there are hash
> >> conflicts, but those should be unlikely.
> >
> > Well, this is what happens currently as well...
> 
> Yes, except that many "unshareable" xattr blocks can remain in the
> hash table where they can only be skipped over. Since you have already
> written the code for fixing that and that fix won't make things worse,
> I would like to see that go in.

So for reference I'm attaching the patch which stops caching xattr block
when it reaches max refcount. Frankly, I'm not in favor of merging this
patch unless we can show it really improves noticeably some realistic
scenario. Removal and re-adding of xattr block from / to cache costs some
CPU cycles as well so overall I'm not convinced it is a clear win and
although the patch is simple it still adds some complexity.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-mbcache2-Do-not-cache-blocks-with-maximum-refcount.patch --]
[-- Type: text/x-patch, Size: 3802 bytes --]

>From cf25c42595b5ff237ff4bc6c9968bb4e92181619 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 16 Dec 2015 13:29:46 +0100
Subject: [PATCH] mbcache2: Do not cache blocks with maximum refcount

Maximum number of inodes sharing xattr block is limited to 1024 to limit
amount of damage one corrupted / bad block can cause. Thus xattr blocks
that have reached this limit cannot be used for further deduplicating
and are just occupying space in cache needlessly. The worst thing is
that all these blocks with the same content have the same hash and thus
they all end up in the same hash chain making chain unnecessarily long.
In the extreme case of a single xattr block value we degrade the hash
table to a linked list.

Improve the situation by removing entries for xattr blocks with maximum
refcount from the cache and adding them back once refcount drops. This
can cause some unnecessary cache removal & additions in cases when xattr
block frequently transations between max refcount and max refcount - 1,
but overall this is a win.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/xattr.c | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index a80e5e2acadd..a8401a650221 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -567,6 +567,10 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode,
 				 EXT4_FREE_BLOCKS_FORGET);
 	} else {
 		le32_add_cpu(&BHDR(bh)->h_refcount, -1);
+		/* Entry with max refcount was not cached. Add it now. */
+		if (le32_to_cpu(BHDR(bh)->h_refcount) ==
+		    EXT4_XATTR_REFCOUNT_MAX - 1)
+			ext4_xattr_cache_insert(EXT4_GET_MB_CACHE(inode), bh);
 		/*
 		 * Beware of this ugliness: Releasing of xattr block references
 		 * from different inodes can race and so we have to protect
@@ -865,6 +869,8 @@ inserted:
 	if (!IS_LAST_ENTRY(s->first)) {
 		new_bh = ext4_xattr_cache_find(inode, header(s->base), &ce);
 		if (new_bh) {
+			bool delete = false;
+
 			/* We found an identical block in the cache. */
 			if (new_bh == bs->bh)
 				ea_bdebug(new_bh, "keeping");
@@ -907,6 +913,9 @@ inserted:
 					goto inserted;
 				}
 				le32_add_cpu(&BHDR(new_bh)->h_refcount, 1);
+				if (le32_to_cpu(BHDR(new_bh)->h_refcount) ==
+				    EXT4_XATTR_REFCOUNT_MAX)
+					delete = true;
 				ea_bdebug(new_bh, "reusing; refcount now=%d",
 					le32_to_cpu(BHDR(new_bh)->h_refcount));
 				unlock_buffer(new_bh);
@@ -916,8 +925,16 @@ inserted:
 				if (error)
 					goto cleanup_dquot;
 			}
-			mb2_cache_entry_touch(ext4_mb_cache, ce);
-			mb2_cache_entry_put(ext4_mb_cache, ce);
+			/*
+			 * Delete entry from cache in case it reached max
+			 * refcount and thus cannot be used anymore
+			 */
+			if (unlikely(delete)) {
+				mb2_cache_entry_delete(ext4_mb_cache, ce);
+			} else {
+				mb2_cache_entry_touch(ext4_mb_cache, ce);
+				mb2_cache_entry_put(ext4_mb_cache, ce);
+			}
 			ce = NULL;
 		} else if (bs->bh && s->base == bs->bh->b_data) {
 			/* We were modifying this block in-place. */
@@ -1538,7 +1555,8 @@ cleanup:
  * ext4_xattr_cache_insert()
  *
  * Create a new entry in the extended attribute cache, and insert
- * it unless such an entry is already in the cache.
+ * it unless such an entry is already in the cache or it has too many
+ * references.
  *
  * Returns 0, or a negative error number on failure.
  */
@@ -1549,6 +1567,10 @@ ext4_xattr_cache_insert(struct mb2_cache *ext4_mb_cache, struct buffer_head *bh)
 	struct mb2_cache_entry *ce;
 	int error;
 
+	/* Don't cache entry with too many references as it cannot be used */
+	if (le32_to_cpu(BHDR(bh)->h_refcount) >= EXT4_XATTR_REFCOUNT_MAX)
+		return;
+
 	ce = mb2_cache_entry_create(ext4_mb_cache, GFP_NOFS, hash,
 				    bh->b_blocknr);
 	if (IS_ERR(ce)) {
-- 
1.7.12.4


  reply	other threads:[~2015-12-22 13:29 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-09 17:57 [PATCH 0/6] ext[24]: MBCache rewrite Jan Kara
2015-12-09 17:57 ` [PATCH 1/6] mbcache2: Reimplement mbcache Jan Kara
2015-12-11 23:58   ` Andreas Grünbacher
2015-12-15 11:08     ` Jan Kara
2015-12-16 15:52       ` Jan Kara
2015-12-22 12:20         ` Andreas Grünbacher
2015-12-22 13:07           ` Jan Kara
2015-12-22 13:16             ` Andreas Grünbacher
2015-12-22 13:29               ` Jan Kara [this message]
2015-12-09 17:57 ` [PATCH 2/6] ext4: Convert to mbcache2 Jan Kara
2015-12-09 17:57 ` [PATCH 3/6] ext2: " Jan Kara
2015-12-09 17:57 ` [PATCH 4/6] mbcache: Remove Jan Kara
2015-12-09 17:57 ` [PATCH 5/6] mbcache2: Limit cache size Jan Kara
2015-12-09 17:57 ` [PATCH 6/6] mbcache2: Use referenced bit instead of LRU Jan Kara
2015-12-11 23:58   ` Andreas Grünbacher
2015-12-14 10:34     ` Jan Kara
2015-12-11 23:57 ` [PATCH 0/6] ext[24]: MBCache rewrite Andreas Grünbacher
2015-12-14 21:14   ` Jan Kara
2015-12-14 22:47     ` Andreas Grünbacher
2016-02-19 21:33 ` Theodore Ts'o
2016-02-22  7:56   ` Jan Kara
2016-02-22 17:17     ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151222132950.GC7266@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger@dilger.ca \
    --cc=andreas.gruenbacher@gmail.com \
    --cc=laurent@guerby.net \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.