From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: linux-btrfs@vger.kernel.org
Cc: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Subject: [PATCH v6 19/19] btrfs: try more times to alloc metadata reserve space
Date: Fri, 5 Feb 2016 09:22:39 +0800 [thread overview]
Message-ID: <1454635359-10013-20-git-send-email-quwenruo@cn.fujitsu.com> (raw)
In-Reply-To: <1454635359-10013-1-git-send-email-quwenruo@cn.fujitsu.com>
From: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
In btrfs_delalloc_reserve_metadata(), the number of metadata bytes we try
to reserve is calculated by the difference between outstanding_extents and
reserved_extents.
When reserve_metadata_bytes() fails to reserve desited metadata space,
it has already done some reclaim work, such as write ordered extents.
In that case, outstanding_extents and reserved_extents may already
changed, and we may reserve enough metadata space then.
So this patch will try to call reserve_metadata_bytes() at most 3 times
to ensure we really run out of space.
Such false ENOSPC is mainly caused by small file extents and time
consuming delalloc functions, which mainly affects in-band
de-duplication. (Compress should also be affected, but LZO/zlib is
faster than SHA256, so still harder to trigger than dedup).
Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
---
fs/btrfs/extent-tree.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 2a17c88..c60e24a 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5669,6 +5669,7 @@ int btrfs_delalloc_reserve_metadata(struct inode *inode, u64 num_bytes)
bool delalloc_lock = true;
u64 to_free = 0;
unsigned dropped;
+ int loops = 0;
/* If we are a free space inode we need to not flush since we will be in
* the middle of a transaction commit. We also don't need the delalloc
@@ -5684,11 +5685,12 @@ int btrfs_delalloc_reserve_metadata(struct inode *inode, u64 num_bytes)
btrfs_transaction_in_commit(root->fs_info))
schedule_timeout(1);
+ num_bytes = ALIGN(num_bytes, root->sectorsize);
+
+again:
if (delalloc_lock)
mutex_lock(&BTRFS_I(inode)->delalloc_mutex);
- num_bytes = ALIGN(num_bytes, root->sectorsize);
-
spin_lock(&BTRFS_I(inode)->lock);
nr_extents = (unsigned)div64_u64(num_bytes +
BTRFS_MAX_EXTENT_SIZE - 1,
@@ -5809,6 +5811,23 @@ out_fail:
}
if (delalloc_lock)
mutex_unlock(&BTRFS_I(inode)->delalloc_mutex);
+ /*
+ * The number of metadata bytes is calculated by the difference
+ * between outstanding_extents and reserved_extents. Sometimes though
+ * reserve_metadata_bytes() fails to reserve the wanted metadata bytes,
+ * indeed it has already done some work to reclaim metadata space, hence
+ * both outstanding_extents and reserved_extents would have changed and
+ * the bytes we try to reserve would also has changed(may be smaller).
+ * So here we try to reserve again. This is much useful for online
+ * dedup, which will easily eat almost all meta space.
+ *
+ * XXX: Indeed here 3 is arbitrarily choosed, it's a good workaround for
+ * online dedup, later we should find a better method to avoid dedup
+ * enospc issue.
+ */
+ if (unlikely(ret == -ENOSPC && loops++ < 3))
+ goto again;
+
return ret;
}
--
2.7.0
prev parent reply other threads:[~2016-02-05 1:25 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-05 1:22 [PATCH v6 00/19][For 4.6] Btrfs: Add inband (write time) de-duplication framework Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 01/19] btrfs: dedup: Introduce dedup framework and its header Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 02/19] btrfs: dedup: Introduce function to initialize dedup info Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 03/19] btrfs: dedup: Introduce function to add hash into in-memory tree Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 04/19] btrfs: dedup: Introduce function to remove hash from " Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 05/19] btrfs: delayed-ref: Add support for increasing data ref under spinlock Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 06/19] btrfs: dedup: Introduce function to search for an existing hash Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 07/19] btrfs: dedup: Implement btrfs_dedup_calc_hash interface Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 08/19] btrfs: ordered-extent: Add support for dedup Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 09/19] btrfs: dedup: Inband in-memory only de-duplication implement Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 10/19] btrfs: dedup: Add basic tree structure for on-disk dedup method Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 11/19] btrfs: dedup: Introduce interfaces to resume and cleanup dedup info Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 12/19] btrfs: dedup: Add support for on-disk hash search Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 13/19] btrfs: dedup: Add support to delete hash for on-disk backend Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 14/19] btrfs: dedup: Add support for adding " Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 15/19] btrfs: dedup: Add ioctl for inband deduplication Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 16/19] btrfs: dedup: add an inode nodedup flag Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 17/19] btrfs: dedup: add a property handler for online dedup Qu Wenruo
2016-02-05 1:22 ` [PATCH v6 18/19] btrfs: dedup: add per-file online dedup control Qu Wenruo
2016-02-05 1:22 ` Qu Wenruo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1454635359-10013-20-git-send-email-quwenruo@cn.fujitsu.com \
--to=quwenruo@cn.fujitsu.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=wangxg.fnst@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).