From: Bob Peterson <rpeterso@redhat.com>
To: Alexander Viro <aviro@redhat.com>, linux-fsdevel@vger.kernel.org
Subject: [PATCH][try5] fs: if block_map clears buffer_holesize bit skip hole size from b_size
Date: Thu, 11 Sep 2014 11:29:32 -0400 (EDT) [thread overview]
Message-ID: <1501522376.20961158.1410449372237.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <877076335.17542169.1409875465535.JavaMail.zimbra@redhat.com>
Hi,
This version uses a new buffer flag, holesize, as Dave Chinner
suggested. It also incorporates a suggestion from Steve Whitehouse.
The problem:
If you do a fiemap operation on a very large sparse file, it can take
an extremely long amount of time (we're talking days here) because
function __generic_block_fiemap does a block-for-block search when it
encounters a hole.
The solution:
Allow the underlying file system to return the hole size so that function
__generic_block_fiemap can quickly skip the hole.
Patch description:
This patch changes function __generic_block_fiemap so that it sets a new
buffer_holesize bit. The new bit signals to the underlying file system
to return a hole size from its block_map function (if possible) in the
event that a hole is encountered at the requested block. If the block_map
function encounters a hole, and clears buffer_holesize, fiemap takes the
returned b_size to be the size of the hole, in bytes. It then skips the
hole and moves to the next block. This may be repeated several times
in a row, especially for large holes, due to possible limitations of the
fs-specific block_map function. This is still much faster than trying
each block individually when large holes are encountered. If the
block_map function does not clear buffer_holesize, the request for
holesize has been ignored, and it falls back to today's method of doing a
block-by-block search for the next valid block.
Regards,
Bob Peterson
Red Hat File Systems
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/ioctl.c | 7 ++++++-
include/linux/buffer_head.h | 2 ++
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 8ac3fad..ae63b1f 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -291,13 +291,18 @@ int __generic_block_fiemap(struct inode *inode,
memset(&map_bh, 0, sizeof(struct buffer_head));
map_bh.b_size = len;
+ set_buffer_holesize(&map_bh); /* return hole size if able */
ret = get_block(inode, start_blk, &map_bh, 0);
if (ret)
break;
/* HOLE */
if (!buffer_mapped(&map_bh)) {
- start_blk++;
+ if (buffer_holesize(&map_bh)) /* holesize ignored */
+ start_blk++;
+ else
+ start_blk += logical_to_blk(inode,
+ map_bh.b_size);
/*
* We want to handle the case where there is an
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 324329c..b8ce396 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -37,6 +37,7 @@ enum bh_state_bits {
BH_Meta, /* Buffer contains metadata */
BH_Prio, /* Buffer should be submitted with REQ_PRIO */
BH_Defer_Completion, /* Defer AIO completion to workqueue */
+ BH_Holesize, /* Return hole size (and clear) if possible */
BH_PrivateStart,/* not a state bit, but the first bit available
* for private allocation by other entities
@@ -128,6 +129,7 @@ BUFFER_FNS(Boundary, boundary)
BUFFER_FNS(Write_EIO, write_io_error)
BUFFER_FNS(Unwritten, unwritten)
BUFFER_FNS(Meta, meta)
+BUFFER_FNS(Holesize, holesize)
BUFFER_FNS(Prio, prio)
BUFFER_FNS(Defer_Completion, defer_completion)
prev parent reply other threads:[~2014-09-11 15:29 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1623792099.11913714.1409074745372.JavaMail.zimbra@redhat.com>
2014-09-03 15:52 ` [PATCH][resend] fs: Add hooks for get_hole_size to generic_block_fiemap Bob Peterson
2014-09-03 20:26 ` Dave Chinner
2014-09-05 0:04 ` [PATCH][try4] " Bob Peterson
2014-09-11 15:29 ` Bob Peterson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1501522376.20961158.1410449372237.JavaMail.zimbra@redhat.com \
--to=rpeterso@redhat.com \
--cc=aviro@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).