From: Ted Ts'o <tytso@mit.edu>
To: Allison Henderson <achender@linux.vnet.ibm.com>
Cc: Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH 1/1 v3] ext4: fix xfstests 75, 112, 127 punch hole failure
Date: Wed, 3 Aug 2011 20:50:01 -0400 [thread overview]
Message-ID: <20110804005001.GI3150@thunk.org> (raw)
In-Reply-To: <4E396744.2030003@linux.vnet.ibm.com>
On Wed, Aug 03, 2011 at 08:20:36AM -0700, Allison Henderson wrote:
> @@ -4227,6 +4227,28 @@ int ext4_ext_punch_hole(struct file *file, loff_t offset, loff_t length)
> }
> }
>
> + /*
> + * Now we need to unmap the un page aligned buffers.
> + * If the file is smaller than a page, just
> + * unmap the middle
> + */
This should be "non-page-aligned buffers". And it's not "if the file
is smaller than a page", but rather "if the file space being truncated
is smaller than a page". (The comment above this patch hunk is
similarly wrong).
> + if (first_page > last_page)
> + ext4_unmap_page_range(handle, mapping, offset, length);
> + else {
> + /* unmap page buffers before the first aligned page */
> + page_len = first_page_offset - offset;
> + if (page_len > 0)
> + ext4_unmap_page_range(handle, mapping,
> + offset, page_len);
> +
> + /* unmap the page buffers after the last aligned page */
> + page_len = offset + length - last_page_offset;
> + if (page_len > 0) {
> + ext4_unmap_page_range(handle, mapping,
> + last_page_offset, page_len);
> + }
> + }
> +
We unconditionally call ext4_unmap_page_range() at least once, and
possibly twice. The ext4_unmap_page_range() function is always going
to be calling find_or_create_page(), which requires locking pages,
taking RCU locks, etc.. None of this code should be needed if:
inode->i_sb->s_blocksize == PAGE_CACHE_SIZE
or
(offset % PAGE_CACHE_SIZE == 0) && (length % PAGE_CACHE_SIZE == 0)
> +/*
> + * ext4_unmap_page_range() unmaps a page range of length 'length'
> + * starting from file offset 'from'. The range to be unmaped must
> + * be contained with in one page. If the specified range exceeds
> + * the end of the page it will be shortened to end of the page
> + * that cooresponds to 'from'. Only block aligned buffers will
> + * be unmapped and unblock aligned buffers are skipped
> + */
> +int ext4_unmap_page_range(handle_t *handle,
> + struct address_space *mapping, loff_t from, loff_t length)
This function is only used by extents.c; maybe it should be placed in
extents.c and declared static, instead of being placed in inode.c?
> +{
> + ext4_fsblk_t index = from >> PAGE_CACHE_SHIFT;
> + unsigned int offset = from & (PAGE_CACHE_SIZE-1);
> + unsigned int blocksize, max, pos;
> + unsigned int end_of_block, range_to_unmap;
> + ext4_lblk_t iblock;
> + struct inode *inode = mapping->host;
> + struct buffer_head *bh;
> + struct page *page;
> + int err = 0;
> +
> + page = find_or_create_page(mapping, from >> PAGE_CACHE_SHIFT,
> + mapping_gfp_mask(mapping) & ~__GFP_FS);
> + if (!page)
> + return -EINVAL;
> +
> + blocksize = inode->i_sb->s_blocksize;
> + max = PAGE_CACHE_SIZE - offset;
> +
> + /*
> + * correct length if it does not fall between
> + * 'from' and the end of the page
> + */
> + if (length > max || length < 0)
> + length = max;
> +
> + iblock = index << (PAGE_CACHE_SHIFT - inode->i_sb->s_blocksize_bits);
> +
> + if (!page_has_buffers(page))
> + create_empty_buffers(page, blocksize, 0);
If the page doesn't have any buffers, we could just return 0; there's
no point calling create_empty_buffers() and then looping over them
all.
> +
> + /* Find the buffer that contains "offset" */
> + bh = page_buffers(page);
> + pos = blocksize;
> + while (offset >= pos) {
> + bh = bh->b_this_page;
> + iblock++;
> + pos += blocksize;
> + }
> +
> + pos = offset;
> + while (pos < offset + length) {
> + err = 0;
> +
> + /* The length of space left to zero */
> + range_to_unmap = offset + length - pos;
> +
> + /* The length of space until the end of the block */
> + end_of_block = blocksize - (pos & (blocksize-1));
> +
> + /* Do not unmap past end of block */
> + if (range_to_unmap > end_of_block)
> + range_to_unmap = end_of_block;
> +
> + if (buffer_freed(bh)) {
> + BUFFER_TRACE(bh, "freed: skip");
> + goto next;
> + }
> +
> + if (!buffer_mapped(bh)) {
> + BUFFER_TRACE(bh, "unmapped");
> + ext4_get_block(inode, iblock, bh, 0);
> + /* unmapped? It's a hole - nothing to do */
> + if (!buffer_mapped(bh)) {
> + BUFFER_TRACE(bh, "still unmapped");
> + goto next;
> + }
> + }
If the buffer is unmapped, why not just goto next right away? Why
bother calling ext4_get_block() and mapping it, when all we're going
to do is declare the buffer as unmapped anyway.
> +
> + /* If the range is not block aligned, skip */
> + if (range_to_unmap != blocksize)
> + goto next;
> +
> + if (ext4_should_journal_data(inode)) {
> + BUFFER_TRACE(bh, "get write access");
> + err = ext4_journal_get_write_access(handle, bh);
> + if (err)
> + goto unlock;
> + }
> +
> + clear_buffer_dirty(bh);
> + bh->b_bdev = NULL;
> + clear_buffer_mapped(bh);
> + clear_buffer_req(bh);
> + clear_buffer_new(bh);
> + clear_buffer_delay(bh);
> + clear_buffer_unwritten(bh);
> + clear_buffer_uptodate(bh);
> + ClearPageUptodate(page);
> +
> + BUFFER_TRACE(bh, "buffer unmapped");
> +
> + if (ext4_should_journal_data(inode)) {
> + err = ext4_handle_dirty_metadata(handle, inode, bh);
> + } else {
> + if (ext4_should_order_data(inode) &&
> + EXT4_I(inode)->jinode)
> + err = ext4_jbd2_file_inode(handle, inode);
> + }
Why are we calling ext4_handle_dirty_metadata() or
ext4_jbd2_file_inode() here? It's not necessary, since we're not
actually dirtying any buffers here. We're just marking buffer heads
as unmarked.
> +
> +next:
> + bh = bh->b_this_page;
> + iblock++;
> + pos += range_to_unmap;
> + }
> +unlock:
> + unlock_page(page);
> + page_cache_release(page);
> + return err;
> +}
> +
> +
> int ext4_can_truncate(struct inode *inode)
> {
> if (S_ISREG(inode->i_mode))
> --
> 1.7.1
next prev parent reply other threads:[~2011-08-04 0:50 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-03 15:20 [PATCH 1/1 v3] ext4: fix xfstests 75, 112, 127 punch hole failure Allison Henderson
2011-08-04 0:50 ` Ted Ts'o [this message]
2011-08-04 6:21 ` Allison Henderson
2011-08-04 7:22 ` Allison Henderson
2011-08-04 15:25 ` Ted Ts'o
2011-08-04 16:10 ` Allison Henderson
2011-08-04 17:44 ` Mingming Cao
2011-08-04 18:03 ` Allison Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110804005001.GI3150@thunk.org \
--to=tytso@mit.edu \
--cc=achender@linux.vnet.ibm.com \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.