From: Allison Henderson <achender@linux.vnet.ibm.com>
To: "Ted Ts'o" <tytso@mit.edu>
Cc: Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH 1/1 v3] ext4: fix xfstests 75, 112, 127 punch hole failure
Date: Thu, 04 Aug 2011 00:22:58 -0700 [thread overview]
Message-ID: <4E3A48D2.3070905@linux.vnet.ibm.com> (raw)
In-Reply-To: <20110804005001.GI3150@thunk.org>
On 08/03/2011 05:50 PM, Ted Ts'o wrote:
> On Wed, Aug 03, 2011 at 08:20:36AM -0700, Allison Henderson wrote:
>> @@ -4227,6 +4227,28 @@ int ext4_ext_punch_hole(struct file *file, loff_t offset, loff_t length)
>> }
>> }
>>
>> + /*
>> + * Now we need to unmap the un page aligned buffers.
>> + * If the file is smaller than a page, just
>> + * unmap the middle
>> + */
>
> This should be "non-page-aligned buffers". And it's not "if the file
> is smaller than a page", but rather "if the file space being truncated
> is smaller than a page". (The comment above this patch hunk is
> similarly wrong).
>
>> + if (first_page> last_page)
>> + ext4_unmap_page_range(handle, mapping, offset, length);
>> + else {
>> + /* unmap page buffers before the first aligned page */
>> + page_len = first_page_offset - offset;
>> + if (page_len> 0)
>> + ext4_unmap_page_range(handle, mapping,
>> + offset, page_len);
>> +
>> + /* unmap the page buffers after the last aligned page */
>> + page_len = offset + length - last_page_offset;
>> + if (page_len> 0) {
>> + ext4_unmap_page_range(handle, mapping,
>> + last_page_offset, page_len);
>> + }
>> + }
>> +
>
> We unconditionally call ext4_unmap_page_range() at least once, and
> possibly twice. The ext4_unmap_page_range() function is always going
> to be calling find_or_create_page(), which requires locking pages,
> taking RCU locks, etc.. None of this code should be needed if:
>
> inode->i_sb->s_blocksize == PAGE_CACHE_SIZE
> or
> (offset % PAGE_CACHE_SIZE == 0)&& (length % PAGE_CACHE_SIZE == 0)
Oh, I think we do avoid calling the unmap for this last condition though. The first and last page offsets are calculated earlier for calling truncate_inode_pages_range to release all the pages in the hole. The idea is that everything from first_page_offset to last_page_offset covers all the page aligned pages in the hole. So then if offset and length are aligned, we basically end up with first_page_offset = offset and last_page_offset = offset + length, and the page_len will turn out to be zero. Right math? Maybe we can add some comments or something to help clarify.
Allison Henderson
>
>> +/*
>> + * ext4_unmap_page_range() unmaps a page range of length 'length'
>> + * starting from file offset 'from'. The range to be unmaped must
>> + * be contained with in one page. If the specified range exceeds
>> + * the end of the page it will be shortened to end of the page
>> + * that cooresponds to 'from'. Only block aligned buffers will
>> + * be unmapped and unblock aligned buffers are skipped
>> + */
>> +int ext4_unmap_page_range(handle_t *handle,
>> + struct address_space *mapping, loff_t from, loff_t length)
>
> This function is only used by extents.c; maybe it should be placed in
> extents.c and declared static, instead of being placed in inode.c?
>
>> +{
>> + ext4_fsblk_t index = from>> PAGE_CACHE_SHIFT;
>> + unsigned int offset = from& (PAGE_CACHE_SIZE-1);
>> + unsigned int blocksize, max, pos;
>> + unsigned int end_of_block, range_to_unmap;
>> + ext4_lblk_t iblock;
>> + struct inode *inode = mapping->host;
>> + struct buffer_head *bh;
>> + struct page *page;
>> + int err = 0;
>> +
>> + page = find_or_create_page(mapping, from>> PAGE_CACHE_SHIFT,
>> + mapping_gfp_mask(mapping)& ~__GFP_FS);
>> + if (!page)
>> + return -EINVAL;
>> +
>> + blocksize = inode->i_sb->s_blocksize;
>> + max = PAGE_CACHE_SIZE - offset;
>> +
>> + /*
>> + * correct length if it does not fall between
>> + * 'from' and the end of the page
>> + */
>> + if (length> max || length< 0)
>> + length = max;
>> +
>> + iblock = index<< (PAGE_CACHE_SHIFT - inode->i_sb->s_blocksize_bits);
>> +
>> + if (!page_has_buffers(page))
>> + create_empty_buffers(page, blocksize, 0);
>
> If the page doesn't have any buffers, we could just return 0; there's
> no point calling create_empty_buffers() and then looping over them
> all.
>
>> +
>> + /* Find the buffer that contains "offset" */
>> + bh = page_buffers(page);
>> + pos = blocksize;
>> + while (offset>= pos) {
>> + bh = bh->b_this_page;
>> + iblock++;
>> + pos += blocksize;
>> + }
>> +
>> + pos = offset;
>> + while (pos< offset + length) {
>> + err = 0;
>> +
>> + /* The length of space left to zero */
>> + range_to_unmap = offset + length - pos;
>> +
>> + /* The length of space until the end of the block */
>> + end_of_block = blocksize - (pos& (blocksize-1));
>> +
>> + /* Do not unmap past end of block */
>> + if (range_to_unmap> end_of_block)
>> + range_to_unmap = end_of_block;
>> +
>> + if (buffer_freed(bh)) {
>> + BUFFER_TRACE(bh, "freed: skip");
>> + goto next;
>> + }
>> +
>> + if (!buffer_mapped(bh)) {
>> + BUFFER_TRACE(bh, "unmapped");
>> + ext4_get_block(inode, iblock, bh, 0);
>> + /* unmapped? It's a hole - nothing to do */
>> + if (!buffer_mapped(bh)) {
>> + BUFFER_TRACE(bh, "still unmapped");
>> + goto next;
>> + }
>> + }
>
> If the buffer is unmapped, why not just goto next right away? Why
> bother calling ext4_get_block() and mapping it, when all we're going
> to do is declare the buffer as unmapped anyway.
>
>> +
>> + /* If the range is not block aligned, skip */
>> + if (range_to_unmap != blocksize)
>> + goto next;
>> +
>> + if (ext4_should_journal_data(inode)) {
>> + BUFFER_TRACE(bh, "get write access");
>> + err = ext4_journal_get_write_access(handle, bh);
>> + if (err)
>> + goto unlock;
>> + }
>> +
>> + clear_buffer_dirty(bh);
>> + bh->b_bdev = NULL;
>> + clear_buffer_mapped(bh);
>> + clear_buffer_req(bh);
>> + clear_buffer_new(bh);
>> + clear_buffer_delay(bh);
>> + clear_buffer_unwritten(bh);
>> + clear_buffer_uptodate(bh);
>> + ClearPageUptodate(page);
>> +
>> + BUFFER_TRACE(bh, "buffer unmapped");
>> +
>> + if (ext4_should_journal_data(inode)) {
>> + err = ext4_handle_dirty_metadata(handle, inode, bh);
>> + } else {
>> + if (ext4_should_order_data(inode)&&
>> + EXT4_I(inode)->jinode)
>> + err = ext4_jbd2_file_inode(handle, inode);
>> + }
>
> Why are we calling ext4_handle_dirty_metadata() or
> ext4_jbd2_file_inode() here? It's not necessary, since we're not
> actually dirtying any buffers here. We're just marking buffer heads
> as unmarked.
>
>> +
>> +next:
>> + bh = bh->b_this_page;
>> + iblock++;
>> + pos += range_to_unmap;
>> + }
>> +unlock:
>> + unlock_page(page);
>> + page_cache_release(page);
>> + return err;
>> +}
>> +
>> +
>> int ext4_can_truncate(struct inode *inode)
>> {
>> if (S_ISREG(inode->i_mode))
>> --
>> 1.7.1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-08-04 7:23 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-03 15:20 [PATCH 1/1 v3] ext4: fix xfstests 75, 112, 127 punch hole failure Allison Henderson
2011-08-04 0:50 ` Ted Ts'o
2011-08-04 6:21 ` Allison Henderson
2011-08-04 7:22 ` Allison Henderson [this message]
2011-08-04 15:25 ` Ted Ts'o
2011-08-04 16:10 ` Allison Henderson
2011-08-04 17:44 ` Mingming Cao
2011-08-04 18:03 ` Allison Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E3A48D2.3070905@linux.vnet.ibm.com \
--to=achender@linux.vnet.ibm.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.