From: Ji Wu <wu_ji2012@163.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org,
Andreas Dilger <adilger.kernel@dilger.ca>,
linux-fsdevel@vger.kernel.org, Zheng Liu <gnehzuil.liu@gmail.com>
Subject: Re: Two questions regarding ext4_fallocate()
Date: Sun, 05 May 2013 09:14:08 +0800 [thread overview]
Message-ID: <5185B260.80203@163.com> (raw)
In-Reply-To: <20130504173326.GA5948@thunk.org>
Hi Theodore,
Thanks for your explanation.
These questions are originally raised by my friend, after
a discussion, we did not figure out an exact answer. Now
I think I can ask him to prepare patch for it. Actually, we did find
this useless call applies to some other file systems.
Cheers,
Ji Wu
On 05/05/2013 01:33 AM, Theodore Ts'o wrote:
> On Sat, May 04, 2013 at 10:58:50PM +0800, Ji Wu wrote:
>> Hi,
>> I have two questions regarding ext4_fallocate(),
>>
>> (1) The first is the FALLOC_FL_PUNCH_HOLE support, I am wondering
>> what is the usage for it? The only use case comes to my mind is
>> while ext4 being used for virtual machine image file storage. When
>> VMM is aware of the file deleting operation in guest os, it can
>> invoke host file system's fallocate() on the virtual machine image
>> file to punch a hole to free host storage, so that save host
>> space. But how can VMM being aware of guest file deleting? Simulate
>> a virtual SSD-like block device to guest os, then capture the TRIM
>> instruction issued by guest file system? That seems too tricky. So
>> basically, where and how to benefit from hole punching?
> It's not too tricky; all of the hypervisors, whether it's KVM, or Xen,
> or VMWare, are already simulating a SATA device to the guest OS.
> Implementing support for the TRIM request is not that hard, and most
> of the hypervisors are doing this already. Implementing the punch
> hole functionality was indeed primarily motivated for this use case.
>
> The other historical use of this was for digital video recorders, but
> that's a much more specialized use case.
>
>> (2) At the beginning of the function ext4_ext_punch_hole(), the
>> codes are as follows,
>>
>> /* write out all dirty pages to avoid race condition */
>> filemap_write_and_wait_range(mapping, offset, offset+length-1);
>> mutex_lock(&inode->i_mutex);
>> truncate_page_cache_range();
>>
>> Why does it need synchronously write back the dirty pages fit
>> into the hole, the data on the disk responding to those pages are to
>> be deleted, why not directly release those pages, no matter they are
>> dirty or not. And furthermore, this is done before the inode lock is
>> held, so it seems it may happen that after the pages are written
>> back, and before the lock is held, those pages are dirtied again.
>> So basically, why does it need call filemap_write_and_wait_range()
>> before releasing those pages?
> That's a good a question. Looking at it, I'm not sure we do. I
> suspect this was put in originally to avoid races with setting the
> EOFBLOCKS_FL flag, but as you point out, there's no way we can prevent
> writes to sneak in before we grab the i_mutex. As a result, we ended
> up dropping the need for EOFBLOCKS_FL entirely.
>
> Maybe one of the ext4 developers will see something that I'm missing,
> but I think we can drop this, which indeed will have a significant
> performance improvement for systems that use the punch hole
> functionality.
>
> Cheers,
>
> - Ted
>
next prev parent reply other threads:[~2013-05-05 1:14 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <5185222A.20801@163.com>
2013-05-04 17:33 ` Two questions regarding ext4_fallocate() Theodore Ts'o
2013-05-05 1:14 ` Ji Wu [this message]
2013-05-05 7:18 ` Dmitry Monakhov
2013-05-04 15:31 Ji Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5185B260.80203@163.com \
--to=wu_ji2012@163.com \
--cc=adilger.kernel@dilger.ca \
--cc=gnehzuil.liu@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.