From: Hao Xu <hao.xu@linux.dev>
To: Jingbo Xu <jefflexu@linux.alibaba.com>, Vivek Goyal <vgoyal@redhat.com>
Cc: fuse-devel@lists.sourceforge.net, miklos@szeredi.hu,
bernd.schubert@fastmail.fm, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] fuse: add a new flag to allow shared mmap in FOPEN_DIRECT_IO mode
Date: Tue, 13 Jun 2023 11:20:44 +0800 [thread overview]
Message-ID: <28c92418-7f07-978a-0eaa-b0d6329f4133@linux.dev> (raw)
In-Reply-To: <cdb8f5d4-5a47-4c17-9f9c-8de24aede4c5@linux.alibaba.com>
Hi Jingbo,
On 6/13/23 10:56, Jingbo Xu wrote:
>
>
> On 5/6/23 1:01 PM, Hao Xu wrote:
>> Hi Vivek,
>>
>> On 5/6/23 04:37, Vivek Goyal wrote:
>>> On Fri, May 05, 2023 at 04:16:52PM +0800, Hao Xu wrote:
>>>> From: Hao Xu <howeyxu@tencent.com>
>>>>
>>>> FOPEN_DIRECT_IO is usually set by fuse daemon to indicate need of strong
>>>> coherency, e.g. network filesystems. Thus shared mmap is disabled since
>>>> it leverages page cache and may write to it, which may cause
>>>> inconsistence. But FOPEN_DIRECT_IO can be used not for coherency but to
>>>> reduce memory footprint as well, e.g. reduce guest memory usage with
>>>> virtiofs. Therefore, add a new flag FOPEN_DIRECT_IO_SHARED_MMAP to allow
>>>> shared mmap for these cases.
>>>>
>>>> Signed-off-by: Hao Xu <howeyxu@tencent.com>
>>>> ---
>>>> fs/fuse/file.c | 11 ++++++++---
>>>> include/uapi/linux/fuse.h | 2 ++
>>>> 2 files changed, 10 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>>>> index 89d97f6188e0..655896bdb0d5 100644
>>>> --- a/fs/fuse/file.c
>>>> +++ b/fs/fuse/file.c
>>>> @@ -161,7 +161,8 @@ struct fuse_file *fuse_file_open(struct
>>>> fuse_mount *fm, u64 nodeid,
>>>> }
>>>> if (isdir)
>>>> - ff->open_flags &= ~FOPEN_DIRECT_IO;
>>>> + ff->open_flags &=
>>>> + ~(FOPEN_DIRECT_IO | FOPEN_DIRECT_IO_SHARED_MMAP);
>>>> ff->nodeid = nodeid;
>>>> @@ -2509,8 +2510,12 @@ static int fuse_file_mmap(struct file *file,
>>>> struct vm_area_struct *vma)
>>>> return fuse_dax_mmap(file, vma);
>>>> if (ff->open_flags & FOPEN_DIRECT_IO) {
>>>> - /* Can't provide the coherency needed for MAP_SHARED */
>>>> - if (vma->vm_flags & VM_MAYSHARE)
>>>> + /* Can't provide the coherency needed for MAP_SHARED.
>>>> + * So disable it if FOPEN_DIRECT_IO_SHARED_MMAP is not
>>>> + * set, which means we do need strong coherency.
>>>> + */
>>>> + if (!(ff->open_flags & FOPEN_DIRECT_IO_SHARED_MMAP) &&
>>>> + vma->vm_flags & VM_MAYSHARE)
>>>> return -ENODEV;
>>>
>>> Can you give an example how this is useful and how do you plan to
>>> use it?
>>>
>>> If goal is not using guest cache (either for saving memory or for cache
>>> coherency with other clients) and hence you used FOPEN_DIRECT_IO,
>>> then by allowing page cache for mmap(), we are contracting that goal.
>>> We are neither saving memory and at the same time we are not
>>> cache coherent.
>>
>> We use it to reduce guest memory "as possible as we can", which means we
>> first have to ensure the functionality so shared mmap should work when
>> users call it, then second reduce memory when users use read/write
>> (from/to other files).
>>
>> In cases where users do read/write in most time and calls shared mmap
>> sometimes, disabling shared mmap makes this case out of service, but
>> with this flag we still reduce memory and the application works.
>>
>>>
>>> IIUC, for virtiofs, you want to use cache=none but at the same time
>>> allow guest applications to do shared mmap() hence you are introducing
>>> this change. There have been folks who have complained about it
>>> and I think 9pfs offers a mode which does this. So solving this
>>> problem will be nice.
>>>
>>> BTW, if "-o dax" is used, it solves this problem. But unfortunately qemu
>>> does not have DAX support yet and we also have issues with page
>>> truncation
>>> on host and MCE not travelling into the guest. So DAX is not a perfect
>>> option yet.
>>
>> Yea, just like I relied in another mail, users' IO pattern may be a
>> bunch of small IO to a bunch of small files, dax may help but not so
>> much in that case.
>>
>>>
>>> I agree that solving this problem will be nice. Just trying to
>>> understand the behavior better. How these cache pages will
>>> interact with read/write?
>>
>> I went through the code, it looks like there are issues when users mmap
>> a file and then write to it, this may cause coherency problem between
>> the backend file and the frontend page cache.
>> I think this problem exists before this patchset: when we private mmap
>> a file and then write to it in FOPEN_DIRECT_IO mode, the change doesn't
>> update to the page cache because we falsely assume there is no page
>> cache under FOPEN_DIRECT_IO mode. I need to go over the code and do some
>> test to see if it is really there and to solve it.
>
> IIUC, I guess the current read/write routine will still initiate DIRECT
> IO to server in FOPEN_DIRECT_IO mode, even there's page cache initiated
> by shared mmap?
Yes, currently no matter we private or shared mmap a file in
FOPEN_DIRECT_IO, when we call syscall write to that file, it goes to
backend directly, what's worse, it doesn't invalidate the page cache,
I've filed a patch for it:
https://lore.kernel.org/linux-fsdevel/0625d0cb-2a65-ffae-b072-e14a3f6c7571@linux.dev/
In this patch, I flush pages regardless the mmap is private or shared,
that will be tweaked in v2. glad if you have time to reviewing.
>
> Private mmap doesn't need to care about the coherency issue, as private
> mmap is private and doesn't need to be flushed to server. Thus IMHO the
Yea, but just like what I said, we should invalidate the page cache page
for private mmaped file.
> weakened DIRECT_IO, or dio_shared_mmap mode only applies to scenarios
> where strong coherency is not needed.
>
hmmm, not exactly, we should flush the page cache page in that case to
enforce strong coherency.
Thanks,
Hao
next prev parent reply other threads:[~2023-06-13 3:20 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-05 8:16 [PATCH] fuse: add a new flag to allow shared mmap in FOPEN_DIRECT_IO mode Hao Xu
2023-05-05 14:39 ` [fuse-devel] " Bernd Schubert
2023-05-06 5:04 ` Hao Xu
2023-05-06 7:03 ` Hao Xu
2023-06-26 17:59 ` Bernd Schubert
2023-05-05 20:37 ` Vivek Goyal
2023-05-06 5:01 ` Hao Xu
2023-05-08 9:36 ` Hao Xu
2023-06-13 2:56 ` Jingbo Xu
2023-06-13 3:20 ` Hao Xu [this message]
2023-06-20 4:07 ` Jingbo Xu
2023-05-09 12:59 ` [fuse-devel] " Hao Xu
2023-06-08 7:17 ` Hao Xu
2023-06-08 21:28 ` [fuse-devel] " Bernd Schubert
2023-06-09 5:56 ` Miklos Szeredi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=28c92418-7f07-978a-0eaa-b0d6329f4133@linux.dev \
--to=hao.xu@linux.dev \
--cc=bernd.schubert@fastmail.fm \
--cc=fuse-devel@lists.sourceforge.net \
--cc=jefflexu@linux.alibaba.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).