linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hao Xu <hao.xu@linux.dev>
To: Jingbo Xu <jefflexu@linux.alibaba.com>, Vivek Goyal <vgoyal@redhat.com>
Cc: fuse-devel@lists.sourceforge.net, miklos@szeredi.hu,
	bernd.schubert@fastmail.fm, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] fuse: add a new flag to allow shared mmap in FOPEN_DIRECT_IO mode
Date: Tue, 13 Jun 2023 11:20:44 +0800	[thread overview]
Message-ID: <28c92418-7f07-978a-0eaa-b0d6329f4133@linux.dev> (raw)
In-Reply-To: <cdb8f5d4-5a47-4c17-9f9c-8de24aede4c5@linux.alibaba.com>

Hi Jingbo,

On 6/13/23 10:56, Jingbo Xu wrote:
> 
> 
> On 5/6/23 1:01 PM, Hao Xu wrote:
>> Hi Vivek,
>>
>> On 5/6/23 04:37, Vivek Goyal wrote:
>>> On Fri, May 05, 2023 at 04:16:52PM +0800, Hao Xu wrote:
>>>> From: Hao Xu <howeyxu@tencent.com>
>>>>
>>>> FOPEN_DIRECT_IO is usually set by fuse daemon to indicate need of strong
>>>> coherency, e.g. network filesystems. Thus shared mmap is disabled since
>>>> it leverages page cache and may write to it, which may cause
>>>> inconsistence. But FOPEN_DIRECT_IO can be used not for coherency but to
>>>> reduce memory footprint as well, e.g. reduce guest memory usage with
>>>> virtiofs. Therefore, add a new flag FOPEN_DIRECT_IO_SHARED_MMAP to allow
>>>> shared mmap for these cases.
>>>>
>>>> Signed-off-by: Hao Xu <howeyxu@tencent.com>
>>>> ---
>>>>    fs/fuse/file.c            | 11 ++++++++---
>>>>    include/uapi/linux/fuse.h |  2 ++
>>>>    2 files changed, 10 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>>>> index 89d97f6188e0..655896bdb0d5 100644
>>>> --- a/fs/fuse/file.c
>>>> +++ b/fs/fuse/file.c
>>>> @@ -161,7 +161,8 @@ struct fuse_file *fuse_file_open(struct
>>>> fuse_mount *fm, u64 nodeid,
>>>>        }
>>>>          if (isdir)
>>>> -        ff->open_flags &= ~FOPEN_DIRECT_IO;
>>>> +        ff->open_flags &=
>>>> +            ~(FOPEN_DIRECT_IO | FOPEN_DIRECT_IO_SHARED_MMAP);
>>>>          ff->nodeid = nodeid;
>>>>    @@ -2509,8 +2510,12 @@ static int fuse_file_mmap(struct file *file,
>>>> struct vm_area_struct *vma)
>>>>            return fuse_dax_mmap(file, vma);
>>>>          if (ff->open_flags & FOPEN_DIRECT_IO) {
>>>> -        /* Can't provide the coherency needed for MAP_SHARED */
>>>> -        if (vma->vm_flags & VM_MAYSHARE)
>>>> +        /* Can't provide the coherency needed for MAP_SHARED.
>>>> +         * So disable it if FOPEN_DIRECT_IO_SHARED_MMAP is not
>>>> +         * set, which means we do need strong coherency.
>>>> +         */
>>>> +        if (!(ff->open_flags & FOPEN_DIRECT_IO_SHARED_MMAP) &&
>>>> +            vma->vm_flags & VM_MAYSHARE)
>>>>                return -ENODEV;
>>>
>>> Can you give an example how this is useful and how do you plan to
>>> use it?
>>>
>>> If goal is not using guest cache (either for saving memory or for cache
>>> coherency with other clients) and hence you used FOPEN_DIRECT_IO,
>>> then by allowing page cache for mmap(), we are contracting that goal.
>>> We are neither saving memory and at the same time we are not
>>> cache coherent.
>>
>> We use it to reduce guest memory "as possible as we can", which means we
>> first have to ensure the functionality so shared mmap should work when
>> users call it, then second reduce memory when users use read/write
>> (from/to other files).
>>
>> In cases where users do read/write in most time and calls shared mmap
>> sometimes, disabling shared mmap makes this case out of service, but
>> with this flag we still reduce memory and the application works.
>>
>>>
>>> IIUC, for virtiofs, you want to use cache=none but at the same time
>>> allow guest applications to do shared mmap() hence you are introducing
>>> this change. There have been folks who have complained about it
>>> and I think 9pfs offers a mode which does this. So solving this
>>> problem will be nice.
>>>
>>> BTW, if "-o dax" is used, it solves this problem. But unfortunately qemu
>>> does not have DAX support yet and we also have issues with page
>>> truncation
>>> on host and MCE not travelling into the guest. So DAX is not a perfect
>>> option yet.
>>
>> Yea, just like I relied in another mail, users' IO pattern may be a
>> bunch of small IO to a bunch of small files, dax may help but not so
>> much in that case.
>>
>>>
>>> I agree that solving this problem will be nice. Just trying to
>>> understand the behavior better. How these cache pages will
>>> interact with read/write?
>>
>> I went through the code, it looks like there are issues when users mmap
>> a file and then write to it, this may cause coherency problem between
>> the backend file and the frontend page cache.
>> I think this problem exists before this patchset: when we private mmap
>> a file and then write to it in FOPEN_DIRECT_IO mode, the change doesn't
>> update to the page cache because we falsely assume there is no page
>> cache under FOPEN_DIRECT_IO mode. I need to go over the code and do some
>> test to see if it is really there and to solve it.
> 
> IIUC, I guess the current read/write routine will still initiate DIRECT
> IO to server in FOPEN_DIRECT_IO mode, even there's page cache initiated
> by shared mmap?

Yes, currently no matter we private or shared mmap a file in 
FOPEN_DIRECT_IO, when we call syscall write to that file, it goes to 
backend directly, what's worse, it doesn't invalidate the page cache, 
I've filed a patch for it: 
https://lore.kernel.org/linux-fsdevel/0625d0cb-2a65-ffae-b072-e14a3f6c7571@linux.dev/
In this patch, I flush pages regardless the mmap is private or shared,
that will be tweaked in v2. glad if you have time to reviewing.

> 
> Private mmap doesn't need to care about the coherency issue, as private
> mmap is private and doesn't need to be flushed to server.  Thus IMHO the

Yea, but just like what I said, we should invalidate the page cache page 
for private mmaped file.

> weakened DIRECT_IO, or dio_shared_mmap mode only applies to scenarios
> where strong coherency is not needed.
> 

hmmm, not exactly, we should flush the page cache page in that case to 
enforce strong coherency.

Thanks,
Hao


  reply	other threads:[~2023-06-13  3:20 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-05  8:16 [PATCH] fuse: add a new flag to allow shared mmap in FOPEN_DIRECT_IO mode Hao Xu
2023-05-05 14:39 ` [fuse-devel] " Bernd Schubert
2023-05-06  5:04   ` Hao Xu
2023-05-06  7:03   ` Hao Xu
2023-06-26 17:59     ` Bernd Schubert
2023-05-05 20:37 ` Vivek Goyal
2023-05-06  5:01   ` Hao Xu
2023-05-08  9:36     ` Hao Xu
2023-06-13  2:56     ` Jingbo Xu
2023-06-13  3:20       ` Hao Xu [this message]
2023-06-20  4:07         ` Jingbo Xu
2023-05-09 12:59 ` [fuse-devel] " Hao Xu
2023-06-08  7:17 ` Hao Xu
2023-06-08 21:28   ` [fuse-devel] " Bernd Schubert
2023-06-09  5:56     ` Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28c92418-7f07-978a-0eaa-b0d6329f4133@linux.dev \
    --to=hao.xu@linux.dev \
    --cc=bernd.schubert@fastmail.fm \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=jefflexu@linux.alibaba.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).