All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jingbo Xu <jefflexu@linux.alibaba.com>
To: Miklos Szeredi <miklos@szeredi.hu>,
	Bernd Schubert <bernd.schubert@fastmail.fm>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	lege.wang@jaguarmicro.com
Subject: Re: [HELP] FUSE writeback performance bottleneck
Date: Tue, 4 Jun 2024 09:57:01 +0800	[thread overview]
Message-ID: <bd49fcba-3eb6-4e84-a0f0-e73bce31ddb2@linux.alibaba.com> (raw)
In-Reply-To: <CAJfpeguts=V9KkBsMJN_WfdkLHPzB6RswGvumVHUMJ87zOAbDQ@mail.gmail.com>

Hi Bernd and Miklos,

On 6/3/24 11:19 PM, Miklos Szeredi wrote:
> On Mon, 3 Jun 2024 at 16:43, Bernd Schubert <bernd.schubert@fastmail.fm> wrote:
>>
>>
>>
>> On 6/3/24 08:17, Jingbo Xu wrote:
>>> Hi, Miklos,
>>>
>>> We spotted a performance bottleneck for FUSE writeback in which the
>>> writeback kworker has consumed nearly 100% CPU, among which 40% CPU is
>>> used for copy_page().
>>>
>>> fuse_writepages_fill
>>>   alloc tmp_page
>>>   copy_highpage
>>>
>>> This is because of FUSE writeback design (see commit 3be5a52b30aa
>>> ("fuse: support writable mmap")), which newly allocates a temp page for
>>> each dirty page to be written back, copy content of dirty page to temp
>>> page, and then write back the temp page instead.  This special design is
>>> intentional to avoid potential deadlocked due to buggy or even malicious
>>> fuse user daemon.
>>
>> I also noticed that and I admin that I don't understand it yet. The commit says
>>
>> <quote>
>>     The basic problem is that there can be no guarantee about the time in which
>>     the userspace filesystem will complete a write.  It may be buggy or even
>>     malicious, and fail to complete WRITE requests.  We don't want unrelated parts
>>     of the system to grind to a halt in such cases.
>> </quote>
>>
>>
>> Timing - NFS/cifs/etc have the same issue? Even a local file system has no guarantees
>> how fast storage is?
> 
> I don't have the details but it boils down to the fact that the
> allocation context provided by GFP_NOFS (PF_MEMALLOC_NOFS) cannot be
> used by the unprivileged userspace server (and even if it could,
> there's no guarantee, that it would).
> 
> When this mechanism was introduced, the deadlock was a real
> possibility.  I'm not sure that it can still happen, but proving that
> it cannot might be difficult.

IIUC, there are two sources that may cause deadlock:
1) the fuse server needs memory allocation when processing FUSE_WRITE
requests, which in turn triggers direct memory reclaim, and FUSE
writeback then - deadlock here
2) a process that trigfgers direct memory reclaim or calls sync(2) may
hang there forever, if the fuse server is buggyly or malicious and thus
hang there when processing FUSE_WRITE requests

Thus the temp page design was introduced to avoid the above potential
issues.

I think case 1 may be fixed (if any), but I don't know how case 2 can be
avoided as any one could run a fuse server in unprivileged mode.  Or if
case 2 really matters?  Please correct me if I miss something.

-- 
Thanks,
Jingbo

  parent reply	other threads:[~2024-06-04  1:57 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-03  6:17 [HELP] FUSE writeback performance bottleneck Jingbo Xu
2024-06-03 14:43 ` Bernd Schubert
2024-06-03 15:19   ` Miklos Szeredi
2024-06-03 15:32     ` Bernd Schubert
2024-06-03 22:10     ` Dave Chinner
2024-06-04  7:20       ` Miklos Szeredi
2024-06-04  1:57     ` Jingbo Xu [this message]
2024-06-04  7:27       ` Miklos Szeredi
2024-06-04  7:36         ` Jingbo Xu
2024-06-04  9:32           ` Bernd Schubert
2024-06-04 10:02             ` Miklos Szeredi
2024-06-04 14:13               ` Bernd Schubert
2024-06-04 16:53                 ` Josef Bacik
2024-06-04 21:39                   ` Bernd Schubert
2024-06-04 22:16                     ` Josef Bacik
2024-06-05  5:49                       ` Amir Goldstein
2024-06-05 15:35                         ` Josef Bacik
2024-08-22 17:00               ` Joanne Koong
2024-08-22 21:01                 ` Joanne Koong
2024-08-23  3:34               ` Jingbo Xu
2024-09-13  0:00                 ` Joanne Koong
2024-09-13  1:25                   ` Jingbo Xu
2024-06-04 12:24             ` Jingbo Xu
2024-09-11  9:32         ` Jingbo Xu
2024-09-12 23:18           ` Joanne Koong
2024-09-13  3:35             ` Jingbo Xu
2024-09-13 20:55               ` Joanne Koong
2024-10-11 23:08                 ` Joanne Koong
2024-10-14  1:57                   ` Jingbo Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bd49fcba-3eb6-4e84-a0f0-e73bce31ddb2@linux.alibaba.com \
    --to=jefflexu@linux.alibaba.com \
    --cc=bernd.schubert@fastmail.fm \
    --cc=lege.wang@jaguarmicro.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.