From: Miklos Szeredi <miklos@szeredi.hu>
To: Fengguang Wu <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Jan Kara <jack@suse.cz>,
linux-mm@kvack.org, Maxim Patlasov <MPatlasov@parallels.com>,
hmh@hmh.eng.br, mel@csn.ul.ie, t.artem@lycos.com,
Theodore Ts'o <tytso@mit.edu>, Jens Axboe <axboe@kernel.dk>,
linux-fsdevel@vger.kernel.org
Subject: Re: [patch 15/15] mm: add strictlimit knob
Date: Thu, 7 Dec 2017 09:50:23 +0100 [thread overview]
Message-ID: <CAJfpegsE-jUOWjpMVQv76cDxp3aLpAfxrMa-vutMFa0KhVKrHw@mail.gmail.com> (raw)
In-Reply-To: <20171207041459.64myz37qwmjkoxu5@wfg-t540p.sh.intel.com>
On Thu, Dec 7, 2017 at 5:14 AM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> CC fuse maintainer, too.
>
> On Wed, Dec 06, 2017 at 05:09:27PM -0800, Andrew Morton wrote:
>>
>> On Fri, 1 Dec 2017 13:29:28 +0100 Jan Kara <jack@suse.cz> wrote:
>>
>>> On Thu 30-11-17 14:15:58, Andrew Morton wrote:
>>> > From: Maxim Patlasov <MPatlasov@parallels.com>
>>> > Subject: mm: add strictlimit knob
>>> >
>>> > The "strictlimit" feature was introduced to enforce per-bdi dirty
>>> > limits
>>> > for FUSE which sets bdi max_ratio to 1% by default:
>>> >
>>> > http://article.gmane.org/gmane.linux.kernel.mm/105809
>>> >
>>> > However the feature can be useful for other relatively slow or
>>> > untrusted
>>> > BDIs like USB flash drives and DVD+RW. The patch adds a knob to enable
>>> > the feature:
>>> >
>>> > echo 1 > /sys/class/bdi/X:Y/strictlimit
>>> >
>>> > Being enabled, the feature enforces bdi max_ratio limit even if global
>>> > (10%) dirty limit is not reached. Of course, the effect is not visible
>>> > until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable
>>> > value.
>>>
>>> In principle I have nothing against this and the usecase sounds
>>> reasonable
>>> (in fact I believe the lack of a feature like this is one of reasons why
>>> desktop automounters usually mount USB devices with 'sync' mount option).
>>> So feel free to add:
>>>
>>> Reviewed-by: Jan Kara <jack@suse.cz>
>>>
>>
>> Cc Jens, who may be vaguely interested in plans to finally merge this
>> three-year-old patch?
>>
>>
>>
>> From: Maxim Patlasov <MPatlasov@parallels.com>
>> Subject: mm: add strictlimit knob
>>
>> The "strictlimit" feature was introduced to enforce per-bdi dirty limits
>> for FUSE which sets bdi max_ratio to 1% by default:
>>
>> http://article.gmane.org/gmane.linux.kernel.mm/105809
>
>
> That link is invalid for now, possibly due to the gmane site rebuild.
> I find an email thread here which looks relevant:
>
> https://sourceforge.net/p/fuse/mailman/message/35254883/
>
> Where Maxim has an interesting point:
>
> > Did any one try increasing the limit and did see any better/worse
>> performance ?
>
> We've used 20% as default value in OpenVZ kernel for a long while (1%
> was not enough to saturate our distributed parallel storage).
>
> So the knob will also enable people to _disable_ the 1% fuse limit to
> increase performance.
>
> So people can use the exposed knob in 2 ways to fit their needs, which
> is in general a good thing.
>
> However the comment in wb_position_ratio() says
>
> Without strictlimit feature, fuse writeback may
> * consume arbitrary amount of RAM because it is accounted in
> * NR_WRITEBACK_TEMP which is not involved in calculating
> "nr_dirty".
>
> How dangerous would that be if some user disabled the 1% fuse limit
> through the exposed knob? Will the NR_WRITEBACK_TEMP effect go far
> beyond the user's expectation (20% max dirty limit)?
>
> Looking at the fuse code, NR_WRITEBACK_TEMP will grow proportional to
> WB_WRITEBACK, which should be throttled when bdi_write_congested().
> The congested flag will be set on
>
> fuse_conn.num_background >= fuse_conn.congestion_threshold
> So it looks NR_WRITEBACK_TEMP will somehow be throttled. Just that
> it's not included in the 20% dirty limit.
Only balance_dirty_pages_ratelimited() is going to limit the
generation of dirty pages, I don't think congestion flags will do
that. And (AFAICS) for fuse only BDI_CAP_STRICTLIMIT will allow
accounting temp writeback pages when throttling dirty page generation.
So without BDI_CAP_STRICTLIMIT kernel memory use of fuse may explode.
So we probably need a way to force BDI_CAP_STRICTLIMIT (i.e. do not
permit disabling it for fuse).
Please correct me if I'm wrong in any of the above statements, it's
been a long time I've taken a detailed look at the page writeback
mechanisms.
Thanks,
Miklos
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-12-07 8:50 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-30 22:15 [patch 15/15] mm: add strictlimit knob akpm
2017-12-01 12:29 ` Jan Kara
2017-12-07 1:09 ` Andrew Morton
2017-12-07 4:14 ` Fengguang Wu
2017-12-07 8:50 ` Miklos Szeredi [this message]
2017-12-07 10:15 ` Fengguang Wu
2017-12-07 10:32 ` Miklos Szeredi
2018-01-31 22:58 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJfpegsE-jUOWjpMVQv76cDxp3aLpAfxrMa-vutMFa0KhVKrHw@mail.gmail.com \
--to=miklos@szeredi.hu \
--cc=MPatlasov@parallels.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=fengguang.wu@intel.com \
--cc=hmh@hmh.eng.br \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=t.artem@lycos.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).