From: Linus Torvalds <torvalds@linux-foundation.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>,
lsf-pc@lists.linux-foundation.org,
linux-fsdevel@vger.kernel.org, linux-mm <linux-mm@kvack.org>,
Daniel Gomez <da.gomez@samsung.com>,
Pankaj Raghav <p.raghav@samsung.com>,
Jens Axboe <axboe@kernel.dk>, Dave Chinner <david@fromorbit.com>,
Christoph Hellwig <hch@lst.de>, Chris Mason <clm@fb.com>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [LSF/MM/BPF TOPIC] Measuring limits and enhancing buffered IO
Date: Sat, 24 Feb 2024 10:20:14 -0800 [thread overview]
Message-ID: <CAHk-=wj0_eGczsoTJska24Lf9Sk6VXUGrfHymcDZF_Q5ExQdxQ@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wjUkYLv23KtF=EyCrQcmf9NGwE8Yo1cuxdaLF8gqx5zWw@mail.gmail.com>
On Sat, 24 Feb 2024 at 09:31, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> And (one) important part here is "nobody sane does that". So
> benchmarking this is a bit crazy. The code is literally meant for bad
> actors, and what you are benchmarking is the kernel telling you "don't
> do that then".
Side note: one reason why the big hammer approach of "don't do that"
has worked so well is that the few loads that *do* want to do this and
have a valid reason to write large amounts of data in one go are
generally trivially translated to O_DIRECT.
For example, if you actually do things like write disk images etc,
O_DIRECT is lovely and easy - even trivial - to use. You don't even
have to write code for it, you can (and people do) just use 'dd' with
'oflag=direct'. So even trivial shell scripting has access to the
"don't do that then" flag.
In other words, I really think that Luis' benchmark triggers that
kernel "you are doing something actively wrong and stupid" logic. It's
not the kernel trying to optimize writeback. It's the kernel trying to
protect others from stupid loads.
Now, I'm also not saying that you should benchmark this with our
"vm_dirty_bytes" logic disabled. That may indeed help performance on
that benchmark, but you'll just hit other problem spots instead. Once
you fill up lots of memory, other problems become really big and
nasty, so you would then need *other* fixes for those issues.
If somebody really cares about this kind of load, and cannot use
O_DIRECT for some reason ("I actually do want caches 99% of the
time"), I suspect the solution is to have some slightly gentler way to
say "instead of the throttling logic, I want you to start my writeouts
much more synchronously".
IOW, we could have a writer flag that still uses the page cache, but
that instead of that
balance_dirty_pages_ratelimited(mapping);
in generic_perform_write(), it would actually synchronously *start*
the write, that might work a whole lot better for any load that still
wants to do big streaming writes, but wants to also keep the page
cache component.
Linus
next prev parent reply other threads:[~2024-02-24 18:20 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-23 23:59 [LSF/MM/BPF TOPIC] Measuring limits and enhancing buffered IO Luis Chamberlain
2024-02-24 4:12 ` Matthew Wilcox
2024-02-24 17:31 ` Linus Torvalds
2024-02-24 18:13 ` Matthew Wilcox
2024-02-24 18:24 ` Linus Torvalds
2024-02-24 18:20 ` Linus Torvalds [this message]
2024-02-24 19:11 ` Linus Torvalds
2024-02-24 21:42 ` Theodore Ts'o
2024-02-24 22:57 ` Chris Mason
2024-02-24 23:40 ` Linus Torvalds
2024-05-10 23:57 ` Luis Chamberlain
2024-02-25 5:18 ` Kent Overstreet
2024-02-25 6:04 ` Kent Overstreet
2024-02-25 13:10 ` Matthew Wilcox
2024-02-25 17:03 ` Linus Torvalds
2024-02-25 21:14 ` Matthew Wilcox
2024-02-25 23:45 ` Linus Torvalds
2024-02-26 1:02 ` Kent Overstreet
2024-02-26 1:32 ` Linus Torvalds
2024-02-26 1:58 ` Kent Overstreet
2024-02-26 2:06 ` Kent Overstreet
2024-02-26 2:34 ` Linus Torvalds
2024-02-26 2:50 ` Al Viro
2024-02-26 17:17 ` Linus Torvalds
2024-02-26 21:07 ` Matthew Wilcox
2024-02-26 21:17 ` Kent Overstreet
2024-02-26 21:19 ` Kent Overstreet
2024-02-26 21:55 ` Paul E. McKenney
2024-02-26 23:29 ` Kent Overstreet
2024-02-27 0:05 ` Paul E. McKenney
2024-02-27 0:29 ` Kent Overstreet
2024-02-27 0:55 ` Paul E. McKenney
2024-02-27 1:08 ` Kent Overstreet
2024-02-27 5:17 ` Paul E. McKenney
2024-02-27 6:21 ` Kent Overstreet
2024-02-27 15:32 ` Paul E. McKenney
2024-02-27 15:52 ` Kent Overstreet
2024-02-27 16:06 ` Paul E. McKenney
2024-02-27 15:54 ` Matthew Wilcox
2024-02-27 16:21 ` Paul E. McKenney
2024-02-27 16:34 ` Kent Overstreet
2024-02-27 17:58 ` Paul E. McKenney
2024-02-28 23:55 ` Kent Overstreet
2024-02-29 19:42 ` Paul E. McKenney
2024-02-29 20:51 ` Kent Overstreet
2024-03-05 2:19 ` Paul E. McKenney
2024-02-27 0:43 ` Dave Chinner
2024-02-26 22:46 ` Linus Torvalds
2024-02-26 23:48 ` Linus Torvalds
2024-02-27 7:21 ` Kent Overstreet
2024-02-27 15:39 ` Matthew Wilcox
2024-02-27 15:54 ` Kent Overstreet
2024-02-27 16:34 ` Linus Torvalds
2024-02-27 16:47 ` Kent Overstreet
2024-02-27 17:07 ` Linus Torvalds
2024-02-27 17:20 ` Kent Overstreet
2024-02-27 18:02 ` Linus Torvalds
2024-05-14 11:52 ` Luis Chamberlain
2024-05-14 16:04 ` Linus Torvalds
2024-11-15 19:43 ` Linus Torvalds
2024-11-15 20:42 ` Matthew Wilcox
2024-11-15 21:52 ` Linus Torvalds
2024-02-25 21:29 ` Kent Overstreet
2024-02-25 17:32 ` Kent Overstreet
2024-02-24 17:55 ` Luis Chamberlain
2024-02-25 5:24 ` Kent Overstreet
2024-02-26 12:22 ` Dave Chinner
2024-02-27 10:07 ` Kent Overstreet
2024-02-27 14:08 ` Luis Chamberlain
2024-02-27 14:57 ` Kent Overstreet
2024-02-27 22:13 ` Dave Chinner
2024-02-27 22:21 ` Kent Overstreet
2024-02-27 22:42 ` Dave Chinner
2024-02-28 7:48 ` [Lsf-pc] " Amir Goldstein
2024-02-28 14:01 ` Chris Mason
2024-02-29 0:25 ` Dave Chinner
2024-02-29 0:57 ` Kent Overstreet
2024-03-04 0:46 ` Dave Chinner
2024-02-27 22:46 ` Linus Torvalds
2024-02-27 23:00 ` Linus Torvalds
2024-02-28 2:22 ` Kent Overstreet
2024-02-28 3:00 ` Matthew Wilcox
2024-02-28 4:22 ` Matthew Wilcox
2024-02-28 17:34 ` Kent Overstreet
2024-02-28 18:04 ` Matthew Wilcox
2024-02-28 18:18 ` Kent Overstreet
2024-02-28 19:09 ` Linus Torvalds
2024-02-28 19:29 ` Kent Overstreet
2024-02-28 20:17 ` Linus Torvalds
2024-02-28 23:21 ` Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHk-=wj0_eGczsoTJska24Lf9Sk6VXUGrfHymcDZF_Q5ExQdxQ@mail.gmail.com' \
--to=torvalds@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=clm@fb.com \
--cc=da.gomez@samsung.com \
--cc=david@fromorbit.com \
--cc=hannes@cmpxchg.org \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).