linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Amir Goldstein <amir73il@gmail.com>,
	Hugh Dickins <hughd@google.com>,
	Michael Larabel <Michael@michaellarabel.com>,
	Ted Ts'o <tytso@google.com>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Jan Kara <jack@suse.cz>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: Kernel Benchmarking
Date: Mon, 14 Sep 2020 04:31:31 +0100	[thread overview]
Message-ID: <20200914033131.GK6583@casper.infradead.org> (raw)
In-Reply-To: <20200913234503.GS12096@dread.disaster.area>

On Mon, Sep 14, 2020 at 09:45:03AM +1000, Dave Chinner wrote:
> I have my doubts that complex page cache manipulation operations
> like ->migrate_page that rely exclusively on page and internal mm
> serialisation are really safe against ->fallocate based invalidation
> races.  I think they probably also need to be wrapped in the
> MMAPLOCK, but I don't understand all the locking and constraints
> that ->migrate_page has and there's been no evidence yet that it's a
> problem so I've kinda left that alone. I suspect that "no evidence"
> thing comes from "filesystem people are largely unable to induce
> page migrations in regression testing" so it has pretty much zero
> test coverage....

Maybe we can get someone who knows the page migration code to give
us a hack to induce pretty much constant migration?

> Stuff like THP splitting hasn't been an issue for us because the
> file-backed page cache does not support THP (yet!). That's
> something I'll be looking closely at in Willy's upcoming patchset.

One of the things I did was fail every tenth I/O to a THP.  That causes
us to split the THP when we come to try to make use of it.  Far more
effective than using dm-flakey because I know that failing a readahead
I/O should not cause any test to fail, so any newly-failing test is
caused by the THP code.

I've probably spent more time looking at the page splitting and
truncate/hole-punch/invalidate/invalidate2 paths than anything else.
It's definitely an area where more eyes are welcome, and just having
more people understand it would be good.  split_huge_page_to_list and
its various helper functions are about 400 lines of code and, IMO,
a little too complex.

> The other issue here is that serialisation via individual cache
> object locking just doesn't scale in any way to the sizes of
> operations that fallocate() can run. fallocate() has 64 bit
> operands, so a user could ask us to lock down a full 8EB range of
> file. Locking that page by page, even using 1GB huge page Xarray
> slot entries, is just not practical... :/

FWIW, there's not currently a "lock down this range" mechanism in
the page cache.  If there were, it wouldn't be restricted to 4k/2M/1G
sizes -- with the XArray today, it's fairly straightforward to
lock ranges which are m * 64^n entries in size (for 1 <= m <= 63, n >=0).
In the next year or two, I hope to be able to offer a "lock arbitrary
page range" feature which is as cheap to lock 8EiB as it is 128KiB.

It would still be page-ranges, not byte-ranges, so I don't know how well
that fits your needs.  It doesn't solve the DIO vs page cache problems
at all, since we want DIO to ranges which happen to be within the same
pages as each other to not conflict.

  reply	other threads:[~2020-09-14  3:31 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAHk-=wiZnE409WkTOG6fbF_eV1LgrHBvMtyKkpTqM9zT5hpf9A@mail.gmail.com>
     [not found] ` <aa90f272-1186-f9e1-8fdb-eefd332fdae8@MichaelLarabel.com>
     [not found]   ` <CAHk-=wh_31_XBNHbdF7EUJceLpEpwRxVF+_1TONzyBUym6Pw4w@mail.gmail.com>
     [not found]     ` <e24ef34d-7b1d-dd99-082d-28ca285a79ff@MichaelLarabel.com>
     [not found]       ` <CAHk-=wgEE4GuNjcRaaAvaS97tW+239-+tjcPjTq2FGhEuM8HYg@mail.gmail.com>
     [not found]         ` <6e1d8740-2594-c58b-ff02-a04df453d53c@MichaelLarabel.com>
     [not found]           ` <CAHk-=wgJ3-cEkU-5zXFPvRCHKkCCuKxVauYWGphjePEhJJgtgQ@mail.gmail.com>
     [not found]             ` <d2023f4c-ef14-b877-b5bb-e4f8af332abc@MichaelLarabel.com>
     [not found]               ` <CAHk-=wiz=J=8mJ=zRG93nuJ9GtQAm5bSRAbWJbWZuN4Br38+EQ@mail.gmail.com>
     [not found]                 ` <CAHk-=wimM2kckaYj7spUJwehZkSYxK9RQqu3G392BE=73dyKtg@mail.gmail.com>
     [not found]                   ` <8bb582d2-2841-94eb-8862-91d1225d5ebc@MichaelLarabel.com>
     [not found]                     ` <CAHk-=wjqE_a6bpZyDQ4DCrvj_Dv2RwQoY7wN91kj8y-tZFRvEA@mail.gmail.com>
     [not found]                       ` <0cbc959e-1b8d-8d7e-1dc6-672cf5b3899a@MichaelLarabel.com>
     [not found]                         ` <CAHk-=whP-7Uw9WgWgjRgF1mCg+NnkOPpWjVw+a9M3F9C52DrVg@mail.gmail.com>
     [not found]                           ` <CAHk-=wjfw3U5eTGWLaisPHg1+jXsCX=xLZgqPx4KJeHhEqRnEQ@mail.gmail.com>
     [not found]                             ` <a2369108-7103-278c-9f10-6309a0a9dc3b@MichaelLarabel.com>
2020-09-12  7:28                               ` Kernel Benchmarking Amir Goldstein
2020-09-12 10:32                                 ` Michael Larabel
2020-09-12 14:37                                   ` Matthew Wilcox
2020-09-12 14:44                                     ` Michael Larabel
2020-09-15  3:32                                       ` Matthew Wilcox
2020-09-15 10:39                                         ` Jan Kara
2020-09-15 13:52                                           ` Matthew Wilcox
     [not found]                                     ` <658ae026-32d9-0a25-5a59-9c510d6898d5@MichaelLarabel.com>
2020-09-14 17:47                                       ` Linus Torvalds
2020-09-14 20:21                                         ` Matthieu Baerts
2020-09-14 20:53                                           ` Linus Torvalds
2020-09-15  0:42                                             ` Linus Torvalds
2020-09-15 15:34                                             ` Matthieu Baerts
2020-09-15 18:27                                               ` Linus Torvalds
2020-09-15 18:47                                                 ` Linus Torvalds
2020-09-15 19:26                                                   ` Matthieu Baerts
2020-09-15 19:32                                                     ` Linus Torvalds
2020-09-15 19:56                                                       ` Matthieu Baerts
2020-09-15 23:35                                                         ` Linus Torvalds
2020-09-16 10:34                                                           ` Jan Kara
2020-09-16 18:47                                                             ` Linus Torvalds
     [not found]                                                 ` <9a92bf16-02c5-ba38-33c7-f350588ac874@tessares.net>
2020-09-15 19:24                                                   ` Linus Torvalds
2020-09-15 19:38                                                     ` Matthieu Baerts
2020-09-15 18:31                                               ` Linus Torvalds
2020-09-15 14:21                                         ` Michael Larabel
2020-09-15 17:52                                           ` Linus Torvalds
2020-09-17 17:51                                         ` Linus Torvalds
2020-09-17 18:23                                           ` Matthew Wilcox
2020-09-17 18:30                                             ` Linus Torvalds
2020-09-17 18:50                                               ` Matthew Wilcox
2020-09-17 19:00                                                 ` Linus Torvalds
2020-09-17 19:27                                                   ` Matthew Wilcox
2020-09-17 19:47                                                     ` Linus Torvalds
2020-09-18  0:39                                                       ` Sedat Dilek
2020-09-18  0:40                                                         ` Sedat Dilek
2020-09-18 20:25                                                           ` Sedat Dilek
2020-09-20 17:06                                                             ` Linus Torvalds
2020-09-20 17:14                                                               ` Sedat Dilek
2020-09-20 17:40                                                                 ` Linus Torvalds
2020-09-20 18:00                                                                   ` Sedat Dilek
2020-09-20 23:23                                                       ` Dave Chinner
2020-09-20 23:31                                                         ` Linus Torvalds
2020-09-20 23:40                                                           ` Linus Torvalds
2020-09-21  1:20                                                           ` Dave Chinner
2020-09-12 15:53                                 ` Matthew Wilcox
2020-09-12 17:59                                 ` Linus Torvalds
2020-09-12 20:32                                   ` Rogério Brito
2020-09-14  9:33                                     ` Jan Kara
2020-09-12 20:58                                   ` Josh Triplett
2020-09-12 20:59                                   ` James Bottomley
2020-09-12 21:15                                     ` Linus Torvalds
2020-09-12 22:32                                   ` Matthew Wilcox
2020-09-13  0:40                                   ` Dave Chinner
2020-09-13  2:39                                     ` Linus Torvalds
2020-09-13  3:40                                       ` Matthew Wilcox
2020-09-13 23:45                                       ` Dave Chinner
2020-09-14  3:31                                         ` Matthew Wilcox [this message]
2020-09-15 14:28                                           ` Chris Mason
2020-09-15  9:27                                         ` Jan Kara
2020-09-13  3:18                                     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200914033131.GK6583@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=Michael@michaellarabel.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).