From: Ingo Molnar <mingo@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: mceier@gmail.com, Davidlohr Bueso <dave@stgolabs.net>,
kernel test robot <rong.a.chen@intel.com>,
Davidlohr Bueso <dbueso@suse.de>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Borislav Petkov <bp@alien8.de>,
LKML <linux-kernel@vger.kernel.org>,
lkp@lists.01.org, "Kenneth R. Crudup" <kenny@panix.com>
Subject: Re: [x86/mm/pat] 8d04a5f97a: phoronix-test-suite.glmark2.0.score -23.7% regression
Date: Sun, 1 Dec 2019 11:46:24 +0100 [thread overview]
Message-ID: <20191201104624.GA51279@gmail.com> (raw)
In-Reply-To: <CAHk-=wh--xwpatv_Rcp3WtCPQtg-RVoXYQj8O+1TSw8os7Jtvw@mail.gmail.com>
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Sat, Nov 30, 2019 at 2:09 PM Mariusz Ceier <mceier@gmail.com> wrote:
> >
> > Contents of /sys/kernel/debug/x86/pat_memtype_list on master
> > (32ef9553635ab1236c33951a8bd9b5af1c3b1646) where performance is
> > degraded:
>
> Diff between good and bad case:
>
> @@ -1,8 +1,8 @@
> PAT memtype list:
> write-back @ 0x55ba4000-0x55ba5000
> write-back @ 0x5e88c000-0x5e8b5000
> -write-back @ 0x5e8b4000-0x5e8b8000
> write-back @ 0x5e8b4000-0x5e8b5000
> +write-back @ 0x5e8b4000-0x5e8b8000
> write-back @ 0x5e8b7000-0x5e8bb000
> write-back @ 0x5e8ba000-0x5e8bc000
> write-back @ 0x5e8bb000-0x5e8be000
> @@ -21,15 +21,15 @@
> uncached-minus @ 0xec260000-0xec264000
> uncached-minus @ 0xec300000-0xec320000
> uncached-minus @ 0xec326000-0xec327000
> -uncached-minus @ 0xf0000000-0xf0001000
> uncached-minus @ 0xf0000000-0xf8000000
> +uncached-minus @ 0xf0000000-0xf0001000
> uncached-minus @ 0xfdc43000-0xfdc44000
> uncached-minus @ 0xfe000000-0xfe001000
> uncached-minus @ 0xfed00000-0xfed01000
> uncached-minus @ 0xfed10000-0xfed16000
> uncached-minus @ 0xfed90000-0xfed91000
> -write-combining @ 0x2000000000-0x2100000000
> -write-combining @ 0x2000000000-0x2100000000
> +uncached-minus @ 0x2000000000-0x2100000000
> +uncached-minus @ 0x2000000000-0x2100000000
> uncached-minus @ 0x2100000000-0x2100001000
> uncached-minus @ 0x2100001000-0x2100002000
> uncached-minus @ 0x2ffff10000-0x2ffff20000
>
> the first two differences are just trivial ordering differences for
> overlapping ranges (starting at 0x5e8b4000 and 0xf0000000)
> respectively.
>
> But the final difference is a real difference where it used to be WC,
> and is now UC-:
>
> -write-combining @ 0x2000000000-0x2100000000
> -write-combining @ 0x2000000000-0x2100000000
> +uncached-minus @ 0x2000000000-0x2100000000
> +uncached-minus @ 0x2000000000-0x2100000000
>
> which certainly could easily explain the huge performance degradation.
Indeed, as two days ago I speculated to Kenneth R. Crudup who reported a
similar slowdown on i915:
> * Ingo Molnar <mingo@kernel.org> wrote:
> > > * Kenneth R. Crudup <kenny@panix.com> wrote:
> > >
> > > > As soon as the i915 driver module is loaded, it takes over the
> > > > EFI framebuffer on my machine (HP Spectre X360 with Intel UHD620
> > > > Graphics) and the subsequent text (as well as any VTs) is
> > > > rendered much more slowly. I don't know if the i915/DRM guys need
> > > > to do anything to their code to take advantage of this change to
> > > > the PATs, but reverting this change (after the associated
> > > > subseqent commits) has fixed that issue for me.
> > > >
> > > > Let me know if you need any further info.
> > >
> > > This is almost certainly the PAT bits being wrong in the
> > > pagetables, i.e. an x86 bug, not a GPU driver bug.
> > >
> > >
> > > Davidlohr, any idea what's going on? The interval tree conversion went
> > > bad. The slowdown symptoms are consistent with perhaps the framebuffer
> > > not getting WC mapped, but uncacheable mapped:
> > >
> > > ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
> > > vma->node.start,
> > > vma->node.size);
> > >
> > > Which is a wrapper around ioremap_wc().
> > >
> > > To debug this it would be useful to do a before/after comparison of the
> > > kernel pagetables:
> > >
> > > - before: git checkout 8d04a5f97a^1
> > > - after: git checkout 8d04a5f97a
And yesterday:
> [...]
>
> There's another similar bugreport of a -20% GL performance drop, from
> the ktest automated benchmark suite:
>
> https://lkml.kernel.org/r/20191127005312.GD20422@shao2-debian
>
> My shot-in-the-dark hypothesis is that perhaps we somehow fail to find
> a newly mapped memtype and leave a key ioremap_wc() area uncached,
> instead of write-combining?
>
> The order of magnitude of the slowdown would be roughly consistent with
> that, in GPU limited workloads - it would be more marked in 3D scenes
> with a lot of vertices or perhaps a lot of texture changes.
>
> But this is really just a random guess.
It's not an unconditional regression, as both Boris and me tried to
reproduce it on different systems that do ioremap_wc() as well and didn't
measure a slowdown, but something about the memory layout probably
triggers the tree management bug.
Thanks,
Ingo
next prev parent reply other threads:[~2019-12-01 10:46 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-27 0:53 [x86/mm/pat] 8d04a5f97a: phoronix-test-suite.glmark2.0.score -23.7% regression kernel test robot
2019-11-30 20:23 ` Mariusz Ceier
2019-11-30 21:27 ` Davidlohr Bueso
2019-11-30 22:08 ` Mariusz Ceier
2019-11-30 22:35 ` Linus Torvalds
2019-12-01 10:46 ` Ingo Molnar [this message]
2019-12-01 14:49 ` [PATCH] x86/pat: Fix off-by-one bugs in interval tree search Ingo Molnar
2019-12-01 16:09 ` Mariusz Ceier
2019-12-01 19:53 ` Ingo Molnar
2019-12-01 16:42 ` Kenneth R. Crudup
2019-12-01 17:01 ` Davidlohr Bueso
2019-12-01 17:08 ` Kenneth R. Crudup
2019-12-01 19:55 ` Ingo Molnar
2019-12-01 20:09 ` Kenneth R. Crudup
2019-12-01 20:30 ` Ingo Molnar
2019-12-01 20:04 ` [tip: x86/urgent] x86/mm/pat: " tip-bot2 for Ingo Molnar
2019-12-02 8:31 ` [PATCH] x86/pat: " Rong Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191201104624.GA51279@gmail.com \
--to=mingo@kernel.org \
--cc=bp@alien8.de \
--cc=dave@stgolabs.net \
--cc=dbueso@suse.de \
--cc=kenny@panix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@lists.01.org \
--cc=mceier@gmail.com \
--cc=peterz@infradead.org \
--cc=rong.a.chen@intel.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).