linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: mceier@gmail.com, Davidlohr Bueso <dave@stgolabs.net>,
	kernel test robot <rong.a.chen@intel.com>,
	Davidlohr Bueso <dbueso@suse.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Borislav Petkov <bp@alien8.de>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, "Kenneth R. Crudup" <kenny@panix.com>
Subject: Re: [x86/mm/pat] 8d04a5f97a: phoronix-test-suite.glmark2.0.score -23.7% regression
Date: Sun, 1 Dec 2019 11:46:24 +0100	[thread overview]
Message-ID: <20191201104624.GA51279@gmail.com> (raw)
In-Reply-To: <CAHk-=wh--xwpatv_Rcp3WtCPQtg-RVoXYQj8O+1TSw8os7Jtvw@mail.gmail.com>


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sat, Nov 30, 2019 at 2:09 PM Mariusz Ceier <mceier@gmail.com> wrote:
> >
> > Contents of /sys/kernel/debug/x86/pat_memtype_list on master
> > (32ef9553635ab1236c33951a8bd9b5af1c3b1646) where performance is
> > degraded:
> 
> Diff between good and bad case:
> 
>     @@ -1,8 +1,8 @@
>      PAT memtype list:
>      write-back @ 0x55ba4000-0x55ba5000
>      write-back @ 0x5e88c000-0x5e8b5000
>     -write-back @ 0x5e8b4000-0x5e8b8000
>      write-back @ 0x5e8b4000-0x5e8b5000
>     +write-back @ 0x5e8b4000-0x5e8b8000
>      write-back @ 0x5e8b7000-0x5e8bb000
>      write-back @ 0x5e8ba000-0x5e8bc000
>      write-back @ 0x5e8bb000-0x5e8be000
>     @@ -21,15 +21,15 @@
>      uncached-minus @ 0xec260000-0xec264000
>      uncached-minus @ 0xec300000-0xec320000
>      uncached-minus @ 0xec326000-0xec327000
>     -uncached-minus @ 0xf0000000-0xf0001000
>      uncached-minus @ 0xf0000000-0xf8000000
>     +uncached-minus @ 0xf0000000-0xf0001000
>      uncached-minus @ 0xfdc43000-0xfdc44000
>      uncached-minus @ 0xfe000000-0xfe001000
>      uncached-minus @ 0xfed00000-0xfed01000
>      uncached-minus @ 0xfed10000-0xfed16000
>      uncached-minus @ 0xfed90000-0xfed91000
>     -write-combining @ 0x2000000000-0x2100000000
>     -write-combining @ 0x2000000000-0x2100000000
>     +uncached-minus @ 0x2000000000-0x2100000000
>     +uncached-minus @ 0x2000000000-0x2100000000
>      uncached-minus @ 0x2100000000-0x2100001000
>      uncached-minus @ 0x2100001000-0x2100002000
>      uncached-minus @ 0x2ffff10000-0x2ffff20000
> 
> the first two differences are just trivial ordering differences for
> overlapping ranges (starting at 0x5e8b4000 and 0xf0000000)
> respectively.
> 
> But the final difference is a real difference where it used to be WC,
> and is now UC-:
> 
>     -write-combining @ 0x2000000000-0x2100000000
>     -write-combining @ 0x2000000000-0x2100000000
>     +uncached-minus @ 0x2000000000-0x2100000000
>     +uncached-minus @ 0x2000000000-0x2100000000
> 
> which certainly could easily explain the huge performance degradation.

Indeed, as two days ago I speculated to Kenneth R. Crudup who reported a 
similar slowdown on i915:

> * Ingo Molnar <mingo@kernel.org> wrote:
> > > * Kenneth R. Crudup <kenny@panix.com> wrote:
> > >
> > > > As soon as the i915 driver module is loaded, it takes over the 
> > > > EFI framebuffer on my machine (HP Spectre X360 with Intel UHD620 
> > > > Graphics) and the subsequent text (as well as any VTs) is 
> > > > rendered much more slowly. I don't know if the i915/DRM guys need 
> > > > to do anything to their code to take advantage of this change to 
> > > > the PATs, but reverting this change (after the associated 
> > > > subseqent commits) has fixed that issue for me.
> > > >
> > > > Let me know if you need any further info.
> > >
> > > This is almost certainly the PAT bits being wrong in the 
> > > pagetables, i.e. an x86 bug, not a GPU driver bug.
> > >
> > >
> > > Davidlohr, any idea what's going on? The interval tree conversion went
> > > bad. The slowdown symptoms are consistent with perhaps the framebuffer
> > > not getting WC mapped, but uncacheable mapped:
> > >
> > >                ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
> > >                                         vma->node.start,
> > >                                         vma->node.size);
> > > 
> > > Which is a wrapper around ioremap_wc().
> > > 
> > > To debug this it would be useful to do a before/after comparison of the
> > > kernel pagetables:
> > > 
> > >  - before: git checkout 8d04a5f97a^1
> > >  - after:  git checkout 8d04a5f97a

And yesterday:

> [...]
>
> There's another similar bugreport of a -20% GL performance drop, from 
> the ktest automated benchmark suite:
>
>     https://lkml.kernel.org/r/20191127005312.GD20422@shao2-debian
>
> My shot-in-the-dark hypothesis is that perhaps we somehow fail to find 
> a newly mapped memtype and leave a key ioremap_wc() area uncached, 
> instead of write-combining?
>
> The order of magnitude of the slowdown would be roughly consistent with 
> that, in GPU limited workloads - it would be more marked in 3D scenes 
> with a lot of vertices or perhaps a lot of texture changes.
>
> But this is really just a random guess.

It's not an unconditional regression, as both Boris and me tried to 
reproduce it on different systems that do ioremap_wc() as well and didn't 
measure a slowdown, but something about the memory layout probably 
triggers the tree management bug.

Thanks,

	Ingo

  reply	other threads:[~2019-12-01 10:46 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-27  0:53 [x86/mm/pat] 8d04a5f97a: phoronix-test-suite.glmark2.0.score -23.7% regression kernel test robot
2019-11-30 20:23 ` Mariusz Ceier
2019-11-30 21:27   ` Davidlohr Bueso
2019-11-30 22:08     ` Mariusz Ceier
2019-11-30 22:35       ` Linus Torvalds
2019-12-01 10:46         ` Ingo Molnar [this message]
2019-12-01 14:49           ` [PATCH] x86/pat: Fix off-by-one bugs in interval tree search Ingo Molnar
2019-12-01 16:09             ` Mariusz Ceier
2019-12-01 19:53               ` Ingo Molnar
2019-12-01 16:42             ` Kenneth R. Crudup
2019-12-01 17:01             ` Davidlohr Bueso
2019-12-01 17:08             ` Kenneth R. Crudup
2019-12-01 19:55               ` Ingo Molnar
2019-12-01 20:09                 ` Kenneth R. Crudup
2019-12-01 20:30                   ` Ingo Molnar
2019-12-01 20:04             ` [tip: x86/urgent] x86/mm/pat: " tip-bot2 for Ingo Molnar
2019-12-02  8:31             ` [PATCH] x86/pat: " Rong Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191201104624.GA51279@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave@stgolabs.net \
    --cc=dbueso@suse.de \
    --cc=kenny@panix.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@lists.01.org \
    --cc=mceier@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rong.a.chen@intel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).