Re: [PATCH] efifb: allow user to disable write combined mapping.

linux-fbdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Linus Torvalds <torvalds@linux-foundation.org>
To: Peter Jones <pjones@redhat.com>,
	the arch/x86 maintainers <x86@kernel.org>
Cc: Dave Airlie <airlied@redhat.com>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
	"linux-fbdev@vger.kernel.org" <linux-fbdev@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Lutomirski <luto@kernel.org>, Peter Anvin <hpa@zytor.com>
Subject: Re: [PATCH] efifb: allow user to disable write combined mapping.
Date: Tue, 18 Jul 2017 19:57:29 +0000	[thread overview]
Message-ID: <CA+55aFwKzwDPYFsPpuQNfBaS-dL2aD0=z1hGEnkaTT1MMfWB6Q@mail.gmail.com> (raw)
In-Reply-To: <20170718143404.omgxrujngj2rhiya@redhat.com>

On Tue, Jul 18, 2017 at 7:34 AM, Peter Jones <pjones@redhat.com> wrote:
>
> Well, that's kind of amazing, given 3c004b4f7eab239e switched us /to/
> using ioremap_wc() for the exact same reason.  I'm not against letting
> the user force one way or the other if it helps, though it sure would be
> nice to know why.

It's kind of amazing for another reason too: how is ioremap_wc()
_possibly_ slower than ioremap_nocache() (which is what plain
ioremap() is)?

The difference is literally _PAGE_CACHE_MODE_WC vs _PAGE_CACHE_MODE_UC_MINUS.

Both of them should be uncached, but WC should allow much better write
behavior. It should also allow much better system behavior.

This really sounds like a band-aid patch that just hides some other
issue entirely. Maybe we screw up the cache modes for some PAT mode
setup?

Or maybe it really is something where there is one global write queue
per die (not per CPU), and having that write queue "active" doing
combining will slow down every core due to some crazy synchronization
issue?

x86 people, look at what Dave Airlie did, I'll just repeat it because
it sounds so crazy:

> A customer noticed major slowdowns while logging to the console
> with write combining enabled, on other tasks running on the same
> CPU. (10x or greater slow down on all other cores on the same CPU
> as is doing the logging).
>
> I reproduced this on a machine with dual CPUs.
> Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz (6 core)
>
> I wrote a test that just mmaps the pci bar and writes to it in
> a loop, while this was running in the background one a single
> core with (taskset -c 1), building a kernel up to init/version.o
> (taskset -c 8) went from 13s to 133s or so. I've yet to explain
> why this occurs or what is going wrong I haven't managed to find
> a perf command that in any way gives insight into this.

So basically the UC vs WC thing seems to slow down somebody *else* (in
this case a kernel compile) on another core entirely, by a factor of
10x. Maybe the WC writer itself is much faster, but _others_ are
slowed down enormously.

Whaa? That just seems incredible.

Dave - while your test sounds very simple, can you package it up some
way so that somebody inside of Intel can just run it on one of their
machines?

The patch itself (to allow people to *not* do WC that is supposed to
be so much better but clearly doesn't seem to be) looks fine to me,
but it would be really good to get intel to look at this.

                    Linus

next prev parent reply	other threads:[~2017-07-18 19:57 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-18  6:09 [PATCH] efifb: allow user to disable write combined mapping Dave Airlie
2017-07-18 14:34 ` Peter Jones
2017-07-18 19:57   ` Linus Torvalds [this message]
2017-07-18 20:44     ` Dave Airlie
2017-07-18 21:21       ` Dave Airlie
2017-07-18 22:22         ` Linus Torvalds
2017-07-18 23:16           ` Dave Airlie
2017-07-18 23:16             ` Dave Airlie
2017-07-19  0:00               ` Dave Airlie
2017-07-19  1:15                 ` Linus Torvalds
2017-07-20  4:07                   ` Dave Airlie
2017-07-20  4:28                     ` Andy Lutomirski
2017-07-20  4:44                       ` Linus Torvalds
2017-07-21  4:27                         ` Dave Airlie
2017-07-20 10:20           ` Ingo Molnar
2017-07-31 19:13       ` H. Peter Anvin
2017-07-25  4:00   ` Dave Airlie
2017-07-25  8:56     ` Bartlomiej Zolnierkiewicz
     [not found]       ` <CGME20170731171022epcas1p2c5537a0a79eca05a729773d4cabaac27@epcas1p2.samsung.com>
2017-07-31 17:10         ` Bartlomiej Zolnierkiewicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFwKzwDPYFsPpuQNfBaS-dL2aD0=z1hGEnkaTT1MMfWB6Q@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=airlied@redhat.com \
    --cc=b.zolnierkie@samsung.com \
    --cc=hpa@zytor.com \
    --cc=linux-fbdev@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=pjones@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).