Re: runtime regression with "x86/mm/pat: Emulate PAT when it is disabled"

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Paul Gortmaker <paul.gortmaker@windriver.com>
To: Toshi Kani <toshi.kani@hpe.com>
Cc: Toshi Kani <toshi.kani@hp.com>,
	linux-kernel@vger.kernel.org,
	openembedded-core <openembedded-core@lists.openembedded.org>,
	Bruce Ashfield <bruce.ashfield@windriver.com>,
	"Hart, Darren" <darren.hart@intel.com>,
	Borislav Petkov <bp@suse.de>, "saul.wold" <saul.wold@intel.com>
Subject: Re: runtime regression with "x86/mm/pat: Emulate PAT when it is disabled"
Date: Fri, 4 Mar 2016 13:37:14 -0500	[thread overview]
Message-ID: <20160304183713.GA26051@windriver.com> (raw)
In-Reply-To: <1457067768.15454.181.camel@hpe.com>

[Re: runtime regression with "x86/mm/pat: Emulate PAT when it is disabled"] On 03/03/2016 (Thu 22:02) Toshi Kani wrote:

> On Thu, 2016-03-03 at 15:59 -0500, Paul Gortmaker wrote:
> > So, the yocto folks moved from 4.1 to 4.4 and one of their automated
> > qemu x86-32 boot tests started failing.  None of the yocto details seem
> > to matter since I offered to help and I've repropduced it using 100%
> > mainline kernels and a generic distro toolchain as well.
> > 
> > The test case is slightly complicated, in that it relies on uvesafb
> > being modular, and so one has to juggle modules within an ext4 image
> > that qemu boots from.  We tried making uvesafb builtin, but that made
> > the issue magically vanish.  Given PAT, this isn't too surprising.
> > 
> > Richard did the preliminary investigation and analysis, and from that I
> > did a bisect, and found the commit in $SUBJECT to be the root cause, as
> > per the discussion here:
> > 
> > http://lists.openembedded.org/pipermail/openembedded-core/2016-March/1183
> > 97.html
> > 
> > I'd mentioned the above to bpetkov on IRC and after confirming it was
> > still an issue on 4.5-rc6, he'd asked if I had a portable reproducer.  
> > 
> > Not sure how complicated that would be, I set out to make one from my
> > build.   With a little LD_PRELOAD type magic and ensuring all the qemu
> > components are in ./  I have one that runs on an otherwise qemu-free
> > x86-64 box. 
> > 
> > The stand alone reproducer is here; launched in 00-runme:
> > 
> > http://openlinux.wrs.com/pat-splat/reproducer.tar.bz2  
> > 
> > It is nothing fancy, just a generic yocto build of "sato" (gfx enabled
> > rootfs).  When it "works" it boots to a UI touchscreen interface.  When
> > it fails, you get a black screen with a blinking cursor (as seen in
> > "vncviewer localhost:0").
> 
> Thanks for tracking down, and packaging the reproducer.  I simply untar'd
> and ran 00-runme, but was not able to connect with localhost:0.  I am not
> familiar with qemu, so I have not looked into why, though...

Maybe it was localhost:1 in your case?  The qemu should have indicated
what vncserver sessions it started.  Can you paste in the output from
the 00-runme?   I tested the reproducer on a machine that was physically
distinct from the build, and that was a generic ubuntu install, but with
no qemu support installed at all and it worked there.  Plus I got Bruce
to test it worked on his machine, so I'm rather surprised it did not
work for you.

> 
> Anyway, with regarding the error message:
>   "x86/PAT: Xorg:705 map pfn expected mapping type uncached-minus for [mem
> 0xfd000000-0xfdffffff], got write-combining"
> 
> Did it came from the following path during fork()?
>  copy_process
>   copy_mm
>    dup_mm
>     dup_mmap
>      copy_page_range
>       track_pfn_copy
>        reserve_pfn_range

The trace is consistent, and was already captured by Richard, as per:

http://lists.openembedded.org/pipermail/openembedded-core/2016-March/118397.html

which is the link given earlier.  When I say consistent, I mean that I
get essentially the same thing when booting 4.5-rc6:

[   30.098100] x86/PAT: Xorg:509 map pfn expected mapping type uncached-minus for [mem 0xfd000000-0xfdffffff], got write-combining
[   30.106782] ------------[ cut here ]------------
[   30.107093] WARNING: CPU: 0 PID: 509 at /home/paul/poky/build/tmp-glibc/work-shared/qemux86/kernel-source/arch/x86/mm/pat.c:986 untrack_pfn+0x9f/0xb0()
[   30.112553] Modules linked in: 8021q parport_pc parport floppy uvesafb
[   30.113766] CPU: 0 PID: 509 Comm: Xorg Not tainted 4.5.0-rc6-yocto-standard #1
[   30.113806] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[   30.114078]  00000000 00003286 c0149d78 c13a6c7f 00000000 00000000 c0149dac c1052bcb
[   30.114214]  c1ac7544 00000000 000001fd c1ac1ea4 000003da c104cbdf 000003da c104cbdf
[   30.114214]  00000000 cdcf0528 00000000 c0149dbc c1052ca2 00000009 00000000 c0149de0
[   30.114214] Call Trace:
[   30.114214]  [<c13a6c7f>] dump_stack+0x58/0x79
[   30.114214]  [<c1052bcb>] warn_slowpath_common+0x8b/0xc0
[   30.114214]  [<c104cbdf>] ? untrack_pfn+0x9f/0xb0
[   30.114214]  [<c104cbdf>] ? untrack_pfn+0x9f/0xb0
[   30.114214]  [<c1052ca2>] warn_slowpath_null+0x22/0x30
[   30.114214]  [<c104cbdf>] untrack_pfn+0x9f/0xb0
[   30.114214]  [<c104ecf4>] ? __kunmap_atomic+0x54/0x110
[   30.114214]  [<c114f1cf>] unmap_single_vma+0x56f/0x580
[   30.114214]  [<c11321d0>] ? pagevec_move_tail_fn+0xa0/0xa0
[   30.114214]  [<c1150123>] unmap_vmas+0x43/0x60
[   30.114214]  [<c1154d5f>] exit_mmap+0x5f/0xf0
[   30.114214]  [<c10507bd>] mmput+0x2d/0xa0
[   30.114214]  [<c1051c19>] copy_process.part.47+0x1229/0x1430
[   30.114214]  [<c1051fb4>] _do_fork+0xb4/0x3b0
[   30.114214]  [<c105239c>] SyS_clone+0x2c/0x30
[   30.114214]  [<c1001a04>] do_syscall_32_irqs_on+0x54/0xb0
[   30.114214]  [<c18b06ca>] entry_INT80_32+0x2a/0x2a
[   30.124383] ---[ end trace f7c8a5d94542f94e ]---

> 
> If so, track_pfn_copy() obtained pgprot from a PTE, and called
> reserve_pfn_range() with it.  So, the error message indicates that previous
> ioremap_wc() (i.e. pcm WC) resulted in creating UC- map (i.e. pgprot UC-).
>  pcm is a logical cache type and pgprot is a HW cache type.  They can be
> different when CPU does not have support for a given logical type.  This WC
> to UC- conversion happens when CPU does not support PAT.
> 
> Richard's change, which compares with pgprot values in reserve_pfn_range()
> is a good one, but I do not understand how we get into this mess.  We do
> not have this check when PAT is disabled, and WC is supported when PAT is
> enabled.
> 
> Commit 9cd25aac1 changed the initial values of the pcm<->pgrot conversion
> tables.  The tables should be initialized with the same values after
> pat_init() is called.  Is there any possibility that ioremap_wc() was
> called before pat_init()..?

I don't think it is an initcall ordering thing; recall that I said the
problem seems to go away when built-in vs uvesafb as module.  So given
that, I think it is more related to where the code lands.

> 
> Also, can you send me a whole dmesg output?  I'd like to check how PAT is
> initialized.

I'll send the full file off list vs. spamming everyone with it.  I'm open
to booting the pre-fail commit with PAT specific bootargs and the post-fail
with the same and diffing the two dmesg if there are bootargs you'd like
me to test.  I'd also like to ensure you have a working reproducer locally
so maybe we should look at how that failed 1st.

Thanks,
Paul.
--

> 
> Thanks!
> -Toshi

WARNING: multiple messages have this Message-ID (diff)

From: Paul Gortmaker <paul.gortmaker@windriver.com>
To: Toshi Kani <toshi.kani@hpe.com>
Cc: Borislav Petkov <bp@suse.de>,
	Richard Purdie <richard.purdie@linuxfoundation.org>,
	Toshi Kani <toshi.kani@hp.com>,
	Bruce Ashfield <bruce.ashfield@windriver.com>,
	openembedded-core <openembedded-core@lists.openembedded.org>,
	"Hart, Darren" <darren.hart@intel.com>,
	"saul.wold" <saul.wold@intel.com>, <linux-kernel@vger.kernel.org>
Subject: Re: runtime regression with "x86/mm/pat: Emulate PAT when it is disabled"
Date: Fri, 4 Mar 2016 13:37:14 -0500	[thread overview]
Message-ID: <20160304183713.GA26051@windriver.com> (raw)
In-Reply-To: <1457067768.15454.181.camel@hpe.com>

[Re: runtime regression with "x86/mm/pat: Emulate PAT when it is disabled"] On 03/03/2016 (Thu 22:02) Toshi Kani wrote:

> On Thu, 2016-03-03 at 15:59 -0500, Paul Gortmaker wrote:
> > So, the yocto folks moved from 4.1 to 4.4 and one of their automated
> > qemu x86-32 boot tests started failing.  None of the yocto details seem
> > to matter since I offered to help and I've repropduced it using 100%
> > mainline kernels and a generic distro toolchain as well.
> > 
> > The test case is slightly complicated, in that it relies on uvesafb
> > being modular, and so one has to juggle modules within an ext4 image
> > that qemu boots from.  We tried making uvesafb builtin, but that made
> > the issue magically vanish.  Given PAT, this isn't too surprising.
> > 
> > Richard did the preliminary investigation and analysis, and from that I
> > did a bisect, and found the commit in $SUBJECT to be the root cause, as
> > per the discussion here:
> > 
> > http://lists.openembedded.org/pipermail/openembedded-core/2016-March/1183
> > 97.html
> > 
> > I'd mentioned the above to bpetkov on IRC and after confirming it was
> > still an issue on 4.5-rc6, he'd asked if I had a portable reproducer.  
> > 
> > Not sure how complicated that would be, I set out to make one from my
> > build.   With a little LD_PRELOAD type magic and ensuring all the qemu
> > components are in ./  I have one that runs on an otherwise qemu-free
> > x86-64 box. 
> > 
> > The stand alone reproducer is here; launched in 00-runme:
> > 
> > http://openlinux.wrs.com/pat-splat/reproducer.tar.bz2  
> > 
> > It is nothing fancy, just a generic yocto build of "sato" (gfx enabled
> > rootfs).  When it "works" it boots to a UI touchscreen interface.  When
> > it fails, you get a black screen with a blinking cursor (as seen in
> > "vncviewer localhost:0").
> 
> Thanks for tracking down, and packaging the reproducer.  I simply untar'd
> and ran 00-runme, but was not able to connect with localhost:0.  I am not
> familiar with qemu, so I have not looked into why, though...

Maybe it was localhost:1 in your case?  The qemu should have indicated
what vncserver sessions it started.  Can you paste in the output from
the 00-runme?   I tested the reproducer on a machine that was physically
distinct from the build, and that was a generic ubuntu install, but with
no qemu support installed at all and it worked there.  Plus I got Bruce
to test it worked on his machine, so I'm rather surprised it did not
work for you.

> 
> Anyway, with regarding the error message:
>   "x86/PAT: Xorg:705 map pfn expected mapping type uncached-minus for [mem
> 0xfd000000-0xfdffffff], got write-combining"
> 
> Did it came from the following path during fork()?
>  copy_process
>   copy_mm
>    dup_mm
>     dup_mmap
>      copy_page_range
>       track_pfn_copy
>        reserve_pfn_range

The trace is consistent, and was already captured by Richard, as per:

http://lists.openembedded.org/pipermail/openembedded-core/2016-March/118397.html

which is the link given earlier.  When I say consistent, I mean that I
get essentially the same thing when booting 4.5-rc6:

[   30.098100] x86/PAT: Xorg:509 map pfn expected mapping type uncached-minus for [mem 0xfd000000-0xfdffffff], got write-combining
[   30.106782] ------------[ cut here ]------------
[   30.107093] WARNING: CPU: 0 PID: 509 at /home/paul/poky/build/tmp-glibc/work-shared/qemux86/kernel-source/arch/x86/mm/pat.c:986 untrack_pfn+0x9f/0xb0()
[   30.112553] Modules linked in: 8021q parport_pc parport floppy uvesafb
[   30.113766] CPU: 0 PID: 509 Comm: Xorg Not tainted 4.5.0-rc6-yocto-standard #1
[   30.113806] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[   30.114078]  00000000 00003286 c0149d78 c13a6c7f 00000000 00000000 c0149dac c1052bcb
[   30.114214]  c1ac7544 00000000 000001fd c1ac1ea4 000003da c104cbdf 000003da c104cbdf
[   30.114214]  00000000 cdcf0528 00000000 c0149dbc c1052ca2 00000009 00000000 c0149de0
[   30.114214] Call Trace:
[   30.114214]  [<c13a6c7f>] dump_stack+0x58/0x79
[   30.114214]  [<c1052bcb>] warn_slowpath_common+0x8b/0xc0
[   30.114214]  [<c104cbdf>] ? untrack_pfn+0x9f/0xb0
[   30.114214]  [<c104cbdf>] ? untrack_pfn+0x9f/0xb0
[   30.114214]  [<c1052ca2>] warn_slowpath_null+0x22/0x30
[   30.114214]  [<c104cbdf>] untrack_pfn+0x9f/0xb0
[   30.114214]  [<c104ecf4>] ? __kunmap_atomic+0x54/0x110
[   30.114214]  [<c114f1cf>] unmap_single_vma+0x56f/0x580
[   30.114214]  [<c11321d0>] ? pagevec_move_tail_fn+0xa0/0xa0
[   30.114214]  [<c1150123>] unmap_vmas+0x43/0x60
[   30.114214]  [<c1154d5f>] exit_mmap+0x5f/0xf0
[   30.114214]  [<c10507bd>] mmput+0x2d/0xa0
[   30.114214]  [<c1051c19>] copy_process.part.47+0x1229/0x1430
[   30.114214]  [<c1051fb4>] _do_fork+0xb4/0x3b0
[   30.114214]  [<c105239c>] SyS_clone+0x2c/0x30
[   30.114214]  [<c1001a04>] do_syscall_32_irqs_on+0x54/0xb0
[   30.114214]  [<c18b06ca>] entry_INT80_32+0x2a/0x2a
[   30.124383] ---[ end trace f7c8a5d94542f94e ]---

> 
> If so, track_pfn_copy() obtained pgprot from a PTE, and called
> reserve_pfn_range() with it.  So, the error message indicates that previous
> ioremap_wc() (i.e. pcm WC) resulted in creating UC- map (i.e. pgprot UC-).
>  pcm is a logical cache type and pgprot is a HW cache type.  They can be
> different when CPU does not have support for a given logical type.  This WC
> to UC- conversion happens when CPU does not support PAT.
> 
> Richard's change, which compares with pgprot values in reserve_pfn_range()
> is a good one, but I do not understand how we get into this mess.  We do
> not have this check when PAT is disabled, and WC is supported when PAT is
> enabled.
> 
> Commit 9cd25aac1 changed the initial values of the pcm<->pgrot conversion
> tables.  The tables should be initialized with the same values after
> pat_init() is called.  Is there any possibility that ioremap_wc() was
> called before pat_init()..?

I don't think it is an initcall ordering thing; recall that I said the
problem seems to go away when built-in vs uvesafb as module.  So given
that, I think it is more related to where the code lands.

> 
> Also, can you send me a whole dmesg output?  I'd like to check how PAT is
> initialized.

I'll send the full file off list vs. spamming everyone with it.  I'm open
to booting the pre-fail commit with PAT specific bootargs and the post-fail
with the same and diffing the two dmesg if there are bootargs you'd like
me to test.  I'd also like to ensure you have a working reproducer locally
so maybe we should look at how that failed 1st.

Thanks,
Paul.
--

> 
> Thanks!
> -Toshi

next prev parent reply	other threads:[~2016-03-04 18:37 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-03 20:59 runtime regression with "x86/mm/pat: Emulate PAT when it is disabled" Paul Gortmaker
2016-03-03 20:59 ` Paul Gortmaker
2016-03-03 21:18 ` Paul Gortmaker
2016-03-03 21:18   ` Paul Gortmaker
2016-03-04  5:02 ` Toshi Kani
2016-03-04 18:37   ` Paul Gortmaker [this message]
2016-03-04 18:37     ` Paul Gortmaker
2016-03-04 22:12     ` Toshi Kani
2016-03-07  0:35       ` Paul Gortmaker
2016-03-07  0:35         ` Paul Gortmaker
2016-03-07 16:03         ` Toshi Kani
     [not found]           ` <20160307210852.GC26051@windriver.com>
2016-03-07 23:38             ` Toshi Kani
2016-03-07 23:53               ` Paul Gortmaker
2016-03-08  0:56                 ` Toshi Kani
2016-03-08  1:35                   ` Toshi Kani
2016-03-08  3:28                     ` Paul Gortmaker
2016-03-08 16:38                       ` Toshi Kani
2016-03-10 14:42                     ` Paul Gortmaker
2016-03-10 16:49                       ` Toshi Kani
2016-03-10 17:20                         ` Borislav Petkov
2016-03-10 19:04                           ` Paul Gortmaker
2016-03-10 19:19                             ` Borislav Petkov
2016-03-11 13:23                               ` One Thousand Gnomes
2016-03-11 13:40                                 ` Borislav Petkov
2016-03-11 19:18                                   ` Paolo Bonzini
2016-03-11 22:16                                     ` Borislav Petkov
2016-03-11 22:28                                       ` Bruce Ashfield
2016-03-11 23:29                                         ` Richard Purdie
2016-03-12 12:03                                           ` Borislav Petkov
2016-03-10 20:12                             ` Toshi Kani
2016-03-10 20:04                           ` Toshi Kani
2016-03-10 19:20                             ` Borislav Petkov
2016-03-10 20:24                               ` Toshi Kani
2016-03-10 21:07                                 ` Borislav Petkov
2016-03-10 23:17                                   ` Toshi Kani
2016-03-08  3:16                   ` Paul Gortmaker
2016-03-08 16:13                     ` Toshi Kani
2016-03-08 16:03                       ` Paul Gortmaker
2016-03-08 17:01                         ` Toshi Kani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160304183713.GA26051@windriver.com \
    --to=paul.gortmaker@windriver.com \
    --cc=bp@suse.de \
    --cc=bruce.ashfield@windriver.com \
    --cc=darren.hart@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=saul.wold@intel.com \
    --cc=toshi.kani@hp.com \
    --cc=toshi.kani@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.