From: Markus Trippelsdorf <markus@trippelsdorf.de>
To: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>, Borislav Petkov <bp@alien8.de>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Tom Lendacky <thomas.lendacky@amd.com>
Subject: Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf
Date: Sat, 9 Sep 2017 10:13:38 +0200 [thread overview]
Message-ID: <20170909081338.GB277@x4> (raw)
In-Reply-To: <CALCETrX6mFHWa1oZxS7DQUtnbqyrRRg53bvgGFv19ka8Sy_KcA@mail.gmail.com>
On 2017.09.08 at 14:47 -0700, Andy Lutomirski wrote:
> On Fri, Sep 8, 2017 at 10:16 AM, Markus Trippelsdorf
> <markus@trippelsdorf.de> wrote:
> > On 2017.09.08 at 09:12 -0700, Andy Lutomirski wrote:
> >> On Fri, Sep 8, 2017 at 4:30 AM, Markus Trippelsdorf
> >> <markus@trippelsdorf.de> wrote:
> >> > On 2017.09.08 at 12:39 +0200, Markus Trippelsdorf wrote:
> >> >> On 2017.09.08 at 12:35 +0200, Ingo Molnar wrote:
> >> >> >
> >> >> > * Markus Trippelsdorf <markus@trippelsdorf.de> wrote:
> >> >> >
> >> >> > > On 2017.09.08 at 11:16 +0200, Borislav Petkov wrote:
> >> >> > > > On Fri, Sep 08, 2017 at 10:05:36AM +0200, Borislav Petkov wrote:
> >> >> > > > > On Fri, Sep 08, 2017 at 08:26:44AM +0200, Thomas Gleixner wrote:
> >> >> > > > > > On Fri, 8 Sep 2017, Markus Trippelsdorf wrote:
> >> >> > > > > >
> >> >> > > > > > CC+ Borislav. He might have access to such a beast
> >> >> > > > >
> >> >> > > > > Can I have /proc/cpuinfo and dmesg pls, in order to see whether I have
> >> >> > > > > something similar?
> >> >> > > > >
> >> >> > > > > Private mail's fine too.
> >> >> > > >
> >> >> > > > So I don't have exactly your model - mine is model 2, stepping 3 but I see
> >> >> > > > something strange too, in dmesg:
> >> >> > >
> >> >> > > I'm pretty sure the bug is in the merged 'x86-mm-for-linus' branch:
> >> >> > > Either Andy's "PCID optimized TLB flushing" (would be my guess) or
> >> >> > > 'encrypted memory' support by Tom Lendacky.
> >> >> > >
> >> >> > > (Bisecting is hard, because sometimes I can compile stuff for over 15
> >> >> > > minutes without hitting the bug. At other times the machine locks up
> >> >> > > hard when starting X11 already.)
> >> >> >
> >> >> > Do you have the 72c0098d92ce fix?
> >> >>
> >> >> Yes. The bug still happens on the current git tree (which has the fix
> >> >> already):
> >> >
> >> > The bug is definitely caused by Andy Lutomirski's PCID optimized TLB
> >> > flushing" patches. Tom is off the hook.
> >>
> >> I'm pretty sure it can't be PCID per se, since these CPUs are way too
> >> old and are very unlikely to have PCID.
> >
> > Yes, the CPU doesn't support PCID (,but it does support PGE).
> >
> >> It could plausibly be the lazy TLB flushing changes.
> >
> > Yes, I've narrowed it down to:
> >
> > commit 94b1b03b519b81c494900cb112aa00ed205cc2d9
> > Author: Andy Lutomirski <luto@kernel.org>
> > Date: Thu Jun 29 08:53:17 2017 -0700
> >
> > x86/mm: Rework lazy TLB mode and TLB freshness tracking
> >
> >
> > Theoretically you guys should be able to reproduce the issue by using
> > the "nopcid" boot option.
> >
>
> Any chance you could test with CONFIG_DEBUG_VM=y? There are lots of
> potentially useful assertions in that code.
CONFIG_DEBUG_VM=y doesn't change anything. I still get the hard hang
without anything in the logs.
> Can you also post your /proc/cpuinfo? And can you re-confirm that a
> problematic guest kernel is causing problems in the *host*?
processor : 0
vendor_id : AuthenticAMD
cpu family : 16
model : 4
model name : AMD Phenom(tm) II X4 955 Processor
stepping : 2
microcode : 0x10000db
cpu MHz : 3210.960
cache size : 512 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt hw_pstate vmmcall npt lbrv svm_lock nrip_save
bugs : tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs null_seg amd_e400
bogomips : 6424.50
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate
Unfortunately I cannot reproduce the qemu (kvm) problem anymore. (Perhaps
I have not tried long enough).
Anyway, kvm has code that should handle erratum_383.
--
Markus
next prev parent reply other threads:[~2017-09-09 8:13 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-05 7:27 Current mainline git (24e700e291d52bd2) hangs when building e.g. perf Markus Trippelsdorf
2017-09-05 8:53 ` Peter Zijlstra
2017-09-05 9:55 ` Markus Trippelsdorf
2017-09-06 12:52 ` Thomas Gleixner
2017-09-06 13:15 ` Markus Trippelsdorf
2017-09-07 6:28 ` Markus Trippelsdorf
2017-09-08 5:35 ` Markus Trippelsdorf
2017-09-08 6:26 ` Thomas Gleixner
2017-09-08 8:05 ` Borislav Petkov
2017-09-08 9:16 ` Borislav Petkov
2017-09-08 9:48 ` Markus Trippelsdorf
2017-09-08 10:35 ` Ingo Molnar
2017-09-08 10:39 ` Markus Trippelsdorf
2017-09-08 11:30 ` Markus Trippelsdorf
2017-09-08 16:12 ` Andy Lutomirski
2017-09-08 17:16 ` Markus Trippelsdorf
2017-09-08 21:47 ` Andy Lutomirski
2017-09-08 21:56 ` Borislav Petkov
2017-09-08 23:07 ` Andy Lutomirski
2017-09-08 23:23 ` Linus Torvalds
2017-09-09 0:00 ` Andy Lutomirski
2017-09-09 1:05 ` Linus Torvalds
2017-09-09 1:39 ` Andy Lutomirski
2017-09-09 17:49 ` Andy Lutomirski
2017-09-09 18:02 ` Linus Torvalds
2017-09-09 6:39 ` Markus Trippelsdorf
2017-09-09 10:18 ` Borislav Petkov
2017-09-09 11:07 ` Markus Trippelsdorf
2017-09-09 13:07 ` Borislav Petkov
2017-09-09 13:37 ` Markus Trippelsdorf
2017-09-09 13:39 ` Markus Trippelsdorf
2017-09-09 14:07 ` Borislav Petkov
2017-09-09 14:20 ` Markus Trippelsdorf
2017-09-09 14:33 ` Borislav Petkov
2017-09-09 14:43 ` Markus Trippelsdorf
2017-09-09 16:32 ` Markus Trippelsdorf
2017-09-09 17:05 ` Borislav Petkov
2017-09-09 17:23 ` Markus Trippelsdorf
2017-09-09 17:36 ` Borislav Petkov
2017-09-09 18:14 ` Markus Trippelsdorf
2017-09-09 18:26 ` Borislav Petkov
2017-09-09 18:46 ` Markus Trippelsdorf
2017-09-09 19:11 ` Borislav Petkov
2017-09-09 19:19 ` Borislav Petkov
2017-09-09 18:26 ` Linus Torvalds
2017-09-09 18:29 ` Borislav Petkov
2017-09-09 18:47 ` Linus Torvalds
2017-09-09 19:09 ` Borislav Petkov
2017-09-09 19:21 ` Linus Torvalds
2017-09-09 19:28 ` Andy Lutomirski
2017-09-09 19:37 ` Borislav Petkov
2017-09-10 4:42 ` Andy Lutomirski
2017-09-10 20:22 ` Peter Zijlstra
2017-09-10 20:25 ` Andy Lutomirski
2017-09-17 17:04 ` Ingo Molnar
2017-09-11 1:12 ` Rik van Riel
2017-09-11 1:46 ` Andy Lutomirski
2017-09-11 15:08 ` Rik van Riel
2017-09-12 7:14 ` Markus Trippelsdorf
2017-09-09 8:13 ` Markus Trippelsdorf [this message]
2017-09-08 14:51 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170909081338.GB277@x4 \
--to=markus@trippelsdorf.de \
--cc=bp@alien8.de \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.