From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops Date: Fri, 18 Sep 2015 09:14:04 +0200 Message-ID: <20150918071403.GA1172@gmail.com> References: <203bb8a52efae1781281fb70ccd45c3e164fbce2.1442523997.git.luto@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: x86@kernel.org, Paolo Bonzini , Peter Zijlstra , KVM list , Arjan van de Ven , xen-devel , linux-kernel@vger.kernel.org, Linus Torvalds , Andrew Morton , Thomas Gleixner To: Andy Lutomirski Return-path: Received: from mail-wi0-f181.google.com ([209.85.212.181]:33537 "EHLO mail-wi0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751915AbbIRHOI (ORCPT ); Fri, 18 Sep 2015 03:14:08 -0400 Content-Disposition: inline In-Reply-To: <203bb8a52efae1781281fb70ccd45c3e164fbce2.1442523997.git.luto@kernel.org> Sender: kvm-owner@vger.kernel.org List-ID: * Andy Lutomirski wrote: > This demotes an OOPS and likely panic due to a failed non-"safe" MSR > access to a WARN_ON_ONCE and a return of poisoned values (in the > RDMSR case). We still write a pr_info entry unconditionally for > debugging. > > To be clear, this type of failure should *not* happen. This patch > exists to minimize the chance of nasty undebuggable failures due on > systems that used to work due to a now-fixed CONFIG_PARAVIRT=y bug. > + if (opcode == 0x320f) { > + /* RDMSR */ > + pr_info("bad kernel RDMSR from non-existent MSR 0x%x", > + (unsigned int)regs->cx); > + if (!panic_on_oops) { > + WARN_ON_ONCE(true); > + > + /* Patch it up with deterministic poison. */ > + regs->ax = 0x5aadc0de; > + regs->dx = 0x8badf00d; > + regs->ip += 2; > + return true; IMHO this should really not poison the result, but use zero as the result. The poison might randomly indicate 'present' feature in various registers that might be accessed in a buggy way. Don't send the code further down into la-la-land by giving it a 'success'. And yes, zero can mean success too, but we have to pick a side here ... The warning will be enough to fix these ups, people (and in particular distro testing people) will be watching out for them. Thanks, Ingo