From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752703AbbIRHOK (ORCPT ); Fri, 18 Sep 2015 03:14:10 -0400 Received: from mail-wi0-f181.google.com ([209.85.212.181]:33537 "EHLO mail-wi0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751915AbbIRHOI (ORCPT ); Fri, 18 Sep 2015 03:14:08 -0400 Date: Fri, 18 Sep 2015 09:14:04 +0200 From: Ingo Molnar To: Andy Lutomirski Cc: x86@kernel.org, Paolo Bonzini , Peter Zijlstra , KVM list , Arjan van de Ven , xen-devel , linux-kernel@vger.kernel.org, Linus Torvalds , Andrew Morton , Thomas Gleixner Subject: Re: [PATCH 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops Message-ID: <20150918071403.GA1172@gmail.com> References: <203bb8a52efae1781281fb70ccd45c3e164fbce2.1442523997.git.luto@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <203bb8a52efae1781281fb70ccd45c3e164fbce2.1442523997.git.luto@kernel.org> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Andy Lutomirski wrote: > This demotes an OOPS and likely panic due to a failed non-"safe" MSR > access to a WARN_ON_ONCE and a return of poisoned values (in the > RDMSR case). We still write a pr_info entry unconditionally for > debugging. > > To be clear, this type of failure should *not* happen. This patch > exists to minimize the chance of nasty undebuggable failures due on > systems that used to work due to a now-fixed CONFIG_PARAVIRT=y bug. > + if (opcode == 0x320f) { > + /* RDMSR */ > + pr_info("bad kernel RDMSR from non-existent MSR 0x%x", > + (unsigned int)regs->cx); > + if (!panic_on_oops) { > + WARN_ON_ONCE(true); > + > + /* Patch it up with deterministic poison. */ > + regs->ax = 0x5aadc0de; > + regs->dx = 0x8badf00d; > + regs->ip += 2; > + return true; IMHO this should really not poison the result, but use zero as the result. The poison might randomly indicate 'present' feature in various registers that might be accessed in a buggy way. Don't send the code further down into la-la-land by giving it a 'success'. And yes, zero can mean success too, but we have to pick a side here ... The warning will be enough to fix these ups, people (and in particular distro testing people) will be watching out for them. Thanks, Ingo