From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from www.tglx.de (www.tglx.de [62.245.132.106]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id A91801007D2 for ; Fri, 23 Jul 2010 04:56:19 +1000 (EST) Date: Thu, 22 Jul 2010 20:38:35 +0200 (CEST) From: Thomas Gleixner To: Darren Hart Subject: Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries In-Reply-To: <4C488CCD.60004@us.ibm.com> Message-ID: References: <4C488CCD.60004@us.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: Stephen Rothwell , Gautham R Shenoy , Steven Rostedt , linuxppc-dev@ozlabs.org, Will Schmidt , Paul Mackerras List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Thu, 22 Jul 2010, Darren Hart wrote: > Also of interest is that this path > cpu_idle()->cpu_die()->pseries_mach_cpu_die() to start_secondary() > enters with a preempt_count=1 if it wasn't corrupted across the hcall. That triggers the problem as well. preempt_count needs to be 0 when entering start_secondary(). So I really wonder how that ever worked. > The early boot path from _start however appears to call > start_secondary() with a preempt_count of 0. Which is correct. > The following patch is most certainly not correct, but it does eliminate It is correct, but i think it is incomplete as other portions of the thread_info on the stack might be in some weird state as well. > the situation on mainline 100% of the time (there is still a 25% > reproduction rate on PREEMPT_RT). But those are diffferent issues, for which we have reasonable explanations and patches/workarounds. > 2) Should we call preempt_enable() in cpu_idle() prior to cpu_die() ? No Thanks, tglx