From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753985AbcARH7z (ORCPT ); Mon, 18 Jan 2016 02:59:55 -0500 Received: from mail-pa0-f65.google.com ([209.85.220.65]:36753 "EHLO mail-pa0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753902AbcARH7v (ORCPT ); Mon, 18 Jan 2016 02:59:51 -0500 Subject: Re: [PATCH 1/1] Revert "genirq: Remove the second parameter from handle_irq_event_percpu()" To: Huang Shijie , Thomas Gleixner References: <1452681116-20924-1-git-send-email-zyjzyj2000@gmail.com> <20160114012918.GA13689@sha-win-210.asiapac.arm.com> Cc: LKML , Jiang Liu , Peter Zijlstra , nd@arm.com From: zhuyj Message-ID: <569C9B8C.9050406@gmail.com> Date: Mon, 18 Jan 2016 16:00:12 +0800 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <20160114012918.GA13689@sha-win-210.asiapac.arm.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, all I made tests for this patch. To now, I can not find any similar problem. Best Regards! Zhu Yanjun On 01/14/2016 09:29 AM, Huang Shijie wrote: > On Wed, Jan 13, 2016 at 02:07:25PM +0100, Thomas Gleixner wrote: >> On Wed, 13 Jan 2016, zyjzyj2000@gmail.com wrote: >> >>> After this commit 71f64340fc0e ("genirq: Remove the second parameter >>> from handle_irq_event_percpu()") is applied, the variable action is >>> not protected by raw_spin_lock. The following calltrace will pop up. >> Thanks, for the report. I missed that detail when merging the patch! >> >> Just for correctness sake: You miss to explain why this can happen. >> >> It's not about the variable action, it's about desc->action not being >> protected anymore. So the reason why this oopses is that the action is being >> removed concurrently. >> >> CPU 0 CPU 1 >> >> free_irq() lock(desc) >> lock(desc) handle_edge_irq() >> handle_irq_event(desc) >> unlock(desc) >> desc->action = NULL handle_irq_event_percpu(desc) >> action = desc->action >> >> While the original code did: >> >> free_irq() lock(desc) >> lock(desc) handle_edge_irq() >> handle_irq_event() >> action = desc->action >> unlock(desc) >> desc->action = NULL handle_irq_event_percpu(desc, action) >> >> So now the question is whether we revert that patch or simply change >> handle_irq_event_percpu() to deal with that. Patch below. >> >> That preserves us the code size reduction of commit 71f64340fc0e. This is safe >> because we either see a valid desc->action or NULL. If the action is about to >> be removed it is still valid as free_irq() is blocked on synchronize_irq(). >> >> free_irq() lock(desc) >> lock(desc) handle_edge_irq() >> handle_irq_event(desc) >> set(INPROGRESS) >> unlock(desc) >> handle_irq_event_percpu(desc) >> action = desc->action >> desc->action = NULL >> sychronize_irq() >> while(INPROGRESS); lock(desc) >> clr(INPROGRESS) >> free(action) >> >> That's basically the same mechanism as we have for shared >> interrupts. action->next can become NULL while handle_irq_event_percpu() >> runs. Either it sees the action or NULL. It does not matter, because action >> itself cannot go away. >> >> Thanks, >> >> tglx >> >> 8<------------- >> >> --- a/kernel/irq/handle.c >> +++ b/kernel/irq/handle.c >> @@ -136,9 +136,15 @@ irqreturn_t handle_irq_event_percpu(stru >> { >> irqreturn_t retval = IRQ_NONE; >> unsigned int flags = 0, irq = desc->irq_data.irq; >> - struct irqaction *action = desc->action; >> + struct irqaction *action; >> >> - do { >> + /* >> + * READ_ONCE is not required here. The compiler cannot reload action >> + * because it'll be action->next for the second iteration of the loop. >> + */ >> + action = desc->action; >> + >> + while (action) { >> irqreturn_t res; >> >> trace_irq_handler_entry(irq, action); >> @@ -173,7 +179,7 @@ irqreturn_t handle_irq_event_percpu(stru >> >> retval |= res; >> action = action->next; >> - } while (action); >> + } >> >> add_interrupt_randomness(irq, flags); > I prefer to this patch, revert the old the patch is not a good solution. > > thanks > Huang Shijie >