From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751254AbdJWGXf (ORCPT ); Mon, 23 Oct 2017 02:23:35 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:8973 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751083AbdJWGXe (ORCPT ); Mon, 23 Oct 2017 02:23:34 -0400 Subject: Re: [Question] null pointer risk of kernel workqueue To: Li Bin , Tejun Heo References: <59C62398.6040101@huawei.com> <20170925152536.GL828415@devbig577.frc2.facebook.com> <59CB6C9C.7000205@huawei.com> <59E99E4E.5090305@huawei.com> <20171021153522.GH1302522@devbig577.frc2.facebook.com> <2f56ab49-4a65-8e35-07ba-6577af8843b6@huawei.com> CC: , , John Garry From: tanxiaofei Message-ID: <59ED8AD5.7070801@huawei.com> Date: Mon, 23 Oct 2017 14:23:17 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <2f56ab49-4a65-8e35-07ba-6577af8843b6@huawei.com> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.74.185.74] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090203.59ED8AE0.003D,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 2f146ce42865a2c80fb8a6f9ba99452f Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Bin, Yes, that's it. thanks. -- tanxiaofei On 2017/10/23 9:34, Li Bin wrote: > > > on 2017/10/21 23:35, Tejun Heo wrote: >> On Fri, Oct 20, 2017 at 02:57:18PM +0800, tanxiaofei wrote: >>> Hi Tejun, >>> >>> Any comments about this? >> >> I think you're confused, or at least can't understand what you're >> trying to say. Can you create a rero? >> > > Hi Tejun, > The case is as following: > > worker_thread() > |-spin_lock_irq() > |-process_one_work() > |-worker->current_pwq = pwq > |-spin_unlock_irq() > |-worker->current_func(work) > |-spin_lock_irq() > |-worker->current_pwq = NULL > |-spin_unlock_irq() > //interrupt here > |-irq_handler > |-__queue_work() > //assuming that the wq is draining > |-if (unlikely(wq->flags & __WQ_DRAINING) &&WARN_ON_ONCE(!is_chained_work(wq))) > |-is_chained_work(wq) > |-current_wq_worker() // Here, 'current' is the interrupted worker! > |-current->current_pwq is NULL here! > |-schedule() > > And I think the following patch can solve the bug, right? > > diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h > index 8635417..650680c 100644 > --- a/kernel/workqueue_internal.h > +++ b/kernel/workqueue_internal.h > @@ -59,7 +59,7 @@ struct worker { > */ > static inline struct worker *current_wq_worker(void) > { > - if (current->flags & PF_WQ_WORKER) > + if (!in_irq() && (current->flags & PF_WQ_WORKER)) > return kthread_data(current); > return NULL; > } > > > Thanks, > Li Bin > >> Thanks. >> > > > . >