From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932306AbdJWOD1 (ORCPT ); Mon, 23 Oct 2017 10:03:27 -0400 Received: from mail-qt0-f193.google.com ([209.85.216.193]:50070 "EHLO mail-qt0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932191AbdJWODZ (ORCPT ); Mon, 23 Oct 2017 10:03:25 -0400 X-Google-Smtp-Source: ABhQp+QJnEIBm6A5tHMmaTj418cWVae5k7asqG7TEBafrycLwEv5vIv5CU6wXo2P0xTmGiw7FsrCkw== Date: Mon, 23 Oct 2017 07:03:21 -0700 From: Tejun Heo To: Li Bin Cc: tanxiaofei , jiangshanlai@gmail.com, linux-kernel@vger.kernel.org, John Garry Subject: Re: [Question] null pointer risk of kernel workqueue Message-ID: <20171023140321.GY1302522@devbig577.frc2.facebook.com> References: <59C62398.6040101@huawei.com> <20170925152536.GL828415@devbig577.frc2.facebook.com> <59CB6C9C.7000205@huawei.com> <59E99E4E.5090305@huawei.com> <20171021153522.GH1302522@devbig577.frc2.facebook.com> <2f56ab49-4a65-8e35-07ba-6577af8843b6@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2f56ab49-4a65-8e35-07ba-6577af8843b6@huawei.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Mon, Oct 23, 2017 at 09:34:11AM +0800, Li Bin wrote: > > > on 2017/10/21 23:35, Tejun Heo wrote: > > On Fri, Oct 20, 2017 at 02:57:18PM +0800, tanxiaofei wrote: > >> Hi Tejun, > >> > >> Any comments about this? > > > > I think you're confused, or at least can't understand what you're > > trying to say. Can you create a rero? > > > > Hi Tejun, > The case is as following: > > worker_thread() > |-spin_lock_irq() > |-process_one_work() > |-worker->current_pwq = pwq > |-spin_unlock_irq() > |-worker->current_func(work) > |-spin_lock_irq() > |-worker->current_pwq = NULL > |-spin_unlock_irq() > //interrupt here > |-irq_handler > |-__queue_work() > //assuming that the wq is draining > |-if (unlikely(wq->flags & __WQ_DRAINING) &&WARN_ON_ONCE(!is_chained_work(wq))) > |-is_chained_work(wq) > |-current_wq_worker() // Here, 'current' is the interrupted worker! > |-current->current_pwq is NULL here! > |-schedule() > > And I think the following patch can solve the bug, right? > > diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h > index 8635417..650680c 100644 > --- a/kernel/workqueue_internal.h > +++ b/kernel/workqueue_internal.h > @@ -59,7 +59,7 @@ struct worker { > */ > static inline struct worker *current_wq_worker(void) > { > - if (current->flags & PF_WQ_WORKER) > + if (!in_irq() && (current->flags & PF_WQ_WORKER)) > return kthread_data(current); > return NULL; > } Yeah, that makes sense to me. Can you please resend the patch with patch description and SOB? Thanks. -- tejun