From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751254AbdJWGXf (ORCPT <rfc822;w@1wt.eu>);
        Mon, 23 Oct 2017 02:23:35 -0400
Received: from szxga04-in.huawei.com ([45.249.212.190]:8973 "EHLO
        szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751083AbdJWGXe (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 23 Oct 2017 02:23:34 -0400
Subject: Re: [Question] null pointer risk of kernel workqueue
To: Li Bin <huawei.libin@huawei.com>, Tejun Heo <tj@kernel.org>
References: <59C62398.6040101@huawei.com>
 <20170925152536.GL828415@devbig577.frc2.facebook.com>
 <59CB6C9C.7000205@huawei.com> <59E99E4E.5090305@huawei.com>
 <20171021153522.GH1302522@devbig577.frc2.facebook.com>
 <2f56ab49-4a65-8e35-07ba-6577af8843b6@huawei.com>
CC: <jiangshanlai@gmail.com>, <linux-kernel@vger.kernel.org>,
        John Garry <john.garry@huawei.com>
From: tanxiaofei <tanxiaofei@huawei.com>
Message-ID: <59ED8AD5.7070801@huawei.com>
Date: Mon, 23 Oct 2017 14:23:17 +0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101
 Thunderbird/38.5.1
MIME-Version: 1.0
In-Reply-To: <2f56ab49-4a65-8e35-07ba-6577af8843b6@huawei.com>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 7bit
X-Originating-IP: [10.74.185.74]
X-CFilter-Loop: Reflected
X-Mirapoint-Virus-RAPID-Raw: score=unknown(0),
        refid=str=0001.0A090203.59ED8AE0.003D,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0,
        ip=0.0.0.0,
        so=2014-11-16 11:51:01,
        dmn=2013-03-21 17:37:32
X-Mirapoint-Loop-Id: 2f146ce42865a2c80fb8a6f9ba99452f
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hi Bin,

Yes, that's it. thanks.

--
tanxiaofei

On 2017/10/23 9:34, Li Bin wrote:
> 
> 
> on 2017/10/21 23:35, Tejun Heo wrote:
>> On Fri, Oct 20, 2017 at 02:57:18PM +0800, tanxiaofei wrote:
>>> Hi Tejun,
>>>
>>> Any comments about this?
>>
>> I think you're confused, or at least can't understand what you're
>> trying to say.  Can you create a rero?
>>
> 
> Hi Tejun,
> The case is as following:
> 
> worker_thread()
> |-spin_lock_irq()
> |-process_one_work()
>     |-worker->current_pwq = pwq
>     |-spin_unlock_irq()
>     |-worker->current_func(work)
>     |-spin_lock_irq()
>     |-worker->current_pwq = NULL
> |-spin_unlock_irq()
>                                                                     //interrupt here
>                                                                     |-irq_handler
>                                                                         |-__queue_work()
>                                                                             //assuming that the wq is draining
>                                                                             |-if (unlikely(wq->flags & __WQ_DRAINING) &&WARN_ON_ONCE(!is_chained_work(wq)))
>                                                                                 |-is_chained_work(wq)
>                                                                                     |-current_wq_worker() // Here, 'current' is the interrupted worker!
>                                                                                         |-current->current_pwq is NULL here!
> |-schedule()
> 
> And I think the following patch can solve the bug, right?
> 
> diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h
> index 8635417..650680c 100644
> --- a/kernel/workqueue_internal.h
> +++ b/kernel/workqueue_internal.h
> @@ -59,7 +59,7 @@ struct worker {
>   */
>  static inline struct worker *current_wq_worker(void)
>  {
> -       if (current->flags & PF_WQ_WORKER)
> +       if (!in_irq() && (current->flags & PF_WQ_WORKER))
>                 return kthread_data(current);
>         return NULL;
>  }
> 
> 
> Thanks,
> Li Bin
> 
>> Thanks.
>>
> 
> 
> .
>