Linux block layer
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Yu Kuai <yukuai1@huaweicloud.com>,
	Yang Erkun <yangerkun@huaweicloud.com>,
	Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-block@vger.kernel.org, yangerkun@huawei.com,
	axboe@kernel.dk, ulf.hansson@linaro.org, hch@lst.de,
	houtao1@huawei.com, linux-modules@vger.kernel.org,
	"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH] brd: fix null pointer when modprobe brd
Date: Sat, 26 Oct 2024 14:28:59 +0800	[thread overview]
Message-ID: <a3e499ec-32a3-7e44-c8fd-3d01cdbee25a@huaweicloud.com> (raw)
In-Reply-To: <62e97223-a508-4174-9ba0-6f897149a825@I-love.SAKURA.ne.jp>

Hi,

在 2024/10/26 13:55, Tetsuo Handa 写道:
> On 2024/10/26 10:21, Yu Kuai wrote:
>> Hi,
>>
>> 在 2024/10/25 18:40, Tetsuo Handa 写道:
>>> On 2024/10/25 16:05, Yang Erkun wrote:
>>>> From: Yang Erkun <yangerkun@huawei.com>
>>>>
>>>> My colleague Wupeng found the following problems during fault injection:
>>>>
>>>> BUG: unable to handle page fault for address: fffffbfff809d073
>>>
>>> Excuse me, but subject says "null pointer" whereas dmesg says
>>> "not a null pointer dereference". Is this a use-after-free bug?
>>> Also, what verb comes after "when modprobe brd" ?
>>>
>>> Is this problem happening with parallel execution? If yes, parallelly
>>> running what and what?
>>
>> The problem is straightforward, to be short,
>>
>> T1: morprobe brd
>> brd_init
>>   brd_alloc
>>    add_disk
>>          T2: open brd
>>          bdev_open
>>           try_module_get
>>    // err path
>>    brd_cleanup
>>            // dereference brd_fops() while module is freed.
> 
> Then, fault injection is irrelevant, isn't it?

Fault injection must involved in the test, brd_init() is unlikely to
fail.
> 
> If bdev_open() can grab a reference before module's initialization phase
> completes is a problem, I think that we can fix the problem with just

Yes, and root cause is that stuff inside module can be freed if module
initialization failed, it's not safe to deference disk->fops in this
case.
> 
>   int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
>   	      const struct blk_holder_ops *hops, struct file *bdev_file)
>   {
>   (...snipped...)
>   	ret = -ENXIO;
>   	if (!disk_live(disk))
>   		goto abort_claiming;
> -	if (!try_module_get(disk->fops->owner))
> +	if ((disk->fops->owner && module_is_coming(disk->fops->owner)) || !try_module_get(disk->fops->owner))

This is not a common issue, do you see similiar behaviour for other
drivers? If not, I'll suggest to fix this in brd.

Thanks,
Kuai
>   		goto abort_claiming;
>   	ret = -EBUSY;
>   	if (!bdev_may_open(bdev, mode))
>   (...snipped...)
>   }
> 
> change. It would be cleaner if we can do
> 
>   bool try_module_get(struct module *module)
>   {
>   	bool ret = true;
>   
>   	if (module) {
>   		/* Note: here, we can fail to get a reference */
> -		if (likely(module_is_live(module) &&
> +		if (likely(module_is_live(module) && !module_is_coming(module) &&
>   			   atomic_inc_not_zero(&module->refcnt) != 0))
>   			trace_module_get(module, _RET_IP_);
>   		else
>   			ret = false;
>   	}
>   	return ret;
>   }
> 
> but I don't know if this change breaks something.
> Maybe we can introduce a variant like
> 
> bool try_module_get_safe(struct module *module)
> {
> 	bool ret = true;
> 
> 	if (module) {
> 		/* Note: here, we can fail to get a reference */
> 		if (likely(module_is_live(module) && !module_is_coming(module) &&
> 			   atomic_inc_not_zero(&module->refcnt) != 0))
> 			trace_module_get(module, _RET_IP_);
> 		else
> 			ret = false;
> 	}
> 	return ret;
> }
> 
> and use it from bdev_open().
> 
> 
> .
> 


  reply	other threads:[~2024-10-26  6:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-25  7:05 [PATCH] brd: fix null pointer when modprobe brd Yang Erkun
2024-10-25  7:59 ` Yu Kuai
2024-10-28  2:10   ` yangerkun
2024-10-25 10:40 ` Tetsuo Handa
2024-10-26  1:21   ` Yu Kuai
2024-10-26  5:55     ` Tetsuo Handa
2024-10-26  6:28       ` Yu Kuai [this message]
2024-10-26  8:06         ` Tetsuo Handa
2024-10-28  2:15           ` yangerkun
2024-11-01  0:31       ` Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a3e499ec-32a3-7e44-c8fd-3d01cdbee25a@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=houtao1@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=ulf.hansson@linaro.org \
    --cc=yangerkun@huawei.com \
    --cc=yangerkun@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox