From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CF0CC433F5 for ; Tue, 7 Dec 2021 12:51:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231365AbhLGMzI (ORCPT ); Tue, 7 Dec 2021 07:55:08 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:25494 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229463AbhLGMzH (ORCPT ); Tue, 7 Dec 2021 07:55:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638881496; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=IQV1xzWOEOwb9RfCG2y/1CHxMXP75lI2nbve1UdiEQk=; b=UeM+9ZYDBRcy5ML/9qElCpR5YIe5QzQKwuI4LXn5U70Odz3IGK2GkFLmEgONqdjPcDGvUh EtS8cj126jy4Z9r5cZwPwkdWs1lDd00TY6U4Xjb8rHJHdd2uRuhZVvK4Gg6YVfp8aaLXxe ULK2rdVMqzZI2CkRunxVECR3s1XH+UY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-24-8ytB88oHN0SIKla8OL9z0A-1; Tue, 07 Dec 2021 07:51:33 -0500 X-MC-Unique: 8ytB88oHN0SIKla8OL9z0A-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id ADBD534837; Tue, 7 Dec 2021 12:51:32 +0000 (UTC) Received: from T590 (ovpn-8-18.pek2.redhat.com [10.72.8.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 945C510016F5; Tue, 7 Dec 2021 12:51:29 +0000 (UTC) Date: Tue, 7 Dec 2021 20:51:24 +0800 From: Ming Lei To: Petr Mladek Cc: Greg Kroah-Hartman , linux-kernel@vger.kernel.org, Luis Chamberlain Subject: Re: [PATCH V2 2/2] kobject: wait until kobject is cleaned up before freeing module Message-ID: References: <20211129034509.2646872-1-ming.lei@redhat.com> <20211129034509.2646872-3-ming.lei@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 07, 2021 at 11:32:27AM +0100, Petr Mladek wrote: > On Mon 2021-12-06 09:04:40, Greg Kroah-Hartman wrote: > > On Mon, Dec 06, 2021 at 10:13:53AM +0800, Ming Lei wrote: > > > On Fri, Dec 03, 2021 at 04:07:39PM +0100, Greg Kroah-Hartman wrote: > > > > On Mon, Nov 29, 2021 at 11:45:09AM +0800, Ming Lei wrote: > > > > > kobject_put() may become asynchronously because of > > > > > CONFIG_DEBUG_KOBJECT_RELEASE, so once kobject_put() returns, the caller may > > > > > expect the kobject is released after the last refcnt is dropped, however > > > > > CONFIG_DEBUG_KOBJECT_RELEASE just schedules one delayed work function > > > > > for cleaning up the kobject. > > > > > > > > The caller should NOT expect the kobject to be released. That's the > > > > whole point of dynamic reference counted objects, you never "know" when > > > > the last object is released. This option just makes it obvious so that > > > > you know when to fix up code that has this assumption. > > > > > > > > Inside the cleanup handler, kobj->ktype and kobj->ktype->release are > > > > > required. > > > > > > > > Yes. Is that a problem? > > > > > > Of course for CONFIG_DEBUG_KOBJECT_RELEASE, which delays to call > > > ->release after random time, when the module for storing ->ktype and > > > ->ktype->release has been unloaded. > > > > > > As I mentioned, the issue can be triggered 100% by 'modprobe -r > > > kset-example' when CONFIG_DEBUG_KOBJECT_RELEASE is enabled if the > > > 1st patch is applied. > > > > Is there any "real" kernel code that this causes problems on? > > > > Again, this is for debugging, yes, this tiny example will crash that > > way, but that is fine, as we can obviously see that the kernel code here > > is correct. > > > > And if you really want to ensure that it works properly, let's wait on > > release before allowing that module to be unloaded. > > This is exactly what this patch is trying to achieve. IMHO, > we should do it another way, see below. > > > > But again, module unload is NOT a normal operation and is not what > > this debugging option was created to help out with. > > But people do unload module and especially when testing kernel. > IMHO, we want both CONFIG_DEBUG_KOBJECT_RELEASE and module unload > enabled when testing kernel. > > > > Again, the confusion between kobjects (which protect data) and module > > references (which protect code) is getting mixed up here. > > This is perfect description of the problem. And the problem is real. > > kobjects protect data but they need to call code (callbacks) when > they are released. module unload is special because it removes the > code. CONFIG_DEBUG_KOBJECT_RELEASE always delays the code calls. > It results into a crash even when everything works as expected. > > > Now, back to the proposed patch. I agree that it looks weird. It > makes CONFIG_DEBUG_KOBJECT_RELEASE useless in this scenario. Can you explain how this patch makes CONFIG_DEBUG_KOBJECT_RELEASE useless? The kobject is still cleaned up with random delay. > > I have another idea. What about adding a pointer to > struct module *mod into struct kobj_type. Some reference > counter and wait_queue into struct module. They might be > used to block the module_exit() until the reference counter > reaches zero. > > I mean something like: > > Let's take samples/kobject/kset-sample.c as an example. > We could define: > > static struct kobj_type foo_ktype = { > .sysfs_ops = &foo_sysfs_ops, > .release = foo_release, > .default_groups = foo_default_groups, > .mod = THIS_MODULE, > }; > > then we might do: > > static int kobject_add_internal(struct kobject *kobj) > { > [...] > if (kobject->ktype->mod) > module_get_kobject_referece(kobject->ktype->mod); > [...] > } > > and > > static void kobject_cleanup(struct kobject *kobj) > { > [...] > if (kobject->ktype->mod) > module_put_kobject_referece(kobject->ktype->mod); > [...] > } > > where > > void module_get_kobject_referece(struct module *mod) > { > mutex_lock(&module_mutex); > mod->kobject_ref++; > mutex_lock(&module_mutex); > } > > void module_put_kobject_referece(struct module *mod) > { > struct wait_queue_head *module_kobject_wq; > > mutex_lock(&module_mutex); > mod->kobject_ref--; > if (!mod->kobject_ref) > wake_up(mod->kobj_release_wq); > mutex_lock(&module_mutex); The question is why kobject is so special for taking one extra module ref here. > } > > > and > > SYSCALL_DEFINE2(delete_module, const char __user *, name_user, > unsigned int, flags) > { > [...] > wait_event_interruptible(mod->kobj_release_wq, !mod->kobj_ref); > [...] > } > > There might be many details to be solved. > > But it looks like a win-win solution. It should make module unload > much more secure. Broken modules will just get blocked in > module_cleanup forever. CONFIG_DEBUG_KOBJECT_RELEASE will still > work as designed. The above approach might work, but it needs every driver with kobjects to be changed, so it is more complicated. Thanks Ming