All of lore.kernel.org
 help / color / mirror / Atom feed
From: Schspa Shi <schspa@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com, swboyd@chromium.org, linux@roeck-us.net,
	wuchi.zero@gmail.com, linux-kernel@vger.kernel.org,
	syzbot+5093ba19745994288b53@syzkaller.appspotmail.com
Subject: Re: [PATCH 1/2] debugobject: fix concurrency issues with is_static_object
Date: Wed, 22 Mar 2023 23:40:29 +0800	[thread overview]
Message-ID: <m2ttycd8kx.fsf@gmail.com> (raw)
In-Reply-To: <87bkl9jt3a.ffs@tglx>


Thomas Gleixner <tglx@linutronix.de> writes:

> Hi!
>
> On Sat, Mar 04 2023 at 00:19, Schspa Shi wrote:
>
>> The is_static_object implementation relay on the initial state of the
>> object. If multiple places are accessed concurrently, there is a
>> probability that the debug object has been registered in the system, which
>> will invalidate the judgment of is_static_object.
>>
>> The following is the scenario where the problem occurs:
>>
>> T0                                                   T1
>> =========================================================================
>> mod_timer();
>>   debug_object_assert_init
>> 	db = get_bucket((unsigned long) addr);
>> 	raw_spin_lock_irqsave(&db->lock, flags);
>> 	obj = lookup_object(addr, db);
>>     if (!obj) {
>> 		raw_spin_unlock_irqrestore(&db->lock, flags);
>>         << Context switch >>
>>                                              mod_timer();
>>                                                debug_object_assert_init
>>                                                ...
>>                                                enqueue_timer();
>>         /*
>>          * The initial state changed a static timer object, and
>>          * is_static_object will return false
>>          */
>>
>>         if (descr->is_static_object &&
>> 			descr->is_static_object(addr)) {
>>                 debug_object_init();
>>             } else {
>>                << Hit here for a static object >>
>> 			   debug_print_object(&o, "assert_init");
>> 			   debug_object_fixup(descr->fixup_assert_init, addr,
>> 					   ODEBUG_STATE_NOTAVAILABLE);
>>             }
>>     }
>
> That analysis is correct.
>
>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>> index df86e649d8be0..d1be18158a1f7 100644
>> --- a/lib/debugobjects.c
>> +++ b/lib/debugobjects.c
>> @@ -275,6 +275,8 @@ alloc_object(void *addr, struct debug_bucket *b, const struct debug_obj_descr *d
>>  		obj->descr  = descr;
>>  		obj->state  = ODEBUG_STATE_NONE;
>>  		obj->astate = 0;
>> +		obj->is_static = descr->is_static_object &&
>> +			descr->is_static_object(addr);
>>  		hlist_add_head(&obj->node, &b->list);
>>  	}
>>  	return obj;
>> @@ -581,7 +583,16 @@ __debug_object_init(void *addr, const struct debug_obj_descr *descr, int onstack
>>  			debug_objects_oom();
>>  			return;
>>  		}
>> +
>>  		check_stack = true;
>> +	} else {
>> +		/*
>> +		 * The debug object is inited, and we should check this again
>> +		 */
>> +		if (obj->is_static) {
>> +			raw_spin_unlock_irqrestore(&db->lock, flags);
>> +			return;
>
> This is broken. If the object is static and already hashed and in active
> state then this returns and fails to detect the re-initialization of an
> active object.
>

Yes, it's right, this can be fixed by pass a skip_ifstatic parameters
from debug_object_activate. then re-initialization of an active object
can be detected.

>> +/*
>> + * Check static object.
>> + */
>> +static bool debug_object_check_static(struct debug_bucket *db, void *addr,
>> +			const struct debug_obj_descr *descr)
>> +{
>> +	struct debug_obj *obj;
>> +
>> +	/*
>> +	 * The is_static_object implementation relay on the initial state of the
>> +	 * object. If multiple places are accessed concurrently, there is a
>> +	 * probability that the debug object has been registered in the system,
>> +	 * which will invalidate the judgment of is_static_object.
>> +	 */
>> +	lockdep_assert_held(&db->lock);
>> +
>> +	obj = lookup_object(addr, db);
>
> What does this lookup solve? See below...
>

This lookup can be droped by add a extra obj parameter.

>> +	if (likely(obj))
>> +		return obj->is_static;
>> +
>> +	return descr->is_static_object && descr->is_static_object(addr);
>> +}
>
>> @@ -696,6 +731,7 @@ int debug_object_activate(void *addr, const struct debug_obj_descr *descr)
>>  		return ret;
>>  	}
>>  
>> +	is_static = debug_object_check_static(db, addr, descr);
>
> It's already clear that the object is not hashed because activate does:
>
>         lock()
>         lookup()
>         if (ob) {
>            ....
>            unlock();
>         }
>
> At this point nothing has changed after the lookup because
> the hash bucket is still locked.
>

Yes, the obj variable should be passed to debug_object_check_static.

>>  	raw_spin_unlock_irqrestore(&db->lock, flags);
>  
>>  	/*
>> @@ -705,7 +741,7 @@ int debug_object_activate(void *addr, const struct debug_obj_descr *descr)
>>  	 * static object is tracked in the object tracker. If
>>  	 * not, this must be a bug, so we try to fix it up.
>>  	 */
>> -	if (descr->is_static_object && descr->is_static_object(addr)) {
>> +	if (is_static) {
>>  		/* track this static object */
>>  		debug_object_init(addr, descr);
>
> This cannot work as the change in debug_object_init() is not correct and
> there is no way to make it correct.
>

Function debug_object_init() can be fixed as explained above.

>> -static __initdata struct self_test obj = { .static_init = 0 };
>> +static struct self_test obj __initdata = { .static_init = 0 };
>> +static struct self_test sobj __initdata = { .static_init = 1 };
>
> ...
>
>> -	obj.static_init = 1;
>
> Plus the s/obj/sobj/ which should be equivalent, unless I'm missing
> something here.
>

We have saved the is_static state when it is used at the first time, so
the is_static_object function won't be called in this environment.

> But that also changes the actual test:
>
> -	obj.static_init = 1;
> -	debug_object_activate(&obj, &descr_type_test);
> -	if (check_results(&obj, ODEBUG_STATE_ACTIVE, fixups, warnings))
> +	debug_object_init(&sobj, &descr_type_test);
> +	debug_object_activate(&sobj, &descr_type_test);
> +	if (check_results(&sobj, ODEBUG_STATE_ACTIVE, fixups, warnings))
> 		goto out;
>
> That's _not_ equivalent.
>
> The original code checks exactly for the case where the object
> initialization does not happen before debug_object_activate() is invoked
> which is perfectly correct for statically initialized
> objects. Initializing the statically allocated object breaks that
> test. The changelog is utterly silent about that.
>

This debug_object_init(&sobj, &descr_type_test) can be droped. I'm
sorry to add this extra change.

> The proper fix for this is obvious from the analysis. The problem
> happens because the failed lookup drops the hash bucket lock which opens
> the window for the concurrent caller. So why dropping the hash bucket
> lock at all?
>
> Uncompiled and untested fix below.
>
> Thanks,
>
>         tglx
> ---
>  lib/debugobjects.c |  127 +++++++++++++++++++++++++++--------------------------
>  1 file changed, 67 insertions(+), 60 deletions(-)
>
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -216,10 +216,6 @@ static struct debug_obj *__alloc_object(
>  	return obj;
>  }
>  
> -/*
> - * Allocate a new object. If the pool is empty, switch off the debugger.
> - * Must be called with interrupts disabled.
> - */
>  static struct debug_obj *
>  alloc_object(void *addr, struct debug_bucket *b, const struct debug_obj_descr *descr)
>  {
> @@ -273,7 +269,7 @@ alloc_object(void *addr, struct debug_bu
>  	if (obj) {
>  		obj->object = addr;
>  		obj->descr  = descr;
> -		obj->state  = ODEBUG_STATE_NONE;
> +		obj->state  = ODEBUG_STATE_INIT;

This actually droped the ODEBUG_STATE_NONE state. If we active a
uninitialized object, there will be no error report.

This should be

if (descr->is_static_object && descr->is_static_object(addr))
	obj->state  = ODEBUG_STATE_INIT;
else
	obj->state  = ODEBUG_STATE_NONE;

But this can't resolve the initial state requirement from the
is_static_object() call.

I think we can report an error when calling debug_object_free() from a
static object. If don't do so, there is no way to determine it's a
static object. When its initialization state changes, the
is_static_object() call will return the wrong value.

Please see the fellowing test case:

	obj.static_init = 1;
	debug_object_activate(&obj, &descr_type_test);
	if (check_results(&obj, ODEBUG_STATE_ACTIVE, fixups, warnings))
		goto out;
	debug_object_init(&obj, &descr_type_test);
	if (check_results(&obj, ODEBUG_STATE_INIT, ++fixups, ++warnings))
		goto out;
	debug_object_free(&obj, &descr_type_test);
	if (check_results(&obj, ODEBUG_STATE_NONE, fixups, warnings))
		goto out;

    /*
     * for the static debugobject, it's initial value will be changed
     * once used.
     */
	obj.static_init = 2;
	debug_object_activate(&obj, &descr_type_test);
	if (check_results(&obj, ODEBUG_STATE_ACTIVE, fixups, warnings))
		goto out;

    /* This test will fail */

>  		obj->astate = 0;
>  		hlist_add_head(&obj->node, &b->list);
>  	}
> @@ -552,11 +548,49 @@ static void debug_object_is_on_stack(voi
>  	WARN_ON(1);
>  }
>  
> +static struct debug_obj *lookup_object_or_alloc(void *addr, struct debug_bucket *b,
> +						const struct debug_obj_descr *descr,
> +						bool onstack, bool alloc_ifstatic)
> +{
> +	struct debug_obj *obj = lookup_object(addr, b);
> +
> +	if (likely(obj))
> +		return obj;
> +
> +	/*
> +	 * debug_object_init() unconditionally allocates untracked
> +	 * objects. It does not matter whether it is a static object or
> +	 * not.
> +	 *
> +	 * debug_object_assert_init() and debug_object_activate() allow
> +	 * allocation only if the descriptor callback confirms that the
> +	 * object is static and considered initialized. For non-static
> +	 * objects the allocation needs to be done from the fixup callback.
> +	 */
> +	if (unlikely(alloc_ifstatic)) {
> +		if (!descr->is_static_object || !descr->is_static_object(addr))
> +			return ERR_PTR(-ENOENT);
> +	}
> +
> +	/*
> +	 * On success the returned object is in INIT state as that's the
> +	 * correct state for all callers.
> +	 */
> +	obj = alloc_object(addr, b, descr);
> +	if (likely(obj)) {
> +		debug_object_is_on_stack(addr, onstack);
> +		return obj;
> +	}
> +
> +	/* Out of memory. Do the cleanup outside of the locked region */
> +	debug_objects_enabled = 0;
> +	return NULL;
> +}
> +
>  static void
>  __debug_object_init(void *addr, const struct debug_obj_descr *descr, int onstack)
>  {
>  	enum debug_obj_state state;
> -	bool check_stack = false;
>  	struct debug_bucket *db;
>  	struct debug_obj *obj;
>  	unsigned long flags;
> @@ -572,16 +606,11 @@ static void
>  
>  	raw_spin_lock_irqsave(&db->lock, flags);
>  
> -	obj = lookup_object(addr, db);
> -	if (!obj) {
> -		obj = alloc_object(addr, db, descr);
> -		if (!obj) {
> -			debug_objects_enabled = 0;
> -			raw_spin_unlock_irqrestore(&db->lock, flags);
> -			debug_objects_oom();
> -			return;
> -		}
> -		check_stack = true;
> +	obj = lookup_object_or_alloc(addr, db, descr, onstack, false);
> +	if (unlikely(!obj)) {
> +		raw_spin_unlock_irqrestore(&db->lock, flags);
> +		debug_objects_oom();
> +		return;
>  	}
>  
>  	switch (obj->state) {
> @@ -607,8 +636,6 @@ static void
>  	}
>  
>  	raw_spin_unlock_irqrestore(&db->lock, flags);
> -	if (check_stack)
> -		debug_object_is_on_stack(addr, onstack);
>  }
>  
>  /**
> @@ -648,14 +675,12 @@ EXPORT_SYMBOL_GPL(debug_object_init_on_s
>   */
>  int debug_object_activate(void *addr, const struct debug_obj_descr *descr)
>  {
> +	struct debug_obj o = { .object = addr, .state = ODEBUG_STATE_NOTAVAILABLE, .descr = descr };
>  	enum debug_obj_state state;
>  	struct debug_bucket *db;
>  	struct debug_obj *obj;
>  	unsigned long flags;
>  	int ret;
> -	struct debug_obj o = { .object = addr,
> -			       .state = ODEBUG_STATE_NOTAVAILABLE,
> -			       .descr = descr };
>  
>  	if (!debug_objects_enabled)
>  		return 0;
> @@ -664,8 +689,8 @@ int debug_object_activate(void *addr, co
>  
>  	raw_spin_lock_irqsave(&db->lock, flags);
>  
> -	obj = lookup_object(addr, db);
> -	if (obj) {
> +	obj = lookup_object_or_alloc(addr, db, descr, false, true);
> +	if (likely(!IS_ERR_OR_NULL(obj))) {
>  		bool print_object = false;
>  
>  		switch (obj->state) {
> @@ -698,24 +723,16 @@ int debug_object_activate(void *addr, co
>  
>  	raw_spin_unlock_irqrestore(&db->lock, flags);
>  
> -	/*
> -	 * We are here when a static object is activated. We
> -	 * let the type specific code confirm whether this is
> -	 * true or not. if true, we just make sure that the
> -	 * static object is tracked in the object tracker. If
> -	 * not, this must be a bug, so we try to fix it up.
> -	 */
> -	if (descr->is_static_object && descr->is_static_object(addr)) {
> -		/* track this static object */
> -		debug_object_init(addr, descr);
> -		debug_object_activate(addr, descr);
> -	} else {
> -		debug_print_object(&o, "activate");
> -		ret = debug_object_fixup(descr->fixup_activate, addr,
> -					ODEBUG_STATE_NOTAVAILABLE);
> -		return ret ? 0 : -EINVAL;
> +	/* If NULL the allocaction has hit OOM */
> +	if (!obj) {
> +		debug_objects_oom();
> +		return 0;
>  	}
> -	return 0;
> +
> +	/* Object is neither static nor tracked. It's not initialized */
> +	debug_print_object(&o, "activate");
> +	ret = debug_object_fixup(descr->fixup_activate, addr, ODEBUG_STATE_NOTAVAILABLE);
> +	return ret ? 0 : -EINVAL;
>  }
>  EXPORT_SYMBOL_GPL(debug_object_activate);
>  
> @@ -869,6 +886,7 @@ EXPORT_SYMBOL_GPL(debug_object_free);
>   */
>  void debug_object_assert_init(void *addr, const struct debug_obj_descr *descr)
>  {
> +	struct debug_obj o = { .object = addr, .state = ODEBUG_STATE_NOTAVAILABLE, .descr = descr };
>  	struct debug_bucket *db;
>  	struct debug_obj *obj;
>  	unsigned long flags;
> @@ -879,31 +897,20 @@ void debug_object_assert_init(void *addr
>  	db = get_bucket((unsigned long) addr);
>  
>  	raw_spin_lock_irqsave(&db->lock, flags);
> +	obj = lookup_object_or_alloc(addr, db, descr, false, true);
> +	raw_spin_unlock_irqrestore(&db->lock, flags);
> +	if (likely(!IS_ERR_OR_NULL(obj)))
> +		return;
>  
> -	obj = lookup_object(addr, db);
> +	/* If NULL the allocaction has hit OOM */
>  	if (!obj) {
> -		struct debug_obj o = { .object = addr,
> -				       .state = ODEBUG_STATE_NOTAVAILABLE,
> -				       .descr = descr };
> -
> -		raw_spin_unlock_irqrestore(&db->lock, flags);
> -		/*
> -		 * Maybe the object is static, and we let the type specific
> -		 * code confirm. Track this static object if true, else invoke
> -		 * fixup.
> -		 */
> -		if (descr->is_static_object && descr->is_static_object(addr)) {
> -			/* Track this static object */
> -			debug_object_init(addr, descr);
> -		} else {
> -			debug_print_object(&o, "assert_init");
> -			debug_object_fixup(descr->fixup_assert_init, addr,
> -					   ODEBUG_STATE_NOTAVAILABLE);
> -		}
> +		debug_objects_oom();
>  		return;
>  	}
>  
> -	raw_spin_unlock_irqrestore(&db->lock, flags);
> +	/* Object is neither tracked nor static. It's not initialized. */
> +	debug_print_object(&o, "assert_init");
> +	debug_object_fixup(descr->fixup_assert_init, addr, ODEBUG_STATE_NOTAVAILABLE);
>  }
>  EXPORT_SYMBOL_GPL(debug_object_assert_init);
>  

I test this patch, with my above change, and it seems to work well, but
we still need to add extra flags to store its static state. And
debug_object_free() should report an error for the static object.

I think we should introduce lookup_object_or_alloc and is_static at the
same time.


-- 
BRs
Schspa Shi

  reply	other threads:[~2023-03-22 17:02 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-03 16:19 [PATCH 1/2] debugobject: fix concurrency issues with is_static_object Schspa Shi
2023-03-03 16:19 ` [PATCH 2/2] debugobject: add unit test for static debug object Schspa Shi
2023-03-23  3:16   ` Schspa Shi
2023-03-23  7:53     ` Thomas Gleixner
2023-03-23  8:44       ` Schspa Shi
2023-04-13 22:39         ` Thomas Gleixner
2023-03-03 17:00 ` [PATCH 1/2] debugobject: fix concurrency issues with is_static_object Waiman Long
2023-03-03 17:53   ` Schspa Shi
2023-03-04  0:14     ` Thomas Gleixner
2023-03-04 11:48       ` Hillf Danton
2023-03-03 23:47 ` Thomas Gleixner
2023-03-22 15:40   ` Schspa Shi [this message]
2023-03-22 17:46     ` Thomas Gleixner
2023-03-22 17:55       ` Schspa Shi
2023-03-22 21:17         ` Thomas Gleixner
2023-03-23  3:10           ` Schspa Shi
2023-03-22 18:05       ` Thomas Gleixner
2023-03-22 18:18         ` Schspa Shi
2023-04-12  7:54         ` [PATCH] debugobject: Prevent init race with static objects Thomas Gleixner
2023-04-13  0:17           ` Stephen Boyd
2023-04-13 12:14             ` Thomas Gleixner
2023-05-01 13:40           ` Ido Schimmel
2023-05-01 15:42             ` Thomas Gleixner
2023-05-02  5:53               ` Ido Schimmel
2023-05-02  8:12               ` [tip: core/debugobjects] debugobject: Ensure pool refill (again) tip-bot2 for Thomas Gleixner
2023-04-15 21:20 ` [tip: core/debugobjects] debugobject: Prevent init race with static objects tip-bot2 for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2ttycd8kx.fsf@gmail.com \
    --to=schspa@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=longman@redhat.com \
    --cc=swboyd@chromium.org \
    --cc=syzbot+5093ba19745994288b53@syzkaller.appspotmail.com \
    --cc=tglx@linutronix.de \
    --cc=wuchi.zero@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.