* [Xenomai] Race condition between threadobj_unlock and threadobj_free in t_resume
@ 2014-04-06 11:24 Matthias Schneider
2014-04-07 9:49 ` Philippe Gerum
2014-04-11 14:01 ` Philippe Gerum
0 siblings, 2 replies; 4+ messages in thread
From: Matthias Schneider @ 2014-04-06 11:24 UTC (permalink / raw)
To: xenomai@xenomai.org
The following minimal program
http://pastebin.com/JdnnXwsF
seems to lead to a race condition between threadobj_unlock
and threadobj_free as indicated by valgrind (xenomai-forge
and mercury):
==9573== Invalid read of size 4
==9573== at 0x403F214: threadobj_unlock (threadobj.h:407)
==9573== by 0x403FAD2: put_psos_task (task.c:151)
==9573== by 0x4040E9F: t_resume (task.c:419)
==9573== by 0x80486F9: main (task-2.c:20)
==9573== Address 0x42a20c4 is 172 bytes inside a block of size 664 free'd
==9573== at 0x4029C88: free (vg_replace_malloc.c:446)
==9573== by 0x40641B6: pvfree (heapobj.h:156)
==9573== by 0x406425A: xnfree (heapobj.h:421)
==9573== by 0x406460C: threadobj_free (threadobj.h:285)
==9573== by 0x4068C6B: finalize_thread (threadobj.c:1239)
==9573== by 0x40B5AE5: __nptl_deallocate_tsd (pthread_create.c:157)
==9573== by 0x40B5CFE: start_thread (pthread_create.c:318)
==9573== by 0x41C2C3D: clone (clone.S:131)
==9573==
I am not sure on how to fix this, any ideas? I do admit
that the use case is neither typical nor very useful on its own,
but it is used in my test environment.
It seems that when resuming a thread that already is in
its finalizer may lead to that thread's heap already being freed
when it is about to being unlocked by the resume function.
Thanks in advance for your help,
regards,
Matthias
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Xenomai] Race condition between threadobj_unlock and threadobj_free in t_resume
2014-04-06 11:24 [Xenomai] Race condition between threadobj_unlock and threadobj_free in t_resume Matthias Schneider
@ 2014-04-07 9:49 ` Philippe Gerum
2014-04-11 14:01 ` Philippe Gerum
1 sibling, 0 replies; 4+ messages in thread
From: Philippe Gerum @ 2014-04-07 9:49 UTC (permalink / raw)
To: Matthias Schneider, xenomai@xenomai.org
On 04/06/2014 01:24 PM, Matthias Schneider wrote:
> The following minimal program
>
> http://pastebin.com/JdnnXwsF
>
> seems to lead to a race condition between threadobj_unlock
> and threadobj_free as indicated by valgrind (xenomai-forge
> and mercury):
>
> ==9573== Invalid read of size 4
> ==9573== at 0x403F214: threadobj_unlock (threadobj.h:407)
> ==9573== by 0x403FAD2: put_psos_task (task.c:151)
> ==9573== by 0x4040E9F: t_resume (task.c:419)
> ==9573== by 0x80486F9: main (task-2.c:20)
> ==9573== Address 0x42a20c4 is 172 bytes inside a block of size 664 free'd
> ==9573== at 0x4029C88: free (vg_replace_malloc.c:446)
> ==9573== by 0x40641B6: pvfree (heapobj.h:156)
> ==9573== by 0x406425A: xnfree (heapobj.h:421)
> ==9573== by 0x406460C: threadobj_free (threadobj.h:285)
> ==9573== by 0x4068C6B: finalize_thread (threadobj.c:1239)
> ==9573== by 0x40B5AE5: __nptl_deallocate_tsd (pthread_create.c:157)
> ==9573== by 0x40B5CFE: start_thread (pthread_create.c:318)
> ==9573== by 0x41C2C3D: clone (clone.S:131)
> ==9573==
>
> I am not sure on how to fix this, any ideas? I do admit
> that the use case is neither typical nor very useful on its own,
> but it is used in my test environment.
>
> It seems that when resuming a thread that already is in
> its finalizer may lead to that thread's heap already being freed
> when it is about to being unlocked by the resume function.
>
Because the finalizer should serialize using this lock but does not,
that's bad. I need to check whether we can make this a general rule for
all emulators/apis. Ok, I'll have a look at the pending issues asap.
Thanks for the heads up.
--
Philippe.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Xenomai] Race condition between threadobj_unlock and threadobj_free in t_resume
2014-04-06 11:24 [Xenomai] Race condition between threadobj_unlock and threadobj_free in t_resume Matthias Schneider
2014-04-07 9:49 ` Philippe Gerum
@ 2014-04-11 14:01 ` Philippe Gerum
2014-04-12 13:52 ` Matthias Schneider
1 sibling, 1 reply; 4+ messages in thread
From: Philippe Gerum @ 2014-04-11 14:01 UTC (permalink / raw)
To: Matthias Schneider, xenomai@xenomai.org
On 04/06/2014 01:24 PM, Matthias Schneider wrote:
> The following minimal program
>
> http://pastebin.com/JdnnXwsF
>
> seems to lead to a race condition between threadobj_unlock
> and threadobj_free as indicated by valgrind (xenomai-forge
> and mercury):
>
> ==9573== Invalid read of size 4
> ==9573== at 0x403F214: threadobj_unlock (threadobj.h:407)
> ==9573== by 0x403FAD2: put_psos_task (task.c:151)
> ==9573== by 0x4040E9F: t_resume (task.c:419)
> ==9573== by 0x80486F9: main (task-2.c:20)
> ==9573== Address 0x42a20c4 is 172 bytes inside a block of size 664 free'd
> ==9573== at 0x4029C88: free (vg_replace_malloc.c:446)
> ==9573== by 0x40641B6: pvfree (heapobj.h:156)
> ==9573== by 0x406425A: xnfree (heapobj.h:421)
> ==9573== by 0x406460C: threadobj_free (threadobj.h:285)
> ==9573== by 0x4068C6B: finalize_thread (threadobj.c:1239)
> ==9573== by 0x40B5AE5: __nptl_deallocate_tsd (pthread_create.c:157)
> ==9573== by 0x40B5CFE: start_thread (pthread_create.c:318)
> ==9573== by 0x41C2C3D: clone (clone.S:131)
> ==9573==
>
> I am not sure on how to fix this, any ideas? I do admit
> that the use case is neither typical nor very useful on its own,
> but it is used in my test environment.
>
> It seems that when resuming a thread that already is in
> its finalizer may lead to that thread's heap already being freed
> when it is about to being unlocked by the resume function.
I could not reproduce this issue with your test code (it's likely too
timing-dependent), but looking at the backtrace above a bit closer, this
patch should help:
diff --git a/include/boilerplate/lock.h b/include/boilerplate/lock.h
index dce1ff0..4819b34 100644
--- a/include/boilerplate/lock.h
+++ b/include/boilerplate/lock.h
@@ -177,9 +177,9 @@ int __check_cancel_type(const char *locktype);
#define __do_unlock_safe(__lock, __state) \
({ \
- int __ret; \
+ int __ret, __restored_state = __state; \
__ret = -__RT(pthread_mutex_unlock(__lock)); \
- pthread_setcancelstate(__state, NULL); \
+ pthread_setcancelstate(__restored_state, NULL); \
__ret; \
})
--
Philippe.
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [Xenomai] Race condition between threadobj_unlock and threadobj_free in t_resume
2014-04-11 14:01 ` Philippe Gerum
@ 2014-04-12 13:52 ` Matthias Schneider
0 siblings, 0 replies; 4+ messages in thread
From: Matthias Schneider @ 2014-04-12 13:52 UTC (permalink / raw)
To: Philippe Gerum, xenomai@xenomai.org
----- Original Message -----
> From: Philippe Gerum <rpm@xenomai.org>
> To: Matthias Schneider <ma30002000@yahoo.de>; "xenomai@xenomai.org" <xenomai@xenomai.org>
> Cc:
> Sent: Friday, April 11, 2014 4:01 PM
> Subject: Re: [Xenomai] Race condition between threadobj_unlock and threadobj_free in t_resume
>
> On 04/06/2014 01:24 PM, Matthias Schneider wrote:
>
>> The following minimal program
>>
>> http://pastebin.com/JdnnXwsF
>>
>> seems to lead to a race condition between threadobj_unlock
>> and threadobj_free as indicated by valgrind (xenomai-forge
>> and mercury):
>>
>> ==9573== Invalid read of size 4
>> ==9573== at 0x403F214: threadobj_unlock (threadobj.h:407)
>> ==9573== by 0x403FAD2: put_psos_task (task.c:151)
>> ==9573== by 0x4040E9F: t_resume (task.c:419)
>> ==9573== by 0x80486F9: main (task-2.c:20)
>> ==9573== Address 0x42a20c4 is 172 bytes inside a block of size 664
> free'd
>> ==9573== at 0x4029C88: free (vg_replace_malloc.c:446)
>> ==9573== by 0x40641B6: pvfree (heapobj.h:156)
>> ==9573== by 0x406425A: xnfree (heapobj.h:421)
>> ==9573== by 0x406460C: threadobj_free (threadobj.h:285)
>> ==9573== by 0x4068C6B: finalize_thread (threadobj.c:1239)
>> ==9573== by 0x40B5AE5: __nptl_deallocate_tsd (pthread_create.c:157)
>> ==9573== by 0x40B5CFE: start_thread (pthread_create.c:318)
>> ==9573== by 0x41C2C3D: clone (clone.S:131)
>> ==9573==
>>
>> I am not sure on how to fix this, any ideas? I do admit
>> that the use case is neither typical nor very useful on its own,
>> but it is used in my test environment.
>>
>> It seems that when resuming a thread that already is in
>> its finalizer may lead to that thread's heap already being freed
>> when it is about to being unlocked by the resume function.
>
> I could not reproduce this issue with your test code (it's likely too
> timing-dependent), but looking at the backtrace above a bit closer, this
> patch should help:
>
> diff --git a/include/boilerplate/lock.h b/include/boilerplate/lock.h
> index dce1ff0..4819b34 100644
> --- a/include/boilerplate/lock.h
> +++ b/include/boilerplate/lock.h
> @@ -177,9 +177,9 @@ int __check_cancel_type(const char *locktype);
>
> #define __do_unlock_safe(__lock, __state) \
> ({ \
> - int __ret; \
> + int __ret, __restored_state = __state; \
> __ret = -__RT(pthread_mutex_unlock(__lock)); \
> - pthread_setcancelstate(__state, NULL); \
> + pthread_setcancelstate(__restored_state, NULL); \
> __ret; \
> })
>
> --
> Philippe.
>
With the patch I was no longer able to detect the issue (it occured
pretty reliably after 5-10 minutes and with the patch I had a successful
run over night.
Thanks a lot for the quick fix,
Matthias
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-04-12 13:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-06 11:24 [Xenomai] Race condition between threadobj_unlock and threadobj_free in t_resume Matthias Schneider
2014-04-07 9:49 ` Philippe Gerum
2014-04-11 14:01 ` Philippe Gerum
2014-04-12 13:52 ` Matthias Schneider
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.