* Re: + delay-accounting-taskstats-interface-send-tgid-once.patch added to -mm tree
[not found] <200606261906.k5QJ6vCp025201@shell0.pdx.osdl.net>
@ 2006-06-27 15:15 ` Arjan van de Ven
2006-06-27 15:26 ` Arjan van de Ven
2006-06-27 15:53 ` Shailabh Nagar
0 siblings, 2 replies; 6+ messages in thread
From: Arjan van de Ven @ 2006-06-27 15:15 UTC (permalink / raw)
To: michal.k.k.piotrowski, linux-kernel; +Cc: mingo, akpm, nagar, balbir, jlan
also in response to
Subject:
Re: 2.6.17-mm3
Date:
Tue, 27 Jun 2006 16:12:42 +0200
with Message-ID:
<6bffcb0e0606270712w166f04a6u237d695e2bfa1913@mail.gmail.com>
On Mon, 2006-06-26 at 12:06 -0700, akpm@osdl.org wrote:
> +static inline void taskstats_tgid_alloc(struct signal_struct *sig)
> +{
> + struct taskstats *stats;
> +
> + stats = kmem_cache_zalloc(taskstats_cache, SLAB_KERNEL);
> + if (!stats)
> + return;
> +
> + spin_lock(&sig->stats_lock);
> + if (!sig->stats) {
+static inline void taskstats_tgid_free(struct signal_struct *sig)
+{
+ struct taskstats *stats = NULL;
+ spin_lock(&sig->stats_lock);
+ if (sig->stats) {
+ stats = sig->stats;
+ sig->stats = NULL;
+ }
+ spin_unlock(&sig->stats_lock);
+ if (stats)
+ kmem_cache_free(taskstats_cache, stats);
+}
this is buggy and deadlock prone!
(found by lockdep)
stats_lock nests within tasklist_lock, which is taken in irq context.
(and thus needs irq_save treatment). But because of this nesting, it is
not allowed to take stats_lock without disabling irqs, or you can have
the following scenario
CPU 0 CPU 1
(in irq) (in the code above)
stats_lock is taken
tasklist_lock is taken
stats_lock_is taken <spin>
<interrupt happens>
tasklist_lock is taken
which now forms an AB-BA deadlock!
this happens at least in copy_process which can call taskstats_tgid_free
without first disabling interrupts (via cleanup_signal). There may be
many other cases, I've not checked deeper yet.
Solution should be to make these functions use irqsave variant... any
comments from the authors of this patch ?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: + delay-accounting-taskstats-interface-send-tgid-once.patch added to -mm tree
2006-06-27 15:15 ` + delay-accounting-taskstats-interface-send-tgid-once.patch added to -mm tree Arjan van de Ven
@ 2006-06-27 15:26 ` Arjan van de Ven
2006-06-27 15:53 ` Shailabh Nagar
1 sibling, 0 replies; 6+ messages in thread
From: Arjan van de Ven @ 2006-06-27 15:26 UTC (permalink / raw)
To: michal.k.k.piotrowski; +Cc: linux-kernel, mingo, akpm, nagar, balbir, jlan
> CPU 0 CPU 1
> (in irq) (in the code above)
actually CPU 0 doesn't even have to be in irq context; any context will
do
> stats_lock is taken
> tasklist_lock is taken
> stats_lock_is taken <spin>
> <interrupt happens>
> tasklist_lock is taken
>
> which now forms an AB-BA deadlock!
>
>
> this happens at least in copy_process which can call taskstats_tgid_free
> without first disabling interrupts (via cleanup_signal). There may be
> many other cases, I've not checked deeper yet.
>
> Solution should be to make these functions use irqsave variant... any
> comments from the authors of this patch ?
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: + delay-accounting-taskstats-interface-send-tgid-once.patch added to -mm tree
2006-06-27 15:15 ` + delay-accounting-taskstats-interface-send-tgid-once.patch added to -mm tree Arjan van de Ven
2006-06-27 15:26 ` Arjan van de Ven
@ 2006-06-27 15:53 ` Shailabh Nagar
2006-06-27 16:06 ` Shailabh Nagar
1 sibling, 1 reply; 6+ messages in thread
From: Shailabh Nagar @ 2006-06-27 15:53 UTC (permalink / raw)
To: Arjan van de Ven
Cc: michal.k.k.piotrowski, linux-kernel, mingo, akpm, balbir, jlan
Arjan van de Ven wrote:
>also in response to
> Subject:
>Re: 2.6.17-mm3
> Date:
>Tue, 27 Jun 2006 16:12:42 +0200
>with Message-ID:
><6bffcb0e0606270712w166f04a6u237d695e2bfa1913@mail.gmail.com>
>
>On Mon, 2006-06-26 at 12:06 -0700, akpm@osdl.org wrote:
>
>
>
>>+static inline void taskstats_tgid_alloc(struct signal_struct *sig)
>>+{
>>+ struct taskstats *stats;
>>+
>>+ stats = kmem_cache_zalloc(taskstats_cache, SLAB_KERNEL);
>>+ if (!stats)
>>+ return;
>>+
>>+ spin_lock(&sig->stats_lock);
>>+ if (!sig->stats) {
>>
>>
>
>+static inline void taskstats_tgid_free(struct signal_struct *sig)
>+{
>+ struct taskstats *stats = NULL;
>+ spin_lock(&sig->stats_lock);
>+ if (sig->stats) {
>+ stats = sig->stats;
>+ sig->stats = NULL;
>+ }
>+ spin_unlock(&sig->stats_lock);
>+ if (stats)
>+ kmem_cache_free(taskstats_cache, stats);
>+}
>
>this is buggy and deadlock prone!
>(found by lockdep)
>
>stats_lock nests within tasklist_lock, which is taken in irq context.
>(and thus needs irq_save treatment). But because of this nesting, it is
>not allowed to take stats_lock without disabling irqs, or you can have
>the following scenario
>
>CPU 0 CPU 1
>(in irq) (in the code above)
> stats_lock is taken
>tasklist_lock is taken
>stats_lock_is taken <spin>
> <interrupt happens>
> tasklist_lock is taken
>
>which now forms an AB-BA deadlock!
>
>
>this happens at least in copy_process which can call taskstats_tgid_free
>without first disabling interrupts (via cleanup_signal).
>
Arjan,
Didn't get how stats_lock is nesting within tasklist_lock in copy_process ?
The call to cleanup_signal (or any call to taskstats_tgid_alloc/free)
seems to be happening
outside of holding the tasklist lock ? Or maybe I'm missing something.
Changing to an irqsave variant isn't a problem of course...
--Shailabh
>There may be
>many other cases, I've not checked deeper yet.
>
>Solution should be to make these functions use irqsave variant... any
>comments from the authors of this patch ?
>
>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: + delay-accounting-taskstats-interface-send-tgid-once.patch added to -mm tree
2006-06-27 15:53 ` Shailabh Nagar
@ 2006-06-27 16:06 ` Shailabh Nagar
2006-06-27 16:55 ` [PATCH] delay accounting taskstats interface: send tgid once locking fix Shailabh Nagar
0 siblings, 1 reply; 6+ messages in thread
From: Shailabh Nagar @ 2006-06-27 16:06 UTC (permalink / raw)
To: Shailabh Nagar
Cc: Arjan van de Ven, michal.k.k.piotrowski, linux-kernel, mingo,
akpm, balbir, jlan
Shailabh Nagar wrote:
> Arjan van de Ven wrote:
>
>> also in response to
>> Subject: Re: 2.6.17-mm3
>> Date: Tue, 27 Jun 2006 16:12:42 +0200
>> with Message-ID:
>> <6bffcb0e0606270712w166f04a6u237d695e2bfa1913@mail.gmail.com>
>>
>> On Mon, 2006-06-26 at 12:06 -0700, akpm@osdl.org wrote:
>>
>>
>>
>>> +static inline void taskstats_tgid_alloc(struct signal_struct *sig)
>>> +{
>>> + struct taskstats *stats;
>>> +
>>> + stats = kmem_cache_zalloc(taskstats_cache, SLAB_KERNEL);
>>> + if (!stats)
>>> + return;
>>> +
>>> + spin_lock(&sig->stats_lock);
>>> + if (!sig->stats) {
>>>
>>
>>
>> +static inline void taskstats_tgid_free(struct signal_struct *sig)
>> +{
>> + struct taskstats *stats = NULL;
>> + spin_lock(&sig->stats_lock);
>> + if (sig->stats) {
>> + stats = sig->stats;
>> + sig->stats = NULL;
>> + }
>> + spin_unlock(&sig->stats_lock);
>> + if (stats)
>> + kmem_cache_free(taskstats_cache, stats);
>> +}
>>
>> this is buggy and deadlock prone!
>> (found by lockdep)
>>
>> stats_lock nests within tasklist_lock, which is taken in irq context.
>> (and thus needs irq_save treatment). But because of this nesting, it is
>> not allowed to take stats_lock without disabling irqs, or you can have
>> the following scenario
>>
>> CPU 0 CPU 1
>> (in irq) (in the code above)
>> stats_lock is taken
>> tasklist_lock is taken
>> stats_lock_is taken <spin>
>> <interrupt happens>
>> tasklist_lock is taken
>> which now forms an AB-BA deadlock!
>>
>>
>> this happens at least in copy_process which can call taskstats_tgid_free
>> without first disabling interrupts (via cleanup_signal).
>
>
> Arjan,
>
> Didn't get how stats_lock is nesting within tasklist_lock in
> copy_process ?
> The call to cleanup_signal (or any call to taskstats_tgid_alloc/free)
> seems to be happening
> outside of holding the tasklist lock ? Or maybe I'm missing something.
Yup...indeed I was.
In release_task(), __cleanup_signal() is being called within tasklist_lock
protection and that leads to stats_lock being held nested within
tasklist_lock.
>
> Changing to an irqsave variant isn't a problem of course...
>
> --Shailabh
>
>> There may be
>> many other cases, I've not checked deeper yet.
>>
>> Solution should be to make these functions use irqsave variant... any
>> comments from the authors of this patch ?
>
Will submit a patch to switch to irqsave variants.
Thanks,
Shailabh
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] delay accounting taskstats interface: send tgid once locking fix
2006-06-27 16:06 ` Shailabh Nagar
@ 2006-06-27 16:55 ` Shailabh Nagar
2006-06-27 18:41 ` Michal Piotrowski
0 siblings, 1 reply; 6+ messages in thread
From: Shailabh Nagar @ 2006-06-27 16:55 UTC (permalink / raw)
To: akpm
Cc: Arjan van de Ven, michal.k.k.piotrowski, linux-kernel, mingo,
balbir, jlan
Use irqsave variants of spin_lock for newly introduced tsk->signal->stats_lock
The lock is nested within tasklist_lock as part of release_task() and hence
there is possibility of a AB-BA deadlock if a timer interrupt occurs while
stats_lock is held.
Thanks to Arjan van de Ven for pointing out the lock bug.
The patch conservatively converts all use of stats_lock to irqsave variant
rather than only the call within taskstats_tgid_free which is the function
called within tasklist_lock protection.
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
---
include/linux/taskstats_kern.h | 11 +++++++----
kernel/taskstats.c | 16 ++++++++++------
2 files changed, 17 insertions(+), 10 deletions(-)
Index: linux-2.6.17/kernel/taskstats.c
===================================================================
--- linux-2.6.17.orig/kernel/taskstats.c 2006-06-27 12:22:45.000000000 -0400
+++ linux-2.6.17/kernel/taskstats.c 2006-06-27 12:34:16.000000000 -0400
@@ -133,6 +133,7 @@ static int fill_tgid(pid_t tgid, struct
struct taskstats *stats)
{
struct task_struct *tsk, *first;
+ unsigned long flags;
/*
* Add additional stats from live tasks except zombie thread group
@@ -152,10 +153,10 @@ static int fill_tgid(pid_t tgid, struct
get_task_struct(first);
/* Start with stats from dead tasks */
- spin_lock(&first->signal->stats_lock);
+ spin_lock_irqsave(&first->signal->stats_lock, flags);
if (first->signal->stats)
memcpy(stats, first->signal->stats, sizeof(*stats));
- spin_unlock(&first->signal->stats_lock);
+ spin_unlock_irqrestore(&first->signal->stats_lock, flags);
tsk = first;
read_lock(&tasklist_lock);
@@ -185,7 +186,9 @@ static int fill_tgid(pid_t tgid, struct
static void fill_tgid_exit(struct task_struct *tsk)
{
- spin_lock(&tsk->signal->stats_lock);
+ unsigned long flags;
+
+ spin_lock_irqsave(&tsk->signal->stats_lock, flags);
if (!tsk->signal->stats)
goto ret;
@@ -197,7 +200,7 @@ static void fill_tgid_exit(struct task_s
*/
delayacct_add_tsk(tsk->signal->stats, tsk);
ret:
- spin_unlock(&tsk->signal->stats_lock);
+ spin_unlock_irqrestore(&tsk->signal->stats_lock, flags);
return;
}
@@ -268,13 +271,14 @@ void taskstats_exit_send(struct task_str
size_t size;
int is_thread_group;
struct nlattr *na;
+ unsigned long flags;
if (!family_registered || !tidstats)
return;
- spin_lock(&tsk->signal->stats_lock);
+ spin_lock_irqsave(&tsk->signal->stats_lock, flags);
is_thread_group = tsk->signal->stats ? 1 : 0;
- spin_unlock(&tsk->signal->stats_lock);
+ spin_unlock_irqrestore(&tsk->signal->stats_lock, flags);
rc = 0;
/*
Index: linux-2.6.17/include/linux/taskstats_kern.h
===================================================================
--- linux-2.6.17.orig/include/linux/taskstats_kern.h 2006-06-27 12:22:34.000000000 -0400
+++ linux-2.6.17/include/linux/taskstats_kern.h 2006-06-27 12:36:12.000000000 -0400
@@ -50,17 +50,18 @@ static inline void taskstats_tgid_init(s
static inline void taskstats_tgid_alloc(struct signal_struct *sig)
{
struct taskstats *stats;
+ unsigned long flags;
stats = kmem_cache_zalloc(taskstats_cache, SLAB_KERNEL);
if (!stats)
return;
- spin_lock(&sig->stats_lock);
+ spin_lock_irqsave(&sig->stats_lock, flags);
if (!sig->stats) {
sig->stats = stats;
stats = NULL;
}
- spin_unlock(&sig->stats_lock);
+ spin_unlock_irqrestore(&sig->stats_lock, flags);
if (stats)
kmem_cache_free(taskstats_cache, stats);
@@ -69,12 +70,14 @@ static inline void taskstats_tgid_alloc(
static inline void taskstats_tgid_free(struct signal_struct *sig)
{
struct taskstats *stats = NULL;
- spin_lock(&sig->stats_lock);
+ unsigned long flags;
+
+ spin_lock_irqsave(&sig->stats_lock, flags);
if (sig->stats) {
stats = sig->stats;
sig->stats = NULL;
}
- spin_unlock(&sig->stats_lock);
+ spin_unlock_irqrestore(&sig->stats_lock, flags);
if (stats)
kmem_cache_free(taskstats_cache, stats);
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] delay accounting taskstats interface: send tgid once locking fix
2006-06-27 16:55 ` [PATCH] delay accounting taskstats interface: send tgid once locking fix Shailabh Nagar
@ 2006-06-27 18:41 ` Michal Piotrowski
0 siblings, 0 replies; 6+ messages in thread
From: Michal Piotrowski @ 2006-06-27 18:41 UTC (permalink / raw)
To: Shailabh Nagar; +Cc: akpm, Arjan van de Ven, linux-kernel, mingo, balbir, jlan
Hi,
On 27/06/06, Shailabh Nagar <nagar@watson.ibm.com> wrote:
> Use irqsave variants of spin_lock for newly introduced tsk->signal->stats_lock
> The lock is nested within tasklist_lock as part of release_task() and hence
> there is possibility of a AB-BA deadlock if a timer interrupt occurs while
> stats_lock is held.
>
> Thanks to Arjan van de Ven for pointing out the lock bug.
> The patch conservatively converts all use of stats_lock to irqsave variant
> rather than only the call within taskstats_tgid_free which is the function
> called within tasklist_lock protection.
>
> Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
>
Problem solved, thanks.
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-06-27 18:42 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200606261906.k5QJ6vCp025201@shell0.pdx.osdl.net>
2006-06-27 15:15 ` + delay-accounting-taskstats-interface-send-tgid-once.patch added to -mm tree Arjan van de Ven
2006-06-27 15:26 ` Arjan van de Ven
2006-06-27 15:53 ` Shailabh Nagar
2006-06-27 16:06 ` Shailabh Nagar
2006-06-27 16:55 ` [PATCH] delay accounting taskstats interface: send tgid once locking fix Shailabh Nagar
2006-06-27 18:41 ` Michal Piotrowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox