* [PATCH] sigqueue_free: fix the race with collect_signal()
@ 2007-08-23 13:45 Oleg Nesterov
2007-08-23 21:36 ` Sukadev Bhattiprolu
2007-08-24 14:26 ` taoyue
0 siblings, 2 replies; 13+ messages in thread
From: Oleg Nesterov @ 2007-08-23 13:45 UTC (permalink / raw)
To: Andrew Morton
Cc: Alexey Dobriyan, Ingo Molnar, Jeremy Katz, taoyue,
Thomas Gleixner, Roland McGrath, linux-kernel, stable
Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz <jeremy.katz@windriver.com>.
collect_signal: sigqueue_free:
list_del_init(&first->list);
if (!list_empty(&q->list)) {
// not taken
}
q->flags &= ~SIGQUEUE_PREALLOC;
__sigqueue_free(first); __sigqueue_free(q);
Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
obviously bad implications.
In particular, this double free breaks the array_cache->avail logic, so the
same sigqueue could be "allocated" twice, and the bug can manifest itself via
the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.
Hopefully this can explain these mysterious bug-reports, see
http://marc.info/?t=118766926500003
http://marc.info/?t=118466273000005
Alexey Dobriyan reports this patch makes the difference for the testcase, but
nobody has an access to the application which opened the problems originally.
Also, this patch removes tasklist lock/unlock, ->siglock is enough.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
--- t/kernel/signal.c~SQFREE 2007-08-22 20:06:31.000000000 +0400
+++ t/kernel/signal.c 2007-08-23 16:02:57.000000000 +0400
@@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void)
void sigqueue_free(struct sigqueue *q)
{
unsigned long flags;
+ spinlock_t *lock = ¤t->sighand->siglock;
+
BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
/*
* If the signal is still pending remove it from the
- * pending queue.
+ * pending queue. We must hold ->siglock while testing
+ * q->list to serialize with collect_signal().
*/
- if (unlikely(!list_empty(&q->list))) {
- spinlock_t *lock = ¤t->sighand->siglock;
- read_lock(&tasklist_lock);
- spin_lock_irqsave(lock, flags);
- if (!list_empty(&q->list))
- list_del_init(&q->list);
- spin_unlock_irqrestore(lock, flags);
- read_unlock(&tasklist_lock);
- }
+ spin_lock_irqsave(lock, flags);
+ if (!list_empty(&q->list))
+ list_del_init(&q->list);
+ spin_unlock_irqrestore(lock, flags);
+
q->flags &= ~SIGQUEUE_PREALLOC;
__sigqueue_free(q);
}
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-23 13:45 [PATCH] sigqueue_free: fix the race with collect_signal() Oleg Nesterov @ 2007-08-23 21:36 ` Sukadev Bhattiprolu 2007-08-23 22:05 ` Oleg Nesterov 2007-08-24 14:26 ` taoyue 1 sibling, 1 reply; 13+ messages in thread From: Sukadev Bhattiprolu @ 2007-08-23 21:36 UTC (permalink / raw) To: Oleg Nesterov Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Jeremy Katz, taoyue, Thomas Gleixner, Roland McGrath, linux-kernel, stable Oleg Nesterov wrote: > Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz <jeremy.katz@windriver.com>. > > collect_signal: sigqueue_free: > > list_del_init(&first->list); > if (!list_empty(&q->list)) { > // not taken > } > q->flags &= ~SIGQUEUE_PREALLOC; > > __sigqueue_free(first); __sigqueue_free(q); > > Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the > obviously bad implications. > > In particular, this double free breaks the array_cache->avail logic, so the > same sigqueue could be "allocated" twice, and the bug can manifest itself via > the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue. > > Hopefully this can explain these mysterious bug-reports, see > > http://marc.info/?t=118766926500003 > http://marc.info/?t=118466273000005 > > Alexey Dobriyan reports this patch makes the difference for the testcase, but > nobody has an access to the application which opened the problems originally. > > Also, this patch removes tasklist lock/unlock, ->siglock is enough. > > Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> > > --- t/kernel/signal.c~SQFREE 2007-08-22 20:06:31.000000000 +0400 > +++ t/kernel/signal.c 2007-08-23 16:02:57.000000000 +0400 > @@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void) > void sigqueue_free(struct sigqueue *q) > { > unsigned long flags; > + spinlock_t *lock = ¤t->sighand->siglock; > + > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > /* > * If the signal is still pending remove it from the > - * pending queue. > + * pending queue. We must hold ->siglock while testing > + * q->list to serialize with collect_signal(). > */ > - if (unlikely(!list_empty(&q->list))) { > - spinlock_t *lock = ¤t->sighand->siglock; > - read_lock(&tasklist_lock); > - spin_lock_irqsave(lock, flags); > Hmm, but the existing code _does_ take the siglock here. Is that not sufficient ? Isn't the first list_empty() check without lock only an optimization for the common case ? > - if (!list_empty(&q->list)) > - list_del_init(&q->list); > - spin_unlock_irqrestore(lock, flags); > - read_unlock(&tasklist_lock); > - } > + spin_lock_irqsave(lock, flags); > + if (!list_empty(&q->list)) > + list_del_init(&q->list); > + spin_unlock_irqrestore(lock, flags); > + > q->flags &= ~SIGQUEUE_PREALLOC; > __sigqueue_free(q); > } > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-23 21:36 ` Sukadev Bhattiprolu @ 2007-08-23 22:05 ` Oleg Nesterov 0 siblings, 0 replies; 13+ messages in thread From: Oleg Nesterov @ 2007-08-23 22:05 UTC (permalink / raw) To: Sukadev Bhattiprolu Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Jeremy Katz, taoyue, Thomas Gleixner, Roland McGrath, linux-kernel, stable On 08/23, Sukadev Bhattiprolu wrote: > > Oleg Nesterov wrote: > >Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz > ><jeremy.katz@windriver.com>. > > > >collect_signal: sigqueue_free: > > > > list_del_init(&first->list); > > if (!list_empty(&q->list)) { > > // not taken > > } > > q->flags &= > > ~SIGQUEUE_PREALLOC; > > > > __sigqueue_free(first); __sigqueue_free(q); > > > >Now, __sigqueue_free() is called twice on the same "struct sigqueue" with > >the > >obviously bad implications. > > > >--- t/kernel/signal.c~SQFREE 2007-08-22 20:06:31.000000000 +0400 > >+++ t/kernel/signal.c 2007-08-23 16:02:57.000000000 +0400 > >@@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void) > > void sigqueue_free(struct sigqueue *q) > > { > > unsigned long flags; > >+ spinlock_t *lock = ¤t->sighand->siglock; > >+ > > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > > /* > > * If the signal is still pending remove it from the > >- * pending queue. > >+ * pending queue. We must hold ->siglock while testing > >+ * q->list to serialize with collect_signal(). > > */ > >- if (unlikely(!list_empty(&q->list))) { > >- spinlock_t *lock = ¤t->sighand->siglock; > >- read_lock(&tasklist_lock); > >- spin_lock_irqsave(lock, flags); > > > Hmm, but the existing code _does_ take the siglock here. Is that not > sufficient ? Yes, it does, and this is sufficient, so the patch removes tasklist_lock. > Isn't the first list_empty() check without lock only an optimization for > the common > case ? Yes, this is optimization, but I strongly believe it is wrong. Please look at the race description above. !list_empty(&q->list) does _not_ necessary mean that q is not used and we can free it. It is possible that collect_signal() just removed this sigqueue from list (list_empty(&q->list) becomes true) and going to free it. The whole point is: we can't check list_empty() without ->siglock, this is racy, and leads to double-free. Oleg. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-23 13:45 [PATCH] sigqueue_free: fix the race with collect_signal() Oleg Nesterov 2007-08-23 21:36 ` Sukadev Bhattiprolu @ 2007-08-24 14:26 ` taoyue 2007-08-24 7:45 ` Oleg Nesterov 1 sibling, 1 reply; 13+ messages in thread From: taoyue @ 2007-08-24 14:26 UTC (permalink / raw) To: Oleg Nesterov Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Jeremy Katz, Thomas Gleixner, Roland McGrath, linux-kernel, stable Oleg Nesterov wrote: > Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz <jeremy.katz@windriver.com>. > > collect_signal: sigqueue_free: > > list_del_init(&first->list); > if (!list_empty(&q->list)) { > // not taken > } > q->flags &= ~SIGQUEUE_PREALLOC; > > __sigqueue_free(first); __sigqueue_free(q); > > Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the > obviously bad implications. > > In particular, this double free breaks the array_cache->avail logic, so the > same sigqueue could be "allocated" twice, and the bug can manifest itself via > the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue. > > Hopefully this can explain these mysterious bug-reports, see > > http://marc.info/?t=118766926500003 > http://marc.info/?t=118466273000005 > > Alexey Dobriyan reports this patch makes the difference for the testcase, but > nobody has an access to the application which opened the problems originally. > > Also, this patch removes tasklist lock/unlock, ->siglock is enough. > > Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> > > --- t/kernel/signal.c~SQFREE 2007-08-22 20:06:31.000000000 +0400 > +++ t/kernel/signal.c 2007-08-23 16:02:57.000000000 +0400 > @@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void) > void sigqueue_free(struct sigqueue *q) > { > unsigned long flags; > + spinlock_t *lock = ¤t->sighand->siglock; > + > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > /* > * If the signal is still pending remove it from the > - * pending queue. > + * pending queue. We must hold ->siglock while testing > + * q->list to serialize with collect_signal(). > */ > - if (unlikely(!list_empty(&q->list))) { > - spinlock_t *lock = ¤t->sighand->siglock; > - read_lock(&tasklist_lock); > - spin_lock_irqsave(lock, flags); > - if (!list_empty(&q->list)) > - list_del_init(&q->list); > - spin_unlock_irqrestore(lock, flags); > - read_unlock(&tasklist_lock); > - } > + spin_lock_irqsave(lock, flags); > + if (!list_empty(&q->list)) > + list_del_init(&q->list); > + spin_unlock_irqrestore(lock, flags); > + > q->flags &= ~SIGQUEUE_PREALLOC; > __sigqueue_free(q); > } > > > Applying previous patch,it seems likely that the __sigqueue_free() is also called twice. collect_signal: sigqueue_free: list_del_init(&first->list); spin_lock_irqsave(lock, flags); if (!list_empty(&q->list)) list_del_init(&q->list); spin_unlock_irqrestore(lock, flags); q->flags &= ~SIGQUEUE_PREALLOC; __sigqueue_free(first); __sigqueue_free(q); yue.tao ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-24 14:26 ` taoyue @ 2007-08-24 7:45 ` Oleg Nesterov 2007-08-24 21:29 ` taoyue 0 siblings, 1 reply; 13+ messages in thread From: Oleg Nesterov @ 2007-08-24 7:45 UTC (permalink / raw) To: taoyue Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner, Roland McGrath, linux-kernel, stable On 08/24, taoyue wrote: > > Oleg Nesterov wrote: > > > >--- t/kernel/signal.c~SQFREE 2007-08-22 20:06:31.000000000 +0400 > >+++ t/kernel/signal.c 2007-08-23 16:02:57.000000000 +0400 > >@@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void) > > void sigqueue_free(struct sigqueue *q) > > { > > unsigned long flags; > >+ spinlock_t *lock = ¤t->sighand->siglock; > >+ > > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > > /* > > * If the signal is still pending remove it from the > >- * pending queue. > >+ * pending queue. We must hold ->siglock while testing > >+ * q->list to serialize with collect_signal(). > > */ > >- if (unlikely(!list_empty(&q->list))) { > >- spinlock_t *lock = ¤t->sighand->siglock; > >- read_lock(&tasklist_lock); > >- spin_lock_irqsave(lock, flags); > >- if (!list_empty(&q->list)) > >- list_del_init(&q->list); > >- spin_unlock_irqrestore(lock, flags); > >- read_unlock(&tasklist_lock); > >- } > >+ spin_lock_irqsave(lock, flags); > >+ if (!list_empty(&q->list)) > >+ list_del_init(&q->list); > >+ spin_unlock_irqrestore(lock, flags); > >+ > > q->flags &= ~SIGQUEUE_PREALLOC; > > __sigqueue_free(q); > > } > > > > > > > Applying previous patch???it seems likely that the __sigqueue_free() is > also called twice. > > collect_signal: sigqueue_free: > > list_del_init(&first->list); > spin_lock_irqsave(lock, flags); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > if (!list_empty(&q->list)) > list_del_init(&q->list); > spin_unlock_irqrestore(lock, flags); > q->flags &= ~SIGQUEUE_PREALLOC; > > __sigqueue_free(first); __sigqueue_free(q); collect_signal() is always called under ->siglock which is also taken by sigqueue_free(), so this is not possible. Basically, this patch is the same one-liner I sent you before http://marc.info/?l=linux-kernel&m=118772206603453&w=2 (Thanks for the additional testing and report, btw). P.S. It would be nice to know if this patch solves the problems reported by Jeremy, but his email is disabled. Oleg. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-24 7:45 ` Oleg Nesterov @ 2007-08-24 21:29 ` taoyue 2007-08-24 11:08 ` Oleg Nesterov 0 siblings, 1 reply; 13+ messages in thread From: taoyue @ 2007-08-24 21:29 UTC (permalink / raw) To: Oleg Nesterov Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner, Roland McGrath, linux-kernel, stable Oleg Nesterov wrote: > On 08/24, taoyue wrote: > >> Oleg Nesterov wrote: >> >>> --- t/kernel/signal.c~SQFREE 2007-08-22 20:06:31.000000000 +0400 >>> +++ t/kernel/signal.c 2007-08-23 16:02:57.000000000 +0400 >>> @@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void) >>> void sigqueue_free(struct sigqueue *q) >>> { >>> unsigned long flags; >>> + spinlock_t *lock = ¤t->sighand->siglock; >>> + >>> BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); >>> /* >>> * If the signal is still pending remove it from the >>> - * pending queue. >>> + * pending queue. We must hold ->siglock while testing >>> + * q->list to serialize with collect_signal(). >>> */ >>> - if (unlikely(!list_empty(&q->list))) { >>> - spinlock_t *lock = ¤t->sighand->siglock; >>> - read_lock(&tasklist_lock); >>> - spin_lock_irqsave(lock, flags); >>> - if (!list_empty(&q->list)) >>> - list_del_init(&q->list); >>> - spin_unlock_irqrestore(lock, flags); >>> - read_unlock(&tasklist_lock); >>> - } >>> + spin_lock_irqsave(lock, flags); >>> + if (!list_empty(&q->list)) >>> + list_del_init(&q->list); >>> + spin_unlock_irqrestore(lock, flags); >>> + >>> q->flags &= ~SIGQUEUE_PREALLOC; >>> __sigqueue_free(q); >>> } >>> >>> >>> >>> >> Applying previous patch???it seems likely that the __sigqueue_free() is >> also called twice. >> >> collect_signal: sigqueue_free: >> >> list_del_init(&first->list); >> spin_lock_irqsave(lock, flags); >> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > >> if (!list_empty(&q->list)) >> list_del_init(&q->list); >> spin_unlock_irqrestore(lock, flags); >> q->flags &= ~SIGQUEUE_PREALLOC; >> >> __sigqueue_free(first); __sigqueue_free(q); >> > > collect_signal() is always called under ->siglock which is also taken by > sigqueue_free(), so this is not possible. > > Basically, this patch is the same one-liner I sent you before > > http://marc.info/?l=linux-kernel&m=118772206603453&w=2 > > (Thanks for the additional testing and report, btw). > > P.S. It would be nice to know if this patch solves the problems reported > by Jeremy, but his email is disabled. > > Oleg. > > I know, using current->sighand->siglock to prevent one sigqueue is free twice. I want to know whether it is possible that the two function is called in different thread. If that, the spin_lock is useless. yue.tao ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-24 21:29 ` taoyue @ 2007-08-24 11:08 ` Oleg Nesterov 2007-08-24 20:03 ` Sukadev Bhattiprolu 0 siblings, 1 reply; 13+ messages in thread From: Oleg Nesterov @ 2007-08-24 11:08 UTC (permalink / raw) To: taoyue Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner, Roland McGrath, linux-kernel, stable On 08/24, taoyue wrote: > > Oleg Nesterov wrote: > >> > >>collect_signal: sigqueue_free: > >> > >> list_del_init(&first->list); > >> spin_lock_irqsave(lock, flags); > >> > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > >> if (!list_empty(&q->list)) > >> list_del_init(&q->list); > >> spin_unlock_irqrestore(lock, > >> flags); > >> q->flags &= ~SIGQUEUE_PREALLOC; > >> > >> __sigqueue_free(first); __sigqueue_free(q); > >> > > > >collect_signal() is always called under ->siglock which is also taken by > >sigqueue_free(), so this is not possible. > > > >Basically, this patch is the same one-liner I sent you before > > > > http://marc.info/?l=linux-kernel&m=118772206603453&w=2 > > > >(Thanks for the additional testing and report, btw). > > > >P.S. It would be nice to know if this patch solves the problems reported > >by Jeremy, but his email is disabled. > > > >Oleg. > > > > > I know, using current->sighand->siglock to prevent one sigqueue > is free twice. I want to know whether it is possible that the two > function is called in different thread. If that, the spin_lock is useless. Not sure I understand. Yes, it is possible they are called by 2 different threads, that is why we had a race. But all threads in the same thread group have the same ->sighand, and thus the same ->sighand->siglock. Oleg. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-24 11:08 ` Oleg Nesterov @ 2007-08-24 20:03 ` Sukadev Bhattiprolu 2007-08-24 20:23 ` Oleg Nesterov 0 siblings, 1 reply; 13+ messages in thread From: Sukadev Bhattiprolu @ 2007-08-24 20:03 UTC (permalink / raw) To: Oleg Nesterov Cc: taoyue, Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner, Roland McGrath, linux-kernel, stable Oleg Nesterov wrote: > On 08/24, taoyue wrote: > >> Oleg Nesterov wrote: >> >>>> collect_signal: sigqueue_free: >>>> >>>> list_del_init(&first->list); >>>> spin_lock_irqsave(lock, flags); >>>> >>>> >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> >>> >>>> if (!list_empty(&q->list)) >>>> list_del_init(&q->list); >>>> spin_unlock_irqrestore(lock, >>>> flags); >>>> q->flags &= ~SIGQUEUE_PREALLOC; >>>> >>>> __sigqueue_free(first); __sigqueue_free(q); >>>> >>>> >>> collect_signal() is always called under ->siglock which is also taken by >>> sigqueue_free(), so this is not possible. >>> >>> Basically, this patch is the same one-liner I sent you before >>> >>> http://marc.info/?l=linux-kernel&m=118772206603453&w=2 >>> >>> (Thanks for the additional testing and report, btw). >>> >>> P.S. It would be nice to know if this patch solves the problems reported >>> by Jeremy, but his email is disabled. >>> >>> Oleg. >>> >>> >>> >> I know, using current->sighand->siglock to prevent one sigqueue >> is free twice. I want to know whether it is possible that the two >> function is called in different thread. If that, the spin_lock is useless. >> > > Not sure I understand. Yes, it is possible they are called by 2 different > threads, that is why we had a race. But all threads in the same thread > group have the same ->sighand, and thus the same ->sighand->siglock. > Oleg, if one thread can be in collect_signal() and another in sigqueue_free() and both operate on the exact same sigqueue object, its not clear how we prevent two calls to __sigqueue_free() to the same object. In that case the lock (or some lock) should be around __sigqueue_free() - no ? i.e if we enter sigqueue_free(), we will call __sigqueue_free() regardless of the state. > Oleg. > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-24 20:03 ` Sukadev Bhattiprolu @ 2007-08-24 20:23 ` Oleg Nesterov 2007-08-25 17:24 ` Sukadev Bhattiprolu 2007-08-27 13:45 ` taoyue 0 siblings, 2 replies; 13+ messages in thread From: Oleg Nesterov @ 2007-08-24 20:23 UTC (permalink / raw) To: Sukadev Bhattiprolu Cc: taoyue, Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner, Roland McGrath, linux-kernel, stable On 08/24, Sukadev Bhattiprolu wrote: > > Oleg Nesterov wrote: > >On 08/24, taoyue wrote: > > > >>Oleg Nesterov wrote: > >> > >>>>collect_signal: sigqueue_free: > >>>> > >>>> list_del_init(&first->list); > >>>> spin_lock_irqsave(lock, flags); > >>>> > >>>> > >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > >>> > >>> > >>>> if (!list_empty(&q->list)) > >>>> list_del_init(&q->list); > >>>> spin_unlock_irqrestore(lock, > >>>> flags); > >>>> q->flags &= ~SIGQUEUE_PREALLOC; > >>>> > >>>> __sigqueue_free(first); __sigqueue_free(q); > >>>> > >>>> > >>>collect_signal() is always called under ->siglock which is also taken by > >>>sigqueue_free(), so this is not possible. > >>> > >>> > >>> > >>I know, using current->sighand->siglock to prevent one sigqueue > >>is free twice. I want to know whether it is possible that the two > >>function is called in different thread. If that, the spin_lock is useless. > >> > > > >Not sure I understand. Yes, it is possible they are called by 2 different > >threads, that is why we had a race. But all threads in the same thread > >group have the same ->sighand, and thus the same ->sighand->siglock. > > > > Oleg, if one thread can be in collect_signal() and another in > sigqueue_free() and both operate on the exact same sigqueue object, its > not clear how we prevent two calls to __sigqueue_free() to > the same object. In that case the lock (or some lock) should be around > __sigqueue_free() - no ? > > i.e if we enter sigqueue_free(), we will call __sigqueue_free() > regardless of the state. Yes. They both will call __sigqueue_free(). But please note that __sigqueue_free() checks SIGQUEUE_PREALLOC, which is cleared by sigqueue_free(). IOW, when sigqueue_free() unlocks ->siglock, we know that it can't be used by collect_signal() from another thread. So we can clear SIGQUEUE_PREALLOC and free sigqueue. We don't need this lock around sigqueue_free() to prevent the race. collect_signal() can "see" only those sigqueues which are on list. IOW, when sigqueue_free() takes ->siglock, colect_signal() can't run, because it needs the same lock. Now we delete this sigqueue from list, nobody can see it, it can't have other references. So we can unlock ->siglock, mark sigqueue as freeable (clear SIGQUEUE_PREALLOC), and free it. Do you agree? Oleg. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-24 20:23 ` Oleg Nesterov @ 2007-08-25 17:24 ` Sukadev Bhattiprolu 2007-08-25 17:34 ` Oleg Nesterov 2007-08-27 13:45 ` taoyue 1 sibling, 1 reply; 13+ messages in thread From: Sukadev Bhattiprolu @ 2007-08-25 17:24 UTC (permalink / raw) To: Oleg Nesterov Cc: taoyue, Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner, Roland McGrath, linux-kernel, stable Oleg Nesterov wrote: > On 08/24, Sukadev Bhattiprolu wrote: > >> Oleg Nesterov wrote: >> >>> On 08/24, taoyue wrote: >>> >>> >>>> Oleg Nesterov wrote: >>>> >>>> >>>>>> collect_signal: sigqueue_free: >>>>>> >>>>>> list_del_init(&first->list); >>>>>> spin_lock_irqsave(lock, flags); >>>>>> >>>>>> >>>>>> >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> >>>>> >>>>>> if (!list_empty(&q->list)) >>>>>> list_del_init(&q->list); >>>>>> spin_unlock_irqrestore(lock, >>>>>> flags); >>>>>> q->flags &= ~SIGQUEUE_PREALLOC; >>>>>> >>>>>> __sigqueue_free(first); __sigqueue_free(q); >>>>>> >>>>>> >>>>>> >>>>> collect_signal() is always called under ->siglock which is also taken by >>>>> sigqueue_free(), so this is not possible. >>>>> >>>>> >>>>> >>>>> >>>> I know, using current->sighand->siglock to prevent one sigqueue >>>> is free twice. I want to know whether it is possible that the two >>>> function is called in different thread. If that, the spin_lock is useless. >>>> >>>> >>> Not sure I understand. Yes, it is possible they are called by 2 different >>> threads, that is why we had a race. But all threads in the same thread >>> group have the same ->sighand, and thus the same ->sighand->siglock. >>> >>> >> Oleg, if one thread can be in collect_signal() and another in >> sigqueue_free() and both operate on the exact same sigqueue object, its >> not clear how we prevent two calls to __sigqueue_free() to >> the same object. In that case the lock (or some lock) should be around >> __sigqueue_free() - no ? >> >> i.e if we enter sigqueue_free(), we will call __sigqueue_free() >> regardless of the state. >> > > Yes. They both will call __sigqueue_free(). But please note that __sigqueue_free() > checks SIGQUEUE_PREALLOC, which is cleared by sigqueue_free(). > > IOW, when sigqueue_free() unlocks ->siglock, we know that it can't be used > by collect_signal() from another thread. So we can clear SIGQUEUE_PREALLOC > and free sigqueue. We don't need this lock around sigqueue_free() to prevent > the race. collect_signal() can "see" only those sigqueues which are on list. > > IOW, when sigqueue_free() takes ->siglock, colect_signal() can't run, because > it needs the same lock. Now we delete this sigqueue from list, nobody can > see it, it can't have other references. So we can unlock ->siglock, mark > sigqueue as freeable (clear SIGQUEUE_PREALLOC), and free it. > > Do you agree? > Yes. I see it now. I had missed the SIGQUEUE_PREALLOC in __sigqueue_free(). Thanks for clarifying Suka ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-25 17:24 ` Sukadev Bhattiprolu @ 2007-08-25 17:34 ` Oleg Nesterov 0 siblings, 0 replies; 13+ messages in thread From: Oleg Nesterov @ 2007-08-25 17:34 UTC (permalink / raw) To: Sukadev Bhattiprolu Cc: taoyue, Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner, Roland McGrath, linux-kernel, stable On 08/25, Sukadev Bhattiprolu wrote: > > Yes. I see it now. I had missed the SIGQUEUE_PREALLOC in __sigqueue_free(). Thanks for looking at this. This code looks simple, but it is not obvious, at least for me. Oleg. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-24 20:23 ` Oleg Nesterov 2007-08-25 17:24 ` Sukadev Bhattiprolu @ 2007-08-27 13:45 ` taoyue 2007-08-27 5:57 ` Oleg Nesterov 1 sibling, 1 reply; 13+ messages in thread From: taoyue @ 2007-08-27 13:45 UTC (permalink / raw) To: Oleg Nesterov Cc: Sukadev Bhattiprolu, Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner, Roland McGrath, linux-kernel, stable, Mark Zhan, Ashfield, Bruce Oleg Nesterov wrote: > On 08/24, Sukadev Bhattiprolu wrote: > >> Oleg Nesterov wrote: >> >>> On 08/24, taoyue wrote: >>> >>> >>>> Oleg Nesterov wrote: >>>> >>>> >>>>>> collect_signal: sigqueue_free: >>>>>> >>>>>> list_del_init(&first->list); >>>>>> spin_lock_irqsave(lock, flags); >>>>>> >>>>>> >>>>>> >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> >>>>> >>>>>> if (!list_empty(&q->list)) >>>>>> list_del_init(&q->list); >>>>>> spin_unlock_irqrestore(lock, >>>>>> flags); >>>>>> q->flags &= ~SIGQUEUE_PREALLOC; >>>>>> >>>>>> __sigqueue_free(first); __sigqueue_free(q); >>>>>> >>>>>> >>>>>> >>>>> collect_signal() is always called under ->siglock which is also taken by >>>>> sigqueue_free(), so this is not possible. >>>>> >>>>> >>>>> >>>>> >>>> I know, using current->sighand->siglock to prevent one sigqueue >>>> is free twice. I want to know whether it is possible that the two >>>> function is called in different thread. If that, the spin_lock is useless. >>>> >>>> >>> Not sure I understand. Yes, it is possible they are called by 2 different >>> threads, that is why we had a race. But all threads in the same thread >>> group have the same ->sighand, and thus the same ->sighand->siglock. >>> I has applied the new patch, and start test program last Friday. So far, the test program has run for three days. In my test program, all threads is created in one process, so they are in the same thread group. If two thread is not in the same thread group, they have the different ->sighand->siglock. In this case, the lock can't prevent the sigqueue object from race condition. >>> >>> >> Oleg, if one thread can be in collect_signal() and another in >> sigqueue_free() and both operate on the exact same sigqueue object, its >> not clear how we prevent two calls to __sigqueue_free() to >> the same object. In that case the lock (or some lock) should be around >> __sigqueue_free() - no ? >> >> i.e if we enter sigqueue_free(), we will call __sigqueue_free() >> regardless of the state. >> > > Yes. They both will call __sigqueue_free(). But please note that __sigqueue_free() > checks SIGQUEUE_PREALLOC, which is cleared by sigqueue_free(). > > IOW, when sigqueue_free() unlocks ->siglock, we know that it can't be used > by collect_signal() from another thread. So we can clear SIGQUEUE_PREALLOC > and free sigqueue. We don't need this lock around sigqueue_free() to prevent > the race. collect_signal() can "see" only those sigqueues which are on list. > > IOW, when sigqueue_free() takes ->siglock, colect_signal() can't run, because > it needs the same lock. Now we delete this sigqueue from list, nobody can > see it, it can't have other references. So we can unlock ->siglock, mark > sigqueue as freeable (clear SIGQUEUE_PREALLOC), and free it. > > Do you agree? > > Oleg. > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] sigqueue_free: fix the race with collect_signal() 2007-08-27 13:45 ` taoyue @ 2007-08-27 5:57 ` Oleg Nesterov 0 siblings, 0 replies; 13+ messages in thread From: Oleg Nesterov @ 2007-08-27 5:57 UTC (permalink / raw) To: taoyue Cc: Sukadev Bhattiprolu, Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner, Roland McGrath, linux-kernel, stable, Mark Zhan, Ashfield, Bruce On 08/27, taoyue wrote: > > Oleg Nesterov wrote: > >On 08/24, Sukadev Bhattiprolu wrote: > >>>>> > >>>>I know, using current->sighand->siglock to prevent one sigqueue > >>>>is free twice. I want to know whether it is possible that the two > >>>>function is called in different thread. If that, the spin_lock is > >>>>useless. > >>>> > >>>> > >>>Not sure I understand. Yes, it is possible they are called by 2 different > >>>threads, that is why we had a race. But all threads in the same thread > >>>group have the same ->sighand, and thus the same ->sighand->siglock. > >>> > I has applied the new patch, and start test program last Friday. > So far, the test program has run for three days. Great, thanks. > In my test program, all threads is created in one process, so they > are in the same thread group. If two thread is not in the same thread > group, they have the different ->sighand->siglock. In this case, the > lock can't prevent the sigqueue object from race condition. If two thread are not in the same thread, they can't access the same SIGQUEUE_PREALLOC sigqueue. That is why sigqueue_free() uses current->sighand. Otherwise, this lock doesn't make any sense, and can't protect list_del(q->list). Oleg. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2007-08-27 5:59 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-08-23 13:45 [PATCH] sigqueue_free: fix the race with collect_signal() Oleg Nesterov 2007-08-23 21:36 ` Sukadev Bhattiprolu 2007-08-23 22:05 ` Oleg Nesterov 2007-08-24 14:26 ` taoyue 2007-08-24 7:45 ` Oleg Nesterov 2007-08-24 21:29 ` taoyue 2007-08-24 11:08 ` Oleg Nesterov 2007-08-24 20:03 ` Sukadev Bhattiprolu 2007-08-24 20:23 ` Oleg Nesterov 2007-08-25 17:24 ` Sukadev Bhattiprolu 2007-08-25 17:34 ` Oleg Nesterov 2007-08-27 13:45 ` taoyue 2007-08-27 5:57 ` Oleg Nesterov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox