* Re: + do_wait-wakeup-optimization.patch added to -mm tree [not found] <200811212015.mALKFMs4019558@imap1.linux-foundation.org> @ 2008-11-23 21:39 ` Oleg Nesterov 2008-11-23 21:55 ` do_wait() vs do_notify_parent_cldstop() theoretical race? Oleg Nesterov ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Oleg Nesterov @ 2008-11-23 21:39 UTC (permalink / raw) To: akpm; +Cc: linux-kernel, roland, mingo, rnalumasu > From: Roland McGrath <roland@redhat.com> > > +static int needs_wakeup(struct task_struct *task, struct do_wait_queue_entry *w) > +{ > + if ((w->options & __WNOTHREAD) && task->parent != w->wq.private) > + return 0; > + > + if (eligible_child(w->type, w->pid, w->options, > + task, task->exit_signal)) > + return 1; > + > + if (thread_group_leader(task)) { > + /* > + * In a group leader, do_notify_parent() may have > + * just reset task->exit_signal because SIGCHLD was > + * ignored, but that doesn't prevent the wakeup. > + */ > + if (!task_detached(task) || > + !eligible_child(w->type, w->pid, w->options, > + task, SIGCHLD)) > + return 0; > + } else { > + /* > + * In a non-leader, this might be the release_task() > + * case, where it's the leader rather than task > + * whose parent is being woken. > + */ > + if (!eligible_child(w->type, w->pid, w->options, > + task->group_leader, > + task_detached(task->group_leader) ? > + SIGCHLD : task->group_leader->exit_signal)) > + return 0; > + } > + > + return 1; > +} Unless I missed something, this is not right. This "task" is current, iow it is the caller of do_notify_parent(). Sometime it is OK (release_task, exit_notify), but in general not, afaics. Let's suppose the ptracer finds the EXIT_ZOMBIE tracee and notifies its ->real_parent which sleeps in do_wait(). In that case the usage of eligible_child(task == ptracer) above is bogus, and checking for group_leader is not rifgt too. > +static int do_wait_wake_function(wait_queue_t *curr, unsigned mode, int sync, > + void *key) > +{ > + struct task_struct *task = current; I think we can fix (and simplify) this code if we change __wake_up_parent(), it should call __wake_up(key => p), so we can do struct task_struct *task = key; > + if (!needs_wakeup(task, w)) > + return 0; > + > + return default_wake_function(curr, mode, sync, key); perhaps autoremove_wake_function() makes more sense. Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
* do_wait() vs do_notify_parent_cldstop() theoretical race? 2008-11-23 21:39 ` + do_wait-wakeup-optimization.patch added to -mm tree Oleg Nesterov @ 2008-11-23 21:55 ` Oleg Nesterov 2008-11-24 7:31 ` Roland McGrath 2008-12-04 1:05 ` Roland McGrath 2008-11-24 7:26 ` + do_wait-wakeup-optimization.patch added to -mm tree Roland McGrath 2008-12-04 1:06 ` Roland McGrath 2 siblings, 2 replies; 8+ messages in thread From: Oleg Nesterov @ 2008-11-23 21:55 UTC (permalink / raw) To: akpm; +Cc: linux-kernel, roland, mingo, rnalumasu Looking at do_wait(), suddenly I am starting to suspect we have the highly theoretical race with do_notify_parent_cldstop(). do_wait: add_wait_queue(...); current->state = TASK_INTERRUPTIBLE; read_lock(tasklist_lock); ... try to find the "interesting" task ... read_unlock(tasklist_lock); if (!retval) // not found schedule(); We don't race with do_notify_parent() because it takes tasklist for writing. But do_notify_parent_cldstop() can run in parallel under read_lock(tasklist). Now suppose that "->state = TASK_INTERRUPTIBLE" leaks deeply into the critical section. In theory, it is possible that wait_consider_task() checks task_is_stopped_or_traced() or SIGNAL_STOP_CONTINUED first, then CPU sets state = TASK_INTERRUPTIBLE. And we can miss the event if do_notify_parent_cldstop() happens in between. No? Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: do_wait() vs do_notify_parent_cldstop() theoretical race? 2008-11-23 21:55 ` do_wait() vs do_notify_parent_cldstop() theoretical race? Oleg Nesterov @ 2008-11-24 7:31 ` Roland McGrath 2008-12-04 1:05 ` Roland McGrath 1 sibling, 0 replies; 8+ messages in thread From: Roland McGrath @ 2008-11-24 7:31 UTC (permalink / raw) To: Oleg Nesterov; +Cc: akpm, linux-kernel, mingo, rnalumasu I guess. There might well be something in there implicitly that's actually a memory barrier already. But sounds like that should use set_current_state(). Thanks, Roland ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: do_wait() vs do_notify_parent_cldstop() theoretical race? 2008-11-23 21:55 ` do_wait() vs do_notify_parent_cldstop() theoretical race? Oleg Nesterov 2008-11-24 7:31 ` Roland McGrath @ 2008-12-04 1:05 ` Roland McGrath 1 sibling, 0 replies; 8+ messages in thread From: Roland McGrath @ 2008-12-04 1:05 UTC (permalink / raw) To: Oleg Nesterov; +Cc: akpm, linux-kernel, mingo, rnalumasu I guess. There might well be something in there implicitly that's actually a memory barrier already. But sounds like that should use set_current_state(). Thanks, Roland ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: + do_wait-wakeup-optimization.patch added to -mm tree 2008-11-23 21:39 ` + do_wait-wakeup-optimization.patch added to -mm tree Oleg Nesterov 2008-11-23 21:55 ` do_wait() vs do_notify_parent_cldstop() theoretical race? Oleg Nesterov @ 2008-11-24 7:26 ` Roland McGrath 2008-12-04 15:26 ` Oleg Nesterov 2008-12-04 1:06 ` Roland McGrath 2 siblings, 1 reply; 8+ messages in thread From: Roland McGrath @ 2008-11-24 7:26 UTC (permalink / raw) To: Oleg Nesterov; +Cc: akpm, linux-kernel, mingo, rnalumasu > Let's suppose the ptracer finds the EXIT_ZOMBIE tracee and notifies its > ->real_parent which sleeps in do_wait(). In that case the usage of > eligible_child(task == ptracer) above is bogus, and checking for > group_leader is not rifgt too. I had overlooked that do_notify_parent() call. > > +static int do_wait_wake_function(wait_queue_t *curr, unsigned mode, int sync, > > + void *key) > > +{ > > + struct task_struct *task = current; > > I think we can fix (and simplify) this code if we change __wake_up_parent(), > it should call __wake_up(key => p), so we can do > > struct task_struct *task = key; I had not looked into the bowels of various __wake_up variants, just assumed it would stay as it is and use wake_up_interruptible_sync. That would certainly be cleaner. Then do_wait_wake_function would not need the second of its special cases, only the one double-check for the thread_group_leader && task_detached case. I don't see an exposed __wake_up* variant that both passes a "key" pointer through and does "sync". For __wake_up_parent, "sync" is quite desireable. > > + if (!needs_wakeup(task, w)) > > + return 0; > > + > > + return default_wake_function(curr, mode, sync, key); > > perhaps autoremove_wake_function() makes more sense. Why? The do_wait loop will have to go through again and still might just sleep again. The explicit remove at the end of do_wait seems fine to me. Thanks, Roland ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: + do_wait-wakeup-optimization.patch added to -mm tree 2008-11-24 7:26 ` + do_wait-wakeup-optimization.patch added to -mm tree Roland McGrath @ 2008-12-04 15:26 ` Oleg Nesterov 2008-12-04 20:59 ` Roland McGrath 0 siblings, 1 reply; 8+ messages in thread From: Oleg Nesterov @ 2008-12-04 15:26 UTC (permalink / raw) To: Roland McGrath; +Cc: akpm, linux-kernel, mingo, rnalumasu On 11/23, Roland McGrath wrote: > > > > +static int do_wait_wake_function(wait_queue_t *curr, unsigned mode, int sync, > > > + void *key) > > > +{ > > > + struct task_struct *task = current; > > > > I think we can fix (and simplify) this code if we change __wake_up_parent(), > > it should call __wake_up(key => p), so we can do > > > > struct task_struct *task = key; > > I don't see an exposed __wake_up* variant that both passes a "key" pointer > through and does "sync". For __wake_up_parent, "sync" is quite desireable. Well, yes... and __wake_up_common() is static. Perhaps we can make a new helper. I must admit, I don't understand what "sync" actually means nowadays. > > > + if (!needs_wakeup(task, w)) > > > + return 0; > > > + > > > + return default_wake_function(curr, mode, sync, key); > > > > perhaps autoremove_wake_function() makes more sense. > > Why? The do_wait loop will have to go through again and still might just > sleep again. The explicit remove at the end of do_wait seems fine to me. Yes, yes, I was wrong. I forgot about "repeat:" in do_wait(). Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: + do_wait-wakeup-optimization.patch added to -mm tree 2008-12-04 15:26 ` Oleg Nesterov @ 2008-12-04 20:59 ` Roland McGrath 0 siblings, 0 replies; 8+ messages in thread From: Roland McGrath @ 2008-12-04 20:59 UTC (permalink / raw) To: Oleg Nesterov; +Cc: akpm, linux-kernel, mingo, rnalumasu > > I don't see an exposed __wake_up* variant that both passes a "key" pointer > > through and does "sync". For __wake_up_parent, "sync" is quite desireable. > > Well, yes... and __wake_up_common() is static. Perhaps we can make a new > helper. Right, that's what I was suggesting (and not volunteering to do ;-). > I must admit, I don't understand what "sync" actually means nowadays. I don't claim to know any actual scheduler innards. But the meaning as I understand it is to "make it runnable, but don't try to reschedule right now because current will block quite soon anyway. If this does indeed reduce work done to immediately reschedule, then it seems quite desireable to avoid that flutter since the dying/stopping thread is very few cycles away from yielding, and in the death case it will be for the last time and rescheduling earlier just means a later unnecessary switch back and delayed put_task_struct processing after the reap. Thanks, Roland ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: + do_wait-wakeup-optimization.patch added to -mm tree 2008-11-23 21:39 ` + do_wait-wakeup-optimization.patch added to -mm tree Oleg Nesterov 2008-11-23 21:55 ` do_wait() vs do_notify_parent_cldstop() theoretical race? Oleg Nesterov 2008-11-24 7:26 ` + do_wait-wakeup-optimization.patch added to -mm tree Roland McGrath @ 2008-12-04 1:06 ` Roland McGrath 2 siblings, 0 replies; 8+ messages in thread From: Roland McGrath @ 2008-12-04 1:06 UTC (permalink / raw) To: Oleg Nesterov; +Cc: akpm, linux-kernel, mingo, rnalumasu > Let's suppose the ptracer finds the EXIT_ZOMBIE tracee and notifies its > ->real_parent which sleeps in do_wait(). In that case the usage of > eligible_child(task == ptracer) above is bogus, and checking for > group_leader is not rifgt too. I had overlooked that do_notify_parent() call. > > +static int do_wait_wake_function(wait_queue_t *curr, unsigned mode, int sync, > > + void *key) > > +{ > > + struct task_struct *task = current; > > I think we can fix (and simplify) this code if we change __wake_up_parent(), > it should call __wake_up(key => p), so we can do > > struct task_struct *task = key; I had not looked into the bowels of various __wake_up variants, just assumed it would stay as it is and use wake_up_interruptible_sync. That would certainly be cleaner. Then do_wait_wake_function would not need the second of its special cases, only the one double-check for the thread_group_leader && task_detached case. I don't see an exposed __wake_up* variant that both passes a "key" pointer through and does "sync". For __wake_up_parent, "sync" is quite desireable. > > + if (!needs_wakeup(task, w)) > > + return 0; > > + > > + return default_wake_function(curr, mode, sync, key); > > perhaps autoremove_wake_function() makes more sense. Why? The do_wait loop will have to go through again and still might just sleep again. The explicit remove at the end of do_wait seems fine to me. Thanks, Roland ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-12-04 21:00 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200811212015.mALKFMs4019558@imap1.linux-foundation.org>
2008-11-23 21:39 ` + do_wait-wakeup-optimization.patch added to -mm tree Oleg Nesterov
2008-11-23 21:55 ` do_wait() vs do_notify_parent_cldstop() theoretical race? Oleg Nesterov
2008-11-24 7:31 ` Roland McGrath
2008-12-04 1:05 ` Roland McGrath
2008-11-24 7:26 ` + do_wait-wakeup-optimization.patch added to -mm tree Roland McGrath
2008-12-04 15:26 ` Oleg Nesterov
2008-12-04 20:59 ` Roland McGrath
2008-12-04 1:06 ` Roland McGrath
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox