* nommu build failure in linux-next
@ 2012-03-06 14:49 Mark Salter
2012-03-06 15:19 ` Siddhesh Poyarekar
2012-03-06 20:16 ` [PATCH 1/2] mm: Fix task_nommu build regression " Siddhesh Poyarekar
0 siblings, 2 replies; 12+ messages in thread
From: Mark Salter @ 2012-03-06 14:49 UTC (permalink / raw)
To: siddhesh.poyarekar; +Cc: linux-next, linux-kernel
commit b7b2a0afacada237005068294cb0ccc49d32889e
Author: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
Date: Tue Mar 6 11:17:12 2012 +1100
procfs: mark thread stack correctly in proc/<pid>/maps
This patch still breaks nommu builds. The previous syntax issue has been
resolved, but now I see a link error:
fs/built-in.o: In function `nommu_vma_show':
task_nommu.c:(.text+0x564a0): undefined reference to `vm_is_stack'
make[1]: *** [.tmp_vmlinux1] Error 1
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: nommu build failure in linux-next 2012-03-06 14:49 nommu build failure in linux-next Mark Salter @ 2012-03-06 15:19 ` Siddhesh Poyarekar 2012-03-06 20:16 ` [PATCH 1/2] mm: Fix task_nommu build regression " Siddhesh Poyarekar 1 sibling, 0 replies; 12+ messages in thread From: Siddhesh Poyarekar @ 2012-03-06 15:19 UTC (permalink / raw) To: Mark Salter; +Cc: linux-next, linux-kernel, Andrew Morton On Tue, Mar 6, 2012 at 8:19 PM, Mark Salter <msalter@redhat.com> wrote: > commit b7b2a0afacada237005068294cb0ccc49d32889e > Author: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com> > Date: Tue Mar 6 11:17:12 2012 +1100 > > procfs: mark thread stack correctly in proc/<pid>/maps > > > This patch still breaks nommu builds. The previous syntax issue has been > resolved, but now I see a link error: > > fs/built-in.o: In function `nommu_vma_show': > task_nommu.c:(.text+0x564a0): undefined reference to `vm_is_stack' > make[1]: *** [.tmp_vmlinux1] Error 1 > The function is defined in mm/memory.c and I realized now that it is not built when !CONFIG_MMU. There needs to be a copy of these functions in nommu.c as well. I'll test the build and send a patch. -- Siddhesh Poyarekar http://siddhesh.in ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] mm: Fix task_nommu build regression in linux-next 2012-03-06 14:49 nommu build failure in linux-next Mark Salter 2012-03-06 15:19 ` Siddhesh Poyarekar @ 2012-03-06 20:16 ` Siddhesh Poyarekar 2012-03-06 20:16 ` [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack Siddhesh Poyarekar ` (2 more replies) 1 sibling, 3 replies; 12+ messages in thread From: Siddhesh Poyarekar @ 2012-03-06 20:16 UTC (permalink / raw) To: Mark Salter Cc: linux-next, linux-kernel, Andrew Morton, Paul Gortmaker, Oleg Nesterov, Siddhesh Poyarekar Commit b7b2a0afacada237005068294cb0ccc49d32889e resulted in a build failure for nommu builds: fs/built-in.o: In function `nommu_vma_show': task_nommu.c:(.text+0x564a0): undefined reference to `vm_is_stack' make[1]: *** [.tmp_vmlinux1] Error 1 Fix is to provide definitions of vm_is_stack() and vm_is_stack_for_task() for nommu as well. This patch also adds an explicit include of sched.h based on observations and patch submission by Paul Gortmaker <paul.gortmaker@windriver.com>: https://lkml.org/lkml/2012/3/5/326 Signed-off-by: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com> --- mm/memory.c | 1 + mm/nommu.c | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 41 insertions(+), 0 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 2533d9f..0ca7fe6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -57,6 +57,7 @@ #include <linux/swapops.h> #include <linux/elf.h> #include <linux/gfp.h> +#include <linux/sched.h> #include <asm/io.h> #include <asm/pgalloc.h> diff --git a/mm/nommu.c b/mm/nommu.c index b982290..5a5c3fc 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -29,6 +29,7 @@ #include <linux/security.h> #include <linux/syscalls.h> #include <linux/audit.h> +#include <linux/sched.h> #include <asm/uaccess.h> #include <asm/tlb.h> @@ -2089,3 +2090,42 @@ int nommu_shrink_inode_mappings(struct inode *inode, size_t size, up_write(&nommu_region_sem); return 0; } + +/* Check if the vma is being used as a stack by this task */ +static int vm_is_stack_for_task(struct task_struct *t, + struct vm_area_struct *vma) +{ + return (vma->vm_start <= KSTK_ESP(t) && vma->vm_end >= KSTK_ESP(t)); +} + +/* + * Check if the vma is being used as a stack. + * If is_group is non-zero, check in the entire thread group or else + * just check in the current task. Returns the pid of the task that + * the vma is stack for. + */ +pid_t vm_is_stack(struct task_struct *task, + struct vm_area_struct *vma, int in_group) +{ + pid_t ret = 0; + + if (vm_is_stack_for_task(task, vma)) + return task->pid; + + if (in_group) { + struct task_struct *t; + rcu_read_lock(); + t = list_first_entry_rcu(&task->thread_group, + struct task_struct, thread_group); + do { + if (vm_is_stack_for_task(t, vma)) { + ret = t->pid; + goto done; + } + } while_each_thread(task, t); +done: + rcu_read_unlock(); + } + + return ret; +} -- 1.7.7.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack 2012-03-06 20:16 ` [PATCH 1/2] mm: Fix task_nommu build regression " Siddhesh Poyarekar @ 2012-03-06 20:16 ` Siddhesh Poyarekar 2012-03-07 15:38 ` Oleg Nesterov 2012-03-06 20:39 ` [PATCH 1/2] mm: Fix task_nommu build regression in linux-next Andrew Morton 2012-03-07 16:43 ` Siddhesh Poyarekar 2 siblings, 1 reply; 12+ messages in thread From: Siddhesh Poyarekar @ 2012-03-06 20:16 UTC (permalink / raw) To: Mark Salter Cc: linux-next, linux-kernel, Andrew Morton, Paul Gortmaker, Oleg Nesterov, Siddhesh Poyarekar Take rcu read lock before we do anything at all with the threadgroup list. Also use list_first_entry_rcu to safely get the reference to the first task in the list. Signed-off-by: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com> --- mm/memory.c | 12 +++++++----- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 0ca7fe6..1d5830c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3932,18 +3932,20 @@ pid_t vm_is_stack(struct task_struct *task, return task->pid; if (in_group) { - struct task_struct *t = task; + struct task_struct *t; rcu_read_lock(); - while_each_thread(task, t) { + t = list_first_entry_rcu(&task->thread_group, + struct task_struct, thread_group); + do { if (vm_is_stack_for_task(t, vma)) { ret = t->pid; goto done; } - } + } while_each_thread(task, t); +done: + rcu_read_unlock(); } -done: - rcu_read_unlock(); return ret; } -- 1.7.7.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack 2012-03-06 20:16 ` [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack Siddhesh Poyarekar @ 2012-03-07 15:38 ` Oleg Nesterov 2012-03-07 16:05 ` Siddhesh Poyarekar 2012-03-07 22:11 ` Siddhesh Poyarekar 0 siblings, 2 replies; 12+ messages in thread From: Oleg Nesterov @ 2012-03-07 15:38 UTC (permalink / raw) To: Siddhesh Poyarekar Cc: Mark Salter, linux-next, linux-kernel, Andrew Morton, Paul Gortmaker On 03/07, Siddhesh Poyarekar wrote: > > Take rcu read lock before we do anything at all with the threadgroup > list. Also use list_first_entry_rcu to safely get the reference to the > first task in the list. Sorry, but both patches are absolutely wrong, afaics. This change fixes nothing, just obfuscates the code. Probably my explanation was not clear, let me try again. rcu_read_lock() can not help without the additional checks. By the time you take it, task->thread_group->next can point to nowhere. So, > @@ -3932,18 +3932,20 @@ pid_t vm_is_stack(struct task_struct *task, > return task->pid; > > if (in_group) { > - struct task_struct *t = task; > + struct task_struct *t; > rcu_read_lock(); > - while_each_thread(task, t) { > + t = list_first_entry_rcu(&task->thread_group, > + struct task_struct, thread_group); It is not safe to dereference this pointer. Once again. You have the task_struct *task. It exits, but task->thread_group->next still points to another thread T. Now suppose that T exits too. But task->thread_group->next was not changed, it still points to T. RCU grace period passes, T is freed. After that you take rcu_read_lock(), but it is too late! >next points to the already freed/reused memory. How can list_first_entry_rcu() help? What you need is something like rcu_read_lock(); if (!pid_alive(task)) goto done; t = task; do { ... } while_each(task, t); done: rcu_read_unlock(); And. Imho it is not good to have the (afaics exactly?) same code in mm/nommu.c, even with the same names. Why it is not possible to make a single definition? Oleg. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack 2012-03-07 15:38 ` Oleg Nesterov @ 2012-03-07 16:05 ` Siddhesh Poyarekar 2012-03-07 22:11 ` Siddhesh Poyarekar 1 sibling, 0 replies; 12+ messages in thread From: Siddhesh Poyarekar @ 2012-03-07 16:05 UTC (permalink / raw) To: Oleg Nesterov Cc: Mark Salter, linux-next, linux-kernel, Andrew Morton, Paul Gortmaker On Wed, Mar 7, 2012 at 9:08 PM, Oleg Nesterov <oleg@redhat.com> wrote: > Once again. You have the task_struct *task. It exits, > but task->thread_group->next still points to another thread T. Now suppose > that T exits too. But task->thread_group->next was not changed, it still > points to T. RCU grace period passes, T is freed. > > After that you take rcu_read_lock(), but it is too late! >next points to > the already freed/reused memory. How can list_first_entry_rcu() help? Ahh, I completely misunderstood your point. Thanks for the detailed explanation. > And. Imho it is not good to have the (afaics exactly?) same code in > mm/nommu.c, even with the same names. Why it is not possible to make > a single definition? Yes it is the same code. I put the code in both memory.c and nommu.c because I thought they fit in there logically. I can find a common place for it. -- Siddhesh Poyarekar http://siddhesh.in ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack 2012-03-07 15:38 ` Oleg Nesterov 2012-03-07 16:05 ` Siddhesh Poyarekar @ 2012-03-07 22:11 ` Siddhesh Poyarekar 2012-03-08 17:40 ` Oleg Nesterov 1 sibling, 1 reply; 12+ messages in thread From: Siddhesh Poyarekar @ 2012-03-07 22:11 UTC (permalink / raw) To: Oleg Nesterov Cc: Mark Salter, linux-next, linux-kernel, Andrew Morton, Paul Gortmaker On Wed, Mar 7, 2012 at 9:08 PM, Oleg Nesterov <oleg@redhat.com> wrote: > rcu_read_lock() can not help without the additional checks. By the > time you take it, task->thread_group->next can point to nowhere. I thought I understood this the second time, but I think I haven't. > Once again. You have the task_struct *task. It exits, > but task->thread_group->next still points to another thread T. Now suppose > that T exits too. But task->thread_group->next was not changed, it still > points to T. RCU grace period passes, T is freed. This is the point I haven't understood. From what I understand about rcu, the rcu update will first update task->thread_group->next and then reclaim the struct it pointed to and not the other way around. So with: >> rcu_read_lock(); >> - while_each_thread(task, t) { >> + t = list_first_entry_rcu(&task->thread_group, >> + struct task_struct, thread_group); since I have the rcu_read_lock when I'm touching the rcu protected list, if I get the old T from task->thread_group->next then I should have a valid struct backing it too. If not, the T should have changed _before_ the struct it points to is removed and hence should again result in a valid reference. I guess there is a corner case where the current task is released and thread_group is rcu_list_del()'d. In that case too, before this happens, the proc entry is removed and the task namespace is unmounted from /proc. Also, the thread_group being deleted from list is merely an update of references and we should get the next element of the list when we get access to the lock and poke it (under an rcu_lock of course). Is there something I am missing? -- Siddhesh Poyarekar http://siddhesh.in ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack 2012-03-07 22:11 ` Siddhesh Poyarekar @ 2012-03-08 17:40 ` Oleg Nesterov 2012-03-08 18:35 ` Siddhesh Poyarekar 0 siblings, 1 reply; 12+ messages in thread From: Oleg Nesterov @ 2012-03-08 17:40 UTC (permalink / raw) To: Siddhesh Poyarekar Cc: Mark Salter, linux-next, linux-kernel, Andrew Morton, Paul Gortmaker On 03/08, Siddhesh Poyarekar wrote: > > On Wed, Mar 7, 2012 at 9:08 PM, Oleg Nesterov <oleg@redhat.com> wrote: > > rcu_read_lock() can not help without the additional checks. By the > > time you take it, task->thread_group->next can point to nowhere. > > I thought I understood this the second time, but I think I haven't. > > > Once again. You have the task_struct *task. It exits, > > but task->thread_group->next still points to another thread T. Now suppose > > that T exits too. But task->thread_group->next was not changed, it still > > points to T. RCU grace period passes, T is freed. > > This is the point I haven't understood. From what I understand about > rcu, the rcu update will first update task->thread_group->next Not in this case. see __unhash_process(p)->list_del_rcu(p->thread_group). You missed the fact that ->thread_group differs from the "usual" rcu protected list. The _head_ of the list can be list_del_rcu'd. Not the first/last/any entry, even the head. Or IOW, we do not really have the head. Every task is the list entry, but it also can be be used as a head by while_each_thread(). > and > then reclaim the struct it pointed to and not the other way around. So > with: > > >> rcu_read_lock(); > >> - while_each_thread(task, t) { > >> + t = list_first_entry_rcu(&task->thread_group, > >> + struct task_struct, thread_group); > > since I have the rcu_read_lock when I'm touching the rcu protected > list, It is not rcu-protected if this task has already exited, that is why you need (say) pid_alive() check. > I guess there is a corner case where the current task is released and > thread_group is rcu_list_del()'d. Yes, assuming that "current" means this "task", > In that case too, before this > happens, the proc entry is removed I guess you meant proc_flush_task()... Not sure I really understand, it can't "remove" the opened entry. This is just optimization which tries to shrink the cache. But this doesn't matter, it can exit right after get_pid_task() succeeds. (OK, and after mm_for_maps() in this particular case, otherwise m_start() fails). > and the task namespace is unmounted > from /proc. Again, this doesn't matter, but note the nr == 1 check. This is only called when init exits and this simply does kern_unmount(). > Also, the thread_group being deleted from list is merely > an update of references and we should get the next element Yes, yes, yes, but this "next element" can exit too before you take rcu_read_lock, and in this case the deleted entry won't be updated. That is the problem. Oleg. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack 2012-03-08 17:40 ` Oleg Nesterov @ 2012-03-08 18:35 ` Siddhesh Poyarekar 0 siblings, 0 replies; 12+ messages in thread From: Siddhesh Poyarekar @ 2012-03-08 18:35 UTC (permalink / raw) To: Oleg Nesterov Cc: Mark Salter, linux-next, linux-kernel, Andrew Morton, Paul Gortmaker On Thu, Mar 8, 2012 at 11:10 PM, Oleg Nesterov <oleg@redhat.com> wrote: > Not in this case. see __unhash_process(p)->list_del_rcu(p->thread_group). > > You missed the fact that ->thread_group differs from the "usual" rcu > protected list. The _head_ of the list can be list_del_rcu'd. Not the > first/last/any entry, even the head. > > Or IOW, we do not really have the head. Every task is the list entry, > but it also can be be used as a head by while_each_thread(). Ahh ok, I did not notice this. That's an interesting quirk. The more I think I understand rcu the more I realize there are gaps in my understanding. >> In that case too, before this >> happens, the proc entry is removed > > I guess you meant proc_flush_task()... Not sure I really understand, > it can't "remove" the opened entry. This is just optimization which > tries to shrink the cache. Yes, and I was obviously wrong, now that I read the whole thing again, including the unmounting of the namespace. I misread the unmounting of proc as being an unmount of the thread/thread group namespace (the nr == 1 check). > Yes, yes, yes, but this "next element" can exit too before you take > rcu_read_lock, and in this case the deleted entry won't be updated. > That is the problem. I will post the entire, consolidated patch once again next week with changes for this as well as some other things (*not* marking the process stack with the TID to maintain backward compatibility and some code cleanup). Thanks for not giving up trying to explain the same thing over and over again ;) -- Siddhesh Poyarekar http://siddhesh.in ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] mm: Fix task_nommu build regression in linux-next 2012-03-06 20:16 ` [PATCH 1/2] mm: Fix task_nommu build regression " Siddhesh Poyarekar 2012-03-06 20:16 ` [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack Siddhesh Poyarekar @ 2012-03-06 20:39 ` Andrew Morton 2012-03-07 3:31 ` Siddhesh Poyarekar 2012-03-07 16:43 ` Siddhesh Poyarekar 2 siblings, 1 reply; 12+ messages in thread From: Andrew Morton @ 2012-03-06 20:39 UTC (permalink / raw) To: Siddhesh Poyarekar Cc: Mark Salter, linux-next, linux-kernel, Paul Gortmaker, Oleg Nesterov On Wed, 7 Mar 2012 01:46:33 +0530 Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com> wrote: > Commit b7b2a0afacada237005068294cb0ccc49d32889e resulted in a build > failure for nommu builds: Bare git commit IDs aren't very useful, because the ID of a patch can be different in different trees. And in linux-next, the ID of a patch can change over time. This is why we like to refer to commits using the formulation b7b2a0afacad ("procfs: mark thread stack correctly in proc/<pid>/maps"). ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] mm: Fix task_nommu build regression in linux-next 2012-03-06 20:39 ` [PATCH 1/2] mm: Fix task_nommu build regression in linux-next Andrew Morton @ 2012-03-07 3:31 ` Siddhesh Poyarekar 0 siblings, 0 replies; 12+ messages in thread From: Siddhesh Poyarekar @ 2012-03-07 3:31 UTC (permalink / raw) To: Andrew Morton Cc: Mark Salter, linux-next, linux-kernel, Paul Gortmaker, Oleg Nesterov On Wed, Mar 7, 2012 at 2:09 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Wed, 7 Mar 2012 01:46:33 +0530 > Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com> wrote: > >> Commit b7b2a0afacada237005068294cb0ccc49d32889e resulted in a build >> failure for nommu builds: > > Bare git commit IDs aren't very useful, because the ID of a patch can > be different in different trees. And in linux-next, the ID of a patch > can change over time. This is why we like to refer to commits using the > formulation b7b2a0afacad ("procfs: mark thread stack correctly in > proc/<pid>/maps"). Thanks, I'll keep this in mind. -- Siddhesh Poyarekar http://siddhesh.in ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] mm: Fix task_nommu build regression in linux-next 2012-03-06 20:16 ` [PATCH 1/2] mm: Fix task_nommu build regression " Siddhesh Poyarekar 2012-03-06 20:16 ` [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack Siddhesh Poyarekar 2012-03-06 20:39 ` [PATCH 1/2] mm: Fix task_nommu build regression in linux-next Andrew Morton @ 2012-03-07 16:43 ` Siddhesh Poyarekar 2 siblings, 0 replies; 12+ messages in thread From: Siddhesh Poyarekar @ 2012-03-07 16:43 UTC (permalink / raw) To: Mark Salter Cc: linux-next, linux-kernel, Andrew Morton, Paul Gortmaker, Oleg Nesterov, Siddhesh Poyarekar On Wed, Mar 7, 2012 at 1:46 AM, Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com> wrote: > Commit b7b2a0afacada237005068294cb0ccc49d32889e resulted in a build > failure for nommu builds: > > fs/built-in.o: In function `nommu_vma_show': > task_nommu.c:(.text+0x564a0): undefined reference to `vm_is_stack' > make[1]: *** [.tmp_vmlinux1] Error 1 > > Fix is to provide definitions of vm_is_stack() and > vm_is_stack_for_task() for nommu as well. This patch also adds an > explicit include of sched.h based on observations and patch submission > by Paul Gortmaker <paul.gortmaker@windriver.com>: > > https://lkml.org/lkml/2012/3/5/326 > > Signed-off-by: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com> Tested-by: Mark Salter <msalter@redhat.com> for c6x (nommu). -- Siddhesh Poyarekar http://siddhesh.in ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-03-08 18:35 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-06 14:49 nommu build failure in linux-next Mark Salter 2012-03-06 15:19 ` Siddhesh Poyarekar 2012-03-06 20:16 ` [PATCH 1/2] mm: Fix task_nommu build regression " Siddhesh Poyarekar 2012-03-06 20:16 ` [PATCH 2/2] mm/linux-next: Fix rcu locking in vm_is_stack Siddhesh Poyarekar 2012-03-07 15:38 ` Oleg Nesterov 2012-03-07 16:05 ` Siddhesh Poyarekar 2012-03-07 22:11 ` Siddhesh Poyarekar 2012-03-08 17:40 ` Oleg Nesterov 2012-03-08 18:35 ` Siddhesh Poyarekar 2012-03-06 20:39 ` [PATCH 1/2] mm: Fix task_nommu build regression in linux-next Andrew Morton 2012-03-07 3:31 ` Siddhesh Poyarekar 2012-03-07 16:43 ` Siddhesh Poyarekar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox