public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] proc: revert to show stack information in /proc/{pid}/status
@ 2009-12-31 14:12 KOSAKI Motohiro
  2009-12-31 15:48 ` Stefani Seibold
  2010-01-07 21:52 ` Andrew Morton
  0 siblings, 2 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-12-31 14:12 UTC (permalink / raw)
  To: LKML
  Cc: kosaki.motohiro, Stefani Seibold, Ingo Molnar, Peter Zijlstra,
	Alexey Dobriyan, Eric W. Biederman, Randy Dunlap, Andrew Morton

Commit d899bf7b (procfs: provide stack information for threads) introduced
to show stack information in /proc/{pid}/status. But it cause large performance
regression. Unfortunately /proc/{pid}/status is used ps command too and ps is
one of most important component. Because both to take mmap_sem and page table walk
are heavily operation.

if many process run, the ps performance is,

[before d899bf7b]

% perf stat ps >/dev/null

 Performance counter stats for 'ps':

     4090.435806  task-clock-msecs         #      0.032 CPUs
             229  context-switches         #      0.000 M/sec
               0  CPU-migrations           #      0.000 M/sec
             234  page-faults              #      0.000 M/sec
      8587565207  cycles                   #   2099.425 M/sec
      9866662403  instructions             #      1.149 IPC
      3789415411  cache-references         #    926.409 M/sec
        30419509  cache-misses             #      7.437 M/sec

   128.859521955  seconds time elapsed

[after d899bf7b]

% perf stat  ps  > /dev/null

 Performance counter stats for 'ps':

     4305.081146  task-clock-msecs         #      0.028 CPUs
             480  context-switches         #      0.000 M/sec
               2  CPU-migrations           #      0.000 M/sec
             237  page-faults              #      0.000 M/sec
      9021211334  cycles                   #   2095.480 M/sec
     10605887536  instructions             #      1.176 IPC
      3612650999  cache-references         #    839.160 M/sec
        23917502  cache-misses             #      5.556 M/sec

   152.277819582  seconds time elapsed

Thus, this patch revert it. Fortunately /proc/{pid}/task/{tid}/smaps
provide almost same information. we can use it.

Cc: Stefani Seibold <stefani@seibold.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 Documentation/filesystems/proc.txt |    2 -
 fs/proc/array.c                    |   89 ------------------------------------
 2 files changed, 0 insertions(+), 91 deletions(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 220cc63..0d07513 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -177,7 +177,6 @@ read the file /proc/PID/status:
   CapBnd: ffffffffffffffff
   voluntary_ctxt_switches:        0
   nonvoluntary_ctxt_switches:     1
-  Stack usage:    12 kB
 
 This shows you nearly the same information you would get if you viewed it with
 the ps  command.  In  fact,  ps  uses  the  proc  file  system  to  obtain its
@@ -231,7 +230,6 @@ Table 1-2: Contents of the statm files (as of 2.6.30-rc7)
  Mems_allowed_list           Same as previous, but in "list format"
  voluntary_ctxt_switches     number of voluntary context switches
  nonvoluntary_ctxt_switches  number of non voluntary context switches
- Stack usage:                stack usage high water mark (round up to page size)
 ..............................................................................
 
 Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 2e5c2a3..b8df659 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -322,94 +322,6 @@ static inline void task_context_switch_counts(struct seq_file *m,
 			p->nivcsw);
 }
 
-#ifdef CONFIG_MMU
-
-struct stack_stats {
-	struct vm_area_struct *vma;
-	unsigned long	startpage;
-	unsigned long	usage;
-};
-
-static int stack_usage_pte_range(pmd_t *pmd, unsigned long addr,
-				unsigned long end, struct mm_walk *walk)
-{
-	struct stack_stats *ss = walk->private;
-	struct vm_area_struct *vma = ss->vma;
-	pte_t *pte, ptent;
-	spinlock_t *ptl;
-	int ret = 0;
-
-	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
-	for (; addr != end; pte++, addr += PAGE_SIZE) {
-		ptent = *pte;
-
-#ifdef CONFIG_STACK_GROWSUP
-		if (pte_present(ptent) || is_swap_pte(ptent))
-			ss->usage = addr - ss->startpage + PAGE_SIZE;
-#else
-		if (pte_present(ptent) || is_swap_pte(ptent)) {
-			ss->usage = ss->startpage - addr + PAGE_SIZE;
-			pte++;
-			ret = 1;
-			break;
-		}
-#endif
-	}
-	pte_unmap_unlock(pte - 1, ptl);
-	cond_resched();
-	return ret;
-}
-
-static inline unsigned long get_stack_usage_in_bytes(struct vm_area_struct *vma,
-				struct task_struct *task)
-{
-	struct stack_stats ss;
-	struct mm_walk stack_walk = {
-		.pmd_entry = stack_usage_pte_range,
-		.mm = vma->vm_mm,
-		.private = &ss,
-	};
-
-	if (!vma->vm_mm || is_vm_hugetlb_page(vma))
-		return 0;
-
-	ss.vma = vma;
-	ss.startpage = task->stack_start & PAGE_MASK;
-	ss.usage = 0;
-
-#ifdef CONFIG_STACK_GROWSUP
-	walk_page_range(KSTK_ESP(task) & PAGE_MASK, vma->vm_end,
-		&stack_walk);
-#else
-	walk_page_range(vma->vm_start, (KSTK_ESP(task) & PAGE_MASK) + PAGE_SIZE,
-		&stack_walk);
-#endif
-	return ss.usage;
-}
-
-static inline void task_show_stack_usage(struct seq_file *m,
-						struct task_struct *task)
-{
-	struct vm_area_struct	*vma;
-	struct mm_struct	*mm = get_task_mm(task);
-
-	if (mm) {
-		down_read(&mm->mmap_sem);
-		vma = find_vma(mm, task->stack_start);
-		if (vma)
-			seq_printf(m, "Stack usage:\t%lu kB\n",
-				get_stack_usage_in_bytes(vma, task) >> 10);
-
-		up_read(&mm->mmap_sem);
-		mmput(mm);
-	}
-}
-#else
-static void task_show_stack_usage(struct seq_file *m, struct task_struct *task)
-{
-}
-#endif		/* CONFIG_MMU */
-
 static void task_cpus_allowed(struct seq_file *m, struct task_struct *task)
 {
 	seq_printf(m, "Cpus_allowed:\t");
@@ -440,7 +352,6 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
 	task_show_regs(m, task);
 #endif
 	task_context_switch_counts(m, task);
-	task_show_stack_usage(m, task);
 	return 0;
 }
 
-- 
1.6.6




^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2009-12-31 14:12 [PATCH] proc: revert to show stack information in /proc/{pid}/status KOSAKI Motohiro
@ 2009-12-31 15:48 ` Stefani Seibold
  2010-01-01 14:14   ` KOSAKI Motohiro
  2010-01-07 21:52 ` Andrew Morton
  1 sibling, 1 reply; 16+ messages in thread
From: Stefani Seibold @ 2009-12-31 15:48 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Alexey Dobriyan,
	Eric W. Biederman, Randy Dunlap, Andrew Morton

Am Donnerstag, den 31.12.2009, 23:12 +0900 schrieb KOSAKI Motohiro:
> Commit d899bf7b (procfs: provide stack information for threads) introduced
> to show stack information in /proc/{pid}/status. But it cause large performance
> regression. Unfortunately /proc/{pid}/status is used ps command too and ps is
> one of most important component. Because both to take mmap_sem and page table walk
> are heavily operation.
> 

/proc/<pid>/status is IMHO not a performance relevant thing. The main
reason is to provide exact information about a running process.
 
> Thus, this patch revert it. Fortunately /proc/{pid}/task/{tid}/smaps
> provide almost same information. we can use it.
> 

Completely wrong. Where does smaps provides this kind of information?
Where is there the high water mark of the stack usage?

> Cc: Stefani Seibold <stefani@seibold.net>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Alexey Dobriyan <adobriyan@gmail.com>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: Randy Dunlap <randy.dunlap@oracle.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---

Nak: Stefani Seibold <stefani@seibold.net>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2009-12-31 15:48 ` Stefani Seibold
@ 2010-01-01 14:14   ` KOSAKI Motohiro
  2010-01-01 15:10     ` Stefani Seibold
  2010-01-01 15:49     ` Andi Kleen
  0 siblings, 2 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2010-01-01 14:14 UTC (permalink / raw)
  To: Stefani Seibold
  Cc: kosaki.motohiro, LKML, Ingo Molnar, Peter Zijlstra,
	Alexey Dobriyan, Eric W. Biederman, Randy Dunlap, Andrew Morton

> Am Donnerstag, den 31.12.2009, 23:12 +0900 schrieb KOSAKI Motohiro:
> > Commit d899bf7b (procfs: provide stack information for threads) introduced
> > to show stack information in /proc/{pid}/status. But it cause large performance
> > regression. Unfortunately /proc/{pid}/status is used ps command too and ps is
> > one of most important component. Because both to take mmap_sem and page table walk
> > are heavily operation.
> > 
> 
> /proc/<pid>/status is IMHO not a performance relevant thing. The main
> reason is to provide exact information about a running process.

No. You have to learn real world use case. if you think so, you should 
change ps before submit this change. This patch obviously make harm than worth. 
Nobody (except you) use it but everybody get regression.


> > Thus, this patch revert it. Fortunately /proc/{pid}/task/{tid}/smaps
> > provide almost same information. we can use it.
> 
> Completely wrong. Where does smaps provides this kind of information?
> Where is there the high water mark of the stack usage?

You have to see you patch itself. show_map_vma() isn't only used by /proc/pid/maps,
but also be used by /proc/pid/smaps.

Now, /proc/{pid}/task/{tid}/smaps show following column.

7fb97c181000-7fb97d1d1000 rw-p 00000000 00:00 0                          [threadstack:0084eff0]
Size:              16704 kB
Rss:                   8 kB
Pss:                   8 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         8 kB
Referenced:            8 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB

It descibe
  - This vma is thread stack
  - vma size is 16704kB
  - stack vsz size in vma is 0x0084eff0 (~ 8508kB)
  - and, physical memory is used 8kB.


Anyway, I revert the regresstion patch as other regression patches. if you really want
this feature, you have three options.

  1. create new /proc file instead to use /proc/pid/status.
  2. improve performance until typical use-case don't notice regression.
  3. change ps and other /proc related userland implementation and resubmit this patch.

But even if you choose anything, You have to test both its functional and performance
_before_ submitting kernel patch.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2010-01-01 14:14   ` KOSAKI Motohiro
@ 2010-01-01 15:10     ` Stefani Seibold
  2010-01-01 22:05       ` Samuel Thibault
  2010-01-01 15:49     ` Andi Kleen
  1 sibling, 1 reply; 16+ messages in thread
From: Stefani Seibold @ 2010-01-01 15:10 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Alexey Dobriyan,
	Eric W. Biederman, Randy Dunlap, Andrew Morton

Am Freitag, den 01.01.2010, 23:14 +0900 schrieb KOSAKI Motohiro:
> > Am Donnerstag, den 31.12.2009, 23:12 +0900 schrieb KOSAKI Motohiro:
> > > Commit d899bf7b (procfs: provide stack information for threads) introduced
> > > to show stack information in /proc/{pid}/status. But it cause large performance
> > > regression. Unfortunately /proc/{pid}/status is used ps command too and ps is
> > > one of most important component. Because both to take mmap_sem and page table walk
> > > are heavily operation.
> > > 
> > 
> > /proc/<pid>/status is IMHO not a performance relevant thing. The main
> > reason is to provide exact information about a running process.
> 
> No. You have to learn real world use case. if you think so, you should 
> change ps before submit this change. This patch obviously make harm than worth. 
> Nobody (except you) use it but everybody get regression.
> 

It is fascinating that every developer means that only his personal view
and requirements are the wisdom of the world.
   
> 
> > > Thus, this patch revert it. Fortunately /proc/{pid}/task/{tid}/smaps
> > > provide almost same information. we can use it.
> > 
> > Completely wrong. Where does smaps provides this kind of information?
> > Where is there the high water mark of the stack usage?
> 
> You have to see you patch itself. show_map_vma() isn't only used by /proc/pid/maps,
> but also be used by /proc/pid/smaps.
> 
> Now, /proc/{pid}/task/{tid}/smaps show following column.
> 
> 7fb97c181000-7fb97d1d1000 rw-p 00000000 00:00 0                          [threadstack:0084eff0]
> Size:              16704 kB
> Rss:                   8 kB
> Pss:                   8 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:         0 kB
> Private_Dirty:         8 kB
> Referenced:            8 kB
> Swap:                  0 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> 
> It descibe
>   - This vma is thread stack
>   - vma size is 16704kB
>   - stack vsz size in vma is 0x0084eff0 (~ 8508kB)
>   - and, physical memory is used 8kB.
> 

But it don't describe the usage high water mark. With the information
provided by proc/*/smaps this is not possible. It is very funny to get
complains without checking. Your assertion is completely WRONG.

> 
> Anyway, I revert the regresstion patch as other regression patches. if you really want
> this feature, you have three options.
> 
>   1. create new /proc file instead to use /proc/pid/status.

This was discarded by Andrew. He prefered the inclusion
in /proc/pid/status. 

>   2. improve performance until typical use-case don't notice regression.

Not possible.

>   3. change ps and other /proc related userland implementation and resubmit this patch.

ps works quiet well.

> 
> But even if you choose anything, You have to test both its functional and performance
> _before_ submitting kernel patch.

Teach me master! Do you think i don't know that is coast something?.
Walking through the pages coast some runtime. But ps is not a
performance critical application nor a cat /proc/*/status!

Stefani



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2010-01-01 14:14   ` KOSAKI Motohiro
  2010-01-01 15:10     ` Stefani Seibold
@ 2010-01-01 15:49     ` Andi Kleen
  2010-01-01 16:09       ` Stefani Seibold
  1 sibling, 1 reply; 16+ messages in thread
From: Andi Kleen @ 2010-01-01 15:49 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Stefani Seibold, LKML, Ingo Molnar, Peter Zijlstra,
	Alexey Dobriyan, Eric W. Biederman, Randy Dunlap, Andrew Morton

KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> writes:
>
> Anyway, I revert the regresstion patch as other regression patches. if you really want
> this feature, you have three options.
>
>   1. create new /proc file instead to use /proc/pid/status.
>   2. improve performance until typical use-case don't notice regression.
>   3. change ps and other /proc related userland implementation and resubmit this patch.
>
> But even if you choose anything, You have to test both its functional and performance
> _before_ submitting kernel patch.


4. Leave everything alone (revert all the commits) and use a ptrace
based tool to get this information when you need it.

That is what I suggested during the original code review and I still think
it's the best solution for such a obscure problem.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2010-01-01 15:49     ` Andi Kleen
@ 2010-01-01 16:09       ` Stefani Seibold
  0 siblings, 0 replies; 16+ messages in thread
From: Stefani Seibold @ 2010-01-01 16:09 UTC (permalink / raw)
  To: Andi Kleen
  Cc: KOSAKI Motohiro, LKML, Ingo Molnar, Peter Zijlstra,
	Alexey Dobriyan, Eric W. Biederman, Randy Dunlap, Andrew Morton

Am Freitag, den 01.01.2010, 16:49 +0100 schrieb Andi Kleen:
> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> writes:
> >
> > Anyway, I revert the regresstion patch as other regression patches. if you really want
> > this feature, you have three options.
> >
> >   1. create new /proc file instead to use /proc/pid/status.
> >   2. improve performance until typical use-case don't notice regression.
> >   3. change ps and other /proc related userland implementation and resubmit this patch.
> >
> > But even if you choose anything, You have to test both its functional and performance
> > _before_ submitting kernel patch.
> 
> 
> 4. Leave everything alone (revert all the commits) and use a ptrace
> based tool to get this information when you need it.
> 

Without the stack_start and the fix for the bogus x86_64 KSTP_ESP it
wouldn't be possible.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2010-01-01 15:10     ` Stefani Seibold
@ 2010-01-01 22:05       ` Samuel Thibault
  2010-01-01 22:21         ` Stefani Seibold
  2010-01-02  1:42         ` [PATCH] proc: revert to show stack information in /proc/{pid}/status Andi Kleen
  0 siblings, 2 replies; 16+ messages in thread
From: Samuel Thibault @ 2010-01-01 22:05 UTC (permalink / raw)
  To: Stefani Seibold
  Cc: KOSAKI Motohiro, LKML, Ingo Molnar, Peter Zijlstra,
	Alexey Dobriyan, Eric W. Biederman, Randy Dunlap, Andrew Morton

Stefani Seibold, le Fri 01 Jan 2010 16:10:16 +0100, a écrit :
> But ps is not a performance critical application nor a cat
> /proc/*/status!

Errr, maybe not so much as some other operations, but a lot of tools use
them, so it's really worth considering it.

Samuel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2010-01-01 22:05       ` Samuel Thibault
@ 2010-01-01 22:21         ` Stefani Seibold
  2010-01-02  5:53           ` KOSAKI Motohiro
  2010-01-02  1:42         ` [PATCH] proc: revert to show stack information in /proc/{pid}/status Andi Kleen
  1 sibling, 1 reply; 16+ messages in thread
From: Stefani Seibold @ 2010-01-01 22:21 UTC (permalink / raw)
  To: Samuel Thibault
  Cc: KOSAKI Motohiro, LKML, Ingo Molnar, Peter Zijlstra,
	Alexey Dobriyan, Eric W. Biederman, Randy Dunlap, Andrew Morton

Am Freitag, den 01.01.2010, 23:05 +0100 schrieb Samuel Thibault:
> Stefani Seibold, le Fri 01 Jan 2010 16:10:16 +0100, a écrit :
> > But ps is not a performance critical application nor a cat
> > /proc/*/status!
> 
> Errr, maybe not so much as some other operations, but a lot of tools use
> them, so it's really worth considering it.
> 

Right. And i am still trying to find some optimization. Stay tuned.

Stefani



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2010-01-01 22:05       ` Samuel Thibault
  2010-01-01 22:21         ` Stefani Seibold
@ 2010-01-02  1:42         ` Andi Kleen
  1 sibling, 0 replies; 16+ messages in thread
From: Andi Kleen @ 2010-01-02  1:42 UTC (permalink / raw)
  To: Samuel Thibault
  Cc: Stefani Seibold, KOSAKI Motohiro, LKML, Ingo Molnar,
	Peter Zijlstra, Alexey Dobriyan, Eric W. Biederman, Randy Dunlap,
	Andrew Morton, torvalds

Samuel Thibault <samuel.thibault@ens-lyon.org> writes:

> Stefani Seibold, le Fri 01 Jan 2010 16:10:16 +0100, a écrit :
>> But ps is not a performance critical application nor a cat
>> /proc/*/status!
>
> Errr, maybe not so much as some other operations, but a lot of tools use
> them, so it's really worth considering it.

As an historical anecdote one of my first patches to Linux ever made
exactly the same mistake as Stefani's (adding mmap_sem to a proc file
read by proc). It was quickly reverted again by Linus when the
reports of ps not working under swapping poured in.

Same should be done with this patch (commit 9ebd4eba761b624a6a6c9189335adeddcb1fa0e0)

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in  /proc/{pid}/status
  2010-01-01 22:21         ` Stefani Seibold
@ 2010-01-02  5:53           ` KOSAKI Motohiro
  2010-01-02  8:26             ` Stefani Seibold
  2010-01-02 14:05             ` [PATCH] partial revert to show stack information in /proc/<pid>/status Stefani Seibold
  0 siblings, 2 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2010-01-02  5:53 UTC (permalink / raw)
  To: Stefani Seibold
  Cc: Samuel Thibault, LKML, Ingo Molnar, Peter Zijlstra,
	Alexey Dobriyan, Eric W. Biederman, Randy Dunlap, Andrew Morton

2010/1/2 Stefani Seibold <stefani@seibold.net>:
> Am Freitag, den 01.01.2010, 23:05 +0100 schrieb Samuel Thibault:
>> Stefani Seibold, le Fri 01 Jan 2010 16:10:16 +0100, a écrit :
>> > But ps is not a performance critical application nor a cat
>> > /proc/*/status!
>>
>> Errr, maybe not so much as some other operations, but a lot of tools use
>> them, so it's really worth considering it.
>
> Right. And i am still trying to find some optimization. Stay tuned.

I can understand you feel sad. but don't worry. reverting is very usual event
on kernel development. You can retry anytime if you make optimized code.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in  /proc/{pid}/status
  2010-01-02  5:53           ` KOSAKI Motohiro
@ 2010-01-02  8:26             ` Stefani Seibold
  2010-01-02 14:05             ` [PATCH] partial revert to show stack information in /proc/<pid>/status Stefani Seibold
  1 sibling, 0 replies; 16+ messages in thread
From: Stefani Seibold @ 2010-01-02  8:26 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Samuel Thibault, LKML, Ingo Molnar, Peter Zijlstra,
	Alexey Dobriyan, Eric W. Biederman, Randy Dunlap, Andrew Morton

Am Samstag, den 02.01.2010, 14:53 +0900 schrieb KOSAKI Motohiro:
> 2010/1/2 Stefani Seibold <stefani@seibold.net>:
> > Am Freitag, den 01.01.2010, 23:05 +0100 schrieb Samuel Thibault:
> >> Stefani Seibold, le Fri 01 Jan 2010 16:10:16 +0100, a écrit :
> >> > But ps is not a performance critical application nor a cat
> >> > /proc/*/status!
> >>
> >> Errr, maybe not so much as some other operations, but a lot of tools use
> >> them, so it's really worth considering it.
> >
> > Right. And i am still trying to find some optimization. Stay tuned.
> 
> I can understand you feel sad. but don't worry. reverting is very usual event
> on kernel development. You can retry anytime if you make optimized code.

I had analyzed the problem and there is only one solution. Create a new
proc/<pid>/stack_usage entry, which was my first idea. This has absolut
now side effects.

I will write a patch and post it in the next days.

Stefani



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] partial revert to show stack information in  /proc/<pid>/status
  2010-01-02  5:53           ` KOSAKI Motohiro
  2010-01-02  8:26             ` Stefani Seibold
@ 2010-01-02 14:05             ` Stefani Seibold
  2010-01-05  5:24               ` KOSAKI Motohiro
  1 sibling, 1 reply; 16+ messages in thread
From: Stefani Seibold @ 2010-01-02 14:05 UTC (permalink / raw)
  To: LKML
  Cc: Samuel Thibault, Ingo Molnar, Peter Zijlstra, Alexey Dobriyan,
	Eric W. Biederman, Randy Dunlap, Andrew Morton, KOSAKI Motohiro

As announce here is the patch to partial revert the show stack
information patch due a not accepted performance regression. It will be
now show only the current stack usage, not the high water mark.

The path is only partial reverted because i need the other parts to do
it in an other way.

There are now two possibilities solutions:

- create a new /proc/<pid>/stackinfo entry, which provides the reverted
information and maybe others like the sigaltstack.
- create a user space tool which use /proc/<pid>/pagemap

In both cases the information of task->stack_start and the KSTK_ESP is
needed.

It will be also needed for an enhancement of the oom handler, where i
free unused stack pages (the pages before the stack pointer) under high
memory pressure. This is currently under work.

Andrew please apply this patch to 2.6.34-rc* tree.

Greetings,
Stefani

Signed-off-by: Stefani Seibold <stefani@seibold.net>
---
 Documentation/filesystems/proc.txt |    2 
 fs/proc/array.c                    |   87 ++-----------------------------------
 2 files changed, 8 insertions(+), 81 deletions(-)

diff -u -N -r -p linux-2.6.33-rc2.orig/fs/proc/array.c linux-2.6.33-rc2.new/fs/proc/array.c
--- linux-2.6.33-rc2.orig/fs/proc/array.c	2009-12-27 23:37:04.817427024 +0100
+++ linux-2.6.33-rc2.new/fs/proc/array.c	2010-01-02 14:36:53.794188418 +0100
@@ -327,93 +327,20 @@ static inline void task_context_switch_c
 			p->nivcsw);
 }
 
-#ifdef CONFIG_MMU
-
-struct stack_stats {
-	struct vm_area_struct *vma;
-	unsigned long	startpage;
-	unsigned long	usage;
-};
-
-static int stack_usage_pte_range(pmd_t *pmd, unsigned long addr,
-				unsigned long end, struct mm_walk *walk)
+static inline void task_show_stack_usage(struct seq_file *m,
+						struct task_struct *task)
 {
-	struct stack_stats *ss = walk->private;
-	struct vm_area_struct *vma = ss->vma;
-	pte_t *pte, ptent;
-	spinlock_t *ptl;
-	int ret = 0;
-
-	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
-	for (; addr != end; pte++, addr += PAGE_SIZE) {
-		ptent = *pte;
+	unsigned long		usage;
 
+	if (task->mm) {
 #ifdef CONFIG_STACK_GROWSUP
-		if (pte_present(ptent) || is_swap_pte(ptent))
-			ss->usage = addr - ss->startpage + PAGE_SIZE;
+		usage = KSTK_ESP(task) - task->stack_start;
 #else
-		if (pte_present(ptent) || is_swap_pte(ptent)) {
-			ss->usage = ss->startpage - addr + PAGE_SIZE;
-			pte++;
-			ret = 1;
-			break;
-		}
+		usage = task->stack_start - KSTK_ESP(task);
 #endif
+		seq_printf(m, "Stack usage:\t%lu kB\n", (usage + 1023) >> 10);
 	}
-	pte_unmap_unlock(pte - 1, ptl);
-	cond_resched();
-	return ret;
-}
-
-static inline unsigned long get_stack_usage_in_bytes(struct vm_area_struct *vma,
-				struct task_struct *task)
-{
-	struct stack_stats ss;
-	struct mm_walk stack_walk = {
-		.pmd_entry = stack_usage_pte_range,
-		.mm = vma->vm_mm,
-		.private = &ss,
-	};
-
-	if (!vma->vm_mm || is_vm_hugetlb_page(vma))
-		return 0;
-
-	ss.vma = vma;
-	ss.startpage = task->stack_start & PAGE_MASK;
-	ss.usage = 0;
-
-#ifdef CONFIG_STACK_GROWSUP
-	walk_page_range(KSTK_ESP(task) & PAGE_MASK, vma->vm_end,
-		&stack_walk);
-#else
-	walk_page_range(vma->vm_start, (KSTK_ESP(task) & PAGE_MASK) + PAGE_SIZE,
-		&stack_walk);
-#endif
-	return ss.usage;
-}
-
-static inline void task_show_stack_usage(struct seq_file *m,
-						struct task_struct *task)
-{
-	struct vm_area_struct	*vma;
-	struct mm_struct	*mm = get_task_mm(task);
-
-	if (mm) {
-		down_read(&mm->mmap_sem);
-		vma = find_vma(mm, task->stack_start);
-		if (vma)
-			seq_printf(m, "Stack usage:\t%lu kB\n",
-				get_stack_usage_in_bytes(vma, task) >> 10);
-
-		up_read(&mm->mmap_sem);
-		mmput(mm);
-	}
-}
-#else
-static void task_show_stack_usage(struct seq_file *m, struct task_struct *task)
-{
 }
-#endif		/* CONFIG_MMU */
 
 static void task_cpus_allowed(struct seq_file *m, struct task_struct *task)
 {
diff -u -N -r -p linux-2.6.33-rc2.orig/Documentation/filesystems/proc.txt linux-2.6.33-rc2.new/Documentation/filesystems/proc.txt
--- linux-2.6.33-rc2.orig/Documentation/filesystems/proc.txt	2009-12-27 23:37:01.098310709 +0100
+++ linux-2.6.33-rc2.new/Documentation/filesystems/proc.txt	2010-01-02 14:30:39.059150340 +0100
@@ -231,7 +231,7 @@ Table 1-2: Contents of the statm files (
  Mems_allowed_list           Same as previous, but in "list format"
  voluntary_ctxt_switches     number of voluntary context switches
  nonvoluntary_ctxt_switches  number of non voluntary context switches
- Stack usage:                stack usage high water mark (round up to page size)
+ Stack usage:                stack usage in kB
 ..............................................................................
 
 Table 1-3: Contents of the statm files (as of 2.6.8-rc3)



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] partial revert to show stack information in  /proc/<pid>/status
  2010-01-02 14:05             ` [PATCH] partial revert to show stack information in /proc/<pid>/status Stefani Seibold
@ 2010-01-05  5:24               ` KOSAKI Motohiro
  0 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2010-01-05  5:24 UTC (permalink / raw)
  To: Stefani Seibold
  Cc: kosaki.motohiro, LKML, Samuel Thibault, Ingo Molnar,
	Peter Zijlstra, Alexey Dobriyan, Eric W. Biederman, Randy Dunlap,
	Andrew Morton

> As announce here is the patch to partial revert the show stack
> information patch due a not accepted performance regression. It will be
> now show only the current stack usage, not the high water mark.
> 
> The path is only partial reverted because i need the other parts to do
> it in an other way.
> 
> There are now two possibilities solutions:
> 
> - create a new /proc/<pid>/stackinfo entry, which provides the reverted
> information and maybe others like the sigaltstack.
> - create a user space tool which use /proc/<pid>/pagemap
> 
> In both cases the information of task->stack_start and the KSTK_ESP is
> needed.
> 
> It will be also needed for an enhancement of the oom handler, where i
> free unused stack pages (the pages before the stack pointer) under high
> memory pressure. This is currently under work.

This explanation seems still strange. both task->stack_start and the KSTK_ESP are already exported
by /proc/{pid}/task/{tid}/stat. and This patch assume KSTK_ESP(task) - task->stack_start mean
StackUsage but it isn't the same as many people's assumption.

However, useless feature better than regression. probably I can ack this.


> 
> Andrew please apply this patch to 2.6.34-rc* tree.
> 
> Greetings,
> Stefani
> 
> Signed-off-by: Stefani Seibold <stefani@seibold.net>
> ---
>  Documentation/filesystems/proc.txt |    2 
>  fs/proc/array.c                    |   87 ++-----------------------------------
>  2 files changed, 8 insertions(+), 81 deletions(-)
> 
> diff -u -N -r -p linux-2.6.33-rc2.orig/fs/proc/array.c linux-2.6.33-rc2.new/fs/proc/array.c
> --- linux-2.6.33-rc2.orig/fs/proc/array.c	2009-12-27 23:37:04.817427024 +0100
> +++ linux-2.6.33-rc2.new/fs/proc/array.c	2010-01-02 14:36:53.794188418 +0100
> @@ -327,93 +327,20 @@ static inline void task_context_switch_c
>  			p->nivcsw);
>  }
>  
> -#ifdef CONFIG_MMU
> -
> -struct stack_stats {
> -	struct vm_area_struct *vma;
> -	unsigned long	startpage;
> -	unsigned long	usage;
> -};
> -
> -static int stack_usage_pte_range(pmd_t *pmd, unsigned long addr,
> -				unsigned long end, struct mm_walk *walk)
> +static inline void task_show_stack_usage(struct seq_file *m,
> +						struct task_struct *task)
>  {
> -	struct stack_stats *ss = walk->private;
> -	struct vm_area_struct *vma = ss->vma;
> -	pte_t *pte, ptent;
> -	spinlock_t *ptl;
> -	int ret = 0;
> -
> -	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> -	for (; addr != end; pte++, addr += PAGE_SIZE) {
> -		ptent = *pte;
> +	unsigned long		usage;
>  
> +	if (task->mm) {
>  #ifdef CONFIG_STACK_GROWSUP
> -		if (pte_present(ptent) || is_swap_pte(ptent))
> -			ss->usage = addr - ss->startpage + PAGE_SIZE;
> +		usage = KSTK_ESP(task) - task->stack_start;
>  #else
> -		if (pte_present(ptent) || is_swap_pte(ptent)) {
> -			ss->usage = ss->startpage - addr + PAGE_SIZE;
> -			pte++;
> -			ret = 1;
> -			break;
> -		}
> +		usage = task->stack_start - KSTK_ESP(task);
>  #endif
> +		seq_printf(m, "Stack usage:\t%lu kB\n", (usage + 1023) >> 10);
>  	}
> -	pte_unmap_unlock(pte - 1, ptl);
> -	cond_resched();
> -	return ret;
> -}
> -
> -static inline unsigned long get_stack_usage_in_bytes(struct vm_area_struct *vma,
> -				struct task_struct *task)
> -{
> -	struct stack_stats ss;
> -	struct mm_walk stack_walk = {
> -		.pmd_entry = stack_usage_pte_range,
> -		.mm = vma->vm_mm,
> -		.private = &ss,
> -	};
> -
> -	if (!vma->vm_mm || is_vm_hugetlb_page(vma))
> -		return 0;
> -
> -	ss.vma = vma;
> -	ss.startpage = task->stack_start & PAGE_MASK;
> -	ss.usage = 0;
> -
> -#ifdef CONFIG_STACK_GROWSUP
> -	walk_page_range(KSTK_ESP(task) & PAGE_MASK, vma->vm_end,
> -		&stack_walk);
> -#else
> -	walk_page_range(vma->vm_start, (KSTK_ESP(task) & PAGE_MASK) + PAGE_SIZE,
> -		&stack_walk);
> -#endif
> -	return ss.usage;
> -}
> -
> -static inline void task_show_stack_usage(struct seq_file *m,
> -						struct task_struct *task)
> -{
> -	struct vm_area_struct	*vma;
> -	struct mm_struct	*mm = get_task_mm(task);
> -
> -	if (mm) {
> -		down_read(&mm->mmap_sem);
> -		vma = find_vma(mm, task->stack_start);
> -		if (vma)
> -			seq_printf(m, "Stack usage:\t%lu kB\n",
> -				get_stack_usage_in_bytes(vma, task) >> 10);
> -
> -		up_read(&mm->mmap_sem);
> -		mmput(mm);
> -	}
> -}
> -#else
> -static void task_show_stack_usage(struct seq_file *m, struct task_struct *task)
> -{
>  }
> -#endif		/* CONFIG_MMU */
>  
>  static void task_cpus_allowed(struct seq_file *m, struct task_struct *task)
>  {
> diff -u -N -r -p linux-2.6.33-rc2.orig/Documentation/filesystems/proc.txt linux-2.6.33-rc2.new/Documentation/filesystems/proc.txt
> --- linux-2.6.33-rc2.orig/Documentation/filesystems/proc.txt	2009-12-27 23:37:01.098310709 +0100
> +++ linux-2.6.33-rc2.new/Documentation/filesystems/proc.txt	2010-01-02 14:30:39.059150340 +0100
> @@ -231,7 +231,7 @@ Table 1-2: Contents of the statm files (
>   Mems_allowed_list           Same as previous, but in "list format"
>   voluntary_ctxt_switches     number of voluntary context switches
>   nonvoluntary_ctxt_switches  number of non voluntary context switches
> - Stack usage:                stack usage high water mark (round up to page size)
> + Stack usage:                stack usage in kB
>  ..............................................................................
>  
>  Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
> 
> 




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2009-12-31 14:12 [PATCH] proc: revert to show stack information in /proc/{pid}/status KOSAKI Motohiro
  2009-12-31 15:48 ` Stefani Seibold
@ 2010-01-07 21:52 ` Andrew Morton
  2010-01-08  0:21   ` KOSAKI Motohiro
  1 sibling, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2010-01-07 21:52 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, Stefani Seibold, Ingo Molnar, Peter Zijlstra,
	Alexey Dobriyan, Eric W. Biederman, Randy Dunlap

On Thu, 31 Dec 2009 23:12:22 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> Thus, this patch revert it.

OK, thanks, I applied the full revert.

We can look at a new and less costly implementation of this feature for
2.6.34 if desired.  In which case, all we have done is to delay the feature
by one release, which is not a big deal.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2010-01-07 21:52 ` Andrew Morton
@ 2010-01-08  0:21   ` KOSAKI Motohiro
  2010-01-08  0:34     ` Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: KOSAKI Motohiro @ 2010-01-08  0:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kosaki.motohiro, LKML, Stefani Seibold, Ingo Molnar,
	Peter Zijlstra, Alexey Dobriyan, Eric W. Biederman, Randy Dunlap

> On Thu, 31 Dec 2009 23:12:22 +0900 (JST)
> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> 
> > Thus, this patch revert it.
> 
> OK, thanks, I applied the full revert.
> 
> We can look at a new and less costly implementation of this feature for
> 2.6.34 if desired.  In which case, all we have done is to delay the feature
> by one release, which is not a big deal.

Thanks.

note: my patch isn't full revert too. Commit d899bf7b introduced two
feature (btw, I don't like one patch have two feature).

 1) Add the annotattion of [thread stack: xxxx] mark to
    /proc/{pid}/task/{tid}/maps.
 2) Add StackUsage field to /proc/{pid}/status.

I only revert (2), because I haven't seen (1) cause regression.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] proc: revert to show stack information in /proc/{pid}/status
  2010-01-08  0:21   ` KOSAKI Motohiro
@ 2010-01-08  0:34     ` Andrew Morton
  0 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2010-01-08  0:34 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, Stefani Seibold, Ingo Molnar, Peter Zijlstra,
	Alexey Dobriyan, Eric W. Biederman, Randy Dunlap

On Fri,  8 Jan 2010 09:21:30 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> > On Thu, 31 Dec 2009 23:12:22 +0900 (JST)
> > KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> > 
> > > Thus, this patch revert it.
> > 
> > OK, thanks, I applied the full revert.
> > 
> > We can look at a new and less costly implementation of this feature for
> > 2.6.34 if desired.  In which case, all we have done is to delay the feature
> > by one release, which is not a big deal.
> 
> Thanks.
> 
> note: my patch isn't full revert too. Commit d899bf7b introduced two
> feature (btw, I don't like one patch have two feature).
> 
>  1) Add the annotattion of [thread stack: xxxx] mark to
>     /proc/{pid}/task/{tid}/maps.
>  2) Add StackUsage field to /proc/{pid}/status.
> 
> I only revert (2), because I haven't seen (1) cause regression.
> 
> 

Ah, thanks.  I've updated the patch title and changelog to reflect that.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-01-08  0:35 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-31 14:12 [PATCH] proc: revert to show stack information in /proc/{pid}/status KOSAKI Motohiro
2009-12-31 15:48 ` Stefani Seibold
2010-01-01 14:14   ` KOSAKI Motohiro
2010-01-01 15:10     ` Stefani Seibold
2010-01-01 22:05       ` Samuel Thibault
2010-01-01 22:21         ` Stefani Seibold
2010-01-02  5:53           ` KOSAKI Motohiro
2010-01-02  8:26             ` Stefani Seibold
2010-01-02 14:05             ` [PATCH] partial revert to show stack information in /proc/<pid>/status Stefani Seibold
2010-01-05  5:24               ` KOSAKI Motohiro
2010-01-02  1:42         ` [PATCH] proc: revert to show stack information in /proc/{pid}/status Andi Kleen
2010-01-01 15:49     ` Andi Kleen
2010-01-01 16:09       ` Stefani Seibold
2010-01-07 21:52 ` Andrew Morton
2010-01-08  0:21   ` KOSAKI Motohiro
2010-01-08  0:34     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox