* [merged] proctxt-update-kernel-filesystem-proctxt-documentation.patch removed from -mm tree
@ 2009-06-18 22:43 akpm
[not found] ` <1245824444.22613.3.camel@wall-e>
0 siblings, 1 reply; 17+ messages in thread
From: akpm @ 2009-06-18 22:43 UTC (permalink / raw)
To: stefani, randy.dunlap, mm-commits
The patch titled
proc.txt: update kernel filesystem/proc.txt documentation
has been removed from the -mm tree. Its filename was
proctxt-update-kernel-filesystem-proctxt-documentation.patch
This patch was dropped because it was merged into mainline or a subsystem tree
The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/
------------------------------------------------------
Subject: proc.txt: update kernel filesystem/proc.txt documentation
From: Stefani Seibold <stefani@seibold.net>
An update for the "Process-Specific Subdirectories" section to reflect the
changes till kernel 2.6.30.
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/filesystems/proc.txt | 242 +++++++++++++++++++++------
1 file changed, 190 insertions(+), 52 deletions(-)
diff -puN Documentation/filesystems/proc.txt~proctxt-update-kernel-filesystem-proctxt-documentation Documentation/filesystems/proc.txt
--- a/Documentation/filesystems/proc.txt~proctxt-update-kernel-filesystem-proctxt-documentation
+++ a/Documentation/filesystems/proc.txt
@@ -5,11 +5,12 @@
Bodo Bauer <bb@ricochet.net>
2.4.x update Jorge Nerin <comandante@zaralinux.com> November 14 2000
-move /proc/sys Shen Feng <shen@cn.fujitsu.com> April 1 2009
+move /proc/sys Shen Feng <shen@cn.fujitsu.com> April 1 2009
------------------------------------------------------------------------------
Version 1.3 Kernel version 2.2.12
Kernel version 2.4.0-test11-pre4
------------------------------------------------------------------------------
+fixes/update part 1.1 Stefani Seibold <stefani@seibold.net> June 9 2009
Table of Contents
-----------------
@@ -116,7 +117,7 @@ The link self points to the process
subdirectory has the entries listed in Table 1-1.
-Table 1-1: Process specific entries in /proc
+Table 1-1: Process specific entries in /proc
..............................................................................
File Content
clear_refs Clears page referenced bits shown in smaps output
@@ -134,46 +135,103 @@ Table 1-1: Process specific entries in /
status Process status in human readable form
wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan
stack Report full stack trace, enable via CONFIG_STACKTRACE
- smaps Extension based on maps, the rss size for each mapped file
+ smaps a extension based on maps, showing the memory consumption of
+ each mapping
..............................................................................
For example, to get the status information of a process, all you have to do is
read the file /proc/PID/status:
- >cat /proc/self/status
- Name: cat
- State: R (running)
- Pid: 5452
- PPid: 743
+ >cat /proc/self/status
+ Name: cat
+ State: R (running)
+ Tgid: 5452
+ Pid: 5452
+ PPid: 743
TracerPid: 0 (2.4)
- Uid: 501 501 501 501
- Gid: 100 100 100 100
- Groups: 100 14 16
- VmSize: 1112 kB
- VmLck: 0 kB
- VmRSS: 348 kB
- VmData: 24 kB
- VmStk: 12 kB
- VmExe: 8 kB
- VmLib: 1044 kB
- SigPnd: 0000000000000000
- SigBlk: 0000000000000000
- SigIgn: 0000000000000000
- SigCgt: 0000000000000000
- CapInh: 00000000fffffeff
- CapPrm: 0000000000000000
- CapEff: 0000000000000000
-
+ Uid: 501 501 501 501
+ Gid: 100 100 100 100
+ FDSize: 256
+ Groups: 100 14 16
+ VmPeak: 5004 kB
+ VmSize: 5004 kB
+ VmLck: 0 kB
+ VmHWM: 476 kB
+ VmRSS: 476 kB
+ VmData: 156 kB
+ VmStk: 88 kB
+ VmExe: 68 kB
+ VmLib: 1412 kB
+ VmPTE: 20 kb
+ Threads: 1
+ SigQ: 0/28578
+ SigPnd: 0000000000000000
+ ShdPnd: 0000000000000000
+ SigBlk: 0000000000000000
+ SigIgn: 0000000000000000
+ SigCgt: 0000000000000000
+ CapInh: 00000000fffffeff
+ CapPrm: 0000000000000000
+ CapEff: 0000000000000000
+ CapBnd: ffffffffffffffff
+ voluntary_ctxt_switches: 0
+ nonvoluntary_ctxt_switches: 1
This shows you nearly the same information you would get if you viewed it with
the ps command. In fact, ps uses the proc file system to obtain its
-information. The statm file contains more detailed information about the
-process memory usage. Its seven fields are explained in Table 1-2. The stat
-file contains details information about the process itself. Its fields are
-explained in Table 1-3.
+information. But you get a more detailed view of the process by reading the
+file /proc/PID/status. It fields are described in table 1-2.
+The statm file contains more detailed information about the process
+memory usage. Its seven fields are explained in Table 1-3. The stat file
+contains details information about the process itself. Its fields are
+explained in Table 1-4.
+
+Table 1-2: Contents of the statm files (as of 2.6.30-rc7)
+..............................................................................
+ Field Content
+ Name filename of the executable
+ State state (R is running, S is sleeping, D is sleeping
+ in an uninterruptible wait, Z is zombie,
+ T is traced or stopped)
+ Tgid thread group ID
+ Pid process id
+ PPid process id of the parent process
+ TracerPid PID of process tracing this process (0 if not)
+ Uid Real, effective, saved set, and file system UIDs
+ Gid Real, effective, saved set, and file system GIDs
+ FDSize number of file descriptor slots currently allocated
+ Groups supplementary group list
+ VmPeak peak virtual memory size
+ VmSize total program size
+ VmLck locked memory size
+ VmHWM peak resident set size ("high water mark")
+ VmRSS size of memory portions
+ VmData size of data, stack, and text segments
+ VmStk size of data, stack, and text segments
+ VmExe size of text segment
+ VmLib size of shared library code
+ VmPTE size of page table entries
+ Threads number of threads
+ SigQ number of signals queued/max. number for queue
+ SigPnd bitmap of pending signals for the thread
+ ShdPnd bitmap of shared pending signals for the process
+ SigBlk bitmap of blocked signals
+ SigIgn bitmap of ignored signals
+ SigCgt bitmap of catched signals
+ CapInh bitmap of inheritable capabilities
+ CapPrm bitmap of permitted capabilities
+ CapEff bitmap of effective capabilities
+ CapBnd bitmap of capabilities bounding set
+ Cpus_allowed mask of CPUs on which this process may run
+ Cpus_allowed_list Same as previous, but in "list format"
+ Mems_allowed mask of memory nodes allowed to this process
+ Mems_allowed_list Same as previous, but in "list format"
+ voluntary_ctxt_switches number of voluntary context switches
+ nonvoluntary_ctxt_switches number of non voluntary context switches
+..............................................................................
-Table 1-2: Contents of the statm files (as of 2.6.8-rc3)
+Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
..............................................................................
Field Content
size total program size (pages) (same as VmSize in status)
@@ -188,7 +246,7 @@ Table 1-2: Contents of the statm files (
..............................................................................
-Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
+Table 1-4: Contents of the stat files (as of 2.6.30-rc7)
..............................................................................
Field Content
pid process id
@@ -222,10 +280,10 @@ Table 1-3: Contents of the stat files (a
start_stack address of the start of the stack
esp current value of ESP
eip current value of EIP
- pending bitmap of pending signals (obsolete)
- blocked bitmap of blocked signals (obsolete)
- sigign bitmap of ignored signals (obsolete)
- sigcatch bitmap of catched signals (obsolete)
+ pending bitmap of pending signals
+ blocked bitmap of blocked signals
+ sigign bitmap of ignored signals
+ sigcatch bitmap of catched signals
wchan address where process went to sleep
0 (place holder)
0 (place holder)
@@ -234,19 +292,99 @@ Table 1-3: Contents of the stat files (a
rt_priority realtime priority
policy scheduling policy (man sched_setscheduler)
blkio_ticks time spent waiting for block IO
+ gtime guest time of the task in jiffies
+ cgtime guest time of the task children in jiffies
..............................................................................
+The /proc/PID/map file containing the currently mapped memory regions and
+their access permissions.
+
+The format is:
+
+address perms offset dev inode pathname
+
+08048000-08049000 r-xp 00000000 03:00 8312 /opt/test
+08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test
+0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
+a7cb1000-a7cb2000 ---p 00000000 00:00 0
+a7cb2000-a7eb2000 rw-p 00000000 00:00 0
+a7eb2000-a7eb3000 ---p 00000000 00:00 0
+a7eb3000-a7ed5000 rw-p 00000000 00:00 0
+a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6
+a8008000-a800a000 r--p 00133000 03:00 4222 /lib/libc.so.6
+a800a000-a800b000 rw-p 00135000 03:00 4222 /lib/libc.so.6
+a800b000-a800e000 rw-p 00000000 00:00 0
+a800e000-a8022000 r-xp 00000000 03:00 14462 /lib/libpthread.so.0
+a8022000-a8023000 r--p 00013000 03:00 14462 /lib/libpthread.so.0
+a8023000-a8024000 rw-p 00014000 03:00 14462 /lib/libpthread.so.0
+a8024000-a8027000 rw-p 00000000 00:00 0
+a8027000-a8043000 r-xp 00000000 03:00 8317 /lib/ld-linux.so.2
+a8043000-a8044000 r--p 0001b000 03:00 8317 /lib/ld-linux.so.2
+a8044000-a8045000 rw-p 0001c000 03:00 8317 /lib/ld-linux.so.2
+aff35000-aff4a000 rw-p 00000000 00:00 0 [stack]
+ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
+
+where "address" is the address space in the process that it occupies, "perms"
+is a set of permissions:
+
+ r = read
+ w = write
+ x = execute
+ s = shared
+ p = private (copy on write)
+
+"offset" is the offset into the mapping, "dev" is the device (major:minor), and
+"inode" is the inode on that device. 0 indicates that no inode is associated
+with the memory region, as the case would be with BSS (uninitialized data).
+The "pathname" shows the name associated file for this mapping. If the mapping
+is not associated with a file:
+
+ [heap] = the heap of the program
+ [stack] = the stack of the main process
+ [vdso] = the "virtual dynamic shared object",
+ the kernel system call handler
+
+ or if empty, the mapping is anonymous.
+
+
+The /proc/PID/smaps is an extension based on maps, showing the memory
+consumption for each of the process's mappings. For each of mappings there
+is a series of lines such as the following:
+
+08048000-080bc000 r-xp 00000000 03:02 13130 /bin/bash
+Size: 1084 kB
+Rss: 892 kB
+Pss: 374 kB
+Shared_Clean: 892 kB
+Shared_Dirty: 0 kB
+Private_Clean: 0 kB
+Private_Dirty: 0 kB
+Referenced: 892 kB
+Swap: 0 kB
+KernelPageSize: 4 kB
+MMUPageSize: 4 kB
+
+The first of these lines shows the same information as is displayed for the
+mapping in /proc/PID/maps. The remaining lines show the size of the mapping,
+the amount of the mapping that is currently resident in RAM, the "proportional
+set sizeâ (divide each shared page by the number of processes sharing it), the
+number of clean and dirty shared pages in the mapping, and the number of clean
+and dirty private pages in the mapping. The "Referenced" indicates the amount
+of memory currently marked as referenced or accessed.
+
+This file is only present if the CONFIG_MMU kernel configuration option is
+enabled.
1.2 Kernel data
---------------
Similar to the process entries, the kernel data files give information about
the running kernel. The files used to obtain this information are contained in
-/proc and are listed in Table 1-4. Not all of these will be present in your
+/proc and are listed in Table 1-5. Not all of these will be present in your
system. It depends on the kernel configuration and the loaded modules, which
files are there, and which are missing.
-Table 1-4: Kernel info in /proc
+Table 1-5: Kernel info in /proc
..............................................................................
File Content
apm Advanced power management info
@@ -634,10 +772,10 @@ IDE devices:
More detailed information can be found in the controller specific
subdirectories. These are named ide0, ide1 and so on. Each of these
-directories contains the files shown in table 1-5.
+directories contains the files shown in table 1-6.
-Table 1-5: IDE controller info in /proc/ide/ide?
+Table 1-6: IDE controller info in /proc/ide/ide?
..............................................................................
File Content
channel IDE channel (0 or 1)
@@ -647,11 +785,11 @@ Table 1-5: IDE controller info in /proc
..............................................................................
Each device connected to a controller has a separate subdirectory in the
-controllers directory. The files listed in table 1-6 are contained in these
+controllers directory. The files listed in table 1-7 are contained in these
directories.
-Table 1-6: IDE device information
+Table 1-7: IDE device information
..............................................................................
File Content
cache The cache
@@ -693,12 +831,12 @@ the drive parameters:
1.4 Networking info in /proc/net
--------------------------------
-The subdirectory /proc/net follows the usual pattern. Table 1-6 shows the
+The subdirectory /proc/net follows the usual pattern. Table 1-8 shows the
additional values you get for IP version 6 if you configure the kernel to
-support this. Table 1-7 lists the files and their meaning.
+support this. Table 1-9 lists the files and their meaning.
-Table 1-6: IPv6 info in /proc/net
+Table 1-8: IPv6 info in /proc/net
..............................................................................
File Content
udp6 UDP sockets (IPv6)
@@ -713,7 +851,7 @@ Table 1-6: IPv6 info in /proc/net
..............................................................................
-Table 1-7: Network info in /proc/net
+Table 1-9: Network info in /proc/net
..............................................................................
File Content
arp Kernel ARP table
@@ -837,10 +975,10 @@ The directory /proc/parport contains i
your system. It has one subdirectory for each port, named after the port
number (0,1,2,...).
-These directories contain the four files shown in Table 1-8.
+These directories contain the four files shown in Table 1-10.
-Table 1-8: Files in /proc/parport
+Table 1-10: Files in /proc/parport
..............................................................................
File Content
autoprobe Any IEEE-1284 device ID information that has been acquired.
@@ -858,10 +996,10 @@ Table 1-8: Files in /proc/parport
Information about the available and actually used tty's can be found in the
directory /proc/tty.You'll find entries for drivers and line disciplines in
-this directory, as shown in Table 1-9.
+this directory, as shown in Table 1-11.
-Table 1-9: Files in /proc/tty
+Table 1-11: Files in /proc/tty
..............................................................................
File Content
drivers list of drivers and their usage
@@ -952,9 +1090,9 @@ Information about mounted ext4 file syst
/proc/fs/ext4. Each mounted filesystem will have a directory in
/proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or
/proc/fs/ext4/dm-0). The files in each per-device directory are shown
-in Table 1-10, below.
+in Table 1-12, below.
-Table 1-10: Files in /proc/fs/ext4/<devname>
+Table 1-12: Files in /proc/fs/ext4/<devname>
..............................................................................
File Content
mb_groups details of multiblock allocator buddy cache of free blocks
_
Patches currently in -mm which might be from stefani@seibold.net are
origin.patch
procfs-provide-stack-information-for-threads-v08.patch
--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [merged] proctxt-update-kernel-filesystem-proctxt-documentation.patch removed from -mm tree
[not found] ` <20090623233247.7ed661b7.akpm@linux-foundation.org>
@ 2009-06-24 6:45 ` Stefani Seibold
2009-06-24 7:13 ` Andrew Morton
0 siblings, 1 reply; 17+ messages in thread
From: Stefani Seibold @ 2009-06-24 6:45 UTC (permalink / raw)
To: Andrew Morton, Alexey Dobriyan
Cc: Eric W. Biederman, linux-kernel, Peter Zijlstra, Ingo Molnar
Am Dienstag, den 23.06.2009, 23:32 -0700 schrieb Andrew Morton:
> On Wed, 24 Jun 2009 08:20:44 +0200 Stefani Seibold <stefani@seibold.net> wrote:
>
> > what is with the associated
> > procfs-provide-stack-information-for-threads-v08.patch
> > patch?
> >
> > There was no real objections against this patch, so why not merge it for
> > 2.6.31?
>
> Alexey pointed out that it doesn't actually work.
That is not true... it works. With my patch the kernel does exactly know
where the thread stack is and therefor it is easy to determinate the
associated map.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [merged] proctxt-update-kernel-filesystem-proctxt-documentation.patch removed from -mm tree
2009-06-24 6:45 ` Stefani Seibold
@ 2009-06-24 7:13 ` Andrew Morton
2009-06-24 7:35 ` Eric W. Biederman
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2009-06-24 7:13 UTC (permalink / raw)
To: Stefani Seibold
Cc: Alexey Dobriyan, Eric W. Biederman, linux-kernel, Peter Zijlstra,
Ingo Molnar
On Wed, 24 Jun 2009 08:45:03 +0200 Stefani Seibold <stefani@seibold.net> wrote:
> Am Dienstag, den 23.06.2009, 23:32 -0700 schrieb Andrew Morton:
> > On Wed, 24 Jun 2009 08:20:44 +0200 Stefani Seibold <stefani@seibold.net> wrote:
> >
> > > what is with the associated
> > > procfs-provide-stack-information-for-threads-v08.patch
> > > patch?
> > >
> > > There was no real objections against this patch, so why not merge it for
> > > 2.6.31?
> >
> > Alexey pointed out that it doesn't actually work.
>
> That is not true... it works. With my patch the kernel does exactly know
> where the thread stack is and therefor it is easy to determinate the
> associated map.
>
On Tue, 16 Jun 2009 02:33:33 +0400 Alexey Dobriyan <adobriyan@gmail.com> wrote:
> On Mon, Jun 15, 2009 at 03:02:05PM -0700, akpm@linux-foundation.org wrote:
> > procfs-provide-stack-information-for-threads-v08.patch
> > --- a/fs/proc/array.c~procfs-provide-stack-information-for-threads-v08
>
> > +++ a/fs/proc/array.c
> > @@ -321,6 +321,54 @@ static inline void task_context_switch_c
> > p->nivcsw);
> > }
> >
> > +static inline unsigned long get_stack_usage_in_bytes(struct vm_area_struct *vma,
> > + struct task_struct *p)
> > +{
> > + unsigned long i;
> > + struct page *page;
> > + unsigned long stkpage;
> > +
> > + stkpage = KSTK_ESP(p) & PAGE_MASK;
> > +
> > +#ifdef CONFIG_STACK_GROWSUP
> > + for (i = vma->vm_end; i-PAGE_SIZE > stkpage; i -= PAGE_SIZE) {
> > +
> > + page = follow_page(vma, i-PAGE_SIZE, 0);
>
> How can this work?
>
> If stack page got swapped out, you'll get smaller than actual result.
Alexey's point is that follow_page() will return NULL if it hits a
swapped-out stack page and the loop will exit, leading to an incorrect
(ie: short) return value from get_stack_usage_in_bytes().
Is this claim wrong?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [merged] proctxt-update-kernel-filesystem-proctxt-documentation.patch removed from -mm tree
2009-06-24 7:13 ` Andrew Morton
@ 2009-06-24 7:35 ` Eric W. Biederman
2009-06-24 9:33 ` Stefani Seibold
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Eric W. Biederman @ 2009-06-24 7:35 UTC (permalink / raw)
To: Andrew Morton
Cc: Stefani Seibold, Alexey Dobriyan, linux-kernel, Peter Zijlstra,
Ingo Molnar
Andrew Morton <akpm@linux-foundation.org> writes:
> On Wed, 24 Jun 2009 08:45:03 +0200 Stefani Seibold <stefani@seibold.net> wrote:
>
>> Am Dienstag, den 23.06.2009, 23:32 -0700 schrieb Andrew Morton:
>> > On Wed, 24 Jun 2009 08:20:44 +0200 Stefani Seibold <stefani@seibold.net> wrote:
>> >
>> > > what is with the associated
>> > > procfs-provide-stack-information-for-threads-v08.patch
>> > > patch?
>> > >
>> > > There was no real objections against this patch, so why not merge it for
>> > > 2.6.31?
>> >
>> > Alexey pointed out that it doesn't actually work.
>>
>> That is not true... it works. With my patch the kernel does exactly know
>> where the thread stack is and therefor it is easy to determinate the
>> associated map.
Usually yes, but not in all cases.
> On Tue, 16 Jun 2009 02:33:33 +0400 Alexey Dobriyan <adobriyan@gmail.com> wrote:
>
>> On Mon, Jun 15, 2009 at 03:02:05PM -0700, akpm@linux-foundation.org wrote:
>> > procfs-provide-stack-information-for-threads-v08.patch
>> > --- a/fs/proc/array.c~procfs-provide-stack-information-for-threads-v08
>>
>> > +++ a/fs/proc/array.c
>> > @@ -321,6 +321,54 @@ static inline void task_context_switch_c
>> > p->nivcsw);
>> > }
>> >
>> > +static inline unsigned long get_stack_usage_in_bytes(struct vm_area_struct *vma,
>> > + struct task_struct *p)
>> > +{
>> > + unsigned long i;
>> > + struct page *page;
>> > + unsigned long stkpage;
>> > +
>> > + stkpage = KSTK_ESP(p) & PAGE_MASK;
>> > +
>> > +#ifdef CONFIG_STACK_GROWSUP
>> > + for (i = vma->vm_end; i-PAGE_SIZE > stkpage; i -= PAGE_SIZE) {
>> > +
>> > + page = follow_page(vma, i-PAGE_SIZE, 0);
>>
>> How can this work?
>>
>> If stack page got swapped out, you'll get smaller than actual result.
>
> Alexey's point is that follow_page() will return NULL if it hits a
> swapped-out stack page and the loop will exit, leading to an incorrect
> (ie: short) return value from get_stack_usage_in_bytes().
>
> Is this claim wrong?
Add to that the code is unnecessarily complicated.
The patch mixes several different changes together. It deserves being
broken up into at least two patches.
I am concerned about the performance. Glibc opens /proc/self/maps in
practically every application so doing something like following page
tables requires testing and verifying the performance.
Eric
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [merged] proctxt-update-kernel-filesystem-proctxt-documentation.patch removed from -mm tree
2009-06-24 7:35 ` Eric W. Biederman
@ 2009-06-24 9:33 ` Stefani Seibold
2009-06-24 15:30 ` Andrew Morton
2009-06-24 12:03 ` [patch 2/2] procfs: provide stack information for threads V0.9 Stefani Seibold
` (2 subsequent siblings)
3 siblings, 1 reply; 17+ messages in thread
From: Stefani Seibold @ 2009-06-24 9:33 UTC (permalink / raw)
To: Eric W. Biederman, Andrew Morton
Cc: Alexey Dobriyan, linux-kernel, Peter Zijlstra, Ingo Molnar
Am Mittwoch, den 24.06.2009, 00:35 -0700 schrieb Eric W. Biederman:
> Andrew Morton <akpm@linux-foundation.org> writes:
>
> > On Wed, 24 Jun 2009 08:45:03 +0200 Stefani Seibold <stefani@seibold.net> wrote:
> >
> >> Am Dienstag, den 23.06.2009, 23:32 -0700 schrieb Andrew Morton:
> >> > On Wed, 24 Jun 2009 08:20:44 +0200 Stefani Seibold <stefani@seibold.net> wrote:
> >> >
> >> > > what is with the associated
> >> > > procfs-provide-stack-information-for-threads-v08.patch
> >> > > patch?
> >> > >
> >> > > There was no real objections against this patch, so why not merge it for
> >> > > 2.6.31?
> >> >
> >> > Alexey pointed out that it doesn't actually work.
> >>
> >> That is not true... it works. With my patch the kernel does exactly know
> >> where the thread stack is and therefor it is easy to determinate the
> >> associated map.
>
> Usually yes, but not in all cases.
Which cases? The only way i know is to set the stack pointer to an
arbitrary place in user space.... And this is not a common use case.
>
>
> > On Tue, 16 Jun 2009 02:33:33 +0400 Alexey Dobriyan <adobriyan@gmail.com> wrote:
> >
> >> On Mon, Jun 15, 2009 at 03:02:05PM -0700, akpm@linux-foundation.org wrote:
> >> > procfs-provide-stack-information-for-threads-v08.patch
> >> > --- a/fs/proc/array.c~procfs-provide-stack-information-for-threads-v08
> >>
> >> > +++ a/fs/proc/array.c
> >> > @@ -321,6 +321,54 @@ static inline void task_context_switch_c
> >> > p->nivcsw);
> >> > }
> >> >
> >> > +static inline unsigned long get_stack_usage_in_bytes(struct vm_area_struct *vma,
> >> > + struct task_struct *p)
> >> > +{
> >> > + unsigned long i;
> >> > + struct page *page;
> >> > + unsigned long stkpage;
> >> > +
> >> > + stkpage = KSTK_ESP(p) & PAGE_MASK;
> >> > +
> >> > +#ifdef CONFIG_STACK_GROWSUP
> >> > + for (i = vma->vm_end; i-PAGE_SIZE > stkpage; i -= PAGE_SIZE) {
> >> > +
> >> > + page = follow_page(vma, i-PAGE_SIZE, 0);
> >>
> >> How can this work?
> >>
I replied a message for a solution to this problem but i get no answer.
>
> >> If stack page got swapped out, you'll get smaller than actual result.
> >
> > Alexey's point is that follow_page() will return NULL if it hits a
> > swapped-out stack page and the loop will exit, leading to an incorrect
> > (ie: short) return value from get_stack_usage_in_bytes().
> >
> > Is this claim wrong?
>
No.
I digged in the kernel source and the only solution i found is to use
the walk_page_range() like show_smap() in proc/fs/task_mmu.c.
Maybe there is an easier way, but i dont know.
So i would implement a similar function like smaps_pte_range() in
proc/fs/task_mmu.c to detected the high water usage.
>
> Add to that the code is unnecessarily complicated.
>
I don't like statements like that, without a explaination.
> The patch mixes several different changes together. It deserves being
> broken up into at least two patches.
>
Everybody tells me a different way to do a patch. Which one is the right
way. Ingo's, Andrew's or your way?
And it is a question of time if you a hacker girl which is not a full
time linux kernel developer.
> I am concerned about the performance. Glibc opens /proc/self/maps in
> practically every application so doing something like following page
> tables requires testing and verifying the performance.
>
I understand your concern, that is the reason why i display the stack
high water usage mark only in /proc/<pid>/status. This is normally a
human interface.
/proc/<pid>/maps or smaps will only show where the thread stack is
resided and the max. of the stack size, which is only a simple
subtraction.
The reason to display the max. size is, because the stack start isn't
equal to the map start address.
> Eric
Stefani
Write a patch: 16 hours
To get a patch into the kernel: 16 days
Overhead: 800 percent
^ permalink raw reply [flat|nested] 17+ messages in thread
* [patch 2/2] procfs: provide stack information for threads V0.9
2009-06-24 7:35 ` Eric W. Biederman
2009-06-24 9:33 ` Stefani Seibold
@ 2009-06-24 12:03 ` Stefani Seibold
2009-06-24 14:33 ` [patch 2/2] procfs: provide stack information for threads V0.10 Stefani Seibold
2009-06-24 16:28 ` [patch 2/2] procfs: provide stack information for threads V0.11 Stefani Seibold
3 siblings, 0 replies; 17+ messages in thread
From: Stefani Seibold @ 2009-06-24 12:03 UTC (permalink / raw)
To: Andrew Morton, linux-kernel, Eric W. Biederman
Cc: Alexey Dobriyan, Peter Zijlstra, Ingo Molnar
Hi,
this is the newest version of the formaly named "detailed stack info"
patch which give you a better overview of the userland application stack
usage, especially for embedded linux.
Currently you are only able to dump the main process/thread stack usage
which is showed in /proc/pid/status by the "VmStk" Value. But you get no
information about the consumed stack memory of the the threads.
There is an enhancement in the /proc/<pid>/{task/*,}/*maps and which
marks the vm mapping where the thread stack pointer reside with "[thread
stack xxxxxxxx]". xxxxxxxx is the maximum size of stack. This is a
value information, because libpthread doesn't set the start of the stack
to the top of the mapped area, depending of the pthread usage.
A sample output of /proc/<pid>/task/<tid>/maps looks like:
08048000-08049000 r-xp 00000000 03:00 8312 /opt/z
08049000-0804a000 rw-p 00001000 03:00 8312 /opt/z
0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
a7d12000-a7d13000 ---p 00000000 00:00 0
a7d13000-a7f13000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
a7f13000-a7f14000 ---p 00000000 00:00 0
a7f14000-a7f36000 rw-p 00000000 00:00 0
a7f36000-a8069000 r-xp 00000000 03:00 4222 /lib/libc.so.6
a8069000-a806b000 r--p 00133000 03:00 4222 /lib/libc.so.6
a806b000-a806c000 rw-p 00135000 03:00 4222 /lib/libc.so.6
a806c000-a806f000 rw-p 00000000 00:00 0
a806f000-a8083000 r-xp 00000000 03:00 14462 /lib/libpthread.so.0
a8083000-a8084000 r--p 00013000 03:00 14462 /lib/libpthread.so.0
a8084000-a8085000 rw-p 00014000 03:00 14462 /lib/libpthread.so.0
a8085000-a8088000 rw-p 00000000 00:00 0
a8088000-a80a4000 r-xp 00000000 03:00 8317 /lib/ld-linux.so.2
a80a4000-a80a5000 r--p 0001b000 03:00 8317 /lib/ld-linux.so.2
a80a5000-a80a6000 rw-p 0001c000 03:00 8317 /lib/ld-linux.so.2
afaf5000-afb0a000 rw-p 00000000 00:00 0 [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
Also there is a new entry "stack usage" in /proc/<pid>/{task/*,}/status
which will you give the current stack usage in kb.
A sample output of /proc/self/status looks like:
Name: cat
State: R (running)
Tgid: 507
Pid: 507
.
.
.
CapBnd: fffffffffffffeff
voluntary_ctxt_switches: 0
nonvoluntary_ctxt_switches: 0
Stack usage: 12 kB
I also fixed stack base address in /proc/<pid>/{task/*,}/stat to the
base address of the associated thread stack and not the one of the main
process. This makes more sense.
Changes since last posting:
- use walk_page_range() to determinate the stack usage high water mark
- include swapped pages to the stack usage high water mark
The patch is against 2.6.30-rc7 and tested with on intel and ppc
architectures.
ChangeLog:
20. Jan 2009 V0.1
- First Version for Kernel 2.6.28.1
31. Mar 2009 V0.2
- Ported to Kernel 2.6.29
03. Jun 2009 V0.3
- Ported to Kernel 2.6.30
- Redesigned what was suggested by Ingo Molnar
- the thread watch monitor is gone
- the /proc/stackmon entry is also gone
- slim down
04. Jun 2009 V0.4
- Redesigned everything that was suggested by Andrew Morton
- slim down
04. Jun 2009 V0.5
- Code cleanup
06. Jun 2009 V0.6
- Fix missing mm->mmap_sem locking in function task_show_stack_usage()
- Code cleanup
10. Jun 2009 V0.7
- update Documentation/filesystem/proc.txt
10. Jun 2009 V0.8
- change maps/smaps output, displays now the max. stack size
Documentation/filesystems/proc.txt | 5 +-
fs/exec.c | 2
fs/proc/array.c | 85 ++++++++++++++++++++++++++++++++++++-
fs/proc/task_mmu.c | 19 ++++++++
include/linux/sched.h | 1
kernel/fork.c | 2
6 files changed, 112 insertions(+), 2 deletions(-)
Signed-off-by: Stefani Seibold <stefani@seibold.net>
diff -u -N -r linux-2.6.30.orig/Documentation/filesystems/proc.txt linux-2.6.30/Documentation/filesystems/proc.txt
--- linux-2.6.30.orig/Documentation/filesystems/proc.txt 2009-06-10 09:09:27.000000000 +0200
+++ linux-2.6.30/Documentation/filesystems/proc.txt 2009-06-10 09:07:46.000000000 +0200
@@ -176,6 +176,7 @@
CapBnd: ffffffffffffffff
voluntary_ctxt_switches: 0
nonvoluntary_ctxt_switches: 1
+ Stack usage: 12 kB
This shows you nearly the same information you would get if you viewed it with
the ps command. In fact, ps uses the proc file system to obtain its
@@ -229,6 +230,7 @@
Mems_allowed_list Same as previous, but in "list format"
voluntary_ctxt_switches number of voluntary context switches
nonvoluntary_ctxt_switches number of non voluntary context switches
+ Stack usage: stack usage high water mark (round up to page size)
..............................................................................
Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
@@ -307,7 +309,7 @@
08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test
0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
a7cb1000-a7cb2000 ---p 00000000 00:00 0
-a7cb2000-a7eb2000 rw-p 00000000 00:00 0
+a7cb2000-a7eb2000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
a7eb2000-a7eb3000 ---p 00000000 00:00 0
a7eb3000-a7ed5000 rw-p 00000000 00:00 0
a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6
@@ -343,6 +345,7 @@
[stack] = the stack of the main process
[vdso] = the "virtual dynamic shared object",
the kernel system call handler
+ [thread stack, xxxxxxxx] = the stack of the thread, xxxxxxxx is the stack size
or if empty, the mapping is anonymous.
diff -u -N -r linux-2.6.30.orig/fs/exec.c linux-2.6.30/fs/exec.c
--- linux-2.6.30.orig/fs/exec.c 2009-06-04 09:29:47.000000000 +0200
+++ linux-2.6.30/fs/exec.c 2009-06-04 09:32:35.000000000 +0200
@@ -1328,6 +1328,8 @@
if (retval < 0)
goto out;
+ current->stack_start = current->mm->start_stack;
+
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
diff -u -N -r linux-2.6.30.orig/fs/proc/array.c linux-2.6.30/fs/proc/array.c
--- linux-2.6.30.orig/fs/proc/array.c 2009-06-04 09:29:47.000000000 +0200
+++ linux-2.6.30/fs/proc/array.c 2009-06-24 13:53:27.000000000 +0200
@@ -83,6 +83,8 @@
#include <linux/ptrace.h>
#include <linux/tracehook.h>
+#include <linux/swapops.h>
+
#include <asm/pgtable.h>
#include <asm/processor.h>
#include "internal.h"
@@ -321,6 +323,86 @@
p->nivcsw);
}
+struct stack_stats {
+ struct vm_area_struct *vma;
+ unsigned long startpage;
+ unsigned long usage;
+};
+
+static int stack_usage_pte_range(pmd_t *pmd, unsigned long addr,
+ unsigned long end, struct mm_walk *walk)
+{
+ struct stack_stats *ss = walk->private;
+ struct vm_area_struct *vma = ss->vma;
+ pte_t *pte, ptent;
+ spinlock_t *ptl;
+ int ret = 0;
+
+ pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
+ for (; addr != end; pte++, addr += PAGE_SIZE) {
+ ptent = *pte;
+
+#ifdef CONFIG_STACK_GROWSUP
+ if (pte_present(ptent) || is_swap_pte(ptent))
+ ss->usage = addr - ss->startpage + PAGE_SIZE;
+#else
+ if (pte_present(ptent) || is_swap_pte(ptent)) {
+ ss->usage = ss->startpage - addr + PAGE_SIZE;
+ ret = 1;
+ break;
+ }
+#endif
+ }
+ pte_unmap_unlock(pte - 1, ptl);
+ cond_resched();
+ return ret;
+}
+
+static inline unsigned long get_stack_usage_in_bytes(struct vm_area_struct *vma,
+ struct task_struct *task)
+{
+ struct stack_stats ss;
+ struct mm_walk stack_walk = {
+ .pmd_entry = stack_usage_pte_range,
+ .mm = vma->vm_mm,
+ .private = &ss,
+ };
+
+ if (!vma->vm_mm || is_vm_hugetlb_page(vma))
+ return 0;
+
+ ss.vma = vma;
+ ss.startpage = task->stack_start & PAGE_MASK;
+ ss.usage = 0;
+
+#ifdef CONFIG_STACK_GROWSUP
+ walk_page_range(KSTK_ESP(task) & PAGE_MASK, vma->vm_end,
+ &stack_walk);
+#else
+ walk_page_range(vma->vm_start, (KSTK_ESP(task) & PAGE_MASK) + PAGE_SIZE,
+ &stack_walk);
+#endif
+ return ss.usage;
+}
+
+static inline void task_show_stack_usage(struct seq_file *m,
+ struct task_struct *task)
+{
+ struct vm_area_struct *vma;
+ struct mm_struct *mm = get_task_mm(task);
+
+ if (mm) {
+ down_read(&mm->mmap_sem);
+ vma = find_vma(mm, task->stack_start);
+ if (vma)
+ seq_printf(m, "Stack usage:\t%lu kB\n",
+ get_stack_usage_in_bytes(vma, task) >> 10);
+
+ up_read(&mm->mmap_sem);
+ mmput(mm);
+ }
+}
+
int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
@@ -340,6 +422,7 @@
task_show_regs(m, task);
#endif
task_context_switch_counts(m, task);
+ task_show_stack_usage(m, task);
return 0;
}
@@ -481,7 +564,7 @@
rsslim,
mm ? mm->start_code : 0,
mm ? mm->end_code : 0,
- (permitted && mm) ? mm->start_stack : 0,
+ (permitted) ? task->stack_start : 0,
esp,
eip,
/* The signal information here is obsolete.
diff -u -N -r linux-2.6.30.orig/fs/proc/task_mmu.c linux-2.6.30/fs/proc/task_mmu.c
--- linux-2.6.30.orig/fs/proc/task_mmu.c 2009-06-04 09:29:47.000000000 +0200
+++ linux-2.6.30/fs/proc/task_mmu.c 2009-06-10 09:02:40.000000000 +0200
@@ -242,6 +242,25 @@
} else if (vma->vm_start <= mm->start_stack &&
vma->vm_end >= mm->start_stack) {
name = "[stack]";
+ } else {
+ unsigned long stack_start;
+ struct proc_maps_private *pmp;
+
+ pmp = m->private;
+ stack_start = pmp->task->stack_start;
+
+ if (vma->vm_start <= stack_start &&
+ vma->vm_end >= stack_start) {
+ pad_len_spaces(m, len);
+ seq_printf(m,
+ "[thread stack: %08lx]",
+#ifdef CONFIG_STACK_GROWSUP
+ vma->vm_end - stack_start
+#else
+ stack_start - vma->vm_start
+#endif
+ );
+ }
}
} else {
name = "[vdso]";
diff -u -N -r linux-2.6.30.orig/include/linux/sched.h linux-2.6.30/include/linux/sched.h
--- linux-2.6.30.orig/include/linux/sched.h 2009-06-04 09:29:47.000000000 +0200
+++ linux-2.6.30/include/linux/sched.h 2009-06-04 09:32:35.000000000 +0200
@@ -1429,6 +1429,7 @@
/* state flags for use by tracers */
unsigned long trace;
#endif
+ unsigned long stack_start;
};
/* Future-safe accessor for struct task_struct's cpus_allowed. */
diff -u -N -r linux-2.6.30.orig/kernel/fork.c linux-2.6.30/kernel/fork.c
--- linux-2.6.30.orig/kernel/fork.c 2009-06-04 09:29:47.000000000 +0200
+++ linux-2.6.30/kernel/fork.c 2009-06-04 13:15:35.000000000 +0200
@@ -1092,6 +1092,8 @@
if (unlikely(current->ptrace))
ptrace_fork(p, clone_flags);
+ p->stack_start = stack_start;
+
/* Perform scheduler related setup. Assign this task to a CPU. */
sched_fork(p, clone_flags);
^ permalink raw reply [flat|nested] 17+ messages in thread
* [patch 2/2] procfs: provide stack information for threads V0.10
2009-06-24 7:35 ` Eric W. Biederman
2009-06-24 9:33 ` Stefani Seibold
2009-06-24 12:03 ` [patch 2/2] procfs: provide stack information for threads V0.9 Stefani Seibold
@ 2009-06-24 14:33 ` Stefani Seibold
2009-06-24 15:20 ` Ingo Molnar
2009-06-24 16:28 ` [patch 2/2] procfs: provide stack information for threads V0.11 Stefani Seibold
3 siblings, 1 reply; 17+ messages in thread
From: Stefani Seibold @ 2009-06-24 14:33 UTC (permalink / raw)
To: Andrew Morton, linux-kernel, Eric W. Biederman
Cc: Alexey Dobriyan, Peter Zijlstra, Ingo Molnar
Hi,
this is the newest version of the formaly named "detailed stack info"
patch which give you a better overview of the userland application stack
usage, especially for embedded linux.
Currently you are only able to dump the main process/thread stack usage
which is showed in /proc/pid/status by the "VmStk" Value. But you get no
information about the consumed stack memory of the the threads.
There is an enhancement in the /proc/<pid>/{task/*,}/*maps and which
marks the vm mapping where the thread stack pointer reside with "[thread
stack xxxxxxxx]". xxxxxxxx is the maximum size of stack. This is a
value information, because libpthread doesn't set the start of the stack
to the top of the mapped area, depending of the pthread usage.
A sample output of /proc/<pid>/task/<tid>/maps looks like:
08048000-08049000 r-xp 00000000 03:00 8312 /opt/z
08049000-0804a000 rw-p 00001000 03:00 8312 /opt/z
0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
a7d12000-a7d13000 ---p 00000000 00:00 0
a7d13000-a7f13000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
a7f13000-a7f14000 ---p 00000000 00:00 0
a7f14000-a7f36000 rw-p 00000000 00:00 0
a7f36000-a8069000 r-xp 00000000 03:00 4222 /lib/libc.so.6
a8069000-a806b000 r--p 00133000 03:00 4222 /lib/libc.so.6
a806b000-a806c000 rw-p 00135000 03:00 4222 /lib/libc.so.6
a806c000-a806f000 rw-p 00000000 00:00 0
a806f000-a8083000 r-xp 00000000 03:00 14462 /lib/libpthread.so.0
a8083000-a8084000 r--p 00013000 03:00 14462 /lib/libpthread.so.0
a8084000-a8085000 rw-p 00014000 03:00 14462 /lib/libpthread.so.0
a8085000-a8088000 rw-p 00000000 00:00 0
a8088000-a80a4000 r-xp 00000000 03:00 8317 /lib/ld-linux.so.2
a80a4000-a80a5000 r--p 0001b000 03:00 8317 /lib/ld-linux.so.2
a80a5000-a80a6000 rw-p 0001c000 03:00 8317 /lib/ld-linux.so.2
afaf5000-afb0a000 rw-p 00000000 00:00 0 [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
Also there is a new entry "stack usage" in /proc/<pid>/{task/*,}/status
which will you give the current stack usage in kb.
A sample output of /proc/self/status looks like:
Name: cat
State: R (running)
Tgid: 507
Pid: 507
.
.
.
CapBnd: fffffffffffffeff
voluntary_ctxt_switches: 0
nonvoluntary_ctxt_switches: 0
Stack usage: 12 kB
I also fixed stack base address in /proc/<pid>/{task/*,}/stat to the
base address of the associated thread stack and not the one of the main
process. This makes more sense.
Changes since last posting:
- fix off by one bug
- cleanup
The patch is against 2.6.30 and is tested on intel and ppc architectures.
ChangeLog:
20. Jan 2009 V0.1
- First Version for Kernel 2.6.28.1
31. Mar 2009 V0.2
- Ported to Kernel 2.6.29
03. Jun 2009 V0.3
- Ported to Kernel 2.6.30
- Redesigned what was suggested by Ingo Molnar
- the thread watch monitor is gone
- the /proc/stackmon entry is also gone
- slim down
04. Jun 2009 V0.4
- Redesigned everything that was suggested by Andrew Morton
- slim down
04. Jun 2009 V0.5
- Code cleanup
06. Jun 2009 V0.6
- Fix missing mm->mmap_sem locking in function task_show_stack_usage()
- Code cleanup
10. Jun 2009 V0.7
- update Documentation/filesystem/proc.txt
10. Jun 2009 V0.8
- change maps/smaps output, displays now the max. stack size
24. Jun 2009 V0.9
- use walk_page_range() to determinate the stack usage high water mark
- include swapped pages to the stack usage high water mark count
Documentation/filesystems/proc.txt | 5 +-
fs/exec.c | 2
fs/proc/array.c | 85 ++++++++++++++++++++++++++++++++++++-
fs/proc/task_mmu.c | 19 ++++++++
include/linux/sched.h | 1
kernel/fork.c | 2
6 files changed, 112 insertions(+), 2 deletions(-)
Signed-off-by: Stefani Seibold <stefani@seibold.net>
diff -u -N -r linux-2.6.30.orig/Documentation/filesystems/proc.txt linux-2.6.30/Documentation/filesystems/proc.txt
--- linux-2.6.30.orig/Documentation/filesystems/proc.txt 2009-06-24 16:21:46.000000000 +0200
+++ linux-2.6.30/Documentation/filesystems/proc.txt 2009-06-24 16:22:11.000000000 +0200
@@ -176,6 +176,7 @@
CapBnd: ffffffffffffffff
voluntary_ctxt_switches: 0
nonvoluntary_ctxt_switches: 1
+ Stack usage: 12 kB
This shows you nearly the same information you would get if you viewed it with
the ps command. In fact, ps uses the proc file system to obtain its
@@ -229,6 +230,7 @@
Mems_allowed_list Same as previous, but in "list format"
voluntary_ctxt_switches number of voluntary context switches
nonvoluntary_ctxt_switches number of non voluntary context switches
+ Stack usage: stack usage high water mark (round up to page size)
..............................................................................
Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
@@ -307,7 +309,7 @@
08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test
0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
a7cb1000-a7cb2000 ---p 00000000 00:00 0
-a7cb2000-a7eb2000 rw-p 00000000 00:00 0
+a7cb2000-a7eb2000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
a7eb2000-a7eb3000 ---p 00000000 00:00 0
a7eb3000-a7ed5000 rw-p 00000000 00:00 0
a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6
@@ -343,6 +345,7 @@
[stack] = the stack of the main process
[vdso] = the "virtual dynamic shared object",
the kernel system call handler
+ [thread stack, xxxxxxxx] = the stack of the thread, xxxxxxxx is the stack size
or if empty, the mapping is anonymous.
diff -u -N -r linux-2.6.30.orig/fs/exec.c linux-2.6.30/fs/exec.c
--- linux-2.6.30.orig/fs/exec.c 2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30/fs/exec.c 2009-06-24 16:22:11.000000000 +0200
@@ -1328,6 +1328,8 @@
if (retval < 0)
goto out;
+ current->stack_start = current->mm->start_stack;
+
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
diff -u -N -r linux-2.6.30.orig/fs/proc/array.c linux-2.6.30/fs/proc/array.c
--- linux-2.6.30.orig/fs/proc/array.c 2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30/fs/proc/array.c 2009-06-24 16:24:59.000000000 +0200
@@ -82,6 +82,7 @@
#include <linux/pid_namespace.h>
#include <linux/ptrace.h>
#include <linux/tracehook.h>
+#include <linux/swapops.h>
#include <asm/pgtable.h>
#include <asm/processor.h>
@@ -321,6 +322,87 @@
p->nivcsw);
}
+struct stack_stats {
+ struct vm_area_struct *vma;
+ unsigned long startpage;
+ unsigned long usage;
+};
+
+static int stack_usage_pte_range(pmd_t *pmd, unsigned long addr,
+ unsigned long end, struct mm_walk *walk)
+{
+ struct stack_stats *ss = walk->private;
+ struct vm_area_struct *vma = ss->vma;
+ pte_t *pte, ptent;
+ spinlock_t *ptl;
+ int ret = 0;
+
+ pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
+ for (; addr != end; pte++, addr += PAGE_SIZE) {
+ ptent = *pte;
+
+#ifdef CONFIG_STACK_GROWSUP
+ if (pte_present(ptent) || is_swap_pte(ptent))
+ ss->usage = addr - ss->startpage + PAGE_SIZE;
+#else
+ if (pte_present(ptent) || is_swap_pte(ptent)) {
+ ss->usage = ss->startpage - addr + PAGE_SIZE;
+ pte++;
+ ret = 1;
+ break;
+ }
+#endif
+ }
+ pte_unmap_unlock(pte - 1, ptl);
+ cond_resched();
+ return ret;
+}
+
+static inline unsigned long get_stack_usage_in_bytes(struct vm_area_struct *vma,
+ struct task_struct *task)
+{
+ struct stack_stats ss;
+ struct mm_walk stack_walk = {
+ .pmd_entry = stack_usage_pte_range,
+ .mm = vma->vm_mm,
+ .private = &ss,
+ };
+
+ if (!vma->vm_mm || is_vm_hugetlb_page(vma))
+ return 0;
+
+ ss.vma = vma;
+ ss.startpage = task->stack_start & PAGE_MASK;
+ ss.usage = 0;
+
+#ifdef CONFIG_STACK_GROWSUP
+ walk_page_range(KSTK_ESP(task) & PAGE_MASK, vma->vm_end,
+ &stack_walk);
+#else
+ walk_page_range(vma->vm_start, (KSTK_ESP(task) & PAGE_MASK) + PAGE_SIZE,
+ &stack_walk);
+#endif
+ return ss.usage;
+}
+
+static inline void task_show_stack_usage(struct seq_file *m,
+ struct task_struct *task)
+{
+ struct vm_area_struct *vma;
+ struct mm_struct *mm = get_task_mm(task);
+
+ if (mm) {
+ down_read(&mm->mmap_sem);
+ vma = find_vma(mm, task->stack_start);
+ if (vma)
+ seq_printf(m, "Stack usage:\t%lu kB\n",
+ get_stack_usage_in_bytes(vma, task) >> 10);
+
+ up_read(&mm->mmap_sem);
+ mmput(mm);
+ }
+}
+
int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
@@ -340,6 +422,7 @@
task_show_regs(m, task);
#endif
task_context_switch_counts(m, task);
+ task_show_stack_usage(m, task);
return 0;
}
@@ -481,7 +564,7 @@
rsslim,
mm ? mm->start_code : 0,
mm ? mm->end_code : 0,
- (permitted && mm) ? mm->start_stack : 0,
+ (permitted) ? task->stack_start : 0,
esp,
eip,
/* The signal information here is obsolete.
diff -u -N -r linux-2.6.30.orig/fs/proc/task_mmu.c linux-2.6.30/fs/proc/task_mmu.c
--- linux-2.6.30.orig/fs/proc/task_mmu.c 2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30/fs/proc/task_mmu.c 2009-06-24 16:22:11.000000000 +0200
@@ -242,6 +242,25 @@
} else if (vma->vm_start <= mm->start_stack &&
vma->vm_end >= mm->start_stack) {
name = "[stack]";
+ } else {
+ unsigned long stack_start;
+ struct proc_maps_private *pmp;
+
+ pmp = m->private;
+ stack_start = pmp->task->stack_start;
+
+ if (vma->vm_start <= stack_start &&
+ vma->vm_end >= stack_start) {
+ pad_len_spaces(m, len);
+ seq_printf(m,
+ "[thread stack: %08lx]",
+#ifdef CONFIG_STACK_GROWSUP
+ vma->vm_end - stack_start
+#else
+ stack_start - vma->vm_start
+#endif
+ );
+ }
}
} else {
name = "[vdso]";
diff -u -N -r linux-2.6.30.orig/include/linux/sched.h linux-2.6.30/include/linux/sched.h
--- linux-2.6.30.orig/include/linux/sched.h 2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30/include/linux/sched.h 2009-06-24 16:22:11.000000000 +0200
@@ -1429,6 +1429,7 @@
/* state flags for use by tracers */
unsigned long trace;
#endif
+ unsigned long stack_start;
};
/* Future-safe accessor for struct task_struct's cpus_allowed. */
diff -u -N -r linux-2.6.30.orig/kernel/fork.c linux-2.6.30/kernel/fork.c
--- linux-2.6.30.orig/kernel/fork.c 2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30/kernel/fork.c 2009-06-24 16:22:11.000000000 +0200
@@ -1092,6 +1092,8 @@
if (unlikely(current->ptrace))
ptrace_fork(p, clone_flags);
+ p->stack_start = stack_start;
+
/* Perform scheduler related setup. Assign this task to a CPU. */
sched_fork(p, clone_flags);
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch 2/2] procfs: provide stack information for threads V0.10
2009-06-24 14:33 ` [patch 2/2] procfs: provide stack information for threads V0.10 Stefani Seibold
@ 2009-06-24 15:20 ` Ingo Molnar
2009-06-24 15:49 ` Stefani Seibold
0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2009-06-24 15:20 UTC (permalink / raw)
To: Stefani Seibold
Cc: Andrew Morton, linux-kernel, Eric W. Biederman, Alexey Dobriyan,
Peter Zijlstra
* Stefani Seibold <stefani@seibold.net> wrote:
> Hi,
>
> this is the newest version of the formaly named "detailed stack info"
> patch which give you a better overview of the userland application stack
> usage, especially for embedded linux.
>
> Currently you are only able to dump the main process/thread stack usage
> which is showed in /proc/pid/status by the "VmStk" Value. But you get no
> information about the consumed stack memory of the the threads.
>
> There is an enhancement in the /proc/<pid>/{task/*,}/*maps and which
> marks the vm mapping where the thread stack pointer reside with "[thread
> stack xxxxxxxx]". xxxxxxxx is the maximum size of stack. This is a
> value information, because libpthread doesn't set the start of the stack
> to the top of the mapped area, depending of the pthread usage.
>
> A sample output of /proc/<pid>/task/<tid>/maps looks like:
>
> 08048000-08049000 r-xp 00000000 03:00 8312 /opt/z
> 08049000-0804a000 rw-p 00001000 03:00 8312 /opt/z
> 0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
> a7d12000-a7d13000 ---p 00000000 00:00 0
> a7d13000-a7f13000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
I have the same question as before: have you checked the use of that
field in tools/perf/builtin-record.c, and how your change will
impact that?
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [merged] proctxt-update-kernel-filesystem-proctxt-documentation.patch removed from -mm tree
2009-06-24 9:33 ` Stefani Seibold
@ 2009-06-24 15:30 ` Andrew Morton
2009-06-24 15:57 ` Stefani Seibold
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2009-06-24 15:30 UTC (permalink / raw)
To: Stefani Seibold
Cc: Eric W. Biederman, Alexey Dobriyan, linux-kernel, Peter Zijlstra,
Ingo Molnar
On Wed, 24 Jun 2009 11:33:25 +0200 Stefani Seibold <stefani@seibold.net> wrote:
> > > Alexey's point is that follow_page() will return NULL if it hits a
> > > swapped-out stack page and the loop will exit, leading to an incorrect
> > > (ie: short) return value from get_stack_usage_in_bytes().
> > >
> > > Is this claim wrong?
> >
>
> No.
>
> I digged in the kernel source and the only solution i found is to use
> the walk_page_range() like show_smap() in proc/fs/task_mmu.c.
>
> Maybe there is an easier way, but i dont know.
>
> So i would implement a similar function like smaps_pte_range() in
> proc/fs/task_mmu.c to detected the high water usage.
Perhaps we could enhance follow_page() so that it can tell the caller
when the target page is "virtually there", but swapped out. Add a new
FOLL_SWAP, I guess.
How to communicate this back to the caller? Perhaps add another
argument to follow_page(), perhaps return some magic value such as
#define FOLLOW_PAGE_SWAPPED_PAGE ((struct page *)1)
Adding the additional argument would be nicer.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch 2/2] procfs: provide stack information for threads V0.10
2009-06-24 15:20 ` Ingo Molnar
@ 2009-06-24 15:49 ` Stefani Seibold
2009-06-24 17:40 ` Johannes Weiner
0 siblings, 1 reply; 17+ messages in thread
From: Stefani Seibold @ 2009-06-24 15:49 UTC (permalink / raw)
To: Ingo Molnar
Cc: Andrew Morton, linux-kernel, Eric W. Biederman, Alexey Dobriyan,
Peter Zijlstra
Am Mittwoch, den 24.06.2009, 17:20 +0200 schrieb Ingo Molnar:
> * Stefani Seibold <stefani@seibold.net> wrote:
>
> > Hi,
> >
> > this is the newest version of the formaly named "detailed stack info"
> > patch which give you a better overview of the userland application stack
> > usage, especially for embedded linux.
> >
> > Currently you are only able to dump the main process/thread stack usage
> > which is showed in /proc/pid/status by the "VmStk" Value. But you get no
> > information about the consumed stack memory of the the threads.
> >
> > There is an enhancement in the /proc/<pid>/{task/*,}/*maps and which
> > marks the vm mapping where the thread stack pointer reside with "[thread
> > stack xxxxxxxx]". xxxxxxxx is the maximum size of stack. This is a
> > value information, because libpthread doesn't set the start of the stack
> > to the top of the mapped area, depending of the pthread usage.
> >
> > A sample output of /proc/<pid>/task/<tid>/maps looks like:
> >
> > 08048000-08049000 r-xp 00000000 03:00 8312 /opt/z
> > 08049000-0804a000 rw-p 00001000 03:00 8312 /opt/z
> > 0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
> > a7d12000-a7d13000 ---p 00000000 00:00 0
> > a7d13000-a7f13000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
>
> I have the same question as before: have you checked the use of that
> field in tools/perf/builtin-record.c, and how your change will
> impact that?
>
Good question... i have another one: What is tools/perf/builtin-record.c
and where can i find it? Then i could check it.
> Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [merged] proctxt-update-kernel-filesystem-proctxt-documentation.patch removed from -mm tree
2009-06-24 15:30 ` Andrew Morton
@ 2009-06-24 15:57 ` Stefani Seibold
0 siblings, 0 replies; 17+ messages in thread
From: Stefani Seibold @ 2009-06-24 15:57 UTC (permalink / raw)
To: Andrew Morton
Cc: Eric W. Biederman, Alexey Dobriyan, linux-kernel, Peter Zijlstra,
Ingo Molnar
Am Mittwoch, den 24.06.2009, 08:30 -0700 schrieb Andrew Morton:
> On Wed, 24 Jun 2009 11:33:25 +0200 Stefani Seibold <stefani@seibold.net> wrote:
>
> > > > Alexey's point is that follow_page() will return NULL if it hits a
> > > > swapped-out stack page and the loop will exit, leading to an incorrect
> > > > (ie: short) return value from get_stack_usage_in_bytes().
> > > >
> > > > Is this claim wrong?
> > >
> >
> > No.
> >
> > I digged in the kernel source and the only solution i found is to use
> > the walk_page_range() like show_smap() in proc/fs/task_mmu.c.
> >
> > Maybe there is an easier way, but i dont know.
> >
> > So i would implement a similar function like smaps_pte_range() in
> > proc/fs/task_mmu.c to detected the high water usage.
>
> Perhaps we could enhance follow_page() so that it can tell the caller
> when the target page is "virtually there", but swapped out. Add a new
> FOLL_SWAP, I guess.
>
I currently fixed it by using walk_page_range(). I think this is a quiet
good solution. But if you like i can do it in a future version.
> How to communicate this back to the caller? Perhaps add another
> argument to follow_page(), perhaps return some magic value such as
>
> #define FOLLOW_PAGE_SWAPPED_PAGE ((struct page *)1)
>
> Adding the additional argument would be nicer.
IMHO i think it would be the best to add a new FOLL_NOTIFY_SWAP flag and
if the page is swapped out return the FOLLOW_PAGE_SWAPPED_PAGE magic if
this flag is passed.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [patch 2/2] procfs: provide stack information for threads V0.11
2009-06-24 7:35 ` Eric W. Biederman
` (2 preceding siblings ...)
2009-06-24 14:33 ` [patch 2/2] procfs: provide stack information for threads V0.10 Stefani Seibold
@ 2009-06-24 16:28 ` Stefani Seibold
3 siblings, 0 replies; 17+ messages in thread
From: Stefani Seibold @ 2009-06-24 16:28 UTC (permalink / raw)
To: Andrew Morton, linux-kernel, Eric W. Biederman
Cc: Alexey Dobriyan, Peter Zijlstra, Ingo Molnar
Hi,
this is the newest version of the formaly named "detailed stack info"
patch which give you a better overview of the userland application stack
usage, especially for embedded linux.
Currently you are only able to dump the main process/thread stack usage
which is showed in /proc/pid/status by the "VmStk" Value. But you get no
information about the consumed stack memory of the the threads.
There is an enhancement in the /proc/<pid>/{task/*,}/*maps and which
marks the vm mapping where the thread stack pointer reside with "[thread
stack xxxxxxxx]". xxxxxxxx is the maximum size of stack. This is a
value information, because libpthread doesn't set the start of the stack
to the top of the mapped area, depending of the pthread usage.
A sample output of /proc/<pid>/task/<tid>/maps looks like:
08048000-08049000 r-xp 00000000 03:00 8312 /opt/z
08049000-0804a000 rw-p 00001000 03:00 8312 /opt/z
0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
a7d12000-a7d13000 ---p 00000000 00:00 0
a7d13000-a7f13000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
a7f13000-a7f14000 ---p 00000000 00:00 0
a7f14000-a7f36000 rw-p 00000000 00:00 0
a7f36000-a8069000 r-xp 00000000 03:00 4222 /lib/libc.so.6
a8069000-a806b000 r--p 00133000 03:00 4222 /lib/libc.so.6
a806b000-a806c000 rw-p 00135000 03:00 4222 /lib/libc.so.6
a806c000-a806f000 rw-p 00000000 00:00 0
a806f000-a8083000 r-xp 00000000 03:00 14462 /lib/libpthread.so.0
a8083000-a8084000 r--p 00013000 03:00 14462 /lib/libpthread.so.0
a8084000-a8085000 rw-p 00014000 03:00 14462 /lib/libpthread.so.0
a8085000-a8088000 rw-p 00000000 00:00 0
a8088000-a80a4000 r-xp 00000000 03:00 8317 /lib/ld-linux.so.2
a80a4000-a80a5000 r--p 0001b000 03:00 8317 /lib/ld-linux.so.2
a80a5000-a80a6000 rw-p 0001c000 03:00 8317 /lib/ld-linux.so.2
afaf5000-afb0a000 rw-p 00000000 00:00 0 [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
Also there is a new entry "stack usage" in /proc/<pid>/{task/*,}/status
which will you give the current stack usage in kb.
A sample output of /proc/self/status looks like:
Name: cat
State: R (running)
Tgid: 507
Pid: 507
.
.
.
CapBnd: fffffffffffffeff
voluntary_ctxt_switches: 0
nonvoluntary_ctxt_switches: 0
Stack usage: 12 kB
I also fixed stack base address in /proc/<pid>/{task/*,}/stat to the
base address of the associated thread stack and not the one of the main
process. This makes more sense.
Changes since last posting:
- fix compatibility with tools/perf/builtin-record.c in upstream kernel
The patch is against 2.6.30 and is tested on intel and ppc architectures.
ChangeLog:
20. Jan 2009 V0.1
- First Version for Kernel 2.6.28.1
31. Mar 2009 V0.2
- Ported to Kernel 2.6.29
03. Jun 2009 V0.3
- Ported to Kernel 2.6.30
- Redesigned what was suggested by Ingo Molnar
- the thread watch monitor is gone
- the /proc/stackmon entry is also gone
- slim down
04. Jun 2009 V0.4
- Redesigned everything that was suggested by Andrew Morton
- slim down
04. Jun 2009 V0.5
- Code cleanup
06. Jun 2009 V0.6
- Fix missing mm->mmap_sem locking in function task_show_stack_usage()
- Code cleanup
10. Jun 2009 V0.7
- update Documentation/filesystem/proc.txt
10. Jun 2009 V0.8
- change maps/smaps output, displays now the max. stack size
24. Jun 2009 V0.9
- use walk_page_range() to determinate the stack usage high water mark
- include swapped pages to the stack usage high water mark count
24. Jun 2009 V0.10
- fix off by one bug
- cleanup
Documentation/filesystems/proc.txt | 5 +-
fs/exec.c | 2
fs/proc/array.c | 85 ++++++++++++++++++++++++++++++++++++-
fs/proc/task_mmu.c | 19 ++++++++
include/linux/sched.h | 1
kernel/fork.c | 2
6 files changed, 112 insertions(+), 2 deletions(-)
Signed-off-by: Stefani Seibold <stefani@seibold.net>
diff -u -N -r linux-2.6.30.orig/Documentation/filesystems/proc.txt linux-2.6.30/Documentation/filesystems/proc.txt
--- linux-2.6.30.orig/Documentation/filesystems/proc.txt 2009-06-24 16:21:46.000000000 +0200
+++ linux-2.6.30/Documentation/filesystems/proc.txt 2009-06-24 16:22:11.000000000 +0200
@@ -176,6 +176,7 @@
CapBnd: ffffffffffffffff
voluntary_ctxt_switches: 0
nonvoluntary_ctxt_switches: 1
+ Stack usage: 12 kB
This shows you nearly the same information you would get if you viewed it with
the ps command. In fact, ps uses the proc file system to obtain its
@@ -229,6 +230,7 @@
Mems_allowed_list Same as previous, but in "list format"
voluntary_ctxt_switches number of voluntary context switches
nonvoluntary_ctxt_switches number of non voluntary context switches
+ Stack usage: stack usage high water mark (round up to page size)
..............................................................................
Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
@@ -307,7 +309,7 @@
08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test
0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
a7cb1000-a7cb2000 ---p 00000000 00:00 0
-a7cb2000-a7eb2000 rw-p 00000000 00:00 0
+a7cb2000-a7eb2000 rw-p 00000000 00:00 0 [threadstack:001ff4b4]
a7eb2000-a7eb3000 ---p 00000000 00:00 0
a7eb3000-a7ed5000 rw-p 00000000 00:00 0
a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6
@@ -343,6 +345,7 @@
[stack] = the stack of the main process
[vdso] = the "virtual dynamic shared object",
the kernel system call handler
+ [threadstack:xxxxxxxx] = the stack of the thread, xxxxxxxx is the stack size
or if empty, the mapping is anonymous.
diff -u -N -r linux-2.6.30.orig/fs/exec.c linux-2.6.30/fs/exec.c
--- linux-2.6.30.orig/fs/exec.c 2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30/fs/exec.c 2009-06-24 16:22:11.000000000 +0200
@@ -1328,6 +1328,8 @@
if (retval < 0)
goto out;
+ current->stack_start = current->mm->start_stack;
+
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
diff -u -N -r linux-2.6.30.orig/fs/proc/array.c linux-2.6.30/fs/proc/array.c
--- linux-2.6.30.orig/fs/proc/array.c 2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30/fs/proc/array.c 2009-06-24 16:24:59.000000000 +0200
@@ -82,6 +82,7 @@
#include <linux/pid_namespace.h>
#include <linux/ptrace.h>
#include <linux/tracehook.h>
+#include <linux/swapops.h>
#include <asm/pgtable.h>
#include <asm/processor.h>
@@ -321,6 +322,87 @@
p->nivcsw);
}
+struct stack_stats {
+ struct vm_area_struct *vma;
+ unsigned long startpage;
+ unsigned long usage;
+};
+
+static int stack_usage_pte_range(pmd_t *pmd, unsigned long addr,
+ unsigned long end, struct mm_walk *walk)
+{
+ struct stack_stats *ss = walk->private;
+ struct vm_area_struct *vma = ss->vma;
+ pte_t *pte, ptent;
+ spinlock_t *ptl;
+ int ret = 0;
+
+ pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
+ for (; addr != end; pte++, addr += PAGE_SIZE) {
+ ptent = *pte;
+
+#ifdef CONFIG_STACK_GROWSUP
+ if (pte_present(ptent) || is_swap_pte(ptent))
+ ss->usage = addr - ss->startpage + PAGE_SIZE;
+#else
+ if (pte_present(ptent) || is_swap_pte(ptent)) {
+ ss->usage = ss->startpage - addr + PAGE_SIZE;
+ pte++;
+ ret = 1;
+ break;
+ }
+#endif
+ }
+ pte_unmap_unlock(pte - 1, ptl);
+ cond_resched();
+ return ret;
+}
+
+static inline unsigned long get_stack_usage_in_bytes(struct vm_area_struct *vma,
+ struct task_struct *task)
+{
+ struct stack_stats ss;
+ struct mm_walk stack_walk = {
+ .pmd_entry = stack_usage_pte_range,
+ .mm = vma->vm_mm,
+ .private = &ss,
+ };
+
+ if (!vma->vm_mm || is_vm_hugetlb_page(vma))
+ return 0;
+
+ ss.vma = vma;
+ ss.startpage = task->stack_start & PAGE_MASK;
+ ss.usage = 0;
+
+#ifdef CONFIG_STACK_GROWSUP
+ walk_page_range(KSTK_ESP(task) & PAGE_MASK, vma->vm_end,
+ &stack_walk);
+#else
+ walk_page_range(vma->vm_start, (KSTK_ESP(task) & PAGE_MASK) + PAGE_SIZE,
+ &stack_walk);
+#endif
+ return ss.usage;
+}
+
+static inline void task_show_stack_usage(struct seq_file *m,
+ struct task_struct *task)
+{
+ struct vm_area_struct *vma;
+ struct mm_struct *mm = get_task_mm(task);
+
+ if (mm) {
+ down_read(&mm->mmap_sem);
+ vma = find_vma(mm, task->stack_start);
+ if (vma)
+ seq_printf(m, "Stack usage:\t%lu kB\n",
+ get_stack_usage_in_bytes(vma, task) >> 10);
+
+ up_read(&mm->mmap_sem);
+ mmput(mm);
+ }
+}
+
int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
@@ -340,6 +422,7 @@
task_show_regs(m, task);
#endif
task_context_switch_counts(m, task);
+ task_show_stack_usage(m, task);
return 0;
}
@@ -481,7 +564,7 @@
rsslim,
mm ? mm->start_code : 0,
mm ? mm->end_code : 0,
- (permitted && mm) ? mm->start_stack : 0,
+ (permitted) ? task->stack_start : 0,
esp,
eip,
/* The signal information here is obsolete.
diff -u -N -r linux-2.6.30.orig/fs/proc/task_mmu.c linux-2.6.30/fs/proc/task_mmu.c
--- linux-2.6.30.orig/fs/proc/task_mmu.c 2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30/fs/proc/task_mmu.c 2009-06-24 16:22:11.000000000 +0200
@@ -242,6 +242,25 @@
} else if (vma->vm_start <= mm->start_stack &&
vma->vm_end >= mm->start_stack) {
name = "[stack]";
+ } else {
+ unsigned long stack_start;
+ struct proc_maps_private *pmp;
+
+ pmp = m->private;
+ stack_start = pmp->task->stack_start;
+
+ if (vma->vm_start <= stack_start &&
+ vma->vm_end >= stack_start) {
+ pad_len_spaces(m, len);
+ seq_printf(m,
+ "[threadstack:%08lx]",
+#ifdef CONFIG_STACK_GROWSUP
+ vma->vm_end - stack_start
+#else
+ stack_start - vma->vm_start
+#endif
+ );
+ }
}
} else {
name = "[vdso]";
diff -u -N -r linux-2.6.30.orig/include/linux/sched.h linux-2.6.30/include/linux/sched.h
--- linux-2.6.30.orig/include/linux/sched.h 2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30/include/linux/sched.h 2009-06-24 16:22:11.000000000 +0200
@@ -1429,6 +1429,7 @@
/* state flags for use by tracers */
unsigned long trace;
#endif
+ unsigned long stack_start;
};
/* Future-safe accessor for struct task_struct's cpus_allowed. */
diff -u -N -r linux-2.6.30.orig/kernel/fork.c linux-2.6.30/kernel/fork.c
--- linux-2.6.30.orig/kernel/fork.c 2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30/kernel/fork.c 2009-06-24 16:22:11.000000000 +0200
@@ -1092,6 +1092,8 @@
if (unlikely(current->ptrace))
ptrace_fork(p, clone_flags);
+ p->stack_start = stack_start;
+
/* Perform scheduler related setup. Assign this task to a CPU. */
sched_fork(p, clone_flags);
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch 2/2] procfs: provide stack information for threads V0.10
2009-06-24 15:49 ` Stefani Seibold
@ 2009-06-24 17:40 ` Johannes Weiner
2009-06-24 17:46 ` Ingo Molnar
0 siblings, 1 reply; 17+ messages in thread
From: Johannes Weiner @ 2009-06-24 17:40 UTC (permalink / raw)
To: Stefani Seibold
Cc: Ingo Molnar, Andrew Morton, linux-kernel, Eric W. Biederman,
Alexey Dobriyan, Peter Zijlstra
On Wed, Jun 24, 2009 at 05:49:50PM +0200, Stefani Seibold wrote:
> Am Mittwoch, den 24.06.2009, 17:20 +0200 schrieb Ingo Molnar:
> > * Stefani Seibold <stefani@seibold.net> wrote:
> >
> > > Hi,
> > >
> > > this is the newest version of the formaly named "detailed stack info"
> > > patch which give you a better overview of the userland application stack
> > > usage, especially for embedded linux.
> > >
> > > Currently you are only able to dump the main process/thread stack usage
> > > which is showed in /proc/pid/status by the "VmStk" Value. But you get no
> > > information about the consumed stack memory of the the threads.
> > >
> > > There is an enhancement in the /proc/<pid>/{task/*,}/*maps and which
> > > marks the vm mapping where the thread stack pointer reside with "[thread
> > > stack xxxxxxxx]". xxxxxxxx is the maximum size of stack. This is a
> > > value information, because libpthread doesn't set the start of the stack
> > > to the top of the mapped area, depending of the pthread usage.
> > >
> > > A sample output of /proc/<pid>/task/<tid>/maps looks like:
> > >
> > > 08048000-08049000 r-xp 00000000 03:00 8312 /opt/z
> > > 08049000-0804a000 rw-p 00001000 03:00 8312 /opt/z
> > > 0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
> > > a7d12000-a7d13000 ---p 00000000 00:00 0
> > > a7d13000-a7f13000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
> >
> > I have the same question as before: have you checked the use of that
> > field in tools/perf/builtin-record.c, and how your change will
> > impact that?
> >
>
> Good question... i have another one: What is tools/perf/builtin-record.c
> and where can i find it? Then i could check it.
You can find it in a recent git tree from Linus.
On the original question: builtin-record.c is unaffected by this patch
as this exact field will only be parsed if the mapping is executable.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch 2/2] procfs: provide stack information for threads V0.10
2009-06-24 17:40 ` Johannes Weiner
@ 2009-06-24 17:46 ` Ingo Molnar
2009-06-24 19:08 ` Johannes Weiner
0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2009-06-24 17:46 UTC (permalink / raw)
To: Johannes Weiner
Cc: Stefani Seibold, Andrew Morton, linux-kernel, Eric W. Biederman,
Alexey Dobriyan, Peter Zijlstra
* Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Wed, Jun 24, 2009 at 05:49:50PM +0200, Stefani Seibold wrote:
> > Am Mittwoch, den 24.06.2009, 17:20 +0200 schrieb Ingo Molnar:
> > > * Stefani Seibold <stefani@seibold.net> wrote:
> > >
> > > > Hi,
> > > >
> > > > this is the newest version of the formaly named "detailed stack info"
> > > > patch which give you a better overview of the userland application stack
> > > > usage, especially for embedded linux.
> > > >
> > > > Currently you are only able to dump the main process/thread stack usage
> > > > which is showed in /proc/pid/status by the "VmStk" Value. But you get no
> > > > information about the consumed stack memory of the the threads.
> > > >
> > > > There is an enhancement in the /proc/<pid>/{task/*,}/*maps and which
> > > > marks the vm mapping where the thread stack pointer reside with "[thread
> > > > stack xxxxxxxx]". xxxxxxxx is the maximum size of stack. This is a
> > > > value information, because libpthread doesn't set the start of the stack
> > > > to the top of the mapped area, depending of the pthread usage.
> > > >
> > > > A sample output of /proc/<pid>/task/<tid>/maps looks like:
> > > >
> > > > 08048000-08049000 r-xp 00000000 03:00 8312 /opt/z
> > > > 08049000-0804a000 rw-p 00001000 03:00 8312 /opt/z
> > > > 0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
> > > > a7d12000-a7d13000 ---p 00000000 00:00 0
> > > > a7d13000-a7f13000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
> > >
> > > I have the same question as before: have you checked the use of that
> > > field in tools/perf/builtin-record.c, and how your change will
> > > impact that?
> > >
> >
> > Good question... i have another one: What is
> > tools/perf/builtin-record.c and where can i find it? Then i
> > could check it.
>
> You can find it in a recent git tree from Linus.
>
> On the original question: builtin-record.c is unaffected by this
> patch as this exact field will only be parsed if the mapping is
> executable.
A stack can be executable too. It is not common, but possible.
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch 2/2] procfs: provide stack information for threads V0.10
2009-06-24 17:46 ` Ingo Molnar
@ 2009-06-24 19:08 ` Johannes Weiner
2009-06-25 9:36 ` Ingo Molnar
2009-06-25 10:09 ` [tip:perfcounters/urgent] perf record: Fix filemap pathname parsing in /proc/pid/maps tip-bot for Johannes Weiner
0 siblings, 2 replies; 17+ messages in thread
From: Johannes Weiner @ 2009-06-24 19:08 UTC (permalink / raw)
To: Ingo Molnar
Cc: Stefani Seibold, Andrew Morton, linux-kernel, Eric W. Biederman,
Alexey Dobriyan, Peter Zijlstra
On Wed, Jun 24, 2009 at 07:46:37PM +0200, Ingo Molnar wrote:
>
> * Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> > On Wed, Jun 24, 2009 at 05:49:50PM +0200, Stefani Seibold wrote:
> > > Am Mittwoch, den 24.06.2009, 17:20 +0200 schrieb Ingo Molnar:
> > > > * Stefani Seibold <stefani@seibold.net> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > this is the newest version of the formaly named "detailed stack info"
> > > > > patch which give you a better overview of the userland application stack
> > > > > usage, especially for embedded linux.
> > > > >
> > > > > Currently you are only able to dump the main process/thread stack usage
> > > > > which is showed in /proc/pid/status by the "VmStk" Value. But you get no
> > > > > information about the consumed stack memory of the the threads.
> > > > >
> > > > > There is an enhancement in the /proc/<pid>/{task/*,}/*maps and which
> > > > > marks the vm mapping where the thread stack pointer reside with "[thread
> > > > > stack xxxxxxxx]". xxxxxxxx is the maximum size of stack. This is a
> > > > > value information, because libpthread doesn't set the start of the stack
> > > > > to the top of the mapped area, depending of the pthread usage.
> > > > >
> > > > > A sample output of /proc/<pid>/task/<tid>/maps looks like:
> > > > >
> > > > > 08048000-08049000 r-xp 00000000 03:00 8312 /opt/z
> > > > > 08049000-0804a000 rw-p 00001000 03:00 8312 /opt/z
> > > > > 0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
> > > > > a7d12000-a7d13000 ---p 00000000 00:00 0
> > > > > a7d13000-a7f13000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
> > > >
> > > > I have the same question as before: have you checked the use of that
> > > > field in tools/perf/builtin-record.c, and how your change will
> > > > impact that?
> > > >
> > >
> > > Good question... i have another one: What is
> > > tools/perf/builtin-record.c and where can i find it? Then i
> > > could check it.
> >
> > You can find it in a recent git tree from Linus.
> >
> > On the original question: builtin-record.c is unaffected by this
> > patch as this exact field will only be parsed if the mapping is
> > executable.
>
> A stack can be executable too. It is not common, but possible.
It also ignores the field if it doesn't start with a slash, so it's
even safe for executable stacks.
On a different note, I think that parser is not working for file
mappings with paths containing spaces. Not common, but possible :)
The below, sorry: untested, should fix this up. I think we don't
expect a slash in those lines except in a pathname, so looking for the
first slash should be okay. What do you think?
---
From: Johannes Weiner <hannes@cmpxchg.org>
Subject: tools/perf: fix filemap pathname parsing in /proc/pid/maps
Looking backward for the first space from the end of a line in
/proc/pid/maps does not find the start of the pathname of the mapped
file if it contains a space.
Since the only slashes we have in this file occur in the (absolute!)
pathname column of file mappings, looking for the first slash in a
line is a safe method to find the name.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d7ebbd7..9b899ba 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -306,12 +306,11 @@ static void pid_synthesize_mmap_samples(pid_t pid)
continue;
pbf += n + 3;
if (*pbf == 'x') { /* vm_exec */
- char *execname = strrchr(bf, ' ');
+ char *execname = strchr(bf, '/');
- if (execname == NULL || execname[1] != '/')
+ if (execname == NULL)
continue;
- execname += 1;
size = strlen(execname);
execname[size - 1] = '\0'; /* Remove \n */
memcpy(mmap_ev.filename, execname, size);
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [patch 2/2] procfs: provide stack information for threads V0.10
2009-06-24 19:08 ` Johannes Weiner
@ 2009-06-25 9:36 ` Ingo Molnar
2009-06-25 10:09 ` [tip:perfcounters/urgent] perf record: Fix filemap pathname parsing in /proc/pid/maps tip-bot for Johannes Weiner
1 sibling, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2009-06-25 9:36 UTC (permalink / raw)
To: Johannes Weiner
Cc: Stefani Seibold, Andrew Morton, linux-kernel, Eric W. Biederman,
Alexey Dobriyan, Peter Zijlstra
* Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Wed, Jun 24, 2009 at 07:46:37PM +0200, Ingo Molnar wrote:
> >
> > * Johannes Weiner <hannes@cmpxchg.org> wrote:
> >
> > > On Wed, Jun 24, 2009 at 05:49:50PM +0200, Stefani Seibold wrote:
> > > > Am Mittwoch, den 24.06.2009, 17:20 +0200 schrieb Ingo Molnar:
> > > > > * Stefani Seibold <stefani@seibold.net> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > this is the newest version of the formaly named "detailed stack info"
> > > > > > patch which give you a better overview of the userland application stack
> > > > > > usage, especially for embedded linux.
> > > > > >
> > > > > > Currently you are only able to dump the main process/thread stack usage
> > > > > > which is showed in /proc/pid/status by the "VmStk" Value. But you get no
> > > > > > information about the consumed stack memory of the the threads.
> > > > > >
> > > > > > There is an enhancement in the /proc/<pid>/{task/*,}/*maps and which
> > > > > > marks the vm mapping where the thread stack pointer reside with "[thread
> > > > > > stack xxxxxxxx]". xxxxxxxx is the maximum size of stack. This is a
> > > > > > value information, because libpthread doesn't set the start of the stack
> > > > > > to the top of the mapped area, depending of the pthread usage.
> > > > > >
> > > > > > A sample output of /proc/<pid>/task/<tid>/maps looks like:
> > > > > >
> > > > > > 08048000-08049000 r-xp 00000000 03:00 8312 /opt/z
> > > > > > 08049000-0804a000 rw-p 00001000 03:00 8312 /opt/z
> > > > > > 0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
> > > > > > a7d12000-a7d13000 ---p 00000000 00:00 0
> > > > > > a7d13000-a7f13000 rw-p 00000000 00:00 0 [thread stack: 001ff4b4]
> > > > >
> > > > > I have the same question as before: have you checked the use of that
> > > > > field in tools/perf/builtin-record.c, and how your change will
> > > > > impact that?
> > > > >
> > > >
> > > > Good question... i have another one: What is
> > > > tools/perf/builtin-record.c and where can i find it? Then i
> > > > could check it.
> > >
> > > You can find it in a recent git tree from Linus.
> > >
> > > On the original question: builtin-record.c is unaffected by this
> > > patch as this exact field will only be parsed if the mapping is
> > > executable.
> >
> > A stack can be executable too. It is not common, but possible.
>
> It also ignores the field if it doesn't start with a slash, so
> it's even safe for executable stacks.
>
> On a different note, I think that parser is not working for file
> mappings with paths containing spaces. Not common, but possible
> :)
>
> The below, sorry: untested, should fix this up. I think we don't
> expect a slash in those lines except in a pathname, so looking for
> the first slash should be okay. What do you think?
heh - good one - applied, thanks Johannes!
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* [tip:perfcounters/urgent] perf record: Fix filemap pathname parsing in /proc/pid/maps
2009-06-24 19:08 ` Johannes Weiner
2009-06-25 9:36 ` Ingo Molnar
@ 2009-06-25 10:09 ` tip-bot for Johannes Weiner
1 sibling, 0 replies; 17+ messages in thread
From: tip-bot for Johannes Weiner @ 2009-06-25 10:09 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, hpa, mingo, stefani, a.p.zijlstra, hannes, ebiederm,
akpm, tglx, mingo, adobriyan
Commit-ID: 76c64c5e4c47b6d28deb3cae8dfa07a93c2229dc
Gitweb: http://git.kernel.org/tip/76c64c5e4c47b6d28deb3cae8dfa07a93c2229dc
Author: Johannes Weiner <hannes@cmpxchg.org>
AuthorDate: Wed, 24 Jun 2009 21:08:36 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 25 Jun 2009 11:35:58 +0200
perf record: Fix filemap pathname parsing in /proc/pid/maps
Looking backward for the first space from the end of a line in
/proc/pid/maps does not find the start of the pathname of the mapped
file if it contains a space.
Since the only slashes we have in this file occur in the (absolute!)
pathname column of file mappings, looking for the first slash in a
line is a safe method to find the name.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090624190835.GA25548@cmpxchg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
tools/perf/builtin-record.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d7ebbd7..9b899ba 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -306,12 +306,11 @@ static void pid_synthesize_mmap_samples(pid_t pid)
continue;
pbf += n + 3;
if (*pbf == 'x') { /* vm_exec */
- char *execname = strrchr(bf, ' ');
+ char *execname = strchr(bf, '/');
- if (execname == NULL || execname[1] != '/')
+ if (execname == NULL)
continue;
- execname += 1;
size = strlen(execname);
execname[size - 1] = '\0'; /* Remove \n */
memcpy(mmap_ev.filename, execname, size);
^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2009-06-25 10:11 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-18 22:43 [merged] proctxt-update-kernel-filesystem-proctxt-documentation.patch removed from -mm tree akpm
[not found] ` <1245824444.22613.3.camel@wall-e>
[not found] ` <20090623233247.7ed661b7.akpm@linux-foundation.org>
2009-06-24 6:45 ` Stefani Seibold
2009-06-24 7:13 ` Andrew Morton
2009-06-24 7:35 ` Eric W. Biederman
2009-06-24 9:33 ` Stefani Seibold
2009-06-24 15:30 ` Andrew Morton
2009-06-24 15:57 ` Stefani Seibold
2009-06-24 12:03 ` [patch 2/2] procfs: provide stack information for threads V0.9 Stefani Seibold
2009-06-24 14:33 ` [patch 2/2] procfs: provide stack information for threads V0.10 Stefani Seibold
2009-06-24 15:20 ` Ingo Molnar
2009-06-24 15:49 ` Stefani Seibold
2009-06-24 17:40 ` Johannes Weiner
2009-06-24 17:46 ` Ingo Molnar
2009-06-24 19:08 ` Johannes Weiner
2009-06-25 9:36 ` Ingo Molnar
2009-06-25 10:09 ` [tip:perfcounters/urgent] perf record: Fix filemap pathname parsing in /proc/pid/maps tip-bot for Johannes Weiner
2009-06-24 16:28 ` [patch 2/2] procfs: provide stack information for threads V0.11 Stefani Seibold
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.