All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: "Steven Rostedt" <rostedt@goodmis.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Larry Woodman" <lwoodman@redhat.com>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Pekka Enberg" <penberg@cs.helsinki.fi>,
	"Eduard - Gabriel Munteanu" <eduard.munteanu@linux360.ro>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>,
	"Andi Kleen" <andi@firstfloor.org>,
	"Matt Mackall" <mpm@selenic.com>,
	"Alexey Dobriyan" <adobriyan@gmail.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 5/5] proc: export more page flags in /proc/kpageflags
Date: Tue, 28 Apr 2009 16:33:20 +0800	[thread overview]
Message-ID: <20090428083320.GB17038@localhost> (raw)
In-Reply-To: <20090428065507.GA2024@elte.hu>

On Tue, Apr 28, 2009 at 08:55:07AM +0200, Ingo Molnar wrote:
> 
> * Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > Export 9 page flags in /proc/kpageflags, and 8 more for kernel developers.
> > 
> > 1) for kernel hackers (on CONFIG_DEBUG_KERNEL)
> >    - all available page flags are exported, and
> >    - exported as is
> > 2) for admins and end users
> >    - only the more `well known' flags are exported:
> > 	11. KPF_MMAP		(pseudo flag) memory mapped page
> > 	12. KPF_ANON		(pseudo flag) memory mapped page (anonymous)
> > 	13. KPF_SWAPCACHE	page is in swap cache
> > 	14. KPF_SWAPBACKED	page is swap/RAM backed
> > 	15. KPF_COMPOUND_HEAD	(*)
> > 	16. KPF_COMPOUND_TAIL	(*)
> > 	17. KPF_UNEVICTABLE	page is in the unevictable LRU list
> > 	18. KPF_HWPOISON	hardware detected corruption
> > 	19. KPF_NOPAGE		(pseudo flag) no page frame at the address
> > 
> > 	(*) For compound pages, exporting _both_ head/tail info enables
> > 	    users to tell where a compound page starts/ends, and its order.
> > 
> >    - limit flags to their typical usage scenario, as indicated by KOSAKI:
> > 	- LRU pages: only export relevant flags
> > 		- PG_lru
> > 		- PG_unevictable
> > 		- PG_active
> > 		- PG_referenced
> > 		- page_mapped()
> > 		- PageAnon()
> > 		- PG_swapcache
> > 		- PG_swapbacked
> > 		- PG_reclaim
> > 	- no-IO pages: mask out irrelevant flags
> > 		- PG_dirty
> > 		- PG_uptodate
> > 		- PG_writeback
> > 	- SLAB pages: mask out overloaded flags:
> > 		- PG_error
> > 		- PG_active
> > 		- PG_private
> > 	- PG_reclaim: mask out the overloaded PG_readahead
> > 	- compound flags: only export huge/gigantic pages
> > 
> > Here are the admin/linus views of all page flags on a newly booted nfs-root system:
> > 
> > # ./page-types # for admin
> >          flags  page-count       MB  symbolic-flags                     long-symbolic-flags
> > 0x000000000000      491174     1918  ____________________________                
> > 0x000000000020           1        0  _____l______________________       lru      
> > 0x000000000028        2543        9  ___U_l______________________       uptodate,lru
> > 0x00000000002c        5288       20  __RU_l______________________       referenced,uptodate,lru
> > 0x000000004060           1        0  _____lA_______b_____________       lru,active,swapbacked
> 
> I think i have to NAK this kind of ad-hoc instrumentation of kernel 
> internals and statistics until we clear up why such instrumentation 
> measures are being accepted into the MM while other, more dynamic 
> and more flexible MM instrumentation are being resisted by Andrew.

An unexpected NAK - to throw away an orange because we are to have an apple? ;-)

Anyway here are the missing rationals.

1) FAST

It takes merely 0.2s to scan 4GB pages:

        ./page-types  0.02s user 0.20s system 99% cpu 0.216 total

2) SIMPLE

/proc/kpageflags will be a *long standing* hack we have to live with -
it was originally introduced by Matt to do shared memory accounting and
a facility to analyze applications' memory consumptions, with the hope
it will also help kernel developers someday.

So why not extend and embrace it, in a straightforward way?

3) USE CASES

I have/will take advantage of the above page-types command in a number ways:
- to help track down memory leak (the recent trace/ring_buffer.c case)
- to estimate the system wide readahead miss ratio
- Andi want to examine the major page types in different workloads
  (for the hwpoison work)
- Me too, for fun of learning: read/write/lock/whatever a lot of pages
  and examine their flags, to get an idea of some random kernel behaviors.
  (the dynamic tracing tools can be more helpful, as a different view)

4) COMPLEMENTARITY

In some cases the dynamic tracing tool is not enough (or too complex)
to rebuild the current status view.

I myself have a dynamic readahead tracing tool(very useful!).
At the same time I also use readahead accounting numbers, and the
/proc/filecache tool(frequently!), and the above page-types tool.
I simply need them all - they are handy for different cases.

Thanks,
Fengguang

> The above type of condensed information can be built out of dynamic 
> trace data too - and much more. Being able to track page state 
> transitions is very valuable when debugging VM problems. One such 
> 'view' of trace data would be a summary histogram like above.
> 
> ( done after a "echo 3 > /proc/sys/vm/drop_caches" to make sure all 
>   interesting pages have been re-established and their state is 
>   present in the trace. )
> 
> The SLAB code already has such a facility, kmemtrace: it's very 
> useful and successful in visualizing complex SLAB details, both 
> dynamically and statically.
> 
> I think the same general approach should be used for the page 
> allocator too (and for the page cache and some other struct page 
> based caches): the life-time of an object should be followed. If we 
> capture the important details we capture the big picture too. Pekka 
> already sent an RFC patch to extend kmemtrace in such a fashion. Why 
> is that more useful method not being pursued?
> 
> By extending upon the (existing) /proc/kpageflags hack a usecase is 
> taken away from the tracing based solution and a needless overlap is 
> created - and that's not particularly helpful IMHO. We now have all 
> the facilities upstream that allow us to do intelligent 
> instrumentation - we should make use of them.
> 
> 	Ingo
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: "Steven Rostedt" <rostedt@goodmis.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Larry Woodman" <lwoodman@redhat.com>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Pekka Enberg" <penberg@cs.helsinki.fi>,
	"Eduard - Gabriel Munteanu" <eduard.munteanu@linux360.ro>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>,
	"Andi Kleen" <andi@firstfloor.org>,
	"Matt Mackall" <mpm@selenic.com>,
	"Alexey Dobriyan" <adobriyan@gmail.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 5/5] proc: export more page flags in /proc/kpageflags
Date: Tue, 28 Apr 2009 16:33:20 +0800	[thread overview]
Message-ID: <20090428083320.GB17038@localhost> (raw)
In-Reply-To: <20090428065507.GA2024@elte.hu>

On Tue, Apr 28, 2009 at 08:55:07AM +0200, Ingo Molnar wrote:
> 
> * Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > Export 9 page flags in /proc/kpageflags, and 8 more for kernel developers.
> > 
> > 1) for kernel hackers (on CONFIG_DEBUG_KERNEL)
> >    - all available page flags are exported, and
> >    - exported as is
> > 2) for admins and end users
> >    - only the more `well known' flags are exported:
> > 	11. KPF_MMAP		(pseudo flag) memory mapped page
> > 	12. KPF_ANON		(pseudo flag) memory mapped page (anonymous)
> > 	13. KPF_SWAPCACHE	page is in swap cache
> > 	14. KPF_SWAPBACKED	page is swap/RAM backed
> > 	15. KPF_COMPOUND_HEAD	(*)
> > 	16. KPF_COMPOUND_TAIL	(*)
> > 	17. KPF_UNEVICTABLE	page is in the unevictable LRU list
> > 	18. KPF_HWPOISON	hardware detected corruption
> > 	19. KPF_NOPAGE		(pseudo flag) no page frame at the address
> > 
> > 	(*) For compound pages, exporting _both_ head/tail info enables
> > 	    users to tell where a compound page starts/ends, and its order.
> > 
> >    - limit flags to their typical usage scenario, as indicated by KOSAKI:
> > 	- LRU pages: only export relevant flags
> > 		- PG_lru
> > 		- PG_unevictable
> > 		- PG_active
> > 		- PG_referenced
> > 		- page_mapped()
> > 		- PageAnon()
> > 		- PG_swapcache
> > 		- PG_swapbacked
> > 		- PG_reclaim
> > 	- no-IO pages: mask out irrelevant flags
> > 		- PG_dirty
> > 		- PG_uptodate
> > 		- PG_writeback
> > 	- SLAB pages: mask out overloaded flags:
> > 		- PG_error
> > 		- PG_active
> > 		- PG_private
> > 	- PG_reclaim: mask out the overloaded PG_readahead
> > 	- compound flags: only export huge/gigantic pages
> > 
> > Here are the admin/linus views of all page flags on a newly booted nfs-root system:
> > 
> > # ./page-types # for admin
> >          flags  page-count       MB  symbolic-flags                     long-symbolic-flags
> > 0x000000000000      491174     1918  ____________________________                
> > 0x000000000020           1        0  _____l______________________       lru      
> > 0x000000000028        2543        9  ___U_l______________________       uptodate,lru
> > 0x00000000002c        5288       20  __RU_l______________________       referenced,uptodate,lru
> > 0x000000004060           1        0  _____lA_______b_____________       lru,active,swapbacked
> 
> I think i have to NAK this kind of ad-hoc instrumentation of kernel 
> internals and statistics until we clear up why such instrumentation 
> measures are being accepted into the MM while other, more dynamic 
> and more flexible MM instrumentation are being resisted by Andrew.

An unexpected NAK - to throw away an orange because we are to have an apple? ;-)

Anyway here are the missing rationals.

1) FAST

It takes merely 0.2s to scan 4GB pages:

        ./page-types  0.02s user 0.20s system 99% cpu 0.216 total

2) SIMPLE

/proc/kpageflags will be a *long standing* hack we have to live with -
it was originally introduced by Matt to do shared memory accounting and
a facility to analyze applications' memory consumptions, with the hope
it will also help kernel developers someday.

So why not extend and embrace it, in a straightforward way?

3) USE CASES

I have/will take advantage of the above page-types command in a number ways:
- to help track down memory leak (the recent trace/ring_buffer.c case)
- to estimate the system wide readahead miss ratio
- Andi want to examine the major page types in different workloads
  (for the hwpoison work)
- Me too, for fun of learning: read/write/lock/whatever a lot of pages
  and examine their flags, to get an idea of some random kernel behaviors.
  (the dynamic tracing tools can be more helpful, as a different view)

4) COMPLEMENTARITY

In some cases the dynamic tracing tool is not enough (or too complex)
to rebuild the current status view.

I myself have a dynamic readahead tracing tool(very useful!).
At the same time I also use readahead accounting numbers, and the
/proc/filecache tool(frequently!), and the above page-types tool.
I simply need them all - they are handy for different cases.

Thanks,
Fengguang

> The above type of condensed information can be built out of dynamic 
> trace data too - and much more. Being able to track page state 
> transitions is very valuable when debugging VM problems. One such 
> 'view' of trace data would be a summary histogram like above.
> 
> ( done after a "echo 3 > /proc/sys/vm/drop_caches" to make sure all 
>   interesting pages have been re-established and their state is 
>   present in the trace. )
> 
> The SLAB code already has such a facility, kmemtrace: it's very 
> useful and successful in visualizing complex SLAB details, both 
> dynamically and statically.
> 
> I think the same general approach should be used for the page 
> allocator too (and for the page cache and some other struct page 
> based caches): the life-time of an object should be followed. If we 
> capture the important details we capture the big picture too. Pekka 
> already sent an RFC patch to extend kmemtrace in such a fashion. Why 
> is that more useful method not being pursued?
> 
> By extending upon the (existing) /proc/kpageflags hack a usecase is 
> taken away from the tracing based solution and a needless overlap is 
> created - and that's not particularly helpful IMHO. We now have all 
> the facilities upstream that allow us to do intelligent 
> instrumentation - we should make use of them.
> 
> 	Ingo
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-04-28  9:15 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-28  1:09 [PATCH 0/5] proc: export more page flags in /proc/kpageflags (take 4) Wu Fengguang
2009-04-28  1:09 ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 1/5] pagemap: document clarifications Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  7:11   ` Tommi Rantala
2009-04-28  7:11     ` Tommi Rantala
2009-04-28  1:09 ` [PATCH 2/5] pagemap: documentation 9 more exported page flags Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 3/5] mm: introduce PageHuge() for testing huge/gigantic pages Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 4/5] proc: kpagecount/kpageflags code cleanup Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 5/5] proc: export more page flags in /proc/kpageflags Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  6:55   ` Ingo Molnar
2009-04-28  6:55     ` Ingo Molnar
2009-04-28  7:40     ` Andi Kleen
2009-04-28  7:40       ` Andi Kleen
2009-04-28  9:04       ` Pekka Enberg
2009-04-28  9:04         ` Pekka Enberg
2009-04-28  9:10         ` Andi Kleen
2009-04-28  9:10           ` Andi Kleen
2009-04-28  9:15           ` Pekka Enberg
2009-04-28  9:15             ` Pekka Enberg
2009-04-28  9:15         ` Ingo Molnar
2009-04-28  9:15           ` Ingo Molnar
2009-04-28  9:19           ` Pekka Enberg
2009-04-28  9:19             ` Pekka Enberg
2009-04-28  9:25             ` Pekka Enberg
2009-04-28  9:25               ` Pekka Enberg
2009-04-28  9:36               ` Wu Fengguang
2009-04-28  9:36                 ` Wu Fengguang
2009-04-28  9:36               ` Ingo Molnar
2009-04-28  9:36                 ` Ingo Molnar
2009-04-28  9:57                 ` Pekka Enberg
2009-04-28  9:57                   ` Pekka Enberg
2009-04-28 10:10                   ` KOSAKI Motohiro
2009-04-28 10:10                     ` KOSAKI Motohiro
2009-04-28 10:21                     ` Pekka Enberg
2009-04-28 10:21                       ` Pekka Enberg
2009-04-28 10:56                       ` Ingo Molnar
2009-04-28 10:56                         ` Ingo Molnar
2009-04-28 11:09                         ` KOSAKI Motohiro
2009-04-28 11:09                           ` KOSAKI Motohiro
2009-04-28 12:42                           ` Ingo Molnar
2009-04-28 12:42                             ` Ingo Molnar
2009-04-28 11:03                   ` Ingo Molnar
2009-04-28 11:03                     ` Ingo Molnar
2009-04-28 17:42                 ` Matt Mackall
2009-04-28 17:42                   ` Matt Mackall
2009-04-28  9:29             ` Ingo Molnar
2009-04-28  9:29               ` Ingo Molnar
2009-04-28  9:34               ` KOSAKI Motohiro
2009-04-28  9:34                 ` KOSAKI Motohiro
2009-04-28  9:38                 ` Ingo Molnar
2009-04-28  9:38                   ` Ingo Molnar
2009-04-28  9:55                   ` Wu Fengguang
2009-04-28  9:55                     ` Wu Fengguang
2009-04-28 10:11                     ` KOSAKI Motohiro
2009-04-28 10:11                       ` KOSAKI Motohiro
2009-04-28 11:05                     ` Ingo Molnar
2009-04-28 11:05                       ` Ingo Molnar
2009-04-28 11:36                       ` Wu Fengguang
2009-04-28 11:36                         ` Wu Fengguang
2009-04-28 12:17                         ` [rfc] object collection tracing (was: [PATCH 5/5] proc: export more page flags in /proc/kpageflags) Ingo Molnar
2009-04-28 12:17                           ` Ingo Molnar
2009-04-28 13:31                           ` Wu Fengguang
2009-04-28 13:31                             ` Wu Fengguang
2009-05-12 13:01                             ` Frederic Weisbecker
2009-05-12 13:01                               ` Frederic Weisbecker
2009-05-17 13:36                               ` Wu Fengguang
2009-05-17 13:55                                 ` Frederic Weisbecker
2009-05-17 13:55                                   ` Frederic Weisbecker
2009-05-17 14:12                                   ` Wu Fengguang
2009-05-17 14:12                                     ` Wu Fengguang
2009-05-18 11:44                                 ` KOSAKI Motohiro
2009-05-18 11:44                                   ` KOSAKI Motohiro
2009-05-18 11:47                                   ` Wu Fengguang
2009-05-18 11:47                                     ` Wu Fengguang
2009-04-28 10:18                   ` [PATCH 5/5] proc: export more page flags in /proc/kpageflags Andi Kleen
2009-04-28 10:18                     ` Andi Kleen
2009-04-28  8:33     ` Wu Fengguang [this message]
2009-04-28  8:33       ` Wu Fengguang
2009-04-28  9:24       ` Ingo Molnar
2009-04-28  9:24         ` Ingo Molnar
2009-04-28 18:11       ` Tony Luck
2009-04-28 18:11         ` Tony Luck
2009-04-28 18:34         ` Matt Mackall
2009-04-28 18:34           ` Matt Mackall
2009-04-28 20:47           ` Tony Luck
2009-04-28 20:47             ` Tony Luck
2009-04-28 20:54             ` Andi Kleen
2009-04-28 20:54               ` Andi Kleen
2009-04-28 20:59             ` Matt Mackall
2009-04-28 20:59               ` Matt Mackall
2009-04-28 21:17         ` Andrew Morton
2009-04-28 21:17           ` Andrew Morton
2009-04-28 21:49           ` Matt Mackall
2009-04-28 21:49             ` Matt Mackall
2009-04-29  0:02             ` Robin Holt
2009-04-29  0:02               ` Robin Holt
2009-04-28 17:49   ` Matt Mackall
2009-04-28 17:49     ` Matt Mackall
2009-04-29  8:05     ` Wu Fengguang
2009-04-29  8:05       ` Wu Fengguang
2009-04-29 19:13       ` Matt Mackall
2009-04-29 19:13         ` Matt Mackall
2009-04-30  1:00         ` Wu Fengguang
2009-04-30  1:00           ` Wu Fengguang
2009-04-28 21:32   ` Andrew Morton
2009-04-28 21:32     ` Andrew Morton
2009-04-28 22:46     ` Matt Mackall
2009-04-28 22:46       ` Matt Mackall
2009-04-28 23:02       ` Andrew Morton
2009-04-28 23:02         ` Andrew Morton
2009-04-28 23:31         ` Matt Mackall
2009-04-28 23:31           ` Matt Mackall
2009-04-28 23:42           ` Andrew Morton
2009-04-28 23:42             ` Andrew Morton
2009-04-28 23:55             ` Matt Mackall
2009-04-28 23:55               ` Matt Mackall
2009-04-29  3:33               ` Wu Fengguang
2009-04-29  3:33                 ` Wu Fengguang
2009-04-29  2:38     ` Wu Fengguang
2009-04-29  2:38       ` Wu Fengguang
2009-04-29  2:55       ` Andrew Morton
2009-04-29  2:55         ` Andrew Morton
2009-04-29  3:48         ` Wu Fengguang
2009-04-29  3:48           ` Wu Fengguang
2009-04-29  5:09           ` Wu Fengguang
2009-04-29  5:09             ` Wu Fengguang
2009-04-29  4:41       ` Nathan Lynch
2009-04-29  4:41         ` Nathan Lynch
2009-04-29  4:41         ` Nathan Lynch
2009-04-29  4:50         ` Andrew Morton
2009-04-29  4:50           ` Andrew Morton
2009-04-29  4:50           ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090428083320.GB17038@localhost \
    --to=fengguang.wu@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=eduard.munteanu@linux360.ro \
    --cc=fweisbec@gmail.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mingo@elte.hu \
    --cc=mpm@selenic.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.