From: Wu Fengguang <fengguang.wu@intel.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: "Frédéric Weisbecker" <fweisbec@gmail.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
"Li Zefan" <lizf@cn.fujitsu.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
"KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>,
"Andi Kleen" <andi@firstfloor.org>,
"Matt Mackall" <mpm@selenic.com>,
"Alexey Dobriyan" <adobriyan@gmail.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [patch] tracing/mm: add page frame snapshot trace
Date: Sun, 10 May 2009 16:35:17 +0800 [thread overview]
Message-ID: <20090510083517.GB5794@localhost> (raw)
In-Reply-To: <20090509140512.GA22000@elte.hu>
On Sat, May 09, 2009 at 10:05:12PM +0800, Ingo Molnar wrote:
>
> * Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> > > ( End even for tasks, which are perhaps the hardest to iterate, we
> > > can still do the /proc method of iterating up to the offset by
> > > counting. It wastes some time for each separate thread as it has
> > > to count up to its offset, but it still allows the dumping itself
> > > to be parallelised. Or we could dump blocks of the PID hash array.
> > > That distributes tasks well, and can be iterated very easily with
> > > low/zero contention. The result will come out unordered in any
> > > case. )
> >
> > For task/file based page walking, the best parallelism unit can be
> > the task/file, instead of page segments inside them.
> >
> > And there is the sparse file problem. There will be large holes in
> > the address space of file and process(and even physical memory!).
>
> If we want to iterate in the file offset space then we should use
> the find_get_pages() trick: use the page radix tree and do gang
> lookups in ascending order. Holes will be skipped over in a natural
> way in the tree.
Right. I actually have code doing this, very neat trick.
> Regarding iterators, i think the best way would be to expose a
> number of 'natural iterators' in the object collection directory.
> The current dump_range could be changed to "pfn_index" (it's really
> a 'physical page number' index and iterator), and we could introduce
> a couple of other indices as well:
>
> /debug/tracing/objects/mm/pages/pfn_index
> /debug/tracing/objects/mm/pages/filename_index
> /debug/tracing/objects/mm/pages/task_index
> /debug/tracing/objects/mm/pages/sb_index
How about
/debug/tracing/objects/mm/pages/walk-pfn
/debug/tracing/objects/mm/pages/walk-file
/debug/tracing/objects/mm/pages/walk-task
/debug/tracing/objects/mm/pages/walk-fs
(fs may be a more well known name than sb?)
They begin with a verb, because they are verbs when we echo some
parameters into them ;-)
> "filename_index" would take a file name (a string), and would dump
> all pages of that inode - perhaps with an additional index/range
> parameter as well. For example:
>
> echo "/home/foo/bar.txt 0 1000" > filename_index
Better to use
"0 1000 /home/foo/bar.txt"
because there will be files named "/some/file 001".
But then echo will append an additional '\n' to filename and we are
faced with the question whether to ignore the trailing '\n'.
> Would look up that file and dump any pages in the page cache related
> to that file, in the 0..1000 pages offset range.
>
> ( We could support the 'batching' of such requests too, so
> multi-line strings can be used to request multiple files, via a
> single system call.
Yes, I'd expect it to make some difference in efficiency, when there
are many small files.
> We could perhaps even support directories and do
> directory-and-all-child-dentries/inodes recursive lookups. )
Maybe, could do this when there comes such a need.
> Other indices/iterators would work like this:
>
> echo "/var" > sb_index
>
> Would try to find the superblock associated to /var, and output all
> pages that relate to that superblock. (it would iterate over all
> inodes and look them all up in the pagecache and dump any matches)
Can we buffer so much outputs in kernel? Even if ftrace has no such
limitations, it may not be a good idea to pin too many pages in the
ring buffer.
I do need this feature. But it sounds like a mixture of
"files-inside-sb" walker and "pages-inside-file" walker.
It's unclear how it will duplicate functions with the
"files object collection" to be added in:
/debug/tracing/objects/mm/files/*
For example,
/debug/tracing/objects/mm/files/walk-fs
/debug/tracing/objects/mm/files/walk-dirty
/debug/tracing/objects/mm/files/walk-global
and some filtering options, like size, cached_size, etc.
> Alternatively, we could do a reverse look up for the inode from the
> pfn, and output that name. That would bloat the records a bit, and
> would be more costly as well.
That sounds like "describe-pfn" and can serve as a good debugging tool.
> The 'task_index' would output based on a PID, it would find the mm
> of that task and dump all pages associated to that mm. Offset/range
> info would be virtual address page index based.
Right.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-05-10 8:35 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-08 10:53 [PATCH 0/8] export more page flags in /proc/kpageflags (take 6) Wu Fengguang
2009-05-08 10:53 ` [PATCH 1/8] mm: introduce PageHuge() for testing huge/gigantic pages Wu Fengguang
2009-05-08 11:40 ` Ingo Molnar
2009-05-08 12:21 ` Wu Fengguang
2009-05-13 17:05 ` Mel Gorman
2009-05-17 13:09 ` Wu Fengguang
2009-05-08 10:53 ` [PATCH 2/8] slob: use PG_slab for identifying SLOB pages Wu Fengguang
2009-05-08 10:53 ` [PATCH 3/8] proc: kpagecount/kpageflags code cleanup Wu Fengguang
2009-05-08 10:53 ` [PATCH 4/8] proc: export more page flags in /proc/kpageflags Wu Fengguang
2009-05-08 11:47 ` Ingo Molnar
2009-05-08 12:44 ` Wu Fengguang
2009-05-09 5:59 ` Ingo Molnar
2009-05-09 7:56 ` Wu Fengguang
2009-05-09 6:27 ` [patch] tracing/mm: add page frame snapshot trace Ingo Molnar
2009-05-09 9:13 ` Wu Fengguang
2009-05-09 9:24 ` Ingo Molnar
2009-05-09 9:43 ` Wu Fengguang
2009-05-09 10:22 ` Ingo Molnar
2009-05-09 10:45 ` Wu Fengguang
2009-05-09 10:01 ` Ingo Molnar
2009-05-09 10:27 ` Ingo Molnar
2009-05-09 10:57 ` Wu Fengguang
2009-05-09 11:05 ` Ingo Molnar
2009-05-09 12:23 ` Wu Fengguang
2009-05-09 14:05 ` Ingo Molnar
2009-05-10 8:35 ` Wu Fengguang [this message]
2009-05-11 12:01 ` Ingo Molnar
2009-05-09 10:36 ` Ingo Molnar
2009-05-08 12:58 ` ftrace: concurrent accesses possible? Wu Fengguang
2009-05-08 13:17 ` Steven Rostedt
2009-05-08 13:43 ` Wu Fengguang
2009-05-08 20:24 ` [PATCH 4/8] proc: export more page flags in /proc/kpageflags Andrew Morton
2009-05-09 10:44 ` Ingo Molnar
2009-05-10 3:58 ` Andrew Morton
2009-05-10 5:26 ` Andrew Morton
2009-05-11 11:45 ` Ingo Molnar
2009-05-11 18:31 ` Andrew Morton
2009-05-11 22:08 ` Ingo Molnar
2009-05-11 19:03 ` Andy Isaacson
2009-05-08 10:53 ` [PATCH 5/8] pagemap: document clarifications Wu Fengguang
2009-05-08 10:53 ` [PATCH 6/8] pagemap: document 9 more exported page flags Wu Fengguang
2009-05-09 8:13 ` KOSAKI Motohiro
2009-05-09 8:18 ` Wu Fengguang
2009-05-08 10:53 ` [PATCH 7/8] pagemap: add page-types tool Wu Fengguang
2009-05-08 10:53 ` [PATCH 8/8] pagemap: export PG_hwpoison Wu Fengguang
2009-05-08 11:49 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090510083517.GB5794@localhost \
--to=fengguang.wu@intel.com \
--cc=a.p.zijlstra@chello.nl \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=fweisbec@gmail.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizf@cn.fujitsu.com \
--cc=mingo@elte.hu \
--cc=mpm@selenic.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).