From: Ingo Molnar <mingo@elte.hu>
To: Peter Zijlstra <peterz@infradead.org>, Greg KH <greg@kroah.com>
Cc: Lin Ming <ming.m.lin@intel.com>,
Corey Ashford <cjashfor@linux.vnet.ibm.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
Paul Mundt <lethal@linux-sh.org>,
"eranian@gmail.com" <eranian@gmail.com>,
"Gary.Mohr@Bull.com" <Gary.Mohr@bull.com>,
"arjan@linux.intel.com" <arjan@linux.intel.com>,
"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>,
Paul Mackerras <paulus@samba.org>,
"David S. Miller" <davem@davemloft.net>,
Russell King <rmk+kernel@arm.linux.org.uk>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Will Deacon <will.deacon@arm.com>,
Maynard Johnson <mpjohn@us.ibm.com>, Carl Love <carll@us.ibm.com>,
Kay Sievers <kay.sievers@vrfy.org>,
lkml <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: [rfc] Describe events in a structured way via sysfs
Date: Fri, 21 May 2010 11:40:53 +0200 [thread overview]
Message-ID: <20100521094053.GA4658@elte.hu> (raw)
In-Reply-To: <1274429038.1674.1684.camel@laptop>
* Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, 2010-05-20 at 16:12 -0700, Greg KH wrote:
> > How deep in the device tree are you really going to be
> > caring about? It sounds like the large majority of
> > events are only going to be coming from the "system"
> > type objects (cpu, nodes, memory, etc.) and very few
> > would be from things that we consider a 'struct
> > device' today (like a pci, usb, scsi, or input, etc.)
>
> The general noise I hear from the hardware people is
> that we'll see more and more device-level stuff - bus
> bridges/controller and actual devices (GPUs, NICs etc.)
> will be wanting to export performance metrics.
There's (much) more:
- laptops want to provide power level/usage metrics,
- we could express a lot of special, lower level
(transport specific) disk IO stats via events as well -
without having to push those stats to a higher level
(where it might not make sense). Currently such kinds
of stats/metrics are very device/subsystem specific
way, if they are provided at all.
Also, we already have quite a few per device tracepoints
upstream. Here are a few examples:
- GPU tracepoints (trace_i915_gem_request_submit(), etc.)
- WIFI tracepoints (trace_iwlwifi_dev_ioread32(), etc.)
- block tracepoints (trace_block_bio_complete())
So these would be attached to:
# GEM events of drm/card0:
/sys/devices/pci0000:00/0000:00:02.0/drm/card0/events/i915_gem_request_submit/
# Wifi-ioread events of wlan0:
/sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0/net/wlan0/events/iwlwifi_dev_ioread32/
# whole sdb disk events:
/sys/block/sdb/events/block_bio_complete/
# sdb1 partition events:
/sys/block/sdb/sdb1/events/block_bio_complete/
And we also have 'software nodes' in /sys that have events
upstream here and today. For example for SLAB we already
have kmalloc/kfree tracepoints (trace_kmalloc() and
trace_kfree()):
# all kmalloc events:
/sys/kernel/slab/events/
# kmalloc events for sighand_cache:
/sys/kernel/slab/sighand_cache/events/kmalloc/
# kfree events for sighand_cache:
/sys/kernel/slab/sighand_cache/events/kfree/
In general the set of events we have upstream is growing
along an exponential curve (there's over a hundred now,
via tracepoints).
They are either logically attached to the hardware
topology of the system (as in the first set of examples
above), or ae attached to the software/subsystem object
topology of the kernel (some examples of which are
described in the second set of examples above).
Sometimes there are aliasing/filtering relationship
between events, which is expressed very well via the
hierarchy and granularity of /sysfs.
New events would go into that topology there in a natural
way.
For example general hugepage tracepoints (should we
introduce any) would go into the existing hugepage node:
/sys/kernel/mm/hugepages/events/...
All in one, all these existing and future events, both of
hardware and software type, are literally begging to be
attached to nodes in /sys :-)
If we created a separate eventfs for it we'd have to start
with duplicating all the topology/hiearchy/structure that
is present in sysfs already. (and dilluting /sys's
utility)
That would be a bad thing, so it would be nice if we found
a workable solution here. We could split up the record
format some more:
/sys/kernel/sched/events/sched_wakeup/format/
/sys/kernel/sched/events/sched_wakeup/format/common_type/
/sys/kernel/sched/events/sched_wakeup/format/common_flags/
/sys/kernel/sched/events/sched_wakeup/format/common_preempt_count/
/sys/kernel/sched/events/sched_wakeup/format/common_pid/
/sys/kernel/sched/events/sched_wakeup/format/common_lock_depth/
/sys/kernel/sched/events/sched_wakeup/format/comm/
/sys/kernel/sched/events/sched_wakeup/format/pid/
/sys/kernel/sched/events/sched_wakeup/format/prio/
/sys/kernel/sched/events/sched_wakeup/format/success/
/sys/kernel/sched/events/sched_wakeup/format/target_cpu/
Into single-value files. But this would add significant
parsing overhead (plus significant allocation overhead),
for no tangible benefit.
The problem with /proc was always the lack of standard
structure and the lack of performance - while the format
file is about _more_ structure.
Increasing structure parsing overhead does not look like
the right answer to that problem.
Hm?
Ingo
next prev parent reply other threads:[~2010-05-21 9:41 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-19 1:46 [RFC][PATCH v2 06/11] perf: core, export pmus via sysfs Lin Ming
2010-05-18 20:05 ` Greg KH
2010-05-19 2:34 ` Lin Ming
2010-05-19 2:48 ` Greg KH
2010-05-19 3:40 ` Lin Ming
2010-05-19 5:00 ` Greg KH
2010-05-19 6:32 ` Lin Ming
2010-05-19 7:14 ` Peter Zijlstra
2010-05-20 18:42 ` Greg KH
2010-05-20 19:52 ` Peter Zijlstra
2010-05-20 20:19 ` Greg KH
2010-05-20 20:14 ` Ingo Molnar
2010-05-20 23:12 ` Greg KH
2010-05-21 8:03 ` Peter Zijlstra
2010-05-21 9:40 ` Ingo Molnar [this message]
[not found] ` <AANLkTinJeYJtCg2aRWhHTcf5E2-dN2-oAfEJ8tAtFjb9@mail.gmail.com>
2010-06-01 2:34 ` [rfc] Describe events in a structured way " Lin Ming
2010-06-08 18:43 ` Ingo Molnar
[not found] ` <AANLkTimf1Z0N9cv2Pu2qTTUscn4utC37zOPelCbqQoPv@mail.gmail.com>
2010-06-21 8:55 ` Lin Ming
[not found] ` <1277112858.3618.16.camel@jlt3.sipsolutions.net>
[not found] ` <1277187920.4467.3.camel@minggr.sh.intel.com>
[not found] ` <1277189971.3637.5.camel@jlt3.sipsolutions.net>
2010-06-22 7:22 ` Lin Ming
2010-06-22 7:33 ` Johannes Berg
2010-06-22 7:39 ` Johannes Berg
2010-06-22 8:04 ` Lin Ming
2010-06-22 8:16 ` Johannes Berg
2010-06-22 7:47 ` Lin Ming
2010-06-22 7:52 ` Johannes Berg
2010-06-24 9:36 ` Ingo Molnar
2010-06-24 16:14 ` Johannes Berg
2010-06-24 17:33 ` Ingo Molnar
2010-06-29 6:15 ` Lin Ming
2010-06-29 8:55 ` Ingo Molnar
2010-06-29 9:20 ` Lin Ming
2010-06-29 10:26 ` Ingo Molnar
2010-07-02 8:06 ` Lin Ming
2010-07-03 12:54 ` Ingo Molnar
2010-07-17 0:20 ` Corey Ashford
2010-07-20 5:48 ` Lin Ming
2010-07-20 15:19 ` Robert Richter
2010-07-20 17:50 ` Corey Ashford
2010-07-20 18:30 ` Robert Richter
2010-07-20 21:18 ` Corey Ashford
2010-07-20 17:43 ` Corey Ashford
2010-05-19 7:06 ` [RFC][PATCH v2 06/11] perf: core, export pmus " Borislav Petkov
2010-05-19 7:17 ` Peter Zijlstra
2010-05-19 7:23 ` Ingo Molnar
2010-05-18 20:07 ` Greg KH
2010-05-19 2:37 ` Lin Ming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100521094053.GA4658@elte.hu \
--to=mingo@elte.hu \
--cc=Gary.Mohr@bull.com \
--cc=acme@redhat.com \
--cc=arjan@linux.intel.com \
--cc=carll@us.ibm.com \
--cc=cjashfor@linux.vnet.ibm.com \
--cc=davem@davemloft.net \
--cc=eranian@gmail.com \
--cc=fweisbec@gmail.com \
--cc=greg@kroah.com \
--cc=kay.sievers@vrfy.org \
--cc=lethal@linux-sh.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.m.lin@intel.com \
--cc=mpjohn@us.ibm.com \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=rmk+kernel@arm.linux.org.uk \
--cc=tglx@linutronix.de \
--cc=will.deacon@arm.com \
--cc=yanmin_zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.