From: Shailabh Nagar <nagar@watson.ibm.com>
To: Paul Jackson <pj@sgi.com>
Cc: akpm@osdl.org, Valdis.Kletnieks@vt.edu, jlan@engr.sgi.com,
balbir@in.ibm.com, csturtiv@sgi.com,
linux-kernel@vger.kernel.org, hadi@cyberus.ca,
netdev@vger.kernel.org
Subject: Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats
Date: Mon, 03 Jul 2006 11:02:17 -0400 [thread overview]
Message-ID: <44A93179.2080303@watson.ibm.com> (raw)
In-Reply-To: <20060702215350.2c1de596.pj@sgi.com>
Paul Jackson wrote:
>Shailabh wrote:
>
>
>>Sends a separate "registration" message with cpumask to listen to.
>>Kernel stores (real) pid and cpumask.
>>
>>
>
>Question:
>=========
>
>Ah - good.
>
>So this means that I could configure a system with a fork/exit
>intensive, performance critical job on some dedicated CPUs, and be able
>to collect taskstat data from tasks exiting on the -other- CPUS, while
>avoiding collecting data from this special job, thus avoiding any
>taskstat collection performance impact on said job.
>
>If I'm understanding this correctly, excellent.
>
>
Yes. If no one registers to listen on a particular CPU, data from tasks
exiting on that cpu is
not sent out at all.
>Caveat:
>=======
>
>Passing cpumasks across the kernel-user boundary can be tricky.
>
>Historically, Unix has a long tradition of boloxing up the passing
>of variable length data types across the kernel-user boundary.
>
>We've got perhaps a half dozen ways of getting these masks out of the
>kernel, and three ways of getting them (or the similar nodemasks) back
>into the kernel. The three ways being used in the sched_setaffinity
>system call, the mbind and set_mempolicy system calls, and the cpuset
>file system.
>
>All three of these ways have their controversial details:
> * The kernel cpumask mask size needed for sched_setaffinity calls is
> not trivially available to userland.
> * The nodemask bit size is off by one in the mbind and set_mempolicy
> calls.
> * The CPU and Node masks are ascii, not binary, in the cpuset calls.
>
>One option that might make sense for these task stat registrations
>would be to:
> 1) make the kernel/sched.c get_user_cpu_mask() routine generic,
> moving it to non-static lib/*.c code, and
> 2) provide a sensible way for user space to query the size of
> the kernel cpumask (and perhaps nodemask while you're at it.)
>
>Currently, the best way I know for user space to query the kernels
>cpumask and nodemask size is to examine the length of the ascii
>string values labeled "Cpus_allowed:" and "Mems_allowed:" in the file
>/proc/self/status. These ascii strings always require exactly nine
>ascii chars to express each 32 bits of kernel mask code, if you include
>in the count the trailing ',' comma or '\n' newline after each eight
>ascii character word.
>
>Probing /proc/self/status fields for these mask sizes is rather
>unobvious and indirect, and requires caching the result if you care at
>all about performance. Userland code in support of your taskstat
>facility might be better served by a more obvious way to size cpumasks.
>
>... unless of course you're inclined to pass cpumasks formatted as
> ascii strings, in which case speak up, as I'd be delighted to
> throw in my 2 cents on how to do that ;).
>
>
Thanks for the size info. I did hit it while coding this up.
So I chose to use the "cpulist" ascii format that has been helpfully
provided in include/linux/cpumask.h (by whom I wonder :-)
User specified the cpumask as an ascii string containing comma separated
cpu ranges.
Kernel parses the same and stores it as a cpumask_t after which we can
iterate over the
mask using standard helpers.
Since registration/deregistration is not a common operation, the
overhead of parsing
ascii strings should be acceptable and avoids the hassles of trying to
determine kernel cpumask size. I don't know if there are buffer overflow
issues in passing a string (though I'm using the
standard netlink way of passing it up using NLA_STRING).
Will post the patch shortly.
--Shailabh
next prev parent reply other threads:[~2006-07-03 15:02 UTC|newest]
Thread overview: 151+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-09 7:41 [Patch][RFC] Disabling per-tgid stats on task exit in taskstats Shailabh Nagar
2006-06-09 8:00 ` Andrew Morton
2006-06-09 10:51 ` Balbir Singh
2006-06-09 11:21 ` Andrew Morton
2006-06-09 13:20 ` Shailabh Nagar
2006-06-09 18:25 ` Jay Lan
2006-06-09 19:12 ` Shailabh Nagar
2006-06-09 15:36 ` Balbir Singh
2006-06-09 18:35 ` Jay Lan
2006-06-09 19:31 ` Shailabh Nagar
2006-06-09 21:56 ` Shailabh Nagar
2006-06-09 22:42 ` Jay Lan
2006-06-09 23:22 ` Andrew Morton
2006-06-09 23:47 ` Jay Lan
2006-06-09 23:56 ` Andrew Morton
2006-06-10 12:21 ` Shailabh Nagar
2006-06-12 18:31 ` Jay Lan
2006-06-12 21:57 ` Shailabh Nagar
2006-06-10 13:05 ` Shailabh Nagar
2006-06-12 18:54 ` Jay Lan
2006-06-21 19:11 ` Jay Lan
2006-06-21 19:14 ` Jay Lan
2006-06-21 19:34 ` Shailabh Nagar
2006-06-21 23:35 ` Jay Lan
2006-06-21 23:45 ` Shailabh Nagar
2006-06-23 17:14 ` Shailabh Nagar
2006-06-23 18:19 ` Jay Lan
2006-06-23 18:53 ` Shailabh Nagar
2006-06-23 20:00 ` Jay Lan
2006-06-23 20:16 ` Shailabh Nagar
2006-06-23 20:36 ` Jay Lan
2006-06-23 21:19 ` Andrew Morton
2006-06-23 22:07 ` Jay Lan
2006-06-23 23:47 ` Andrew Morton
2006-06-24 2:59 ` Shailabh Nagar
2006-06-24 4:39 ` Andrew Morton
2006-06-24 5:59 ` Shailabh Nagar
2006-06-26 17:33 ` Jay Lan
2006-06-26 17:52 ` Shailabh Nagar
2006-06-26 17:55 ` Andrew Morton
2006-06-26 18:00 ` Shailabh Nagar
2006-06-26 18:12 ` Andrew Morton
2006-06-26 18:26 ` Jay Lan
2006-06-26 18:39 ` Andrew Morton
2006-06-26 18:49 ` Shailabh Nagar
2006-06-26 19:00 ` Jay Lan
2006-06-28 21:30 ` Jay Lan
2006-06-28 21:53 ` Andrew Morton
2006-06-28 22:02 ` Jay Lan
2006-06-29 8:40 ` Paul Jackson
2006-06-29 12:30 ` Valdis.Kletnieks
2006-06-29 16:44 ` Paul Jackson
2006-06-29 18:01 ` Andrew Morton
2006-06-29 18:07 ` Paul Jackson
2006-06-29 18:26 ` Paul Jackson
2006-06-29 19:15 ` Shailabh Nagar
2006-06-29 19:41 ` Paul Jackson
2006-06-29 21:42 ` Shailabh Nagar
2006-06-29 21:54 ` Jay Lan
2006-06-29 22:09 ` Shailabh Nagar
2006-06-29 22:23 ` Paul Jackson
2006-06-30 0:15 ` Shailabh Nagar
2006-06-30 0:40 ` Paul Jackson
2006-06-30 1:00 ` Shailabh Nagar
2006-06-30 1:05 ` Paul Jackson
[not found] ` <44A46C6C.1090405@watson.ibm.com>
2006-06-30 0:38 ` Paul Jackson
2006-06-30 2:21 ` Paul Jackson
2006-06-30 2:46 ` Shailabh Nagar
2006-06-30 2:54 ` Paul Jackson
2006-06-30 3:02 ` Paul Jackson
2006-06-29 19:22 ` Shailabh Nagar
2006-06-29 19:10 ` Shailabh Nagar
2006-06-29 19:10 ` Shailabh Nagar
2006-06-29 19:23 ` Paul Jackson
2006-06-29 19:33 ` Andrew Morton
2006-06-29 19:43 ` Shailabh Nagar
2006-06-29 19:43 ` Shailabh Nagar
2006-06-29 20:00 ` Andrew Morton
2006-06-29 22:13 ` Shailabh Nagar
2006-06-29 22:13 ` Shailabh Nagar
2006-06-29 23:00 ` jamal
2006-06-29 20:01 ` Shailabh Nagar
2006-06-29 20:01 ` Shailabh Nagar
2006-06-29 21:22 ` Paul Jackson
2006-06-29 22:54 ` jamal
2006-06-30 0:38 ` Shailabh Nagar
2006-06-30 0:38 ` Shailabh Nagar
2006-06-30 1:05 ` Andrew Morton
2006-06-30 1:11 ` Shailabh Nagar
2006-06-30 1:11 ` Shailabh Nagar
2006-06-30 1:30 ` jamal
2006-06-30 3:01 ` Shailabh Nagar
2006-06-30 3:01 ` Shailabh Nagar
2006-06-30 12:45 ` jamal
2006-06-30 2:25 ` Paul Jackson
2006-06-30 2:35 ` Andrew Morton
2006-06-30 2:43 ` Paul Jackson
2006-06-29 19:33 ` Jay Lan
2006-06-30 18:53 ` Shailabh Nagar
2006-06-30 18:53 ` Shailabh Nagar
2006-06-30 19:10 ` Shailabh Nagar
2006-06-30 19:10 ` Shailabh Nagar
2006-06-30 19:19 ` Shailabh Nagar
2006-06-30 19:19 ` Shailabh Nagar
2006-06-30 20:19 ` jamal
2006-06-30 22:50 ` Andrew Morton
2006-07-01 2:20 ` Shailabh Nagar
2006-07-01 2:20 ` Shailabh Nagar
2006-07-01 2:43 ` Andrew Morton
2006-07-01 3:37 ` Shailabh Nagar
2006-07-01 3:37 ` Shailabh Nagar
2006-07-01 3:51 ` Andrew Morton
2006-07-03 21:11 ` Shailabh Nagar
2006-07-03 21:41 ` Andrew Morton
2006-07-04 0:13 ` Shailabh Nagar
2006-07-04 0:38 ` Andrew Morton
2006-07-04 20:19 ` Paul Jackson
2006-07-04 20:22 ` Paul Jackson
2006-07-04 0:54 ` Shailabh Nagar
2006-07-04 1:01 ` Andrew Morton
2006-07-04 13:05 ` jamal
2006-07-04 15:18 ` Shailabh Nagar
2006-07-04 16:37 ` Shailabh Nagar
2006-07-04 19:24 ` jamal
2006-07-05 14:09 ` Shailabh Nagar
2006-07-05 14:09 ` Shailabh Nagar
2006-07-05 20:25 ` Chris Sturtivant
2006-07-05 20:25 ` Chris Sturtivant
2006-07-05 20:32 ` Shailabh Nagar
2006-07-05 20:32 ` Shailabh Nagar
2006-07-03 4:53 ` Paul Jackson
2006-07-03 15:02 ` Shailabh Nagar [this message]
2006-07-03 15:55 ` Paul Jackson
2006-07-03 16:31 ` Paul Jackson
2006-07-04 0:09 ` Shailabh Nagar
2006-07-04 19:59 ` Paul Jackson
2006-07-05 17:20 ` Jay Lan
2006-07-05 17:20 ` Jay Lan
2006-07-05 18:18 ` Shailabh Nagar
2006-07-05 18:18 ` Shailabh Nagar
2006-06-30 22:56 ` Andrew Morton
2006-06-29 18:05 ` Nick Piggin
2006-06-29 12:42 ` Shailabh Nagar
2006-06-24 3:08 ` Shailabh Nagar
2006-06-21 20:38 ` Andrew Morton
2006-06-21 21:31 ` Shailabh Nagar
2006-06-21 21:45 ` Jay Lan
2006-06-21 21:54 ` Andrew Morton
2006-06-21 22:19 ` Jay Lan
2006-06-21 21:59 ` Shailabh Nagar
2006-06-09 15:55 ` Chris Sturtivant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44A93179.2080303@watson.ibm.com \
--to=nagar@watson.ibm.com \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@osdl.org \
--cc=balbir@in.ibm.com \
--cc=csturtiv@sgi.com \
--cc=hadi@cyberus.ca \
--cc=jlan@engr.sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pj@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.