From: Andi Kleen <ak@suse.de>
To: discuss@x86-64.org, rohitseth@google.com
Cc: Chuck Ebbert <76306.1226@compuserve.com>,
Linus Torvalds <torvalds@osdl.org>, Ingo Molnar <mingo@elte.hu>,
linux-kernel@vger.kernel.org
Subject: Re: [discuss] Re: [RFC, patch] i386: vgetcpu(), take 2
Date: Fri, 23 Jun 2006 14:42:23 +0200 [thread overview]
Message-ID: <200606231442.23073.ak@suse.de> (raw)
In-Reply-To: <1151017804.14536.98.camel@galaxy.corp.google.com>
On Friday 23 June 2006 01:10, Rohit Seth wrote:
> > > I agree that we should not overload a single call (though cpu, package
> > > and node numbers do belong in one category IMO). We can have multiple
> > > calls if that is required as long as there is an efficient mechanism to
> > > provide that information.
> >
> > The current mechanism doesn't scale to much more calls, but I guess
> > i'll have to do a vDSO sooner or later.
> >
> > > Why maintain that extra logic in user space when kernel can easily give
> > > that information.
> >
> > It already does.
> >
>
> I'm missing your point here. How and where?
In /proc/cpuinfo.
Suresh and others even put a lot of thought into how to present the information
there.
Or did you just refer to the overhead of writing a /proc parser?
> > > > I've been pondering to put some more information about that
> > > > in the ELF aux vector, but exporting might work too. I suppose
> > > > exporting would require the vDSO first to give a sane interface.
> > > >
> > > Can you please tell me what more information you are thinking of putting
> > > in aux vector?
> >
> > One proposal (not fully fleshed out was) number of siblings / sockets / nodes
> > I don't think bitmaps would work well there (and if someone really needs
> > those they can read cpuinfo again)
> >
>
> This is exactly the point, why do that expensive /proc operation when
> you can do a quick vsyscall and get all of that information. I'm not
> sure if Aux is the right direction.
It's already used for this at least (hwcap etc.)
vDSO might be better too, but I haven't thought too much about it yet
>
> > This is mostly for OpenMP and tuning of a few functions (e.g. on AMD
> > the memory latencies varies with the number of nodes so some functions
> > can be tuned in different ways based on that)
> >
> > > You are absolutely right that the mechanism I'm proposing makes sense
> > > only if we have more fields AND if any of those fields are dynamically
> > > changing. But this is a generic mechanism that could be extended to
> > > share any user visible information in efficient way. Once we have this
> > > in place then information like whole cpuinfo, percpu interrupts etc. can
> > > be retrieved easily.
> >
> > The problem with exposing too much is that it might be a nightmare
> > to guarantee a stable ABI for this. At least it would
> > constrain the kernel internally. Probably less is better here.
> >
>
> There will be (in all probability) requests to include as much as
> possible,
Yes but that doesn't mean all these requests make sense and should
be actually followed :)
> but I think that should be manageable with sensible API.
Not sure. Leaner interfaces are really better here.
It's one of the lessons I learned from libnuma - i provide a lot of tools,
but nearly all people are perfectly satisfied with the total basics. So
it's better to start small and only add stuff when there is really a clear
use case.
> Okay. I just cooked that example for some monitoring process to find out
> the interrupts /sec on that CPU. But as you mentioned above sibling,
> sockets, nodes, flags, and even other characteristics like current
> p-state are all important information that will help applications
> sitting in user land (even if some of them will be used only couple of
> times in the life of a process).
Ok you want faster monitoring applications? Some faster way than
/proc for some stuff probably makes sense - but I don't think shared
mappings are the right way for it.
There's still a lot of other possibilities for this like relayfs
or binary /proc files
> Side note: I don't want to delay the vgetcpu call into mainline because
> of this discussion
I'll probably delay it after 2.6.18
> (as long as there is no cpuid and tcache in that
> call).
What do you not like about tcache?
-Andi
next prev parent reply other threads:[~2006-06-23 12:42 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-21 7:27 [RFC, patch] i386: vgetcpu(), take 2 Chuck Ebbert
2006-06-21 8:15 ` Ingo Molnar
2006-06-21 17:38 ` Artur Skawina
2006-06-28 5:44 ` Paul Jackson
2006-06-28 8:53 ` Andi Kleen
2006-06-28 9:00 ` Ingo Molnar
2006-06-29 8:47 ` Paul Jackson
2006-06-21 9:26 ` Andi Kleen
2006-06-21 9:35 ` Ingo Molnar
2006-06-21 21:54 ` Rohit Seth
2006-06-21 22:21 ` Andi Kleen
2006-06-21 22:59 ` Rohit Seth
2006-06-21 23:05 ` Andi Kleen
2006-06-21 23:18 ` Rohit Seth
2006-06-21 23:29 ` Andi Kleen
2006-06-22 0:55 ` Rohit Seth
2006-06-22 8:08 ` Andi Kleen
2006-06-22 21:06 ` Rohit Seth
2006-06-22 22:14 ` Andi Kleen
2006-06-22 23:10 ` Rohit Seth
2006-06-23 12:42 ` Andi Kleen [this message]
2006-06-24 2:06 ` [discuss] " Rohit Seth
2006-06-24 8:42 ` Andi Kleen
2006-06-27 1:13 ` Rohit Seth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200606231442.23073.ak@suse.de \
--to=ak@suse.de \
--cc=76306.1226@compuserve.com \
--cc=discuss@x86-64.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rohitseth@google.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox