All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <andi@firstfloor.org>,
	linux-kernel@vger.kernel.org, libc-alpha@sourceware.org,
	Andi Kleen <ak@linux.intel.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 4/5] Add a sysconf syscall
Date: Sat, 14 May 2011 18:34:24 +0200	[thread overview]
Message-ID: <20110514163424.GU6008@one.firstfloor.org> (raw)
In-Reply-To: <20110514065752.GA8827@elte.hu>

> What glibc does (opening /proc/stat) is rather stupid, but i think your syscall 

I don't think it has any other choice today. So if anything is "stupid"
it is the kernel for not providing efficient interfaces for this.

> Note that these are mostly constant or semi-constant values that are updated 
> very rarely:

That's not true. Most of them are dynamic. Take a look at the patch.
Also several of those have changed recently.

> If glibc is stupid and reads /proc/stat to receive something it could cache or 
> mmap() itself then hey, did you consider fixing glibc or creating a sane libc? 
Caching doesn't help when you have a workload that exec()s a lot.
Also some of these values can change at runtime.

> If we *really* want to offer kernel help for these values even then your 
> solution is still wrong: then the proper solution would be to define a standard 
> *data* structure and map it as a vsyscall *data* page - essentially a 
> kernel-guaranteed data mmap(), with no extra syscall needed!

That's quite complicted because several of those are dynamically computed
based on other values. Sometimes they are also not tied to the mm_struct -- like
the vsyscall is. For example some of the rlimits are per task, not VM.
Basically your proposal doesn't interact well with clone().

Even if we ignored that semantic problem it would need another writable page 
per task because the values cannot be shared.

Also I never liked the idea of having more writable pages per task,
It increases the memory footprint of a single process more. Given a 4K
page is not a lot, but lots of 4K pages add up. Some workloads like
to have lots of small processes and I think that's a valuable use
case Linux should stay lean and efficient at.

[OK in theory one could do COW for the page and share it but that would
get really complicated]

I also don't think it's THAT performance critical to justify the vsyscall. 
The simple syscall is already orders of magnitude faster than /proc, and 
seems to solve the performance problems we've seen completely. 

It's also simple and straight forward and simple to userstand and maintain.
I doubt any of that would apply to a vsyscall solution.

I don't think the additional effort for a vsyscall would be worth
it at this point, unless there's some concrete example that would
justify it. Even then it wouldn't work for some of the values.

Also a vsyscall doesn't help on non x86 anyways.

As for efficiency: I thought about doing a batched interface where
the user could pass in an array of values to fill in. But at least for the 
workloads I looked at the application usually calls sysconf() where
the array size would be always 1. And that's the standard interface.
This might be still worth doing, but I would like to see a concrete
use case first.

> That could have other uses as well in the future.

Hmm for what?

Note we already have a fast mechanism to pass some thing to glibc
in the aux vector. 

> 
> That way it would much faster than your current code btw.
> 
> So unless there are some compelling arguments in favor of sys_sysconf() that i 
> missed, i have to NAK this:

Well see above for lots of reasons you missed. They are understandable
mistakes for someone who first looks at the problem though. I'll 
attempt to improve the documentation next time.

-Andi

  reply	other threads:[~2011-05-14 16:34 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-13 23:24 Add a sysconf syscall Andi Kleen
2011-05-13 23:24 ` [PATCH 1/5] VFS: Make symlink nesting limit a define Andi Kleen
2011-05-13 23:24 ` [PATCH 2/5] Move max_threads variable declaration into include file Andi Kleen
2011-05-13 23:24 ` [PATCH 3/5] EXEC: Use define for stack to argument size limit Andi Kleen
2011-05-13 23:24 ` [PATCH 4/5] Add a sysconf syscall Andi Kleen
2011-05-14  6:57   ` Ingo Molnar
2011-05-14 16:34     ` Andi Kleen [this message]
2011-05-16 13:36       ` Ingo Molnar
2011-05-17 11:25         ` Ingo Molnar
2011-05-16 15:51       ` Andy Lutomirski
2011-05-16 16:08         ` Andi Kleen
2011-05-16 17:06           ` Andrew Lutomirski
     [not found]           ` <OFCC4C610A.F152D00D-ON86257892.005E11F4-86257892.005E22BA@us.ibm.com>
     [not found]             ` <4DD15E9B.2090809@linux.intel.com>
2011-05-17 10:59               ` Ingo Molnar
2011-05-16 15:42   ` Denys Vlasenko
2011-05-16 16:01     ` Andi Kleen
     [not found]       ` <OF30360F87.5C6D6DCF-ON86257892.005D7E68-86257892.005E0059@us.ibm.com>
2011-05-16 17:39         ` Andi Kleen
     [not found]           ` <OFD2EE69FB.301A458A-ON86257892.00631BE8-86257892.006A93AF@us.ibm.com>
2011-05-16 20:51             ` Andi Kleen
2011-05-17 12:33       ` Denys Vlasenko
2011-05-13 23:24 ` [PATCH 5/5] Hook up sysconf syscall for all architectures Andi Kleen
2011-05-14  1:21   ` David Miller
2011-05-14  2:51     ` Andi Kleen
2011-05-14  2:23   ` Mike Frysinger
2011-05-24  1:46     ` Mike Frysinger
2011-05-26 18:04       ` Mike Frysinger
2011-05-26 18:45         ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110514163424.GU6008@one.firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.