All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Ulrich Drepper <drepper@redhat.com>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
	Gautham R Shenoy <ego@in.ibm.com>,
	Dipankar Sarma <dipankar@in.ibm.com>, Paul Jackson <pj@sgi.com>,
	Ingo Molnar <mingo@elte.hu>, Oleg Nesterov <oleg@tv-sign.ru>
Subject: Re: getting processor numbers
Date: Tue, 3 Apr 2007 16:23:49 -0700	[thread overview]
Message-ID: <20070403162349.583adf84.akpm@linux-foundation.org> (raw)
In-Reply-To: <4612DCA2.2090400@redhat.com>

On Tue, 03 Apr 2007 16:00:50 -0700
Ulrich Drepper <drepper@redhat.com> wrote:

> Andrew Morton wrote:
> > Now it could be argued that the current behaviour is that sane thing: we
> > allow the process to "pin" itself to not-present CPUs and just handle it in
> > the CPU scheduler.
> 
> As a stop-gap solution Jakub will likely implement the sched_getaffinity
> hack.  So, it would realy be best to get the masks updated.
> 
> 
> But all this of course does not solve the issue sysconf() has.  In
> sysconf we cannot use sched_getaffinity since all the systems CPUs must
> be reported.

OK.

This is excecptionally gruesome, but one could run sched_getaffinity()
against pid 1 (init).  Which will break nicely in the OS-virtualised future
when the system has multiple pid-1-inits running in containers...

> 
> > Is it kernel overhead, or userspace?  The overhead of counting the bits?
> 
> The overhead I meant is userland.
> 

OK.  Your cost of counting those bits is proportional to CONFIG_NR_CPUS.

It's a bit sad that sys_sched_get_get_affinity() returns sizeof(cpumask_t),
because that means that userspace must handle 256 or whatever CPUs on a
machine which only has two CPUs.

Does anyone see a reason why sys_sched_getaffinity() cannot be altered to
return maximum-possible-cpu-id-on-this-machine?  That way, your hweight
operation will be much faster on sane-sized machines.

> 
> > Because sched_getaffinity() could be easily sped up in the case where
> > it is operating on the current process.
> 
> If there is possibility to treat this case special and make it faster,
> please do so.  It would be best to allow pid==0 as a special case so
> that callers don't have to find out the TID (which they shouldn't have
> to know).
> 

OK.

Does anyone see a reason why we cannot do this?

--- a/kernel/sched.c~sched_getaffinity-speedup
+++ a/kernel/sched.c
@@ -4381,8 +4381,12 @@ long sched_getaffinity(pid_t pid, cpumas
 	struct task_struct *p;
 	int retval;
 
-	lock_cpu_hotplug();
-	read_lock(&tasklist_lock);
+	if (pid) {
+		lock_cpu_hotplug();
+		read_lock(&tasklist_lock);
+	} else {
+		preempt_disable();	/* Prevent CPU hotplugging */
+	}
 
 	retval = -ESRCH;
 	p = find_process_by_pid(pid);
@@ -4396,12 +4400,13 @@ long sched_getaffinity(pid_t pid, cpumas
 	cpus_and(*mask, p->cpus_allowed, cpu_online_map);
 
 out_unlock:
-	read_unlock(&tasklist_lock);
-	unlock_cpu_hotplug();
-	if (retval)
-		return retval;
-
-	return 0;
+	if (pid) {
+		read_unlock(&tasklist_lock);
+		unlock_cpu_hotplug();
+	} else {
+		preempt_enable();
+	}
+	return retval;
 }
 
 /**
_

> 
> > Anyway, where do we stand?  Assuming we can address the CPU hotplug issues,
> > does sched_getaffinity() look like it will be suitable?
> 
> It's only usable for the special case on the OpenMP code where the
> number of threads is used to determine the number of worker threads.
> For sysconf() we still need better support.  Maybe now somebody will
> step up and say they need faster sysconf as well.

I guess we could add a simple sys_get_nr_cpus().  If we want more than that
(ie: topology, SMT/MC/NUMA/numa-distance etc) then it gets much more complex
and sysfs is more appropriate for that.


  reply	other threads:[~2007-04-03 23:24 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-03 16:54 getting processor numbers Ulrich Drepper
2007-04-03 17:30 ` linux-os (Dick Johnson)
2007-04-03 17:37   ` Ulrich Drepper
2007-04-03 17:56 ` Dr. David Alan Gilbert
2007-04-03 18:11 ` Andi Kleen
2007-04-03 17:17   ` Ulrich Drepper
2007-04-03 17:22     ` Alan Cox
2007-04-03 17:30       ` Andi Kleen
2007-04-03 20:24         ` Jeremy Fitzhardinge
2007-04-03 17:27     ` Andi Kleen
2007-04-03 17:30       ` Ulrich Drepper
2007-04-03 17:35         ` Andi Kleen
2007-04-03 17:45           ` Ulrich Drepper
2007-04-03 17:58             ` Andi Kleen
2007-04-03 18:05               ` Ulrich Drepper
2007-04-03 18:11                 ` Andi Kleen
2007-04-03 18:21                   ` Ulrich Drepper
2007-04-03 17:44         ` Siddha, Suresh B
2007-04-03 17:59           ` Ulrich Drepper
2007-04-03 19:40             ` Jakub Jelinek
2007-04-03 20:13             ` Ingo Oeser
2007-04-03 23:38               ` J.A. Magallón
2007-04-03 19:55           ` Ulrich Drepper
2007-04-03 20:13             ` Siddha, Suresh B
2007-04-03 20:19               ` Ulrich Drepper
2007-04-03 20:32                 ` Eric Dumazet
2007-04-03 20:20             ` Nathan Lynch
2007-04-03 19:15 ` Davide Libenzi
2007-04-03 19:32   ` Ulrich Drepper
2007-04-04  0:31     ` H. Peter Anvin
2007-04-04  0:35       ` Jeremy Fitzhardinge
2007-04-04  0:38         ` H. Peter Anvin
2007-04-04  5:09           ` Eric Dumazet
2007-04-04  5:16             ` H. Peter Anvin
2007-04-04  5:22               ` Jeremy Fitzhardinge
2007-04-04  5:40                 ` H. Peter Anvin
2007-04-04  5:46                   ` Eric Dumazet
2007-04-04  5:29               ` Eric Dumazet
2007-04-03 20:16 ` Andrew Morton
     [not found]   ` <4612BB89.8040102@redhat.com>
     [not found]     ` <20070403141348.9bcdb13e.akpm@linux-foundation.org>
2007-04-03 22:13       ` Ulrich Drepper
2007-04-03 22:48         ` Andrew Morton
2007-04-03 23:00           ` Ulrich Drepper
2007-04-03 23:23             ` Andrew Morton [this message]
2007-04-03 23:54               ` Ulrich Drepper
2007-04-04  2:55               ` Paul Jackson
2007-04-04  8:39               ` Oleg Nesterov
2007-04-04  9:39                 ` Ingo Molnar
2007-04-04  8:57                   ` Oleg Nesterov
2007-04-04 10:01                     ` Ingo Molnar
2007-04-04  2:58             ` Paul Jackson
2007-04-04  3:04             ` Paul Jackson
2007-04-04  2:52           ` Paul Jackson
2007-04-04  2:04   ` Paul Jackson
2007-04-04  6:47     ` Jakub Jelinek
2007-04-04  7:02       ` Paul Jackson
2007-04-04 14:51       ` Cliff Wickman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070403162349.583adf84.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=dipankar@in.ibm.com \
    --cc=drepper@redhat.com \
    --cc=ego@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@tv-sign.ru \
    --cc=pj@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.