How to notify app of changed cpu/mem/io node configuration?

linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* How to notify app of changed cpu/mem/io node configuration?
@ 2004-06-29  0:38 Paul Jackson
  2004-06-29  6:24 ` Greg KH
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: Paul Jackson @ 2004-06-29  0:38 UTC (permalink / raw)
  To: linux-hotplug

How is it that applications will be notified of changes in a systems
configuration, such as cpu, memory or i/o nodes coming online and going
offline?

I have not followed the hotplug email lists (neither this one, nor the
memory or cpu or node hotplug lists) closely enough to know if this is
well understood, or not.  My apologies if I am covering old ground.

The cpuset work that Simon Derr (Bull) and I are doing, to support
constrained placement of sets of tasks on possibly exclusive sets of
CPUs and Memory Nodes, will eventually require some means to notify
tasks if the CPU/Memory nodes allowed to them by their cpuset have
changed, so that they can rebind to the appropriate new CPU or Memory
nodes.  I intend to post a patch of this cpuset work to lkml, within the
next day, for review and feedback. For now, no migration is supported,
and notification of CPU or Memory placement changes not useful.  But,
later on, I believe it will need to be added.  In other words, if there
was an agreed mechanism for notifying tasks of nodes going on and off
line, I would hope to use the same mechanism for notifying them that the
nodes _allowed_ to them in their current cpuset had changed.  In either
case, the task had best move.

I can imagine using for notification a new signal, that could be sent by
an administrator or system service (batch manager, perhaps) to tasks if
their allowed CPUs or Memory Nodes or other such had changed.

Then, for example, shared library code could catch the signal, and
optionally reissue sched_setaffinity(), mbind() or set_mempolicy() calls
with changed values, reflecting the new resources available to that
task.  If a task refused to take the hint and move, it could be
administratively shot.

However this is just my brainstorming, and may well not be a good way to
handle this.

By the way - on which of the Hotplug, Hotplug Memory, Hotplug CPU and/or
Hotplug Node email lists on SourceForge is it appropriate to ask this
questions?  I'm guessing the plain Hotplug list, figuring it covers
overall issues not specific to CPU, Memory or Nodes.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
@ 2004-06-29  6:24 ` Greg KH
  2004-06-29  8:37 ` Paul Jackson
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Greg KH @ 2004-06-29  6:24 UTC (permalink / raw)
  To: linux-hotplug

On Mon, Jun 28, 2004 at 05:38:08PM -0700, Paul Jackson wrote:
> How is it that applications will be notified of changes in a systems
> configuration, such as cpu, memory or i/o nodes coming online and going
> offline?

Through /sbin/hotplug.  Is that not acceptable?
I'm sure there will be a DBUS event generated from it :)

I've heard that the memory hotplug notifiers are working just fine
through that interface.

> I can imagine using for notification a new signal, that could be sent by
> an administrator or system service (batch manager, perhaps) to tasks if
> their allowed CPUs or Memory Nodes or other such had changed.

I recall some initial proposals for when CPUs were offlined, to send the
applications that were bound to those CPUs a specific signal, but I do
not know if that got implemented or not.  Anyone know?

thanks,

greg k-h


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
  2004-06-29  6:24 ` Greg KH
@ 2004-06-29  8:37 ` Paul Jackson
  2004-06-29 22:41 ` Dave Hansen
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Paul Jackson @ 2004-06-29  8:37 UTC (permalink / raw)
  To: linux-hotplug

Greg KH wrote:
> Through /sbin/hotplug.  Is that not acceptable?

Hotplug is a fine tool.  Not what I had in mind however.

I'm afraid my initial question was quite incomplete.

And now that I see your initial response, perhaps off-topic for this
list as well - though not that far off.

My question is not how to translate kernel hardware config events into
executing selected user code (which is what I understand hotplug is good
at), but rather how to cause a long running big scientific or technical
job (think a 4 day Fortran program) to execute a bit of code that alls
mbind(2) and/or set_mempolicy(2), at the request of a user level system
service (think batch manager) 

Consider the following scenario:

  1) A very large system - hundreds of CPUs.
  2) Running a handful of long running jobs, perhaps days.
  3) Each job is carefully placed, to use certain CPUs and Memory.
  4) Shaving a few percent off the total runtime of a job can be critical.
  5) A batch manager controls the start, stop and placement of jobs.
  6) The batch manager learns, perhaps via hotplug, that a node is doing down.
  4) The batch manager now wants to move the job that was using that node.

We have enough in current 2.6 kernels to move the CPU placement, as the
sched_setaffinity(2) can be applied to other task pid's.  However the
mbind(2) and set_mempolicy(2) calls only apply to the current task, for
good reason - too hard otherwise.  So the batch manager needs to poke a
seldom used bit of co-operating code buried in the job to be moved, to
tell it to issue the requisite mbind and set_mempolicy calls to move on
over to the new CPU and Memory resources.

=> The critical step that I am pondering here is:

  How does this event first enter the process space of the long running job?

Certainly, the job isn't hanging out on some D-BUS just waiting for it. 
It is too busy computing some large problem.

So, I need a low bandwidth, uni-directional, asynchronous channel that
has essentially zero cost on the receiver, until such time as it is
used.

I suspect that I am describing signals and only signals.

The only other option I'm aware of would be adding kernel code that,
similar in implementation to signals and debugging (ptrace) events,
provides hooks to enable one task to cause another to invoke from within
the kernel the mbind and set_mempolicy system calls on itself.

I suspect that this alternative (doing it in the kernel) would go over
on lkml like a lead balloon.  As well it should.

> I'm sure there will be a DBUS event generated from it :)

These messages make sense between co-operating threads that are waiting
for messages.  But I would guess (correct me if I'm wrong) they don't
provide async notification to unsuspecting tasks that were 100% busy
doing something entirely unrelated.  D-Bus sounds to me like another RPC
or Corba IPC mechanism, tailored for sensible use between desktop apps
and services.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
  2004-06-29  6:24 ` Greg KH
  2004-06-29  8:37 ` Paul Jackson
@ 2004-06-29 22:41 ` Dave Hansen
  2004-06-29 23:15 ` Paul Jackson
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Dave Hansen @ 2004-06-29 22:41 UTC (permalink / raw)
  To: linux-hotplug

On Mon, 2004-06-28 at 23:24, Greg KH wrote: 
> On Mon, Jun 28, 2004 at 05:38:08PM -0700, Paul Jackson wrote:
> > I can imagine using for notification a new signal, that could be sent by
> > an administrator or system service (batch manager, perhaps) to tasks if
> > their allowed CPUs or Memory Nodes or other such had changed.
> 
> I recall some initial proposals for when CPUs were offlined, to send the
> applications that were bound to those CPUs a specific signal, but I do
> not know if that got implemented or not.  Anyone know?

I don't see anything in the various signal.h files that looks likely.  

But, we'll probably need some kind of synchronous process notification
at some point.  If an app gets itself in a bad state such that it has no
possibility of being scheduled, or gets itself into some other hopeless
state due to a hotplug action, the kernel will likely have the option to
kill it.  

I think the problem for the kernel comes when we consider how to notify
the app.  Doing a hotplug event and having the scripts send an
appropriate signal isn't really a viable option because the kernel
wouldn't really know when each app had a chance to handle the signal. 
Sleeping for 5 seconds and hoping for the best probably isn't the best
option, either. :)

The other option would be to have the kernel send signals to the
processes, and recheck the "impossible state" after the signal has been
handled.

I don't really remember how the discussions about CPU hotplug ended, but
I wonder if a much more generic signal could be of more use than a
single task CPU hotplug signal.

What about a SIGHOTPLUG that can be used whenever the kernel notices
that a task is using a resource that's being hotplugged?

-- Dave

-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (2 preceding siblings ...)
  2004-06-29 22:41 ` Dave Hansen
@ 2004-06-29 23:15 ` Paul Jackson
  2004-06-29 23:39 ` Dave Hansen
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Paul Jackson @ 2004-06-29 23:15 UTC (permalink / raw)
  To: linux-hotplug

Dave wrote:
> But, we'll probably need some kind of synchronous process notification
> at some point. 

Did you mean "some kind of asynchronous ..."?

> If ... app ... hopeless ... the kernel will likely have the option to
> kill it.  

More commonly, it seems that the kernel just forces the app onto some
CPU/Memory that will work, such as the lowest currently online CPU.
Essentially, CPU 0 (and I guess Memory Node 0, in time) become the
orphanage for homeless tasks.

> Sleeping for 5 seconds and hoping for the best probably isn't the best
> option, either. :)

Yeah - the kernel doesn't really want to be playing such games.
Hence actions that might require such should be left to userland.

For example, on orderly moving of apps from one Node to another
could be accomplished by user code that took its time and did what
it had to do to move (or kill) apps off old Node, before it told
the kernel "all clear - remove old Node from service"

> I wonder if a much more generic signal could be of more use ...

That's my inclination as well.  These sorts of events are rare. Send one
generic signal, and let the recipient poke around and figure out what
has changed that it cares about and deal with it.  I could even imagine
overloading SIGPWR for this purpose, if new signal numbers/names are too
difficult to come by.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (3 preceding siblings ...)
  2004-06-29 23:15 ` Paul Jackson
@ 2004-06-29 23:39 ` Dave Hansen
  2004-06-29 23:53 ` Rusty Russell
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Dave Hansen @ 2004-06-29 23:39 UTC (permalink / raw)
  To: linux-hotplug

On Tue, 2004-06-29 at 16:15, Paul Jackson wrote:
> Dave wrote:
> > But, we'll probably need some kind of synchronous process notification
> > at some point. 
> 
> Did you mean "some kind of asynchronous ..."?

Some kind of notification that the kernel sends where it can be sure
that the app received it.  A signal sent by /sbin/hotplug is
asynchronous to the kernel, because it has no idea whether the app
handled it.  A signal sent directly by the kernel is another matter.

> > If ... app ... hopeless ... the kernel will likely have the option to
> > kill it.  
> 
> More commonly, it seems that the kernel just forces the app onto some
> CPU/Memory that will work, such as the lowest currently online CPU.
> Essentially, CPU 0 (and I guess Memory Node 0, in time) become the
> orphanage for homeless tasks.

This might, in fact, be an invalid state.  We have some silly tools on
the NUMA-Q that *have* to run on a certain node because the hardware
maps different node-specific structures at the same physical address on
different nodes.  Rudely being migrated to CPU 0 would break the app,
and could (in theory) cause data corruption.  

Now, this is a silly NUMA-Q thing, but I'm sure there are others like
that.  As Mr. Dobson has told me several times: the mask is
cpus_allowed, not cpus_preferred :)

> > Sleeping for 5 seconds and hoping for the best probably isn't the best
> > option, either. :)
> 
> Yeah - the kernel doesn't really want to be playing such games.
> Hence actions that might require such should be left to userland.
> 
> For example, on orderly moving of apps from one Node to another
> could be accomplished by user code that took its time and did what
> it had to do to move (or kill) apps off old Node, before it told
> the kernel "all clear - remove old Node from service"

I just worry about the complexity of feeding all of this information
back to the kernel.  Maybe there's a simple way to do it, but one isn't
horribly apparent to me right now.  

> > I wonder if a much more generic signal could be of more use ...
> 
> That's my inclination as well.  These sorts of events are rare. Send one
> generic signal, and let the recipient poke around and figure out what
> has changed that it cares about and deal with it.  I could even imagine
> overloading SIGPWR for this purpose, if new signal numbers/names are too
> difficult to come by.

I think we need Rusty to remind us all of what came from the last
discussion on this topic.

-- Dave



-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (4 preceding siblings ...)
  2004-06-29 23:39 ` Dave Hansen
@ 2004-06-29 23:53 ` Rusty Russell
  2004-06-30  1:22 ` Paul Jackson
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Rusty Russell @ 2004-06-29 23:53 UTC (permalink / raw)
  To: linux-hotplug

On Wed, 2004-06-30 at 08:41, Dave Hansen wrote:
> On Mon, 2004-06-28 at 23:24, Greg KH wrote: 
> > On Mon, Jun 28, 2004 at 05:38:08PM -0700, Paul Jackson wrote:
> > > I can imagine using for notification a new signal, that could be sent by
> > > an administrator or system service (batch manager, perhaps) to tasks if
> > > their allowed CPUs or Memory Nodes or other such had changed.
> > 
> > I recall some initial proposals for when CPUs were offlined, to send the
> > applications that were bound to those CPUs a specific signal, but I do
> > not know if that got implemented or not.  Anyone know?
> 
> I don't see anything in the various signal.h files that looks likely.  
> 
> But, we'll probably need some kind of synchronous process notification
> at some point.  If an app gets itself in a bad state such that it has no
> possibility of being scheduled, or gets itself into some other hopeless
> state due to a hotplug action, the kernel will likely have the option to
> kill it.  
> 
> I think the problem for the kernel comes when we consider how to notify
> the app.  Doing a hotplug event and having the scripts send an
> appropriate signal isn't really a viable option because the kernel
> wouldn't really know when each app had a chance to handle the signal. 
> Sleeping for 5 seconds and hoping for the best probably isn't the best
> option, either. :)
> 
> The other option would be to have the kernel send signals to the
> processes, and recheck the "impossible state" after the signal has been
> handled.
> 
> I don't really remember how the discussions about CPU hotplug ended, but
> I wonder if a much more generic signal could be of more use than a
> single task CPU hotplug signal.
> 
> What about a SIGHOTPLUG that can be used whenever the kernel notices
> that a task is using a resource that's being hotplugged?

SIGRECONFIG was suggested.  Default is ignored, and you get unbound from
CPU anyway.  Later on would cover memory changes as well.  But speaking
with another OS which uses this approach, they found that vendors
preferred the "run a script" approach anyway, so the solution is
probably:

	(1) If you want to do this, ask on DBUS for any naks.
	(2) If no NAKs, tell kernel.
	(3) /sbin/hotplug tells DBUS when it's done.

The real killer was that we can't add signals without breaking glibc,
since it never asks the kernel how many signals there are.

Hope that clarifies!
Rusty.

Anyone who quotes me in their signature is an idiot -- Rusty Russell



-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (5 preceding siblings ...)
  2004-06-29 23:53 ` Rusty Russell
@ 2004-06-30  1:22 ` Paul Jackson
  2004-06-30  1:38 ` Dave Hansen
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Paul Jackson @ 2004-06-30  1:22 UTC (permalink / raw)
  To: linux-hotplug

Dave wrote:
> (in response to my query of whether you meant "sync" or "async"):
> Some kind of notification that the kernel sends where it can be sure
> that the app received it.

Huh - I'm confused as to what you think is essential here.  I have this
sense that you are saying we need reliably acknowledged notification.
But this seems impossible to me - one cannot have the kernel depending
on some random bit of user code sending a timely acknowledgement, much
less on such code doing something competent with the notice.

Whatever ... in any case ... one element that I consider essential is an
agreed to convention whereby a system service such as a batch manager
can send to the several tasks in a job "fair and adequate" notification
that they (the tasks) had better pack up and move - or be shot if they
don't.  In particular, the mbind() and set_mempolicy() calls can only be
done by a task on itself, not by some third party.   I doubt that the
kernel is party to or even aware of this convention.

> Rudely being migrated to CPU 0 would break the app,
> and could (in theory) cause data corruption.  

Then I am surpised that you accept the following bit of code, now
in Linus' 2.6.7:

  kernel/sched.c:migrate_all_tasks()

                if (dest_cpu = NR_CPUS) {
                        cpus_setall(tsk->cpus_allowed);
                        dest_cpu = any_online_cpu(tsk->cpus_allowed);

The usual affect of this is to force cpus_allowed to all possible CPUs,
and to set "dest_cpu = 0".  It happens in the case of a migration, when
none of the remaining CPUs allowed to a task are online.

Hmmm ... thinking out loud ... why not set cpus_allowed to the single
cpu which is the lowest numbered online cpu, rather than all cpus?
I see no need to give orphans the run of the entire city!

> > ...  before it told the kernel "all clear - remove old Node from service"
> 
> I just worry about the complexity of feeding all of this information
> back to the kernel. 

Huh - what's to tell the kernel at this step?  "Remove CPU or Memory
node X from service, with ruthless abandon to any residual user tasks or
memory left there" -- that's all I see.

> I think we need Rusty to remind us all ...

Good idea !!

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (6 preceding siblings ...)
  2004-06-30  1:22 ` Paul Jackson
@ 2004-06-30  1:38 ` Dave Hansen
  2004-06-30  1:40 ` Paul Jackson
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Dave Hansen @ 2004-06-30  1:38 UTC (permalink / raw)
  To: linux-hotplug

On Tue, 2004-06-29 at 18:22, Paul Jackson wrote:
> Dave wrote:
> > (in response to my query of whether you meant "sync" or "async"):
> > Some kind of notification that the kernel sends where it can be sure
> > that the app received it.
> 
> Huh - I'm confused as to what you think is essential here.  I have this
> sense that you are saying we need reliably acknowledged notification.
> But this seems impossible to me - one cannot have the kernel depending
> on some random bit of user code sending a timely acknowledgement, much
> less on such code doing something competent with the notice.

I was thinking we could just make sure that it handled the signal.  But,
that doesn't seem horribly feasible at this point.,

> > Rudely being migrated to CPU 0 would break the app,
> > and could (in theory) cause data corruption.  
> 
> Then I am surpised that you accept the following bit of code, now
> in Linus' 2.6.7:
> 
>   kernel/sched.c:migrate_all_tasks()
> 
>                 if (dest_cpu = NR_CPUS) {
>                         cpus_setall(tsk->cpus_allowed);
>                         dest_cpu = any_online_cpu(tsk->cpus_allowed);

I don't accept a lot of the cpu hotplug code as it is :).

But, it at least avoids having to really worry about processes handling
the notification, and won't kill them.  I guess slow is better than
dead.

> The usual affect of this is to force cpus_allowed to all possible CPUs,
> and to set "dest_cpu = 0".  It happens in the case of a migration, when
> none of the remaining CPUs allowed to a task are online.
> 
> Hmmm ... thinking out loud ... why not set cpus_allowed to the single
> cpu which is the lowest numbered online cpu, rather than all cpus?
> I see no need to give orphans the run of the entire city!

The only problem that might cause would be scheduler imbalances.  It
would sure beat the living daylights out of CPU0 if a lot of tasks were
sent over there.  This way, the scheduler at least has a fighting chance
to balance them out.  

> > > ...  before it told the kernel "all clear - remove old Node from service"
> > 
> > I just worry about the complexity of feeding all of this information
> > back to the kernel. 
> 
> Huh - what's to tell the kernel at this step?  "Remove CPU or Memory
> node X from service, with ruthless abandon to any residual user tasks or
> memory left there" -- that's all I see.

I was thinking about how to have a userspace /sbin/hotplug handler tell
the kernel which tasks had been migrated or handled correctly.  But,
that doesn't make _any_ sense as I think back.  I think I was in a
meeting when I sent that and my brain was turned off.

-- Dave



-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (7 preceding siblings ...)
  2004-06-30  1:38 ` Dave Hansen
@ 2004-06-30  1:40 ` Paul Jackson
  2004-06-30  2:16 ` Paul Jackson
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Paul Jackson @ 2004-06-30  1:40 UTC (permalink / raw)
  To: linux-hotplug

> But speaking with another OS which uses this approach, they found that
> vendors preferred the "run a script" approach anyway

Why are the "script" and the "signal" exclusive?  We seem to need both.

I'd expect that the 'batch manager' or such would want a "script" of its
choosing executed, as you describe.

But I also expect that a signal mechanism is required to deliver notice:

  from: the system service or its script (a batch manager, for example)

    to: the target task (a long running Fortran job, for example)

that some special code linked in the target task needs to make sense of
the situation and invoke the appropriate mbind/set_mempolicy calls.

Perhaps that "other OS" didn't have the difficulty we have of having to
get the target task to issue a call or two on its own behalf that can
not be issued 'by proxy'.

> The real killer was that we can't add signals without breaking glibc,

Yeah - I agree.  Don't count on glibc for anything.  Hence no new
signals, and expect to be overloading an existing signal for these
purposes, if it comes to that.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (8 preceding siblings ...)
  2004-06-30  1:40 ` Paul Jackson
@ 2004-06-30  2:16 ` Paul Jackson
  2004-06-30 11:57 ` jlm_devel
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Paul Jackson @ 2004-06-30  2:16 UTC (permalink / raw)
  To: linux-hotplug

> I don't accept a lot of the cpu hotplug code as it is :).

Ah so ...

> sure beat the living daylights out of CPU0 

Eh - not the kernels problem.  If user level code, such a batch
manager, screws up the configuration and leaves a bunch of orphans
laying around, I'd just as soon cram them all into one corner.

But this is likely reflecting a difference in our target markets.

For my key customers, if a big job can't get exactly the resources it
signed up for, that job might as well be taken out behind the barn and
shot.  One of the first features these customers demand is someway to
guarantee that nothing else on the system will interfere with their key
job(s).  For such systems, misconfigured orphans should be quarantined
so as to have the least impact on any remaining healthy jobs.

I can well imagine that my customers needs are not universal.

Whatever ... probably not worth worrying about.  If the user level code
doesn't like the kernel default here, then the user level code can just
arrange it so that this line of code is almost never executed.

>  I think I was in a meeting when I sent that ...

That'll do it ;).

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (9 preceding siblings ...)
  2004-06-30  2:16 ` Paul Jackson
@ 2004-06-30 11:57 ` jlm_devel
  2004-06-30 12:26 ` Paul Jackson
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jlm_devel @ 2004-06-30 11:57 UTC (permalink / raw)
  To: linux-hotplug

Paul Jackson wrote:

>>But speaking with another OS which uses this approach, they found that
>>vendors preferred the "run a script" approach anyway
>>    
>>
>
>Why are the "script" and the "signal" exclusive?  We seem to need both.
>
>I'd expect that the 'batch manager' or such would want a "script" of its
>choosing executed, as you describe.
>
>But I also expect that a signal mechanism is required to deliver notice:
>
>  from: the system service or its script (a batch manager, for example)
>
>    to: the target task (a long running Fortran job, for example)
>
>that some special code linked in the target task needs to make sense of
>the situation and invoke the appropriate mbind/set_mempolicy calls.
>
>Perhaps that "other OS" didn't have the difficulty we have of having to
>get the target task to issue a call or two on its own behalf that can
>not be issued 'by proxy'.
>
>  
>
>>The real killer was that we can't add signals without breaking glibc,
>>    
>>
>
>Yeah - I agree.  Don't count on glibc for anything.  Hence no new
>signals, and expect to be overloading an existing signal for these
>purposes, if it comes to that.
>
>  
>
I'm not uptodate but if Icorrectly remember hotplug executes scripts 
when recieving an event from the kernel.... nothing prevent then to 
create a script that do
for PID in $(ls /var/run/hotplug/signals); do
    kill -10 $PID;
done

which will send a SIGUSR1 to any process listed in /var/run/hotplug/signals

more you can write a small aplication using POSIX extended signal to 
pass some value to the processes.....

(by the way I perfer this method rather than DBUS.... because this is 
more "standard" and don't create again the wheel)


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (10 preceding siblings ...)
  2004-06-30 11:57 ` jlm_devel
@ 2004-06-30 12:26 ` Paul Jackson
  2004-06-30 13:08 ` Paul Jackson
  2004-06-30 17:37 ` jlm_devel
  13 siblings, 0 replies; 15+ messages in thread
From: Paul Jackson @ 2004-06-30 12:26 UTC (permalink / raw)
  To: linux-hotplug

jlm_devel writes:
> nothing prevent then to create a script that do

Yes - I quite agree:

 1) However it be that notice is given, a hotplug script can send it,
    in the case that the kernel first saw the event.  I was less
    focused on how the notice was sent, than I was with what form
    (signal, message, ...) the notice took when it arrived into the
    process context of the long running job needing to be moved.

 2) I too prefer signals over DBUS for the particular case I was trying
    to raise here, of delivering low bandwidth asynchronous notice to
    a long running application that it needed to run a little bit of
    specialized "memory placement" code within its process context.

Note that it's not always the kernel that first notices the event.  It
might also be a batch manager that decides it needs to move some long
running jobs around, to make room for additional work.  In such case,
the batch manager, not a hotplug script, would be sending the notice.

I would not go so far as to describe D-BUS as reinventing the signal
wheel.  These are two different mechanisms, useful for different needs. 
If anything, D-BUS reinvents the CORBA wheel, which some would say was
pregnant with opportunities for reinvention.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (11 preceding siblings ...)
  2004-06-30 12:26 ` Paul Jackson
@ 2004-06-30 13:08 ` Paul Jackson
  2004-06-30 17:37 ` jlm_devel
  13 siblings, 0 replies; 15+ messages in thread
From: Paul Jackson @ 2004-06-30 13:08 UTC (permalink / raw)
  To: linux-hotplug

jlm_devel wrote:
> using POSIX extended signal to pass some value

sigqueue(2) - thanks for pointing that out - I hadn't realized
it existed.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: How to notify app of changed cpu/mem/io node configuration?
  2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
                   ` (12 preceding siblings ...)
  2004-06-30 13:08 ` Paul Jackson
@ 2004-06-30 17:37 ` jlm_devel
  13 siblings, 0 replies; 15+ messages in thread
From: jlm_devel @ 2004-06-30 17:37 UTC (permalink / raw)
  To: linux-hotplug

Paul Jackson wrote:

>jlm_devel wrote:
>  
>
>>using POSIX extended signal to pass some value
>>    
>>
>
>sigqueue(2) - thanks for pointing that out - I hadn't realized
>it existed.
>
>  
>
thnats why I say that D-BUS reinvent the wheel..... because we can 
already use signals to send small pieces of data...... more the linux 
kernel now support posix messages queues to send more data and if you 
really want to send huge piece of data you can use signal/messages 
queues to pass pointers to shared memory objects...... and there this is 
blastering fast......
If i'm correct d-bus is used to send some information about hotplug 
event such as usb key plug..... signal seems to be more appropriate and 
less overhead since they have been optimized for a long time.... we 
always reinvent the wheel because we don't know enough the system we're 
running on.... QT and other "abstracting" toolkits don't help because 
for abstracting you prevent user to use some mechanism that aren't 
portable..... ever wondered why the cp command is so blastering fast and 
so less cpu consuming ? you need to dig more inside the system 
basics..... even the glibc is abstracting and I had the prouf yesterday 
with the unlink system call against remove glibc one.....


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-06-30 17:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
2004-06-29  6:24 ` Greg KH
2004-06-29  8:37 ` Paul Jackson
2004-06-29 22:41 ` Dave Hansen
2004-06-29 23:15 ` Paul Jackson
2004-06-29 23:39 ` Dave Hansen
2004-06-29 23:53 ` Rusty Russell
2004-06-30  1:22 ` Paul Jackson
2004-06-30  1:38 ` Dave Hansen
2004-06-30  1:40 ` Paul Jackson
2004-06-30  2:16 ` Paul Jackson
2004-06-30 11:57 ` jlm_devel
2004-06-30 12:26 ` Paul Jackson
2004-06-30 13:08 ` Paul Jackson
2004-06-30 17:37 ` jlm_devel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).