From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Hansen Date: Tue, 29 Jun 2004 23:39:55 +0000 Subject: Re: How to notify app of changed cpu/mem/io node configuration? Message-Id: <1088552395.26704.25.camel@nighthawk> List-Id: References: <20040628173808.04718b83.pj@sgi.com> In-Reply-To: <20040628173808.04718b83.pj@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-hotplug@vger.kernel.org On Tue, 2004-06-29 at 16:15, Paul Jackson wrote: > Dave wrote: > > But, we'll probably need some kind of synchronous process notification > > at some point. > > Did you mean "some kind of asynchronous ..."? Some kind of notification that the kernel sends where it can be sure that the app received it. A signal sent by /sbin/hotplug is asynchronous to the kernel, because it has no idea whether the app handled it. A signal sent directly by the kernel is another matter. > > If ... app ... hopeless ... the kernel will likely have the option to > > kill it. > > More commonly, it seems that the kernel just forces the app onto some > CPU/Memory that will work, such as the lowest currently online CPU. > Essentially, CPU 0 (and I guess Memory Node 0, in time) become the > orphanage for homeless tasks. This might, in fact, be an invalid state. We have some silly tools on the NUMA-Q that *have* to run on a certain node because the hardware maps different node-specific structures at the same physical address on different nodes. Rudely being migrated to CPU 0 would break the app, and could (in theory) cause data corruption. Now, this is a silly NUMA-Q thing, but I'm sure there are others like that. As Mr. Dobson has told me several times: the mask is cpus_allowed, not cpus_preferred :) > > Sleeping for 5 seconds and hoping for the best probably isn't the best > > option, either. :) > > Yeah - the kernel doesn't really want to be playing such games. > Hence actions that might require such should be left to userland. > > For example, on orderly moving of apps from one Node to another > could be accomplished by user code that took its time and did what > it had to do to move (or kill) apps off old Node, before it told > the kernel "all clear - remove old Node from service" I just worry about the complexity of feeding all of this information back to the kernel. Maybe there's a simple way to do it, but one isn't horribly apparent to me right now. > > I wonder if a much more generic signal could be of more use ... > > That's my inclination as well. These sorts of events are rare. Send one > generic signal, and let the recipient poke around and figure out what > has changed that it cares about and deal with it. I could even imagine > overloading SIGPWR for this purpose, if new signal numbers/names are too > difficult to come by. I think we need Rusty to remind us all of what came from the last discussion on this topic. -- Dave ------------------------------------------------------- This SF.Net email sponsored by Black Hat Briefings & Training. Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com _______________________________________________ Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net Linux-hotplug-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel