linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Oleg Nesterov <oleg@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	Viresh Kumar <viresh.kumar@linaro.org>
Subject: Re: [PATCH] hotplug: Optimize {get,put}_online_cpus()
Date: Wed, 02 Oct 2013 00:26:54 +0530	[thread overview]
Message-ID: <524B1AF6.8020406@linux.vnet.ibm.com> (raw)
In-Reply-To: <524B111F.9060003@linux.vnet.ibm.com>

On 10/01/2013 11:44 PM, Srivatsa S. Bhat wrote:
> On 10/01/2013 11:06 PM, Peter Zijlstra wrote:
>> On Tue, Oct 01, 2013 at 10:41:15PM +0530, Srivatsa S. Bhat wrote:
>>> However, as Oleg said, its definitely worth considering whether this proposed
>>> change in semantics is going to hurt us in the future. CPU_POST_DEAD has certainly
>>> proved to be very useful in certain challenging situations (commit 1aee40ac9c
>>> explains one such example), so IMHO we should be very careful not to undermine
>>> its utility.
>>
>> Urgh.. crazy things. I've always understood POST_DEAD to mean 'will be
>> called at some time after the unplug' with no further guarantees. And my
>> patch preserves that.
>>
>> Its not at all clear to me why cpufreq needs more; 1aee40ac9c certainly
>> doesn't explain it.
>>
> 
> Sorry if I was unclear - I didn't mean to say that cpufreq needs more guarantees
> than that. I was just saying that the cpufreq code would need certain additional
> changes/restructuring to accommodate the change in the semantics brought about
> by this patch. IOW, it won't work as it is, but it can certainly be fixed.
> 

And an important reason why this change can be accommodated with not so much
trouble is because you are changing it only in the suspend/resume path, where
userspace has already been frozen, so all hotplug operations are initiated by
the suspend path and that path *alone* (and so we enjoy certain "simplifiers" that
we know before-hand, eg: all of them are CPU offline operations, happening one at
a time, in sequence) and we don't expect any "interference" to this routine ;-).
As a result the number and variety of races that we need to take care of tend to
be far lesser. (For example, we don't have to worry about the deadlock caused by
sysfs-writes that 1aee40ac9c was talking about).

On the other hand, if the proposal was to change the regular hotplug path as well
on the same lines, then I guess it would have been a little more difficult to
adjust to it. For example, in cpufreq, _dev_prepare() sends a STOP to the governor,
whereas a part of _dev_finish() sends a START to it; so we might have races there,
due to which we might proceed with CPU offline with a running governor, depending
on the exact timing of the events. Of course, this problem doesn't occur in the
suspend/resume case, and hence I didn't bring it up in my previous mail.

So this is another reason why I'm a little concerned about POST_DEAD: since this
is a change in semantics, it might be worth asking ourselves whether we'd still
want to go with that change, if we happened to be changing regular hotplug as
well, rather than just the more controlled environment of suspend/resume.
Yes, I know that's not what you proposed, but I feel it might be worth considering
its implications while deciding how to solve the POST_DEAD issue.

Regards,
Srivatsa S. Bhat

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-10-01 19:01 UTC|newest]

Thread overview: 182+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-10  9:31 [PATCH 0/50] Basic scheduler support for automatic NUMA balancing V7 Mel Gorman
2013-09-10  9:31 ` [PATCH 01/50] sched: monolithic code dump of what is being pushed upstream Mel Gorman
2013-09-11  0:58   ` Joonsoo Kim
2013-09-11  3:11   ` Hillf Danton
2013-09-13  8:11     ` Mel Gorman
2013-09-10  9:31 ` [PATCH 02/50] mm: numa: Document automatic NUMA balancing sysctls Mel Gorman
2013-09-10  9:31 ` [PATCH 03/50] sched, numa: Comment fixlets Mel Gorman
2013-09-10  9:31 ` [PATCH 04/50] mm: numa: Do not account for a hinting fault if we raced Mel Gorman
2013-09-10  9:31 ` [PATCH 05/50] mm: Wait for THP migrations to complete during NUMA hinting faults Mel Gorman
2013-09-10  9:31 ` [PATCH 06/50] mm: Prevent parallel splits during THP migration Mel Gorman
2013-09-10  9:31 ` [PATCH 07/50] mm: Account for a THP NUMA hinting update as one PTE update Mel Gorman
2013-09-16 12:36   ` Peter Zijlstra
2013-09-16 13:39     ` Rik van Riel
2013-09-16 14:54       ` Peter Zijlstra
2013-09-16 16:11         ` Mel Gorman
2013-09-16 16:37           ` Peter Zijlstra
2013-09-10  9:31 ` [PATCH 08/50] mm: numa: Sanitize task_numa_fault() callsites Mel Gorman
2013-09-10  9:31 ` [PATCH 09/50] mm: numa: Do not migrate or account for hinting faults on the zero page Mel Gorman
2013-09-10  9:31 ` [PATCH 10/50] sched: numa: Mitigate chance that same task always updates PTEs Mel Gorman
2013-09-10  9:31 ` [PATCH 11/50] sched: numa: Continue PTE scanning even if migrate rate limited Mel Gorman
2013-09-10  9:31 ` [PATCH 12/50] Revert "mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node" Mel Gorman
2013-09-10  9:31 ` [PATCH 13/50] sched: numa: Initialise numa_next_scan properly Mel Gorman
2013-09-10  9:31 ` [PATCH 14/50] sched: Set the scan rate proportional to the memory usage of the task being scanned Mel Gorman
2013-09-16 15:18   ` Peter Zijlstra
2013-09-16 15:40     ` Mel Gorman
2013-09-10  9:31 ` [PATCH 15/50] sched: numa: Correct adjustment of numa_scan_period Mel Gorman
2013-09-10  9:31 ` [PATCH 16/50] mm: Only flush TLBs if a transhuge PMD is modified for NUMA pte scanning Mel Gorman
2013-09-10  9:31 ` [PATCH 17/50] mm: Do not flush TLB during protection change if !pte_present && !migration_entry Mel Gorman
2013-09-16 16:35   ` Peter Zijlstra
2013-09-17 17:00     ` Mel Gorman
2013-09-10  9:31 ` [PATCH 18/50] sched: numa: Slow scan rate if no NUMA hinting faults are being recorded Mel Gorman
2013-09-10  9:31 ` [PATCH 19/50] sched: Track NUMA hinting faults on per-node basis Mel Gorman
2013-09-10  9:32 ` [PATCH 20/50] sched: Select a preferred node with the most numa hinting faults Mel Gorman
2013-09-10  9:32 ` [PATCH 21/50] sched: Update NUMA hinting faults once per scan Mel Gorman
2013-09-10  9:32 ` [PATCH 22/50] sched: Favour moving tasks towards the preferred node Mel Gorman
2013-09-10  9:32 ` [PATCH 23/50] sched: Resist moving tasks towards nodes with fewer hinting faults Mel Gorman
2013-09-10  9:32 ` [PATCH 24/50] sched: Reschedule task on preferred NUMA node once selected Mel Gorman
2013-09-10  9:32 ` [PATCH 25/50] sched: Add infrastructure for split shared/private accounting of NUMA hinting faults Mel Gorman
2013-09-10  9:32 ` [PATCH 26/50] sched: Check current->mm before allocating NUMA faults Mel Gorman
2013-09-10  9:32 ` [PATCH 27/50] mm: numa: Scan pages with elevated page_mapcount Mel Gorman
2013-09-12  2:10   ` Hillf Danton
2013-09-13  8:11     ` Mel Gorman
2013-09-10  9:32 ` [PATCH 28/50] sched: Remove check that skips small VMAs Mel Gorman
2013-09-10  9:32 ` [PATCH 29/50] sched: Set preferred NUMA node based on number of private faults Mel Gorman
2013-09-10  9:32 ` [PATCH 30/50] sched: Do not migrate memory immediately after switching node Mel Gorman
2013-09-10  9:32 ` [PATCH 31/50] sched: Avoid overloading CPUs on a preferred NUMA node Mel Gorman
2013-09-10  9:32 ` [PATCH 32/50] sched: Retry migration of tasks to CPU on a preferred node Mel Gorman
2013-09-10  9:32 ` [PATCH 33/50] sched: numa: increment numa_migrate_seq when task runs in correct location Mel Gorman
2013-09-10  9:32 ` [PATCH 34/50] sched: numa: Do not trap hinting faults for shared libraries Mel Gorman
2013-09-17  2:02   ` 答复: " 张天飞
2013-09-17  8:05     ` ????: " Mel Gorman
2013-09-17  8:22       ` Figo.zhang
2013-09-10  9:32 ` [PATCH 35/50] mm: numa: Only trap pmd hinting faults if we would otherwise trap PTE faults Mel Gorman
2013-09-10  9:32 ` [PATCH 36/50] stop_machine: Introduce stop_two_cpus() Mel Gorman
2013-09-10  9:32 ` [PATCH 37/50] sched: Introduce migrate_swap() Mel Gorman
2013-09-17 14:30   ` [PATCH] hotplug: Optimize {get,put}_online_cpus() Peter Zijlstra
2013-09-17 16:20     ` Mel Gorman
2013-09-17 16:45       ` Peter Zijlstra
2013-09-18 15:49         ` Peter Zijlstra
2013-09-19 14:32           ` Peter Zijlstra
2013-09-21 16:34             ` Oleg Nesterov
2013-09-21 19:13               ` Oleg Nesterov
2013-09-23  9:29               ` Peter Zijlstra
2013-09-23 17:32                 ` Oleg Nesterov
2013-09-24 20:24                   ` Peter Zijlstra
2013-09-24 21:02                     ` Peter Zijlstra
2013-09-25 15:55                     ` Oleg Nesterov
2013-09-25 16:59                       ` Paul E. McKenney
2013-09-25 17:43                       ` Peter Zijlstra
2013-09-25 17:50                         ` Oleg Nesterov
2013-09-25 18:40                           ` Peter Zijlstra
2013-09-25 21:22                             ` Paul E. McKenney
2013-09-26 11:10                               ` Peter Zijlstra
     [not found]                                 ` <20130926155321.GA4342@redhat.com>
2013-09-26 16:13                                   ` Peter Zijlstra
2013-09-26 16:14                                     ` Oleg Nesterov
2013-09-26 16:40                                       ` Peter Zijlstra
2013-09-26 16:58                                 ` Oleg Nesterov
2013-09-26 17:50                                   ` Peter Zijlstra
2013-09-27 18:15                                     ` Oleg Nesterov
2013-09-27 20:41                                       ` Peter Zijlstra
2013-09-28 12:48                                         ` Oleg Nesterov
2013-09-28 14:47                                           ` Peter Zijlstra
2013-09-28 16:31                                             ` Oleg Nesterov
2013-09-30 20:11                                               ` Rafael J. Wysocki
2013-10-01 17:11                                                 ` Srivatsa S. Bhat
2013-10-01 17:36                                                   ` Peter Zijlstra
2013-10-01 17:45                                                     ` Oleg Nesterov
2013-10-01 17:56                                                       ` Peter Zijlstra
2013-10-01 18:07                                                         ` Oleg Nesterov
2013-10-01 19:05                                                           ` Paul E. McKenney
2013-10-02 12:16                                                             ` Oleg Nesterov
2013-10-02  9:08                                                           ` Peter Zijlstra
2013-10-02 12:13                                                             ` Oleg Nesterov
2013-10-02 12:25                                                               ` Peter Zijlstra
2013-10-02 13:31                                                               ` Peter Zijlstra
2013-10-02 14:00                                                                 ` Oleg Nesterov
2013-10-02 15:17                                                                   ` Peter Zijlstra
2013-10-02 16:31                                                                     ` Oleg Nesterov
2013-10-02 17:52                                                                   ` Paul E. McKenney
2013-10-01 19:03                                                         ` Srivatsa S. Bhat
2013-10-01 18:14                                                     ` Srivatsa S. Bhat
2013-10-01 18:56                                                       ` Srivatsa S. Bhat [this message]
2013-10-02 10:14                                                       ` Srivatsa S. Bhat
2013-09-28 20:46                                           ` Paul E. McKenney
2013-10-01  3:56                                         ` Paul E. McKenney
2013-10-01 14:14                                           ` Oleg Nesterov
2013-10-01 14:45                                             ` Paul E. McKenney
2013-10-01 14:48                                               ` Peter Zijlstra
2013-10-01 15:24                                                 ` Paul E. McKenney
2013-10-01 15:34                                                   ` Oleg Nesterov
2013-10-01 15:00                                               ` Oleg Nesterov
2013-09-29 13:56                                       ` Oleg Nesterov
2013-10-01 15:38                                         ` Paul E. McKenney
2013-10-01 15:40                                           ` Oleg Nesterov
2013-10-01 20:40                                 ` Paul E. McKenney
2013-09-23 14:50             ` Steven Rostedt
2013-09-23 14:54               ` Peter Zijlstra
2013-09-23 15:13                 ` Steven Rostedt
2013-09-23 15:22                   ` Peter Zijlstra
2013-09-23 15:59                     ` Steven Rostedt
2013-09-23 16:02                       ` Peter Zijlstra
2013-09-23 15:50                   ` Paul E. McKenney
2013-09-23 16:01                     ` Peter Zijlstra
2013-09-23 17:04                       ` Paul E. McKenney
2013-09-23 17:30                         ` Peter Zijlstra
2013-09-23 17:50             ` Oleg Nesterov
2013-09-24 12:38               ` Peter Zijlstra
2013-09-24 14:42                 ` Paul E. McKenney
2013-09-24 16:09                   ` Peter Zijlstra
2013-09-24 16:31                     ` Oleg Nesterov
2013-09-24 21:09                     ` Paul E. McKenney
2013-09-24 16:03                 ` Oleg Nesterov
2013-09-24 16:43                   ` Steven Rostedt
2013-09-24 17:06                     ` Oleg Nesterov
2013-09-24 17:47                       ` Paul E. McKenney
2013-09-24 18:00                         ` Oleg Nesterov
2013-09-24 20:35                           ` Peter Zijlstra
2013-09-25 15:16                             ` Oleg Nesterov
2013-09-25 15:35                               ` Peter Zijlstra
2013-09-25 16:33                                 ` Oleg Nesterov
2013-09-24 16:49                   ` Paul E. McKenney
2013-09-24 16:54                     ` Peter Zijlstra
2013-09-24 17:02                       ` Oleg Nesterov
2013-09-24 16:51                   ` Peter Zijlstra
2013-09-24 16:39                 ` Steven Rostedt
2013-09-29 18:36     ` [RFC] introduce synchronize_sched_{enter,exit}() Oleg Nesterov
2013-09-29 20:01       ` Paul E. McKenney
2013-09-30 12:42         ` Oleg Nesterov
2013-09-29 21:34       ` Steven Rostedt
2013-09-30 13:03         ` Oleg Nesterov
2013-09-30 12:59       ` Peter Zijlstra
2013-09-30 14:24         ` Peter Zijlstra
2013-09-30 15:06           ` Peter Zijlstra
2013-09-30 16:58             ` Oleg Nesterov
2013-09-30 16:38         ` Oleg Nesterov
2013-10-02 14:41       ` Peter Zijlstra
2013-10-03  7:04         ` Ingo Molnar
2013-10-03  7:43           ` Peter Zijlstra
2013-09-17 14:32   ` [PATCH 37/50] sched: Introduce migrate_swap() Peter Zijlstra
2013-09-10  9:32 ` [PATCH 38/50] sched: numa: Use a system-wide search to find swap/migration candidates Mel Gorman
2013-09-10  9:32 ` [PATCH 39/50] sched: numa: Favor placing a task on the preferred node Mel Gorman
2013-09-10  9:32 ` [PATCH 40/50] mm: numa: Change page last {nid,pid} into {cpu,pid} Mel Gorman
2013-09-10  9:32 ` [PATCH 41/50] sched: numa: Use {cpu, pid} to create task groups for shared faults Mel Gorman
2013-09-12 12:42   ` Hillf Danton
2013-09-12 14:40     ` Mel Gorman
2013-09-12 12:45   ` Hillf Danton
2013-09-10  9:32 ` [PATCH 42/50] sched: numa: Report a NUMA task group ID Mel Gorman
2013-09-10  9:32 ` [PATCH 43/50] mm: numa: Do not group on RO pages Mel Gorman
2013-09-10  9:32 ` [PATCH 44/50] sched: numa: stay on the same node if CLONE_VM Mel Gorman
2013-09-10  9:32 ` [PATCH 45/50] sched: numa: use group fault statistics in numa placement Mel Gorman
2013-09-10  9:32 ` [PATCH 46/50] sched: numa: Prevent parallel updates to group stats during placement Mel Gorman
2013-09-20  9:55   ` Peter Zijlstra
2013-09-20 12:31     ` Mel Gorman
2013-09-20 12:36       ` Peter Zijlstra
2013-09-20 13:31       ` Mel Gorman
2013-09-10  9:32 ` [PATCH 47/50] sched: numa: add debugging Mel Gorman
2013-09-10  9:32 ` [PATCH 48/50] sched: numa: Decide whether to favour task or group weights based on swap candidate relationships Mel Gorman
2013-09-10  9:32 ` [PATCH 49/50] sched: numa: fix task or group comparison Mel Gorman
2013-09-10  9:32 ` [PATCH 50/50] sched: numa: Avoid migrating tasks that are placed on their preferred node Mel Gorman
2013-09-11  2:03 ` [PATCH 0/50] Basic scheduler support for automatic NUMA balancing V7 Rik van Riel
2013-09-14  2:57 ` Bob Liu
2013-09-30 10:30   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=524B1AF6.8020406@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rjw@rjwysocki.net \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).