* [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs @ 2014-10-27 2:59 Neil Zhang 2014-10-27 21:38 ` Rafael J. Wysocki 0 siblings, 1 reply; 12+ messages in thread From: Neil Zhang @ 2014-10-27 2:59 UTC (permalink / raw) To: linux-kernel; +Cc: gregkh, Neil Zhang, Rafael J. Wysocki The current per-cpu offline info won't be updated when we use any other method besides sysfs to call cpu_up/down. Thus the cpu/online can't reflect the real online status. This patch is going to fix the issue introduced by commit 0902a9044fa5b7a0456ea4daacec2c2b3189ba8c (Driver core: Use generic offline/online for CPU offline/online) CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Neil Zhang <zhangwm@marvell.com> --- drivers/base/cpu.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 006b1bc..9d61824 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -418,10 +418,35 @@ static void __init cpu_dev_register_generic(void) #endif } +static int device_hotplug_notifier(struct notifier_block *nfb, + unsigned long action, void *hcpu) +{ + unsigned int cpu = (unsigned long)hcpu; + struct device *dev = get_cpu_device(cpu); + int ret; + + switch (action & ~CPU_TASKS_FROZEN) { + case CPU_ONLINE: + dev->offline = false; + ret = NOTIFY_OK; + break; + case CPU_DYING: + dev->offline = true; + ret = NOTIFY_OK; + break; + default: + ret = NOTIFY_DONE; + break; + } + + return ret; +} + void __init cpu_dev_init(void) { if (subsys_system_register(&cpu_subsys, cpu_root_attr_groups)) panic("Failed to register CPU subsystem"); cpu_dev_register_generic(); + cpu_notifier(device_hotplug_notifier, 0); } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs 2014-10-27 2:59 [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs Neil Zhang @ 2014-10-27 21:38 ` Rafael J. Wysocki 2014-10-28 0:46 ` Dan Streetman 0 siblings, 1 reply; 12+ messages in thread From: Rafael J. Wysocki @ 2014-10-27 21:38 UTC (permalink / raw) To: Neil Zhang, linux-kernel; +Cc: gregkh, Rafael J. Wysocki On 10/27/2014 3:59 AM, Neil Zhang wrote: > The current per-cpu offline info won't be updated when we use > any other method besides sysfs to call cpu_up/down. > Thus the cpu/online can't reflect the real online status. > > This patch is going to fix the issue introduced by commit > 0902a9044fa5b7a0456ea4daacec2c2b3189ba8c (Driver core: > Use generic offline/online for CPU offline/online) > > CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Tested-by: Dan Streetman <ddstreet@ieee.org> > Signed-off-by: Neil Zhang <zhangwm@marvell.com> Oh dear, no. Please first tell me what exactly the problem you're seeing is. > --- > drivers/base/cpu.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c > index 006b1bc..9d61824 100644 > --- a/drivers/base/cpu.c > +++ b/drivers/base/cpu.c > @@ -418,10 +418,35 @@ static void __init cpu_dev_register_generic(void) > #endif > } > > +static int device_hotplug_notifier(struct notifier_block *nfb, > + unsigned long action, void *hcpu) > +{ > + unsigned int cpu = (unsigned long)hcpu; > + struct device *dev = get_cpu_device(cpu); > + int ret; > + > + switch (action & ~CPU_TASKS_FROZEN) { > + case CPU_ONLINE: > + dev->offline = false; > + ret = NOTIFY_OK; > + break; > + case CPU_DYING: > + dev->offline = true; > + ret = NOTIFY_OK; > + break; > + default: > + ret = NOTIFY_DONE; > + break; > + } > + > + return ret; > +} > + > void __init cpu_dev_init(void) > { > if (subsys_system_register(&cpu_subsys, cpu_root_attr_groups)) > panic("Failed to register CPU subsystem"); > > cpu_dev_register_generic(); > + cpu_notifier(device_hotplug_notifier, 0); > } ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs 2014-10-27 21:38 ` Rafael J. Wysocki @ 2014-10-28 0:46 ` Dan Streetman 2014-10-29 21:46 ` Rafael J. Wysocki 0 siblings, 1 reply; 12+ messages in thread From: Dan Streetman @ 2014-10-28 0:46 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Neil Zhang, linux-kernel, Greg Kroah-Hartman, Rafael J. Wysocki, nfont On Mon, Oct 27, 2014 at 5:38 PM, Rafael J. Wysocki <rafael.j.wysocki@intel.com> wrote: > On 10/27/2014 3:59 AM, Neil Zhang wrote: >> >> The current per-cpu offline info won't be updated when we use >> any other method besides sysfs to call cpu_up/down. >> Thus the cpu/online can't reflect the real online status. >> >> This patch is going to fix the issue introduced by commit >> 0902a9044fa5b7a0456ea4daacec2c2b3189ba8c (Driver core: >> Use generic offline/online for CPU offline/online) >> >> CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> Tested-by: Dan Streetman <ddstreet@ieee.org> >> Signed-off-by: Neil Zhang <zhangwm@marvell.com> > > > Oh dear, no. > > Please first tell me what exactly the problem you're seeing is. For some background, here is my last comment on the first email thread on this: https://lkml.org/lkml/2014/10/27/595 I didn't create this patch, but the problem essentially is that before your commit the individual cpu online nodes (/sys/devices/system/cpu/cpuN/online) stayed in sync during cpu_down/up, because they used the cpu_online_mask; while after the commit, they are tracked by the cpu's generic dev->offline flag, which isn't updated during cpu_down/up. So now, any place in the kernel that brings a cpu up or down must also update the cpu->dev->offline flag. My interest in the patch was coincidental because I was seeing the same problem when using dlpar operations to hotplug cpus, which uses the arch/powerpc/platform/pseries/dlpar.c code; that code brings a cpu offline when it's hot-removed (and the cpu online when it's hot-added), but it hasn't been changed to also update the cpu's dev->offline flag. As I said in the previous email to the first thread, the ppc dlpar operation might be changed in the future to fully unregister a cpu when it's hot-removed, which would remove the entire sysfs cpuN directory. Alternately and/or until then, it could be updated to simply update the cpu'd dev->offline flag (that's what I originally did for my own testing). However, without a central place to update the cpu's dev->offline field, like this, or possibly in set_cpu_online(), or elsewhere during cpu_down/up, each place in the kernel that calls cpu_down() or cpu_up() also needs to update the dev->offline flag. It's possible that the ppc dlpar code is the only place in the kernel that has this problem; I haven't searched. > > >> --- >> drivers/base/cpu.c | 25 +++++++++++++++++++++++++ >> 1 file changed, 25 insertions(+) >> >> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c >> index 006b1bc..9d61824 100644 >> --- a/drivers/base/cpu.c >> +++ b/drivers/base/cpu.c >> @@ -418,10 +418,35 @@ static void __init cpu_dev_register_generic(void) >> #endif >> } >> +static int device_hotplug_notifier(struct notifier_block *nfb, >> + unsigned long action, void *hcpu) >> +{ >> + unsigned int cpu = (unsigned long)hcpu; >> + struct device *dev = get_cpu_device(cpu); >> + int ret; >> + >> + switch (action & ~CPU_TASKS_FROZEN) { >> + case CPU_ONLINE: >> + dev->offline = false; >> + ret = NOTIFY_OK; >> + break; >> + case CPU_DYING: >> + dev->offline = true; >> + ret = NOTIFY_OK; >> + break; >> + default: >> + ret = NOTIFY_DONE; >> + break; >> + } >> + >> + return ret; >> +} >> + >> void __init cpu_dev_init(void) >> { >> if (subsys_system_register(&cpu_subsys, cpu_root_attr_groups)) >> panic("Failed to register CPU subsystem"); >> cpu_dev_register_generic(); >> + cpu_notifier(device_hotplug_notifier, 0); >> } > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs 2014-10-28 0:46 ` Dan Streetman @ 2014-10-29 21:46 ` Rafael J. Wysocki 2014-10-29 21:52 ` Dan Streetman 2014-10-30 2:00 ` [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs Neil Zhang 0 siblings, 2 replies; 12+ messages in thread From: Rafael J. Wysocki @ 2014-10-29 21:46 UTC (permalink / raw) To: Dan Streetman Cc: Rafael J. Wysocki, Neil Zhang, linux-kernel, Greg Kroah-Hartman, nfont On Monday, October 27, 2014 08:46:08 PM Dan Streetman wrote: > On Mon, Oct 27, 2014 at 5:38 PM, Rafael J. Wysocki > <rafael.j.wysocki@intel.com> wrote: > > On 10/27/2014 3:59 AM, Neil Zhang wrote: > >> > >> The current per-cpu offline info won't be updated when we use > >> any other method besides sysfs to call cpu_up/down. > >> Thus the cpu/online can't reflect the real online status. > >> > >> This patch is going to fix the issue introduced by commit > >> 0902a9044fa5b7a0456ea4daacec2c2b3189ba8c (Driver core: > >> Use generic offline/online for CPU offline/online) > >> > >> CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >> Tested-by: Dan Streetman <ddstreet@ieee.org> > >> Signed-off-by: Neil Zhang <zhangwm@marvell.com> > > > > > > Oh dear, no. > > > > Please first tell me what exactly the problem you're seeing is. > > For some background, here is my last comment on the first email thread on this: > https://lkml.org/lkml/2014/10/27/595 > > I didn't create this patch, but the problem essentially is that before > your commit the individual cpu online nodes > (/sys/devices/system/cpu/cpuN/online) stayed in sync during > cpu_down/up, because they used the cpu_online_mask; while after the > commit, they are tracked by the cpu's generic dev->offline flag, which > isn't updated during cpu_down/up. Which is not triggered from sysfs. > So now, any place in the kernel > that brings a cpu up or down must also update the cpu->dev->offline > flag. Not any place. In particular, system suspend-resume doesn't need to do that, because it takes CPUs offline and then brings them back online. If there's a place in the kernel where CPUs are taken offline and left in that state, then it needs to be updated. > My interest in the patch was coincidental because I was seeing > the same problem when using dlpar operations to hotplug cpus, which > uses the arch/powerpc/platform/pseries/dlpar.c code; that code brings > a cpu offline when it's hot-removed (and the cpu online when it's > hot-added), but it hasn't been changed to also update the cpu's > dev->offline flag. It should be modified to do that. > As I said in the previous email to the first thread, the ppc dlpar > operation might be changed in the future to fully unregister a cpu > when it's hot-removed, which would remove the entire sysfs cpuN > directory. Alternately and/or until then, it could be updated to > simply update the cpu'd dev->offline flag (that's what I originally > did for my own testing). However, without a central place to update > the cpu's dev->offline field, like this, or possibly in > set_cpu_online(), or elsewhere during cpu_down/up, each place in the > kernel that calls cpu_down() or cpu_up() also needs to update the > dev->offline flag. It's possible that the ppc dlpar code is the only > place in the kernel that has this problem; I haven't searched. It is quite likely to be the only place like that. While I'm not familiar with the code in question, the most straightforward way to fix the problem would be to replace cpu_down() in there with device_offline(get_cpu_device(cpu)), but that needs to be called under device_hotplug_lock. -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs 2014-10-29 21:46 ` Rafael J. Wysocki @ 2014-10-29 21:52 ` Dan Streetman 2014-10-30 2:03 ` Neil Zhang 2014-10-31 19:41 ` Dan Streetman 2014-10-30 2:00 ` [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs Neil Zhang 1 sibling, 2 replies; 12+ messages in thread From: Dan Streetman @ 2014-10-29 21:52 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Rafael J. Wysocki, Neil Zhang, linux-kernel, Greg Kroah-Hartman, nfont On Wed, Oct 29, 2014 at 5:46 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Monday, October 27, 2014 08:46:08 PM Dan Streetman wrote: >> On Mon, Oct 27, 2014 at 5:38 PM, Rafael J. Wysocki >> <rafael.j.wysocki@intel.com> wrote: >> > On 10/27/2014 3:59 AM, Neil Zhang wrote: >> >> >> >> The current per-cpu offline info won't be updated when we use >> >> any other method besides sysfs to call cpu_up/down. >> >> Thus the cpu/online can't reflect the real online status. >> >> >> >> This patch is going to fix the issue introduced by commit >> >> 0902a9044fa5b7a0456ea4daacec2c2b3189ba8c (Driver core: >> >> Use generic offline/online for CPU offline/online) >> >> >> >> CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> >> Tested-by: Dan Streetman <ddstreet@ieee.org> >> >> Signed-off-by: Neil Zhang <zhangwm@marvell.com> >> > >> > >> > Oh dear, no. >> > >> > Please first tell me what exactly the problem you're seeing is. >> >> For some background, here is my last comment on the first email thread on this: >> https://lkml.org/lkml/2014/10/27/595 >> >> I didn't create this patch, but the problem essentially is that before >> your commit the individual cpu online nodes >> (/sys/devices/system/cpu/cpuN/online) stayed in sync during >> cpu_down/up, because they used the cpu_online_mask; while after the >> commit, they are tracked by the cpu's generic dev->offline flag, which >> isn't updated during cpu_down/up. > > Which is not triggered from sysfs. > >> So now, any place in the kernel >> that brings a cpu up or down must also update the cpu->dev->offline >> flag. > > Not any place. In particular, system suspend-resume doesn't need to > do that, because it takes CPUs offline and then brings them back > online. > > If there's a place in the kernel where CPUs are taken offline and > left in that state, then it needs to be updated. The only place I know of is ppc's dlpar code, as mentioned below. Neil, as you crafted the original patch, I assume you know of some other place in the kernel doing cpu_up/down directly, where you're seeing this problem? > >> My interest in the patch was coincidental because I was seeing >> the same problem when using dlpar operations to hotplug cpus, which >> uses the arch/powerpc/platform/pseries/dlpar.c code; that code brings >> a cpu offline when it's hot-removed (and the cpu online when it's >> hot-added), but it hasn't been changed to also update the cpu's >> dev->offline flag. > > It should be modified to do that. > >> As I said in the previous email to the first thread, the ppc dlpar >> operation might be changed in the future to fully unregister a cpu >> when it's hot-removed, which would remove the entire sysfs cpuN >> directory. Alternately and/or until then, it could be updated to >> simply update the cpu'd dev->offline flag (that's what I originally >> did for my own testing). However, without a central place to update >> the cpu's dev->offline field, like this, or possibly in >> set_cpu_online(), or elsewhere during cpu_down/up, each place in the >> kernel that calls cpu_down() or cpu_up() also needs to update the >> dev->offline flag. It's possible that the ppc dlpar code is the only >> place in the kernel that has this problem; I haven't searched. > > It is quite likely to be the only place like that. > > While I'm not familiar with the code in question, the most straightforward > way to fix the problem would be to replace cpu_down() in there with > device_offline(get_cpu_device(cpu)), but that needs to be called under > device_hotplug_lock. Ok, will do. > > -- > I speak only for myself. > Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs 2014-10-29 21:52 ` Dan Streetman @ 2014-10-30 2:03 ` Neil Zhang 2014-10-31 19:41 ` Dan Streetman 1 sibling, 0 replies; 12+ messages in thread From: Neil Zhang @ 2014-10-30 2:03 UTC (permalink / raw) To: Dan Streetman, Rafael J. Wysocki Cc: Rafael J. Wysocki, linux-kernel, Greg Kroah-Hartman, nfont@linux.vnet.ibm.com [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 4624 bytes --] Dan, > -----Original Message----- > From: ddstreet@gmail.com [mailto:ddstreet@gmail.com] On Behalf Of Dan > Streetman > Sent: 2014å¹´10æ30æ¥ 5:52 > To: Rafael J. Wysocki > Cc: Rafael J. Wysocki; Neil Zhang; linux-kernel; Greg Kroah-Hartman; > nfont@linux.vnet.ibm.com > Subject: Re: [PATCH V2] Driver cpu: update online when cpu_up/down besides > sysfs > > On Wed, Oct 29, 2014 at 5:46 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > On Monday, October 27, 2014 08:46:08 PM Dan Streetman wrote: > >> On Mon, Oct 27, 2014 at 5:38 PM, Rafael J. Wysocki > >> <rafael.j.wysocki@intel.com> wrote: > >> > On 10/27/2014 3:59 AM, Neil Zhang wrote: > >> >> > >> >> The current per-cpu offline info won't be updated when we use any > >> >> other method besides sysfs to call cpu_up/down. > >> >> Thus the cpu/online can't reflect the real online status. > >> >> > >> >> This patch is going to fix the issue introduced by commit > >> >> 0902a9044fa5b7a0456ea4daacec2c2b3189ba8c (Driver core: > >> >> Use generic offline/online for CPU offline/online) > >> >> > >> >> CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >> >> Tested-by: Dan Streetman <ddstreet@ieee.org> > >> >> Signed-off-by: Neil Zhang <zhangwm@marvell.com> > >> > > >> > > >> > Oh dear, no. > >> > > >> > Please first tell me what exactly the problem you're seeing is. > >> > >> For some background, here is my last comment on the first email thread on > this: > >> https://lkml.org/lkml/2014/10/27/595 > >> > >> I didn't create this patch, but the problem essentially is that > >> before your commit the individual cpu online nodes > >> (/sys/devices/system/cpu/cpuN/online) stayed in sync during > >> cpu_down/up, because they used the cpu_online_mask; while after the > >> commit, they are tracked by the cpu's generic dev->offline flag, > >> which isn't updated during cpu_down/up. > > > > Which is not triggered from sysfs. > > > >> So now, any place in the kernel > >> that brings a cpu up or down must also update the cpu->dev->offline > >> flag. > > > > Not any place. In particular, system suspend-resume doesn't need to > > do that, because it takes CPUs offline and then brings them back > > online. > > > > If there's a place in the kernel where CPUs are taken offline and left > > in that state, then it needs to be updated. > > The only place I know of is ppc's dlpar code, as mentioned below. > > Neil, as you crafted the original patch, I assume you know of some other place > in the kernel doing cpu_up/down directly, where you're seeing this problem? > As I replied to Rafael many ARM SoCs will use an in kernel profiler to romove/add a core via cpu_up /down for power and performance consideration. But actually we can fix these issues as Rafael suggested. Thanks for the discussion in these days. > > > >> My interest in the patch was coincidental because I was seeing the > >> same problem when using dlpar operations to hotplug cpus, which uses > >> the arch/powerpc/platform/pseries/dlpar.c code; that code brings a > >> cpu offline when it's hot-removed (and the cpu online when it's > >> hot-added), but it hasn't been changed to also update the cpu's > >> dev->offline flag. > > > > It should be modified to do that. > > > >> As I said in the previous email to the first thread, the ppc dlpar > >> operation might be changed in the future to fully unregister a cpu > >> when it's hot-removed, which would remove the entire sysfs cpuN > >> directory. Alternately and/or until then, it could be updated to > >> simply update the cpu'd dev->offline flag (that's what I originally > >> did for my own testing). However, without a central place to update > >> the cpu's dev->offline field, like this, or possibly in > >> set_cpu_online(), or elsewhere during cpu_down/up, each place in the > >> kernel that calls cpu_down() or cpu_up() also needs to update the > >> dev->offline flag. It's possible that the ppc dlpar code is the only > >> place in the kernel that has this problem; I haven't searched. > > > > It is quite likely to be the only place like that. > > > > While I'm not familiar with the code in question, the most > > straightforward way to fix the problem would be to replace cpu_down() > > in there with device_offline(get_cpu_device(cpu)), but that needs to > > be called under device_hotplug_lock. > > Ok, will do. > > > > > -- > > I speak only for myself. > > Rafael J. Wysocki, Intel Open Source Technology Center. Best Regards, Neil Zhang ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] powerpc: use device_online/offline() instead of cpu_up/down() 2014-10-29 21:52 ` Dan Streetman @ 2014-10-31 19:41 ` Dan Streetman 2014-10-31 19:41 ` Dan Streetman 1 sibling, 0 replies; 12+ messages in thread From: Dan Streetman @ 2014-10-31 19:41 UTC (permalink / raw) To: Nathan Fontenot, Benjamin Herrenschmidt, Michael Ellerman, Tyrel Datwyler Cc: Thomas Falcon, Greg Kroah-Hartman, Rafael J. Wysocki, Rafael J. Wysocki, linux-kernel, Paul Mackerras, Daniel Walter, Bharata B Rao, Grant Likely, Andrew Morton, Neil Zhang, linuxppc-dev, Dan Streetman In powerpc pseries platform dlpar operations, Use device_online() and device_offline() instead of cpu_up() and cpu_down(). Calling cpu_up/down directly does not update the cpu device offline field, which is used to online/offline a cpu from sysfs. Calling device_online/offline instead keeps the sysfs cpu online value correct. The hotplug lock, which is required to be held when calling device_online/offline, is already held when dlpar_online/offline_cpu are called, since they are called only from cpu_probe|release_store. This patch fixes errors on PowerVM systems that have cpu(s) added/removed using dlpar operations; without this patch, the /sys/devices/system/cpu/cpuN/online nodes do not correctly show the online state of added/removed cpus. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> --- Previous discussion for this: https://lkml.org/lkml/2014/10/29/839 arch/powerpc/platforms/pseries/dlpar.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 6ad83bd..c22bb1b 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -382,7 +382,7 @@ static int dlpar_online_cpu(struct device_node *dn) BUG_ON(get_cpu_current_state(cpu) != CPU_STATE_OFFLINE); cpu_maps_update_done(); - rc = cpu_up(cpu); + rc = device_online(get_cpu_device(cpu)); if (rc) goto out; cpu_maps_update_begin(); @@ -467,7 +467,7 @@ static int dlpar_offline_cpu(struct device_node *dn) if (get_cpu_current_state(cpu) == CPU_STATE_ONLINE) { set_preferred_offline_state(cpu, CPU_STATE_OFFLINE); cpu_maps_update_done(); - rc = cpu_down(cpu); + rc = device_offline(get_cpu_device(cpu)); if (rc) goto out; cpu_maps_update_begin(); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH] powerpc: use device_online/offline() instead of cpu_up/down() @ 2014-10-31 19:41 ` Dan Streetman 0 siblings, 0 replies; 12+ messages in thread From: Dan Streetman @ 2014-10-31 19:41 UTC (permalink / raw) To: Nathan Fontenot, Benjamin Herrenschmidt, Michael Ellerman, Tyrel Datwyler Cc: Dan Streetman, Paul Mackerras, Thomas Falcon, Grant Likely, Andrew Morton, Daniel Walter, Bharata B Rao, Rafael J. Wysocki, Rafael J. Wysocki, Neil Zhang, Greg Kroah-Hartman, linuxppc-dev, linux-kernel In powerpc pseries platform dlpar operations, Use device_online() and device_offline() instead of cpu_up() and cpu_down(). Calling cpu_up/down directly does not update the cpu device offline field, which is used to online/offline a cpu from sysfs. Calling device_online/offline instead keeps the sysfs cpu online value correct. The hotplug lock, which is required to be held when calling device_online/offline, is already held when dlpar_online/offline_cpu are called, since they are called only from cpu_probe|release_store. This patch fixes errors on PowerVM systems that have cpu(s) added/removed using dlpar operations; without this patch, the /sys/devices/system/cpu/cpuN/online nodes do not correctly show the online state of added/removed cpus. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> --- Previous discussion for this: https://lkml.org/lkml/2014/10/29/839 arch/powerpc/platforms/pseries/dlpar.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 6ad83bd..c22bb1b 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -382,7 +382,7 @@ static int dlpar_online_cpu(struct device_node *dn) BUG_ON(get_cpu_current_state(cpu) != CPU_STATE_OFFLINE); cpu_maps_update_done(); - rc = cpu_up(cpu); + rc = device_online(get_cpu_device(cpu)); if (rc) goto out; cpu_maps_update_begin(); @@ -467,7 +467,7 @@ static int dlpar_offline_cpu(struct device_node *dn) if (get_cpu_current_state(cpu) == CPU_STATE_ONLINE) { set_preferred_offline_state(cpu, CPU_STATE_OFFLINE); cpu_maps_update_done(); - rc = cpu_down(cpu); + rc = device_offline(get_cpu_device(cpu)); if (rc) goto out; cpu_maps_update_begin(); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] powerpc: use device_online/offline() instead of cpu_up/down() 2014-10-31 19:41 ` Dan Streetman @ 2014-11-02 4:58 ` Bharata B Rao -1 siblings, 0 replies; 12+ messages in thread From: Bharata B Rao @ 2014-11-02 4:58 UTC (permalink / raw) To: Dan Streetman Cc: Thomas Falcon, Greg Kroah-Hartman, Rafael J. Wysocki, Rafael J. Wysocki, linux-kernel, Paul Mackerras, Tyrel Datwyler, Grant Likely, Nathan Fontenot, Andrew Morton, Neil Zhang, linuxppc-dev, Daniel Walter On Fri, Oct 31, 2014 at 03:41:34PM -0400, Dan Streetman wrote: > In powerpc pseries platform dlpar operations, Use device_online() and > device_offline() instead of cpu_up() and cpu_down(). > > Calling cpu_up/down directly does not update the cpu device offline > field, which is used to online/offline a cpu from sysfs. Calling > device_online/offline instead keeps the sysfs cpu online value correct. > The hotplug lock, which is required to be held when calling > device_online/offline, is already held when dlpar_online/offline_cpu > are called, since they are called only from cpu_probe|release_store. > > This patch fixes errors on PowerVM systems that have cpu(s) added/removed > using dlpar operations; without this patch, the > /sys/devices/system/cpu/cpuN/online nodes do not correctly show the > online state of added/removed cpus. Verified the patch to be working as expected when I online and offline CPUs of a PowerKVM guest using QEMU (plus my RFC hotplug patchset for QEMU) Regards, Bharata. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] powerpc: use device_online/offline() instead of cpu_up/down() @ 2014-11-02 4:58 ` Bharata B Rao 0 siblings, 0 replies; 12+ messages in thread From: Bharata B Rao @ 2014-11-02 4:58 UTC (permalink / raw) To: Dan Streetman Cc: Nathan Fontenot, Benjamin Herrenschmidt, Michael Ellerman, Tyrel Datwyler, Paul Mackerras, Thomas Falcon, Grant Likely, Andrew Morton, Daniel Walter, Rafael J. Wysocki, Rafael J. Wysocki, Neil Zhang, Greg Kroah-Hartman, linuxppc-dev, linux-kernel On Fri, Oct 31, 2014 at 03:41:34PM -0400, Dan Streetman wrote: > In powerpc pseries platform dlpar operations, Use device_online() and > device_offline() instead of cpu_up() and cpu_down(). > > Calling cpu_up/down directly does not update the cpu device offline > field, which is used to online/offline a cpu from sysfs. Calling > device_online/offline instead keeps the sysfs cpu online value correct. > The hotplug lock, which is required to be held when calling > device_online/offline, is already held when dlpar_online/offline_cpu > are called, since they are called only from cpu_probe|release_store. > > This patch fixes errors on PowerVM systems that have cpu(s) added/removed > using dlpar operations; without this patch, the > /sys/devices/system/cpu/cpuN/online nodes do not correctly show the > online state of added/removed cpus. Verified the patch to be working as expected when I online and offline CPUs of a PowerKVM guest using QEMU (plus my RFC hotplug patchset for QEMU) Regards, Bharata. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] powerpc: use device_online/offline() instead of cpu_up/down() 2014-10-31 19:41 ` Dan Streetman (?) (?) @ 2014-11-03 15:45 ` Nathan Fontenot -1 siblings, 0 replies; 12+ messages in thread From: Nathan Fontenot @ 2014-11-03 15:45 UTC (permalink / raw) To: Dan Streetman, Benjamin Herrenschmidt, Michael Ellerman, Tyrel Datwyler Cc: Thomas Falcon, Greg Kroah-Hartman, Rafael J. Wysocki, Rafael J. Wysocki, linux-kernel, Paul Mackerras, Daniel Walter, Bharata B Rao, Grant Likely, Andrew Morton, Neil Zhang, linuxppc-dev On 10/31/2014 02:41 PM, Dan Streetman wrote: > In powerpc pseries platform dlpar operations, Use device_online() and > device_offline() instead of cpu_up() and cpu_down(). > > Calling cpu_up/down directly does not update the cpu device offline > field, which is used to online/offline a cpu from sysfs. Calling > device_online/offline instead keeps the sysfs cpu online value correct. > The hotplug lock, which is required to be held when calling > device_online/offline, is already held when dlpar_online/offline_cpu > are called, since they are called only from cpu_probe|release_store. > > This patch fixes errors on PowerVM systems that have cpu(s) added/removed > using dlpar operations; without this patch, the > /sys/devices/system/cpu/cpuN/online nodes do not correctly show the > online state of added/removed cpus. > > Signed-off-by: Dan Streetman <ddstreet@ieee.org> > Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> > --- > > Previous discussion for this: > https://lkml.org/lkml/2014/10/29/839 > > arch/powerpc/platforms/pseries/dlpar.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c > index 6ad83bd..c22bb1b 100644 > --- a/arch/powerpc/platforms/pseries/dlpar.c > +++ b/arch/powerpc/platforms/pseries/dlpar.c > @@ -382,7 +382,7 @@ static int dlpar_online_cpu(struct device_node *dn) > BUG_ON(get_cpu_current_state(cpu) > != CPU_STATE_OFFLINE); > cpu_maps_update_done(); > - rc = cpu_up(cpu); > + rc = device_online(get_cpu_device(cpu)); > if (rc) > goto out; > cpu_maps_update_begin(); > @@ -467,7 +467,7 @@ static int dlpar_offline_cpu(struct device_node *dn) > if (get_cpu_current_state(cpu) == CPU_STATE_ONLINE) { > set_preferred_offline_state(cpu, CPU_STATE_OFFLINE); > cpu_maps_update_done(); > - rc = cpu_down(cpu); > + rc = device_offline(get_cpu_device(cpu)); > if (rc) > goto out; > cpu_maps_update_begin(); > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs 2014-10-29 21:46 ` Rafael J. Wysocki 2014-10-29 21:52 ` Dan Streetman @ 2014-10-30 2:00 ` Neil Zhang 1 sibling, 0 replies; 12+ messages in thread From: Neil Zhang @ 2014-10-30 2:00 UTC (permalink / raw) To: Rafael J. Wysocki, Dan Streetman Cc: Rafael J. Wysocki, linux-kernel, Greg Kroah-Hartman, nfont@linux.vnet.ibm.com [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 4190 bytes --] Rafael, > -----Original Message----- > From: Rafael J. Wysocki [mailto:rjw@rjwysocki.net] > Sent: 2014å¹´10æ30æ¥ 5:47 > To: Dan Streetman > Cc: Rafael J. Wysocki; Neil Zhang; linux-kernel; Greg Kroah-Hartman; > nfont@linux.vnet.ibm.com > Subject: Re: [PATCH V2] Driver cpu: update online when cpu_up/down besides > sysfs > > On Monday, October 27, 2014 08:46:08 PM Dan Streetman wrote: > > On Mon, Oct 27, 2014 at 5:38 PM, Rafael J. Wysocki > > <rafael.j.wysocki@intel.com> wrote: > > > On 10/27/2014 3:59 AM, Neil Zhang wrote: > > >> > > >> The current per-cpu offline info won't be updated when we use any > > >> other method besides sysfs to call cpu_up/down. > > >> Thus the cpu/online can't reflect the real online status. > > >> > > >> This patch is going to fix the issue introduced by commit > > >> 0902a9044fa5b7a0456ea4daacec2c2b3189ba8c (Driver core: > > >> Use generic offline/online for CPU offline/online) > > >> > > >> CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > >> Tested-by: Dan Streetman <ddstreet@ieee.org> > > >> Signed-off-by: Neil Zhang <zhangwm@marvell.com> > > > > > > > > > Oh dear, no. > > > > > > Please first tell me what exactly the problem you're seeing is. > > > > For some background, here is my last comment on the first email thread on > this: > > https://lkml.org/lkml/2014/10/27/595 > > > > I didn't create this patch, but the problem essentially is that before > > your commit the individual cpu online nodes > > (/sys/devices/system/cpu/cpuN/online) stayed in sync during > > cpu_down/up, because they used the cpu_online_mask; while after the > > commit, they are tracked by the cpu's generic dev->offline flag, which > > isn't updated during cpu_down/up. > > Which is not triggered from sysfs. > > > So now, any place in the kernel > > that brings a cpu up or down must also update the cpu->dev->offline > > flag. > > Not any place. In particular, system suspend-resume doesn't need to do that, > because it takes CPUs offline and then brings them back online. > > If there's a place in the kernel where CPUs are taken offline and left in that > state, then it needs to be updated. Many ARM SoCs will have an in kernel profiler to determine whether we need to remove or add a cpu into system for power and performance consideration. Thus we will call cpu_up / down in kernel directly. > > > My interest in the patch was coincidental because I was seeing the > > same problem when using dlpar operations to hotplug cpus, which uses > > the arch/powerpc/platform/pseries/dlpar.c code; that code brings a cpu > > offline when it's hot-removed (and the cpu online when it's > > hot-added), but it hasn't been changed to also update the cpu's > > dev->offline flag. > > It should be modified to do that. > > > As I said in the previous email to the first thread, the ppc dlpar > > operation might be changed in the future to fully unregister a cpu > > when it's hot-removed, which would remove the entire sysfs cpuN > > directory. Alternately and/or until then, it could be updated to > > simply update the cpu'd dev->offline flag (that's what I originally > > did for my own testing). However, without a central place to update > > the cpu's dev->offline field, like this, or possibly in > > set_cpu_online(), or elsewhere during cpu_down/up, each place in the > > kernel that calls cpu_down() or cpu_up() also needs to update the > > dev->offline flag. It's possible that the ppc dlpar code is the only > > place in the kernel that has this problem; I haven't searched. > > It is quite likely to be the only place like that. > > While I'm not familiar with the code in question, the most straightforward way > to fix the problem would be to replace cpu_down() in there with > device_offline(get_cpu_device(cpu)), but that needs to be called under > device_hotplug_lock. Great, we can use this way to fix our problems. Thanks for the remind! > > -- > I speak only for myself. > Rafael J. Wysocki, Intel Open Source Technology Center. Best Regards, Neil Zhang ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-11-03 15:45 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-10-27 2:59 [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs Neil Zhang 2014-10-27 21:38 ` Rafael J. Wysocki 2014-10-28 0:46 ` Dan Streetman 2014-10-29 21:46 ` Rafael J. Wysocki 2014-10-29 21:52 ` Dan Streetman 2014-10-30 2:03 ` Neil Zhang 2014-10-31 19:41 ` [PATCH] powerpc: use device_online/offline() instead of cpu_up/down() Dan Streetman 2014-10-31 19:41 ` Dan Streetman 2014-11-02 4:58 ` Bharata B Rao 2014-11-02 4:58 ` Bharata B Rao 2014-11-03 15:45 ` Nathan Fontenot 2014-10-30 2:00 ` [PATCH V2] Driver cpu: update online when cpu_up/down besides sysfs Neil Zhang
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.