* [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code @ 2015-02-28 2:24 Tyrel Datwyler 2015-02-28 2:24 ` [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration Tyrel Datwyler ` (3 more replies) 0 siblings, 4 replies; 18+ messages in thread From: Tyrel Datwyler @ 2015-02-28 2:24 UTC (permalink / raw) To: linuxppc-dev; +Cc: Tyrel Datwyler, cyrilbur, nfont This patchset simplifies the usage of rtas_ibm_suspend_me() by removing an extraneous function parameter, fixes device tree updating on little endian platforms, and adds a mechanism for informing drmgr that the kernel is cabable of performing the whole migration including device tree update itself. Tyrel Datwyler (3): powerpc/pseries: Simplify check for suspendability during suspend/migration powerpc/pseries: Little endian fixes for post mobility device tree update powerpc/pseries: Expose post-migration in kernel device tree update to drmgr arch/powerpc/include/asm/rtas.h | 2 +- arch/powerpc/kernel/rtas.c | 15 ++++----- arch/powerpc/platforms/pseries/mobility.c | 55 ++++++++++++++++++------------- 3 files changed, 40 insertions(+), 32 deletions(-) -- 1.7.12.2 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration 2015-02-28 2:24 [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code Tyrel Datwyler @ 2015-02-28 2:24 ` Tyrel Datwyler 2015-03-02 4:19 ` Cyril Bur 2015-02-28 2:24 ` [PATCH 2/3] powerpc/pseries: Little endian fixes for post mobility device tree update Tyrel Datwyler ` (2 subsequent siblings) 3 siblings, 1 reply; 18+ messages in thread From: Tyrel Datwyler @ 2015-02-28 2:24 UTC (permalink / raw) To: linuxppc-dev; +Cc: Tyrel Datwyler, cyrilbur, nfont During suspend/migration operation we must wait for the VASI state reported by the hypervisor to become Suspending prior to making the ibm,suspend-me RTAS call. Calling routines to rtas_ibm_supend_me() pass a vasi_state variable that exposes the VASI state to the caller. This is unnecessary as the caller only really cares about the following three conditions; if there is an error we should bailout, success indicating we have suspended and woken back up so proceed to device tree updated, or we are not suspendable yet so try calling rtas_ibm_suspend_me again shortly. This patch removes the extraneous vasi_state variable and simply uses the return code to communicate how to proceed. We either succeed, fail, or get -EAGAIN in which case we sleep for a second before trying to call rtas_ibm_suspend_me again. Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> --- arch/powerpc/include/asm/rtas.h | 2 +- arch/powerpc/kernel/rtas.c | 15 +++++++-------- arch/powerpc/platforms/pseries/mobility.c | 8 +++----- 3 files changed, 11 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index 2e23e92..fc85eb0 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -327,7 +327,7 @@ extern int rtas_suspend_cpu(struct rtas_suspend_me_data *data); extern int rtas_suspend_last_cpu(struct rtas_suspend_me_data *data); extern int rtas_online_cpus_mask(cpumask_var_t cpus); extern int rtas_offline_cpus_mask(cpumask_var_t cpus); -extern int rtas_ibm_suspend_me(u64 handle, int *vasi_return); +extern int rtas_ibm_suspend_me(u64 handle); struct rtc_time; extern unsigned long rtas_get_boot_time(void); diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 21c45a2..603b928 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -897,7 +897,7 @@ int rtas_offline_cpus_mask(cpumask_var_t cpus) } EXPORT_SYMBOL(rtas_offline_cpus_mask); -int rtas_ibm_suspend_me(u64 handle, int *vasi_return) +int rtas_ibm_suspend_me(u64 handle) { long state; long rc; @@ -919,13 +919,11 @@ int rtas_ibm_suspend_me(u64 handle, int *vasi_return) printk(KERN_ERR "rtas_ibm_suspend_me: vasi_state returned %ld\n",rc); return rc; } else if (state == H_VASI_ENABLED) { - *vasi_return = RTAS_NOT_SUSPENDABLE; - return 0; + return -EAGAIN; } else if (state != H_VASI_SUSPENDING) { printk(KERN_ERR "rtas_ibm_suspend_me: vasi_state returned state %ld\n", state); - *vasi_return = -1; - return 0; + return -EIO; } if (!alloc_cpumask_var(&offline_mask, GFP_TEMPORARY)) @@ -1060,9 +1058,10 @@ asmlinkage int ppc_rtas(struct rtas_args __user *uargs) int vasi_rc = 0; u64 handle = ((u64)be32_to_cpu(args.args[0]) << 32) | be32_to_cpu(args.args[1]); - rc = rtas_ibm_suspend_me(handle, &vasi_rc); - args.rets[0] = cpu_to_be32(vasi_rc); - if (rc) + rc = rtas_ibm_suspend_me(handle); + if (rc == -EAGAIN) + args.rets[0] = cpu_to_be32(RTAS_NOT_SUSPENDABLE); + else if (rc) return rc; goto copy_return; } diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c index 90cf3dc..29e4f04 100644 --- a/arch/powerpc/platforms/pseries/mobility.c +++ b/arch/powerpc/platforms/pseries/mobility.c @@ -325,15 +325,13 @@ static ssize_t migrate_store(struct class *class, struct class_attribute *attr, return rc; do { - rc = rtas_ibm_suspend_me(streamid, &vasi_rc); - if (!rc && vasi_rc == RTAS_NOT_SUSPENDABLE) + rc = rtas_ibm_suspend_me(streamid); + if (rc == -EAGAIN) ssleep(1); - } while (!rc && vasi_rc == RTAS_NOT_SUSPENDABLE); + } while (rc == -EAGAIN); if (rc) return rc; - if (vasi_rc) - return vasi_rc; post_mobility_fixup(); return count; -- 1.7.12.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration 2015-02-28 2:24 ` [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration Tyrel Datwyler @ 2015-03-02 4:19 ` Cyril Bur 2015-03-02 21:30 ` Tyrel Datwyler 0 siblings, 1 reply; 18+ messages in thread From: Cyril Bur @ 2015-03-02 4:19 UTC (permalink / raw) To: Tyrel Datwyler; +Cc: linuxppc-dev, nfont On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: > During suspend/migration operation we must wait for the VASI state reported > by the hypervisor to become Suspending prior to making the ibm,suspend-me > RTAS call. Calling routines to rtas_ibm_supend_me() pass a vasi_state variable > that exposes the VASI state to the caller. This is unnecessary as the caller > only really cares about the following three conditions; if there is an error > we should bailout, success indicating we have suspended and woken back up so > proceed to device tree updated, or we are not suspendable yet so try calling > rtas_ibm_suspend_me again shortly. > > This patch removes the extraneous vasi_state variable and simply uses the > return code to communicate how to proceed. We either succeed, fail, or get > -EAGAIN in which case we sleep for a second before trying to call > rtas_ibm_suspend_me again. > > Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> > --- > arch/powerpc/include/asm/rtas.h | 2 +- > arch/powerpc/kernel/rtas.c | 15 +++++++-------- > arch/powerpc/platforms/pseries/mobility.c | 8 +++----- > 3 files changed, 11 insertions(+), 14 deletions(-) > > diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h > index 2e23e92..fc85eb0 100644 > --- a/arch/powerpc/include/asm/rtas.h > +++ b/arch/powerpc/include/asm/rtas.h > @@ -327,7 +327,7 @@ extern int rtas_suspend_cpu(struct rtas_suspend_me_data *data); > extern int rtas_suspend_last_cpu(struct rtas_suspend_me_data *data); > extern int rtas_online_cpus_mask(cpumask_var_t cpus); > extern int rtas_offline_cpus_mask(cpumask_var_t cpus); > -extern int rtas_ibm_suspend_me(u64 handle, int *vasi_return); > +extern int rtas_ibm_suspend_me(u64 handle); > I like ditching vasi_return, I was never happy with myself for doing that! > struct rtc_time; > extern unsigned long rtas_get_boot_time(void); > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > index 21c45a2..603b928 100644 > --- a/arch/powerpc/kernel/rtas.c > +++ b/arch/powerpc/kernel/rtas.c > @@ -897,7 +897,7 @@ int rtas_offline_cpus_mask(cpumask_var_t cpus) > } > EXPORT_SYMBOL(rtas_offline_cpus_mask); > > -int rtas_ibm_suspend_me(u64 handle, int *vasi_return) > +int rtas_ibm_suspend_me(u64 handle) That definition is actually in an #ifdef CONFIG_PPC_PSERIES, you'll need to change the definition for !CONFIG_PPC_PSERIES > { > long state; > long rc; > @@ -919,13 +919,11 @@ int rtas_ibm_suspend_me(u64 handle, int *vasi_return) > printk(KERN_ERR "rtas_ibm_suspend_me: vasi_state returned %ld\n",rc); > return rc; > } else if (state == H_VASI_ENABLED) { > - *vasi_return = RTAS_NOT_SUSPENDABLE; > - return 0; > + return -EAGAIN; > } else if (state != H_VASI_SUSPENDING) { > printk(KERN_ERR "rtas_ibm_suspend_me: vasi_state returned state %ld\n", > state); > - *vasi_return = -1; > - return 0; > + return -EIO; I've had a look as to how these return values get passed back up the stack and admittedly were dealing with a confusing mess, I've compared back to before my patch (which wasn't perfect either it seems). Both the state == H_VASI_ENABLED and state == H_VASI_SUSPENDING cause ppc_rtas to go to the copy_return and return 0 (albeit with an error code in args.rets[0]), because rtas_ppc goes back to out userland, I hesitate to change any of that. > } > > if (!alloc_cpumask_var(&offline_mask, GFP_TEMPORARY)) > @@ -1060,9 +1058,10 @@ asmlinkage int ppc_rtas(struct rtas_args __user *uargs) > int vasi_rc = 0; This generates unused variable warning. > u64 handle = ((u64)be32_to_cpu(args.args[0]) << 32) > | be32_to_cpu(args.args[1]); > - rc = rtas_ibm_suspend_me(handle, &vasi_rc); > - args.rets[0] = cpu_to_be32(vasi_rc); > - if (rc) > + rc = rtas_ibm_suspend_me(handle); > + if (rc == -EAGAIN) > + args.rets[0] = cpu_to_be32(RTAS_NOT_SUSPENDABLE); (continuing on...) so perhaps here have rc = 0; else if (rc == -EIO) args.rets[0] = cpu_to_be32(-1); rc = 0; Which should keep the original behaviour, the last thing we want to do is break BE. Might be worth checking that rc from rtas_ibm_suspend_me will only be -EAGAIN and -EIO when they are explicitly set in rtas_ibm_suspend_me and can't come back out from the hcall. >From reading PAPR we're ok there but just as a thought it might be worth returning errno as positive because hcall errors are going to be negative, to make life easier at some point... but then we'll have to remember to make them negative when going back to userland (and there are two places...) so there's no perfect win here. > + else if (rc) > return rc; > goto copy_return; > } > diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c > index 90cf3dc..29e4f04 100644 > --- a/arch/powerpc/platforms/pseries/mobility.c > +++ b/arch/powerpc/platforms/pseries/mobility.c > @@ -325,15 +325,13 @@ static ssize_t migrate_store(struct class *class, struct class_attribute *attr, > return rc; > > do { > - rc = rtas_ibm_suspend_me(streamid, &vasi_rc); > - if (!rc && vasi_rc == RTAS_NOT_SUSPENDABLE) > + rc = rtas_ibm_suspend_me(streamid); > + if (rc == -EAGAIN) > ssleep(1); > - } while (!rc && vasi_rc == RTAS_NOT_SUSPENDABLE); > + } while (rc == -EAGAIN); This is going to change the value of the error code. > > if (rc) > return rc; > - if (vasi_rc) > - return vasi_rc; > > post_mobility_fixup(); > return count; Thanks for taking it, it looks nicer now. Cyril ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration 2015-03-02 4:19 ` Cyril Bur @ 2015-03-02 21:30 ` Tyrel Datwyler 2015-03-03 6:15 ` Michael Ellerman 0 siblings, 1 reply; 18+ messages in thread From: Tyrel Datwyler @ 2015-03-02 21:30 UTC (permalink / raw) To: Cyril Bur; +Cc: linuxppc-dev, nfont On 03/01/2015 08:19 PM, Cyril Bur wrote: > On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: >> During suspend/migration operation we must wait for the VASI state reported >> by the hypervisor to become Suspending prior to making the ibm,suspend-me >> RTAS call. Calling routines to rtas_ibm_supend_me() pass a vasi_state variable >> that exposes the VASI state to the caller. This is unnecessary as the caller >> only really cares about the following three conditions; if there is an error >> we should bailout, success indicating we have suspended and woken back up so >> proceed to device tree updated, or we are not suspendable yet so try calling >> rtas_ibm_suspend_me again shortly. >> >> This patch removes the extraneous vasi_state variable and simply uses the >> return code to communicate how to proceed. We either succeed, fail, or get >> -EAGAIN in which case we sleep for a second before trying to call >> rtas_ibm_suspend_me again. >> >> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> >> --- >> arch/powerpc/include/asm/rtas.h | 2 +- >> arch/powerpc/kernel/rtas.c | 15 +++++++-------- >> arch/powerpc/platforms/pseries/mobility.c | 8 +++----- >> 3 files changed, 11 insertions(+), 14 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h >> index 2e23e92..fc85eb0 100644 >> --- a/arch/powerpc/include/asm/rtas.h >> +++ b/arch/powerpc/include/asm/rtas.h >> @@ -327,7 +327,7 @@ extern int rtas_suspend_cpu(struct rtas_suspend_me_data *data); >> extern int rtas_suspend_last_cpu(struct rtas_suspend_me_data *data); >> extern int rtas_online_cpus_mask(cpumask_var_t cpus); >> extern int rtas_offline_cpus_mask(cpumask_var_t cpus); >> -extern int rtas_ibm_suspend_me(u64 handle, int *vasi_return); >> +extern int rtas_ibm_suspend_me(u64 handle); >> > I like ditching vasi_return, I was never happy with myself for doing > that! > >> struct rtc_time; >> extern unsigned long rtas_get_boot_time(void); >> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c >> index 21c45a2..603b928 100644 >> --- a/arch/powerpc/kernel/rtas.c >> +++ b/arch/powerpc/kernel/rtas.c >> @@ -897,7 +897,7 @@ int rtas_offline_cpus_mask(cpumask_var_t cpus) >> } >> EXPORT_SYMBOL(rtas_offline_cpus_mask); >> >> -int rtas_ibm_suspend_me(u64 handle, int *vasi_return) >> +int rtas_ibm_suspend_me(u64 handle) > > That definition is actually in an #ifdef CONFIG_PPC_PSERIES, you'll need > to change the definition for !CONFIG_PPC_PSERIES Good catch. I'll fix it there too. >> { >> long state; >> long rc; >> @@ -919,13 +919,11 @@ int rtas_ibm_suspend_me(u64 handle, int *vasi_return) >> printk(KERN_ERR "rtas_ibm_suspend_me: vasi_state returned %ld\n",rc); >> return rc; >> } else if (state == H_VASI_ENABLED) { >> - *vasi_return = RTAS_NOT_SUSPENDABLE; >> - return 0; >> + return -EAGAIN; >> } else if (state != H_VASI_SUSPENDING) { >> printk(KERN_ERR "rtas_ibm_suspend_me: vasi_state returned state %ld\n", >> state); >> - *vasi_return = -1; >> - return 0; >> + return -EIO; > > I've had a look as to how these return values get passed back up the > stack and admittedly were dealing with a confusing mess, I've compared > back to before my patch (which wasn't perfect either it seems). > Both the state == H_VASI_ENABLED and state == H_VASI_SUSPENDING cause > ppc_rtas to go to the copy_return and return 0 (albeit with an error > code in args.rets[0]), because rtas_ppc goes back to out userland, I > hesitate to change any of that. Agreed, that this is a bit of a mess. The problem is we have two call paths into rtas_ibm_suspend_me(). The one from migrate_store() and one from ppc_rtas(). I'll address each with your other comments below. >> } >> >> if (!alloc_cpumask_var(&offline_mask, GFP_TEMPORARY)) >> @@ -1060,9 +1058,10 @@ asmlinkage int ppc_rtas(struct rtas_args __user *uargs) >> int vasi_rc = 0; > > This generates unused variable warning. Sloppy on my part. Will remove. > >> u64 handle = ((u64)be32_to_cpu(args.args[0]) << 32) >> | be32_to_cpu(args.args[1]); >> - rc = rtas_ibm_suspend_me(handle, &vasi_rc); >> - args.rets[0] = cpu_to_be32(vasi_rc); >> - if (rc) >> + rc = rtas_ibm_suspend_me(handle); >> + if (rc == -EAGAIN) >> + args.rets[0] = cpu_to_be32(RTAS_NOT_SUSPENDABLE); > > (continuing on...) so perhaps here have > rc = 0; > else if (rc == -EIO) > args.rets[0] = cpu_to_be32(-1); > rc = 0; > Which should keep the original behaviour, the last thing we want to do > is break BE. The biggest problem here is we are making what basically equates to a fake rtas call from drmgr which we intercept in ppc_rtas(). From there we make this special call to rtas_ibm_suspend_me() to check VASI state and do a bunch of other specialized work that needs to be setup prior to making the actual ibm,suspend-me rtas call. Since, we are cheating PAPR here I guess we can really handle it however we want. I chose to simply fail the rtas call in the case where rtas_ibm_suspend_me() fails with something other than -EAGAIN. In user space librtas will log errno for the failure and return RTAS_IO_ASSERT to drmgr which in turn will log that error and fail. Going forward we want to move drmgr to initiating migration through sysfs and not this clunky highway robbery of the rtas interface. So, for legacy purpose does it matter how we fail the call here? I'm open to either solution. If we choose to pass the error back through args.ret[0] what value do we choose? The following are all pretty standardized, but I don't think make sense here: -1: Hardware error -2: Busy -3: Parameter error 9000: Suspension Aborted The 9000 code maybe makes sense, but doesn't really convey that something bad a happened. In the end whatever value is passed in args.ret[0] drmgr will simply log. While I agree about not breaking BE I'm not sure how it would. All i've done is added the -EIO case to explicit failure. > > Might be worth checking that rc from rtas_ibm_suspend_me will only be > -EAGAIN and -EIO when they are explicitly set in rtas_ibm_suspend_me and > can't come back out from the hcall. > From reading PAPR we're ok there but just as a thought it might be worth > returning errno as positive because hcall errors are going to be > negative, to make life easier at some point... but then we'll have to > remember to make them negative when going back to userland (and there > are two places...) so there's no perfect win here. > There are a variety of things that could go wrong that aren't directly related to rtas. This is why I chose to explicitly fail the rtas call if we get anything other than 0 or -EAGAIN. >> + else if (rc) >> return rc; >> goto copy_return; >> } >> diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c >> index 90cf3dc..29e4f04 100644 >> --- a/arch/powerpc/platforms/pseries/mobility.c >> +++ b/arch/powerpc/platforms/pseries/mobility.c >> @@ -325,15 +325,13 @@ static ssize_t migrate_store(struct class *class, struct class_attribute *attr, >> return rc; >> >> do { >> - rc = rtas_ibm_suspend_me(streamid, &vasi_rc); >> - if (!rc && vasi_rc == RTAS_NOT_SUSPENDABLE) >> + rc = rtas_ibm_suspend_me(streamid); >> + if (rc == -EAGAIN) >> ssleep(1); >> - } while (!rc && vasi_rc == RTAS_NOT_SUSPENDABLE); >> + } while (rc == -EAGAIN); > > This is going to change the value of the error code. Here drmgr assumes a zero or greater value to mean success, and anything negative failure. It logs errno in failure case. -Tyrel >> >> if (rc) >> return rc; >> - if (vasi_rc) >> - return vasi_rc; >> >> post_mobility_fixup(); >> return count; > > Thanks for taking it, it looks nicer now. > > Cyril > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration 2015-03-02 21:30 ` Tyrel Datwyler @ 2015-03-03 6:15 ` Michael Ellerman 2015-03-03 20:16 ` Tyrel Datwyler 0 siblings, 1 reply; 18+ messages in thread From: Michael Ellerman @ 2015-03-03 6:15 UTC (permalink / raw) To: Tyrel Datwyler; +Cc: linuxppc-dev, Cyril Bur, nfont On Mon, 2015-03-02 at 13:30 -0800, Tyrel Datwyler wrote: > On 03/01/2015 08:19 PM, Cyril Bur wrote: > > On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: > >> During suspend/migration operation we must wait for the VASI state reported > >> by the hypervisor to become Suspending prior to making the ibm,suspend-me > >> RTAS call. Calling routines to rtas_ibm_supend_me() pass a vasi_state variable > >> that exposes the VASI state to the caller. This is unnecessary as the caller > >> only really cares about the following three conditions; if there is an error > >> we should bailout, success indicating we have suspended and woken back up so > >> proceed to device tree updated, or we are not suspendable yet so try calling > >> rtas_ibm_suspend_me again shortly. > >> > >> This patch removes the extraneous vasi_state variable and simply uses the > >> return code to communicate how to proceed. We either succeed, fail, or get > >> -EAGAIN in which case we sleep for a second before trying to call > >> rtas_ibm_suspend_me again. > >> > >> u64 handle = ((u64)be32_to_cpu(args.args[0]) << 32) > >> | be32_to_cpu(args.args[1]); > >> - rc = rtas_ibm_suspend_me(handle, &vasi_rc); > >> - args.rets[0] = cpu_to_be32(vasi_rc); > >> - if (rc) > >> + rc = rtas_ibm_suspend_me(handle); > >> + if (rc == -EAGAIN) > >> + args.rets[0] = cpu_to_be32(RTAS_NOT_SUSPENDABLE); > > > > (continuing on...) so perhaps here have > > rc = 0; > > else if (rc == -EIO) > > args.rets[0] = cpu_to_be32(-1); > > rc = 0; > > Which should keep the original behaviour, the last thing we want to do > > is break BE. > > The biggest problem here is we are making what basically equates to a > fake rtas call from drmgr which we intercept in ppc_rtas(). From there > we make this special call to rtas_ibm_suspend_me() to check VASI state > and do a bunch of other specialized work that needs to be setup prior to > making the actual ibm,suspend-me rtas call. Since, we are cheating PAPR > here I guess we can really handle it however we want. I chose to simply > fail the rtas call in the case where rtas_ibm_suspend_me() fails with > something other than -EAGAIN. In user space librtas will log errno for > the failure and return RTAS_IO_ASSERT to drmgr which in turn will log > that error and fail. We don't want to change the return values of the syscall unless we absolutely have to. And I don't think that's the case here. Sure we think drmgr is the only thing that uses this crap, but we don't know for sure. cheers ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration 2015-03-03 6:15 ` Michael Ellerman @ 2015-03-03 20:16 ` Tyrel Datwyler 2015-03-04 15:58 ` Nathan Fontenot 0 siblings, 1 reply; 18+ messages in thread From: Tyrel Datwyler @ 2015-03-03 20:16 UTC (permalink / raw) To: Michael Ellerman; +Cc: linuxppc-dev, Cyril Bur, nfont On 03/02/2015 10:15 PM, Michael Ellerman wrote: > On Mon, 2015-03-02 at 13:30 -0800, Tyrel Datwyler wrote: >> On 03/01/2015 08:19 PM, Cyril Bur wrote: >>> On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: >>>> During suspend/migration operation we must wait for the VASI state reported >>>> by the hypervisor to become Suspending prior to making the ibm,suspend-me >>>> RTAS call. Calling routines to rtas_ibm_supend_me() pass a vasi_state variable >>>> that exposes the VASI state to the caller. This is unnecessary as the caller >>>> only really cares about the following three conditions; if there is an error >>>> we should bailout, success indicating we have suspended and woken back up so >>>> proceed to device tree updated, or we are not suspendable yet so try calling >>>> rtas_ibm_suspend_me again shortly. >>>> >>>> This patch removes the extraneous vasi_state variable and simply uses the >>>> return code to communicate how to proceed. We either succeed, fail, or get >>>> -EAGAIN in which case we sleep for a second before trying to call >>>> rtas_ibm_suspend_me again. >>>> >>>> u64 handle = ((u64)be32_to_cpu(args.args[0]) << 32) >>>> | be32_to_cpu(args.args[1]); >>>> - rc = rtas_ibm_suspend_me(handle, &vasi_rc); >>>> - args.rets[0] = cpu_to_be32(vasi_rc); >>>> - if (rc) >>>> + rc = rtas_ibm_suspend_me(handle); >>>> + if (rc == -EAGAIN) >>>> + args.rets[0] = cpu_to_be32(RTAS_NOT_SUSPENDABLE); >>> >>> (continuing on...) so perhaps here have >>> rc = 0; >>> else if (rc == -EIO) >>> args.rets[0] = cpu_to_be32(-1); >>> rc = 0; >>> Which should keep the original behaviour, the last thing we want to do >>> is break BE. >> >> The biggest problem here is we are making what basically equates to a >> fake rtas call from drmgr which we intercept in ppc_rtas(). From there >> we make this special call to rtas_ibm_suspend_me() to check VASI state >> and do a bunch of other specialized work that needs to be setup prior to >> making the actual ibm,suspend-me rtas call. Since, we are cheating PAPR >> here I guess we can really handle it however we want. I chose to simply >> fail the rtas call in the case where rtas_ibm_suspend_me() fails with >> something other than -EAGAIN. In user space librtas will log errno for >> the failure and return RTAS_IO_ASSERT to drmgr which in turn will log >> that error and fail. > > We don't want to change the return values of the syscall unless we absolutely > have to. And I don't think that's the case here. I'd like to argue that the one case I changed makes sense, but its just as easy to keep the original behavior. > > Sure we think drmgr is the only thing that uses this crap, but we don't know > for sure. I can't imagine how anybody else could possibly use this hack without a streamid from the hmc/hypervisor, but I've been wrong in the past more times than I can count. :) -Tyrel > > cheers > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration 2015-03-03 20:16 ` Tyrel Datwyler @ 2015-03-04 15:58 ` Nathan Fontenot 0 siblings, 0 replies; 18+ messages in thread From: Nathan Fontenot @ 2015-03-04 15:58 UTC (permalink / raw) To: Tyrel Datwyler, Michael Ellerman; +Cc: linuxppc-dev, Cyril Bur On 03/03/2015 02:16 PM, Tyrel Datwyler wrote: > On 03/02/2015 10:15 PM, Michael Ellerman wrote: >> On Mon, 2015-03-02 at 13:30 -0800, Tyrel Datwyler wrote: >>> On 03/01/2015 08:19 PM, Cyril Bur wrote: >>>> On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: >>>>> During suspend/migration operation we must wait for the VASI state reported >>>>> by the hypervisor to become Suspending prior to making the ibm,suspend-me >>>>> RTAS call. Calling routines to rtas_ibm_supend_me() pass a vasi_state variable >>>>> that exposes the VASI state to the caller. This is unnecessary as the caller >>>>> only really cares about the following three conditions; if there is an error >>>>> we should bailout, success indicating we have suspended and woken back up so >>>>> proceed to device tree updated, or we are not suspendable yet so try calling >>>>> rtas_ibm_suspend_me again shortly. >>>>> >>>>> This patch removes the extraneous vasi_state variable and simply uses the >>>>> return code to communicate how to proceed. We either succeed, fail, or get >>>>> -EAGAIN in which case we sleep for a second before trying to call >>>>> rtas_ibm_suspend_me again. >>>>> >>>>> u64 handle = ((u64)be32_to_cpu(args.args[0]) << 32) >>>>> | be32_to_cpu(args.args[1]); >>>>> - rc = rtas_ibm_suspend_me(handle, &vasi_rc); >>>>> - args.rets[0] = cpu_to_be32(vasi_rc); >>>>> - if (rc) >>>>> + rc = rtas_ibm_suspend_me(handle); >>>>> + if (rc == -EAGAIN) >>>>> + args.rets[0] = cpu_to_be32(RTAS_NOT_SUSPENDABLE); >>>> >>>> (continuing on...) so perhaps here have >>>> rc = 0; >>>> else if (rc == -EIO) >>>> args.rets[0] = cpu_to_be32(-1); >>>> rc = 0; >>>> Which should keep the original behaviour, the last thing we want to do >>>> is break BE. >>> >>> The biggest problem here is we are making what basically equates to a >>> fake rtas call from drmgr which we intercept in ppc_rtas(). From there >>> we make this special call to rtas_ibm_suspend_me() to check VASI state >>> and do a bunch of other specialized work that needs to be setup prior to >>> making the actual ibm,suspend-me rtas call. Since, we are cheating PAPR >>> here I guess we can really handle it however we want. I chose to simply >>> fail the rtas call in the case where rtas_ibm_suspend_me() fails with >>> something other than -EAGAIN. In user space librtas will log errno for >>> the failure and return RTAS_IO_ASSERT to drmgr which in turn will log >>> that error and fail. >> >> We don't want to change the return values of the syscall unless we absolutely >> have to. And I don't think that's the case here. > > I'd like to argue that the one case I changed makes sense, but its just > as easy to keep the original behavior. > >> >> Sure we think drmgr is the only thing that uses this crap, but we don't know >> for sure. > > I can't imagine how anybody else could possibly use this hack without a > streamid from the hmc/hypervisor, but I've been wrong in the past more > times than I can count. :) Correct, this will fail if called with a random streamid. The streamid has to match what is handed to us from the HMC when a migration request is initiated. -Nathan > > -Tyrel > >> >> cheers >> >> > ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 2/3] powerpc/pseries: Little endian fixes for post mobility device tree update 2015-02-28 2:24 [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code Tyrel Datwyler 2015-02-28 2:24 ` [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration Tyrel Datwyler @ 2015-02-28 2:24 ` Tyrel Datwyler 2015-03-02 5:20 ` Cyril Bur 2015-02-28 2:24 ` [PATCH 3/3] powerpc/pseries: Expose post-migration in kernel device tree update to drmgr Tyrel Datwyler 2015-03-03 6:10 ` [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code Michael Ellerman 3 siblings, 1 reply; 18+ messages in thread From: Tyrel Datwyler @ 2015-02-28 2:24 UTC (permalink / raw) To: linuxppc-dev; +Cc: Tyrel Datwyler, cyrilbur, nfont We currently use the device tree update code in the kernel after resuming from a suspend operation to re-sync the kernels view of the device tree with that of the hypervisor. The code as it stands is not endian safe as it relies on parsing buffers returned by RTAS calls that thusly contains data in big endian format. This patch annotates variables and structure members with __be types as well as performing necessary byte swaps to cpu endian for data that needs to be parsed. Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> --- arch/powerpc/platforms/pseries/mobility.c | 36 ++++++++++++++++--------------- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c index 29e4f04..0b1f70e 100644 --- a/arch/powerpc/platforms/pseries/mobility.c +++ b/arch/powerpc/platforms/pseries/mobility.c @@ -25,10 +25,10 @@ static struct kobject *mobility_kobj; struct update_props_workarea { - u32 phandle; - u32 state; - u64 reserved; - u32 nprops; + __be32 phandle; + __be32 state; + __be64 reserved; + __be32 nprops; } __packed; #define NODE_ACTION_MASK 0xff000000 @@ -127,7 +127,7 @@ static int update_dt_property(struct device_node *dn, struct property **prop, return 0; } -static int update_dt_node(u32 phandle, s32 scope) +static int update_dt_node(__be32 phandle, s32 scope) { struct update_props_workarea *upwa; struct device_node *dn; @@ -136,6 +136,7 @@ static int update_dt_node(u32 phandle, s32 scope) char *prop_data; char *rtas_buf; int update_properties_token; + u32 nprops; u32 vd; update_properties_token = rtas_token("ibm,update-properties"); @@ -162,6 +163,7 @@ static int update_dt_node(u32 phandle, s32 scope) break; prop_data = rtas_buf + sizeof(*upwa); + nprops = be32_to_cpu(upwa->nprops); /* On the first call to ibm,update-properties for a node the * the first property value descriptor contains an empty @@ -170,17 +172,17 @@ static int update_dt_node(u32 phandle, s32 scope) */ if (*prop_data == 0) { prop_data++; - vd = *(u32 *)prop_data; + vd = be32_to_cpu(*(__be32 *)prop_data); prop_data += vd + sizeof(vd); - upwa->nprops--; + nprops--; } - for (i = 0; i < upwa->nprops; i++) { + for (i = 0; i < nprops; i++) { char *prop_name; prop_name = prop_data; prop_data += strlen(prop_name) + 1; - vd = *(u32 *)prop_data; + vd = be32_to_cpu(*(__be32 *)prop_data); prop_data += sizeof(vd); switch (vd) { @@ -212,7 +214,7 @@ static int update_dt_node(u32 phandle, s32 scope) return 0; } -static int add_dt_node(u32 parent_phandle, u32 drc_index) +static int add_dt_node(__be32 parent_phandle, __be32 drc_index) { struct device_node *dn; struct device_node *parent_dn; @@ -237,7 +239,7 @@ static int add_dt_node(u32 parent_phandle, u32 drc_index) int pseries_devicetree_update(s32 scope) { char *rtas_buf; - u32 *data; + __be32 *data; int update_nodes_token; int rc; @@ -254,17 +256,17 @@ int pseries_devicetree_update(s32 scope) if (rc && rc != 1) break; - data = (u32 *)rtas_buf + 4; - while (*data & NODE_ACTION_MASK) { + data = (__be32 *)rtas_buf + 4; + while (be32_to_cpu(*data) & NODE_ACTION_MASK) { int i; - u32 action = *data & NODE_ACTION_MASK; - int node_count = *data & NODE_COUNT_MASK; + u32 action = be32_to_cpu(*data) & NODE_ACTION_MASK; + u32 node_count = be32_to_cpu(*data) & NODE_COUNT_MASK; data++; for (i = 0; i < node_count; i++) { - u32 phandle = *data++; - u32 drc_index; + __be32 phandle = *data++; + __be32 drc_index; switch (action) { case DELETE_DT_NODE: -- 1.7.12.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 2/3] powerpc/pseries: Little endian fixes for post mobility device tree update 2015-02-28 2:24 ` [PATCH 2/3] powerpc/pseries: Little endian fixes for post mobility device tree update Tyrel Datwyler @ 2015-03-02 5:20 ` Cyril Bur 2015-03-02 21:49 ` Tyrel Datwyler 0 siblings, 1 reply; 18+ messages in thread From: Cyril Bur @ 2015-03-02 5:20 UTC (permalink / raw) To: Tyrel Datwyler; +Cc: linuxppc-dev, nfont On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: > We currently use the device tree update code in the kernel after resuming > from a suspend operation to re-sync the kernels view of the device tree with > that of the hypervisor. The code as it stands is not endian safe as it relies > on parsing buffers returned by RTAS calls that thusly contains data in big > endian format. > > This patch annotates variables and structure members with __be types as well > as performing necessary byte swaps to cpu endian for data that needs to be > parsed. > > Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> > --- > arch/powerpc/platforms/pseries/mobility.c | 36 ++++++++++++++++--------------- > 1 file changed, 19 insertions(+), 17 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c > index 29e4f04..0b1f70e 100644 > --- a/arch/powerpc/platforms/pseries/mobility.c > +++ b/arch/powerpc/platforms/pseries/mobility.c > @@ -25,10 +25,10 @@ > static struct kobject *mobility_kobj; > > struct update_props_workarea { > - u32 phandle; > - u32 state; > - u64 reserved; > - u32 nprops; > + __be32 phandle; > + __be32 state; > + __be64 reserved; > + __be32 nprops; > } __packed; > > #define NODE_ACTION_MASK 0xff000000 > @@ -127,7 +127,7 @@ static int update_dt_property(struct device_node *dn, struct property **prop, > return 0; > } > > -static int update_dt_node(u32 phandle, s32 scope) > +static int update_dt_node(__be32 phandle, s32 scope) > { On line 153 of this function: dn = of_find_node_by_phandle(phandle); You're passing a __be32 to device tree code, if we can treat the phandle as a opaque value returned to us from the rtas call and pass it around like that then all good. Its also hard to be sure if these need to be BE and have always been that way because we've always run BE so they've never actually wanted CPU endian its just that CPU endian has always been BE (I think I started rambling...) Just want to check that *not* converting them is done on purpose. And having read on, I'm assuming the answer is yes since this observation is true for your changes which affect: delete_dt_node() update_dt_node() add_dt_node() Worth noting that you didn't change the definition of delete_dt_node() I'll have a look once you address the non compiling in patch 1/3 (I'm getting blocked the unused var because somehow Werror is on, odd it didn't trip you up) but I also suspect this will have sparse go a bit nuts. I wonder if there is a nice way of shutting sparse up. > struct update_props_workarea *upwa; > struct device_node *dn; > @@ -136,6 +136,7 @@ static int update_dt_node(u32 phandle, s32 scope) > char *prop_data; > char *rtas_buf; > int update_properties_token; > + u32 nprops; > u32 vd; > > update_properties_token = rtas_token("ibm,update-properties"); > @@ -162,6 +163,7 @@ static int update_dt_node(u32 phandle, s32 scope) > break; > > prop_data = rtas_buf + sizeof(*upwa); > + nprops = be32_to_cpu(upwa->nprops); > > /* On the first call to ibm,update-properties for a node the > * the first property value descriptor contains an empty > @@ -170,17 +172,17 @@ static int update_dt_node(u32 phandle, s32 scope) > */ > if (*prop_data == 0) { > prop_data++; > - vd = *(u32 *)prop_data; > + vd = be32_to_cpu(*(__be32 *)prop_data); > prop_data += vd + sizeof(vd); > - upwa->nprops--; > + nprops--; > } > > - for (i = 0; i < upwa->nprops; i++) { > + for (i = 0; i < nprops; i++) { > char *prop_name; > > prop_name = prop_data; > prop_data += strlen(prop_name) + 1; > - vd = *(u32 *)prop_data; > + vd = be32_to_cpu(*(__be32 *)prop_data); > prop_data += sizeof(vd); > > switch (vd) { > @@ -212,7 +214,7 @@ static int update_dt_node(u32 phandle, s32 scope) > return 0; > } > > -static int add_dt_node(u32 parent_phandle, u32 drc_index) > +static int add_dt_node(__be32 parent_phandle, __be32 drc_index) > { > struct device_node *dn; > struct device_node *parent_dn; > @@ -237,7 +239,7 @@ static int add_dt_node(u32 parent_phandle, u32 drc_index) > int pseries_devicetree_update(s32 scope) > { > char *rtas_buf; > - u32 *data; > + __be32 *data; > int update_nodes_token; > int rc; > > @@ -254,17 +256,17 @@ int pseries_devicetree_update(s32 scope) > if (rc && rc != 1) > break; > > - data = (u32 *)rtas_buf + 4; > - while (*data & NODE_ACTION_MASK) { > + data = (__be32 *)rtas_buf + 4; > + while (be32_to_cpu(*data) & NODE_ACTION_MASK) { > int i; > - u32 action = *data & NODE_ACTION_MASK; > - int node_count = *data & NODE_COUNT_MASK; > + u32 action = be32_to_cpu(*data) & NODE_ACTION_MASK; > + u32 node_count = be32_to_cpu(*data) & NODE_COUNT_MASK; > > data++; > > for (i = 0; i < node_count; i++) { > - u32 phandle = *data++; > - u32 drc_index; > + __be32 phandle = *data++; > + __be32 drc_index; > > switch (action) { > case DELETE_DT_NODE: The patch looks good, no nonsense endian fixing. Worth noting that it leaves existing bugs in place, which is fine, I'll rebase my patches which address endian and bugs on top of these so as to address the bugs. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/3] powerpc/pseries: Little endian fixes for post mobility device tree update 2015-03-02 5:20 ` Cyril Bur @ 2015-03-02 21:49 ` Tyrel Datwyler 2015-03-03 23:15 ` Tyrel Datwyler 0 siblings, 1 reply; 18+ messages in thread From: Tyrel Datwyler @ 2015-03-02 21:49 UTC (permalink / raw) To: Cyril Bur; +Cc: linuxppc-dev, nfont On 03/01/2015 09:20 PM, Cyril Bur wrote: > On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: >> We currently use the device tree update code in the kernel after resuming >> from a suspend operation to re-sync the kernels view of the device tree with >> that of the hypervisor. The code as it stands is not endian safe as it relies >> on parsing buffers returned by RTAS calls that thusly contains data in big >> endian format. >> >> This patch annotates variables and structure members with __be types as well >> as performing necessary byte swaps to cpu endian for data that needs to be >> parsed. >> >> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> >> --- >> arch/powerpc/platforms/pseries/mobility.c | 36 ++++++++++++++++--------------- >> 1 file changed, 19 insertions(+), 17 deletions(-) >> >> diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c >> index 29e4f04..0b1f70e 100644 >> --- a/arch/powerpc/platforms/pseries/mobility.c >> +++ b/arch/powerpc/platforms/pseries/mobility.c >> @@ -25,10 +25,10 @@ >> static struct kobject *mobility_kobj; >> >> struct update_props_workarea { >> - u32 phandle; >> - u32 state; >> - u64 reserved; >> - u32 nprops; >> + __be32 phandle; >> + __be32 state; >> + __be64 reserved; >> + __be32 nprops; >> } __packed; >> >> #define NODE_ACTION_MASK 0xff000000 >> @@ -127,7 +127,7 @@ static int update_dt_property(struct device_node *dn, struct property **prop, >> return 0; >> } >> >> -static int update_dt_node(u32 phandle, s32 scope) >> +static int update_dt_node(__be32 phandle, s32 scope) >> { > > On line 153 of this function: > dn = of_find_node_by_phandle(phandle); > > You're passing a __be32 to device tree code, if we can treat the phandle > as a opaque value returned to us from the rtas call and pass it around > like that then all good. Yes, of_find_node_by_phandle directly compares phandle passed in against the handle stored in each device_node when searching for a matching node. Since, the device tree is big endian it follows that the big endian phandle received in the rtas buffer needs no conversion. Further, we need to pass the phandle to ibm,update-properties in the work area which is also required to be big endian. So, again it seemed that converting to cpu endian was a waste of effort just to convert it back to big endian. > Its also hard to be sure if these need to be BE and have always been > that way because we've always run BE so they've never actually wanted > CPU endian its just that CPU endian has always been BE (I think I > started rambling...) > > Just want to check that *not* converting them is done on purpose. Yes, I explicitly did not convert them on purpose. As mentioned above we need phandle in BE for the ibm,update-properties rtas work area. Similarly, drc_index needs to be in BE for the ibm,configure-connector rtas work area. Outside, of that we do no other manipulation of those values. > > And having read on, I'm assuming the answer is yes since this > observation is true for your changes which affect: > delete_dt_node() > update_dt_node() > add_dt_node() > Worth noting that you didn't change the definition of delete_dt_node() You are correct. Oversight. I will fix that as it should generate a sparse complaint. -Tyrel > > I'll have a look once you address the non compiling in patch 1/3 (I'm > getting blocked the unused var because somehow Werror is on, odd it > didn't trip you up) but I also suspect this will have sparse go a bit > nuts. > I wonder if there is a nice way of shutting sparse up. > >> struct update_props_workarea *upwa; >> struct device_node *dn; >> @@ -136,6 +136,7 @@ static int update_dt_node(u32 phandle, s32 scope) >> char *prop_data; >> char *rtas_buf; >> int update_properties_token; >> + u32 nprops; >> u32 vd; >> >> update_properties_token = rtas_token("ibm,update-properties"); >> @@ -162,6 +163,7 @@ static int update_dt_node(u32 phandle, s32 scope) >> break; >> >> prop_data = rtas_buf + sizeof(*upwa); >> + nprops = be32_to_cpu(upwa->nprops); >> >> /* On the first call to ibm,update-properties for a node the >> * the first property value descriptor contains an empty >> @@ -170,17 +172,17 @@ static int update_dt_node(u32 phandle, s32 scope) >> */ >> if (*prop_data == 0) { >> prop_data++; >> - vd = *(u32 *)prop_data; >> + vd = be32_to_cpu(*(__be32 *)prop_data); >> prop_data += vd + sizeof(vd); >> - upwa->nprops--; >> + nprops--; >> } >> >> - for (i = 0; i < upwa->nprops; i++) { >> + for (i = 0; i < nprops; i++) { >> char *prop_name; >> >> prop_name = prop_data; >> prop_data += strlen(prop_name) + 1; >> - vd = *(u32 *)prop_data; >> + vd = be32_to_cpu(*(__be32 *)prop_data); >> prop_data += sizeof(vd); >> >> switch (vd) { >> @@ -212,7 +214,7 @@ static int update_dt_node(u32 phandle, s32 scope) >> return 0; >> } >> >> -static int add_dt_node(u32 parent_phandle, u32 drc_index) >> +static int add_dt_node(__be32 parent_phandle, __be32 drc_index) >> { >> struct device_node *dn; >> struct device_node *parent_dn; >> @@ -237,7 +239,7 @@ static int add_dt_node(u32 parent_phandle, u32 drc_index) >> int pseries_devicetree_update(s32 scope) >> { >> char *rtas_buf; >> - u32 *data; >> + __be32 *data; >> int update_nodes_token; >> int rc; >> >> @@ -254,17 +256,17 @@ int pseries_devicetree_update(s32 scope) >> if (rc && rc != 1) >> break; >> >> - data = (u32 *)rtas_buf + 4; >> - while (*data & NODE_ACTION_MASK) { >> + data = (__be32 *)rtas_buf + 4; >> + while (be32_to_cpu(*data) & NODE_ACTION_MASK) { >> int i; >> - u32 action = *data & NODE_ACTION_MASK; >> - int node_count = *data & NODE_COUNT_MASK; >> + u32 action = be32_to_cpu(*data) & NODE_ACTION_MASK; >> + u32 node_count = be32_to_cpu(*data) & NODE_COUNT_MASK; >> >> data++; >> >> for (i = 0; i < node_count; i++) { >> - u32 phandle = *data++; >> - u32 drc_index; >> + __be32 phandle = *data++; >> + __be32 drc_index; >> >> switch (action) { >> case DELETE_DT_NODE: > > The patch looks good, no nonsense endian fixing. > Worth noting that it leaves existing bugs in place, which is fine, I'll > rebase my patches which address endian and bugs on top of these so as to > address the bugs. > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/3] powerpc/pseries: Little endian fixes for post mobility device tree update 2015-03-02 21:49 ` Tyrel Datwyler @ 2015-03-03 23:15 ` Tyrel Datwyler 2015-03-04 1:20 ` Cyril Bur 0 siblings, 1 reply; 18+ messages in thread From: Tyrel Datwyler @ 2015-03-03 23:15 UTC (permalink / raw) To: Cyril Bur; +Cc: nfont, linuxppc-dev On 03/02/2015 01:49 PM, Tyrel Datwyler wrote: > On 03/01/2015 09:20 PM, Cyril Bur wrote: >> On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: >>> We currently use the device tree update code in the kernel after resuming >>> from a suspend operation to re-sync the kernels view of the device tree with >>> that of the hypervisor. The code as it stands is not endian safe as it relies >>> on parsing buffers returned by RTAS calls that thusly contains data in big >>> endian format. >>> >>> This patch annotates variables and structure members with __be types as well >>> as performing necessary byte swaps to cpu endian for data that needs to be >>> parsed. >>> >>> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> >>> --- >>> arch/powerpc/platforms/pseries/mobility.c | 36 ++++++++++++++++--------------- >>> 1 file changed, 19 insertions(+), 17 deletions(-) >>> >>> diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c >>> index 29e4f04..0b1f70e 100644 >>> --- a/arch/powerpc/platforms/pseries/mobility.c >>> +++ b/arch/powerpc/platforms/pseries/mobility.c >>> @@ -25,10 +25,10 @@ >>> static struct kobject *mobility_kobj; >>> >>> struct update_props_workarea { >>> - u32 phandle; >>> - u32 state; >>> - u64 reserved; >>> - u32 nprops; >>> + __be32 phandle; >>> + __be32 state; >>> + __be64 reserved; >>> + __be32 nprops; >>> } __packed; >>> >>> #define NODE_ACTION_MASK 0xff000000 >>> @@ -127,7 +127,7 @@ static int update_dt_property(struct device_node *dn, struct property **prop, >>> return 0; >>> } >>> >>> -static int update_dt_node(u32 phandle, s32 scope) >>> +static int update_dt_node(__be32 phandle, s32 scope) >>> { >> >> On line 153 of this function: >> dn = of_find_node_by_phandle(phandle); >> >> You're passing a __be32 to device tree code, if we can treat the phandle >> as a opaque value returned to us from the rtas call and pass it around >> like that then all good. After digging deeper the device_node->phandle is stored in cpu endian under the covers. So, for the of_find_node_by_phandle() we do need to convert the phandle to cpu endian first. It appears I got lucky with the update fixing the observed RMC issue because the phandle for the root node seems to always be 0xffffffff. -Tyrel > > Yes, of_find_node_by_phandle directly compares phandle passed in against > the handle stored in each device_node when searching for a matching > node. Since, the device tree is big endian it follows that the big > endian phandle received in the rtas buffer needs no conversion. > > Further, we need to pass the phandle to ibm,update-properties in the > work area which is also required to be big endian. So, again it seemed > that converting to cpu endian was a waste of effort just to convert it > back to big endian. > >> Its also hard to be sure if these need to be BE and have always been >> that way because we've always run BE so they've never actually wanted >> CPU endian its just that CPU endian has always been BE (I think I >> started rambling...) >> >> Just want to check that *not* converting them is done on purpose. > > Yes, I explicitly did not convert them on purpose. As mentioned above we > need phandle in BE for the ibm,update-properties rtas work area. > Similarly, drc_index needs to be in BE for the ibm,configure-connector > rtas work area. Outside, of that we do no other manipulation of those > values. > >> >> And having read on, I'm assuming the answer is yes since this >> observation is true for your changes which affect: >> delete_dt_node() >> update_dt_node() >> add_dt_node() >> Worth noting that you didn't change the definition of delete_dt_node() > > You are correct. Oversight. I will fix that as it should generate a > sparse complaint. > > -Tyrel > >> >> I'll have a look once you address the non compiling in patch 1/3 (I'm >> getting blocked the unused var because somehow Werror is on, odd it >> didn't trip you up) but I also suspect this will have sparse go a bit >> nuts. >> I wonder if there is a nice way of shutting sparse up. >> >>> struct update_props_workarea *upwa; >>> struct device_node *dn; >>> @@ -136,6 +136,7 @@ static int update_dt_node(u32 phandle, s32 scope) >>> char *prop_data; >>> char *rtas_buf; >>> int update_properties_token; >>> + u32 nprops; >>> u32 vd; >>> >>> update_properties_token = rtas_token("ibm,update-properties"); >>> @@ -162,6 +163,7 @@ static int update_dt_node(u32 phandle, s32 scope) >>> break; >>> >>> prop_data = rtas_buf + sizeof(*upwa); >>> + nprops = be32_to_cpu(upwa->nprops); >>> >>> /* On the first call to ibm,update-properties for a node the >>> * the first property value descriptor contains an empty >>> @@ -170,17 +172,17 @@ static int update_dt_node(u32 phandle, s32 scope) >>> */ >>> if (*prop_data == 0) { >>> prop_data++; >>> - vd = *(u32 *)prop_data; >>> + vd = be32_to_cpu(*(__be32 *)prop_data); >>> prop_data += vd + sizeof(vd); >>> - upwa->nprops--; >>> + nprops--; >>> } >>> >>> - for (i = 0; i < upwa->nprops; i++) { >>> + for (i = 0; i < nprops; i++) { >>> char *prop_name; >>> >>> prop_name = prop_data; >>> prop_data += strlen(prop_name) + 1; >>> - vd = *(u32 *)prop_data; >>> + vd = be32_to_cpu(*(__be32 *)prop_data); >>> prop_data += sizeof(vd); >>> >>> switch (vd) { >>> @@ -212,7 +214,7 @@ static int update_dt_node(u32 phandle, s32 scope) >>> return 0; >>> } >>> >>> -static int add_dt_node(u32 parent_phandle, u32 drc_index) >>> +static int add_dt_node(__be32 parent_phandle, __be32 drc_index) >>> { >>> struct device_node *dn; >>> struct device_node *parent_dn; >>> @@ -237,7 +239,7 @@ static int add_dt_node(u32 parent_phandle, u32 drc_index) >>> int pseries_devicetree_update(s32 scope) >>> { >>> char *rtas_buf; >>> - u32 *data; >>> + __be32 *data; >>> int update_nodes_token; >>> int rc; >>> >>> @@ -254,17 +256,17 @@ int pseries_devicetree_update(s32 scope) >>> if (rc && rc != 1) >>> break; >>> >>> - data = (u32 *)rtas_buf + 4; >>> - while (*data & NODE_ACTION_MASK) { >>> + data = (__be32 *)rtas_buf + 4; >>> + while (be32_to_cpu(*data) & NODE_ACTION_MASK) { >>> int i; >>> - u32 action = *data & NODE_ACTION_MASK; >>> - int node_count = *data & NODE_COUNT_MASK; >>> + u32 action = be32_to_cpu(*data) & NODE_ACTION_MASK; >>> + u32 node_count = be32_to_cpu(*data) & NODE_COUNT_MASK; >>> >>> data++; >>> >>> for (i = 0; i < node_count; i++) { >>> - u32 phandle = *data++; >>> - u32 drc_index; >>> + __be32 phandle = *data++; >>> + __be32 drc_index; >>> >>> switch (action) { >>> case DELETE_DT_NODE: >> >> The patch looks good, no nonsense endian fixing. >> Worth noting that it leaves existing bugs in place, which is fine, I'll >> rebase my patches which address endian and bugs on top of these so as to >> address the bugs. >> >> > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/3] powerpc/pseries: Little endian fixes for post mobility device tree update 2015-03-03 23:15 ` Tyrel Datwyler @ 2015-03-04 1:20 ` Cyril Bur 2015-03-04 1:41 ` Tyrel Datwyler 0 siblings, 1 reply; 18+ messages in thread From: Cyril Bur @ 2015-03-04 1:20 UTC (permalink / raw) To: Tyrel Datwyler; +Cc: nfont, linuxppc-dev On Tue, 2015-03-03 at 15:15 -0800, Tyrel Datwyler wrote: > On 03/02/2015 01:49 PM, Tyrel Datwyler wrote: > > On 03/01/2015 09:20 PM, Cyril Bur wrote: > >> On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: > >>> We currently use the device tree update code in the kernel after resuming > >>> from a suspend operation to re-sync the kernels view of the device tree with > >>> that of the hypervisor. The code as it stands is not endian safe as it relies > >>> on parsing buffers returned by RTAS calls that thusly contains data in big > >>> endian format. > >>> > >>> This patch annotates variables and structure members with __be types as well > >>> as performing necessary byte swaps to cpu endian for data that needs to be > >>> parsed. > >>> > >>> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> > >>> --- > >>> arch/powerpc/platforms/pseries/mobility.c | 36 ++++++++++++++++--------------- > >>> 1 file changed, 19 insertions(+), 17 deletions(-) > >>> > >>> diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c > >>> index 29e4f04..0b1f70e 100644 > >>> --- a/arch/powerpc/platforms/pseries/mobility.c > >>> +++ b/arch/powerpc/platforms/pseries/mobility.c > >>> @@ -25,10 +25,10 @@ > >>> static struct kobject *mobility_kobj; > >>> > >>> struct update_props_workarea { > >>> - u32 phandle; > >>> - u32 state; > >>> - u64 reserved; > >>> - u32 nprops; > >>> + __be32 phandle; > >>> + __be32 state; > >>> + __be64 reserved; > >>> + __be32 nprops; > >>> } __packed; > >>> > >>> #define NODE_ACTION_MASK 0xff000000 > >>> @@ -127,7 +127,7 @@ static int update_dt_property(struct device_node *dn, struct property **prop, > >>> return 0; > >>> } > >>> > >>> -static int update_dt_node(u32 phandle, s32 scope) > >>> +static int update_dt_node(__be32 phandle, s32 scope) > >>> { > >> > >> On line 153 of this function: > >> dn = of_find_node_by_phandle(phandle); > >> > >> You're passing a __be32 to device tree code, if we can treat the phandle > >> as a opaque value returned to us from the rtas call and pass it around > >> like that then all good. > > After digging deeper the device_node->phandle is stored in cpu endian > under the covers. So, for the of_find_node_by_phandle() we do need to > convert the phandle to cpu endian first. It appears I got lucky with the > update fixing the observed RMC issue because the phandle for the root > node seems to always be 0xffffffff. > I think we've both switched opinions here, initially I thought an endian conversion was necessary but turns out that all of_find_node_by_phandle really does is: for_each_of_allnodes(np) if (np->phandle == handle) break; of_node_get(np); The == is safe either way and I think the of code might be trying to imply that it doesn't matter by having a typedefed type 'phandle'. I'm still digging around, we want to get this right! Cyril > -Tyrel > > > > > Yes, of_find_node_by_phandle directly compares phandle passed in against > > the handle stored in each device_node when searching for a matching > > node. Since, the device tree is big endian it follows that the big > > endian phandle received in the rtas buffer needs no conversion. > > > > Further, we need to pass the phandle to ibm,update-properties in the > > work area which is also required to be big endian. So, again it seemed > > that converting to cpu endian was a waste of effort just to convert it > > back to big endian. > > > >> Its also hard to be sure if these need to be BE and have always been > >> that way because we've always run BE so they've never actually wanted > >> CPU endian its just that CPU endian has always been BE (I think I > >> started rambling...) > >> > >> Just want to check that *not* converting them is done on purpose. > > > > Yes, I explicitly did not convert them on purpose. As mentioned above we > > need phandle in BE for the ibm,update-properties rtas work area. > > Similarly, drc_index needs to be in BE for the ibm,configure-connector > > rtas work area. Outside, of that we do no other manipulation of those > > values. > > > >> > >> And having read on, I'm assuming the answer is yes since this > >> observation is true for your changes which affect: > >> delete_dt_node() > >> update_dt_node() > >> add_dt_node() > >> Worth noting that you didn't change the definition of delete_dt_node() > > > > You are correct. Oversight. I will fix that as it should generate a > > sparse complaint. > > > > -Tyrel > > > >> > >> I'll have a look once you address the non compiling in patch 1/3 (I'm > >> getting blocked the unused var because somehow Werror is on, odd it > >> didn't trip you up) but I also suspect this will have sparse go a bit > >> nuts. > >> I wonder if there is a nice way of shutting sparse up. > >> > >>> struct update_props_workarea *upwa; > >>> struct device_node *dn; > >>> @@ -136,6 +136,7 @@ static int update_dt_node(u32 phandle, s32 scope) > >>> char *prop_data; > >>> char *rtas_buf; > >>> int update_properties_token; > >>> + u32 nprops; > >>> u32 vd; > >>> > >>> update_properties_token = rtas_token("ibm,update-properties"); > >>> @@ -162,6 +163,7 @@ static int update_dt_node(u32 phandle, s32 scope) > >>> break; > >>> > >>> prop_data = rtas_buf + sizeof(*upwa); > >>> + nprops = be32_to_cpu(upwa->nprops); > >>> > >>> /* On the first call to ibm,update-properties for a node the > >>> * the first property value descriptor contains an empty > >>> @@ -170,17 +172,17 @@ static int update_dt_node(u32 phandle, s32 scope) > >>> */ > >>> if (*prop_data == 0) { > >>> prop_data++; > >>> - vd = *(u32 *)prop_data; > >>> + vd = be32_to_cpu(*(__be32 *)prop_data); > >>> prop_data += vd + sizeof(vd); > >>> - upwa->nprops--; > >>> + nprops--; > >>> } > >>> > >>> - for (i = 0; i < upwa->nprops; i++) { > >>> + for (i = 0; i < nprops; i++) { > >>> char *prop_name; > >>> > >>> prop_name = prop_data; > >>> prop_data += strlen(prop_name) + 1; > >>> - vd = *(u32 *)prop_data; > >>> + vd = be32_to_cpu(*(__be32 *)prop_data); > >>> prop_data += sizeof(vd); > >>> > >>> switch (vd) { > >>> @@ -212,7 +214,7 @@ static int update_dt_node(u32 phandle, s32 scope) > >>> return 0; > >>> } > >>> > >>> -static int add_dt_node(u32 parent_phandle, u32 drc_index) > >>> +static int add_dt_node(__be32 parent_phandle, __be32 drc_index) > >>> { > >>> struct device_node *dn; > >>> struct device_node *parent_dn; > >>> @@ -237,7 +239,7 @@ static int add_dt_node(u32 parent_phandle, u32 drc_index) > >>> int pseries_devicetree_update(s32 scope) > >>> { > >>> char *rtas_buf; > >>> - u32 *data; > >>> + __be32 *data; > >>> int update_nodes_token; > >>> int rc; > >>> > >>> @@ -254,17 +256,17 @@ int pseries_devicetree_update(s32 scope) > >>> if (rc && rc != 1) > >>> break; > >>> > >>> - data = (u32 *)rtas_buf + 4; > >>> - while (*data & NODE_ACTION_MASK) { > >>> + data = (__be32 *)rtas_buf + 4; > >>> + while (be32_to_cpu(*data) & NODE_ACTION_MASK) { > >>> int i; > >>> - u32 action = *data & NODE_ACTION_MASK; > >>> - int node_count = *data & NODE_COUNT_MASK; > >>> + u32 action = be32_to_cpu(*data) & NODE_ACTION_MASK; > >>> + u32 node_count = be32_to_cpu(*data) & NODE_COUNT_MASK; > >>> > >>> data++; > >>> > >>> for (i = 0; i < node_count; i++) { > >>> - u32 phandle = *data++; > >>> - u32 drc_index; > >>> + __be32 phandle = *data++; > >>> + __be32 drc_index; > >>> > >>> switch (action) { > >>> case DELETE_DT_NODE: > >> > >> The patch looks good, no nonsense endian fixing. > >> Worth noting that it leaves existing bugs in place, which is fine, I'll > >> rebase my patches which address endian and bugs on top of these so as to > >> address the bugs. > >> > >> > > > > _______________________________________________ > > Linuxppc-dev mailing list > > Linuxppc-dev@lists.ozlabs.org > > https://lists.ozlabs.org/listinfo/linuxppc-dev > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/3] powerpc/pseries: Little endian fixes for post mobility device tree update 2015-03-04 1:20 ` Cyril Bur @ 2015-03-04 1:41 ` Tyrel Datwyler 0 siblings, 0 replies; 18+ messages in thread From: Tyrel Datwyler @ 2015-03-04 1:41 UTC (permalink / raw) To: Cyril Bur; +Cc: nfont, linuxppc-dev On 03/03/2015 05:20 PM, Cyril Bur wrote: > On Tue, 2015-03-03 at 15:15 -0800, Tyrel Datwyler wrote: >> On 03/02/2015 01:49 PM, Tyrel Datwyler wrote: >>> On 03/01/2015 09:20 PM, Cyril Bur wrote: >>>> On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: >>>>> We currently use the device tree update code in the kernel after resuming >>>>> from a suspend operation to re-sync the kernels view of the device tree with >>>>> that of the hypervisor. The code as it stands is not endian safe as it relies >>>>> on parsing buffers returned by RTAS calls that thusly contains data in big >>>>> endian format. >>>>> >>>>> This patch annotates variables and structure members with __be types as well >>>>> as performing necessary byte swaps to cpu endian for data that needs to be >>>>> parsed. >>>>> >>>>> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> >>>>> --- >>>>> arch/powerpc/platforms/pseries/mobility.c | 36 ++++++++++++++++--------------- >>>>> 1 file changed, 19 insertions(+), 17 deletions(-) >>>>> >>>>> diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c >>>>> index 29e4f04..0b1f70e 100644 >>>>> --- a/arch/powerpc/platforms/pseries/mobility.c >>>>> +++ b/arch/powerpc/platforms/pseries/mobility.c >>>>> @@ -25,10 +25,10 @@ >>>>> static struct kobject *mobility_kobj; >>>>> >>>>> struct update_props_workarea { >>>>> - u32 phandle; >>>>> - u32 state; >>>>> - u64 reserved; >>>>> - u32 nprops; >>>>> + __be32 phandle; >>>>> + __be32 state; >>>>> + __be64 reserved; >>>>> + __be32 nprops; >>>>> } __packed; >>>>> >>>>> #define NODE_ACTION_MASK 0xff000000 >>>>> @@ -127,7 +127,7 @@ static int update_dt_property(struct device_node *dn, struct property **prop, >>>>> return 0; >>>>> } >>>>> >>>>> -static int update_dt_node(u32 phandle, s32 scope) >>>>> +static int update_dt_node(__be32 phandle, s32 scope) >>>>> { >>>> >>>> On line 153 of this function: >>>> dn = of_find_node_by_phandle(phandle); >>>> >>>> You're passing a __be32 to device tree code, if we can treat the phandle >>>> as a opaque value returned to us from the rtas call and pass it around >>>> like that then all good. >> >> After digging deeper the device_node->phandle is stored in cpu endian >> under the covers. So, for the of_find_node_by_phandle() we do need to >> convert the phandle to cpu endian first. It appears I got lucky with the >> update fixing the observed RMC issue because the phandle for the root >> node seems to always be 0xffffffff. >> > I think we've both switched opinions here, initially I thought an endian > conversion was necessary but turns out that all of_find_node_by_phandle > really does is: > for_each_of_allnodes(np) > if (np->phandle == handle) > break; > of_node_get(np); > > The == is safe either way and I think the of code might be trying to > imply that it doesn't matter by having a typedefed type 'phandle'. > > I'm still digging around, we want to get this right! When the device tree is unflattened the phandle is byte swapped to cpu endian. The following code is from unflatten_dt_node(). if (strcmp(pname, "ibm,phandle") == 0) np->phandle = be32_to_cpup(p); I added some debug to the of_find_node_by_phandle() and verified if the phandle isn't swapped to cpu endian we fail to find a matching node except in the case where the phandle is equivalent in both big and little endian. -Tyrel > > > Cyril >> -Tyrel >> >>> >>> Yes, of_find_node_by_phandle directly compares phandle passed in against >>> the handle stored in each device_node when searching for a matching >>> node. Since, the device tree is big endian it follows that the big >>> endian phandle received in the rtas buffer needs no conversion. >>> >>> Further, we need to pass the phandle to ibm,update-properties in the >>> work area which is also required to be big endian. So, again it seemed >>> that converting to cpu endian was a waste of effort just to convert it >>> back to big endian. >>> >>>> Its also hard to be sure if these need to be BE and have always been >>>> that way because we've always run BE so they've never actually wanted >>>> CPU endian its just that CPU endian has always been BE (I think I >>>> started rambling...) >>>> >>>> Just want to check that *not* converting them is done on purpose. >>> >>> Yes, I explicitly did not convert them on purpose. As mentioned above we >>> need phandle in BE for the ibm,update-properties rtas work area. >>> Similarly, drc_index needs to be in BE for the ibm,configure-connector >>> rtas work area. Outside, of that we do no other manipulation of those >>> values. >>> >>>> >>>> And having read on, I'm assuming the answer is yes since this >>>> observation is true for your changes which affect: >>>> delete_dt_node() >>>> update_dt_node() >>>> add_dt_node() >>>> Worth noting that you didn't change the definition of delete_dt_node() >>> >>> You are correct. Oversight. I will fix that as it should generate a >>> sparse complaint. >>> >>> -Tyrel >>> >>>> >>>> I'll have a look once you address the non compiling in patch 1/3 (I'm >>>> getting blocked the unused var because somehow Werror is on, odd it >>>> didn't trip you up) but I also suspect this will have sparse go a bit >>>> nuts. >>>> I wonder if there is a nice way of shutting sparse up. >>>> >>>>> struct update_props_workarea *upwa; >>>>> struct device_node *dn; >>>>> @@ -136,6 +136,7 @@ static int update_dt_node(u32 phandle, s32 scope) >>>>> char *prop_data; >>>>> char *rtas_buf; >>>>> int update_properties_token; >>>>> + u32 nprops; >>>>> u32 vd; >>>>> >>>>> update_properties_token = rtas_token("ibm,update-properties"); >>>>> @@ -162,6 +163,7 @@ static int update_dt_node(u32 phandle, s32 scope) >>>>> break; >>>>> >>>>> prop_data = rtas_buf + sizeof(*upwa); >>>>> + nprops = be32_to_cpu(upwa->nprops); >>>>> >>>>> /* On the first call to ibm,update-properties for a node the >>>>> * the first property value descriptor contains an empty >>>>> @@ -170,17 +172,17 @@ static int update_dt_node(u32 phandle, s32 scope) >>>>> */ >>>>> if (*prop_data == 0) { >>>>> prop_data++; >>>>> - vd = *(u32 *)prop_data; >>>>> + vd = be32_to_cpu(*(__be32 *)prop_data); >>>>> prop_data += vd + sizeof(vd); >>>>> - upwa->nprops--; >>>>> + nprops--; >>>>> } >>>>> >>>>> - for (i = 0; i < upwa->nprops; i++) { >>>>> + for (i = 0; i < nprops; i++) { >>>>> char *prop_name; >>>>> >>>>> prop_name = prop_data; >>>>> prop_data += strlen(prop_name) + 1; >>>>> - vd = *(u32 *)prop_data; >>>>> + vd = be32_to_cpu(*(__be32 *)prop_data); >>>>> prop_data += sizeof(vd); >>>>> >>>>> switch (vd) { >>>>> @@ -212,7 +214,7 @@ static int update_dt_node(u32 phandle, s32 scope) >>>>> return 0; >>>>> } >>>>> >>>>> -static int add_dt_node(u32 parent_phandle, u32 drc_index) >>>>> +static int add_dt_node(__be32 parent_phandle, __be32 drc_index) >>>>> { >>>>> struct device_node *dn; >>>>> struct device_node *parent_dn; >>>>> @@ -237,7 +239,7 @@ static int add_dt_node(u32 parent_phandle, u32 drc_index) >>>>> int pseries_devicetree_update(s32 scope) >>>>> { >>>>> char *rtas_buf; >>>>> - u32 *data; >>>>> + __be32 *data; >>>>> int update_nodes_token; >>>>> int rc; >>>>> >>>>> @@ -254,17 +256,17 @@ int pseries_devicetree_update(s32 scope) >>>>> if (rc && rc != 1) >>>>> break; >>>>> >>>>> - data = (u32 *)rtas_buf + 4; >>>>> - while (*data & NODE_ACTION_MASK) { >>>>> + data = (__be32 *)rtas_buf + 4; >>>>> + while (be32_to_cpu(*data) & NODE_ACTION_MASK) { >>>>> int i; >>>>> - u32 action = *data & NODE_ACTION_MASK; >>>>> - int node_count = *data & NODE_COUNT_MASK; >>>>> + u32 action = be32_to_cpu(*data) & NODE_ACTION_MASK; >>>>> + u32 node_count = be32_to_cpu(*data) & NODE_COUNT_MASK; >>>>> >>>>> data++; >>>>> >>>>> for (i = 0; i < node_count; i++) { >>>>> - u32 phandle = *data++; >>>>> - u32 drc_index; >>>>> + __be32 phandle = *data++; >>>>> + __be32 drc_index; >>>>> >>>>> switch (action) { >>>>> case DELETE_DT_NODE: >>>> >>>> The patch looks good, no nonsense endian fixing. >>>> Worth noting that it leaves existing bugs in place, which is fine, I'll >>>> rebase my patches which address endian and bugs on top of these so as to >>>> address the bugs. >>>> >>>> >>> >>> _______________________________________________ >>> Linuxppc-dev mailing list >>> Linuxppc-dev@lists.ozlabs.org >>> https://lists.ozlabs.org/listinfo/linuxppc-dev >>> >> > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 3/3] powerpc/pseries: Expose post-migration in kernel device tree update to drmgr 2015-02-28 2:24 [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code Tyrel Datwyler 2015-02-28 2:24 ` [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration Tyrel Datwyler 2015-02-28 2:24 ` [PATCH 2/3] powerpc/pseries: Little endian fixes for post mobility device tree update Tyrel Datwyler @ 2015-02-28 2:24 ` Tyrel Datwyler 2015-03-03 6:24 ` Michael Ellerman 2015-03-03 6:10 ` [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code Michael Ellerman 3 siblings, 1 reply; 18+ messages in thread From: Tyrel Datwyler @ 2015-02-28 2:24 UTC (permalink / raw) To: linuxppc-dev; +Cc: Tyrel Datwyler, cyrilbur, nfont Traditionally after a migration operation drmgr has coordinated the device tree update with the kernel in userspace via the ugly /proc/ppc64/ofdt interface. This can be better done fully in the kernel where support already exists. Currently, drmgr makes a faux ibm,suspend-me RTAS call which we intercept in the kernel so that we can check VASI state for suspendability. After the LPAR resumes and returns to drmgr that is followed by the necessary update-nodes and update-properties RTAS calls which are parsed and communitated back to the kernel through /proc/ppc64/ofdt for the device tree update. The drmgr tool should instead initiate the migration using the already existing /sysfs/kernel/mobility/migration entry that performs all this work in the kernel. This patch adds a show function to the sysfs "migration" attribute that returns 1 to indicate the kernel will perform the device tree update after a migration operation and that drmgr should initiated the migration through the sysfs "migration" attribute. Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> --- arch/powerpc/platforms/pseries/mobility.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c index 0b1f70e..a689f74 100644 --- a/arch/powerpc/platforms/pseries/mobility.c +++ b/arch/powerpc/platforms/pseries/mobility.c @@ -40,6 +40,9 @@ struct update_props_workarea { #define MIGRATION_SCOPE (1) +#define USER_DT_UPDATE 0 +#define KERN_DT_UPDATE 1 + static int mobility_rtas_call(int token, char *buf, s32 scope) { int rc; @@ -339,7 +342,13 @@ static ssize_t migrate_store(struct class *class, struct class_attribute *attr, return count; } -static CLASS_ATTR(migration, S_IWUSR, NULL, migrate_store); +static ssize_t migrate_show(struct class *class, struct class_attribute *attr, + char *buf) +{ + return sprintf(buf, "%d\n", KERN_DT_UPDATE); +} + +static CLASS_ATTR(migration, S_IWUSR | S_IRUGO, migrate_show, migrate_store); static int __init mobility_sysfs_init(void) { -- 1.7.12.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 3/3] powerpc/pseries: Expose post-migration in kernel device tree update to drmgr 2015-02-28 2:24 ` [PATCH 3/3] powerpc/pseries: Expose post-migration in kernel device tree update to drmgr Tyrel Datwyler @ 2015-03-03 6:24 ` Michael Ellerman 2015-03-03 21:18 ` Tyrel Datwyler 0 siblings, 1 reply; 18+ messages in thread From: Michael Ellerman @ 2015-03-03 6:24 UTC (permalink / raw) To: Tyrel Datwyler; +Cc: linuxppc-dev, cyrilbur, nfont On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: > Traditionally after a migration operation drmgr has coordinated the device tree > update with the kernel in userspace via the ugly /proc/ppc64/ofdt interface. This > can be better done fully in the kernel where support already exists. Currently, > drmgr makes a faux ibm,suspend-me RTAS call which we intercept in the kernel so > that we can check VASI state for suspendability. After the LPAR resumes and > returns to drmgr that is followed by the necessary update-nodes and > update-properties RTAS calls which are parsed and communitated back to the kernel > through /proc/ppc64/ofdt for the device tree update. The drmgr tool should > instead initiate the migration using the already existing > /sysfs/kernel/mobility/migration entry that performs all this work in the kernel. > > This patch adds a show function to the sysfs "migration" attribute that returns > 1 to indicate the kernel will perform the device tree update after a migration > operation and that drmgr should initiated the migration through the sysfs > "migration" attribute. I don't understand why we need this? Can't drmgr just check if /sysfs/kernel/mobility/migration exists, and if so it knows it should use it and that the kernel will handle the whole procedure? cheers ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 3/3] powerpc/pseries: Expose post-migration in kernel device tree update to drmgr 2015-03-03 6:24 ` Michael Ellerman @ 2015-03-03 21:18 ` Tyrel Datwyler 0 siblings, 0 replies; 18+ messages in thread From: Tyrel Datwyler @ 2015-03-03 21:18 UTC (permalink / raw) To: Michael Ellerman; +Cc: linuxppc-dev, cyrilbur, nfont On 03/02/2015 10:24 PM, Michael Ellerman wrote: > On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: >> Traditionally after a migration operation drmgr has coordinated the device tree >> update with the kernel in userspace via the ugly /proc/ppc64/ofdt interface. This >> can be better done fully in the kernel where support already exists. Currently, >> drmgr makes a faux ibm,suspend-me RTAS call which we intercept in the kernel so >> that we can check VASI state for suspendability. After the LPAR resumes and >> returns to drmgr that is followed by the necessary update-nodes and >> update-properties RTAS calls which are parsed and communitated back to the kernel >> through /proc/ppc64/ofdt for the device tree update. The drmgr tool should >> instead initiate the migration using the already existing >> /sysfs/kernel/mobility/migration entry that performs all this work in the kernel. >> >> This patch adds a show function to the sysfs "migration" attribute that returns >> 1 to indicate the kernel will perform the device tree update after a migration >> operation and that drmgr should initiated the migration through the sysfs >> "migration" attribute. > > I don't understand why we need this? > > Can't drmgr just check if /sysfs/kernel/mobility/migration exists, and if so it > knows it should use it and that the kernel will handle the whole procedure? The problem is that this sysfs entry was originally added with the remainder of the in kernel device tree update code in 2.6.37, but drmgr was never modified to use it. By the time I started looking at the in-kernel device tree code I found it very broken. I had bunch of fixes to get it working that went into 3.12. So, if somebody were to use a newer version of drmgr that simply checks for the existence of the migration sysfs entry on a pre-3.12 kernel their device-tree update experience is going to be sub-par. The approach taken here is identical to what was done in 9da3489 when we hooked the device tree update code into the suspend code. However, in that case we were already using the sysfs entry to trigger the suspend and legitimately needed a way to tell drmgr the kernel was now taking care of updating the device tree. Here we are really just trying to inform drmgr that it is running on a new enough kernel that the kernel device tree code actually works properly. Now, I don't really care for this approach, but the only other thought I had was to change the sysfs entry from "migration" to "migrate". -Tyrel > > cheers > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code 2015-02-28 2:24 [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code Tyrel Datwyler ` (2 preceding siblings ...) 2015-02-28 2:24 ` [PATCH 3/3] powerpc/pseries: Expose post-migration in kernel device tree update to drmgr Tyrel Datwyler @ 2015-03-03 6:10 ` Michael Ellerman 2015-03-03 20:37 ` Tyrel Datwyler 3 siblings, 1 reply; 18+ messages in thread From: Michael Ellerman @ 2015-03-03 6:10 UTC (permalink / raw) To: Tyrel Datwyler; +Cc: linuxppc-dev, cyrilbur, nfont On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: > This patchset simplifies the usage of rtas_ibm_suspend_me() by removing an > extraneous function parameter, fixes device tree updating on little endian > platforms, and adds a mechanism for informing drmgr that the kernel is cabable of > performing the whole migration including device tree update itself. > > Tyrel Datwyler (3): > powerpc/pseries: Simplify check for suspendability during > suspend/migration > powerpc/pseries: Little endian fixes for post mobility device tree > update > powerpc/pseries: Expose post-migration in kernel device tree update > to drmgr Hi Tyrel, Firstly let me say how much I hate this code, so thanks for working on it :) But I need you to split this series, into 1) fixes for 4.0 (and stable?), and 2) the rest. I *think* that would be patch 2, and then patches 1 & 3, but I don't want to guess. So please resend. cheers ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code 2015-03-03 6:10 ` [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code Michael Ellerman @ 2015-03-03 20:37 ` Tyrel Datwyler 0 siblings, 0 replies; 18+ messages in thread From: Tyrel Datwyler @ 2015-03-03 20:37 UTC (permalink / raw) To: Michael Ellerman; +Cc: linuxppc-dev, cyrilbur, nfont On 03/02/2015 10:10 PM, Michael Ellerman wrote: > On Fri, 2015-02-27 at 18:24 -0800, Tyrel Datwyler wrote: >> This patchset simplifies the usage of rtas_ibm_suspend_me() by removing an >> extraneous function parameter, fixes device tree updating on little endian >> platforms, and adds a mechanism for informing drmgr that the kernel is cabable of >> performing the whole migration including device tree update itself. >> >> Tyrel Datwyler (3): >> powerpc/pseries: Simplify check for suspendability during >> suspend/migration >> powerpc/pseries: Little endian fixes for post mobility device tree >> update >> powerpc/pseries: Expose post-migration in kernel device tree update >> to drmgr > > Hi Tyrel, > > Firstly let me say how much I hate this code, so thanks for working on it :) I did it once. Might as well sacrifice my sanity a second time. :) > > But I need you to split this series, into 1) fixes for 4.0 (and stable?), and > 2) the rest. > > I *think* that would be patch 2, and then patches 1 & 3, but I don't want to > guess. So please resend. Sure. Your split seems correct as patch 2 is fixes while 1 and 3 are cosmetic/new feature. Seeing as patch 1 is endian fixes I'll Cc -stable as well. -Tyrel > > cheers > > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-03-04 15:58 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-02-28 2:24 [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code Tyrel Datwyler 2015-02-28 2:24 ` [PATCH 1/3] powerpc/pseries: Simplify check for suspendability during suspend/migration Tyrel Datwyler 2015-03-02 4:19 ` Cyril Bur 2015-03-02 21:30 ` Tyrel Datwyler 2015-03-03 6:15 ` Michael Ellerman 2015-03-03 20:16 ` Tyrel Datwyler 2015-03-04 15:58 ` Nathan Fontenot 2015-02-28 2:24 ` [PATCH 2/3] powerpc/pseries: Little endian fixes for post mobility device tree update Tyrel Datwyler 2015-03-02 5:20 ` Cyril Bur 2015-03-02 21:49 ` Tyrel Datwyler 2015-03-03 23:15 ` Tyrel Datwyler 2015-03-04 1:20 ` Cyril Bur 2015-03-04 1:41 ` Tyrel Datwyler 2015-02-28 2:24 ` [PATCH 3/3] powerpc/pseries: Expose post-migration in kernel device tree update to drmgr Tyrel Datwyler 2015-03-03 6:24 ` Michael Ellerman 2015-03-03 21:18 ` Tyrel Datwyler 2015-03-03 6:10 ` [PATCH 0/3] powerpc/pseries: Fixes and cleanup of suspend/migration code Michael Ellerman 2015-03-03 20:37 ` Tyrel Datwyler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).