All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vaibhav Jain <vaibhav@linux.ibm.com>
To: Michael Ellerman <michaele@au1.ibm.com>, Ira Weiny <ira.weiny@intel.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-nvdimm@lists.01.org
Subject: Re: [RESEND PATCH v7 3/5] powerpc/papr_scm: Fetch nvdimm health information from PHYP
Date: Thu, 21 May 2020 22:29:38 +0530	[thread overview]
Message-ID: <87imgpw7et.fsf@linux.ibm.com> (raw)
In-Reply-To: <87k115gy0i.fsf@mpe.ellerman.id.au>

Michael Ellerman <michaele@au1.ibm.com> writes:

> Vaibhav Jain <vaibhav@linux.ibm.com> writes:
>> Thanks for reviewing this this patch Ira. My responses below:
>> Ira Weiny <ira.weiny@intel.com> writes:
>>> On Wed, May 20, 2020 at 12:30:56AM +0530, Vaibhav Jain wrote:
>>>> Implement support for fetching nvdimm health information via
>>>> H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
>>>> of 64-bit big-endian integers, bitwise-and of which is then stored in
>>>> 'struct papr_scm_priv' and subsequently partially exposed to
>>>> user-space via newly introduced dimm specific attribute
>>>> 'papr/flags'. Since the hcall is costly, the health information is
>>>> cached and only re-queried, 60s after the previous successful hcall.
> ...
>>>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>>>> index f35592423380..142636e1a59f 100644
>>>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>>>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>>>> @@ -39,6 +78,15 @@ struct papr_scm_priv {
>>>>  	struct resource res;
>>>>  	struct nd_region *region;
>>>>  	struct nd_interleave_set nd_set;
>>>> +
>>>> +	/* Protect dimm health data from concurrent read/writes */
>>>> +	struct mutex health_mutex;
>>>> +
>>>> +	/* Last time the health information of the dimm was updated */
>>>> +	unsigned long lasthealth_jiffies;
>>>> +
>>>> +	/* Health information for the dimm */
>>>> +	u64 health_bitmap;
>>>
>>> I wonder if this should be typed big endian as you mention that it is in the
>>> commit message?
>> This was discussed in an earlier review of the patch series at
>> https://lore.kernel.org/linux-nvdimm/878sjetcis.fsf@mpe.ellerman.id.au
>>
>> Even though health bitmap is returned in big endian format (For ex
>> value 0xC00000000000000 indicates bits 0,1 set), its value is never
>> used. Instead only test for specific bits being set in the register is
>> done.
>
> This has already caused a lot of confusion, so let me try and clear it
> up. I will probably fail :)
>
> The value is not big endian.
>
> It's returned in a GPR (a register), from the hypervisor. The ordering
> of bytes in a register is not dependent on what endian we're executing
> in.
>
> It's true that the hypervisor will have been running big endian, and
> when it returns to us we will now be running little endian. But the
> value is unchanged, it was 0xC00000000000000 in the GPR while the HV was
> running and it's still 0xC00000000000000 when we return to Linux. You
> can see this in mambo, see below for an example.
>
>
> _However_, the specification of the bits in the bitmap value uses MSB 0
> ordering, as is traditional for IBM documentation. That means the most
> significant bit, aka. the left most bit, is called "bit 0".
>
> See: https://en.wikipedia.org/wiki/Bit_numbering#MSB_0_bit_numbering
>
> That is the opposite numbering from what most people use, and in
> particular what most code in Linux uses, which is that bit 0 is the
> least significant bit.
>
> Which is where the confusion comes in. It's not that the bytes are
> returned in a different order, it's that the bits are numbered
> differently in the IBM documentation.
>
> The way to fix this kind of thing is to read the docs, and convert all
> the bits into correct numbering (LSB=0), and then throw away the docs ;)
>
> cheers
Thanks a lot for clarifying this Mpe and for this detailed explaination.

I have removed the term Big-Endian from v8 patch description to avoid
any further confusion.

>
>
>
> In mambo we can set a breakpoint just before the kernel enters skiboot,
> towards the end of __opal_call. The kernel is running LE and skiboot
> runs BE.
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] b 0xc0000000000c1744
>   breakpoint set at [0:0:0]: 0xc0000000000c1744 (0x00000000000C1744) Enc:0x2402004C : hrfid
>
> Then run:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] c
>   [0:0:0]: 0xC0000000000C1744 (0x00000000000C1744) Enc:0x2402004C : hrfid
>   INFO: 121671618: (121671618): ** Execution stopped: user (tcl),  **
>   121671618: ** finished running 121671618 instructions **
>
> And we stop there, on an hrfid that we haven't executed yet.
> We can print r0, to see the OPAL token:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p r0
>   0x0000000000000019
>
> ie. we're calling OPAL_CONSOLE_WRITE_BUFFER_SPACE (25).
>
> And we can print the MSR:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p msr
>   0x9000000002001033
>   
>                      64-bit mode (SF): 0x1 [64-bit mode]
>                 Hypervisor State (HV): 0x1
>                Vector Available (VEC): 0x1
>   Machine Check Interrupt Enable (ME): 0x1
>             Instruction Relocate (IR): 0x1
>                    Data Relocate (DR): 0x1
>            Recoverable Interrupt (RI): 0x1
>               Little-Endian Mode (LE): 0x1 [little-endian]
>
> ie. we're little endian.
>
> We then step one instruction:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] s
>   [0:0:0]: 0x0000000030002BF0 (0x0000000030002BF0) Enc:0x7D9FFAA6 : mfspr   r12,PIR
>
> Now we're in skiboot. Print the MSR again:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p msr
>   0x9000000002001002
>   
>                      64-bit mode (SF): 0x1 [64-bit mode]
>                 Hypervisor State (HV): 0x1
>                Vector Available (VEC): 0x1
>   Machine Check Interrupt Enable (ME): 0x1
>            Recoverable Interrupt (RI): 0x1
>
> We're big endian.
> Print r0:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p r0
>   0x0000000000000019
>
> r0 is unchanged!
Got it. Thanks again.

-- 
Cheers
~ Vaibhav
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Vaibhav Jain <vaibhav@linux.ibm.com>
To: Michael Ellerman <michaele@au1.ibm.com>, Ira Weiny <ira.weiny@intel.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-nvdimm@lists.01.org
Subject: Re: [RESEND PATCH v7 3/5] powerpc/papr_scm: Fetch nvdimm health information from PHYP
Date: Thu, 21 May 2020 22:29:38 +0530	[thread overview]
Message-ID: <87imgpw7et.fsf@linux.ibm.com> (raw)
In-Reply-To: <87k115gy0i.fsf@mpe.ellerman.id.au>

Michael Ellerman <michaele@au1.ibm.com> writes:

> Vaibhav Jain <vaibhav@linux.ibm.com> writes:
>> Thanks for reviewing this this patch Ira. My responses below:
>> Ira Weiny <ira.weiny@intel.com> writes:
>>> On Wed, May 20, 2020 at 12:30:56AM +0530, Vaibhav Jain wrote:
>>>> Implement support for fetching nvdimm health information via
>>>> H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
>>>> of 64-bit big-endian integers, bitwise-and of which is then stored in
>>>> 'struct papr_scm_priv' and subsequently partially exposed to
>>>> user-space via newly introduced dimm specific attribute
>>>> 'papr/flags'. Since the hcall is costly, the health information is
>>>> cached and only re-queried, 60s after the previous successful hcall.
> ...
>>>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>>>> index f35592423380..142636e1a59f 100644
>>>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>>>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>>>> @@ -39,6 +78,15 @@ struct papr_scm_priv {
>>>>  	struct resource res;
>>>>  	struct nd_region *region;
>>>>  	struct nd_interleave_set nd_set;
>>>> +
>>>> +	/* Protect dimm health data from concurrent read/writes */
>>>> +	struct mutex health_mutex;
>>>> +
>>>> +	/* Last time the health information of the dimm was updated */
>>>> +	unsigned long lasthealth_jiffies;
>>>> +
>>>> +	/* Health information for the dimm */
>>>> +	u64 health_bitmap;
>>>
>>> I wonder if this should be typed big endian as you mention that it is in the
>>> commit message?
>> This was discussed in an earlier review of the patch series at
>> https://lore.kernel.org/linux-nvdimm/878sjetcis.fsf@mpe.ellerman.id.au
>>
>> Even though health bitmap is returned in big endian format (For ex
>> value 0xC00000000000000 indicates bits 0,1 set), its value is never
>> used. Instead only test for specific bits being set in the register is
>> done.
>
> This has already caused a lot of confusion, so let me try and clear it
> up. I will probably fail :)
>
> The value is not big endian.
>
> It's returned in a GPR (a register), from the hypervisor. The ordering
> of bytes in a register is not dependent on what endian we're executing
> in.
>
> It's true that the hypervisor will have been running big endian, and
> when it returns to us we will now be running little endian. But the
> value is unchanged, it was 0xC00000000000000 in the GPR while the HV was
> running and it's still 0xC00000000000000 when we return to Linux. You
> can see this in mambo, see below for an example.
>
>
> _However_, the specification of the bits in the bitmap value uses MSB 0
> ordering, as is traditional for IBM documentation. That means the most
> significant bit, aka. the left most bit, is called "bit 0".
>
> See: https://en.wikipedia.org/wiki/Bit_numbering#MSB_0_bit_numbering
>
> That is the opposite numbering from what most people use, and in
> particular what most code in Linux uses, which is that bit 0 is the
> least significant bit.
>
> Which is where the confusion comes in. It's not that the bytes are
> returned in a different order, it's that the bits are numbered
> differently in the IBM documentation.
>
> The way to fix this kind of thing is to read the docs, and convert all
> the bits into correct numbering (LSB=0), and then throw away the docs ;)
>
> cheers
Thanks a lot for clarifying this Mpe and for this detailed explaination.

I have removed the term Big-Endian from v8 patch description to avoid
any further confusion.

>
>
>
> In mambo we can set a breakpoint just before the kernel enters skiboot,
> towards the end of __opal_call. The kernel is running LE and skiboot
> runs BE.
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] b 0xc0000000000c1744
>   breakpoint set at [0:0:0]: 0xc0000000000c1744 (0x00000000000C1744) Enc:0x2402004C : hrfid
>
> Then run:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] c
>   [0:0:0]: 0xC0000000000C1744 (0x00000000000C1744) Enc:0x2402004C : hrfid
>   INFO: 121671618: (121671618): ** Execution stopped: user (tcl),  **
>   121671618: ** finished running 121671618 instructions **
>
> And we stop there, on an hrfid that we haven't executed yet.
> We can print r0, to see the OPAL token:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p r0
>   0x0000000000000019
>
> ie. we're calling OPAL_CONSOLE_WRITE_BUFFER_SPACE (25).
>
> And we can print the MSR:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p msr
>   0x9000000002001033
>   
>                      64-bit mode (SF): 0x1 [64-bit mode]
>                 Hypervisor State (HV): 0x1
>                Vector Available (VEC): 0x1
>   Machine Check Interrupt Enable (ME): 0x1
>             Instruction Relocate (IR): 0x1
>                    Data Relocate (DR): 0x1
>            Recoverable Interrupt (RI): 0x1
>               Little-Endian Mode (LE): 0x1 [little-endian]
>
> ie. we're little endian.
>
> We then step one instruction:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] s
>   [0:0:0]: 0x0000000030002BF0 (0x0000000030002BF0) Enc:0x7D9FFAA6 : mfspr   r12,PIR
>
> Now we're in skiboot. Print the MSR again:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p msr
>   0x9000000002001002
>   
>                      64-bit mode (SF): 0x1 [64-bit mode]
>                 Hypervisor State (HV): 0x1
>                Vector Available (VEC): 0x1
>   Machine Check Interrupt Enable (ME): 0x1
>            Recoverable Interrupt (RI): 0x1
>
> We're big endian.
> Print r0:
>
>   systemsim-p9 [~/skiboot/skiboot/external/mambo] p r0
>   0x0000000000000019
>
> r0 is unchanged!
Got it. Thanks again.

-- 
Cheers
~ Vaibhav

  reply	other threads:[~2020-05-21 17:00 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-19 19:00 [RESEND PATCH v7 0/5] powerpc/papr_scm: Add support for reporting nvdimm health Vaibhav Jain
2020-05-19 19:00 ` Vaibhav Jain
2020-05-19 19:00 ` Vaibhav Jain
2020-05-19 19:00 ` [RESEND PATCH v7 1/5] powerpc: Document details on H_SCM_HEALTH hcall Vaibhav Jain
2020-05-19 19:00   ` Vaibhav Jain
2020-05-19 19:00   ` Vaibhav Jain
2020-05-19 19:00 ` [RESEND PATCH v7 2/5] seq_buf: Export seq_buf_printf() to external modules Vaibhav Jain
2020-05-19 19:00   ` Vaibhav Jain
2020-05-19 19:00   ` Vaibhav Jain
2020-05-20 17:01   ` Christoph Hellwig
2020-05-20 17:01     ` Christoph Hellwig
2020-05-20 17:01     ` Christoph Hellwig
2020-05-19 19:00 ` [RESEND PATCH v7 3/5] powerpc/papr_scm: Fetch nvdimm health information from PHYP Vaibhav Jain
2020-05-19 19:00   ` Vaibhav Jain
2020-05-19 19:00   ` Vaibhav Jain
2020-05-20 14:54   ` Ira Weiny
2020-05-20 14:54     ` Ira Weiny
2020-05-20 14:54     ` Ira Weiny
2020-05-20 17:15     ` Vaibhav Jain
2020-05-20 17:15       ` Vaibhav Jain
2020-05-20 17:15       ` Vaibhav Jain
2020-05-21 14:31       ` Michael Ellerman
2020-05-21 14:31         ` Michael Ellerman
2020-05-21 14:31         ` Michael Ellerman
2020-05-21 16:59         ` Vaibhav Jain [this message]
2020-05-21 16:59           ` Vaibhav Jain
2020-05-21 23:32       ` Ira Weiny
2020-05-21 23:32         ` Ira Weiny
2020-05-21 23:32         ` Ira Weiny
2020-05-19 19:00 ` [RESEND PATCH v7 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods Vaibhav Jain
2020-05-19 19:00   ` Vaibhav Jain
2020-05-19 19:00   ` [RESEND PATCH v7 4/5] ndctl/papr_scm, uapi: " Vaibhav Jain
2020-05-20  7:09   ` [RESEND PATCH v7 4/5] ndctl/papr_scm,uapi: " Aneesh Kumar K.V
2020-05-20  7:09     ` Aneesh Kumar K.V
2020-05-20  7:09     ` Aneesh Kumar K.V
2020-05-20 10:19     ` Vaibhav Jain
2020-05-20 10:19       ` Vaibhav Jain
2020-05-20 10:19       ` [RESEND PATCH v7 4/5] ndctl/papr_scm, uapi: " Vaibhav Jain
2020-05-20 15:32   ` [RESEND PATCH v7 4/5] ndctl/papr_scm,uapi: " Ira Weiny
2020-05-20 15:32     ` Ira Weiny
2020-05-20 15:32     ` Ira Weiny
2020-05-20 18:37     ` Vaibhav Jain
2020-05-20 18:37       ` Vaibhav Jain
2020-05-20 18:37       ` [RESEND PATCH v7 4/5] ndctl/papr_scm, uapi: " Vaibhav Jain
2020-05-21  6:48     ` [RESEND PATCH v7 4/5] ndctl/papr_scm,uapi: " Michael Ellerman
2020-05-21  6:48       ` Michael Ellerman
2020-05-21  6:48       ` [RESEND PATCH v7 4/5] ndctl/papr_scm, uapi: " Michael Ellerman
2020-05-22  7:38       ` Vaibhav Jain
2020-05-22  7:38         ` Vaibhav Jain
2020-05-25 12:00         ` Vaibhav Jain
2020-05-25 12:00           ` Vaibhav Jain
2020-05-26 12:14           ` Michael Ellerman
2020-05-26 12:14             ` Michael Ellerman
2020-05-19 19:00 ` [RESEND PATCH v7 5/5] powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_HEALTH Vaibhav Jain
2020-05-19 19:00   ` Vaibhav Jain
2020-05-19 19:00   ` Vaibhav Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87imgpw7et.fsf@linux.ibm.com \
    --to=vaibhav@linux.ibm.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=michaele@au1.ibm.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.