* ibmvtpm byteswapping inconsistency
@ 2017-01-26 20:22 Michal Suchánek
2017-01-26 22:05 ` Jason Gunthorpe
` (2 more replies)
0 siblings, 3 replies; 22+ messages in thread
From: Michal Suchánek @ 2017-01-26 20:22 UTC (permalink / raw)
To: Ashley Lai, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen,
Jason Gunthorpe, tpmdd-devel, linuxppc-dev, linux-kernel
Hello,
building ibmvtpm I noticed gcc warning complaining that second word of
struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized.
The structure is defined as
struct ibmvtpm_crq {
u8 valid;
u8 msg;
__be16 len;
__be32 data;
__be64 reserved;
} __attribute__((packed, aligned(8)));
initialized as
struct ibmvtpm_crq crq;
u64 *buf = (u64 *) &crq;
...
crq.valid = (u8)IBMVTPM_VALID_CMD;
crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND;
and submitted with
rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]),
cpu_to_be64(buf[1]));
which means that the second word indeed contains purely garbage.
This is repeated a few times in the driver so I added memset to quiet
gcc and make behavior deterministic in case the unused fields get some
meaning in the future.
However, in tpm_ibmvtpm_send the structure is initialized as
struct ibmvtpm_crq crq;
__be64 *word = (__be64 *)&crq;
...
crq.valid = (u8)IBMVTPM_VALID_CMD;
crq.msg = (u8)VTPM_TPM_COMMAND;
crq.len = cpu_to_be16(count);
crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle);
and submitted with
rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]),
be64_to_cpu(word[1]));
meaning it is swapped twice.
Where is the interface defined? Are the command arguments passed as BE
subfields (the second case was correct before adding the extra whole
word swap) or BE words (the first case doing whole word swap is
correct)?
Thanks
Michal
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: ibmvtpm byteswapping inconsistency 2017-01-26 20:22 ibmvtpm byteswapping inconsistency Michal Suchánek @ 2017-01-26 22:05 ` Jason Gunthorpe 2017-01-26 22:43 ` Michal Suchanek 2017-01-26 22:58 ` Ashley Lai 2017-01-27 1:42 ` Tyrel Datwyler 2017-01-27 11:18 ` David Laight 2 siblings, 2 replies; 22+ messages in thread From: Jason Gunthorpe @ 2017-01-26 22:05 UTC (permalink / raw) To: Michal Such??nek, Nayna Jain Cc: Ashley Lai, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, tpmdd-devel, linuxppc-dev, linux-kernel On Thu, Jan 26, 2017 at 09:22:48PM +0100, Michal Such??nek wrote: > This is repeated a few times in the driver so I added memset to quiet > gcc and make behavior deterministic in case the unused fields get some > meaning in the future. Yep, reserved certainly needs to be zeroed.. Can you send a patch? memset is overkill... > However, in tpm_ibmvtpm_send the structure is initialized as > > struct ibmvtpm_crq crq; > __be64 *word = (__be64 *)&crq; > ... > crq.valid = (u8)IBMVTPM_VALID_CMD; > crq.msg = (u8)VTPM_TPM_COMMAND; > crq.len = cpu_to_be16(count); > crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); > > and submitted with > > rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), > be64_to_cpu(word[1])); > meaning it is swapped twice. No idea, Nayna may know. My guess is that '__be64 *word' should be 'u64 *word'... Jason ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-26 22:05 ` Jason Gunthorpe @ 2017-01-26 22:43 ` Michal Suchanek 2017-01-26 22:58 ` Ashley Lai 1 sibling, 0 replies; 22+ messages in thread From: Michal Suchanek @ 2017-01-26 22:43 UTC (permalink / raw) To: Jason Gunthorpe Cc: Michal Such??nek, Nayna Jain, Ashley Lai, Marcel Selhorst, Linux Kernel Mailing List, Jarkko Sakkinen, tpmdd-devel, Paul Mackerras, Peter Huewe, linuxppc-dev On 26 January 2017 at 23:05, Jason Gunthorpe <jgunthorpe@obsidianresearch.com> wrote: > On Thu, Jan 26, 2017 at 09:22:48PM +0100, Michal Such??nek wrote: > >> This is repeated a few times in the driver so I added memset to quiet >> gcc and make behavior deterministic in case the unused fields get some >> meaning in the future. > > Yep, reserved certainly needs to be zeroed.. Can you send a patch? And len and data as well.. > memset is overkill... Does not look so. Michal ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-26 22:05 ` Jason Gunthorpe 2017-01-26 22:43 ` Michal Suchanek @ 2017-01-26 22:58 ` Ashley Lai 2017-02-02 4:40 ` Vicky 1 sibling, 1 reply; 22+ messages in thread From: Ashley Lai @ 2017-01-26 22:58 UTC (permalink / raw) To: Jason Gunthorpe, Michal Such??nek, Nayna Jain, honclo Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, tpmdd-devel, linuxppc-dev, linux-kernel Adding Vicky from IBM. On 01/26/2017 04:05 PM, Jason Gunthorpe wrote: > On Thu, Jan 26, 2017 at 09:22:48PM +0100, Michal Such??nek wrote: > >> This is repeated a few times in the driver so I added memset to quiet >> gcc and make behavior deterministic in case the unused fields get some >> meaning in the future. > Yep, reserved certainly needs to be zeroed.. Can you send a patch? > memset is overkill... > >> However, in tpm_ibmvtpm_send the structure is initialized as >> >> struct ibmvtpm_crq crq; >> __be64 *word = (__be64 *)&crq; >> ... >> crq.valid = (u8)IBMVTPM_VALID_CMD; >> crq.msg = (u8)VTPM_TPM_COMMAND; >> crq.len = cpu_to_be16(count); >> crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); >> >> and submitted with >> >> rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), >> be64_to_cpu(word[1])); >> meaning it is swapped twice. > No idea, Nayna may know. > > My guess is that '__be64 *word' should be 'u64 *word'... > > Jason ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-26 22:58 ` Ashley Lai @ 2017-02-02 4:40 ` Vicky 2017-02-02 10:55 ` Michael Ellerman 2017-02-02 11:29 ` Michal Suchánek 0 siblings, 2 replies; 22+ messages in thread From: Vicky @ 2017-02-02 4:40 UTC (permalink / raw) To: Ashley Lai Cc: Jason Gunthorpe, Michal Such??nek, Nayna Jain, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, tpmdd-devel, linuxppc-dev, linux-kernel > On Jan 26, 2017, at 5:58 PM, Ashley Lai <ashleydlai@gmail.com> wrote: > > Adding Vicky from IBM. > > > On 01/26/2017 04:05 PM, Jason Gunthorpe wrote: >> On Thu, Jan 26, 2017 at 09:22:48PM +0100, Michal Such??nek wrote: >> >>> This is repeated a few times in the driver so I added memset to quiet >>> gcc and make behavior deterministic in case the unused fields get some >>> meaning in the future. >> Yep, reserved certainly needs to be zeroed.. Can you send a patch? >> memset is overkill... >> >>> However, in tpm_ibmvtpm_send the structure is initialized as >>> >>> struct ibmvtpm_crq crq; >>> __be64 *word = (__be64 *)&crq; >>> ... >>> crq.valid = (u8)IBMVTPM_VALID_CMD; >>> crq.msg = (u8)VTPM_TPM_COMMAND; >>> crq.len = cpu_to_be16(count); >>> crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); >>> >>> and submitted with >>> >>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), >>> be64_to_cpu(word[1])); >>> meaning it is swapped twice. >> No idea, Nayna may know. >> >> My guess is that '__be64 *word' should be 'u64 *word'... >> >> Jason > I don’t think we want ‘word' to be changed back to be of type ‘u64’. Please see commit 62dfd912ab3b5405b6fe72d0135c37e9648071f1 Vicky ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-02-02 4:40 ` Vicky @ 2017-02-02 10:55 ` Michael Ellerman 2017-02-02 11:29 ` Michal Suchánek 1 sibling, 0 replies; 22+ messages in thread From: Michael Ellerman @ 2017-02-02 10:55 UTC (permalink / raw) To: Vicky, Ashley Lai Cc: Jason Gunthorpe, Michal Such??nek, Nayna Jain, Benjamin Herrenschmidt, Paul Mackerras, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, tpmdd-devel, linuxppc-dev, linux-kernel Vicky <honclo@linux.vnet.ibm.com> writes: >> On Jan 26, 2017, at 5:58 PM, Ashley Lai <ashleydlai@gmail.com> wrote: >> >> Adding Vicky from IBM. >> >> >> On 01/26/2017 04:05 PM, Jason Gunthorpe wrote: >>> On Thu, Jan 26, 2017 at 09:22:48PM +0100, Michal Such??nek wrote: >>> >>>> This is repeated a few times in the driver so I added memset to quiet >>>> gcc and make behavior deterministic in case the unused fields get some >>>> meaning in the future. >>> Yep, reserved certainly needs to be zeroed.. Can you send a patch? >>> memset is overkill... >>> >>>> However, in tpm_ibmvtpm_send the structure is initialized as >>>> >>>> struct ibmvtpm_crq crq; >>>> __be64 *word = (__be64 *)&crq; >>>> ... >>>> crq.valid = (u8)IBMVTPM_VALID_CMD; >>>> crq.msg = (u8)VTPM_TPM_COMMAND; >>>> crq.len = cpu_to_be16(count); >>>> crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); >>>> >>>> and submitted with >>>> >>>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), >>>> be64_to_cpu(word[1])); >>>> meaning it is swapped twice. >>> No idea, Nayna may know. >>> >>> My guess is that '__be64 *word' should be 'u64 *word'... >>> >>> Jason >> > > I don’t think we want ‘word' to be changed back to be of type ‘u64’. Please see commit 62dfd912ab3b5405b6fe72d0135c37e9648071f1 The type of word is basically irrelevant. Unless you're running sparse and actually checking the errors, which it seems you're not doing: drivers/char/tpm/tpm_ibmvtpm.c:90:30: warning: cast removes address space of expression drivers/char/tpm/tpm_ibmvtpm.c:91:23: warning: incorrect type in argument 1 (different address spaces) drivers/char/tpm/tpm_ibmvtpm.c:91:23: expected void *<noident> drivers/char/tpm/tpm_ibmvtpm.c:91:23: got void [noderef] <asn:2>*rtce_buf drivers/char/tpm/tpm_ibmvtpm.c:136:17: warning: cast removes address space of expression drivers/char/tpm/tpm_ibmvtpm.c:188:46: warning: incorrect type in argument 2 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:188:46: expected unsigned long long [unsigned] [usertype] w1 drivers/char/tpm/tpm_ibmvtpm.c:188:46: got restricted __be64 [usertype] <noident> drivers/char/tpm/tpm_ibmvtpm.c:189:31: warning: incorrect type in argument 3 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:189:31: expected unsigned long long [unsigned] [usertype] w2 drivers/char/tpm/tpm_ibmvtpm.c:189:31: got restricted __be64 [usertype] <noident> drivers/char/tpm/tpm_ibmvtpm.c:215:46: warning: incorrect type in argument 2 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:215:46: expected unsigned long long [unsigned] [usertype] w1 drivers/char/tpm/tpm_ibmvtpm.c:215:46: got restricted __be64 [usertype] <noident> drivers/char/tpm/tpm_ibmvtpm.c:216:31: warning: incorrect type in argument 3 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:216:31: expected unsigned long long [unsigned] [usertype] w2 drivers/char/tpm/tpm_ibmvtpm.c:216:31: got restricted __be64 [usertype] <noident> drivers/char/tpm/tpm_ibmvtpm.c:294:30: warning: incorrect type in argument 1 (different address spaces) drivers/char/tpm/tpm_ibmvtpm.c:294:30: expected void const *<noident> drivers/char/tpm/tpm_ibmvtpm.c:294:30: got void [noderef] <asn:2>*rtce_buf drivers/char/tpm/tpm_ibmvtpm.c:342:46: warning: incorrect type in argument 2 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:342:46: expected unsigned long long [unsigned] [usertype] w1 drivers/char/tpm/tpm_ibmvtpm.c:342:46: got restricted __be64 [usertype] <noident> drivers/char/tpm/tpm_ibmvtpm.c:343:31: warning: incorrect type in argument 3 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:343:31: expected unsigned long long [unsigned] [usertype] w2 drivers/char/tpm/tpm_ibmvtpm.c:343:31: got restricted __be64 [usertype] <noident> drivers/char/tpm/tpm_ibmvtpm.c:494:43: warning: incorrect type in assignment (different address spaces) drivers/char/tpm/tpm_ibmvtpm.c:494:43: expected void [noderef] <asn:2>*rtce_buf drivers/char/tpm/tpm_ibmvtpm.c:494:43: got void * drivers/char/tpm/tpm_ibmvtpm.c:501:52: warning: incorrect type in argument 2 (different address spaces) drivers/char/tpm/tpm_ibmvtpm.c:501:52: expected void *ptr drivers/char/tpm/tpm_ibmvtpm.c:501:52: got void [noderef] <asn:2>*rtce_buf drivers/char/tpm/tpm_ibmvtpm.c:507:46: warning: incorrect type in argument 1 (different address spaces) drivers/char/tpm/tpm_ibmvtpm.c:507:46: expected void const *<noident> drivers/char/tpm/tpm_ibmvtpm.c:507:46: got void [noderef] <asn:2>*rtce_buf What matters is how you actually do the byte swaps. cheers ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-02-02 4:40 ` Vicky 2017-02-02 10:55 ` Michael Ellerman @ 2017-02-02 11:29 ` Michal Suchánek 2017-02-02 15:17 ` David Laight 1 sibling, 1 reply; 22+ messages in thread From: Michal Suchánek @ 2017-02-02 11:29 UTC (permalink / raw) To: Vicky Cc: Ashley Lai, Nayna Jain, Marcel Selhorst, linux-kernel, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, Paul Mackerras, Peter Huewe, linuxppc-dev On Wed, 1 Feb 2017 23:40:33 -0500 Vicky <honclo@linux.vnet.ibm.com> wrote: > > On Jan 26, 2017, at 5:58 PM, Ashley Lai <ashleydlai@gmail.com> > > wrote: > > > > Adding Vicky from IBM. > > > > > > On 01/26/2017 04:05 PM, Jason Gunthorpe wrote: > >> On Thu, Jan 26, 2017 at 09:22:48PM +0100, Michal Such??nek wrote: > >> > >>> This is repeated a few times in the driver so I added memset to > >>> quiet gcc and make behavior deterministic in case the unused > >>> fields get some meaning in the future. > >> Yep, reserved certainly needs to be zeroed.. Can you send a patch? > >> memset is overkill... > >> > >>> However, in tpm_ibmvtpm_send the structure is initialized as > >>> > >>> struct ibmvtpm_crq crq; > >>> __be64 *word = (__be64 *)&crq; > >>> ... > >>> crq.valid = (u8)IBMVTPM_VALID_CMD; > >>> crq.msg = (u8)VTPM_TPM_COMMAND; > >>> crq.len = cpu_to_be16(count); > >>> crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); > >>> > >>> and submitted with > >>> > >>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), > >>> be64_to_cpu(word[1])); > >>> meaning it is swapped twice. > >> No idea, Nayna may know. > >> > >> My guess is that '__be64 *word' should be 'u64 *word'... > >> > >> Jason > > > > I don’t think we want ‘word' to be changed back to be of type > ‘u64’. Please see commit 62dfd912ab3b5405b6fe72d0135c37e9648071f1 The word is marked correctly as __be64 in that patch because count and handle are swapped to BE when saved to it and the whole word is then swapped again when loaded. If you just load ((u64)IBMVTPM_VALID_CMD << 56 | ((u64)VTPM_TPM_COMMAND << 48) | ((u64)count << 32) | ibmvtpm->rtce_dma_handle in a register it works equally well without any __be and swaps involved. Note however that __be64 and u64 are all the same to the compiler. It's just a note for the reader and analysis tools. Thanks Michal ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: ibmvtpm byteswapping inconsistency 2017-02-02 11:29 ` Michal Suchánek @ 2017-02-02 15:17 ` David Laight 0 siblings, 0 replies; 22+ messages in thread From: David Laight @ 2017-02-02 15:17 UTC (permalink / raw) To: 'Michal Suchánek', Vicky Cc: Nayna Jain, Marcel Selhorst, linux-kernel@vger.kernel.org, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel@lists.sourceforge.net, Paul Mackerras, Ashley Lai, Peter Huewe, linuxppc-dev@lists.ozlabs.org From: Michal Suchánek > Sent: 02 February 2017 11:30 ... > The word is marked correctly as __be64 in that patch because count and > handle are swapped to BE when saved to it and the whole word is then > swapped again when loaded. If you just load ((u64)IBMVTPM_VALID_CMD << > 56 | ((u64)VTPM_TPM_COMMAND << 48) | ((u64)count << 32) | > ibmvtpm->rtce_dma_handle in a register it works equally well > without any __be and swaps involved. And that version will almost certainly generate much better code. David ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-26 20:22 ibmvtpm byteswapping inconsistency Michal Suchánek 2017-01-26 22:05 ` Jason Gunthorpe @ 2017-01-27 1:42 ` Tyrel Datwyler 2017-01-27 1:50 ` Benjamin Herrenschmidt 2017-01-27 11:18 ` David Laight 2 siblings, 1 reply; 22+ messages in thread From: Tyrel Datwyler @ 2017-01-27 1:42 UTC (permalink / raw) To: Michal Suchánek, Ashley Lai, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, linuxppc-dev, linux-kernel On 01/26/2017 12:22 PM, Michal Suchánek wrote: > Hello, > > building ibmvtpm I noticed gcc warning complaining that second word of > struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized. > > The structure is defined as > > struct ibmvtpm_crq { > u8 valid; > u8 msg; > __be16 len; > __be32 data; > __be64 reserved; > } __attribute__((packed, aligned(8))); > > initialized as > > struct ibmvtpm_crq crq; > u64 *buf = (u64 *) &crq; > ... > crq.valid = (u8)IBMVTPM_VALID_CMD; > crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND; > > and submitted with > > rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]), > cpu_to_be64(buf[1])); These should be be64_to_cpu() here. The underlying hcall made by ibmvtpm_send_crq() requires parameters to be in cpu endian unlike the RTAS interface which requires data in BE. > > which means that the second word indeed contains purely garbage. > > This is repeated a few times in the driver so I added memset to quiet > gcc and make behavior deterministic in case the unused fields get some > meaning in the future. > > However, in tpm_ibmvtpm_send the structure is initialized as > > struct ibmvtpm_crq crq; > __be64 *word = (__be64 *)&crq; > ... > crq.valid = (u8)IBMVTPM_VALID_CMD; > crq.msg = (u8)VTPM_TPM_COMMAND; > crq.len = cpu_to_be16(count); > crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); > > and submitted with > > rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), > be64_to_cpu(word[1])); > meaning it is swapped twice. > > > Where is the interface defined? Are the command arguments passed as BE > subfields (the second case was correct before adding the extra whole > word swap) or BE words (the first case doing whole word swap is > correct)? The interface is defined in PAPR. The crq format is defined in BE terms. However, when we break the crq apart into high and low words they need to be in cpu endian as mentioned above. -Tyrel > > Thanks > > Michal > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-27 1:42 ` Tyrel Datwyler @ 2017-01-27 1:50 ` Benjamin Herrenschmidt 2017-01-27 9:03 ` Michal Suchanek 2017-01-27 18:02 ` Tyrel Datwyler 0 siblings, 2 replies; 22+ messages in thread From: Benjamin Herrenschmidt @ 2017-01-27 1:50 UTC (permalink / raw) To: Tyrel Datwyler, Michal Suchánek, Ashley Lai, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, linuxppc-dev, linux-kernel On Thu, 2017-01-26 at 17:42 -0800, Tyrel Datwyler wrote: > On 01/26/2017 12:22 PM, Michal Suchánek wrote: > > Hello, > > > > building ibmvtpm I noticed gcc warning complaining that second word > > of > > struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized. > > > > The structure is defined as > > > > struct ibmvtpm_crq { > > u8 valid; > > u8 msg; > > __be16 len; > > __be32 data; > > __be64 reserved; > > } __attribute__((packed, aligned(8))); > > > > initialized as > > > > struct ibmvtpm_crq crq; > > u64 *buf = (u64 *) &crq; > > ... > > crq.valid = (u8)IBMVTPM_VALID_CMD; > > crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND; > > > > and submitted with > > > > rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]), > > cpu_to_be64(buf[1])); > > These should be be64_to_cpu() here. The underlying hcall made by > ibmvtpm_send_crq() requires parameters to be in cpu endian unlike the > RTAS interface which requires data in BE. Hrm... an hcall takes register arguments. Register arguments don't have an endianness. The problem is that we are packing an in-memory structure into 2 registers and it's expected that this structure is laid out in the registers as if it had been loaded by a BE CPU. So we have two things at play here: - The >8-bit fields should be laid out BE in the memory image - That whole 128-bit structure should be loaded into 2 64-bit registers MSB first. So the "double" swap is somewhat needed. The uglyness comes from the passing-by-register of the h-call but it should work. That said, be64_to_cpup(buf) and be64_to_cpup(buf+1) might give you better result (though recent gcc's might not make a difference). > > > > which means that the second word indeed contains purely garbage. > > > > This is repeated a few times in the driver so I added memset to > > quiet > > gcc and make behavior deterministic in case the unused fields get > > some > > meaning in the future. > > > > However, in tpm_ibmvtpm_send the structure is initialized as > > > > struct ibmvtpm_crq crq; > > __be64 *word = (__be64 *)&crq; > > ... > > crq.valid = (u8)IBMVTPM_VALID_CMD; > > crq.msg = (u8)VTPM_TPM_COMMAND; > > crq.len = cpu_to_be16(count); > > crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); > > > > and submitted with > > > > rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), > > be64_to_cpu(word[1])); > > meaning it is swapped twice. > > > > > > Where is the interface defined? Are the command arguments passed as > > BE > > subfields (the second case was correct before adding the extra > > whole > > word swap) or BE words (the first case doing whole word swap is > > correct)? > > The interface is defined in PAPR. The crq format is defined in BE > terms. > However, when we break the crq apart into high and low words they > need > to be in cpu endian as mentioned above. > > -Tyrel > > > > > Thanks > > > > Michal > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-27 1:50 ` Benjamin Herrenschmidt @ 2017-01-27 9:03 ` Michal Suchanek 2017-01-27 21:19 ` Tyrel Datwyler 2017-01-27 18:02 ` Tyrel Datwyler 1 sibling, 1 reply; 22+ messages in thread From: Michal Suchanek @ 2017-01-27 9:03 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Tyrel Datwyler, Michal Suchánek, Ashley Lai, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, linuxppc-dev, Linux Kernel Mailing List On 27 January 2017 at 02:50, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Thu, 2017-01-26 at 17:42 -0800, Tyrel Datwyler wrote: >> On 01/26/2017 12:22 PM, Michal Suchánek wrote: >> > Hello, >> > >> > building ibmvtpm I noticed gcc warning complaining that second word >> > of >> > struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized. >> > >> > The structure is defined as >> > >> > struct ibmvtpm_crq { >> > u8 valid; >> > u8 msg; >> > __be16 len; >> > __be32 data; >> > __be64 reserved; >> > } __attribute__((packed, aligned(8))); >> > >> > initialized as >> > >> > struct ibmvtpm_crq crq; >> > u64 *buf = (u64 *) &crq; >> > ... >> > crq.valid = (u8)IBMVTPM_VALID_CMD; >> > crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND; >> > >> > and submitted with >> > >> > rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]), >> > cpu_to_be64(buf[1])); >> >> These should be be64_to_cpu() here. The underlying hcall made by >> ibmvtpm_send_crq() requires parameters to be in cpu endian unlike the >> RTAS interface which requires data in BE. > > Hrm... an hcall takes register arguments. Register arguments don't have > an endianness. > > The problem is that we are packing an in-memory structure into 2 > registers and it's expected that this structure is laid out in the > registers as if it had been loaded by a BE CPU. > > So we have two things at play here: > > - The >8-bit fields should be laid out BE in the memory image > - That whole 128-bit structure should be loaded into 2 64-bit > registers MSB first. > > So the "double" swap is somewhat needed. The uglyness comes from the > passing-by-register of the h-call but it should work. > > That said, be64_to_cpup(buf) and be64_to_cpup(buf+1) might give you > better result (though recent gcc's might not make a difference). If this should work then the below case that swaps the fields separately is broken. Anyway, structures have no endianess so when they start with a byte they start with that byte no matter the host endian. crq.valid is the first byte always. And then each field is to be swapped separately. On the other hand, bitfields are part of an integer and the field should be swapped as part of the integer. That is, #define CRQ_VALID ((buf[0] >> 56) & 0xff) CRQ_VALID is part of an integer in buf and would be laid out differently on start or end depending on the host being BE or LE. And the question is what the PAPR actually defines because both ways are used in the code. You can describe an in-memory layout either way. >> > >> > which means that the second word indeed contains purely garbage. >> > >> > This is repeated a few times in the driver so I added memset to >> > quiet >> > gcc and make behavior deterministic in case the unused fields get >> > some >> > meaning in the future. >> > >> > However, in tpm_ibmvtpm_send the structure is initialized as >> > >> > struct ibmvtpm_crq crq; >> > __be64 *word = (__be64 *)&crq; >> > ... >> > crq.valid = (u8)IBMVTPM_VALID_CMD; >> > crq.msg = (u8)VTPM_TPM_COMMAND; >> > crq.len = cpu_to_be16(count); >> > crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); >> > >> > and submitted with >> > >> > rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), >> > be64_to_cpu(word[1])); >> > meaning it is swapped twice. >> > >> > >> > Where is the interface defined? Are the command arguments passed as >> > BE >> > subfields (the second case was correct before adding the extra >> > whole >> > word swap) or BE words (the first case doing whole word swap is >> > correct)? >> >> The interface is defined in PAPR. The crq format is defined in BE >> terms. Which exact PAPR? Where can I get it? The PAPR document I found does not say anything about vtpm. Thanks Michal ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-27 9:03 ` Michal Suchanek @ 2017-01-27 21:19 ` Tyrel Datwyler 2017-01-30 4:32 ` Michael Ellerman 0 siblings, 1 reply; 22+ messages in thread From: Tyrel Datwyler @ 2017-01-27 21:19 UTC (permalink / raw) To: Michal Suchanek, Benjamin Herrenschmidt Cc: Marcel Selhorst, Linux Kernel Mailing List, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, Paul Mackerras, Ashley Lai, Peter Huewe, Michal Suchánek, linuxppc-dev On 01/27/2017 01:03 AM, Michal Suchanek wrote: > On 27 January 2017 at 02:50, Benjamin Herrenschmidt > <benh@kernel.crashing.org> wrote: >> On Thu, 2017-01-26 at 17:42 -0800, Tyrel Datwyler wrote: >>> On 01/26/2017 12:22 PM, Michal Suchánek wrote: >>>> Hello, >>>> >>>> building ibmvtpm I noticed gcc warning complaining that second word >>>> of >>>> struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized. >>>> >>>> The structure is defined as >>>> >>>> struct ibmvtpm_crq { >>>> u8 valid; >>>> u8 msg; >>>> __be16 len; >>>> __be32 data; >>>> __be64 reserved; >>>> } __attribute__((packed, aligned(8))); >>>> >>>> initialized as >>>> >>>> struct ibmvtpm_crq crq; >>>> u64 *buf = (u64 *) &crq; >>>> ... >>>> crq.valid = (u8)IBMVTPM_VALID_CMD; >>>> crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND; >>>> >>>> and submitted with >>>> >>>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]), >>>> cpu_to_be64(buf[1])); >>> >>> These should be be64_to_cpu() here. The underlying hcall made by >>> ibmvtpm_send_crq() requires parameters to be in cpu endian unlike the >>> RTAS interface which requires data in BE. >> >> Hrm... an hcall takes register arguments. Register arguments don't have >> an endianness. >> >> The problem is that we are packing an in-memory structure into 2 >> registers and it's expected that this structure is laid out in the >> registers as if it had been loaded by a BE CPU. >> >> So we have two things at play here: >> >> - The >8-bit fields should be laid out BE in the memory image >> - That whole 128-bit structure should be loaded into 2 64-bit >> registers MSB first. >> >> So the "double" swap is somewhat needed. The uglyness comes from the >> passing-by-register of the h-call but it should work. >> >> That said, be64_to_cpup(buf) and be64_to_cpup(buf+1) might give you >> better result (though recent gcc's might not make a difference). > > If this should work then the below case that swaps the fields separately is > broken. > > Anyway, structures have no endianess so when they start with a byte they > start with that byte no matter the host endian. > crq.valid is the first byte always. And then each field is to be swapped > separately. > > On the other hand, bitfields are part of an integer and the field should be > swapped as part of the integer. > > That is, > #define CRQ_VALID ((buf[0] >> 56) & 0xff) > CRQ_VALID is part of an integer in buf and would be laid out differently > on start or end depending on the host being BE or LE. > > And the question is what the PAPR actually defines because both ways are > used in the code. You can describe an in-memory layout either way. Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 ----------------------------------------------------------------------- Word0 | Valid | Type | Length | Data ----------------------------------------------------------------------- Word1 | Reserved ----------------------------------------------------------------------- The following definition looks to match: struct ibmvtpm_crq { u8 valid; u8 msg; __be16 len; __be32 data; __be64 reserved; } __attribute__((packed, aligned(8))); > >>>> >>>> which means that the second word indeed contains purely garbage. >>>> >>>> This is repeated a few times in the driver so I added memset to >>>> quiet >>>> gcc and make behavior deterministic in case the unused fields get >>>> some >>>> meaning in the future. >>>> >>>> However, in tpm_ibmvtpm_send the structure is initialized as >>>> >>>> struct ibmvtpm_crq crq; >>>> __be64 *word = (__be64 *)&crq; >>>> ... >>>> crq.valid = (u8)IBMVTPM_VALID_CMD; >>>> crq.msg = (u8)VTPM_TPM_COMMAND; >>>> crq.len = cpu_to_be16(count); >>>> crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); >>>> >>>> and submitted with >>>> >>>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), >>>> be64_to_cpu(word[1])); >>>> meaning it is swapped twice. >>>> >>>> >>>> Where is the interface defined? Are the command arguments passed as >>>> BE >>>> subfields (the second case was correct before adding the extra >>>> whole >>>> word swap) or BE words (the first case doing whole word swap is >>>> correct)? >>> >>> The interface is defined in PAPR. The crq format is defined in BE >>> terms. > > Which exact PAPR? Where can I get it? > The PAPR document I found does not say anything about vtpm. Unfortunately, vtpm doesn't appear to be covered in the publicly available LoPAPR. -Tyrel > > Thanks > > Michal > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-27 21:19 ` Tyrel Datwyler @ 2017-01-30 4:32 ` Michael Ellerman 2017-01-30 20:34 ` Tyrel Datwyler 0 siblings, 1 reply; 22+ messages in thread From: Michael Ellerman @ 2017-01-30 4:32 UTC (permalink / raw) To: Tyrel Datwyler, Michal Suchanek, Benjamin Herrenschmidt Cc: Marcel Selhorst, Linux Kernel Mailing List, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, Paul Mackerras, Ashley Lai, Peter Huewe, Michal Suchánek, linuxppc-dev Tyrel Datwyler <tyreld@linux.vnet.ibm.com> writes: > On 01/27/2017 01:03 AM, Michal Suchanek wrote: >> On 27 January 2017 at 02:50, Benjamin Herrenschmidt >> <benh@kernel.crashing.org> wrote: >>> On Thu, 2017-01-26 at 17:42 -0800, Tyrel Datwyler wrote: >>>> On 01/26/2017 12:22 PM, Michal Suchánek wrote: >>>>> Hello, >>>>> >>>>> building ibmvtpm I noticed gcc warning complaining that second word >>>>> of >>>>> struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized. >>>>> >>>>> The structure is defined as >>>>> >>>>> struct ibmvtpm_crq { >>>>> u8 valid; >>>>> u8 msg; >>>>> __be16 len; >>>>> __be32 data; >>>>> __be64 reserved; >>>>> } __attribute__((packed, aligned(8))); >>>>> >>>>> initialized as >>>>> >>>>> struct ibmvtpm_crq crq; >>>>> u64 *buf = (u64 *) &crq; >>>>> ... >>>>> crq.valid = (u8)IBMVTPM_VALID_CMD; >>>>> crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND; >>>>> >>>>> and submitted with >>>>> >>>>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]), >>>>> cpu_to_be64(buf[1])); >>>> >>>> These should be be64_to_cpu() here. The underlying hcall made by >>>> ibmvtpm_send_crq() requires parameters to be in cpu endian unlike the >>>> RTAS interface which requires data in BE. >>> >>> Hrm... an hcall takes register arguments. Register arguments don't have >>> an endianness. >>> >>> The problem is that we are packing an in-memory structure into 2 >>> registers and it's expected that this structure is laid out in the >>> registers as if it had been loaded by a BE CPU. >>> >>> So we have two things at play here: >>> >>> - The >8-bit fields should be laid out BE in the memory image >>> - That whole 128-bit structure should be loaded into 2 64-bit >>> registers MSB first. >>> >>> So the "double" swap is somewhat needed. The uglyness comes from the >>> passing-by-register of the h-call but it should work. >>> >>> That said, be64_to_cpup(buf) and be64_to_cpup(buf+1) might give you >>> better result (though recent gcc's might not make a difference). >> >> If this should work then the below case that swaps the fields separately is >> broken. >> >> Anyway, structures have no endianess so when they start with a byte they >> start with that byte no matter the host endian. >> crq.valid is the first byte always. And then each field is to be swapped >> separately. >> >> On the other hand, bitfields are part of an integer and the field should be >> swapped as part of the integer. >> >> That is, >> #define CRQ_VALID ((buf[0] >> 56) & 0xff) >> CRQ_VALID is part of an integer in buf and would be laid out differently >> on start or end depending on the host being BE or LE. >> >> And the question is what the PAPR actually defines because both ways are >> used in the code. You can describe an in-memory layout either way. > > Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 > ----------------------------------------------------------------------- > Word0 | Valid | Type | Length | Data > ----------------------------------------------------------------------- > Word1 | Reserved > ----------------------------------------------------------------------- > > The following definition looks to match: > > struct ibmvtpm_crq { > u8 valid; > u8 msg; > __be16 len; > __be32 data; > __be64 reserved; > } __attribute__((packed, aligned(8))); Well it's a partial match. Your layout above doesn't define which byte of Length or Data is the MSB or LSB. So going by that we still don't know the endianness of either field. cheers ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-30 4:32 ` Michael Ellerman @ 2017-01-30 20:34 ` Tyrel Datwyler 2017-01-31 8:38 ` Michael Ellerman 0 siblings, 1 reply; 22+ messages in thread From: Tyrel Datwyler @ 2017-01-30 20:34 UTC (permalink / raw) To: Michael Ellerman, Tyrel Datwyler, Michal Suchanek, Benjamin Herrenschmidt Cc: Marcel Selhorst, Linux Kernel Mailing List, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, Paul Mackerras, Ashley Lai, Peter Huewe, Michal Suchánek, linuxppc-dev On 01/29/2017 08:32 PM, Michael Ellerman wrote: > Tyrel Datwyler <tyreld@linux.vnet.ibm.com> writes: > >> On 01/27/2017 01:03 AM, Michal Suchanek wrote: >>> On 27 January 2017 at 02:50, Benjamin Herrenschmidt >>> <benh@kernel.crashing.org> wrote: >>>> On Thu, 2017-01-26 at 17:42 -0800, Tyrel Datwyler wrote: >>>>> On 01/26/2017 12:22 PM, Michal Suchánek wrote: >>>>>> Hello, >>>>>> >>>>>> building ibmvtpm I noticed gcc warning complaining that second word >>>>>> of >>>>>> struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized. >>>>>> >>>>>> The structure is defined as >>>>>> >>>>>> struct ibmvtpm_crq { >>>>>> u8 valid; >>>>>> u8 msg; >>>>>> __be16 len; >>>>>> __be32 data; >>>>>> __be64 reserved; >>>>>> } __attribute__((packed, aligned(8))); >>>>>> >>>>>> initialized as >>>>>> >>>>>> struct ibmvtpm_crq crq; >>>>>> u64 *buf = (u64 *) &crq; >>>>>> ... >>>>>> crq.valid = (u8)IBMVTPM_VALID_CMD; >>>>>> crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND; >>>>>> >>>>>> and submitted with >>>>>> >>>>>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]), >>>>>> cpu_to_be64(buf[1])); >>>>> >>>>> These should be be64_to_cpu() here. The underlying hcall made by >>>>> ibmvtpm_send_crq() requires parameters to be in cpu endian unlike the >>>>> RTAS interface which requires data in BE. >>>> >>>> Hrm... an hcall takes register arguments. Register arguments don't have >>>> an endianness. >>>> >>>> The problem is that we are packing an in-memory structure into 2 >>>> registers and it's expected that this structure is laid out in the >>>> registers as if it had been loaded by a BE CPU. >>>> >>>> So we have two things at play here: >>>> >>>> - The >8-bit fields should be laid out BE in the memory image >>>> - That whole 128-bit structure should be loaded into 2 64-bit >>>> registers MSB first. >>>> >>>> So the "double" swap is somewhat needed. The uglyness comes from the >>>> passing-by-register of the h-call but it should work. >>>> >>>> That said, be64_to_cpup(buf) and be64_to_cpup(buf+1) might give you >>>> better result (though recent gcc's might not make a difference). >>> >>> If this should work then the below case that swaps the fields separately is >>> broken. >>> >>> Anyway, structures have no endianess so when they start with a byte they >>> start with that byte no matter the host endian. >>> crq.valid is the first byte always. And then each field is to be swapped >>> separately. >>> >>> On the other hand, bitfields are part of an integer and the field should be >>> swapped as part of the integer. >>> >>> That is, >>> #define CRQ_VALID ((buf[0] >> 56) & 0xff) >>> CRQ_VALID is part of an integer in buf and would be laid out differently >>> on start or end depending on the host being BE or LE. >>> >>> And the question is what the PAPR actually defines because both ways are >>> used in the code. You can describe an in-memory layout either way. >> >> Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 >> ----------------------------------------------------------------------- >> Word0 | Valid | Type | Length | Data >> ----------------------------------------------------------------------- >> Word1 | Reserved >> ----------------------------------------------------------------------- >> >> The following definition looks to match: >> >> struct ibmvtpm_crq { >> u8 valid; >> u8 msg; >> __be16 len; >> __be32 data; >> __be64 reserved; >> } __attribute__((packed, aligned(8))); > > Well it's a partial match. > > Your layout above doesn't define which byte of Length or Data is the MSB > or LSB. So going by that we still don't know the endianness of either I should have been explicit that PAPR uses Big Endian bit and byte numbering throughout the spec unless otherwise noted. -Tyrel > field. > > cheers > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-30 20:34 ` Tyrel Datwyler @ 2017-01-31 8:38 ` Michael Ellerman 0 siblings, 0 replies; 22+ messages in thread From: Michael Ellerman @ 2017-01-31 8:38 UTC (permalink / raw) To: Tyrel Datwyler, Tyrel Datwyler, Michal Suchanek, Benjamin Herrenschmidt Cc: Marcel Selhorst, Linux Kernel Mailing List, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, Paul Mackerras, Ashley Lai, Peter Huewe, Michal Suchánek, linuxppc-dev Tyrel Datwyler <tyreld@linux.vnet.ibm.com> writes: > On 01/29/2017 08:32 PM, Michael Ellerman wrote: >> Tyrel Datwyler <tyreld@linux.vnet.ibm.com> writes: >>> >>> Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 >>> ----------------------------------------------------------------------- >>> Word0 | Valid | Type | Length | Data >>> ----------------------------------------------------------------------- >>> Word1 | Reserved >>> ----------------------------------------------------------------------- >>> >>> The following definition looks to match: >>> >>> struct ibmvtpm_crq { >>> u8 valid; >>> u8 msg; >>> __be16 len; >>> __be32 data; >>> __be64 reserved; >>> } __attribute__((packed, aligned(8))); >> >> Well it's a partial match. >> >> Your layout above doesn't define which byte of Length or Data is the MSB >> or LSB. So going by that we still don't know the endianness of either > > I should have been explicit that PAPR uses Big Endian bit and byte > numbering throughout the spec unless otherwise noted. OK. I didn't actually remember that so yeah good to be explicit. cheers ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-27 1:50 ` Benjamin Herrenschmidt 2017-01-27 9:03 ` Michal Suchanek @ 2017-01-27 18:02 ` Tyrel Datwyler 2017-01-27 19:58 ` Benjamin Herrenschmidt 2017-01-30 14:42 ` David Laight 1 sibling, 2 replies; 22+ messages in thread From: Tyrel Datwyler @ 2017-01-27 18:02 UTC (permalink / raw) To: Benjamin Herrenschmidt, Michal Suchánek, Ashley Lai, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, linuxppc-dev, linux-kernel On 01/26/2017 05:50 PM, Benjamin Herrenschmidt wrote: > On Thu, 2017-01-26 at 17:42 -0800, Tyrel Datwyler wrote: >> On 01/26/2017 12:22 PM, Michal Suchánek wrote: >>> Hello, >>> >>> building ibmvtpm I noticed gcc warning complaining that second word >>> of >>> struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized. >>> >>> The structure is defined as >>> >>> struct ibmvtpm_crq { >>> u8 valid; >>> u8 msg; >>> __be16 len; >>> __be32 data; >>> __be64 reserved; >>> } __attribute__((packed, aligned(8))); >>> >>> initialized as >>> >>> struct ibmvtpm_crq crq; >>> u64 *buf = (u64 *) &crq; >>> ... >>> crq.valid = (u8)IBMVTPM_VALID_CMD; >>> crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND; >>> >>> and submitted with >>> >>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]), >>> cpu_to_be64(buf[1])); >> >> These should be be64_to_cpu() here. The underlying hcall made by >> ibmvtpm_send_crq() requires parameters to be in cpu endian unlike the >> RTAS interface which requires data in BE. > > Hrm... an hcall takes register arguments. Register arguments don't have > an endianness. I wasn't suggesting that they do. However, I still believe my point is valid that the arguments need to be loaded into the registers according to the endianness of the cpu. We had several bugs during LE porting where assumptions were made that parameters should be loaded BE regardless of cpu endian. For example: commit 3df76a9dcc74d5f012b94ea01ed6e7aaf8362c5a Author: Cyril Bur <cyrilbur@gmail.com> Date: Wed Jan 21 13:32:00 2015 +1100 powerpc/pseries: Fix endian problems with LE migration RTAS events require arguments be passed in big endian while hypercalls have their arguments passed in registers and the values should therefore be in CPU endian. > > The problem is that we are packing an in-memory structure into 2 > registers and it's expected that this structure is laid out in the > registers as if it had been loaded by a BE CPU. This is only the case if the cpu is BE. If the cpu is LE, regardless of the fact that our in memory structure is laid out BE, when we break it into 2 words each of those words needs to be loaded LE. -Tyrel > > So we have two things at play here: > > - The >8-bit fields should be laid out BE in the memory image > - That whole 128-bit structure should be loaded into 2 64-bit > registers MSB first. > > So the "double" swap is somewhat needed. The uglyness comes from the > passing-by-register of the h-call but it should work. > > That said, be64_to_cpup(buf) and be64_to_cpup(buf+1) might give you > better result (though recent gcc's might not make a difference). >>> >>> which means that the second word indeed contains purely garbage. >>> >>> This is repeated a few times in the driver so I added memset to >>> quiet >>> gcc and make behavior deterministic in case the unused fields get >>> some >>> meaning in the future. >>> >>> However, in tpm_ibmvtpm_send the structure is initialized as >>> >>> struct ibmvtpm_crq crq; >>> __be64 *word = (__be64 *)&crq; >>> ... >>> crq.valid = (u8)IBMVTPM_VALID_CMD; >>> crq.msg = (u8)VTPM_TPM_COMMAND; >>> crq.len = cpu_to_be16(count); >>> crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); >>> >>> and submitted with >>> >>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), >>> be64_to_cpu(word[1])); >>> meaning it is swapped twice. >>> >>> >>> Where is the interface defined? Are the command arguments passed as >>> BE >>> subfields (the second case was correct before adding the extra >>> whole >>> word swap) or BE words (the first case doing whole word swap is >>> correct)? >> >> The interface is defined in PAPR. The crq format is defined in BE >> terms. >> However, when we break the crq apart into high and low words they >> need >> to be in cpu endian as mentioned above. >> >> -Tyrel >> >>> >>> Thanks >>> >>> Michal >>> > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-27 18:02 ` Tyrel Datwyler @ 2017-01-27 19:58 ` Benjamin Herrenschmidt 2017-01-27 20:32 ` Tyrel Datwyler 2017-01-30 14:42 ` David Laight 1 sibling, 1 reply; 22+ messages in thread From: Benjamin Herrenschmidt @ 2017-01-27 19:58 UTC (permalink / raw) To: Tyrel Datwyler, Michal Suchánek, Ashley Lai, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, linuxppc-dev, linux-kernel On Fri, 2017-01-27 at 10:02 -0800, Tyrel Datwyler wrote: > > The problem is that we are packing an in-memory structure into 2 > > registers and it's expected that this structure is laid out in the > > registers as if it had been loaded by a BE CPU. > > This is only the case if the cpu is BE. If the cpu is LE, regardless of > the fact that our in memory structure is laid out BE, when we break it > into 2 words each of those words needs to be loaded LE. That doesn't make sense and doesn't match the code... The structure needs to always have the same in-register layout regardless of the endianness of the CPU, especially since the underlying hypervisor will most likely be BE :-) Thta's why the code does a be64_to_cpu() when loading it, this in effect performs a "BE" load, which on a BE CPU is just a normal load and on LE is a swap to compensate for the CPU loading it the "wrong way around". Cheers, Ben. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-27 19:58 ` Benjamin Herrenschmidt @ 2017-01-27 20:32 ` Tyrel Datwyler 2017-01-28 0:35 ` msuchanek 2017-01-28 4:28 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 22+ messages in thread From: Tyrel Datwyler @ 2017-01-27 20:32 UTC (permalink / raw) To: Benjamin Herrenschmidt, Michal Suchánek, Ashley Lai, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, linuxppc-dev, linux-kernel On 01/27/2017 11:58 AM, Benjamin Herrenschmidt wrote: > On Fri, 2017-01-27 at 10:02 -0800, Tyrel Datwyler wrote: >>> The problem is that we are packing an in-memory structure into 2 >>> registers and it's expected that this structure is laid out in the >>> registers as if it had been loaded by a BE CPU. >> >> This is only the case if the cpu is BE. If the cpu is LE, regardless of >> the fact that our in memory structure is laid out BE, when we break it >> into 2 words each of those words needs to be loaded LE. > > That doesn't make sense and doesn't match the code... The structure > needs to always have the same in-register layout regardless of the > endianness of the CPU, especially since the underlying hypervisor > will most likely be BE :-) > > Thta's why the code does a be64_to_cpu() when loading it, this in > effect performs a "BE" load, which on a BE CPU is just a normal load > and on LE is a swap to compensate for the CPU loading it the "wrong way > around". Its possible being the end of the week I'm just a little dense, but wouldn't be64_to_cpu() imply that we are byte-swapping something that is already, or supposedly already, in BE format to cpu endianness? Which on a BE cpu I would expect a no-op, and on a LE cpu the 64bit word to have been swapped from BE --> LE? In my eyes the code does seem to support what I've argued. The same thing is done in the scsi VIO drivers. The CRQ structure is laid out and annotated BE. We use cpu_to_be() calls to load any non 8bit field. Finally, each word is swapped to cpu endian when we hand it off for the hcall. from ibmvfc_send_event(): __be64 *crq_as_u64 = (__be64 *) &evt->crq; <..snip..> if ((rc = ibmvfc_send_crq(vhost, be64_to_cpu(crq_as_u64[0]), be64_to_cpu(crq_as_u64[1])))) { Again, maybe I'm missing something. -Tyrel > > Cheers, > Ben. > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-27 20:32 ` Tyrel Datwyler @ 2017-01-28 0:35 ` msuchanek 2017-01-28 4:28 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 22+ messages in thread From: msuchanek @ 2017-01-28 0:35 UTC (permalink / raw) To: Tyrel Datwyler Cc: Benjamin Herrenschmidt, Ashley Lai, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, linuxppc-dev, linux-kernel Hello, On 2017-01-27 21:32, Tyrel Datwyler wrote: > On 01/27/2017 11:58 AM, Benjamin Herrenschmidt wrote: >> On Fri, 2017-01-27 at 10:02 -0800, Tyrel Datwyler wrote: >>>> The problem is that we are packing an in-memory structure into 2 >>>> registers and it's expected that this structure is laid out in the >>>> registers as if it had been loaded by a BE CPU. >>> >>> This is only the case if the cpu is BE. If the cpu is LE, regardless >>> of >>> the fact that our in memory structure is laid out BE, when we break >>> it >>> into 2 words each of those words needs to be loaded LE. >> >> That doesn't make sense and doesn't match the code... The structure >> needs to always have the same in-register layout regardless of the >> endianness of the CPU, especially since the underlying hypervisor >> will most likely be BE :-) >> >> Thta's why the code does a be64_to_cpu() when loading it, this in >> effect performs a "BE" load, which on a BE CPU is just a normal load >> and on LE is a swap to compensate for the CPU loading it the "wrong >> way >> around". > > Its possible being the end of the week I'm just a little dense, but > wouldn't be64_to_cpu() imply that we are byte-swapping something that > is > already, or supposedly already, in BE format to cpu endianness? Which > on > a BE cpu I would expect a no-op, and on a LE cpu the 64bit word to have > been swapped from BE --> LE? > > In my eyes the code does seem to support what I've argued. The same > thing is done in the scsi VIO drivers. The CRQ structure is laid out > and > annotated BE. We use cpu_to_be() calls to load any non 8bit field. > Finally, each word is swapped to cpu endian when we hand it off for the > hcall. > > from ibmvfc_send_event(): > > __be64 *crq_as_u64 = (__be64 *) &evt->crq; > > <..snip..> > > if ((rc = ibmvfc_send_crq(vhost, be64_to_cpu(crq_as_u64[0]), > be64_to_cpu(crq_as_u64[1])))) { > > Again, maybe I'm missing something. > Ok, so you perform really difficult operation for no good reason. You say that the ppc dual-endian works like this: there is an internal in-cpu representation of numbers which is always the same. What is affected by switching endian is how memory loads and stores work. If you pass these two words in registers you never need to swap anything. > Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 > ----------------------------------------------------------------------- > Word0 | Valid | Type | Length | Data > ----------------------------------------------------------------------- > Word1 | Reserved > ----------------------------------------------------------------------- > > The following definition looks to match: > > struct ibmvtpm_crq { > u8 valid; > u8 msg; > __be16 len; > __be32 data; > __be64 reserved; > } __attribute__((packed, aligned(8))); If under BE valid is first byte then it is MSB and you would get value to pass in word 0 as (valid << 56) | (type << 48) | (length << 32 ) | data. No swaps involved. To achieve same with structure and swaps you would indeed first swap the members and then the whole word. Much harder to read code that way, though. Thanks Michal ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: ibmvtpm byteswapping inconsistency 2017-01-27 20:32 ` Tyrel Datwyler 2017-01-28 0:35 ` msuchanek @ 2017-01-28 4:28 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 22+ messages in thread From: Benjamin Herrenschmidt @ 2017-01-28 4:28 UTC (permalink / raw) To: Tyrel Datwyler, Michal Suchánek, Ashley Lai, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel, linuxppc-dev, linux-kernel On Fri, 2017-01-27 at 12:32 -0800, Tyrel Datwyler wrote: > Its possible being the end of the week I'm just a little dense, but > wouldn't be64_to_cpu() imply that we are byte-swapping something that is > already, or supposedly already, in BE format to cpu endianness? Which on > a BE cpu I would expect a no-op, and on a LE cpu the 64bit word to have > been swapped from BE --> LE? It's in BE format in memory. In LE mode, loading it into a register will get it the wrong way around, thus we have to swap it again. Once in a register it has no "endianness" per-se, what matters is that the act of loading from memory to a register would have loaded it the wrong way around in LE. > In my eyes the code does seem to support what I've argued. The same > thing is done in the scsi VIO drivers. The CRQ structure is laid out and > annotated BE. We use cpu_to_be() calls to load any non 8bit field. > Finally, each word is swapped to cpu endian when we hand it off for the > hcall. > > from ibmvfc_send_event(): > > __be64 *crq_as_u64 = (__be64 *) &evt->crq; > > <..snip..> > > if ((rc = ibmvfc_send_crq(vhost, be64_to_cpu(crq_as_u64[0]), > be64_to_cpu(crq_as_u64[1])))) { > > Again, maybe I'm missing something. ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: ibmvtpm byteswapping inconsistency 2017-01-27 18:02 ` Tyrel Datwyler 2017-01-27 19:58 ` Benjamin Herrenschmidt @ 2017-01-30 14:42 ` David Laight 1 sibling, 0 replies; 22+ messages in thread From: David Laight @ 2017-01-30 14:42 UTC (permalink / raw) To: 'Tyrel Datwyler', Benjamin Herrenschmidt, Michal Suchánek, Ashley Lai, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel@lists.sourceforge.net, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org From: Tyrel Datwyler > Sent: 27 January 2017 18:03 > On 01/26/2017 05:50 PM, Benjamin Herrenschmidt wrote: > > On Thu, 2017-01-26 at 17:42 -0800, Tyrel Datwyler wrote: > >> On 01/26/2017 12:22 PM, Michal Suchnek wrote: > >>> Hello, > >>> > >>> building ibmvtpm I noticed gcc warning complaining that second word > >>> of > >>> struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized. > >>> > >>> The structure is defined as > >>> > >>> struct ibmvtpm_crq { > >>> u8 valid; > >>> u8 msg; > >>> __be16 len; > >>> __be32 data; > >>> __be64 reserved; > >>> } __attribute__((packed, aligned(8))); > >>> > >>> initialized as > >>> > >>> struct ibmvtpm_crq crq; > >>> u64 *buf = (u64 *) &crq; > >>> ... > >>> crq.valid = (u8)IBMVTPM_VALID_CMD; > >>> crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND; > >>> > >>> and submitted with > >>> > >>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]), > >>> cpu_to_be64(buf[1])); Isn't the real fubar here the use of that memory layout structure at all? It would probably all be better if the call looked like: rc = ibmvtpm_send_crq(ibmvtpm->vdev, MAKE_REQ(IBMVTPM_VALID_CMD, VTPM_PREPARE_TO_SUSPEND, xxx_len, xxx_data), 0); and MAKE_REQ() did all the required endian independant shifts to generate the correct 32bit value. David ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: ibmvtpm byteswapping inconsistency 2017-01-26 20:22 ibmvtpm byteswapping inconsistency Michal Suchánek 2017-01-26 22:05 ` Jason Gunthorpe 2017-01-27 1:42 ` Tyrel Datwyler @ 2017-01-27 11:18 ` David Laight 2 siblings, 0 replies; 22+ messages in thread From: David Laight @ 2017-01-27 11:18 UTC (permalink / raw) To: 'Michal Suchánek', Ashley Lai, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Peter Huewe, Marcel Selhorst, Jarkko Sakkinen, Jason Gunthorpe, tpmdd-devel@lists.sourceforge.net, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org From: Michal Suchánek > building ibmvtpm I noticed gcc warning complaining that second word of > struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized. > > The structure is defined as > > struct ibmvtpm_crq { > u8 valid; > u8 msg; > __be16 len; > __be32 data; > __be64 reserved; > } __attribute__((packed, aligned(8))); > > initialized as > > struct ibmvtpm_crq crq; > u64 *buf = (u64 *) &crq; ... Hrummfff.... What is that attribute for, seems pretty confusing and pointless to me. I also suspect that if you want to access it as two 64bit words it ought to be a union. David ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2017-02-02 15:17 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-01-26 20:22 ibmvtpm byteswapping inconsistency Michal Suchánek 2017-01-26 22:05 ` Jason Gunthorpe 2017-01-26 22:43 ` Michal Suchanek 2017-01-26 22:58 ` Ashley Lai 2017-02-02 4:40 ` Vicky 2017-02-02 10:55 ` Michael Ellerman 2017-02-02 11:29 ` Michal Suchánek 2017-02-02 15:17 ` David Laight 2017-01-27 1:42 ` Tyrel Datwyler 2017-01-27 1:50 ` Benjamin Herrenschmidt 2017-01-27 9:03 ` Michal Suchanek 2017-01-27 21:19 ` Tyrel Datwyler 2017-01-30 4:32 ` Michael Ellerman 2017-01-30 20:34 ` Tyrel Datwyler 2017-01-31 8:38 ` Michael Ellerman 2017-01-27 18:02 ` Tyrel Datwyler 2017-01-27 19:58 ` Benjamin Herrenschmidt 2017-01-27 20:32 ` Tyrel Datwyler 2017-01-28 0:35 ` msuchanek 2017-01-28 4:28 ` Benjamin Herrenschmidt 2017-01-30 14:42 ` David Laight 2017-01-27 11:18 ` David Laight
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox