From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43041) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eEWTg-0004Se-TX for qemu-devel@nongnu.org; Tue, 14 Nov 2017 03:21:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eEWTd-0001VO-Ll for qemu-devel@nongnu.org; Tue, 14 Nov 2017 03:21:52 -0500 References: <1510592277-20635-1-git-send-email-thuth@redhat.com> <4bb42004-19d2-26e5-8ae5-3b873f2b8ec3@amsat.org> From: Thomas Huth Message-ID: Date: Tue, 14 Nov 2017 09:21:41 +0100 MIME-Version: 1.0 In-Reply-To: <4bb42004-19d2-26e5-8ae5-3b873f2b8ec3@amsat.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH for-2.11] hw/net/vmxnet3: Fix code to work on big endian hosts, too List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= , qemu-devel@nongnu.org, Jason Wang , Dmitry Fleytman Cc: Dmitry Fleytman , David Gibson , qemu-s390x@nongnu.org, qemu-ppc@nongnu.org On 13.11.2017 23:38, Philippe Mathieu-Daud=C3=A9 wrote: > Hi Thomas, >=20 > On 11/13/2017 01:57 PM, Thomas Huth wrote: >> Since commit ab06ec43577177a442e8 we test the vmxnet3 device in the >> pxe-tester, too (when running "make check SPEED=3Dslow"). This now >> revealed that the vmxnet3 code is not working right if the host is a b= ig >> endian machine (for example ppc64 or s390x) - "make check SPEED=3Dslow= " >> is now failing on such hosts. >> >> The vmxnet3 code lacks endianess conversions in a couple of places. >> Interestingly, the bitfields in the structs in vmxnet3.h already tried= to >> take care of the *bit* endianess of the C compilers - but the code mis= sed >> to change the *byte* endianess when reading or writing the correspondi= ng >> structs. So the bitfields are now wrapped into unions which allow to c= hange >> the byte endianess during runtime with the non-bitfield member of the = union. >> With these changes, "make check SPEED=3Dslow" now properly works on bi= g endian >> hosts, too. >> >> Reported-by: David Gibson >> Signed-off-by: Thomas Huth >> --- [...] >> =20 >> @@ -535,7 +535,8 @@ static void vmxnet3_complete_packet(VMXNET3State *= s, int qidx, uint32_t tx_ridx) >> memset(&txcq_descr, 0, sizeof(txcq_descr)); >> txcq_descr.txdIdx =3D tx_ridx; >> txcq_descr.gen =3D vmxnet3_ring_curr_gen(&s->txq_descr[qidx].comp= _ring); >> - >> + txcq_descr.val1 =3D cpu_to_le32(txcq_descr.val1); >> + txcq_descr.val2 =3D cpu_to_le32(txcq_descr.val2); >> vmxnet3_ring_write_curr_cell(d, &s->txq_descr[qidx].comp_ring, &t= xcq_descr); >=20 > What about using inline functions: >=20 > - vmxnet3_rxdesc_pci_to_cpu(struct Vmxnet3_RxDesc *rxd) > - vmxnet3_txdesc_pci_to_cpu(...) > - vmxnet3_rxcompdesc_pci_to_cpu(struct Vmxnet3_RxCompDesc *rxcd) > - vmxnet3_txcompdesc_pci_to_cpu(...) Most of the structure byte-swapping is only done in one place for each type of structure, so for adding these two or three new lines to the code, it's not really justified to put them into separate functions that add at least six or seven new lines of code. This is only useful if the same struct is byte-swapped at two or more places in the code. >> /* Flush changes in TX descriptor before changing the counter val= ue */ >> @@ -695,11 +696,17 @@ vmxnet3_pop_next_tx_descr(VMXNET3State *s, >> PCIDevice *d =3D PCI_DEVICE(s); >> =20 >> vmxnet3_ring_read_curr_cell(d, ring, txd); >> + txd->addr =3D le64_to_cpu(txd->addr); >> + txd->val1 =3D le32_to_cpu(txd->val1); >> + txd->val2 =3D le32_to_cpu(txd->val2); >=20 > ^ probably not useful At least the txd->gen (via val1) is required one line below! ... and I wanted to be sure that the caller of vmxnet3_pop_next_tx_descr() always gets a consistent descriptor, even if the function returns false, so let's better be safe than sorry and always swap all fields of the struct. >> if (txd->gen =3D=3D vmxnet3_ring_curr_gen(ring)) { >> /* Only read after generation field verification */ >> smp_rmb(); >> /* Re-read to be sure we got the latest version */ >> vmxnet3_ring_read_curr_cell(d, ring, txd); >> + txd->addr =3D le64_to_cpu(txd->addr); >> + txd->val1 =3D le32_to_cpu(txd->val1); >> + txd->val2 =3D le32_to_cpu(txd->val2); >=20 > Using inlined func: >=20 > vmxnet3_txdesc_pci_to_cpu(txd); Ok, so that's one of the few spots where a struct is byteswapped not only in one place, but in two. I guess I could add a inline function for this... >> VMXNET3_RING_DUMP(VMW_RIPRN, "TX", qidx, ring); >> *descr_idx =3D vmxnet3_ring_curr_cell_idx(ring); >> vmxnet3_inc_tx_consumption_counter(s, qidx); >> @@ -749,7 +756,7 @@ static void vmxnet3_process_tx_queue(VMXNET3State = *s, int qidx) >> =20 >> if (!s->skip_current_tx_pkt) { >> data_len =3D (txd.len > 0) ? txd.len : VMXNET3_MAX_TX_BUF= _SIZE; >> - data_pa =3D le64_to_cpu(txd.addr); >> + data_pa =3D txd.addr; >> =20 >> if (!net_tx_pkt_add_raw_fragment(s->tx_pkt, >> data_pa, >> @@ -792,6 +799,9 @@ vmxnet3_read_next_rx_descr(VMXNET3State *s, int qi= dx, int ridx, >> Vmxnet3Ring *ring =3D &s->rxq_descr[qidx].rx_ring[ridx]; >> *didx =3D vmxnet3_ring_curr_cell_idx(ring); >> vmxnet3_ring_read_curr_cell(d, ring, dbuf); >> + dbuf->addr =3D le64_to_cpu(dbuf->addr); >> + dbuf->val1 =3D le32_to_cpu(dbuf->val1); >> + dbuf->ext1 =3D le32_to_cpu(dbuf->ext1); >=20 > again: >=20 > vmxnet3_txdesc_pci_to_cpu(dbuf); That function reads an RX descriptor, not a TX descriptor! And this descriptor type is only byte-swapped in this single place. And the whole function body is only 6 lines. So an additional inline function is really not justified here! >> } >> =20 >> static inline uint8_t >> @@ -811,6 +821,9 @@ vmxnet3_pop_rxc_descr(VMXNET3State *s, int qidx, u= int32_t *descr_gen) >> =20 >> pci_dma_read(PCI_DEVICE(s), >> daddr, &rxcd, sizeof(struct Vmxnet3_RxCompDesc)); >> + rxcd.val1 =3D le32_to_cpu(rxcd.val1); >> + rxcd.val2 =3D le32_to_cpu(rxcd.val2); >> + rxcd.val3 =3D le32_to_cpu(rxcd.val3); >=20 > vmxnet3_txcompdesc_pci_to_cpu(&rxcd); >=20 >> ring_gen =3D vmxnet3_ring_curr_gen(&s->rxq_descr[qidx].comp_ring)= ; >> =20 >> if (rxcd.gen !=3D ring_gen) { >> @@ -1098,14 +1111,16 @@ vmxnet3_indicate_packet(VMXNET3State *s) >> } >> =20 >> chunk_size =3D MIN(bytes_left, rxd.len); >> - vmxnet3_pci_dma_writev(d, data, bytes_copied, >> - le64_to_cpu(rxd.addr), chunk_size); >> + vmxnet3_pci_dma_writev(d, data, bytes_copied, rxd.addr, chunk= _size); >> bytes_copied +=3D chunk_size; >> bytes_left -=3D chunk_size; >> =20 >> vmxnet3_dump_rx_descr(&rxd); >=20 > ^ this dump is incorrect (not host swapped yet). I think you're wrong, the endianess should be fine here. rxd is ready via vmxnet3_get_next_rx_descr() which eventually calls vmxnet3_read_next_rx_descr() and that function is taking care of the byte-swapping. > vmxnet3_rxcompdesc_pci_to_cpu(&rxcd); >=20 >> =20 >> if (ready_rxcd_pa !=3D 0) { >> + rxcd.val1 =3D le32_to_cpu(rxcd.val1); >> + rxcd.val2 =3D le32_to_cpu(rxcd.val2); >> + rxcd.val3 =3D le32_to_cpu(rxcd.val3); >=20 > (out of if) Why? >> pci_dma_write(d, ready_rxcd_pa, &rxcd, sizeof(rxcd)); >> } >> =20 >> @@ -1138,6 +1153,9 @@ vmxnet3_indicate_packet(VMXNET3State *s) >> rxcd.eop =3D 1; >> rxcd.err =3D (bytes_left !=3D 0); >> =20 >> + rxcd.val1 =3D le32_to_cpu(rxcd.val1); >> + rxcd.val2 =3D le32_to_cpu(rxcd.val2); >> + rxcd.val3 =3D le32_to_cpu(rxcd.val3); >=20 > vmxnet3_rxcompdesc_pci_to_cpu(&rxcd); >=20 >> pci_dma_write(d, ready_rxcd_pa, &rxcd, sizeof(rxcd)); One of the few spots where a struct is written twice. I guess I could indeed add an inline wrapper for this ... not sure whether it's worth the effort... Thomas