From mboxrd@z Thu Jan 1 00:00:00 1970 From: Al Viro Subject: Re: [endianness bug] cxgb4: mk_act_open_req() buggers ->{local,peer}_ip on big-endian hosts Date: Fri, 17 Aug 2018 20:44:10 +0100 Message-ID: <20180817194410.GH6515@ZenIV.linux.org.uk> References: <20180805172238.GH15082@ZenIV.linux.org.uk> <20180806121459.GA5591@chelsio.com> <20180814212524.GT6515@ZenIV.linux.org.uk> <20180817130536.GA31768@chelsio.com> <20180817154320.GD6515@ZenIV.linux.org.uk> <20180817180949.GE6515@ZenIV.linux.org.uk> <20180817185841.GF6515@ZenIV.linux.org.uk> <20180817185944.GG6515@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Rahul Lakkireddy , David Miller , "netdev@vger.kernel.org" To: Ganesh Goudar Return-path: Received: from zeniv.linux.org.uk ([195.92.253.2]:55474 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725827AbeHQWsz (ORCPT ); Fri, 17 Aug 2018 18:48:55 -0400 Content-Disposition: inline In-Reply-To: <20180817185944.GG6515@ZenIV.linux.org.uk> Sender: netdev-owner@vger.kernel.org List-ID: On Fri, Aug 17, 2018 at 07:59:44PM +0100, Al Viro wrote: > On Fri, Aug 17, 2018 at 07:58:41PM +0100, Al Viro wrote: > > On Fri, Aug 17, 2018 at 07:09:49PM +0100, Al Viro wrote: > > > > > Re that code - are you sure it doesn't need le64_to_cpu(*src)? Because from what > > > I understand about PCI (which matches just fine to the comments in the same driver), > > > you probably do need that. Again, the only real way to find out is to test on > > > big-endian host... > > > > BTW, would that, by any chance, be an open-coded > > _iowrite64_copy(dst, src, EQ_UNIT/sizeof(u64)) > > __iowrite64_copy, even... FWIW, it looks like the confusion had been between the endianness of the data structures (b-e both on host and NIC side) and the fact that PCI is l-e. *IF* that code wants to copy data from host data structures to iomem as-is, it needs to use __raw_writeq() and its ilk or writeq(le64_to_cpu(...)) to compensate. The latter would, indeed, confuse sparse - we are accessing b-e data as if it was l-e. If we want copying that wouldn't affect the endianness, we need memcpy_toio() or similar beasts. And AFAICS that code is very close to /* If we're only writing a single Egress Unit and the BAR2 * Queue ID is 0, we can use the Write Combining Doorbell * Gather Buffer; otherwise we use the simple doorbell. */ if (n == 1 && tq->bar2_qid == 0) { unsigned int index = (tq->pidx ?: tq->size) - 1; /* Copy the TX Descriptor in a tight loop in order to * try to get it to the adapter in a single Write * Combined transfer on the PCI-E Bus. If the Write * Combine fails (say because of an interrupt, etc.) * the hardware will simply take the last write as a * simple doorbell write with a PIDX Increment of 1 * and will fetch the TX Descriptor from memory via * DMA. */ __iowrite64_copy(tq->bar2_addr + SGE_UDB_WCDOORBELL, &tq->desc[index], EQ_UNIT/sizeof(u64)) } else { writel(val | QID_V(tq->bar2_qid), tq->bar2_addr + SGE_UDB_KDOORBELL); } /* This Write Memory Barrier will force the write to the User * Doorbell area to be flushed. This is needed to prevent * writes on different CPUs for the same queue from hitting * the adapter out of order. This is required when some Work * Requests take the Write Combine Gather Buffer path (user * doorbell area offset [SGE_UDB_WCDOORBELL..+63]) and some * take the traditional path where we simply increment the * PIDX (User Doorbell area SGE_UDB_KDOORBELL) and have the * hardware DMA read the actual Work Request. */ wmb(); which wouldn't have looked unusual... Again, that really needs review from the folks familiar with the hardware in question, as well as testing - I'm not trying to push patches like that. If the current mainline variant really works on b-e, I'd like to understand how does it manage that, though...