Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v4 1/2] kernel.h: Introduce const_max() for VLA removal
From: Kees Cook @ 2018-03-15 23:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Josh Poimboeuf, Rasmus Villemoes, Randy Dunlap,
	Miguel Ojeda, Ingo Molnar, David Laight, Ian Abbott, linux-input,
	linux-btrfs, Network Development, Linux Kernel Mailing List,
	Kernel Hardening
In-Reply-To: <CA+55aFyRV9KXzeQZpVYsZYVUJm-ASgu_4_1+8Y8-0KH-YT2M8Q@mail.gmail.com>

On Thu, Mar 15, 2018 at 4:46 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> What I'm *not* so much ok with is "const_max(5,sizeof(x))" erroring
> out, or silently causing insane behavior due to hidden subtle type
> casts..

Yup! I like it as an explicit argument. Thanks!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply

* Re: [PATCH v2 12/15] ice: Add stats and ethtool support
From: Stephen Hemminger @ 2018-03-15 23:52 UTC (permalink / raw)
  To: Anirudh Venkataramanan
  Cc: intel-wired-lan, netdev, Andrew Lunn, Jakub Kicinski
In-Reply-To: <20180315234802.31336-13-anirudh.venkataramanan@intel.com>

On Thu, 15 Mar 2018 16:47:59 -0700
Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> wrote:

> +
> +static const struct ice_stats ice_gstrings_vsi_stats[] = {
> +	ICE_VSI_STAT("tx_unicast", eth_stats.tx_unicast),
> +	ICE_VSI_STAT("rx_unicast", eth_stats.rx_unicast),
> +	ICE_VSI_STAT("tx_multicast", eth_stats.tx_multicast),
> +	ICE_VSI_STAT("rx_multicast", eth_stats.rx_multicast),
> +	ICE_VSI_STAT("tx_broadcast", eth_stats.tx_broadcast),
> +	ICE_VSI_STAT("rx_broadcast", eth_stats.rx_broadcast),
> +	ICE_VSI_STAT("tx_bytes", eth_stats.tx_bytes),
> +	ICE_VSI_STAT("rx_bytes", eth_stats.rx_bytes),
> +	ICE_VSI_STAT("rx_discards", eth_stats.rx_discards),
> +	ICE_VSI_STAT("tx_errors", eth_stats.tx_errors),
> +	ICE_VSI_STAT("tx_linearize", tx_linearize),
> +	ICE_VSI_STAT("rx_unknown_protocol", eth_stats.rx_unknown_protocol),
> +	ICE_VSI_STAT("rx_alloc_fail", rx_buf_failed),
> +	ICE_VSI_STAT("rx_pg_alloc_fail", rx_page_failed),
> +};
> +

Ignoring feedback from maintainers is unlikely to help get your driver adopted.

^ permalink raw reply

* Re: [PATCH] mlx5: Remove call to ida_pre_get
From: Saeed Mahameed @ 2018-03-15 23:58 UTC (permalink / raw)
  To: Matan Barak, Maor Gottlieb, leon@kernel.org, willy@infradead.org
  Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org
In-Reply-To: <20180315025724.GB9973@bombadil.infradead.org>

On Wed, 2018-03-14 at 19:57 -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> The mlx5 driver calls ida_pre_get() in a loop for no readily apparent
> reason.  The driver uses ida_simple_get() which will call
> ida_pre_get()
> by itself and there's no need to use ida_pre_get() unless using
> ida_get_new().
> 

Hi Matthew,

Is this is causing any issues ? or just a simple cleanup ?

Adding Maor, the author of this change,

I believe the idea is to speed up insert_fte (which calls
ida_simple_get) since insert_fte runs under the FTE write semaphore,
in this case if ida_pre_get was successful before taking the semaphore
for all the FTE nodes in the loop, this will be a huge win for
ida_simple_get which will immediately return success without even
trying to allocate.

so it is a best effort to speed up critical path.

Maor, if this is really the case and this is not causing any issues,
then we need to consider adding a comment.


> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
> b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
> index 10e16381f20a..3ba07c7096ef 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
> @@ -1647,7 +1647,6 @@ try_add_to_existing_fg(struct mlx5_flow_table
> *ft,
>  
>  	list_for_each_entry(iter, match_head, list) {
>  		nested_down_read_ref_node(&iter->g->node,
> FS_LOCK_PARENT);
> -		ida_pre_get(&iter->g->fte_allocator, GFP_KERNEL);
>  	}
>  
>  search_again_locked:
> 

^ permalink raw reply

* Re: [PATCH v6 0/6] staging: Introduce DPAA2 Ethernet Switch driver
From: Andrew Lunn @ 2018-03-16  0:15 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: devel, stuyoder, arnd, gregkh, alexandru.marginean, agraf,
	linux-kernel, Razvan Stefanescu, ioana.ciornei, netdev,
	laurentiu.tudor
In-Reply-To: <20180315105642.szn2zhrsgwsv35yf@mwanda>

On Thu, Mar 15, 2018 at 01:56:42PM +0300, Dan Carpenter wrote:
> On Thu, Mar 15, 2018 at 12:44:37AM +0100, Andrew Lunn wrote:
> > On Wed, Mar 14, 2018 at 10:55:52AM -0500, Razvan Stefanescu wrote:
> > > This patchset introduces the Ethernet Switch Driver for Freescale/NXP SoCs
> > > with DPAA2 (DataPath Acceleration Architecture v2). The driver manages
> > > switch objects discovered on the fsl-mc bus. A description of the driver
> > > can be found in the associated README file.
> > 
> > Hi Greg
> > 
> > This code has much better quality than the usual stuff in staging. I
> > see no reason not to merge it. 
> 
> Yeah.  It seems pretty decent.  Stuart, Laurentiu, care to comment?
> 
> Meanwhile, netdev and DaveM aren't even on the CC list and they're the
> ones to ultimately decide.

The patches are for staging, so it is GregKH who decides at this
point, not really DaveM.

       Andrew

^ permalink raw reply

* Re: [PATCH 6/7] e1000: eliminate duplicate barriers on weakly-ordered archs
From: Alexander Duyck @ 2018-03-16  0:25 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Netdev, Timur Tabi, sulrich, linux-arm-msm, linux-arm-kernel,
	Jeff Kirsher, intel-wired-lan, LKML
In-Reply-To: <39dc5bb4-02b1-bf7e-fbfc-17fc484e4fb7@codeaurora.org>

On Thu, Mar 15, 2018 at 4:30 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
> On 3/14/2018 9:41 PM, Alexander Duyck wrote:
>>>  }
>>>
>> So you missed the writel in e1000_xmit_frame. You should probably get
>> that one too while you are doing these updates. The wmb() is in
>> e1000_tx_queue().
>>
>
> I brought wmb() outside along with the next descriptor assignment to be
> similar to the rest of the other code.
>
> if wmb() and writel() are not visible in the same function, let's not touch
> the code.

Maybe for e1000 we should just skip the driver entirely. Odds are you
aren't going to have any e1000 parts running on ARM anyway since most
of them are legacy PCI or PCI-X parts that were made over 10 years
ago. Most of your efforts would probably be best spent on igb, igbvf,
ixgbe, ixgbevf, i40e, i40evf, and fm10k.

^ permalink raw reply

* Re: [PATCH net-next] net: ethernet: ti: cpsw: enable vlan rx vlan offload
From: Andrew Lunn @ 2018-03-16  0:29 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: David S. Miller, netdev, Sekhar Nori, linux-kernel, linux-omap
In-Reply-To: <20180315201550.21487-1-grygorii.strashko@ti.com>

On Thu, Mar 15, 2018 at 03:15:50PM -0500, Grygorii Strashko wrote:
> In VLAN_AWARE mode CPSW can insert VLAN header encapsulation word on Host
> port 0 egress (RX) before the packet data if RX_VLAN_ENCAP bit is set in
> CPSW_CONTROL register. VLAN header encapsulation word has following format:
> 
>  HDR_PKT_Priority bits 29-31 - Header Packet VLAN prio (Highest prio: 7)
>  HDR_PKT_CFI 	  bits 28 - Header Packet VLAN CFI bit.
>  HDR_PKT_Vid 	  bits 27-16 - Header Packet VLAN ID
>  PKT_Type         bits 8-9 - Packet Type. Indicates whether the packet is
>                  	VLAN-tagged, priority-tagged, or non-tagged.
> 	00: VLAN-tagged packet
> 	01: Reserved
> 	10: Priority-tagged packet
> 	11: Non-tagged packet
> 
> This feature can be used to implement TX VLAN offload in case of
> VLAN-tagged packets and to insert VLAN tag in case Non-tagged packet was
> received on port with PVID set. As per documentation, CPSW never modifies
> packet data on Host egress (RX) and as result, without this feature
> enabled, Host port will not be able to receive properly packets which
> entered switch non-tagged through external Port with PVID set (when
> non-tagged packet forwarded from external Port with PVID set to another
> external Port - packet will be VLAN tagged properly).

So, i think it is time to discuss the future of this driver. It should
really be replaced by a switchdev/DSA driver. There are plenty of
carrots for a new driver: Better statistics, working ethtool support
for all the PHYs, better user experience, etc. But maybe now it is
time for the stick. Should we Maintainers decide that no new features
should be added to the existing drivers, just bug fixes?

       Andrew

^ permalink raw reply

* Re: [bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
From: Daniel Borkmann @ 2018-03-16  0:37 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: John Fastabend, davem, ast, davejwatson, netdev
In-Reply-To: <20180315230605.vndmzwxso57puskx@ast-mbp>

On 03/16/2018 12:06 AM, Alexei Starovoitov wrote:
> On Thu, Mar 15, 2018 at 11:55:39PM +0100, Daniel Borkmann wrote:
>> On 03/15/2018 11:20 PM, Alexei Starovoitov wrote:
>>> On Thu, Mar 15, 2018 at 11:17:12PM +0100, Daniel Borkmann wrote:
>>>> On 03/15/2018 10:59 PM, Alexei Starovoitov wrote:
>>>>> On Mon, Mar 12, 2018 at 12:23:29PM -0700, John Fastabend wrote:
>>>>>>  
>>>>>> +/* User return codes for SK_MSG prog type. */
>>>>>> +enum sk_msg_action {
>>>>>> +	SK_MSG_DROP = 0,
>>>>>> +	SK_MSG_PASS,
>>>>>> +};
>>>>>
>>>>> do we really need new enum here?
>>>>> It's the same as 'enum sk_action' and SK_DROP == SK_MSG_DROP
>>>>> and there will be only drop/pass in both enums.
>>>>> Also I don't see where these two new SK_MSG_* are used...
>>>>>
>>>>>> +
>>>>>> +/* user accessible metadata for SK_MSG packet hook, new fields must
>>>>>> + * be added to the end of this structure
>>>>>> + */
>>>>>> +struct sk_msg_md {
>>>>>> +	__u32 data;
>>>>>> +	__u32 data_end;
>>>>>> +};
>>>>>
>>>>> I think it's time for me to ask for forgiveness :)
>>>>
>>>> :-)
>>>>
>>>>> I used __u32 for data and data_end only because all other fields
>>>>> in __sk_buff were __u32 at the time and I couldn't easily figure out
>>>>> how to teach verifier to recognize 8-byte rewrites.
>>>>> Unfortunately my mistake stuck and was copied over into xdp.
>>>>> Since this is new struct let's do it right and add
>>>>> 'void *data, *data_end' here,
>>>>> since bpf prog will use them as 'void *' pointers.
>>>>> There are no compat issues here, since bpf is always 64-bit.
>>>>
>>>> But at least offset-wise when you do the ctx rewrite this would then
>>>> be a bit more tricky when you have 64 bit kernel with 32 bit user
>>>> space since void * members are in each cases at different offset. So
>>>> unless I'm missing something, this still should either be __u32 or
>>>> __u64 instead of void *, no?
>>>
>>> there is no 32-bit user space. these structs are seen by bpf progs only
>>> and bpf is 64-bit only too.
>>> unless I'm missing your point.
>>
>> Ok, so lets say you have 32 bit LLVM binary and compile the prog where
>> you access md->data_end. Given the void * in the struct will that access
>> end up being BPF_W at ctx offset 4 or BPF_DW at ctx offset 8 from clang
>> perspective (iow, is the back end treating this special and always use
>> fixed BPF_DW in such case)? If not and it would be the first case with
>> offset 4, then we could have the case that underlying 64 bit kernel is
>> expecting ctx offset 8 for doing the md ctx conversion.
> 
> i'm still not quite following.
> Whether llvm itself is 32-bit binary or it's arm32 or sprac32 binary
> doesn't matter. It will produce the same 64-bit bpf code.
> It will see 'void *' deref from this struct and will emit DW.
> May be confusion is from newly added -mattr=+alu32 flag?
> That option doesn't change that sizeof(void*)==8.
> It only allows backend to emit 32-bit alu insns.

Ok, so conclusion we had is that while BPF target is unconditionally 64 bit,
it depends which clang front end you use for compilation wrt structs. E.g.
on 32 bit native (e.g. arm) clang front end it would compile the ctx void *
pointers as 4 byte while using clang -target bpf it would compile it as 8
byte. The native clang front end is needed in case of tracing when accessing
pt_regs for walking data structures, but not for networking use case, so
always using -target bpf there is proper way. Meaning there would be no
confusion on the void * since size will always be 8 regardless of underlying
arch being 32 or 64 bit or clang/llvm binary being 32 bit on 64 bit kernel.
Thus, sticking to void * would be fine, but definitely samples/sockmap/Makefile
must be fixed as well, such that people don't copy it wrongly.

Cheers,
Daniel

^ permalink raw reply

* Re: [PATCH net] net/sched: act_simple: don't leak 'index' in the error path
From: Cong Wang @ 2018-03-16  0:43 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Jamal Hadi Salim, Jiri Pirko, David S. Miller,
	Linux Kernel Network Developers
In-Reply-To: <1521067427.2750.40.camel@redhat.com>

On Wed, Mar 14, 2018 at 3:43 PM, Davide Caratti <dcaratti@redhat.com> wrote:
> hello Cong, thank you for reviewing this.
>
> On Wed, 2018-03-14 at 11:41 -0700, Cong Wang wrote:
>> On Tue, Mar 13, 2018 at 7:13 PM, Davide Caratti <dcaratti@redhat.com> wrote:
>>
>> Looks like we just need to replace the tcf_idr_cleanup() with
>> tcf_idr_release()? Which is also simpler.
>
> I just tried it on act_simple, and I can confirm: 'index' does not leak
> anymore if alloc_defdata() fails to kzalloc(), and then tcf_idr_release()
> is called in place of of tcf_idr_cleanup().

Good.

>
>> Looks like all other callers of tcf_idr_cleanup() need to be replaced too,
>> but I don't audit all of them...
>
> no problem, I can try to do that, it's not going to be a big series
> anyway.


Please audit all of them.


>
> while at it, I will also fix other spots where the same bug can be
> reproduced, even if tcf_idr_cleanup() is not there: for example, when
> tcf_vlan_init() fails allocating struct tcf_vlan_params *p,
>
> ASSERT_RTNL();
> p = kzalloc(sizeof(*p), GFP_KERNEL);
> if (!p) {
>         if (ovr)
>                 tcf_idr_release(*a, bind);
>         return -ENOMEM;
> }
>
> the followinng behavior can be observed:
>
> # tc actions flush action vlan
> # tc actions add action vlan pop index 5
> RTNETLINK answers: Cannot allocate memory
> We have an error talking to the kernel
> # tc actions add action vlan pop index 5
> RTNETLINK answers: No space left on device
> We have an error talking to the kernel
> # tc actions add action vlan pop index 5
> RTNETLINK answers: No space left on device
> We have an error talking to the kernel
>
> Probably testing the value of 'ovr' here is wrong, or maybe it's
> not enough: I will also verify what happens using 'replace'
> keyword instead of 'add'.

Please fix it separately if really needed, and it would be nicer
if you can add your test cases to tools/testing/selftests/tc-testing/.

Thanks!

^ permalink raw reply

* Re: WARNING: CPU: 3 PID: 0 at net/sched/sch_hfsc.c:1388 hfsc_dequeue+0x319/0x350 [sch_hfsc]
From: Cong Wang @ 2018-03-16  0:48 UTC (permalink / raw)
  To: Marco Berizzi; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim
In-Reply-To: <495487313.1094743.1521015026754@mail.libero.it>

On Wed, Mar 14, 2018 at 1:10 AM, Marco Berizzi <pupilla@libero.it> wrote:
>> Il 9 marzo 2018 alle 0.14 Cong Wang <xiyou.wangcong@gmail.com> ha scritto:
>>
>>
>> On Thu, Mar 8, 2018 at 8:02 AM, Marco Berizzi <pupilla@libero.it> wrote:
>> >> Marco Berizzi wrote:
>> >>
>> >>
>> >> Hello everyone,
>> >>
>> >> Yesterday I got this error on a slackware linux 4.16-rc4 system
>> >> running as a traffic shaping gateway and netfilter nat.
>> >> The error has been arisen after a partial ISP network outage,
>> >> so unfortunately it will not trivial for me to reproduce it again.
>> >
>> > Hello everyone,
>> >
>> > I'm getting this error twice/day, so fortunately I'm able to
>> > reproduce it.
>>
>> IIRC, there was a patch for this, but it got lost...
>>
>> I will take a look anyway.
>
> ok, thanks for the response. Let me know when there will be a patch
> available to test.

It has been reported here:
https://bugzilla.kernel.org/show_bug.cgi?id=109581

And there is a workaround from Konstantin:
https://patchwork.ozlabs.org/patch/803885/

Unfortunately I don't think that is a real fix, we probably need to
fix HFSC itself rather than just workaround the qlen==0. It is not
trivial since HFSC implementation is not easy to understand.
Maybe Jamal knows better than me.


Thanks

^ permalink raw reply

* Re: [PATCH net-next v3 0/7] ibmvnic: Update TX pool and TX routines
From: Thomas Falcon @ 2018-03-16  0:48 UTC (permalink / raw)
  To: netdev; +Cc: jallen, nfont, davem
In-Reply-To: <1521129763-21030-1-git-send-email-tlfalcon@linux.vnet.ibm.com>

On 03/15/2018 11:02 AM, Thomas Falcon wrote:
> This patch restructures the TX pool data structure and provides a
> separate TX pool array for TSO transmissions. This is already used
> in some way due to our unique DMA situation, namely that we cannot
> use single DMA mappings for packet data. Previously, both buffer
> arrays used the same pool entry. This restructuring allows for
> some additional cleanup in the driver code, especially in some
> places in the device transmit routine.
>
> In addition, it allows us to more easily track the consumer
> and producer indexes of a particular pool. This has been
> further improved by better tracking of in-use buffers to
> prevent possible data corruption in case an invalid buffer
> entry is used.
>
> v3: Forgot to update TX pool cleaning function to handle new data
> structures. Included 7th patch for that.
>
> v2: Fix typo in 3/6 commit subject line
>
> Thomas Falcon (7):
>   ibmvnic: Generalize TX pool structure
>   ibmvnic: Update and clean up reset TX pool routine
>   ibmvnic: Update release TX pool routine
>   ibmvnic: Update TX pool initialization routine
>   ibmvnic: Update TX and TX completion routines
>   ibmvnic: Improve TX buffer accounting
>   ibmvnic: Update TX pool cleaning routine
>
>  drivers/net/ethernet/ibm/ibmvnic.c | 275 +++++++++++++++++++++----------------
>  drivers/net/ethernet/ibm/ibmvnic.h |   8 +-
>  2 files changed, 160 insertions(+), 123 deletions(-)
>
Sorry again, I need to send another version because of a bug in the 7th patch.

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH v2 12/15] ice: Add stats and ethtool support
From: Alexander Duyck @ 2018-03-16  0:50 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Anirudh Venkataramanan, Jakub Kicinski, Netdev, intel-wired-lan,
	Andrew Lunn
In-Reply-To: <20180315165247.165ac10e@xeon-e3>

On Thu, Mar 15, 2018 at 4:52 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Thu, 15 Mar 2018 16:47:59 -0700
> Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> wrote:
>
>> +
>> +static const struct ice_stats ice_gstrings_vsi_stats[] = {
>> +     ICE_VSI_STAT("tx_unicast", eth_stats.tx_unicast),
>> +     ICE_VSI_STAT("rx_unicast", eth_stats.rx_unicast),
>> +     ICE_VSI_STAT("tx_multicast", eth_stats.tx_multicast),
>> +     ICE_VSI_STAT("rx_multicast", eth_stats.rx_multicast),
>> +     ICE_VSI_STAT("tx_broadcast", eth_stats.tx_broadcast),
>> +     ICE_VSI_STAT("rx_broadcast", eth_stats.rx_broadcast),
>> +     ICE_VSI_STAT("tx_bytes", eth_stats.tx_bytes),
>> +     ICE_VSI_STAT("rx_bytes", eth_stats.rx_bytes),
>> +     ICE_VSI_STAT("rx_discards", eth_stats.rx_discards),
>> +     ICE_VSI_STAT("tx_errors", eth_stats.tx_errors),
>> +     ICE_VSI_STAT("tx_linearize", tx_linearize),
>> +     ICE_VSI_STAT("rx_unknown_protocol", eth_stats.rx_unknown_protocol),
>> +     ICE_VSI_STAT("rx_alloc_fail", rx_buf_failed),
>> +     ICE_VSI_STAT("rx_pg_alloc_fail", rx_page_failed),
>> +};
>> +
>
> Ignoring feedback from maintainers is unlikely to help get your driver adopted.

Your feedback wasn't ignored, the netdev stats are gone. I double
checked and there was this in addition to the netdev stats before so I
think the suggestion to remove the netdev stats was just taken
literally.

The VSI is a slightly different entity from the netdev itself. A
netdev can be backed by a VSI in the case of the PF, but the VSI can
be used in other ways such as what we did in i40e where we were using
it to spawn queue groups to work with mqprio as a filter target and in
that case the queue groups wouldn't have a netdev directly associated
with them so in that case it might make sense to leave these as
separate stats.

- Alex

^ permalink raw reply

* Re: [PATCH 6/7] e1000: eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-16  0:50 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Netdev, Timur Tabi, sulrich, linux-arm-msm, linux-arm-kernel,
	Jeff Kirsher, intel-wired-lan, LKML
In-Reply-To: <CAKgT0UcfEwL9Jas149a=h3R4zeu4oA=uywJ9dpYD9kDdnpWQFA@mail.gmail.com>

On 3/15/2018 8:25 PM, Alexander Duyck wrote:
> On Thu, Mar 15, 2018 at 4:30 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
>> On 3/14/2018 9:41 PM, Alexander Duyck wrote:
>>>>  }
>>>>
>>> So you missed the writel in e1000_xmit_frame. You should probably get
>>> that one too while you are doing these updates. The wmb() is in
>>> e1000_tx_queue().
>>>
>>
>> I brought wmb() outside along with the next descriptor assignment to be
>> similar to the rest of the other code.
>>
>> if wmb() and writel() are not visible in the same function, let's not touch
>> the code.
> 
> Maybe for e1000 we should just skip the driver entirely. Odds are you
> aren't going to have any e1000 parts running on ARM anyway since most
> of them are legacy PCI or PCI-X parts that were made over 10 years
> ago. Most of your efforts would probably be best spent on igb, igbvf,
> ixgbe, ixgbevf, i40e, i40evf, and fm10k.
> 

Sure. I'll drop it.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply

* linux-next: manual merge of the net-next tree with the rdma-fixes tree
From: Stephen Rothwell @ 2018-03-16  0:56 UTC (permalink / raw)
  To: David Miller, Networking, Doug Ledford, Jason Gunthorpe
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, Mark Bloch,
	Leon Romanovsky

[-- Attachment #1: Type: text/plain, Size: 5754 bytes --]

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/infiniband/hw/mlx5/main.c

between commit:

  42cea83f9524 ("IB/mlx5: Fix cleanup order on unload")

from the rdma-fixes tree and commit:

  b5ca15ad7e61 ("IB/mlx5: Add proper representors support")

from the net-next tree.

I fixed it up (see below and the merge fix patch as well) and can
carry the fix as necessary. This is now fixed as far as linux-next is
concerned, but any non trivial conflicts should be mentioned to your
upstream maintainer when your tree is submitted for merging.  You may
also want to consider cooperating with the maintainer of the conflicting
tree to minimise any particularly complex conflicts.

From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Fri, 16 Mar 2018 11:54:01 +1100
Subject: [PATCH] IB/mlx5: merge fix for "Fix cleanup order on unload"

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
 drivers/infiniband/hw/mlx5/ib_rep.c  | 6 +++---
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 +--
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/ib_rep.c b/drivers/infiniband/hw/mlx5/ib_rep.c
index 61cc3d7db257..7fb997dadd80 100644
--- a/drivers/infiniband/hw/mlx5/ib_rep.c
+++ b/drivers/infiniband/hw/mlx5/ib_rep.c
@@ -33,9 +33,9 @@ static const struct mlx5_ib_profile rep_profile = {
 	STAGE_CREATE(MLX5_IB_STAGE_IB_REG,
 		     mlx5_ib_stage_ib_reg_init,
 		     mlx5_ib_stage_ib_reg_cleanup),
-	STAGE_CREATE(MLX5_IB_STAGE_UMR_RESOURCES,
-		     mlx5_ib_stage_umr_res_init,
-		     mlx5_ib_stage_umr_res_cleanup),
+	STAGE_CREATE(MLX5_IB_STAGE_POST_IB_REG_UMR,
+		     mlx5_ib_stage_post_ib_reg_umr_init,
+		     NULL),
 	STAGE_CREATE(MLX5_IB_STAGE_CLASS_ATTR,
 		     mlx5_ib_stage_class_attr_init,
 		     NULL),
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 7ec753ec7962..c45a7abdbe3e 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1071,8 +1071,7 @@ int mlx5_ib_stage_bfrag_init(struct mlx5_ib_dev *dev);
 void mlx5_ib_stage_bfrag_cleanup(struct mlx5_ib_dev *dev);
 int mlx5_ib_stage_ib_reg_init(struct mlx5_ib_dev *dev);
 void mlx5_ib_stage_ib_reg_cleanup(struct mlx5_ib_dev *dev);
-int mlx5_ib_stage_umr_res_init(struct mlx5_ib_dev *dev);
-void mlx5_ib_stage_umr_res_cleanup(struct mlx5_ib_dev *dev);
+int mlx5_ib_stage_post_ib_reg_umr_init(struct mlx5_ib_dev *dev);
 int mlx5_ib_stage_class_attr_init(struct mlx5_ib_dev *dev);
 void __mlx5_ib_remove(struct mlx5_ib_dev *dev,
 		      const struct mlx5_ib_profile *profile,
-- 
2.16.1

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/infiniband/hw/mlx5/main.c
index da091de4e69d,d9474b95d8e5..000000000000
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@@ -4860,19 -4999,19 +4996,19 @@@ int mlx5_ib_stage_ib_reg_init(struct ml
  	return ib_register_device(&dev->ib_dev, NULL);
  }
  
 -void mlx5_ib_stage_ib_reg_cleanup(struct mlx5_ib_dev *dev)
 +static void mlx5_ib_stage_pre_ib_reg_umr_cleanup(struct mlx5_ib_dev *dev)
  {
 -	ib_unregister_device(&dev->ib_dev);
 +	destroy_umrc_res(dev);
  }
  
- static void mlx5_ib_stage_ib_reg_cleanup(struct mlx5_ib_dev *dev)
 -int mlx5_ib_stage_umr_res_init(struct mlx5_ib_dev *dev)
++void mlx5_ib_stage_ib_reg_cleanup(struct mlx5_ib_dev *dev)
  {
 -	return create_umr_res(dev);
 +	ib_unregister_device(&dev->ib_dev);
  }
  
- static int mlx5_ib_stage_post_ib_reg_umr_init(struct mlx5_ib_dev *dev)
 -void mlx5_ib_stage_umr_res_cleanup(struct mlx5_ib_dev *dev)
++int mlx5_ib_stage_post_ib_reg_umr_init(struct mlx5_ib_dev *dev)
  {
 -	destroy_umrc_res(dev);
 +	return create_umr_res(dev);
  }
  
  static int mlx5_ib_stage_delay_drop_init(struct mlx5_ib_dev *dev)
@@@ -4999,6 -5144,48 +5144,48 @@@ static const struct mlx5_ib_profile pf_
  		     NULL),
  };
  
+ static const struct mlx5_ib_profile nic_rep_profile = {
+ 	STAGE_CREATE(MLX5_IB_STAGE_INIT,
+ 		     mlx5_ib_stage_init_init,
+ 		     mlx5_ib_stage_init_cleanup),
+ 	STAGE_CREATE(MLX5_IB_STAGE_FLOW_DB,
+ 		     mlx5_ib_stage_flow_db_init,
+ 		     mlx5_ib_stage_flow_db_cleanup),
+ 	STAGE_CREATE(MLX5_IB_STAGE_CAPS,
+ 		     mlx5_ib_stage_caps_init,
+ 		     NULL),
+ 	STAGE_CREATE(MLX5_IB_STAGE_NON_DEFAULT_CB,
+ 		     mlx5_ib_stage_rep_non_default_cb,
+ 		     NULL),
+ 	STAGE_CREATE(MLX5_IB_STAGE_ROCE,
+ 		     mlx5_ib_stage_rep_roce_init,
+ 		     mlx5_ib_stage_rep_roce_cleanup),
+ 	STAGE_CREATE(MLX5_IB_STAGE_DEVICE_RESOURCES,
+ 		     mlx5_ib_stage_dev_res_init,
+ 		     mlx5_ib_stage_dev_res_cleanup),
+ 	STAGE_CREATE(MLX5_IB_STAGE_COUNTERS,
+ 		     mlx5_ib_stage_counters_init,
+ 		     mlx5_ib_stage_counters_cleanup),
+ 	STAGE_CREATE(MLX5_IB_STAGE_UAR,
+ 		     mlx5_ib_stage_uar_init,
+ 		     mlx5_ib_stage_uar_cleanup),
+ 	STAGE_CREATE(MLX5_IB_STAGE_BFREG,
+ 		     mlx5_ib_stage_bfrag_init,
+ 		     mlx5_ib_stage_bfrag_cleanup),
+ 	STAGE_CREATE(MLX5_IB_STAGE_IB_REG,
+ 		     mlx5_ib_stage_ib_reg_init,
+ 		     mlx5_ib_stage_ib_reg_cleanup),
 -	STAGE_CREATE(MLX5_IB_STAGE_UMR_RESOURCES,
 -		     mlx5_ib_stage_umr_res_init,
 -		     mlx5_ib_stage_umr_res_cleanup),
++	STAGE_CREATE(MLX5_IB_STAGE_POST_IB_REG_UMR,
++		     mlx5_ib_stage_post_ib_reg_umr_init,
++		     NULL),
+ 	STAGE_CREATE(MLX5_IB_STAGE_CLASS_ATTR,
+ 		     mlx5_ib_stage_class_attr_init,
+ 		     NULL),
+ 	STAGE_CREATE(MLX5_IB_STAGE_REP_REG,
+ 		     mlx5_ib_stage_rep_reg_init,
+ 		     mlx5_ib_stage_rep_reg_cleanup),
+ };
+ 
  static void *mlx5_ib_add_slave_port(struct mlx5_core_dev *mdev, u8 port_num)
  {
  	struct mlx5_ib_multiport_info *mpi;

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related

* [PATCH v2 0/6] Eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-16  1:04 UTC (permalink / raw)
  To: netdev, timur, sulrich; +Cc: Sinan Kaya, linux-arm-msm, linux-arm-kernel

Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

I did a regex search for wmb() followed by writel() in each drivers
directory.
I scrubbed the ones I care about and posted this series. Note also that
I have one Infiniband patch in the series.

I considered "ease of change", "popular usage" and "performance critical
path" as the determining criteria for my filtering.

We used relaxed API heavily on ARM for a long time but
it did not exist on other architectures. For this reason, relaxed
architectures have been paying double penalty in order to use the common
drivers.

Now that relaxed API is present on all architectures, we can go and scrub
all drivers to see what needs to change and what can remain.

We start with mostly used ones and hope to increase the coverage over time.
It will take a while to cover all drivers.

Changes since v1:

i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs
missed writel calls in:
i40e:
  i40e_program_fdir_filter
  i40e_clean_rx_irq
  i40e_tx_map
i40evf:
  i40e_clean_rx_irq
  i40e_tx_map

ixgbe: eliminate duplicate barriers on weakly-ordered archs
missed the writel at the end of ixgbe_tx_map

RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs
dropped since applied

igbvf: eliminate duplicate barriers on weakly-ordered archs
missed the writel at the end of igbvf_tx_queue_adv()

igb: eliminate duplicate barriers on weakly-ordered archs
missed the writel at the end of igb_tx_map()

e1000: eliminate duplicate barriers on weakly-ordered archs
dropped

ixgbevf: eliminate duplicate barriers on weakly-ordered archs
split into two and remove extra barrier.

Sinan Kaya (6):
  i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs
  ixgbe: eliminate duplicate barriers on weakly-ordered archs
  igbvf: eliminate duplicate barriers on weakly-ordered archs
  igb: eliminate duplicate barriers on weakly-ordered archs
  ixgbevf: keep writel() closer to wmb()
  ixgbevf: eliminate duplicate barriers on weakly-ordered archs

 drivers/net/ethernet/intel/i40e/i40e_txrx.c       | 8 ++++----
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c     | 4 ++--
 drivers/net/ethernet/intel/igb/igb_main.c         | 4 ++--
 drivers/net/ethernet/intel/igbvf/netdev.c         | 4 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     | 8 ++++----
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      | 5 -----
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++--
 7 files changed, 16 insertions(+), 21 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH v2 1/6] i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-16  1:04 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Jeff Kirsher,
	intel-wired-lan, linux-kernel
In-Reply-To: <1521162296-19729-1-git-send-email-okaya@codeaurora.org>

Code includes wmb() followed by writel(). writel() already has a barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   | 8 ++++----
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index e554aa6cf..9455869 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -185,7 +185,7 @@ static int i40e_program_fdir_filter(struct i40e_fdir_filter *fdir_data,
 	/* Mark the data descriptor to be watched */
 	first->next_to_watch = tx_desc;
 
-	writel(tx_ring->next_to_use, tx_ring->tail);
+	writel_relaxed(tx_ring->next_to_use, tx_ring->tail);
 	return 0;
 
 dma_fail:
@@ -1375,7 +1375,7 @@ static inline void i40e_release_rx_desc(struct i40e_ring *rx_ring, u32 val)
 	 * such as IA-64).
 	 */
 	wmb();
-	writel(val, rx_ring->tail);
+	writel_relaxed(val, rx_ring->tail);
 }
 
 /**
@@ -2258,7 +2258,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 		 */
 		wmb();
 
-		writel(xdp_ring->next_to_use, xdp_ring->tail);
+		writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail);
 	}
 
 	rx_ring->skb = skb;
@@ -3286,7 +3286,7 @@ static inline int i40e_tx_map(struct i40e_ring *tx_ring, struct sk_buff *skb,
 
 	/* notify HW of packet */
 	if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
-		writel(i, tx_ring->tail);
+		writel_relaxed(i, tx_ring->tail);
 
 		/* we need this if more than one processor can write to our tail
 		 * at a time, it synchronizes IO on IA64/Altix systems
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index 357d605..56eea20 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -667,7 +667,7 @@ static inline void i40e_release_rx_desc(struct i40e_ring *rx_ring, u32 val)
 	 * such as IA-64).
 	 */
 	wmb();
-	writel(val, rx_ring->tail);
+	writel_relaxed(val, rx_ring->tail);
 }
 
 /**
@@ -2243,7 +2243,7 @@ static inline void i40evf_tx_map(struct i40e_ring *tx_ring, struct sk_buff *skb,
 
 	/* notify HW of packet */
 	if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
-		writel(i, tx_ring->tail);
+		writel_relaxed(i, tx_ring->tail);
 
 		/* we need this if more than one processor can write to our tail
 		 * at a time, it synchronizes IO on IA64/Altix systems
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 2/6] ixgbe: eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-16  1:04 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Jeff Kirsher,
	intel-wired-lan, linux-kernel
In-Reply-To: <1521162296-19729-1-git-send-email-okaya@codeaurora.org>

Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 0da5aa2..58ed70f 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1692,7 +1692,7 @@ void ixgbe_alloc_rx_buffers(struct ixgbe_ring *rx_ring, u16 cleaned_count)
 		 * such as IA-64).
 		 */
 		wmb();
-		writel(i, rx_ring->tail);
+		writel_relaxed(i, rx_ring->tail);
 	}
 }
 
@@ -2453,7 +2453,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 		 * know there are new descriptors to fetch.
 		 */
 		wmb();
-		writel(ring->next_to_use, ring->tail);
+		writel_relaxed(ring->next_to_use, ring->tail);
 
 		xdp_do_flush_map();
 	}
@@ -8078,7 +8078,7 @@ static int ixgbe_tx_map(struct ixgbe_ring *tx_ring,
 	ixgbe_maybe_stop_tx(tx_ring, DESC_NEEDED);
 
 	if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
-		writel(i, tx_ring->tail);
+		writel_relaxed(i, tx_ring->tail);
 
 		/* we need this if more than one processor can write to our tail
 		 * at a time, it synchronizes IO on IA64/Altix systems
@@ -10014,7 +10014,7 @@ static void ixgbe_xdp_flush(struct net_device *dev)
 	 * are new descriptors to fetch.
 	 */
 	wmb();
-	writel(ring->next_to_use, ring->tail);
+	writel_relaxed(ring->next_to_use, ring->tail);
 
 	return;
 }
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 3/6] igbvf: eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-16  1:04 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Jeff Kirsher,
	intel-wired-lan, linux-kernel
In-Reply-To: <1521162296-19729-1-git-send-email-okaya@codeaurora.org>

Code includes wmb() followed by writel(). writel() already has a barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/igbvf/netdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c
index 4214c15..edb1c34 100644
--- a/drivers/net/ethernet/intel/igbvf/netdev.c
+++ b/drivers/net/ethernet/intel/igbvf/netdev.c
@@ -251,7 +251,7 @@ static void igbvf_alloc_rx_buffers(struct igbvf_ring *rx_ring,
 		 * such as IA-64).
 		*/
 		wmb();
-		writel(i, adapter->hw.hw_addr + rx_ring->tail);
+		writel_relaxed(i, adapter->hw.hw_addr + rx_ring->tail);
 	}
 }
 
@@ -2297,7 +2297,7 @@ static inline void igbvf_tx_queue_adv(struct igbvf_adapter *adapter,
 
 	tx_ring->buffer_info[first].next_to_watch = tx_desc;
 	tx_ring->next_to_use = i;
-	writel(i, adapter->hw.hw_addr + tx_ring->tail);
+	writel_relaxed(i, adapter->hw.hw_addr + tx_ring->tail);
 	/* we need this if more than one processor can write to our tail
 	 * at a time, it synchronizes IO on IA64/Altix systems
 	 */
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 4/6] igb: eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-16  1:04 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Jeff Kirsher,
	intel-wired-lan, linux-kernel
In-Reply-To: <1521162296-19729-1-git-send-email-okaya@codeaurora.org>

Code includes wmb() followed by writel(). writel() already has a barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index b88fae7..82aea92 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -5671,7 +5671,7 @@ static int igb_tx_map(struct igb_ring *tx_ring,
 	igb_maybe_stop_tx(tx_ring, DESC_NEEDED);
 
 	if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
-		writel(i, tx_ring->tail);
+		writel_relaxed(i, tx_ring->tail);
 
 		/* we need this if more than one processor can write to our tail
 		 * at a time, it synchronizes IO on IA64/Altix systems
@@ -8072,7 +8072,7 @@ void igb_alloc_rx_buffers(struct igb_ring *rx_ring, u16 cleaned_count)
 		 * such as IA-64).
 		 */
 		wmb();
-		writel(i, rx_ring->tail);
+		writel_relaxed(i, rx_ring->tail);
 	}
 }
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 5/6] ixgbevf: keep writel() closer to wmb()
From: Sinan Kaya @ 2018-03-16  1:04 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Jeff Kirsher,
	intel-wired-lan, linux-kernel
In-Reply-To: <1521162296-19729-1-git-send-email-okaya@codeaurora.org>

Remove ixgbevf_write_tail() in favor of moving writel() close to
wmb().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      | 5 -----
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++--
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index f695242..11e893e 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -244,11 +244,6 @@ static inline u16 ixgbevf_desc_unused(struct ixgbevf_ring *ring)
 	return ((ntc > ntu) ? 0 : ring->count) + ntc - ntu - 1;
 }
 
-static inline void ixgbevf_write_tail(struct ixgbevf_ring *ring, u32 value)
-{
-	writel(value, ring->tail);
-}
-
 #define IXGBEVF_RX_DESC(R, i)	\
 	(&(((union ixgbe_adv_rx_desc *)((R)->desc))[i]))
 #define IXGBEVF_TX_DESC(R, i)	\
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 9b3d43d..b65f691 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -659,7 +659,7 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
 		 * such as IA-64).
 		 */
 		wmb();
-		ixgbevf_write_tail(rx_ring, i);
+		writel(i, rx_ring->tail);
 	}
 }
 
@@ -3644,7 +3644,7 @@ static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 	tx_ring->next_to_use = i;
 
 	/* notify HW of packet */
-	ixgbevf_write_tail(tx_ring, i);
+	writel(value, tx_ring->tail);
 
 	return;
 dma_error:
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 6/6] ixgbevf: eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-16  1:04 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Jeff Kirsher,
	intel-wired-lan, linux-kernel
In-Reply-To: <1521162296-19729-1-git-send-email-okaya@codeaurora.org>

Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index b65f691..9e2e0fd 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -3644,7 +3644,7 @@ static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 	tx_ring->next_to_use = i;
 
 	/* notify HW of packet */
-	writel(value, tx_ring->tail);
+	writel_relaxed(value, tx_ring->tail);
 
 	return;
 dma_error:
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH v2 5/6] ixgbevf: keep writel() closer to wmb()
From: Sinan Kaya @ 2018-03-16  1:13 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Jeff Kirsher, intel-wired-lan,
	linux-kernel
In-Reply-To: <1521162296-19729-6-git-send-email-okaya@codeaurora.org>

On 3/15/2018 9:04 PM, Sinan Kaya wrote:
>  	/* notify HW of packet */
> -	ixgbevf_write_tail(tx_ring, i);
> +	writel(value, tx_ring->tail);
>  

oops. copy paste mistake. 

I'll hold onto posting v3 until i hear more feedback.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply

* Re: linux-next: manual merge of the net-next tree with the rdma-fixes tree
From: Doug Ledford @ 2018-03-16  1:18 UTC (permalink / raw)
  To: Stephen Rothwell, David Miller, Networking, Jason Gunthorpe
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, Mark Bloch,
	Leon Romanovsky
In-Reply-To: <20180316115610.3d7f232a@canb.auug.org.au>

[-- Attachment #1: Type: text/plain, Size: 4229 bytes --]

On Fri, 2018-03-16 at 11:56 +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the net-next tree got a conflict in:
> 
>   drivers/infiniband/hw/mlx5/main.c
> 
> between commit:
> 
>   42cea83f9524 ("IB/mlx5: Fix cleanup order on unload")
> 
> from the rdma-fixes tree and commit:
> 
>   b5ca15ad7e61 ("IB/mlx5: Add proper representors support")
> 
> from the net-next tree.

We are aware of the merge conflict.  This is a result of the fact that
code had been submitted to the for-next area (the representors support)
and after that an issue was found by the syzkaller bot that deserved rc
fix status and which conflicted.  The fixup you list below is
insufficient to fix the merge conflict.  The full fixup can be found in
the rdma tree from where I merged the for-rc branch into the for-next
branch and created a complete fixup of the merge conflict.  The problem
is that one patch change the device init stage flow, while the other
patch duplicates the normal device init stage flow to the representor
device stage flow.  To resolve the fix, you not only have to resolve the
contextual diffs, but you have to duplicate the changes to the normal
device stage flow into the representor device stage flow.  It is very
far from a trivial merge.  We were planning on talking to Dave about
this issue tomorrow, but you beat us to raising the issue ;-).

Here's the commit (from the rdma git repo) with the proper merge fix
(although it also has other minor merge stuff that needs to be ignored):

2d873449a202 (Merge branch 'k.o/wip/dl-for-rc' into k.o/wip/dl-for-next)

> I fixed it up (see below and the merge fix patch as well) and can
> carry the fix as necessary. This is now fixed as far as linux-next is
> concerned, but any non trivial conflicts should be mentioned to your
> upstream maintainer when your tree is submitted for merging.  You may
> also want to consider cooperating with the maintainer of the conflicting
> tree to minimise any particularly complex conflicts.
> 
> From: Stephen Rothwell <sfr@canb.auug.org.au>
> Date: Fri, 16 Mar 2018 11:54:01 +1100
> Subject: [PATCH] IB/mlx5: merge fix for "Fix cleanup order on unload"
> 
> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
> ---
>  drivers/infiniband/hw/mlx5/ib_rep.c  | 6 +++---
>  drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 +--
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/ib_rep.c b/drivers/infiniband/hw/mlx5/ib_rep.c
> index 61cc3d7db257..7fb997dadd80 100644
> --- a/drivers/infiniband/hw/mlx5/ib_rep.c
> +++ b/drivers/infiniband/hw/mlx5/ib_rep.c
> @@ -33,9 +33,9 @@ static const struct mlx5_ib_profile rep_profile = {
>  	STAGE_CREATE(MLX5_IB_STAGE_IB_REG,
>  		     mlx5_ib_stage_ib_reg_init,
>  		     mlx5_ib_stage_ib_reg_cleanup),
> -	STAGE_CREATE(MLX5_IB_STAGE_UMR_RESOURCES,
> -		     mlx5_ib_stage_umr_res_init,
> -		     mlx5_ib_stage_umr_res_cleanup),
> +	STAGE_CREATE(MLX5_IB_STAGE_POST_IB_REG_UMR,
> +		     mlx5_ib_stage_post_ib_reg_umr_init,
> +		     NULL),
>  	STAGE_CREATE(MLX5_IB_STAGE_CLASS_ATTR,
>  		     mlx5_ib_stage_class_attr_init,
>  		     NULL),
> diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> index 7ec753ec7962..c45a7abdbe3e 100644
> --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
> +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> @@ -1071,8 +1071,7 @@ int mlx5_ib_stage_bfrag_init(struct mlx5_ib_dev *dev);
>  void mlx5_ib_stage_bfrag_cleanup(struct mlx5_ib_dev *dev);
>  int mlx5_ib_stage_ib_reg_init(struct mlx5_ib_dev *dev);
>  void mlx5_ib_stage_ib_reg_cleanup(struct mlx5_ib_dev *dev);
> -int mlx5_ib_stage_umr_res_init(struct mlx5_ib_dev *dev);
> -void mlx5_ib_stage_umr_res_cleanup(struct mlx5_ib_dev *dev);
> +int mlx5_ib_stage_post_ib_reg_umr_init(struct mlx5_ib_dev *dev);
>  int mlx5_ib_stage_class_attr_init(struct mlx5_ib_dev *dev);
>  void __mlx5_ib_remove(struct mlx5_ib_dev *dev,
>  		      const struct mlx5_ib_profile *profile,
> -- 
> 2.16.1
> 

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH net-next v4 0/7] ibmvnic: Update TX pool and TX routines
From: Thomas Falcon @ 2018-03-16  1:20 UTC (permalink / raw)
  To: netdev; +Cc: jallen, nfont, Thomas Falcon

This patch restructures the TX pool data structure and provides a
separate TX pool array for TSO transmissions. This is already used
in some way due to our unique DMA situation, namely that we cannot
use single DMA mappings for packet data. Previously, both buffer
arrays used the same pool entry. This restructuring allows for
some additional cleanup in the driver code, especially in some
places in the device transmit routine.

In addition, it allows us to more easily track the consumer
and producer indexes of a particular pool. This has been
further improved by better tracking of in-use buffers to
prevent possible data corruption in case an invalid buffer
entry is used.

v4: Fix error in 7th patch that causes an oops by using
    the older fixed value for number of buffers instead
    of the respective field in the tx pool data structure

v3: Forgot to update TX pool cleaning function to handle new data
    structures. Included 7th patch for that.

v2: Fix typo in 3/6 commit subject line

Thomas Falcon (7):
  ibmvnic: Generalize TX pool structure
  ibmvnic: Update and clean up reset TX pool routine
  ibmvnic: Update release TX pool routine
  ibmvnic: Update TX pool initialization routine
  ibmvnic: Update TX and TX completion routines
  ibmvnic: Improve TX buffer accounting
  ibmvnic: Update TX pool cleaning routine

 drivers/net/ethernet/ibm/ibmvnic.c | 275 +++++++++++++++++++++----------------
 drivers/net/ethernet/ibm/ibmvnic.h |   8 +-
 2 files changed, 160 insertions(+), 123 deletions(-)

-- 
2.15.0

^ permalink raw reply

* [PATCH net-next v4 1/7] ibmvnic: Generalize TX pool structure
From: Thomas Falcon @ 2018-03-16  1:20 UTC (permalink / raw)
  To: netdev; +Cc: jallen, nfont, Thomas Falcon
In-Reply-To: <1521163223-11478-1-git-send-email-tlfalcon@linux.vnet.ibm.com>

Remove some unused fields in the structure and include values
describing the individual buffer size and number of buffers in
a TX pool. This allows us to use these fields for TX pool buffer
accounting as opposed to using hard coded values. Finally, split
TSO buffers out and provide an additional TX pool array for TSO.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.h | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index 099c89d49945..a2e21b39074f 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -917,11 +917,9 @@ struct ibmvnic_tx_pool {
 	int *free_map;
 	int consumer_index;
 	int producer_index;
-	wait_queue_head_t ibmvnic_tx_comp_q;
-	struct task_struct *work_thread;
 	struct ibmvnic_long_term_buff long_term_buff;
-	struct ibmvnic_long_term_buff tso_ltb;
-	int tso_index;
+	int num_buffers;
+	int buf_size;
 };
 
 struct ibmvnic_rx_buff {
@@ -1044,6 +1042,7 @@ struct ibmvnic_adapter {
 	u64 promisc;
 
 	struct ibmvnic_tx_pool *tx_pool;
+	struct ibmvnic_tx_pool *tso_pool;
 	struct completion init_done;
 	int init_done_rc;
 
-- 
2.15.0

^ permalink raw reply related

* [PATCH net-next v4 2/7] ibmvnic: Update and clean up reset TX pool routine
From: Thomas Falcon @ 2018-03-16  1:20 UTC (permalink / raw)
  To: netdev; +Cc: jallen, nfont, Thomas Falcon
In-Reply-To: <1521163223-11478-1-git-send-email-tlfalcon@linux.vnet.ibm.com>

Update TX pool reset routine to accommodate new TSO pool array. Introduce
a function that resets one TX pool, and use that function to initialize
each pool in both pool arrays.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 45 +++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 9c7d19c926f9..4dc304422ece 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -557,36 +557,41 @@ static int init_rx_pools(struct net_device *netdev)
 	return 0;
 }
 
+static int reset_one_tx_pool(struct ibmvnic_adapter *adapter,
+			     struct ibmvnic_tx_pool *tx_pool)
+{
+	int rc, i;
+
+	rc = reset_long_term_buff(adapter, &tx_pool->long_term_buff);
+	if (rc)
+		return rc;
+
+	memset(tx_pool->tx_buff, 0,
+	       tx_pool->num_buffers *
+	       sizeof(struct ibmvnic_tx_buff));
+
+	for (i = 0; i < tx_pool->num_buffers; i++)
+		tx_pool->free_map[i] = i;
+
+	tx_pool->consumer_index = 0;
+	tx_pool->producer_index = 0;
+
+	return 0;
+}
+
 static int reset_tx_pools(struct ibmvnic_adapter *adapter)
 {
-	struct ibmvnic_tx_pool *tx_pool;
 	int tx_scrqs;
-	int i, j, rc;
+	int i, rc;
 
 	tx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs);
 	for (i = 0; i < tx_scrqs; i++) {
-		netdev_dbg(adapter->netdev, "Re-setting tx_pool[%d]\n", i);
-
-		tx_pool = &adapter->tx_pool[i];
-
-		rc = reset_long_term_buff(adapter, &tx_pool->long_term_buff);
+		rc = reset_one_tx_pool(adapter, &adapter->tso_pool[i]);
 		if (rc)
 			return rc;
-
-		rc = reset_long_term_buff(adapter, &tx_pool->tso_ltb);
+		rc = reset_one_tx_pool(adapter, &adapter->tx_pool[i]);
 		if (rc)
 			return rc;
-
-		memset(tx_pool->tx_buff, 0,
-		       adapter->req_tx_entries_per_subcrq *
-		       sizeof(struct ibmvnic_tx_buff));
-
-		for (j = 0; j < adapter->req_tx_entries_per_subcrq; j++)
-			tx_pool->free_map[j] = j;
-
-		tx_pool->consumer_index = 0;
-		tx_pool->producer_index = 0;
-		tx_pool->tso_index = 0;
 	}
 
 	return 0;
-- 
2.15.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox