Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [RFC PATCH v2 06/10] udp: cope with UDP GRO packet misdirection
From: Steffen Klassert @ 2018-10-23 10:29 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netdev, Willem de Bruijn
In-Reply-To: <38816b5568eab473cc21e14ad58e51e6c04170f2.camel@redhat.com>

On Mon, Oct 22, 2018 at 02:51:56PM +0200, Paolo Abeni wrote:
> > > +
> > > +static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
> > > +{
> > > +	struct sk_buff *next, *segs;
> > > +	int ret;
> > > +
> > > +	if (likely(!udp_unexpected_gso(sk, skb)))
> > > +		return udp_queue_rcv_one_skb(sk, skb);
> > > +
> > > +	BUILD_BUG_ON(sizeof(struct udp_skb_cb) > SKB_SGO_CB_OFFSET);
> > > +	__skb_push(skb, -skb_mac_offset(skb));
> > > +	segs = udp_rcv_segment(sk, skb);
> > > +	for (skb = segs; skb; skb = next) {
> > > +		next = skb->next;
> > > +		__skb_pull(skb, skb_transport_offset(skb));
> > > +		ret = udp_queue_rcv_one_skb(sk, skb);
> > 
> > udp_queue_rcv_one_skb() starts with doing a xfrm4_policy_check().
> > Maybe we can do this on the GSO packet instead of the segments.
> > So far this code is just for handling a corner case, but this might
> > change.
> 
> I thought about keeping the policy check here, but then I preferred
> what looked the safest option. Perhaps we can improve with a follow-up?

Fair enough. Let's keep it in mind and do it later.

^ permalink raw reply

* Re: [PATCH net-next 3/4] net: phy-c45: Implement reset/suspend/resume callbacks
From: Jose Abreu @ 2018-10-23 10:28 UTC (permalink / raw)
  To: Russell King - ARM Linux, Jose Abreu
  Cc: Florian Fainelli, Andrew Lunn, netdev, David S. Miller,
	Joao Pinto
In-Reply-To: <20181023102023.GM30658@n2100.armlinux.org.uk>

On 23-10-2018 11:20, Russell King - ARM Linux wrote:
> On Tue, Oct 23, 2018 at 11:17:50AM +0100, Jose Abreu wrote:
>> On 22-10-2018 18:13, Florian Fainelli wrote:
>>> On 10/22/18 8:48 AM, Russell King - ARM Linux wrote:
>>>> On Mon, Oct 22, 2018 at 01:47:48PM +0100, Jose Abreu wrote:
>>>>> Hello,
>>>>>
>>>>> On 22-10-2018 13:28, Andrew Lunn wrote:
>>>>>>>  EXPORT_SYMBOL_GPL(gen10g_resume);
>>>>>>> @@ -327,7 +381,7 @@ struct phy_driver genphy_10g_driver = {
>>>>>>>  	.phy_id         = 0xffffffff,
>>>>>>>  	.phy_id_mask    = 0xffffffff,
>>>>>>>  	.name           = "Generic 10G PHY",
>>>>>>> -	.soft_reset	= gen10g_no_soft_reset,
>>>>>>> +	.soft_reset	= gen10g_soft_reset,
>>>>>>>  	.config_init    = gen10g_config_init,
>>>>>>>  	.features       = 0,
>>>>>>>  	.aneg_done	= genphy_c45_aneg_done,
>>>>>> Hi Jose
>>>>>>
>>>>>> You need to be careful here. There is a reason this is called
>>>>>> gen10g_no_soft_reset, rather than having an empty
>>>>>> gen10g_soft_reset. Some PHYs break when you do a reset.  So adding a
>>>>>> gen10g_soft_reset is fine, but don't change this here, without first
>>>>>> understanding the history, and talking to Russell King.
>>>>> Hmm, the reset function only interacts with standard PCS
>>>>> registers, which should always be available ...
>>>>>
>>>>> >From my tests I need to do at least 1 reset during power-up so in
>>>>> ultimate case I can add a feature quirk or similar.
>>>>>
>>>>> Russell, can you please comment ?
>>>> Setting the reset bit on 88x3310 causes the entire device to become
>>>> completely inaccessible until hardware reset.  Therefore, this bit
>>>> must _never_ be set for these devices.  That said, we have a separate
>>>> driver for these PHYs, but that will only be used for them if it's
>>>> present in the kernel.  If we accidentally fall back to the generic
>>>> driver, then we'll screw the 88x3310 until a full hardware reset.
>>>>
>>>> We also have a bunch of net devices that make use of this crippled
>>>> "generic" 10G support - we don't know whether resetting the PHY
>>>> for those systems will cause a regression - maybe board firmware
>>>> already configured the PHY?  I can't say either way on that, except
>>>> that we've had crippled 10G support in PHYLIB for a number of years
>>>> now _with_ users, and adding reset support drastically changes the
>>>> subsystem's behaviour for these users.
>>>>
>>>> I would recommend not touching the generic 10G driver, but instead
>>>> implement your own driver for your PHY to avoid causing regressions.
>>>>
>>> Agreed.
>> What about .suspend / .resume ?
> I have no idea what you're proposing there - your patches weren't copied
> to me.
>

They just set / unset  MDIO_CTRL1_LPOWER bit in PCS. I find that
without this remote end doesn't detect link is down ...

If it's okay for Generic 10G driver I can submit only this and
manually reset PHY in stmmac driver so that I don't need to
implement custom PHY driver ...

BTW, I just found out currently Generic 10G Driver is broken
without patch 4/4 of this series [1]

[1] https://patchwork.ozlabs.org/patch/987570/

Thanks and Best Regards,
Jose Miguel Abreu

^ permalink raw reply

* Re: [PATCH net-next 3/4] net: phy-c45: Implement reset/suspend/resume callbacks
From: Russell King - ARM Linux @ 2018-10-23 10:20 UTC (permalink / raw)
  To: Jose Abreu
  Cc: Florian Fainelli, Andrew Lunn, netdev, David S. Miller,
	Joao Pinto
In-Reply-To: <943a6194-001a-d6a0-84ba-b93b728ce64b@synopsys.com>

On Tue, Oct 23, 2018 at 11:17:50AM +0100, Jose Abreu wrote:
> On 22-10-2018 18:13, Florian Fainelli wrote:
> > On 10/22/18 8:48 AM, Russell King - ARM Linux wrote:
> >> On Mon, Oct 22, 2018 at 01:47:48PM +0100, Jose Abreu wrote:
> >>> Hello,
> >>>
> >>> On 22-10-2018 13:28, Andrew Lunn wrote:
> >>>>>  EXPORT_SYMBOL_GPL(gen10g_resume);
> >>>>> @@ -327,7 +381,7 @@ struct phy_driver genphy_10g_driver = {
> >>>>>  	.phy_id         = 0xffffffff,
> >>>>>  	.phy_id_mask    = 0xffffffff,
> >>>>>  	.name           = "Generic 10G PHY",
> >>>>> -	.soft_reset	= gen10g_no_soft_reset,
> >>>>> +	.soft_reset	= gen10g_soft_reset,
> >>>>>  	.config_init    = gen10g_config_init,
> >>>>>  	.features       = 0,
> >>>>>  	.aneg_done	= genphy_c45_aneg_done,
> >>>> Hi Jose
> >>>>
> >>>> You need to be careful here. There is a reason this is called
> >>>> gen10g_no_soft_reset, rather than having an empty
> >>>> gen10g_soft_reset. Some PHYs break when you do a reset.  So adding a
> >>>> gen10g_soft_reset is fine, but don't change this here, without first
> >>>> understanding the history, and talking to Russell King.
> >>> Hmm, the reset function only interacts with standard PCS
> >>> registers, which should always be available ...
> >>>
> >>> >From my tests I need to do at least 1 reset during power-up so in
> >>> ultimate case I can add a feature quirk or similar.
> >>>
> >>> Russell, can you please comment ?
> >> Setting the reset bit on 88x3310 causes the entire device to become
> >> completely inaccessible until hardware reset.  Therefore, this bit
> >> must _never_ be set for these devices.  That said, we have a separate
> >> driver for these PHYs, but that will only be used for them if it's
> >> present in the kernel.  If we accidentally fall back to the generic
> >> driver, then we'll screw the 88x3310 until a full hardware reset.
> >>
> >> We also have a bunch of net devices that make use of this crippled
> >> "generic" 10G support - we don't know whether resetting the PHY
> >> for those systems will cause a regression - maybe board firmware
> >> already configured the PHY?  I can't say either way on that, except
> >> that we've had crippled 10G support in PHYLIB for a number of years
> >> now _with_ users, and adding reset support drastically changes the
> >> subsystem's behaviour for these users.
> >>
> >> I would recommend not touching the generic 10G driver, but instead
> >> implement your own driver for your PHY to avoid causing regressions.
> >>
> > Agreed.
> 
> What about .suspend / .resume ?

I have no idea what you're proposing there - your patches weren't copied
to me.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply

* Re: [PATCH net-next 1/4] net: phy: Use C45 Helpers when forcing PHY
From: Jose Abreu @ 2018-10-23 10:20 UTC (permalink / raw)
  To: Florian Fainelli, Jose Abreu, netdev
  Cc: Andrew Lunn, David S. Miller, Joao Pinto
In-Reply-To: <8e1a35a5-de85-e7b3-c9f3-524b3313feaa@gmail.com>

On 22-10-2018 18:11, Florian Fainelli wrote:
> On 10/22/18 3:32 AM, Jose Abreu wrote:
>> If PHY is in force state and we have a C45 phy we need to use the
>> standard C45 helpers and not the C22 ones.
>>
>> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
>> Cc: Andrew Lunn <andrew@lunn.ch>
>> Cc: Florian Fainelli <f.fainelli@gmail.com>
>> Cc: "David S. Miller" <davem@davemloft.net>
>> Cc: Joao Pinto <joao.pinto@synopsys.com>
>> ---
>>  drivers/net/phy/phy.c | 2 +-
>>  include/linux/phy.h   | 8 ++++++++
>>  2 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
>> index 1d73ac3309ce..0ff4946e208e 100644
>> --- a/drivers/net/phy/phy.c
>> +++ b/drivers/net/phy/phy.c
>> @@ -995,7 +995,7 @@ void phy_state_machine(struct work_struct *work)
>>  		}
>>  		break;
>>  	case PHY_FORCING:
>> -		err = genphy_update_link(phydev);
>> +		err = phy_update_link(phydev);
>>  		if (err)
>>  			break;
>>  
>> diff --git a/include/linux/phy.h b/include/linux/phy.h
>> index 3ea87f774a76..02c2ee8bc05b 100644
>> --- a/include/linux/phy.h
>> +++ b/include/linux/phy.h
>> @@ -1044,6 +1044,14 @@ static inline int phy_read_status(struct phy_device *phydev)
>>  		return genphy_read_status(phydev);
>>  }
>>  
>> +static inline int phy_update_link(struct phy_device *phydev)
>> +{
>> +	if (phydev->is_c45)
>> +		return gen10g_read_status(phydev);
> Should not this be genphy_c45_read_link() for symmetry with
> genphy_update_link() which only updates phydev->link?

Hmmm, genphy_c45_read_link() does not update phydev->link ... I
can create a new gen10g_update_link() that wraps around
genphy_c45_read_link() and updates link ...

Thanks and Best Regards,
Jose Miguel Abreu

^ permalink raw reply

* Re: [PATCH net-next 3/4] net: phy-c45: Implement reset/suspend/resume callbacks
From: Jose Abreu @ 2018-10-23 10:17 UTC (permalink / raw)
  To: Florian Fainelli, Russell King - ARM Linux, Jose Abreu
  Cc: Andrew Lunn, netdev, David S. Miller, Joao Pinto
In-Reply-To: <f8cd48a0-2cf5-5718-a6a6-1e9824834720@gmail.com>

On 22-10-2018 18:13, Florian Fainelli wrote:
> On 10/22/18 8:48 AM, Russell King - ARM Linux wrote:
>> On Mon, Oct 22, 2018 at 01:47:48PM +0100, Jose Abreu wrote:
>>> Hello,
>>>
>>> On 22-10-2018 13:28, Andrew Lunn wrote:
>>>>>  EXPORT_SYMBOL_GPL(gen10g_resume);
>>>>> @@ -327,7 +381,7 @@ struct phy_driver genphy_10g_driver = {
>>>>>  	.phy_id         = 0xffffffff,
>>>>>  	.phy_id_mask    = 0xffffffff,
>>>>>  	.name           = "Generic 10G PHY",
>>>>> -	.soft_reset	= gen10g_no_soft_reset,
>>>>> +	.soft_reset	= gen10g_soft_reset,
>>>>>  	.config_init    = gen10g_config_init,
>>>>>  	.features       = 0,
>>>>>  	.aneg_done	= genphy_c45_aneg_done,
>>>> Hi Jose
>>>>
>>>> You need to be careful here. There is a reason this is called
>>>> gen10g_no_soft_reset, rather than having an empty
>>>> gen10g_soft_reset. Some PHYs break when you do a reset.  So adding a
>>>> gen10g_soft_reset is fine, but don't change this here, without first
>>>> understanding the history, and talking to Russell King.
>>> Hmm, the reset function only interacts with standard PCS
>>> registers, which should always be available ...
>>>
>>> >From my tests I need to do at least 1 reset during power-up so in
>>> ultimate case I can add a feature quirk or similar.
>>>
>>> Russell, can you please comment ?
>> Setting the reset bit on 88x3310 causes the entire device to become
>> completely inaccessible until hardware reset.  Therefore, this bit
>> must _never_ be set for these devices.  That said, we have a separate
>> driver for these PHYs, but that will only be used for them if it's
>> present in the kernel.  If we accidentally fall back to the generic
>> driver, then we'll screw the 88x3310 until a full hardware reset.
>>
>> We also have a bunch of net devices that make use of this crippled
>> "generic" 10G support - we don't know whether resetting the PHY
>> for those systems will cause a regression - maybe board firmware
>> already configured the PHY?  I can't say either way on that, except
>> that we've had crippled 10G support in PHYLIB for a number of years
>> now _with_ users, and adding reset support drastically changes the
>> subsystem's behaviour for these users.
>>
>> I would recommend not touching the generic 10G driver, but instead
>> implement your own driver for your PHY to avoid causing regressions.
>>
> Agreed.

What about .suspend / .resume ?

Thanks and Best Regards,
Jose Miguel Abreu

^ permalink raw reply

* BUG: please report to dccp@vger.kernel.org => prev = 2, last = 2 at net/dccp/ccids/lib/packet_history.c:LINE/tfrc_rx_his
From: syzbot @ 2018-10-23 10:13 UTC (permalink / raw)
  To: davem, dccp, garsilva, gerrit, linux-kernel, netdev,
	syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    ca9eb48fe01f Merge tag 'regulator-v5.0' of git://git.kerne..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1482a939400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=963b24abf3f7c2d8
dashboard link: https://syzkaller.appspot.com/bug?extid=e786ba000564d103a6fe
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+e786ba000564d103a6fe@syzkaller.appspotmail.com

input: syz0 as /devices/virtual/input/input6
BUG: please report to dccp@vger.kernel.org => prev = 2, last = 2 at  
net/dccp/ccids/lib/packet_history.c:425/tfrc_rx_hist_sample_rtt()
CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 4.19.0+ #298
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1c4/0x2b6 lib/dump_stack.c:113
  tfrc_rx_hist_sample_rtt.cold.3+0x54/0x5c  
net/dccp/ccids/lib/packet_history.c:422
  ccid3_hc_rx_packet_recv+0x5c4/0xeb0 net/dccp/ccids/ccid3.c:767
  ccid_hc_rx_packet_recv net/dccp/ccid.h:185 [inline]
  dccp_deliver_input_to_ccids+0xf0/0x280 net/dccp/input.c:180
  dccp_rcv_established+0x87/0xb0 net/dccp/input.c:378
  dccp_v4_do_rcv+0x153/0x180 net/dccp/ipv4.c:656
  sk_backlog_rcv include/net/sock.h:931 [inline]
  __sk_receive_skb+0x3e5/0xec0 net/core/sock.c:473
  dccp_v4_rcv+0x10f9/0x1f58 net/dccp/ipv4.c:877
  ip_local_deliver_finish+0x2e9/0xda0 net/ipv4/ip_input.c:215
  NF_HOOK include/linux/netfilter.h:289 [inline]
  ip_local_deliver+0x1e9/0x750 net/ipv4/ip_input.c:256
  dst_input include/net/dst.h:450 [inline]
  ip_rcv_finish+0x1f9/0x300 net/ipv4/ip_input.c:415
  NF_HOOK include/linux/netfilter.h:289 [inline]
  ip_rcv+0xed/0x600 net/ipv4/ip_input.c:524
  __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4913
  __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5023
  process_backlog+0x218/0x6f0 net/core/dev.c:5829
  napi_poll net/core/dev.c:6249 [inline]
  net_rx_action+0x7c5/0x1950 net/core/dev.c:6315
  __do_softirq+0x30c/0xb03 kernel/softirq.c:292
  run_ksoftirqd+0x94/0x100 kernel/softirq.c:653
  smpboot_thread_fn+0x68b/0xa00 kernel/smpboot.c:164
  kthread+0x35a/0x420 kernel/kthread.c:246
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413
net_ratelimit: 18 callbacks suppressed
dccp_close: ABORT with 105978 bytes unread
input: syz0 as /devices/virtual/input/input7
input: syz0 as /devices/virtual/input/input8
dccp_close: ABORT with 52730 bytes unread
input: syz0 as /devices/virtual/input/input9
dccp_close: ABORT with 105978 bytes unread
dccp_close: ABORT with 105978 bytes unread
dccp_close: ABORT with 77306 bytes unread
dccp_close: ABORT with 89594 bytes unread
input: syz0 as /devices/virtual/input/input10
input: syz0 as /devices/virtual/input/input11
input: syz0 as /devices/virtual/input/input12
input: syz0 as /devices/virtual/input/input13
input: syz0 as /devices/virtual/input/input14
input: syz0 as /devices/virtual/input/input15
input: syz0 as /devices/virtual/input/input16
input: syz0 as /devices/virtual/input/input17
input: syz0 as /devices/virtual/input/input18


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply

* Re: Kernel oops with mlx5 and dual XDP redirect programs
From: Toke Høiland-Jørgensen @ 2018-10-23 10:10 UTC (permalink / raw)
  To: Saeed Mahameed, netdev@vger.kernel.org
  Cc: Eran Ben Elisha, Tariq Toukan, brouer@redhat.com
In-Reply-To: <15797ad1ccee84dfd47c6f45af155806b4ccc228.camel@mellanox.com>

Saeed Mahameed <saeedm@mellanox.com> writes:

> On Thu, 2018-10-18 at 23:53 +0200, Toke Høiland-Jørgensen wrote:
>> Saeed Mahameed <saeedm@mellanox.com> writes:
>> 
>> > I think that the mlx5 driver doesn't know how to tell the other
>> > device
>> > to stop transmitting to it while it is resetting.. Maybe tariq or
>> > Jesper know more about this ?
>> > I will look at this tomorrow after noon and will try to repro...
>> 
>> Hi Saeed
>> 
>> Did you have a chance to poke at this? :)
>
> HI Toke, yes i have been planing to respond but also i wanted to dig
> more,
>
> so the root cause is very clear.
>
> 1. core 1 is doing tx_dev->ndo_xdp_xmit()
> 2. core 2 is doing tx_dev->xdp_set() //remove xdp program.

Right, it was also my guess that it was related to this interaction.
Thanks for looking into it!

> and the problem is beyond mlx5, since we don't have a way to tell a
> different core/different netdev to stop xmitting, or at least
> synchronize with it.

Hmm, ideally there should be some way for the higher level XDP API to
notice this and abort the call before it even reaches the driver on the
TX side, shouldn't there? At LPC, Jesper and I will be talking about a
proposal for decoupling the ndo_xdp_xmit() resource allocation from
loading and unloading XDP programs, which I guess could be a way to deal
with this as well.

In the meantime...

> I will be waiting for your confirmation that the fix did work.

I tested your patch, and it does indeed fix the crash. However, it also
seems to have the effect that the XDP redirect continues to function
even after removing the XDP program on the target device.

I.e., after the call to ./xdp_fwd -d $TX_IF, I still see packets being
redirected out $TX_IF. Is this intentional?

-Toke

^ permalink raw reply

* [PATCH v3 3/4] net: emac: remove IBM_EMAC_RX_SKB_HEADROOM
From: Christian Lamparter @ 2018-10-23 10:04 UTC (permalink / raw)
  To: netdev; +Cc: David S . Miller
In-Reply-To: <e514e6560fd190f4dcfce574ca5ac6a26640c9ff.1540289031.git.chunkeey@gmail.com>

The EMAC driver had a custom IBM_EMAC_RX_SKB_HEADROOM
Kconfig option that reserved additional skb headroom for RX.
This patch removes the option and migrates the code
to use napi_alloc_skb() and netdev_alloc_skb_ip_align()
in its place.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
 drivers/net/ethernet/ibm/emac/Kconfig | 12 ------
 drivers/net/ethernet/ibm/emac/core.c  | 57 +++++++++++++++++++--------
 drivers/net/ethernet/ibm/emac/core.h  | 10 ++---
 3 files changed, 43 insertions(+), 36 deletions(-)

diff --git a/drivers/net/ethernet/ibm/emac/Kconfig b/drivers/net/ethernet/ibm/emac/Kconfig
index 90d49191beb3..eacf7e141fdc 100644
--- a/drivers/net/ethernet/ibm/emac/Kconfig
+++ b/drivers/net/ethernet/ibm/emac/Kconfig
@@ -28,18 +28,6 @@ config IBM_EMAC_RX_COPY_THRESHOLD
 	depends on IBM_EMAC
 	default "256"
 
-config IBM_EMAC_RX_SKB_HEADROOM
-	int "Additional RX skb headroom (bytes)"
-	depends on IBM_EMAC
-	default "0"
-	help
-	  Additional receive skb headroom. Note, that driver
-	  will always reserve at least 2 bytes to make IP header
-	  aligned, so usually there is no need to add any additional
-	  headroom.
-
-	  If unsure, set to 0.
-
 config IBM_EMAC_DEBUG
 	bool "Debugging"
 	depends on IBM_EMAC
diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
index 80aafd7552aa..266b6614125b 100644
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -1075,7 +1075,9 @@ static int emac_resize_rx_ring(struct emac_instance *dev, int new_mtu)
 
 	/* Second pass, allocate new skbs */
 	for (i = 0; i < NUM_RX_BUFF; ++i) {
-		struct sk_buff *skb = alloc_skb(rx_skb_size, GFP_ATOMIC);
+		struct sk_buff *skb;
+
+		skb = netdev_alloc_skb_ip_align(dev->ndev, rx_skb_size);
 		if (!skb) {
 			ret = -ENOMEM;
 			goto oom;
@@ -1084,7 +1086,6 @@ static int emac_resize_rx_ring(struct emac_instance *dev, int new_mtu)
 		BUG_ON(!dev->rx_skb[i]);
 		dev_kfree_skb(dev->rx_skb[i]);
 
-		skb_reserve(skb, EMAC_RX_SKB_HEADROOM + 2);
 		dev->rx_desc[i].data_ptr =
 		    dma_map_single(&dev->ofdev->dev, skb->data - 2, rx_sync_size,
 				   DMA_FROM_DEVICE) + 2;
@@ -1205,20 +1206,18 @@ static void emac_clean_rx_ring(struct emac_instance *dev)
 	}
 }
 
-static inline int emac_alloc_rx_skb(struct emac_instance *dev, int slot,
-				    gfp_t flags)
+static inline int
+__emac_prepare_rx_skb(struct sk_buff *skb, struct emac_instance *dev, int slot)
 {
-	struct sk_buff *skb = alloc_skb(dev->rx_skb_size, flags);
 	if (unlikely(!skb))
 		return -ENOMEM;
 
 	dev->rx_skb[slot] = skb;
 	dev->rx_desc[slot].data_len = 0;
 
-	skb_reserve(skb, EMAC_RX_SKB_HEADROOM + 2);
 	dev->rx_desc[slot].data_ptr =
-	    dma_map_single(&dev->ofdev->dev, skb->data - 2, dev->rx_sync_size,
-			   DMA_FROM_DEVICE) + 2;
+	    dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
+			   dev->rx_sync_size, DMA_FROM_DEVICE) + NET_IP_ALIGN;
 	wmb();
 	dev->rx_desc[slot].ctrl = MAL_RX_CTRL_EMPTY |
 	    (slot == (NUM_RX_BUFF - 1) ? MAL_RX_CTRL_WRAP : 0);
@@ -1226,6 +1225,27 @@ static inline int emac_alloc_rx_skb(struct emac_instance *dev, int slot,
 	return 0;
 }
 
+static inline int
+emac_alloc_rx_skb(struct emac_instance *dev, int slot)
+{
+	struct sk_buff *skb;
+
+	skb = __netdev_alloc_skb_ip_align(dev->ndev, dev->rx_skb_size,
+					  GFP_KERNEL);
+
+	return __emac_prepare_rx_skb(skb, dev, slot);
+}
+
+static inline int
+emac_alloc_rx_skb_napi(struct emac_instance *dev, int slot)
+{
+	struct sk_buff *skb;
+
+	skb = napi_alloc_skb(&dev->mal->napi, dev->rx_skb_size);
+
+	return __emac_prepare_rx_skb(skb, dev, slot);
+}
+
 static void emac_print_link_status(struct emac_instance *dev)
 {
 	if (netif_carrier_ok(dev->ndev))
@@ -1256,7 +1276,7 @@ static int emac_open(struct net_device *ndev)
 
 	/* Allocate RX ring */
 	for (i = 0; i < NUM_RX_BUFF; ++i)
-		if (emac_alloc_rx_skb(dev, i, GFP_KERNEL)) {
+		if (emac_alloc_rx_skb(dev, i)) {
 			printk(KERN_ERR "%s: failed to allocate RX ring\n",
 			       ndev->name);
 			goto oom;
@@ -1779,8 +1799,9 @@ static inline void emac_recycle_rx_skb(struct emac_instance *dev, int slot,
 	DBG2(dev, "recycle %d %d" NL, slot, len);
 
 	if (len)
-		dma_map_single(&dev->ofdev->dev, skb->data - 2,
-			       EMAC_DMA_ALIGN(len + 2), DMA_FROM_DEVICE);
+		dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
+			       SKB_DATA_ALIGN(len + NET_IP_ALIGN),
+			       DMA_FROM_DEVICE);
 
 	dev->rx_desc[slot].data_len = 0;
 	wmb();
@@ -1888,16 +1909,18 @@ static int emac_poll_rx(void *param, int budget)
 		}
 
 		if (len && len < EMAC_RX_COPY_THRESH) {
-			struct sk_buff *copy_skb =
-			    alloc_skb(len + EMAC_RX_SKB_HEADROOM + 2, GFP_ATOMIC);
+			struct sk_buff *copy_skb;
+
+			copy_skb = napi_alloc_skb(&dev->mal->napi, len);
 			if (unlikely(!copy_skb))
 				goto oom;
 
-			skb_reserve(copy_skb, EMAC_RX_SKB_HEADROOM + 2);
-			memcpy(copy_skb->data - 2, skb->data - 2, len + 2);
+			memcpy(copy_skb->data - NET_IP_ALIGN,
+			       skb->data - NET_IP_ALIGN,
+			       len + NET_IP_ALIGN);
 			emac_recycle_rx_skb(dev, slot, len);
 			skb = copy_skb;
-		} else if (unlikely(emac_alloc_rx_skb(dev, slot, GFP_ATOMIC)))
+		} else if (unlikely(emac_alloc_rx_skb_napi(dev, slot)))
 			goto oom;
 
 		skb_put(skb, len);
@@ -1918,7 +1941,7 @@ static int emac_poll_rx(void *param, int budget)
 	sg:
 		if (ctrl & MAL_RX_CTRL_FIRST) {
 			BUG_ON(dev->rx_sg_skb);
-			if (unlikely(emac_alloc_rx_skb(dev, slot, GFP_ATOMIC))) {
+			if (unlikely(emac_alloc_rx_skb_napi(dev, slot))) {
 				DBG(dev, "rx OOM %d" NL, slot);
 				++dev->estats.rx_dropped_oom;
 				emac_recycle_rx_skb(dev, slot, 0);
diff --git a/drivers/net/ethernet/ibm/emac/core.h b/drivers/net/ethernet/ibm/emac/core.h
index 0bcfe952a3cf..0faeb7c7e958 100644
--- a/drivers/net/ethernet/ibm/emac/core.h
+++ b/drivers/net/ethernet/ibm/emac/core.h
@@ -68,22 +68,18 @@ static inline int emac_rx_size(int mtu)
 		return mal_rx_size(ETH_DATA_LEN + EMAC_MTU_OVERHEAD);
 }
 
-#define EMAC_DMA_ALIGN(x)		ALIGN((x), dma_get_cache_alignment())
-
-#define EMAC_RX_SKB_HEADROOM		\
-	EMAC_DMA_ALIGN(CONFIG_IBM_EMAC_RX_SKB_HEADROOM)
-
 /* Size of RX skb for the given MTU */
 static inline int emac_rx_skb_size(int mtu)
 {
 	int size = max(mtu + EMAC_MTU_OVERHEAD, emac_rx_size(mtu));
-	return EMAC_DMA_ALIGN(size + 2) + EMAC_RX_SKB_HEADROOM;
+
+	return SKB_DATA_ALIGN(size + NET_SKB_PAD + NET_IP_ALIGN);
 }
 
 /* RX DMA sync size */
 static inline int emac_rx_sync_size(int mtu)
 {
-	return EMAC_DMA_ALIGN(emac_rx_size(mtu) + 2);
+	return SKB_DATA_ALIGN(emac_rx_size(mtu) + NET_IP_ALIGN);
 }
 
 /* Driver statistcs is split into two parts to make it more cache friendly:
-- 
2.19.1

^ permalink raw reply related

* [PATCH v3 2/4] net: emac: implement TCP segmentation offload (TSO)
From: Christian Lamparter @ 2018-10-23 10:04 UTC (permalink / raw)
  To: netdev; +Cc: David S . Miller
In-Reply-To: <f4acff06dde1a69a3c3f4fdd27014de3a3bd51cb.1540289031.git.chunkeey@gmail.com>

This patch enables TSO(v4) hw feature for emac driver.
As atleast the APM82181's TCP/IP acceleration hardware
controller (TAH) provides TCP segmentation support in
the transmit path.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
 drivers/net/ethernet/ibm/emac/core.c | 112 ++++++++++++++++++++++++++-
 drivers/net/ethernet/ibm/emac/core.h |   7 ++
 drivers/net/ethernet/ibm/emac/emac.h |   7 ++
 drivers/net/ethernet/ibm/emac/tah.c  |  22 +++++-
 drivers/net/ethernet/ibm/emac/tah.h  |   2 +
 5 files changed, 148 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
index be560f9031f4..80aafd7552aa 100644
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -38,6 +38,9 @@
 #include <linux/mii.h>
 #include <linux/bitops.h>
 #include <linux/if_vlan.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/tcp.h>
 #include <linux/workqueue.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
@@ -1118,6 +1121,32 @@ static int emac_resize_rx_ring(struct emac_instance *dev, int new_mtu)
 	return ret;
 }
 
+/* Restriction applied for the segmentation size
+ * to use HW segmentation offload feature. the size
+ * of the segment must not be less than 168 bytes for
+ * DIX formatted segments, or 176 bytes for
+ * IEEE formatted segments. However based on actual
+ * tests any MTU less than 416 causes excessive retries
+ * due to TX FIFO underruns.
+ */
+const u32 tah_ss[TAH_NO_SSR] = { 1500, 1344, 1152, 960, 768, 416 };
+
+/* look-up matching segment size for the given mtu */
+static void emac_find_tso_ss_for_mtu(struct emac_instance *dev)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(tah_ss); i++) {
+		if (tah_ss[i] <= dev->ndev->mtu)
+			break;
+	}
+	/* if no matching segment size is found, set the tso_ss_mtu_start
+	 * variable anyway. This will cause the emac_tx_tso to skip straight
+	 * to the software fallback.
+	 */
+	dev->tso_ss_mtu_start = i;
+}
+
 /* Process ctx, rtnl_lock semaphore */
 static int emac_change_mtu(struct net_device *ndev, int new_mtu)
 {
@@ -1134,6 +1163,7 @@ static int emac_change_mtu(struct net_device *ndev, int new_mtu)
 
 	if (!ret) {
 		ndev->mtu = new_mtu;
+		emac_find_tso_ss_for_mtu(dev);
 		dev->rx_skb_size = emac_rx_skb_size(new_mtu);
 		dev->rx_sync_size = emac_rx_sync_size(new_mtu);
 	}
@@ -1410,6 +1440,33 @@ static inline u16 emac_tx_csum(struct emac_instance *dev,
 	return 0;
 }
 
+static int emac_tx_tso(struct emac_instance *dev, struct sk_buff *skb,
+		       u16 *ctrl)
+{
+	if (emac_has_feature(dev, EMAC_FTR_TAH_HAS_TSO) && skb_is_gso(skb) &&
+	    !!(skb_shinfo(skb)->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) {
+		u32 seg_size = 0, i;
+
+		/* Get the MTU */
+		seg_size = skb_shinfo(skb)->gso_size + tcp_hdrlen(skb) +
+			   skb_network_header_len(skb);
+
+		for (i = dev->tso_ss_mtu_start; i < ARRAY_SIZE(tah_ss); i++) {
+			if (tah_ss[i] > seg_size)
+				continue;
+
+			*ctrl |= EMAC_TX_CTRL_TAH_SSR(i);
+			return 0;
+		}
+
+		/* none found fall back to software */
+		return -EINVAL;
+	}
+
+	*ctrl |= emac_tx_csum(dev, skb);
+	return 0;
+}
+
 static inline netdev_tx_t emac_xmit_finish(struct emac_instance *dev, int len)
 {
 	struct emac_regs __iomem *p = dev->emacp;
@@ -1452,6 +1509,46 @@ static inline u16 emac_tx_vlan(struct emac_instance *dev, struct sk_buff *skb)
 	return 0;
 }
 
+static netdev_tx_t
+emac_start_xmit_sg(struct sk_buff *skb, struct net_device *ndev);
+
+static int
+emac_sw_tso(struct sk_buff *skb, struct net_device *ndev)
+{
+	struct emac_instance *dev = netdev_priv(ndev);
+	struct sk_buff *segs, *curr;
+	unsigned int i, frag_slots;
+
+	/* make sure to not overflow the tx ring */
+	frag_slots = dev->tx_cnt;
+	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+		struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[i];
+
+		frag_slots += mal_tx_chunks(skb_frag_size(frag));
+
+		if (frag_slots >= NUM_TX_BUFF)
+			return -ENOSPC;
+	};
+
+	segs = skb_gso_segment(skb, ndev->features &
+					~(NETIF_F_TSO | NETIF_F_TSO6));
+	if (IS_ERR_OR_NULL(segs)) {
+		++dev->estats.tx_dropped;
+		dev_kfree_skb_any(skb);
+	} else {
+		while (segs) {
+			curr = segs;
+			segs = curr->next;
+			curr->next = NULL;
+
+			emac_start_xmit_sg(curr, ndev);
+		}
+		dev_consume_skb_any(skb);
+	}
+
+	return 0;
+}
+
 /* Tx lock BH */
 static netdev_tx_t emac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 {
@@ -1535,7 +1632,12 @@ emac_start_xmit_sg(struct sk_buff *skb, struct net_device *ndev)
 		goto stop_queue;
 
 	ctrl = EMAC_TX_CTRL_GFCS | EMAC_TX_CTRL_GP | MAL_TX_CTRL_READY |
-	    emac_tx_csum(dev, skb) | emac_tx_vlan(dev, skb);
+	    emac_tx_vlan(dev, skb);
+	if (emac_tx_tso(dev, skb, &ctrl)) {
+		if (emac_sw_tso(skb, ndev))
+			goto stop_queue;
+	}
+
 	slot = dev->tx_slot;
 
 	/* skb data */
@@ -2946,6 +3048,9 @@ static int emac_init_config(struct emac_instance *dev)
 	if (dev->tah_ph != 0) {
 #ifdef CONFIG_IBM_EMAC_TAH
 		dev->features |= EMAC_FTR_HAS_TAH;
+
+		if (of_device_is_compatible(np, "ibm,emac-apm821xx"))
+			dev->features |= EMAC_FTR_TAH_HAS_TSO;
 #else
 		printk(KERN_ERR "%pOF: TAH support not enabled !\n", np);
 		return -ENXIO;
@@ -3113,6 +3218,8 @@ static int emac_probe(struct platform_device *ofdev)
 	}
 	dev->rx_skb_size = emac_rx_skb_size(ndev->mtu);
 	dev->rx_sync_size = emac_rx_sync_size(ndev->mtu);
+	ndev->gso_max_segs = NUM_TX_BUFF / 2;
+	emac_find_tso_ss_for_mtu(dev);
 
 	/* Get pointers to BD rings */
 	dev->tx_desc =
@@ -3167,6 +3274,9 @@ static int emac_probe(struct platform_device *ofdev)
 	if (dev->tah_dev) {
 		ndev->hw_features = NETIF_F_IP_CSUM | NETIF_F_SG;
 
+		if (emac_has_feature(dev, EMAC_FTR_TAH_HAS_TSO))
+			ndev->hw_features |= NETIF_F_TSO;
+
 		if (emac_has_feature(dev, EMAC_FTR_HAS_VLAN_CTAG_TX)) {
 			ndev->vlan_features |= ndev->hw_features;
 			ndev->hw_features |= NETIF_F_HW_VLAN_CTAG_TX;
diff --git a/drivers/net/ethernet/ibm/emac/core.h b/drivers/net/ethernet/ibm/emac/core.h
index 8d84d439168c..0bcfe952a3cf 100644
--- a/drivers/net/ethernet/ibm/emac/core.h
+++ b/drivers/net/ethernet/ibm/emac/core.h
@@ -245,6 +245,9 @@ struct emac_instance {
 	u32				xaht_slots_shift;
 	u32				xaht_width_shift;
 
+	/* TAH TSO start index */
+	int				tso_ss_mtu_start;
+
 	/* Descriptor management
 	 */
 	struct mal_descriptor		*tx_desc;
@@ -336,6 +339,8 @@ struct emac_instance {
 #define EMAC_FTR_APM821XX_NO_HALF_DUPLEX	0x00001000
 /* EMAC can insert 802.1Q tag */
 #define EMAC_FTR_HAS_VLAN_CTAG_TX		0x00002000
+/* TAH can do TCP segmentation offload */
+#define EMAC_FTR_TAH_HAS_TSO			0x00004000
 
 /* Right now, we don't quite handle the always/possible masks on the
  * most optimal way as we don't have a way to say something like
@@ -352,6 +357,8 @@ enum {
 #endif
 #ifdef CONFIG_IBM_EMAC_TAH
 	    EMAC_FTR_HAS_TAH	|
+	    EMAC_FTR_TAH_HAS_TSO	|
+
 #endif
 #ifdef CONFIG_IBM_EMAC_ZMII
 	    EMAC_FTR_HAS_ZMII	|
diff --git a/drivers/net/ethernet/ibm/emac/emac.h b/drivers/net/ethernet/ibm/emac/emac.h
index e2f80cca9bed..833967aceb2f 100644
--- a/drivers/net/ethernet/ibm/emac/emac.h
+++ b/drivers/net/ethernet/ibm/emac/emac.h
@@ -266,6 +266,13 @@ struct emac_regs {
 #define EMAC_TX_CTRL_IVT		0x0020
 #define EMAC_TX_CTRL_RVT		0x0010
 #define EMAC_TX_CTRL_TAH_CSUM		0x000e
+#define EMAC_TX_CTRL_TAH_SSR(idx)	(((idx) + 1) << 1)
+#define EMAC_TX_CTRL_TAH_SSR5		0x000c
+#define EMAC_TX_CTRL_TAH_SSR4		0x000a
+#define EMAC_TX_CTRL_TAH_SSR3		0x0008
+#define EMAC_TX_CTRL_TAH_SSR2		0x0006
+#define EMAC_TX_CTRL_TAH_SSR1		0x0004
+#define EMAC_TX_CTRL_TAH_SSR0		0x0002
 
 /* EMAC specific TX descriptor status fields (read access) */
 #define EMAC_TX_ST_BFCS			0x0200
diff --git a/drivers/net/ethernet/ibm/emac/tah.c b/drivers/net/ethernet/ibm/emac/tah.c
index 9912456dca48..619c08ee22f7 100644
--- a/drivers/net/ethernet/ibm/emac/tah.c
+++ b/drivers/net/ethernet/ibm/emac/tah.c
@@ -45,6 +45,24 @@ void tah_detach(struct platform_device *ofdev, int channel)
 	mutex_unlock(&dev->lock);
 }
 
+static void tah_set_ssr(struct platform_device *ofdev)
+{
+	struct tah_instance *dev = dev_get_drvdata(&ofdev->dev);
+	struct tah_regs __iomem *p = dev->base;
+	int i;
+
+	mutex_lock(&dev->lock);
+
+	for (i = 0; i < ARRAY_SIZE(tah_ss); i++) {
+		/* Segment size can be up to 16K, but needs
+		 * to be a multiple of 2 bytes
+		 */
+		out_be32(&p->ssr0 + i, (tah_ss[i] & 0x3ffc) << 16);
+	}
+
+	mutex_unlock(&dev->lock);
+}
+
 void tah_reset(struct platform_device *ofdev)
 {
 	struct tah_instance *dev = platform_get_drvdata(ofdev);
@@ -64,6 +82,8 @@ void tah_reset(struct platform_device *ofdev)
 	out_be32(&p->mr,
 		 TAH_MR_CVR | TAH_MR_ST_768 | TAH_MR_TFS_10KB | TAH_MR_DTFP |
 		 TAH_MR_DIG);
+
+	tah_set_ssr(ofdev);
 }
 
 int tah_get_regs_len(struct platform_device *ofdev)
@@ -118,7 +138,7 @@ static int tah_probe(struct platform_device *ofdev)
 
 	platform_set_drvdata(ofdev, dev);
 
-	/* Initialize TAH and enable IPv4 checksum verification, no TSO yet */
+	/* Initialize TAH and enable IPv4 checksum verification */
 	tah_reset(ofdev);
 
 	printk(KERN_INFO "TAH %pOF initialized\n", ofdev->dev.of_node);
diff --git a/drivers/net/ethernet/ibm/emac/tah.h b/drivers/net/ethernet/ibm/emac/tah.h
index 4d5f336f07b3..2cb0629f30e2 100644
--- a/drivers/net/ethernet/ibm/emac/tah.h
+++ b/drivers/net/ethernet/ibm/emac/tah.h
@@ -36,6 +36,8 @@ struct tah_regs {
 	u32 tsr;
 };
 
+#define TAH_NO_SSR	6
+extern const u32 tah_ss[TAH_NO_SSR];
 
 /* TAH device */
 struct tah_instance {
-- 
2.19.1

^ permalink raw reply related

* [PATCH v3 1/4] net: emac: implement 802.1Q VLAN TX tagging support
From: Christian Lamparter @ 2018-10-23 10:04 UTC (permalink / raw)
  To: netdev; +Cc: David S . Miller

As per' APM82181 Embedded Processor User Manual 26.1 EMAC Features:
VLAN:
 - Support for VLAN tag ID in compliance with IEEE 802.3ac.
 - VLAN tag insertion or replacement for transmit packets

This patch completes the missing code for the VLAN tx tagging
support, as the the EMAC_MR1_VLE was already enabled.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
 drivers/net/ethernet/ibm/emac/core.c | 32 ++++++++++++++++++++++++----
 drivers/net/ethernet/ibm/emac/core.h |  6 +++++-
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
index 760b2ad8e295..be560f9031f4 100644
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -37,6 +37,7 @@
 #include <linux/ethtool.h>
 #include <linux/mii.h>
 #include <linux/bitops.h>
+#include <linux/if_vlan.h>
 #include <linux/workqueue.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
@@ -674,7 +675,7 @@ static int emac_configure(struct emac_instance *dev)
 		 ndev->dev_addr[5]);
 
 	/* VLAN Tag Protocol ID */
-	out_be32(&p->vtpid, 0x8100);
+	out_be32(&p->vtpid, ETH_P_8021Q);
 
 	/* Receive mode register */
 	r = emac_iff2rmr(ndev);
@@ -1435,6 +1436,22 @@ static inline netdev_tx_t emac_xmit_finish(struct emac_instance *dev, int len)
 	return NETDEV_TX_OK;
 }
 
+static inline u16 emac_tx_vlan(struct emac_instance *dev, struct sk_buff *skb)
+{
+	/* Handle VLAN TPID and TCI insert if this is a VLAN skb */
+	if (emac_has_feature(dev, EMAC_FTR_HAS_VLAN_CTAG_TX) &&
+	    skb_vlan_tag_present(skb)) {
+		struct emac_regs __iomem *p = dev->emacp;
+
+		/* update the VLAN TCI */
+		out_be32(&p->vtci, (u32)skb_vlan_tag_get(skb));
+
+		/* Insert VLAN tag */
+		return EMAC_TX_CTRL_IVT;
+	}
+	return 0;
+}
+
 /* Tx lock BH */
 static netdev_tx_t emac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 {
@@ -1443,7 +1460,7 @@ static netdev_tx_t emac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	int slot;
 
 	u16 ctrl = EMAC_TX_CTRL_GFCS | EMAC_TX_CTRL_GP | MAL_TX_CTRL_READY |
-	    MAL_TX_CTRL_LAST | emac_tx_csum(dev, skb);
+	    MAL_TX_CTRL_LAST | emac_tx_csum(dev, skb) | emac_tx_vlan(dev, skb);
 
 	slot = dev->tx_slot++;
 	if (dev->tx_slot == NUM_TX_BUFF) {
@@ -1518,7 +1535,7 @@ emac_start_xmit_sg(struct sk_buff *skb, struct net_device *ndev)
 		goto stop_queue;
 
 	ctrl = EMAC_TX_CTRL_GFCS | EMAC_TX_CTRL_GP | MAL_TX_CTRL_READY |
-	    emac_tx_csum(dev, skb);
+	    emac_tx_csum(dev, skb) | emac_tx_vlan(dev, skb);
 	slot = dev->tx_slot;
 
 	/* skb data */
@@ -2891,7 +2908,8 @@ static int emac_init_config(struct emac_instance *dev)
 		if (of_device_is_compatible(np, "ibm,emac-apm821xx")) {
 			dev->features |= (EMAC_APM821XX_REQ_JUMBO_FRAME_SIZE |
 					  EMAC_FTR_APM821XX_NO_HALF_DUPLEX |
-					  EMAC_FTR_460EX_PHY_CLK_FIX);
+					  EMAC_FTR_460EX_PHY_CLK_FIX |
+					  EMAC_FTR_HAS_VLAN_CTAG_TX);
 		}
 	} else if (of_device_is_compatible(np, "ibm,emac4")) {
 		dev->features |= EMAC_FTR_EMAC4;
@@ -3148,6 +3166,12 @@ static int emac_probe(struct platform_device *ofdev)
 
 	if (dev->tah_dev) {
 		ndev->hw_features = NETIF_F_IP_CSUM | NETIF_F_SG;
+
+		if (emac_has_feature(dev, EMAC_FTR_HAS_VLAN_CTAG_TX)) {
+			ndev->vlan_features |= ndev->hw_features;
+			ndev->hw_features |= NETIF_F_HW_VLAN_CTAG_TX;
+		}
+
 		ndev->features |= ndev->hw_features | NETIF_F_RXCSUM;
 	}
 	ndev->watchdog_timeo = 5 * HZ;
diff --git a/drivers/net/ethernet/ibm/emac/core.h b/drivers/net/ethernet/ibm/emac/core.h
index 84caa4a3fc52..8d84d439168c 100644
--- a/drivers/net/ethernet/ibm/emac/core.h
+++ b/drivers/net/ethernet/ibm/emac/core.h
@@ -334,6 +334,8 @@ struct emac_instance {
  * APM821xx does not support Half Duplex mode
  */
 #define EMAC_FTR_APM821XX_NO_HALF_DUPLEX	0x00001000
+/* EMAC can insert 802.1Q tag */
+#define EMAC_FTR_HAS_VLAN_CTAG_TX		0x00002000
 
 /* Right now, we don't quite handle the always/possible masks on the
  * most optimal way as we don't have a way to say something like
@@ -363,7 +365,9 @@ enum {
 	EMAC_FTR_460EX_PHY_CLK_FIX |
 	EMAC_FTR_440EP_PHY_CLK_FIX |
 	EMAC_APM821XX_REQ_JUMBO_FRAME_SIZE |
-	EMAC_FTR_APM821XX_NO_HALF_DUPLEX,
+	EMAC_FTR_APM821XX_NO_HALF_DUPLEX |
+	EMAC_FTR_HAS_VLAN_CTAG_TX |
+	0,
 };
 
 static inline int emac_has_feature(struct emac_instance *dev,
-- 
2.19.1

^ permalink raw reply related

* [PATCH v3 4/4] net: emac: add deprecation notice to emac custom phy users
From: Christian Lamparter @ 2018-10-23 10:04 UTC (permalink / raw)
  To: netdev; +Cc: David S . Miller
In-Reply-To: <870fa73077774530ad5c60faff620b025f4869cf.1540289031.git.chunkeey@gmail.com>

This patch starts the deprecation process of emac's small library of
supported phys by adding a message to inform all remaining users to
start looking into converting their platform's device-tree to PHYLIB.

EMAC's phy.c support is limited to mostly single ethernet transceivers:
CIS8201, BCM5248, ET1011C, Marvell 88E1111 and 88E1112, AR8035.

And Linux has dedicated PHYLIB drivers for all but the BCM5248 which
can be supported by the generic phy driver.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
 drivers/net/ethernet/ibm/emac/phy.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/ibm/emac/phy.c b/drivers/net/ethernet/ibm/emac/phy.c
index aa070c063e48..143b4c688ee9 100644
--- a/drivers/net/ethernet/ibm/emac/phy.c
+++ b/drivers/net/ethernet/ibm/emac/phy.c
@@ -496,6 +496,7 @@ static struct mii_phy_def ar8035_phy_def = {
 };
 
 static struct mii_phy_def *mii_phy_table[] = {
+	/* DEPRECATED: Do not add any new PHY drivers to this list. */
 	&et1011c_phy_def,
 	&cis8201_phy_def,
 	&bcm5248_phy_def,
@@ -512,6 +513,9 @@ int emac_mii_phy_probe(struct mii_phy *phy, int address)
 	int i;
 	u32 id;
 
+	pr_info("EMAC's custom phy code has been deprecated.\n"
+		"Please convert your EMAC device to PHYLIB.\n");
+
 	phy->autoneg = AUTONEG_DISABLE;
 	phy->advertising = 0;
 	phy->address = address;
-- 
2.19.1

^ permalink raw reply related

* Re: [PATCH] bonding:avoid repeated display of same link status change
From: David Miller @ 2018-10-23 18:08 UTC (permalink / raw)
  To: mk.singh; +Cc: netdev, j.vosburgh, vfalico, andy, linux-kernel
In-Reply-To: <20181023152924.24033-1-mk.singh@oracle.com>

From: mk.singh@oracle.com
Date: Tue, 23 Oct 2018 20:59:24 +0530

> @@ -229,6 +229,7 @@ struct bonding {
>  	struct	 dentry *debug_dir;
>  #endif /* CONFIG_DEBUG_FS */
>  	struct rtnl_link_stats64 bond_stats;
> +	atomic_t rtnl_needed;

As mentioned by others, if the only operations you perform on a value
are set and read, using atomic_t is utterly and totally pointless.

I really have no idea what is achieved by using atomic_t in this set
of circumstances.

It is not guaranteeing that the value stays stable after you read it,
and it is not guaranteeing that another thread won't overwrite the
value you just set it to.

All of those things, if important, need proper synchronization.  An
atomic_t by itself will not do that for you.

^ permalink raw reply

* Re: [PATCH v2] wireless: mark expected switch fall-throughs
From: Gustavo A. R. Silva @ 2018-10-23  8:59 UTC (permalink / raw)
  To: Johannes Berg, David S. Miller
  Cc: linux-wireless, netdev, linux-kernel, Kees Cook
In-Reply-To: <0b3197a734f71bdfffaf717e63b17e2fe31720a2.camel@sipsolutions.net>

On 10/23/18 9:01 AM, Johannes Berg wrote:
> On Tue, 2018-10-23 at 02:13 +0200, Gustavo A. R. Silva wrote:
>> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
>> where we are expecting to fall through.
>>
>> Warning level 3 was used: -Wimplicit-fallthrough=3
>>
>> This code was not tested and GCC 7.2.0 was used to compile it.
> 
> Look, I'm not going to make this any clearer: I'm not applying patches
> like that where you've invested no effort whatsoever on verifying that
> they're correct.
> 

How do you suggest me to verify that every part is correct in this type
of patches?

Thanks

^ permalink raw reply

* [PATCH v2] rtlwifi: remove set but not used variable 'radiob_array_table' and 'radiob_arraylen'
From: zhong jiang @ 2018-10-23  8:28 UTC (permalink / raw)
  To: kvalo; +Cc: davem, pkshih, linux-wireless, netdev, linux-kernel

radiob_array_table' and 'radiob_arraylen' are not used after setting its value.
It is safe to remove the unused variable. Meanwhile, radio B radio should be
removed as well. because it will no longer be referenced.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 drivers/net/wireless/realtek/rtlwifi/rtl8723ae/phy.c   | 5 +----
 drivers/net/wireless/realtek/rtlwifi/rtl8723ae/table.c | 4 ----
 drivers/net/wireless/realtek/rtlwifi/rtl8723ae/table.h | 2 --
 3 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/phy.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/phy.c
index 5cf29f5..3f33278 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/phy.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/phy.c
@@ -509,13 +509,10 @@ bool rtl8723e_phy_config_rf_with_headerfile(struct ieee80211_hw *hw,
 	int i;
 	bool rtstatus = true;
 	u32 *radioa_array_table;
-	u32 *radiob_array_table;
-	u16 radioa_arraylen, radiob_arraylen;
+	u16 radioa_arraylen;
 
 	radioa_arraylen = RTL8723ERADIOA_1TARRAYLENGTH;
 	radioa_array_table = RTL8723E_RADIOA_1TARRAY;
-	radiob_arraylen = RTL8723E_RADIOB_1TARRAYLENGTH;
-	radiob_array_table = RTL8723E_RADIOB_1TARRAY;
 
 	rtstatus = true;
 
diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/table.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/table.c
index 61e8604..1bbee0b 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/table.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/table.c
@@ -475,10 +475,6 @@
 	0x000, 0x00030159,
 };
 
-u32 RTL8723E_RADIOB_1TARRAY[RTL8723E_RADIOB_1TARRAYLENGTH] = {
-	0x0,
-};
-
 u32 RTL8723EMAC_ARRAY[RTL8723E_MACARRAYLENGTH] = {
 	0x420, 0x00000080,
 	0x423, 0x00000000,
diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/table.h b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/table.h
index 57a548c..a044f3c 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/table.h
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/table.h
@@ -36,8 +36,6 @@
 extern u32 RTL8723EPHY_REG_ARRAY_PG[RTL8723E_PHY_REG_ARRAY_PGLENGTH];
 #define RTL8723ERADIOA_1TARRAYLENGTH		282
 extern u32 RTL8723E_RADIOA_1TARRAY[RTL8723ERADIOA_1TARRAYLENGTH];
-#define RTL8723E_RADIOB_1TARRAYLENGTH		1
-extern u32 RTL8723E_RADIOB_1TARRAY[RTL8723E_RADIOB_1TARRAYLENGTH];
 #define RTL8723E_MACARRAYLENGTH			172
 extern u32 RTL8723EMAC_ARRAY[RTL8723E_MACARRAYLENGTH];
 #define RTL8723E_AGCTAB_1TARRAYLENGTH		320
-- 
1.7.12.4

^ permalink raw reply related

* Re: [PATCH] bonding:avoid repeated display of same link status change
From: Michal Kubecek @ 2018-10-23 16:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Mahesh Bandewar (महेश बंडेवार),
	mk.singh, linux-netdev, Jay Vosburgh, Veaceslav Falico,
	Andy Gospodarek, David S. Miller, linux-kernel
In-Reply-To: <20181023162613.GA22291@unicorn.suse.cz>

On Tue, Oct 23, 2018 at 06:26:14PM +0200, Michal Kubecek wrote:
> On Tue, Oct 23, 2018 at 09:10:44AM -0700, Eric Dumazet wrote:
> > 
> > 
> > On 10/23/2018 08:54 AM, Mahesh Bandewar (महेश बंडेवार) wrote:
> > 
> > > Atomic operations are expensive (on certain architectures) and miimon
> > > runs quite frequently. Is the added cost of these atomic operations
> > > even worth just to avoid *duplicate info* messages? This seems like a
> > > overkill!
> > 
> > atomic_read() is a simple read, no atomic operation involved.
> > 
> > Same remark for atomic_set()
> 
> Which makes me wonder if the patch really needs atomic_t.

IMHO it does not. AFAICS multiple instances of bond_mii_monitor() cannot
run simultaneously for the same bond so that there doesn't seem to be
anything to collide with. (And if they could, we would need to test and
set the flag atomically in bond_miimon_inspect().)

Michal Kubecek

^ permalink raw reply

* [QUESTION] AF_UNIX connect behavior when listener backlog=0
From: Vito Caputo @ 2018-10-23 16:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: netdev

The current implementation of AF_UNIX sockets immediately establishes a
new connection even when the backlog on the listener is zero.

Wouldn't it make more sense for connects to become synchronous with
accept when the listener is configured with a backlog of zero?

That way connects can be reliably refused from the listener side via
e.g. closing the listener instead of accepting the connect.

I'm not subscribed to netdev, so please take care to include me in any
replies on that list.

Regards,
Vito Caputo

^ permalink raw reply

* Re: [v2,0/2] net: if_arp: use define instead of hard-coded value
From: Håkon Bugge @ 2018-10-23 16:31 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, stephen, kstewart, tglx, gregkh, pombredanne,
	linux-kernel
In-Reply-To: <20181023.091115.1241170918401068592.davem@davemloft.net>



> On 23 Oct 2018, at 18:11, David Miller <davem@davemloft.net> wrote:
> 
> From: Håkon Bugge <haakon.bugge@oracle.com>
> Date: Tue, 23 Oct 2018 14:30:57 +0200
> 
>> Just a friendly reminder.
> 
> Reminder for what?
> 
> Your patch was applied to the net-next tree right after it was posted,
> what more do you want?

Oh, my bad then. Was expecting review comments or an "Applied, thanks". Will check the respective tree next time. No need for the v2 then I assume.


Thxs, Håkon

^ permalink raw reply

* Re: [PATCH] bonding:avoid repeated display of same link status change
From: Michal Kubecek @ 2018-10-23 16:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Mahesh Bandewar (महेश बंडेवार),
	mk.singh, linux-netdev, Jay Vosburgh, Veaceslav Falico,
	Andy Gospodarek, David S. Miller, linux-kernel
In-Reply-To: <65f98009-1ce0-d6fd-06dc-233aa115abc9@gmail.com>

On Tue, Oct 23, 2018 at 09:10:44AM -0700, Eric Dumazet wrote:
> 
> 
> On 10/23/2018 08:54 AM, Mahesh Bandewar (महेश बंडेवार) wrote:
> 
> > Atomic operations are expensive (on certain architectures) and miimon
> > runs quite frequently. Is the added cost of these atomic operations
> > even worth just to avoid *duplicate info* messages? This seems like a
> > overkill!
> 
> atomic_read() is a simple read, no atomic operation involved.
> 
> Same remark for atomic_set()

Which makes me wonder if the patch really needs atomic_t.

Michal Kubecek

^ permalink raw reply

* Re: [RFC PATCH v2 06/10] udp: cope with UDP GRO packet misdirection
From: Paolo Abeni @ 2018-10-23  7:59 UTC (permalink / raw)
  To: Subash Abhinov Kasiviswanathan; +Cc: netdev, Willem de Bruijn, Steffen Klassert
In-Reply-To: <f87ef4ed8f5c0ab5989d3e067b218005@codeaurora.org>

Hi,

On Mon, 2018-10-22 at 13:04 -0600, Subash Abhinov Kasiviswanathan
wrote:
> On 2018-10-19 08:25, Paolo Abeni wrote:
> > In some scenarios, the GRO engine can assemble an UDP GRO packet
> > that ultimately lands on a non GRO-enabled socket.
> > This patch tries to address the issue explicitly checking for the UDP
> > socket features before enqueuing the packet, and eventually segmenting
> > the unexpected GRO packet, as needed.
> > 
> > We must also cope with re-insertion requests: after segmentation the
> > UDP code calls the helper introduced by the previous patches, as 
> > needed.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> > +static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff
> > *skb)
> > +{
> > +	return !udp_sk(sk)->gro_enabled && skb_is_gso(skb) &&
> > +	       skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4;
> > +}
> > +
> > +static inline struct sk_buff *udp_rcv_segment(struct sock *sk,
> > +					      struct sk_buff *skb)
> > +{
> > +	struct sk_buff *segs;
> > +
> > +	/* the GSO CB lays after the UDP one, no need to save and restore
> > any
> > +	 * CB fragment, just initialize it
> > +	 */
> > +	segs = __skb_gso_segment(skb, NETIF_F_SG, false);
> > +	if (unlikely(IS_ERR(segs)))
> > +		kfree_skb(skb);
> > +	else if (segs)
> > +		consume_skb(skb);
> > +	return segs;
> > +}
> > +
> > +
> 
> Hi Paolo
> 
> Do we need to check for IS_ERR_OR_NULL(segs)

Yes, thanks.

(also Williem already noted the above)

> > 
> > +void ip_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int
> > proto);
> > +
> > +static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
> > +{
> > +	struct sk_buff *next, *segs;
> > +	int ret;
> > +
> > +	if (likely(!udp_unexpected_gso(sk, skb)))
> > +		return udp_queue_rcv_one_skb(sk, skb);
> > +static int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
> > +{
> > +	struct sk_buff *next, *segs;
> > +	int ret;
> > +
> > +	if (likely(!udp_unexpected_gso(sk, skb)))
> > +		return udpv6_queue_rcv_one_skb(sk, skb);
> > +
> 
> Is the "likely" required here?

Not required, but currently helpful IMHO, as we should hit the above
only on unlikey and really unwonted configuration.

Note that only SKB_GSO_UDP_L4 GSO packets will not match the above
likely condition.

> HW can coalesce all incoming streams of UDP and may not know the socket 
> state.
> In that case, a socket not having UDP GRO option might see a penalty 
> here.

Really? Is there any HW creating SKB_GSO_UDP_L4 packets on RX? if the
HW is doing that, without this patch, I think it's breaking existing
applications (which may expext that the read UDP frame length
implicitly describe the application level message length).

Cheers,

Paolo

^ permalink raw reply

* Re: [v2,0/2] net: if_arp: use define instead of hard-coded value
From: David Miller @ 2018-10-23 16:11 UTC (permalink / raw)
  To: haakon.bugge
  Cc: netdev, stephen, kstewart, tglx, gregkh, pombredanne,
	linux-kernel
In-Reply-To: <A1A93434-9496-4029-9A53-2D64D56CB846@oracle.com>

From: Håkon Bugge <haakon.bugge@oracle.com>
Date: Tue, 23 Oct 2018 14:30:57 +0200

> Just a friendly reminder.

Reminder for what?

Your patch was applied to the net-next tree right after it was posted,
what more do you want?

I gather that you have no idea what tree was appropriate for your change
and therefore where your should check to see if it was applied or not.

^ permalink raw reply

* Re: [PATCH] bonding:avoid repeated display of same link status change
From: Eric Dumazet @ 2018-10-23 16:10 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश बंडेवार),
	mk.singh
  Cc: linux-netdev, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, linux-kernel
In-Reply-To: <CAF2d9jjM+5twGtwnB-JvOaaFbU9-n1oNyMXF6wx=s-0fVn9-6w@mail.gmail.com>



On 10/23/2018 08:54 AM, Mahesh Bandewar (महेश बंडेवार) wrote:

> Atomic operations are expensive (on certain architectures) and miimon
> runs quite frequently. Is the added cost of these atomic operations
> even worth just to avoid *duplicate info* messages? This seems like a
> overkill!

atomic_read() is a simple read, no atomic operation involved.

Same remark for atomic_set()

^ permalink raw reply

* Re: [PATCH net-next 1/3] net/sock: factor out dequeue/peek with offset code
From: Paolo Abeni @ 2018-10-23  7:28 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: netdev, David S. Miller, Eric Dumazet, kafai
In-Reply-To: <20181023044929.guyx7uwf5ndt6hiz@ast-mbp>

Hi,

On Mon, 2018-10-22 at 21:49 -0700, Alexei Starovoitov wrote:
> On Mon, May 15, 2017 at 11:01:42AM +0200, Paolo Abeni wrote:
> > And update __sk_queue_drop_skb() to work on the specified queue.
> > This will help the udp protocol to use an additional private
> > rx queue in a later patch.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  include/linux/skbuff.h |  7 ++++
> >  include/net/sock.h     |  4 +--
> >  net/core/datagram.c    | 90 ++++++++++++++++++++++++++++----------------------
> >  3 files changed, 60 insertions(+), 41 deletions(-)
> > 
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index a098d95..bfc7892 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -3056,6 +3056,13 @@ static inline void skb_frag_list_init(struct sk_buff *skb)
> >  
> >  int __skb_wait_for_more_packets(struct sock *sk, int *err, long *timeo_p,
> >  				const struct sk_buff *skb);
> > +struct sk_buff *__skb_try_recv_from_queue(struct sock *sk,
> > +					  struct sk_buff_head *queue,
> > +					  unsigned int flags,
> > +					  void (*destructor)(struct sock *sk,
> > +							   struct sk_buff *skb),
> > +					  int *peeked, int *off, int *err,
> > +					  struct sk_buff **last);
> >  struct sk_buff *__skb_try_recv_datagram(struct sock *sk, unsigned flags,
> >  					void (*destructor)(struct sock *sk,
> >  							   struct sk_buff *skb),
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index 66349e4..49d226f 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -2035,8 +2035,8 @@ void sk_reset_timer(struct sock *sk, struct timer_list *timer,
> >  
> >  void sk_stop_timer(struct sock *sk, struct timer_list *timer);
> >  
> > -int __sk_queue_drop_skb(struct sock *sk, struct sk_buff *skb,
> > -			unsigned int flags,
> > +int __sk_queue_drop_skb(struct sock *sk, struct sk_buff_head *sk_queue,
> > +			struct sk_buff *skb, unsigned int flags,
> >  			void (*destructor)(struct sock *sk,
> >  					   struct sk_buff *skb));
> >  int __sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb);
> > diff --git a/net/core/datagram.c b/net/core/datagram.c
> > index db1866f2..a4592b4 100644
> > --- a/net/core/datagram.c
> > +++ b/net/core/datagram.c
> > @@ -161,6 +161,43 @@ static struct sk_buff *skb_set_peeked(struct sk_buff *skb)
> >  	return skb;
> >  }
> >  
> > +struct sk_buff *__skb_try_recv_from_queue(struct sock *sk,
> > +					  struct sk_buff_head *queue,
> > +					  unsigned int flags,
> > +					  void (*destructor)(struct sock *sk,
> > +							   struct sk_buff *skb),
> > +					  int *peeked, int *off, int *err,
> > +					  struct sk_buff **last)
> > +{
> > +	struct sk_buff *skb;
> > +
> > +	*last = queue->prev;
> 
> this refactoring changed the behavior.
> Now queue->prev is returned as last.
> Whereas it was *last = queue before.
> 
> > +	skb_queue_walk(queue, skb) {
> 
> and *last = skb assignment is gone too.
> 
> Was this intentional ? 

Yes.

> Is this the right behavior?

I think so. queue->prev is the last skb in the queue. With the old
code,   __skb_try_recv_datagram(), when returning NULL, used the
instructions you quoted to overall set 'last' to the last skb in the
queue. We did not use 'last' elsewhere. So overall this just reduce the
number of instructions inside the loop. (unless I'm missing something).

Are you experiencing any specific issues due to the mentioned commit?

Thanks,

Paolo

^ permalink raw reply

* [PATCH] bonding:avoid repeated display of same link status change
From: mk.singh @ 2018-10-23 15:29 UTC (permalink / raw)
  To: netdev
  Cc: Manish Kumar Singh, Jay Vosburgh, Veaceslav Falico,
	Andy Gospodarek, David S. Miller, linux-kernel

From: Manish Kumar Singh <mk.singh@oracle.com>

When link status change needs to be committed and rtnl lock couldn't be
taken, avoid redisplay of same link status change message.

Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
---
 drivers/net/bonding/bond_main.c | 6 ++++--
 include/net/bonding.h           | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2b01180be834..af9ef889a429 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2096,7 +2096,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			bond_propose_link_state(slave, BOND_LINK_FAIL);
 			commit++;
 			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
+			if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
 					    (BOND_MODE(bond) ==
 					     BOND_MODE_ACTIVEBACKUP) ?
@@ -2136,7 +2136,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			commit++;
 			slave->delay = bond->params.updelay;
 
-			if (slave->delay) {
+			if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
 					    slave->dev->name,
 					    ignore_updelay ? 0 :
@@ -2310,9 +2310,11 @@ static void bond_mii_monitor(struct work_struct *work)
 		if (!rtnl_trylock()) {
 			delay = 1;
 			should_notify_peers = false;
+			atomic_set(&bond->rtnl_needed, 1);
 			goto re_arm;
 		}
 
+		atomic_set(&bond->rtnl_needed, 0);
 		bond_for_each_slave(bond, slave, iter) {
 			bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER);
 		}
diff --git a/include/net/bonding.h b/include/net/bonding.h
index a4f116f06c50..a4353506bb4f 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -229,6 +229,7 @@ struct bonding {
 	struct	 dentry *debug_dir;
 #endif /* CONFIG_DEBUG_FS */
 	struct rtnl_link_stats64 bond_stats;
+	atomic_t rtnl_needed;
 };
 
 #define bond_slave_get_rcu(dev) \
-- 
2.14.1

^ permalink raw reply related

* [PATCH] Revert "be2net: remove desc field from be_eq_obj"
From: Ivan Vecera @ 2018-10-23 14:40 UTC (permalink / raw)
  To: netdev
  Cc: Sathya Perla, Ajit Khaparde, Sriharsha Basavapatna, Somnath Kotur,
	David S. Miller, open list

The mentioned commit needs to be reverted because we cannot pass
string allocated on stack to request_irq(). This function stores
uses this pointer for later use (e.g. /proc/interrupts) so we need
to keep this string persistently.

Fixes: d6d9704af8f4 ("be2net: remove desc field from be_eq_obj")

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
---
 drivers/net/ethernet/emulex/benet/be.h      | 1 +
 drivers/net/ethernet/emulex/benet/be_main.c | 6 ++----
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index 58bcee8f0a58..ce041c90adb0 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -185,6 +185,7 @@ static inline void queue_tail_inc(struct be_queue_info *q)
 
 struct be_eq_obj {
 	struct be_queue_info q;
+	char desc[32];
 
 	struct be_adapter *adapter;
 	struct napi_struct napi;
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 534787291b44..bff74752cef1 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -3488,11 +3488,9 @@ static int be_msix_register(struct be_adapter *adapter)
 	int status, i, vec;
 
 	for_all_evt_queues(adapter, eqo, i) {
-		char irq_name[IFNAMSIZ+4];
-
-		snprintf(irq_name, sizeof(irq_name), "%s-q%d", netdev->name, i);
+		sprintf(eqo->desc, "%s-q%d", netdev->name, i);
 		vec = be_msix_vec_get(adapter, eqo);
-		status = request_irq(vec, be_msix, 0, irq_name, eqo);
+		status = request_irq(vec, be_msix, 0, eqo->desc, eqo);
 		if (status)
 			goto err_msix;
 
-- 
2.18.1

^ permalink raw reply related

* KASAN: use-after-free Write in hci_sock_release
From: syzbot @ 2018-10-23 14:38 UTC (permalink / raw)
  To: davem, johan.hedberg, linux-bluetooth, linux-kernel, marcel,
	netdev, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    8c60c36d0b8c Add linux-next specific files for 20181019
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=163a6499400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8b6d7c4c81535e89
dashboard link: https://syzkaller.appspot.com/bug?extid=b364ed862aa07c74bc62
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+b364ed862aa07c74bc62@syzkaller.appspotmail.com

F2FS-fs (loop5): Can't find valid F2FS filesystem in 1th superblock
F2FS-fs (loop5): Magic Mismatch, valid(0xf2f52010) - read(0x0)
F2FS-fs (loop5): Can't find valid F2FS filesystem in 2th superblock
==================================================================
BUG: KASAN: use-after-free in atomic_dec  
include/asm-generic/atomic-instrumented.h:127 [inline]
BUG: KASAN: use-after-free in hci_sock_release+0x14f/0x3b0  
net/bluetooth/hci_sock.c:873
Write of size 4 at addr ffff880182f492a0 by task syz-executor0/7470

CPU: 1 PID: 7470 Comm: syz-executor0 Not tainted 4.19.0-rc8-next-20181019+  
#98
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x244/0x39d lib/dump_stack.c:113
  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
  kasan_report_error mm/kasan/report.c:354 [inline]
  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
  check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
  kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
  atomic_dec include/asm-generic/atomic-instrumented.h:127 [inline]
  hci_sock_release+0x14f/0x3b0 net/bluetooth/hci_sock.c:873
  __sock_release+0xd7/0x250 net/socket.c:580
  sock_close+0x19/0x20 net/socket.c:1142
  __fput+0x3bc/0xa70 fs/file_table.c:279
  ____fput+0x15/0x20 fs/file_table.c:312
  task_work_run+0x1e8/0x2a0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x318/0x380 arch/x86/entry/common.c:166
  prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
  do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fb9b189cc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000031
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000457569
RDX: 000000000000000c RSI: 0000000020000240 RDI: 0000000000000006
RBP: 000000000072bfa0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb9b189d6d4
R13: 00000000004bd6a4 R14: 00000000004cbf20 R15: 00000000ffffffff

Allocated by task 7468:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
  kmem_cache_alloc_trace+0x152/0x750 mm/slab.c:3620
  kmalloc include/linux/slab.h:546 [inline]
  kzalloc include/linux/slab.h:741 [inline]
  hci_alloc_dev+0x228/0x21a0 net/bluetooth/hci_core.c:3116
  __vhci_create_device+0x102/0x580 drivers/bluetooth/hci_vhci.c:114
  vhci_create_device drivers/bluetooth/hci_vhci.c:163 [inline]
  vhci_get_user drivers/bluetooth/hci_vhci.c:219 [inline]
  vhci_write+0x2de/0x470 drivers/bluetooth/hci_vhci.c:299
  call_write_iter include/linux/fs.h:1844 [inline]
  new_sync_write fs/read_write.c:474 [inline]
  __vfs_write+0x6b8/0x9f0 fs/read_write.c:487
  vfs_write+0x1fc/0x560 fs/read_write.c:549
  ksys_write+0x101/0x260 fs/read_write.c:598
  __do_sys_write fs/read_write.c:610 [inline]
  __se_sys_write fs/read_write.c:607 [inline]
  __x64_sys_write+0x73/0xb0 fs/read_write.c:607
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 7467:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
  __cache_free mm/slab.c:3498 [inline]
  kfree+0xcf/0x230 mm/slab.c:3817
  bt_host_release+0x19/0x30 net/bluetooth/hci_sysfs.c:86
  device_release+0x7e/0x210 drivers/base/core.c:891
  kobject_cleanup lib/kobject.c:662 [inline]
  kobject_release lib/kobject.c:691 [inline]
  kref_put include/linux/kref.h:70 [inline]
  kobject_put.cold.9+0x287/0x2e4 lib/kobject.c:708
  put_device+0x20/0x30 drivers/base/core.c:2024
  hci_free_dev+0x19/0x20 net/bluetooth/hci_core.c:3208
  vhci_release+0x7e/0xf0 drivers/bluetooth/hci_vhci.c:355
  __fput+0x3bc/0xa70 fs/file_table.c:279
  ____fput+0x15/0x20 fs/file_table.c:312
  task_work_run+0x1e8/0x2a0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x318/0x380 arch/x86/entry/common.c:166
  prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
  do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff880182f48240
  which belongs to the cache kmalloc-8k of size 8192
The buggy address is located 4192 bytes inside of
  8192-byte region [ffff880182f48240, ffff880182f4a240)
The buggy address belongs to the page:
page:ffffea00060bd200 count:1 mapcount:0 mapping:ffff8801da802080 index:0x0  
compound_mapcount: 0
flags: 0x2fffc0000010200(slab|head)
raw: 02fffc0000010200 ffffea000610f708 ffff8801da801b48 ffff8801da802080
raw: 0000000000000000 ffff880182f48240 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff880182f49180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff880182f49200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff880182f49280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                ^
  ffff880182f49300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff880182f49380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
audit: type=1800 audit(1540276897.489:32): pid=7430 uid=0 auid=4294967295  
ses=4294967295 subj==unconfined op=collect_data cause=failed(directio)  
comm="syz-executor5" name="file0" dev="sda1" ino=16559 res=0
syz-executor5 (7430) used greatest stack depth: 14640 bytes left


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox