Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] IB/IPoIB: Check the headroom size
From: Honggang LI @ 2017-04-26 13:33 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Paolo Abeni, Or Gerlitz, Erez Shitrit, Erez Shitrit,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux Netdev List, David Miller
In-Reply-To: <1493213258.3041.98.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Wed, Apr 26, 2017 at 09:27:38AM -0400, Doug Ledford wrote:
> On Wed, 2017-04-26 at 20:52 +0800, Honggang LI wrote:
> > On Wed, Apr 26, 2017 at 09:46:34AM +0200, Paolo Abeni wrote:
> > > 
> > > On Wed, 2017-04-26 at 00:50 +0300, Or Gerlitz wrote:
> > > > 
> > > > so maybe @ least for the time being, we should be picking Hong's
> > > > patch
> > > > with proper change log and without the giant stack dump till we
> > > > have
> > > > something better. If you agree, can you do the re-write of the
> > > > change
> > > > log?
> > > 
> > > I think that Hong's patch is following the correct way to fix the
> > > issue: ipoib_hard_header() can't assume that the skb headroom is at
> > > least IPOIB_HARD_LEN bytes, as wrongly implied by commit
> > > fc791b633515
> > > (my fault, I'm sorry).
> > > 
> > > Perhaps we can make the code a little more robust with something
> > > alongside the following (only compile tested):
> > > ---
> > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > > b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > > index d1d3fb7..d53d2e1 100644
> > > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > > @@ -1160,6 +1160,11 @@ static int ipoib_hard_header(struct sk_buff
> > > *skb,
> > >                              const void *daddr, const void *saddr,
> > > unsigned len)
> > >  {
> > >         struct ipoib_header *header;
> > > +       int ret;
> > > +
> > > +       ret = skb_cow_head(skb, IPOIB_HARD_LEN);
> > 
> > I don't think so. When this skb_under_panic arise, all slaves had
> > been
> > removed from a busy bonding interface,
> 
> I'm not sure this entirely makes sense.  If all slaves had been
> removed, then we should never end up in ipoib_hard_header.  That should
> only be called as long as there is at least one slave still attached to
> the bond.  It might be during the process of removing the final slave,

Yes, it is during the process of removing the final slave. The
reproducer looks like this:

ping remote_ip_over_bonding_interface &
while 1; do
	ifdown bond0
	ifup   bond0
done

> but I don't think it can be after the final slave has been fully
> removed from the bond.  Paolo, what should the bond driver be doing
> once the slaves are gone?  Wouldn't it just be dropping every skb on
> the floor without calling anyone's hard header routine?
> 
> >  so it is better to immediately
> > give up and return error.
> > 
> > > 
> > > +       if (ret)
> > > +               return ret;
> > >  
> > >         header = (struct ipoib_header *) skb_push(skb, sizeof
> > > *header);
> > > ---
> > > 
> > > Paolo
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-
> > > rdma" in
> > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> -- 
> Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>     GPG KeyID: B826A3330E572FDD
>    
> Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next v9 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
From: Jiri Pirko @ 2017-04-26 13:47 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, xiyou.wangcong, eric.dumazet, simon.horman, netdev
In-Reply-To: <5723f2e0-7738-09f5-23c0-bfe7432d7528@mojatatu.com>

Wed, Apr 26, 2017 at 03:17:59PM CEST, jhs@mojatatu.com wrote:
>On 17-04-26 09:08 AM, Jiri Pirko wrote:
>> Wed, Apr 26, 2017 at 02:42:17PM CEST, jhs@mojatatu.com wrote:
>
>> > +#define VALID_TCA_ROOT_FLAGS TCA_FLAG_LARGE_DUMP_ON
>> > +static inline bool tca_flags_valid(u32 act_flags)
>> > +{
>> > +	u32 invalid_flags_mask  = ~VALID_TCA_ROOT_FLAGS;
>> > +
>> > +	if (act_flags & invalid_flags_mask)
>> > +		return false;
>> > +
>> > +	return true;
>> 
>> This dance should either not be here (flag-per-attr) or should be in
>> netlink generic place. This is not TC specific at all.
>> 
>
>So where do you think it should be?
>
>> I would still like to see the numbers prooving we need this.
>> Thanks
>> 
>
>We are going to have to agree to disagree.

No, I don't agree with this. All I'm asking is if the flag dance you do
is really necessary. This is important, UAPI, set in stone in future.
So please avoid the rush this. Don't send another version please.

Your argument is "performance", so I just asked for the proof that this  
argument is valid. I believe it is not. That's legit, right?              
Please proove me wrong and I'll be happy.
                                                                         
Thanks!

^ permalink raw reply

* Re: [PATCH] IB/IPoIB: Check the headroom size
From: Doug Ledford @ 2017-04-26 13:48 UTC (permalink / raw)
  To: Honggang LI
  Cc: Paolo Abeni, Or Gerlitz, Erez Shitrit, Erez Shitrit,
	linux-rdma@vger.kernel.org, Linux Netdev List, David Miller
In-Reply-To: <20170426133353.GA7898@dhcp-13-42.nay.redhat.com>

On Wed, 2017-04-26 at 21:33 +0800, Honggang LI wrote:
> Yes, it is during the process of removing the final slave. The
> reproducer looks like this:
> 
> ping remote_ip_over_bonding_interface &
> while 1; do
> 	ifdown bond0
> 	ifup   bond0
> done

Honestly, I would suspect the problem here is not when removing the
busy interface, but when bringing the interface back up.  IIRC, the
bonding driver defaults to assuming it will be used on an Ethernet
interface.  Once you attach an IB slave, it reconfigures itself for
running over IB instead.  But once it's configured it should stay
configured until after the last IB slave is removed (and once that
slave it removed, the bond should no longer even possess the pointer to
the ipoib_hard_header routine, so we should never call it).

The process, in the bonding driver, for going down and coming back up
needs to look something like this:

ifdown bond0:
  stop all queues
  remove all slaves
  possibly reconfigure to default state which is Ethernet suitable

ifup bond0:
  add first slave
  configure for IB instead of Ethernet
  start queues
  add additional slaves

I'm wondering if, when we have a current backlog, we aren't
accidentally doing this instead:

ifup bond0:
  add first slave
  release backlog queue
  configure for IB instead of Ethernet
  add additional slaves

Or, it might even be more subtle, such as:

ifup bond0:
  add first slave
  configure for IB instead of Ethernet
  start queues
   -> however, there was a backlog item on the queue from prior to
      the first slave being added and the port configured for IB mode,
      so the backlog skb is still configured for the default Ethernet
      mode, hence the failure
  add additional slaves

Maybe the real issue is that inside the bonding driver, when we
reconfigure the queue type from IB to Ethernet or Ethernet to IB, we
need to force either a drop or a reconfiguration of any packets already
currently in our waiting backlog queue of skbs.  Paolo?

> > 
> > but I don't think it can be after the final slave has been fully
> > removed from the bond.  Paolo, what should the bond driver be doing
> > once the slaves are gone?  Wouldn't it just be dropping every skb
> > on
> > the floor without calling anyone's hard header routine?
> > 
> > > 
> > >  so it is better to immediately
> > > give up and return error.
> > > 
> > > > 
> > > > 
> > > > +       if (ret)
> > > > +               return ret;
> > > >  
> > > >         header = (struct ipoib_header *) skb_push(skb, sizeof
> > > > *header);
> > > > ---
> > > > 
> > > > Paolo
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe
> > > > linux-
> > > > rdma" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.h
> > > > tml
> > -- 
> > Doug Ledford <dledford@redhat.com>
> >     GPG KeyID: B826A3330E572FDD
> >    
> > Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57
> > 2FDD
> > 
-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

^ permalink raw reply

* [PATCH v2 net-next] bridge: add per-port broadcast flood flag
From: Mike Manning @ 2017-04-26 13:48 UTC (permalink / raw)
  To: netdev; +Cc: Nikolay Aleksandrov

Support for l2 multicast flood control was added in commit b6cb5ac8331b
("net: bridge: add per-port multicast flood flag"). It allows broadcast
as it was introduced specifically for unknown multicast flood control.
But as broadcast is a special case of multicast, this may also need to
be disabled. For this purpose, introduce a flag to disable the flooding
of received l2 broadcasts. This approach is backwards compatible and
provides flexibility in filtering for the desired packet types.

Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Mike Manning <mmanning@brocade.com>
---
 include/linux/if_bridge.h    |  1 +
 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_forward.c      | 24 +++++++++++++++++-------
 net/bridge/br_if.c           |  2 +-
 net/bridge/br_netlink.c      |  3 +++
 net/bridge/br_sysfs_if.c     |  2 ++
 6 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c5847dc..0c16866 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -48,6 +48,7 @@ struct br_ip_list {
 #define BR_MCAST_FLOOD		BIT(11)
 #define BR_MULTICAST_TO_UNICAST	BIT(12)
 #define BR_VLAN_TUNNEL		BIT(13)
+#define BR_BCAST_FLOOD		BIT(14)
 
 #define BR_DEFAULT_AGEING_TIME	(300 * HZ)
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 633aa02..8e56ac7 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -323,6 +323,7 @@ enum {
 	IFLA_BRPORT_MCAST_FLOOD,
 	IFLA_BRPORT_MCAST_TO_UCAST,
 	IFLA_BRPORT_VLAN_TUNNEL,
+	IFLA_BRPORT_BCAST_FLOOD,
 	__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 902af6b..48fb174 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -183,13 +183,23 @@ void br_flood(struct net_bridge *br, struct sk_buff *skb,
 	struct net_bridge_port *p;
 
 	list_for_each_entry_rcu(p, &br->port_list, list) {
-		/* Do not flood unicast traffic to ports that turn it off */
-		if (pkt_type == BR_PKT_UNICAST && !(p->flags & BR_FLOOD))
-			continue;
-		/* Do not flood if mc off, except for traffic we originate */
-		if (pkt_type == BR_PKT_MULTICAST &&
-		    !(p->flags & BR_MCAST_FLOOD) && skb->dev != br->dev)
-			continue;
+		/* Do not flood unicast traffic to ports that turn it off, nor
+		 * other traffic if flood off, except for traffic we originate
+		 */
+		switch (pkt_type) {
+		case BR_PKT_UNICAST:
+			if (!(p->flags & BR_FLOOD))
+				continue;
+			break;
+		case BR_PKT_MULTICAST:
+			if (!(p->flags & BR_MCAST_FLOOD) && skb->dev != br->dev)
+				continue;
+			break;
+		case BR_PKT_BROADCAST:
+			if (!(p->flags & BR_BCAST_FLOOD) && skb->dev != br->dev)
+				continue;
+			break;
+		}
 
 		/* Do not flood to ports that enable proxy ARP */
 		if (p->flags & BR_PROXYARP)
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 6d273ca..b436ea0 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -362,7 +362,7 @@ static struct net_bridge_port *new_nbp(struct net_bridge *br,
 	p->path_cost = port_cost(dev);
 	p->priority = 0x8000 >> BR_PORT_BITS;
 	p->port_no = index;
-	p->flags = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD;
+	p->flags = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
 	br_init_port(p);
 	br_set_state(p, BR_STATE_DISABLED);
 	br_stp_port_timer_init(p);
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 6509864..a572db71 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -189,6 +189,8 @@ static int br_port_fill_attrs(struct sk_buff *skb,
 		       !!(p->flags & BR_FLOOD)) ||
 	    nla_put_u8(skb, IFLA_BRPORT_MCAST_FLOOD,
 		       !!(p->flags & BR_MCAST_FLOOD)) ||
+	    nla_put_u8(skb, IFLA_BRPORT_BCAST_FLOOD,
+		       !!(p->flags & BR_BCAST_FLOOD)) ||
 	    nla_put_u8(skb, IFLA_BRPORT_PROXYARP, !!(p->flags & BR_PROXYARP)) ||
 	    nla_put_u8(skb, IFLA_BRPORT_PROXYARP_WIFI,
 		       !!(p->flags & BR_PROXYARP_WIFI)) ||
@@ -683,6 +685,7 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
 	br_set_port_flag(p, tb, IFLA_BRPORT_UNICAST_FLOOD, BR_FLOOD);
 	br_set_port_flag(p, tb, IFLA_BRPORT_MCAST_FLOOD, BR_MCAST_FLOOD);
 	br_set_port_flag(p, tb, IFLA_BRPORT_MCAST_TO_UCAST, BR_MULTICAST_TO_UNICAST);
+	br_set_port_flag(p, tb, IFLA_BRPORT_BCAST_FLOOD, BR_BCAST_FLOOD);
 	br_set_port_flag(p, tb, IFLA_BRPORT_PROXYARP, BR_PROXYARP);
 	br_set_port_flag(p, tb, IFLA_BRPORT_PROXYARP_WIFI, BR_PROXYARP_WIFI);
 
diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 79aee75..5d5d413a 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -173,6 +173,7 @@ BRPORT_ATTR_FLAG(unicast_flood, BR_FLOOD);
 BRPORT_ATTR_FLAG(proxyarp, BR_PROXYARP);
 BRPORT_ATTR_FLAG(proxyarp_wifi, BR_PROXYARP_WIFI);
 BRPORT_ATTR_FLAG(multicast_flood, BR_MCAST_FLOOD);
+BRPORT_ATTR_FLAG(broadcast_flood, BR_BCAST_FLOOD);
 
 #ifdef CONFIG_BRIDGE_IGMP_SNOOPING
 static ssize_t show_multicast_router(struct net_bridge_port *p, char *buf)
@@ -221,6 +222,7 @@ static const struct brport_attribute *brport_attrs[] = {
 	&brport_attr_proxyarp,
 	&brport_attr_proxyarp_wifi,
 	&brport_attr_multicast_flood,
+	&brport_attr_broadcast_flood,
 	NULL
 };
 
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH net-next v9 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
From: Jiri Pirko @ 2017-04-26 13:48 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, xiyou.wangcong, eric.dumazet, simon.horman, netdev
In-Reply-To: <fc75c67b-2413-a9a9-2411-7789eef43203@mojatatu.com>

Wed, Apr 26, 2017 at 03:19:54PM CEST, jhs@mojatatu.com wrote:
>On 17-04-26 09:17 AM, Jamal Hadi Salim wrote:
>> On 17-04-26 09:08 AM, Jiri Pirko wrote:
>> > Wed, Apr 26, 2017 at 02:42:17PM CEST, jhs@mojatatu.com wrote:
>> 
>> > > +#define VALID_TCA_ROOT_FLAGS TCA_FLAG_LARGE_DUMP_ON
>> > > +static inline bool tca_flags_valid(u32 act_flags)
>> > > +{
>> > > +    u32 invalid_flags_mask  = ~VALID_TCA_ROOT_FLAGS;
>> > > +
>> > > +    if (act_flags & invalid_flags_mask)
>> > > +        return false;
>> > > +
>> > > +    return true;
>> > 
>> > This dance should either not be here (flag-per-attr) or should be in
>> > netlink generic place. This is not TC specific at all.
>> > 
>> 
>> So where do you think it should be?
>
>It would also be helpful for you to make comments when these things
>show up. This change was in version 6. I have had to do this back and
>forth a few times.

I still think that this whole thing is wrong, so that is why. Version 6
it is just because you are pushing versions too fast without actually
reaching consensus (I know why you are doing this, you really want to
push this through. I think it is not good).

^ permalink raw reply

* Re: [PATCH] IB/IPoIB: Check the headroom size
From: Doug Ledford @ 2017-04-26 13:50 UTC (permalink / raw)
  To: Honggang LI
  Cc: Paolo Abeni, Or Gerlitz, Erez Shitrit, Erez Shitrit,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux Netdev List, David Miller
In-Reply-To: <1493214483.3041.108.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Wed, 2017-04-26 at 09:48 -0400, Doug Ledford wrote:
> On Wed, 2017-04-26 at 21:33 +0800, Honggang LI wrote:
> > 
> > Yes, it is during the process of removing the final slave. The
> > reproducer looks like this:
> > 
> > ping remote_ip_over_bonding_interface &
> > while 1; do
> > 	ifdown bond0
> > 	ifup   bond0
> > done

BTW, rerunning your test as:

ping remote_ip_over_bonding_interface &
while 1; do
    echo -n "Downing interface..."
    ifdown bond0
    echo "done."
    sleep 1
    echo -n "Bringing interface up..."
    ifup bond0
    echo "done."
done

will allow you to more precisely identify whether the failure is during
the down or the up of the interface.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH v3 net-next]smsc911x: Adding support for Micochip LAN9250 Ethernet controller
From: David.Cai @ 2017-04-26 13:55 UTC (permalink / raw)
  To: netdev, davem; +Cc: UNGLinuxDriver, steve.glendinning

From: David Cai <david.cai@microchip.com>

Adding support for Microchip LAN9250 Ethernet controller.

Signed-off-by: David Cai <david.cai@microchip.com>
---
Changes
V2
 -  email format changed
 - remove unnecessary text in commit log Changes
V3
 - defined all supported Ethernet controller chip ID.
 - removed pdata->sub_generation = 0;
 - changed 'if (pdata->sub_generation)' to 
 	 if (pdata>generation == 4 && pdata->sub_generation)
 
 drivers/net/ethernet/smsc/smsc911x.c | 55 ++++++++++++++++++++++++------------
 drivers/net/ethernet/smsc/smsc911x.h | 19 +++++++++++++
 2 files changed, 56 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c
index fa5ca09..e35fe96 100644
--- a/drivers/net/ethernet/smsc/smsc911x.c
+++ b/drivers/net/ethernet/smsc/smsc911x.c
@@ -25,7 +25,7 @@
  *   LAN9215, LAN9216, LAN9217, LAN9218
  *   LAN9210, LAN9211
  *   LAN9220, LAN9221
- *   LAN89218
+ *   LAN89218,LAN9250
  *
  */
 
@@ -104,6 +104,9 @@ struct smsc911x_data {
 	/* used to decide which workarounds apply */
 	unsigned int generation;
 
+	/* used to decide which sub generation product work arounds to apply */
+	unsigned int sub_generation;
+
 	/* device configuration (copied from platform_data during probe) */
 	struct smsc911x_platform_config config;
 
@@ -1450,6 +1453,8 @@ static int smsc911x_soft_reset(struct smsc911x_data *pdata)
 	unsigned int timeout;
 	unsigned int temp;
 	int ret;
+	unsigned int reset_offset = HW_CFG;
+	unsigned int reset_mask = HW_CFG_SRST_;
 
 	/*
 	 * Make sure to power-up the PHY chip before doing a reset, otherwise @@ -1476,15 +1481,23 @@ static int smsc911x_soft_reset(struct smsc911x_data *pdata)
 		}
 	}
 
+	if (pdata->generation == 4 && pdata->sub_generation) {
+		/* special reset for  LAN9250 */
+		reset_offset = RESET_CTL;
+		reset_mask = RESET_CTL_DIGITAL_RST_;
+	}
+
 	/* Reset the LAN911x */
-	smsc911x_reg_write(pdata, HW_CFG, HW_CFG_SRST_);
+	smsc911x_reg_write(pdata, reset_offset, reset_mask);
+
+	/* verify reset bit is cleared */
 	timeout = 10;
 	do {
 		udelay(10);
-		temp = smsc911x_reg_read(pdata, HW_CFG);
-	} while ((--timeout) && (temp & HW_CFG_SRST_));
+		temp = smsc911x_reg_read(pdata, reset_offset);
+	} while ((--timeout) && (temp & reset_mask));
 
-	if (unlikely(temp & HW_CFG_SRST_)) {
+	if (unlikely(temp & reset_mask)) {
 		SMSC_WARN(pdata, drv, "Failed to complete reset");
 		return -EIO;
 	}
@@ -2253,31 +2266,37 @@ static int smsc911x_init(struct net_device *dev)
 
 	pdata->idrev = smsc911x_reg_read(pdata, ID_REV);
 	switch (pdata->idrev & 0xFFFF0000) {
-	case 0x01180000:
-	case 0x01170000:
-	case 0x01160000:
-	case 0x01150000:
-	case 0x218A0000:
+	case LAN9118:
+	case LAN9117:
+	case LAN9116:
+	case LAN9115:
+	case LAN89218:
 		/* LAN911[5678] family */
 		pdata->generation = pdata->idrev & 0x0000FFFF;
 		break;
 
-	case 0x118A0000:
-	case 0x117A0000:
-	case 0x116A0000:
-	case 0x115A0000:
+	case LAN9218:
+	case LAN9217:
+	case LAN9216:
+	case LAN9215:
 		/* LAN921[5678] family */
 		pdata->generation = 3;
 		break;
 
-	case 0x92100000:
-	case 0x92110000:
-	case 0x92200000:
-	case 0x92210000:
+	case LAN9210:
+	case LAN9211:
+	case LAN9220:
+	case LAN9221:
 		/* LAN9210/LAN9211/LAN9220/LAN9221 */
 		pdata->generation = 4;
 		break;
 
+	case LAN9250:
+		/* LAN9250 */
+		pdata->generation = 4;
+		pdata->sub_generation = 1;
+		break;
+
 	default:
 		SMSC_WARN(pdata, probe, "LAN911x not identified, idrev: 0x%08X",
 			  pdata->idrev);
diff --git a/drivers/net/ethernet/smsc/smsc911x.h b/drivers/net/ethernet/smsc/smsc911x.h
index 54d6489..8d75508 100644
--- a/drivers/net/ethernet/smsc/smsc911x.h
+++ b/drivers/net/ethernet/smsc/smsc911x.h
@@ -20,6 +20,22 @@
 #ifndef __SMSC911X_H__
 #define __SMSC911X_H__
 
+/*Chip ID*/
+#define LAN9115	0x01150000
+#define LAN9116	0x01160000
+#define LAN9117	0x01170000
+#define LAN9118	0x01180000
+#define LAN9215	0x115A0000
+#define LAN9216	0x116A0000
+#define LAN9217	0x117A0000
+#define LAN9218	0x118A0000
+#define LAN9210	0x92100000
+#define LAN9211	0x92110000
+#define LAN9220	0x92200000
+#define LAN9221	0x92210000
+#define LAN9250	0x92500000
+#define LAN89218	0x218A0000
+
 #define TX_FIFO_LOW_THRESHOLD	((u32)1600)
 #define SMSC911X_EEPROM_SIZE	((u32)128)
 #define USE_DEBUG		0
@@ -303,6 +319,9 @@
 #define E2P_DATA_EEPROM_DATA_		0x000000FF
 #define LAN_REGISTER_EXTENT		0x00000100
 
+#define RESET_CTL			0x1F8
+#define RESET_CTL_DIGITAL_RST_		0x00000001
+
 /*
  * MAC Control and Status Register (Indirect Address)
  * Offset (through the MAC_CSR CMD and DATA port)

^ permalink raw reply related

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
From: Jiri Pirko @ 2017-04-26 13:56 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise
In-Reply-To: <10fe2c22-8e76-543e-dd24-ddce5813ab69@mojatatu.com>

Wed, Apr 26, 2017 at 03:14:38PM CEST, jhs@mojatatu.com wrote:
>On 17-04-26 08:08 AM, Jiri Pirko wrote:
>> Wed, Apr 26, 2017 at 01:48:29PM CEST, jhs@mojatatu.com wrote:
>> > On 17-04-26 02:19 AM, Jiri Pirko wrote:
>> > > Tue, Apr 25, 2017 at 10:29:40PM CEST, jhs@mojatatu.com wrote:
>> > > > On 17-04-25 12:04 PM, Jiri Pirko wrote:
>
>> > I have experience with dealing with a massive amount of various dumps
>> > and (batch) sets and it always boils down to one thing:
>> > _how much data is exchanged between user and kernel_
>> > 3 flags encoded as bits in a u32 attribute cost 64 bits.
>> > Encoded separately cost 3x that.
>> > 
>> > Believe me, it _does make a difference_ in performance.
>> > 
>> > My least favorite subsystem is bridge. The bridge code has
>> > tons of flags in those entries that are sent to/from kernel as u8
>> > attributes. It is painful.
>> > 
>> > For something more recent, lets look at this commit from Ben on Flower:
>> > +       TCA_FLOWER_KEY_MPLS_TTL,        /* u8 - 8 bits */
>> > +       TCA_FLOWER_KEY_MPLS_BOS,        /* u8 - 1 bit */
>> > +       TCA_FLOWER_KEY_MPLS_TC,         /* u8 - 3 bits */
>> > +       TCA_FLOWER_KEY_MPLS_LABEL,      /* be32 - 20 bits */
>> > 
>> > Yes, that looks pretty, but:
>> > That would have fit in one attribute with a u32. Mask attributes would
>> > be eliminated with a second 32 bit - all in the same singular
>> > attribute.
>> > 
>> > Imagine if i have 1M flower entries. If you add up the mask the cost
>> > of these things is about 3*2*64 bits more per entry compared to putting
>> > the mpls info/mask in one attribute.
>> > At 1M entries that is a few MBs of data being exchanged.
>> 
>> I can do the math :) Yet still, I would like to see the numbers :)
>> Because I believe that is the only way to end this lenghty converstation
>> once and forever...
>> 
>
>Jiri, what are you arguing about if you have done the math? ;->

I can do 3*2*64. What I cannot do is to figure out the real performance
impact.


>You want me to show you that getting or setting less data is good for
>performance?
>Look at the third patch: Why do i think it is necessary to send only
>actions that have changed? Precisely to reduce the amount of data
>being transported. The second patch - to reduce the amount of crossing
>user space to kernel space (which is going to happen more with increased
>data I have to transport between the user and the kernel).
>
>Again: You are looking at this from a manageability point of view which
>is useful but not the only input into a design. If i can squeeze more
>data without killing usability - I am all for it. It just doesnt
>compute that it is ok to use a flag per attribute because it looks
>beautiful.

Hmm. Now that I'm thinking about it, why don't we have NLA_FLAGS with
couple of helpers around it? It will be obvious what the attr is, all
kernel code would use the same helpers. Would be nice.

^ permalink raw reply

* Re: net/ipv6: use-after-free in __call_rcu/in6_dev_finish_destroy_rcu
From: Paul E. McKenney @ 2017-04-26 13:59 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Eric Dumazet,
	Cong Wang, Dmitry Vyukov, Kostya Serebryany, syzkaller
In-Reply-To: <CAAeHK+y=YqG0Kb2chQ=_2E8aPttMetQjZPCxFibHKCfktQS2PQ@mail.gmail.com>

On Wed, Apr 26, 2017 at 02:34:15PM +0200, Andrey Konovalov wrote:
> Hi,
> 
> I've got the following error report while fuzzing the kernel with syzkaller.
> 
> On commit 5a7ad1146caa895ad718a534399e38bd2ba721b7 (4.11-rc8).
> 
> Unfortunately it's not reproducible.
> 
> I'm not sure whether is is an issue with rcu or ipv6.
> 
> ==================================================================
> BUG: KASAN: use-after-free in __call_rcu.constprop.77+0x13be/0x1640

Does building with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y show any splats?
(Yeah, kind of a stupid question if it is not reproducible, but had
to ask!)

							Thanx, Paul

> kernel/rcu/tree.c:3269 at addr ffff88003b842280
> Write of size 8 by task kworker/u10:1/180
> CPU: 2 PID: 180 Comm: kworker/u10:1 Not tainted 4.11.0-rc8+ #270
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: events_unbound call_usermodehelper_exec_work
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x192/0x22d lib/dump_stack.c:52
>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:164
>  print_address_description mm/kasan/report.c:202 [inline]
>  kasan_report_error mm/kasan/report.c:291 [inline]
>  kasan_report+0x252/0x510 mm/kasan/report.c:347
>  __asan_report_store8_noabort+0x17/0x20 mm/kasan/report.c:373
>  __call_rcu.constprop.77+0x13be/0x1640 kernel/rcu/tree.c:3269
>  call_rcu_sched+0x12/0x20 kernel/rcu/tree.c:3288
>  free_pid+0x446/0x5d0 kernel/pid.c:293
>  __change_pid+0x2a1/0x3d0 kernel/pid.c:411
>  detach_pid+0x1f/0x30 kernel/pid.c:416
>  __unhash_process kernel/exit.c:74 [inline]
>  __exit_signal kernel/exit.c:155 [inline]
>  release_task+0xbb0/0x1d90 kernel/exit.c:199
>  wait_task_zombie kernel/exit.c:1230 [inline]
>  wait_consider_task+0x11fe/0x3410 kernel/exit.c:1458
>  do_wait_thread kernel/exit.c:1521 [inline]
>  do_wait+0x3ea/0x8e0 kernel/exit.c:1592
>  SYSC_wait4 kernel/exit.c:1720 [inline]
>  SyS_wait4+0x208/0x340 kernel/exit.c:1689
>  call_usermodehelper_exec_sync kernel/kmod.c:292 [inline]
>  call_usermodehelper_exec_work+0x1a7/0x2b0 kernel/kmod.c:329
>  process_one_work+0x9f7/0x1580 kernel/workqueue.c:2097
>  worker_thread+0x1df/0x14b0 kernel/workqueue.c:2231
>  kthread+0x31f/0x3f0 kernel/kthread.c:231
>  ret_from_fork+0x2c/0x40 arch/x86/entry/entry_64.S:430
> Object at ffff88003b842018, in cache kmalloc-1024 size: 1024
> Allocated:
> PID = 1
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616
>  kmem_cache_alloc_trace+0x61/0x170 mm/slub.c:2745
>  kmalloc include/linux/slab.h:490 [inline]
>  kzalloc include/linux/slab.h:663 [inline]
>  ipv6_add_dev+0x199/0x1380 net/ipv6/addrconf.c:380
>  addrconf_init+0xd0/0x29a net/ipv6/addrconf.c:6405
>  inet6_init+0x2f6/0x584 net/ipv6/af_inet6.c:962
>  do_one_initcall+0xf3/0x380 init/main.c:792
>  do_initcall_level init/main.c:858 [inline]
>  do_initcalls init/main.c:866 [inline]
>  do_basic_setup init/main.c:884 [inline]
>  kernel_init_freeable+0x54d/0x622 init/main.c:1035
>  kernel_init+0x13/0x180 init/main.c:959
>  ret_from_fork+0x2c/0x40 arch/x86/entry/entry_64.S:430
> Freed:
> PID = 6479
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589
>  slab_free_hook mm/slub.c:1357 [inline]
>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>  slab_free mm/slub.c:2961 [inline]
>  kfree+0x91/0x190 mm/slub.c:3882
>  in6_dev_finish_destroy_rcu+0x97/0xc0 net/ipv6/addrconf_core.c:150
>  __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
>  rcu_do_batch.isra.65+0x6de/0xbd0 kernel/rcu/tree.c:2879
>  invoke_rcu_callbacks kernel/rcu/tree.c:3142 [inline]
>  __rcu_process_callbacks kernel/rcu/tree.c:3109 [inline]
>  rcu_process_callbacks+0x23f/0x810 kernel/rcu/tree.c:3126
>  __do_softirq+0x253/0x78b kernel/softirq.c:284
> Memory state around the buggy address:
>  ffff88003b842180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff88003b842200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >ffff88003b842280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                    ^
>  ffff88003b842300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff88003b842380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
> 

^ permalink raw reply

* Re: [PATCH net v3 1/3] net: hns: support deferred probe when can not obtain irq
From: Matthias Brugger @ 2017-04-26 14:03 UTC (permalink / raw)
  To: Yankejian, davem, salil.mehta, yisen.zhuang, lipeng321, zhouhuiru,
	huangdaode
  Cc: netdev, linuxarm
In-Reply-To: <1493190022-91343-2-git-send-email-yankejian@huawei.com>



On 26/04/17 09:00, Yankejian wrote:
> From: lipeng <lipeng321@huawei.com>
>
> In the hip06 and hip07 SoCs, the interrupt lines from the
> DSAF controllers are connected to mbigen hw module.
> The mbigen module is probed with module_init, and, as such,
> is not guaranteed to probe before the HNS driver. So we need
> to support deferred probe.
>
> We check for probe deferral in the hw layer probe, so we not
> probe into the main layer and memories, etc., to later learn
> that we need to defer the probe.

This paragraph does not hold any more and should be deleted.

Other then this:
Reviewed-by: Matthias Brugger <mbrugger@suse.com>

>
> Signed-off-by: lipeng <lipeng321@huawei.com>
> Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
> ---
> V2 -> V3:
> 1. Check return value when  platform_get_irq in hns_rcb_get_cfg;
> ---
>  drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c | 4 +++-
>  drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c | 8 +++++++-
>  drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h | 2 +-
>  3 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
> index 6ea8722..a41cf95 100644
> --- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
> +++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
> @@ -510,7 +510,9 @@ int hns_ppe_init(struct dsaf_device *dsaf_dev)
>
>  		hns_ppe_get_cfg(dsaf_dev->ppe_common[i]);
>
> -		hns_rcb_get_cfg(dsaf_dev->rcb_common[i]);
> +		ret = hns_rcb_get_cfg(dsaf_dev->rcb_common[i]);
> +		if (ret)
> +			goto get_rcb_cfg_fail;
>  	}
>
>  	for (i = 0; i < HNS_PPE_COM_NUM; i++)
> diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
> index f0ed80d6..673a5d3 100644
> --- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
> +++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
> @@ -452,7 +452,7 @@ static int hns_rcb_get_base_irq_idx(struct rcb_common_cb *rcb_common)
>   *hns_rcb_get_cfg - get rcb config
>   *@rcb_common: rcb common device
>   */
> -void hns_rcb_get_cfg(struct rcb_common_cb *rcb_common)
> +int hns_rcb_get_cfg(struct rcb_common_cb *rcb_common)
>  {
>  	struct ring_pair_cb *ring_pair_cb;
>  	u32 i;
> @@ -477,10 +477,16 @@ void hns_rcb_get_cfg(struct rcb_common_cb *rcb_common)
>  		ring_pair_cb->virq[HNS_RCB_IRQ_IDX_RX] =
>  		is_ver1 ? platform_get_irq(pdev, base_irq_idx + i * 2 + 1) :
>  			  platform_get_irq(pdev, base_irq_idx + i * 3);
> +		if ((ring_pair_cb->virq[HNS_RCB_IRQ_IDX_TX] == -EPROBE_DEFER) ||
> +		    (ring_pair_cb->virq[HNS_RCB_IRQ_IDX_RX] == -EPROBE_DEFER))
> +			return -EPROBE_DEFER;
> +
>  		ring_pair_cb->q.phy_base =
>  			RCB_COMM_BASE_TO_RING_BASE(rcb_common->phy_base, i);
>  		hns_rcb_ring_pair_get_cfg(ring_pair_cb);
>  	}
> +
> +	return 0;
>  }
>
>  /**
> diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h
> index 99b4e1b..3d7b484 100644
> --- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h
> +++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h
> @@ -110,7 +110,7 @@ struct rcb_common_cb {
>  void hns_rcb_common_free_cfg(struct dsaf_device *dsaf_dev, u32 comm_index);
>  int hns_rcb_common_init_hw(struct rcb_common_cb *rcb_common);
>  void hns_rcb_start(struct hnae_queue *q, u32 val);
> -void hns_rcb_get_cfg(struct rcb_common_cb *rcb_common);
> +int hns_rcb_get_cfg(struct rcb_common_cb *rcb_common);
>  void hns_rcb_get_queue_mode(enum dsaf_mode dsaf_mode,
>  			    u16 *max_vfn, u16 *max_q_per_vf);
>
>

^ permalink raw reply

* [PATCH net-next] fib_rules: fix error return code
From: Wei Yongjun @ 2017-04-26 14:03 UTC (permalink / raw)
  To: Lorenzo Colitti, Ido Schimmel, David Ahern, Mateusz Bajorski,
	Johannes Berg
  Cc: Wei Yongjun, netdev

From: Wei Yongjun <weiyongjun1@huawei.com>

Fix to return error code -EINVAL from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 622ec2c9d524 ("net: core: add UID to flows, rules, and routes")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
---
 net/core/fib_rules.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index c58c1df..f21c4d3 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -440,6 +440,7 @@ int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (tb[FRA_TUN_ID])
 		rule->tun_id = nla_get_be64(tb[FRA_TUN_ID]);
 
+	err = -EINVAL;
 	if (tb[FRA_L3MDEV]) {
 #ifdef CONFIG_NET_L3_MASTER_DEV
 		rule->l3mdev = nla_get_u8(tb[FRA_L3MDEV]);
@@ -461,7 +462,6 @@ int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr *nlh,
 	else
 		rule->suppress_ifgroup = -1;
 
-	err = -EINVAL;
 	if (tb[FRA_GOTO]) {
 		if (rule->action != FR_ACT_GOTO)
 			goto errout_free;
@@ -592,8 +592,10 @@ int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 	if (tb[FRA_UID_RANGE]) {
 		range = nla_get_kuid_range(tb);
-		if (!uid_range_set(&range))
+		if (!uid_range_set(&range)) {
+			err = -EINVAL;
 			goto errout;
+		}
 	} else {
 		range = fib_kuid_range_unset;
 	}

^ permalink raw reply related

* Re: [PATCH net v3 2/3] net: hns: support deferred probe when no mdio
From: Matthias Brugger @ 2017-04-26 14:07 UTC (permalink / raw)
  To: Yankejian, davem, salil.mehta, yisen.zhuang, lipeng321, zhouhuiru,
	huangdaode
  Cc: netdev, linuxarm
In-Reply-To: <1493190022-91343-3-git-send-email-yankejian@huawei.com>



On 26/04/17 09:00, Yankejian wrote:
> From: lipeng <lipeng321@huawei.com>
>
> In the hip06 and hip07 SoCs, phy connect to mdio bus.The mdio
> module is probed with module_init, and, as such,
> is not guaranteed to probe before the HNS driver. So we need
> to support deferred probe.
>
> We check for probe deferral in the mac init, so we not init DSAF
> when there is no mdio, and free all resource, to later learn that
> we need to defer the probe.
>
> Signed-off-by: lipeng <lipeng321@huawei.com>
> Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>

Reviewed-by: Matthias Brugger <mbrugger@suse.com>

> ---
>  drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c | 39 +++++++++++++++++++----
>  1 file changed, 33 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
> index bdd8cdd..8c00e09 100644
> --- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
> +++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
> @@ -735,6 +735,8 @@ static int hns_mac_init_ex(struct hns_mac_cb *mac_cb)
>  	rc = phy_device_register(phy);
>  	if (rc) {
>  		phy_device_free(phy);
> +		dev_err(&mdio->dev, "registered phy fail at address %i\n",
> +			addr);
>  		return -ENODEV;
>  	}
>
> @@ -745,7 +747,7 @@ static int hns_mac_init_ex(struct hns_mac_cb *mac_cb)
>  	return 0;
>  }
>
> -static void hns_mac_register_phy(struct hns_mac_cb *mac_cb)
> +static int hns_mac_register_phy(struct hns_mac_cb *mac_cb)
>  {
>  	struct acpi_reference_args args;
>  	struct platform_device *pdev;
> @@ -755,24 +757,39 @@ static void hns_mac_register_phy(struct hns_mac_cb *mac_cb)
>
>  	/* Loop over the child nodes and register a phy_device for each one */
>  	if (!to_acpi_device_node(mac_cb->fw_port))
> -		return;
> +		return -ENODEV;
>
>  	rc = acpi_node_get_property_reference(
>  			mac_cb->fw_port, "mdio-node", 0, &args);
>  	if (rc)
> -		return;
> +		return rc;
>
>  	addr = hns_mac_phy_parse_addr(mac_cb->dev, mac_cb->fw_port);
>  	if (addr < 0)
> -		return;
> +		return addr;
>
>  	/* dev address in adev */
>  	pdev = hns_dsaf_find_platform_device(acpi_fwnode_handle(args.adev));
> +	if (!pdev) {
> +		dev_err(mac_cb->dev, "mac%d mdio pdev is NULL\n",
> +			mac_cb->mac_id);
> +		return  -EINVAL;
> +	}
> +
>  	mii_bus = platform_get_drvdata(pdev);
> +	if (!mii_bus) {
> +		dev_err(mac_cb->dev,
> +			"mac%d mdio is NULL, dsaf will probe again later\n",
> +			mac_cb->mac_id);
> +		return -EPROBE_DEFER;
> +	}
> +
>  	rc = hns_mac_register_phydev(mii_bus, mac_cb, addr);
>  	if (!rc)
>  		dev_dbg(mac_cb->dev, "mac%d register phy addr:%d\n",
>  			mac_cb->mac_id, addr);
> +
> +	return rc;
>  }
>
>  #define MAC_MEDIA_TYPE_MAX_LEN		16
> @@ -793,7 +810,7 @@ static void hns_mac_register_phy(struct hns_mac_cb *mac_cb)
>   *@np:device node
>   * return: 0 --success, negative --fail
>   */
> -static int  hns_mac_get_info(struct hns_mac_cb *mac_cb)
> +static int hns_mac_get_info(struct hns_mac_cb *mac_cb)
>  {
>  	struct device_node *np;
>  	struct regmap *syscon;
> @@ -903,7 +920,15 @@ static int  hns_mac_get_info(struct hns_mac_cb *mac_cb)
>  			}
>  		}
>  	} else if (is_acpi_node(mac_cb->fw_port)) {
> -		hns_mac_register_phy(mac_cb);
> +		ret = hns_mac_register_phy(mac_cb);
> +		/*
> +		 * Mac can work well if there is phy or not.If the port don't
> +		 * connect with phy, the return value will be ignored. Only
> +		 * when there is phy but can't find mdio bus, the return value
> +		 * will be handled.
> +		 */
> +		if (ret == -EPROBE_DEFER)
> +			return ret;
>  	} else {
>  		dev_err(mac_cb->dev, "mac%d cannot find phy node\n",
>  			mac_cb->mac_id);
> @@ -1065,6 +1090,7 @@ int hns_mac_init(struct dsaf_device *dsaf_dev)
>  			dsaf_dev->mac_cb[port_id] = mac_cb;
>  		}
>  	}
> +
>  	/* init mac_cb for all port */
>  	for (port_id = 0; port_id < max_port_num; port_id++) {
>  		mac_cb = dsaf_dev->mac_cb[port_id];
> @@ -1074,6 +1100,7 @@ int hns_mac_init(struct dsaf_device *dsaf_dev)
>  		ret = hns_mac_get_cfg(dsaf_dev, mac_cb);
>  		if (ret)
>  			return ret;
> +
>  		ret = hns_mac_init_ex(mac_cb);
>  		if (ret)
>  			return ret;
>

^ permalink raw reply

* Re: [PATCH] IB/IPoIB: Check the headroom size
From: Honggang LI @ 2017-04-26 14:11 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Paolo Abeni, Or Gerlitz, Erez Shitrit, Erez Shitrit,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux Netdev List, David Miller
In-Reply-To: <1493214638.3041.110.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Wed, Apr 26, 2017 at 09:50:38AM -0400, Doug Ledford wrote:
> On Wed, 2017-04-26 at 09:48 -0400, Doug Ledford wrote:
> > On Wed, 2017-04-26 at 21:33 +0800, Honggang LI wrote:
> > > 
> > > Yes, it is during the process of removing the final slave. The
> > > reproducer looks like this:
> > > 
> > > ping remote_ip_over_bonding_interface &
> > > while 1; do
> > > 	ifdown bond0
> > > 	ifup   bond0
> > > done
> 
> BTW, rerunning your test as:
> 
> ping remote_ip_over_bonding_interface &
> while 1; do
>     echo -n "Downing interface..."
>     ifdown bond0

Confirmed panic in here.

>     echo "done."
>     sleep 1
>     echo -n "Bringing interface up..."
>     ifup bond0
>     echo "done."
> done
> 
> will allow you to more precisely identify whether the failure is during
> the down or the up of the interface.
> 
> -- 
> Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>     GPG KeyID: B826A3330E572FDD
>    
> Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: net/tcp: warning in tcp_try_coalesce/skb_try_coalesce
From: Eric Dumazet @ 2017-04-26 14:16 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Patrick McHardy, netdev, LKML, Dmitry Vyukov, Kostya Serebryany,
	syzkaller
In-Reply-To: <CAAeHK+yQJUuuv2XM6tQuaE12Zo1htujz6A7Qd_h+E9qCVEwDSg@mail.gmail.com>

On Wed, Apr 26, 2017 at 5:08 AM, Andrey Konovalov <andreyknvl@google.com> wrote:
> On Tue, Jan 31, 2017 at 2:17 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
>> Hi,
>>
>> I've got the following error report while running the syzkaller fuzzer.
>>
>> On commit 566cf877a1fcb6d6dc0126b076aad062054c2637 (4.10-rc6).
>>
>> The fuzzer hits this issue quite often, but I don't have a working reproducer.
>
> I still see this on 5a7ad1146caa895ad718a534399e38bd2ba721b7 (4.11-rc8).
>
> I also have a reproducer now, attached.
>
> It takes around 10 seconds for the warning to trigger.
>

It does not trigger for me, but seeing that you use SO_ATTACH_FILTER,
I have a pretty good idea how to remove this splat.

A call to skb_condense() from ___pskb_trim() will likely help, I will
post a patch after my commute.

Thanks.

^ permalink raw reply

* Re: [PATCH v2 net-next] bridge: add per-port broadcast flood flag
From: Nikolay Aleksandrov @ 2017-04-26 14:19 UTC (permalink / raw)
  To: Mike Manning, netdev
In-Reply-To: <1493214489-9921-1-git-send-email-mmanning@brocade.com>

On 26/04/17 16:48, Mike Manning wrote:
> Support for l2 multicast flood control was added in commit b6cb5ac8331b
> ("net: bridge: add per-port multicast flood flag"). It allows broadcast
> as it was introduced specifically for unknown multicast flood control.
> But as broadcast is a special case of multicast, this may also need to
> be disabled. For this purpose, introduce a flag to disable the flooding
> of received l2 broadcasts. This approach is backwards compatible and
> provides flexibility in filtering for the desired packet types.
> 
> Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> Signed-off-by: Mike Manning <mmanning@brocade.com>
> ---
>  include/linux/if_bridge.h    |  1 +
>  include/uapi/linux/if_link.h |  1 +
>  net/bridge/br_forward.c      | 24 +++++++++++++++++-------
>  net/bridge/br_if.c           |  2 +-
>  net/bridge/br_netlink.c      |  3 +++
>  net/bridge/br_sysfs_if.c     |  2 ++
>  6 files changed, 25 insertions(+), 8 deletions(-)
> 

Thank you, now we can control all types.

Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

^ permalink raw reply

* Re: [PATCH] IB/IPoIB: Check the headroom size
From: Doug Ledford @ 2017-04-26 14:25 UTC (permalink / raw)
  To: Honggang LI
  Cc: Paolo Abeni, Or Gerlitz, Erez Shitrit, Erez Shitrit,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux Netdev List, David Miller
In-Reply-To: <20170426141139.GA14635-Y5OA6DF/u0nid9cnFhDO8BcY2uh10dtjAL8bYrjMMd8@public.gmane.org>

On Wed, 2017-04-26 at 22:11 +0800, Honggang LI wrote:
> On Wed, Apr 26, 2017 at 09:50:38AM -0400, Doug Ledford wrote:
> > 
> > On Wed, 2017-04-26 at 09:48 -0400, Doug Ledford wrote:
> > > 
> > > On Wed, 2017-04-26 at 21:33 +0800, Honggang LI wrote:
> > > > 
> > > > 
> > > > Yes, it is during the process of removing the final slave. The
> > > > reproducer looks like this:
> > > > 
> > > > ping remote_ip_over_bonding_interface &
> > > > while 1; do
> > > > 	ifdown bond0
> > > > 	ifup   bond0
> > > > done
> > 
> > BTW, rerunning your test as:
> > 
> > ping remote_ip_over_bonding_interface &
> > while 1; do
> >     echo -n "Downing interface..."
> >     ifdown bond0
> 
> Confirmed panic in here.

OK, this leads me to suspect that the bonding driver is possibly
reconfiguring the skb setup to Ethernet mode before the last slave is
fully dropped from the bond, and so the slave is seeing a packet to its
hard header routine with the wrong headroom.  I think Paolo's patch is
probably the best way to go.  The reason I say that is because we can't
definitively say that this is the only pathway we will ever see that
causes us to get a packet with Ethernet headroom on our IPoIB
interface, and Paolo's patch gives us the possibility of recovering and
sending the packet out.  Even in this case, we are downing the
interface, but it isn't down entirely yet, so sending the packet out
isn't necessarily wrong.  So a method of fix that allows us to
recover/continue seems preferable to me.  Can you test Paolo's patch
and see if it resolves the problem?

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD

Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 net-next]smsc911x: Adding support for Micochip LAN9250 Ethernet controller
From: Andrew Lunn @ 2017-04-26 14:34 UTC (permalink / raw)
  To: David.Cai; +Cc: netdev, davem, UNGLinuxDriver, steve.glendinning
In-Reply-To: <C3C28FB10418274EB7FD7C2B85C796A441247573@CHN-SV-EXMX02.mchp-main.com>

On Wed, Apr 26, 2017 at 01:55:52PM +0000, David.Cai@microchip.com wrote:
> From: David Cai <david.cai@microchip.com>
> 
> Adding support for Microchip LAN9250 Ethernet controller.
> 
> Signed-off-by: David Cai <david.cai@microchip.com>
> ---
> Changes
> V2
>  -  email format changed
>  - remove unnecessary text in commit log Changes
> V3
>  - defined all supported Ethernet controller chip ID.
>  - removed pdata->sub_generation = 0;
>  - changed 'if (pdata->sub_generation)' to 
>  	 if (pdata>generation == 4 && pdata->sub_generation)
>  
>  drivers/net/ethernet/smsc/smsc911x.c | 55 ++++++++++++++++++++++++------------
>  drivers/net/ethernet/smsc/smsc911x.h | 19 +++++++++++++
>  2 files changed, 56 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c
> index fa5ca09..e35fe96 100644
> --- a/drivers/net/ethernet/smsc/smsc911x.c
> +++ b/drivers/net/ethernet/smsc/smsc911x.c
> @@ -25,7 +25,7 @@
>   *   LAN9215, LAN9216, LAN9217, LAN9218
>   *   LAN9210, LAN9211
>   *   LAN9220, LAN9221
> - *   LAN89218
> + *   LAN89218,LAN9250
>   *
>   */
>  
> @@ -104,6 +104,9 @@ struct smsc911x_data {
>  	/* used to decide which workarounds apply */
>  	unsigned int generation;
>  
> +	/* used to decide which sub generation product work arounds to apply */
> +	unsigned int sub_generation;
> +
>  	/* device configuration (copied from platform_data during probe) */
>  	struct smsc911x_platform_config config;
>  
> @@ -1450,6 +1453,8 @@ static int smsc911x_soft_reset(struct smsc911x_data *pdata)
>  	unsigned int timeout;
>  	unsigned int temp;
>  	int ret;
> +	unsigned int reset_offset = HW_CFG;
> +	unsigned int reset_mask = HW_CFG_SRST_;
>  
>  	/*
>  	 * Make sure to power-up the PHY chip before doing a reset, otherwise @@ -1476,15 +1481,23 @@ static int smsc911x_soft_reset(struct smsc911x_data *pdata)
>  		}
>  	}
>  
> +	if (pdata->generation == 4 && pdata->sub_generation) {
> +		/* special reset for  LAN9250 */
> +		reset_offset = RESET_CTL;
> +		reset_mask = RESET_CTL_DIGITAL_RST_;
> +	}

Hi David

Thanks for adding the chip ID #defines. But the point of doing that is
that you can then change this above to:

  	if (pdata->idrev & 0xFFFF0000 == LAN9250) {
		reset_offset = RESET_CTL;
		reset_mask = RESET_CTL_DIGITAL_RST_;
}

which is much more readable.

      Andrew

^ permalink raw reply

* Re: [PATCH] IB/IPoIB: Check the headroom size
From: Honggang LI @ 2017-04-26 14:44 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Paolo Abeni, Or Gerlitz, Erez Shitrit, Erez Shitrit,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux Netdev List, David Miller
In-Reply-To: <1493216704.3041.115.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Wed, Apr 26, 2017 at 10:25:04AM -0400, Doug Ledford wrote:
> On Wed, 2017-04-26 at 22:11 +0800, Honggang LI wrote:
> > On Wed, Apr 26, 2017 at 09:50:38AM -0400, Doug Ledford wrote:
> > > 
> > > On Wed, 2017-04-26 at 09:48 -0400, Doug Ledford wrote:
> > > > 
> > > > On Wed, 2017-04-26 at 21:33 +0800, Honggang LI wrote:
> > > > > 
> > > > > 
> > > > > Yes, it is during the process of removing the final slave. The
> > > > > reproducer looks like this:
> > > > > 
> > > > > ping remote_ip_over_bonding_interface &
> > > > > while 1; do
> > > > > 	ifdown bond0
> > > > > 	ifup   bond0
> > > > > done
> > > 
> > > BTW, rerunning your test as:
> > > 
> > > ping remote_ip_over_bonding_interface &
> > > while 1; do
> > >     echo -n "Downing interface..."
> > >     ifdown bond0
> > 
> > Confirmed panic in here.
> 
> OK, this leads me to suspect that the bonding driver is possibly
> reconfiguring the skb setup to Ethernet mode before the last slave is
> fully dropped from the bond, and so the slave is seeing a packet to its
> hard header routine with the wrong headroom.  I think Paolo's patch is
> probably the best way to go.  The reason I say that is because we can't
> definitively say that this is the only pathway we will ever see that
> causes us to get a packet with Ethernet headroom on our IPoIB
> interface, and Paolo's patch gives us the possibility of recovering and
> sending the packet out.  Even in this case, we are downing the
> interface, but it isn't down entirely yet, so sending the packet out

Sending packet out over a removing interface looks bad. I'd like to drop
backlog packet.

> isn't necessarily wrong.  So a method of fix that allows us to
> recover/continue seems preferable to me.  Can you test Paolo's patch
> and see if it resolves the problem?

Yes, Paolo's patch fixes the panic issue too. But it seems has side
effort. For example:

520 struct sk_buff *arp_create(int type, int ptype, __be32 dest_ip,
.....
538                 return NULL;
539 
540         skb_reserve(skb, hlen);
541         skb_reset_network_header(skb); <-- it update skb->network_header


But skb_cow_head seems does not touch skb->network_header. I have to say
I did not test the network_header.

> 
> -- 
> Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>     GPG KeyID: B826A3330E572FDD
>    
> Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [oss-drivers] Re: [RFC 3/4] nfp: make use of extended ack message reporting
From: David Miller @ 2017-04-26 14:44 UTC (permalink / raw)
  To: simon.horman
  Cc: jhs, jakub.kicinski, netdev, johannes, dsa, daniel,
	alexei.starovoitov, bblanco, john.fastabend, kubakici,
	oss-drivers
In-Reply-To: <20170426111315.GC28251@vergenet.net>

From: Simon Horman <simon.horman@netronome.com>
Date: Wed, 26 Apr 2017 13:13:16 +0200

> On Tue, Apr 25, 2017 at 10:20:22AM -0400, David Miller wrote:
>> From: Jamal Hadi Salim <jhs@mojatatu.com>
>> Date: Tue, 25 Apr 2017 08:42:32 -0400
>> 
>> > So are we going to standardize these strings?
>> 
>> No.
>> 
>> > i.e what if some user has written a bash script that depends on this
>> > string and it gets changed later.
>> 
>> They can't do that.
>> 
>> It's free form extra information an application may or not provide
>> to the user when the kernel emits it.
> 
> I don't feel strongly about this and perhaps it can be revisited at some
> point but perhaps it would be worth documenting that he strings do not
> form part of the UAPI as my expectation would have been that they do f.e. to
> facilitate internationalisation.

These two things are entirely separate.

We can maintain uptodate translations of the strings, yet document that
they can change at any time and are thus not UAPI.

^ permalink raw reply

* Re: net/ipv6: use-after-free in __call_rcu/in6_dev_finish_destroy_rcu
From: Andrey Konovalov @ 2017-04-26 14:45 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML, Josh Triplett,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Eric Dumazet,
	Cong Wang, Dmitry Vyukov, Kostya Serebryany, syzkaller
In-Reply-To: <20170426135950.GO3956@linux.vnet.ibm.com>

On Wed, Apr 26, 2017 at 3:59 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Wed, Apr 26, 2017 at 02:34:15PM +0200, Andrey Konovalov wrote:
>> Hi,
>>
>> I've got the following error report while fuzzing the kernel with syzkaller.
>>
>> On commit 5a7ad1146caa895ad718a534399e38bd2ba721b7 (4.11-rc8).
>>
>> Unfortunately it's not reproducible.
>>
>> I'm not sure whether is is an issue with rcu or ipv6.
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in __call_rcu.constprop.77+0x13be/0x1640
>
> Does building with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y show any splats?
> (Yeah, kind of a stupid question if it is not reproducible, but had
> to ask!)

Hi Paul,

I'll try enabling this config.

In the meantime, while I was trying to reproduce this issue, I got this warning:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 4590 at kernel/rcu/tree.c:2919
rcu_do_batch.isra.65+0x845/0xbd0
Modules linked in:
CPU: 0 PID: 4590 Comm: syz-executor Not tainted 4.11.0-rc8+ #270
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:16
 dump_stack+0x192/0x22d lib/dump_stack.c:52
 __warn+0x19f/0x1e0 kernel/panic.c:549
 warn_slowpath_null+0x2c/0x40 kernel/panic.c:584
 rcu_do_batch.isra.65+0x845/0xbd0 kernel/rcu/tree.c:2919
 invoke_rcu_callbacks kernel/rcu/tree.c:3142
 __rcu_process_callbacks kernel/rcu/tree.c:3109
 rcu_process_callbacks+0x23f/0x810 kernel/rcu/tree.c:3126
 __do_softirq+0x253/0x78b kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364
 irq_exit+0x149/0x180 kernel/softirq.c:405
 exiting_irq ./arch/x86/include/asm/apic.h:657
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
 apic_timer_interrupt+0x89/0x90
RIP: 0010:do_anonymous_page mm/memory.c:2962
RIP: 0010:handle_pte_fault mm/memory.c:3721
RIP: 0010:__handle_mm_fault+0xdab/0x1c00 mm/memory.c:3841
RSP: 0018:ffff88006a02f7d0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
RAX: 000000000005dab0 RBX: 1ffff1000d405f00 RCX: 1ffff1000d28cd3f
RDX: dffffc0000000000 RSI: 8000000000000025 RDI: ffff8800694669f8
RBP: ffff88006a02f9a8 R08: 00000000014280ca R09: 1ffff1000d405eb0
R10: ffff88006a02f660 R11: dffffc0000000000 R12: ffff88006a02f8e0
R13: ffff88006a02f980 R14: ffff8800694669b0 R15: ffff880069466a00
 </IRQ>
 handle_mm_fault+0x1aa/0x450 mm/memory.c:3878
 faultin_page mm/gup.c:408
 __get_user_pages+0x606/0x14a0 mm/gup.c:607
 populate_vma_page_range+0xd9/0x100 mm/gup.c:1062
 __mm_populate+0x278/0x540 mm/gup.c:1112
 mm_populate ./include/linux/mm.h:2132
 vm_mmap_pgoff+0x258/0x280 mm/util.c:314
 SYSC_mmap_pgoff mm/mmap.c:1503
 SyS_mmap_pgoff+0x22c/0x5e0 mm/mmap.c:1461
 SYSC_mmap arch/x86/kernel/sys_x86_64.c:96
 SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:87
 entry_SYSCALL_64_fastpath+0x1a/0xa9 arch/x86/entry/entry_64.S:204
RIP: 0033:0x4458e9
RSP: 002b:00007f3a15ce4b58 EFLAGS: 00000282 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 00007f3a15ce5700 RCX: 00000000004458e9
RDX: 0000000000000002 RSI: 00000000009b8000 RDI: 0000000020000000
RBP: 0000000000000000 R08: ffffffffffffffff R09: 8000000000000000
R10: 0000000000008032 R11: 0000000000000282 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f3a15ce59c0 R15: 00007f3a15ce5700
---[ end trace e36457085170396e ]---

>
>                                                         Thanx, Paul
>
>> kernel/rcu/tree.c:3269 at addr ffff88003b842280
>> Write of size 8 by task kworker/u10:1/180
>> CPU: 2 PID: 180 Comm: kworker/u10:1 Not tainted 4.11.0-rc8+ #270
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Workqueue: events_unbound call_usermodehelper_exec_work
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:16 [inline]
>>  dump_stack+0x192/0x22d lib/dump_stack.c:52
>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:164
>>  print_address_description mm/kasan/report.c:202 [inline]
>>  kasan_report_error mm/kasan/report.c:291 [inline]
>>  kasan_report+0x252/0x510 mm/kasan/report.c:347
>>  __asan_report_store8_noabort+0x17/0x20 mm/kasan/report.c:373
>>  __call_rcu.constprop.77+0x13be/0x1640 kernel/rcu/tree.c:3269
>>  call_rcu_sched+0x12/0x20 kernel/rcu/tree.c:3288
>>  free_pid+0x446/0x5d0 kernel/pid.c:293
>>  __change_pid+0x2a1/0x3d0 kernel/pid.c:411
>>  detach_pid+0x1f/0x30 kernel/pid.c:416
>>  __unhash_process kernel/exit.c:74 [inline]
>>  __exit_signal kernel/exit.c:155 [inline]
>>  release_task+0xbb0/0x1d90 kernel/exit.c:199
>>  wait_task_zombie kernel/exit.c:1230 [inline]
>>  wait_consider_task+0x11fe/0x3410 kernel/exit.c:1458
>>  do_wait_thread kernel/exit.c:1521 [inline]
>>  do_wait+0x3ea/0x8e0 kernel/exit.c:1592
>>  SYSC_wait4 kernel/exit.c:1720 [inline]
>>  SyS_wait4+0x208/0x340 kernel/exit.c:1689
>>  call_usermodehelper_exec_sync kernel/kmod.c:292 [inline]
>>  call_usermodehelper_exec_work+0x1a7/0x2b0 kernel/kmod.c:329
>>  process_one_work+0x9f7/0x1580 kernel/workqueue.c:2097
>>  worker_thread+0x1df/0x14b0 kernel/workqueue.c:2231
>>  kthread+0x31f/0x3f0 kernel/kthread.c:231
>>  ret_from_fork+0x2c/0x40 arch/x86/entry/entry_64.S:430
>> Object at ffff88003b842018, in cache kmalloc-1024 size: 1024
>> Allocated:
>> PID = 1
>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>>  set_track mm/kasan/kasan.c:525 [inline]
>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616
>>  kmem_cache_alloc_trace+0x61/0x170 mm/slub.c:2745
>>  kmalloc include/linux/slab.h:490 [inline]
>>  kzalloc include/linux/slab.h:663 [inline]
>>  ipv6_add_dev+0x199/0x1380 net/ipv6/addrconf.c:380
>>  addrconf_init+0xd0/0x29a net/ipv6/addrconf.c:6405
>>  inet6_init+0x2f6/0x584 net/ipv6/af_inet6.c:962
>>  do_one_initcall+0xf3/0x380 init/main.c:792
>>  do_initcall_level init/main.c:858 [inline]
>>  do_initcalls init/main.c:866 [inline]
>>  do_basic_setup init/main.c:884 [inline]
>>  kernel_init_freeable+0x54d/0x622 init/main.c:1035
>>  kernel_init+0x13/0x180 init/main.c:959
>>  ret_from_fork+0x2c/0x40 arch/x86/entry/entry_64.S:430
>> Freed:
>> PID = 6479
>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>>  set_track mm/kasan/kasan.c:525 [inline]
>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589
>>  slab_free_hook mm/slub.c:1357 [inline]
>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>  slab_free mm/slub.c:2961 [inline]
>>  kfree+0x91/0x190 mm/slub.c:3882
>>  in6_dev_finish_destroy_rcu+0x97/0xc0 net/ipv6/addrconf_core.c:150
>>  __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
>>  rcu_do_batch.isra.65+0x6de/0xbd0 kernel/rcu/tree.c:2879
>>  invoke_rcu_callbacks kernel/rcu/tree.c:3142 [inline]
>>  __rcu_process_callbacks kernel/rcu/tree.c:3109 [inline]
>>  rcu_process_callbacks+0x23f/0x810 kernel/rcu/tree.c:3126
>>  __do_softirq+0x253/0x78b kernel/softirq.c:284
>> Memory state around the buggy address:
>>  ffff88003b842180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>  ffff88003b842200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >ffff88003b842280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>                    ^
>>  ffff88003b842300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>  ffff88003b842380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ==================================================================
>>
>

^ permalink raw reply

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
From: David Miller @ 2017-04-26 14:46 UTC (permalink / raw)
  To: jiri; +Cc: jhs, simon.horman, xiyou.wangcong, eric.dumazet, netdev
In-Reply-To: <20170426130506.GF1867@nanopsycho.orion>

From: Jiri Pirko <jiri@resnulli.us>
Date: Wed, 26 Apr 2017 15:05:06 +0200

> Wed, Apr 26, 2017 at 02:46:22PM CEST, jhs@mojatatu.com wrote:
>>On 17-04-26 07:02 AM, Simon Horman wrote:
>>> On Tue, Apr 25, 2017 at 06:04:45PM +0200, Jiri Pirko wrote:
>>[..]
>>
>>> > So fix iproute2. It is always first kernel, then iproute2.
>>> 
>>> Perhaps I am missing the point or somehow misguided but I would expect that
>>> if the UAPI uses BIT() it also provides BIT().
>>
>>There is a user of BIT() already in iproute2 (devlink). We can move
>>the code to be more generally available for other iproute2 users.
>>Then this UAPI change makes use of it.
> 
> Should be part of UAPI as well
> I see that include/uapi/rdma/vmw_pvrdma-abi.h is using BIT macro.
> I don't see BIT macro defined in UAPI (I thought it is). So either
> define it there (not sure where) or just use "<<"

"BIT" is a pretty crazy small simple name to pollute into the global
namespace, IMHO.

^ permalink raw reply

* [PATCH v1 net-next 0/6] Extend socket timestamping API
From: Miroslav Lichvar @ 2017-04-26 14:50 UTC (permalink / raw)
  To: netdev
  Cc: Richard Cochran, Willem de Bruijn, Soheil Hassas Yeganeh,
	Keller, Jacob E, Denny Page, Jiri Benc

Changes RFC->v1:
- reworked SOF_TIMESTAMPING_OPT_PKTINFO patch to not add new fields to
  skb shared info (net device is now looked up by napi_id), not require
  any changes in drivers, and restrict the cmsg to incoming packets
- renamed SOF_TIMESTAMPING_OPT_MULTIMSG to SOF_TIMESTAMPING_OPT_TX_SWHW
  and fixed its description
- moved struct scm_ts_pktinfo from errqueue.h to net_tstamp.h as it
  can't be received from the error queue anymore
- improved commit descriptions and removed incorrect comment

This patchset adds new options to the timestamping API that will be
useful for NTP implementations and possibly other applications.

The first patch specifies a timestamp filter for NTP packets. The second
patch updates drivers that can timestamp all packets, or need to list
the filter as unsupported. There is no attempt to add the support to the
phyter driver.

The third patch adds a new option to get a new control message with the
L2 length and interface index for incoming packets with hardware
timestamps.

The fourth patch fixes the code to not make a false software TX
timestamp when hardware timestamping is enabled. The fifth patch depends
on this fix.

The fifth patch adds a new option to request both software and hardware
timestamps for outgoing packets. The sixth patch updates drivers that
assumed software timestamping cannot be used together with hardware
timestamping.

The patches have been tested on x86_64 machines with igb and e1000e
drivers.

Miroslav Lichvar (6):
  net: define receive timestamp filter for NTP
  net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALL
  net: add new control message for incoming HW-timestamped packets
  net: don't make false software transmit timestamps
  net: allow simultaneous SW and HW transmit timestamping
  net: ethernet: update drivers to make both SW and HW TX timestamps

 Documentation/networking/timestamping.txt          | 22 ++++++++++++++--
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c           |  4 +--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c   |  1 +
 drivers/net/ethernet/cavium/liquidio/lio_main.c    |  1 +
 drivers/net/ethernet/cavium/liquidio/lio_vf_main.c |  1 +
 drivers/net/ethernet/cavium/octeon/octeon_mgmt.c   |  1 +
 drivers/net/ethernet/intel/e1000e/netdev.c         |  5 ++--
 drivers/net/ethernet/intel/i40e/i40e_ptp.c         |  1 +
 drivers/net/ethernet/intel/igb/igb_ptp.c           |  1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c       |  1 +
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c     |  1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_clock.c |  1 +
 drivers/net/ethernet/neterion/vxge/vxge-main.c     |  1 +
 drivers/net/ethernet/qlogic/qede/qede_ptp.c        |  1 +
 drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c    |  3 +--
 drivers/net/ethernet/sfc/ef10.c                    |  1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  7 +++--
 drivers/net/ethernet/ti/cpsw.c                     |  1 +
 drivers/net/ethernet/tile/tilegx.c                 |  1 +
 include/linux/netdevice.h                          |  1 +
 include/linux/skbuff.h                             |  3 +--
 include/uapi/asm-generic/socket.h                  |  2 ++
 include/uapi/linux/net_tstamp.h                    | 13 +++++++++-
 net/core/dev.c                                     | 18 +++++++++++++
 net/core/dev_ioctl.c                               |  1 +
 net/core/skbuff.c                                  |  4 +++
 net/socket.c                                       | 30 ++++++++++++++++++++--
 27 files changed, 110 insertions(+), 17 deletions(-)

-- 
2.9.3

^ permalink raw reply

* [PATCH v1 net-next 3/6] net: add new control message for incoming HW-timestamped packets
From: Miroslav Lichvar @ 2017-04-26 14:50 UTC (permalink / raw)
  To: netdev
  Cc: Richard Cochran, Willem de Bruijn, Soheil Hassas Yeganeh,
	Keller, Jacob E, Denny Page, Jiri Benc
In-Reply-To: <20170426145035.25846-1-mlichvar@redhat.com>

Add SOF_TIMESTAMPING_OPT_PKTINFO option to request a new control message
for incoming packets with hardware timestamps. It contains the index of
the real interface which received the packet and the length of the
packet at layer 2.

The index is useful with bonding, bridges and other interfaces, where
IP_PKTINFO doesn't allow applications to determine which PHC made the
timestamp. With the L2 length (and link speed) it is possible to
transpose preamble timestamps to trailer timestamps, which are used in
the NTP protocol.

While this information could be provided by two new socket options
independently from timestamping, it doesn't look like it would be very
useful. With this option any performance impact is limited to hardware
timestamping.

As skb currently has no field for the original interface index, use
napi_id to look up the device and its index. This limits the option to
kernels with enabled CONFIG_NET_RX_BUSY_POLL and drivers using napi,
which should cover all current MAC drivers that support hardware
timestamping.

CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
 Documentation/networking/timestamping.txt |  8 ++++++++
 include/linux/netdevice.h                 |  1 +
 include/uapi/asm-generic/socket.h         |  2 ++
 include/uapi/linux/net_tstamp.h           |  9 ++++++++-
 net/core/dev.c                            | 18 ++++++++++++++++++
 net/socket.c                              | 26 +++++++++++++++++++++++++-
 6 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt
index 96f5069..6c07e7c 100644
--- a/Documentation/networking/timestamping.txt
+++ b/Documentation/networking/timestamping.txt
@@ -193,6 +193,14 @@ SOF_TIMESTAMPING_OPT_STATS:
   the transmit timestamps, such as how long a certain block of
   data was limited by peer's receiver window.
 
+SOF_TIMESTAMPING_OPT_PKTINFO:
+
+  Enable the SCM_TIMESTAMPING_PKTINFO control message for incoming
+  packets with hardware timestamps. The message contains struct
+  scm_ts_pktinfo, which supplies the index of the real interface
+  which received the packet and its length at layer 2. This option
+  works only if CONFIG_NET_RX_BUSY_POLL is enabled.
+
 New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to
 disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate
 regardless of the setting of sysctl net.core.tstamp_allow_data.
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 8c5c8cd..c7ce189 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2451,6 +2451,7 @@ static inline int dev_recursion_level(void)
 struct net_device *dev_get_by_index(struct net *net, int ifindex);
 struct net_device *__dev_get_by_index(struct net *net, int ifindex);
 struct net_device *dev_get_by_index_rcu(struct net *net, int ifindex);
+struct net_device *dev_get_by_napi_id(unsigned int napi_id);
 int netdev_get_name(struct net *net, char *name, int ifindex);
 int dev_restart(struct net_device *dev);
 int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb);
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 2b48856..a5f6e81 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -100,4 +100,6 @@
 
 #define SO_COOKIE		57
 
+#define SCM_TIMESTAMPING_PKTINFO	58
+
 #endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
index 0749fb1..8fcae35 100644
--- a/include/uapi/linux/net_tstamp.h
+++ b/include/uapi/linux/net_tstamp.h
@@ -26,8 +26,9 @@ enum {
 	SOF_TIMESTAMPING_OPT_CMSG = (1<<10),
 	SOF_TIMESTAMPING_OPT_TSONLY = (1<<11),
 	SOF_TIMESTAMPING_OPT_STATS = (1<<12),
+	SOF_TIMESTAMPING_OPT_PKTINFO = (1<<13),
 
-	SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_STATS,
+	SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_PKTINFO,
 	SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
 				 SOF_TIMESTAMPING_LAST
 };
@@ -130,4 +131,10 @@ enum hwtstamp_rx_filters {
 	HWTSTAMP_FILTER_NTP_ALL,
 };
 
+/* SCM_TIMESTAMPING_PKTINFO control message */
+struct scm_ts_pktinfo {
+	int if_index;
+	int pkt_length;
+};
+
 #endif /* _NET_TIMESTAMPING_H */
diff --git a/net/core/dev.c b/net/core/dev.c
index 1b3317c..0a78f7f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -160,6 +160,7 @@ static int netif_rx_internal(struct sk_buff *skb);
 static int call_netdevice_notifiers_info(unsigned long val,
 					 struct net_device *dev,
 					 struct netdev_notifier_info *info);
+static struct napi_struct *napi_by_id(unsigned int napi_id);
 
 /*
  * The @dev_base_head list is protected by @dev_base_lock and the rtnl
@@ -863,6 +864,23 @@ struct net_device *dev_get_by_index(struct net *net, int ifindex)
 }
 EXPORT_SYMBOL(dev_get_by_index);
 
+struct net_device *dev_get_by_napi_id(unsigned int napi_id)
+{
+	struct net_device *dev = NULL;
+	struct napi_struct *napi;
+
+	rcu_read_lock();
+
+	napi = napi_by_id(napi_id);
+	if (napi)
+		dev = napi->dev;
+
+	rcu_read_unlock();
+
+	return dev;
+}
+EXPORT_SYMBOL(dev_get_by_napi_id);
+
 /**
  *	netdev_get_name - get a netdevice name, knowing its ifindex.
  *	@net: network namespace
diff --git a/net/socket.c b/net/socket.c
index c2564eb..5ea5f29 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -662,6 +662,26 @@ static bool skb_is_err_queue(const struct sk_buff *skb)
 	return skb->pkt_type == PACKET_OUTGOING;
 }
 
+static void put_ts_pktinfo(struct msghdr *msg, struct sk_buff *skb)
+{
+#ifdef CONFIG_NET_RX_BUSY_POLL
+	struct scm_ts_pktinfo ts_pktinfo;
+	struct net_device *orig_dev;
+
+	if (skb->napi_id < MIN_NAPI_ID || !skb_mac_header_was_set(skb))
+		return;
+
+	orig_dev = dev_get_by_napi_id(skb->napi_id);
+	if (!orig_dev)
+		return;
+
+	ts_pktinfo.if_index = orig_dev->ifindex;
+	ts_pktinfo.pkt_length = skb->len - skb_mac_offset(skb);
+	put_cmsg(msg, SOL_SOCKET, SCM_TIMESTAMPING_PKTINFO,
+		 sizeof(ts_pktinfo), &ts_pktinfo);
+#endif
+}
+
 /*
  * called from sock_recv_timestamp() if sock_flag(sk, SOCK_RCVTSTAMP)
  */
@@ -699,8 +719,12 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
 		empty = 0;
 	if (shhwtstamps &&
 	    (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
-	    ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2))
+	    ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
 		empty = 0;
+		if ((sk->sk_tsflags & SOF_TIMESTAMPING_OPT_PKTINFO) &&
+		    !skb_is_err_queue(skb))
+			put_ts_pktinfo(msg, skb);
+	}
 	if (!empty) {
 		put_cmsg(msg, SOL_SOCKET,
 			 SCM_TIMESTAMPING, sizeof(tss), &tss);
-- 
2.9.3

^ permalink raw reply related

* [PATCH v1 net-next 1/6] net: define receive timestamp filter for NTP
From: Miroslav Lichvar @ 2017-04-26 14:50 UTC (permalink / raw)
  To: netdev
  Cc: Richard Cochran, Willem de Bruijn, Soheil Hassas Yeganeh,
	Keller, Jacob E, Denny Page, Jiri Benc
In-Reply-To: <20170426145035.25846-1-mlichvar@redhat.com>

Add HWTSTAMP_FILTER_NTP_ALL to the hwtstamp_rx_filters enum for
timestamping of NTP packets. There is currently only one driver
(phyter) that could support it directly.

CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
 include/uapi/linux/net_tstamp.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
index 464dcca..0749fb1 100644
--- a/include/uapi/linux/net_tstamp.h
+++ b/include/uapi/linux/net_tstamp.h
@@ -125,6 +125,9 @@ enum hwtstamp_rx_filters {
 	HWTSTAMP_FILTER_PTP_V2_SYNC,
 	/* PTP v2/802.AS1, any layer, Delay_req packet */
 	HWTSTAMP_FILTER_PTP_V2_DELAY_REQ,
+
+	/* NTP, UDP, all versions and packet modes */
+	HWTSTAMP_FILTER_NTP_ALL,
 };
 
 #endif /* _NET_TIMESTAMPING_H */
-- 
2.9.3

^ permalink raw reply related

* [PATCH v1 net-next 4/6] net: don't make false software transmit timestamps
From: Miroslav Lichvar @ 2017-04-26 14:50 UTC (permalink / raw)
  To: netdev
  Cc: Richard Cochran, Willem de Bruijn, Soheil Hassas Yeganeh,
	Keller, Jacob E, Denny Page, Jiri Benc
In-Reply-To: <20170426145035.25846-1-mlichvar@redhat.com>

If software timestamping is enabled by the SO_TIMESTAMP(NS) option
when a message without timestamp is already waiting in the queue, the
__sock_recv_timestamp() function will read the current time to make a
timestamp in order to always have something for the application.

However, this applies also to outgoing packets looped back to the error
queue when hardware timestamping is enabled by the SO_TIMESTAMPING
option. A software transmit timestamp made after the actual transmission
is added to messages with hardware timestamps.

Modify the function to save the current time as a software timestamp
only if it's for a received packet (i.e. it's not in the error queue).

CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
 net/socket.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/socket.c b/net/socket.c
index 5ea5f29..68b9304 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -688,7 +688,8 @@ static void put_ts_pktinfo(struct msghdr *msg, struct sk_buff *skb)
 void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
 	struct sk_buff *skb)
 {
-	int need_software_tstamp = sock_flag(sk, SOCK_RCVTSTAMP);
+	int need_software_tstamp = sock_flag(sk, SOCK_RCVTSTAMP) &&
+				   !skb_is_err_queue(skb);
 	struct scm_timestamping tss;
 	int empty = 1;
 	struct skb_shared_hwtstamps *shhwtstamps =
-- 
2.9.3

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox