Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] drivers:net:wireless: Add proper locking for the function, b43_op_beacon_set_tim in main.c
From: Rafał Miłecki @ 2015-01-02 11:19 UTC (permalink / raw)
  To: Michael Büsch
  Cc: Nicholas Krause, Stefano Brivio, Network Development,
	linux-wireless@vger.kernel.org, Linux Kernel Mailing List,
	b43-dev, Kalle Valo
In-Reply-To: <20150102102727.3918a684@wiggum>

On 2 January 2015 at 10:27, Michael Büsch <m@bues.ch> wrote:
> On Fri,  2 Jan 2015 02:34:01 -0500
> Nicholas Krause <xerofoify@gmail.com> wrote:
>
>> This adds proper locking for the function, b43_op_beacon_set_tim in main.c by using the mutex lock
>> in the structure pointer wl, as embedded into this pointer as a mutex in order to protect against
>> multiple access to the pointer wl when updating the templates for this pointer in the function,
>> b43_update_templates internally in the function, b43_op_beacon_set_tim.
>>
>> Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
>> ---
>>  drivers/net/wireless/b43/main.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c
>> index 47731cb..d568fc8 100644
>> --- a/drivers/net/wireless/b43/main.c
>> +++ b/drivers/net/wireless/b43/main.c
>> @@ -5094,8 +5094,9 @@ static int b43_op_beacon_set_tim(struct ieee80211_hw *hw,
>>  {
>>       struct b43_wl *wl = hw_to_b43_wl(hw);
>>
>> -     /* FIXME: add locking */
>> +     mutex_lock(&wl->mutex);
>>       b43_update_templates(wl);
>> +     mutex_unlock(&wl->mutex);
>>
>>       return 0;
>>  }
>
> Thanks for the patch.
>
> However, this does not work. We are in atomic context here.
> Please see the b43-dev mailing list archives for a recent thread about that.

Michael: guess who it was who sent the patch doing the same back in November.

Yes, the same troll.

-- 
Rafał

^ permalink raw reply

* Re: [PATCH] TCP: Add support for TCP Stealth
From: Hagen Paul Pfeifer @ 2015-01-02 10:36 UTC (permalink / raw)
  To: Julian Kirsch, Eric Dumazet; +Cc: netdev, Christian Grothoff, Jacob Appelbaum
In-Reply-To: <54A470B3.3010501@sec.in.tum.de>

On 31 December 2014 at 22:54, Julian Kirsch <kirschju@sec.in.tum.de> wrote:

Hey Jullian

> one year ago [0] we tried to convince you to add support for a new
> socket option to the linux kernel. Equipped with an improved version of
> our patch we're back to accomplish this task today. :-)

Nice, what is the difference from the previous version? There is no
changelog in the patch. Especially issues raised by Eric (cc'ed) one
year ago and TCPM.

Hagen

^ permalink raw reply

* Re: [PATCH] drivers:net:wireless: Add proper locking for the function, b43_op_beacon_set_tim in main.c
From: Michael Büsch @ 2015-01-02  9:27 UTC (permalink / raw)
  To: Nicholas Krause
  Cc: stefano.brivio, netdev, linux-wireless, b43-dev, kvalo,
	linux-kernel
In-Reply-To: <1420184041-6788-1-git-send-email-xerofoify@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1387 bytes --]

On Fri,  2 Jan 2015 02:34:01 -0500
Nicholas Krause <xerofoify@gmail.com> wrote:

> This adds proper locking for the function, b43_op_beacon_set_tim in main.c by using the mutex lock
> in the structure pointer wl, as embedded into this pointer as a mutex in order to protect against
> multiple access to the pointer wl when updating the templates for this pointer in the function,
> b43_update_templates internally in the function, b43_op_beacon_set_tim.
> 
> Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
> ---
>  drivers/net/wireless/b43/main.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c
> index 47731cb..d568fc8 100644
> --- a/drivers/net/wireless/b43/main.c
> +++ b/drivers/net/wireless/b43/main.c
> @@ -5094,8 +5094,9 @@ static int b43_op_beacon_set_tim(struct ieee80211_hw *hw,
>  {
>  	struct b43_wl *wl = hw_to_b43_wl(hw);
>  
> -	/* FIXME: add locking */
> +	mutex_lock(&wl->mutex);
>  	b43_update_templates(wl);
> +	mutex_unlock(&wl->mutex);
>  
>  	return 0;
>  }

Thanks for the patch.

However, this does not work. We are in atomic context here.
Please see the b43-dev mailing list archives for a recent thread about that.
I'm also pretty sure that this is safe without lock, due to the higher level locks in mac80211.

-- 
Michael

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Re: [PATCH net-next 0/3] swdev: add IPv4 routing offload
From: Rami Rosen @ 2015-01-02  9:04 UTC (permalink / raw)
  To: sfeldma; +Cc: Netdev, jiri, john.fastabend, tgraf, jhs, andy, roopa
In-Reply-To: <1420169361-31767-1-git-send-email-sfeldma@gmail.com>

Hi, Scott,

Good work!

You say that currently the rocker driver support only unicast singlepath IPv4.
If I understand correctly, IPv4 packets with tos !=0 are skipped in
the current rocker implementation of the ndo_sw_parent_fib_ipv4_add()
callback. I am referring to the rocker_port_fib_ipv4_skip() method:

if (tos != 0)
     return -EOPNOTSUPP;

see:
https://github.com/jpirko/net-next-rocker/blob/master/drivers/net/ethernet/rocker/rocker.c#L3701

Is there a reason for this? (The NDO that you suggest,
ndo_sw_parent_fib_ipv4_add(), has the tos as a parameter, so from this
aspect there is no problem).

Regards,
Rami Rosen


On Fri, Jan 2, 2015 at 5:29 AM,  <sfeldma@gmail.com> wrote:
> From: Scott Feldman <sfeldma@gmail.com>
>
> This patch set adds L3 routing offload support for IPv4 routes.  The idea is to
> mirror routes installed in the kernel's FIB down to a hardware switch device to
> offload the data forwarding path for L3.  Only the data forwarding path is
> intercepted.  Control and management of the kernel's FIB remains with the
> kernel.
>
> A couple of new ndo ops (ndo_switch_fib_ipv4_add/del) are added to the swdev
> model to add/remove FIB entries to/from the offload device.  The ops are called
> from the core IPv4 FIB code directly.  Just before the FIB entry is installed
> in the kernel's FIB, the swdev device driver gets a chance at the FIB entry
> (assuming the swdev driver implements the new ndo ops).  This is a synchronous
> call in the RTM_NEWROUTE path, and the swdev has the option to fail the
> install, which means the FIB entry is not installed in swdev or the kernel, and
> the user is notified of the failure.  The swdev driver also has the option to
> return -EOPNOTSUPP to pass on the FIB entry, so it'll only be installed in the
> kernel FIB.
>
> The FIB flush path is modified also to call into the swdev driver to flush the
> FIB entries from hardware.
>
> The rocker swdev driver is updated to support these new ndo ops.  Right now
> rocker only supports IPv4 singlepath routes, but follow-on patches will add
> IPv6 and ECMP support.  Also, only unicast IPv4 routes are supported, but
> follow-on patches will add multicast route support.
>
> Testing was done in my simulated network envionment using VMs and the rocker
> device.  I'm using Quagga OSPFv2 for the routing protocol for automatic control
> plane processing.  No modifications to Quagga or netlink/iproute2 is required;
> it just works.
>
> One important metric is the time spent installing/removing FIB entries from the
> kernel and the device.  With these patches applied, I measured the wall time
> required to install and remove 10K IPv4 routes.  I used ip route add cmd in
> batch mode to install static routes.  I used the ip route flush cmd to delete
> the routes.  This is 10000 routes installed to the kernel's FIB and to the
> swdev device's L3 tables.  And then removed from each.  The performance is less
> than a second for each operation.  This is on my simulated rocker device running
> on a VM, so a real embedded CPU would probably do much better.
>
> My batch has 10K lines of:
>
> simp@simp:~$ head east
> route add 16.0.0.0/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.1/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.2/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.3/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.4/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.5/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.6/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.7/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.8/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.9/32 nexthop via 11.0.0.2 dev swp1
> [...]
>
> Install/removing routes:
>
> simp@simp:~$ wc -l east
> 10000 east
> simp@simp:~$ ip route show root 16/8 | wc -l
> 0
> simp@simp:~$ time sudo ip --batch east
>
> real    0m0.715s
> user    0m0.092s
> sys     0m0.388s
> simp@simp:~$ ip route show root 16/8 | wc -l
> 10000
>
> [At this point, 10K routes are installed in kernel and the device]
>
> simp@simp:~$ time sudo ip route flush root 16/8
>
> real    0m0.458s
> user    0m0.000s
> sys     0m0.284s
> simp@simp:~$ ip route show root 16/8 | wc -l
> 0
>
> [All gone]
>
> Scott Feldman (3):
>   net: add IPv4 routing FIB support for swdev
>   net: call swdev fib del for flushed routes
>   rocker: implement IPv4 fib offloading
>
>  drivers/net/ethernet/rocker/rocker.c |  441 +++++++++++++++++++++++++++++++++-
>  include/linux/netdevice.h            |   22 ++
>  include/net/switchdev.h              |   18 ++
>  net/ipv4/fib_trie.c                  |   31 ++-
>  net/switchdev/switchdev.c            |   89 +++++++
>  5 files changed, 592 insertions(+), 9 deletions(-)
>
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next 1/3] net: add IPv4 routing FIB support for swdev
From: Scott Feldman @ 2015-01-02  8:00 UTC (permalink / raw)
  To: roopa
  Cc: Netdev, Jiří Pírko, john fastabend, Thomas Graf,
	Jamal Hadi Salim, Andy Gospodarek
In-Reply-To: <54A63186.1090807@cumulusnetworks.com>

On Thu, Jan 1, 2015 at 9:49 PM, roopa <roopa@cumulusnetworks.com> wrote:
> On 1/1/15, 7:29 PM, sfeldma@gmail.com wrote:
>>
>> From: Scott Feldman <sfeldma@gmail.com>
>>
>> To offload IPv4 L3 routing functions to swdev device, the swdev device
>> driver
>> implements two new ndo ops (ndo_switch_fib_ipv4_add/del).  The ops are
>> called
>> by the core IPv4 FIB code when installing/removing FIB entries to/from the
>> kernel FIB.  On install, the driver should return 0 if FIB entry (route)
>> can be
>> installed to device for offloading, -EOPNOTSUPP if route cannot be
>> installed
>> due to device limitations, and other negative error code on failure to
>> install
>> route to device.  On failure error code, the route is not installed to
>> device,
>> and not installed in kernel FIB, and the return code is propagated back to
>> the
>> user-space caller (via netlink).  An -EOPNOTSUPP error code is skipped for
>> the
>> device but installed in the kernel FIB.
>>
>> The FIB entry (route) nexthop list is used to find the swdev device port
>> to
>> anchor the ndo op call.  The route's fib_dev (the first nexthop's dev) is
>> used
>> find the swdev port by recursively traversing the fib_dev's lower_dev list
>> until a swdev port is found.  The ndo op is called on this swdev port.
>
>
> scott, I posted a similar api for bridge attribute sets. But, nobody
> supported it.
> http://marc.info/?l=linux-netdev&m=141820234410602&w=2
>
> If this is acceptable, I will be resubmitting my api as well.
>

This may get shot down as well, who knows?

For routes, the nexthop dev may be a bridge or a bond for an IP on the
router, so we have no choice but to walk down from the bridge or the
bond to find a swport dev to call the ndo op to install the route.

For bridge settings, I remember someone raised the issue that settings
should be propagated down the dev hierarchy, with parent calling
child's op and so on.  I'll go back and look at your post.

>
>
>>
>> Since the FIB entry is "naked" when push from the kernel, the
>> driver/device
>> is responsible for resolving the route's nexthops to neighbor MAC
>> addresses.
>> This can be done by the driver by monitoring NETEVENT_NEIGH_UPDATE
>> netevent notifier to watch for ARP activity.  Once a nexthop is resolved
>> to
>> neighbor MAC address, it can be installed to the device and the device
>> will
>> do the L3 routing offload in HW, for that nexthop.
>>
>> Signed-off-by: Scott Feldman <sfeldma@gmail.com>
>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>> ---
>>   include/linux/netdevice.h |   22 +++++++++++
>>   include/net/switchdev.h   |   18 +++++++++
>>   net/ipv4/fib_trie.c       |   17 ++++++++-
>>   net/switchdev/switchdev.c |   89
>> +++++++++++++++++++++++++++++++++++++++++++++
>>   4 files changed, 145 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 679e6e9..b66d22b 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -767,6 +767,8 @@ struct netdev_phys_item_id {
>>   typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
>>                                        struct sk_buff *skb);
>>   +struct fib_info;
>> +
>>   /*
>>    * This structure defines the management hooks for network devices.
>>    * The following hooks can be defined; unless noted otherwise, they are
>> @@ -1030,6 +1032,14 @@ typedef u16 (*select_queue_fallback_t)(struct
>> net_device *dev,
>>    * int (*ndo_switch_port_stp_update)(struct net_device *dev, u8 state);
>>    *    Called to notify switch device port of bridge port STP
>>    *    state change.
>> + * int (*ndo_sw_parent_fib_ipv4_add)(struct net_device *dev, __be32 dst,
>> + *                                  int dst_len, struct fib_info *fi,
>> + *                                  u8 tos, u8 type, u32 tb_id);
>> + *     Called to add IPv4 route to switch device.
>> + * int (*ndo_sw_parent_fib_ipv4_del)(struct net_device *dev, __be32 dst,
>> + *                                  int dst_len, struct fib_info *fi,
>> + *                                  u8 tos, u8 type, u32 tb_id);
>> + *     Called to delete IPv4 route from switch device.
>>    */
>>   struct net_device_ops {
>>         int                     (*ndo_init)(struct net_device *dev);
>> @@ -1189,6 +1199,18 @@ struct net_device_ops {
>>                                                             struct
>> netdev_phys_item_id *psid);
>>         int                     (*ndo_switch_port_stp_update)(struct
>> net_device *dev,
>>                                                               u8 state);
>> +       int                     (*ndo_switch_fib_ipv4_add)(struct
>> net_device *dev,
>> +                                                          __be32 dst,
>> +                                                          int dst_len,
>> +                                                          struct fib_info
>> *fi,
>> +                                                          u8 tos, u8
>> type,
>> +                                                          u32 tb_id);
>> +       int                     (*ndo_switch_fib_ipv4_del)(struct
>> net_device *dev,
>> +                                                          __be32 dst,
>> +                                                          int dst_len,
>> +                                                          struct fib_info
>> *fi,
>> +                                                          u8 tos, u8
>> type,
>> +                                                          u32 tb_id);
>>   #endif
>>   };
>>   diff --git a/include/net/switchdev.h b/include/net/switchdev.h
>> index 8a6d164..caebc2a 100644
>> --- a/include/net/switchdev.h
>> +++ b/include/net/switchdev.h
>> @@ -17,6 +17,10 @@
>>   int netdev_switch_parent_id_get(struct net_device *dev,
>>                                 struct netdev_phys_item_id *psid);
>>   int netdev_switch_port_stp_update(struct net_device *dev, u8 state);
>> +int netdev_switch_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
>> +                              u8 tos, u8 type, u32 tb_id);
>> +int netdev_switch_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
>> +                              u8 tos, u8 type, u32 tb_id);
>>     #else
>>   @@ -32,6 +36,20 @@ static inline int
>> netdev_switch_port_stp_update(struct net_device *dev,
>>         return -EOPNOTSUPP;
>>   }
>>   +static inline int netdev_switch_fib_ipv4_add(u32 dst, int dst_len,
>> +                                            struct fib_info *fi,
>> +                                            u8 tos, u8 type, u32 tb_id)
>> +{
>> +       return -EOPNOTSUPP;
>> +}
>> +
>> +static inline int netdev_switch_fib_ipv4_del(u32 dst, int dst_len,
>> +                                            struct fib_info *fi,
>> +                                            u8 tos, u8 type, u32 tb_id)
>> +{
>> +       return -EOPNOTSUPP;
>> +}
>> +
>>   #endif
>>     #endif /* _LINUX_SWITCHDEV_H_ */
>> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
>> index 281e5e0..ea2dc17 100644
>> --- a/net/ipv4/fib_trie.c
>> +++ b/net/ipv4/fib_trie.c
>> @@ -79,6 +79,7 @@
>>   #include <net/tcp.h>
>>   #include <net/sock.h>
>>   #include <net/ip_fib.h>
>> +#include <net/switchdev.h>
>>   #include "fib_lookup.h"
>>     #define MAX_STAT_DEPTH 32
>> @@ -1201,6 +1202,8 @@ int fib_table_insert(struct fib_table *tb, struct
>> fib_config *cfg)
>>                         fib_release_info(fi_drop);
>>                         if (state & FA_S_ACCESSED)
>>                                 rt_cache_flush(cfg->fc_nlinfo.nl_net);
>> +                       netdev_switch_fib_ipv4_add(key, plen, fi,
>> fa->fa_tos,
>> +                                                  cfg->fc_type,
>> tb->tb_id);
>>                         rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen,
>>                                 tb->tb_id, &cfg->fc_nlinfo,
>> NLM_F_REPLACE);
>>   @@ -1229,6 +1232,13 @@ int fib_table_insert(struct fib_table *tb, struct
>> fib_config *cfg)
>>         new_fa->fa_tos = tos;
>>         new_fa->fa_type = cfg->fc_type;
>>         new_fa->fa_state = 0;
>> +
>> +       /* (Optionally) offload fib info to switch hardware. */
>> +       err = netdev_switch_fib_ipv4_add(key, plen, fi, tos,
>> +                                        cfg->fc_type, tb->tb_id);
>> +       if (err && err != -EOPNOTSUPP)
>> +               goto out_free_new_fa;
>> +
>>         /*
>>          * Insert new entry to the list.
>>          */
>> @@ -1237,7 +1247,7 @@ int fib_table_insert(struct fib_table *tb, struct
>> fib_config *cfg)
>>                 fa_head = fib_insert_node(t, key, plen);
>>                 if (unlikely(!fa_head)) {
>>                         err = -ENOMEM;
>> -                       goto out_free_new_fa;
>> +                       goto out_sw_fib_del;
>>                 }
>>         }
>>   @@ -1253,6 +1263,8 @@ int fib_table_insert(struct fib_table *tb, struct
>> fib_config *cfg)
>>   succeeded:
>>         return 0;
>>   +out_sw_fib_del:
>> +       netdev_switch_fib_ipv4_del(key, plen, fi, tos, cfg->fc_type,
>> tb->tb_id);
>>   out_free_new_fa:
>>         kmem_cache_free(fn_alias_kmem, new_fa);
>>   out:
>> @@ -1529,6 +1541,9 @@ int fib_table_delete(struct fib_table *tb, struct
>> fib_config *cfg)
>>         rtmsg_fib(RTM_DELROUTE, htonl(key), fa, plen, tb->tb_id,
>>                   &cfg->fc_nlinfo, 0);
>>   +     netdev_switch_fib_ipv4_del(key, plen, fa->fa_info, tos,
>> +                                  cfg->fc_type, tb->tb_id);
>> +
>>         list_del_rcu(&fa->fa_list);
>>         if (!plen)
>> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
>> index d162b21..211a8a0 100644
>> --- a/net/switchdev/switchdev.c
>> +++ b/net/switchdev/switchdev.c
>> @@ -12,6 +12,7 @@
>>   #include <linux/types.h>
>>   #include <linux/init.h>
>>   #include <linux/netdevice.h>
>> +#include <net/ip_fib.h>
>>   #include <net/switchdev.h>
>>     /**
>> @@ -50,3 +51,91 @@ int netdev_switch_port_stp_update(struct net_device
>> *dev, u8 state)
>>         return ops->ndo_switch_port_stp_update(dev, state);
>>   }
>>   EXPORT_SYMBOL(netdev_switch_port_stp_update);
>> +
>> +static struct net_device *netdev_switch_get_by_fib_dev(struct net_device
>> *dev)
>> +{
>> +       const struct net_device_ops *ops = dev->netdev_ops;
>> +       struct net_device *lower_dev;
>> +       struct net_device *port_dev;
>> +       struct list_head *iter;
>> +
>> +       /* Recusively search from fib_dev down until we find
>> +        * a sw port dev.  (A sw port dev supports
>> +        * ndo_switch_parent_id_get).
>> +        */
>> +
>> +       if (ops->ndo_switch_parent_id_get)
>> +               return dev;
>> +
>> +       netdev_for_each_lower_dev(dev, lower_dev, iter) {
>> +               port_dev = netdev_switch_get_by_fib_dev(lower_dev);
>> +               if (port_dev)
>> +                       return port_dev;
>> +       }
>> +
>> +       return NULL;
>> +}
>> +
>> +/**
>> + *     netdev_switch_fib_ipv4_add - Add IPv4 route entry to switch
>> + *
>> + *     @dst: route's IPv4 destination address
>> + *     @dst_len: destination address length (prefix length)
>> + *     @fi: route FIB info structure
>> + *     @tos: route TOS
>> + *     @type: route type
>> + *     @tb_id: route table ID
>> + *
>> + *     Add IPv4 route entry to switch device.
>> + */
>> +int netdev_switch_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
>> +                              u8 tos, u8 type, u32 tb_id)
>> +{
>> +       struct net_device *dev;
>> +       const struct net_device_ops *ops;
>> +       int err = -EOPNOTSUPP;
>> +
>> +       dev = netdev_switch_get_by_fib_dev(fi->fib_dev);
>> +       if (!dev)
>> +               return -EOPNOTSUPP;
>> +       ops = dev->netdev_ops;
>> +
>> +       if (ops->ndo_switch_fib_ipv4_add)
>> +               err = ops->ndo_switch_fib_ipv4_add(dev, htonl(dst),
>> dst_len,
>> +                                                  fi, tos, type, tb_id);
>> +
>> +       return err;
>> +}
>> +EXPORT_SYMBOL(netdev_switch_fib_ipv4_add);
>> +
>> +/**
>> + *     netdev_switch_fib_ipv4_del - Delete IPv4 route entry from switch
>> + *
>> + *     @dst: route's IPv4 destination address
>> + *     @dst_len: destination address length (prefix length)
>> + *     @fi: route FIB info structure
>> + *     @tos: route TOS
>> + *     @type: route type
>> + *     @tb_id: route table ID
>> + *
>> + *     Delete IPv4 route entry from switch device.
>> + */
>> +int netdev_switch_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
>> +                              u8 tos, u8 type, u32 tb_id)
>> +{
>> +       struct net_device *dev;
>> +       const struct net_device_ops *ops;
>> +       int err = -EOPNOTSUPP;
>> +
>> +       dev = netdev_switch_get_by_fib_dev(fi->fib_dev);
>> +       if (!dev)
>> +               return -EOPNOTSUPP;
>> +       ops = dev->netdev_ops;
>> +
>> +       if (ops->ndo_switch_fib_ipv4_del)
>> +               err = ops->ndo_switch_fib_ipv4_del(dev, htonl(dst),
>> dst_len,
>> +                                                  fi, tos, type, tb_id);
>> +
>> +       return err;
>> +}
>> +EXPORT_SYMBOL(netdev_switch_fib_ipv4_del);
>
>

^ permalink raw reply

* [PATCH] drivers:net:wireless: Add proper locking for the function, b43_op_beacon_set_tim in main.c
From: Nicholas Krause @ 2015-01-02  7:34 UTC (permalink / raw)
  To: stefano.brivio-hl5o88x/ua9eoWH0uzbU5w
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	b43-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	kvalo-sgV2jX0FEOL9JmXXK+q4OQ, linux-kernel-u79uwXL29TY76Z2rM5mHXA

This adds proper locking for the function, b43_op_beacon_set_tim in main.c by using the mutex lock
in the structure pointer wl, as embedded into this pointer as a mutex in order to protect against
multiple access to the pointer wl when updating the templates for this pointer in the function,
b43_update_templates internally in the function, b43_op_beacon_set_tim.

Signed-off-by: Nicholas Krause <xerofoify-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/net/wireless/b43/main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c
index 47731cb..d568fc8 100644
--- a/drivers/net/wireless/b43/main.c
+++ b/drivers/net/wireless/b43/main.c
@@ -5094,8 +5094,9 @@ static int b43_op_beacon_set_tim(struct ieee80211_hw *hw,
 {
 	struct b43_wl *wl = hw_to_b43_wl(hw);
 
-	/* FIXME: add locking */
+	mutex_lock(&wl->mutex);
 	b43_update_templates(wl);
+	mutex_unlock(&wl->mutex);
 
 	return 0;
 }
-- 
2.1.0

^ permalink raw reply related

* Re: [PATCH net-next 0/7] Fixing the "Time Counter fixes and improvements"
From: Richard Cochran @ 2015-01-02  6:14 UTC (permalink / raw)
  To: Sedat Dilek; +Cc: David Miller, netdev@vger.kernel.org
In-Reply-To: <CA+icZUXF6-_W6=YsajP4W84F5yvKCNtA3EYxKBgNS=aY4qAYPg@mail.gmail.com>

On Thu, Jan 01, 2015 at 05:30:58PM +0100, Sedat Dilek wrote:
> With this conversion those 2 commits in net-next.git#master are obsolete now...
> 
> e1000e: Include clocksource.h to get CLOCKSOURCE_MASK.
> igb_ptp: Include clocksource.h to get CLOCKSOURCE_MASK.

Right. Okay, I'll rework this series to undo the obsolete includes.

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH net-next 1/3] net: add IPv4 routing FIB support for swdev
From: roopa @ 2015-01-02  5:49 UTC (permalink / raw)
  To: sfeldma; +Cc: netdev, jiri, john.fastabend, tgraf, jhs, andy
In-Reply-To: <1420169361-31767-2-git-send-email-sfeldma@gmail.com>

On 1/1/15, 7:29 PM, sfeldma@gmail.com wrote:
> From: Scott Feldman <sfeldma@gmail.com>
>
> To offload IPv4 L3 routing functions to swdev device, the swdev device driver
> implements two new ndo ops (ndo_switch_fib_ipv4_add/del).  The ops are called
> by the core IPv4 FIB code when installing/removing FIB entries to/from the
> kernel FIB.  On install, the driver should return 0 if FIB entry (route) can be
> installed to device for offloading, -EOPNOTSUPP if route cannot be installed
> due to device limitations, and other negative error code on failure to install
> route to device.  On failure error code, the route is not installed to device,
> and not installed in kernel FIB, and the return code is propagated back to the
> user-space caller (via netlink).  An -EOPNOTSUPP error code is skipped for the
> device but installed in the kernel FIB.
>
> The FIB entry (route) nexthop list is used to find the swdev device port to
> anchor the ndo op call.  The route's fib_dev (the first nexthop's dev) is used
> find the swdev port by recursively traversing the fib_dev's lower_dev list
> until a swdev port is found.  The ndo op is called on this swdev port.

scott, I posted a similar api for bridge attribute sets. But, nobody 
supported it.
http://marc.info/?l=linux-netdev&m=141820234410602&w=2

If this is acceptable, I will be resubmitting my api as well.



>
> Since the FIB entry is "naked" when push from the kernel, the driver/device
> is responsible for resolving the route's nexthops to neighbor MAC addresses.
> This can be done by the driver by monitoring NETEVENT_NEIGH_UPDATE
> netevent notifier to watch for ARP activity.  Once a nexthop is resolved to
> neighbor MAC address, it can be installed to the device and the device will
> do the L3 routing offload in HW, for that nexthop.
>
> Signed-off-by: Scott Feldman <sfeldma@gmail.com>
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> ---
>   include/linux/netdevice.h |   22 +++++++++++
>   include/net/switchdev.h   |   18 +++++++++
>   net/ipv4/fib_trie.c       |   17 ++++++++-
>   net/switchdev/switchdev.c |   89 +++++++++++++++++++++++++++++++++++++++++++++
>   4 files changed, 145 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 679e6e9..b66d22b 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -767,6 +767,8 @@ struct netdev_phys_item_id {
>   typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
>   				       struct sk_buff *skb);
>   
> +struct fib_info;
> +
>   /*
>    * This structure defines the management hooks for network devices.
>    * The following hooks can be defined; unless noted otherwise, they are
> @@ -1030,6 +1032,14 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
>    * int (*ndo_switch_port_stp_update)(struct net_device *dev, u8 state);
>    *	Called to notify switch device port of bridge port STP
>    *	state change.
> + * int (*ndo_sw_parent_fib_ipv4_add)(struct net_device *dev, __be32 dst,
> + *				     int dst_len, struct fib_info *fi,
> + *				     u8 tos, u8 type, u32 tb_id);
> + *	Called to add IPv4 route to switch device.
> + * int (*ndo_sw_parent_fib_ipv4_del)(struct net_device *dev, __be32 dst,
> + *				     int dst_len, struct fib_info *fi,
> + *				     u8 tos, u8 type, u32 tb_id);
> + *	Called to delete IPv4 route from switch device.
>    */
>   struct net_device_ops {
>   	int			(*ndo_init)(struct net_device *dev);
> @@ -1189,6 +1199,18 @@ struct net_device_ops {
>   							    struct netdev_phys_item_id *psid);
>   	int			(*ndo_switch_port_stp_update)(struct net_device *dev,
>   							      u8 state);
> +	int			(*ndo_switch_fib_ipv4_add)(struct net_device *dev,
> +							   __be32 dst,
> +							   int dst_len,
> +							   struct fib_info *fi,
> +							   u8 tos, u8 type,
> +							   u32 tb_id);
> +	int			(*ndo_switch_fib_ipv4_del)(struct net_device *dev,
> +							   __be32 dst,
> +							   int dst_len,
> +							   struct fib_info *fi,
> +							   u8 tos, u8 type,
> +							   u32 tb_id);
>   #endif
>   };
>   
> diff --git a/include/net/switchdev.h b/include/net/switchdev.h
> index 8a6d164..caebc2a 100644
> --- a/include/net/switchdev.h
> +++ b/include/net/switchdev.h
> @@ -17,6 +17,10 @@
>   int netdev_switch_parent_id_get(struct net_device *dev,
>   				struct netdev_phys_item_id *psid);
>   int netdev_switch_port_stp_update(struct net_device *dev, u8 state);
> +int netdev_switch_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
> +			       u8 tos, u8 type, u32 tb_id);
> +int netdev_switch_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
> +			       u8 tos, u8 type, u32 tb_id);
>   
>   #else
>   
> @@ -32,6 +36,20 @@ static inline int netdev_switch_port_stp_update(struct net_device *dev,
>   	return -EOPNOTSUPP;
>   }
>   
> +static inline int netdev_switch_fib_ipv4_add(u32 dst, int dst_len,
> +					     struct fib_info *fi,
> +					     u8 tos, u8 type, u32 tb_id)
> +{
> +	return -EOPNOTSUPP;
> +}
> +
> +static inline int netdev_switch_fib_ipv4_del(u32 dst, int dst_len,
> +					     struct fib_info *fi,
> +					     u8 tos, u8 type, u32 tb_id)
> +{
> +	return -EOPNOTSUPP;
> +}
> +
>   #endif
>   
>   #endif /* _LINUX_SWITCHDEV_H_ */
> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index 281e5e0..ea2dc17 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -79,6 +79,7 @@
>   #include <net/tcp.h>
>   #include <net/sock.h>
>   #include <net/ip_fib.h>
> +#include <net/switchdev.h>
>   #include "fib_lookup.h"
>   
>   #define MAX_STAT_DEPTH 32
> @@ -1201,6 +1202,8 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
>   			fib_release_info(fi_drop);
>   			if (state & FA_S_ACCESSED)
>   				rt_cache_flush(cfg->fc_nlinfo.nl_net);
> +			netdev_switch_fib_ipv4_add(key, plen, fi, fa->fa_tos,
> +						   cfg->fc_type, tb->tb_id);
>   			rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen,
>   				tb->tb_id, &cfg->fc_nlinfo, NLM_F_REPLACE);
>   
> @@ -1229,6 +1232,13 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
>   	new_fa->fa_tos = tos;
>   	new_fa->fa_type = cfg->fc_type;
>   	new_fa->fa_state = 0;
> +
> +	/* (Optionally) offload fib info to switch hardware. */
> +	err = netdev_switch_fib_ipv4_add(key, plen, fi, tos,
> +					 cfg->fc_type, tb->tb_id);
> +	if (err && err != -EOPNOTSUPP)
> +		goto out_free_new_fa;
> +
>   	/*
>   	 * Insert new entry to the list.
>   	 */
> @@ -1237,7 +1247,7 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
>   		fa_head = fib_insert_node(t, key, plen);
>   		if (unlikely(!fa_head)) {
>   			err = -ENOMEM;
> -			goto out_free_new_fa;
> +			goto out_sw_fib_del;
>   		}
>   	}
>   
> @@ -1253,6 +1263,8 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
>   succeeded:
>   	return 0;
>   
> +out_sw_fib_del:
> +	netdev_switch_fib_ipv4_del(key, plen, fi, tos, cfg->fc_type, tb->tb_id);
>   out_free_new_fa:
>   	kmem_cache_free(fn_alias_kmem, new_fa);
>   out:
> @@ -1529,6 +1541,9 @@ int fib_table_delete(struct fib_table *tb, struct fib_config *cfg)
>   	rtmsg_fib(RTM_DELROUTE, htonl(key), fa, plen, tb->tb_id,
>   		  &cfg->fc_nlinfo, 0);
>   
> +	netdev_switch_fib_ipv4_del(key, plen, fa->fa_info, tos,
> +				   cfg->fc_type, tb->tb_id);
> +
>   	list_del_rcu(&fa->fa_list);
>   
>   	if (!plen)
> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
> index d162b21..211a8a0 100644
> --- a/net/switchdev/switchdev.c
> +++ b/net/switchdev/switchdev.c
> @@ -12,6 +12,7 @@
>   #include <linux/types.h>
>   #include <linux/init.h>
>   #include <linux/netdevice.h>
> +#include <net/ip_fib.h>
>   #include <net/switchdev.h>
>   
>   /**
> @@ -50,3 +51,91 @@ int netdev_switch_port_stp_update(struct net_device *dev, u8 state)
>   	return ops->ndo_switch_port_stp_update(dev, state);
>   }
>   EXPORT_SYMBOL(netdev_switch_port_stp_update);
> +
> +static struct net_device *netdev_switch_get_by_fib_dev(struct net_device *dev)
> +{
> +	const struct net_device_ops *ops = dev->netdev_ops;
> +	struct net_device *lower_dev;
> +	struct net_device *port_dev;
> +	struct list_head *iter;
> +
> +	/* Recusively search from fib_dev down until we find
> +	 * a sw port dev.  (A sw port dev supports
> +	 * ndo_switch_parent_id_get).
> +	 */
> +
> +	if (ops->ndo_switch_parent_id_get)
> +		return dev;
> +
> +	netdev_for_each_lower_dev(dev, lower_dev, iter) {
> +		port_dev = netdev_switch_get_by_fib_dev(lower_dev);
> +		if (port_dev)
> +			return port_dev;
> +	}
> +
> +	return NULL;
> +}
> +
> +/**
> + *	netdev_switch_fib_ipv4_add - Add IPv4 route entry to switch
> + *
> + *	@dst: route's IPv4 destination address
> + *	@dst_len: destination address length (prefix length)
> + *	@fi: route FIB info structure
> + *	@tos: route TOS
> + *	@type: route type
> + *	@tb_id: route table ID
> + *
> + *	Add IPv4 route entry to switch device.
> + */
> +int netdev_switch_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
> +			       u8 tos, u8 type, u32 tb_id)
> +{
> +	struct net_device *dev;
> +	const struct net_device_ops *ops;
> +	int err = -EOPNOTSUPP;
> +
> +	dev = netdev_switch_get_by_fib_dev(fi->fib_dev);
> +	if (!dev)
> +		return -EOPNOTSUPP;
> +	ops = dev->netdev_ops;
> +
> +	if (ops->ndo_switch_fib_ipv4_add)
> +		err = ops->ndo_switch_fib_ipv4_add(dev, htonl(dst), dst_len,
> +						   fi, tos, type, tb_id);
> +
> +	return err;
> +}
> +EXPORT_SYMBOL(netdev_switch_fib_ipv4_add);
> +
> +/**
> + *	netdev_switch_fib_ipv4_del - Delete IPv4 route entry from switch
> + *
> + *	@dst: route's IPv4 destination address
> + *	@dst_len: destination address length (prefix length)
> + *	@fi: route FIB info structure
> + *	@tos: route TOS
> + *	@type: route type
> + *	@tb_id: route table ID
> + *
> + *	Delete IPv4 route entry from switch device.
> + */
> +int netdev_switch_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
> +			       u8 tos, u8 type, u32 tb_id)
> +{
> +	struct net_device *dev;
> +	const struct net_device_ops *ops;
> +	int err = -EOPNOTSUPP;
> +
> +	dev = netdev_switch_get_by_fib_dev(fi->fib_dev);
> +	if (!dev)
> +		return -EOPNOTSUPP;
> +	ops = dev->netdev_ops;
> +
> +	if (ops->ndo_switch_fib_ipv4_del)
> +		err = ops->ndo_switch_fib_ipv4_del(dev, htonl(dst), dst_len,
> +						   fi, tos, type, tb_id);
> +
> +	return err;
> +}
> +EXPORT_SYMBOL(netdev_switch_fib_ipv4_del);

^ permalink raw reply

* Re: [PATCH net-next 0/3] swdev: add IPv4 routing offload
From: Dave Taht @ 2015-01-02  5:11 UTC (permalink / raw)
  To: Scott Feldman
  Cc: netdev@vger.kernel.org, jiri, john fastabend, Thomas Graf,
	Jamal Hadi Salim, andy, roopa, David Lamparter
In-Reply-To: <1420169361-31767-1-git-send-email-sfeldma@gmail.com>

On Thu, Jan 1, 2015 at 7:29 PM,  <sfeldma@gmail.com> wrote:
> From: Scott Feldman <sfeldma@gmail.com>
>
> This patch set adds L3 routing offload support for IPv4 routes.  The idea is to
> mirror routes installed in the kernel's FIB down to a hardware switch device to
> offload the data forwarding path for L3.  Only the data forwarding path is
> intercepted.  Control and management of the kernel's FIB remains with the
> kernel.
>
> A couple of new ndo ops (ndo_switch_fib_ipv4_add/del) are added to the swdev
> model to add/remove FIB entries to/from the offload device.  The ops are called
> from the core IPv4 FIB code directly.  Just before the FIB entry is installed
> in the kernel's FIB, the swdev device driver gets a chance at the FIB entry
> (assuming the swdev driver implements the new ndo ops).  This is a synchronous
> call in the RTM_NEWROUTE path, and the swdev has the option to fail the
> install, which means the FIB entry is not installed in swdev or the kernel, and
> the user is notified of the failure.  The swdev driver also has the option to
> return -EOPNOTSUPP to pass on the FIB entry, so it'll only be installed in the
> kernel FIB.

A couple notes:

1) As currently implemented in quagga, (to my knowledge), an actual
route change is actually a route delete/route add rather than an
atomic route modify or route add/route delete. While it would be nice
to fix quagga to do it atomically (and for all I know some fork does
it right?), I am curious as to the extent of serialization during the
process like this in the virtual switch. (and it does not appear you
have tested the ip route change commands above, or beat up quagga's
routing decisions)

2) It is generally helpful to be concurrently running the max traffic
you can sustain through the switch, while doing fib changes... and
observing what happens to that traffic.

3) As you attempt ipv6, life gets more complex. (you need to switch to
a later routing protocol in particular...)

4) There's a new idea on the block: Source specific routing (sometimes
called SADR) is mandated by the ietf homenet working group, in
particular, which relies on IPV6_subtrees, and link local ipv6
multicast. the code furthest enough along is babels
(http://www.pps.univ-paris-diderot.fr/~jch/software/babel/
https://github.com/boutier/babeld also with patches for quagga) which,
being easy to setup, might be a good exercise of both link local
multicast and of ipv6 in the virtual switch itself, as well as
exercising the fib. (ospfv3 and ISIS also have support for source
specific routing in various branches.)

>
> The FIB flush path is modified also to call into the swdev driver to flush the
> FIB entries from hardware.
>
> The rocker swdev driver is updated to support these new ndo ops.  Right now
> rocker only supports IPv4 singlepath routes, but follow-on patches will add
> IPv6 and ECMP support.  Also, only unicast IPv4 routes are supported, but
> follow-on patches will add multicast route support.
>
> Testing was done in my simulated network envionment using VMs and the rocker
> device.  I'm using Quagga OSPFv2 for the routing protocol for automatic control
> plane processing.  No modifications to Quagga or netlink/iproute2 is required;
> it just works.
>
> One important metric is the time spent installing/removing FIB entries from the
> kernel and the device.  With these patches applied, I measured the wall time
> required to install and remove 10K IPv4 routes.  I used ip route add cmd in
> batch mode to install static routes.  I used the ip route flush cmd to delete
> the routes.  This is 10000 routes installed to the kernel's FIB and to the
> swdev device's L3 tables.  And then removed from each.  The performance is less
> than a second for each operation.  This is on my simulated rocker device running
> on a VM, so a real embedded CPU would probably do much better.
>
> My batch has 10K lines of:
>
> simp@simp:~$ head east
> route add 16.0.0.0/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.1/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.2/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.3/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.4/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.5/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.6/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.7/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.8/32 nexthop via 11.0.0.2 dev swp1
> route add 16.0.0.9/32 nexthop via 11.0.0.2 dev swp1
> [...]
>
> Install/removing routes:
>
> simp@simp:~$ wc -l east
> 10000 east
> simp@simp:~$ ip route show root 16/8 | wc -l
> 0
> simp@simp:~$ time sudo ip --batch east
>
> real    0m0.715s
> user    0m0.092s
> sys     0m0.388s
> simp@simp:~$ ip route show root 16/8 | wc -l
> 10000
>
> [At this point, 10K routes are installed in kernel and the device]
>
> simp@simp:~$ time sudo ip route flush root 16/8
>
> real    0m0.458s
> user    0m0.000s
> sys     0m0.284s
> simp@simp:~$ ip route show root 16/8 | wc -l
> 0
>
> [All gone]
>
> Scott Feldman (3):
>   net: add IPv4 routing FIB support for swdev
>   net: call swdev fib del for flushed routes
>   rocker: implement IPv4 fib offloading
>
>  drivers/net/ethernet/rocker/rocker.c |  441 +++++++++++++++++++++++++++++++++-
>  include/linux/netdevice.h            |   22 ++
>  include/net/switchdev.h              |   18 ++
>  net/ipv4/fib_trie.c                  |   31 ++-
>  net/switchdev/switchdev.c            |   89 +++++++
>  5 files changed, 592 insertions(+), 9 deletions(-)
>
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

^ permalink raw reply

* [PATCH net-next 3/3] rocker: implement IPv4 fib offloading
From: sfeldma @ 2015-01-02  3:29 UTC (permalink / raw)
  To: netdev, jiri, john.fastabend, tgraf, jhs, andy, roopa

From: Scott Feldman <sfeldma@gmail.com>

The driver implements ndo_switch_fib_ipv4_add/del ops to add/del IPv4 routes
to/from swdev device.  Once a route is added to the device, and the route's
nexthops are resolved to neighbor MAC address, the device will forward matching
pkts rather than the kernel.  This offloads the L3 forwarding path from the
kernel to the device.  Note that control and management planes are still
mananged by Linux; only the data plane is offloaded.  Standard routing control
protocols such as OSPF and BGP run on Linux and manage the kernel's FIB via
standard rtm netlink msgs.

A new hash table is added to rocker to track neighbors.  The driver listens for
neighbor updates events using netevent notifier NETEVENT_NEIGH_UPDATE.  Any ARP
table updates for ports on this device are recorded in this table.  Routes
installed to the device with nexthops that reference neighbors in this table
are "qualified".  In the case of a route with nexthops not resolved in the
table, a kernel thread is started to ARP-ping the neighbor proactively to
resolve the MAC address for the neighbor.  The driver uses arp_send() to send
the ARP request to resolve the MAC address, every 2 seconds until resolved.
Once resolved, the kernel thread is stopped.

The device can only forward to pkts matching route dst to resolved nexthops.
Currently, the device only supports single-path routes (i.e. routes with one
nexthop).  Multipath (ECMP) route support will be added in followup patches.

This patch is driver support for unicast IPv4 routing only.  Followup patches
will add driver and infrastructure for IPv6 routing and multicast routing.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 drivers/net/ethernet/rocker/rocker.c |  441 +++++++++++++++++++++++++++++++++-
 1 file changed, 438 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index 2f398fa..c554816 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -32,6 +32,9 @@
 #include <linux/bitops.h>
 #include <net/switchdev.h>
 #include <net/rtnetlink.h>
+#include <net/ip_fib.h>
+#include <net/netevent.h>
+#include <net/arp.h>
 #include <asm-generic/io-64-nonatomic-lo-hi.h>
 #include <generated/utsrelease.h>
 
@@ -161,6 +164,19 @@ struct rocker_internal_vlan_tbl_entry {
 	__be16 vlan_id;
 };
 
+struct rocker_neigh_tbl_entry {
+	struct hlist_node entry;
+	__be32 ip_addr; /* key */
+	struct net_device *dev;
+	u32 ref_count;
+	u32 index;
+	u8 eth_dst[ETH_ALEN];
+	bool ttl_check;
+	struct delayed_work arp_work;
+	unsigned long arp_delay;
+	bool arp_running;
+};
+
 struct rocker_desc_info {
 	char *data; /* mapped */
 	size_t data_size;
@@ -234,6 +250,9 @@ struct rocker {
 	unsigned long internal_vlan_bitmap[ROCKER_INTERNAL_VLAN_BITMAP_LEN];
 	DECLARE_HASHTABLE(internal_vlan_tbl, 8);
 	spinlock_t internal_vlan_tbl_lock;
+	DECLARE_HASHTABLE(neigh_tbl, 16);
+	spinlock_t neigh_tbl_lock;
+	u32 neigh_tbl_next_index;
 };
 
 static const u8 zero_mac[ETH_ALEN]   = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 };
@@ -2145,9 +2164,9 @@ static int rocker_cmd_group_tbl_del(struct rocker *rocker,
 	return 0;
 }
 
-/*****************************************
- * Flow, group, FDB, internal VLAN tables
- *****************************************/
+/***************************************************
+ * Flow, group, FDB, internal VLAN and neigh tables
+ ***************************************************/
 
 static int rocker_init_tbls(struct rocker *rocker)
 {
@@ -2163,6 +2182,9 @@ static int rocker_init_tbls(struct rocker *rocker)
 	hash_init(rocker->internal_vlan_tbl);
 	spin_lock_init(&rocker->internal_vlan_tbl_lock);
 
+	hash_init(rocker->neigh_tbl);
+	spin_lock_init(&rocker->neigh_tbl_lock);
+
 	return 0;
 }
 
@@ -2173,6 +2195,7 @@ static void rocker_free_tbls(struct rocker *rocker)
 	struct rocker_group_tbl_entry *group_entry;
 	struct rocker_fdb_tbl_entry *fdb_entry;
 	struct rocker_internal_vlan_tbl_entry *internal_vlan_entry;
+	struct rocker_neigh_tbl_entry *neigh_entry;
 	struct hlist_node *tmp;
 	int bkt;
 
@@ -2196,6 +2219,11 @@ static void rocker_free_tbls(struct rocker *rocker)
 			   tmp, internal_vlan_entry, entry)
 		hash_del(&internal_vlan_entry->entry);
 	spin_unlock_irqrestore(&rocker->internal_vlan_tbl_lock, flags);
+
+	spin_lock_irqsave(&rocker->neigh_tbl_lock, flags);
+	hash_for_each_safe(rocker->neigh_tbl, bkt, tmp, neigh_entry, entry)
+		hash_del(&neigh_entry->entry);
+	spin_unlock_irqrestore(&rocker->neigh_tbl_lock, flags);
 }
 
 static struct rocker_flow_tbl_entry *
@@ -2444,6 +2472,29 @@ static int rocker_flow_tbl_bridge(struct rocker_port *rocker_port,
 	return rocker_flow_tbl_do(rocker_port, flags, entry);
 }
 
+static int rocker_flow_tbl_ucast4_routing(struct rocker_port *rocker_port,
+					  __be16 eth_type, __be32 dst,
+					  __be32 dst_mask,
+					  enum rocker_of_dpa_table_id goto_tbl,
+					  u32 group_id, int flags)
+{
+	struct rocker_flow_tbl_entry *entry;
+
+	entry = kzalloc(sizeof(*entry), rocker_op_flags_gfp(flags));
+	if (!entry)
+		return -ENOMEM;
+
+	entry->key.tbl_id = ROCKER_OF_DPA_TABLE_ID_UNICAST_ROUTING;
+	entry->key.priority = ROCKER_PRIORITY_UNICAST_ROUTING;
+	entry->key.ucast_routing.eth_type = eth_type;
+	entry->key.ucast_routing.dst4 = dst;
+	entry->key.ucast_routing.dst4_mask = dst_mask;
+	entry->key.ucast_routing.goto_tbl = goto_tbl;
+	entry->key.ucast_routing.group_id = group_id;
+
+	return rocker_flow_tbl_do(rocker_port, flags, entry);
+}
+
 static int rocker_flow_tbl_acl(struct rocker_port *rocker_port,
 			       int flags, u32 in_lport,
 			       u32 in_lport_mask,
@@ -2652,6 +2703,232 @@ static int rocker_group_l2_flood(struct rocker_port *rocker_port,
 				       group_id);
 }
 
+static int rocker_group_l3_unicast(struct rocker_port *rocker_port,
+				   int flags, u32 index, u8 *src_mac,
+				   u8 *dst_mac, __be16 vlan_id,
+				   bool ttl_check, u32 lport)
+{
+	struct rocker_group_tbl_entry *entry;
+
+	entry = kzalloc(sizeof(*entry), rocker_op_flags_gfp(flags));
+	if (!entry)
+		return -ENOMEM;
+
+	entry->group_id = ROCKER_GROUP_L3_UNICAST(index);
+	if (src_mac)
+		ether_addr_copy(entry->l3_unicast.eth_src, src_mac);
+	if (dst_mac)
+		ether_addr_copy(entry->l3_unicast.eth_dst, dst_mac);
+	entry->l3_unicast.vlan_id = vlan_id;
+	entry->l3_unicast.ttl_check = ttl_check;
+	entry->l3_unicast.group_id = ROCKER_GROUP_L2_INTERFACE(vlan_id, lport);
+
+	return rocker_group_tbl_do(rocker_port, flags, entry);
+}
+
+static struct rocker_neigh_tbl_entry *
+	rocker_neigh_tbl_find(struct rocker *rocker, __be32 ip_addr)
+{
+	struct rocker_neigh_tbl_entry *found;
+
+	hash_for_each_possible(rocker->neigh_tbl, found, entry, ip_addr)
+		if (found->ip_addr == ip_addr)
+			return found;
+
+	return NULL;
+}
+
+static void _rocker_neigh_add(struct rocker *rocker,
+			      struct rocker_neigh_tbl_entry *entry)
+{
+	entry->index = rocker->neigh_tbl_next_index++;
+	entry->ref_count++;
+	hash_add(rocker->neigh_tbl, &entry->entry, entry->ip_addr);
+}
+
+static void _rocker_neigh_del(struct rocker *rocker,
+			      struct rocker_neigh_tbl_entry *entry)
+{
+	if (--entry->ref_count == 0)
+		hash_del(&entry->entry);
+}
+
+static void _rocker_neigh_update(struct rocker *rocker,
+				 struct rocker_neigh_tbl_entry *entry,
+				 u8 *eth_dst)
+{
+	if (eth_dst)
+		ether_addr_copy(entry->eth_dst, eth_dst);
+	else
+		entry->ref_count++;
+}
+
+static void rocker_port_neigh_resolve_work(struct work_struct *work)
+{
+	struct rocker_neigh_tbl_entry *entry =
+		container_of(to_delayed_work(work),
+			     struct rocker_neigh_tbl_entry, arp_work);
+
+	arp_send(ARPOP_REQUEST, ETH_P_ARP, entry->ip_addr, entry->dev,
+		 entry->ip_addr, NULL, entry->dev->dev_addr, NULL);
+
+	entry->arp_delay = entry->arp_delay ? entry->arp_delay * 2 : HZ;
+	schedule_delayed_work(&entry->arp_work, entry->arp_delay);
+}
+
+static int rocker_port_neigh(struct rocker_port *rocker_port, int flags,
+			     __be32 ip_addr, u8 *eth_dst)
+{
+	struct rocker *rocker = rocker_port->rocker;
+	struct rocker_neigh_tbl_entry *entry;
+	struct rocker_neigh_tbl_entry *found;
+	unsigned long lock_flags;
+	__be16 eth_type = htons(ETH_P_IP);
+	enum rocker_of_dpa_table_id goto_tbl =
+		ROCKER_OF_DPA_TABLE_ID_ACL_POLICY;
+	u32 group_id;
+	bool adding = !(flags & ROCKER_OP_FLAG_REMOVE);
+	bool updating;
+	bool removing;
+	int err = 0;
+
+	entry = kzalloc(sizeof(*entry), rocker_op_flags_gfp(flags));
+	if (!entry)
+		return -ENOMEM;
+
+	entry->ip_addr = ip_addr;
+	entry->dev = rocker_port->dev;
+	ether_addr_copy(entry->eth_dst, eth_dst);
+	entry->ttl_check = true;
+
+	spin_lock_irqsave(&rocker->neigh_tbl_lock, lock_flags);
+
+	found = rocker_neigh_tbl_find(rocker, ip_addr);
+
+	updating = found && adding;
+	removing = found && !adding;
+	adding = !found && adding;
+
+	if (adding)
+		_rocker_neigh_add(rocker, entry);
+	else if (removing)
+		_rocker_neigh_del(rocker, found);
+	else if (updating)
+		_rocker_neigh_update(rocker, found, eth_dst);
+
+	if (found)
+		kfree(entry);
+	else
+		found = entry;
+
+	if (found->arp_running) {
+		found->arp_running = false;
+		cancel_delayed_work_sync(&found->arp_work);
+	}
+
+	spin_unlock_irqrestore(&rocker->neigh_tbl_lock, lock_flags);
+
+	if (!adding && !removing && !updating)
+		return -ENOENT;
+
+	/* For each active neighbor, we have an L3 unicast group and
+	 * a /32 route to the neighbor, which uses the L3 unicast
+	 * group.  The L3 unicast group can also be referred to by
+	 * other routes' nexthops.
+	 */
+
+	err = rocker_group_l3_unicast(rocker_port, flags,
+				      found->index,
+				      rocker_port->dev->dev_addr,
+				      found->eth_dst,
+				      rocker_port->internal_vlan_id,
+				      found->ttl_check,
+				      rocker_port->lport);
+	if (err) {
+		netdev_err(rocker_port->dev,
+			   "Error (%d) L3 unicast group index %d\n",
+			   err, found->index);
+		return err;
+	}
+
+	if (adding || removing) {
+		group_id = ROCKER_GROUP_L3_UNICAST(found->index);
+		err = rocker_flow_tbl_ucast4_routing(rocker_port,
+						     eth_type, found->ip_addr,
+						     inet_make_mask(32),
+						     goto_tbl, group_id,
+						     flags);
+
+		if (err)
+			netdev_err(rocker_port->dev,
+				   "Error (%d) /32 unicast route %pI4 group 0x%08x\n",
+				   err, &found->ip_addr, group_id);
+	}
+
+	return err;
+}
+
+static int rocker_port_nh(struct rocker_port *rocker_port, int flags,
+			  __be32 ip_addr, u32 *index)
+{
+	struct rocker *rocker = rocker_port->rocker;
+	struct rocker_neigh_tbl_entry *entry;
+	struct rocker_neigh_tbl_entry *found;
+	unsigned long lock_flags;
+	bool adding = !(flags & ROCKER_OP_FLAG_REMOVE);
+	bool updating;
+	bool removing;
+	bool completed;
+
+	entry = kzalloc(sizeof(*entry), rocker_op_flags_gfp(flags));
+	if (!entry)
+		return -ENOMEM;
+
+	entry->ip_addr = ip_addr;
+	entry->dev = rocker_port->dev;
+
+	spin_lock_irqsave(&rocker->neigh_tbl_lock, lock_flags);
+
+	found = rocker_neigh_tbl_find(rocker, ip_addr);
+
+	updating = found && adding;
+	removing = found && !adding;
+	adding = !found && adding;
+
+	if (adding)
+		_rocker_neigh_add(rocker, entry);
+	else if (removing)
+		_rocker_neigh_del(rocker, found);
+	else if (updating)
+		_rocker_neigh_update(rocker, found, NULL);
+
+	if (found)
+		kfree(entry);
+	else
+		found = entry;
+
+	/* Completed means neigh ip_addr is resolved to neigh mac.
+	 * If an entry is incomplete, we need to ARP to try to
+	 * resolve the neigh mac.
+	 */
+
+	completed = !is_zero_ether_addr(found->eth_dst);
+
+	if (!completed && !found->arp_running) {
+		INIT_DELAYED_WORK(&found->arp_work,
+				  rocker_port_neigh_resolve_work);
+		found->arp_delay = 0;
+		found->arp_running = true;
+		schedule_delayed_work(&found->arp_work, found->arp_delay);
+	}
+
+	spin_unlock_irqrestore(&rocker->neigh_tbl_lock, lock_flags);
+
+	*index = found->index;
+
+	return 0;
+}
+
 static int rocker_port_vlan_flood_group(struct rocker_port *rocker_port,
 					int flags, __be16 vlan_id)
 {
@@ -3381,6 +3658,79 @@ not_found:
 	spin_unlock_irqrestore(&rocker->internal_vlan_tbl_lock, lock_flags);
 }
 
+static int rocker_port_fib_ipv4(struct rocker_port *rocker_port, __be32 dst,
+				int dst_len, struct fib_info *fi, u32 tb_id,
+				int flags)
+{
+	struct fib_nh *nh = fi->fib_nh;
+	__be16 eth_type = htons(ETH_P_IP);
+	__be32 dst_mask = inet_make_mask(dst_len);
+	__be16 internal_vlan_id = rocker_port->internal_vlan_id;
+	enum rocker_of_dpa_table_id goto_tbl =
+		ROCKER_OF_DPA_TABLE_ID_ACL_POLICY;
+	u32 group_id;
+	bool nh_on_port = (fi->fib_dev == rocker_port->dev);
+	bool has_gw = !!nh->nh_gw;
+	u32 index;
+	int err;
+
+	/* XXX support ECMP */
+
+	if (has_gw && nh_on_port) {
+		err = rocker_port_nh(rocker_port, flags, nh->nh_gw, &index);
+		if (err)
+			return err;
+
+		group_id = ROCKER_GROUP_L3_UNICAST(index);
+	} else {
+		/* Send to CPU for processing */
+		group_id = ROCKER_GROUP_L2_INTERFACE(internal_vlan_id, 0);
+	}
+
+	err = rocker_flow_tbl_ucast4_routing(rocker_port,
+					     eth_type, dst,
+					     dst_mask, goto_tbl,
+					     group_id, flags);
+	if (err)
+		netdev_err(rocker_port->dev, "Error (%d) IPv4 route %pI4\n",
+			   err, &dst);
+
+	return err;
+}
+
+static int rocker_port_fib_ipv4_skip(struct net_device *dev,
+				     __be32 dst, int dst_len,
+				     struct fib_info *fi,
+				     u8 tos, u8 type, u32 tb_id)
+{
+	if (fi->fib_flags & RTM_F_CLONED)
+		return -EOPNOTSUPP;
+
+	if (tb_id != RT_TABLE_MAIN && tb_id != RT_TABLE_LOCAL)
+		return -EOPNOTSUPP;
+
+	if (type != RTN_UNICAST && type != RTN_BLACKHOLE &&
+	    type != RTN_UNREACHABLE && type != RTN_LOCAL &&
+	    type != RTN_BROADCAST)
+		return -EOPNOTSUPP;
+
+	if (tb_id == RT_TABLE_MAIN && type != RTN_UNICAST &&
+	    type != RTN_BLACKHOLE && type != RTN_UNREACHABLE)
+		return -EOPNOTSUPP;
+
+	if (tos != 0)
+		return -EOPNOTSUPP;
+
+	if (ipv4_is_loopback(dst))
+		return -EOPNOTSUPP;
+
+	/* XXX not handling ECMP right now */
+	if (fi->fib_nhs != 1)
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
 /*****************
  * Net device ops
  *****************/
@@ -3781,6 +4131,42 @@ static int rocker_port_switch_port_stp_update(struct net_device *dev, u8 state)
 	return rocker_port_stp_update(rocker_port, state);
 }
 
+static int rocker_port_switch_fib_ipv4_add(struct net_device *dev,
+					   __be32 dst, int dst_len,
+					   struct fib_info *fi,
+					   u8 tos, u8 type, u32 tb_id)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	int flags = 0;
+	int err;
+
+	err = rocker_port_fib_ipv4_skip(dev, dst, dst_len, fi,
+					tos, type, tb_id);
+	if (err)
+		return err;
+
+	return rocker_port_fib_ipv4(rocker_port, dst, dst_len,
+				    fi, tb_id, flags);
+}
+
+static int rocker_port_switch_fib_ipv4_del(struct net_device *dev,
+					   __be32 dst, int dst_len,
+					   struct fib_info *fi,
+					   u8 tos, u8 type, u32 tb_id)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	int flags = ROCKER_OP_FLAG_REMOVE;
+	int err;
+
+	err = rocker_port_fib_ipv4_skip(dev, dst, dst_len, fi,
+					tos, type, tb_id);
+	if (err)
+		return err;
+
+	return rocker_port_fib_ipv4(rocker_port, dst, dst_len,
+				    fi, tb_id, flags);
+}
+
 static const struct net_device_ops rocker_port_netdev_ops = {
 	.ndo_open			= rocker_port_open,
 	.ndo_stop			= rocker_port_stop,
@@ -3795,6 +4181,8 @@ static const struct net_device_ops rocker_port_netdev_ops = {
 	.ndo_bridge_getlink		= rocker_port_bridge_getlink,
 	.ndo_switch_parent_id_get	= rocker_port_switch_parent_id_get,
 	.ndo_switch_port_stp_update	= rocker_port_switch_port_stp_update,
+	.ndo_switch_fib_ipv4_add	= rocker_port_switch_fib_ipv4_add,
+	.ndo_switch_fib_ipv4_del	= rocker_port_switch_fib_ipv4_del,
 };
 
 /********************
@@ -4340,6 +4728,50 @@ static struct notifier_block rocker_netdevice_nb __read_mostly = {
 	.notifier_call = rocker_netdevice_event,
 };
 
+/************************************
+ * Net event notifier event handler
+ ************************************/
+
+static int rocker_neigh_update(struct net_device *dev, struct neighbour *neigh)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	int flags = neigh->nud_state & NUD_VALID ? 0 : ROCKER_OP_FLAG_REMOVE;
+	__be32 ip_addr = *(__be32 *)neigh->primary_key;
+	unsigned char *mac = neigh->ha;
+
+	return rocker_port_neigh(rocker_port, flags, ip_addr, mac);
+}
+
+static int rocker_netevent_event(struct notifier_block *unused,
+				 unsigned long event, void *ptr)
+{
+	struct net_device *dev;
+	struct neighbour *neigh;
+	int err;
+
+	switch (event) {
+	case NETEVENT_NEIGH_UPDATE:
+		neigh = ptr;
+		if (neigh->tbl != &arp_tbl)
+			return NOTIFY_DONE;
+		dev = neigh->dev;
+		if (!rocker_port_dev_check(dev))
+			return NOTIFY_DONE;
+		err = rocker_neigh_update(dev, neigh);
+		if (err)
+			netdev_warn(dev,
+				    "failed to handle neigh update (err %d)\n",
+				    err);
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block rocker_netevent_nb __read_mostly = {
+	.notifier_call = rocker_netevent_event,
+};
+
 /***********************
  * Module init and exit
  ***********************/
@@ -4349,18 +4781,21 @@ static int __init rocker_module_init(void)
 	int err;
 
 	register_netdevice_notifier(&rocker_netdevice_nb);
+	register_netevent_notifier(&rocker_netevent_nb);
 	err = pci_register_driver(&rocker_pci_driver);
 	if (err)
 		goto err_pci_register_driver;
 	return 0;
 
 err_pci_register_driver:
+	unregister_netdevice_notifier(&rocker_netevent_nb);
 	unregister_netdevice_notifier(&rocker_netdevice_nb);
 	return err;
 }
 
 static void __exit rocker_module_exit(void)
 {
+	unregister_netevent_notifier(&rocker_netevent_nb);
 	unregister_netdevice_notifier(&rocker_netdevice_nb);
 	pci_unregister_driver(&rocker_pci_driver);
 }
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 2/3] net: call swdev fib del for flushed routes
From: sfeldma @ 2015-01-02  3:29 UTC (permalink / raw)
  To: netdev, jiri, john.fastabend, tgraf, jhs, andy, roopa

From: Scott Feldman <sfeldma@gmail.com>

This is essentially the same patch Nicolas Dichtel posted, but was rejected.
The difference here is rather than calling netlink to notify, this patch
calls swdev to del the fib entry.  Without this patch, when routes are flushed,
for example when an interface goes down, the normal route delete paths aren't
called, so we need to hook the flush path so swdev get's called.

Nicolas' patch is here:

http://marc.info/?l=linux-netdev&m=135161909714545&w=2

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 net/ipv4/fib_trie.c |   14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index ea2dc17..c7a5947 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1565,15 +1565,19 @@ int fib_table_delete(struct fib_table *tb, struct fib_config *cfg)
 	return 0;
 }
 
-static int trie_flush_list(struct list_head *head)
+static int trie_flush_list(struct fib_table *tb, struct tnode *l,
+			   struct leaf_info *li)
 {
 	struct fib_alias *fa, *fa_node;
 	int found = 0;
 
-	list_for_each_entry_safe(fa, fa_node, head, fa_list) {
+	list_for_each_entry_safe(fa, fa_node, &li->falh, fa_list) {
 		struct fib_info *fi = fa->fa_info;
 
 		if (fi && (fi->fib_flags & RTNH_F_DEAD)) {
+			netdev_switch_fib_ipv4_del(l->key, li->plen, fi,
+						   fa->fa_tos, fa->fa_type,
+						   tb->tb_id);
 			list_del_rcu(&fa->fa_list);
 			fib_release_info(fa->fa_info);
 			alias_free_mem_rcu(fa);
@@ -1583,7 +1587,7 @@ static int trie_flush_list(struct list_head *head)
 	return found;
 }
 
-static int trie_flush_leaf(struct tnode *l)
+static int trie_flush_leaf(struct fib_table *tb, struct tnode *l)
 {
 	int found = 0;
 	struct hlist_head *lih = &l->list;
@@ -1591,7 +1595,7 @@ static int trie_flush_leaf(struct tnode *l)
 	struct leaf_info *li = NULL;
 
 	hlist_for_each_entry_safe(li, tmp, lih, hlist) {
-		found += trie_flush_list(&li->falh);
+		found += trie_flush_list(tb, l, li);
 
 		if (list_empty(&li->falh)) {
 			hlist_del_rcu(&li->hlist);
@@ -1674,7 +1678,7 @@ int fib_table_flush(struct fib_table *tb)
 	int found = 0;
 
 	for (l = trie_firstleaf(t); l; l = trie_nextleaf(l)) {
-		found += trie_flush_leaf(l);
+		found += trie_flush_leaf(tb, l);
 
 		if (ll && hlist_empty(&ll->list))
 			trie_leaf_remove(t, ll);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 1/3] net: add IPv4 routing FIB support for swdev
From: sfeldma @ 2015-01-02  3:29 UTC (permalink / raw)
  To: netdev, jiri, john.fastabend, tgraf, jhs, andy, roopa

From: Scott Feldman <sfeldma@gmail.com>

To offload IPv4 L3 routing functions to swdev device, the swdev device driver
implements two new ndo ops (ndo_switch_fib_ipv4_add/del).  The ops are called
by the core IPv4 FIB code when installing/removing FIB entries to/from the
kernel FIB.  On install, the driver should return 0 if FIB entry (route) can be
installed to device for offloading, -EOPNOTSUPP if route cannot be installed
due to device limitations, and other negative error code on failure to install
route to device.  On failure error code, the route is not installed to device,
and not installed in kernel FIB, and the return code is propagated back to the
user-space caller (via netlink).  An -EOPNOTSUPP error code is skipped for the
device but installed in the kernel FIB.

The FIB entry (route) nexthop list is used to find the swdev device port to
anchor the ndo op call.  The route's fib_dev (the first nexthop's dev) is used
find the swdev port by recursively traversing the fib_dev's lower_dev list
until a swdev port is found.  The ndo op is called on this swdev port.

Since the FIB entry is "naked" when push from the kernel, the driver/device
is responsible for resolving the route's nexthops to neighbor MAC addresses.
This can be done by the driver by monitoring NETEVENT_NEIGH_UPDATE
netevent notifier to watch for ARP activity.  Once a nexthop is resolved to
neighbor MAC address, it can be installed to the device and the device will
do the L3 routing offload in HW, for that nexthop.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 include/linux/netdevice.h |   22 +++++++++++
 include/net/switchdev.h   |   18 +++++++++
 net/ipv4/fib_trie.c       |   17 ++++++++-
 net/switchdev/switchdev.c |   89 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 145 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 679e6e9..b66d22b 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -767,6 +767,8 @@ struct netdev_phys_item_id {
 typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
 				       struct sk_buff *skb);
 
+struct fib_info;
+
 /*
  * This structure defines the management hooks for network devices.
  * The following hooks can be defined; unless noted otherwise, they are
@@ -1030,6 +1032,14 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
  * int (*ndo_switch_port_stp_update)(struct net_device *dev, u8 state);
  *	Called to notify switch device port of bridge port STP
  *	state change.
+ * int (*ndo_sw_parent_fib_ipv4_add)(struct net_device *dev, __be32 dst,
+ *				     int dst_len, struct fib_info *fi,
+ *				     u8 tos, u8 type, u32 tb_id);
+ *	Called to add IPv4 route to switch device.
+ * int (*ndo_sw_parent_fib_ipv4_del)(struct net_device *dev, __be32 dst,
+ *				     int dst_len, struct fib_info *fi,
+ *				     u8 tos, u8 type, u32 tb_id);
+ *	Called to delete IPv4 route from switch device.
  */
 struct net_device_ops {
 	int			(*ndo_init)(struct net_device *dev);
@@ -1189,6 +1199,18 @@ struct net_device_ops {
 							    struct netdev_phys_item_id *psid);
 	int			(*ndo_switch_port_stp_update)(struct net_device *dev,
 							      u8 state);
+	int			(*ndo_switch_fib_ipv4_add)(struct net_device *dev,
+							   __be32 dst,
+							   int dst_len,
+							   struct fib_info *fi,
+							   u8 tos, u8 type,
+							   u32 tb_id);
+	int			(*ndo_switch_fib_ipv4_del)(struct net_device *dev,
+							   __be32 dst,
+							   int dst_len,
+							   struct fib_info *fi,
+							   u8 tos, u8 type,
+							   u32 tb_id);
 #endif
 };
 
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 8a6d164..caebc2a 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -17,6 +17,10 @@
 int netdev_switch_parent_id_get(struct net_device *dev,
 				struct netdev_phys_item_id *psid);
 int netdev_switch_port_stp_update(struct net_device *dev, u8 state);
+int netdev_switch_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
+			       u8 tos, u8 type, u32 tb_id);
+int netdev_switch_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
+			       u8 tos, u8 type, u32 tb_id);
 
 #else
 
@@ -32,6 +36,20 @@ static inline int netdev_switch_port_stp_update(struct net_device *dev,
 	return -EOPNOTSUPP;
 }
 
+static inline int netdev_switch_fib_ipv4_add(u32 dst, int dst_len,
+					     struct fib_info *fi,
+					     u8 tos, u8 type, u32 tb_id)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int netdev_switch_fib_ipv4_del(u32 dst, int dst_len,
+					     struct fib_info *fi,
+					     u8 tos, u8 type, u32 tb_id)
+{
+	return -EOPNOTSUPP;
+}
+
 #endif
 
 #endif /* _LINUX_SWITCHDEV_H_ */
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 281e5e0..ea2dc17 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -79,6 +79,7 @@
 #include <net/tcp.h>
 #include <net/sock.h>
 #include <net/ip_fib.h>
+#include <net/switchdev.h>
 #include "fib_lookup.h"
 
 #define MAX_STAT_DEPTH 32
@@ -1201,6 +1202,8 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
 			fib_release_info(fi_drop);
 			if (state & FA_S_ACCESSED)
 				rt_cache_flush(cfg->fc_nlinfo.nl_net);
+			netdev_switch_fib_ipv4_add(key, plen, fi, fa->fa_tos,
+						   cfg->fc_type, tb->tb_id);
 			rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen,
 				tb->tb_id, &cfg->fc_nlinfo, NLM_F_REPLACE);
 
@@ -1229,6 +1232,13 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
 	new_fa->fa_tos = tos;
 	new_fa->fa_type = cfg->fc_type;
 	new_fa->fa_state = 0;
+
+	/* (Optionally) offload fib info to switch hardware. */
+	err = netdev_switch_fib_ipv4_add(key, plen, fi, tos,
+					 cfg->fc_type, tb->tb_id);
+	if (err && err != -EOPNOTSUPP)
+		goto out_free_new_fa;
+
 	/*
 	 * Insert new entry to the list.
 	 */
@@ -1237,7 +1247,7 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
 		fa_head = fib_insert_node(t, key, plen);
 		if (unlikely(!fa_head)) {
 			err = -ENOMEM;
-			goto out_free_new_fa;
+			goto out_sw_fib_del;
 		}
 	}
 
@@ -1253,6 +1263,8 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
 succeeded:
 	return 0;
 
+out_sw_fib_del:
+	netdev_switch_fib_ipv4_del(key, plen, fi, tos, cfg->fc_type, tb->tb_id);
 out_free_new_fa:
 	kmem_cache_free(fn_alias_kmem, new_fa);
 out:
@@ -1529,6 +1541,9 @@ int fib_table_delete(struct fib_table *tb, struct fib_config *cfg)
 	rtmsg_fib(RTM_DELROUTE, htonl(key), fa, plen, tb->tb_id,
 		  &cfg->fc_nlinfo, 0);
 
+	netdev_switch_fib_ipv4_del(key, plen, fa->fa_info, tos,
+				   cfg->fc_type, tb->tb_id);
+
 	list_del_rcu(&fa->fa_list);
 
 	if (!plen)
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index d162b21..211a8a0 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -12,6 +12,7 @@
 #include <linux/types.h>
 #include <linux/init.h>
 #include <linux/netdevice.h>
+#include <net/ip_fib.h>
 #include <net/switchdev.h>
 
 /**
@@ -50,3 +51,91 @@ int netdev_switch_port_stp_update(struct net_device *dev, u8 state)
 	return ops->ndo_switch_port_stp_update(dev, state);
 }
 EXPORT_SYMBOL(netdev_switch_port_stp_update);
+
+static struct net_device *netdev_switch_get_by_fib_dev(struct net_device *dev)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	struct net_device *lower_dev;
+	struct net_device *port_dev;
+	struct list_head *iter;
+
+	/* Recusively search from fib_dev down until we find
+	 * a sw port dev.  (A sw port dev supports
+	 * ndo_switch_parent_id_get).
+	 */
+
+	if (ops->ndo_switch_parent_id_get)
+		return dev;
+
+	netdev_for_each_lower_dev(dev, lower_dev, iter) {
+		port_dev = netdev_switch_get_by_fib_dev(lower_dev);
+		if (port_dev)
+			return port_dev;
+	}
+
+	return NULL;
+}
+
+/**
+ *	netdev_switch_fib_ipv4_add - Add IPv4 route entry to switch
+ *
+ *	@dst: route's IPv4 destination address
+ *	@dst_len: destination address length (prefix length)
+ *	@fi: route FIB info structure
+ *	@tos: route TOS
+ *	@type: route type
+ *	@tb_id: route table ID
+ *
+ *	Add IPv4 route entry to switch device.
+ */
+int netdev_switch_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
+			       u8 tos, u8 type, u32 tb_id)
+{
+	struct net_device *dev;
+	const struct net_device_ops *ops;
+	int err = -EOPNOTSUPP;
+
+	dev = netdev_switch_get_by_fib_dev(fi->fib_dev);
+	if (!dev)
+		return -EOPNOTSUPP;
+	ops = dev->netdev_ops;
+
+	if (ops->ndo_switch_fib_ipv4_add)
+		err = ops->ndo_switch_fib_ipv4_add(dev, htonl(dst), dst_len,
+						   fi, tos, type, tb_id);
+
+	return err;
+}
+EXPORT_SYMBOL(netdev_switch_fib_ipv4_add);
+
+/**
+ *	netdev_switch_fib_ipv4_del - Delete IPv4 route entry from switch
+ *
+ *	@dst: route's IPv4 destination address
+ *	@dst_len: destination address length (prefix length)
+ *	@fi: route FIB info structure
+ *	@tos: route TOS
+ *	@type: route type
+ *	@tb_id: route table ID
+ *
+ *	Delete IPv4 route entry from switch device.
+ */
+int netdev_switch_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
+			       u8 tos, u8 type, u32 tb_id)
+{
+	struct net_device *dev;
+	const struct net_device_ops *ops;
+	int err = -EOPNOTSUPP;
+
+	dev = netdev_switch_get_by_fib_dev(fi->fib_dev);
+	if (!dev)
+		return -EOPNOTSUPP;
+	ops = dev->netdev_ops;
+
+	if (ops->ndo_switch_fib_ipv4_del)
+		err = ops->ndo_switch_fib_ipv4_del(dev, htonl(dst), dst_len,
+						   fi, tos, type, tb_id);
+
+	return err;
+}
+EXPORT_SYMBOL(netdev_switch_fib_ipv4_del);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 0/3] swdev: add IPv4 routing offload
From: sfeldma @ 2015-01-02  3:29 UTC (permalink / raw)
  To: netdev, jiri, john.fastabend, tgraf, jhs, andy, roopa

From: Scott Feldman <sfeldma@gmail.com>

This patch set adds L3 routing offload support for IPv4 routes.  The idea is to
mirror routes installed in the kernel's FIB down to a hardware switch device to
offload the data forwarding path for L3.  Only the data forwarding path is
intercepted.  Control and management of the kernel's FIB remains with the
kernel.

A couple of new ndo ops (ndo_switch_fib_ipv4_add/del) are added to the swdev
model to add/remove FIB entries to/from the offload device.  The ops are called
from the core IPv4 FIB code directly.  Just before the FIB entry is installed
in the kernel's FIB, the swdev device driver gets a chance at the FIB entry
(assuming the swdev driver implements the new ndo ops).  This is a synchronous
call in the RTM_NEWROUTE path, and the swdev has the option to fail the
install, which means the FIB entry is not installed in swdev or the kernel, and
the user is notified of the failure.  The swdev driver also has the option to
return -EOPNOTSUPP to pass on the FIB entry, so it'll only be installed in the
kernel FIB.

The FIB flush path is modified also to call into the swdev driver to flush the
FIB entries from hardware.

The rocker swdev driver is updated to support these new ndo ops.  Right now
rocker only supports IPv4 singlepath routes, but follow-on patches will add
IPv6 and ECMP support.  Also, only unicast IPv4 routes are supported, but
follow-on patches will add multicast route support.

Testing was done in my simulated network envionment using VMs and the rocker
device.  I'm using Quagga OSPFv2 for the routing protocol for automatic control
plane processing.  No modifications to Quagga or netlink/iproute2 is required;
it just works.

One important metric is the time spent installing/removing FIB entries from the
kernel and the device.  With these patches applied, I measured the wall time
required to install and remove 10K IPv4 routes.  I used ip route add cmd in
batch mode to install static routes.  I used the ip route flush cmd to delete
the routes.  This is 10000 routes installed to the kernel's FIB and to the
swdev device's L3 tables.  And then removed from each.  The performance is less
than a second for each operation.  This is on my simulated rocker device running
on a VM, so a real embedded CPU would probably do much better.

My batch has 10K lines of:

simp@simp:~$ head east
route add 16.0.0.0/32 nexthop via 11.0.0.2 dev swp1
route add 16.0.0.1/32 nexthop via 11.0.0.2 dev swp1
route add 16.0.0.2/32 nexthop via 11.0.0.2 dev swp1
route add 16.0.0.3/32 nexthop via 11.0.0.2 dev swp1
route add 16.0.0.4/32 nexthop via 11.0.0.2 dev swp1
route add 16.0.0.5/32 nexthop via 11.0.0.2 dev swp1
route add 16.0.0.6/32 nexthop via 11.0.0.2 dev swp1
route add 16.0.0.7/32 nexthop via 11.0.0.2 dev swp1
route add 16.0.0.8/32 nexthop via 11.0.0.2 dev swp1
route add 16.0.0.9/32 nexthop via 11.0.0.2 dev swp1
[...]

Install/removing routes:

simp@simp:~$ wc -l east
10000 east
simp@simp:~$ ip route show root 16/8 | wc -l
0
simp@simp:~$ time sudo ip --batch east

real    0m0.715s
user    0m0.092s
sys     0m0.388s
simp@simp:~$ ip route show root 16/8 | wc -l
10000

[At this point, 10K routes are installed in kernel and the device]

simp@simp:~$ time sudo ip route flush root 16/8

real    0m0.458s
user    0m0.000s
sys     0m0.284s
simp@simp:~$ ip route show root 16/8 | wc -l
0

[All gone]

Scott Feldman (3):
  net: add IPv4 routing FIB support for swdev
  net: call swdev fib del for flushed routes
  rocker: implement IPv4 fib offloading

 drivers/net/ethernet/rocker/rocker.c |  441 +++++++++++++++++++++++++++++++++-
 include/linux/netdevice.h            |   22 ++
 include/net/switchdev.h              |   18 ++
 net/ipv4/fib_trie.c                  |   31 ++-
 net/switchdev/switchdev.c            |   89 +++++++
 5 files changed, 592 insertions(+), 9 deletions(-)

-- 
1.7.10.4

^ permalink raw reply

* Re: [net-next PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75%
From: David Miller @ 2015-01-02  2:08 UTC (permalink / raw)
  To: alexander.duyck; +Cc: alexander.h.duyck, netdev
In-Reply-To: <54A4B1D4.1030506@gmail.com>

From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Wed, 31 Dec 2014 18:32:52 -0800

> On 12/31/2014 03:46 PM, David Miller wrote:
>> This knocks about 35 cpu cycles off of a lookup that ends up using the
>> default route on sparc64.  From about ~438 cycles to ~403.
> 
> Did that 438 value include both fib_table_lookup and check_leaf?  Just
> curious as the overall gain seems smaller than what I have been seeing
> on the x86 system I was testing with, but then again it could just be a
> sparc64 thing.

This is just a default run of my kbench_mod.ko from the net_test_tools
repo.  You can try it as well on x86-86 or similar.

> I've started work on a second round of patches.  With any luck they
> should be ready by the time the next net-next opens.  My hope is to cut
> the look-up time by another 30 to 50%, though it will take some time as
> I have to go though and drop the leaf_info structure, and look at
> splitting the tnode in half to break the key/pos/bits and child pointer
> dependency chain which will hopefully allow for a significant reduction
> in memory read stalls.

I'm very much looking forward to this.

> I am also planning to take a look at addressing the memory waste that
> occurs on nodes larger than 256 bytes due to the way kmalloc allocates
> memory as powers of 2.  I'm thinking I might try encouraging the growth
> of smaller nodes, and discouraging anything over 256 by implementing a
> "truesize" type logic that can be used in the inflate/halve functions so
> that the memory usage is more accurately reflected.

Wouldn't this result in a deeper tree?  The whole point is to keep the
tree as shallow as possible to minimize the memory refs on a lookup
right?

^ permalink raw reply

* Re: [PATCH v3 0/6] support GMAC driver for RK3288
From: David Miller @ 2015-01-02  2:01 UTC (permalink / raw)
  To: heiko
  Cc: roger.chen, peppe.cavallaro, netdev, linux-kernel, linux-rockchip,
	kever.yang, eddie.cai
In-Reply-To: <1493383.y3pPThNXY3@phil>

From: Heiko Stübner <heiko@sntech.de>
Date: Thu, 01 Jan 2015 03:06:31 +0100

> Hi David,
> 
> Am Mittwoch, 31. Dezember 2014, 19:15:38 schrieb David Miller:
>> From: Roger Chen <roger.chen@rock-chips.com>
>> Date: Mon, 29 Dec 2014 17:42:32 +0800
>> 
>> > Roger Chen (6):
>> >   patch1: add driver for Rockchip RK3288 SoCs integrated GMAC
>> >   patch2: define clock ID used for GMAC
>> >   patch3: modify CRU config for Rockchip RK3288 SoCs integrated GMAC
>> >   patch4: dts: rockchip: add gmac info for rk3288
>> >   patch5: dts: rockchip: enable gmac on RK3288 evb board
>> >   patch6: add document for Rockchip RK3288 GMAC
>> > 
>> > Tested on rk3288 evb board:
>> > Execute the following command to enable ethernet,
>> > set local IP and ping a remote host.
>> > 
>> > busybox ifconfig eth0 up
>> > busybox ifconfig eth0 192.168.1.111
>> > ping 192.168.1.1
>> 
>> Series applied to net-next, thanks.
> 
> could we split this up a bit instead?

Too late, what's in my tree is in the permanent commit history
and cannot be deleted.

^ permalink raw reply

* Re: [PATCH] TCP: Add support for TCP Stealth
From: Julian Kirsch @ 2015-01-01 23:31 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Christian Grothoff, Jacob Appelbaum
In-Reply-To: <20150101111030.1e2b3a18@urahara>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Stephen,

thanks a lot for your input. I've reflected your suggestions in the
code: The mode member of the stealth struct is now a u8, integrity_len
is now unsigned (size_t), I've switched the integrity_len and
integrity_hash members in order to close the gaps in the struct (sorry
for wasting bits) and I removed the cast from the second parameter of
the memcopy you mentioned. While I totally agree with you that the
part of the iv-setup where we xor the hash, tsval and dport in is not
very readable, I'd argue that declaring and using a struct would make
the patch longer while the semantics of such a struct really are
needed only in context of a single function and therefore cannot be
reused. Do you think that accessing the elements with a macro (see
below) could be an alternative?

#define tcp_stealth_iv_integrity_hash(iv)	(((__be16 *)&iv)[2])

Best,
Julian

On 2015-01-01 20:10, Stephen Hemminger wrote:
> On Wed, 31 Dec 2014 22:54:59 +0100 Julian Kirsch
> <kirschju@sec.in.tum.de> wrote:
> 
>> +	memcpy(iv, (const __u8 *)daddr, +	       (daddr_size >
>> sizeof(iv)) ? sizeof(iv) : daddr_size); + +#ifdef
>> CONFIG_TCP_MD5SIG +	md5 = tp->af_specific->md5_lookup(sk, sk); 
>> +#else +	md5 = NULL; +#endif +	if (likely(sysctl_tcp_timestamps
>> && !md5) || tp->stealth.saw_tsval) +		tsval =
>> tp->stealth.mstamp.stamp_jiffies; + +	((__be16 *)iv)[2] ^=
>> cpu_to_be16(tp->stealth.integrity_hash);
> 
> Cast unnecessary on memcpy arg since it takes void *
> 
> Would be clearer to use a real structure or union not assignment to
> cast to setup iv. -- To unsubscribe from this list: send the line
> "unsubscribe netdev" in the body of a message to
> majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJUpdjhAAoJENwkOWttRRA4ZFYIALs7gskvZqlCzFCuNTsJ4js9
7x5OTsuyY5caOIEcveVqYnW2dcOO2Jtwe1QUIOsRo7X6YOEA/8IT6+sJ0fxViLTD
TJAzU670Kcecn7+0cHPAj31yW+t9SHb5BBzMLCJlhSAboMs0YKmkwetqffg013uP
x81OI6kJy6pUCAeBeyyy20QafrIhs5vjEILGf9qSzeoIXRBdpnuH99FzoxEjOkUA
ka4QtrAUh3Uk0s6H8ezcpqvY2bKcz7te8+af5XF+Kz/+DLatoN2x58psxw3irBCw
x18FRVsbgmXY/m3leKilK4ieCyO1LzafRNa674fSW6QtHHMDZwSoYi0kDjSNMRk=
=ke3l
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: [PATCH] can: kvaser_usb: Don't free packets when tight on URBs
From: Stephen Hemminger @ 2015-01-01 21:59 UTC (permalink / raw)
  To: Ahmed S. Darwish
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, David S. Miller, Paul Gortmaker, Linux-CAN,
	netdev, LKML
In-Reply-To: <20141223154654.GB6460@vivalin-002>

On Tue, 23 Dec 2014 17:46:54 +0200
"Ahmed S. Darwish" <darwish.07@gmail.com> wrote:

>  	int ret = NETDEV_TX_OK;
> +	bool kfree_skb_on_error = true;
>  
>  	if (can_dropped_invalid_skb(netdev, skb))
>  		return NETDEV_TX_OK;
> @@ -1336,6 +1337,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
>  
>  	if (!context) {
>  		netdev_warn(netdev, "cannot find free context\n");
> +		kfree_skb_on_error = false;
>  		ret =  NETDEV_TX_BUSY;

You already have a flag value (ret == NETDEV_TX_BUSY), why
not use that instead of introducing another variable?

^ permalink raw reply

* Re: [PATCH] drivers:isdn: Remove uneeded fix me comment in capi.c for the function,decode_ie
From: Tilman Schmidt @ 2015-01-01 21:18 UTC (permalink / raw)
  To: Nicholas Krause, hjlipp; +Cc: isdn, gigaset307x-common, netdev, linux-kernel
In-Reply-To: <1420053714-24661-1-git-send-email-xerofoify@gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nicholas,

Am 31.12.2014 um 20:21 schrieb Nicholas Krause:
> Removes a no longer needed fix me comment for the
> function,decode_ie in the file, capi.c. This comment is no longer
> needed as the commit this was written in was 2009 when the new
> driver,three was introduced. Due to no breakage related to the 
> newer version of this driver we can safely remove this comment.

I won't oppose removal of that comment. However your commit message is
not quite to the point.

The comment poses the question of whether the conversion to upper case
done by the toupper() calls in the two lines following is actually
necessary. The fact that there is no breakage with the current code
does not answer that question. It could only be answered by *removing*
the toupper() calls and *then* checking for breakage.

However the cost of two toupper() calls seems so small that it's
hardly worth the effort and risk and we should just leave them in.

Regards,
Tilman

> diff --git a/drivers/isdn/gigaset/capi.c
> b/drivers/isdn/gigaset/capi.c index ccec777..d55ba3f 100644 ---
> a/drivers/isdn/gigaset/capi.c +++ b/drivers/isdn/gigaset/capi.c @@
> -182,7 +182,6 @@ static void decode_ie(u8 *in, char *out) { int i =
> *in; while (i-- > 0) { -		/* ToDo: conversion to upper case
> necessary? */ *out++ = toupper(hex_asc_hi(*++in)); *out++ =
> toupper(hex_asc_lo(*in)); }
> 

- -- 
Tilman Schmidt                              E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBAgAGBQJUpbm7AAoJEFPuqx0v+F+qxFcH/1F6l/1ycTikNdwfsHesZW6Q
TXDO7YfguLDe+wyePEvHdBFW+AwEgGFWTz2EEqfpdu/dHwmWjZh8yTLkJEMfn4fz
jBDm2Kyw+uFzUSXPVCyMUNXgC7yATs6l2AEkX5FSr0BycUVyMIeemChB4Pf5DfQD
B1Lz86xKhM3HpdI/KIDXqqHK88mT/B+7+rUL6lyJ6GRU/168EetD5PzjXoQ67DUw
JaSpCTjf1nUQAL1Nw623QVT4le/vPXLhwpIIAhLmvwO3nn8XWx4WdiCGVm6bGgrV
coWwo8LRB/IMCwoZGh2RhmDy2nOYwj1k9NYLjrx+izhbQUSpwqoLUCZe6vhBTks=
=Ppmw
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] bridge: modify bridge af spec parser to accomodate vlan list and ranges
From: Scott Feldman @ 2015-01-01 19:34 UTC (permalink / raw)
  To: Arad, Ronen
  Cc: roopa@cumulusnetworks.com, netdev@vger.kernel.org,
	hemminger@vyatta.com, vyasevic@redhat.com,
	wkok@cumulusnetworks.com
In-Reply-To: <E4CD12F19ABA0C4D8729E087A761DC3505DD2DE4@ORSMSX101.amr.corp.intel.com>

On Thu, Jan 1, 2015 at 12:54 AM, Arad, Ronen <ronen.arad@intel.com> wrote:
>
>
>>-----Original Message-----
>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>>Behalf Of roopa@cumulusnetworks.com
>>Sent: Wednesday, December 31, 2014 6:49 PM
>>To: netdev@vger.kernel.org; hemminger@vyatta.com; vyasevic@redhat.com
>>Cc: sfeldma@gmail.com; wkok@cumulusnetworks.com; Roopa Prabhu
>>Subject: [PATCH net-next v2 2/2] bridge: modify bridge af spec parser to
>>accomodate vlan list and ranges
>>
>>From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>>This patch modifies br_afspec to parse incoming IFLA_BRIDGE_VLAN_INFO_LIST
>>
>>Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>---
>> net/bridge/br_netlink.c |  115 ++++++++++++++++++++++++++++++++++------------
>>-
>> 1 file changed, 85 insertions(+), 30 deletions(-)
>>
>>diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>index 492ef6a..bcba9d2 100644
>>--- a/net/bridge/br_netlink.c
>>+++ b/net/bridge/br_netlink.c
>>@@ -226,53 +226,108 @@ static const struct nla_policy
>>ifla_br_policy[IFLA_MAX+1] = {
>>       [IFLA_BRIDGE_VLAN_INFO_LIST] = { .type = NLA_NESTED, },
>> };
>>
>>+static int br_vlan_info(struct net_bridge *br, struct net_bridge_port *p,
>>+                      int cmd, struct bridge_vlan_info *vinfo)
>>+{
>>+      int err = 0;
>>+
>>+      switch (cmd) {
>>+      case RTM_SETLINK:
>>+              if (p) {
>>+                      err = nbp_vlan_add(p, vinfo->vid, vinfo->flags);
>>+                      if (err)
>>+                              break;
>>+
>>+                      if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
>>+                              err = br_vlan_add(p->br, vinfo->vid,
>>+                                                vinfo->flags);
>>+              } else {
>>+                      err = br_vlan_add(br, vinfo->vid, vinfo->flags);
>>+              }
>>+              break;
>>+
>>+      case RTM_DELLINK:
>>+              if (p) {
>>+                      nbp_vlan_delete(p, vinfo->vid);
>>+                      if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
>>+                              br_vlan_delete(p->br, vinfo->vid);
>>+              } else {
>>+                      br_vlan_delete(br, vinfo->vid);
>>+              }
>>+              break;
>>+      }
>>+
>>+      return err;
>>+}
>>+
>> static int br_afspec(struct net_bridge *br,
>>                    struct net_bridge_port *p,
>>                    struct nlattr *af_spec,
>>                    int cmd)
>> {
>>       struct nlattr *tb[IFLA_BRIDGE_MAX+1];
>>+      struct nlattr *attr;
>>       int err = 0;
>>+      int rem;
>>
>>       err = nla_parse_nested(tb, IFLA_BRIDGE_MAX, af_spec, ifla_br_policy);
>>       if (err)
>>               return err;
>>
>>       if (tb[IFLA_BRIDGE_VLAN_INFO]) {
>>-              struct bridge_vlan_info *vinfo;
>>-
>>-              vinfo = nla_data(tb[IFLA_BRIDGE_VLAN_INFO]);
>>-
>>-              if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK)
>>-                      return -EINVAL;
>>-
>>-              switch (cmd) {
>>-              case RTM_SETLINK:
>>-                      if (p) {
>>-                              err = nbp_vlan_add(p, vinfo->vid, vinfo->flags);
>>-                              if (err)
>>-                                      break;
>>-
>>-                              if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
>>-                                      err = br_vlan_add(p->br, vinfo->vid,
>>-                                                        vinfo->flags);
>>-                      } else
>>-                              err = br_vlan_add(br, vinfo->vid, vinfo->flags);
>>-
>>-                      break;
>>-
>>-              case RTM_DELLINK:
>>-                      if (p) {
>>-                              nbp_vlan_delete(p, vinfo->vid);
>>-                              if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
>>-                                      br_vlan_delete(p->br, vinfo->vid);
>>-                      } else
>>-                              br_vlan_delete(br, vinfo->vid);
>>-                      break;
>>+              attr = tb[IFLA_BRIDGE_VLAN_INFO];
>>+              if (nla_len(attr) != sizeof(struct bridge_vlan_info))
>>+                      goto err_inval;
>>+
>>+              err = br_vlan_info(br, p, cmd,
>>+                                 (struct bridge_vlan_info *)nla_data(attr));
>>+
>>+      } else if (tb[IFLA_BRIDGE_VLAN_INFO_LIST]) {
>>+              struct bridge_vlan_info *vinfo_start = NULL;
>>+              struct bridge_vlan_info *vinfo = NULL;
>>+
>>+              nla_for_each_nested(attr, tb[IFLA_BRIDGE_VLAN_INFO_LIST], rem) {
>>+                      if (nla_len(attr) != sizeof(struct bridge_vlan_info) ||
>>+                          nla_type(attr) != IFLA_BRIDGE_VLAN_INFO)
>>+                              goto err_inval;
>>+                      vinfo = nla_data(attr);
>>+                      if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_START) {
>>+                              if (vinfo_start)
>>+                                      goto err_inval;
>>+                              vinfo_start = vinfo;
>>+                              continue;
>>+                      }
>>+
>>+                      if (vinfo_start) {
>>+                              int v;
>>+
>>+                              if (!(vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END))
>>+                                      goto err_inval;
>>+
>>+                              if (vinfo->vid < vinfo_start->vid)
>
> This check rejects inverted range. However it allows the RANGE_START and
> RANGE_END vinfos to have the same vid. Isn't it inconsistent with the rejection
> of a single vinfo with both RANGE_START and RANGE_END set?

Allowing both START and END to be set on single vinfo might simplify
the encoding of LIST, so maybe it should be allowed.

Roopa, I know you dropped the subsequent notification patches from the
set, but I suspect now with these new START/END markers, the
notification algorithm can be very close to the original loop, without
having to make copies of the vlan_bitmap and untagged_bitmap.  Using
both START/END on a single vinfo will keep the loop simple for adding
single vids that are not in a range.

(hmmm...START/STOP or BEGIN/END?  Seems START/END is mixing the two
concepts...BEGIN/END seems best)

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] bridge: modify bridge af spec parser to accomodate vlan list and ranges
From: Scott Feldman @ 2015-01-01 19:20 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: Netdev, hemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <1420044533-16963-3-git-send-email-roopa@cumulusnetworks.com>

On Wed, Dec 31, 2014 at 8:48 AM,  <roopa@cumulusnetworks.com> wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> This patch modifies br_afspec to parse incoming IFLA_BRIDGE_VLAN_INFO_LIST
>
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---
>  net/bridge/br_netlink.c |  115 ++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 85 insertions(+), 30 deletions(-)
>
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 492ef6a..bcba9d2 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -226,53 +226,108 @@ static const struct nla_policy ifla_br_policy[IFLA_MAX+1] = {
>         [IFLA_BRIDGE_VLAN_INFO_LIST] = { .type = NLA_NESTED, },
>  };
>
> +static int br_vlan_info(struct net_bridge *br, struct net_bridge_port *p,
> +                       int cmd, struct bridge_vlan_info *vinfo)
> +{
> +       int err = 0;
> +
> +       switch (cmd) {
> +       case RTM_SETLINK:
> +               if (p) {
> +                       err = nbp_vlan_add(p, vinfo->vid, vinfo->flags);
> +                       if (err)
> +                               break;
> +
> +                       if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
> +                               err = br_vlan_add(p->br, vinfo->vid,
> +                                                 vinfo->flags);
> +               } else {
> +                       err = br_vlan_add(br, vinfo->vid, vinfo->flags);
> +               }
> +               break;
> +
> +       case RTM_DELLINK:
> +               if (p) {
> +                       nbp_vlan_delete(p, vinfo->vid);
> +                       if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
> +                               br_vlan_delete(p->br, vinfo->vid);
> +               } else {
> +                       br_vlan_delete(br, vinfo->vid);
> +               }
> +               break;
> +       }
> +
> +       return err;
> +}
> +
>  static int br_afspec(struct net_bridge *br,
>                      struct net_bridge_port *p,
>                      struct nlattr *af_spec,
>                      int cmd)
>  {
>         struct nlattr *tb[IFLA_BRIDGE_MAX+1];
> +       struct nlattr *attr;
>         int err = 0;
> +       int rem;
>
>         err = nla_parse_nested(tb, IFLA_BRIDGE_MAX, af_spec, ifla_br_policy);
>         if (err)
>                 return err;
>
>         if (tb[IFLA_BRIDGE_VLAN_INFO]) {
> -               struct bridge_vlan_info *vinfo;
> -
> -               vinfo = nla_data(tb[IFLA_BRIDGE_VLAN_INFO]);
> -
> -               if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK)
> -                       return -EINVAL;
> -
> -               switch (cmd) {
> -               case RTM_SETLINK:
> -                       if (p) {
> -                               err = nbp_vlan_add(p, vinfo->vid, vinfo->flags);
> -                               if (err)
> -                                       break;
> -
> -                               if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
> -                                       err = br_vlan_add(p->br, vinfo->vid,
> -                                                         vinfo->flags);
> -                       } else
> -                               err = br_vlan_add(br, vinfo->vid, vinfo->flags);
> -
> -                       break;
> -
> -               case RTM_DELLINK:
> -                       if (p) {
> -                               nbp_vlan_delete(p, vinfo->vid);
> -                               if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
> -                                       br_vlan_delete(p->br, vinfo->vid);
> -                       } else
> -                               br_vlan_delete(br, vinfo->vid);
> -                       break;
> +               attr = tb[IFLA_BRIDGE_VLAN_INFO];
> +               if (nla_len(attr) != sizeof(struct bridge_vlan_info))
> +                       goto err_inval;

Size check isn't necessary here as it was already done with nla_parse_nested().

> +
> +               err = br_vlan_info(br, p, cmd,
> +                                  (struct bridge_vlan_info *)nla_data(attr));
> +
> +       } else if (tb[IFLA_BRIDGE_VLAN_INFO_LIST]) {

Could you have both IFLA_BRIDGE_VLAN_INFO and
IFLA_BRIDGE_VLAN_INFO_LIST?  If so, drop the else.

> +               struct bridge_vlan_info *vinfo_start = NULL;
> +               struct bridge_vlan_info *vinfo = NULL;

Initializer not needed on this ^^^ one.

> +
> +               nla_for_each_nested(attr, tb[IFLA_BRIDGE_VLAN_INFO_LIST], rem) {
> +                       if (nla_len(attr) != sizeof(struct bridge_vlan_info) ||
> +                           nla_type(attr) != IFLA_BRIDGE_VLAN_INFO)
> +                               goto err_inval;

goto isn't necessary...just return -EINVAL...more like this below

> +                       vinfo = nla_data(attr);
> +                       if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_START) {
> +                               if (vinfo_start)
> +                                       goto err_inval;
> +                               vinfo_start = vinfo;
> +                               continue;
> +                       }
> +
> +                       if (vinfo_start) {
> +                               int v;
> +
> +                               if (!(vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END))
> +                                       goto err_inval;
> +
> +                               if (vinfo->vid < vinfo_start->vid)
> +                                       goto err_inval;
> +
> +                               for (v = vinfo_start->vid; v <= vinfo->vid;
> +                                       v++) {

Did the above exceed 80 chars?  If not, one liner.

> +                                       vinfo_start->vid = v;
> +                                       err = br_vlan_info(br, p, cmd,
> +                                                          vinfo_start);
> +                                       if (err)
> +                                               break;
> +                               }
> +                               vinfo_start = NULL;
> +                       } else {
> +                               err = br_vlan_info(br, p, cmd, vinfo);
> +                       }
> +                       if (err)
> +                               break;
>                 }
>         }
>
>         return err;
> +
> +err_inval:
> +       return -EINVAL;
>  }
>
>  static const struct nla_policy br_port_policy[IFLA_BRPORT_MAX + 1] = {
> --
> 1.7.10.4
>

^ permalink raw reply

* [PATCH] isdn: hisax: hfc4s8s_l1:  Remove some unused functions
From: Rickard Strandqvist @ 2015-01-01 19:17 UTC (permalink / raw)
  To: Karsten Keil, David S. Miller
  Cc: Rickard Strandqvist, Dan Carpenter, netdev, linux-kernel

Removes some functions that are not used anywhere:
Read_hfc32() Write_hfc32() Write_hfc16()

This was partially found by using a static code analysis program called cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
---
 drivers/isdn/hisax/hfc4s8s_l1.c |   21 ---------------------
 1 file changed, 21 deletions(-)

diff --git a/drivers/isdn/hisax/hfc4s8s_l1.c b/drivers/isdn/hisax/hfc4s8s_l1.c
index fc9f9d0..0e5d673 100644
--- a/drivers/isdn/hisax/hfc4s8s_l1.c
+++ b/drivers/isdn/hisax/hfc4s8s_l1.c
@@ -225,20 +225,6 @@ fWrite_hfc8(hfc4s8s_hw *a, u_char c)
 }
 
 static inline void
-Write_hfc16(hfc4s8s_hw *a, u_char b, u_short c)
-{
-	SetRegAddr(a, b);
-	outw(c, a->iobase);
-}
-
-static inline void
-Write_hfc32(hfc4s8s_hw *a, u_char b, u_long c)
-{
-	SetRegAddr(a, b);
-	outl(c, a->iobase);
-}
-
-static inline void
 fWrite_hfc32(hfc4s8s_hw *a, u_long c)
 {
 	outl(c, a->iobase);
@@ -266,13 +252,6 @@ Read_hfc16(hfc4s8s_hw *a, u_char b)
 }
 
 static inline u_long
-Read_hfc32(hfc4s8s_hw *a, u_char b)
-{
-	SetRegAddr(a, b);
-	return (inl((volatile u_int) a->iobase));
-}
-
-static inline u_long
 fRead_hfc32(hfc4s8s_hw *a)
 {
 	return (inl((volatile u_int) a->iobase));
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH] TCP: Add support for TCP Stealth
From: Stephen Hemminger @ 2015-01-01 19:10 UTC (permalink / raw)
  To: Julian Kirsch; +Cc: netdev, Christian Grothoff, Jacob Appelbaum
In-Reply-To: <54A470B3.3010501@sec.in.tum.de>

On Wed, 31 Dec 2014 22:54:59 +0100
Julian Kirsch <kirschju@sec.in.tum.de> wrote:

> +	memcpy(iv, (const __u8 *)daddr,
> +	       (daddr_size > sizeof(iv)) ? sizeof(iv) : daddr_size);
> +
> +#ifdef CONFIG_TCP_MD5SIG
> +	md5 = tp->af_specific->md5_lookup(sk, sk);
> +#else
> +	md5 = NULL;
> +#endif
> +	if (likely(sysctl_tcp_timestamps && !md5) || tp->stealth.saw_tsval)
> +		tsval = tp->stealth.mstamp.stamp_jiffies;
> +
> +	((__be16 *)iv)[2] ^= cpu_to_be16(tp->stealth.integrity_hash);

Cast unnecessary on memcpy arg since it takes void *

Would be clearer to use a real structure or union not assignment to cast to setup iv.

^ permalink raw reply

* Re: [PATCH] TCP: Add support for TCP Stealth
From: Stephen Hemminger @ 2015-01-01 19:06 UTC (permalink / raw)
  To: Julian Kirsch; +Cc: netdev, Christian Grothoff, Jacob Appelbaum
In-Reply-To: <54A470B3.3010501@sec.in.tum.de>

On Wed, 31 Dec 2014 22:54:59 +0100
Julian Kirsch <kirschju@sec.in.tum.de> wrote:

> +#ifdef CONFIG_TCP_STEALTH
> +/* Stealth TCP socket configuration */
> +	struct {
> +		#define TCP_STEALTH_MODE_AUTH		BIT(0)
> +		#define TCP_STEALTH_MODE_INTEGRITY	BIT(1)
> +		#define TCP_STEALTH_MODE_INTEGRITY_LEN	BIT(2)
> +		int mode;
> +		u8 secret[MD5_MESSAGE_BYTES];
> +		int integrity_len;
> +		u16 integrity_hash;
> +		struct skb_mstamp mstamp;
> +		bool saw_tsval;
> +	} stealth;
> +#endif

If you want a bitfield, why not use a bitfield for mode?
If you have to use masks, better to use u8 for mode.
Integrity length should be unsigned since obviously negative
values are not possible.

Rearrange structure to save space. Lots of holes here.

^ permalink raw reply

* Re: [PATCH] Drivers: isdn: gigaset: checkpatch cleanup
From: Tilman Schmidt @ 2015-01-01 18:46 UTC (permalink / raw)
  To: Bas Peters, hjlipp; +Cc: isdn, gigaset307x-common, netdev, linux-kernel
In-Reply-To: <1420047298-7798-1-git-send-email-baspeters93@gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Bas,

I have several objections to your patch.

Am 31.12.2014 um 18:34 schrieb Bas Peters:
> I have not been able to test the code as I do not have access to
> the hardware but since no new features were really added I don't
> think that should pose a problem.

It's always problematic to change code you cannot test.
At the very least, if you do coding style cleanups you should test
whether the result still compiles and generates the same code as before.

> --- a/drivers/isdn/gigaset/bas-gigaset.c +++
> b/drivers/isdn/gigaset/bas-gigaset.c @@ -261,11 +261,12 @@ static
> inline void dump_urb(enum debuglevel level, const char *tag, { 
> #ifdef CONFIG_GIGASET_DEBUG int i; + gig_dbg(level, "%s
> urb(0x%08lx)->{", tag, (unsigned long) urb); if (urb) { 
> gig_dbg(level, -			"  dev=0x%08lx, pipe=%s:EP%d/DV%d:%s, " -
> "hcpriv=0x%08lx, transfer_flags=0x%x,", +			"  dev=0x%08lx,
> pipe=%s:EP%d/DV%d:%s, +			hcpriv=0x%08lx, transfer_flags=0x%x,",

This is syntactically wrong and won't compile. You cannot have an
unescaped newline inside a string literal.

> @@ -2312,13 +2312,13 @@ static int gigaset_probe(struct
> usb_interface *interface, /* Reject application specific
> interfaces */ if (hostif->desc.bInterfaceClass != 255) { -
> dev_warn(&udev->dev, "%s: bInterfaceClass == %d\n", +
> dev_warn(&udev->dev, "%s: bInterfaceClass == %d\n",\ __func__,
> hostif->desc.bInterfaceClass); return -ENODEV; }
> 
> dev_info(&udev->dev, -		 "%s: Device matched (Vendor: 0x%x,
> Product: 0x%x)\n", +		 "%s: Device matched (Vendor: 0x%x, Product:
> 0x%x)\n",\ __func__, le16_to_cpu(udev->descriptor.idVendor), 
> le16_to_cpu(udev->descriptor.idProduct));

This looks strange, and not like correct coding style. Why would you
want to escape the end of line after a function argument?

> --- a/drivers/isdn/gigaset/common.c +++
> b/drivers/isdn/gigaset/common.c @@ -53,7 +53,7 @@ void
> gigaset_dbg_buffer(enum debuglevel level, const unsigned char
> *msg, { unsigned char outbuf[80]; unsigned char c; -	size_t space =
> sizeof outbuf - 1; +	size_t space = sizeof(outbuf - 1);

This is wrong. The sizeof operator must be applied to the array
variable outbuf, not to the expression (outbuf - 1).

> --- a/drivers/isdn/gigaset/ev-layer.c +++
> b/drivers/isdn/gigaset/ev-layer.c

> @@ -1355,8 +1351,20 @@ static void do_action(int action, struct
> cardstate *cs, }
> 
> for (i = 0; i < 4; ++i) { -			val = simple_strtoul(s, (char **) &e,
> 10); -			if (val > INT_MAX || e == s) +			unsigned long *e; + +
> val = kstrtoul(s, 10, e); +			if (val == -EINVAL) { +
> dev_err(cs->dev, "Parsing error on converting string to\ +
> unsigned long\n"); +				break; +			} +			if (val == -ERANGE) { +
> dev_err(cs->dev, "Overflow error converting string to\ +
> unsigned long\n"); +				break; +			} +			if (val > INT_MAX || *e ==
> s) break; if (i == 3) { if (*e)

This cannot work. The pointer variable e gets dereferenced without
ever being initialized. The type mismatches when declaring e as
pointing to an unsigned long but comparing *e to s in one place and to
a character literal in another point make me wonder which semantics
you had in mind for e in the first place.
Also your error messages are not helpful for someone reading the log
and trying to find out what went wrong, and not very readable because
of the big stretch of whitespace you insert between the words "to" and
"unsigned". In fact I'm not even convinced it's a good idea to emit a
log message at all here.

> --- a/drivers/isdn/gigaset/gigaset.h +++
> b/drivers/isdn/gigaset/gigaset.h @@ -94,8 +94,7 @@ enum debuglevel
> { #define gig_dbg(level, format, arg...)					\ do {								\ if
> (unlikely(((enum debuglevel)gigaset_debuglevel) & (level))) \ -
> printk(KERN_DEBUG KBUILD_MODNAME ": " format "\n", \ -			       ##
> arg);					\ +			dev_dbg(cs->dev, KBUILD_MODNAME ": " format "\n")\ 
> } while (0) #define DEBUG_DEFAULT (DEBUG_TRANSCMD | DEBUG_CMD |
> DEBUG_USBREQ)
> 

This will not work when
- - there is no cs variable in the context where the macro is used or
- - cs->dev doesn't contain a valid device pointer or
- - the format string references additional arguments,
all of which actually occur in the driver.

> --- a/drivers/isdn/gigaset/i4l.c +++ b/drivers/isdn/gigaset/i4l.c 
> @@ -624,14 +624,14 @@ int gigaset_isdn_regdev(struct cardstate *cs,
> const char *isdnid) { isdn_if *iif;
> 
> -	iif = kmalloc(sizeof *iif, GFP_KERNEL); +	iif =
> kmalloc(sizeof(*iif, GFP_KERNEL)); if (!iif) { pr_err("out of
> memory\n"); return -ENOMEM;

You're calling kmalloc with too few arguments here.

> --- a/drivers/isdn/gigaset/proc.c +++
> b/drivers/isdn/gigaset/proc.c @@ -27,13 +27,18 @@ static ssize_t
> set_cidmode(struct device *dev, struct device_attribute *attr, 
> const char *buf, size_t count) { struct cardstate *cs =
> dev_get_drvdata(dev); -	long int value; -	char *end; +	long int
> *value; +	int result;
> 
> -	value = simple_strtol(buf, &end, 0); -	while (*end) -		if
> (!isspace(*end++)) -			return -EINVAL; +	result = kstrtol(buf, 0,
> &value); +	if (result == -ERANGE) +		/* Overflow error */ +
> dev_err(cs->dev, "Overflow error on conversion from string to\ +
> long\n"); +	if (result == -EINVAL) +		/* Parsing error  */ +
> dev_err(cs->dev, "Parsing error on conversion from string to\ +
> long\n"); if (value < 0 || value > 1) return -EINVAL;

This changes semantics. Your code will not accept the same input as
the original code, and it will emit messages of its own instead of
just returning an error code to the caller as it should.

> --- a/drivers/isdn/gigaset/usb-gigaset.c +++
> b/drivers/isdn/gigaset/usb-gigaset.c

> default: rate =  9600; -		dev_err(cs->dev, "unsupported baudrate
> request 0x%x," -			" using default of B9600\n", cflag); +
> dev_err(cs->dev, "unsupported baudrate request 0x%x,\ +				 using
> default of B9600\n", cflag);

This makes the message much less readable by inserting a long stretch
of whitespace after the comma.

In sum: NACK.

Regards,
Tilman

- -- 
Tilman Schmidt                              E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBAgAGBQJUpZYFAAoJEFPuqx0v+F+qEcsH/1yyu8A8tHPhiW60DzQWhCj7
kxKw7gUS24ATLEEl5jUmrua2xPM0Exg7FknBSYRmNmOEj8j3sl7mQ0dDzDcJgOgI
BaDXV5YqnnqppmYkdT7OMykAuhdt2rk1w4khc2EjyyKrAdGJyB+j3ROgRDtG0wsY
zI/Uz7yKe540cwVWc6VCNQvS7cVEasQZnJzzTGBcPW35RjTYpvWEieJ/yY3tphIe
TQRl+SrKgiwGuzi0p886Vk8Mu4cfHHO5/EXyzpVdMVg6wxwxNs+YeW5xRf/mjzQe
YyygRKUA4VtZTno/rabdA2QvLkdDMHoiUNYa0InmBpAlLVol8OLh5mTit9yozwY=
=uU+S
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: [PATCH net-next v1 2/5] ixgbevf: Add a RETA query code
From: Alexander Duyck @ 2015-01-01 18:09 UTC (permalink / raw)
  To: Vlad Zolotarov, netdev
  Cc: gleb, avi, jeffrey.t.kirsher, Don Skidmore, tantilov, Emil S
In-Reply-To: <54A44188.204@cloudius-systems.com>

On 12/31/2014 10:33 AM, Vlad Zolotarov wrote:
>
> On 12/31/14 20:00, Alexander Duyck wrote:
>> I suspect this code is badly broken as it doesn't take several things
>> into account.
>>
>> First the PF redirection table can have values outside of the range
>> supported by the VF.  This is allowed as the VF can set how many bits of
>> the redirection table it actually wants to use.  This is controlled via
>> the PSRTYPE register.  So for example the PF can be running with 4
>> queues, and the VF can run either in single queue or as just a pair of
>> queues.
>>
>> Second you could compress this data much more tightly by taking
>> advantage of the bit widths allowed.  So for everything x540 and older
>> they only use a 4 bit value per entry.  That means you could
>> theoretically stuff 8 entries per u32 instead of just 4.
>
> Compression is nice but I think ethtool expects it in a certain
> format: one entry per byte. And since this patch is targeting the
> ethtool the output format should be as ethtool expects it to be and
> this is what this patch does. However I agree that masking the
> appropriate bits according to PSRTYPE is required. Good catch!

The idea of compression comes into play when you consider there is
significant latency trying to get messages across the mailbox.  By
reducing the number of messages needed to get the redirection table you
should be able to significantly reduce the amount of time needed to
fetch it.  The job of compressing/expanding the values is actually
pretty straight forward when you consider all that should be needed is a
simple loop to perform some shift, and, and or operations.

- Alex

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox