Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next 0/4]: Allow head adjustment in XDP prog
From: Jesper Dangaard Brouer @ 2016-12-03 15:17 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: brouer, netdev, Alexei Starovoitov, Brenden Blanco,
	Daniel Borkmann, David Miller, Saeed Mahameed, Tariq Toukan,
	Kernel Team
In-Reply-To: <1480721013-1047541-1-git-send-email-kafai@fb.com>

On Fri, 2 Dec 2016 15:23:29 -0800
Martin KaFai Lau <kafai@fb.com> wrote:

> This series adds a helper to allow head adjustment in XDP prog.  mlx4
> driver has been modified to support this feature.  An example is written
> to encapsulate a packet with an IPv4/v6 header and then XDP_TX it
> out.

Hi Martin,

It is great to see work in this area.

Push/pop of headers is listed as on of the missing features here:
 https://prototype-kernel.readthedocs.io/en/latest/networking/XDP/implementation/missing_features.html
We can hopefully soon cross that of the list :-)

Header push and pop desc:
 https://prototype-kernel.readthedocs.io/en/latest/networking/XDP/design/requirements.html#header-push-and-pop

Thanks for working on this! :-)))
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH v2 net] ixgbevf: fix invalid uses of napi_hash_del()
From: Eric Dumazet @ 2016-12-03 15:00 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Jeff Kirsher
In-Reply-To: <1479310010.8455.197.camel@edumazet-glaptop3.roam.corp.google.com>

On Wed, 2016-11-16 at 07:26 -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Calling napi_hash_del() before netif_napi_del() is dangerous
> if a synchronize_rcu() is not enforced before NAPI struct freeing.
> 
> Lets leave this detail to core networking stack to get it right.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
>  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |    6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> index 7eaac3234049..bf4d7efc7dbd 100644
> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> @@ -2511,9 +2511,6 @@ static int ixgbevf_alloc_q_vectors(struct ixgbevf_adapter *adapter)
>  	while (q_idx) {
>  		q_idx--;
>  		q_vector = adapter->q_vector[q_idx];
> -#ifdef CONFIG_NET_RX_BUSY_POLL
> -		napi_hash_del(&q_vector->napi);
> -#endif
>  		netif_napi_del(&q_vector->napi);
>  		kfree(q_vector);
>  		adapter->q_vector[q_idx] = NULL;
> @@ -2537,9 +2534,6 @@ static void ixgbevf_free_q_vectors(struct ixgbevf_adapter *adapter)
>  		struct ixgbevf_q_vector *q_vector = adapter->q_vector[q_idx];
>  
>  		adapter->q_vector[q_idx] = NULL;
> -#ifdef CONFIG_NET_RX_BUSY_POLL
> -		napi_hash_del(&q_vector->napi);
> -#endif
>  		netif_napi_del(&q_vector->napi);
>  		kfree(q_vector);
>  	}
> 
> 

It looks this patch was not picked up ?

Thanks !

^ permalink raw reply

* Re: [PATCH net] team: team_port_add should check link_up before enable port
From: Marcelo Ricardo Leitner @ 2016-12-03 14:57 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, davem, Jiri Pirko
In-Reply-To: <14defaf74cf554158b8e289dd394815da1d8760c.1480772531.git.lucien.xin@gmail.com>

On Sat, Dec 03, 2016 at 09:42:11PM +0800, Xin Long wrote:
> Now when users add a nic to team dev, the option 'enable' of the port
> is true by default, as team_port_enable enables it after dev_open in
> team_port_add.
> 
> But even if the port_dev has no carrier, like it's cable was unpluged,
> the port is still enabled. It leads to that team dev couldn't work well
> if this port was chosen to connect, and has no chance to change to use
> other ports if link_watch is ethtool.
> 
> This patch is to enable the port only when the port_dev has carrier in
> team_port_add.
> 
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
>  drivers/net/team/team.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
> index a380649..42004ac 100644
> --- a/drivers/net/team/team.c
> +++ b/drivers/net/team/team.c
> @@ -1140,6 +1140,7 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
>  	struct net_device *dev = team->dev;
>  	struct team_port *port;
>  	char *portname = port_dev->name;
> +	bool linkup;
>  	int err;
>  
>  	if (port_dev->flags & IFF_LOOPBACK) {
> @@ -1249,9 +1250,12 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
>  
>  	port->index = -1;
>  	list_add_tail_rcu(&port->list, &team->port_list);
> -	team_port_enable(team, port);
> +	linkup = !!netif_carrier_ok(port_dev);

The !! here is not needed anymore, netif_carrier_ok already returns a
bool.
static inline bool netif_carrier_ok(const struct net_device *dev)


> +	if (linkup)
> +		team_port_enable(team, port);
> +
>  	__team_compute_features(team);
> -	__team_port_change_port_added(port, !!netif_carrier_ok(port_dev));
> +	__team_port_change_port_added(port, linkup);
>  	__team_options_change_check(team);
>  
>  	netdev_info(dev, "Port device %s added\n", portname);
> -- 
> 2.1.0
> 

^ permalink raw reply

* Re: arp_filter and IPv6 ND
From: Saku Ytti @ 2016-12-03 14:21 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev
In-Reply-To: <96111ae8-9b68-943f-c9be-8fccdd614c8b@stressinduktion.org>

On 2 December 2016 at 20:39, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:

Hey,

> E.g. you can use IP addresses bound to other interfaces to send replys
> on another interface. This can be useful if you have a limited amount of
> IP addresses on the system but much more interfaces. Especially if they
> are limited in scope, like in IPv6.
>
> Basically Cisco's feature of "unnumbered interface" is always provided
> in Linux. And there are certainly cases where you would want to use it,
> e.g. emulate private-vlan feature for network separation.

Got it, thanks, the explanation makes sense. And indeed it's valid
case, but also it is the exception, not the rule. I think it would be
entirely change the default and people who want 'unnumbered' style
behaviour (like some BRAS scenarios), will know how to and why to
configure it.

> Also in the BGP setup, you might have it easier to establish loopback
> neighbor contact by just using static on-link routes, without caring
> about more complex numbering there (otherwise you pretty soon introduce
> OSPF or some other routing protocol to do the recursive forward resolution).

The BGP is running on-link, it's just that the BGP is advertising loop
of Linux. Why the loop ends up in ND cache, I don't know.

>> Grand, not that I feel comfortable writing it. I'd rather see the
>> whole suppression functionality moved to neighbour.c from being AFI
>> specific.
>
> Yes sure, please provide a patch. A separate sysctl is necessary anyway
> because the current one is within the ipv4 procfs directory hierarchy.

Sorry, not a comfortable C programmer, I'm pretty confident I could
get it working, but I'm more confident that patch would be entirely
rejected and rewritten by someone who knows what they are doing.
I see no reason not to have AFI specific toggle, just logic and code
should be AFI agnostic, like GC (ARP/ND cache time) stuff in
neighbour.c is nicely done. Frankly whole ARP/ND code could do with
refactoring to make arp.c and ndisc.c more wire-format stuff and
behavioural code more in neighbour.c.


-- 
  ++ytti

^ permalink raw reply

* [PATCH net] team: team_port_add should check link_up before enable port
From: Xin Long @ 2016-12-03 13:42 UTC (permalink / raw)
  To: network dev; +Cc: davem, Jiri Pirko

Now when users add a nic to team dev, the option 'enable' of the port
is true by default, as team_port_enable enables it after dev_open in
team_port_add.

But even if the port_dev has no carrier, like it's cable was unpluged,
the port is still enabled. It leads to that team dev couldn't work well
if this port was chosen to connect, and has no chance to change to use
other ports if link_watch is ethtool.

This patch is to enable the port only when the port_dev has carrier in
team_port_add.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 drivers/net/team/team.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index a380649..42004ac 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1140,6 +1140,7 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
 	struct net_device *dev = team->dev;
 	struct team_port *port;
 	char *portname = port_dev->name;
+	bool linkup;
 	int err;
 
 	if (port_dev->flags & IFF_LOOPBACK) {
@@ -1249,9 +1250,12 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
 
 	port->index = -1;
 	list_add_tail_rcu(&port->list, &team->port_list);
-	team_port_enable(team, port);
+	linkup = !!netif_carrier_ok(port_dev);
+	if (linkup)
+		team_port_enable(team, port);
+
 	__team_compute_features(team);
-	__team_port_change_port_added(port, !!netif_carrier_ok(port_dev));
+	__team_port_change_port_added(port, linkup);
 	__team_options_change_check(team);
 
 	netdev_info(dev, "Port device %s added\n", portname);
-- 
2.1.0

^ permalink raw reply related

* (unknown), 
From: cbordinaro @ 2016-12-03 13:59 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: MESSAGE_07189225617444_netdev.zip --]
[-- Type: application/zip, Size: 1430 bytes --]

^ permalink raw reply

* Re: [PATCH 1/1] net: ethernet: 3com: set error code on failures
From: Lino Sanfilippo @ 2016-12-03 13:53 UTC (permalink / raw)
  To: Pan Bian, David Dillow, netdev; +Cc: linux-kernel, Pan Bian
In-Reply-To: <1480771470-6404-1-git-send-email-bianpan201602@163.com>

Hi,

On 03.12.2016 14:24, Pan Bian wrote:
> From: Pan Bian <bianpan2016@163.com>
> 
> In function typhoon_init_one(), returns the value of variable err on
> errors. However, on some error paths, variable err is not set to a
> negative errno. This patch assigns "-EIO" to err on those paths.
> 
> Signed-off-by: Pan Bian <bianpan2016@163.com>

>  
> @@ -2409,6 +2410,7 @@ enum state_values {
>  	INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_VERSIONS);
>  	if(typhoon_issue_command(tp, 1, &xp_cmd, 3, xp_resp) < 0) {
>  		err_msg = "Could not get Sleep Image version";
> +		err = -EIO;
>  		goto error_out_reset;
>  	}
>  
> @@ -2451,6 +2453,7 @@ enum state_values {
>  
>  	if(register_netdev(dev) < 0) {
>  		err_msg = "unable to register netdev";
> +		err = -EIO;
>  		goto error_out_reset;
>  	}
>  
> 

Why not return the error value provided by the called functions? Is there a reason
to map different errors to -EIO?

Regards,
Lino

^ permalink raw reply

* [PATCH 1/1] net: dcb: set error code on failures
From: Pan Bian @ 2016-12-03 13:49 UTC (permalink / raw)
  To: David S. Miller, netdev; +Cc: linux-kernel, Pan Bian

From: Pan Bian <bianpan2016@163.com>

In function dcbnl_cee_fill(), returns the value of variable err on
errors. However, on some error paths (e.g. nla put fails), its value may
be 0. It may be better to explicitly set a negative errno to variable
err before returning.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188881

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 net/dcb/dcbnl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/dcb/dcbnl.c b/net/dcb/dcbnl.c
index 4f6c186..3202d75 100644
--- a/net/dcb/dcbnl.c
+++ b/net/dcb/dcbnl.c
@@ -1353,6 +1353,7 @@ static int dcbnl_cee_fill(struct sk_buff *skb, struct net_device *netdev)
 dcb_unlock:
 	spin_unlock_bh(&dcb_lock);
 nla_put_failure:
+	err = -EMSGSIZE;
 	return err;
 }
 
-- 
1.9.1

^ permalink raw reply related

* Re: net: use-after-free in worker_thread
From: Eric Dumazet @ 2016-12-03 13:49 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: David S. Miller, Cong Wang, Johannes Berg, Florian Westphal,
	Herbert Xu, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML, Kostya Serebryany, Dmitry Vyukov,
	syzkaller
In-Reply-To: <CAAeHK+wtAhv2vvLva5a9J52A-bZj1kY8tF8RT7bC=5QVnxOr7A@mail.gmail.com>

On Sat, 2016-12-03 at 14:05 +0100, Andrey Konovalov wrote:
> On Sat, Dec 3, 2016 at 1:58 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
> > +syzkaller@googlegroups.com
> >
> > On Sat, Dec 3, 2016 at 1:56 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
> >> Hi!
> >>
> >> I'm seeing lots of the following error reports while running the
> >> syzkaller fuzzer.
> >>
> >> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).
> >>
> >> ==================================================================
> >> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
> >> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774
> >>
> >> page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
> >> index:0xffff880067f39c10 compound_mapcount: 0
> >> flags: 0x500000000004080(slab|head)
> >> page dumped because: kasan: bad access detected
> >>
> >> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> >>  ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
> >>  ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
> >>  ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
> >> Call Trace:
> >>  [<     inline     >] __dump_stack lib/dump_stack.c:15
> >>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
> >>  [<     inline     >] describe_address mm/kasan/report.c:262
> >>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
> >>  [<     inline     >] kasan_report mm/kasan/report.c:390
> >>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
> >> mm/kasan/report.c:411
> >>  [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
> >>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
> >>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> >>
> >> The buggy address belongs to the object at ffff880067f3e6d0
> >>  which belongs to the cache kmalloc-2048 of size 2048
> >> The buggy address ffff880067f3ecd8 is located 1544 bytes inside
> >>  of 2048-byte region [ffff880067f3e6d0, ffff880067f3eed0)
> >>
> >> Freed by task 0:
> >>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
> >>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
> >>  [<     inline     >] set_track mm/kasan/kasan.c:507
> >>  [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
> >>  [<     inline     >] slab_free_hook mm/slub.c:1352
> >>  [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
> >>  [<     inline     >] slab_free mm/slub.c:2951
> >>  [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
> >>  [<     inline     >] sk_prot_free net/core/sock.c:1372
> >>  [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
> >>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
> >>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
> >>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
> >>  [<     inline     >] sock_put include/net/sock.h:1591
> >>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
> >>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
> >>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
> >>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
> >>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
> >>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
> >>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
> >>
> >> Allocated by task 10748:
> >>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
> >>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
> >>  [<     inline     >] set_track mm/kasan/kasan.c:507
> >>  [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
> >>  [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
> >>  [<     inline     >] kmalloc include/linux/slab.h:495
> >>  [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
> >>  [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
> >>  [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
> >>  [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
> >>  [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
> >>  [<     inline     >] sock_create net/socket.c:1208
> >>  [<     inline     >] SYSC_socket net/socket.c:1238
> >>  [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
> >>  [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
> >>
> >> Memory state around the buggy address:
> >>  ffff880067f3eb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>  ffff880067f3ec00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>ffff880067f3ec80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>                                                     ^
> >>  ffff880067f3ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>  ffff880067f3ed80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> ==================================================================
> 
> Here is another report that looks related:
> 
> ==================================================================
> BUG: KASAN: use-after-free in __list_add+0x236/0x2c0
> Read of size 8 at addr ffff880068854780 by task ksoftirqd/2/20
> 
> page:ffffea0001a21400 count:1 mapcount:0 mapping:          (null)
> index:0x0 compound_mapcount: 0
> flags: 0x500000000004080(slab|head)
> page dumped because: kasan: bad access detected
> 
> CPU: 2 PID: 20 Comm: ksoftirqd/2 Not tainted 4.9.0-rc7+ #66
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  ffff88006daf6578 ffffffff81f882da ffffffff6daf62a0 1ffff1000db5ec42
>  ffffed000db5ec3a dffffc0000000000 0000000041b58ab3 ffffffff8541e198
>  ffffffff81f88048 ffff88006dac3610 ffff88006daf6300 0000000000000802
> Call Trace:
>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
>  [<     inline     >] describe_address mm/kasan/report.c:262
>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
>  [<     inline     >] kasan_report mm/kasan/report.c:390
>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
> mm/kasan/report.c:411
>  [<ffffffff8200c166>] __list_add+0x236/0x2c0 lib/list_debug.c:30
>  [<     inline     >] list_add_tail include/linux/list.h:77
>  [<ffffffff8131e295>] insert_work+0x175/0x4b0 kernel/workqueue.c:1298
>  [<ffffffff8131eb52>] __queue_work+0x582/0x11e0 kernel/workqueue.c:1459
>  [<ffffffff81320c21>] queue_work_on+0x231/0x240 kernel/workqueue.c:1484
>  [<     inline     >] queue_work include/linux/workqueue.h:474
>  [<     inline     >] schedule_work include/linux/workqueue.h:532
>  [<ffffffff8348c8cc>] netlink_sock_destruct+0x23c/0x2d0
> net/netlink/af_netlink.c:361
>  [<ffffffff831e9ce1>] __sk_destruct+0xe1/0x6e0 net/core/sock.c:1423
>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>  [<     inline     >] sock_put include/net/sock.h:1591
>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>  [<ffffffff812d38c0>] run_ksoftirqd+0x20/0x60 kernel/softirq.c:676
>  [<ffffffff81350132>] smpboot_thread_fn+0x562/0x860 kernel/smpboot.c:163
>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> 
> The buggy address belongs to the object at ffff880068854170
>  which belongs to the cache kmalloc-2048 of size 2048
> The buggy address ffff880068854780 is located 1552 bytes inside
>  of 2048-byte region [ffff880068854170, ffff880068854970)
> 
> Freed by task 20:
>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>  [<     inline     >] set_track mm/kasan/kasan.c:507
>  [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
>  [<     inline     >] slab_free_hook mm/slub.c:1352
>  [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
>  [<     inline     >] slab_free mm/slub.c:2951
>  [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
>  [<     inline     >] sk_prot_free net/core/sock.c:1372
>  [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>  [<     inline     >] sock_put include/net/sock.h:1591
>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
> 
> Allocated by task 9480:
>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>  [<     inline     >] set_track mm/kasan/kasan.c:507
>  [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
>  [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
>  [<     inline     >] kmalloc include/linux/slab.h:495
>  [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
>  [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
>  [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
>  [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
>  [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
>  [<     inline     >] sock_create net/socket.c:1208
>  [<     inline     >] SYSC_socket net/socket.c:1238
>  [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
>  [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
> 
> Memory state around the buggy address:
>  ffff880068854680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff880068854700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >ffff880068854780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                    ^
>  ffff880068854800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff880068854880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================


Hi Andrey. Please give us some rest during the week end ;)

This looks like the bug I mentioned earlier for which I have a pending
patch ? Can you try it ?

The RCU conversion done by Thomas was quite buggy.

Thanks.


diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 602e5ebe9db39ec6c72708628bc48efad9f0e680..c348c4a5ea4ecc05dcc9e2afbc069ab65a1a57fe 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -475,8 +475,8 @@ static struct sock *netlink_lookup(struct net *net, int protocol, u32 portid)
 
 	rcu_read_lock();
 	sk = __netlink_lookup(table, portid, net);
-	if (sk)
-		sock_hold(sk);
+	if (sk && !atomic_inc_not_zero(&sk->sk_refcnt))
+		sk = NULL;
 	rcu_read_unlock();
 
 	return sk;
@@ -600,6 +600,7 @@ static int __netlink_create(struct net *net, struct socket *sock,
 	}
 	init_waitqueue_head(&nlk->wait);
 
+	sock_set_flag(sk, SOCK_RCU_FREE);
 	sk->sk_destruct = netlink_sock_destruct;
 	sk->sk_protocol = protocol;
 	return 0;
@@ -664,13 +665,6 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol,
 	goto out;
 }
 
-static void deferred_put_nlk_sk(struct rcu_head *head)
-{
-	struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu);
-
-	sock_put(&nlk->sk);
-}
-
 static int netlink_release(struct socket *sock)
 {
 	struct sock *sk = sock->sk;
@@ -743,7 +737,7 @@ static int netlink_release(struct socket *sock)
 	local_bh_disable();
 	sock_prot_inuse_add(sock_net(sk), &netlink_proto, -1);
 	local_bh_enable();
-	call_rcu(&nlk->rcu, deferred_put_nlk_sk);
+	sock_put(sk);
 	return 0;
 }
 
diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h
index 4fdb3831897775547f77c069a8018c0d2a253c8c..988d1a02487e37b7efd4872dd0ab6d230e5a2021 100644
--- a/net/netlink/af_netlink.h
+++ b/net/netlink/af_netlink.h
@@ -33,7 +33,6 @@ struct netlink_sock {
 	struct module		*module;
 
 	struct rhash_head	node;
-	struct rcu_head		rcu;
 	struct work_struct	work;
 };
 

^ permalink raw reply related

* [PATCH 1/1] net: ethernet: 3com: set error code on failures
From: Pan Bian @ 2016-12-03 13:24 UTC (permalink / raw)
  To: David Dillow, netdev; +Cc: linux-kernel, Pan Bian

From: Pan Bian <bianpan2016@163.com>

In function typhoon_init_one(), returns the value of variable err on
errors. However, on some error paths, variable err is not set to a
negative errno. This patch assigns "-EIO" to err on those paths.

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 drivers/net/ethernet/3com/typhoon.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/3com/typhoon.c b/drivers/net/ethernet/3com/typhoon.c
index 8f8418d..9a477fc 100644
--- a/drivers/net/ethernet/3com/typhoon.c
+++ b/drivers/net/ethernet/3com/typhoon.c
@@ -2400,6 +2400,7 @@ enum state_values {
 
 	if(!is_valid_ether_addr(dev->dev_addr)) {
 		err_msg = "Could not obtain valid ethernet address, aborting";
+		err = -EIO;
 		goto error_out_reset;
 	}
 
@@ -2409,6 +2410,7 @@ enum state_values {
 	INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_VERSIONS);
 	if(typhoon_issue_command(tp, 1, &xp_cmd, 3, xp_resp) < 0) {
 		err_msg = "Could not get Sleep Image version";
+		err = -EIO;
 		goto error_out_reset;
 	}
 
@@ -2451,6 +2453,7 @@ enum state_values {
 
 	if(register_netdev(dev) < 0) {
 		err_msg = "unable to register netdev";
+		err = -EIO;
 		goto error_out_reset;
 	}
 
-- 
1.9.1

^ permalink raw reply related

* Possible regression due to "net/sched: cls_flower: Add offload support using egress Hardware device"
From: Simon Horman @ 2016-12-03 13:17 UTC (permalink / raw)
  To: Hadar Hen Zion
  Cc: netdev, Saeed Mahameed, Jiri Pirko, Amir Vadai, Or Gerlitz,
	Roi Dayan

Hi Hadar,

in net-next I am observing what appears to be an regression in net-next due to:
7091d8c7055d ("net/sched: cls_flower: Add offload support using egress Hardware device")

The problem occurs when adding a flower filter (without offload to a virtio
device).

# ethtool -d eth0
ethtool -i eth0
driver: virtio_net
...

# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: flower indev eth0
[  104.302779] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d5
[  104.303388] IP: [<ffffffff812c966d>] fl_dump+0x18d/0x7b0
[  104.304140] PGD 1f825067 [  104.304535] PUD 1f81a067 
PMD 0 [  104.305080] 
[  104.305351] Oops: 0000 [#1] SMP
[  104.305850] Modules linked in:
[  104.306358] CPU: 0 PID: 164 Comm: tc Not tainted 4.9.0-rc6-01485-g7091d8c7055d #1217
[  104.307603] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[  104.309781] task: ffff8800167dac40 task.stack: ffffc9000017c000
[  104.310950] RIP: 0010:[<ffffffff812c966d>]  [<ffffffff812c966d>] fl_dump+0x18d/0x7b0
[  104.311924] RSP: 0018:ffffc9000017fa40  EFLAGS: 00010246
[  104.311924] RAX: ffff88001f830a00 RBX: ffff88001b320900 RCX: 0000000000000000
[  104.311924] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000ffff0000
[  104.311924] RBP: ffff8800167dec00 R08: 0000000000000000 R09: ffff88001f800034
[  104.311924] R10: 00000000c6024032 R11: 0000000000000000 R12: ffff88001f800030
[  104.311924] R13: ffff880016412e00 R14: ffff8800166fb200 R15: ffffc9000017fa50
[  104.311924] FS:  00007fe24e0e6700(0000) GS:ffff88001b600000(0000) knlGS:0000000000000000
[  104.311924] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  104.311924] CR2: 00000000000000d5 CR3: 000000001645a000 CR4: 00000000000006b0
[  104.311924] Stack:
[  104.311924]  0000000000000286 0000000000000000 0000000000000000 0000000000000000
[  104.311924]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  104.311924]  ffff88001b320900 ffff88001f800000 ffff88001f800000 ffff880016412e00
[  104.311924] Call Trace:
[  104.311924]  [<ffffffff812c452e>] ? tcf_fill_node.constprop.12+0x12e/0x180
[  104.311924]  [<ffffffff812c45f6>] ? tfilter_notify.constprop.11+0x76/0x100
[  104.311924]  [<ffffffff812c4ac9>] ? tc_ctl_tfilter+0x449/0x6c0
[  104.311924]  [<ffffffff812b17d3>] ? rtnetlink_rcv_msg+0x83/0x200
[  104.311924]  [<ffffffff812b1750>] ? rtnl_newlink+0x810/0x810
[  104.311924]  [<ffffffff812ce834>] ? netlink_rcv_skb+0x94/0xb0
[  104.311924]  [<ffffffff812ae4e4>] ? rtnetlink_rcv+0x24/0x30
[  104.311924]  [<ffffffff812ce1b5>] ? netlink_unicast+0x145/0x1d0
[  104.311924]  [<ffffffff812ce659>] ? netlink_sendmsg+0x369/0x390
[  104.311924]  [<ffffffff811119a3>] ? rw_copy_check_uvector+0x53/0x110
[  104.311924]  [<ffffffff81282830>] ? sock_sendmsg+0x10/0x20
[  104.311924]  [<ffffffff81282e97>] ? ___sys_sendmsg+0x1f7/0x200
[  104.311924]  [<ffffffff81282fb9>] ? ___sys_recvmsg+0x119/0x140
[  104.311924]  [<ffffffff810e3c70>] ? trace_raw_output_mm_lru_activate+0x60/0x60
[  104.311924]  [<ffffffff81105606>] ? page_add_new_anon_rmap+0x46/0x80
[  104.311924]  [<ffffffff810fd902>] ? handle_mm_fault+0xae2/0xb00
[  104.311924]  [<ffffffff81283da1>] ? __sys_sendmsg+0x41/0x70
[  104.311924]  [<ffffffff813b7560>] ? entry_SYSCALL_64_fastpath+0x13/0x94
[  104.311924] Code: 85 5b ff ff ff e9 1a ff ff ff 4c 8d 7c 24 10 31 c0 b9 06 00 00 00 4c 8b 85 60 01 00 00 4c 89 ff f3 48 ab 49 8b 45 28 41 8b 7d 20 <41> f6 80 d5 00 00 00 80 48 8b 40 18 48 8b 40 08 0f 84 f0 fe ff 
[  104.311924] RIP  [<ffffffff812c966d>] fl_dump+0x18d/0x7b0
[  104.311924]  RSP <ffffc9000017fa40>
[  104.311924] CR2: 00000000000000d5
[  104.347974] ---[ end trace 9d9dacd54834303d ]---

^ permalink raw reply

* Re: [PATCH 1/1] net: caif: fix ineffective error check
From: Sergei Shtylyov @ 2016-12-03 13:17 UTC (permalink / raw)
  To: Pan Bian, Dmitry Tarnyagin, David S. Miller, netdev; +Cc: linux-kernel
In-Reply-To: <1480763901-5323-1-git-send-email-bianpan2016@163.com>

Hello.

On 12/3/2016 2:18 PM, Pan Bian wrote:

> In function caif_sktinit_module(), the check of the return value of
> sock_register() seems ineffective. This patch fixes it.
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188751
>
> Signed-off-by: Pan Bian <bianpan2016@163.com>
> ---
>  net/caif/caif_socket.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
> index aa209b1..2a689a3 100644
> --- a/net/caif/caif_socket.c
> +++ b/net/caif/caif_socket.c
> @@ -1108,7 +1108,7 @@ static int caif_create(struct net *net, struct socket *sock, int protocol,
>  static int __init caif_sktinit_module(void)
>  {
>  	int err = sock_register(&caif_family_ops);
> -	if (!err)
> +	if (err)
>  		return err;

    Why not just:

	return sock_register(&caif_family_ops);

>  	return 0;
>  }

MBR, Sergei

^ permalink raw reply

* Re: net: use-after-free in worker_thread
From: Andrey Konovalov @ 2016-12-03 12:58 UTC (permalink / raw)
  To: David S. Miller, Cong Wang, Johannes Berg, Florian Westphal,
	Herbert Xu, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML
  Cc: Kostya Serebryany, Dmitry Vyukov, syzkaller
In-Reply-To: <CAAeHK+x95-n__YSbzebp51ez78yjqjK4CJL=tgOgPuBuGh+q1A@mail.gmail.com>

+syzkaller@googlegroups.com

On Sat, Dec 3, 2016 at 1:56 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
> Hi!
>
> I'm seeing lots of the following error reports while running the
> syzkaller fuzzer.
>
> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).
>
> ==================================================================
> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774
>
> page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
> index:0xffff880067f39c10 compound_mapcount: 0
> flags: 0x500000000004080(slab|head)
> page dumped because: kasan: bad access detected
>
> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
>  ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
>  ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
> Call Trace:
>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
>  [<     inline     >] describe_address mm/kasan/report.c:262
>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
>  [<     inline     >] kasan_report mm/kasan/report.c:390
>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
> mm/kasan/report.c:411
>  [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
>
> The buggy address belongs to the object at ffff880067f3e6d0
>  which belongs to the cache kmalloc-2048 of size 2048
> The buggy address ffff880067f3ecd8 is located 1544 bytes inside
>  of 2048-byte region [ffff880067f3e6d0, ffff880067f3eed0)
>
> Freed by task 0:
>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>  [<     inline     >] set_track mm/kasan/kasan.c:507
>  [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
>  [<     inline     >] slab_free_hook mm/slub.c:1352
>  [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
>  [<     inline     >] slab_free mm/slub.c:2951
>  [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
>  [<     inline     >] sk_prot_free net/core/sock.c:1372
>  [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>  [<     inline     >] sock_put include/net/sock.h:1591
>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>
> Allocated by task 10748:
>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>  [<     inline     >] set_track mm/kasan/kasan.c:507
>  [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
>  [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
>  [<     inline     >] kmalloc include/linux/slab.h:495
>  [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
>  [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
>  [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
>  [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
>  [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
>  [<     inline     >] sock_create net/socket.c:1208
>  [<     inline     >] SYSC_socket net/socket.c:1238
>  [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
>  [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
>
> Memory state around the buggy address:
>  ffff880067f3eb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff880067f3ec00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>ffff880067f3ec80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                                     ^
>  ffff880067f3ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff880067f3ed80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================

^ permalink raw reply

* Re: net: use-after-free in worker_thread
From: Andrey Konovalov @ 2016-12-03 13:05 UTC (permalink / raw)
  To: David S. Miller, Cong Wang, Johannes Berg, Florian Westphal,
	Herbert Xu, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML
  Cc: Kostya Serebryany, Dmitry Vyukov, syzkaller
In-Reply-To: <CAAeHK+zDnDugPcdEGBnC6rt5iJMffz+tmmDkF=vv6u0YF=EMwg@mail.gmail.com>

On Sat, Dec 3, 2016 at 1:58 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
> +syzkaller@googlegroups.com
>
> On Sat, Dec 3, 2016 at 1:56 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
>> Hi!
>>
>> I'm seeing lots of the following error reports while running the
>> syzkaller fuzzer.
>>
>> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
>> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774
>>
>> page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
>> index:0xffff880067f39c10 compound_mapcount: 0
>> flags: 0x500000000004080(slab|head)
>> page dumped because: kasan: bad access detected
>>
>> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>>  ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
>>  ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
>>  ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
>> Call Trace:
>>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
>>  [<     inline     >] describe_address mm/kasan/report.c:262
>>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
>>  [<     inline     >] kasan_report mm/kasan/report.c:390
>>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
>> mm/kasan/report.c:411
>>  [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
>>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
>>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
>>
>> The buggy address belongs to the object at ffff880067f3e6d0
>>  which belongs to the cache kmalloc-2048 of size 2048
>> The buggy address ffff880067f3ecd8 is located 1544 bytes inside
>>  of 2048-byte region [ffff880067f3e6d0, ffff880067f3eed0)
>>
>> Freed by task 0:
>>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>>  [<     inline     >] set_track mm/kasan/kasan.c:507
>>  [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
>>  [<     inline     >] slab_free_hook mm/slub.c:1352
>>  [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
>>  [<     inline     >] slab_free mm/slub.c:2951
>>  [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
>>  [<     inline     >] sk_prot_free net/core/sock.c:1372
>>  [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
>>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>>  [<     inline     >] sock_put include/net/sock.h:1591
>>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>>
>> Allocated by task 10748:
>>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>>  [<     inline     >] set_track mm/kasan/kasan.c:507
>>  [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
>>  [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
>>  [<     inline     >] kmalloc include/linux/slab.h:495
>>  [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
>>  [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
>>  [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
>>  [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
>>  [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
>>  [<     inline     >] sock_create net/socket.c:1208
>>  [<     inline     >] SYSC_socket net/socket.c:1238
>>  [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
>>  [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
>>
>> Memory state around the buggy address:
>>  ffff880067f3eb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>  ffff880067f3ec00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>ffff880067f3ec80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>                                                     ^
>>  ffff880067f3ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>  ffff880067f3ed80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ==================================================================

Here is another report that looks related:

==================================================================
BUG: KASAN: use-after-free in __list_add+0x236/0x2c0
Read of size 8 at addr ffff880068854780 by task ksoftirqd/2/20

page:ffffea0001a21400 count:1 mapcount:0 mapping:          (null)
index:0x0 compound_mapcount: 0
flags: 0x500000000004080(slab|head)
page dumped because: kasan: bad access detected

CPU: 2 PID: 20 Comm: ksoftirqd/2 Not tainted 4.9.0-rc7+ #66
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 ffff88006daf6578 ffffffff81f882da ffffffff6daf62a0 1ffff1000db5ec42
 ffffed000db5ec3a dffffc0000000000 0000000041b58ab3 ffffffff8541e198
 ffffffff81f88048 ffff88006dac3610 ffff88006daf6300 0000000000000802
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [<     inline     >] describe_address mm/kasan/report.c:262
 [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
 [<     inline     >] kasan_report mm/kasan/report.c:390
 [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
mm/kasan/report.c:411
 [<ffffffff8200c166>] __list_add+0x236/0x2c0 lib/list_debug.c:30
 [<     inline     >] list_add_tail include/linux/list.h:77
 [<ffffffff8131e295>] insert_work+0x175/0x4b0 kernel/workqueue.c:1298
 [<ffffffff8131eb52>] __queue_work+0x582/0x11e0 kernel/workqueue.c:1459
 [<ffffffff81320c21>] queue_work_on+0x231/0x240 kernel/workqueue.c:1484
 [<     inline     >] queue_work include/linux/workqueue.h:474
 [<     inline     >] schedule_work include/linux/workqueue.h:532
 [<ffffffff8348c8cc>] netlink_sock_destruct+0x23c/0x2d0
net/netlink/af_netlink.c:361
 [<ffffffff831e9ce1>] __sk_destruct+0xe1/0x6e0 net/core/sock.c:1423
 [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
 [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
 [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
 [<     inline     >] sock_put include/net/sock.h:1591
 [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
 [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
 [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
 [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
 [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
 [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
 [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
 [<ffffffff812d38c0>] run_ksoftirqd+0x20/0x60 kernel/softirq.c:676
 [<ffffffff81350132>] smpboot_thread_fn+0x562/0x860 kernel/smpboot.c:163
 [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
 [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433

The buggy address belongs to the object at ffff880068854170
 which belongs to the cache kmalloc-2048 of size 2048
The buggy address ffff880068854780 is located 1552 bytes inside
 of 2048-byte region [ffff880068854170, ffff880068854970)

Freed by task 20:
 [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [<     inline     >] set_track mm/kasan/kasan.c:507
 [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
 [<     inline     >] slab_free_hook mm/slub.c:1352
 [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
 [<     inline     >] slab_free mm/slub.c:2951
 [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
 [<     inline     >] sk_prot_free net/core/sock.c:1372
 [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
 [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
 [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
 [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
 [<     inline     >] sock_put include/net/sock.h:1591
 [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
 [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
 [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
 [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
 [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
 [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
 [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284

Allocated by task 9480:
 [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [<     inline     >] set_track mm/kasan/kasan.c:507
 [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
 [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
 [<     inline     >] kmalloc include/linux/slab.h:495
 [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
 [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
 [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
 [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
 [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
 [<     inline     >] sock_create net/socket.c:1208
 [<     inline     >] SYSC_socket net/socket.c:1238
 [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
 [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2

Memory state around the buggy address:
 ffff880068854680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880068854700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff880068854780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff880068854800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880068854880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

^ permalink raw reply

* net: use-after-free in worker_thread
From: Andrey Konovalov @ 2016-12-03 12:56 UTC (permalink / raw)
  To: David S. Miller, Cong Wang, Johannes Berg, Florian Westphal,
	Herbert Xu, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML

Hi!

I'm seeing lots of the following error reports while running the
syzkaller fuzzer.

Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).

==================================================================
BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774

page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
index:0xffff880067f39c10 compound_mapcount: 0
flags: 0x500000000004080(slab|head)
page dumped because: kasan: bad access detected

CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
 ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
 ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [<     inline     >] describe_address mm/kasan/report.c:262
 [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
 [<     inline     >] kasan_report mm/kasan/report.c:390
 [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
mm/kasan/report.c:411
 [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
 [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
 [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433

The buggy address belongs to the object at ffff880067f3e6d0
 which belongs to the cache kmalloc-2048 of size 2048
The buggy address ffff880067f3ecd8 is located 1544 bytes inside
 of 2048-byte region [ffff880067f3e6d0, ffff880067f3eed0)

Freed by task 0:
 [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [<     inline     >] set_track mm/kasan/kasan.c:507
 [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
 [<     inline     >] slab_free_hook mm/slub.c:1352
 [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
 [<     inline     >] slab_free mm/slub.c:2951
 [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
 [<     inline     >] sk_prot_free net/core/sock.c:1372
 [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
 [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
 [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
 [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
 [<     inline     >] sock_put include/net/sock.h:1591
 [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
 [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
 [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
 [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
 [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
 [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
 [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284

Allocated by task 10748:
 [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [<     inline     >] set_track mm/kasan/kasan.c:507
 [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
 [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
 [<     inline     >] kmalloc include/linux/slab.h:495
 [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
 [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
 [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
 [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
 [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
 [<     inline     >] sock_create net/socket.c:1208
 [<     inline     >] SYSC_socket net/socket.c:1238
 [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
 [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2

Memory state around the buggy address:
 ffff880067f3eb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880067f3ec00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff880067f3ec80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                    ^
 ffff880067f3ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880067f3ed80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

^ permalink raw reply

* [PATCH 1/1] atm: lanai: set error code when ioremap fails
From: Pan Bian @ 2016-12-03 12:25 UTC (permalink / raw)
  To: Chas Williams, linux-atm-general, netdev; +Cc: linux-kernel, Pan Bian

In function lanai_dev_open(), when the call to ioremap() fails, the
value of return variable result is 0. 0 means no error in this context.
This patch fixes the bug, assigning "-ENOMEM" to result when ioremap()
returns a NULL pointer.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188791

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 drivers/atm/lanai.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/atm/lanai.c b/drivers/atm/lanai.c
index ce43ae3..445505d 100644
--- a/drivers/atm/lanai.c
+++ b/drivers/atm/lanai.c
@@ -2143,6 +2143,7 @@ static int lanai_dev_open(struct atm_dev *atmdev)
 	lanai->base = (bus_addr_t) ioremap(raw_base, LANAI_MAPPING_SIZE);
 	if (lanai->base == NULL) {
 		printk(KERN_ERR DEV_LABEL ": couldn't remap I/O space\n");
+		result = -ENOMEM;
 		goto error_pci;
 	}
 	/* 3.3: Reset lanai and PHY */
-- 
1.9.1

^ permalink raw reply related

* [PATCH 1/1] net: bridge: set error code on failure
From: Pan Bian @ 2016-12-03 11:33 UTC (permalink / raw)
  To: Stephen Hemminger, David S. Miller, bridge, netdev; +Cc: linux-kernel, Pan Bian

Function br_sysfs_addbr() does not set error code when the call
kobject_create_and_add() returns a NULL pointer. It may be better to
return "-ENOMEM" when kobject_create_and_add() fails.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188781

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 net/bridge/br_sysfs_br.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c
index e120307..f88c4df 100644
--- a/net/bridge/br_sysfs_br.c
+++ b/net/bridge/br_sysfs_br.c
@@ -898,6 +898,7 @@ int br_sysfs_addbr(struct net_device *dev)
 	if (!br->ifobj) {
 		pr_info("%s: can't add kobject (directory) %s/%s\n",
 			__func__, dev->name, SYSFS_BRIDGE_PORT_SUBDIR);
+		err = -ENOMEM;
 		goto out3;
 	}
 	return 0;
-- 
1.9.1

^ permalink raw reply related

* [PATCH 1/1] net: usb: set error code when usb_alloc_urb fails
From: Pan Bian @ 2016-12-03 11:24 UTC (permalink / raw)
  To: Woojung Huh, Microchip Linux Driver Support, netdev, linux-usb
  Cc: linux-kernel, Pan Bian

In function lan78xx_probe(), variable ret takes the errno code on
failures. However, when the call to usb_alloc_urb() fails, its value
will keeps 0. 0 indicates success in the context, which is inconsistent
with the execution result. This patch fixes the bug, assigning
"-ENOMEM" to ret when usb_alloc_urb() returns a NULL pointer.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188771

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 drivers/net/usb/lan78xx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index db558b8..f33460c 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -3395,6 +3395,7 @@ static int lan78xx_probe(struct usb_interface *intf,
 	if (buf) {
 		dev->urb_intr = usb_alloc_urb(0, GFP_KERNEL);
 		if (!dev->urb_intr) {
+			ret = -ENOMEM;
 			kfree(buf);
 			goto out3;
 		} else {
-- 
1.9.1

^ permalink raw reply related

* [PATCH 1/1] net: caif: fix ineffective error check
From: Pan Bian @ 2016-12-03 11:18 UTC (permalink / raw)
  To: Dmitry Tarnyagin, David S. Miller, netdev; +Cc: linux-kernel, Pan Bian

In function caif_sktinit_module(), the check of the return value of
sock_register() seems ineffective. This patch fixes it.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188751

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 net/caif/caif_socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index aa209b1..2a689a3 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -1108,7 +1108,7 @@ static int caif_create(struct net *net, struct socket *sock, int protocol,
 static int __init caif_sktinit_module(void)
 {
 	int err = sock_register(&caif_family_ops);
-	if (!err)
+	if (err)
 		return err;
 	return 0;
 }
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH v2 net-next 1/2] flow dissector: ICMP support
From: Jiri Pirko @ 2016-12-03 10:49 UTC (permalink / raw)
  To: Simon Horman
  Cc: David Miller, netdev, Jay Vosburgh, Veaceslav Falico,
	Andy Gospodarek, Jamal Hadi Salim, Jiri Pirko
In-Reply-To: <1480710702-16850-2-git-send-email-simon.horman@netronome.com>

Fri, Dec 02, 2016 at 09:31:41PM CET, simon.horman@netronome.com wrote:
>Allow dissection of ICMP(V6) type and code. This re-uses transport layer
>port dissection code as although ICMP is not a transport protocol and their
>type and code are not ports this allows sharing of both code and storage.
>
>Signed-off-by: Simon Horman <simon.horman@netronome.com>
>---
> drivers/net/bonding/bond_main.c |  6 +++--
> include/linux/skbuff.h          |  5 +++++
> include/net/flow_dissector.h    | 50 ++++++++++++++++++++++++++++++++++++++---
> net/core/flow_dissector.c       | 34 +++++++++++++++++++++++++---
> net/sched/cls_flow.c            |  4 ++--
> 5 files changed, 89 insertions(+), 10 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 8029dd4912b6..a6f75cfb2bf7 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -3181,7 +3181,8 @@ static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb,
> 	} else {
> 		return false;
> 	}
>-	if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34 && proto >= 0)
>+	if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34 &&
>+	    proto >= 0 && !skb_flow_is_icmp_any(skb, proto))
> 		fk->ports.ports = skb_flow_get_ports(skb, noff, proto);
> 
> 	return true;
>@@ -3209,7 +3210,8 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb)
> 		return bond_eth_hash(skb);
> 
> 	if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER23 ||
>-	    bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP23)
>+	    bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP23 ||
>+	    flow_keys_are_icmp_any(&flow))
> 		hash = bond_eth_hash(skb);
> 	else
> 		hash = (__force u32)flow.ports.ports;
>diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>index 9c535fbccf2c..44a8f69a9198 100644
>--- a/include/linux/skbuff.h
>+++ b/include/linux/skbuff.h
>@@ -1094,6 +1094,11 @@ u32 __skb_get_poff(const struct sk_buff *skb, void *data,
> __be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto,
> 			    void *data, int hlen_proto);
> 
>+static inline bool skb_flow_is_icmp_any(const struct sk_buff *skb, u8 ip_proto)
>+{
>+	return flow_protos_are_icmp_any(skb->protocol, ip_proto);
>+}
>+
> static inline __be32 skb_flow_get_ports(const struct sk_buff *skb,
> 					int thoff, u8 ip_proto)
> {
>diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
>index c4f31666afd2..5540dfa18872 100644
>--- a/include/net/flow_dissector.h
>+++ b/include/net/flow_dissector.h
>@@ -2,6 +2,7 @@
> #define _NET_FLOW_DISSECTOR_H
> 
> #include <linux/types.h>
>+#include <linux/in.h>
> #include <linux/in6.h>
> #include <uapi/linux/if_ether.h>
> 
>@@ -89,10 +90,15 @@ struct flow_dissector_key_addrs {
> };
> 
> /**
>- * flow_dissector_key_tp_ports:
>- *	@ports: port numbers of Transport header
>+ * flow_dissector_key_ports:
>+ *	@ports: port numbers of Transport header or
>+ *		type and code of ICMP header
>+ *		ports: source (high) and destination (low) port numbers
>  *		src: source port number
>  *		dst: destination port number
>+ *		icmp: ICMP type (high) and code (low)
>+ *		type: ICMP type
>+ *		type: ICMP code
>  */
> struct flow_dissector_key_ports {
> 	union {
>@@ -101,6 +107,11 @@ struct flow_dissector_key_ports {
> 			__be16 src;
> 			__be16 dst;
> 		};
>+		__be16 icmp;
>+		struct {
>+			u8 type;
>+			u8 code;
>+		};

Digging into this a bit more. I think it would be much nice not to mix
up l4 ports and icmp stuff.

How about to have FLOW_DISSECTOR_KEY_ICMP
and
struct flow_dissector_key_icmp {
	u8 type;
	u8 code;
};

The you can make this structure and struct flow_dissector_key_ports into
an union in struct flow_keys.

Looks much cleaner to me.



> 	};
> };
> 
>@@ -188,9 +199,42 @@ struct flow_keys_digest {
> void make_flow_keys_digest(struct flow_keys_digest *digest,
> 			   const struct flow_keys *flow);
> 
>+static inline bool flow_protos_are_icmpv4(__be16 n_proto, u8 ip_proto)
>+{
>+	return n_proto == htons(ETH_P_IP) && ip_proto == IPPROTO_ICMP;
>+}
>+
>+static inline bool flow_protos_are_icmpv6(__be16 n_proto, u8 ip_proto)
>+{
>+	return n_proto == htons(ETH_P_IPV6) && ip_proto == IPPROTO_ICMPV6;
>+}
>+
>+static inline bool flow_protos_are_icmp_any(__be16 n_proto, u8 ip_proto)
>+{
>+	return flow_protos_are_icmpv4(n_proto, ip_proto) ||
>+		flow_protos_are_icmpv6(n_proto, ip_proto);
>+}
>+
>+static inline bool flow_basic_key_is_icmpv4(const struct flow_dissector_key_basic *basic)
>+{
>+	return flow_protos_are_icmpv4(basic->n_proto, basic->ip_proto);
>+}
>+
>+static inline bool flow_basic_key_is_icmpv6(const struct flow_dissector_key_basic *basic)
>+{
>+	return flow_protos_are_icmpv6(basic->n_proto, basic->ip_proto);
>+}
>+
>+static inline bool flow_keys_are_icmp_any(const struct flow_keys *keys)
>+{
>+	return flow_protos_are_icmp_any(keys->basic.n_proto,
>+					keys->basic.ip_proto);
>+}
>+
> static inline bool flow_keys_have_l4(const struct flow_keys *keys)
> {
>-	return (keys->ports.ports || keys->tags.flow_label);
>+	return (!flow_keys_are_icmp_any(keys) && keys->ports.ports) ||
>+		keys->tags.flow_label;
> }
> 
> u32 flow_hash_from_keys(struct flow_keys *keys);
>diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>index 1eb6f949e5b2..0584b4bb4390 100644
>--- a/net/core/flow_dissector.c
>+++ b/net/core/flow_dissector.c
>@@ -58,6 +58,28 @@ void skb_flow_dissector_init(struct flow_dissector *flow_dissector,
> EXPORT_SYMBOL(skb_flow_dissector_init);
> 
> /**
>+ * skb_flow_get_be16 - extract be16 entity
>+ * @skb: sk_buff to extract from
>+ * @poff: offset to extract at
>+ * @data: raw buffer pointer to the packet
>+ * @hlen: packet header length
>+ *
>+ * The function will try to retrieve a be32 entity at
>+ * offset poff
>+ */
>+__be16 skb_flow_get_be16(const struct sk_buff *skb, int poff, void *data,
>+			 int hlen)
>+{
>+	__be16 *u, _u;
>+
>+	u = __skb_header_pointer(skb, poff, sizeof(_u), data, hlen, &_u);
>+	if (u)
>+		return *u;
>+
>+	return 0;
>+}
>+
>+/**
>  * __skb_flow_get_ports - extract the upper layer ports and return them
>  * @skb: sk_buff to extract the ports from
>  * @thoff: transport header offset
>@@ -542,8 +564,13 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
> 		key_ports = skb_flow_dissector_target(flow_dissector,
> 						      FLOW_DISSECTOR_KEY_PORTS,
> 						      target_container);
>-		key_ports->ports = __skb_flow_get_ports(skb, nhoff, ip_proto,
>-							data, hlen);
>+		if (flow_protos_are_icmp_any(proto, ip_proto))
>+			key_ports->icmp = skb_flow_get_be16(skb, nhoff, data,
>+							    hlen);
>+		else
>+			key_ports->ports = __skb_flow_get_ports(skb, nhoff,
>+								ip_proto, data,
>+								hlen);
> 	}
> 
> out_good:
>@@ -718,7 +745,8 @@ void make_flow_keys_digest(struct flow_keys_digest *digest,
> 
> 	data->n_proto = flow->basic.n_proto;
> 	data->ip_proto = flow->basic.ip_proto;
>-	data->ports = flow->ports.ports;
>+	if (flow_keys_have_l4(flow))
>+		data->ports = flow->ports.ports;
> 	data->src = flow->addrs.v4addrs.src;
> 	data->dst = flow->addrs.v4addrs.dst;
> }
>diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
>index e39672394c7b..a1a7ae71aa62 100644
>--- a/net/sched/cls_flow.c
>+++ b/net/sched/cls_flow.c
>@@ -96,7 +96,7 @@ static u32 flow_get_proto(const struct sk_buff *skb,
> static u32 flow_get_proto_src(const struct sk_buff *skb,
> 			      const struct flow_keys *flow)
> {
>-	if (flow->ports.ports)
>+	if (!flow_keys_are_icmp_any(flow) && flow->ports.ports)
> 		return ntohs(flow->ports.src);
> 
> 	return addr_fold(skb->sk);
>@@ -105,7 +105,7 @@ static u32 flow_get_proto_src(const struct sk_buff *skb,
> static u32 flow_get_proto_dst(const struct sk_buff *skb,
> 			      const struct flow_keys *flow)
> {
>-	if (flow->ports.ports)
>+	if (!flow_keys_are_icmp_any(flow) && flow->ports.ports)
> 		return ntohs(flow->ports.dst);
> 
> 	return addr_fold(skb_dst(skb)) ^ (__force u16) tc_skb_protocol(skb);
>-- 
>2.7.0.rc3.207.g0ac5344
>

^ permalink raw reply

* [PATCH 1/1] net: wireless: marvell: fix improper return value
From: Pan Bian @ 2016-12-03 10:27 UTC (permalink / raw)
  To: Kalle Valo, Andreas Kemnade, Johannes Berg, libertas-dev,
	linux-wireless, netdev
  Cc: linux-kernel, Pan Bian

Function lbs_cmd_802_11_sleep_params() always return 0, even if the call
to lbs_cmd_with_response() fails. In this case, the parameter @sp will
keep uninitialized. Because the return value is 0, its caller (say
lbs_sleepparams_read()) will not detect the error, and will copy the
uninitialized stack memory to user sapce, resulting in stack information
leak. To avoid the bug, this patch returns variable ret (which takes
the return value of lbs_cmd_with_response()) instead of 0.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188451

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 drivers/net/wireless/marvell/libertas/cmd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/marvell/libertas/cmd.c b/drivers/net/wireless/marvell/libertas/cmd.c
index 301170c..033ff88 100644
--- a/drivers/net/wireless/marvell/libertas/cmd.c
+++ b/drivers/net/wireless/marvell/libertas/cmd.c
@@ -305,7 +305,7 @@ int lbs_cmd_802_11_sleep_params(struct lbs_private *priv, uint16_t cmd_action,
 	}
 
 	lbs_deb_leave_args(LBS_DEB_CMD, "ret %d", ret);
-	return 0;
+	return ret;
 }
 
 static int lbs_wait_for_ds_awake(struct lbs_private *priv)
-- 
1.9.1

^ permalink raw reply related

* [PATCH 1/1] net: wireless: intersil: fix improper return value
From: Pan Bian @ 2016-12-03 10:22 UTC (permalink / raw)
  To: Kalle Valo, linux-wireless, netdev; +Cc: linux-kernel, Pan Bian

Function orinoco_ioctl_commit() returns 0 (indicates success) when the
call to orinoco_lock() fails. Thus, the return value is inconsistent with
the execution status. It may be better to return "-EBUSY" when the call 
to orinoco_lock() fails.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188671

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 drivers/net/wireless/intersil/orinoco/wext.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intersil/orinoco/wext.c b/drivers/net/wireless/intersil/orinoco/wext.c
index 1d4dae4..fee57ea 100644
--- a/drivers/net/wireless/intersil/orinoco/wext.c
+++ b/drivers/net/wireless/intersil/orinoco/wext.c
@@ -1314,7 +1314,7 @@ static int orinoco_ioctl_commit(struct net_device *dev,
 		return 0;
 
 	if (orinoco_lock(priv, &flags) != 0)
-		return err;
+		return -EBUSY;
 
 	err = orinoco_commit(priv);
 
-- 
1.9.1

^ permalink raw reply related

* [PATCH 1/1] net: wireless: intersil: fix improper return value
From: Pan Bian @ 2016-12-03 10:18 UTC (permalink / raw)
  To: Kalle Valo, linux-wireless, netdev; +Cc: linux-kernel, Pan Bian

Function orinoco_ioctl_commit() returns 0 (indicates success) when the
call to orinoco_lock() fails. Thus, the return value is inconsistent with
the execution status. It may be better to return "-EBUSY" when the call 
to orinoco_lock() fails.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188671

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 drivers/net/wireless/intersil/orinoco/wext.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intersil/orinoco/wext.c b/drivers/net/wireless/intersil/orinoco/wext.c
index 1d4dae4..fee57ea 100644
--- a/drivers/net/wireless/intersil/orinoco/wext.c
+++ b/drivers/net/wireless/intersil/orinoco/wext.c
@@ -1314,7 +1314,7 @@ static int orinoco_ioctl_commit(struct net_device *dev,
 		return 0;
 
 	if (orinoco_lock(priv, &flags) != 0)
-		return err;
+		return -EBUSY;
 
 	err = orinoco_commit(priv);
 
-- 
1.9.1

^ permalink raw reply related

* [PATCH 1/1] netdev: broadcom: propagate error code
From: Pan Bian @ 2016-12-03  9:56 UTC (permalink / raw)
  To: David S. Miller, Michael Chan, Prashant Sreedharan,
	Satish Baddipadige, netdev
  Cc: linux-kernel, Pan Bian

Function bnxt_hwrm_stat_ctx_alloc() always returns 0, even if the call
to _hwrm_send_message() fails. It may be better to propagate the errors
to the caller of bnxt_hwrm_stat_ctx_alloc().

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188661

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index ee1a803..f08a20b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4120,7 +4120,7 @@ static int bnxt_hwrm_stat_ctx_alloc(struct bnxt *bp)
 		bp->grp_info[i].fw_stats_ctx = cpr->hw_stats_ctx_id;
 	}
 	mutex_unlock(&bp->hwrm_cmd_lock);
-	return 0;
+	return rc;
 }
 
 static int bnxt_hwrm_func_qcfg(struct bnxt *bp)
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next] net: remove abuse of VLAN DEI/CFI bit
From: Michał Mirosław @ 2016-12-03  9:22 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: open list:OPENVSWITCH, moderated list:ETHERNET BRIDGE

This All-in-one patch removes abuse of VLAN CFI bit, so it can be passed
intact through linux networking stack.

Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl>
---

Dear NetDevs

I guess this needs to be split to the prep..convert[]..finish sequence,
but if you like it as is, then it's ready.

The biggest question is if the modified interface and vlan_present
is the way to go. This can be changed to use vlan_proto != 0 instead
of an extra flag bit.

As I can't test most of the driver changes, please look at them carefully.
OVS and bridge eyes are especially welcome.

Best Regards,
Michał Mirosław
---
 Documentation/networking/openvswitch.txt         | 14 ------
 arch/arm/net/bpf_jit_32.c                        | 14 +++---
 arch/mips/net/bpf_jit.c                          | 17 +++----
 arch/powerpc/net/bpf_jit_comp.c                  | 14 +++---
 arch/sparc/net/bpf_jit_comp.c                    | 14 +++---
 drivers/infiniband/hw/cxgb4/cm.c                 |  2 +-
 drivers/infiniband/hw/i40iw/i40iw_cm.c           |  8 ++--
 drivers/net/ethernet/broadcom/cnic.c             |  2 +-
 drivers/net/ethernet/emulex/benet/be_main.c      |  4 +-
 drivers/net/ethernet/freescale/gianfar_ethtool.c |  7 +--
 drivers/net/ethernet/ibm/ibmvnic.c               |  5 +-
 drivers/net/ethernet/marvell/sky2.c              |  4 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c   |  9 ++--
 drivers/net/hyperv/hyperv_net.h                  |  2 +-
 drivers/net/hyperv/netvsc_drv.c                  | 14 +++---
 drivers/net/hyperv/rndis_filter.c                |  5 +-
 include/linux/if_vlan.h                          | 23 ++++++---
 include/linux/skbuff.h                           | 10 +++-
 lib/test_bpf.c                                   | 14 +++---
 net/8021q/vlan_core.c                            |  2 +-
 net/bridge/br_netfilter_hooks.c                  | 14 +++---
 net/bridge/br_private.h                          |  2 +-
 net/bridge/br_vlan.c                             |  6 +--
 net/core/dev.c                                   |  8 ++--
 net/core/filter.c                                | 17 +++----
 net/core/skbuff.c                                |  2 +-
 net/ipv4/ip_tunnel_core.c                        |  2 +-
 net/netfilter/nfnetlink_queue.c                  |  5 +-
 net/openvswitch/actions.c                        | 13 ++---
 net/openvswitch/flow.c                           |  2 +-
 net/openvswitch/flow.h                           |  4 +-
 net/openvswitch/flow_netlink.c                   | 61 ++++++++----------------
 net/sched/act_vlan.c                             |  2 +-
 33 files changed, 152 insertions(+), 170 deletions(-)

diff --git a/Documentation/networking/openvswitch.txt b/Documentation/networking/openvswitch.txt
index b3b9ac6..e7ca27d 100644
--- a/Documentation/networking/openvswitch.txt
+++ b/Documentation/networking/openvswitch.txt
@@ -219,20 +219,6 @@ this:
 
     eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
 
-As another example, consider a packet with an Ethernet type of 0x8100,
-indicating that a VLAN TCI should follow, but which is truncated just
-after the Ethernet type.  The flow key for this packet would include
-an all-zero-bits vlan and an empty encap attribute, like this:
-
-    eth(...), eth_type(0x8100), vlan(0), encap()
-
-Unlike a TCP packet with source and destination ports 0, an
-all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka
-VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan
-attribute expressly to allow this situation to be distinguished.
-Thus, the flow key in this second example unambiguously indicates a
-missing or malformed VLAN TCI.
-
 Other rules
 -----------
 
diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 93d0b6d..aff9dfa 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -915,17 +915,17 @@ static int build_body(struct jit_ctx *ctx)
 			emit(ARM_LDR_I(r_A, r_skb, off), ctx);
 			break;
 		case BPF_ANC | SKF_AD_VLAN_TAG:
-		case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
 			ctx->seen |= SEEN_SKB;
 			BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_tci) != 2);
 			off = offsetof(struct sk_buff, vlan_tci);
 			emit(ARM_LDRH_I(r_A, r_skb, off), ctx);
-			if (code == (BPF_ANC | SKF_AD_VLAN_TAG))
-				OP_IMM3(ARM_AND, r_A, r_A, ~VLAN_TAG_PRESENT, ctx);
-			else {
-				OP_IMM3(ARM_LSR, r_A, r_A, 12, ctx);
-				OP_IMM3(ARM_AND, r_A, r_A, 0x1, ctx);
-			}
+			break;
+		case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
+			off = PKT_VLAN_PRESENT_OFFSET();
+			emit(ARM_LDRB_I(r_A, r_skb, off), ctx);
+			if (PKT_VLAN_PRESENT_BIT)
+				OP_IMM3(ARM_LSR, r_A, r_A, PKT_VLAN_PRESENT_BIT, ctx);
+			OP_IMM3(ARM_AND, r_A, r_A, 0x1, ctx);
 			break;
 		case BPF_ANC | SKF_AD_PKTTYPE:
 			ctx->seen |= SEEN_SKB;
diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c
index 49a2e22..fb6d234 100644
--- a/arch/mips/net/bpf_jit.c
+++ b/arch/mips/net/bpf_jit.c
@@ -1138,19 +1138,20 @@ static int build_body(struct jit_ctx *ctx)
 			emit_load(r_A, r_skb, off, ctx);
 			break;
 		case BPF_ANC | SKF_AD_VLAN_TAG:
-		case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
 			ctx->flags |= SEEN_SKB | SEEN_A;
 			BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
 						  vlan_tci) != 2);
 			off = offsetof(struct sk_buff, vlan_tci);
 			emit_half_load(r_s0, r_skb, off, ctx);
-			if (code == (BPF_ANC | SKF_AD_VLAN_TAG)) {
-				emit_andi(r_A, r_s0, (u16)~VLAN_TAG_PRESENT, ctx);
-			} else {
-				emit_andi(r_A, r_s0, VLAN_TAG_PRESENT, ctx);
-				/* return 1 if present */
-				emit_sltu(r_A, r_zero, r_A, ctx);
-			}
+			break;
+		case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
+			ctx->flags |= SEEN_SKB | SEEN_A;
+			emit_load_byte(r_A, r_skb, PKT_VLAN_PRESENT_OFFSET(), ctx);
+			if (PKT_VLAN_PRESENT_BIT)
+				emit_srl(r_A, r_A, PKT_VLAN_PRESENT_BIT, ctx);
+			emit_andi(r_A, r_s0, 1, ctx);
+			/* return 1 if present */
+			emit_sltu(r_A, r_zero, r_A, ctx);
 			break;
 		case BPF_ANC | SKF_AD_PKTTYPE:
 			ctx->flags |= SEEN_SKB;
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 7e706f3..fb38927 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -377,18 +377,16 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
 							  hash));
 			break;
 		case BPF_ANC | SKF_AD_VLAN_TAG:
-		case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
 			BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_tci) != 2);
-			BUILD_BUG_ON(VLAN_TAG_PRESENT != 0x1000);
 
 			PPC_LHZ_OFFS(r_A, r_skb, offsetof(struct sk_buff,
 							  vlan_tci));
-			if (code == (BPF_ANC | SKF_AD_VLAN_TAG)) {
-				PPC_ANDI(r_A, r_A, ~VLAN_TAG_PRESENT);
-			} else {
-				PPC_ANDI(r_A, r_A, VLAN_TAG_PRESENT);
-				PPC_SRWI(r_A, r_A, 12);
-			}
+			break;
+		case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
+			PPC_LBZ_OFFS(r_A, r_skb, PKT_VLAN_PRESENT_OFFSET());
+			if (PKT_VLAN_PRESENT_BIT)
+				PPC_SRWI(r_A, r_A, PKT_VLAN_PRESENT_BIT);
+			PPC_ANDI(r_A, r_A, 1);
 			break;
 		case BPF_ANC | SKF_AD_QUEUE:
 			BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c
index a6d9204..d499b39 100644
--- a/arch/sparc/net/bpf_jit_comp.c
+++ b/arch/sparc/net/bpf_jit_comp.c
@@ -601,15 +601,13 @@ void bpf_jit_compile(struct bpf_prog *fp)
 				emit_skb_load32(hash, r_A);
 				break;
 			case BPF_ANC | SKF_AD_VLAN_TAG:
-			case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
 				emit_skb_load16(vlan_tci, r_A);
-				if (code != (BPF_ANC | SKF_AD_VLAN_TAG)) {
-					emit_alu_K(SRL, 12);
-					emit_andi(r_A, 1, r_A);
-				} else {
-					emit_loadimm(~VLAN_TAG_PRESENT, r_TMP);
-					emit_and(r_A, r_TMP, r_A);
-				}
+				break;
+			case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
+				__emit_skb_load8(__pkt_vlan_present_offset, r_A);
+				if (PKT_VLAN_PRESENT_BIT)
+					emit_alu_K(SRL, PKT_VLAN_PRESENT_BIT);
+				emit_andi(r_A, 1, r_A);
 				break;
 			case BPF_LD | BPF_W | BPF_LEN:
 				emit_skb_load32(len, r_A);
diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
index f1510cc..66a3d39 100644
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -3899,7 +3899,7 @@ static int rx_pkt(struct c4iw_dev *dev, struct sk_buff *skb)
 	} else {
 		vlan_eh = (struct vlan_ethhdr *)(req + 1);
 		iph = (struct iphdr *)(vlan_eh + 1);
-		skb->vlan_tci = ntohs(cpl->vlan);
+		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), ntohs(cpl->vlan));
 	}
 
 	if (iph->version != 0x4)
diff --git a/drivers/infiniband/hw/i40iw/i40iw_cm.c b/drivers/infiniband/hw/i40iw/i40iw_cm.c
index 8563769..b9e360c 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_cm.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_cm.c
@@ -414,7 +414,7 @@ static struct i40iw_puda_buf *i40iw_form_cm_frame(struct i40iw_cm_node *cm_node,
 			pd_len += MPA_ZERO_PAD_LEN;
 	}
 
-	if (cm_node->vlan_id < VLAN_TAG_PRESENT)
+	if (cm_node->vlan_id <= VLAN_VID_MASK)
 		eth_hlen += 4;
 
 	if (cm_node->ipv4)
@@ -443,7 +443,7 @@ static struct i40iw_puda_buf *i40iw_form_cm_frame(struct i40iw_cm_node *cm_node,
 
 		ether_addr_copy(ethh->h_dest, cm_node->rem_mac);
 		ether_addr_copy(ethh->h_source, cm_node->loc_mac);
-		if (cm_node->vlan_id < VLAN_TAG_PRESENT) {
+		if (cm_node->vlan_id <= VLAN_VID_MASK) {
 			((struct vlan_ethhdr *)ethh)->h_vlan_proto = htons(ETH_P_8021Q);
 			((struct vlan_ethhdr *)ethh)->h_vlan_TCI = htons(cm_node->vlan_id);
 
@@ -472,7 +472,7 @@ static struct i40iw_puda_buf *i40iw_form_cm_frame(struct i40iw_cm_node *cm_node,
 
 		ether_addr_copy(ethh->h_dest, cm_node->rem_mac);
 		ether_addr_copy(ethh->h_source, cm_node->loc_mac);
-		if (cm_node->vlan_id < VLAN_TAG_PRESENT) {
+		if (cm_node->vlan_id <= VLAN_VID_MASK) {
 			((struct vlan_ethhdr *)ethh)->h_vlan_proto = htons(ETH_P_8021Q);
 			((struct vlan_ethhdr *)ethh)->h_vlan_TCI = htons(cm_node->vlan_id);
 			((struct vlan_ethhdr *)ethh)->h_vlan_encapsulated_proto = htons(ETH_P_IPV6);
@@ -3235,7 +3235,7 @@ static void i40iw_init_tcp_ctx(struct i40iw_cm_node *cm_node,
 
 	tcp_info->flow_label = 0;
 	tcp_info->snd_mss = cpu_to_le32(((u32)cm_node->tcp_cntxt.mss));
-	if (cm_node->vlan_id < VLAN_TAG_PRESENT) {
+	if (cm_node->vlan_id <= VLAN_VID_MASK) {
 		tcp_info->insert_vlan_tag = true;
 		tcp_info->vlan_tag = cpu_to_le16(cm_node->vlan_id);
 	}
diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index b1d2ac8..6e3c610 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -5734,7 +5734,7 @@ static int cnic_netdev_event(struct notifier_block *this, unsigned long event,
 		if (realdev) {
 			dev = cnic_from_netdev(realdev);
 			if (dev) {
-				vid |= VLAN_TAG_PRESENT;
+				vid |= VLAN_CFI_MASK;	/* make non-zero */
 				cnic_rcv_netevent(dev->cnic_priv, event, vid);
 				cnic_put(dev);
 			}
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 7e1633b..b365a01 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1053,12 +1053,12 @@ static struct sk_buff *be_insert_vlan_in_pkt(struct be_adapter *adapter,
 		BE_WRB_F_SET(wrb_params->features, VLAN_SKIP_HW, 1);
 	}
 
-	if (vlan_tag) {
+	if (skb_vlan_tag_present(skb)) {
 		skb = vlan_insert_tag_set_proto(skb, htons(ETH_P_8021Q),
 						vlan_tag);
 		if (unlikely(!skb))
 			return skb;
-		skb->vlan_tci = 0;
+		__vlan_hwaccel_clear_tag(skb);
 	}
 
 	/* Insert the outer VLAN, if any */
diff --git a/drivers/net/ethernet/freescale/gianfar_ethtool.c b/drivers/net/ethernet/freescale/gianfar_ethtool.c
index 56588f2..b479ded 100644
--- a/drivers/net/ethernet/freescale/gianfar_ethtool.c
+++ b/drivers/net/ethernet/freescale/gianfar_ethtool.c
@@ -1155,13 +1155,10 @@ static int gfar_convert_to_filer(struct ethtool_rx_flow_spec *rule,
 		prio = vlan_tci_prio(rule);
 		prio_mask = vlan_tci_priom(rule);
 
-		if (cfi == VLAN_TAG_PRESENT && cfi_mask == VLAN_TAG_PRESENT) {
+		if (cfi)
 			vlan |= RQFPR_CFI;
+		if (cfi_mask)
 			vlan_mask |= RQFPR_CFI;
-		} else if (cfi != VLAN_TAG_PRESENT &&
-			   cfi_mask == VLAN_TAG_PRESENT) {
-			vlan_mask |= RQFPR_CFI;
-		}
 	}
 
 	switch (rule->flow_type & ~FLOW_EXT) {
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index c125966..c7664db 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -765,7 +765,7 @@ static int ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	tx_crq.v1.sge_len = cpu_to_be32(skb->len);
 	tx_crq.v1.ioba = cpu_to_be64(data_dma_addr);
 
-	if (adapter->vlan_header_insertion) {
+	if (adapter->vlan_header_insertion && skb_vlan_tag_present(skb)) {
 		tx_crq.v1.flags2 |= IBMVNIC_TX_VLAN_INSERT;
 		tx_crq.v1.vlan_id = cpu_to_be16(skb->vlan_tci);
 	}
@@ -964,7 +964,8 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
 		skb = rx_buff->skb;
 		skb_copy_to_linear_data(skb, rx_buff->data + offset,
 					length);
-		skb->vlan_tci = be16_to_cpu(next->rx_comp.vlan_tci);
+		if (flags & IBMVNIC_VLAN_STRIPPED)
+			__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), be16_to_cpu(next->rx_comp.vlan_tci));
 		/* free the entry */
 		next->rx_comp.first = 0;
 		remove_buff_from_pool(adapter, rx_buff);
diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c
index b60ad0e..566bdfc1 100644
--- a/drivers/net/ethernet/marvell/sky2.c
+++ b/drivers/net/ethernet/marvell/sky2.c
@@ -2487,11 +2487,11 @@ static struct sk_buff *receive_copy(struct sky2_port *sky2,
 		skb_copy_hash(skb, re->skb);
 		skb->vlan_proto = re->skb->vlan_proto;
 		skb->vlan_tci = re->skb->vlan_tci;
+		skb->vlan_present = re->skb->vlan_present;
 
 		pci_dma_sync_single_for_device(sky2->hw->pdev, re->data_addr,
 					       length, PCI_DMA_FROMDEVICE);
-		re->skb->vlan_proto = 0;
-		re->skb->vlan_tci = 0;
+		__vlan_hwaccel_clear_tag(re->skb);
 		skb_clear_hash(re->skb);
 		re->skb->ip_summed = CHECKSUM_NONE;
 		skb_put(skb, length);
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c
index fedd736..806e4d1 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c
@@ -459,7 +459,7 @@ static int qlcnic_tx_pkt(struct qlcnic_adapter *adapter,
 			 struct cmd_desc_type0 *first_desc, struct sk_buff *skb,
 			 struct qlcnic_host_tx_ring *tx_ring)
 {
-	u8 l4proto, opcode = 0, hdr_len = 0;
+	u8 l4proto, opcode = 0, hdr_len = 0, tag_vlan = 0;
 	u16 flags = 0, vlan_tci = 0;
 	int copied, offset, copy_len, size;
 	struct cmd_desc_type0 *hwdesc;
@@ -472,18 +472,21 @@ static int qlcnic_tx_pkt(struct qlcnic_adapter *adapter,
 		flags = QLCNIC_FLAGS_VLAN_TAGGED;
 		vlan_tci = ntohs(vh->h_vlan_TCI);
 		protocol = ntohs(vh->h_vlan_encapsulated_proto);
+		tag_vlan = 1;
 	} else if (skb_vlan_tag_present(skb)) {
 		flags = QLCNIC_FLAGS_VLAN_OOB;
 		vlan_tci = skb_vlan_tag_get(skb);
+		tag_vlan = 1;
 	}
 	if (unlikely(adapter->tx_pvid)) {
-		if (vlan_tci && !(adapter->flags & QLCNIC_TAGGING_ENABLED))
+		if (tag_vlan && !(adapter->flags & QLCNIC_TAGGING_ENABLED))
 			return -EIO;
-		if (vlan_tci && (adapter->flags & QLCNIC_TAGGING_ENABLED))
+		if (tag_vlan && (adapter->flags & QLCNIC_TAGGING_ENABLED))
 			goto set_flags;
 
 		flags = QLCNIC_FLAGS_VLAN_OOB;
 		vlan_tci = adapter->tx_pvid;
+		tag_vlan = 1;
 	}
 set_flags:
 	qlcnic_set_tx_vlan_tci(first_desc, vlan_tci);
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 3958ada..b53729e 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -186,7 +186,7 @@ int netvsc_recv_callback(struct hv_device *device_obj,
 			void **data,
 			struct ndis_tcp_ip_checksum_info *csum_info,
 			struct vmbus_channel *channel,
-			u16 vlan_tci);
+			u16 vlan_tci, bool vlan_present);
 void netvsc_channel_cb(void *context);
 int rndis_filter_open(struct netvsc_device *nvdev);
 int rndis_filter_close(struct netvsc_device *nvdev);
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 9522763..1ef3d70 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -437,6 +437,7 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
 		vlan = (struct ndis_pkt_8021q_info *)((void *)ppi +
 						ppi->ppi_offset);
 		vlan->vlanid = skb->vlan_tci & VLAN_VID_MASK;
+		vlan->cfi = !!(skb->vlan_tci & VLAN_CFI_MASK);
 		vlan->pri = (skb->vlan_tci & VLAN_PRIO_MASK) >>
 				VLAN_PRIO_SHIFT;
 	}
@@ -591,7 +592,7 @@ void netvsc_linkstatus_callback(struct hv_device *device_obj,
 static struct sk_buff *netvsc_alloc_recv_skb(struct net_device *net,
 				struct hv_netvsc_packet *packet,
 				struct ndis_tcp_ip_checksum_info *csum_info,
-				void *data, u16 vlan_tci)
+				void *data)
 {
 	struct sk_buff *skb;
 
@@ -621,10 +622,6 @@ static struct sk_buff *netvsc_alloc_recv_skb(struct net_device *net,
 			skb->ip_summed = CHECKSUM_UNNECESSARY;
 	}
 
-	if (vlan_tci & VLAN_TAG_PRESENT)
-		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q),
-				       vlan_tci);
-
 	return skb;
 }
 
@@ -637,7 +634,7 @@ int netvsc_recv_callback(struct hv_device *device_obj,
 				void **data,
 				struct ndis_tcp_ip_checksum_info *csum_info,
 				struct vmbus_channel *channel,
-				u16 vlan_tci)
+				u16 vlan_tci, bool vlan_present)
 {
 	struct net_device *net = hv_get_drvdata(device_obj);
 	struct net_device_context *net_device_ctx = netdev_priv(net);
@@ -660,12 +657,15 @@ int netvsc_recv_callback(struct hv_device *device_obj,
 		net = vf_netdev;
 
 	/* Allocate a skb - TODO direct I/O to pages? */
-	skb = netvsc_alloc_recv_skb(net, packet, csum_info, *data, vlan_tci);
+	skb = netvsc_alloc_recv_skb(net, packet, csum_info, *data);
 	if (unlikely(!skb)) {
 		++net->stats.rx_dropped;
 		return NVSP_STAT_FAIL;
 	}
 
+	if (vlan_present)
+		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tci);
+
 	if (net != vf_netdev)
 		skb_record_rx_queue(skb,
 				    channel->offermsg.offer.sub_channel_index);
diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 8d90904..9759d73 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -381,13 +381,14 @@ static int rndis_filter_receive_data(struct rndis_device *dev,
 
 	vlan = rndis_get_ppi(rndis_pkt, IEEE_8021Q_INFO);
 	if (vlan) {
-		vlan_tci = VLAN_TAG_PRESENT | vlan->vlanid |
+		vlan_tci = vlan->vlanid |
+			(vlan->cfi ? VLAN_CFI_MASK : 0) |
 			(vlan->pri << VLAN_PRIO_SHIFT);
 	}
 
 	csum_info = rndis_get_ppi(rndis_pkt, TCPIP_CHKSUM_PKTINFO);
 	return netvsc_recv_callback(net_device_ctx->device_ctx, pkt, data,
-				    csum_info, channel, vlan_tci);
+				    csum_info, channel, vlan_tci, vlan);
 }
 
 int rndis_filter_receive(struct hv_device *dev,
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 8d5fcd6..a0ba7ba 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -66,7 +66,6 @@ static inline struct vlan_ethhdr *vlan_eth_hdr(const struct sk_buff *skb)
 #define VLAN_PRIO_MASK		0xe000 /* Priority Code Point */
 #define VLAN_PRIO_SHIFT		13
 #define VLAN_CFI_MASK		0x1000 /* Canonical Format Indicator */
-#define VLAN_TAG_PRESENT	VLAN_CFI_MASK
 #define VLAN_VID_MASK		0x0fff /* VLAN Identifier */
 #define VLAN_N_VID		4096
 
@@ -78,8 +77,8 @@ static inline bool is_vlan_dev(const struct net_device *dev)
         return dev->priv_flags & IFF_802_1Q_VLAN;
 }
 
-#define skb_vlan_tag_present(__skb)	((__skb)->vlan_tci & VLAN_TAG_PRESENT)
-#define skb_vlan_tag_get(__skb)		((__skb)->vlan_tci & ~VLAN_TAG_PRESENT)
+#define skb_vlan_tag_present(__skb)	((__skb)->vlan_present)
+#define skb_vlan_tag_get(__skb)		((__skb)->vlan_tci)
 #define skb_vlan_tag_get_id(__skb)	((__skb)->vlan_tci & VLAN_VID_MASK)
 #define skb_vlan_tag_get_prio(__skb)	((__skb)->vlan_tci & VLAN_PRIO_MASK)
 
@@ -382,6 +381,17 @@ static inline struct sk_buff *vlan_insert_tag_set_proto(struct sk_buff *skb,
 	return skb;
 }
 
+/**
+ * __vlan_hwaccel_clear_tag - clear hardware accelerated VLAN info
+ * @skb: skbuff to clear
+ *
+ * Clears the VLAN information from @skb
+ */
+static inline void __vlan_hwaccel_clear_tag(struct sk_buff *skb)
+{
+	skb->vlan_present = 0;
+}
+
 /*
  * __vlan_hwaccel_push_inside - pushes vlan tag to the payload
  * @skb: skbuff to tag
@@ -396,7 +406,7 @@ static inline struct sk_buff *__vlan_hwaccel_push_inside(struct sk_buff *skb)
 	skb = vlan_insert_tag_set_proto(skb, skb->vlan_proto,
 					skb_vlan_tag_get(skb));
 	if (likely(skb))
-		skb->vlan_tci = 0;
+		__vlan_hwaccel_clear_tag(skb);
 	return skb;
 }
 
@@ -412,7 +422,8 @@ static inline void __vlan_hwaccel_put_tag(struct sk_buff *skb,
 					  __be16 vlan_proto, u16 vlan_tci)
 {
 	skb->vlan_proto = vlan_proto;
-	skb->vlan_tci = VLAN_TAG_PRESENT | vlan_tci;
+	skb->vlan_tci = vlan_tci;
+	skb->vlan_present = 1;
 }
 
 /**
@@ -452,8 +463,6 @@ static inline int __vlan_hwaccel_get_tag(const struct sk_buff *skb,
 	}
 }
 
-#define HAVE_VLAN_GET_TAG
-
 /**
  * vlan_get_tag - get the VLAN ID from the skb
  * @skb: skbuff to query
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 9c535fb..4a28beed 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -733,6 +733,14 @@ struct sk_buff {
 	__u8			csum_level:2;
 	__u8			csum_bad:1;
 
+#ifdef __BIG_ENDIAN_BITFIELD
+#define PKT_VLAN_PRESENT_BIT	7
+#else
+#define PKT_VLAN_PRESENT_BIT	0
+#endif
+#define PKT_VLAN_PRESENT_OFFSET()	offsetof(struct sk_buff, __pkt_vlan_present_offset)
+	__u8			__pkt_vlan_present_offset[0];
+	__u8			vlan_present:1;
 #ifdef CONFIG_IPV6_NDISC_NODETYPE
 	__u8			ndisc_nodetype:2;
 #endif
@@ -742,7 +750,7 @@ struct sk_buff {
 #ifdef CONFIG_NET_SWITCHDEV
 	__u8			offload_fwd_mark:1;
 #endif
-	/* 2, 4 or 5 bit hole */
+	/* 1-4 bit hole */
 
 #ifdef CONFIG_NET_SCHED
 	__u16			tc_index;	/* traffic control index */
diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 0362da0..9cb21a2 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -38,6 +38,7 @@
 #define SKB_HASH	0x1234aaab
 #define SKB_QUEUE_MAP	123
 #define SKB_VLAN_TCI	0xffff
+#define SKB_VLAN_PRESENT	1
 #define SKB_DEV_IFINDEX	577
 #define SKB_DEV_TYPE	588
 
@@ -691,8 +692,8 @@ static struct bpf_test tests[] = {
 		CLASSIC,
 		{ },
 		{
-			{ 1, SKB_VLAN_TCI & ~VLAN_TAG_PRESENT },
-			{ 10, SKB_VLAN_TCI & ~VLAN_TAG_PRESENT }
+			{ 1, SKB_VLAN_TCI },
+			{ 10, SKB_VLAN_TCI }
 		},
 	},
 	{
@@ -705,8 +706,8 @@ static struct bpf_test tests[] = {
 		CLASSIC,
 		{ },
 		{
-			{ 1, !!(SKB_VLAN_TCI & VLAN_TAG_PRESENT) },
-			{ 10, !!(SKB_VLAN_TCI & VLAN_TAG_PRESENT) }
+			{ 1, SKB_VLAN_PRESENT },
+			{ 10, SKB_VLAN_PRESENT }
 		},
 	},
 	{
@@ -4773,8 +4774,8 @@ static struct bpf_test tests[] = {
 		CLASSIC,
 		{ },
 		{
-			{  1, !!(SKB_VLAN_TCI & VLAN_TAG_PRESENT) },
-			{ 10, !!(SKB_VLAN_TCI & VLAN_TAG_PRESENT) }
+			{  1, SKB_VLAN_PRESENT },
+			{ 10, SKB_VLAN_PRESENT }
 		},
 		.fill_helper = bpf_fill_maxinsns6,
 	},
@@ -5486,6 +5487,7 @@ static struct sk_buff *populate_skb(char *buf, int size)
 	skb->queue_mapping = SKB_QUEUE_MAP;
 	skb->vlan_tci = SKB_VLAN_TCI;
 	skb->vlan_proto = htons(ETH_P_IP);
+	skb->vlan_present = SKB_VLAN_PRESENT;
 	skb->dev = &dev;
 	skb->dev->ifindex = SKB_DEV_IFINDEX;
 	skb->dev->type = SKB_DEV_TYPE;
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index e2ed698..604a67a 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -50,7 +50,7 @@ bool vlan_do_receive(struct sk_buff **skbp)
 	}
 
 	skb->priority = vlan_get_ingress_priority(vlan_dev, skb->vlan_tci);
-	skb->vlan_tci = 0;
+	__vlan_hwaccel_clear_tag(skb);
 
 	rx_stats = this_cpu_ptr(vlan_dev_priv(vlan_dev)->vlan_pcpu_stats);
 
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 83d937f..1610a51 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -682,10 +682,8 @@ static int br_nf_push_frag_xmit(struct net *net, struct sock *sk, struct sk_buff
 		return 0;
 	}
 
-	if (data->vlan_tci) {
-		skb->vlan_tci = data->vlan_tci;
-		skb->vlan_proto = data->vlan_proto;
-	}
+	if (data->vlan_proto)
+		__vlan_hwaccel_put_tag(skb, data->vlan_proto, data->vlan_tci);
 
 	skb_copy_to_linear_data_offset(skb, -data->size, data->mac, data->size);
 	__skb_push(skb, data->encap_size);
@@ -749,8 +747,12 @@ static int br_nf_dev_queue_xmit(struct net *net, struct sock *sk, struct sk_buff
 
 		data = this_cpu_ptr(&brnf_frag_data_storage);
 
-		data->vlan_tci = skb->vlan_tci;
-		data->vlan_proto = skb->vlan_proto;
+		if (skb_vlan_tag_present(skb)) {
+			data->vlan_tci = skb->vlan_tci;
+			data->vlan_proto = skb->vlan_proto;
+		} else
+			data->vlan_proto = 0;
+
 		data->encap_size = nf_bridge_encap_header_len(skb);
 		data->size = ETH_HLEN + data->encap_size;
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 26aec23..33a0ba0 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -818,7 +818,7 @@ static inline int br_vlan_get_tag(const struct sk_buff *skb, u16 *vid)
 	int err = 0;
 
 	if (skb_vlan_tag_present(skb)) {
-		*vid = skb_vlan_tag_get(skb) & VLAN_VID_MASK;
+		*vid = skb_vlan_tag_get_id(skb);
 	} else {
 		*vid = 0;
 		err = -EINVAL;
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index b6de4f4..ef94664 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -377,7 +377,7 @@ struct sk_buff *br_handle_vlan(struct net_bridge *br,
 	}
 
 	if (v->flags & BRIDGE_VLAN_INFO_UNTAGGED)
-		skb->vlan_tci = 0;
+		__vlan_hwaccel_clear_tag(skb);
 out:
 	return skb;
 }
@@ -444,8 +444,8 @@ static bool __allowed_ingress(const struct net_bridge *br,
 			__vlan_hwaccel_put_tag(skb, br->vlan_proto, pvid);
 		else
 			/* Priority-tagged Frame.
-			 * At this point, We know that skb->vlan_tci had
-			 * VLAN_TAG_PRESENT bit and its VID field was 0x000.
+			 * At this point, We know that skb->vlan_tci VID
+			 * field was 0x000.
 			 * We update only VID field and preserve PCP field.
 			 */
 			skb->vlan_tci |= pvid;
diff --git a/net/core/dev.c b/net/core/dev.c
index bffb525..1773204 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4166,7 +4166,7 @@ static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc)
 		 * and set skb->priority like in vlan_do_receive()
 		 * For the time being, just ignore Priority Code Point
 		 */
-		skb->vlan_tci = 0;
+		__vlan_hwaccel_clear_tag(skb);
 	}
 
 	type = skb->protocol;
@@ -4413,7 +4413,9 @@ static void gro_list_prepare(struct napi_struct *napi, struct sk_buff *skb)
 		}
 
 		diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
-		diffs |= p->vlan_tci ^ skb->vlan_tci;
+		diffs |= skb_vlan_tag_present(p) ^ skb_vlan_tag_present(skb);
+		if (skb_vlan_tag_present(p))
+			diffs |= p->vlan_tci ^ skb->vlan_tci;
 		diffs |= skb_metadata_dst_cmp(p, skb);
 		if (maclen == ETH_HLEN)
 			diffs |= compare_ether_header(skb_mac_header(p),
@@ -4651,7 +4653,7 @@ static void napi_reuse_skb(struct napi_struct *napi, struct sk_buff *skb)
 	__skb_pull(skb, skb_headlen(skb));
 	/* restore the reserve we had after netdev_alloc_skb_ip_align() */
 	skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN - skb_headroom(skb));
-	skb->vlan_tci = 0;
+	__vlan_hwaccel_clear_tag(skb);
 	skb->dev = napi->dev;
 	skb->skb_iif = 0;
 	skb->encapsulation = 0;
diff --git a/net/core/filter.c b/net/core/filter.c
index 56b4358..b4537c6 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -188,22 +188,17 @@ static u32 convert_skb_access(int skb_field, int dst_reg, int src_reg,
 		break;
 
 	case SKF_AD_VLAN_TAG:
-	case SKF_AD_VLAN_TAG_PRESENT:
 		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_tci) != 2);
-		BUILD_BUG_ON(VLAN_TAG_PRESENT != 0x1000);
 
 		/* dst_reg = *(u16 *) (src_reg + offsetof(vlan_tci)) */
 		*insn++ = BPF_LDX_MEM(BPF_H, dst_reg, src_reg,
 				      offsetof(struct sk_buff, vlan_tci));
-		if (skb_field == SKF_AD_VLAN_TAG) {
-			*insn++ = BPF_ALU32_IMM(BPF_AND, dst_reg,
-						~VLAN_TAG_PRESENT);
-		} else {
-			/* dst_reg >>= 12 */
-			*insn++ = BPF_ALU32_IMM(BPF_RSH, dst_reg, 12);
-			/* dst_reg &= 1 */
-			*insn++ = BPF_ALU32_IMM(BPF_AND, dst_reg, 1);
-		}
+		break;
+	case SKF_AD_VLAN_TAG_PRESENT:
+		*insn++ = BPF_LDX_MEM(BPF_B, dst_reg, src_reg, PKT_VLAN_PRESENT_OFFSET());
+		if (PKT_VLAN_PRESENT_BIT)
+			*insn++ = BPF_ALU32_IMM(BPF_RSH, dst_reg, PKT_VLAN_PRESENT_BIT);
+		*insn++ = BPF_ALU32_IMM(BPF_AND, dst_reg, 1);
 		break;
 	}
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index b45cd14..66fb686 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4565,7 +4565,7 @@ int skb_vlan_pop(struct sk_buff *skb)
 	int err;
 
 	if (likely(skb_vlan_tag_present(skb))) {
-		skb->vlan_tci = 0;
+		__vlan_hwaccel_clear_tag(skb);
 	} else {
 		if (unlikely(!eth_type_vlan(skb->protocol)))
 			return 0;
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index fed3d29..0004a54 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -120,7 +120,7 @@ int __iptunnel_pull_header(struct sk_buff *skb, int hdr_len,
 	}
 
 	skb_clear_hash_if_not_l4(skb);
-	skb->vlan_tci = 0;
+	__vlan_hwaccel_clear_tag(skb);
 	skb_set_queue_mapping(skb, 0);
 	skb_scrub_packet(skb, xnet);
 
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index be7627b..f268bb9 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -1111,8 +1111,9 @@ static int nfqa_parse_bridge(struct nf_queue_entry *entry,
 		if (!tb[NFQA_VLAN_TCI] || !tb[NFQA_VLAN_PROTO])
 			return -EINVAL;
 
-		entry->skb->vlan_tci = ntohs(nla_get_be16(tb[NFQA_VLAN_TCI]));
-		entry->skb->vlan_proto = nla_get_be16(tb[NFQA_VLAN_PROTO]);
+		__vlan_hwaccel_put_tag(entry->skb,
+			nla_get_be16(tb[NFQA_VLAN_PROTO]),
+			ntohs(nla_get_be16(tb[NFQA_VLAN_TCI])));
 	}
 
 	if (nfqa[NFQA_L2HDR]) {
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 514f7bc..a3f0982 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -277,8 +277,7 @@ static int push_vlan(struct sk_buff *skb, struct sw_flow_key *key,
 		key->eth.vlan.tci = vlan->vlan_tci;
 		key->eth.vlan.tpid = vlan->vlan_tpid;
 	}
-	return skb_vlan_push(skb, vlan->vlan_tpid,
-			     ntohs(vlan->vlan_tci) & ~VLAN_TAG_PRESENT);
+	return skb_vlan_push(skb, vlan->vlan_tpid, ntohs(vlan->vlan_tci));
 }
 
 /* 'src' is already properly masked. */
@@ -704,8 +703,10 @@ static int ovs_vport_output(struct net *net, struct sock *sk, struct sk_buff *sk
 	__skb_dst_copy(skb, data->dst);
 	*OVS_CB(skb) = data->cb;
 	skb->inner_protocol = data->inner_protocol;
-	skb->vlan_tci = data->vlan_tci;
-	skb->vlan_proto = data->vlan_proto;
+	if (data->vlan_proto)
+		__vlan_hwaccel_put_tag(skb, data->vlan_proto, data->vlan_tci);
+	else
+		__vlan_hwaccel_clear_tag(skb);
 
 	/* Reconstruct the MAC header.  */
 	skb_push(skb, data->l2_len);
@@ -749,8 +750,8 @@ static void prepare_frag(struct vport *vport, struct sk_buff *skb,
 	data->cb = *OVS_CB(skb);
 	data->inner_protocol = skb->inner_protocol;
 	data->network_offset = orig_network_offset;
-	data->vlan_tci = skb->vlan_tci;
-	data->vlan_proto = skb->vlan_proto;
+	data->vlan_tci = skb_vlan_tag_present(skb) ? skb->vlan_tci : 0;
+	data->vlan_proto = skb_vlan_tag_present(skb) ? skb->vlan_proto : 0;
 	data->mac_proto = mac_proto;
 	data->l2_len = hlen;
 	memcpy(&data->l2_data, skb->data, hlen);
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 08aa926..5e4579a 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -327,7 +327,7 @@ static int parse_vlan_tag(struct sk_buff *skb, struct vlan_head *key_vh)
 		return -ENOMEM;
 
 	vh = (struct vlan_head *)skb->data;
-	key_vh->tci = vh->tci | htons(VLAN_TAG_PRESENT);
+	key_vh->tci = vh->tci;
 	key_vh->tpid = vh->tpid;
 
 	__skb_pull(skb, sizeof(struct vlan_head));
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index f61cae7..f5115ed 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -57,8 +57,8 @@ struct ovs_tunnel_info {
 };
 
 struct vlan_head {
-	__be16 tpid; /* Vlan type. Generally 802.1q or 802.1ad.*/
-	__be16 tci;  /* 0 if no VLAN, VLAN_TAG_PRESENT set otherwise. */
+	__be16 tpid; /* Vlan type. Generally 802.1q or 802.1ad. 0 if no VLAN*/
+	__be16 tci;
 };
 
 #define OVS_SW_FLOW_KEY_METADATA_SIZE			\
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index d19044f..6ae5218 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -835,8 +835,6 @@ static int validate_vlan_from_nlattrs(const struct sw_flow_match *match,
 				      u64 key_attrs, bool inner,
 				      const struct nlattr **a, bool log)
 {
-	__be16 tci = 0;
-
 	if (!((key_attrs & (1 << OVS_KEY_ATTR_ETHERNET)) &&
 	      (key_attrs & (1 << OVS_KEY_ATTR_ETHERTYPE)) &&
 	       eth_type_vlan(nla_get_be16(a[OVS_KEY_ATTR_ETHERTYPE])))) {
@@ -850,20 +848,11 @@ static int validate_vlan_from_nlattrs(const struct sw_flow_match *match,
 		return -EINVAL;
 	}
 
-	if (a[OVS_KEY_ATTR_VLAN])
-		tci = nla_get_be16(a[OVS_KEY_ATTR_VLAN]);
-
-	if (!(tci & htons(VLAN_TAG_PRESENT))) {
-		if (tci) {
-			OVS_NLERR(log, "%s TCI does not have VLAN_TAG_PRESENT bit set.",
-				  (inner) ? "C-VLAN" : "VLAN");
-			return -EINVAL;
-		} else if (nla_len(a[OVS_KEY_ATTR_ENCAP])) {
-			/* Corner case for truncated VLAN header. */
-			OVS_NLERR(log, "Truncated %s header has non-zero encap attribute.",
-				  (inner) ? "C-VLAN" : "VLAN");
-			return -EINVAL;
-		}
+	if (!a[OVS_KEY_ATTR_VLAN] && nla_len(a[OVS_KEY_ATTR_ENCAP])) {
+		/* Corner case for truncated VLAN header. */
+		OVS_NLERR(log, "Truncated %s header has non-zero encap attribute.",
+			(inner) ? "C-VLAN" : "VLAN");
+		return -EINVAL;
 	}
 
 	return 1;
@@ -873,12 +862,9 @@ static int validate_vlan_mask_from_nlattrs(const struct sw_flow_match *match,
 					   u64 key_attrs, bool inner,
 					   const struct nlattr **a, bool log)
 {
-	__be16 tci = 0;
 	__be16 tpid = 0;
-	bool encap_valid = !!(match->key->eth.vlan.tci &
-			      htons(VLAN_TAG_PRESENT));
-	bool i_encap_valid = !!(match->key->eth.cvlan.tci &
-				htons(VLAN_TAG_PRESENT));
+	bool encap_valid = !!match->key->eth.vlan.tpid;
+	bool i_encap_valid = !!match->key->eth.cvlan.tpid;
 
 	if (!(key_attrs & (1 << OVS_KEY_ATTR_ENCAP))) {
 		/* Not a VLAN. */
@@ -891,9 +877,6 @@ static int validate_vlan_mask_from_nlattrs(const struct sw_flow_match *match,
 		return -EINVAL;
 	}
 
-	if (a[OVS_KEY_ATTR_VLAN])
-		tci = nla_get_be16(a[OVS_KEY_ATTR_VLAN]);
-
 	if (a[OVS_KEY_ATTR_ETHERTYPE])
 		tpid = nla_get_be16(a[OVS_KEY_ATTR_ETHERTYPE]);
 
@@ -902,11 +885,6 @@ static int validate_vlan_mask_from_nlattrs(const struct sw_flow_match *match,
 			  (inner) ? "C-VLAN" : "VLAN", ntohs(tpid));
 		return -EINVAL;
 	}
-	if (!(tci & htons(VLAN_TAG_PRESENT))) {
-		OVS_NLERR(log, "%s TCI mask does not have exact match for VLAN_TAG_PRESENT bit.",
-			  (inner) ? "C-VLAN" : "VLAN");
-		return -EINVAL;
-	}
 
 	return 1;
 }
@@ -958,7 +936,7 @@ static int parse_vlan_from_nlattrs(struct sw_flow_match *match,
 	if (err)
 		return err;
 
-	encap_valid = !!(match->key->eth.vlan.tci & htons(VLAN_TAG_PRESENT));
+	encap_valid = !!match->key->eth.vlan.tpid;
 	if (encap_valid) {
 		err = __parse_vlan_from_nlattrs(match, key_attrs, true, a,
 						is_mask, log);
@@ -1974,12 +1952,12 @@ static inline void add_nested_action_end(struct sw_flow_actions *sfa,
 static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 				  const struct sw_flow_key *key,
 				  int depth, struct sw_flow_actions **sfa,
-				  __be16 eth_type, __be16 vlan_tci, bool log);
+				  __be16 eth_type, __be16 vlan_tci, bool has_vlan, bool log);
 
 static int validate_and_copy_sample(struct net *net, const struct nlattr *attr,
 				    const struct sw_flow_key *key, int depth,
 				    struct sw_flow_actions **sfa,
-				    __be16 eth_type, __be16 vlan_tci, bool log)
+				    __be16 eth_type, __be16 vlan_tci, bool has_vlan, bool log)
 {
 	const struct nlattr *attrs[OVS_SAMPLE_ATTR_MAX + 1];
 	const struct nlattr *probability, *actions;
@@ -2017,7 +1995,7 @@ static int validate_and_copy_sample(struct net *net, const struct nlattr *attr,
 		return st_acts;
 
 	err = __ovs_nla_copy_actions(net, actions, key, depth + 1, sfa,
-				     eth_type, vlan_tci, log);
+				     eth_type, vlan_tci, has_vlan, log);
 	if (err)
 		return err;
 
@@ -2358,7 +2336,7 @@ static int copy_action(const struct nlattr *from,
 static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 				  const struct sw_flow_key *key,
 				  int depth, struct sw_flow_actions **sfa,
-				  __be16 eth_type, __be16 vlan_tci, bool log)
+				  __be16 eth_type, __be16 vlan_tci, bool has_vlan, bool log)
 {
 	u8 mac_proto = ovs_key_mac_proto(key);
 	const struct nlattr *a;
@@ -2436,6 +2414,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 			if (mac_proto != MAC_PROTO_ETHERNET)
 				return -EINVAL;
 			vlan_tci = htons(0);
+			has_vlan = 0;
 			break;
 
 		case OVS_ACTION_ATTR_PUSH_VLAN:
@@ -2444,9 +2423,8 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 			vlan = nla_data(a);
 			if (!eth_type_vlan(vlan->vlan_tpid))
 				return -EINVAL;
-			if (!(vlan->vlan_tci & htons(VLAN_TAG_PRESENT)))
-				return -EINVAL;
 			vlan_tci = vlan->vlan_tci;
+			has_vlan = 1;
 			break;
 
 		case OVS_ACTION_ATTR_RECIRC:
@@ -2460,7 +2438,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 			/* Prohibit push MPLS other than to a white list
 			 * for packets that have a known tag order.
 			 */
-			if (vlan_tci & htons(VLAN_TAG_PRESENT) ||
+			if (has_vlan ||
 			    (eth_type != htons(ETH_P_IP) &&
 			     eth_type != htons(ETH_P_IPV6) &&
 			     eth_type != htons(ETH_P_ARP) &&
@@ -2472,8 +2450,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 		}
 
 		case OVS_ACTION_ATTR_POP_MPLS:
-			if (vlan_tci & htons(VLAN_TAG_PRESENT) ||
-			    !eth_p_mpls(eth_type))
+			if (has_vlan || !eth_p_mpls(eth_type))
 				return -EINVAL;
 
 			/* Disallow subsequent L2.5+ set and mpls_pop actions
@@ -2506,7 +2483,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 
 		case OVS_ACTION_ATTR_SAMPLE:
 			err = validate_and_copy_sample(net, a, key, depth, sfa,
-						       eth_type, vlan_tci, log);
+						       eth_type, vlan_tci, has_vlan, log);
 			if (err)
 				return err;
 			skip_copy = true;
@@ -2530,7 +2507,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 		case OVS_ACTION_ATTR_POP_ETH:
 			if (mac_proto != MAC_PROTO_ETHERNET)
 				return -EINVAL;
-			if (vlan_tci & htons(VLAN_TAG_PRESENT))
+			if (has_vlan)
 				return -EINVAL;
 			mac_proto = MAC_PROTO_ETHERNET;
 			break;
@@ -2565,7 +2542,7 @@ int ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 
 	(*sfa)->orig_len = nla_len(attr);
 	err = __ovs_nla_copy_actions(net, attr, key, 0, sfa, key->eth.type,
-				     key->eth.vlan.tci, log);
+				     key->eth.vlan.tci, !!key->eth.vlan.tpid, log);
 	if (err)
 		ovs_nla_free_flow_actions(*sfa);
 
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index 19e0dba..8d56380 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -62,7 +62,7 @@ static int tcf_vlan(struct sk_buff *skb, const struct tc_action *a,
 		/* extract existing tag (and guarantee no hw-accel tag) */
 		if (skb_vlan_tag_present(skb)) {
 			tci = skb_vlan_tag_get(skb);
-			skb->vlan_tci = 0;
+			__vlan_hwaccel_clear_tag(skb);
 		} else {
 			/* in-payload vlan tag, pop it */
 			err = __skb_vlan_pop(skb, &tci);
-- 
2.10.2

_______________________________________________
dev mailing list
dev@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox