* [PATCH net] net: stmmac: Fix lack of link transition for fixed PHYs
From: Florian Fainelli @ 2016-11-14 1:50 UTC (permalink / raw)
To: netdev
Cc: davem, Florian Fainelli, Giuseppe Cavallaro, Alexandre Torgue,
open list
Commit 52f95bbfcf72 ("stmmac: fix adjust link call in case of a switch
is attached") added some logic to avoid polling the fixed PHY and
therefore invoking the adjust_link callback more than once, since this
is a fixed PHY and link events won't be generated.
This works fine the first time, because we start with phydev->irq =
PHY_POLL, so we call adjust_link, then we set phydev->irq =
PHY_IGNORE_INTERRUPT and we stop polling the PHY.
Now, if we called ndo_close(), which calls both phy_stop() and does an
explicit netif_carrier_off(), we end up with a link down. Upon calling
ndo_open() again, despite starting the PHY state machine, we have
PHY_IGNORE_INTERRUPT set, and we generate no link event at all, so the
link is permanently down.
52f95bbfcf72 ("stmmac: fix adjust link call in case of a switch is attached")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
Alexandre, Peppe,
The original patch is already a hack, but since this is a bugfix, I took the
same approach that you did here to backport this to -stable kernels.
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 10909c9c0033..03dbf8e89c4c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -882,6 +882,13 @@ static int stmmac_init_phy(struct net_device *dev)
return -ENODEV;
}
+ /* stmmac_adjust_link will change this to PHY_IGNORE_INTERRUPT to avoid
+ * subsequent PHY polling, make sure we force a link transition if
+ * we have a UP/DOWN/UP transition
+ */
+ if (phydev->is_pseudo_fixed_link)
+ phydev->irq = PHY_POLL;
+
pr_debug("stmmac_init_phy: %s: attached to PHY (UID 0x%x)"
" Link = %d\n", dev->name, phydev->phy_id, phydev->link);
--
2.9.3
^ permalink raw reply related
* Re: [PATCH net-next 03/11] net: dsa: mv88e6xxx: Add the mv88e6390 family
From: Vivien Didelot @ 2016-11-14 2:05 UTC (permalink / raw)
To: Andrew Lunn, David Miller; +Cc: netdev, Andrew Lunn
In-Reply-To: <1478832823-31471-4-git-send-email-andrew@lunn.ch>
Hi Andrew,
Andrew Lunn <andrew@lunn.ch> writes:
> -- compatible : Should be one of "marvell,mv88e6085",
> +- compatible : Should be one of "marvell,mv88e6085" or
> + "marvell,mv88e6390"
Just curious here, mv88e6085 was choosen because it was the smaller
product ID supported. Following that logic, shouldn't mv88e6190 be
choosen here instead of mv88e6390?
> +static const struct mv88e6xxx_ops mv88e6390_ops = {
> + .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
> + .phy_read = mv88e6xxx_g2_smi_phy_read,
> + .phy_write = mv88e6xxx_g2_smi_phy_write,
> + .port_set_link = mv88e6xxx_port_set_link,
> + .port_set_duplex = mv88e6xxx_port_set_duplex,
> + .port_set_rgmii_delay = mv88e6390_port_set_rgmii_delay,
> + .port_set_speed = mv88e6390_port_set_speed,
> +};
> +
> +static const struct mv88e6xxx_ops mv88e6390x_ops = {
> + .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
> + .phy_read = mv88e6xxx_g2_smi_phy_read,
> + .phy_write = mv88e6xxx_g2_smi_phy_write,
> + .port_set_link = mv88e6xxx_port_set_link,
> + .port_set_duplex = mv88e6xxx_port_set_duplex,
> + .port_set_rgmii_delay = mv88e6390_port_set_rgmii_delay,
> + .port_set_speed = mv88e6390x_port_set_speed,
> +};
Even if it is a bit more verbose, I'd intentionally keep one
mv88e6xxx_ops structure per chip. Using per-family structure is
error-prone and simpler is better here.
Thanks,
Vivien
^ permalink raw reply
* Re: [PATCH net-next 05/11] net: dsa: mv88e6xxx: Add comment about family a device belongs to
From: Vivien Didelot @ 2016-11-14 2:08 UTC (permalink / raw)
To: Andrew Lunn, David Miller; +Cc: netdev, Andrew Lunn
In-Reply-To: <1478832823-31471-6-git-send-email-andrew@lunn.ch>
Hi Andrew,
Andrew Lunn <andrew@lunn.ch> writes:
> Knowing the family of device belongs to helps with picking the ops
> implementation which is appropriate to the device. So add a comment to
> each structure of ops.
This commit is not necessary. mv88e6xxx_ops structure must be per-chip,
and the family information is already described in patch 03/11.
Thanks,
Vivien
^ permalink raw reply
* Re: [net] 2ab9fb18c4: kernel BUG at include/linux/skbuff.h:1935!
From: Ye Xiaolong @ 2016-11-14 2:14 UTC (permalink / raw)
To: Eric Dumazet
Cc: lkp, netdev, Willem de Bruijn, Alexei Starovoitov,
Alexander Duyck, Jojy Varghese, Tom Herbert, Yibin Yang,
David Miller
In-Reply-To: <1479088020.8455.41.camel@edumazet-glaptop3.roam.corp.google.com>
On 11/13, Eric Dumazet wrote:
>On Mon, 2016-11-14 at 07:49 +0800, kernel test robot wrote:
>> FYI, we noticed the following commit:
>
>
>> in testcase: kbuild
>> with following parameters:
>>
>> runtime: 300s
>> nr_task: 50%
>> cpufreq_governor: performance
>>
>>
>>
>>
>> on test machine: 8 threads Intel(R) Atom(TM) CPU C2750 @ 2.40GHz with 16G memory
>>
>> caused below changes:
>>
>>
>> +-------------------------------------------------------+------------+------------+
>> | | cdb26d3387 | 2ab9fb18c4 |
>> +-------------------------------------------------------+------------+------------+
>> | boot_successes | 10 | 3 |
>> | boot_failures | 0 | 9 |
>> | kernel_BUG_at_include/linux/skbuff.h | 0 | 8 |
>> | invalid_opcode:#[##]SMP | 0 | 8 |
>> | RIP:eth_type_trans | 0 | 8 |
>> | Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 0 | 5 |
>> | WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup | 0 | 1 |
>> | calltrace:parport_pc_init | 0 | 1 |
>> | calltrace:SyS_finit_module | 0 | 1 |
>> | WARNING:at_lib/kobject.c:#kobject_add_internal | 0 | 1 |
>> +-------------------------------------------------------+------------+------------+
>>
>>
>>
>> [ 20.491020] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>> [ 20.502988] Sending DHCP requests .
>> [ 20.506729] ------------[ cut here ]------------
>> [ 20.511369] kernel BUG at include/linux/skbuff.h:1935!
>> [ 20.517893] invalid opcode: 0000 [#1] SMP
>> [ 20.521902] Modules linked in:
>> [ 20.524979] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.9.0-rc3-00286-g2ab9fb1 #1
>> [ 20.532463] Hardware name: Supermicro SYS-5018A-TN4/A1SAi, BIOS 1.1a 08/27/2015
>> [ 20.539768] task: ffff8804456c2480 task.stack: ffffc90001920000
>> [ 20.545684] RIP: 0010:[<ffffffff81837b48>] [<ffffffff81837b48>] eth_type_trans+0xe8/0x140
>> [ 20.553972] RSP: 0018:ffff88047fd03db8 EFLAGS: 00010297
>> [ 20.559283] RAX: 0000000000000158 RBX: ffff88047d8ae600 RCX: 0000000000001073
>> [ 20.566415] RDX: ffff88047bf07dc0 RSI: ffff88047d8a4000 RDI: ffff88047dac0f00
>> [ 20.573546] RBP: ffff88047fd03e20 R08: ffff88047d8a4000 R09: 0000000000000800
>> [ 20.580678] R10: ffff88047bf07ec0 R11: ffffea0011f6e400 R12: ffff88047dac0f00
>> [ 20.587810] R13: ffff880457413000 R14: ffffc90002129000 R15: 000000000000015e
>> [ 20.594946] FS: 0000000000000000(0000) GS:ffff88047fd00000(0000) knlGS:0000000000000000
>> [ 20.603032] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 20.608775] CR2: 00007fffadfb4ef0 CR3: 000000047ee07000 CR4: 00000000001006e0
>> [ 20.615906] Stack:
>> [ 20.617927] ffffffff816905a7 ffffea0011f6e400 ffffea0000000008 ffff88047d8ae450
>> [ 20.625403] ffff88047d8ae400 0000004000000166 ffffea0011f6e400 0000ffff00000000
>> [ 20.632873] 0000000000000040 0000000000000000 ffff88047d8ae450 ffff88047d8b1140
>> [ 20.640352] Call Trace:
>> [ 20.642805] <IRQ>
>> [ 20.644740] [<ffffffff816905a7>] ? igb_clean_rx_irq+0x6a7/0x7d0
>> [ 20.650760] [<ffffffff81690a52>] igb_poll+0x382/0x700
>> [ 20.655904] [<ffffffff8146edd9>] ? timerqueue_add+0x59/0xb0
>> [ 20.661564] [<ffffffff8180f2d7>] net_rx_action+0x217/0x360
>> [ 20.667137] [<ffffffff81957ef4>] __do_softirq+0x104/0x2ab
>> [ 20.672624] [<ffffffff81086961>] irq_exit+0xf1/0x100
>> [ 20.677673] [<ffffffff81957c34>] do_IRQ+0x54/0xd0
>> [ 20.682466] [<ffffffff81955acc>] common_interrupt+0x8c/0x8c
>> [ 20.688123] <EOI>
>> [ 20.690054] [<ffffffff817c1d12>] ? cpuidle_enter_state+0x122/0x2e0
>> [ 20.696333] [<ffffffff817c1f07>] cpuidle_enter+0x17/0x20
>> [ 20.701733] [<ffffffff810c64c3>] call_cpuidle+0x23/0x40
>> [ 20.707045] [<ffffffff810c66f4>] cpu_startup_entry+0x114/0x200
>> [ 20.712964] [<ffffffff81051c87>] start_secondary+0x107/0x130
>> [ 20.718708] Code: 00 04 00 00 c9 c3 48 33 86 70 03 00 00 48 c1 e0 10 48 85 c0 0f b6 87 90 00 00 00 75 28 83 e0 f8 83 c8 01 88 87 90 00 00 00 eb 82 <0f> 0b 0f b6 87 90 00 00 00 83 e0 f8 83 c8 03 88 87 90 00 00 00
>> [ 20.738722] RIP [<ffffffff81837b48>] eth_type_trans+0xe8/0x140
>> [ 20.744662] RSP <ffff88047fd03db8>
>> [ 20.748160] ---[ end trace 153440bf1ca2e6fc ]---
>> [ 20.748165] ------------[ cut here ]------------
>>
>>
>> To reproduce:
>>
>> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>> cd lkp-tests
>> bin/lkp install job.yaml # job file is attached in this email
>> bin/lkp run job.yaml
>>
>>
>>
>> Thanks,
>> Kernel Test Robot
>
>
>Hi guys.
>
>I took a look at the commit again and I do not see how this can happen.
>
>Are you sure patch was properly applied ?
>
>In particular, the following extract is obscure for me :
>
>
>> https://github.com/0day-ci/linux Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839
>> commit 2ab9fb18c46b91b16a0f0f329336d3be9fc32deb ("net: __skb_flow_dissect() must cap its return value")
>>
Hi,
The above two lines means 0day repo setup a new branch
"Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839"
which is based on net/master, then applied you patch on top of it,
commit id is 2ab9fb18c46b91b16a0f0f329336d3be9fc32deb.
Thanks,
Xiaolong
>
>Thanks.
>
>
^ permalink raw reply
* Re: [PATCH net-next 07/11] net: dsa: mv88e6xxx: Add mv88e6390 statistics unit init
From: Vivien Didelot @ 2016-11-14 2:29 UTC (permalink / raw)
To: Andrew Lunn, David Miller; +Cc: netdev, Andrew Lunn
In-Reply-To: <1478832823-31471-8-git-send-email-andrew@lunn.ch>
Hi Andrew,
Andrew Lunn <andrew@lunn.ch> writes:
> The statistics unit on the mv88e6390 needs to the configured in a
> different register to the others as to what histogram statistics is
> should return.
Can you re-phrase the above please?
> +static int mv88e6390_stats_init(struct mv88e6xxx_chip *chip)
> +{
> + u16 val;
> + int err;
> +
> + err = mv88e6xxx_g1_read(chip, GLOBAL_CONTROL_2, &val);
> + if (err)
> + return err;
> +
> + val |= GLOBAL_CONTROL_2_HIST_RX_TX;
> +
> + err = mv88e6xxx_g1_write(chip, GLOBAL_CONTROL_2, val);
> +
> + return err;
> +}
Can you please move this Global 1 specific helper in global1.c under an
ordered snippet such as:
/* Offset 0x1C: Global Control 2 */
int mv88e6xxx_g1_set_foo(struct mv88e6xxx_chip *chip)
{
...
}
I'd like internal SMI devices to be self documented in their specific
files and easy to hack for new developers. Ordered helpers will help.
Also, the helper should reflect what it really does. It is used to set
the Histogram Counters Mode. So please name it accordingly, something
like mv88e6xxx_g1_set_hist_count_mode().
Thanks,
Vivien
^ permalink raw reply
* Re: [PATCH net-next v1] bpf: Use u64_to_user_ptr()
From: Alexei Starovoitov @ 2016-11-14 2:38 UTC (permalink / raw)
To: Mickaël Salaün
Cc: netdev, Alexei Starovoitov, Arnd Bergmann, Daniel Borkmann
In-Reply-To: <20161113184403.15222-1-mic@digikod.net>
On Sun, Nov 13, 2016 at 07:44:03PM +0100, Mickaël Salaün wrote:
> Replace the custom u64_to_ptr() function with the u64_to_user_ptr()
> macro.
>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
Thanks for following up on this one.
Acked-by: Alexei Starovoitov <ast@kernel.org>
^ permalink raw reply
* Re: [PATCH net-next 08/11] net: dsa: mv88e6xxx: Add stats_get_sset_count to ops structure
From: Vivien Didelot @ 2016-11-14 2:47 UTC (permalink / raw)
To: Andrew Lunn, David Miller; +Cc: netdev, Andrew Lunn
In-Reply-To: <1478832823-31471-9-git-send-email-andrew@lunn.ch>
Hi Andrew,
Andrew Lunn <andrew@lunn.ch> writes:
> Different families have different sets of statistics. Abstract this
> using a stats_get_sset_count op. Each stat has a bitmap, and the ops
> implementer uses a bit map mask to count the statistics which apply
> for the family.
> -static int mv88e6xxx_get_sset_count(struct dsa_switch *ds)
> +static int _mv88e6xxx_get_sset_count(struct mv88e6xxx_chip *chip, int types)
Looks good overall. But please don't re-introduce underscore-prefixed
helpers. If I'm not mistaken, stats are a Global 1 feature, so ordered
explicit helpers in global1.c will be perfect.
If the stats code is huge, don't hesitate to move them in a
global1_stats.c file, as you wish. But we have to keep it
self-documented and easy to follow for new developers.
Thanks,
Vivien
^ permalink raw reply
* Re: [PATCH net-next 05/11] net: dsa: mv88e6xxx: Add comment about family a device belongs to
From: Andrew Lunn @ 2016-11-14 2:48 UTC (permalink / raw)
To: Vivien Didelot; +Cc: David Miller, netdev
In-Reply-To: <8737iubype.fsf@ketchup.i-did-not-set--mail-host-address--so-tickle-me>
On Mon, Nov 14, 2016 at 01:08:13PM +1100, Vivien Didelot wrote:
> Hi Andrew,
>
> Andrew Lunn <andrew@lunn.ch> writes:
>
> > Knowing the family of device belongs to helps with picking the ops
> > implementation which is appropriate to the device. So add a comment to
> > each structure of ops.
>
> This commit is not necessary. mv88e6xxx_ops structure must be per-chip,
> and the family information is already described in patch 03/11.
I disagree. I made a lot of errors adding the right per family handler
to these structures, simply because it is not obvious what family a
device belongs to when looking at the structure.
Andrew
^ permalink raw reply
* Re: [PATCH net 2/3] bpf, mlx5: fix various refcount/prog issues in mlx5e_xdp_set
From: Alexei Starovoitov @ 2016-11-14 2:49 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: davem, bblanco, tariqt, zhiyisun, ranas, netdev
In-Reply-To: <03741f7075af64e83d23add379bdab41204396b0.1479080215.git.daniel@iogearbox.net>
On Mon, Nov 14, 2016 at 01:43:41AM +0100, Daniel Borkmann wrote:
> There are multiple issues in mlx5e_xdp_set():
>
> 1) prog can be NULL, so calling unconditionally into bpf_prog_add(prog,
> priv->params.num_channels) can end badly.
>
> 2) The batched bpf_prog_add() should be done at an earlier point in
> time. This makes sure that we cannot fail anymore at the time we
> want to set the program for each channel. This only means that we
> have to undo the bpf_prog_add() in case we return early due to
> reset or device not in MLX5E_STATE_OPENED yet. Note, err is 0 here.
>
> 3) When swapping the priv->xdp_prog, then no extra reference count must
> be taken since we got that from call path via dev_change_xdp_fd()
> already. Otherwise, we'd never be able to free the program. Also,
> bpf_prog_add() without checking the return code could fail.
>
> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
...
> +static inline void bpf_prog_sub(struct bpf_prog *prog, int i)
> +{
> +}
> +
> static inline void bpf_prog_put(struct bpf_prog *prog)
> {
> }
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 751e806..a0fca9f 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -682,6 +682,17 @@ struct bpf_prog *bpf_prog_add(struct bpf_prog *prog, int i)
> }
> EXPORT_SYMBOL_GPL(bpf_prog_add);
>
> +void bpf_prog_sub(struct bpf_prog *prog, int i)
> +{
> + /* Only to be used for undoing previous bpf_prog_add() in some
> + * error path. We still know that another entity in our call
> + * path holds a reference to the program, thus atomic_sub() can
> + * be safely used in such cases!
> + */
> + WARN_ON(atomic_sub_return(i, &prog->aux->refcnt) == 0);
> +}
> +EXPORT_SYMBOL_GPL(bpf_prog_sub);
the patches look good. I'm only worried about net/net-next merge
conflict here. (I would have to deal with it as well).
So instead of copying the above helper can we apply net-next's
'bpf, mlx4: fix prog refcount in mlx4_en_try_alloc_resources error path'
patch to net without mlx4_xdp_set hunk and then apply
the rest of this patch?
Even better is to send this patch 2/3 to net-next?
yes, it's an issue, but very small one. There is no security
concern here, so I would prefer to avoid merge conflict.
Did you do a test merge of net/net-next by any chance?
May be I'm overreacting.
^ permalink raw reply
* [PATCH net-next] mdio: Demote print from info to debug in mdio_driver_register
From: Florian Fainelli @ 2016-11-14 3:01 UTC (permalink / raw)
To: netdev; +Cc: davem, andrew, Florian Fainelli
While it is useful to know which MDIO driver is being registered, demote
the pr_info() to a pr_debug().
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/phy/mdio_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/phy/mdio_device.c b/drivers/net/phy/mdio_device.c
index 9c88e6749b9a..43c8fd46504b 100644
--- a/drivers/net/phy/mdio_device.c
+++ b/drivers/net/phy/mdio_device.c
@@ -144,7 +144,7 @@ int mdio_driver_register(struct mdio_driver *drv)
struct mdio_driver_common *mdiodrv = &drv->mdiodrv;
int retval;
- pr_info("mdio_driver_register: %s\n", mdiodrv->driver.name);
+ pr_debug("mdio_driver_register: %s\n", mdiodrv->driver.name);
mdiodrv->driver.bus = &mdio_bus_type;
mdiodrv->driver.probe = mdio_probe;
--
2.9.3
^ permalink raw reply related
* Re: [LKP] [net] 2ab9fb18c4: kernel BUG at include/linux/skbuff.h:1935!
From: Fengguang Wu @ 2016-11-14 3:11 UTC (permalink / raw)
To: Ye Xiaolong
Cc: Eric Dumazet, Alexander Duyck, Willem de Bruijn, netdev,
Alexei Starovoitov, Jojy Varghese, Tom Herbert, Yibin Yang, lkp,
David Miller
In-Reply-To: <20161114021420.GC31218@yexl-desktop>
>>Hi guys.
>>
>>I took a look at the commit again and I do not see how this can happen.
>>
>>Are you sure patch was properly applied ?
>>
>>In particular, the following extract is obscure for me :
>>
>>
>>> https://github.com/0day-ci/linux Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839
>>> commit 2ab9fb18c46b91b16a0f0f329336d3be9fc32deb ("net: __skb_flow_dissect() must cap its return value")
>>>
>
>Hi,
>
>The above two lines means 0day repo setup a new branch
>"Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839"
>which is based on net/master, then applied you patch on top of it,
>commit id is 2ab9fb18c46b91b16a0f0f329336d3be9fc32deb.
Xiaolong, it may be more helpful to show the base tree where we apply
the patch to. And the final url:
https://github.com/0day-ci/linux/tree/Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839
Thanks,
Fengguang
^ permalink raw reply
* Re: [PATCH net-next 00/11] Start adding support for mv88e6390 family
From: David Miller @ 2016-11-14 3:39 UTC (permalink / raw)
To: andrew; +Cc: netdev, vivien.didelot
In-Reply-To: <20161113202403.GB18258@lunn.ch>
From: Andrew Lunn <andrew@lunn.ch>
Date: Sun, 13 Nov 2016 21:24:03 +0100
> What seems to be the issue is you said you have accepted:
>
> [PATCH net-next 0/2] Fixes for port refactoring
> https://marc.info/?l=linux-netdev&m=147880114928996&w=1
>
> Yet i don't see these in net-next. And i based this patchset on a tree
> which included the fixes. Hence they are not applying.
>
> Have the fixes really been accepted?
Accepted but not pushed out properly, sorry.
This should be sorted out now.
^ permalink raw reply
* Re: [PATCH net-next] mdio: Demote print from info to debug in mdio_driver_register
From: Andrew Lunn @ 2016-11-14 3:44 UTC (permalink / raw)
To: Florian Fainelli; +Cc: netdev, davem
In-Reply-To: <20161114030117.25169-1-f.fainelli@gmail.com>
On Sun, Nov 13, 2016 at 07:01:17PM -0800, Florian Fainelli wrote:
> While it is useful to know which MDIO driver is being registered, demote
> the pr_info() to a pr_debug().
>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH v2 net-next 1/5] bpf: Refactor cgroups code in prep for new type
From: David Ahern @ 2016-11-14 3:51 UTC (permalink / raw)
To: Thomas Graf, Daniel Mack; +Cc: David Miller, netdev, ast, daniel, maheshb
In-Reply-To: <20161031174942.GF32374@pox.localdomain>
On 10/31/16 11:49 AM, Thomas Graf wrote:
> On 10/31/16 at 06:16pm, Daniel Mack wrote:
>> On 10/31/2016 06:05 PM, David Ahern wrote:
>>> On 10/31/16 11:00 AM, Daniel Mack wrote:
>>>> Yeah, I'm confused too. I changed that name in my v7 from
>>>> BPF_PROG_TYPE_CGROUP_SOCK to BPF_PROG_TYPE_CGROUP_SKB on David's
>>>> (Ahern) request. Why is it now renamed again?
>>>
>>> Thomas pushed back on adding another program type in favor of using
>>> subtypes. So this makes the program type generic to CGROUP and patch
>>> 2 in this v2 set added Mickaël's subtype patch with the socket
>>> mangling done that way in patch 3.
>>>
>>
>> Fine for me. I can change it around again.
>
> I would like to hear from Daniel B and Alexei as well. We need to
> decide whether to use subtypes consistently and treat prog types as
> something more high level or whether to bluntly introduce a new prog
> type for every distinct set of verifier limits. I will change lwt_bpf
> as well accordingly.
>
Alexei / Daniel - any comments/preferences on subtypes vs program types?
^ permalink raw reply
* Re: [PATCH 00/39] Netfilter updates for net-next
From: David Miller @ 2016-11-14 4:25 UTC (permalink / raw)
To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <1479075933-4491-1-git-send-email-pablo@netfilter.org>
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Sun, 13 Nov 2016 23:24:54 +0100
> The following patchset contains a second batch of Netfilter updates
> for your net-next tree. This includes a rework of the core hook
> infrastructure that improves Netfilter performance by ~15% according
> to synthetic benchmarks. Then, a large batch with ipset updates,
> including a new hash:ipmac set type, via Jozsef Kadlecsik. This also
> includes a couple of assorted updates.
Looks great, pulled, thanks!
^ permalink raw reply
* [PATCH net-next v3 0/7] vxlan: xmit improvements.
From: Pravin B Shelar @ 2016-11-14 4:43 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
Following patch series improves vxlan fast path, removes
duplicate code and simplifies vxlan xmit code path.
v2-v3:
Removed unrelated warning fix from patch 2.
rearranged error handling from patch 3
Fixed stats updates in vxlan route lookup in patch 4
v1-v2:
Fix compilation error when IPv6 support is not enabled.
Pravin B Shelar (7):
vxlan: avoid vlan processing in vxlan device.
vxlan: avoid checking socket multiple times.
vxlan: simplify exception handling
vxlan: improve vxlan route lookup checks.
vxlan: simplify RTF_LOCAL handling.
vxlan: simplify vxlan xmit
vxlan: remove unsed vxlan_dev_dst_port()
drivers/net/vxlan.c | 285 +++++++++++++++++++++++-------------------------
include/linux/if_vlan.h | 16 ---
include/net/vxlan.h | 10 --
3 files changed, 137 insertions(+), 174 deletions(-)
--
1.9.1
^ permalink raw reply
* [PATCH net-next v3 1/7] vxlan: avoid vlan processing in vxlan device.
From: Pravin B Shelar @ 2016-11-14 4:43 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479098638-4921-1-git-send-email-pshelar@ovn.org>
VxLan device does not have special handling for vlan taging on egress.
Therefore it does not make sense to expose vlan offloading feature.
This patch does not change vxlan functinality.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jiri Benc <jbenc@redhat.com>
---
drivers/net/vxlan.c | 9 +--------
include/linux/if_vlan.h | 16 ----------------
2 files changed, 1 insertion(+), 24 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index cb5cc7c..756d826 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1748,18 +1748,13 @@ static int vxlan_build_skb(struct sk_buff *skb, struct dst_entry *dst,
}
min_headroom = LL_RESERVED_SPACE(dst->dev) + dst->header_len
- + VXLAN_HLEN + iphdr_len
- + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
+ + VXLAN_HLEN + iphdr_len;
/* Need space for new headers (invalidates iph ptr) */
err = skb_cow_head(skb, min_headroom);
if (unlikely(err))
goto out_free;
- skb = vlan_hwaccel_push_inside(skb);
- if (WARN_ON(!skb))
- return -ENOMEM;
-
err = iptunnel_handle_offloads(skb, type);
if (err)
goto out_free;
@@ -2527,10 +2522,8 @@ static void vxlan_setup(struct net_device *dev)
dev->features |= NETIF_F_GSO_SOFTWARE;
dev->vlan_features = dev->features;
- dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
dev->hw_features |= NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_RXCSUM;
dev->hw_features |= NETIF_F_GSO_SOFTWARE;
- dev->hw_features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
netif_keep_dst(dev);
dev->priv_flags |= IFF_NO_QUEUE;
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 3319d97..8d5fcd6 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -399,22 +399,6 @@ static inline struct sk_buff *__vlan_hwaccel_push_inside(struct sk_buff *skb)
skb->vlan_tci = 0;
return skb;
}
-/*
- * vlan_hwaccel_push_inside - pushes vlan tag to the payload
- * @skb: skbuff to tag
- *
- * Checks is tag is present in @skb->vlan_tci and if it is, it pushes the
- * VLAN tag from @skb->vlan_tci inside to the payload.
- *
- * Following the skb_unshare() example, in case of error, the calling function
- * doesn't have to worry about freeing the original skb.
- */
-static inline struct sk_buff *vlan_hwaccel_push_inside(struct sk_buff *skb)
-{
- if (skb_vlan_tag_present(skb))
- skb = __vlan_hwaccel_push_inside(skb);
- return skb;
-}
/**
* __vlan_hwaccel_put_tag - hardware accelerated VLAN inserting
--
1.9.1
^ permalink raw reply related
* [PATCH net-next v3 2/7] vxlan: avoid checking socket multiple times.
From: Pravin B Shelar @ 2016-11-14 4:43 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479098638-4921-1-git-send-email-pshelar@ovn.org>
Check the vxlan socket in vxlan6_getroute().
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
drivers/net/vxlan.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 756d826..9adeff9 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1830,6 +1830,7 @@ static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan,
#if IS_ENABLED(CONFIG_IPV6)
static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan,
+ struct vxlan_sock *sock6,
struct sk_buff *skb, int oif, u8 tos,
__be32 label,
const struct in6_addr *daddr,
@@ -1837,7 +1838,6 @@ static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan,
struct dst_cache *dst_cache,
const struct ip_tunnel_info *info)
{
- struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
struct dst_entry *ndst;
struct flowi6 fl6;
@@ -2069,11 +2069,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
struct dst_entry *ndst;
u32 rt6i_flags;
- if (!sock6)
- goto drop;
- sk = sock6->sock->sk;
-
- ndst = vxlan6_get_route(vxlan, skb,
+ ndst = vxlan6_get_route(vxlan, sock6, skb,
rdst ? rdst->remote_ifindex : 0, tos,
label, &dst->sin6.sin6_addr,
&src->sin6.sin6_addr,
@@ -2093,6 +2089,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
goto tx_error;
}
+ sk = sock6->sock->sk;
/* Bypass encapsulation if the destination is local */
rt6i_flags = ((struct rt6_info *)ndst)->rt6i_flags;
if (!info && rt6i_flags & RTF_LOCAL &&
@@ -2432,9 +2429,10 @@ static int vxlan_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
ip_rt_put(rt);
} else {
#if IS_ENABLED(CONFIG_IPV6)
+ struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
struct dst_entry *ndst;
- ndst = vxlan6_get_route(vxlan, skb, 0, info->key.tos,
+ ndst = vxlan6_get_route(vxlan, sock6, skb, 0, info->key.tos,
info->key.label, &info->key.u.ipv6.dst,
&info->key.u.ipv6.src, NULL, info);
if (IS_ERR(ndst))
--
1.9.1
^ permalink raw reply related
* [PATCH net-next v3 4/7] vxlan: improve vxlan route lookup checks.
From: Pravin B Shelar @ 2016-11-14 4:43 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479098638-4921-1-git-send-email-pshelar@ovn.org>
Move route sanity check to respective vxlan[4/6]_get_route functions.
This allows us to perform all sanity checks before caching the dst so
that we can avoid these checks on subsequent packets.
This give move accurate metadata information for packet from
fill_metadata_dst().
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
drivers/net/vxlan.c | 77 ++++++++++++++++++++++++++---------------------------
1 file changed, 38 insertions(+), 39 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 8bb58f6..aabb918 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1789,7 +1789,8 @@ static int vxlan_build_skb(struct sk_buff *skb, struct dst_entry *dst,
return 0;
}
-static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan,
+static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan, struct net_device *dev,
+ struct vxlan_sock *sock4,
struct sk_buff *skb, int oif, u8 tos,
__be32 daddr, __be32 *saddr,
struct dst_cache *dst_cache,
@@ -1799,6 +1800,9 @@ static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan,
struct rtable *rt = NULL;
struct flowi4 fl4;
+ if (!sock4)
+ return ERR_PTR(-EIO);
+
if (tos && !info)
use_cache = false;
if (use_cache) {
@@ -1816,16 +1820,26 @@ static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan,
fl4.saddr = *saddr;
rt = ip_route_output_key(vxlan->net, &fl4);
- if (!IS_ERR(rt)) {
+ if (likely(!IS_ERR(rt))) {
+ if (rt->dst.dev == dev) {
+ netdev_dbg(dev, "circular route to %pI4\n", &daddr);
+ ip_rt_put(rt);
+ return ERR_PTR(-ELOOP);
+ }
+
*saddr = fl4.saddr;
if (use_cache)
dst_cache_set_ip4(dst_cache, &rt->dst, fl4.saddr);
+ } else {
+ netdev_dbg(dev, "no route to %pI4\n", &daddr);
+ return ERR_PTR(-ENETUNREACH);
}
return rt;
}
#if IS_ENABLED(CONFIG_IPV6)
static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan,
+ struct net_device *dev,
struct vxlan_sock *sock6,
struct sk_buff *skb, int oif, u8 tos,
__be32 label,
@@ -1861,8 +1875,16 @@ static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan,
err = ipv6_stub->ipv6_dst_lookup(vxlan->net,
sock6->sock->sk,
&ndst, &fl6);
- if (err < 0)
- return ERR_PTR(err);
+ if (unlikely(err < 0)) {
+ netdev_dbg(dev, "no route to %pI6\n", daddr);
+ return ERR_PTR(-ENETUNREACH);
+ }
+
+ if (unlikely(ndst->dev == dev)) {
+ netdev_dbg(dev, "circular route to %pI6\n", daddr);
+ dst_release(ndst);
+ return ERR_PTR(-ELOOP);
+ }
*saddr = fl6.saddr;
if (use_cache)
@@ -1929,8 +1951,8 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
union vxlan_addr *src;
struct vxlan_metadata _md;
struct vxlan_metadata *md = &_md;
- struct dst_entry *ndst = NULL;
__be16 src_port = 0, dst_port;
+ struct dst_entry *ndst = NULL;
__be32 vni, label;
__be16 df = 0;
__u8 tos, ttl;
@@ -2007,29 +2029,14 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
struct rtable *rt;
- if (!sock4)
- goto drop;
- sk = sock4->sock->sk;
-
- rt = vxlan_get_route(vxlan, skb,
+ rt = vxlan_get_route(vxlan, dev, sock4, skb,
rdst ? rdst->remote_ifindex : 0, tos,
dst->sin.sin_addr.s_addr,
&src->sin.sin_addr.s_addr,
dst_cache, info);
- if (IS_ERR(rt)) {
- netdev_dbg(dev, "no route to %pI4\n",
- &dst->sin.sin_addr.s_addr);
- dev->stats.tx_carrier_errors++;
- goto tx_error;
- }
-
- if (rt->dst.dev == dev) {
- netdev_dbg(dev, "circular route to %pI4\n",
- &dst->sin.sin_addr.s_addr);
- dev->stats.collisions++;
- ip_rt_put(rt);
+ if (IS_ERR(rt))
goto tx_error;
- }
+ sk = sock4->sock->sk;
/* Bypass encapsulation if the destination is local */
if (!info && rt->rt_flags & RTCF_LOCAL &&
@@ -2067,27 +2074,17 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
u32 rt6i_flags;
- ndst = vxlan6_get_route(vxlan, sock6, skb,
+ ndst = vxlan6_get_route(vxlan, dev, sock6, skb,
rdst ? rdst->remote_ifindex : 0, tos,
label, &dst->sin6.sin6_addr,
&src->sin6.sin6_addr,
dst_cache, info);
if (IS_ERR(ndst)) {
- netdev_dbg(dev, "no route to %pI6\n",
- &dst->sin6.sin6_addr);
- dev->stats.tx_carrier_errors++;
ndst = NULL;
goto tx_error;
}
-
- if (ndst->dev == dev) {
- netdev_dbg(dev, "circular route to %pI6\n",
- &dst->sin6.sin6_addr);
- dev->stats.collisions++;
- goto tx_error;
- }
-
sk = sock6->sock->sk;
+
/* Bypass encapsulation if the destination is local */
rt6i_flags = ((struct rt6_info *)ndst)->rt6i_flags;
if (!info && rt6i_flags & RTF_LOCAL &&
@@ -2130,6 +2127,10 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
return;
tx_error:
+ if (err == -ELOOP)
+ dev->stats.collisions++;
+ else if (err == -ENETUNREACH)
+ dev->stats.tx_carrier_errors++;
dst_release(ndst);
dev->stats.tx_errors++;
kfree_skb(skb);
@@ -2411,9 +2412,7 @@ static int vxlan_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
struct rtable *rt;
- if (!sock4)
- return -EINVAL;
- rt = vxlan_get_route(vxlan, skb, 0, info->key.tos,
+ rt = vxlan_get_route(vxlan, dev, sock4, skb, 0, info->key.tos,
info->key.u.ipv4.dst,
&info->key.u.ipv4.src, NULL, info);
if (IS_ERR(rt))
@@ -2424,7 +2423,7 @@ static int vxlan_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
struct dst_entry *ndst;
- ndst = vxlan6_get_route(vxlan, sock6, skb, 0, info->key.tos,
+ ndst = vxlan6_get_route(vxlan, dev, sock6, skb, 0, info->key.tos,
info->key.label, &info->key.u.ipv6.dst,
&info->key.u.ipv6.src, NULL, info);
if (IS_ERR(ndst))
--
1.9.1
^ permalink raw reply related
* [PATCH net-next v3 3/7] vxlan: simplify exception handling
From: Pravin B Shelar @ 2016-11-14 4:43 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479098638-4921-1-git-send-email-pshelar@ovn.org>
vxlan egress path error handling has became complicated, it
need to handle IPv4 and IPv6 tunnel cases.
Earlier patch removes vlan handling from vxlan_build_skb(), so
vxlan_build_skb does not need to free skb and we can simplify
the xmit path by having single error handling for both type of
tunnels.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
drivers/net/vxlan.c | 46 +++++++++++++++++++---------------------------
1 file changed, 19 insertions(+), 27 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 9adeff9..8bb58f6 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1753,11 +1753,11 @@ static int vxlan_build_skb(struct sk_buff *skb, struct dst_entry *dst,
/* Need space for new headers (invalidates iph ptr) */
err = skb_cow_head(skb, min_headroom);
if (unlikely(err))
- goto out_free;
+ return err;
err = iptunnel_handle_offloads(skb, type);
if (err)
- goto out_free;
+ return err;
vxh = (struct vxlanhdr *) __skb_push(skb, sizeof(*vxh));
vxh->vx_flags = VXLAN_HF_VNI;
@@ -1781,16 +1781,12 @@ static int vxlan_build_skb(struct sk_buff *skb, struct dst_entry *dst,
if (vxflags & VXLAN_F_GPE) {
err = vxlan_build_gpe_hdr(vxh, vxflags, skb->protocol);
if (err < 0)
- goto out_free;
+ return err;
inner_protocol = skb->protocol;
}
skb_set_inner_protocol(skb, inner_protocol);
return 0;
-
-out_free:
- kfree_skb(skb);
- return err;
}
static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan,
@@ -1927,13 +1923,13 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
struct ip_tunnel_info *info;
struct vxlan_dev *vxlan = netdev_priv(dev);
struct sock *sk;
- struct rtable *rt = NULL;
const struct iphdr *old_iph;
union vxlan_addr *dst;
union vxlan_addr remote_ip, local_ip;
union vxlan_addr *src;
struct vxlan_metadata _md;
struct vxlan_metadata *md = &_md;
+ struct dst_entry *ndst = NULL;
__be16 src_port = 0, dst_port;
__be32 vni, label;
__be16 df = 0;
@@ -2009,6 +2005,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
if (dst->sa.sa_family == AF_INET) {
struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
+ struct rtable *rt;
if (!sock4)
goto drop;
@@ -2030,7 +2027,8 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
netdev_dbg(dev, "circular route to %pI4\n",
&dst->sin.sin_addr.s_addr);
dev->stats.collisions++;
- goto rt_tx_error;
+ ip_rt_put(rt);
+ goto tx_error;
}
/* Bypass encapsulation if the destination is local */
@@ -2053,12 +2051,13 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT)
df = htons(IP_DF);
+ ndst = &rt->dst;
tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
ttl = ttl ? : ip4_dst_hoplimit(&rt->dst);
- err = vxlan_build_skb(skb, &rt->dst, sizeof(struct iphdr),
+ err = vxlan_build_skb(skb, ndst, sizeof(struct iphdr),
vni, md, flags, udp_sum);
if (err < 0)
- goto xmit_tx_error;
+ goto tx_error;
udp_tunnel_xmit_skb(rt, sk, skb, src->sin.sin_addr.s_addr,
dst->sin.sin_addr.s_addr, tos, ttl, df,
@@ -2066,7 +2065,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
#if IS_ENABLED(CONFIG_IPV6)
} else {
struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
- struct dst_entry *ndst;
u32 rt6i_flags;
ndst = vxlan6_get_route(vxlan, sock6, skb,
@@ -2078,13 +2076,13 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
netdev_dbg(dev, "no route to %pI6\n",
&dst->sin6.sin6_addr);
dev->stats.tx_carrier_errors++;
+ ndst = NULL;
goto tx_error;
}
if (ndst->dev == dev) {
netdev_dbg(dev, "circular route to %pI6\n",
&dst->sin6.sin6_addr);
- dst_release(ndst);
dev->stats.collisions++;
goto tx_error;
}
@@ -2096,12 +2094,12 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
!(rt6i_flags & (RTCF_BROADCAST | RTCF_MULTICAST))) {
struct vxlan_dev *dst_vxlan;
- dst_release(ndst);
dst_vxlan = vxlan_find_vni(vxlan->net, vni,
dst->sa.sa_family, dst_port,
vxlan->flags);
if (!dst_vxlan)
goto tx_error;
+ dst_release(ndst);
vxlan_encap_bypass(skb, vxlan, dst_vxlan);
return;
}
@@ -2114,11 +2112,9 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
skb_scrub_packet(skb, xnet);
err = vxlan_build_skb(skb, ndst, sizeof(struct ipv6hdr),
vni, md, flags, udp_sum);
- if (err < 0) {
- dst_release(ndst);
- dev->stats.tx_errors++;
- return;
- }
+ if (err < 0)
+ goto tx_error;
+
udp_tunnel6_xmit_skb(ndst, sk, skb, dev,
&src->sin6.sin6_addr,
&dst->sin6.sin6_addr, tos, ttl,
@@ -2130,17 +2126,13 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
drop:
dev->stats.tx_dropped++;
- goto tx_free;
+ dev_kfree_skb(skb);
+ return;
-xmit_tx_error:
- /* skb is already freed. */
- skb = NULL;
-rt_tx_error:
- ip_rt_put(rt);
tx_error:
+ dst_release(ndst);
dev->stats.tx_errors++;
-tx_free:
- dev_kfree_skb(skb);
+ kfree_skb(skb);
}
/* Transmit local packets over Vxlan
--
1.9.1
^ permalink raw reply related
* [PATCH net-next v3 5/7] vxlan: simplify RTF_LOCAL handling.
From: Pravin B Shelar @ 2016-11-14 4:43 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479098638-4921-1-git-send-email-pshelar@ovn.org>
Avoid code duplicate code for handling RTF_LOCAL routes.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
drivers/net/vxlan.c | 85 ++++++++++++++++++++++++++++++++---------------------
1 file changed, 51 insertions(+), 34 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index aabb918..0b188d6 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1938,6 +1938,40 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan,
}
}
+static int encap_bypass_if_local(struct sk_buff *skb, struct net_device *dev,
+ struct vxlan_dev *vxlan, union vxlan_addr *daddr,
+ __be32 dst_port, __be32 vni, struct dst_entry *dst,
+ u32 rt_flags)
+{
+#if IS_ENABLED(CONFIG_IPV6)
+ /* IPv6 rt-flags are checked against RTF_LOCAL, but the value of
+ * RTF_LOCAL is equal to RTCF_LOCAL. So to keep code simple
+ * we can use RTCF_LOCAL which works for ipv4 and ipv6 route entry.
+ */
+ BUILD_BUG_ON(RTCF_LOCAL != RTF_LOCAL);
+#endif
+ /* Bypass encapsulation if the destination is local */
+ if (rt_flags & RTCF_LOCAL &&
+ !(rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))) {
+ struct vxlan_dev *dst_vxlan;
+
+ dst_release(dst);
+ dst_vxlan = vxlan_find_vni(vxlan->net, vni,
+ daddr->sa.sa_family, dst_port,
+ vxlan->flags);
+ if (!dst_vxlan) {
+ dev->stats.tx_errors++;
+ kfree_skb(skb);
+
+ return -ENOENT;
+ }
+ vxlan_encap_bypass(skb, vxlan, dst_vxlan);
+ return 1;
+ }
+
+ return 0;
+}
+
static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
struct vxlan_rdst *rdst, bool did_rsc)
{
@@ -2036,27 +2070,19 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
dst_cache, info);
if (IS_ERR(rt))
goto tx_error;
- sk = sock4->sock->sk;
+ sk = sock4->sock->sk;
/* Bypass encapsulation if the destination is local */
- if (!info && rt->rt_flags & RTCF_LOCAL &&
- !(rt->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))) {
- struct vxlan_dev *dst_vxlan;
-
- ip_rt_put(rt);
- dst_vxlan = vxlan_find_vni(vxlan->net, vni,
- dst->sa.sa_family, dst_port,
- vxlan->flags);
- if (!dst_vxlan)
- goto tx_error;
- vxlan_encap_bypass(skb, vxlan, dst_vxlan);
- return;
- }
-
- if (!info)
+ if (!info) {
+ err = encap_bypass_if_local(skb, dev, vxlan, dst,
+ dst_port, vni, &rt->dst,
+ rt->rt_flags);
+ if (err)
+ return;
udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM_TX);
- else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT)
+ } else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT) {
df = htons(IP_DF);
+ }
ndst = &rt->dst;
tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
@@ -2072,7 +2098,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
#if IS_ENABLED(CONFIG_IPV6)
} else {
struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
- u32 rt6i_flags;
ndst = vxlan6_get_route(vxlan, dev, sock6, skb,
rdst ? rdst->remote_ifindex : 0, tos,
@@ -2085,24 +2110,16 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
}
sk = sock6->sock->sk;
- /* Bypass encapsulation if the destination is local */
- rt6i_flags = ((struct rt6_info *)ndst)->rt6i_flags;
- if (!info && rt6i_flags & RTF_LOCAL &&
- !(rt6i_flags & (RTCF_BROADCAST | RTCF_MULTICAST))) {
- struct vxlan_dev *dst_vxlan;
-
- dst_vxlan = vxlan_find_vni(vxlan->net, vni,
- dst->sa.sa_family, dst_port,
- vxlan->flags);
- if (!dst_vxlan)
- goto tx_error;
- dst_release(ndst);
- vxlan_encap_bypass(skb, vxlan, dst_vxlan);
- return;
- }
+ if (!info) {
+ u32 rt6i_flags = ((struct rt6_info *)ndst)->rt6i_flags;
- if (!info)
+ err = encap_bypass_if_local(skb, dev, vxlan, dst,
+ dst_port, vni, ndst,
+ rt6i_flags);
+ if (err)
+ return;
udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM6_TX);
+ }
tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
ttl = ttl ? : ip6_dst_hoplimit(ndst);
--
1.9.1
^ permalink raw reply related
* [PATCH net-next v3 6/7] vxlan: simplify vxlan xmit
From: Pravin B Shelar @ 2016-11-14 4:43 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479098638-4921-1-git-send-email-pshelar@ovn.org>
Existing vxlan xmit function handles two distinct cases.
1. vxlan net device
2. vxlan lwt device.
By seperating initialization these two cases the egress path
looks better.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jiri Benc <jbenc@redhat.com>
---
drivers/net/vxlan.c | 78 +++++++++++++++++++++++------------------------------
1 file changed, 34 insertions(+), 44 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 0b188d6..411534c 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1978,8 +1978,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
struct dst_cache *dst_cache;
struct ip_tunnel_info *info;
struct vxlan_dev *vxlan = netdev_priv(dev);
- struct sock *sk;
- const struct iphdr *old_iph;
+ const struct iphdr *old_iph = ip_hdr(skb);
union vxlan_addr *dst;
union vxlan_addr remote_ip, local_ip;
union vxlan_addr *src;
@@ -1988,7 +1987,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
__be16 src_port = 0, dst_port;
struct dst_entry *ndst = NULL;
__be32 vni, label;
- __be16 df = 0;
__u8 tos, ttl;
int err;
u32 flags = vxlan->flags;
@@ -1998,19 +1996,40 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
info = skb_tunnel_info(skb);
if (rdst) {
+ dst = &rdst->remote_ip;
+ if (vxlan_addr_any(dst)) {
+ if (did_rsc) {
+ /* short-circuited back to local bridge */
+ vxlan_encap_bypass(skb, vxlan, vxlan);
+ return;
+ }
+ goto drop;
+ }
+
dst_port = rdst->remote_port ? rdst->remote_port : vxlan->cfg.dst_port;
vni = rdst->remote_vni;
- dst = &rdst->remote_ip;
src = &vxlan->cfg.saddr;
dst_cache = &rdst->dst_cache;
+ md->gbp = skb->mark;
+ ttl = vxlan->cfg.ttl;
+ if (!ttl && vxlan_addr_multicast(dst))
+ ttl = 1;
+
+ tos = vxlan->cfg.tos;
+ if (tos == 1)
+ tos = ip_tunnel_get_dsfield(old_iph, skb);
+
+ if (dst->sa.sa_family == AF_INET)
+ udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM_TX);
+ else
+ udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM6_TX);
+ label = vxlan->cfg.label;
} else {
if (!info) {
WARN_ONCE(1, "%s: Missing encapsulation instructions\n",
dev->name);
goto drop;
}
- dst_port = info->key.tp_dst ? : vxlan->cfg.dst_port;
- vni = tunnel_id_to_key32(info->key.tun_id);
remote_ip.sa.sa_family = ip_tunnel_info_af(info);
if (remote_ip.sa.sa_family == AF_INET) {
remote_ip.sin.sin_addr.s_addr = info->key.u.ipv4.dst;
@@ -2020,48 +2039,24 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
local_ip.sin6.sin6_addr = info->key.u.ipv6.src;
}
dst = &remote_ip;
+ dst_port = info->key.tp_dst ? : vxlan->cfg.dst_port;
+ vni = tunnel_id_to_key32(info->key.tun_id);
src = &local_ip;
dst_cache = &info->dst_cache;
- }
-
- if (vxlan_addr_any(dst)) {
- if (did_rsc) {
- /* short-circuited back to local bridge */
- vxlan_encap_bypass(skb, vxlan, vxlan);
- return;
- }
- goto drop;
- }
-
- old_iph = ip_hdr(skb);
-
- ttl = vxlan->cfg.ttl;
- if (!ttl && vxlan_addr_multicast(dst))
- ttl = 1;
-
- tos = vxlan->cfg.tos;
- if (tos == 1)
- tos = ip_tunnel_get_dsfield(old_iph, skb);
-
- label = vxlan->cfg.label;
- src_port = udp_flow_src_port(dev_net(dev), skb, vxlan->cfg.port_min,
- vxlan->cfg.port_max, true);
-
- if (info) {
+ if (info->options_len)
+ md = ip_tunnel_info_opts(info);
ttl = info->key.ttl;
tos = info->key.tos;
label = info->key.label;
udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
-
- if (info->options_len)
- md = ip_tunnel_info_opts(info);
- } else {
- md->gbp = skb->mark;
}
+ src_port = udp_flow_src_port(dev_net(dev), skb, vxlan->cfg.port_min,
+ vxlan->cfg.port_max, true);
if (dst->sa.sa_family == AF_INET) {
struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
struct rtable *rt;
+ __be16 df = 0;
rt = vxlan_get_route(vxlan, dev, sock4, skb,
rdst ? rdst->remote_ifindex : 0, tos,
@@ -2071,7 +2066,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
if (IS_ERR(rt))
goto tx_error;
- sk = sock4->sock->sk;
/* Bypass encapsulation if the destination is local */
if (!info) {
err = encap_bypass_if_local(skb, dev, vxlan, dst,
@@ -2079,7 +2073,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
rt->rt_flags);
if (err)
return;
- udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM_TX);
} else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT) {
df = htons(IP_DF);
}
@@ -2092,7 +2085,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
if (err < 0)
goto tx_error;
- udp_tunnel_xmit_skb(rt, sk, skb, src->sin.sin_addr.s_addr,
+ udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, src->sin.sin_addr.s_addr,
dst->sin.sin_addr.s_addr, tos, ttl, df,
src_port, dst_port, xnet, !udp_sum);
#if IS_ENABLED(CONFIG_IPV6)
@@ -2108,7 +2101,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
ndst = NULL;
goto tx_error;
}
- sk = sock6->sock->sk;
if (!info) {
u32 rt6i_flags = ((struct rt6_info *)ndst)->rt6i_flags;
@@ -2118,7 +2110,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
rt6i_flags);
if (err)
return;
- udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM6_TX);
}
tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
@@ -2129,13 +2120,12 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
if (err < 0)
goto tx_error;
- udp_tunnel6_xmit_skb(ndst, sk, skb, dev,
+ udp_tunnel6_xmit_skb(ndst, sock6->sock->sk, skb, dev,
&src->sin6.sin6_addr,
&dst->sin6.sin6_addr, tos, ttl,
label, src_port, dst_port, !udp_sum);
#endif
}
-
return;
drop:
--
1.9.1
^ permalink raw reply related
* [PATCH net-next v3 7/7] vxlan: remove unsed vxlan_dev_dst_port()
From: Pravin B Shelar @ 2016-11-14 4:43 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479098638-4921-1-git-send-email-pshelar@ovn.org>
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
include/net/vxlan.h | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/include/net/vxlan.h b/include/net/vxlan.h
index 308adc4..49a5920 100644
--- a/include/net/vxlan.h
+++ b/include/net/vxlan.h
@@ -281,16 +281,6 @@ struct vxlan_dev {
struct net_device *vxlan_dev_create(struct net *net, const char *name,
u8 name_assign_type, struct vxlan_config *conf);
-static inline __be16 vxlan_dev_dst_port(struct vxlan_dev *vxlan,
- unsigned short family)
-{
-#if IS_ENABLED(CONFIG_IPV6)
- if (family == AF_INET6)
- return inet_sk(vxlan->vn6_sock->sock->sk)->inet_sport;
-#endif
- return inet_sk(vxlan->vn4_sock->sock->sk)->inet_sport;
-}
-
static inline netdev_features_t vxlan_features_check(struct sk_buff *skb,
netdev_features_t features)
{
--
1.9.1
^ permalink raw reply related
* Re: [LKP] [net] 2ab9fb18c4: kernel BUG at include/linux/skbuff.h:1935!
From: Ye Xiaolong @ 2016-11-14 5:54 UTC (permalink / raw)
To: Fengguang Wu
Cc: Eric Dumazet, Alexander Duyck, Willem de Bruijn, netdev,
Alexei Starovoitov, Jojy Varghese, Tom Herbert, Yibin Yang, lkp,
David Miller
In-Reply-To: <20161114031130.yjb4anou24ede4ue@wfg-t540p.sh.intel.com>
On 11/14, Fengguang Wu wrote:
>>>Hi guys.
>>>
>>>I took a look at the commit again and I do not see how this can happen.
>>>
>>>Are you sure patch was properly applied ?
>>>
>>>In particular, the following extract is obscure for me :
>>>
>>>
>>>>https://github.com/0day-ci/linux Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839
>>>>commit 2ab9fb18c46b91b16a0f0f329336d3be9fc32deb ("net: __skb_flow_dissect() must cap its return value")
>>>>
>>
>>Hi,
>>
>>The above two lines means 0day repo setup a new branch
>>"Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839"
>>which is based on net/master, then applied you patch on top of it,
>>commit id is 2ab9fb18c46b91b16a0f0f329336d3be9fc32deb.
>
>Xiaolong, it may be more helpful to show the base tree where we apply
>the patch to. And the final url:
>
>https://github.com/0day-ci/linux/tree/Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839
>
Ok, I'll improve the appearance to make it more clear.
Thanks,
Xiaolong
>Thanks,
>Fengguang
^ permalink raw reply
* Re: [PATCH] genetlink: fix unsigned int comparison with less than zero
From: Cong Wang @ 2016-11-14 6:29 UTC (permalink / raw)
To: David Miller
Cc: Colin King, Johannes Berg, pravin shelar, Wei Yongjun,
Florian Westphal, Tycho Andersen, Tom Herbert,
Linux Kernel Network Developers, LKML
In-Reply-To: <20161113.121519.399311594808700910.davem@davemloft.net>
On Sun, Nov 13, 2016 at 9:15 AM, David Miller <davem@davemloft.net> wrote:
> I've commited the following to net-next:
>
> ====================
> [PATCH] genetlink: Make family a signed integer.
>
> The idr_alloc(), idr_remove(), et al. routines all expect IDs to be
> signed integers. Therefore make the genl_family member 'id' signed
> too.
This is exactly what I replied to Johannes.
Thanks for the fix!
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox