* [PATCH] net_sched: blackhole: tell upper qdisc about dropped packets
From: Konstantin Khlebnikov @ 2018-06-15 10:27 UTC (permalink / raw)
To: netdev, David S. Miller; +Cc: Cong Wang, Jiri Pirko, Jamal Hadi Salim
When blackhole is used on top of classful qdisc like hfsc it breaks
qlen and backlog counters because packets are disappear without notice.
In HFSC non-zero qlen while all classes are inactive triggers warning:
WARNING: ... at net/sched/sch_hfsc.c:1393 hfsc_dequeue+0xba4/0xe90 [sch_hfsc]
and schedules watchdog work endlessly.
This patch return __NET_XMIT_BYPASS in addition to NET_XMIT_SUCCESS,
this flag tells upper layer: this packet is gone and isn't queued.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
net/sched/sch_blackhole.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_blackhole.c b/net/sched/sch_blackhole.c
index c98a61e980ba..9c4c2bb547d7 100644
--- a/net/sched/sch_blackhole.c
+++ b/net/sched/sch_blackhole.c
@@ -21,7 +21,7 @@ static int blackhole_enqueue(struct sk_buff *skb, struct Qdisc *sch,
struct sk_buff **to_free)
{
qdisc_drop(skb, sch, to_free);
- return NET_XMIT_SUCCESS;
+ return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
}
static struct sk_buff *blackhole_dequeue(struct Qdisc *sch)
^ permalink raw reply related
* Re: [PATCH] net: Fix device name resolving crash in default_device_exit()
From: Kirill Tkhai @ 2018-06-15 9:44 UTC (permalink / raw)
To: David Ahern, netdev
Cc: davem, daniel, jakub.kicinski, ast, linux, john.fastabend, brouer
In-Reply-To: <3c024347-a60e-69aa-42c9-fcb6244642cb@gmail.com>
On 14.06.2018 20:11, David Ahern wrote:
> On 6/14/18 6:38 AM, Kirill Tkhai wrote:
>> The following script makes kernel to crash since it can't obtain
>> a name for a device, when the name is occupied by another device:
>>
>> #!/bin/bash
>> ifconfig eth0 down
>> ifconfig eth1 down
>> index=`cat /sys/class/net/eth1/ifindex`
>> ip link set eth1 name dev$index
>> unshare -n sleep 1h &
>> pid=$!
>> while [[ "`readlink /proc/self/ns/net`" == "`readlink /proc/$pid/ns/net`" ]]; do continue; done
>> ip link set dev$index netns $pid
>> ip link set eth0 name dev$index
>> kill -9 $pid
>>
>> Kernel messages:
>>
>> virtio_net virtio1 dev3: renamed from eth1
>> virtio_net virtio0 dev3: renamed from eth0
>> default_device_exit: failed to move dev3 to init_net: -17
>> ------------[ cut here ]------------
>> kernel BUG at net/core/dev.c:8978!
>> invalid opcode: 0000 [#1] PREEMPT SMP
>> CPU: 1 PID: 276 Comm: kworker/u8:3 Not tainted 4.17.0+ #292
>> Workqueue: netns cleanup_net
>> RIP: 0010:default_device_exit+0x9c/0xb0
>> [stack trace snipped]
>>
>> This patch gives more variability during choosing new name
>> of device and fixes the problem.
>>
>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>> ---
>> net/core/dev.c | 4 +---
>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 6e18242a1cae..6c9b9303ded6 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -8959,7 +8959,6 @@ static void __net_exit default_device_exit(struct net *net)
>> rtnl_lock();
>> for_each_netdev_safe(net, dev, aux) {
>> int err;
>> - char fb_name[IFNAMSIZ];
>>
>> /* Ignore unmoveable devices (i.e. loopback) */
>> if (dev->features & NETIF_F_NETNS_LOCAL)
>> @@ -8970,8 +8969,7 @@ static void __net_exit default_device_exit(struct net *net)
>> continue;
>>
>> /* Push remaining network devices to init_net */
>> - snprintf(fb_name, IFNAMSIZ, "dev%d", dev->ifindex);
>> - err = dev_change_net_namespace(dev, &init_net, fb_name);
>> + err = dev_change_net_namespace(dev, &init_net, "dev%d");
>> if (err) {
>> pr_emerg("%s: failed to move %s to init_net: %d\n",
>> __func__, dev->name, err);
>>
>
> This could cause repeated looping over __dev_alloc_name. If init_net has
> a large number of devices, it is going to be a performance bottleneck.
Hm, but is this a likely case, when real device is moved to net ns, so it
requires moving to init_net back? It seems the most devices moved to !init_net
are virtual and they just destroyed in default_device_exit_batch(). Or we have
more devices to care here?
I don't much want to insert here something like below:
if (__dev_get_by_name(&init_net, dev->name))
snprintf(fb_name, IFNAMSIZ, "dev%d", dev->ifindex);
err = dev_change_net_namespace(dev, &init_net, "dev%d");
because dev_change_net_namespace() is generic interface and it's used not only here,
and this will crumble the code in corner cases.
Maybe you have better ideas about this?
Kirill
^ permalink raw reply
* Re: Re: [Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net
From: Cornelia Huck @ 2018-06-15 9:32 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Siwei Liu, Samudrala, Sridhar, Alexander Duyck, virtio-dev,
aaron.f.brown, Jiri Pirko, Jakub Kicinski, Netdev, qemu-devel,
virtualization
In-Reply-To: <20180615052743-mutt-send-email-mst@kernel.org>
On Fri, 15 Jun 2018 05:34:24 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Jun 14, 2018 at 12:02:31PM +0200, Cornelia Huck wrote:
> > > > I am not all that familiar with how Qemu manages network devices. If we can
> > > > do all the
> > > > required management of the primary/standby devices within Qemu, that is
> > > > definitely a better
> > > > approach without upper layer involvement.
> > >
> > > Right. I would imagine in the extreme case the upper layer doesn't
> > > have to be involved at all if QEMU manages all hot plug/unplug logic.
> > > The management tool can supply passthrough device and virtio with the
> > > same group UUID, QEMU auto-manages the presence of the primary, and
> > > hot plug the device as needed before or after the migration.
> >
> > I do not really see how you can manage that kind of stuff in QEMU only.
>
> So right now failover is limited to pci passthrough devices only.
> The idea is to realize the vfio device but not expose it
> to guest. Have a separate command to expose it to guest.
> Hotunplug would also hide it from guest but not unrealize it.
So, this would not be real hot(un)plug, but 'hide it from the guest',
right? The concept of "we have it realized in QEMU, but the guest can't
discover and use it" should be translatable to non-pci as well (at
least for ccw).
>
> This will help ensure that e.g. on migration failure we can
> re-expose the device without risk of running out of resources.
Makes sense.
Should that 'hidden' state be visible/settable from outside as well
(e.g. via a property)? I guess yes, so that management software has a
chance to see whether a device is visible. Settable may be useful if we
find another use case for hiding realized devices.
^ permalink raw reply
* Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled
From: Michal Kubecek @ 2018-06-15 9:27 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: Yuchung Cheng, netdev, Eric Dumazet, LKML
In-Reply-To: <alpine.DEB.2.20.1806151048000.29120@whs-18.cs.helsinki.fi>
On Fri, Jun 15, 2018 at 11:05:03AM +0300, Ilpo Järvinen wrote:
> On Thu, 14 Jun 2018, Michal Kubecek wrote:
> > The trace wouldn't look so nice but it can be reproduced even with more
> > data to send. I've copied an example below. I couldn't find a really
> > nice one quickly so that first few retransmits (17:22:13.865105 through
> > 17:23:05.841105) are without new data but starting at 17:23:58.189150,
> > you can see that sending new (previously unsent) data may not suffice to
> > break the loop.
>
> My point was that the new data segment bursts that occur if the sender
> isn't application limited indicate that there's something going wrong
> with FRTO. And that wrong is also what is causing that RTO loop because
> the sender doesn't see the previous FRTO recovery on second RTO. With
> my FRTO undo fix, (new_recovery || icsk->icsk_retransmits) will be false
> and that will prevent the RTO loop.
Yes, it would prevent the loop in this case (except it would be a bit
later, after second RTO rather than after first). But I'm not convinced
the logic of the patch is correct. If I understand it correctly, it
essentially changes "presumption of innocence" (if we get an ack past
what we retransmitted, we assume it was received earlier - i.e. would
have been sacked before if SACK was in use) to "presumption of guilt"
(whenever a retransmitted segment is acked, we assume nothing else acked
with it was received earlier). Or that you trade false negatives for
false positives.
Maybe I understand it wrong but it seems that you de facto prevent
Step (3b) from ever happening in non-SACK case.
> > > No! The window should not update window on ACKs the receiver intends to
> > > designate as "duplicate ACKs". That is not without some potential cost
> > > though as it requires delaying window updates up to the next cumulative
> > > ACK. In the non-SACK series one of the changes is fixing this for
> > > non-SACK Linux TCP flows.
> >
> > That sounds like a reasonable change (at least at the first glance,
> > I didn't think about it too deeply) but even if we fix Linux stack to
> > behave like this, we cannot force everyone else to do the same.
>
> Unfortunately I don't know what the other stacks besides Linux do. But
> for Linux, the cause for the changing receiver window is the receiver
> window auto-tuning and I'm not sure if other stacks have a similar
> feature (or if that affects (almost) all ACKs like in Linux).
The capture from my previous e-mail and some others I have seen indicate
that at least some implementations do not take care to never change
window size when responding to an out-of-order segment. That means that
even if we change linux TCP this way (we might still need to send
a separate window update in some cases), we still cannot rely on others
doing the same.
I checked the capture attached to my previous e-mail again and there is
one thing where our F-RTO implementation (in 4.4, at least) is wrong,
IMHO. While the first ACK after "new data" (sent in (2b)) was a window
update (and therefore not dupack by definition) so that we could take
neither (3a) nor (3b), in some iterations there were further acks which
did not change window size. The text just before Step 1 says
The F-RTO algorithm does not specify actions for receiving
a segment that neither acknowledges new data nor is a duplicate
acknowledgment. The TCP sender SHOULD ignore such segments and
wait for a segment that either acknowledges new data or is
a duplicate acknowledgment.
My understanding is that this means that while the first ack after new
data is correctly ignored, the following ack which preserves window size
should be recognized as a dupack and (3a) should be taken.
Michal Kubecek
^ permalink raw reply
* Re: [PATCH net] hv_netvsc: Fix the variable sizes in ipsecv2 and rsc offload
From: Dan Carpenter @ 2018-06-15 9:14 UTC (permalink / raw)
To: haiyangz; +Cc: davem, netdev, olaf, sthemmin, linux-kernel, devel, vkuznets
In-Reply-To: <20180615012909.13440-1-haiyangz@linuxonhyperv.com>
On Thu, Jun 14, 2018 at 06:29:09PM -0700, Haiyang Zhang wrote:
> These structs are not in use right now, but will be used soon.
Btw, thank you for adding this information. I was wondering about that
in the first version of this patch. It's always useful to know what
the effect of a bugfix is even if the effect is nothing at this time.
regards,
dan carpenter
^ permalink raw reply
* (no subject)
From: Dani Camps @ 2018-06-15 8:48 UTC (permalink / raw)
To: netdev, neus matutes, ntop request, ntop, openwrt devel request
[-- Attachment #1.1: Type: text/plain, Size: 48 bytes --]
http://period.cloudstar.ca
Dani Camps
[-- Attachment #1.2: Type: text/html, Size: 1909 bytes --]
[-- Attachment #2: Type: text/plain, Size: 141 bytes --]
_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop
^ permalink raw reply
* Re: [PATCH v3 16/27] docs: Fix more broken references
From: Matthias Brugger @ 2018-06-15 8:46 UTC (permalink / raw)
To: Mauro Carvalho Chehab, Linux Doc Mailing List
Cc: linux-hwmon, devicetree, alsa-devel, linux-samsung-soc,
Jonathan Corbet, netdev, linux-pm, linux-mmc, linux-kernel,
dri-devel, Mauro Carvalho Chehab, linux-rockchip, linux-usb,
intel-wired-lan, linux-fsdevel, linux-mediatek, linux-clk,
linux-arm-kernel
In-Reply-To: <e1bf52a721005b2017434acc54ec5ddc152d6fe4.1528990947.git.mchehab+samsung@kernel.org>
On 14/06/18 18:09, Mauro Carvalho Chehab wrote:
> As we move stuff around, some doc references are broken. Fix some of
> them via this script:
> ./scripts/documentation-file-ref-check --fix
>
> Manually checked that produced results are valid.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
for mt6397.txt:
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
> ---
> .../devicetree/bindings/clock/st/st,clkgen.txt | 8 ++++----
> .../devicetree/bindings/clock/ti/gate.txt | 2 +-
> .../devicetree/bindings/clock/ti/interface.txt | 2 +-
> .../bindings/cpufreq/cpufreq-mediatek.txt | 2 +-
> .../devicetree/bindings/devfreq/rk3399_dmc.txt | 2 +-
> .../bindings/gpu/arm,mali-midgard.txt | 2 +-
> .../bindings/gpu/arm,mali-utgard.txt | 2 +-
> .../devicetree/bindings/mfd/mt6397.txt | 2 +-
> .../devicetree/bindings/mfd/sun6i-prcm.txt | 2 +-
> .../devicetree/bindings/mmc/exynos-dw-mshc.txt | 2 +-
> .../devicetree/bindings/net/dsa/ksz.txt | 2 +-
> .../devicetree/bindings/net/dsa/mt7530.txt | 2 +-
> .../devicetree/bindings/power/fsl,imx-gpc.txt | 2 +-
> .../bindings/power/wakeup-source.txt | 2 +-
> .../devicetree/bindings/usb/rockchip,dwc3.txt | 2 +-
> Documentation/hwmon/ina2xx | 2 +-
> Documentation/maintainer/pull-requests.rst | 2 +-
> Documentation/translations/ko_KR/howto.rst | 2 +-
> MAINTAINERS | 18 +++++++++---------
> drivers/net/ethernet/intel/Kconfig | 8 ++++----
> drivers/soundwire/stream.c | 8 ++++----
> fs/Kconfig.binfmt | 2 +-
> fs/binfmt_misc.c | 2 +-
> 23 files changed, 40 insertions(+), 40 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/clock/st/st,clkgen.txt b/Documentation/devicetree/bindings/clock/st/st,clkgen.txt
> index 7364953d0d0b..45ac19bfa0a9 100644
> --- a/Documentation/devicetree/bindings/clock/st/st,clkgen.txt
> +++ b/Documentation/devicetree/bindings/clock/st/st,clkgen.txt
> @@ -31,10 +31,10 @@ This binding uses the common clock binding[1].
> Each subnode should use the binding described in [2]..[7]
>
> [1] Documentation/devicetree/bindings/clock/clock-bindings.txt
> -[3] Documentation/devicetree/bindings/clock/st,clkgen-mux.txt
> -[4] Documentation/devicetree/bindings/clock/st,clkgen-pll.txt
> -[7] Documentation/devicetree/bindings/clock/st,quadfs.txt
> -[8] Documentation/devicetree/bindings/clock/st,flexgen.txt
> +[3] Documentation/devicetree/bindings/clock/st/st,clkgen-mux.txt
> +[4] Documentation/devicetree/bindings/clock/st/st,clkgen-pll.txt
> +[7] Documentation/devicetree/bindings/clock/st/st,quadfs.txt
> +[8] Documentation/devicetree/bindings/clock/st/st,flexgen.txt
>
>
> Required properties:
> diff --git a/Documentation/devicetree/bindings/clock/ti/gate.txt b/Documentation/devicetree/bindings/clock/ti/gate.txt
> index 03f8fdee62a7..56d603c1f716 100644
> --- a/Documentation/devicetree/bindings/clock/ti/gate.txt
> +++ b/Documentation/devicetree/bindings/clock/ti/gate.txt
> @@ -10,7 +10,7 @@ will be controlled instead and the corresponding hw-ops for
> that is used.
>
> [1] Documentation/devicetree/bindings/clock/clock-bindings.txt
> -[2] Documentation/devicetree/bindings/clock/gate-clock.txt
> +[2] Documentation/devicetree/bindings/clock/gpio-gate-clock.txt
> [3] Documentation/devicetree/bindings/clock/ti/clockdomain.txt
>
> Required properties:
> diff --git a/Documentation/devicetree/bindings/clock/ti/interface.txt b/Documentation/devicetree/bindings/clock/ti/interface.txt
> index 3111a409fea6..3f4704040140 100644
> --- a/Documentation/devicetree/bindings/clock/ti/interface.txt
> +++ b/Documentation/devicetree/bindings/clock/ti/interface.txt
> @@ -9,7 +9,7 @@ companion clock finding (match corresponding functional gate
> clock) and hardware autoidle enable / disable.
>
> [1] Documentation/devicetree/bindings/clock/clock-bindings.txt
> -[2] Documentation/devicetree/bindings/clock/gate-clock.txt
> +[2] Documentation/devicetree/bindings/clock/gpio-gate-clock.txt
>
> Required properties:
> - compatible : shall be one of:
> diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt b/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt
> index d36f07e0a2bb..0551c78619de 100644
> --- a/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt
> +++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt
> @@ -8,7 +8,7 @@ Required properties:
> "intermediate" - A parent of "cpu" clock which is used as "intermediate" clock
> source (usually MAINPLL) when the original CPU PLL is under
> transition and not stable yet.
> - Please refer to Documentation/devicetree/bindings/clk/clock-bindings.txt for
> + Please refer to Documentation/devicetree/bindings/clock/clock-bindings.txt for
> generic clock consumer properties.
> - operating-points-v2: Please refer to Documentation/devicetree/bindings/opp/opp.txt
> for detail.
> diff --git a/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> index d6d2833482c9..fc2bcbe26b1e 100644
> --- a/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> +++ b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> @@ -12,7 +12,7 @@ Required properties:
> - clocks: Phandles for clock specified in "clock-names" property
> - clock-names : The name of clock used by the DFI, must be
> "pclk_ddr_mon";
> -- operating-points-v2: Refer to Documentation/devicetree/bindings/power/opp.txt
> +- operating-points-v2: Refer to Documentation/devicetree/bindings/opp/opp.txt
> for details.
> - center-supply: DMC supply node.
> - status: Marks the node enabled/disabled.
> diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt b/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
> index 039219df05c5..18a2cde2e5f3 100644
> --- a/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
> +++ b/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
> @@ -34,7 +34,7 @@ Optional properties:
> - mali-supply : Phandle to regulator for the Mali device. Refer to
> Documentation/devicetree/bindings/regulator/regulator.txt for details.
>
> -- operating-points-v2 : Refer to Documentation/devicetree/bindings/power/opp.txt
> +- operating-points-v2 : Refer to Documentation/devicetree/bindings/opp/opp.txt
> for details.
>
>
> diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-utgard.txt b/Documentation/devicetree/bindings/gpu/arm,mali-utgard.txt
> index c1f65d1dac1d..63cd91176a68 100644
> --- a/Documentation/devicetree/bindings/gpu/arm,mali-utgard.txt
> +++ b/Documentation/devicetree/bindings/gpu/arm,mali-utgard.txt
> @@ -44,7 +44,7 @@ Optional properties:
>
> - memory-region:
> Memory region to allocate from, as defined in
> - Documentation/devicetree/bindi/reserved-memory/reserved-memory.txt
> + Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
>
> - mali-supply:
> Phandle to regulator for the Mali device, as defined in
> diff --git a/Documentation/devicetree/bindings/mfd/mt6397.txt b/Documentation/devicetree/bindings/mfd/mt6397.txt
> index d1df77f4d655..0ebd08af777d 100644
> --- a/Documentation/devicetree/bindings/mfd/mt6397.txt
> +++ b/Documentation/devicetree/bindings/mfd/mt6397.txt
> @@ -12,7 +12,7 @@ MT6397/MT6323 is a multifunction device with the following sub modules:
> It is interfaced to host controller using SPI interface by a proprietary hardware
> called PMIC wrapper or pwrap. MT6397/MT6323 MFD is a child device of pwrap.
> See the following for pwarp node definitions:
> -Documentation/devicetree/bindings/soc/pwrap.txt
> +Documentation/devicetree/bindings/soc/mediatek/pwrap.txt
>
> This document describes the binding for MFD device and its sub module.
>
> diff --git a/Documentation/devicetree/bindings/mfd/sun6i-prcm.txt b/Documentation/devicetree/bindings/mfd/sun6i-prcm.txt
> index dd2c06540485..4d21ffdb0fc1 100644
> --- a/Documentation/devicetree/bindings/mfd/sun6i-prcm.txt
> +++ b/Documentation/devicetree/bindings/mfd/sun6i-prcm.txt
> @@ -9,7 +9,7 @@ Required properties:
>
> The prcm node may contain several subdevices definitions:
> - see Documentation/devicetree/clk/sunxi.txt for clock devices
> - - see Documentation/devicetree/reset/allwinner,sunxi-clock-reset.txt for reset
> + - see Documentation/devicetree/bindings/reset/allwinner,sunxi-clock-reset.txt for reset
> controller devices
>
>
> diff --git a/Documentation/devicetree/bindings/mmc/exynos-dw-mshc.txt b/Documentation/devicetree/bindings/mmc/exynos-dw-mshc.txt
> index a58c173b7ab9..0419a63f73a0 100644
> --- a/Documentation/devicetree/bindings/mmc/exynos-dw-mshc.txt
> +++ b/Documentation/devicetree/bindings/mmc/exynos-dw-mshc.txt
> @@ -62,7 +62,7 @@ Required properties for a slot (Deprecated - Recommend to use one slot per host)
> rest of the gpios (depending on the bus-width property) are the data lines in
> no particular order. The format of the gpio specifier depends on the gpio
> controller.
> -(Deprecated - Refer to Documentation/devicetree/binding/pinctrl/samsung-pinctrl.txt)
> +(Deprecated - Refer to Documentation/devicetree/bindings/pinctrl/samsung-pinctrl.txt)
>
> Example:
>
> diff --git a/Documentation/devicetree/bindings/net/dsa/ksz.txt b/Documentation/devicetree/bindings/net/dsa/ksz.txt
> index fd23904ac68e..a700943218ca 100644
> --- a/Documentation/devicetree/bindings/net/dsa/ksz.txt
> +++ b/Documentation/devicetree/bindings/net/dsa/ksz.txt
> @@ -6,7 +6,7 @@ Required properties:
> - compatible: For external switch chips, compatible string must be exactly one
> of: "microchip,ksz9477"
>
> -See Documentation/devicetree/bindings/dsa/dsa.txt for a list of additional
> +See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional
> required and optional properties.
>
> Examples:
> diff --git a/Documentation/devicetree/bindings/net/dsa/mt7530.txt b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
> index a9bc27b93ee3..aa3527f71fdc 100644
> --- a/Documentation/devicetree/bindings/net/dsa/mt7530.txt
> +++ b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
> @@ -31,7 +31,7 @@ Required properties for the child nodes within ports container:
> - phy-mode: String, must be either "trgmii" or "rgmii" for port labeled
> "cpu".
>
> -See Documentation/devicetree/bindings/dsa/dsa.txt for a list of additional
> +See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional
> required, optional properties and how the integrated switch subnodes must
> be specified.
>
> diff --git a/Documentation/devicetree/bindings/power/fsl,imx-gpc.txt b/Documentation/devicetree/bindings/power/fsl,imx-gpc.txt
> index b31d6bbeee16..726ec2875223 100644
> --- a/Documentation/devicetree/bindings/power/fsl,imx-gpc.txt
> +++ b/Documentation/devicetree/bindings/power/fsl,imx-gpc.txt
> @@ -14,7 +14,7 @@ Required properties:
> datasheet
> - interrupts: Should contain one interrupt specifier for the GPC interrupt
> - clocks: Must contain an entry for each entry in clock-names.
> - See Documentation/devicetree/bindings/clocks/clock-bindings.txt for details.
> + See Documentation/devicetree/bindings/clock/clock-bindings.txt for details.
> - clock-names: Must include the following entries:
> - ipg
>
> diff --git a/Documentation/devicetree/bindings/power/wakeup-source.txt b/Documentation/devicetree/bindings/power/wakeup-source.txt
> index 5d254ab13ebf..cfd74659fbed 100644
> --- a/Documentation/devicetree/bindings/power/wakeup-source.txt
> +++ b/Documentation/devicetree/bindings/power/wakeup-source.txt
> @@ -22,7 +22,7 @@ List of legacy properties and respective binding document
> 3. "has-tpo" Documentation/devicetree/bindings/rtc/rtc-opal.txt
> 4. "linux,wakeup" Documentation/devicetree/bindings/input/gpio-matrix-keypad.txt
> Documentation/devicetree/bindings/mfd/tc3589x.txt
> - Documentation/devicetree/bindings/input/ads7846.txt
> + Documentation/devicetree/bindings/input/touchscreen/ads7846.txt
> 5. "linux,keypad-wakeup" Documentation/devicetree/bindings/input/qcom,pm8xxx-keypad.txt
> 6. "linux,input-wakeup" Documentation/devicetree/bindings/input/samsung-keypad.txt
> 7. "nvidia,wakeup-source" Documentation/devicetree/bindings/input/nvidia,tegra20-kbc.txt
> diff --git a/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt b/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt
> index 50a31536e975..252a05c5d976 100644
> --- a/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt
> +++ b/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt
> @@ -16,7 +16,7 @@ A child node must exist to represent the core DWC3 IP block. The name of
> the node is not important. The content of the node is defined in dwc3.txt.
>
> Phy documentation is provided in the following places:
> -Documentation/devicetree/bindings/phy/rockchip,dwc3-usb-phy.txt
> +Documentation/devicetree/bindings/phy/qcom-dwc3-usb-phy.txt
>
> Example device nodes:
>
> diff --git a/Documentation/hwmon/ina2xx b/Documentation/hwmon/ina2xx
> index cfd31d94c872..72d16f08e431 100644
> --- a/Documentation/hwmon/ina2xx
> +++ b/Documentation/hwmon/ina2xx
> @@ -53,7 +53,7 @@ bus supply voltage.
>
> The shunt value in micro-ohms can be set via platform data or device tree at
> compile-time or via the shunt_resistor attribute in sysfs at run-time. Please
> -refer to the Documentation/devicetree/bindings/i2c/ina2xx.txt for bindings
> +refer to the Documentation/devicetree/bindings/hwmon/ina2xx.txt for bindings
> if the device tree is used.
>
> Additionally ina226 supports update_interval attribute as described in
> diff --git a/Documentation/maintainer/pull-requests.rst b/Documentation/maintainer/pull-requests.rst
> index a19db3458b56..22b271de0304 100644
> --- a/Documentation/maintainer/pull-requests.rst
> +++ b/Documentation/maintainer/pull-requests.rst
> @@ -41,7 +41,7 @@ named ``char-misc-next``, you would be using the following command::
>
> that will create a signed tag called ``char-misc-4.15-rc1`` based on the
> last commit in the ``char-misc-next`` branch, and sign it with your gpg key
> -(see :ref:`Documentation/maintainer/configure_git.rst <configuregit>`).
> +(see :ref:`Documentation/maintainer/configure-git.rst <configuregit>`).
>
> Linus will only accept pull requests based on a signed tag. Other
> maintainers may differ.
> diff --git a/Documentation/translations/ko_KR/howto.rst b/Documentation/translations/ko_KR/howto.rst
> index 624654bdcd8a..a8197e072599 100644
> --- a/Documentation/translations/ko_KR/howto.rst
> +++ b/Documentation/translations/ko_KR/howto.rst
> @@ -160,7 +160,7 @@ mtk.manpages@gmail.com의 메인테이너에게 보낼 것을 권장한다.
> 독특한 행동에 관하여 흔히 있는 오해들과 혼란들을 해소하고 있기
> 때문이다.
>
> - :ref:`Documentation/process/stable_kernel_rules.rst <stable_kernel_rules>`
> + :ref:`Documentation/process/stable-kernel-rules.rst <stable_kernel_rules>`
> 이 문서는 안정적인 커널 배포가 이루어지는 규칙을 설명하고 있으며
> 여러분들이 이러한 배포들 중 하나에 변경을 하길 원한다면
> 무엇을 해야 하는지를 설명한다.
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ec65e33e2cf1..5871dd5060f6 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4513,7 +4513,7 @@ DRM DRIVER FOR ILITEK ILI9225 PANELS
> M: David Lechner <david@lechnology.com>
> S: Maintained
> F: drivers/gpu/drm/tinydrm/ili9225.c
> -F: Documentation/devicetree/bindings/display/ili9225.txt
> +F: Documentation/devicetree/bindings/display/ilitek,ili9225.txt
>
> DRM DRIVER FOR INTEL I810 VIDEO CARDS
> S: Orphan / Obsolete
> @@ -4599,13 +4599,13 @@ DRM DRIVER FOR SITRONIX ST7586 PANELS
> M: David Lechner <david@lechnology.com>
> S: Maintained
> F: drivers/gpu/drm/tinydrm/st7586.c
> -F: Documentation/devicetree/bindings/display/st7586.txt
> +F: Documentation/devicetree/bindings/display/sitronix,st7586.txt
>
> DRM DRIVER FOR SITRONIX ST7735R PANELS
> M: David Lechner <david@lechnology.com>
> S: Maintained
> F: drivers/gpu/drm/tinydrm/st7735r.c
> -F: Documentation/devicetree/bindings/display/st7735r.txt
> +F: Documentation/devicetree/bindings/display/sitronix,st7735r.txt
>
> DRM DRIVER FOR TDFX VIDEO CARDS
> S: Orphan / Obsolete
> @@ -4824,7 +4824,7 @@ M: Eric Anholt <eric@anholt.net>
> S: Supported
> F: drivers/gpu/drm/v3d/
> F: include/uapi/drm/v3d_drm.h
> -F: Documentation/devicetree/bindings/display/brcm,bcm-v3d.txt
> +F: Documentation/devicetree/bindings/gpu/brcm,bcm-v3d.txt
> T: git git://anongit.freedesktop.org/drm/drm-misc
>
> DRM DRIVERS FOR VC4
> @@ -5735,7 +5735,7 @@ M: Madalin Bucur <madalin.bucur@nxp.com>
> L: netdev@vger.kernel.org
> S: Maintained
> F: drivers/net/ethernet/freescale/fman
> -F: Documentation/devicetree/bindings/powerpc/fsl/fman.txt
> +F: Documentation/devicetree/bindings/net/fsl-fman.txt
>
> FREESCALE QORIQ PTP CLOCK DRIVER
> M: Yangbo Lu <yangbo.lu@nxp.com>
> @@ -8700,7 +8700,7 @@ M: Guenter Roeck <linux@roeck-us.net>
> L: linux-hwmon@vger.kernel.org
> S: Maintained
> F: Documentation/hwmon/max6697
> -F: Documentation/devicetree/bindings/i2c/max6697.txt
> +F: Documentation/devicetree/bindings/hwmon/max6697.txt
> F: drivers/hwmon/max6697.c
> F: include/linux/platform_data/max6697.h
>
> @@ -9080,7 +9080,7 @@ M: Martin Donnelly <martin.donnelly@ge.com>
> M: Martyn Welch <martyn.welch@collabora.co.uk>
> S: Maintained
> F: drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c
> -F: Documentation/devicetree/bindings/video/bridge/megachips-stdpxxxx-ge-b850v3-fw.txt
> +F: Documentation/devicetree/bindings/display/bridge/megachips-stdpxxxx-ge-b850v3-fw.txt
>
> MEGARAID SCSI/SAS DRIVERS
> M: Kashyap Desai <kashyap.desai@broadcom.com>
> @@ -10728,7 +10728,7 @@ PARALLEL LCD/KEYPAD PANEL DRIVER
> M: Willy Tarreau <willy@haproxy.com>
> M: Ksenija Stanojevic <ksenija.stanojevic@gmail.com>
> S: Odd Fixes
> -F: Documentation/misc-devices/lcd-panel-cgram.txt
> +F: Documentation/auxdisplay/lcd-panel-cgram.txt
> F: drivers/misc/panel.c
>
> PARALLEL PORT SUBSYSTEM
> @@ -13291,7 +13291,7 @@ M: Vinod Koul <vkoul@kernel.org>
> L: alsa-devel@alsa-project.org (moderated for non-subscribers)
> T: git git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
> S: Supported
> -F: Documentation/sound/alsa/compress_offload.txt
> +F: Documentation/sound/designs/compress-offload.rst
> F: include/sound/compress_driver.h
> F: include/uapi/sound/compress_*
> F: sound/core/compress_offload.c
> diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
> index 14d287bed33c..1ab613eb5796 100644
> --- a/drivers/net/ethernet/intel/Kconfig
> +++ b/drivers/net/ethernet/intel/Kconfig
> @@ -33,7 +33,7 @@ config E100
> to identify the adapter.
>
> More specific information on configuring the driver is in
> - <file:Documentation/networking/e100.txt>.
> + <file:Documentation/networking/e100.rst>.
>
> To compile this driver as a module, choose M here. The module
> will be called e100.
> @@ -49,7 +49,7 @@ config E1000
> <http://support.intel.com>
>
> More specific information on configuring the driver is in
> - <file:Documentation/networking/e1000.txt>.
> + <file:Documentation/networking/e1000.rst>.
>
> To compile this driver as a module, choose M here. The module
> will be called e1000.
> @@ -94,7 +94,7 @@ config IGB
> <http://support.intel.com>
>
> More specific information on configuring the driver is in
> - <file:Documentation/networking/e1000.txt>.
> + <file:Documentation/networking/e1000.rst>.
>
> To compile this driver as a module, choose M here. The module
> will be called igb.
> @@ -130,7 +130,7 @@ config IGBVF
> <http://support.intel.com>
>
> More specific information on configuring the driver is in
> - <file:Documentation/networking/e1000.txt>.
> + <file:Documentation/networking/e1000.rst>.
>
> To compile this driver as a module, choose M here. The module
> will be called igbvf.
> diff --git a/drivers/soundwire/stream.c b/drivers/soundwire/stream.c
> index 8974a0fcda1b..4b5e250e8615 100644
> --- a/drivers/soundwire/stream.c
> +++ b/drivers/soundwire/stream.c
> @@ -1291,7 +1291,7 @@ static int _sdw_prepare_stream(struct sdw_stream_runtime *stream)
> *
> * @stream: Soundwire stream
> *
> - * Documentation/soundwire/stream.txt explains this API in detail
> + * Documentation/driver-api/soundwire/stream.rst explains this API in detail
> */
> int sdw_prepare_stream(struct sdw_stream_runtime *stream)
> {
> @@ -1348,7 +1348,7 @@ static int _sdw_enable_stream(struct sdw_stream_runtime *stream)
> *
> * @stream: Soundwire stream
> *
> - * Documentation/soundwire/stream.txt explains this API in detail
> + * Documentation/driver-api/soundwire/stream.rst explains this API in detail
> */
> int sdw_enable_stream(struct sdw_stream_runtime *stream)
> {
> @@ -1400,7 +1400,7 @@ static int _sdw_disable_stream(struct sdw_stream_runtime *stream)
> *
> * @stream: Soundwire stream
> *
> - * Documentation/soundwire/stream.txt explains this API in detail
> + * Documentation/driver-api/soundwire/stream.rst explains this API in detail
> */
> int sdw_disable_stream(struct sdw_stream_runtime *stream)
> {
> @@ -1456,7 +1456,7 @@ static int _sdw_deprepare_stream(struct sdw_stream_runtime *stream)
> *
> * @stream: Soundwire stream
> *
> - * Documentation/soundwire/stream.txt explains this API in detail
> + * Documentation/driver-api/soundwire/stream.rst explains this API in detail
> */
> int sdw_deprepare_stream(struct sdw_stream_runtime *stream)
> {
> diff --git a/fs/Kconfig.binfmt b/fs/Kconfig.binfmt
> index 57a27c42b5ac..56df483de619 100644
> --- a/fs/Kconfig.binfmt
> +++ b/fs/Kconfig.binfmt
> @@ -168,7 +168,7 @@ config BINFMT_MISC
> will automatically feed it to the correct interpreter.
>
> You can do other nice things, too. Read the file
> - <file:Documentation/binfmt_misc.txt> to learn how to use this
> + <file:Documentation/admin-guide/binfmt-misc.rst> to learn how to use this
> feature, <file:Documentation/admin-guide/java.rst> for information about how
> to include Java support. and <file:Documentation/admin-guide/mono.rst> for
> information about how to include Mono-based .NET support.
> diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
> index 4de191563261..4b5fff31ef27 100644
> --- a/fs/binfmt_misc.c
> +++ b/fs/binfmt_misc.c
> @@ -4,7 +4,7 @@
> * Copyright (C) 1997 Richard Günther
> *
> * binfmt_misc detects binaries via a magic or filename extension and invokes
> - * a specified wrapper. See Documentation/binfmt_misc.txt for more details.
> + * a specified wrapper. See Documentation/admin-guide/binfmt-misc.rst for more details.
> */
>
> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
^ permalink raw reply
* [PATCH v2] ip: add rmnet initial support
From: Daniele Palmas @ 2018-06-15 8:23 UTC (permalink / raw)
To: netdev, Stephen Hemminger; +Cc: Subash Abhinov Kasiviswanathan, Daniele Palmas
This patch adds basic support for Qualcomm rmnet devices.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
---
v2:
rebased on iproute2-next
removed GPL boilerplate
added print_opt function
man page updated
fixed MUXID values
---
ip/Makefile | 2 +-
ip/iplink.c | 2 +-
ip/iplink_rmnet.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++
man/man8/ip-link.8.in | 21 ++++++++++++-
4 files changed, 103 insertions(+), 3 deletions(-)
create mode 100644 ip/iplink_rmnet.c
diff --git a/ip/Makefile b/ip/Makefile
index 77fadee..a88f936 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -10,7 +10,7 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o ipnetns.o \
link_iptnl.o link_gre6.o iplink_bond.o iplink_bond_slave.o iplink_hsr.o \
iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o \
iplink_geneve.o iplink_vrf.o iproute_lwtunnel.o ipmacsec.o ipila.o \
- ipvrf.o iplink_xstats.o ipseg6.o iplink_netdevsim.o
+ ipvrf.o iplink_xstats.o ipseg6.o iplink_netdevsim.o iplink_rmnet.o
RTMONOBJ=rtmon.o
diff --git a/ip/iplink.c b/ip/iplink.c
index e4d4da9..0ba5f1a 100644
--- a/ip/iplink.c
+++ b/ip/iplink.c
@@ -121,7 +121,7 @@ void iplink_usage(void)
" bridge | bond | team | ipoib | ip6tnl | ipip | sit | vxlan |\n"
" gre | gretap | erspan | ip6gre | ip6gretap | ip6erspan |\n"
" vti | nlmon | team_slave | bond_slave | ipvlan | geneve |\n"
- " bridge_slave | vrf | macsec | netdevsim }\n");
+ " bridge_slave | vrf | macsec | netdevsim | rmnet }\n");
}
exit(-1);
}
diff --git a/ip/iplink_rmnet.c b/ip/iplink_rmnet.c
new file mode 100644
index 0000000..1d16440
--- /dev/null
+++ b/ip/iplink_rmnet.c
@@ -0,0 +1,81 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * iplink_rmnet.c RMNET device support
+ *
+ * Authors: Daniele Palmas <dnlplm@gmail.com>
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "utils.h"
+#include "ip_common.h"
+
+static void print_explain(FILE *f)
+{
+ fprintf(f,
+ "Usage: ... rmnet mux_id MUXID\n"
+ "\n"
+ "MUXID := 1-254\n"
+ );
+}
+
+static void explain(void)
+{
+ print_explain(stderr);
+}
+
+static int rmnet_parse_opt(struct link_util *lu, int argc, char **argv,
+ struct nlmsghdr *n)
+{
+ __u16 mux_id;
+
+ while (argc > 0) {
+ if (matches(*argv, "mux_id") == 0) {
+ NEXT_ARG();
+ if (get_u16(&mux_id, *argv, 0))
+ invarg("mux_id is invalid", *argv);
+ addattr16(n, 1024, IFLA_RMNET_MUX_ID, mux_id);
+ } else if (matches(*argv, "help") == 0) {
+ explain();
+ return -1;
+ } else {
+ fprintf(stderr, "rmnet: unknown command \"%s\"?\n", *argv);
+ explain();
+ return -1;
+ }
+ argc--, argv++;
+ }
+
+ return 0;
+}
+
+static void rmnet_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
+{
+ if (!tb)
+ return;
+
+ if (!tb[IFLA_RMNET_MUX_ID] ||
+ RTA_PAYLOAD(tb[IFLA_RMNET_MUX_ID]) < sizeof(__u16))
+ return;
+
+ print_uint(PRINT_ANY,
+ "mux_id",
+ "mux_id %u ",
+ rta_getattr_u16(tb[IFLA_RMNET_MUX_ID]));
+}
+
+static void rmnet_print_help(struct link_util *lu, int argc, char **argv,
+ FILE *f)
+{
+ print_explain(f);
+}
+
+struct link_util rmnet_link_util = {
+ .id = "rmnet",
+ .maxattr = IFLA_RMNET_MAX,
+ .parse_opt = rmnet_parse_opt,
+ .print_opt = rmnet_print_opt,
+ .print_help = rmnet_print_help,
+};
diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in
index 83ef3ca..fd2c107 100644
--- a/man/man8/ip-link.8.in
+++ b/man/man8/ip-link.8.in
@@ -219,7 +219,8 @@ ip-link \- network device configuration
.BR geneve " |"
.BR vrf " |"
.BR macsec " |"
-.BR netdevsim " ]"
+.BR netdevsim " |"
+.BR rmnet " ]"
.ti -8
.IR ETYPE " := [ " TYPE " |"
@@ -342,6 +343,9 @@ Link types:
.sp
.BR netdevsim
- Interface for netdev API tests
+.sp
+.BR rmnet
+- Qualcomm rmnet device
.in -8
.TP
@@ -1651,6 +1655,21 @@ the following additional arguments are supported:
.in -8
+.TP
+RMNET Type Support
+For a link of type
+.I RMNET
+the following additional arguments are supported:
+
+.BI "ip link add link " DEVICE " name " NAME " type rmnet mux_id " MUXID
+
+.in +8
+.sp
+.BI mux_id " MUXID "
+- specifies the mux identifier for the rmnet device, possible values 1-254.
+
+.in -8
+
.SS ip link delete - delete virtual link
.TP
--
2.7.4
^ permalink raw reply related
* Re: [PATCH] SUNRPC: Move inline xprt_alloc_xid() up to fix compiler warning
From: Geert Uytterhoeven @ 2018-06-15 8:23 UTC (permalink / raw)
To: Chuck Lever
Cc: Bruce Fields, jlayton, trond.myklebust, Anna Schumaker,
David S. Miller, open list:NFS, SUNRPC, AND..., netdev,
Linux Kernel Mailing List
In-Reply-To: <675726D3-9273-4051-B38F-6377B8445A27@oracle.com>
Hi Chuck,
On Thu, Jun 14, 2018 at 7:20 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> > On Jun 13, 2018, at 8:01 AM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >
> > With gcc 4.1.2:
> >
> > net/sunrpc/xprt.c:69: warning: ‘xprt_alloc_xid’ declared inline after being called
> > net/sunrpc/xprt.c:69: warning: previous declaration of ‘xprt_alloc_xid’ was here
> >
> > To fix this, move the function up, before its caller, and remove the no
> > longer needed forward declaration.
> >
> > Fixes: 37ac86c3a76c1136 ("SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock")
> > Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
> > ---
> > net/sunrpc/xprt.c | 11 +++++------
> > 1 file changed, 5 insertions(+), 6 deletions(-)
> >
> > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > index 3c85af058227d14b..60a8b9f91cf94b54 100644
> > --- a/net/sunrpc/xprt.c
> > +++ b/net/sunrpc/xprt.c
> > @@ -66,7 +66,6 @@
> > * Local functions
> > */
> > static void xprt_init(struct rpc_xprt *xprt, struct net *net);
> > -static __be32 xprt_alloc_xid(struct rpc_xprt *xprt);
> > static void xprt_connect_status(struct rpc_task *task);
> > static int __xprt_get_cong(struct rpc_xprt *, struct rpc_task *);
> > static void __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst *);
> > @@ -956,6 +955,11 @@ static void xprt_timer(struct rpc_task *task)
> > task->tk_status = 0;
> > }
> >
> > +static inline __be32 xprt_alloc_xid(struct rpc_xprt *xprt)
> > +{
> > + return (__force __be32)xprt->xid++;
> > +}
> > +
> > /**
> > * xprt_prepare_transmit - reserve the transport before sending a request
> > * @task: RPC task about to send a request
> > @@ -1296,11 +1300,6 @@ void xprt_retry_reserve(struct rpc_task *task)
> > xprt->ops->alloc_slot(xprt, task);
> > }
> >
> > -static inline __be32 xprt_alloc_xid(struct rpc_xprt *xprt)
> > -{
> > - return (__force __be32)xprt->xid++;
> > -}
> > -
>
> For code organization, we might want to keep xprt_alloc_xid
> together with xprt_init_xid. Would it be better to simply
> remove the "inline" directive from these two and let the
> compiler choose the best optimization?
That's an option, too.
> > static inline void xprt_init_xid(struct rpc_xprt *xprt)
> > {
> > xprt->xid = prandom_u32();
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply
* Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled
From: Ilpo Järvinen @ 2018-06-15 8:05 UTC (permalink / raw)
To: Michal Kubecek; +Cc: Yuchung Cheng, netdev, Eric Dumazet, LKML
In-Reply-To: <20180614131801.hd474jgrhmtqzhag@unicorn.suse.cz>
[-- Attachment #1: Type: text/plain, Size: 2529 bytes --]
On Thu, 14 Jun 2018, Michal Kubecek wrote:
> On Thu, Jun 14, 2018 at 02:51:18PM +0300, Ilpo Järvinen wrote:
> > On Thu, 14 Jun 2018, Michal Kubecek wrote:
> > > On Thu, Jun 14, 2018 at 11:42:43AM +0300, Ilpo Järvinen wrote:
> > > > On Wed, 13 Jun 2018, Yuchung Cheng wrote:
> > > > > On Wed, Jun 13, 2018 at 9:55 AM, Michal Kubecek <mkubecek@suse.cz> wrote:
> > >
> > > AFAICS RFC 5682 is not explicit about this and offers multiple options.
> > > Anyway, this is not essential and in most of the customer provided
> > > captures, it wasn't the case.
> >
> > Lacking the new segments is essential for hiding the actual bug as the
> > trace would look weird otherwise with a burst of new data segments (due
> > to the other bug).
>
> The trace wouldn't look so nice but it can be reproduced even with more
> data to send. I've copied an example below. I couldn't find a really
> nice one quickly so that first few retransmits (17:22:13.865105 through
> 17:23:05.841105) are without new data but starting at 17:23:58.189150,
> you can see that sending new (previously unsent) data may not suffice to
> break the loop.
My point was that the new data segment bursts that occur if the sender
isn't application limited indicate that there's something going wrong
with FRTO. And that wrong is also what is causing that RTO loop because
the sender doesn't see the previous FRTO recovery on second RTO. With
my FRTO undo fix, (new_recovery || icsk->icsk_retransmits) will be false
and that will prevent the RTO loop.
> > > Normally, we would have timestamps (and even SACK). Without them, you
> > > cannot reliably recognize a dupack with changed window size from
> > > a spontaneous window update.
> >
> > No! The window should not update window on ACKs the receiver intends to
> > designate as "duplicate ACKs". That is not without some potential cost
> > though as it requires delaying window updates up to the next cumulative
> > ACK. In the non-SACK series one of the changes is fixing this for
> > non-SACK Linux TCP flows.
>
> That sounds like a reasonable change (at least at the first glance,
> I didn't think about it too deeply) but even if we fix Linux stack to
> behave like this, we cannot force everyone else to do the same.
Unfortunately I don't know what the other stacks besides Linux do. But
for Linux, the cause for the changing receiver window is the receiver
window auto-tuning and I'm not sure if other stacks have a similar
feature (or if that affects (almost) all ACKs like in Linux).
--
i.
^ permalink raw reply
* Re: [PATCH] optoe: driver to read/write SFP/QSFP EEPROMs
From: Andrew Lunn @ 2018-06-15 7:54 UTC (permalink / raw)
To: Don Bollinger
Cc: Tom Lendacky, Arnd Bergmann, Greg Kroah-Hartman, linux-kernel,
brandon_chuang, wally_wang, roy_lee, rick_burchett, quentin.chang,
steven.noble, jeffrey.townsend, scotte, roopa, David Ahern,
luke.williams, Guohan Lu, Russell King, netdev@vger.kernel.org
In-Reply-To: <20180615022652.t6oqpnwwvdmbooab@thebollingers.org>
> Actually this is better described by a third use case. The target
> switches are PHY-less (see various designs at
> www.compute.org/wiki/Networking/SpecsAndDesigns). The AS5712 for example
> says "The AS5712-54X is a PHY-Less design with the SFP+ and QSFP+
> connections directly attaching to the Serdes interfaces of the Broadcom
> BCM56854 720G Trident 2 switching silicon..."
We consider the SFP+ and QSFP+ as being the PHY. You need something to
control that PHY. Either it is firmware running in the switch, or it
is the Linux kernel, via PHYLINK.
> The i2c bus is muxed from the CPU to all of the {Q}SFP devices, which
> are set up as standard linux i2c devices
> (/sys/bus/i2c/devices/i2c-xxxx).
Having a standard i2c bus driver is correct. This is what PHYLINK
assumes. It knows about the different addresses the SFP uses on the
i2c bus.
> There is no MDIO bus between the CPU and the {Q}SFP devices.
There is no physical MDIO bus for SFP devices. If the SFP module
implements copper 1G, there is often MDIO tunnelled over i2c. PHYLINK
knows how to do this, and will instantiate a normal Linux MDIO bus
driver, and then you can use the Linux kernel copper PHY state
machines as normal.
> And, there isn't actually 'a wish to expose' the EEPROM data to linux
> (the kernel). It turns out that none of the NOS partners I'm working
> with use that data *in the kernel*. It is all managed from user space.
Ah. O.K. We can stop here then.
If you are using Linux as a boot loader, i doubt you will find any
network kernel developers who are willing to consider this driver. The
kernel community as decided switchdev is how the Linux kernel supports
switches. We are unlikely to add drivers for supporting user space
drivers of switches.
NACK.
Andrew
^ permalink raw reply
* Re: [PATCH 0/3] Use sbitmap instead of percpu_ida
From: Christoph Hellwig @ 2018-06-15 7:37 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Juergen Gross, Jens Axboe, kvm, linux-scsi, netdev, linux-usb,
linux-kernel, virtualization, target-devel, qla2xxx-upstream,
linux1394-devel, Kent Overstreet
In-Reply-To: <20180612190545.10781-1-willy@infradead.org>
Btw, if you are on a spree to remove almost unused data structures
from target code, the lib/btree.c code is only used by the qla2xxx
target code, and doesn't really look like the best fit for it either.
^ permalink raw reply
* Re: BUG: KASAN: stack-out-of-bounds in ipv6_addr_equal include/net/ipv6.h
From: Dmitry Vyukov @ 2018-06-15 7:32 UTC (permalink / raw)
To: air icy
Cc: David Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, netdev,
syzkaller
In-Reply-To: <CAAzSK-zy+p3An11LwSRB9Pr9O5oO09u8-XX78QiuLF7j_RFWVg@mail.gmail.com>
On Fri, Jun 15, 2018 at 8:33 AM, air icy <icytxw@gmail.com> wrote:
>
> Hi,
> I found a kernel bug with enchanced syzkaller in the newest linux kernel v4.17.
> The output is as follows:
>
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in ipv6_addr_equal include/net/ipv6.h:508 [inline]
> BUG: KASAN: stack-out-of-bounds in __xfrm6_state_addr_check include/net/xfrm.h:1358 [inline]
> BUG: KASAN: stack-out-of-bounds in xfrm_state_addr_check include/net/xfrm.h:1375 [inline]
> BUG: KASAN: stack-out-of-bounds in xfrm_state_find+0x2693/0x2740 net/xfrm/xfrm_state.c:959
> Read of size 4 at addr ffff880065d77b70 by task syz-executor1/10036
This may be related to "KMSAN: uninit-value in xfrm_state_find":
https://groups.google.com/d/msg/syzkaller-bugs/myqLUHNGRRc/Zb3SlyJZBwAJ
> CPU: 0 PID: 10036 Comm: syz-executor1 Not tainted 4.17.0 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> Call Trace:
>
> The buggy address belongs to the page:
> page:ffffea0001975dc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
> flags: 0x100000000000000()
> raw: 0100000000000000 0000000000000000 ffffea0001975dc8 0000000000000000
> raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff880065d77a00: 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4 f2
> ffff880065d77a80: f2 f2 f2 00 00 00 00 f2 f2 f2 f2 00 00 00 00 00
> >ffff880065d77b00: f4 f4 f4 f2 f2 f2 f2 00 00 00 00 00 00 00 f4 f2
> ^
> ffff880065d77b80: f2 f2 f2 00 00 00 00 00 00 00 00 00 f4 f4 f4 f3
> ffff880065d77c00: f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00
> ==================================================================
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 0 PID: 10036 Comm: syz-executor1 Tainted: G B 4.17.0 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> Call Trace:
> Dumping ftrace buffer:
> (ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>
> bugzilla url: https://bugzilla.kernel.org/show_bug.cgi?id=200065
> config file is attached in this email
>
> thanks
> Xuwen Tu
>
> log2
>
> report2
>
> .config
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
^ permalink raw reply
* Re: KMSAN: uninit-value in xfrm_state_find
From: Dmitry Vyukov @ 2018-06-15 7:31 UTC (permalink / raw)
To: syzbot
Cc: David Miller, Herbert Xu, LKML, netdev, Steffen Klassert,
syzkaller-bugs, icytxw
In-Reply-To: <0000000000001f31eb056ea92fcb@google.com>
On Fri, Jun 15, 2018 at 9:30 AM, syzbot
<syzbot+131cd4c6d21724b99a26@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: 1df165c8d2d6 kmsan: introduce kmsan_clear_user_page()
> git tree: https://github.com/google/kmsan.git/master
> console output: https://syzkaller.appspot.com/x/log.txt?x=15336e97800000
> kernel config: https://syzkaller.appspot.com/x/.config?x=4ca1e57bafa8ab1f
> dashboard link: https://syzkaller.appspot.com/bug?extid=131cd4c6d21724b99a26
> compiler: clang version 7.0.0 (trunk 329391)
> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=12c7a417800000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13710197800000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+131cd4c6d21724b99a26@syzkaller.appspotmail.com
This may be related to "BUG: KASAN: stack-out-of-bounds in
ipv6_addr_equal include/net/ipv6.h":
https://groups.google.com/d/msg/syzkaller/va_9cjZsHQE/Htc7sYY2BwAJ
> IPv6: ADDRCONF(NETDEV_UP): veth1: link is not ready
> IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
> IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
> 8021q: adding VLAN 0 to HW filter on device team0
> ==================================================================
> BUG: KMSAN: uninit-value in __arch_swab32
> arch/x86/include/uapi/asm/swab.h:10 [inline]
> BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline]
> BUG: KMSAN: uninit-value in __xfrm6_daddr_saddr_hash net/xfrm/xfrm_hash.h:29
> [inline]
> BUG: KMSAN: uninit-value in __xfrm_dst_hash net/xfrm/xfrm_hash.h:96 [inline]
> BUG: KMSAN: uninit-value in xfrm_dst_hash net/xfrm/xfrm_state.c:60 [inline]
> BUG: KMSAN: uninit-value in xfrm_state_find+0x2b15/0x4f40
> net/xfrm/xfrm_state.c:952
> CPU: 0 PID: 4464 Comm: syz-executor988 Not tainted 4.17.0-rc3+ #93
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x185/0x1d0 lib/dump_stack.c:113
> kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1084
> __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
> __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
> __fswab32 include/uapi/linux/swab.h:59 [inline]
> __xfrm6_daddr_saddr_hash net/xfrm/xfrm_hash.h:29 [inline]
> __xfrm_dst_hash net/xfrm/xfrm_hash.h:96 [inline]
> xfrm_dst_hash net/xfrm/xfrm_state.c:60 [inline]
> xfrm_state_find+0x2b15/0x4f40 net/xfrm/xfrm_state.c:952
> xfrm_tmpl_resolve_one net/xfrm/xfrm_policy.c:1393 [inline]
> xfrm_tmpl_resolve net/xfrm/xfrm_policy.c:1437 [inline]
> xfrm_resolve_and_create_bundle+0xc31/0x5270 net/xfrm/xfrm_policy.c:1833
> xfrm_lookup+0x606/0x39d0 net/xfrm/xfrm_policy.c:2163
> xfrm_lookup_route+0xfa/0x360 net/xfrm/xfrm_policy.c:2283
> ip_route_output_flow+0x35b/0x3b0 net/ipv4/route.c:2574
> udp_sendmsg+0x2289/0x33f0 net/ipv4/udp.c:1006
> udpv6_sendmsg+0x1291/0x3f40 net/ipv6/udp.c:1175
> inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:798
> sock_sendmsg_nosec net/socket.c:629 [inline]
> sock_sendmsg net/socket.c:639 [inline]
> ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
> __sys_sendmmsg+0x490/0x850 net/socket.c:2212
> __do_sys_sendmmsg net/socket.c:2241 [inline]
> __se_sys_sendmmsg net/socket.c:2238 [inline]
> __x64_sys_sendmmsg+0x11c/0x170 net/socket.c:2238
> do_syscall_64+0x154/0x220 arch/x86/entry/common.c:287
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x4419c9
> RSP: 002b:00007ffdb3fa4608 EFLAGS: 00000217 ORIG_RAX: 0000000000000133
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004419c9
> RDX: 0000000000000001 RSI: 0000000020002000 RDI: 0000000000000003
> RBP: 00000000006cd018 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004026c0
> R13: 0000000000402750 R14: 0000000000000000 R15: 0000000000000000
>
> Local variable description: ----fl4_stack@udp_sendmsg
> Variable was created at:
> udp_sendmsg+0xe5/0x33f0 net/ipv4/udp.c:841
> udpv6_sendmsg+0x1291/0x3f40 net/ipv6/udp.c:1175
> ==================================================================
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> syzbot.
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches
^ permalink raw reply
* KMSAN: uninit-value in xfrm_state_find
From: syzbot @ 2018-06-15 7:30 UTC (permalink / raw)
To: davem, herbert, linux-kernel, netdev, steffen.klassert,
syzkaller-bugs
Hello,
syzbot found the following crash on:
HEAD commit: 1df165c8d2d6 kmsan: introduce kmsan_clear_user_page()
git tree: https://github.com/google/kmsan.git/master
console output: https://syzkaller.appspot.com/x/log.txt?x=15336e97800000
kernel config: https://syzkaller.appspot.com/x/.config?x=4ca1e57bafa8ab1f
dashboard link: https://syzkaller.appspot.com/bug?extid=131cd4c6d21724b99a26
compiler: clang version 7.0.0 (trunk 329391)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=12c7a417800000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13710197800000
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+131cd4c6d21724b99a26@syzkaller.appspotmail.com
IPv6: ADDRCONF(NETDEV_UP): veth1: link is not ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
8021q: adding VLAN 0 to HW filter on device team0
==================================================================
BUG: KMSAN: uninit-value in __arch_swab32
arch/x86/include/uapi/asm/swab.h:10 [inline]
BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline]
BUG: KMSAN: uninit-value in __xfrm6_daddr_saddr_hash
net/xfrm/xfrm_hash.h:29 [inline]
BUG: KMSAN: uninit-value in __xfrm_dst_hash net/xfrm/xfrm_hash.h:96 [inline]
BUG: KMSAN: uninit-value in xfrm_dst_hash net/xfrm/xfrm_state.c:60 [inline]
BUG: KMSAN: uninit-value in xfrm_state_find+0x2b15/0x4f40
net/xfrm/xfrm_state.c:952
CPU: 0 PID: 4464 Comm: syz-executor988 Not tainted 4.17.0-rc3+ #93
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:113
kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1084
__msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
__arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
__fswab32 include/uapi/linux/swab.h:59 [inline]
__xfrm6_daddr_saddr_hash net/xfrm/xfrm_hash.h:29 [inline]
__xfrm_dst_hash net/xfrm/xfrm_hash.h:96 [inline]
xfrm_dst_hash net/xfrm/xfrm_state.c:60 [inline]
xfrm_state_find+0x2b15/0x4f40 net/xfrm/xfrm_state.c:952
xfrm_tmpl_resolve_one net/xfrm/xfrm_policy.c:1393 [inline]
xfrm_tmpl_resolve net/xfrm/xfrm_policy.c:1437 [inline]
xfrm_resolve_and_create_bundle+0xc31/0x5270 net/xfrm/xfrm_policy.c:1833
xfrm_lookup+0x606/0x39d0 net/xfrm/xfrm_policy.c:2163
xfrm_lookup_route+0xfa/0x360 net/xfrm/xfrm_policy.c:2283
ip_route_output_flow+0x35b/0x3b0 net/ipv4/route.c:2574
udp_sendmsg+0x2289/0x33f0 net/ipv4/udp.c:1006
udpv6_sendmsg+0x1291/0x3f40 net/ipv6/udp.c:1175
inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:629 [inline]
sock_sendmsg net/socket.c:639 [inline]
___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
__sys_sendmmsg+0x490/0x850 net/socket.c:2212
__do_sys_sendmmsg net/socket.c:2241 [inline]
__se_sys_sendmmsg net/socket.c:2238 [inline]
__x64_sys_sendmmsg+0x11c/0x170 net/socket.c:2238
do_syscall_64+0x154/0x220 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x4419c9
RSP: 002b:00007ffdb3fa4608 EFLAGS: 00000217 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004419c9
RDX: 0000000000000001 RSI: 0000000020002000 RDI: 0000000000000003
RBP: 00000000006cd018 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004026c0
R13: 0000000000402750 R14: 0000000000000000 R15: 0000000000000000
Local variable description: ----fl4_stack@udp_sendmsg
Variable was created at:
udp_sendmsg+0xe5/0x33f0 net/ipv4/udp.c:841
udpv6_sendmsg+0x1291/0x3f40 net/ipv6/udp.c:1175
==================================================================
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches
^ permalink raw reply
* Re: [PATCH v7 net] stmmac: added support for 802.1ad vlan stripping
From: Toshiaki Makita @ 2018-06-15 7:19 UTC (permalink / raw)
To: Elad Nachman, David Miller
Cc: Jose.Abreu, f.fainelli, netdev, peppe.cavallaro, alexandre.torgue
In-Reply-To: <82c9c00a-b1b7-bd16-0e9c-8b31291cb618@gmail.com>
On 2018/06/15 15:57, Elad Nachman wrote:
> stmmac reception handler calls stmmac_rx_vlan() to strip the vlan before
> calling napi_gro_receive().
>
> The function assumes VLAN tagged frames are always tagged with
> 802.1Q protocol, and assigns ETH_P_8021Q to the skb by hard-coding
> the parameter on call to __vlan_hwaccel_put_tag() .
>
> This causes packets not to be passed to the VLAN slave if it was created
> with 802.1AD protocol
> (ip link add link eth0 eth0.100 type vlan proto 802.1ad id 100).
>
> This fix passes the protocol from the VLAN header into
> __vlan_hwaccel_put_tag() instead of using the hard-coded value of
> ETH_P_8021Q.
>
> NETIF_F_HW_VLAN_STAG_RX check was added and the strip action is now
> dependent on the correct combination of features and the detected vlan tag.
>
> NETIF_F_HW_VLAN_STAG_RX feature was added to be in line with the driver
> actual abilities.
>
> Signed-off-by: Elad Nachman <eladn@gilat.com>
Reviewed-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 21 +++++++++++++--------
> 1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 11fb7c7..c4ffbfb 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -3182,17 +3182,22 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
>
> static void stmmac_rx_vlan(struct net_device *dev, struct sk_buff *skb)
> {
> - struct ethhdr *ehdr;
> + struct vlan_ethhdr *veth;
> + __be16 vlan_proto;
> u16 vlanid;
>
> - if ((dev->features & NETIF_F_HW_VLAN_CTAG_RX) ==
> - NETIF_F_HW_VLAN_CTAG_RX &&
> - !__vlan_get_tag(skb, &vlanid)) {
> + veth = (struct vlan_ethhdr *)skb->data;
> + vlan_proto = veth->h_vlan_proto;
> +
> + if ((vlan_proto == htons(ETH_P_8021Q) &&
> + dev->features & NETIF_F_HW_VLAN_CTAG_RX) ||
> + (vlan_proto == htons(ETH_P_8021AD) &&
> + dev->features & NETIF_F_HW_VLAN_STAG_RX)) {
> /* pop the vlan tag */
> - ehdr = (struct ethhdr *)skb->data;
> - memmove(skb->data + VLAN_HLEN, ehdr, ETH_ALEN * 2);
> + vlanid = ntohs(veth->h_vlan_TCI);
> + memmove(skb->data + VLAN_HLEN, veth, ETH_ALEN * 2);
> skb_pull(skb, VLAN_HLEN);
> - __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlanid);
> + __vlan_hwaccel_put_tag(skb, vlan_proto, vlanid);
> }
> }
>
> @@ -4235,7 +4240,7 @@ int stmmac_dvr_probe(struct device *device,
> ndev->watchdog_timeo = msecs_to_jiffies(watchdog);
> #ifdef STMMAC_VLAN_TAG_USED
> /* Both mac100 and gmac support receive VLAN tag detection */
> - ndev->features |= NETIF_F_HW_VLAN_CTAG_RX;
> + ndev->features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX;
> #endif
> priv->msg_enable = netif_msg_init(debug, default_msg_level);
>
>
--
Toshiaki Makita
^ permalink raw reply
* [PATCH v7 net] stmmac: added support for 802.1ad vlan stripping
From: Elad Nachman @ 2018-06-15 6:57 UTC (permalink / raw)
To: Toshiaki Makita, David Miller
Cc: Jose.Abreu, f.fainelli, netdev, peppe.cavallaro, alexandre.torgue,
eladv6
In-Reply-To: <e68da366-9c38-c2d5-80e7-9d5ec8799cdd@lab.ntt.co.jp>
stmmac reception handler calls stmmac_rx_vlan() to strip the vlan before
calling napi_gro_receive().
The function assumes VLAN tagged frames are always tagged with
802.1Q protocol, and assigns ETH_P_8021Q to the skb by hard-coding
the parameter on call to __vlan_hwaccel_put_tag() .
This causes packets not to be passed to the VLAN slave if it was created
with 802.1AD protocol
(ip link add link eth0 eth0.100 type vlan proto 802.1ad id 100).
This fix passes the protocol from the VLAN header into
__vlan_hwaccel_put_tag() instead of using the hard-coded value of
ETH_P_8021Q.
NETIF_F_HW_VLAN_STAG_RX check was added and the strip action is now
dependent on the correct combination of features and the detected vlan tag.
NETIF_F_HW_VLAN_STAG_RX feature was added to be in line with the driver
actual abilities.
Signed-off-by: Elad Nachman <eladn@gilat.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 11fb7c7..c4ffbfb 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3182,17 +3182,22 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
static void stmmac_rx_vlan(struct net_device *dev, struct sk_buff *skb)
{
- struct ethhdr *ehdr;
+ struct vlan_ethhdr *veth;
+ __be16 vlan_proto;
u16 vlanid;
- if ((dev->features & NETIF_F_HW_VLAN_CTAG_RX) ==
- NETIF_F_HW_VLAN_CTAG_RX &&
- !__vlan_get_tag(skb, &vlanid)) {
+ veth = (struct vlan_ethhdr *)skb->data;
+ vlan_proto = veth->h_vlan_proto;
+
+ if ((vlan_proto == htons(ETH_P_8021Q) &&
+ dev->features & NETIF_F_HW_VLAN_CTAG_RX) ||
+ (vlan_proto == htons(ETH_P_8021AD) &&
+ dev->features & NETIF_F_HW_VLAN_STAG_RX)) {
/* pop the vlan tag */
- ehdr = (struct ethhdr *)skb->data;
- memmove(skb->data + VLAN_HLEN, ehdr, ETH_ALEN * 2);
+ vlanid = ntohs(veth->h_vlan_TCI);
+ memmove(skb->data + VLAN_HLEN, veth, ETH_ALEN * 2);
skb_pull(skb, VLAN_HLEN);
- __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlanid);
+ __vlan_hwaccel_put_tag(skb, vlan_proto, vlanid);
}
}
@@ -4235,7 +4240,7 @@ int stmmac_dvr_probe(struct device *device,
ndev->watchdog_timeo = msecs_to_jiffies(watchdog);
#ifdef STMMAC_VLAN_TAG_USED
/* Both mac100 and gmac support receive VLAN tag detection */
- ndev->features |= NETIF_F_HW_VLAN_CTAG_RX;
+ ndev->features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX;
#endif
priv->msg_enable = netif_msg_init(debug, default_msg_level);
--
2.7.4
^ permalink raw reply related
* [PATCH RFC ipsec-next] xfrm: Extend the output_mark to support input direction and masking.
From: Steffen Klassert @ 2018-06-15 6:55 UTC (permalink / raw)
To: netdev; +Cc: Tobias Brunner, Eyal Birger, Lorenzo Colitti
We already support setting an output mark at the xfrm_state,
unfortunately this does not support the input direction and
masking the marks that will be applied to the skb. This change
adds support applying a masked value in both directions.
The existing XFRMA_OUTPUT_MARK number is reused for this purpose
and as it is now bi-directional, it is renamed to XFRMA_SET_MARK.
An additional XFRMA_SET_MARK_MASK attribute is added for setting the
mask. If the attribute mask not provided, it is set to 0xffffffff,
keeping the XFRMA_OUTPUT_MARK existing 'full mask' semantics.
Co-developed-by: Tobias Brunner <tobias@strongswan.org>
Co-developed-by: Eyal Birger <eyal.birger@gmail.com>
Co-developed-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Tobias Brunner <tobias@strongswan.org>
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
include/net/xfrm.h | 9 ++++++++-
include/uapi/linux/xfrm.h | 4 +++-
net/xfrm/xfrm_input.c | 2 ++
net/xfrm/xfrm_output.c | 3 +--
net/xfrm/xfrm_policy.c | 5 +++--
net/xfrm/xfrm_user.c | 48 +++++++++++++++++++++++++++++++++++++----------
6 files changed, 55 insertions(+), 16 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 45e75c36b738..8727b2484855 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -166,7 +166,7 @@ struct xfrm_state {
int header_len;
int trailer_len;
u32 extra_flags;
- u32 output_mark;
+ struct xfrm_mark smark;
} props;
struct xfrm_lifetime_cfg lft;
@@ -2012,6 +2012,13 @@ static inline int xfrm_mark_put(struct sk_buff *skb, const struct xfrm_mark *m)
return ret;
}
+static inline __u32 xfrm_smark_get(__u32 mark, struct xfrm_state *x)
+{
+ struct xfrm_mark *m = &x->props.smark;
+
+ return (m->v & m->m) | (mark & ~m->m);
+}
+
static inline int xfrm_tunnel_check(struct sk_buff *skb, struct xfrm_state *x,
unsigned int family)
{
diff --git a/include/uapi/linux/xfrm.h b/include/uapi/linux/xfrm.h
index e3af2859188b..5a6ed7ce5a29 100644
--- a/include/uapi/linux/xfrm.h
+++ b/include/uapi/linux/xfrm.h
@@ -305,9 +305,11 @@ enum xfrm_attr_type_t {
XFRMA_ADDRESS_FILTER, /* struct xfrm_address_filter */
XFRMA_PAD,
XFRMA_OFFLOAD_DEV, /* struct xfrm_state_offload */
- XFRMA_OUTPUT_MARK, /* __u32 */
+ XFRMA_SET_MARK, /* __u32 */
+ XFRMA_SET_MARK_MASK, /* __u32 */
__XFRMA_MAX
+#define XFRMA_OUTPUT_MARK XFRMA_SET_MARK /* Compatibility */
#define XFRMA_MAX (__XFRMA_MAX - 1)
};
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 352abca2605f..074810436242 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -339,6 +339,8 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
goto drop;
}
+ skb->mark = xfrm_smark_get(skb->mark, x);
+
skb->sp->xvec[skb->sp->len++] = x;
lock:
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index 89b178a78dc7..45ba07ab3e4f 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -66,8 +66,7 @@ static int xfrm_output_one(struct sk_buff *skb, int err)
goto error_nolock;
}
- if (x->props.output_mark)
- skb->mark = x->props.output_mark;
+ skb->mark = xfrm_smark_get(skb->mark, x);
err = x->outer_mode->output(x, skb);
if (err) {
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 40b54cc64243..f95f5f75748c 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1607,10 +1607,11 @@ static struct dst_entry *xfrm_bundle_create(struct xfrm_policy *policy,
dst_copy_metrics(dst1, dst);
if (xfrm[i]->props.mode != XFRM_MODE_TRANSPORT) {
+ __u32 mark = xfrm_smark_get(fl->flowi_mark, xfrm[i]);
+
family = xfrm[i]->props.family;
dst = xfrm_dst_lookup(xfrm[i], tos, fl->flowi_oif,
- &saddr, &daddr, family,
- xfrm[i]->props.output_mark);
+ &saddr, &daddr, family, mark);
err = PTR_ERR(dst);
if (IS_ERR(dst))
goto put_states;
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 080035f056d9..9602cc9e05ab 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -527,6 +527,19 @@ static void xfrm_update_ae_params(struct xfrm_state *x, struct nlattr **attrs,
x->replay_maxdiff = nla_get_u32(rt);
}
+static void xfrm_smark_init(struct nlattr **attrs, struct xfrm_mark *m)
+{
+ if (attrs[XFRMA_SET_MARK]) {
+ m->v = nla_get_u32(attrs[XFRMA_SET_MARK]);
+ if (attrs[XFRMA_SET_MARK_MASK])
+ m->m = nla_get_u32(attrs[XFRMA_SET_MARK_MASK]);
+ else
+ m->m = 0xffffffff;
+ } else {
+ m->v = m->m = 0;
+ }
+}
+
static struct xfrm_state *xfrm_state_construct(struct net *net,
struct xfrm_usersa_info *p,
struct nlattr **attrs,
@@ -579,8 +592,7 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
xfrm_mark_get(attrs, &x->mark);
- if (attrs[XFRMA_OUTPUT_MARK])
- x->props.output_mark = nla_get_u32(attrs[XFRMA_OUTPUT_MARK]);
+ xfrm_smark_init(attrs, &x->props.smark);
err = __xfrm_init_state(x, false, attrs[XFRMA_OFFLOAD_DEV]);
if (err)
@@ -824,6 +836,18 @@ static int copy_to_user_auth(struct xfrm_algo_auth *auth, struct sk_buff *skb)
return 0;
}
+static int xfrm_smark_put(struct sk_buff *skb, struct xfrm_mark *m)
+{
+ int ret = 0;
+
+ if (m->v | m->m) {
+ ret = nla_put_u32(skb, XFRMA_SET_MARK, m->v);
+ if (!ret)
+ ret = nla_put_u32(skb, XFRMA_SET_MARK_MASK, m->m);
+ }
+ return ret;
+}
+
/* Don't change this without updating xfrm_sa_len! */
static int copy_to_user_state_extra(struct xfrm_state *x,
struct xfrm_usersa_info *p,
@@ -887,6 +911,11 @@ static int copy_to_user_state_extra(struct xfrm_state *x,
ret = xfrm_mark_put(skb, &x->mark);
if (ret)
goto out;
+
+ ret = xfrm_smark_put(skb, &x->props.smark);
+ if (ret)
+ goto out;
+
if (x->replay_esn)
ret = nla_put(skb, XFRMA_REPLAY_ESN_VAL,
xfrm_replay_state_esn_len(x->replay_esn),
@@ -900,11 +929,7 @@ static int copy_to_user_state_extra(struct xfrm_state *x,
ret = copy_user_offload(&x->xso, skb);
if (ret)
goto out;
- if (x->props.output_mark) {
- ret = nla_put_u32(skb, XFRMA_OUTPUT_MARK, x->props.output_mark);
- if (ret)
- goto out;
- }
+
if (x->security)
ret = copy_sec_ctx(x->security, skb);
out:
@@ -2493,7 +2518,8 @@ static const struct nla_policy xfrma_policy[XFRMA_MAX+1] = {
[XFRMA_PROTO] = { .type = NLA_U8 },
[XFRMA_ADDRESS_FILTER] = { .len = sizeof(struct xfrm_address_filter) },
[XFRMA_OFFLOAD_DEV] = { .len = sizeof(struct xfrm_user_offload) },
- [XFRMA_OUTPUT_MARK] = { .type = NLA_U32 },
+ [XFRMA_SET_MARK] = { .type = NLA_U32 },
+ [XFRMA_SET_MARK_MASK] = { .type = NLA_U32 },
};
static const struct nla_policy xfrma_spd_policy[XFRMA_SPD_MAX+1] = {
@@ -2719,8 +2745,10 @@ static inline unsigned int xfrm_sa_len(struct xfrm_state *x)
l += nla_total_size(sizeof(x->props.extra_flags));
if (x->xso.dev)
l += nla_total_size(sizeof(x->xso));
- if (x->props.output_mark)
- l += nla_total_size(sizeof(x->props.output_mark));
+ if (x->props.smark.v | x->props.smark.m) {
+ l += nla_total_size(sizeof(x->props.smark.v));
+ l += nla_total_size(sizeof(x->props.smark.m));
+ }
/* Must count x->lastused as it may become non-zero behind our back. */
l += nla_total_size_64bit(sizeof(u64));
--
2.14.1
^ permalink raw reply related
* Re: [PATCH] selftests: bpf: config: add config fragments
From: Anders Roxell @ 2018-06-15 6:41 UTC (permalink / raw)
To: William Tu
Cc: Daniel Borkmann, Alexei Starovoitov, Shuah Khan, Networking,
Linux Kernel Mailing List, open list:KERNEL SELFTEST FRAMEWORK
In-Reply-To: <CALDO+SZXvnUU7VKd_UwnBCU0NiTHjqei4xEUCdcktkKCD3MnEg@mail.gmail.com>
On Thu, 14 Jun 2018 at 14:09, William Tu <u9012063@gmail.com> wrote:
>
> On Thu, Jun 14, 2018 at 4:42 AM, Anders Roxell <anders.roxell@linaro.org> wrote:
> > On 14 June 2018 at 13:06, William Tu <u9012063@gmail.com> wrote:
> >> On Tue, Jun 12, 2018 at 5:08 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>> On 06/12/2018 01:05 PM, Anders Roxell wrote:
> >>>> Tests test_tunnel.sh fails due to config fragments ins't enabled.
> >>>>
> >>>> Fixes: 933a741e3b82 ("selftests/bpf: bpf tunnel test.")
> >>>> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
> >>>> ---
> >>>>
> >>>> All tests passes except ip6gretap that still fails. I'm unsure why.
> >>>> Ideas?
> >>
> >> Hi Anders,
> >>
> >> ip6erspan is based on ip6gretap, does ip6erspan pass?
> >
> > it did pass when I was sending the email.
> > However, I retested this on next-20180613 and now it fails.
> >
> Does 'ip -s link show' show any errors/dropped on ip6gretap device?
I rerun the test_ip6gretap test only and added "set -x" to
test_tunnel.sh here's the output.
I added "ip -s link show ip6gretap11" before the cleanup function in the script.
# ./test_tunnel.sh
+ PING_ARG='-c 3 -w 10 -q'
+ ret=0
+ GREEN='\033[0;92m'
+ RED='\033[0;31m'
+ NC='\033[0m'
+ trap cleanup 0 3 6
+ trap cleanup_exit 2 9
+ cleanup
+ ip netns delete at_ns0
+ ip link del veth1
+ ip link del ipip11
+ ip link del ipip6tnl11
+ ip link del gretap11
+ ip link del ip6gre11
+ ip link del ip6gretap11
+ ip link del vxlan11
+ ip link del ip6vxlan11
+ ip link del geneve11
+ ip link del ip6geneve11
+ ip link del erspan11
+ ip link del ip6erspan11
+ bpf_tunnel_test
+ echo 'Testing IP6GRETAP tunnel...'
Testing IP6GRETAP tunnel...
+ test_ip6gretap
+ TYPE=ip6gretap
+ DEV_NS=ip6gretap00
+ DEV=ip6gretap11
+ ret=0
+ check ip6gretap
+ ip link help ip6gretap
+ grep -q '^Usage:'
+ '[' 0 -ne 0 ']'
+ config_device
+ ip netns add at_ns0
+ ip link add veth0 type veth peer name veth1
+ ip link set veth0 netns at_ns0
+ ip netns exec at_ns0 ip addr add 172.16.1.100/24 dev veth0
+ ip netns exec at_ns0 ip link set dev veth0 up
+ ip link set dev veth1 up mtu 1500
+ ip addr add dev veth1 172.16.1.200/24
+ add_ip6gretap_tunnel
+ ip netns exec at_ns0 ip addr add ::11/96 dev veth0
+ ip netns exec at_ns0 ip link set dev veth0 up
+ ip addr add dev veth1 ::22/96
+ ip link set dev veth1 up
+ ip netns exec at_ns0 ip link add dev ip6gretap00 type ip6gretap seq
flowlabel 0xbcdef key 2 local ::11 remote ::22
+ ip netns exec at_ns0 ip addr add dev ip6gretap00 10.1.1.100/24
+ ip netns exec at_ns0 ip addr add dev ip6gretap00 fc80::100/96
+ ip netns exec at_ns0 ip link set dev ip6gretap00 up
+ ip link add dev ip6gretap11 type ip6gretap external
+ ip addr add dev ip6gretap11 10.1.1.200/24
+ ip addr add dev ip6gretap11 fc80::200/24
+ ip link set dev ip6gretap11 up
+ attach_bpf ip6gretap11 ip6gretap_set_tunnel ip6gretap_get_tunnel
+ DEV=ip6gretap11
+ SET=ip6gretap_set_tunnel
+ GET=ip6gretap_get_tunnel
+ tc qdisc add dev ip6gretap11 clsact
+ tc filter add dev ip6gretap11 egress bpf da obj test_tunnel_kern.o
sec ip6gretap_set_tunnel
+ tc filter add dev ip6gretap11 ingress bpf da obj test_tunnel_kern.o
sec ip6gretap_get_tunnel
+ ping6 -c 3 -w 10 -q ::11
PING ::11 (::11): 56 data bytes
--- ::11 ping statistics ---
5 packets transmitted, 3 packets received, 40% packet loss
round-trip min/avg/max = 0.139/1.857/5.293 ms
+ ip netns exec at_ns0 ping -c 3 -w 10 -q 10.1.1.200
PING 10.1.1.200 (10.1.1.200): 56 data bytes
--- 10.1.1.200 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.214/0.256/0.305 ms
+ ping -c 3 -w 10 -q 10.1.1.100
PING 10.1.1.100 (10.1.1.100): 56 data bytes
--- 10.1.1.100 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.210/0.211/0.213 ms
+ check_err 0
+ '[' 0 -eq 0 ']'
+ ret=0
+ ip netns exec at_ns0 ping6 -c 3 -w 10 -q fc80::200
PING fc80::200 (fc80::200): 56 data bytes
--- fc80::200 ping statistics ---
10 packets transmitted, 0 packets received, 100% packet loss
+ check_err 1
+ '[' 0 -eq 0 ']'
+ ret=1
+ ip -s link show ip6gretap11
19: ip6gretap11@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1434 qdisc
pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000
link/ether de:d2:0c:53:80:8c brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
2096 25 0 0 0 0
TX: bytes packets errors dropped carrier collsns
5324 36 5 5 0 0
+ cleanup
+ ip netns delete at_ns0
+ ip link del veth1
+ ip link del ipip11
+ ip link del ipip6tnl11
+ ip link del gretap11
+ ip link del ip6gre11
+ ip link del ip6gretap11
+ ip link del vxlan11
+ ip link del ip6vxlan11
+ ip link del geneve11
+ ip link del ip6geneve11
+ ip link del erspan11
+ ip link del ip6erspan11
+ '[' 1 -ne 0 ']'
+ echo -e '\033[0;31mFAIL: ip6gretap\033[0m'
FAIL: ip6gretap
+ return 1
+ exit 0
+ cleanup
+ ip netns delete at_ns0
+ ip link del veth1
+ ip link del ipip11
+ ip link del ipip6tnl11
+ ip link del gretap11
+ ip link del ip6gre11
+ ip link del ip6gretap11
+ ip link del vxlan11
+ ip link del ip6vxlan11
+ ip link del geneve11
+ ip link del ip6geneve11
+ ip link del erspan11
+ ip link del ip6erspan11
^ permalink raw reply
* Re: [PATCH net-next,RFC 00/13] New fast forwarding path
From: Steffen Klassert @ 2018-06-15 6:34 UTC (permalink / raw)
To: David Miller; +Cc: tom, pablo, netfilter-devel, netdev
In-Reply-To: <20180614.165834.338565136334574983.davem@davemloft.net>
On Thu, Jun 14, 2018 at 04:58:34PM -0700, David Miller wrote:
> From: Tom Herbert <tom@herbertland.com>
> Date: Thu, 14 Jun 2018 13:52:03 -0700
>
> > IIRC, there was a similar proposal a while back that want to bundle
> > packets of the same flow together (without doing GRO) so that they
> > could be processed by various functions by looking at just one
> > representative packet in the group. The concept had some promise, but
> > in the end it created quite a bit of complexity since at some point
> > the packet bundle needed to be undone to go back to processing the
> > individual packets.
>
> You're probably talking about Edward Cree's SKB list stuff, and as
> per his presenation at netconf 2 weeks ago he plans to revitalize
> it given how Spectre et al. gives cause to reevaluate all bulking
> techniques.
Are there patches for the proposal Edward did a while ago,
or was it just a concept?
Maybe we can somehow put things together, I just need some
batching method that works for IPsec and UDP. It does not
need to be exactly the one we proposing here.
^ permalink raw reply
* Re: [PATCH net-next,RFC 00/13] New fast forwarding path
From: Steffen Klassert @ 2018-06-15 6:27 UTC (permalink / raw)
To: Tom Herbert
Cc: Pablo Neira Ayuso, netfilter-devel,
Linux Kernel Network Developers
In-Reply-To: <CALx6S345B2-sejH52beewdiHuGo5nAY0v0uFmNuiJ0hQ6Wu3xw@mail.gmail.com>
On Thu, Jun 14, 2018 at 01:52:03PM -0700, Tom Herbert wrote:
> On Thu, Jun 14, 2018 at 7:19 AM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > Hi,
> >
> > This patchset proposes a new fast forwarding path infrastructure that
> > combines the GRO/GSO and the flowtable infrastructures. The idea is to
> > add a hook at the GRO layer that is invoked before the standard GRO
> > protocol offloads. This allows us to build custom packet chains that we
> > can quickly pass in one go to the neighbour layer to define fast
> > forwarding path for flows.
> >
> > For each packet that gets into the GRO layer, we first check if there is
> > an entry in the flowtable, if so, the packet is placed in a list until
> > the GRO infrastructure decides to send the batch from gro_complete to
> > the neighbour layer. The first packet in the list takes the route from
> > the flowtable entry, so we avoid reiterative routing lookups.
> >
> > In case no entry is found in the flowtable, the packet is passed up to
> > the classic GRO offload handlers. Thus, this packet follows the standard
> > forwarding path. Note that the initial packets of the flow always go
> > through the standard IPv4/IPv6 netfilter forward hook, that is used to
> > configure what flows are placed in the flowtable. Therefore, only a few
> > (initial) packets follow the standard forwarding path while most of the
> > follow up packets take this new fast forwarding path.
> >
>
> IIRC, there was a similar proposal a while back that want to bundle
> packets of the same flow together (without doing GRO) so that they
> could be processed by various functions by looking at just one
> representative packet in the group. The concept had some promise, but
> in the end it created quite a bit of complexity since at some point
> the packet bundle needed to be undone to go back to processing the
> individual packets.
With the way we chain the packets it is not too complicated to
undo this chaining (nft_skb_segment in patch 5 implements this).
After that, this looks like a chain of usual segments, so we
trigger xmit_more with every packet chain.
^ permalink raw reply
* Re: [PATCH net-next,RFC 00/13] New fast forwarding path
From: Steffen Klassert @ 2018-06-15 6:17 UTC (permalink / raw)
To: David Miller; +Cc: pablo, netfilter-devel, netdev
In-Reply-To: <20180614.101831.465275975690050595.davem@davemloft.net>
On Thu, Jun 14, 2018 at 10:18:31AM -0700, David Miller wrote:
> From: Pablo Neira Ayuso <pablo@netfilter.org>
> Date: Thu, 14 Jun 2018 16:19:34 +0200
>
> > This patchset proposes a new fast forwarding path infrastructure
> > that combines the GRO/GSO and the flowtable infrastructures. The
> > idea is to add a hook at the GRO layer that is invoked before the
> > standard GRO protocol offloads. This allows us to build custom
> > packet chains that we can quickly pass in one go to the neighbour
> > layer to define fast forwarding path for flows.
>
> We have full, complete, customizability of the packet path via XDP
> and eBPF.
>
> XDP and eBPF supports everything necessary to accomplish that,
> there are implementations of forwarding implementations in
> the tree and elsewhere.
>
> And most importantly, XDP and eBPF are optimized in drivers and
> offloaded to hardware.
>
> There really is no need for something like what you are proposing.
I started with this last year because I wanted to improve
the IPsec (and UDP) forwarding path. Batching packets
at layer2 and send them directly to the output path
seemed to be a good method to improve this.
In particular, we need to do only one IPsec lookup
for the whole packet chain. So it relaxes the pain
from reomoving the IPsec flowcache a bit. It can be
only a first step, but we need some improvements here
as people start to complain about that.
^ permalink raw reply
* Re: [bpf PATCH v2 6/6] bpf: selftest remove attempts to add LISTEN sockets to sockmap
From: Martin KaFai Lau @ 2018-06-15 6:07 UTC (permalink / raw)
To: John Fastabend; +Cc: ast, daniel, netdev
In-Reply-To: <20180614164512.24994.56879.stgit@john-Precision-Tower-5810>
On Thu, Jun 14, 2018 at 09:45:12AM -0700, John Fastabend wrote:
> In selftest test_maps the sockmap test case attempts to add a socket
> in listening state to the sockmap. This is no longer a valid operation
> so it fails as expected. However, the test wrongly reports this as an
> error now. Fix the test to avoid adding sockets in listening state.
>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
> ---
> 0 files changed
>
> diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
> index 6c25334..9fed5f0 100644
> --- a/tools/testing/selftests/bpf/test_maps.c
> +++ b/tools/testing/selftests/bpf/test_maps.c
> @@ -564,7 +564,7 @@ static void test_sockmap(int tasks, void *data)
> }
>
> /* Test update without programs */
> - for (i = 0; i < 6; i++) {
> + for (i = 2; i < 6; i++) {
> err = bpf_map_update_elem(fd, &i, &sfd[i], BPF_ANY);
> if (err) {
> printf("Failed noprog update sockmap '%i:%i'\n",
> @@ -727,7 +727,7 @@ static void test_sockmap(int tasks, void *data)
> }
>
> /* Test map update elem afterwards fd lives in fd and map_fd */
> - for (i = 0; i < 6; i++) {
> + for (i = 2; i < 6; i++) {
> err = bpf_map_update_elem(map_fd_rx, &i, &sfd[i], BPF_ANY);
> if (err) {
> printf("Failed map_fd_rx update sockmap %i '%i:%i'\n",
>
^ permalink raw reply
* Re: [bpf PATCH v2 5/6] bpf: sockhash, add release routine
From: Martin KaFai Lau @ 2018-06-15 6:05 UTC (permalink / raw)
To: John Fastabend; +Cc: ast, daniel, netdev
In-Reply-To: <20180614164507.24994.83616.stgit@john-Precision-Tower-5810>
On Thu, Jun 14, 2018 at 09:45:07AM -0700, John Fastabend wrote:
> Add map_release_uref pointer to hashmap ops. This was dropped when
> original sockhash code was ported into bpf-next before initial
> commit.
>
> Fixes: 81110384441a ("bpf: sockmap, add hash map support")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
> ---
> 0 files changed
>
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index ffc5152..77fe204 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -2518,6 +2518,7 @@ struct sock *__sock_hash_lookup_elem(struct bpf_map *map, void *key)
> .map_get_next_key = sock_hash_get_next_key,
> .map_update_elem = sock_hash_update_elem,
> .map_delete_elem = sock_hash_delete_elem,
> + .map_release_uref = sock_map_release,
> };
>
> static bool bpf_is_valid_sock(struct bpf_sock_ops_kern *ops)
>
^ permalink raw reply
* Re: [bpf PATCH v2 4/6] bpf: sockmap, tcp_disconnect to listen transition
From: Martin KaFai Lau @ 2018-06-15 6:04 UTC (permalink / raw)
To: John Fastabend; +Cc: ast, daniel, netdev
In-Reply-To: <20180614164502.24994.38682.stgit@john-Precision-Tower-5810>
On Thu, Jun 14, 2018 at 09:45:02AM -0700, John Fastabend wrote:
> After adding checks to ensure TCP is in ESTABLISHED state when a
> sock is added we need to also ensure that user does not transition
> through tcp_disconnect() and back into ESTABLISHED state without
> sockmap removing the sock.
>
> To do this add unhash hook and remove sock from map there.
>
> Reported-by: Eric Dumazet <edumazet@google.com>
> Fixes: 81110384441a ("bpf: sockmap, add hash map support")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
LGTM. One nit.
Acked-by: Martin KaFai Lau <kafai@fb.com>
> ---
> 0 files changed
>
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index 04764f5..ffc5152 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -130,6 +130,7 @@ struct smap_psock {
>
> struct proto *sk_proto;
> void (*save_close)(struct sock *sk, long timeout);
> + void (*save_unhash)(struct sock *sk);
> void (*save_data_ready)(struct sock *sk);
> void (*save_write_space)(struct sock *sk);
> };
> @@ -141,6 +142,7 @@ static int bpf_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
> static int bpf_tcp_sendpage(struct sock *sk, struct page *page,
> int offset, size_t size, int flags);
> static void bpf_tcp_close(struct sock *sk, long timeout);
> +static void bpf_tcp_unhash(struct sock *sk);
>
> static inline struct smap_psock *smap_psock_sk(const struct sock *sk)
> {
> @@ -182,6 +184,7 @@ static void build_protos(struct proto prot[SOCKMAP_NUM_CONFIGS],
> {
> prot[SOCKMAP_BASE] = *base;
> prot[SOCKMAP_BASE].close = bpf_tcp_close;
> + prot[SOCKMAP_BASE].unhash = bpf_tcp_unhash;
> prot[SOCKMAP_BASE].recvmsg = bpf_tcp_recvmsg;
> prot[SOCKMAP_BASE].stream_memory_read = bpf_tcp_stream_read;
>
> @@ -215,6 +218,7 @@ static int bpf_tcp_init(struct sock *sk)
> }
>
> psock->save_close = sk->sk_prot->close;
> + psock->save_unhash = sk->sk_prot->unhash;
> psock->sk_proto = sk->sk_prot;
>
> /* Build IPv6 sockmap whenever the address of tcpv6_prot changes */
> @@ -302,28 +306,12 @@ struct smap_psock_map_entry *psock_map_pop(struct sock *sk,
> return e;
> }
>
> -static void bpf_tcp_close(struct sock *sk, long timeout)
> +static void bpf_tcp_remove(struct sock *sk, struct smap_psock *psock)
> {
> - void (*close_fun)(struct sock *sk, long timeout);
> struct smap_psock_map_entry *e;
> struct sk_msg_buff *md, *mtmp;
> - struct smap_psock *psock;
> struct sock *osk;
>
> - rcu_read_lock();
> - psock = smap_psock_sk(sk);
> - if (unlikely(!psock)) {
> - rcu_read_unlock();
> - return sk->sk_prot->close(sk, timeout);
> - }
> -
> - /* The psock may be destroyed anytime after exiting the RCU critial
> - * section so by the time we use close_fun the psock may no longer
> - * be valid. However, bpf_tcp_close is called with the sock lock
> - * held so the close hook and sk are still valid.
> - */
> - close_fun = psock->save_close;
> -
> if (psock->cork) {
> free_start_sg(psock->sock, psock->cork);
> kfree(psock->cork);
> @@ -378,6 +366,51 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
> }
> e = psock_map_pop(sk, psock);
> }
> +}
> +
> +static void bpf_tcp_unhash(struct sock *sk)
> +{
> + void (*unhash_fun)(struct sock *sk);
> + struct smap_psock *psock;
> +
> + rcu_read_lock();
> + psock = smap_psock_sk(sk);
> + if (unlikely(!psock)) {
> + rcu_read_unlock();
> + return sk->sk_prot->unhash(sk);
> + }
> +
> + /* The psock may be destroyed anytime after exiting the RCU critial
> + * section so by the time we use close_fun the psock may no longer
> + * be valid. However, bpf_tcp_close is called with the sock lock
> + * held so the close hook and sk are still valid.
> + */
Nit. s/close/unhash/
> + unhash_fun = psock->save_unhash;
> + bpf_tcp_remove(sk, psock);
> + rcu_read_unlock();
> + unhash_fun(sk);
> +
> +}
> +
> +static void bpf_tcp_close(struct sock *sk, long timeout)
> +{
> + void (*close_fun)(struct sock *sk, long timeout);
> + struct smap_psock *psock;
> +
> + rcu_read_lock();
> + psock = smap_psock_sk(sk);
> + if (unlikely(!psock)) {
> + rcu_read_unlock();
> + return sk->sk_prot->close(sk, timeout);
> + }
> +
> + /* The psock may be destroyed anytime after exiting the RCU critial
> + * section so by the time we use close_fun the psock may no longer
> + * be valid. However, bpf_tcp_close is called with the sock lock
> + * held so the close hook and sk are still valid.
> + */
> + close_fun = psock->save_close;
> + bpf_tcp_remove(sk, psock);
> rcu_read_unlock();
> close_fun(sk, timeout);
> }
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox