* ss filter problem
From: Phil Sutter @ 2016-03-29 19:32 UTC (permalink / raw)
To: Vadim Kochan; +Cc: Stephen Hemminger, netdev
Hi,
I am trying to fix a bug in ss filter code, but feel quite lost right
now. The issue is this:
| ss -nl -f inet '( sport = :22 )'
prints not only listening sockets (as requested by -l flag), but
established ones as well (reproduce by opening ssh connection to
127.0.0.1 before calling above).
In contrast, the following both don't show the established sockets:
| ss -nl '( sport = :22 )'
| ss -nl -f inet
My investigation led me to see that current_filter.states is altered
after ssfilter_parse() returns, and using gdb with a watchpoint I was
able to identify parse_hostcond() to be the bad guy: In line 1560, it
calls filter_af_set() after checking for fam != AF_UNSPEC (which is the
case, since fam = preferred_family and the latter is changed to AF_INET
when parsing '-f inet' parameter).
This whole jumping back and forth confuses me quite effectively. Since
you did some fixes in the past already, are you possibly able to point
out where/how this tiny mess has to be fixed?
I guess in an ideal world we would translate '-l' to 'state listen', '-f
inet' to 'src inet:*' and pass everything ANDed together to
ssfilter_parse(). Or maybe that would make things even worse. ;)
Cheers, Phil
^ permalink raw reply
* Re: [PATCH] bus: mvebu-mbus: use %pad to print phys_addr_t
From: Arnd Bergmann @ 2016-03-29 19:51 UTC (permalink / raw)
To: Gregory CLEMENT
Cc: David S . Miller, Marcin Wojtas, Evan Wang, netdev,
Thomas Petazzoni, Nicolas Schichan, linux-kernel
In-Reply-To: <87h9fpw9r4.fsf@free-electrons.com>
On Tuesday 29 March 2016 18:04:47 Gregory CLEMENT wrote:
>
> What is the status of this patch?
>
> Do you plan to send a second version with the title fixed as suggested
> by Joe Perches?
>
> Also do you expect that I collect this patch in the mvebu subsystem?
Right now, it's on my long-term todo list along with some 70 other patches
I need to revisit. If you want to fix up the title and apply it now,
that would be great, otherwise I'll get to it in a few weeks after coming
back from ELC/vacation.
ARnd
^ permalink raw reply
* Re: ss filter problem
From: Vadim Kochan @ 2016-03-29 20:05 UTC (permalink / raw)
To: Phil Sutter, Vadim Kochan, Stephen Hemminger, netdev
In-Reply-To: <20160329193242.GA28502@orbyte.nwl.cc>
Hi Phil,
On Tue, Mar 29, 2016 at 09:32:42PM +0200, Phil Sutter wrote:
> Hi,
>
> I am trying to fix a bug in ss filter code, but feel quite lost right
> now. The issue is this:
>
> | ss -nl -f inet '( sport = :22 )'
>
> prints not only listening sockets (as requested by -l flag), but
> established ones as well (reproduce by opening ssh connection to
> 127.0.0.1 before calling above).
>
> In contrast, the following both don't show the established sockets:
>
> | ss -nl '( sport = :22 )'
> | ss -nl -f inet
>
> My investigation led me to see that current_filter.states is altered
> after ssfilter_parse() returns, and using gdb with a watchpoint I was
> able to identify parse_hostcond() to be the bad guy: In line 1560, it
> calls filter_af_set() after checking for fam != AF_UNSPEC (which is the
> case, since fam = preferred_family and the latter is changed to AF_INET
> when parsing '-f inet' parameter).
Yes, after removing of fam != AF_UNSPEC body - it works, because
it does not overwrite specified states (-l) from command line, but I
can't say what it may affect else, I will try to investigate it better.
>
> This whole jumping back and forth confuses me quite effectively. Since
> you did some fixes in the past already, are you possibly able to point
> out where/how this tiny mess has to be fixed?
>
> I guess in an ideal world we would translate '-l' to 'state listen', '-f
> inet' to 'src inet:*' and pass everything ANDed together to
> ssfilter_parse(). Or maybe that would make things even worse. ;)
>
> Cheers, Phil
I thought I fixed & tested well ss filter, but seems it would be good to
have good automation testing.
Regards,
Vadim Kochan
^ permalink raw reply
* Re: [PATCH 1/1] ipv4: fix NULL pointer dereference in __inet_put_port()
From: David Miller @ 2016-03-29 20:53 UTC (permalink / raw)
To: fanhui00; +Cc: kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel
In-Reply-To: <1459233953-4745-1-git-send-email-fanhui00@gmail.com>
From: fanhui <fanhui00@gmail.com>
Date: Tue, 29 Mar 2016 14:45:53 +0800
> [<ffffffc000930718>] tcp_nuke_addr+0x22c/0x2a0
Do not report or fix problems in non-mainline kernels.
Thank you.
^ permalink raw reply
* Re: [PATCH net-next v3.16]r9169: Correct Set Vlan tag
From: David Miller @ 2016-03-29 20:59 UTC (permalink / raw)
To: asd; +Cc: netdev, romieu
In-Reply-To: <1459240400-1602-1-git-send-email-asd@marian1000.go.ro>
From: Corcodel Marian <asd@marian1000.go.ro>
Date: Tue, 29 Mar 2016 11:33:20 +0300
> This patch add set Vlan tag and flush CPlusCmd register because when unset
> RxVlan and RxChkSum bit, whithout some explication , unwanted bits
> is set, PCIDAC, PCIMulRW and others.Whithout this patch when run
> ethtool -d eth0 on "C+ Command" field missing "VLAN de-tagging"
>
> Signed-off-by: Corcodel Marian <asd@marian1000.go.ro>
I am hereby blocking you from making any and all postings to the
mailing lists at vger.kernel.org
There is not negotiating about this, all complaints sent to me will be
ignored. You should have thought about the consequences of your
actions (or lack thereof) over the past several months.
You cannot continually post patches, as you please, and completely
refuse to interact with the developers who review your changes and
give you feedback.
You have not coherently replied to any feedback you have been given.
You have almost always ignored the feedback you were given and
continued to make the same undesirable changes over and over again.
And you refuse to even acknowledge or address the issue of your
interactions with the community in any way whatsoever.
Therefore you are hurting development, and wasting precious developer
resources with your postings.
All of this is unacceptable.
But that all ends right now.
Thank you.
^ permalink raw reply
* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: David Miller @ 2016-03-29 21:03 UTC (permalink / raw)
To: tom; +Cc: aduyck, alexander.duyck, jesse, netdev
In-Reply-To: <CALx6S37GPMUuhgiWRH+40314x8eKnCWA-zZkvUmsTqf3_AFOhw@mail.gmail.com>
From: Tom Herbert <tom@herbertland.com>
Date: Mon, 28 Mar 2016 21:51:17 -0700
> No, but I do expect that you support code that is already there. There
> was apparently zero testing done on the original patch and it caused
> one very obvious regression. So how can we have any confidence
> whatsoever that this patch doesn't break other things? Furthermore,
> with all these claims of bugs I still don't see that _anyone_ has
> taken the time to reproduce any issue and show that this patch
> materially fixes any thing. I seriously don't understand how basic
> testing could be such a challenge.
>
> Anyway, what I expect is moot. It's up to davem to decide what to do
> with this...
You being upset with a lack of testing is one issue, and is
legitimate.
But the fact that we can't support, and never could support, more than
one network header at a time except in a very special case for GRO
is very real. And you must acknowledge that this was a very shaky
foundation upon which to erect the kinds of things you expect to work.
^ permalink raw reply
* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Tom Herbert @ 2016-03-29 21:13 UTC (permalink / raw)
To: Alexander Duyck
Cc: Jesse Gross, Linux Kernel Network Developers, David S. Miller,
Alexander Duyck
In-Reply-To: <20160328235613.26269.26291.stgit@localhost.localdomain>
On Mon, Mar 28, 2016 at 4:58 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
> This patch should fix the issues seen with a recent fix to prevent
> tunnel-in-tunnel frames from being generated with GRO. The fix itself is
> correct for now as long as we do not add any devices that support
> NETIF_F_GSO_GRE_CSUM. When such a device is added it could have the
> potential to mess things up due to the fact that the outer transport header
> points to the outer UDP header and not the GRE header as would be expected.
>
> Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
> ---
>
> This should allow us to keep the fix that Jesse added without breaking the
> 3 cases that Tom called out in terms of FOU/GUE.
>
> Additional work will be needed in net-next as we probably need to make it
> so that offloads work correctly when we get around to supporting
> NETIF_F_GSO_GRE_CSUM.
>
> net/ipv4/fou.c | 32 ++++++++++++++++++++++++++++++++
> 1 file changed, 32 insertions(+)
>
> diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
> index 4136da9275b2..2c30256ee959 100644
> --- a/net/ipv4/fou.c
> +++ b/net/ipv4/fou.c
> @@ -195,6 +195,22 @@ static struct sk_buff **fou_gro_receive(struct sk_buff **head,
> u8 proto = NAPI_GRO_CB(skb)->proto;
> const struct net_offload **offloads;
>
> + switch (proto) {
> + case IPPROTO_IPIP:
> + case IPPROTO_IPV6:
> + case IPPROTO_GRE:
> + /* We can clear the encap_mark for these 3 protocols as
> + * we are either adding an L4 tunnel header to the outer
> + * L3 tunnel header, or we are are simply treating the
> + * GRE tunnel header as though it is a UDP protocol
> + * specific header such as VXLAN or GENEVE.
> + */
> + NAPI_GRO_CB(skb)->encap_mark = 0;
> + /* fall-through */
> + default:
> + break;
> + }
Switch statement is not needed, just do NAPI_GRO_CB(skb)->encap_mark = 0;
> +
> rcu_read_lock();
> offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
> ops = rcu_dereference(offloads[proto]);
> @@ -359,6 +375,22 @@ static struct sk_buff **gue_gro_receive(struct sk_buff **head,
> NAPI_GRO_CB(p)->flush |= NAPI_GRO_CB(p)->flush_id;
> }
>
> + switch (guehdr->proto_ctype) {
> + case IPPROTO_IPIP:
> + case IPPROTO_IPV6:
> + case IPPROTO_GRE:
> + /* We can clear the encap_mark for these 3 protocols as
> + * we are either adding an L4 tunnel header to the outer
> + * L3 tunnel header, or we are are simply treating the
> + * GRE tunnel header as though it is a UDP protocol
> + * specific header such as VXLAN or GENEVE.
> + */
> + NAPI_GRO_CB(skb)->encap_mark = 0;
> + /* fall-through */
> + default:
> + break;
> + }
> +
Here also.
> rcu_read_lock();
> offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
> ops = rcu_dereference(offloads[guehdr->proto_ctype]);
>
^ permalink raw reply
* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Alexander Duyck @ 2016-03-29 21:26 UTC (permalink / raw)
To: Tom Herbert
Cc: Alexander Duyck, Jesse Gross, Linux Kernel Network Developers,
David S. Miller
In-Reply-To: <CALx6S36PohjN6ReOE_cZKcyQdcRDOmSh+sij9boTVLnDHNc7JA@mail.gmail.com>
On Tue, Mar 29, 2016 at 2:13 PM, Tom Herbert <tom@herbertland.com> wrote:
> On Mon, Mar 28, 2016 at 4:58 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
>> This patch should fix the issues seen with a recent fix to prevent
>> tunnel-in-tunnel frames from being generated with GRO. The fix itself is
>> correct for now as long as we do not add any devices that support
>> NETIF_F_GSO_GRE_CSUM. When such a device is added it could have the
>> potential to mess things up due to the fact that the outer transport header
>> points to the outer UDP header and not the GRE header as would be expected.
>>
>> Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>> ---
>>
>> This should allow us to keep the fix that Jesse added without breaking the
>> 3 cases that Tom called out in terms of FOU/GUE.
>>
>> Additional work will be needed in net-next as we probably need to make it
>> so that offloads work correctly when we get around to supporting
>> NETIF_F_GSO_GRE_CSUM.
>>
>> net/ipv4/fou.c | 32 ++++++++++++++++++++++++++++++++
>> 1 file changed, 32 insertions(+)
>>
>> diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
>> index 4136da9275b2..2c30256ee959 100644
>> --- a/net/ipv4/fou.c
>> +++ b/net/ipv4/fou.c
>> @@ -195,6 +195,22 @@ static struct sk_buff **fou_gro_receive(struct sk_buff **head,
>> u8 proto = NAPI_GRO_CB(skb)->proto;
>> const struct net_offload **offloads;
>>
>> + switch (proto) {
>> + case IPPROTO_IPIP:
>> + case IPPROTO_IPV6:
>> + case IPPROTO_GRE:
>> + /* We can clear the encap_mark for these 3 protocols as
>> + * we are either adding an L4 tunnel header to the outer
>> + * L3 tunnel header, or we are are simply treating the
>> + * GRE tunnel header as though it is a UDP protocol
>> + * specific header such as VXLAN or GENEVE.
>> + */
>> + NAPI_GRO_CB(skb)->encap_mark = 0;
>> + /* fall-through */
>> + default:
>> + break;
>> + }
>
> Switch statement is not needed, just do NAPI_GRO_CB(skb)->encap_mark = 0;
>
>> +
>> rcu_read_lock();
>> offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
>> ops = rcu_dereference(offloads[proto]);
>> @@ -359,6 +375,22 @@ static struct sk_buff **gue_gro_receive(struct sk_buff **head,
>> NAPI_GRO_CB(p)->flush |= NAPI_GRO_CB(p)->flush_id;
>> }
>>
>> + switch (guehdr->proto_ctype) {
>> + case IPPROTO_IPIP:
>> + case IPPROTO_IPV6:
>> + case IPPROTO_GRE:
>> + /* We can clear the encap_mark for these 3 protocols as
>> + * we are either adding an L4 tunnel header to the outer
>> + * L3 tunnel header, or we are are simply treating the
>> + * GRE tunnel header as though it is a UDP protocol
>> + * specific header such as VXLAN or GENEVE.
>> + */
>> + NAPI_GRO_CB(skb)->encap_mark = 0;
>> + /* fall-through */
>> + default:
>> + break;
>> + }
>> +
> Here also.
>
>> rcu_read_lock();
>> offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
>> ops = rcu_dereference(offloads[guehdr->proto_ctype]);
>>
Okay, I can update that and submit a v2. The only real reason why I
had the switch statements was out of an abundance of caution since
those were the only 3 cases where I knew we would run into issues.
- Alex
^ permalink raw reply
* Re: [PATCH] bridge: Allow set bridge ageing time when switchdev disabled
From: Ido Schimmel @ 2016-03-29 21:34 UTC (permalink / raw)
To: Haishuang Yan; +Cc: netdev, bridge, linux-kernel, jiri, David S. Miller
In-Reply-To: <1459248488-25621-1-git-send-email-yanhaishuang@cmss.chinamobile.com>
Tue, Mar 29, 2016 at 01:48:08PM IDT, yanhaishuang@cmss.chinamobile.com wrote:
>When NET_SWITCHDEV=n, switchdev_port_attr_set will return -EOPNOTSUPP,
>we should ignore this error code and continue to set the ageing time.
>
>Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Fixes: c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev")
Acked-by: Ido Schimmel <idosch@mellanox.com>
Thank you.
^ permalink raw reply
* Re: [PATCH 06/16] wcn36xx: Fetch private sta data from sta entry instead of from vif
From: Bjorn Andersson @ 2016-03-29 21:37 UTC (permalink / raw)
To: kbuild test robot
Cc: netdev, linux-wireless, linux-kernel, Pontus Fuchs, kbuild-all,
wcn36xx, Eugene Krasnikov, Kalle Valo
In-Reply-To: <201603300022.MmhimaU1%fengguang.wu@intel.com>
On Tue 29 Mar 10:01 PDT 2016, kbuild test robot wrote:
> Hi Pontus,
>
> [auto build test ERROR on wireless-drivers/master]
> [also build test ERROR on v4.6-rc1 next-20160329]
> [if your patch is applied to the wrong git tree, please drop us a note to help improving the system]
>
> url: https://github.com/0day-ci/linux/commits/Bjorn-Andersson/Misc-wcn36xx-fixes/20160329-141847
> base: https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git master
> config: sparc64-allyesconfig (attached as .config)
> reproduce:
> wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=sparc64
>
> Note: the linux-review/Bjorn-Andersson/Misc-wcn36xx-fixes/20160329-141847 HEAD 8303daac889854237207e7caefaea94fee0b87f2 builds fine.
> It only hurts bisectibility.
>
> All error/warnings (new ones prefixed by >>):
>
> drivers/net/wireless/ath/wcn36xx/main.c: In function 'wcn36xx_set_key':
> >> drivers/net/wireless/ath/wcn36xx/main.c:389:9: error: implicit declaration of function 'wcn36xx_sta_to_priv' [-Werror=implicit-function-declaration]
> struct wcn36xx_sta *sta_priv = wcn36xx_sta_to_priv(sta);
> ^
> >> drivers/net/wireless/ath/wcn36xx/main.c:389:33: warning: initialization makes pointer from integer without a cast
> struct wcn36xx_sta *sta_priv = wcn36xx_sta_to_priv(sta);
> ^
> cc1: some warnings being treated as errors
This should have been reordered with patch 7, that introduces this
helper function. Do you want me to resend, or can you apply the patches
out of order?
Regards,
Bjorn
>
> vim +/wcn36xx_sta_to_priv +389 drivers/net/wireless/ath/wcn36xx/main.c
>
> 383 struct ieee80211_vif *vif,
> 384 struct ieee80211_sta *sta,
> 385 struct ieee80211_key_conf *key_conf)
> 386 {
> 387 struct wcn36xx *wcn = hw->priv;
> 388 struct wcn36xx_vif *vif_priv = wcn36xx_vif_to_priv(vif);
> > 389 struct wcn36xx_sta *sta_priv = wcn36xx_sta_to_priv(sta);
> 390 int ret = 0;
> 391 u8 key[WLAN_MAX_KEY_LEN];
> 392
>
> ---
> 0-DAY kernel test infrastructure Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all Intel Corporation
> _______________________________________________
> wcn36xx mailing list
> wcn36xx@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/wcn36xx
^ permalink raw reply
* [RESEND PATCH 15/16] wcn36xx: don't pad beacons for mesh
From: Bjorn Andersson @ 2016-03-29 21:41 UTC (permalink / raw)
To: Eugene Krasnikov, Kalle Valo
Cc: Sergei Shtylyov, wcn36xx, linux-wireless, netdev, linux-kernel,
Jason Mobarak, Chun-Yeow Yeoh
In-Reply-To: <1459231593-360-16-git-send-email-bjorn.andersson@linaro.org>
From: Jason Mobarak <jam@cozybit.com>
Patch "wcn36xx: Pad TIM PVM if needed" has caused a regression in mesh
beaconing. The field tim_off is always 0 for mesh mode, and thus
pvm_len (referring to the TIM length field) and pad are both incorrectly
calculated. Thus, msg_body.beacon_length is incorrectly calculated for
mesh mode. Fix this.
Fixes: 8ad99a4e3ee5 ("wcn36xx: Pad TIM PVM if needed")
Signed-off-by: Jason Mobarak <jam@cozybit.com>
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@cozybit.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
---
Resend this single patch with included Fixes tag.
drivers/net/wireless/ath/wcn36xx/smd.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c
index a57d158298a1..b1bdc229e560 100644
--- a/drivers/net/wireless/ath/wcn36xx/smd.c
+++ b/drivers/net/wireless/ath/wcn36xx/smd.c
@@ -1410,6 +1410,11 @@ int wcn36xx_smd_send_beacon(struct wcn36xx *wcn, struct ieee80211_vif *vif,
pvm_len = skb_beacon->data[tim_off + 1] - 3;
pad = TIM_MIN_PVM_SIZE - pvm_len;
+
+ /* Padding is irrelevant to mesh mode since tim_off is always 0. */
+ if (vif->type == NL80211_IFTYPE_MESH_POINT)
+ pad = 0;
+
msg_body.beacon_length = skb_beacon->len + pad;
/* TODO need to find out why + 6 is needed */
msg_body.beacon_length6 = msg_body.beacon_length + 6;
--
2.5.0
^ permalink raw reply related
* Re: [RESEND PATCH 15/16] wcn36xx: don't pad beacons for mesh
From: Bjorn Andersson @ 2016-03-29 21:44 UTC (permalink / raw)
To: Eugene Krasnikov, Kalle Valo
Cc: Sergei Shtylyov, wcn36xx-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Jason Mobarak,
Chun-Yeow Yeoh
In-Reply-To: <1459287672-3324-1-git-send-email-bjorn.andersson-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
On Tue 29 Mar 14:41 PDT 2016, Bjorn Andersson wrote:
> From: Jason Mobarak <jam-W/OLz77bvjtBDgjK7y7TUQ@public.gmane.org>
>
> Patch "wcn36xx: Pad TIM PVM if needed" has caused a regression in mesh
> beaconing. The field tim_off is always 0 for mesh mode, and thus
> pvm_len (referring to the TIM length field) and pad are both incorrectly
> calculated. Thus, msg_body.beacon_length is incorrectly calculated for
> mesh mode. Fix this.
>
> Fixes: 8ad99a4e3ee5 ("wcn36xx: Pad TIM PVM if needed")
> Signed-off-by: Jason Mobarak <jam-W/OLz77bvjtBDgjK7y7TUQ@public.gmane.org>
> Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow-W/OLz77bvjtBDgjK7y7TUQ@public.gmane.org>
> Signed-off-by: Bjorn Andersson <bjorn.andersson-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>
> Resend this single patch with included Fixes tag.
>
Sorry for the spam, I read the git log incorrectly. The patch referred
to is part of this series, so the sha1 is bogus.
Regards,
Bjorn
> drivers/net/wireless/ath/wcn36xx/smd.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c
> index a57d158298a1..b1bdc229e560 100644
> --- a/drivers/net/wireless/ath/wcn36xx/smd.c
> +++ b/drivers/net/wireless/ath/wcn36xx/smd.c
> @@ -1410,6 +1410,11 @@ int wcn36xx_smd_send_beacon(struct wcn36xx *wcn, struct ieee80211_vif *vif,
>
> pvm_len = skb_beacon->data[tim_off + 1] - 3;
> pad = TIM_MIN_PVM_SIZE - pvm_len;
> +
> + /* Padding is irrelevant to mesh mode since tim_off is always 0. */
> + if (vif->type == NL80211_IFTYPE_MESH_POINT)
> + pad = 0;
> +
> msg_body.beacon_length = skb_beacon->len + pad;
> /* TODO need to find out why + 6 is needed */
> msg_body.beacon_length6 = msg_body.beacon_length + 6;
> --
> 2.5.0
>
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [net PATCH v2] gro: Allow tunnel stacking in the case of FOU/GUE
From: Alexander Duyck @ 2016-03-29 21:55 UTC (permalink / raw)
To: jesse, netdev, davem, alexander.duyck, tom
This patch should fix the issues seen with a recent fix to prevent
tunnel-in-tunnel frames from being generated with GRO. The fix itself is
correct for now as long as we do not add any devices that support
NETIF_F_GSO_GRE_CSUM. When such a device is added it could have the
potential to mess things up due to the fact that the outer transport header
points to the outer UDP header and not the GRE header as would be expected.
Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
v2: Dropped switch statements per suggestion of Tom Herbert.
net/ipv4/fou.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
index a0586b4a197d..5a94aea280d3 100644
--- a/net/ipv4/fou.c
+++ b/net/ipv4/fou.c
@@ -195,6 +195,14 @@ static struct sk_buff **fou_gro_receive(struct sk_buff **head,
u8 proto = NAPI_GRO_CB(skb)->proto;
const struct net_offload **offloads;
+ /* We can clear the encap_mark for FOU as we are essentially doing
+ * one of two possible things. We are either adding an L4 tunnel
+ * header to the outer L3 tunnel header, or we are are simply
+ * treating the GRE tunnel header as though it is a UDP protocol
+ * specific header such as VXLAN or GENEVE.
+ */
+ NAPI_GRO_CB(skb)->encap_mark = 0;
+
rcu_read_lock();
offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
ops = rcu_dereference(offloads[proto]);
@@ -352,6 +360,14 @@ static struct sk_buff **gue_gro_receive(struct sk_buff **head,
}
}
+ /* We can clear the encap_mark for GUE as we are essentially doing
+ * one of two possible things. We are either adding an L4 tunnel
+ * header to the outer L3 tunnel header, or we are are simply
+ * treating the GRE tunnel header as though it is a UDP protocol
+ * specific header such as VXLAN or GENEVE.
+ */
+ NAPI_GRO_CB(skb)->encap_mark = 0;
+
rcu_read_lock();
offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
ops = rcu_dereference(offloads[guehdr->proto_ctype]);
^ permalink raw reply related
* [PATCH net] bpf: make padding in bpf_tunnel_key explicit
From: Daniel Borkmann @ 2016-03-29 22:02 UTC (permalink / raw)
To: davem; +Cc: alexei.starovoitov, netdev, Daniel Borkmann
Make the 2 byte padding in struct bpf_tunnel_key between tunnel_ttl
and tunnel_label members explicit. No issue has been observed, and
gcc/llvm does padding for the old struct already, where tunnel_label
was not yet present, so the current code works, but since it's part
of uapi, make sure we don't introduce holes in structs.
Therefore, add tunnel_ext that we can use generically in future
(f.e. to flag OAM messages for backends, etc). Also add the offset
to the compat tests to be sure should some compilers not padd the
tail of the old version of bpf_tunnel_key.
Fixes: 4018ab1875e0 ("bpf: support flow label for bpf_skb_{set, get}_tunnel_key")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
include/uapi/linux/bpf.h | 1 +
net/core/filter.c | 5 ++++-
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 924f537..23917bb 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -375,6 +375,7 @@ struct bpf_tunnel_key {
};
__u8 tunnel_tos;
__u8 tunnel_ttl;
+ __u16 tunnel_ext;
__u32 tunnel_label;
};
diff --git a/net/core/filter.c b/net/core/filter.c
index b7177d0..4b81b71 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1764,6 +1764,7 @@ static u64 bpf_skb_get_tunnel_key(u64 r1, u64 r2, u64 size, u64 flags, u64 r5)
if (unlikely(size != sizeof(struct bpf_tunnel_key))) {
switch (size) {
case offsetof(struct bpf_tunnel_key, tunnel_label):
+ case offsetof(struct bpf_tunnel_key, tunnel_ext):
goto set_compat;
case offsetof(struct bpf_tunnel_key, remote_ipv6[1]):
/* Fixup deprecated structure layouts here, so we have
@@ -1849,6 +1850,7 @@ static u64 bpf_skb_set_tunnel_key(u64 r1, u64 r2, u64 size, u64 flags, u64 r5)
if (unlikely(size != sizeof(struct bpf_tunnel_key))) {
switch (size) {
case offsetof(struct bpf_tunnel_key, tunnel_label):
+ case offsetof(struct bpf_tunnel_key, tunnel_ext):
case offsetof(struct bpf_tunnel_key, remote_ipv6[1]):
/* Fixup deprecated structure layouts here, so we have
* a common path later on.
@@ -1861,7 +1863,8 @@ static u64 bpf_skb_set_tunnel_key(u64 r1, u64 r2, u64 size, u64 flags, u64 r5)
return -EINVAL;
}
}
- if (unlikely(!(flags & BPF_F_TUNINFO_IPV6) && from->tunnel_label))
+ if (unlikely((!(flags & BPF_F_TUNINFO_IPV6) && from->tunnel_label) ||
+ from->tunnel_ext))
return -EINVAL;
skb_dst_drop(skb);
--
1.9.3
^ permalink raw reply related
* Re: [PATCH net] bpf: make padding in bpf_tunnel_key explicit
From: Alexei Starovoitov @ 2016-03-29 22:06 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: davem, netdev
In-Reply-To: <6589c70157238797e63986eeea67cfe2abfb3260.1459288316.git.daniel@iogearbox.net>
On Wed, Mar 30, 2016 at 12:02:00AM +0200, Daniel Borkmann wrote:
> Make the 2 byte padding in struct bpf_tunnel_key between tunnel_ttl
> and tunnel_label members explicit. No issue has been observed, and
> gcc/llvm does padding for the old struct already, where tunnel_label
> was not yet present, so the current code works, but since it's part
> of uapi, make sure we don't introduce holes in structs.
>
> Therefore, add tunnel_ext that we can use generically in future
> (f.e. to flag OAM messages for backends, etc). Also add the offset
> to the compat tests to be sure should some compilers not padd the
> tail of the old version of bpf_tunnel_key.
>
> Fixes: 4018ab1875e0 ("bpf: support flow label for bpf_skb_{set, get}_tunnel_key")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
^ permalink raw reply
* [PATCH RFC net-next] net: core: Pass XPS select queue decision to skb_tx_hash
From: Saeed Mahameed @ 2016-03-29 22:24 UTC (permalink / raw)
To: netdev
Cc: Eric Dumazet, Tom Herbert, Jiri Pirko, David S. Miller,
John Fastabend, Saeed Mahameed
Currently XPS select queue decision is final and overrides/ignores all other
select queue parameters such as QoS TC, RX recording.
This patch makes get_xps_queue value as a hint for skb_tx_hash, which will decide
whether to use this hint as is or to tweak it a little to provide the correct TXQ.
This will fix bugs in cases such that TC QoS offload (tc_to_txq mapping) and XPS mapping
are configured but select queue will only respect the XPS configuration which will skip
the TC QoS queue selection and thus will not satisfy the QoS configuration.
RFC because I want to discuss how we would like the final behavior of the
__netdev_pick_tx, with this patch it goes as follows:
netdev_pick_tx(skb):
hint = get_xps_queue
txq = skb_tx_hash(skb, hint)
skb_tx_hash(skb, hint):
if (skb_rx_queue_recorded(skb))
return skb_get_rx_queue(skb);
queue_offset = 0;
if (dev->num_tc)
queue_offset = tc_queue_offset[tc];
hash = hint < 0 ? skb_get_hash(skb) : hint;
return hash + queue_offset;
i.e: instead of blindly return the XPS decision, we pass it to skb_tx_hash which can make the final
decision for us, select queue will now respect and combine XPS and TC QoS tc_to_txq mappings.
Also there is one additional behavioral change that recorded rx queues will
now override the XPS configuration.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 2 +-
include/linux/netdevice.h | 6 +++---
net/core/dev.c | 10 ++++++----
3 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index c0d7b72..873cf49 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -693,7 +693,7 @@ u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
u8 up = 0;
if (dev->num_tc)
- return skb_tx_hash(dev, skb);
+ return skb_tx_hash(dev, skb, -1);
if (skb_vlan_tag_present(skb))
up = skb_vlan_tag_get(skb) >> VLAN_PRIO_SHIFT;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cb0d5d0..ad81ffe 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3130,16 +3130,16 @@ static inline int netif_set_xps_queue(struct net_device *dev,
#endif
u16 __skb_tx_hash(const struct net_device *dev, struct sk_buff *skb,
- unsigned int num_tx_queues);
+ unsigned int num_tx_queues, int txq_hint);
/*
* Returns a Tx hash for the given packet when dev->real_num_tx_queues is used
* as a distribution range limit for the returned value.
*/
static inline u16 skb_tx_hash(const struct net_device *dev,
- struct sk_buff *skb)
+ struct sk_buff *skb, int txq_hint)
{
- return __skb_tx_hash(dev, skb, dev->real_num_tx_queues);
+ return __skb_tx_hash(dev, skb, dev->real_num_tx_queues, txq_hint);
}
/**
diff --git a/net/core/dev.c b/net/core/dev.c
index b9bcbe7..ff640b7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2394,7 +2394,7 @@ EXPORT_SYMBOL(netif_device_attach);
* to be used as a distribution range.
*/
u16 __skb_tx_hash(const struct net_device *dev, struct sk_buff *skb,
- unsigned int num_tx_queues)
+ unsigned int num_tx_queues, int txq_hint)
{
u32 hash;
u16 qoffset = 0;
@@ -2411,9 +2411,12 @@ u16 __skb_tx_hash(const struct net_device *dev, struct sk_buff *skb,
u8 tc = netdev_get_prio_tc_map(dev, skb->priority);
qoffset = dev->tc_to_txq[tc].offset;
qcount = dev->tc_to_txq[tc].count;
+ if (txq_hint >= qcount) /* This can happen when xps is configured on TC TXQs */
+ txq_hint = -1; /* recalculate TXQ hash */
}
- return (u16) reciprocal_scale(skb_get_hash(skb), qcount) + qoffset;
+ hash = txq_hint < 0 ? reciprocal_scale(skb_get_hash(skb), qcount) : txq_hint;
+ return (u16)(hash) + qoffset;
}
EXPORT_SYMBOL(__skb_tx_hash);
@@ -3201,8 +3204,7 @@ static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb)
if (queue_index < 0 || skb->ooo_okay ||
queue_index >= dev->real_num_tx_queues) {
int new_index = get_xps_queue(dev, skb);
- if (new_index < 0)
- new_index = skb_tx_hash(dev, skb);
+ new_index = skb_tx_hash(dev, skb, new_index);
if (queue_index != new_index && sk &&
sk_fullsock(sk) &&
--
1.7.1
^ permalink raw reply related
* Best way to reduce system call overhead for tun device I/O?
From: Guus Sliepen @ 2016-03-29 22:40 UTC (permalink / raw)
To: netdev
I'm trying to reduce system call overhead when reading/writing to/from a
tun device in userspace. For sockets, one can use sendmmsg()/recvmmsg(),
but a tun fd is not a socket fd, so this doesn't work. I'm see several
options to allow userspace to read/write multiple packets with one
syscall:
- Implement a TX/RX ring buffer that is mmap()ed, like with AF_PACKET
sockets.
- Implement a ioctl() to emulate sendmmsg()/recvmmsg().
- Add a flag that can be set using TUNSETIFF that makes regular
read()/write() calls handle multiple packets in one go.
- Expose a socket fd to userspace, so regular sendmmsg()/recvmmsg() can
be used. There is tun_get_socket() which is used internally in the
kernel, but this is not exposed to userspace, and doesn't look trivial
to do either.
What would be the right way to do this?
--
Met vriendelijke groet / with kind regards,
Guus Sliepen <guus@tinc-vpn.org>
^ permalink raw reply
* [PATCH net-next] tcp: remove cwnd moderation after recovery
From: Yuchung Cheng @ 2016-03-30 0:15 UTC (permalink / raw)
To: davem
Cc: netdev, Yuchung Cheng, Matt Mathis, Neal Cardwell,
Soheil Hassas Yeganeh
For non-SACK connections, cwnd is lowered to inflight plus 3 packets
when the recovery ends. This is an optional feature in the NewReno
RFC 2582 to reduce the potential burst when cwnd is "re-opened"
after recovery and inflight is low.
This feature is questionably effective because of PRR: when
the recovery ends (i.e., snd_una == high_seq) NewReno holds the
CA_Recovery state for another round trip to prevent false fast
retransmits. But if the inflight is low, PRR will overwrite the
moderated cwnd in tcp_cwnd_reduction() later.
On the other hand, if the recovery ends because the sender
detects the losses were spurious (e.g., reordering). This feature
unconditionally lowers a reverted cwnd even though nothing
was lost.
By principle loss recovery module should not update cwnd. Further
pacing is much more effective to reduce burst. Hence this patch
removes the cwnd moderation feature.
Signed-off-by: Matt Mathis <mattmathis@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
---
include/net/tcp.h | 11 -----------
net/ipv4/tcp_input.c | 11 -----------
2 files changed, 22 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index b91370f..f8bb4a4 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1039,17 +1039,6 @@ static inline __u32 tcp_max_tso_deferred_mss(const struct tcp_sock *tp)
return 3;
}
-/* Slow start with delack produces 3 packets of burst, so that
- * it is safe "de facto". This will be the default - same as
- * the default reordering threshold - but if reordering increases,
- * we must be able to allow cwnd to burst at least this much in order
- * to not pull it back when holes are filled.
- */
-static __inline__ __u32 tcp_max_burst(const struct tcp_sock *tp)
-{
- return tp->reordering;
-}
-
/* Returns end sequence number of the receiver's advertised window */
static inline u32 tcp_wnd_end(const struct tcp_sock *tp)
{
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e6e65f7..f87b84a 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2252,16 +2252,6 @@ static void tcp_update_scoreboard(struct sock *sk, int fast_rexmit)
}
}
-/* CWND moderation, preventing bursts due to too big ACKs
- * in dubious situations.
- */
-static inline void tcp_moderate_cwnd(struct tcp_sock *tp)
-{
- tp->snd_cwnd = min(tp->snd_cwnd,
- tcp_packets_in_flight(tp) + tcp_max_burst(tp));
- tp->snd_cwnd_stamp = tcp_time_stamp;
-}
-
static bool tcp_tsopt_ecr_before(const struct tcp_sock *tp, u32 when)
{
return tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
@@ -2410,7 +2400,6 @@ static bool tcp_try_undo_recovery(struct sock *sk)
/* Hold old state until something *above* high_seq
* is ACKed. For Reno it is MUST to prevent false
* fast retransmits (RFC2582). SACK TCP is safe. */
- tcp_moderate_cwnd(tp);
if (!tcp_any_retrans_done(sk))
tp->retrans_stamp = 0;
return true;
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related
* Re: [PATCH RFC net-next] net: core: Pass XPS select queue decision to skb_tx_hash
From: John Fastabend @ 2016-03-30 0:18 UTC (permalink / raw)
To: Saeed Mahameed, netdev
Cc: Eric Dumazet, Tom Herbert, Jiri Pirko, David S. Miller,
John Fastabend
In-Reply-To: <1459290252-4121-1-git-send-email-saeedm@mellanox.com>
On 16-03-29 03:24 PM, Saeed Mahameed wrote:
> Currently XPS select queue decision is final and overrides/ignores all other
> select queue parameters such as QoS TC, RX recording.
>
> This patch makes get_xps_queue value as a hint for skb_tx_hash, which will decide
> whether to use this hint as is or to tweak it a little to provide the correct TXQ.
>
> This will fix bugs in cases such that TC QoS offload (tc_to_txq mapping) and XPS mapping
> are configured but select queue will only respect the XPS configuration which will skip
> the TC QoS queue selection and thus will not satisfy the QoS configuration.
>
> RFC because I want to discuss how we would like the final behavior of the
> __netdev_pick_tx, with this patch it goes as follows:
>
> netdev_pick_tx(skb):
> hint = get_xps_queue
> txq = skb_tx_hash(skb, hint)
>
> skb_tx_hash(skb, hint):
> if (skb_rx_queue_recorded(skb))
> return skb_get_rx_queue(skb);
>
> queue_offset = 0;
> if (dev->num_tc)
> queue_offset = tc_queue_offset[tc];
>
> hash = hint < 0 ? skb_get_hash(skb) : hint;
> return hash + queue_offset;
>
> i.e: instead of blindly return the XPS decision, we pass it to skb_tx_hash which can make the final
> decision for us, select queue will now respect and combine XPS and TC QoS tc_to_txq mappings.
>
> Also there is one additional behavioral change that recorded rx queues will
> now override the XPS configuration.
>
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> ---
> drivers/net/ethernet/mellanox/mlx4/en_tx.c | 2 +-
> include/linux/netdevice.h | 6 +++---
> net/core/dev.c | 10 ++++++----
> 3 files changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> index c0d7b72..873cf49 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> @@ -693,7 +693,7 @@ u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
> u8 up = 0;
>
> if (dev->num_tc)
> - return skb_tx_hash(dev, skb);
> + return skb_tx_hash(dev, skb, -1);
>
I would prefer to not have another strange quirk users have to remember
in order to do tx classification. So with this change depending on the
driver the queue selection precedence changes. In short I agree with
the problem statement but think we can find a better solution.
One idea that comes to mind is we can have a tc action to force the
queue selection? Now that we have the egress tc hook it would probably
be fairly cheap to implement and if users want this behavior they can
ask for it explicitly.
If your thinking about tc stuff we could fix the tooling to set this
action when ever dcb is turned on or hardware rate limiting is enabled,
etc. And even if we wanted we could have the driver add the rule in the
cases where firmware protocols are configuring the QOS/etc.
> if (skb_vlan_tag_present(skb))
> up = skb_vlan_tag_get(skb) >> VLAN_PRIO_SHIFT;
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index cb0d5d0..ad81ffe 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3130,16 +3130,16 @@ static inline int netif_set_xps_queue(struct net_device *dev,
> #endif
>
> u16 __skb_tx_hash(const struct net_device *dev, struct sk_buff *skb,
> - unsigned int num_tx_queues);
> + unsigned int num_tx_queues, int txq_hint);
>
[...]
And all this seems like it would only ever be called by drivers select
queue routines which I really wish we could kill off one of these days
instead of add to. Now if the signal is something higher in the stack
and not the driver I think it is OK.
.John
^ permalink raw reply
* Re: [net-next 02/16] i40e/i40evf: Rewrite logic for 8 descriptor per packet check
From: Jesse Brandeburg @ 2016-03-30 0:34 UTC (permalink / raw)
To: Jeff Kirsher, Dave Miller
Cc: Alexander Duyck, NetDEV list, nhorman, sassmann, jogreene,
Jesse Brandeburg
In-Reply-To: <CAEuXFEwPdnh8LNMXRP6fnasXO3P-iW-k1qPciXqayY+7c6=b0w@mail.gmail.com>
stupid gmail... sent again to the list, sorry for duplicate (but
formatted better this time)
Hey Alex, this patch appears to have caused a regression (probably both
i40e/i40evf). Easily reproducible running rds-stress (see the thread titled
"i40e card Tx resets", the middle post by sowmini has the repro steps,
ignore the pktgen discussion in the thread.)
I've spent some time trying to figure out the right fix, but keep getting
stuck in the complicated logic of the function, so I'm sending this in the
hope you'll take a look.
This is (one example of) the skb that doesn't get linearized:
skb_print_bits: > headlen=114 datalen=33824 data=238 mac=238 nh=252
h=272 gso=1448 frag=17
skb_print_bits: frag[0]: len: 256
skb_print_bits: frag[1]: len: 48
skb_print_bits: frag[2]: len: 256
skb_print_bits: frag[3]: len: 48
skb_print_bits: frag[4]: len: 256
skb_print_bits: frag[5]: len: 48
skb_print_bits: frag[6]: len: 4096
This descriptor ^^^ is the 8th, I believe the hardware mechanism faults on
this input.
I added a print of the sum at each point it is subtracted or added to after
initially being set, sum7/8 are in the for loop.
skb_print_bits: frag[7]: len: 4096
skb_print_bits: frag[8]: len: 48
skb_print_bits: frag[9]: len: 4096
skb_print_bits: frag[10]: len: 4096
skb_print_bits: frag[11]: len: 48
skb_print_bits: frag[12]: len: 4096
skb_print_bits: frag[13]: len: 4096
skb_print_bits: frag[14]: len: 48
skb_print_bits: frag[15]: len: 4096
skb_print_bits: frag[16]: len: 4096
__i40e_chk_linearize: sum1: -1399
__i40e_chk_linearize: sum2: -1143
__i40e_chk_linearize: sum3: -1095
__i40e_chk_linearize: sum4: -839
__i40e_chk_linearize: sum5: -791
__i40e_chk_linearize: sum7: 3305
__i40e_chk_linearize: sum8: 3257
__i40e_chk_linearize: sum7: 7353
__i40e_chk_linearize: sum8: 7097
__i40e_chk_linearize: sum7: 7145
__i40e_chk_linearize: sum8: 7097
__i40e_chk_linearize: sum7: 11193
__i40e_chk_linearize: sum8: 10937
__i40e_chk_linearize: sum7: 15033
__i40e_chk_linearize: sum8: 14985
__i40e_chk_linearize: sum7: 15033
This was the descriptors generated from the above:
d[054] = 0x0000000000000000 0x16a0211400000011
d[055] = 0x0000000bfbaa40ee 0x000001ca02871640
d[056] = 0x0000000584bcd000 0x0000040202871640
d[057] = 0x0000000c0bfea9d0 0x000000c202871640
d[058] = 0x0000000584bcd100 0x0000040202871640
d[059] = 0x0000000c0bfeaa00 0x000000c202871640
d[05a] = 0x0000000584bcd200 0x0000040202871640
d[05b] = 0x0000000c0bfeaa30 0x000000c202871640
d[05c] = 0x000000056d5f0000 0x0000400202871640
d[05d] = 0x000000056d5f1000 0x0000400202871640
d[05e] = 0x0000000c0bfeaa60 0x000000c202871640
d[05f] = 0x00000005f2762000 0x0000400202871640
d[060] = 0x00000005f765e000 0x0000400202871640
d[061] = 0x0000000c0bfeaa90 0x000000c202871640
d[062] = 0x0000000574928000 0x0000400202871640
d[063] = 0x0000000568ba5000 0x0000400202871640
d[064] = 0x0000000c0bfeaac0 0x000000c202871640
d[065] = 0x00000005f68cd000 0x0000400202871640
d[066] = 0x0000000585a2a000 0x0000400202871670
On Fri, Feb 19, 2016 at 3:54 AM Jeff Kirsher <jeffrey.t.kirsher@intel.com>
wrote:
>
> From: Alexander Duyck <aduyck@mirantis.com>
>
> This patch is meant to rewrite the logic for how we determine if we can
> transmit the frame or if it needs to be linearized.
>
> + /* Initialize size to the negative value of gso_size minus 1. We
> + * use this as the worst case scenerio in which the frag ahead
> + * of us only provides one byte which is why we are limited to 6
> + * descriptors for a single transmit as the header and previous
> + * fragment are already consuming 2 descriptors.
> + */
> + sum = 1 - gso_size;
> +
> + /* Add size of frags 1 through 5 to create our initial sum */
> + sum += skb_frag_size(++frag);
I'm pretty sure this code skips frag[0] due to the pre-increment, the bug
seems to occur in the algorithm because skb->data contains L2/L3/L4 header
plus some data, and should counts as 1 descriptor for every subsequent sent
packet, and if the incoming skb has 7 chunks that add up to < MSS then we
get in trouble.
> + sum += skb_frag_size(++frag);
> + sum += skb_frag_size(++frag);
> + sum += skb_frag_size(++frag);
> + sum += skb_frag_size(++frag);
> +
> + /* Walk through fragments adding latest fragment, testing it, and
> + * then removing stale fragments from the sum.
> + */
> + stale = &skb_shinfo(skb)->frags[0];
> + for (;;) {
> + sum += skb_frag_size(++frag);
> +
> + /* if sum is negative we failed to make sufficient progress */
> + if (sum < 0)
> + return true;
> +
> + /* use pre-decrement to avoid processing last fragment */
> + if (!--nr_frags)
> + break;
> +
> + sum -= skb_frag_size(++stale);
I think this line also skips stale[0]
^ permalink raw reply
* Re: [PATCH net-next] tcp: remove cwnd moderation after recovery
From: Stephen Hemminger @ 2016-03-30 0:35 UTC (permalink / raw)
To: Yuchung Cheng
Cc: davem, netdev, Matt Mathis, Neal Cardwell, Soheil Hassas Yeganeh
In-Reply-To: <1459296952-12214-1-git-send-email-ycheng@google.com>
On Tue, 29 Mar 2016 17:15:52 -0700
Yuchung Cheng <ycheng@google.com> wrote:
> For non-SACK connections, cwnd is lowered to inflight plus 3 packets
> when the recovery ends. This is an optional feature in the NewReno
> RFC 2582 to reduce the potential burst when cwnd is "re-opened"
> after recovery and inflight is low.
>
> This feature is questionably effective because of PRR: when
> the recovery ends (i.e., snd_una == high_seq) NewReno holds the
> CA_Recovery state for another round trip to prevent false fast
> retransmits. But if the inflight is low, PRR will overwrite the
> moderated cwnd in tcp_cwnd_reduction() later.
>
> On the other hand, if the recovery ends because the sender
> detects the losses were spurious (e.g., reordering). This feature
> unconditionally lowers a reverted cwnd even though nothing
> was lost.
>
> By principle loss recovery module should not update cwnd. Further
> pacing is much more effective to reduce burst. Hence this patch
> removes the cwnd moderation feature.
>
> Signed-off-by: Matt Mathis <mattmathis@google.com>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
I have a concern that this might break Linux builtin protection
against hostile receiver sending bogus ACK's. Remember Linux is
different than NewReno. You are changing something that has existed for
a long long time.
^ permalink raw reply
* Re: [PATCH net-next v2 2/2] net: dsa: mv88e6xxx: Clear the PDOWN bit on setup
From: Patrick Uiterwijk @ 2016-03-30 0:58 UTC (permalink / raw)
To: Vivien Didelot
Cc: Andrew Lunn, Guenter Roeck, davem, netdev, Dennis Gilmore,
Peter Robinson
In-Reply-To: <87wpolgluy.fsf@ketchup.mtl.sfl>
Hi Vivien,
On Tue, Mar 29, 2016 at 6:49 PM, Vivien Didelot
<vivien.didelot@savoirfairelinux.com> wrote:
> Hi Andrew, Patrick,
>
> Andrew Lunn <andrew@lunn.ch> writes:
>
>> On Tue, Mar 29, 2016 at 12:23:06PM -0400, Vivien Didelot wrote:
>>> Hi Patrick,
>>>
>>> Two comments below.
>>>
>>> Patrick Uiterwijk <patrick@puiterwijk.org> writes:
>>>
>>> > +static int mv88e6xxx_power_on_serdes(struct dsa_switch *ds)
>>>
>>> Since this function assumes the SMI lock is already held, its name
>>> should be prefixed with _ by convention (_mv88e6xxx_power_on_serdes).
>>
>> We decided to drop at, since nearly everything would end up with a _
>> prefix. The assert_smi_lock() should find any missing locks, and
>> lockdep/deadlocks will make it clear when the lock is taken twice.
>
> OK, I didn't know that. This makes sense. There is no need to respin a
> v3 only for my previous &= comment then.
Does that mean the merger will fix this up?
Or that I'll roll a v3 when I get a reviewed-by for the second patch?
Thanks,
Patrick
^ permalink raw reply
* Re: [PATCH net-next v2 2/2] net: dsa: mv88e6xxx: Clear the PDOWN bit on setup
From: Andrew Lunn @ 2016-03-30 1:02 UTC (permalink / raw)
To: Patrick Uiterwijk
Cc: Vivien Didelot, Guenter Roeck, davem, netdev, Dennis Gilmore,
Peter Robinson
In-Reply-To: <CAJweMdYPfk7v0VVZoKW3+ZFsKD=6ja-yCo0jjwygTwYE3D30fQ@mail.gmail.com>
On Wed, Mar 30, 2016 at 12:58:04AM +0000, Patrick Uiterwijk wrote:
> Hi Vivien,
>
> On Tue, Mar 29, 2016 at 6:49 PM, Vivien Didelot
> <vivien.didelot@savoirfairelinux.com> wrote:
> > Hi Andrew, Patrick,
> >
> > Andrew Lunn <andrew@lunn.ch> writes:
> >
> >> On Tue, Mar 29, 2016 at 12:23:06PM -0400, Vivien Didelot wrote:
> >>> Hi Patrick,
> >>>
> >>> Two comments below.
> >>>
> >>> Patrick Uiterwijk <patrick@puiterwijk.org> writes:
> >>>
> >>> > +static int mv88e6xxx_power_on_serdes(struct dsa_switch *ds)
> >>>
> >>> Since this function assumes the SMI lock is already held, its name
> >>> should be prefixed with _ by convention (_mv88e6xxx_power_on_serdes).
> >>
> >> We decided to drop at, since nearly everything would end up with a _
> >> prefix. The assert_smi_lock() should find any missing locks, and
> >> lockdep/deadlocks will make it clear when the lock is taken twice.
> >
> > OK, I didn't know that. This makes sense. There is no need to respin a
> > v3 only for my previous &= comment then.
>
> Does that mean the merger will fix this up?
> Or that I'll roll a v3 when I get a reviewed-by for the second patch?
Hi Patrick
Role a v3, and you can add
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
as well as Viviens for patch #1.
Andrew
^ permalink raw reply
* [PATCH net-next v3 1/2] net: dsa: mv88e6xxx: Introduce _mv88e6xxx_phy_page_{read,write}
From: Patrick Uiterwijk @ 2016-03-30 1:39 UTC (permalink / raw)
To: linux, davem, vivien.didelot, andrew
Cc: netdev, dennis, pbrobinson, Patrick Uiterwijk
Add versions of the phy_page_read and _write functions to
be used in a context where the SMI mutex is held.
Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Patrick Uiterwijk <patrick@puiterwijk.org>
---
drivers/net/dsa/mv88e6xxx.c | 49 +++++++++++++++++++++++++++++++++------------
1 file changed, 36 insertions(+), 13 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index fa086e0..86a2029 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2264,6 +2264,38 @@ static void mv88e6xxx_bridge_work(struct work_struct *work)
mutex_unlock(&ps->smi_mutex);
}
+static int _mv88e6xxx_phy_page_write(struct dsa_switch *ds, int port, int page,
+ int reg, int val)
+{
+ int ret;
+
+ ret = _mv88e6xxx_phy_write_indirect(ds, port, 0x16, page);
+ if (ret < 0)
+ goto restore_page_0;
+
+ ret = _mv88e6xxx_phy_write_indirect(ds, port, reg, val);
+restore_page_0:
+ _mv88e6xxx_phy_write_indirect(ds, port, 0x16, 0x0);
+
+ return ret;
+}
+
+static int _mv88e6xxx_phy_page_read(struct dsa_switch *ds, int port, int page,
+ int reg)
+{
+ int ret;
+
+ ret = _mv88e6xxx_phy_write_indirect(ds, port, 0x16, page);
+ if (ret < 0)
+ goto restore_page_0;
+
+ ret = _mv88e6xxx_phy_read_indirect(ds, port, reg);
+restore_page_0:
+ _mv88e6xxx_phy_write_indirect(ds, port, 0x16, 0x0);
+
+ return ret;
+}
+
static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
{
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
@@ -2714,13 +2746,9 @@ int mv88e6xxx_phy_page_read(struct dsa_switch *ds, int port, int page, int reg)
int ret;
mutex_lock(&ps->smi_mutex);
- ret = _mv88e6xxx_phy_write_indirect(ds, port, 0x16, page);
- if (ret < 0)
- goto error;
- ret = _mv88e6xxx_phy_read_indirect(ds, port, reg);
-error:
- _mv88e6xxx_phy_write_indirect(ds, port, 0x16, 0x0);
+ ret = _mv88e6xxx_phy_page_read(ds, port, page, reg);
mutex_unlock(&ps->smi_mutex);
+
return ret;
}
@@ -2731,14 +2759,9 @@ int mv88e6xxx_phy_page_write(struct dsa_switch *ds, int port, int page,
int ret;
mutex_lock(&ps->smi_mutex);
- ret = _mv88e6xxx_phy_write_indirect(ds, port, 0x16, page);
- if (ret < 0)
- goto error;
-
- ret = _mv88e6xxx_phy_write_indirect(ds, port, reg, val);
-error:
- _mv88e6xxx_phy_write_indirect(ds, port, 0x16, 0x0);
+ ret = _mv88e6xxx_phy_page_write(ds, port, page, reg, val);
mutex_unlock(&ps->smi_mutex);
+
return ret;
}
--
2.7.4
^ permalink raw reply related
* [PATCH net-next v3 2/2] net: dsa: mv88e6xxx: Clear the PDOWN bit on setup
From: Patrick Uiterwijk @ 2016-03-30 1:39 UTC (permalink / raw)
To: linux, davem, vivien.didelot, andrew
Cc: netdev, dennis, pbrobinson, Patrick Uiterwijk
In-Reply-To: <1459301981-26535-1-git-send-email-patrick@puiterwijk.org>
Some of the vendor-specific bootloaders set up this part
of the initialization for us, so this was never added.
However, since upstream bootloaders don't initialize the
chip specifically, they leave the fiber MII's PDOWN flag
set, which means that the CPU port doesn't connect.
This patch checks whether this flag has been clear prior
by something else, and if not make us clear it.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Patrick Uiterwijk <patrick@puiterwijk.org>
---
drivers/net/dsa/mv88e6xxx.c | 36 ++++++++++++++++++++++++++++++++++++
drivers/net/dsa/mv88e6xxx.h | 8 ++++++++
2 files changed, 44 insertions(+)
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 86a2029..50454be 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2296,6 +2296,25 @@ restore_page_0:
return ret;
}
+static int mv88e6xxx_power_on_serdes(struct dsa_switch *ds)
+{
+ int ret;
+
+ ret = _mv88e6xxx_phy_page_read(ds, REG_FIBER_SERDES, PAGE_FIBER_SERDES,
+ MII_BMCR);
+ if (ret < 0)
+ return ret;
+
+ if (ret & BMCR_PDOWN) {
+ ret &= ~BMCR_PDOWN;
+ ret = _mv88e6xxx_phy_page_write(ds, REG_FIBER_SERDES,
+ PAGE_FIBER_SERDES, MII_BMCR,
+ ret);
+ }
+
+ return ret;
+}
+
static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
{
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
@@ -2399,6 +2418,23 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
goto abort;
}
+ /* If this port is connected to a SerDes, make sure the SerDes is not
+ * powered down.
+ */
+ if (mv88e6xxx_6352_family(ds)) {
+ ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_STATUS);
+ if (ret < 0)
+ goto abort;
+ ret &= PORT_STATUS_CMODE_MASK;
+ if ((ret == PORT_STATUS_CMODE_100BASE_X) ||
+ (ret == PORT_STATUS_CMODE_1000BASE_X) ||
+ (ret == PORT_STATUS_CMODE_SGMII)) {
+ ret = mv88e6xxx_power_on_serdes(ds);
+ if (ret < 0)
+ goto abort;
+ }
+ }
+
/* Port Control 2: don't force a good FCS, set the maximum frame size to
* 10240 bytes, disable 802.1q tags checking, don't discard tagged or
* untagged frames on this port, do a destination address lookup on all
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index 9a038ab..26a424a 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -28,6 +28,10 @@
#define SMI_CMD_OP_45_READ_DATA_INC ((3 << 10) | SMI_CMD_BUSY)
#define SMI_DATA 0x01
+/* Fiber/SERDES Registers are located at SMI address F, page 1 */
+#define REG_FIBER_SERDES 0x0f
+#define PAGE_FIBER_SERDES 0x01
+
#define REG_PORT(p) (0x10 + (p))
#define PORT_STATUS 0x00
#define PORT_STATUS_PAUSE_EN BIT(15)
@@ -45,6 +49,10 @@
#define PORT_STATUS_MGMII BIT(6) /* 6185 */
#define PORT_STATUS_TX_PAUSED BIT(5)
#define PORT_STATUS_FLOW_CTRL BIT(4)
+#define PORT_STATUS_CMODE_MASK 0x0f
+#define PORT_STATUS_CMODE_100BASE_X 0x8
+#define PORT_STATUS_CMODE_1000BASE_X 0x9
+#define PORT_STATUS_CMODE_SGMII 0xa
#define PORT_PCS_CTRL 0x01
#define PORT_PCS_CTRL_RGMII_DELAY_RXCLK BIT(15)
#define PORT_PCS_CTRL_RGMII_DELAY_TXCLK BIT(14)
--
2.7.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox