Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next 0/3] tc: act_ife: handle IEEE IFE ethertype as default
From: Stephen Hemminger @ 2017-08-30 15:27 UTC (permalink / raw)
  To: Alexander Aring
  Cc: jhs, yotamg, xiyou.wangcong, jiri, lucasb, netdev,
	linux-kselftest
In-Reply-To: <20170828190315.26646-1-aring@mojatatu.com>

On Mon, 28 Aug 2017 15:03:12 -0400
Alexander Aring <aring@mojatatu.com> wrote:

> Hi,
> 
> this patch series will introduce the IFE ethertype which is registered by
> IEEE. If the netlink act_ife type netlink attribute is not given it will
> use this value by default now.
> At least it will introduce some UAPI testcases to check if the default type
> is used if not specified and vice versa.
> 
> - Alex
> 
> Alexander Aring (3):
>   if_ether: add forces ife lfb type
>   act_ife: use registered ife_type as fallback
>   tc-testing: add test for testing ife type
> 
>  include/uapi/linux/if_ether.h                      |  1 +
>  net/sched/act_ife.c                                | 17 ++------
>  .../tc-testing/tc-tests/actions/tests.json         | 50 ++++++++++++++++++++++
>  3 files changed, 54 insertions(+), 14 deletions(-)
> 

Applied to net-next

^ permalink raw reply

* Re: [PATCH iproute2 net-next v2 0/2] Add support for seg6 l2encap mode
From: Stephen Hemminger @ 2017-08-30 15:30 UTC (permalink / raw)
  To: David Lebrun; +Cc: netdev
In-Reply-To: <20170828192640.19240-1-david.lebrun@uclouvain.be>

On Mon, 28 Aug 2017 20:26:38 +0100
David Lebrun <david.lebrun@uclouvain.be> wrote:

> This patch series adds support for the new L2ENCAP mode for SRv6
> encapsulations.
> 
> v2: use a name/value table for encap modes
> 
> David Lebrun (2):
>   iproute: add support for seg6 l2encap mode
>   man: add documentation for seg6 l2encap mode
> 
>  ip/iproute_lwtunnel.c  | 41 ++++++++++++++++++++++++++++++-----------
>  man/man8/ip-route.8.in |  6 +++++-
>  2 files changed, 35 insertions(+), 12 deletions(-)
> 

Applied to net-next

^ permalink raw reply

* Re: nflog performance ...
From: Stephen Hemminger @ 2017-08-30 15:37 UTC (permalink / raw)
  To: Akshat Kakkar; +Cc: netdev
In-Reply-To: <CAA5aLPjzMd71S-L47G=2pdF+v+SruZ4kEavV03r4datDwdAJ-Q@mail.gmail.com>

On Wed, 30 Aug 2017 13:27:57 +0530
Akshat Kakkar <akshat.1984@gmail.com> wrote:

> Anybody?
> 
> On Tue, Aug 29, 2017 at 4:11 PM, Akshat Kakkar <akshat.1984@gmail.com> wrote:
> > I am using ulogd2 to log iptables activity.
> > However, when using pgsql as output plugin ... performance is very
> > very sluggish. (~130-150 entries per second)
> >
> > To enhance performance I am trying
> >
> > modprobe ipt_ULOG nlbufsiz=65535 flushtimeout=1000
> >
> > but this gives error : ipt_ULOG module not found.
> >
> >
> > On the same lines, I tried
> >
> > modprobe ipt_NFLOG nlbufsiz=65535 flushtimeout=1000
> > It didnt give any error !!!
> >
> > But still there is no increase in performance.
> > Are these values effective?
> >
> > I am also using  --nflog-threshold 50 in iptables rule.
> >
> > I am setting buffer_size as
> > netlink_socket_buffer_size=104857600
> > netlink_socket_buffer_maxsize=1048576000
> >
> > When running ulog, it gives message of setting buffer size as
> > 21708600. Though it didnt give any message like
> > "ulogd_inppkt_NFLOG.c:443 We are losing events, increasing buffer size
> > to xxxxxx"
> > but again why it is only setting buffer size as 21708600 though I have
> > set in config as 104857600?  

Try  the netfilter users mailing list netfilter@vger.kernel.org

^ permalink raw reply

* RE: [PATCH net-next v2] net: bcmgenet: Use correct I/O accessors
From: Florian Fainelli @ 2017-08-30 16:00 UTC (permalink / raw)
  To: David Laight, netdev@vger.kernel.org
  Cc: davem@davemloft.net, opendmb@gmail.com, jaedon.shin@gmail.com
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DD0069792@AcuExch.aculab.com>

On August 30, 2017 4:39:52 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote:
>From: Florian Fainelli
>> Sent: 29 August 2017 20:26
>> The GENET driver currently uses __raw_{read,write}l which means
>> native I/O endian. This works correctly for an ARM LE kernel
>(default)
>> but fails miserably on an ARM BE (BE8) kernel where registers are
>kept
>> little endian, so replace uses with {read,write}l_relaxed here which
>is
>> what we want because this is all performance sensitive code.
>...
>> +	if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
>> +		__raw_writel(value, offset);
>> +	else
>> +		writel_relaxed(value, offset);
>
>How do you know that all BE MIPS that might have this driver have
>the BE registers of your card?
>(Or that all ARM BE systems have LE registers.)
>

This is the embedded network controller found on Broadcom STB SoCs, they were MIPS-based before, now ARM/ARM64-based. Any MIPS-based SoC that has this controller is using one of Broadcom's BMIPS processor (4350/4380/5000/5200) and all obey the same rule that their endian strap propagates to bus register endian setting as such that the result is always native endian for them. All ARM/ARM64-based SoC are paired with a newer version of the register bus that voluntarily dropped support for changing its endian, such that it is always LE for these newer SoCs.

You won't find this controller in any other product from Broadcom, just like there was not a version designed for e.g: running on a PCI(e) attached FPGA or anything.

>If nothing else the driver code should be predicated on a
>condition set by the kernel config that depends on the cpu build
>rather than embedding that condition in a lot of drivers

The driver is made to build for as many configurations as possible but it won't get probed unless the appropriate DT nodes are populated.

-- 
Florian

^ permalink raw reply

* Re: [PATCH] staging: r8822be: Fix typo for CONFIG_RTLWIFI_DEBUG
From: Larry Finger @ 2017-08-30 16:28 UTC (permalink / raw)
  To: Andreas Ziegler, Greg KH
  Cc: devel, netdev, Yan-Hsuan Chuang, Steven Ting, Birming Chiu
In-Reply-To: <bd895fab-dfb1-aa80-1ce5-4cad55d0e234@fau.de>

On 08/30/2017 02:58 AM, Andreas Ziegler wrote:
> Indeed, sorry I missed that as well.
> 
> So what should we make of that #ifdef? The code inside it doesn't compile
> (anymore? I didn't find any development history for that patch except the
> original mail), as there is no definition of struct submit_ctx in the headers
> (for other rtl drivers - 8188eu, 8723bs - that struct lives in
> include/rtw_xmit.h). Is a comparable header simply missing?
> 
> Regards,
> 
> Andreas

Andreas,

I'm sorry that I did not have time yesterday to properly analyze the situation. 
All I knew is that your patch was not the correct one. It turns out that the 
extra code was left over from the original writing/testing of the driver and 
should have been deleted. I have prepared a patch that does that and will submit 
it soon.

When the extraneous code was deleted, addition simplifications of the code were 
apparent. I am currently testing that change, and will submit the two patches at 
the same time.

Larry

^ permalink raw reply

* [PATCH net] kcm: do not attach PF_KCM sockets to avoid deadlock
From: Eric Dumazet @ 2017-08-30 16:29 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Tom Herbert

From: Eric Dumazet <edumazet@google.com>

syzkaller had no problem to trigger a deadlock, attaching a KCM socket
to another one (or itself). (original syzkaller report was a very
confusing lockdep splat during a sendmsg())

It seems KCM claims to only support TCP, but no enforcement is done,
so we might need to add additional checks.

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
---
 net/kcm/kcmsock.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index 48e993b2dbcf1afae04968ed840e2e98c2cf6772..af4e76ac88ff0817398d1d7460a41f0cd5fe6f30 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1387,6 +1387,10 @@ static int kcm_attach(struct socket *sock, struct socket *csock,
 	if (!csk)
 		return -EINVAL;
 
+	/* We must prevent loops or risk deadlock ! */
+	if (csk->sk_family == PF_KCM)
+		return -EOPNOTSUPP;
+
 	psock = kmem_cache_zalloc(kcm_psockp, GFP_KERNEL);
 	if (!psock)
 		return -ENOMEM;

^ permalink raw reply related

* Re: [PATCH] rtlwifi: btcoex: 23b 1ant: fix duplicated code for different branches
From: Larry Finger @ 2017-08-30 16:37 UTC (permalink / raw)
  To: Gustavo A. R. Silva, Pkshih, Kalle Valo
  Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20170830134223.GA13596@embeddedgus>

On 08/30/2017 08:42 AM, Gustavo A. R. Silva wrote:
> Refactor code in order to avoid identical code for different branches.
> 
> This issue was detected with the help of Coccinelle.
> 
> Addresses-Coverity-ID: 1226788
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
> ---
> This issue was reported by Coverity and it was tested by compilation only.
> I'm suspicious this may be a copy/paste error. Please, verify.

I have referred this change to the engineers at Realtek. For the moment, please 
hold this patch.

Thanks for reporting the condition.

Larry

^ permalink raw reply

* Re: [PATCH net] kcm: do not attach PF_KCM sockets to avoid deadlock
From: Tom Herbert @ 2017-08-30 16:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1504110571.11498.120.camel@edumazet-glaptop3.roam.corp.google.com>

On Wed, Aug 30, 2017 at 9:29 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> syzkaller had no problem to trigger a deadlock, attaching a KCM socket
> to another one (or itself). (original syzkaller report was a very
> confusing lockdep splat during a sendmsg())
>
> It seems KCM claims to only support TCP, but no enforcement is done,
> so we might need to add additional checks.
>
> Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Dmitry Vyukov <dvyukov@google.com>

Acked-by: Tom Herbert <tom@quantonium.net>

> ---
>  net/kcm/kcmsock.c |    4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
> index 48e993b2dbcf1afae04968ed840e2e98c2cf6772..af4e76ac88ff0817398d1d7460a41f0cd5fe6f30 100644
> --- a/net/kcm/kcmsock.c
> +++ b/net/kcm/kcmsock.c
> @@ -1387,6 +1387,10 @@ static int kcm_attach(struct socket *sock, struct socket *csock,
>         if (!csk)
>                 return -EINVAL;
>
> +       /* We must prevent loops or risk deadlock ! */
> +       if (csk->sk_family == PF_KCM)
> +               return -EOPNOTSUPP;
> +
>         psock = kmem_cache_zalloc(kcm_psockp, GFP_KERNEL);
>         if (!psock)
>                 return -ENOMEM;
>
>

^ permalink raw reply

* [PATCH] rtlwifi: refactor code in halbtcoutsrc module
From: Gustavo A. R. Silva @ 2017-08-30 16:46 UTC (permalink / raw)
  To: Larry Finger, Chaoming Li, Kalle Valo
  Cc: linux-wireless, netdev, linux-kernel, Gustavo A. R. Silva

Function halbtc_get_wifi_rssi always returns rtlpriv->dm.undec_sm_pwdb.
So this function can be removed and the value of
rtlpriv->dm.undec_sm_pwdb assigned to *s32_tmp directly.

This issue was first reported by Coverity as "identical code for different
branches" in function halbtc_get_wifi_rssi.

Addresses-Coverity-ID: 1226793
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
---
This code was reported by Coverity and it was tested by compilation only.
Chances are this may be a copy/paste error in function
halbtc_get_wifi_rssi. Please, verify.
Also, notice this code has been there since 2014.

 .../net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c   | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c b/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c
index c1eacd8..2a47b97 100644
--- a/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c
+++ b/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c
@@ -373,17 +373,6 @@ u32 halbtc_get_wifi_link_status(struct btc_coexist *btcoexist)
 	return ret_val;
 }
 
-static s32 halbtc_get_wifi_rssi(struct rtl_priv *rtlpriv)
-{
-	int undec_sm_pwdb = 0;
-
-	if (rtlpriv->mac80211.link_state >= MAC80211_LINKED)
-		undec_sm_pwdb = rtlpriv->dm.undec_sm_pwdb;
-	else /* associated entry pwdb */
-		undec_sm_pwdb = rtlpriv->dm.undec_sm_pwdb;
-	return undec_sm_pwdb;
-}
-
 static bool halbtc_get(void *void_btcoexist, u8 get_type, void *out_buf)
 {
 	struct btc_coexist *btcoexist = (struct btc_coexist *)void_btcoexist;
@@ -479,7 +468,7 @@ static bool halbtc_get(void *void_btcoexist, u8 get_type, void *out_buf)
 		*bool_tmp = false;
 		break;
 	case BTC_GET_S4_WIFI_RSSI:
-		*s32_tmp = halbtc_get_wifi_rssi(rtlpriv);
+		*s32_tmp = rtlpriv->dm.undec_sm_pwdb;
 		break;
 	case BTC_GET_S4_HS_RSSI:
 		*s32_tmp = 0;
-- 
2.5.0

^ permalink raw reply related

* Re: [PATCH] rtlwifi: btcoex: 23b 1ant: fix duplicated code for different branches
From: Gustavo A. R. Silva @ 2017-08-30 16:51 UTC (permalink / raw)
  To: Larry Finger, Pkshih, Kalle Valo; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <eef42b52-ae7b-e574-237f-209ac8ec033c@lwfinger.net>

Hi Larry,

On 08/30/2017 11:37 AM, Larry Finger wrote:
> On 08/30/2017 08:42 AM, Gustavo A. R. Silva wrote:
>> Refactor code in order to avoid identical code for different branches.
>>
>> This issue was detected with the help of Coccinelle.
>>
>> Addresses-Coverity-ID: 1226788
>> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
>> ---
>> This issue was reported by Coverity and it was tested by compilation
>> only.
>> I'm suspicious this may be a copy/paste error. Please, verify.
>
> I have referred this change to the engineers at Realtek. For the moment,
> please hold this patch.
>
> Thanks for reporting the condition.
>

Glad to help. :)

-- 
Gustavo A. R. Silva

^ permalink raw reply

* linux-next: Signed-off-by missing for commit in the net-next tree
From: Stephen Rothwell @ 2017-08-30 16:51 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, David Ahern

Hi all,

Commit

  1b70d792cf67 ("ipv6: Use rt6i_idev index for echo replies to a local address")

is missing a Signed-off-by from its author.

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply

* Re: rsi: remove memset before memcpy
From: Kalle Valo @ 2017-08-30 16:51 UTC (permalink / raw)
  To: Himanshu Jha
  Cc: amit.karwar, linux-wireless, netdev, linux-kernel, Himanshu Jha
In-Reply-To: <1504032890-25229-1-git-send-email-himanshujha199640@gmail.com>

Himanshu Jha <himanshujha199640@gmail.com> wrote:

> calling memcpy immediately after memset with the same region of memory
> makes memset redundant.
> 
> Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>

Patch applied to wireless-drivers-next.git, thanks.

66a3479e1217 rsi: remove memset before memcpy

-- 
https://patchwork.kernel.org/patch/9927943/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: linux-next: Signed-off-by missing for commit in the net-next tree
From: David Ahern @ 2017-08-30 16:53 UTC (permalink / raw)
  To: Stephen Rothwell, David Miller, Networking
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20170831025131.56d2cf92@canb.auug.org.au>

On 8/30/17 10:51 AM, Stephen Rothwell wrote:
> Hi all,
> 
> Commit
> 
>   1b70d792cf67 ("ipv6: Use rt6i_idev index for echo replies to a local address")
> 
> is missing a Signed-off-by from its author.
> 

Eric pointed this out last night. The commit message copied output from
a command that contained '---' so git ignored the remainder of the message.

^ permalink raw reply

* [PATCH] rtlwifi: rtl8723be: fix duplicated code for different branches
From: Gustavo A. R. Silva @ 2017-08-30 17:04 UTC (permalink / raw)
  To: Larry Finger, Chaoming Li, Kalle Valo
  Cc: linux-wireless, netdev, linux-kernel, Gustavo A. R. Silva

Refactor code in order to avoid identical code for different branches.

Addresses-Coverity-ID: 1248728
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
---
This issue was reported by Coverity and it was tested by compilation only.
Please, verify if this is not a copy/paste error.
Also, notice this code has been there since 2014.

 drivers/net/wireless/realtek/rtlwifi/rtl8723be/dm.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/dm.c
index 131c0d1..15c117e 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/dm.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/dm.c
@@ -883,12 +883,8 @@ static void rtl8723be_dm_txpower_tracking_callback_thermalmeter(
 	if ((rtldm->power_index_offset[RF90_PATH_A] != 0) &&
 	    (rtldm->txpower_track_control)) {
 		rtldm->done_txpower = true;
-		if (thermalvalue > rtlefuse->eeprom_thermalmeter)
-			rtl8723be_dm_tx_power_track_set_power(hw, BBSWING, 0,
-							     index_for_channel);
-		else
-			rtl8723be_dm_tx_power_track_set_power(hw, BBSWING, 0,
-							     index_for_channel);
+		rtl8723be_dm_tx_power_track_set_power(hw, BBSWING, 0,
+						      index_for_channel);
 
 		rtldm->swing_idx_cck_base = rtldm->swing_idx_cck;
 		rtldm->swing_idx_ofdm_base[RF90_PATH_A] =
-- 
2.5.0

^ permalink raw reply related

* Re: [PATCH] net: ti: cpsw-common: dont print error if ti_cm_get_macid() fails
From: David Miller @ 2017-08-30 17:09 UTC (permalink / raw)
  To: nsekhar; +Cc: grygorii.strashko, linux-omap, netdev, tony, aford173
In-Reply-To: <423afc6874d8911615c4df941957067aebfc09dd.1504080198.git.nsekhar@ti.com>

From: Sekhar Nori <nsekhar@ti.com>
Date: Wed, 30 Aug 2017 13:37:13 +0530

> It is quite common for ti_cm_get_macid() to fail on some of the
> platforms it is invoked on. They include any platform where
> mac address is not part of SoC register space.
> 
> On these platforms, mac address is read and populated in
> device-tree by bootloader. An example is TI DA850.
> 
> Downgrade the severity of message to "information", so it does
> not spam logs when 'quiet' boot is desired.
> 
> Signed-off-by: Sekhar Nori <nsekhar@ti.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH][next][V2] bpf: test_maps: fix typo "conenct" -> "connect"
From: Colin Ian King @ 2017-08-30 17:09 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov, Shuah Khan, netdev,
	linux-kselftest
  Cc: linux-kernel
In-Reply-To: <59A6C1B9.3050005@iogearbox.net>

On 30/08/17 14:46, Daniel Borkmann wrote:
> On 08/30/2017 01:47 PM, Colin King wrote:
>> From: Colin Ian King <colin.king@canonical.com>
>>
>> Trivial fix to typo in printf error message
>>
>> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> 
> For net-next; looks like there is also one in "failed to listeen\n".
> Want to fix this one as well ? ;)

Ah, missed that one. Thanks.
> 
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply

* Re: [GIT] Networking
From: David Miller @ 2017-08-30 17:11 UTC (permalink / raw)
  To: kvalo; +Cc: pavel, xiyou.wangcong, torvalds, akpm, netdev, linux-kernel
In-Reply-To: <87pobdt8qc.fsf@kamboji.qca.qualcomm.com>

From: Kalle Valo <kvalo@codeaurora.org>
Date: Wed, 30 Aug 2017 17:45:31 +0300

> Pavel Machek <pavel@ucw.cz> writes:
> 
>> Could we get this one in?
>>
>> wl1251 misses a spin_lock_init().
>>
>> https://www.mail-archive.com/netdev@vger.kernel.org/msg177031.html
>>
>> It seems pretty trivial, yet getting the backtraces is not nice.
> 
> It's in wireless-drivers-next and will be in 4.14-rc1:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git/commit/?id=6e9aae179f290f1a44fce7ef8e9a8e2dd68ed1e4

Is the bug only present in net-next?

^ permalink raw reply

* [PATCH][net-next][V3] bpf: test_maps: fix typos, "conenct" and "listeen"
From: Colin King @ 2017-08-30 17:15 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Shuah Khan, netdev,
	linux-kselftest, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Trivial fix to typos in printf error messages:
"conenct" -> "connect"
"listeen" -> "listen"

thanks to Daniel Borkmann for spotting one of these mistakes

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 tools/testing/selftests/bpf/test_maps.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index 7059bb315a10..4acc772a28c0 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -504,7 +504,7 @@ static void test_sockmap(int tasks, void *data)
 		}
 		err = listen(sfd[i], 32);
 		if (err < 0) {
-			printf("failed to listeen\n");
+			printf("failed to listen\n");
 			goto out;
 		}
 	}
@@ -525,7 +525,7 @@ static void test_sockmap(int tasks, void *data)
 		addr.sin_port = htons(ports[i - 2]);
 		err = connect(sfd[i], (struct sockaddr *)&addr, sizeof(addr));
 		if (err) {
-			printf("failed to conenct\n");
+			printf("failed to connect\n");
 			goto out;
 		}
 	}
-- 
2.14.1

^ permalink raw reply related

* Re: [PATCH v3 net-next] staging: irda: fix init level for irda core
From: David Miller @ 2017-08-30 17:15 UTC (permalink / raw)
  To: gregkh; +Cc: devel, netdev, samuel, linux-kernel, fengguang.wu, geert
In-Reply-To: <20170830111649.GA13000@kroah.com>

From: Greg KH <gregkh@linuxfoundation.org>
Date: Wed, 30 Aug 2017 13:16:49 +0200

> When moving the IRDA code out of net/ into drivers/staging/irda/net, the
> link order changes when IRDA is built into the kernel.  That causes a
> kernel crash at boot time as netfilter isn't initialized yet.
> 
> To fix this, move the init call level of the irda core to be
> device_initcall() as the link order keeps this being initialized at the
> correct time.
> 
> Reported-by: kernel test robot <fengguang.wu@intel.com>
> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
> 
> v3 - just change the initcall level, works so much simpler, thanks to
>      DaveM for the idea.
> v2 - don't force irda to be a module, make the Makefiles put irda back
>      where it was before in the link order.

Applied, thanks for following up on this Greg.

^ permalink raw reply

* Re: [net-next PATCHv6 2/2] net: socionext: Add NetSec driver
From: Andrew Lunn @ 2017-08-30 17:17 UTC (permalink / raw)
  To: Jassi Brar
  Cc: netdev, devicetree, linux-arm-kernel, davem, mark.rutland, arnd,
	patches, Jassi Brar, robh+dt, andy
In-Reply-To: <1504088771-6255-1-git-send-email-jaswinder.singh@linaro.org>

> +static int netsec_mac_update_to_phy_state(struct netsec_priv *priv)
> +{
> +	struct phy_device *phydev = priv->ndev->phydev;
> +	u32 value = 0;
> +
> +	value = phydev->duplex ? NETSEC_GMAC_MCR_REG_FULL_DUPLEX_COMMON :
> +				       NETSEC_GMAC_MCR_REG_HALF_DUPLEX_COMMON;
> +
> +	if (phydev->speed != SPEED_1000)
> +		value |= NETSEC_MCR_PS;
> +
> +	if ((priv->phy_interface != PHY_INTERFACE_MODE_GMII) &&
> +	    (phydev->speed == SPEED_100))
> +		value |= NETSEC_GMAC_MCR_REG_FES;
> +
> +	value |= NETSEC_GMAC_MCR_REG_CST | NETSEC_GMAC_MCR_REG_JE;
> +
> +	if (priv->phy_interface == PHY_INTERFACE_MODE_RGMII)
> +		value |= NETSEC_GMAC_MCR_REG_IBN;
> +
> +	if (netsec_mac_write(priv, GMAC_REG_MCR, value))
> +		return -ETIMEDOUT;
> +
> +	priv->actual_link_speed = phydev->speed;
> +	priv->actual_duplex = phydev->duplex;
> +	netif_info(priv, drv, priv->ndev, "%s: %uMbps, duplex:%d\n",
> +		   __func__, phydev->speed, phydev->duplex);

phy_print_status()

> +	mac = of_get_mac_address(pdev->dev.of_node);
> +	if (mac)
> +		ether_addr_copy(ndev->dev_addr, mac);
> +
> +	if (!is_valid_ether_addr(ndev->dev_addr)) {
> +		eth_hw_addr_random(ndev);
> +		dev_warn(&pdev->dev, "No MAC address found, using random\n");
> +	}

So the mac address is optional, unlike what the binding document says.

> +	priv->phy_np = of_parse_phandle(pdev->dev.of_node, "phy-handle", 0);
> +	if (!priv->phy_np) {
> +		netif_err(priv, probe, ndev, "missing phy in DT\n");

It is the phy-handle which is missing, not the phy.

> +
> +	/* MTU range */
> +	ndev->min_mtu = ETH_MIN_MTU;

No need to set this, it is the default.

Otherwise, this looks good, in terms of phy and mdio.

	   Andrew

^ permalink raw reply

* [PATCH net-next 0/2] tcp: re-add header prediction
From: Florian Westphal @ 2017-08-30 17:24 UTC (permalink / raw)
  To: netdev; +Cc: edumazet

Eric reported a performance regression caused by header prediction
removal.

We now call tcp_ack() much more frequently, for some workloads
this brings in enough cache line misses to become noticeable.

We could possibly still kill HP provided we find a different
way to suppress unneeded tcp_ack, but given we're late in
the cycle it seems preferable to revert.

^ permalink raw reply

* [PATCH net-next 1/2] tcp: Revert "tcp: remove CA_ACK_SLOWPATH"
From: Florian Westphal @ 2017-08-30 17:24 UTC (permalink / raw)
  To: netdev; +Cc: edumazet, Florian Westphal
In-Reply-To: <20170830172458.18544-1-fw@strlen.de>

This change was a followup to the header prediction removal,
so first revert this as a prerequisite to back out hp removal.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/tcp.h       |  5 +++--
 net/ipv4/tcp_input.c    | 35 +++++++++++++++++++----------------
 net/ipv4/tcp_westwood.c | 31 +++++++++++++++++++++++++++----
 3 files changed, 49 insertions(+), 22 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index c614ff135b66..c546d13ffbca 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -910,8 +910,9 @@ enum tcp_ca_event {
 
 /* Information about inbound ACK, passed to cong_ops->in_ack_event() */
 enum tcp_ca_ack_event_flags {
-	CA_ACK_WIN_UPDATE	= (1 << 0),	/* ACK updated window */
-	CA_ACK_ECE		= (1 << 1),	/* ECE bit is set on ack */
+	CA_ACK_SLOWPATH		= (1 << 0),	/* In slow path processing */
+	CA_ACK_WIN_UPDATE	= (1 << 1),	/* ACK updated window */
+	CA_ACK_ECE		= (1 << 2),	/* ECE bit is set on ack */
 };
 
 /*
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 7616cd76f6f6..a0e436366d31 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3552,7 +3552,6 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
 	u32 lost = tp->lost;
 	int acked = 0; /* Number of packets newly acked */
 	int rexmit = REXMIT_NONE; /* Flag to (re)transmit to recover losses */
-	u32 ack_ev_flags = 0;
 
 	sack_state.first_sackt = 0;
 	sack_state.rate = &rs;
@@ -3593,26 +3592,30 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
 	if (flag & FLAG_UPDATE_TS_RECENT)
 		tcp_replace_ts_recent(tp, TCP_SKB_CB(skb)->seq);
 
-	if (ack_seq != TCP_SKB_CB(skb)->end_seq)
-		flag |= FLAG_DATA;
-	else
-		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPPUREACKS);
+	{
+		u32 ack_ev_flags = CA_ACK_SLOWPATH;
 
-	flag |= tcp_ack_update_window(sk, skb, ack, ack_seq);
+		if (ack_seq != TCP_SKB_CB(skb)->end_seq)
+			flag |= FLAG_DATA;
+		else
+			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPPUREACKS);
 
-	if (TCP_SKB_CB(skb)->sacked)
-		flag |= tcp_sacktag_write_queue(sk, skb, prior_snd_una,
-						&sack_state);
+		flag |= tcp_ack_update_window(sk, skb, ack, ack_seq);
 
-	if (tcp_ecn_rcv_ecn_echo(tp, tcp_hdr(skb))) {
-		flag |= FLAG_ECE;
-		ack_ev_flags = CA_ACK_ECE;
-	}
+		if (TCP_SKB_CB(skb)->sacked)
+			flag |= tcp_sacktag_write_queue(sk, skb, prior_snd_una,
+							&sack_state);
+
+		if (tcp_ecn_rcv_ecn_echo(tp, tcp_hdr(skb))) {
+			flag |= FLAG_ECE;
+			ack_ev_flags |= CA_ACK_ECE;
+		}
 
-	if (flag & FLAG_WIN_UPDATE)
-		ack_ev_flags |= CA_ACK_WIN_UPDATE;
+		if (flag & FLAG_WIN_UPDATE)
+			ack_ev_flags |= CA_ACK_WIN_UPDATE;
 
-	tcp_in_ack_event(sk, ack_ev_flags);
+		tcp_in_ack_event(sk, ack_ev_flags);
+	}
 
 	/* We passed data and got it acked, remove any soft error
 	 * log. Something worked...
diff --git a/net/ipv4/tcp_westwood.c b/net/ipv4/tcp_westwood.c
index e5de84310949..bec9cafbe3f9 100644
--- a/net/ipv4/tcp_westwood.c
+++ b/net/ipv4/tcp_westwood.c
@@ -154,6 +154,24 @@ static inline void update_rtt_min(struct westwood *w)
 }
 
 /*
+ * @westwood_fast_bw
+ * It is called when we are in fast path. In particular it is called when
+ * header prediction is successful. In such case in fact update is
+ * straight forward and doesn't need any particular care.
+ */
+static inline void westwood_fast_bw(struct sock *sk)
+{
+	const struct tcp_sock *tp = tcp_sk(sk);
+	struct westwood *w = inet_csk_ca(sk);
+
+	westwood_update_window(sk);
+
+	w->bk += tp->snd_una - w->snd_una;
+	w->snd_una = tp->snd_una;
+	update_rtt_min(w);
+}
+
+/*
  * @westwood_acked_count
  * This function evaluates cumul_ack for evaluating bk in case of
  * delayed or partial acks.
@@ -205,12 +223,17 @@ static u32 tcp_westwood_bw_rttmin(const struct sock *sk)
 
 static void tcp_westwood_ack(struct sock *sk, u32 ack_flags)
 {
-	struct westwood *w = inet_csk_ca(sk);
+	if (ack_flags & CA_ACK_SLOWPATH) {
+		struct westwood *w = inet_csk_ca(sk);
 
-	westwood_update_window(sk);
-	w->bk += westwood_acked_count(sk);
+		westwood_update_window(sk);
+		w->bk += westwood_acked_count(sk);
 
-	update_rtt_min(w);
+		update_rtt_min(w);
+		return;
+	}
+
+	westwood_fast_bw(sk);
 }
 
 static void tcp_westwood_event(struct sock *sk, enum tcp_ca_event event)
-- 
2.13.0

^ permalink raw reply related

* [PATCH net-next 2/2] tcp: Revert "tcp: remove header prediction"
From: Florian Westphal @ 2017-08-30 17:24 UTC (permalink / raw)
  To: netdev; +Cc: edumazet, Florian Westphal
In-Reply-To: <20170830172458.18544-1-fw@strlen.de>

This reverts commit 45f119bf936b1f9f546a0b139c5b56f9bb2bdc78.

Eric Dumazet says:
  We found at Google a significant regression caused by
  45f119bf936b1f9f546a0b139c5b56f9bb2bdc78 tcp: remove header prediction

  In typical RPC  (TCP_RR), when a TCP socket receives data, we now call
  tcp_ack() while we used to not call it.

  This touches enough cache lines to cause a slowdown.

so problem does not seem to be HP removal itself but the tcp_ack()
call.  Therefore, it might be possible to remove HP after all, provided
one finds a way to elide tcp_ack for most cases.

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/linux/tcp.h       |   6 ++
 include/net/tcp.h         |  23 ++++++
 include/uapi/linux/snmp.h |   2 +
 net/ipv4/proc.c           |   2 +
 net/ipv4/tcp.c            |   4 +-
 net/ipv4/tcp_input.c      | 188 ++++++++++++++++++++++++++++++++++++++++++++--
 net/ipv4/tcp_minisocks.c  |   2 +
 net/ipv4/tcp_output.c     |   2 +
 8 files changed, 223 insertions(+), 6 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 267164a1d559..4aa40ef02d32 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -148,6 +148,12 @@ struct tcp_sock {
 	u16	gso_segs;	/* Max number of segs per GSO packet	*/
 
 /*
+ *	Header prediction flags
+ *	0x5?10 << 16 + snd_wnd in net byte order
+ */
+	__be32	pred_flags;
+
+/*
  *	RFC793 variables by their proper names. This means you can
  *	read the code and the spec side by side (and laugh ...)
  *	See RFC793 and RFC1122. The RFC writes these in capitals.
diff --git a/include/net/tcp.h b/include/net/tcp.h
index c546d13ffbca..9c3db054e47f 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -634,6 +634,29 @@ static inline u32 __tcp_set_rto(const struct tcp_sock *tp)
 	return usecs_to_jiffies((tp->srtt_us >> 3) + tp->rttvar_us);
 }
 
+static inline void __tcp_fast_path_on(struct tcp_sock *tp, u32 snd_wnd)
+{
+	tp->pred_flags = htonl((tp->tcp_header_len << 26) |
+			       ntohl(TCP_FLAG_ACK) |
+			       snd_wnd);
+}
+
+static inline void tcp_fast_path_on(struct tcp_sock *tp)
+{
+	__tcp_fast_path_on(tp, tp->snd_wnd >> tp->rx_opt.snd_wscale);
+}
+
+static inline void tcp_fast_path_check(struct sock *sk)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+
+	if (RB_EMPTY_ROOT(&tp->out_of_order_queue) &&
+	    tp->rcv_wnd &&
+	    atomic_read(&sk->sk_rmem_alloc) < sk->sk_rcvbuf &&
+	    !tp->urg_data)
+		tcp_fast_path_on(tp);
+}
+
 /* Compute the actual rto_min value */
 static inline u32 tcp_rto_min(struct sock *sk)
 {
diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index b3f346fb9fe3..758f12b58541 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -184,7 +184,9 @@ enum
 	LINUX_MIB_DELAYEDACKLOST,		/* DelayedACKLost */
 	LINUX_MIB_LISTENOVERFLOWS,		/* ListenOverflows */
 	LINUX_MIB_LISTENDROPS,			/* ListenDrops */
+	LINUX_MIB_TCPHPHITS,			/* TCPHPHits */
 	LINUX_MIB_TCPPUREACKS,			/* TCPPureAcks */
+	LINUX_MIB_TCPHPACKS,			/* TCPHPAcks */
 	LINUX_MIB_TCPRENORECOVERY,		/* TCPRenoRecovery */
 	LINUX_MIB_TCPSACKRECOVERY,		/* TCPSackRecovery */
 	LINUX_MIB_TCPSACKRENEGING,		/* TCPSACKReneging */
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index b6d3fe03feb3..127153f1ed8a 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -206,7 +206,9 @@ static const struct snmp_mib snmp4_net_list[] = {
 	SNMP_MIB_ITEM("DelayedACKLost", LINUX_MIB_DELAYEDACKLOST),
 	SNMP_MIB_ITEM("ListenOverflows", LINUX_MIB_LISTENOVERFLOWS),
 	SNMP_MIB_ITEM("ListenDrops", LINUX_MIB_LISTENDROPS),
+	SNMP_MIB_ITEM("TCPHPHits", LINUX_MIB_TCPHPHITS),
 	SNMP_MIB_ITEM("TCPPureAcks", LINUX_MIB_TCPPUREACKS),
+	SNMP_MIB_ITEM("TCPHPAcks", LINUX_MIB_TCPHPACKS),
 	SNMP_MIB_ITEM("TCPRenoRecovery", LINUX_MIB_TCPRENORECOVERY),
 	SNMP_MIB_ITEM("TCPSackRecovery", LINUX_MIB_TCPSACKRECOVERY),
 	SNMP_MIB_ITEM("TCPSACKReneging", LINUX_MIB_TCPSACKRENEGING),
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 566083ee2654..21ca2df274c5 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1963,8 +1963,10 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock,
 		tcp_rcv_space_adjust(sk);
 
 skip_copy:
-		if (tp->urg_data && after(tp->copied_seq, tp->urg_seq))
+		if (tp->urg_data && after(tp->copied_seq, tp->urg_seq)) {
 			tp->urg_data = 0;
+			tcp_fast_path_check(sk);
+		}
 		if (used + offset < skb->len)
 			continue;
 
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index a0e436366d31..c5d7656beeee 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -103,6 +103,7 @@ int sysctl_tcp_invalid_ratelimit __read_mostly = HZ/2;
 #define FLAG_DATA_SACKED	0x20 /* New SACK.				*/
 #define FLAG_ECE		0x40 /* ECE in this ACK				*/
 #define FLAG_LOST_RETRANS	0x80 /* This ACK marks some retransmission lost */
+#define FLAG_SLOWPATH		0x100 /* Do not skip RFC checks for window update.*/
 #define FLAG_ORIG_SACK_ACKED	0x200 /* Never retransmitted data are (s)acked	*/
 #define FLAG_SND_UNA_ADVANCED	0x400 /* Snd_una was changed (!= FLAG_DATA_ACKED) */
 #define FLAG_DSACKING_ACK	0x800 /* SACK blocks contained D-SACK info */
@@ -3371,6 +3372,12 @@ static int tcp_ack_update_window(struct sock *sk, const struct sk_buff *skb, u32
 		if (tp->snd_wnd != nwin) {
 			tp->snd_wnd = nwin;
 
+			/* Note, it is the only place, where
+			 * fast path is recovered for sending TCP.
+			 */
+			tp->pred_flags = 0;
+			tcp_fast_path_check(sk);
+
 			if (tcp_send_head(sk))
 				tcp_slow_start_after_idle_check(sk);
 
@@ -3592,7 +3599,19 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
 	if (flag & FLAG_UPDATE_TS_RECENT)
 		tcp_replace_ts_recent(tp, TCP_SKB_CB(skb)->seq);
 
-	{
+	if (!(flag & FLAG_SLOWPATH) && after(ack, prior_snd_una)) {
+		/* Window is constant, pure forward advance.
+		 * No more checks are required.
+		 * Note, we use the fact that SND.UNA>=SND.WL2.
+		 */
+		tcp_update_wl(tp, ack_seq);
+		tcp_snd_una_update(tp, ack);
+		flag |= FLAG_WIN_UPDATE;
+
+		tcp_in_ack_event(sk, CA_ACK_WIN_UPDATE);
+
+		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPACKS);
+	} else {
 		u32 ack_ev_flags = CA_ACK_SLOWPATH;
 
 		if (ack_seq != TCP_SKB_CB(skb)->end_seq)
@@ -4407,6 +4426,8 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
 	if (TCP_SKB_CB(skb)->has_rxtstamp)
 		TCP_SKB_CB(skb)->swtstamp = skb->tstamp;
 
+	/* Disable header prediction. */
+	tp->pred_flags = 0;
 	inet_csk_schedule_ack(sk);
 
 	NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFOQUEUE);
@@ -4647,6 +4668,8 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 		if (tp->rx_opt.num_sacks)
 			tcp_sack_remove(tp);
 
+		tcp_fast_path_check(sk);
+
 		if (eaten > 0)
 			kfree_skb_partial(skb, fragstolen);
 		if (!sock_flag(sk, SOCK_DEAD))
@@ -4972,6 +4995,7 @@ static int tcp_prune_queue(struct sock *sk)
 	NET_INC_STATS(sock_net(sk), LINUX_MIB_RCVPRUNED);
 
 	/* Massive buffer overcommit. */
+	tp->pred_flags = 0;
 	return -1;
 }
 
@@ -5143,6 +5167,9 @@ static void tcp_check_urg(struct sock *sk, const struct tcphdr *th)
 
 	tp->urg_data = TCP_URG_NOTYET;
 	tp->urg_seq = ptr;
+
+	/* Disable header prediction. */
+	tp->pred_flags = 0;
 }
 
 /* This is the 'fast' part of urgent handling. */
@@ -5301,6 +5328,26 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
 
 /*
  *	TCP receive function for the ESTABLISHED state.
+ *
+ *	It is split into a fast path and a slow path. The fast path is
+ * 	disabled when:
+ *	- A zero window was announced from us - zero window probing
+ *        is only handled properly in the slow path.
+ *	- Out of order segments arrived.
+ *	- Urgent data is expected.
+ *	- There is no buffer space left
+ *	- Unexpected TCP flags/window values/header lengths are received
+ *	  (detected by checking the TCP header against pred_flags)
+ *	- Data is sent in both directions. Fast path only supports pure senders
+ *	  or pure receivers (this means either the sequence number or the ack
+ *	  value must stay constant)
+ *	- Unexpected TCP option.
+ *
+ *	When these conditions are not satisfied it drops into a standard
+ *	receive procedure patterned after RFC793 to handle all cases.
+ *	The first three cases are guaranteed by proper pred_flags setting,
+ *	the rest is checked inline. Fast processing is turned on in
+ *	tcp_data_queue when everything is OK.
  */
 void tcp_rcv_established(struct sock *sk, struct sk_buff *skb,
 			 const struct tcphdr *th)
@@ -5311,19 +5358,144 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb,
 	tcp_mstamp_refresh(tp);
 	if (unlikely(!sk->sk_rx_dst))
 		inet_csk(sk)->icsk_af_ops->sk_rx_dst_set(sk, skb);
+	/*
+	 *	Header prediction.
+	 *	The code loosely follows the one in the famous
+	 *	"30 instruction TCP receive" Van Jacobson mail.
+	 *
+	 *	Van's trick is to deposit buffers into socket queue
+	 *	on a device interrupt, to call tcp_recv function
+	 *	on the receive process context and checksum and copy
+	 *	the buffer to user space. smart...
+	 *
+	 *	Our current scheme is not silly either but we take the
+	 *	extra cost of the net_bh soft interrupt processing...
+	 *	We do checksum and copy also but from device to kernel.
+	 */
 
 	tp->rx_opt.saw_tstamp = 0;
 
+	/*	pred_flags is 0xS?10 << 16 + snd_wnd
+	 *	if header_prediction is to be made
+	 *	'S' will always be tp->tcp_header_len >> 2
+	 *	'?' will be 0 for the fast path, otherwise pred_flags is 0 to
+	 *  turn it off	(when there are holes in the receive
+	 *	 space for instance)
+	 *	PSH flag is ignored.
+	 */
+
+	if ((tcp_flag_word(th) & TCP_HP_BITS) == tp->pred_flags &&
+	    TCP_SKB_CB(skb)->seq == tp->rcv_nxt &&
+	    !after(TCP_SKB_CB(skb)->ack_seq, tp->snd_nxt)) {
+		int tcp_header_len = tp->tcp_header_len;
+
+		/* Timestamp header prediction: tcp_header_len
+		 * is automatically equal to th->doff*4 due to pred_flags
+		 * match.
+		 */
+
+		/* Check timestamp */
+		if (tcp_header_len == sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED) {
+			/* No? Slow path! */
+			if (!tcp_parse_aligned_timestamp(tp, th))
+				goto slow_path;
+
+			/* If PAWS failed, check it more carefully in slow path */
+			if ((s32)(tp->rx_opt.rcv_tsval - tp->rx_opt.ts_recent) < 0)
+				goto slow_path;
+
+			/* DO NOT update ts_recent here, if checksum fails
+			 * and timestamp was corrupted part, it will result
+			 * in a hung connection since we will drop all
+			 * future packets due to the PAWS test.
+			 */
+		}
+
+		if (len <= tcp_header_len) {
+			/* Bulk data transfer: sender */
+			if (len == tcp_header_len) {
+				/* Predicted packet is in window by definition.
+				 * seq == rcv_nxt and rcv_wup <= rcv_nxt.
+				 * Hence, check seq<=rcv_wup reduces to:
+				 */
+				if (tcp_header_len ==
+				    (sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED) &&
+				    tp->rcv_nxt == tp->rcv_wup)
+					tcp_store_ts_recent(tp);
+
+				/* We know that such packets are checksummed
+				 * on entry.
+				 */
+				tcp_ack(sk, skb, 0);
+				__kfree_skb(skb);
+				tcp_data_snd_check(sk);
+				return;
+			} else { /* Header too small */
+				TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
+				goto discard;
+			}
+		} else {
+			int eaten = 0;
+			bool fragstolen = false;
+
+			if (tcp_checksum_complete(skb))
+				goto csum_error;
+
+			if ((int)skb->truesize > sk->sk_forward_alloc)
+				goto step5;
+
+			/* Predicted packet is in window by definition.
+			 * seq == rcv_nxt and rcv_wup <= rcv_nxt.
+			 * Hence, check seq<=rcv_wup reduces to:
+			 */
+			if (tcp_header_len ==
+			    (sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED) &&
+			    tp->rcv_nxt == tp->rcv_wup)
+				tcp_store_ts_recent(tp);
+
+			tcp_rcv_rtt_measure_ts(sk, skb);
+
+			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPHITS);
+
+			/* Bulk data transfer: receiver */
+			eaten = tcp_queue_rcv(sk, skb, tcp_header_len,
+					      &fragstolen);
+
+			tcp_event_data_recv(sk, skb);
+
+			if (TCP_SKB_CB(skb)->ack_seq != tp->snd_una) {
+				/* Well, only one small jumplet in fast path... */
+				tcp_ack(sk, skb, FLAG_DATA);
+				tcp_data_snd_check(sk);
+				if (!inet_csk_ack_scheduled(sk))
+					goto no_ack;
+			}
+
+			__tcp_ack_snd_check(sk, 0);
+no_ack:
+			if (eaten)
+				kfree_skb_partial(skb, fragstolen);
+			sk->sk_data_ready(sk);
+			return;
+		}
+	}
+
+slow_path:
 	if (len < (th->doff << 2) || tcp_checksum_complete(skb))
 		goto csum_error;
 
 	if (!th->ack && !th->rst && !th->syn)
 		goto discard;
 
+	/*
+	 *	Standard slow path.
+	 */
+
 	if (!tcp_validate_incoming(sk, skb, th, 1))
 		return;
 
-	if (tcp_ack(sk, skb, FLAG_UPDATE_TS_RECENT) < 0)
+step5:
+	if (tcp_ack(sk, skb, FLAG_SLOWPATH | FLAG_UPDATE_TS_RECENT) < 0)
 		goto discard;
 
 	tcp_rcv_rtt_measure_ts(sk, skb);
@@ -5376,6 +5548,11 @@ void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
 
 	if (sock_flag(sk, SOCK_KEEPOPEN))
 		inet_csk_reset_keepalive_timer(sk, keepalive_time_when(tp));
+
+	if (!tp->rx_opt.snd_wscale)
+		__tcp_fast_path_on(tp, tp->snd_wnd);
+	else
+		tp->pred_flags = 0;
 }
 
 static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack,
@@ -5504,7 +5681,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
 		tcp_ecn_rcv_synack(tp, th);
 
 		tcp_init_wl(tp, TCP_SKB_CB(skb)->seq);
-		tcp_ack(sk, skb, 0);
+		tcp_ack(sk, skb, FLAG_SLOWPATH);
 
 		/* Ok.. it's good. Set up sequence numbers and
 		 * move to established.
@@ -5740,8 +5917,8 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 		return 0;
 
 	/* step 5: check the ACK field */
-
-	acceptable = tcp_ack(sk, skb, FLAG_UPDATE_TS_RECENT |
+	acceptable = tcp_ack(sk, skb, FLAG_SLOWPATH |
+				      FLAG_UPDATE_TS_RECENT |
 				      FLAG_NO_CHALLENGE_ACK) > 0;
 
 	if (!acceptable) {
@@ -5809,6 +5986,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 		tp->lsndtime = tcp_jiffies32;
 
 		tcp_initialize_rcv_mss(sk);
+		tcp_fast_path_on(tp);
 		break;
 
 	case TCP_FIN_WAIT1: {
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 1537b87c657f..188a6f31356d 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -436,6 +436,8 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
 		struct tcp_sock *newtp = tcp_sk(newsk);
 
 		/* Now setup tcp_sock */
+		newtp->pred_flags = 0;
+
 		newtp->rcv_wup = newtp->copied_seq =
 		newtp->rcv_nxt = treq->rcv_isn + 1;
 		newtp->segs_in = 1;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 3e0d19631534..5b6690d05abb 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -295,7 +295,9 @@ static u16 tcp_select_window(struct sock *sk)
 	/* RFC1323 scaling applied */
 	new_win >>= tp->rx_opt.rcv_wscale;
 
+	/* If we advertise zero window, disable fast path. */
 	if (new_win == 0) {
+		tp->pred_flags = 0;
 		if (old_win)
 			NET_INC_STATS(sock_net(sk),
 				      LINUX_MIB_TCPTOZEROWINDOWADV);
-- 
2.13.0

^ permalink raw reply related

* Re: [patch net-next 0/8] mlxsw: Add IPv6 host dpipe table
From: Andrew Lunn @ 2017-08-30 17:26 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, arkadis, idosch, mlxsw
In-Reply-To: <20170830120306.6128-1-jiri@resnulli.us>

On Wed, Aug 30, 2017 at 02:02:58PM +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Arkadi says:
> 
> This patchset adds IPv6 host dpipe table support. This will provide the
> ability to observe the hardware offloaded IPv6 neighbors.

Hi Jiri, Arkadi

Could you give us an example of the output seen in user space.

Thanks
	Andrew

^ permalink raw reply

* Re: [GIT] Networking
From: Kalle Valo @ 2017-08-30 17:31 UTC (permalink / raw)
  To: David Miller; +Cc: pavel, xiyou.wangcong, torvalds, akpm, netdev, linux-kernel
In-Reply-To: <20170830.101143.2305098064824357647.davem@davemloft.net>

David Miller <davem@davemloft.net> writes:

> From: Kalle Valo <kvalo@codeaurora.org>
> Date: Wed, 30 Aug 2017 17:45:31 +0300
>
>> Pavel Machek <pavel@ucw.cz> writes:
>> 
>>> Could we get this one in?
>>>
>>> wl1251 misses a spin_lock_init().
>>>
>>> https://www.mail-archive.com/netdev@vger.kernel.org/msg177031.html
>>>
>>> It seems pretty trivial, yet getting the backtraces is not nice.
>> 
>> It's in wireless-drivers-next and will be in 4.14-rc1:
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git/commit/?id=6e9aae179f290f1a44fce7ef8e9a8e2dd68ed1e4
>
> Is the bug only present in net-next?

AFAICS the bug was introduced by 9df86e2e702c6 back in 2010. If the bug
has been there for 7 years so waiting for a few more weeks should not
hurt.

And Pavel can also submit it to the stable release, it should apply
without problems as wl1251 doesn't have had that many patches during the
last few years (if ever).

-- 
Kalle Valo

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox