* [net-next PATCH 0/2] NAPI ID fixups related to busy polling
@ 2017-03-20 21:48 Alexander Duyck
2017-03-20 21:48 ` [net-next PATCH 1/2] net: Busy polling should ignore sender CPUs Alexander Duyck
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Alexander Duyck @ 2017-03-20 21:48 UTC (permalink / raw)
To: netdev, linux-kernel; +Cc: sridhar.samudrala, edumazet, davem
These two patches are a couple of minor clean-ups related to busy polling.
The first one addresses the fact that we were trying to busy poll on
sender_cpu values instead of true NAPI IDs. The second addresses the fact
that there were a few paths where TCP sockets were being instanciated based
on a received patcket, but not recording the hash or NAPI ID of the packet
that was used to instanciate them.
---
Alexander Duyck (2):
net: Busy polling should ignore sender CPUs
tcp: Record Rx hash and NAPI ID in tcp_child_process
include/net/busy_poll.h | 11 +++++++++--
net/core/dev.c | 6 +++---
net/ipv4/tcp_ipv4.c | 2 --
net/ipv4/tcp_minisocks.c | 5 +++++
net/ipv6/tcp_ipv6.c | 2 --
5 files changed, 17 insertions(+), 9 deletions(-)
--
^ permalink raw reply [flat|nested] 6+ messages in thread
* [net-next PATCH 1/2] net: Busy polling should ignore sender CPUs
2017-03-20 21:48 [net-next PATCH 0/2] NAPI ID fixups related to busy polling Alexander Duyck
@ 2017-03-20 21:48 ` Alexander Duyck
2017-03-20 22:16 ` Eric Dumazet
2017-03-20 21:48 ` [net-next PATCH 2/2] tcp: Record Rx hash and NAPI ID in tcp_child_process Alexander Duyck
2017-03-22 18:26 ` [net-next PATCH 0/2] NAPI ID fixups related to busy polling David Miller
2 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2017-03-20 21:48 UTC (permalink / raw)
To: netdev, linux-kernel; +Cc: sridhar.samudrala, edumazet, davem
From: Alexander Duyck <alexander.h.duyck@intel.com>
This patch is a cleanup/fix for NAPI IDs following the changes that made it
so that sender_cpu and napi_id were doing a better job of sharing the same
location in the sk_buff.
One issue I found is that we weren't validating the napi_id as being valid
before we started trying to setup the busy polling. This change corrects
that by using the MIN_NAPI_ID value that is now used in both allocating the
NAPI IDs, as well as validating them.
Fixes: 52bd2d62ce675 ("net: better skb->sender_cpu and skb->napi_id cohabitation")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
include/net/busy_poll.h | 11 +++++++++--
net/core/dev.c | 6 +++---
2 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h
index c0452de83086..edf1310212a1 100644
--- a/include/net/busy_poll.h
+++ b/include/net/busy_poll.h
@@ -35,6 +35,12 @@
extern unsigned int sysctl_net_busy_read __read_mostly;
extern unsigned int sysctl_net_busy_poll __read_mostly;
+/* 0 - Reserved to indicate value not set
+ * 1..NR_CPUS - Reserved for sender_cpu
+ * NR_CPUS+1..~0 - Region available for NAPI IDs
+ */
+#define MIN_NAPI_ID ((unsigned int)(NR_CPUS + 1))
+
static inline bool net_busy_loop_on(void)
{
return sysctl_net_busy_poll;
@@ -58,10 +64,11 @@ static inline unsigned long busy_loop_end_time(void)
static inline bool sk_can_busy_loop(const struct sock *sk)
{
- return sk->sk_ll_usec && sk->sk_napi_id && !signal_pending(current);
+ return sk->sk_ll_usec &&
+ (sk->sk_napi_id >= MIN_NAPI_ID) &&
+ !signal_pending(current);
}
-
static inline bool busy_loop_timeout(unsigned long end_time)
{
unsigned long now = busy_loop_us_clock();
diff --git a/net/core/dev.c b/net/core/dev.c
index 7869ae3837ca..5bbe30c08a5b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5143,10 +5143,10 @@ static void napi_hash_add(struct napi_struct *napi)
spin_lock(&napi_hash_lock);
- /* 0..NR_CPUS+1 range is reserved for sender_cpu use */
+ /* 0..NR_CPUS range is reserved for sender_cpu use */
do {
- if (unlikely(++napi_gen_id < NR_CPUS + 1))
- napi_gen_id = NR_CPUS + 1;
+ if (unlikely(++napi_gen_id < MIN_NAPI_ID))
+ napi_gen_id = MIN_NAPI_ID;
} while (napi_by_id(napi_gen_id));
napi->napi_id = napi_gen_id;
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [net-next PATCH 2/2] tcp: Record Rx hash and NAPI ID in tcp_child_process
2017-03-20 21:48 [net-next PATCH 0/2] NAPI ID fixups related to busy polling Alexander Duyck
2017-03-20 21:48 ` [net-next PATCH 1/2] net: Busy polling should ignore sender CPUs Alexander Duyck
@ 2017-03-20 21:48 ` Alexander Duyck
2017-03-20 22:06 ` Eric Dumazet
2017-03-22 18:26 ` [net-next PATCH 0/2] NAPI ID fixups related to busy polling David Miller
2 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2017-03-20 21:48 UTC (permalink / raw)
To: netdev, linux-kernel; +Cc: sridhar.samudrala, edumazet, davem
From: Alexander Duyck <alexander.h.duyck@intel.com>
While working on some recent busy poll changes we found that child sockets
were being instantiated without NAPI ID being set. In our first attempt to
fix it, it was suggested that we should just pull programming the NAPI ID
into the function itself since all callers will need to have it set.
In addition to NAPI ID I have decided to also pull in populating the Rx
hash since it likely has the same problem as NAPI ID but just doesn't have
the visibility.
Reported-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
net/ipv4/tcp_ipv4.c | 2 --
net/ipv4/tcp_minisocks.c | 5 +++++
net/ipv6/tcp_ipv6.c | 2 --
3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 7482b5d11861..20cbd2f07f28 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1409,8 +1409,6 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
if (!nsk)
goto discard;
if (nsk != sk) {
- sock_rps_save_rxhash(nsk, skb);
- sk_mark_napi_id(nsk, skb);
if (tcp_child_process(sk, nsk, skb)) {
rsk = nsk;
goto reset;
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 692f974e5abe..245b63856c04 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -26,6 +26,7 @@
#include <net/tcp.h>
#include <net/inet_common.h>
#include <net/xfrm.h>
+#include <net/busy_poll.h>
int sysctl_tcp_abort_on_overflow __read_mostly;
@@ -798,6 +799,10 @@ int tcp_child_process(struct sock *parent, struct sock *child,
int ret = 0;
int state = child->sk_state;
+ /* record Rx hash and NAPI ID of child */
+ sock_rps_save_rxhash(child, skb);
+ sk_mark_napi_id(child, skb);
+
tcp_segs_in(tcp_sk(child), skb);
if (!sock_owned_by_user(child)) {
ret = tcp_rcv_state_process(child, skb);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 0f08d718a002..ee13e380c0dd 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1293,8 +1293,6 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
goto discard;
if (nsk != sk) {
- sock_rps_save_rxhash(nsk, skb);
- sk_mark_napi_id(nsk, skb);
if (tcp_child_process(sk, nsk, skb))
goto reset;
if (opt_skb)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [net-next PATCH 2/2] tcp: Record Rx hash and NAPI ID in tcp_child_process
2017-03-20 21:48 ` [net-next PATCH 2/2] tcp: Record Rx hash and NAPI ID in tcp_child_process Alexander Duyck
@ 2017-03-20 22:06 ` Eric Dumazet
0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2017-03-20 22:06 UTC (permalink / raw)
To: Alexander Duyck; +Cc: netdev, LKML, sridhar.samudrala, David Miller
On Mon, Mar 20, 2017 at 2:48 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> From: Alexander Duyck <alexander.h.duyck@intel.com>
>
> While working on some recent busy poll changes we found that child sockets
> were being instantiated without NAPI ID being set. In our first attempt to
> fix it, it was suggested that we should just pull programming the NAPI ID
> into the function itself since all callers will need to have it set.
>
> In addition to NAPI ID I have decided to also pull in populating the Rx
> hash since it likely has the same problem as NAPI ID but just doesn't have
> the visibility.
It looks like Rx hash was initialized elsewhere (
tcp_get_cookie_sock() & tcp_check_req())
So this probably could be cleaned up, if done at the proper place ;)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [net-next PATCH 1/2] net: Busy polling should ignore sender CPUs
2017-03-20 21:48 ` [net-next PATCH 1/2] net: Busy polling should ignore sender CPUs Alexander Duyck
@ 2017-03-20 22:16 ` Eric Dumazet
0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2017-03-20 22:16 UTC (permalink / raw)
To: Alexander Duyck; +Cc: netdev, linux-kernel, sridhar.samudrala, edumazet, davem
On Mon, 2017-03-20 at 14:48 -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexander.h.duyck@intel.com>
>
> This patch is a cleanup/fix for NAPI IDs following the changes that made it
> so that sender_cpu and napi_id were doing a better job of sharing the same
> location in the sk_buff.
>
> One issue I found is that we weren't validating the napi_id as being valid
> before we started trying to setup the busy polling. This change corrects
> that by using the MIN_NAPI_ID value that is now used in both allocating the
> NAPI IDs, as well as validating them.
>
> Fixes: 52bd2d62ce675 ("net: better skb->sender_cpu and skb->napi_id cohabitation")
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
This Fixes: tag seems not really needed here.
If really busy polling is attempted to a socket with a <wrong> napi id,
nothing bad happens. This fits the advisory model of busy polling...
Otherwise, your patch would be a candidate for net tree.
Also note that as soon as sk_can_busy_loop(sk) returns some status,
another cpu might already have changed sk->sk_napi_id to something else,
possibly with a <wrong> napi id again.
If your upcoming code depends on sk->sk_napi_id being verified, then
you need to read it once.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [net-next PATCH 0/2] NAPI ID fixups related to busy polling
2017-03-20 21:48 [net-next PATCH 0/2] NAPI ID fixups related to busy polling Alexander Duyck
2017-03-20 21:48 ` [net-next PATCH 1/2] net: Busy polling should ignore sender CPUs Alexander Duyck
2017-03-20 21:48 ` [net-next PATCH 2/2] tcp: Record Rx hash and NAPI ID in tcp_child_process Alexander Duyck
@ 2017-03-22 18:26 ` David Miller
2 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2017-03-22 18:26 UTC (permalink / raw)
To: alexander.duyck; +Cc: netdev, linux-kernel, sridhar.samudrala, edumazet
From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Mon, 20 Mar 2017 14:48:41 -0700
> These two patches are a couple of minor clean-ups related to busy polling.
> The first one addresses the fact that we were trying to busy poll on
> sender_cpu values instead of true NAPI IDs. The second addresses the fact
> that there were a few paths where TCP sockets were being instanciated based
> on a received patcket, but not recording the hash or NAPI ID of the packet
> that was used to instanciate them.
I'm expecting a respin of this. Patch #1 appears to be 'net' material, and
Eric asked for some other minor tweaks as well.
Thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-03-22 18:26 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-20 21:48 [net-next PATCH 0/2] NAPI ID fixups related to busy polling Alexander Duyck
2017-03-20 21:48 ` [net-next PATCH 1/2] net: Busy polling should ignore sender CPUs Alexander Duyck
2017-03-20 22:16 ` Eric Dumazet
2017-03-20 21:48 ` [net-next PATCH 2/2] tcp: Record Rx hash and NAPI ID in tcp_child_process Alexander Duyck
2017-03-20 22:06 ` Eric Dumazet
2017-03-22 18:26 ` [net-next PATCH 0/2] NAPI ID fixups related to busy polling David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).