Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v3

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v3
       [not found]       ` <20161130163520.hg7icdflagmvarbr@techsingularity.net>
@ 2016-12-01 17:34         ` Jesper Dangaard Brouer
  2016-12-01 22:17           ` Paolo Abeni
  0 siblings, 1 reply; 4+ messages in thread
From: Jesper Dangaard Brouer @ 2016-12-01 17:34 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Christoph Lameter, Michal Hocko, Vlastimil Babka,
	Johannes Weiner, Linux-MM, Linux-Kernel, Rick Jones, Paolo Abeni,
	brouer, netdev@vger.kernel.org, Hannes Frederic Sowa

(Cc. netdev, we might have an issue with Paolo's UDP accounting and
small socket queues)

On Wed, 30 Nov 2016 16:35:20 +0000
Mel Gorman <mgorman@techsingularity.net> wrote:

> > I don't quite get why you are setting the socket recv size
> > (with -- -s and -S) to such a small number, size + 256.
> >   
> 
> Maybe I missed something at the time I wrote that but why would it
> need to be larger?

Well, to me it is quite obvious that we need some queue to avoid packet
drops.  We have two processes netperf and netserver, that are sending
packets between each-other (UDP_STREAM mostly netperf -> netserver).
These PIDs are getting scheduled and migrated between CPUs, and thus
does not get executed equally fast, thus a queue is need absorb the
fluctuations.

The network stack is even partly catching your config "mistake" and
increase the socket queue size, so we minimum can handle one max frame
(due skb "truesize" concept approx PAGE_SIZE + overhead).

Hopefully for localhost testing a small queue should hopefully not
result in packet drops.  Testing... ups, this does result in packet
drops.

Test command extracted from mmtests, UDP_STREAM size 1024:

 netperf-2.4.5-installed/bin/netperf -t UDP_STREAM  -l 60  -H 127.0.0.1 \
   -- -s 1280 -S 1280 -m 1024 -M 1024 -P 15895

 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0)
  port 15895 AF_INET to 127.0.0.1 (127.0.0.1) port 15895 AF_INET
 Socket  Message  Elapsed      Messages                
 Size    Size     Time         Okay Errors   Throughput
 bytes   bytes    secs            #      #   10^6bits/sec

   4608    1024   60.00     50024301      0    6829.98
   2560           60.00     46133211           6298.72

 Dropped packets: 50024301-46133211=3891090

To get a better drop indication, during this I run a command, to get
system-wide network counters from the last second, so below numbers are
per second.

 $ nstat > /dev/null && sleep 1  && nstat
 #kernel
 IpInReceives                    885162             0.0
 IpInDelivers                    885161             0.0
 IpOutRequests                   885162             0.0
 UdpInDatagrams                  776105             0.0
 UdpInErrors                     109056             0.0
 UdpOutDatagrams                 885160             0.0
 UdpRcvbufErrors                 109056             0.0
 IpExtInOctets                   931190476          0.0
 IpExtOutOctets                  931189564          0.0
 IpExtInNoECTPkts                885162             0.0

So, 885Kpps but only 776Kpps delivered and 109Kpps drops. See
UdpInErrors and UdpRcvbufErrors is equal (109056/sec). This drop
happens kernel side in __udp_queue_rcv_skb[1], because receiving
process didn't empty it's queue fast enough see [2].

Although upstream changes are coming in this area, [2] is replaced with
__udp_enqueue_schedule_skb, which I actually tested with... hmm

Retesting with kernel 4.7.0-baseline+ ... show something else. 
To Paolo, you might want to look into this.  And it could also explain why
I've not see the mentioned speedup by mm-change, as I've been testing
this patch on top of net-next (at 93ba2222550) with Paolo's UDP changes.

 netperf-2.4.5-installed/bin/netperf -t UDP_STREAM  -l 60  -H 127.0.0.1 \
   -- -s 1280 -S 1280 -m 1024 -M 1024 -P 15895

 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 15895
  AF_INET to 127.0.0.1 (127.0.0.1) port 15895 AF_INET
 Socket  Message  Elapsed      Messages                
 Size    Size     Time         Okay Errors   Throughput
 bytes   bytes    secs            #      #   10^6bits/sec

   4608    1024   60.00     47248301      0    6450.97
   2560           60.00     47245030           6450.52

Only dropped 47248301-47245030=3271

$ nstat > /dev/null && sleep 1  && nstat
#kernel
IpInReceives                    810566             0.0
IpInDelivers                    810566             0.0
IpOutRequests                   810566             0.0
UdpInDatagrams                  810468             0.0
UdpInErrors                     99                 0.0
UdpOutDatagrams                 810566             0.0
UdpRcvbufErrors                 99                 0.0
IpExtInOctets                   852713328          0.0
IpExtOutOctets                  852713328          0.0
IpExtInNoECTPkts                810563             0.0

And nstat is also much better with only 99 drop/sec.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

[1] http://lxr.free-electrons.com/source/net/ipv4/udp.c?v=4.8#L1454
[2] http://lxr.free-electrons.com/source/net/core/sock.c?v=4.8#L413


Extra: with net-next at 93ba2222550

If I use netperf default socket queue, then there is not a single
packet drop:

netperf-2.4.5-installed/bin/netperf -t UDP_STREAM  -l 60  -H 127.0.0.1  
   -- -m 1024 -M 1024 -P 15895

UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) 
 port 15895 AF_INET to 127.0.0.1 (127.0.0.1) port 15895 AF_INET
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992    1024   60.00     48485642      0    6619.91
212992           60.00     48485642           6619.91


$ nstat > /dev/null && sleep 1  && nstat
#kernel
IpInReceives                    821723             0.0
IpInDelivers                    821722             0.0
IpOutRequests                   821723             0.0
UdpInDatagrams                  821722             0.0
UdpOutDatagrams                 821722             0.0
IpExtInOctets                   864457856          0.0
IpExtOutOctets                  864458908          0.0
IpExtInNoECTPkts                821729             0.0




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v3
  2016-12-01 17:34         ` [PATCH] mm: page_alloc: High-order per-cpu page allocator v3 Jesper Dangaard Brouer
@ 2016-12-01 22:17           ` Paolo Abeni
  2016-12-02 15:37             ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 4+ messages in thread
From: Paolo Abeni @ 2016-12-01 22:17 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Mel Gorman, Andrew Morton, Christoph Lameter, Michal Hocko,
	Vlastimil Babka, Johannes Weiner, Linux-MM, Linux-Kernel,
	Rick Jones, netdev@vger.kernel.org, Hannes Frederic Sowa

On Thu, 2016-12-01 at 18:34 +0100, Jesper Dangaard Brouer wrote:
> (Cc. netdev, we might have an issue with Paolo's UDP accounting and
> small socket queues)
> 
> On Wed, 30 Nov 2016 16:35:20 +0000
> Mel Gorman <mgorman@techsingularity.net> wrote:
> 
> > > I don't quite get why you are setting the socket recv size
> > > (with -- -s and -S) to such a small number, size + 256.
> > >   
> > 
> > Maybe I missed something at the time I wrote that but why would it
> > need to be larger?
> 
> Well, to me it is quite obvious that we need some queue to avoid packet
> drops.  We have two processes netperf and netserver, that are sending
> packets between each-other (UDP_STREAM mostly netperf -> netserver).
> These PIDs are getting scheduled and migrated between CPUs, and thus
> does not get executed equally fast, thus a queue is need absorb the
> fluctuations.
> 
> The network stack is even partly catching your config "mistake" and
> increase the socket queue size, so we minimum can handle one max frame
> (due skb "truesize" concept approx PAGE_SIZE + overhead).
> 
> Hopefully for localhost testing a small queue should hopefully not
> result in packet drops.  Testing... ups, this does result in packet
> drops.
> 
> Test command extracted from mmtests, UDP_STREAM size 1024:
> 
>  netperf-2.4.5-installed/bin/netperf -t UDP_STREAM  -l 60  -H 127.0.0.1 \
>    -- -s 1280 -S 1280 -m 1024 -M 1024 -P 15895
> 
>  UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0)
>   port 15895 AF_INET to 127.0.0.1 (127.0.0.1) port 15895 AF_INET
>  Socket  Message  Elapsed      Messages                
>  Size    Size     Time         Okay Errors   Throughput
>  bytes   bytes    secs            #      #   10^6bits/sec
> 
>    4608    1024   60.00     50024301      0    6829.98
>    2560           60.00     46133211           6298.72
> 
>  Dropped packets: 50024301-46133211=3891090
> 
> To get a better drop indication, during this I run a command, to get
> system-wide network counters from the last second, so below numbers are
> per second.
> 
>  $ nstat > /dev/null && sleep 1  && nstat
>  #kernel
>  IpInReceives                    885162             0.0
>  IpInDelivers                    885161             0.0
>  IpOutRequests                   885162             0.0
>  UdpInDatagrams                  776105             0.0
>  UdpInErrors                     109056             0.0
>  UdpOutDatagrams                 885160             0.0
>  UdpRcvbufErrors                 109056             0.0
>  IpExtInOctets                   931190476          0.0
>  IpExtOutOctets                  931189564          0.0
>  IpExtInNoECTPkts                885162             0.0
> 
> So, 885Kpps but only 776Kpps delivered and 109Kpps drops. See
> UdpInErrors and UdpRcvbufErrors is equal (109056/sec). This drop
> happens kernel side in __udp_queue_rcv_skb[1], because receiving
> process didn't empty it's queue fast enough see [2].
> 
> Although upstream changes are coming in this area, [2] is replaced with
> __udp_enqueue_schedule_skb, which I actually tested with... hmm
> 
> Retesting with kernel 4.7.0-baseline+ ... show something else. 
> To Paolo, you might want to look into this.  And it could also explain why
> I've not see the mentioned speedup by mm-change, as I've been testing
> this patch on top of net-next (at 93ba2222550) with Paolo's UDP changes.

Thank you for reporting this.

It seems that the commit 123b4a633580 ("udp: use it's own memory
accounting schema") is too strict while checking the rcvbuf. 

For very small value of rcvbuf, it allows a single skb to be enqueued,
while previously we allowed 2 of them to enter the queue, even if the
first one truesize exceeded rcvbuf, as in your test-case.

Can you please try the following patch ?

Thank you,

Paolo
---
 net/ipv4/udp.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e1d0bf8..2f5dc92 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1200,19 +1200,21 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
 	struct sk_buff_head *list = &sk->sk_receive_queue;
 	int rmem, delta, amt, err = -ENOMEM;
 	int size = skb->truesize;
+	int limit;
 
 	/* try to avoid the costly atomic add/sub pair when the receive
 	 * queue is full; always allow at least a packet
 	 */
 	rmem = atomic_read(&sk->sk_rmem_alloc);
-	if (rmem && (rmem + size > sk->sk_rcvbuf))
+	limit = size + sk->sk_rcvbuf;
+	if (rmem > limit)
 		goto drop;
 
 	/* we drop only if the receive buf is full and the receive
 	 * queue contains some other skb
 	 */
 	rmem = atomic_add_return(size, &sk->sk_rmem_alloc);
-	if ((rmem > sk->sk_rcvbuf) && (rmem > size))
+	if (rmem > limit)
 		goto uncharge_drop;
 
 	spin_lock(&list->lock);





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v3
  2016-12-01 22:17           ` Paolo Abeni
@ 2016-12-02 15:37             ` Jesper Dangaard Brouer
  2016-12-02 15:44               ` Paolo Abeni
  0 siblings, 1 reply; 4+ messages in thread
From: Jesper Dangaard Brouer @ 2016-12-02 15:37 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Mel Gorman, Andrew Morton, Christoph Lameter, Michal Hocko,
	Vlastimil Babka, Johannes Weiner, Linux-MM, Linux-Kernel,
	Rick Jones, netdev@vger.kernel.org, Hannes Frederic Sowa, brouer

On Thu, 01 Dec 2016 23:17:48 +0100
Paolo Abeni <pabeni@redhat.com> wrote:

> On Thu, 2016-12-01 at 18:34 +0100, Jesper Dangaard Brouer wrote:
> > (Cc. netdev, we might have an issue with Paolo's UDP accounting and
> > small socket queues)
> > 
> > On Wed, 30 Nov 2016 16:35:20 +0000
> > Mel Gorman <mgorman@techsingularity.net> wrote:
> >   
> > > > I don't quite get why you are setting the socket recv size
> > > > (with -- -s and -S) to such a small number, size + 256.
> > > >     
> > > 
> > > Maybe I missed something at the time I wrote that but why would it
> > > need to be larger?  
> > 
> > Well, to me it is quite obvious that we need some queue to avoid packet
> > drops.  We have two processes netperf and netserver, that are sending
> > packets between each-other (UDP_STREAM mostly netperf -> netserver).
> > These PIDs are getting scheduled and migrated between CPUs, and thus
> > does not get executed equally fast, thus a queue is need absorb the
> > fluctuations.
> > 
> > The network stack is even partly catching your config "mistake" and
> > increase the socket queue size, so we minimum can handle one max frame
> > (due skb "truesize" concept approx PAGE_SIZE + overhead).
> > 
> > Hopefully for localhost testing a small queue should hopefully not
> > result in packet drops.  Testing... ups, this does result in packet
> > drops.
> > 
> > Test command extracted from mmtests, UDP_STREAM size 1024:
> > 
> >  netperf-2.4.5-installed/bin/netperf -t UDP_STREAM  -l 60  -H 127.0.0.1 \
> >    -- -s 1280 -S 1280 -m 1024 -M 1024 -P 15895
> > 
> >  UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0)
> >   port 15895 AF_INET to 127.0.0.1 (127.0.0.1) port 15895 AF_INET
> >  Socket  Message  Elapsed      Messages                
> >  Size    Size     Time         Okay Errors   Throughput
> >  bytes   bytes    secs            #      #   10^6bits/sec
> > 
> >    4608    1024   60.00     50024301      0    6829.98
> >    2560           60.00     46133211           6298.72
> > 
> >  Dropped packets: 50024301-46133211=3891090
> > 
> > To get a better drop indication, during this I run a command, to get
> > system-wide network counters from the last second, so below numbers are
> > per second.
> > 
> >  $ nstat > /dev/null && sleep 1  && nstat
> >  #kernel
> >  IpInReceives                    885162             0.0
> >  IpInDelivers                    885161             0.0
> >  IpOutRequests                   885162             0.0
> >  UdpInDatagrams                  776105             0.0
> >  UdpInErrors                     109056             0.0
> >  UdpOutDatagrams                 885160             0.0
> >  UdpRcvbufErrors                 109056             0.0
> >  IpExtInOctets                   931190476          0.0
> >  IpExtOutOctets                  931189564          0.0
> >  IpExtInNoECTPkts                885162             0.0
> > 
> > So, 885Kpps but only 776Kpps delivered and 109Kpps drops. See
> > UdpInErrors and UdpRcvbufErrors is equal (109056/sec). This drop
> > happens kernel side in __udp_queue_rcv_skb[1], because receiving
> > process didn't empty it's queue fast enough see [2].
> > 
> > Although upstream changes are coming in this area, [2] is replaced with
> > __udp_enqueue_schedule_skb, which I actually tested with... hmm
> > 
> > Retesting with kernel 4.7.0-baseline+ ... show something else. 
> > To Paolo, you might want to look into this.  And it could also explain why
> > I've not see the mentioned speedup by mm-change, as I've been testing
> > this patch on top of net-next (at 93ba2222550) with Paolo's UDP changes.  
> 
> Thank you for reporting this.
> 
> It seems that the commit 123b4a633580 ("udp: use it's own memory
> accounting schema") is too strict while checking the rcvbuf. 
> 
> For very small value of rcvbuf, it allows a single skb to be enqueued,
> while previously we allowed 2 of them to enter the queue, even if the
> first one truesize exceeded rcvbuf, as in your test-case.
> 
> Can you please try the following patch ?

Sure, it looks much better with this patch.


$ /home/jbrouer/git/mmtests/work/testdisk/sources/netperf-2.4.5-installed/bin/netperf -t UDP_STREAM  -l 60  -H 127.0.0.1    -- -s 1280 -S 1280 -m 1024 -M 1024 -P 15895
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 15895 AF_INET to 127.0.0.1 (127.0.0.1) port 15895 AF_INET
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

  4608    1024   60.00     50191555      0    6852.82
  2560           60.00     50189872           6852.59

Only 50191555-50189872=1683 drops, approx 1683/60 = 28/sec

$ nstat > /dev/null && sleep 1  && nstat
#kernel
IpInReceives                    885417             0.0
IpInDelivers                    885416             0.0
IpOutRequests                   885417             0.0
UdpInDatagrams                  885382             0.0
UdpInErrors                     29                 0.0
UdpOutDatagrams                 885410             0.0
UdpRcvbufErrors                 29                 0.0
IpExtInOctets                   931534428          0.0
IpExtOutOctets                  931533376          0.0
IpExtInNoECTPkts                885488             0.0

 
> Thank you,
> 
> Paolo
> ---
>  net/ipv4/udp.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index e1d0bf8..2f5dc92 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1200,19 +1200,21 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
>  	struct sk_buff_head *list = &sk->sk_receive_queue;
>  	int rmem, delta, amt, err = -ENOMEM;
>  	int size = skb->truesize;
> +	int limit;
>  
>  	/* try to avoid the costly atomic add/sub pair when the receive
>  	 * queue is full; always allow at least a packet
>  	 */
>  	rmem = atomic_read(&sk->sk_rmem_alloc);
> -	if (rmem && (rmem + size > sk->sk_rcvbuf))
> +	limit = size + sk->sk_rcvbuf;
> +	if (rmem > limit)
>  		goto drop;
>  
>  	/* we drop only if the receive buf is full and the receive
>  	 * queue contains some other skb
>  	 */
>  	rmem = atomic_add_return(size, &sk->sk_rmem_alloc);
> -	if ((rmem > sk->sk_rcvbuf) && (rmem > size))
> +	if (rmem > limit)
>  		goto uncharge_drop;
>  
>  	spin_lock(&list->lock);
> 



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v3
  2016-12-02 15:37             ` Jesper Dangaard Brouer
@ 2016-12-02 15:44               ` Paolo Abeni
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2016-12-02 15:44 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Mel Gorman, Andrew Morton, Christoph Lameter, Michal Hocko,
	Vlastimil Babka, Johannes Weiner, Linux-MM, Linux-Kernel,
	Rick Jones, netdev@vger.kernel.org, Hannes Frederic Sowa

On Fri, 2016-12-02 at 16:37 +0100, Jesper Dangaard Brouer wrote:
> On Thu, 01 Dec 2016 23:17:48 +0100
> Paolo Abeni <pabeni@redhat.com> wrote:
> 
> > On Thu, 2016-12-01 at 18:34 +0100, Jesper Dangaard Brouer wrote:
> > > (Cc. netdev, we might have an issue with Paolo's UDP accounting and
> > > small socket queues)
> > > 
> > > On Wed, 30 Nov 2016 16:35:20 +0000
> > > Mel Gorman <mgorman@techsingularity.net> wrote:
> > >   
> > > > > I don't quite get why you are setting the socket recv size
> > > > > (with -- -s and -S) to such a small number, size + 256.
> > > > >     
> > > > 
> > > > Maybe I missed something at the time I wrote that but why would it
> > > > need to be larger?  
> > > 
> > > Well, to me it is quite obvious that we need some queue to avoid packet
> > > drops.  We have two processes netperf and netserver, that are sending
> > > packets between each-other (UDP_STREAM mostly netperf -> netserver).
> > > These PIDs are getting scheduled and migrated between CPUs, and thus
> > > does not get executed equally fast, thus a queue is need absorb the
> > > fluctuations.
> > > 
> > > The network stack is even partly catching your config "mistake" and
> > > increase the socket queue size, so we minimum can handle one max frame
> > > (due skb "truesize" concept approx PAGE_SIZE + overhead).
> > > 
> > > Hopefully for localhost testing a small queue should hopefully not
> > > result in packet drops.  Testing... ups, this does result in packet
> > > drops.
> > > 
> > > Test command extracted from mmtests, UDP_STREAM size 1024:
> > > 
> > >  netperf-2.4.5-installed/bin/netperf -t UDP_STREAM  -l 60  -H 127.0.0.1 \
> > >    -- -s 1280 -S 1280 -m 1024 -M 1024 -P 15895
> > > 
> > >  UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0)
> > >   port 15895 AF_INET to 127.0.0.1 (127.0.0.1) port 15895 AF_INET
> > >  Socket  Message  Elapsed      Messages                
> > >  Size    Size     Time         Okay Errors   Throughput
> > >  bytes   bytes    secs            #      #   10^6bits/sec
> > > 
> > >    4608    1024   60.00     50024301      0    6829.98
> > >    2560           60.00     46133211           6298.72
> > > 
> > >  Dropped packets: 50024301-46133211=3891090
> > > 
> > > To get a better drop indication, during this I run a command, to get
> > > system-wide network counters from the last second, so below numbers are
> > > per second.
> > > 
> > >  $ nstat > /dev/null && sleep 1  && nstat
> > >  #kernel
> > >  IpInReceives                    885162             0.0
> > >  IpInDelivers                    885161             0.0
> > >  IpOutRequests                   885162             0.0
> > >  UdpInDatagrams                  776105             0.0
> > >  UdpInErrors                     109056             0.0
> > >  UdpOutDatagrams                 885160             0.0
> > >  UdpRcvbufErrors                 109056             0.0
> > >  IpExtInOctets                   931190476          0.0
> > >  IpExtOutOctets                  931189564          0.0
> > >  IpExtInNoECTPkts                885162             0.0
> > > 
> > > So, 885Kpps but only 776Kpps delivered and 109Kpps drops. See
> > > UdpInErrors and UdpRcvbufErrors is equal (109056/sec). This drop
> > > happens kernel side in __udp_queue_rcv_skb[1], because receiving
> > > process didn't empty it's queue fast enough see [2].
> > > 
> > > Although upstream changes are coming in this area, [2] is replaced with
> > > __udp_enqueue_schedule_skb, which I actually tested with... hmm
> > > 
> > > Retesting with kernel 4.7.0-baseline+ ... show something else. 
> > > To Paolo, you might want to look into this.  And it could also explain why
> > > I've not see the mentioned speedup by mm-change, as I've been testing
> > > this patch on top of net-next (at 93ba2222550) with Paolo's UDP changes.  
> > 
> > Thank you for reporting this.
> > 
> > It seems that the commit 123b4a633580 ("udp: use it's own memory
> > accounting schema") is too strict while checking the rcvbuf. 
> > 
> > For very small value of rcvbuf, it allows a single skb to be enqueued,
> > while previously we allowed 2 of them to enter the queue, even if the
> > first one truesize exceeded rcvbuf, as in your test-case.
> > 
> > Can you please try the following patch ?
> 
> Sure, it looks much better with this patch.

Thank you for testing. I'll send a formal patch to David soon.

BTW I see I nice performance improvement compared to 4.7...

Cheers,

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-12-02 15:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20161127131954.10026-1-mgorman@techsingularity.net>
     [not found] ` <20161130134034.3b60c7f0@redhat.com>
     [not found]   ` <20161130140615.3bbn7576iwbyc3op@techsingularity.net>
     [not found]     ` <20161130160612.474ca93c@redhat.com>
     [not found]       ` <20161130163520.hg7icdflagmvarbr@techsingularity.net>
2016-12-01 17:34         ` [PATCH] mm: page_alloc: High-order per-cpu page allocator v3 Jesper Dangaard Brouer
2016-12-01 22:17           ` Paolo Abeni
2016-12-02 15:37             ` Jesper Dangaard Brouer
2016-12-02 15:44               ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).