From: Paolo Abeni <pabeni@redhat.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>,
Andrew Morton <akpm@linux-foundtion.org>,
Christoph Lameter <cl@linux.com>, Michal Hocko <mhocko@suse.com>,
Vlastimil Babka <vbabka@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
Linux-MM <linux-mm@kvack.org>,
Linux-Kernel <linux-kernel@vger.kernel.org>,
Rick Jones <rick.jones2@hpe.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Hannes Frederic Sowa <hannes@stressinduktion.org>
Subject: Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v3
Date: Fri, 02 Dec 2016 16:44:28 +0100 [thread overview]
Message-ID: <1480693468.26226.2.camel@redhat.com> (raw)
In-Reply-To: <20161202163758.0d8cc9bf@redhat.com>
On Fri, 2016-12-02 at 16:37 +0100, Jesper Dangaard Brouer wrote:
> On Thu, 01 Dec 2016 23:17:48 +0100
> Paolo Abeni <pabeni@redhat.com> wrote:
>
> > On Thu, 2016-12-01 at 18:34 +0100, Jesper Dangaard Brouer wrote:
> > > (Cc. netdev, we might have an issue with Paolo's UDP accounting and
> > > small socket queues)
> > >
> > > On Wed, 30 Nov 2016 16:35:20 +0000
> > > Mel Gorman <mgorman@techsingularity.net> wrote:
> > >
> > > > > I don't quite get why you are setting the socket recv size
> > > > > (with -- -s and -S) to such a small number, size + 256.
> > > > >
> > > >
> > > > Maybe I missed something at the time I wrote that but why would it
> > > > need to be larger?
> > >
> > > Well, to me it is quite obvious that we need some queue to avoid packet
> > > drops. We have two processes netperf and netserver, that are sending
> > > packets between each-other (UDP_STREAM mostly netperf -> netserver).
> > > These PIDs are getting scheduled and migrated between CPUs, and thus
> > > does not get executed equally fast, thus a queue is need absorb the
> > > fluctuations.
> > >
> > > The network stack is even partly catching your config "mistake" and
> > > increase the socket queue size, so we minimum can handle one max frame
> > > (due skb "truesize" concept approx PAGE_SIZE + overhead).
> > >
> > > Hopefully for localhost testing a small queue should hopefully not
> > > result in packet drops. Testing... ups, this does result in packet
> > > drops.
> > >
> > > Test command extracted from mmtests, UDP_STREAM size 1024:
> > >
> > > netperf-2.4.5-installed/bin/netperf -t UDP_STREAM -l 60 -H 127.0.0.1 \
> > > -- -s 1280 -S 1280 -m 1024 -M 1024 -P 15895
> > >
> > > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0)
> > > port 15895 AF_INET to 127.0.0.1 (127.0.0.1) port 15895 AF_INET
> > > Socket Message Elapsed Messages
> > > Size Size Time Okay Errors Throughput
> > > bytes bytes secs # # 10^6bits/sec
> > >
> > > 4608 1024 60.00 50024301 0 6829.98
> > > 2560 60.00 46133211 6298.72
> > >
> > > Dropped packets: 50024301-46133211=3891090
> > >
> > > To get a better drop indication, during this I run a command, to get
> > > system-wide network counters from the last second, so below numbers are
> > > per second.
> > >
> > > $ nstat > /dev/null && sleep 1 && nstat
> > > #kernel
> > > IpInReceives 885162 0.0
> > > IpInDelivers 885161 0.0
> > > IpOutRequests 885162 0.0
> > > UdpInDatagrams 776105 0.0
> > > UdpInErrors 109056 0.0
> > > UdpOutDatagrams 885160 0.0
> > > UdpRcvbufErrors 109056 0.0
> > > IpExtInOctets 931190476 0.0
> > > IpExtOutOctets 931189564 0.0
> > > IpExtInNoECTPkts 885162 0.0
> > >
> > > So, 885Kpps but only 776Kpps delivered and 109Kpps drops. See
> > > UdpInErrors and UdpRcvbufErrors is equal (109056/sec). This drop
> > > happens kernel side in __udp_queue_rcv_skb[1], because receiving
> > > process didn't empty it's queue fast enough see [2].
> > >
> > > Although upstream changes are coming in this area, [2] is replaced with
> > > __udp_enqueue_schedule_skb, which I actually tested with... hmm
> > >
> > > Retesting with kernel 4.7.0-baseline+ ... show something else.
> > > To Paolo, you might want to look into this. And it could also explain why
> > > I've not see the mentioned speedup by mm-change, as I've been testing
> > > this patch on top of net-next (at 93ba2222550) with Paolo's UDP changes.
> >
> > Thank you for reporting this.
> >
> > It seems that the commit 123b4a633580 ("udp: use it's own memory
> > accounting schema") is too strict while checking the rcvbuf.
> >
> > For very small value of rcvbuf, it allows a single skb to be enqueued,
> > while previously we allowed 2 of them to enter the queue, even if the
> > first one truesize exceeded rcvbuf, as in your test-case.
> >
> > Can you please try the following patch ?
>
> Sure, it looks much better with this patch.
Thank you for testing. I'll send a formal patch to David soon.
BTW I see I nice performance improvement compared to 4.7...
Cheers,
Paolo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Paolo Abeni <pabeni@redhat.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>,
Andrew Morton <akpm@linux-foundtion.org>,
Christoph Lameter <cl@linux.com>, Michal Hocko <mhocko@suse.com>,
Vlastimil Babka <vbabka@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
Linux-MM <linux-mm@kvack.org>,
Linux-Kernel <linux-kernel@vger.kernel.org>,
Rick Jones <rick.jones2@hpe.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Hannes Frederic Sowa <hannes@stressinduktion.org>
Subject: Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v3
Date: Fri, 02 Dec 2016 16:44:28 +0100 [thread overview]
Message-ID: <1480693468.26226.2.camel@redhat.com> (raw)
In-Reply-To: <20161202163758.0d8cc9bf@redhat.com>
On Fri, 2016-12-02 at 16:37 +0100, Jesper Dangaard Brouer wrote:
> On Thu, 01 Dec 2016 23:17:48 +0100
> Paolo Abeni <pabeni@redhat.com> wrote:
>
> > On Thu, 2016-12-01 at 18:34 +0100, Jesper Dangaard Brouer wrote:
> > > (Cc. netdev, we might have an issue with Paolo's UDP accounting and
> > > small socket queues)
> > >
> > > On Wed, 30 Nov 2016 16:35:20 +0000
> > > Mel Gorman <mgorman@techsingularity.net> wrote:
> > >
> > > > > I don't quite get why you are setting the socket recv size
> > > > > (with -- -s and -S) to such a small number, size + 256.
> > > > >
> > > >
> > > > Maybe I missed something at the time I wrote that but why would it
> > > > need to be larger?
> > >
> > > Well, to me it is quite obvious that we need some queue to avoid packet
> > > drops. We have two processes netperf and netserver, that are sending
> > > packets between each-other (UDP_STREAM mostly netperf -> netserver).
> > > These PIDs are getting scheduled and migrated between CPUs, and thus
> > > does not get executed equally fast, thus a queue is need absorb the
> > > fluctuations.
> > >
> > > The network stack is even partly catching your config "mistake" and
> > > increase the socket queue size, so we minimum can handle one max frame
> > > (due skb "truesize" concept approx PAGE_SIZE + overhead).
> > >
> > > Hopefully for localhost testing a small queue should hopefully not
> > > result in packet drops. Testing... ups, this does result in packet
> > > drops.
> > >
> > > Test command extracted from mmtests, UDP_STREAM size 1024:
> > >
> > > netperf-2.4.5-installed/bin/netperf -t UDP_STREAM -l 60 -H 127.0.0.1 \
> > > -- -s 1280 -S 1280 -m 1024 -M 1024 -P 15895
> > >
> > > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0)
> > > port 15895 AF_INET to 127.0.0.1 (127.0.0.1) port 15895 AF_INET
> > > Socket Message Elapsed Messages
> > > Size Size Time Okay Errors Throughput
> > > bytes bytes secs # # 10^6bits/sec
> > >
> > > 4608 1024 60.00 50024301 0 6829.98
> > > 2560 60.00 46133211 6298.72
> > >
> > > Dropped packets: 50024301-46133211=3891090
> > >
> > > To get a better drop indication, during this I run a command, to get
> > > system-wide network counters from the last second, so below numbers are
> > > per second.
> > >
> > > $ nstat > /dev/null && sleep 1 && nstat
> > > #kernel
> > > IpInReceives 885162 0.0
> > > IpInDelivers 885161 0.0
> > > IpOutRequests 885162 0.0
> > > UdpInDatagrams 776105 0.0
> > > UdpInErrors 109056 0.0
> > > UdpOutDatagrams 885160 0.0
> > > UdpRcvbufErrors 109056 0.0
> > > IpExtInOctets 931190476 0.0
> > > IpExtOutOctets 931189564 0.0
> > > IpExtInNoECTPkts 885162 0.0
> > >
> > > So, 885Kpps but only 776Kpps delivered and 109Kpps drops. See
> > > UdpInErrors and UdpRcvbufErrors is equal (109056/sec). This drop
> > > happens kernel side in __udp_queue_rcv_skb[1], because receiving
> > > process didn't empty it's queue fast enough see [2].
> > >
> > > Although upstream changes are coming in this area, [2] is replaced with
> > > __udp_enqueue_schedule_skb, which I actually tested with... hmm
> > >
> > > Retesting with kernel 4.7.0-baseline+ ... show something else.
> > > To Paolo, you might want to look into this. And it could also explain why
> > > I've not see the mentioned speedup by mm-change, as I've been testing
> > > this patch on top of net-next (at 93ba2222550) with Paolo's UDP changes.
> >
> > Thank you for reporting this.
> >
> > It seems that the commit 123b4a633580 ("udp: use it's own memory
> > accounting schema") is too strict while checking the rcvbuf.
> >
> > For very small value of rcvbuf, it allows a single skb to be enqueued,
> > while previously we allowed 2 of them to enter the queue, even if the
> > first one truesize exceeded rcvbuf, as in your test-case.
> >
> > Can you please try the following patch ?
>
> Sure, it looks much better with this patch.
Thank you for testing. I'll send a formal patch to David soon.
BTW I see I nice performance improvement compared to 4.7...
Cheers,
Paolo
next prev parent reply other threads:[~2016-12-02 15:44 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-27 13:19 [PATCH] mm: page_alloc: High-order per-cpu page allocator v3 Mel Gorman
2016-11-27 13:19 ` Mel Gorman
2016-11-28 11:00 ` Vlastimil Babka
2016-11-28 11:00 ` Vlastimil Babka
2016-11-28 11:45 ` Mel Gorman
2016-11-28 11:45 ` Mel Gorman
2016-11-30 8:55 ` Mel Gorman
2016-11-30 8:55 ` Mel Gorman
2016-11-28 15:39 ` Christoph Lameter
2016-11-28 15:39 ` Christoph Lameter
2016-11-28 16:21 ` Mel Gorman
2016-11-28 16:21 ` Mel Gorman
2016-11-28 16:38 ` Christoph Lameter
2016-11-28 16:38 ` Christoph Lameter
2016-11-28 18:47 ` Mel Gorman
2016-11-28 18:47 ` Mel Gorman
2016-11-28 18:54 ` Christoph Lameter
2016-11-28 18:54 ` Christoph Lameter
2016-11-28 20:59 ` Vlastimil Babka
2016-11-28 20:59 ` Vlastimil Babka
2016-11-28 19:54 ` Johannes Weiner
2016-11-28 19:54 ` Johannes Weiner
2016-11-30 12:40 ` Jesper Dangaard Brouer
2016-11-30 12:40 ` Jesper Dangaard Brouer
2016-11-30 14:06 ` Mel Gorman
2016-11-30 14:06 ` Mel Gorman
2016-11-30 15:06 ` Jesper Dangaard Brouer
2016-11-30 15:06 ` Jesper Dangaard Brouer
2016-11-30 16:35 ` Mel Gorman
2016-11-30 16:35 ` Mel Gorman
2016-12-01 17:34 ` Jesper Dangaard Brouer
2016-12-01 17:34 ` Jesper Dangaard Brouer
2016-12-01 22:17 ` Paolo Abeni
2016-12-01 22:17 ` Paolo Abeni
2016-12-02 15:37 ` Jesper Dangaard Brouer
2016-12-02 15:37 ` Jesper Dangaard Brouer
2016-12-02 15:44 ` Paolo Abeni [this message]
2016-12-02 15:44 ` Paolo Abeni
2016-11-30 13:05 ` Michal Hocko
2016-11-30 13:05 ` Michal Hocko
2016-11-30 14:16 ` Mel Gorman
2016-11-30 14:16 ` Mel Gorman
2016-11-30 14:59 ` Michal Hocko
2016-11-30 14:59 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1480693468.26226.2.camel@redhat.com \
--to=pabeni@redhat.com \
--cc=akpm@linux-foundtion.org \
--cc=brouer@redhat.com \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=hannes@stressinduktion.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=netdev@vger.kernel.org \
--cc=rick.jones2@hpe.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.