netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ivan Babrou <ivan@cloudflare.com>
To: Eric Dumazet <edumazet@google.com>
Cc: Linux MM <linux-mm@kvack.org>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeelb@google.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"David S. Miller" <davem@davemloft.net>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	David Ahern <dsahern@kernel.org>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	cgroups@vger.kernel.org, kernel-team <kernel-team@cloudflare.com>
Subject: Re: Low TCP throughput due to vmpressure with swap enabled
Date: Tue, 22 Nov 2022 10:11:41 -0800	[thread overview]
Message-ID: <CABWYdi2TWJej806yif9hi7cxD9P9-EpMB9EU_72wWw9fFqtt4g@mail.gmail.com> (raw)
In-Reply-To: <CANn89iLzARPp6jW1xS0rf+-wS_RnwK-Kfgs9uQFYan2AHPRQFA@mail.gmail.com>

On Tue, Nov 22, 2022 at 10:01 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Nov 21, 2022 at 4:53 PM Ivan Babrou <ivan@cloudflare.com> wrote:
> >
> > Hello,
> >
> > We have observed a negative TCP throughput behavior from the following commit:
> >
> > * 8e8ae645249b mm: memcontrol: hook up vmpressure to socket pressure
> >
> > It landed back in 2016 in v4.5, so it's not exactly a new issue.
> >
> > The crux of the issue is that in some cases with swap present the
> > workload can be unfairly throttled in terms of TCP throughput.
>
> I guess defining 'fairness' in such a scenario is nearly impossible.
>
> Have you tried changing /proc/sys/net/ipv4/tcp_rmem  (and/or tcp_wmem) ?
> Defaults are quite conservative.

Yes, our max sizes are much higher than the defaults. I don't see how
it matters though. The issue is that the kernel clamps rcv_sshtrehsh
at 4 x advmss. No matter how much TCP memory you end up using, the
kernel will clamp based on responsiveness to memory reclaim, which in
turn depends on swap presence. We're seeing it in production with tens
of thousands of sockets and high max tcp_rmem and I'm able to
replicate the same issue in my vm with the default sysctl values.

> If for your workload you want to ensure a minimum amount of memory per
> TCP socket,
> that might be good enough.

That's not my goal at all. We don't have a problem with TCP memory
consumption. Our issue is low throughput because vmpressure() thinks
that the cgroup is memory constrained when it most definitely is not.

  reply	other threads:[~2022-11-22 18:12 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-22  0:53 Low TCP throughput due to vmpressure with swap enabled Ivan Babrou
2022-11-22 18:01 ` Eric Dumazet
2022-11-22 18:11   ` Ivan Babrou [this message]
2022-11-22 18:23     ` Eric Dumazet
2022-11-22 18:59 ` Yu Zhao
2022-11-22 19:05   ` Ivan Babrou
2022-11-22 19:08     ` Yu Zhao
2022-11-22 19:46 ` Yu Zhao
2022-11-22 20:05   ` Yu Zhao
2022-11-23  0:44     ` Yu Zhao
2022-11-23 21:22       ` Johannes Weiner
2022-11-24  1:18         ` Yu Zhao
2022-11-24  1:29           ` Yu Zhao
2022-11-22 20:05 ` Johannes Weiner
2022-11-22 22:11   ` Ivan Babrou
2022-11-23  1:28     ` Ivan Babrou
2022-11-28 18:07       ` Johannes Weiner
2022-12-05 19:28         ` Shakeel Butt
2022-12-05 23:57         ` Ivan Babrou
2022-12-06  0:50           ` Ivan Babrou
2022-12-06 19:00             ` Johannes Weiner
2022-12-06 19:13               ` Eric Dumazet
2022-12-06 20:51                 ` Johannes Weiner
2022-12-06 23:10                   ` Shakeel Butt
2022-12-07 12:53                     ` Johannes Weiner
2022-12-08  0:31                       ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABWYdi2TWJej806yif9hi7cxD9P9-EpMB9EU_72wWw9fFqtt4g@mail.gmail.com \
    --to=ivan@cloudflare.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).