From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>, David Miller <davem@davemloft.net>,
netdev@vger.kernel.org
Subject: Re: [PATCH net-next] net: netdev_alloc_skb() use build_skb()
Date: Tue, 5 Jun 2012 00:54:36 +0300 [thread overview]
Message-ID: <20120604215434.GB3193@redhat.com> (raw)
In-Reply-To: <1338839579.2760.1932.camel@edumazet-glaptop>
On Mon, Jun 04, 2012 at 09:52:59PM +0200, Eric Dumazet wrote:
> On Mon, 2012-06-04 at 22:43 +0300, Michael S. Tsirkin wrote:
> > On Mon, Jun 04, 2012 at 09:29:45PM +0200, Eric Dumazet wrote:
> > > On Mon, 2012-06-04 at 21:16 +0300, Michael S. Tsirkin wrote:
> > >
> > > > Yes but if a tcp socket then hangs on, on one of the fragments,
> > > > while the other has been freed, the whole page is still
> > > > never reused, right?
> > > >
> > > > Doesn't this mean truesize should be 4K?
> > > >
> > >
> > > Yes, or more exactly PAGE_SIZE, but then performance would really go
> > > down on machines with 64KB pages.
> > > Maybe we should make the whole frag
> > > head idea enabled only for PAGE_SIZE=4096.
> > >
> > > Not sure we want to track precise truesize, as the minimum truesize is
> > > SKB_DATA_ALIGN(length + NET_SKB_PAD) + SKB_DATA_ALIGN(sizeof(struct
> > > skb_shared_info)) (64 + 64 + 320) = 448
> > >
> > > Its not like buggy drivers that used truesize = length
> > >
> > >
> >
> > Interesting. But where's the threshold?
> >
>
> It all depends on the global limit you have on your machine.
>
> If you allow tcp memory to use 10% of ram, then a systematic x4 error
> would allow it to use 40% of ram. Mabe not enough to crash.
>
> Now you have to find a real workload able to hit this limit for real...
>
> But, if you "allow" a driver to claim a truesize of 1 (instead of 4096),
> you can reach the limit and OOM faster
>
> You know, even the current page stored for each socket (sk_sndmsg_page)
> can be a problem if you setup 1.000.000 tcp sockets. That can consume
> 4GB of ram (added to inode/sockets themselves)
> This is not really taken into account right now...
>
>
Yes but what bugs me if the box is not under memory pressure
this overestimation limits buffers for no real gain.
How about we teach tcp to use data_len for buffer
limits normally and switch to truesize when low on memory?
--
MST
next prev parent reply other threads:[~2012-06-04 22:02 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-17 12:18 Stable regression with 'tcp: allow splice() to build full TSO packets' Willy Tarreau
2012-05-17 15:01 ` Willy Tarreau
2012-05-17 15:43 ` Eric Dumazet
2012-05-17 15:56 ` Willy Tarreau
2012-05-17 16:33 ` Eric Dumazet
2012-05-17 16:40 ` Willy Tarreau
2012-05-17 16:47 ` Eric Dumazet
2012-05-17 16:49 ` Eric Dumazet
2012-05-17 17:22 ` Willy Tarreau
2012-05-17 17:34 ` [PATCH net-next] net: netdev_alloc_skb() use build_skb() Eric Dumazet
2012-05-17 17:45 ` Willy Tarreau
2012-06-04 12:39 ` Michael S. Tsirkin
2012-06-04 12:44 ` Willy Tarreau
2012-05-17 19:53 ` David Miller
2012-05-18 4:41 ` Eric Dumazet
2012-06-04 12:37 ` Michael S. Tsirkin
2012-06-04 13:06 ` Eric Dumazet
2012-06-04 13:41 ` Michael S. Tsirkin
2012-06-04 14:01 ` Eric Dumazet
2012-06-04 14:09 ` Eric Dumazet
2012-06-04 14:17 ` Michael S. Tsirkin
2012-06-04 15:01 ` Eric Dumazet
2012-06-04 17:20 ` Michael S. Tsirkin
2012-06-04 17:44 ` Eric Dumazet
2012-06-04 18:16 ` Michael S. Tsirkin
2012-06-04 19:24 ` Eric Dumazet
2012-06-04 19:48 ` Michael S. Tsirkin
2012-06-04 19:56 ` Eric Dumazet
2012-06-04 21:20 ` Michael S. Tsirkin
2012-06-05 2:50 ` Eric Dumazet
2012-06-04 18:16 ` Michael S. Tsirkin
2012-06-04 19:29 ` Eric Dumazet
2012-06-04 19:43 ` Michael S. Tsirkin
2012-06-04 19:52 ` Eric Dumazet
2012-06-04 21:54 ` Michael S. Tsirkin [this message]
2012-06-05 2:46 ` Eric Dumazet
2012-06-04 19:56 ` Michael S. Tsirkin
2012-06-04 20:05 ` Eric Dumazet
2012-05-17 18:38 ` Stable regression with 'tcp: allow splice() to build full TSO packets' Ben Hutchings
2012-05-17 19:55 ` David Miller
2012-05-17 20:04 ` Willy Tarreau
2012-05-17 20:07 ` David Miller
2012-05-17 20:41 ` Eric Dumazet
2012-05-17 21:14 ` Willy Tarreau
2012-05-17 21:40 ` Eric Dumazet
2012-05-17 21:50 ` Eric Dumazet
2012-05-17 21:57 ` Willy Tarreau
2012-05-17 22:01 ` Eric Dumazet
2012-05-17 22:10 ` Eric Dumazet
2012-05-17 22:16 ` Willy Tarreau
2012-05-17 22:22 ` Eric Dumazet
2012-05-17 22:24 ` Willy Tarreau
2012-05-17 22:25 ` David Miller
2012-05-17 22:30 ` Willy Tarreau
2012-05-17 22:35 ` David Miller
2012-05-17 22:49 ` Willy Tarreau
2012-05-17 22:27 ` Joe Perches
2012-05-17 21:54 ` Willy Tarreau
2012-05-17 21:47 ` Willy Tarreau
2012-05-17 22:14 ` Eric Dumazet
2012-05-17 22:29 ` Willy Tarreau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120604215434.GB3193@redhat.com \
--to=mst@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).