From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>, David Miller <davem@davemloft.net>,
netdev@vger.kernel.org
Subject: Re: [PATCH net-next] net: netdev_alloc_skb() use build_skb()
Date: Tue, 5 Jun 2012 00:54:36 +0300 [thread overview]
Message-ID: <20120604215434.GB3193@redhat.com> (raw)
In-Reply-To: <1338839579.2760.1932.camel@edumazet-glaptop>
On Mon, Jun 04, 2012 at 09:52:59PM +0200, Eric Dumazet wrote:
> On Mon, 2012-06-04 at 22:43 +0300, Michael S. Tsirkin wrote:
> > On Mon, Jun 04, 2012 at 09:29:45PM +0200, Eric Dumazet wrote:
> > > On Mon, 2012-06-04 at 21:16 +0300, Michael S. Tsirkin wrote:
> > >
> > > > Yes but if a tcp socket then hangs on, on one of the fragments,
> > > > while the other has been freed, the whole page is still
> > > > never reused, right?
> > > >
> > > > Doesn't this mean truesize should be 4K?
> > > >
> > >
> > > Yes, or more exactly PAGE_SIZE, but then performance would really go
> > > down on machines with 64KB pages.
> > > Maybe we should make the whole frag
> > > head idea enabled only for PAGE_SIZE=4096.
> > >
> > > Not sure we want to track precise truesize, as the minimum truesize is
> > > SKB_DATA_ALIGN(length + NET_SKB_PAD) + SKB_DATA_ALIGN(sizeof(struct
> > > skb_shared_info)) (64 + 64 + 320) = 448
> > >
> > > Its not like buggy drivers that used truesize = length
> > >
> > >
> >
> > Interesting. But where's the threshold?
> >
>
> It all depends on the global limit you have on your machine.
>
> If you allow tcp memory to use 10% of ram, then a systematic x4 error
> would allow it to use 40% of ram. Mabe not enough to crash.
>
> Now you have to find a real workload able to hit this limit for real...
>
> But, if you "allow" a driver to claim a truesize of 1 (instead of 4096),
> you can reach the limit and OOM faster
>
> You know, even the current page stored for each socket (sk_sndmsg_page)
> can be a problem if you setup 1.000.000 tcp sockets. That can consume
> 4GB of ram (added to inode/sockets themselves)
> This is not really taken into account right now...
>
>
Yes but what bugs me if the box is not under memory pressure
this overestimation limits buffers for no real gain.
How about we teach tcp to use data_len for buffer
limits normally and switch to truesize when low on memory?
--
MST
next prev parent reply other threads:[~2012-06-04 22:02 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-17 12:18 Stable regression with 'tcp: allow splice() to build full TSO packets' Willy Tarreau
2012-05-17 15:01 ` Willy Tarreau
2012-05-17 15:43 ` Eric Dumazet
2012-05-17 15:56 ` Willy Tarreau
2012-05-17 16:33 ` Eric Dumazet
2012-05-17 16:40 ` Willy Tarreau
2012-05-17 16:47 ` Eric Dumazet
2012-05-17 16:49 ` Eric Dumazet
2012-05-17 17:22 ` Willy Tarreau
2012-05-17 17:34 ` [PATCH net-next] net: netdev_alloc_skb() use build_skb() Eric Dumazet
2012-05-17 17:45 ` Willy Tarreau
2012-06-04 12:39 ` Michael S. Tsirkin
2012-06-04 12:44 ` Willy Tarreau
2012-05-17 19:53 ` David Miller
2012-05-18 4:41 ` Eric Dumazet
2012-06-04 12:37 ` Michael S. Tsirkin
2012-06-04 13:06 ` Eric Dumazet
2012-06-04 13:41 ` Michael S. Tsirkin
2012-06-04 14:01 ` Eric Dumazet
2012-06-04 14:09 ` Eric Dumazet
2012-06-04 14:17 ` Michael S. Tsirkin
2012-06-04 15:01 ` Eric Dumazet
2012-06-04 17:20 ` Michael S. Tsirkin
2012-06-04 17:44 ` Eric Dumazet
2012-06-04 18:16 ` Michael S. Tsirkin
2012-06-04 19:24 ` Eric Dumazet
2012-06-04 19:48 ` Michael S. Tsirkin
2012-06-04 19:56 ` Eric Dumazet
2012-06-04 21:20 ` Michael S. Tsirkin
2012-06-05 2:50 ` Eric Dumazet
2012-06-04 18:16 ` Michael S. Tsirkin
2012-06-04 19:29 ` Eric Dumazet
2012-06-04 19:43 ` Michael S. Tsirkin
2012-06-04 19:52 ` Eric Dumazet
2012-06-04 21:54 ` Michael S. Tsirkin [this message]
2012-06-05 2:46 ` Eric Dumazet
2012-06-04 19:56 ` Michael S. Tsirkin
2012-06-04 20:05 ` Eric Dumazet
2012-05-17 18:38 ` Stable regression with 'tcp: allow splice() to build full TSO packets' Ben Hutchings
2012-05-17 19:55 ` David Miller
2012-05-17 20:04 ` Willy Tarreau
2012-05-17 20:07 ` David Miller
2012-05-17 20:41 ` Eric Dumazet
2012-05-17 21:14 ` Willy Tarreau
2012-05-17 21:40 ` Eric Dumazet
2012-05-17 21:50 ` Eric Dumazet
2012-05-17 21:57 ` Willy Tarreau
2012-05-17 22:01 ` Eric Dumazet
2012-05-17 22:10 ` Eric Dumazet
2012-05-17 22:16 ` Willy Tarreau
2012-05-17 22:22 ` Eric Dumazet
2012-05-17 22:24 ` Willy Tarreau
2012-05-17 22:25 ` David Miller
2012-05-17 22:30 ` Willy Tarreau
2012-05-17 22:35 ` David Miller
2012-05-17 22:49 ` Willy Tarreau
2012-05-17 22:27 ` Joe Perches
2012-05-17 21:54 ` Willy Tarreau
2012-05-17 21:47 ` Willy Tarreau
2012-05-17 22:14 ` Eric Dumazet
2012-05-17 22:29 ` Willy Tarreau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120604215434.GB3193@redhat.com \
--to=mst@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.