From: sdf@google.com
To: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <song@kernel.org>, Networking <netdev@vger.kernel.org>,
bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>
Subject: Re: [PATCH bpf-next 1/2] bpf: try to avoid kzalloc in cgroup/{s,g}etsockopt
Date: Thu, 31 Dec 2020 12:14:13 -0800 [thread overview]
Message-ID: <X+4xFUuYHUIufeJ1@google.com> (raw)
In-Reply-To: <20201231064728.x7vywfzxxn3sqq7e@kafai-mbp.dhcp.thefacebook.com>
On 12/30, Martin KaFai Lau wrote:
> On Mon, Dec 21, 2020 at 02:22:41PM -0800, Song Liu wrote:
> > On Thu, Dec 17, 2020 at 9:24 AM Stanislav Fomichev <sdf@google.com>
> wrote:
> > >
> > > When we attach a bpf program to cgroup/getsockopt any other
> getsockopt()
> > > syscall starts incurring kzalloc/kfree cost. While, in general, it's
> > > not an issue, sometimes it is, like in the case of
> TCP_ZEROCOPY_RECEIVE.
> > > TCP_ZEROCOPY_RECEIVE (ab)uses getsockopt system call to implement
> > > fastpath for incoming TCP, we don't want to have extra allocations in
> > > there.
> > >
> > > Let add a small buffer on the stack and use it for small (majority)
> > > {s,g}etsockopt values. I've started with 128 bytes to cover
> > > the options we care about (TCP_ZEROCOPY_RECEIVE which is 32 bytes
> > > currently, with some planned extension to 64 + some headroom
> > > for the future).
> >
> > I don't really know the rule of thumb, but 128 bytes on stack feels too
> big to
> > me. I would like to hear others' opinions on this. Can we solve the
> problem
> > with some other mechanisms, e.g. a mempool?
> It seems the do_tcp_getsockopt() is also having "struct
> tcp_zerocopy_receive"
> in the stack. I think the buf here is also mimicking
> "struct tcp_zerocopy_receive", so should not cause any
> new problem.
Good point!
> However, "struct tcp_zerocopy_receive" is only 40 bytes now. I think it
> is better to have a smaller buf for now and increase it later when the
> the future needs in "struct tcp_zerocopy_receive" is also upstreamed.
I can lower it to 64. Or even 40?
I can also try to add something like BUILD_BUG_ON(sizeof(struct
tcp_zerocopy_receive) < BPF_SOCKOPT_KERN_BUF_SIZE) to make sure this
buffer gets adjusted whenever we touch tcp_zerocopy_receive.
next prev parent reply other threads:[~2020-12-31 20:15 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-17 17:23 [PATCH bpf-next 0/2] bpf: misc performance improvements for cgroup hooks Stanislav Fomichev
2020-12-17 17:23 ` [PATCH bpf-next 1/2] bpf: try to avoid kzalloc in cgroup/{s,g}etsockopt Stanislav Fomichev
2020-12-21 22:22 ` Song Liu
2020-12-22 2:09 ` sdf
2020-12-31 6:47 ` Martin KaFai Lau
2020-12-31 20:14 ` sdf [this message]
2021-01-04 21:01 ` Martin KaFai Lau
2020-12-21 22:25 ` Song Liu
2020-12-22 2:11 ` sdf
2020-12-22 19:11 ` Martin KaFai Lau
2020-12-23 3:09 ` sdf
2020-12-31 6:50 ` Martin KaFai Lau
2020-12-31 20:18 ` sdf
2020-12-17 17:23 ` [PATCH bpf-next 2/2] bpf: split cgroup_bpf_enabled per attach type Stanislav Fomichev
2020-12-21 22:40 ` Song Liu
2020-12-22 1:57 ` sdf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=X+4xFUuYHUIufeJ1@google.com \
--to=sdf@google.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kafai@fb.com \
--cc=netdev@vger.kernel.org \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).