linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Stanislav Fomichev <sdf@google.com>
Cc: Eric Dumazet <edumazet@google.com>,
	davem@davemloft.net, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, daniel@iogearbox.net,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Leon Romanovsky <leon@kernel.org>,
	David Ahern <dsahern@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Kees Cook <keescook@chromium.org>,
	Christian Brauner <brauner@kernel.org>,
	Kuniyuki Iwashima <kuniyu@amazon.com>,
	Lennart Poettering <mzxreary@0pointer.de>,
	linux-arch@vger.kernel.org,
	Aleksandr Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>,
	bpf <bpf@vger.kernel.org>
Subject: handling unsupported optlen in cgroup bpf getsockopt: (was [PATCH net-next v4 2/4] net: socket: add sockopts blacklist for BPF cgroup hook)
Date: Mon, 17 Apr 2023 18:03:35 -0700	[thread overview]
Message-ID: <a4591e85-d58b-0efd-c8a4-2652dc69ff68@linux.dev> (raw)
In-Reply-To: <ZDoEG0VF6fb9y0EC@google.com>

On 4/14/23 6:55 PM, Stanislav Fomichev wrote:
> On 04/13, Stanislav Fomichev wrote:
>> On Thu, Apr 13, 2023 at 7:38 AM Aleksandr Mikhalitsyn
>> <aleksandr.mikhalitsyn@canonical.com> wrote:
>>>
>>> On Thu, Apr 13, 2023 at 4:22 PM Eric Dumazet <edumazet@google.com> wrote:
>>>>
>>>> On Thu, Apr 13, 2023 at 3:35 PM Alexander Mikhalitsyn
>>>> <aleksandr.mikhalitsyn@canonical.com> wrote:
>>>>>
>>>>> During work on SO_PEERPIDFD, it was discovered (thanks to Christian),
>>>>> that bpf cgroup hook can cause FD leaks when used with sockopts which
>>>>> install FDs into the process fdtable.
>>>>>
>>>>> After some offlist discussion it was proposed to add a blacklist of
>>>>
>>>> We try to replace this word by either denylist or blocklist, even in changelogs.
>>>
>>> Hi Eric,
>>>
>>> Oh, I'm sorry about that. :( Sure.
>>>
>>>>
>>>>> socket options those can cause troubles when BPF cgroup hook is enabled.
>>>>>
>>>>
>>>> Can we find the appropriate Fixes: tag to help stable teams ?
>>>
>>> Sure, I will add next time.
>>>
>>> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
>>>
>>> I think it's better to add Stanislav Fomichev to CC.
>>
>> Can we use 'struct proto' bpf_bypass_getsockopt instead? We already
>> use it for tcp zerocopy, I'm assuming it should work in this case as
>> well?
> 
> Jakub reminded me of the other things I wanted to ask here bug forgot:
> 
> - setsockopt is probably not needed, right? setsockopt hook triggers
>    before the kernel and shouldn't leak anything
> - for getsockopt, instead of bypassing bpf completely, should we instead
>    ignore the error from the bpf program? that would still preserve
>    the observability aspect

stealing this thread to discuss the optlen issue which may make sense to bypass 
also.

There has been issue with optlen. Other than this older post related to optlen > 
PAGE_SIZE: 
https://lore.kernel.org/bpf/5c8b7d59-1f28-2284-f7b9-49d946f2e982@linux.dev/, the 
recent one related to optlen that we have seen is NETLINK_LIST_MEMBERSHIPS. The 
userspace passed in optlen == 0 and the kernel put the expected optlen (> 0) and 
'return 0;' to userspace. The userspace intention is to learn the expected 
optlen. This makes 'ctx.optlen > max_optlen' and 
__cgroup_bpf_run_filter_getsockopt() ends up returning -EFAULT to the userspace 
even the bpf prog has not changed anything.

Does it make sense to also bypass the bpf prog when 'ctx.optlen > max_optlen' 
for now (and this can use a separate patch which as usual requires a bpf selftests)?

In the future, does it make sense to have a specific cgroup-bpf-prog (a specific 
attach type?) that only uses bpf_dynptr kfunc to access the optval such that it 
can enforce read-only for some optname and potentially also track if bpf-prog 
has written a new optval? The bpf-prog can only return 1 (OK) and only allows 
using bpf_set_retval() instead. Likely there is still holes but could be a seed 
of thought to continue polishing the idea.


> - or maybe we can even have a per-proto bpf_getsockopt_cleanup call that
>    gets called whenever bpf returns an error to make sure protocols have
>    a chance to handle that condition (and free the fd)
> 



  parent reply	other threads:[~2023-04-18  1:09 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230413133355.350571-1-aleksandr.mikhalitsyn@canonical.com>
2023-04-13 13:33 ` [PATCH net-next v4 1/4] scm: add SO_PASSPIDFD and SCM_PIDFD Alexander Mikhalitsyn
2023-04-17 15:18   ` Christian Brauner
2023-04-17 16:01     ` Aleksandr Mikhalitsyn
2023-04-17 17:16       ` Christian Brauner
2023-04-17 17:43         ` Eric Dumazet
2023-04-18  8:16           ` Christian Brauner
2023-04-18 13:07   ` Christian Brauner
2023-04-13 13:33 ` [PATCH net-next v4 2/4] net: socket: add sockopts blacklist for BPF cgroup hook Alexander Mikhalitsyn
2023-04-13 14:21   ` Eric Dumazet
2023-04-13 14:38     ` Aleksandr Mikhalitsyn
2023-04-13 16:37       ` Stanislav Fomichev
2023-04-15  1:55         ` Stanislav Fomichev
2023-04-17 14:42           ` Christian Brauner
2023-04-17 18:10             ` Stanislav Fomichev
2023-04-18  1:03           ` Martin KaFai Lau [this message]
2023-04-18 16:47             ` handling unsupported optlen in cgroup bpf getsockopt: (was [PATCH net-next v4 2/4] net: socket: add sockopts blacklist for BPF cgroup hook) Stanislav Fomichev
2023-04-25 17:59               ` Kui-Feng Lee
2023-04-25 18:42                 ` Stanislav Fomichev
2023-04-25 21:11                   ` Kui-Feng Lee
2023-04-26 15:50                     ` Stanislav Fomichev
2023-04-25 21:28                   ` Kui-Feng Lee
2023-04-26 15:50                     ` Stanislav Fomichev
2023-04-15  0:51       ` [PATCH net-next v4 2/4] net: socket: add sockopts blacklist for BPF cgroup hook Jakub Kicinski
2023-04-13 13:33 ` [PATCH net-next v4 3/4] net: core: add getsockopt SO_PEERPIDFD Alexander Mikhalitsyn
2023-04-18 13:13   ` Christian Brauner
2023-04-13 13:33 ` [PATCH net-next v4 4/4] selftests: net: add SCM_PIDFD / SO_PEERPIDFD test Alexander Mikhalitsyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a4591e85-d58b-0efd-c8a4-2652dc69ff68@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=aleksandr.mikhalitsyn@canonical.com \
    --cc=arnd@arndb.de \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=keescook@chromium.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=leon@kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mzxreary@0pointer.de \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).