Re: [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net
Cc: andrii@kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF
Date: Sat, 11 Dec 2021 20:36:29 +0100	[thread overview]
Message-ID: <87ilvvue6a.fsf@toke.dk> (raw)
In-Reply-To: <20211210201333.896276-2-andrii@kernel.org>

Andrii Nakryiko <andrii@kernel.org> writes:

> The need to increase RLIMIT_MEMLOCK to do anything useful with BPF is
> one of the first extremely frustrating gotchas that all new BPF users go
> through and in some cases have to learn it a very hard way.
>
> Luckily, starting with upstream Linux kernel version 5.11, BPF subsystem
> dropped the dependency on memlock and uses memcg-based memory accounting
> instead. Unfortunately, detecting memcg-based BPF memory accounting is
> far from trivial (as can be evidenced by this patch), so in practice
> most BPF applications still do unconditional RLIMIT_MEMLOCK increase.
>
> As we move towards libbpf 1.0, it would be good to allow users to forget
> about RLIMIT_MEMLOCK vs memcg and let libbpf do the sensible adjustment
> automatically. This patch paves the way forward in this matter. Libbpf
> will do feature detection of memcg-based accounting, and if detected,
> will do nothing. But if the kernel is too old, just like BCC, libbpf
> will automatically increase RLIMIT_MEMLOCK on behalf of user
> application ([0]).
>
> As this is technically a breaking change, during the transition period
> applications have to opt into libbpf 1.0 mode by setting
> LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK bit when calling
> libbpf_set_strict_mode().
>
> Libbpf allows to control the exact amount of set RLIMIT_MEMLOCK limit
> with libbpf_set_memlock_rlim_max() API. Passing 0 will make libbpf do
> nothing with RLIMIT_MEMLOCK. libbpf_set_memlock_rlim_max() has to be
> called before the first bpf_prog_load(), bpf_btf_load(), or
> bpf_object__load() call, otherwise it has no effect and will return
> -EBUSY.
>
>   [0] Closes: https://github.com/libbpf/libbpf/issues/369
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

The probing approach breaks with out-of-order backports, I suppose.
Hopefully no one will do those for that particular patch, though (it's
not really a bugfix), and at least for RHEL we did backport them
together.

Can't think of any better ways of doing the detection either, but maybe
something to be aware of in the future (i.e., "don't change things in a
way that can't be detected from userspace")?

Anyway, with the nits below:

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>

> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 6b2407e12060..7c82136979bf 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c

[...]

> +	/* attempt loading freplace trying to use custom BTF */
> +	memset(&attr, 0, bpf_load_attr_sz);
> +	attr.prog_type = BPF_PROG_TYPE_TRACING;
> +	attr.expected_attach_type = BPF_TRACE_FENTRY;

This comment also seems to be disagreeing with the code it's commenting
on?

[...]

> +static bool memlock_bumped;
> +static rlim_t memlock_rlim_max = RLIM_INFINITY;
> +
> +int libbpf_set_memlock_rlim_max(size_t memlock_max)
> +{
> +	if (memlock_bumped)
> +		return libbpf_err(-EBUSY);
> +
> +	memlock_rlim_max = memlock_max;
> +	return 0;
> +}
> +
> +int bump_rlimit_memlock(void)
> +{
> +	struct rlimit rlim;
> +
> +	/* this the default in libbpf 1.0, but for now user has to opt-in explicitly */
> +	if (!(libbpf_mode & LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK))
> +		return 0;
> +
> +	/* if kernel supports memcg-based accounting, skip bumping RLIMIT_MEMLOCK */
> +	if (memlock_bumped || kernel_supports(NULL, FEAT_MEMCG_ACCOUNT))
> +		return 0;
> +
> +	memlock_bumped = true;
> +
> +	/* zero memlock_rlim_max disables auto-bumping RLIMIT_MEMLOCK */
> +	if (memlock_rlim_max == 0)
> +		return 0;
> +
> +	rlim.rlim_cur = rlim.rlim_max = memlock_rlim_max;
> +	if (setrlimit(RLIMIT_MEMLOCK, &rlim))
> +		return -errno;
> +
> +	return 0;
> +}
> +

"rlim_max" seems to imply this will only ever increase the limit, but if
I'm reading the code correctly it could actually end up lowering the
effective limit?

-Toke

next prev parent reply	other threads:[~2021-12-11 19:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-10 20:13 [PATCH v2 bpf-next 0/2] libbpf: auto-bumpd RLIMIT_MEMLOCK on old kernels Andrii Nakryiko
2021-12-10 20:13 ` [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF Andrii Nakryiko
2021-12-11 19:36   ` Toke Høiland-Jørgensen [this message]
2021-12-12  2:22     ` Alexei Starovoitov
2021-12-12 19:45       ` Andrii Nakryiko
2021-12-10 20:13 ` [PATCH v2 bpf-next 2/2] selftests/bpf: remove explicit setrlimit(RLIMIT_MEMLOCK) in main selftests Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ilvvue6a.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.