BPF List
 help / color / mirror / Atom feed
From: Martin Kelly <martin.kelly@crowdstrike.com>
To: <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Yonghong Song <yonghong.song@linux.dev>
Subject: Memory corruption in out_batch parameter of batch lookup APIs
Date: Fri, 16 Feb 2024 16:24:49 -0800	[thread overview]
Message-ID: <1f98a10d-9fd7-4a0c-baa8-be31c1e78fa8@crowdstrike.com> (raw)

Hi, I noticed there's a subtlety to to the batch APIs 
(bpf_map_lookup_batch and bpf_map_lookup_and_delete_batch) that can lead 
to bugs if callers are not careful, and I'm wondering about the best way 
to document or address it.

Specifically, the size of the data pointed to by in_batch/out_batch is 
not clear, and if it's too small, the caller can see memory/stack 
corruption. The function documentation isn't super clear about this, 
calling in_batch "address of the first element in batch to read", so a 
caller might reasonably assume that a pointer size is fine. However, the 
right size actually depends on the map type.

For hash and array maps, out_batch will be u32 (as the parameter is used 
as an index). But for LPM trie, it will be the size of the key (in the 
case of LPM trie, I think that's 260 bytes). If a caller passes a 
pointer to memory smaller the key size, the kernel will overwrite past 
that memory and corrupt the stack (or wherever out_batch points). This 
is because of the copy_to_user(uobatch, prev_key, map->key_size) at the 
end of generic_map_lookup_batch.

It seems to me that we could add documentation to these functions 
indicating that out_batch should be able to hold at least one key to be 
safe. This is simple but overly strict (at the moment) for all map types 
other than LPM trie. However, if we specifically call out LPM trie as 
needing key-sized width while other map types need 4 bytes, then this 
documentation could easily become out-of-date as new map types are added.

We could alternatively add a statement like "out_batch should generally 
point to memory large enough to hold a single key, but for some map 
implementations a smaller type is possible". This gives more information 
but might be too vague for many API users, and it means future kernels 
could be tied to this implementation to avoid breaking users.

Any thoughts/preferences on how best to handle this? I'm happy to send a 
patch clarifying the documentation, but I'd like to get a general 
consensus on the best way to proceed first.

Thanks,

Martin


             reply	other threads:[~2024-02-17  0:51 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-17  0:24 Martin Kelly [this message]
2024-02-20  1:22 ` Memory corruption in out_batch parameter of batch lookup APIs Yonghong Song
2024-02-21  1:02   ` Martin Kelly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f98a10d-9fd7-4a0c-baa8-be31c1e78fa8@crowdstrike.com \
    --to=martin.kelly@crowdstrike.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox