public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: buermarc <buermarc@googlemail.com>
To: joel.granados@kernel.org
Cc: buermarc@googlemail.com, davem@davemloft.net,
	elias.rw2@gmail.com, kees@kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	opurdila@ixiacom.com, ps.report@gmx.net
Subject: Re: [PATCH v3] sysctl: fix check against uninitialized variable in proc_do_large_bitmap
Date: Tue, 24 Mar 2026 00:46:23 +0100	[thread overview]
Message-ID: <20260323234623.689612-1-buermarc@googlemail.com> (raw)
In-Reply-To: <wcqe7kk2jdca7otlp6iwbp27ud6d3ynzopr7gs66gqn4xeskbr@cdyrjkouqmem>

Hi Joel,

Thanks for the question.

On Mon, 23 Mar 2026 14:53:51 +0100, Joel Granados wrote:

> Here I have a question:
> 
> How is it possible that the buffer in proc_do_large_bitmap is not
> terminated with a '\0' when writing?
> 
> The proc_do_large_bitmap gets called from proc_sys_call_handler and
> before running the table->proc_handler call, the end of the buffer gets
> set with '\0'.

This is correct, the buffer will be terminated with a '\0'.

> trailing char is '\0'. Which, in turn, means that 'c' will always get a
> value.

I think this does not hold true. If the buffers is parsed till the end
len can be of the same value as size in proc_get_long(), which means we
do not set tr.

I will go through my current understanding for a
write(3</proc/sys/net/ipv4/ip_local_reserved_ports>, "123", 3) = 3
syscall. Hopefully that will make things a bit clearer, or show where I
am mistaken.

In proc_sys_call_handler() we call the function (in our case
proc_do_large_bitmap()) with the count based on the result of
iov_iter_count(), while ensuring that the buffer contains a trailing
'\0' char. So the buffer will contain '123\0', that is correct. The
passed count will does not change it will still be 3. I used bpftrace to
check my assumption as I don't know the iov_iter struct all that good
and it holds up.

    The trace code:
    sudo bpftrace -e '
    kprobe:proc_sys_call_handler
    {
        printf("Count: %lu\n", ((struct iov_iter *)arg1)->count);
    }

    kprobe:proc_do_large_bitmap
    {
        printf("Count: %lu\n", *(uint8 *)arg3);
    }'
    Attached 2 probes
    Count: 3
    Count: 3

Where the output is triggered by:
write(3</proc/sys/net/ipv4/ip_local_reserved_ports>, "123", 3) = 3

proc_do_large_bitmap() takes the given count as lenp and sets left to
the value lenp points to. left is then passed to proc_get_long(). This
means size in proc_get_long() is 3.

proc_get_long() calls strtoul_lenient(), which parses until it hits '\0'
and lets p point to that. See in proc_get_long():

	if (strtoul_lenient(p, &p, 0, val))
		return -EINVAL;

	len = p - tmp;

Setting len to 'p - tmp' means length is 3 in our example. In our case
size is also 3. As such:

    (len < *size) => (3 < 3) => false.

This means following branches will not be take. See in proc_get_long():

	if (len < *size && perm_tr_len && !memchr(perm_tr, *p, perm_tr_len))
		return -EINVAL;

	if (tr && (len < *size))
		*tr = *p;

This means we do not set tr to what p points to. Also we do not do the
!memchr check. This branch would return -EINVAL if len < size holds true
and we do not contain an expected trailing character.

We can force the !memchr branch to return -EINVAL, by using
write(fd,'123\0',4). Here len is again 3, as we only read to '\0', but
the size is 4. This will, as expected, now return an -EINVAL, because
for the first proc_get_long() call we do not have '\0' in the set of
expected trailing characters.

See tr_a, and the first proc_get_long() call in proc_do_large_bitmap():

	char tr_a[] = { '-', ',', '\n' }, tr_b[] = { ',', '\n', 0 }, c;

        ...

        /* In case we stop parsing mid-number, we can reset */
        saved_left = left;
        err = proc_get_long(&p, &left, &val_a, &neg, tr_a,
                             sizeof(tr_a), &c);

For the second proc_get_long() call we use tr_b, and here using '\0' is
valid. The resulting behavior feels a bit unintuitive to me. I can
understand why based on the code and it likely has a some reason that I
just haven't figured out yet.

For example the following works:
write(3</proc/sys/net/ipv4/ip_local_reserved_ports>, "123-234\0", 8) = 8

But if we have same '\0' char at the end for the first part it fails:
write(3</proc/sys/net/ipv4/ip_local_reserved_ports>, "123\0", 4) = -1

> I'll repeat the comments from the other reviewers. Careful with these
> separators. When I used b4 to bring all this into my testing env, All
> your trailers were ignored.

I'll make sure to apply test output of b4 next time and check if
everything works locally as expected. Sorry for the mistake.

> *Unless*, the buffer is changed in BPF_CGROUP_RUN_PROG_SYSCTL by an
> external (to the kernel) program.
> 
> Are you running a BPF program on this hook? Do you see the same behavior
> if you turn off BPF?

I tried to see if I can access what c is on the stack to check if I
could see the '-' ascii value. I think I used the following snippet:

    # Print the specific byte where 'c' could be (e.g., RBP-1)
    # On x86_64, local chars could be at the end of the stack frame?
    sudo bpftrace -e 'kprobe:proc_do_large_bitmap {
        $c_val = *(u8 *)(reg("bp") - 1);
        printf("PID %d: c starts as 0x%02x (%c)\n", pid, $c_val, $c_val);
    }'

But I must admit this was just a shot into the blue. Yet, on the
affected host the -EINVAL results disappeared as long as the probe was
attached. My assumption was that the eBPF trace changed which value c
pointed to on the stack.

Best Regards,
Marc

  reply	other threads:[~2026-03-23 23:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-19 22:50 [PATCH v3] sysctl: fix check against uninitialized variable in proc_do_large_bitmap Marc Buerg
2026-03-20 18:30 ` Kees Cook
2026-03-21 12:05   ` Peter Seiderer
2026-03-22 10:13     ` Marc Buerg
2026-03-23 13:53 ` Joel Granados
2026-03-23 23:46   ` buermarc [this message]
2026-03-24  7:44     ` Joel Granados
2026-03-24 22:36       ` buermarc
2026-03-25 10:07         ` Joel Granados
2026-03-25 20:48           ` Marc Buerg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260323234623.689612-1-buermarc@googlemail.com \
    --to=buermarc@googlemail.com \
    --cc=davem@davemloft.net \
    --cc=elias.rw2@gmail.com \
    --cc=joel.granados@kernel.org \
    --cc=kees@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=opurdila@ixiacom.com \
    --cc=ps.report@gmx.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox