From: Joel Fernandes <joel@joelfernandes.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Paasch <cpaasch@apple.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, MPTCP Upstream <mptcp@lists.linux.dev>,
rcu@vger.kernel.org
Subject: Re: kmemleak handling of kfree_rcu
Date: Tue, 5 Sep 2023 11:17:25 +0000 [thread overview]
Message-ID: <20230905111725.GA3737639@google.com> (raw)
In-Reply-To: <ZPZKsJUPVeHCsLQB@arm.com>
On Mon, Sep 04, 2023 at 10:22:56PM +0100, Catalin Marinas wrote:
> Hi Christoph,
>
> Please try not to send html, many servers block such emails.
>
> Also adding the RCU list.
>
> On Wed, Aug 30, 2023 at 09:37:23AM -0700, Christoph Paasch wrote:
> > for the MPTCP upstream project, we are running syzkaller [1] continuously to
> > qualify our kernel changes.
> >
> > We found one issue with kmemleak and its handling of kfree_rcu.
> >
> > Specifically, it looks like kmemleak falsely reports memory-leaks when the
> > object is being freed by kfree_rcu after a certain grace-period.
> >
> > For example, https://github.com/multipath-tcp/mptcp_net-next/issues/398#
> > issuecomment-1584819482 shows how the syzkaller program reliably produces a
> > kmemleak report, although the object is not leaked (we confirmed that by simply
> > increasing MSECS_MIN_AGE to something larger than the grace-period).
>
> Not sure which RCU variant you are using but most likely the false
> positives are caused by the original reference to the object being lost
> and the pointer added to a new location that kmemleak does not track
> (e.g. bnode->records[] in the tree-based variant).
>
> A quick attempt (untested, not even compiled):
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
I am not sure if that works. Correct me if I'm wrong but the issue is that
the allocated pointer being RCU-freed is no longer reachable by kmemleak (it
is scanned but not reachable), however it might still be reachable via an
RCU reader. In such a situation, it is a false-positive.
The correct fix then should probably be to mark the object as
kmemleak_not_leak() until a grace period elapses. This will cause the object
to not be reported but still be scanned until eventually the lower layers
will remove the object from kmemleak-tracking after the grace period. Per the
docs also, that API is used to prevent false-positives.
Instead what the below diff appears to do is to mark the bnode cache as a
kmemleak object itself, which smells a bit wrong. The bnode is not an
allocated object in the traditional sense, it is simple an internal data
structure. That may not solve the issue as the false-positive unreachable
object is still unreachable.
Or did I misunderstand how kmemleak works? (Quite possible).
thanks,
- Joel
> index 1449cb69a0e0..681a1eb7560a 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -53,6 +53,7 @@
> #include <linux/sysrq.h>
> #include <linux/kprobes.h>
> #include <linux/gfp.h>
> +#include <linux/kmemleak.h>
> #include <linux/oom.h>
> #include <linux/smpboot.h>
> #include <linux/jiffies.h>
> @@ -2934,6 +2935,7 @@ drain_page_cache(struct kfree_rcu_cpu *krcp)
>
> llist_for_each_safe(pos, n, page_list) {
> free_page((unsigned long)pos);
> + kmemleak_free(pos);
> freed++;
> }
>
> @@ -2962,8 +2964,16 @@ kvfree_rcu_bulk(struct kfree_rcu_cpu *krcp,
> rcu_state.name, bnode->records[i], 0);
>
> vfree(bnode->records[i]);
> + /* avoid false negatives */
> + kmemleak_erase(&bnode->records[i]);
> }
> }
> + /*
> + * Avoid kmemleak false negatives when such pointers are later
> + * re-allocated.
> + */
> + for (i = 0; i < bnode->nr_records; i++)
> + kmemleak_erase(&bnode->records[i]);
> rcu_lock_release(&rcu_callback_map);
> }
>
> @@ -2972,8 +2982,10 @@ kvfree_rcu_bulk(struct kfree_rcu_cpu *krcp,
> bnode = NULL;
> raw_spin_unlock_irqrestore(&krcp->lock, flags);
>
> - if (bnode)
> + if (bnode) {
> free_page((unsigned long) bnode);
> + kmemleak_free(bnode);
> + }
>
> cond_resched_tasks_rcu_qs();
> }
> @@ -3241,6 +3253,12 @@ static void fill_page_cache_func(struct work_struct *work)
> free_page((unsigned long) bnode);
> break;
> }
> +
> + /*
> + * Scan the bnode->records[] array until the objects are
> + * actually freed.
> + */
> + kmemleak_alloc(bnode, PAGE_SIZE, 0, GFP_KERNEL);
> }
>
> atomic_set(&krcp->work_in_progress, 0);
> @@ -3308,6 +3326,11 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
> // scenarios.
> bnode = (struct kvfree_rcu_bulk_data *)
> __get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
> + /*
> + * Scan the bnode->records[] array until the objects are
> + * actually freed.
> + */
> + kmemleak_alloc(bnode, PAGE_SIZE, 0, GFP_KERNEL);
> raw_spin_lock_irqsave(&(*krcp)->lock, *flags);
> }
>
>
> --
> Catalin
next prev parent reply other threads:[~2023-09-05 11:17 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-30 16:37 kmemleak handling of kfree_rcu Christoph Paasch
2023-09-04 21:22 ` Catalin Marinas
2023-09-05 11:17 ` Joel Fernandes [this message]
2023-09-05 14:41 ` Catalin Marinas
2023-09-06 14:35 ` Joel Fernandes
2023-09-06 17:15 ` Catalin Marinas
2023-09-06 19:11 ` Paul E. McKenney
2023-09-06 21:37 ` Catalin Marinas
2023-09-06 22:02 ` Paul E. McKenney
2023-09-06 23:10 ` Joel Fernandes
2023-09-12 13:15 ` Matthieu Baerts
2023-09-12 13:32 ` Joel Fernandes
2023-09-05 21:22 ` Christoph Paasch
2023-09-06 17:21 ` Catalin Marinas
2023-09-10 23:10 ` Joel Fernandes
2023-09-11 17:41 ` Christoph Paasch
2023-09-12 9:52 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230905111725.GA3737639@google.com \
--to=joel@joelfernandes.org \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=cpaasch@apple.com \
--cc=linux-mm@kvack.org \
--cc=mptcp@lists.linux.dev \
--cc=rcu@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.