Netdev List
 help / color / mirror / Atom feed
From: Hyunwoo Kim <imv4bel@gmail.com>
To: Eric Dumazet <edumazet@google.com>
Cc: dsahern@kernel.org, idosch@nvidia.com, davem@davemloft.net,
	kuba@kernel.org, pabeni@redhat.com, horms@kernel.org,
	netdev@vger.kernel.org, imv4bel@gmail.com
Subject: Re: [PATCH net] inet: frags: fix use-after-free caused by the fqdir_pre_exit() flush
Date: Mon, 1 Jun 2026 19:49:21 +0900	[thread overview]
Message-ID: <ah1jsYSmm2PzZ3V1@v4bel> (raw)
In-Reply-To: <CANn89i+q8uKxsjCCe6Gz-YcHTwGZQsDWRKNO2FP=+0Z4S7uQCg@mail.gmail.com>

On Mon, Jun 01, 2026 at 02:56:37AM -0700, Eric Dumazet wrote:
> On Mon, Jun 1, 2026 at 2:37 AM Hyunwoo Kim <imv4bel@gmail.com> wrote:
> >
> > On netns teardown, fqdir_pre_exit() walks the fqdir rhashtable and
> > flushes every fragment queue that is not yet complete using
> > inet_frag_queue_flush(). That helper frees all the skbs queued on the
> > fragment queue but does not set INET_FRAG_COMPLETE, and leaves
> > q->fragments_tail and q->last_run_head pointing at the freed skbs.
> > The queue itself stays in the rhashtable.
> >
> > fqdir_pre_exit() first lowers high_thresh to 0 to stop new queue lookups,
> > but it cannot stop a fragment that already obtained the queue through
> > inet_frag_find() earlier and stalled just before taking the queue lock.
> > Once that fragment resumes after the flush and takes the queue lock,
> > it passes the INET_FRAG_COMPLETE check and then dereferences the freed
> > fragments_tail. inet_frag_queue_insert() reads FRAG_CB() and ->len of
> > that pointer and, on the append path, writes ->next_frag, causing a
> > slab use-after-free. IPv6, nf_conntrack_reasm6 and 6lowpan reassembly
> > share the same flush path and are affected as well.
> >
> > Mark the queue complete and reset its remaining pointers under the same
> > lock right after the flush. With INET_FRAG_COMPLETE set, the insert in
> > each reassembly path bails out at its check as soon as it takes the
> > queue lock and no longer accesses the freed fragments_tail.
> >
> > Fixes: 006a5035b495 ("inet: frags: flush pending skbs in fqdir_pre_exit()")
> > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > ---
> >  net/ipv4/inet_fragment.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
> > index 393770920abd..d532f6182c8a 100644
> > --- a/net/ipv4/inet_fragment.c
> > +++ b/net/ipv4/inet_fragment.c
> > @@ -243,8 +243,13 @@ void fqdir_pre_exit(struct fqdir *fqdir)
> >                         continue;
> >                 }
> >                 spin_lock_bh(&fq->lock);
> > -               if (!(fq->flags & INET_FRAG_COMPLETE))
> > +               if (!(fq->flags & INET_FRAG_COMPLETE)) {
> >                         inet_frag_queue_flush(fq, 0);
> > +                       fq->flags |= INET_FRAG_COMPLETE;
> > +                       fq->rb_fragments = RB_ROOT;
> > +                       fq->fragments_tail = NULL;
> > +                       fq->last_run_head = NULL;
> > +               }
> 
> 
> Any reason this is not done from inet_frag_queue_flush() so that we can
> remove the related code from ip_frag_reinit()?

I looked at the callers and agree that doing this in 
inet_frag_queue_flush() is the right direction.

> 
> diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
> index 86b100694659ee51292625216113f9411b98a351..6c5f373e55d3a39a581a6364599d911782469f77
> 100644
> --- a/net/ipv4/inet_fragment.c
> +++ b/net/ipv4/inet_fragment.c
> @@ -326,6 +326,10 @@ void inet_frag_queue_flush(struct inet_frag_queue *q,
>         reason = reason ?: SKB_DROP_REASON_FRAG_REASM_TIMEOUT;
>         sum = inet_frag_rbtree_purge(&q->rb_fragments, reason);
>         sub_frag_mem_limit(q->fqdir, sum);
> +       q->flags |= INET_FRAG_COMPLETE;

While testing, though, I found that setting INET_FRAG_COMPLETE there 
leaks the inet_frag_queue. A queue flushed by fqdir_pre_exit() then 
reaches inet_frags_free_cb() with INET_FRAG_COMPLETE set but 
INET_FRAG_HASH_DEAD clear, so neither branch there drops its hash 
reference. So the INET_FRAG_COMPLETE assignment should be dropped.

I'll send v2 after 24 hours.


Best regards,
Hyunwoo Kim

> +       q->rb_fragments = RB_ROOT;
> +       q->fragments_tail = NULL;
> +       q->last_run_head = NULL;
>  }
>  EXPORT_SYMBOL(inet_frag_queue_flush);
> 
> diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
> index 56b0f738d2f27b6b4c4b55f5ca9368305ce1eb4f..c790d2f494870e1debd7e73b2d67df017a29f8a8
> 100644
> --- a/net/ipv4/ip_fragment.c
> +++ b/net/ipv4/ip_fragment.c
> @@ -250,9 +250,6 @@ static int ip_frag_reinit(struct ipq *qp)
>         qp->q.flags = 0;
>         qp->q.len = 0;
>         qp->q.meat = 0;
> -       qp->q.rb_fragments = RB_ROOT;
> -       qp->q.fragments_tail = NULL;
> -       qp->q.last_run_head = NULL;
>         qp->iif = 0;
>         qp->ecn = 0;

      reply	other threads:[~2026-06-01 10:49 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-01  9:37 [PATCH net] inet: frags: fix use-after-free caused by the fqdir_pre_exit() flush Hyunwoo Kim
2026-06-01  9:56 ` Eric Dumazet
2026-06-01 10:49   ` Hyunwoo Kim [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ah1jsYSmm2PzZ3V1@v4bel \
    --to=imv4bel@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox