From: Tzung-Bi Shih <tzungbi@kernel.org>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: kadlec@netfilter.org, fw@strlen.de,
netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
jiejiang@chromium.org, jasongustaman@chromium.org,
garrick@chromium.org
Subject: Re: [PATCH] netfilter: conntrack: fix wrong ct->timeout value
Date: Mon, 17 Apr 2023 11:41:41 +0800 [thread overview]
Message-ID: <ZDy/9WEIQRyIPSNo@google.com> (raw)
In-Reply-To: <ZDkK3ktVcBaykTVT@calendula>
On Fri, Apr 14, 2023 at 10:12:14AM +0200, Pablo Neira Ayuso wrote:
> On Fri, Apr 14, 2023 at 11:52:02AM +0800, Tzung-Bi Shih wrote:
> > On Thu, Apr 13, 2023 at 11:04:24AM +0200, Pablo Neira Ayuso wrote:
> > > On Thu, Apr 13, 2023 at 11:23:11AM +0800, Tzung-Bi Shih wrote:
> > > > nf_ct_expires() got called by userspace program. And the return
> > > > value (which means the remaining timeout) will be the parameter for
> > > > the next ctnetlink_change_timeout().
> > >
> > > Unconfirmed conntrack is owned by the packet that refers to it, it is
> > > not yet in the hashes. I don't see how concurrent access to the
> > > timeout might occur.
> > >
> > > Or are you referring to a different scenario that triggers the partial
> > > state?
> >
> > I think it isn't a concurrent access. We observed that userspace program
> > reads the remaining timeout and sets it back for unconfirmed conntrack.
> >
> > Conceptually, it looks like:
> > timeout = nf_ct_expires(...); (1)
> > ctnetlink_change_timeout(...timeout);
>
> How could this possibly happen?
>
> nf_ct_expires() is called from:
>
> - ctnetlink_dump_timeout(), this is netlink dump path, since:
>
> commit 1397af5bfd7d32b0cf2adb70a78c9a9e8f11d912
> Author: Florian Westphal <fw@strlen.de>
> Date: Mon Apr 11 13:01:18 2022 +0200
>
> netfilter: conntrack: remove the percpu dying list
>
> it is not possible to do any listing of unconfirmed, and that is fine.
>
> As I said, nfnetlink_queue operates with an unconfirmed conntrack with
> owns exclusively, which is not in the hashes.
I applied the following patches on top of v6.3-rc5[5].
As you suggested:
diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h
index 71d1269fe4d4..3384859a8921 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -89,7 +89,11 @@ static inline void __nf_ct_set_timeout(struct nf_conn *ct, u64 timeout)
{
if (timeout > INT_MAX)
timeout = INT_MAX;
- WRITE_ONCE(ct->timeout, nfct_time_stamp + (u32)timeout);
+
+ if (nf_ct_is_confirmed(ct))
+ WRITE_ONCE(ct->timeout, nfct_time_stamp + (u32)timeout);
+ else
+ ct->timeout = (u32)timeout;
}
And this patch for dumping the debug information:
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index bfc3aaa2c872..b4c016ff07e9 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -178,6 +178,11 @@ static int ctnetlink_dump_timeout(struct sk_buff *skb, const struct nf_conn *ct,
{
long timeout = nf_ct_expires(ct) / HZ;
+ if (!nf_ct_is_confirmed(ct)) {
+ dump_stack();
+ printk(KERN_ERR "===ct->timeout=%u===timeout=%ld\n", ct->timeout, timeout);
+ }
+
if (skip_zero && timeout == 0)
return 0;
And here is the trimmed dmesg:
===ct->timeout=30000===timeout=0
[...] 6.3.0-rc5-00675-ged0c7cbf5b2c [...]
Call Trace:
<TASK>
dump_stack_lvl
ctnetlink_dump_timeout
__ctnetlink_glue_build
ctnetlink_glue_build
__nfqnl_enqueue_packet
nf_queue
nf_hook_slow
ip_mc_output
? __pfx_ip_finish_output
ip_send_skb
? __pfx_dst_output
udp_send_skb
udp_sendmsg
? __pfx_ip_generic_getfrag
sock_sendmsg
As the dmesg showed, the unconfirmed conntrack did call into
ctnetlink_dump_timeout(). As a result, the value returned from
nf_ct_expires() is always 0 in the case (because the `ct->timeout` got
subtracted by current timestamp `nfct_time_stamp`[4]).
[4]: https://elixir.bootlin.com/linux/v6.3-rc6/source/include/net/netfilter/nf_conntrack.h#L296
[5]: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/refs/heads/merge/continuous/chromeos-kernelupstream-6.3-rc5
next prev parent reply other threads:[~2023-04-17 3:41 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-10 6:09 [PATCH] netfilter: conntrack: fix wrong ct->timeout value Tzung-Bi Shih
2023-04-10 8:33 ` Pablo Neira Ayuso
2023-04-10 9:31 ` Pablo Neira Ayuso
2023-04-10 9:59 ` Tzung-Bi Shih
2023-04-12 22:56 ` Pablo Neira Ayuso
2023-04-13 3:23 ` Tzung-Bi Shih
2023-04-13 9:04 ` Pablo Neira Ayuso
2023-04-14 3:52 ` Tzung-Bi Shih
2023-04-14 8:12 ` Pablo Neira Ayuso
2023-04-17 3:41 ` Tzung-Bi Shih [this message]
2023-04-18 8:17 ` Pablo Neira Ayuso
2023-04-19 5:20 ` Tzung-Bi Shih
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZDy/9WEIQRyIPSNo@google.com \
--to=tzungbi@kernel.org \
--cc=coreteam@netfilter.org \
--cc=fw@strlen.de \
--cc=garrick@chromium.org \
--cc=jasongustaman@chromium.org \
--cc=jiejiang@chromium.org \
--cc=kadlec@netfilter.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.