* [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize
@ 2017-05-20 23:22 Liping Zhang
2017-05-21 0:00 ` Florian Westphal
2017-05-24 10:24 ` Pablo Neira Ayuso
0 siblings, 2 replies; 8+ messages in thread
From: Liping Zhang @ 2017-05-20 23:22 UTC (permalink / raw)
To: pablo; +Cc: netfilter-devel, Liping Zhang
From: Liping Zhang <zlpnobody@gmail.com>
If nf_conntrack_htable_size was adjusted by the user during the ct
dump operation, we may invoke nf_ct_put twice for the same ct, i.e.
the "last" ct. This will cause the ct will be freed but still linked
in hash buckets.
It's very easy to reproduce the problem by the following commands:
# while : ; do
echo $RANDOM > /proc/sys/net/netfilter/nf_conntrack_buckets
done
# while : ; do
conntrack -L
done
# iperf -s 127.0.0.1 &
# iperf -c 127.0.0.1 -P 60 -t 36000
After a while, the system will hang like this:
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [bash:20184]
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [iperf:20382]
...
So at last if we find cb->args[1] is equal to "last", this means hash
resize happened, then we can set cb->args[1] to 0 to fix the above
issue.
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
---
net/netfilter/nf_conntrack_netlink.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index dcf561b..3b449e0 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -888,8 +888,13 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
}
out:
local_bh_enable();
- if (last)
+ if (last) {
+ /* nf ct hash resize happened, now clear the leftover. */
+ if ((struct nf_conn *)cb->args[1] == last)
+ cb->args[1] = 0;
+
nf_ct_put(last);
+ }
while (i) {
i--;
--
2.5.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize
2017-05-20 23:22 [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize Liping Zhang
@ 2017-05-21 0:00 ` Florian Westphal
2017-05-21 0:59 ` Liping Zhang
2017-05-24 10:24 ` Pablo Neira Ayuso
1 sibling, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2017-05-21 0:00 UTC (permalink / raw)
To: Liping Zhang; +Cc: pablo, netfilter-devel, Liping Zhang
Liping Zhang <zlpnobody@163.com> wrote:
> From: Liping Zhang <zlpnobody@gmail.com>
>
> If nf_conntrack_htable_size was adjusted by the user during the ct
> dump operation, we may invoke nf_ct_put twice for the same ct, i.e.
> the "last" ct. This will cause the ct will be freed but still linked
> in hash buckets.
>
> It's very easy to reproduce the problem by the following commands:
> # while : ; do
> echo $RANDOM > /proc/sys/net/netfilter/nf_conntrack_buckets
> done
> # while : ; do
> conntrack -L
> done
> # iperf -s 127.0.0.1 &
> # iperf -c 127.0.0.1 -P 60 -t 36000
>
> After a while, the system will hang like this:
> NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [bash:20184]
> NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [iperf:20382]
> ...
>
> So at last if we find cb->args[1] is equal to "last", this means hash
> resize happened, then we can set cb->args[1] to 0 to fix the above
> issue.
Yes, you're right, seems this was added in
93bb0ceb75be2fdfa9fc0dd1fb522d9ada515d9c (it adds the 'goto out').
Your patch looks correct.
However, why do we bump refcnt of 'last' in the first place?
Its only the continuation marker, i.e. its expected to reside
in the hash slot at cb->args[0], but after rehash this might not
be true either.
I think we should simplify this, just take the verbatim address,
and clear it right at start of ctnetlink_dump_table, i.e.
unsigned long last = cb->args[1];
cb->args[1] = 0;
for (; cb->args[0] < nf_conntrack_htable_size; cb->args[0]++) {
...
hlist_nulls_for_each_entry ... {
...
if (last) {
if (last != (unsigned long)ct))
cont;
last = 0;
}
...
dump();
}
last = 0; /* reset it, as it wasn't in args[0] slot */
}
Do you see any problem with that?
[ It might be better to take your patch for nf- though and do
this no-refcnt thing in nf-next ... ]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize
2017-05-21 0:00 ` Florian Westphal
@ 2017-05-21 0:59 ` Liping Zhang
2017-05-23 21:34 ` Pablo Neira Ayuso
0 siblings, 1 reply; 8+ messages in thread
From: Liping Zhang @ 2017-05-21 0:59 UTC (permalink / raw)
To: Florian Westphal
Cc: Liping Zhang, Pablo Neira Ayuso, Netfilter Developer Mailing List
Hi Florian,
2017-05-21 8:00 GMT+08:00 Florian Westphal <fw@strlen.de>:
[...]
> Yes, you're right, seems this was added in
> 93bb0ceb75be2fdfa9fc0dd1fb522d9ada515d9c (it adds the 'goto out').
I added some trace logs, and when the hash size reduced, for example,
from 60000 to 500, then the issue would happen.
Actually, hitting 'goto out' is not easy, so the issue exists for a very long
time. Maybe commit 89f2e21883b5("[NETFILTER]: ctnetlink: change
table dumping not to require an unique ID") is to blame for it.
> Your patch looks correct.
>
> However, why do we bump refcnt of 'last' in the first place?
>
> Its only the continuation marker, i.e. its expected to reside
> in the hash slot at cb->args[0], but after rehash this might not
> be true either.
>
> I think we should simplify this, just take the verbatim address,
> and clear it right at start of ctnetlink_dump_table, i.e.
>
> unsigned long last = cb->args[1];
> cb->args[1] = 0;
>
> for (; cb->args[0] < nf_conntrack_htable_size; cb->args[0]++) {
> ...
> hlist_nulls_for_each_entry ... {
> ...
> if (last) {
> if (last != (unsigned long)ct))
> cont;
> last = 0;
> }
> ...
> dump();
> }
> last = 0; /* reset it, as it wasn't in args[0] slot */
> }
>
> Do you see any problem with that?
I think this will be better, this will make code more clean.
Also we can clean up the ctnetlink_exp_ct_dump_table too.
>
> [ It might be better to take your patch for nf- though and do
> this no-refcnt thing in nf-next ... ]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize
2017-05-21 0:59 ` Liping Zhang
@ 2017-05-23 21:34 ` Pablo Neira Ayuso
2017-05-23 22:28 ` Florian Westphal
0 siblings, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2017-05-23 21:34 UTC (permalink / raw)
To: Liping Zhang
Cc: Florian Westphal, Liping Zhang, Netfilter Developer Mailing List
On Sun, May 21, 2017 at 08:59:45AM +0800, Liping Zhang wrote:
> Hi Florian,
>
> 2017-05-21 8:00 GMT+08:00 Florian Westphal <fw@strlen.de>:
> [...]
> > Yes, you're right, seems this was added in
> > 93bb0ceb75be2fdfa9fc0dd1fb522d9ada515d9c (it adds the 'goto out').
>
> I added some trace logs, and when the hash size reduced, for example,
> from 60000 to 500, then the issue would happen.
>
> Actually, hitting 'goto out' is not easy, so the issue exists for a very long
> time. Maybe commit 89f2e21883b5("[NETFILTER]: ctnetlink: change
> table dumping not to require an unique ID") is to blame for it.
>
> > Your patch looks correct.
> >
> > However, why do we bump refcnt of 'last' in the first place?
> >
> > Its only the continuation marker, i.e. its expected to reside
> > in the hash slot at cb->args[0], but after rehash this might not
> > be true either.
> >
> > I think we should simplify this, just take the verbatim address,
> > and clear it right at start of ctnetlink_dump_table, i.e.
> >
> > unsigned long last = cb->args[1];
> > cb->args[1] = 0;
> >
> > for (; cb->args[0] < nf_conntrack_htable_size; cb->args[0]++) {
> > ...
> > hlist_nulls_for_each_entry ... {
> > ...
> > if (last) {
> > if (last != (unsigned long)ct))
> > cont;
> > last = 0;
> > }
> > ...
> > dump();
> > }
> > last = 0; /* reset it, as it wasn't in args[0] slot */
> > }
> >
> > Do you see any problem with that?
>
> I think this will be better, this will make code more clean.
> Also we can clean up the ctnetlink_exp_ct_dump_table too.
@Florian, no objection then if I place this into nf.git?
I will append the Fixes: tag:
Fixes: 89f2e21883b5 ("[NETFILTER]: ctnetlink: change table dumping not to require an unique ID")
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize
2017-05-23 21:34 ` Pablo Neira Ayuso
@ 2017-05-23 22:28 ` Florian Westphal
2017-05-24 0:52 ` Liping Zhang
0 siblings, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2017-05-23 22:28 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Liping Zhang, Florian Westphal, Liping Zhang,
Netfilter Developer Mailing List
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Sun, May 21, 2017 at 08:59:45AM +0800, Liping Zhang wrote:
> > Hi Florian,
> >
> > 2017-05-21 8:00 GMT+08:00 Florian Westphal <fw@strlen.de>:
> > [...]
> > > Yes, you're right, seems this was added in
> > > 93bb0ceb75be2fdfa9fc0dd1fb522d9ada515d9c (it adds the 'goto out').
> >
> > I added some trace logs, and when the hash size reduced, for example,
> > from 60000 to 500, then the issue would happen.
> >
> > Actually, hitting 'goto out' is not easy, so the issue exists for a very long
> > time. Maybe commit 89f2e21883b5("[NETFILTER]: ctnetlink: change
> > table dumping not to require an unique ID") is to blame for it.
> >
> > > Your patch looks correct.
> > >
> > > However, why do we bump refcnt of 'last' in the first place?
> > >
> > > Its only the continuation marker, i.e. its expected to reside
> > > in the hash slot at cb->args[0], but after rehash this might not
> > > be true either.
> > >
> > > I think we should simplify this, just take the verbatim address,
> > > and clear it right at start of ctnetlink_dump_table, i.e.
> > >
> > > unsigned long last = cb->args[1];
> > > cb->args[1] = 0;
> > >
> > > for (; cb->args[0] < nf_conntrack_htable_size; cb->args[0]++) {
> > > ...
> > > hlist_nulls_for_each_entry ... {
> > > ...
> > > if (last) {
> > > if (last != (unsigned long)ct))
> > > cont;
> > > last = 0;
> > > }
> > > ...
> > > dump();
> > > }
> > > last = 0; /* reset it, as it wasn't in args[0] slot */
> > > }
> > >
> > > Do you see any problem with that?
> >
> > I think this will be better, this will make code more clean.
> > Also we can clean up the ctnetlink_exp_ct_dump_table too.
>
> @Florian, no objection then if I place this into nf.git?
No objection, thanks!
> I will append the Fixes: tag:
>
> Fixes: 89f2e21883b5 ("[NETFILTER]: ctnetlink: change table dumping not to require an unique ID")
That commit looks fine to me, it seems to make sure to put
"last" only once in all cases.
93bb0ceb75be2fdfa9fc0dd1 however adds a check on cb->args[0], and if
that is hit it will do a put() on last, and then, the "done" netlink
callback will do another put operation on cb->args[1] (i.e., last).
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize
2017-05-23 22:28 ` Florian Westphal
@ 2017-05-24 0:52 ` Liping Zhang
2017-05-24 6:22 ` Florian Westphal
0 siblings, 1 reply; 8+ messages in thread
From: Liping Zhang @ 2017-05-24 0:52 UTC (permalink / raw)
To: Florian Westphal
Cc: Pablo Neira Ayuso, Liping Zhang, Netfilter Developer Mailing List
2017-05-24 6:28 GMT+08:00 Florian Westphal <fw@strlen.de>:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
[...]
>> I will append the Fixes: tag:
>>
>> Fixes: 89f2e21883b5 ("[NETFILTER]: ctnetlink: change table dumping not to require an unique ID")
>
> That commit looks fine to me, it seems to make sure to put
> "last" only once in all cases.
>
> 93bb0ceb75be2fdfa9fc0dd1 however adds a check on cb->args[0], and if
> that is hit it will do a put() on last, and then, the "done" netlink
> callback will do another put operation on cb->args[1] (i.e., last).
After I have a closer look, I think this patch should add:
Fixes: d205dc40798d ("[NETFILTER]: ctnetlink: fix deadlock in table dumping")
After this commit, when the hash size was reduced, for example,
from 60000 to 600, then we may put the "last" ct twice, as we may
fail to go into the iteration and clear the cb->args[1], so:
1. nf_ct_put(last) by ctnetlink_dump_table, but cb->args[1] still
point to the "last"
2. nf_ct_put((struct nf_conn *)cb->args[1]) by ctnetlink_done
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize
2017-05-24 0:52 ` Liping Zhang
@ 2017-05-24 6:22 ` Florian Westphal
0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2017-05-24 6:22 UTC (permalink / raw)
To: Liping Zhang
Cc: Florian Westphal, Pablo Neira Ayuso, Liping Zhang,
Netfilter Developer Mailing List
Liping Zhang <zlpnobody@gmail.com> wrote:
> 2017-05-24 6:28 GMT+08:00 Florian Westphal <fw@strlen.de>:
> > Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> [...]
> >> I will append the Fixes: tag:
> >>
> >> Fixes: 89f2e21883b5 ("[NETFILTER]: ctnetlink: change table dumping not to require an unique ID")
> >
> > That commit looks fine to me, it seems to make sure to put
> > "last" only once in all cases.
> >
> > 93bb0ceb75be2fdfa9fc0dd1 however adds a check on cb->args[0], and if
> > that is hit it will do a put() on last, and then, the "done" netlink
> > callback will do another put operation on cb->args[1] (i.e., last).
>
> After I have a closer look, I think this patch should add:
>
> Fixes: d205dc40798d ("[NETFILTER]: ctnetlink: fix deadlock in table dumping")
>
> After this commit, when the hash size was reduced, for example,
> from 60000 to 600, then we may put the "last" ct twice, as we may
> fail to go into the iteration and clear the cb->args[1], so:
>
> 1. nf_ct_put(last) by ctnetlink_dump_table, but cb->args[1] still
> point to the "last"
> 2. nf_ct_put((struct nf_conn *)cb->args[1]) by ctnetlink_done
You are right.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize
2017-05-20 23:22 [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize Liping Zhang
2017-05-21 0:00 ` Florian Westphal
@ 2017-05-24 10:24 ` Pablo Neira Ayuso
1 sibling, 0 replies; 8+ messages in thread
From: Pablo Neira Ayuso @ 2017-05-24 10:24 UTC (permalink / raw)
To: Liping Zhang; +Cc: netfilter-devel, Liping Zhang
On Sun, May 21, 2017 at 07:22:49AM +0800, Liping Zhang wrote:
> From: Liping Zhang <zlpnobody@gmail.com>
>
> If nf_conntrack_htable_size was adjusted by the user during the ct
> dump operation, we may invoke nf_ct_put twice for the same ct, i.e.
> the "last" ct. This will cause the ct will be freed but still linked
> in hash buckets.
>
> It's very easy to reproduce the problem by the following commands:
> # while : ; do
> echo $RANDOM > /proc/sys/net/netfilter/nf_conntrack_buckets
> done
> # while : ; do
> conntrack -L
> done
> # iperf -s 127.0.0.1 &
> # iperf -c 127.0.0.1 -P 60 -t 36000
>
> After a while, the system will hang like this:
> NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [bash:20184]
> NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [iperf:20382]
> ...
>
> So at last if we find cb->args[1] is equal to "last", this means hash
> resize happened, then we can set cb->args[1] to 0 to fix the above
> issue.
Applied, thanks.
I have added:
Fixes: d205dc40798d ("[NETFILTER]: ctnetlink: fix deadlock in table dumping")
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-05-24 10:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-20 23:22 [PATCH nf] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize Liping Zhang
2017-05-21 0:00 ` Florian Westphal
2017-05-21 0:59 ` Liping Zhang
2017-05-23 21:34 ` Pablo Neira Ayuso
2017-05-23 22:28 ` Florian Westphal
2017-05-24 0:52 ` Liping Zhang
2017-05-24 6:22 ` Florian Westphal
2017-05-24 10:24 ` Pablo Neira Ayuso
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).