From mboxrd@z Thu Jan 1 00:00:00 1970 From: Roman Kapl Subject: [PATCH v3] net: sched: crash on blocks with goto chain action Date: Wed, 29 Nov 2017 20:20:14 +0100 Message-ID: <20171129192014.20026-1-code@rkapl.cz> Cc: xiyou.wangcong@gmail.com, jiri@resnulli.us, Roman Kapl To: netdev@vger.kernel.org Return-path: Received: from droplet.rkapl.cz ([46.101.253.207]:43776 "EHLO droplet.rkapl.cz" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751897AbdK2TUe (ORCPT ); Wed, 29 Nov 2017 14:20:34 -0500 Sender: netdev-owner@vger.kernel.org List-ID: tcf_block_put_ext has assumed that all filters (and thus their goto actions) are destroyed in RCU callback and so can not race with our list iteration. However, that is not true during netns cleanup (see tcf_exts_get_net comment). The assumption was broken by the patch series c7e460ce5572..623859ae06b8 ("Merge branch 'net-sched-race-fix'"). Prevent the user after free by holding all chains (except 0, that one is already held) as it was done before 822e86d997e4 ("net_sched: remove tcf_block_put_deferred()"). To reproduce, run the following in a netns and then delete the ns: ip link add dtest type dummy tc qdisc add dev dtest ingress tc filter add dev dtest chain 1 parent ffff: handle 1 prio 1 flower action goto chain 2 Fixes: 623859ae06b8 ("Merge branch 'net-sched-race-fix'") Signed-off-by: Roman Kapl --- v1 -> v2: Hold all chains instead of just the currently iterated one, the code should be more clear this way. v2 -> v3: Point out where the chains will be released. Blame the correct patch series (the one that broke the assumption). -- net/sched/cls_api.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 7d97f612c9b9..24a3593ed88a 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -336,7 +336,8 @@ static void tcf_block_put_final(struct work_struct *work) struct tcf_chain *chain, *tmp; rtnl_lock(); - /* Only chain 0 should be still here. */ + + /* At this point, all the chains should have refcnt == 1. */ list_for_each_entry_safe(chain, tmp, &block->chain_list, list) tcf_chain_put(chain); rtnl_unlock(); @@ -344,15 +345,22 @@ static void tcf_block_put_final(struct work_struct *work) } /* XXX: Standalone actions are not allowed to jump to any chain, and bound - * actions should be all removed after flushing. However, filters are now - * destroyed in tc filter workqueue with RTNL lock, they can not race here. + * actions should be all removed after flushing. */ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q, struct tcf_block_ext_info *ei) { - struct tcf_chain *chain, *tmp; + struct tcf_chain *chain; - list_for_each_entry_safe(chain, tmp, &block->chain_list, list) + /* Hold a refcnt for all chains, except 0 (which is held anyway), so + * that they don't disappear while we are iterating. We will release + * them in tcf_block_put_final, including finally releasing chain 0. + */ + list_for_each_entry(chain, &block->chain_list, list) + if (chain->index) + tcf_chain_hold(chain); + + list_for_each_entry(chain, &block->chain_list, list) tcf_chain_flush(chain); tcf_block_offload_unbind(block, q, ei); -- 2.15.0