* [RFC net 0/1] net: sched: act: fix rcu race @ 2017-10-10 12:32 Alexander Aring 2017-10-10 12:32 ` [RFC net 1/1] net: sched: act: fix rcu race in dump Alexander Aring 0 siblings, 1 reply; 6+ messages in thread From: Alexander Aring @ 2017-10-10 12:32 UTC (permalink / raw) To: jhs; +Cc: xiyou.wangcong, jiri, netdev, kurup.manish, bjb, Alexander Aring Hi, while I reading tc action code to debug a "it does not work" statement I suppose I detected issues with the current rcu handling of tc actions. There are more than just skbmod which do it wrong. Anyway if somebody agree with me here I will send more patches which fix this behaviour in other tc actions where code was just copy&pasted. The problem because nobody hits this issue is, I think that dump will do alot of previous stuff which took more time than a rcu_synchronize. Anyway, this change should avoid any use after free issues etc. - Alex Alexander Aring (1): net: sched: act: fix rcu race in dump net/sched/act_skbmod.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [RFC net 1/1] net: sched: act: fix rcu race in dump 2017-10-10 12:32 [RFC net 0/1] net: sched: act: fix rcu race Alexander Aring @ 2017-10-10 12:32 ` Alexander Aring 2017-10-10 12:39 ` Alexander Aring ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Alexander Aring @ 2017-10-10 12:32 UTC (permalink / raw) To: jhs; +Cc: xiyou.wangcong, jiri, netdev, kurup.manish, bjb, Alexander Aring This patch fixes an issue with kfree_rcu which is not protected by RTNL lock. It could be that the current assigned rcu pointer will be freed by kfree_rcu while dump callback is running. To prevent this, we call rcu_synchronize at first. Then we are sure all latest rcu functions e.g. rcu_assign_pointer and kfree_rcu in init are done. After rcu_synchronize we dereference under RTNL lock which is also held in init function, which means no other rcu_assign_pointer or kfree_rcu will occur. To call rcu_synchronize will also prevent weird behaviours by doing over netlink: - set params A - set params B - dump params \--> will dump params A This could be a unlikely case that the last rcu_assign_pointer was not happened before dump callback. Signed-off-by: Alexander Aring <aring@mojatatu.com> --- net/sched/act_skbmod.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c index b642ad3d39dd..231e07bca384 100644 --- a/net/sched/act_skbmod.c +++ b/net/sched/act_skbmod.c @@ -198,7 +198,7 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a, { struct tcf_skbmod *d = to_skbmod(a); unsigned char *b = skb_tail_pointer(skb); - struct tcf_skbmod_params *p = rtnl_dereference(d->skbmod_p); + struct tcf_skbmod_params *p; struct tc_skbmod opt = { .index = d->tcf_index, .refcnt = d->tcf_refcnt - ref, @@ -207,6 +207,11 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a, }; struct tcf_t t; + /* wait until last rcu_assign_pointer/kfree_rcu is done */ + rcu_synchronize(); + /* RTNL lock prevents another rcu_assign_pointer/kfree_rcu call */ + p = rtnl_dereference(d->skbmod_p); + opt.flags = p->flags; if (nla_put(skb, TCA_SKBMOD_PARMS, sizeof(opt), &opt)) goto nla_put_failure; -- 2.11.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC net 1/1] net: sched: act: fix rcu race in dump 2017-10-10 12:32 ` [RFC net 1/1] net: sched: act: fix rcu race in dump Alexander Aring @ 2017-10-10 12:39 ` Alexander Aring 2017-10-10 14:12 ` Eric Dumazet 2017-10-10 16:40 ` Cong Wang 2 siblings, 0 replies; 6+ messages in thread From: Alexander Aring @ 2017-10-10 12:39 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Cong Wang, Jiří Pírko, netdev, Manish Kurup, Brenda Butler, Alexander Aring Hi, On Tue, Oct 10, 2017 at 8:32 AM, Alexander Aring <aring@mojatatu.com> wrote: > This patch fixes an issue with kfree_rcu which is not protected by RTNL > lock. It could be that the current assigned rcu pointer will be freed by > kfree_rcu while dump callback is running. > > To prevent this, we call rcu_synchronize at first. Then we are sure all > latest rcu functions e.g. rcu_assign_pointer and kfree_rcu in init are > done. After rcu_synchronize we dereference under RTNL lock which is also > held in init function, which means no other rcu_assign_pointer or > kfree_rcu will occur. > > To call rcu_synchronize will also prevent weird behaviours by doing over > netlink: > > - set params A > - set params B > - dump params > \--> will dump params A > > This could be a unlikely case that the last rcu_assign_pointer was not > happened before dump callback. > > Signed-off-by: Alexander Aring <aring@mojatatu.com> > --- > net/sched/act_skbmod.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c > index b642ad3d39dd..231e07bca384 100644 > --- a/net/sched/act_skbmod.c > +++ b/net/sched/act_skbmod.c > @@ -198,7 +198,7 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a, > { > struct tcf_skbmod *d = to_skbmod(a); > unsigned char *b = skb_tail_pointer(skb); > - struct tcf_skbmod_params *p = rtnl_dereference(d->skbmod_p); > + struct tcf_skbmod_params *p; > struct tc_skbmod opt = { > .index = d->tcf_index, > .refcnt = d->tcf_refcnt - ref, > @@ -207,6 +207,11 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a, > }; > struct tcf_t t; > > + /* wait until last rcu_assign_pointer/kfree_rcu is done */ > + rcu_synchronize(); ... and next time I should use the right function: s/rcu_synchronize/synchronize_rcu/ anyway there exists a reason why sent it as RFC. :-) Thanks. - Alex ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC net 1/1] net: sched: act: fix rcu race in dump 2017-10-10 12:32 ` [RFC net 1/1] net: sched: act: fix rcu race in dump Alexander Aring 2017-10-10 12:39 ` Alexander Aring @ 2017-10-10 14:12 ` Eric Dumazet 2017-10-10 18:09 ` Alexander Aring 2017-10-10 16:40 ` Cong Wang 2 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2017-10-10 14:12 UTC (permalink / raw) To: Alexander Aring; +Cc: jhs, xiyou.wangcong, jiri, netdev, kurup.manish, bjb On Tue, 2017-10-10 at 08:32 -0400, Alexander Aring wrote: > This patch fixes an issue with kfree_rcu which is not protected by RTNL > lock. It could be that the current assigned rcu pointer will be freed by > kfree_rcu while dump callback is running. > > To prevent this, we call rcu_synchronize at first. Then we are sure all > latest rcu functions e.g. rcu_assign_pointer and kfree_rcu in init are > done. After rcu_synchronize we dereference under RTNL lock which is also > held in init function, which means no other rcu_assign_pointer or > kfree_rcu will occur. > > To call rcu_synchronize will also prevent weird behaviours by doing over > netlink: > > - set params A > - set params B > - dump params > \--> will dump params A > > This could be a unlikely case that the last rcu_assign_pointer was not > happened before dump callback. > > Signed-off-by: Alexander Aring <aring@mojatatu.com> > --- > net/sched/act_skbmod.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c > index b642ad3d39dd..231e07bca384 100644 > --- a/net/sched/act_skbmod.c > +++ b/net/sched/act_skbmod.c > @@ -198,7 +198,7 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a, > { > struct tcf_skbmod *d = to_skbmod(a); > unsigned char *b = skb_tail_pointer(skb); > - struct tcf_skbmod_params *p = rtnl_dereference(d->skbmod_p); > + struct tcf_skbmod_params *p; > struct tc_skbmod opt = { > .index = d->tcf_index, > .refcnt = d->tcf_refcnt - ref, > @@ -207,6 +207,11 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a, > }; > struct tcf_t t; > > + /* wait until last rcu_assign_pointer/kfree_rcu is done */ > + rcu_synchronize(); > + /* RTNL lock prevents another rcu_assign_pointer/kfree_rcu call */ > + p = rtnl_dereference(d->skbmod_p); > + > opt.flags = p->flags; > if (nla_put(skb, TCA_SKBMOD_PARMS, sizeof(opt), &opt)) > goto nla_put_failure; Sorry but no. This is plainly wrong. We need to fix this without adding a _very_ expensive rcu_synchronize() on a path which does not need such thing. I am confused by this patch, please tell us more what the problem is. I suspect rcu_read_lock() is what you need, but isn't a writer supposed to hold RTNL in net/sched/* ??? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC net 1/1] net: sched: act: fix rcu race in dump 2017-10-10 14:12 ` Eric Dumazet @ 2017-10-10 18:09 ` Alexander Aring 0 siblings, 0 replies; 6+ messages in thread From: Alexander Aring @ 2017-10-10 18:09 UTC (permalink / raw) To: Eric Dumazet Cc: Jamal Hadi Salim, Cong Wang, Jiří Pírko, netdev, Manish Kurup, Brenda Butler Hi, On Tue, Oct 10, 2017 at 10:12 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Tue, 2017-10-10 at 08:32 -0400, Alexander Aring wrote: >> This patch fixes an issue with kfree_rcu which is not protected by RTNL >> lock. It could be that the current assigned rcu pointer will be freed by >> kfree_rcu while dump callback is running. >> >> To prevent this, we call rcu_synchronize at first. Then we are sure all >> latest rcu functions e.g. rcu_assign_pointer and kfree_rcu in init are >> done. After rcu_synchronize we dereference under RTNL lock which is also >> held in init function, which means no other rcu_assign_pointer or >> kfree_rcu will occur. >> >> To call rcu_synchronize will also prevent weird behaviours by doing over >> netlink: >> >> - set params A >> - set params B >> - dump params >> \--> will dump params A >> >> This could be a unlikely case that the last rcu_assign_pointer was not >> happened before dump callback. >> >> Signed-off-by: Alexander Aring <aring@mojatatu.com> >> --- >> net/sched/act_skbmod.c | 7 ++++++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c >> index b642ad3d39dd..231e07bca384 100644 >> --- a/net/sched/act_skbmod.c >> +++ b/net/sched/act_skbmod.c >> @@ -198,7 +198,7 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a, >> { >> struct tcf_skbmod *d = to_skbmod(a); >> unsigned char *b = skb_tail_pointer(skb); >> - struct tcf_skbmod_params *p = rtnl_dereference(d->skbmod_p); >> + struct tcf_skbmod_params *p; >> struct tc_skbmod opt = { >> .index = d->tcf_index, >> .refcnt = d->tcf_refcnt - ref, >> @@ -207,6 +207,11 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a, >> }; >> struct tcf_t t; >> >> + /* wait until last rcu_assign_pointer/kfree_rcu is done */ >> + rcu_synchronize(); >> + /* RTNL lock prevents another rcu_assign_pointer/kfree_rcu call */ >> + p = rtnl_dereference(d->skbmod_p); >> + >> opt.flags = p->flags; >> if (nla_put(skb, TCA_SKBMOD_PARMS, sizeof(opt), &opt)) >> goto nla_put_failure; > > Sorry but no. This is plainly wrong. > > We need to fix this without adding a _very_ expensive rcu_synchronize() > on a path which does not need such thing. > I agree that a rcu synchronize is very expensive while holding RTNL. Should be handled with rcu_read_lock as you suggested below, but this will not prevent to show an user space behavior like: - set_params(A) - set_params(B) \---> dump - will dump values A Because the rcu_read_lock will avoid rcu_assign_pointer to update the pointer and not wait that the rcu_assign_pointer of set_params(B) is done before calling dump. Okay, this issue is maybe something we should not care about it so far it's not an use after free issue. > I am confused by this patch, please tell us more what the problem is. > The callback "init" is also called by updating parameters for an action. It use rcu_assign_pointer [0], as well kfree_rcu [1] to swap the pointers of parameter structures and free the old resource. This is well protected by rcu_read_lock inside the "run" callback of tc action, which runs in softirq context. But dump is only protected by RTNL so far I see. Sorry when I understood RCU wrong, but so far I understood RCU handling, it _could_ be that returning of "init" the pointers are not updated yet. After a "grace" period, which rcu synchronize waits for it - we can be sure that it's assigned and kfree_rcu completes. The problem is: If the deference of parameters inside dump callback using still the old structure (for my understanding, it can happened because this callback do nothing against it to protect it) kfree_rcu can free the resource during accessing this structure. A RCU read lock will of course preventing RCU to update the pointers in this time (but not RTNL, so far I understood). > I suspect rcu_read_lock() is what you need, but isn't a writer supposed > to hold RTNL in net/sched/* ??? > Yes a writer holds RTNL, but these writers using RCU to write (as shown in [0] and [1]). So far I know kfree_rcu: it can occur that "init" returns and dump is called afterwards - during the dump RCU can run and free/assign pointers in this time (while dump still holds references). So far I understand a RTNL lock will not prevent RCU to do that. I wrote this mail also to get an answer if there exists a problem or not. If you say me, the resource cannot be freed by kfree_rcu if RTNL lock is hold, then I know more about how RCU is working now. - Alex [0] http://elixir.free-electrons.com/linux/latest/source/net/sched/act_skbmod.c#L177 [1] http://elixir.free-electrons.com/linux/latest/source/net/sched/act_skbmod.c#L182 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC net 1/1] net: sched: act: fix rcu race in dump 2017-10-10 12:32 ` [RFC net 1/1] net: sched: act: fix rcu race in dump Alexander Aring 2017-10-10 12:39 ` Alexander Aring 2017-10-10 14:12 ` Eric Dumazet @ 2017-10-10 16:40 ` Cong Wang 2 siblings, 0 replies; 6+ messages in thread From: Cong Wang @ 2017-10-10 16:40 UTC (permalink / raw) To: Alexander Aring Cc: Jamal Hadi Salim, Jiri Pirko, Linux Kernel Network Developers, kurup.manish, Brenda Butler On Tue, Oct 10, 2017 at 5:32 AM, Alexander Aring <aring@mojatatu.com> wrote: > This patch fixes an issue with kfree_rcu which is not protected by RTNL > lock. It could be that the current assigned rcu pointer will be freed by > kfree_rcu while dump callback is running. Why? kfree_rcu() respects existing readers, so why this could happen? > > To prevent this, we call rcu_synchronize at first. Then we are sure all > latest rcu functions e.g. rcu_assign_pointer and kfree_rcu in init are > done. After rcu_synchronize we dereference under RTNL lock which is also > held in init function, which means no other rcu_assign_pointer or > kfree_rcu will occur. If you really want to wait for kfree_rcu(), rcu_barrier() is the one instead of rcu_synchronize(). Just FYI. > > To call rcu_synchronize will also prevent weird behaviours by doing over > netlink: > > - set params A > - set params B > - dump params > \--> will dump params A What's wrong with this? Existing readers could still read old data, which is _perfectly_ fine as long as we don't free the old data before they are gone. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-10-10 18:09 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-10-10 12:32 [RFC net 0/1] net: sched: act: fix rcu race Alexander Aring 2017-10-10 12:32 ` [RFC net 1/1] net: sched: act: fix rcu race in dump Alexander Aring 2017-10-10 12:39 ` Alexander Aring 2017-10-10 14:12 ` Eric Dumazet 2017-10-10 18:09 ` Alexander Aring 2017-10-10 16:40 ` Cong Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).