* [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers
@ 2020-08-28 3:19 Muchun Song
2020-09-07 2:54 ` Muchun Song
0 siblings, 1 reply; 10+ messages in thread
From: Muchun Song @ 2020-08-28 3:19 UTC (permalink / raw)
To: keescook, mhiramat, rostedt, alex.popov, miguel.ojeda.sandonis
Cc: linux-kernel, Muchun Song
There is a race between the assignment of `table->data` and write value
to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
the other thread.
CPU0: CPU1:
proc_sys_write
stack_erasing_sysctl proc_sys_call_handler
table->data = &state; stack_erasing_sysctl
table->data = &state;
proc_doulongvec_minmax
do_proc_doulongvec_minmax sysctl_head_finish
__do_proc_doulongvec_minmax unuse_table
i = table->data;
*i = val; // corrupt CPU1's stack
Fix this by duplicating the `table`, and only update the duplicate of
it.
Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
changelogs in v2:
1. Add more details about how the race happened to the commit message.
kernel/stackleak.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index a8fc9ae1d03d..fd95b87478ff 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write,
int ret = 0;
int state = !static_branch_unlikely(&stack_erasing_bypass);
int prev_state = state;
+ struct ctl_table dup_table = *table;
- table->data = &state;
- table->maxlen = sizeof(int);
- ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+ /*
+ * In order to avoid races with __do_proc_doulongvec_minmax(), we
+ * can duplicate the @table and alter the duplicate of it.
+ */
+ dup_table.data = &state;
+ dup_table.maxlen = sizeof(int);
+ ret = proc_dointvec_minmax(&dup_table, write, buffer, lenp, ppos);
state = !!state;
if (ret || !write || state == prev_state)
return ret;
--
2.11.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers 2020-08-28 3:19 [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers Muchun Song @ 2020-09-07 2:54 ` Muchun Song 2020-09-07 11:24 ` Alexander Popov 0 siblings, 1 reply; 10+ messages in thread From: Muchun Song @ 2020-09-07 2:54 UTC (permalink / raw) To: Kees Cook, Masami Hiramatsu, Steven Rostedt, alex.popov, miguel.ojeda.sandonis Cc: LKML Hi all, Any comments or suggestions? Thanks. On Fri, Aug 28, 2020 at 11:19 AM Muchun Song <songmuchun@bytedance.com> wrote: > > There is a race between the assignment of `table->data` and write value > to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on > the other thread. > > CPU0: CPU1: > proc_sys_write > stack_erasing_sysctl proc_sys_call_handler > table->data = &state; stack_erasing_sysctl > table->data = &state; > proc_doulongvec_minmax > do_proc_doulongvec_minmax sysctl_head_finish > __do_proc_doulongvec_minmax unuse_table > i = table->data; > *i = val; // corrupt CPU1's stack > > Fix this by duplicating the `table`, and only update the duplicate of > it. > > Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > --- > changelogs in v2: > 1. Add more details about how the race happened to the commit message. > > kernel/stackleak.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/kernel/stackleak.c b/kernel/stackleak.c > index a8fc9ae1d03d..fd95b87478ff 100644 > --- a/kernel/stackleak.c > +++ b/kernel/stackleak.c > @@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write, > int ret = 0; > int state = !static_branch_unlikely(&stack_erasing_bypass); > int prev_state = state; > + struct ctl_table dup_table = *table; > > - table->data = &state; > - table->maxlen = sizeof(int); > - ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); > + /* > + * In order to avoid races with __do_proc_doulongvec_minmax(), we > + * can duplicate the @table and alter the duplicate of it. > + */ > + dup_table.data = &state; > + dup_table.maxlen = sizeof(int); > + ret = proc_dointvec_minmax(&dup_table, write, buffer, lenp, ppos); > state = !!state; > if (ret || !write || state == prev_state) > return ret; > -- > 2.11.0 > -- Yours, Muchun ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers 2020-09-07 2:54 ` Muchun Song @ 2020-09-07 11:24 ` Alexander Popov 2020-09-07 13:53 ` [External] " Muchun Song 0 siblings, 1 reply; 10+ messages in thread From: Alexander Popov @ 2020-09-07 11:24 UTC (permalink / raw) To: Muchun Song, Kees Cook, Masami Hiramatsu, Steven Rostedt, miguel.ojeda.sandonis Cc: LKML On 07.09.2020 05:54, Muchun Song wrote: > Hi all, > > Any comments or suggestions? Thanks. > > On Fri, Aug 28, 2020 at 11:19 AM Muchun Song <songmuchun@bytedance.com> wrote: >> >> There is a race between the assignment of `table->data` and write value >> to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on >> the other thread. >> >> CPU0: CPU1: >> proc_sys_write >> stack_erasing_sysctl proc_sys_call_handler >> table->data = &state; stack_erasing_sysctl >> table->data = &state; >> proc_doulongvec_minmax >> do_proc_doulongvec_minmax sysctl_head_finish >> __do_proc_doulongvec_minmax unuse_table >> i = table->data; >> *i = val; // corrupt CPU1's stack Hello everyone! As I remember, I implemented stack_erasing_sysctl() very similar to other sysctl handlers. Is that issue relevant for other handlers as well? Muchun, could you elaborate how CPU1's stack is corrupted and how you detected that? Thanks! Best regards, Alexander >> Fix this by duplicating the `table`, and only update the duplicate of >> it. >> >> Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") >> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >> --- >> changelogs in v2: >> 1. Add more details about how the race happened to the commit message. >> >> kernel/stackleak.c | 11 ++++++++--- >> 1 file changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/kernel/stackleak.c b/kernel/stackleak.c >> index a8fc9ae1d03d..fd95b87478ff 100644 >> --- a/kernel/stackleak.c >> +++ b/kernel/stackleak.c >> @@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write, >> int ret = 0; >> int state = !static_branch_unlikely(&stack_erasing_bypass); >> int prev_state = state; >> + struct ctl_table dup_table = *table; >> >> - table->data = &state; >> - table->maxlen = sizeof(int); >> - ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); >> + /* >> + * In order to avoid races with __do_proc_doulongvec_minmax(), we >> + * can duplicate the @table and alter the duplicate of it. >> + */ >> + dup_table.data = &state; >> + dup_table.maxlen = sizeof(int); >> + ret = proc_dointvec_minmax(&dup_table, write, buffer, lenp, ppos); >> state = !!state; >> if (ret || !write || state == prev_state) >> return ret; >> -- >> 2.11.0 >> > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Re: [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers 2020-09-07 11:24 ` Alexander Popov @ 2020-09-07 13:53 ` Muchun Song 2020-09-11 3:53 ` Muchun Song 2020-09-14 13:56 ` Alexander Popov 0 siblings, 2 replies; 10+ messages in thread From: Muchun Song @ 2020-09-07 13:53 UTC (permalink / raw) To: alex.popov Cc: Kees Cook, Masami Hiramatsu, Steven Rostedt, miguel.ojeda.sandonis, LKML On Mon, Sep 7, 2020 at 7:24 PM Alexander Popov <alex.popov@linux.com> wrote: > > On 07.09.2020 05:54, Muchun Song wrote: > > Hi all, > > > > Any comments or suggestions? Thanks. > > > > On Fri, Aug 28, 2020 at 11:19 AM Muchun Song <songmuchun@bytedance.com> wrote: > >> > >> There is a race between the assignment of `table->data` and write value > >> to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on > >> the other thread. > >> > >> CPU0: CPU1: > >> proc_sys_write > >> stack_erasing_sysctl proc_sys_call_handler > >> table->data = &state; stack_erasing_sysctl > >> table->data = &state; > >> proc_doulongvec_minmax > >> do_proc_doulongvec_minmax sysctl_head_finish > >> __do_proc_doulongvec_minmax unuse_table > >> i = table->data; > >> *i = val; // corrupt CPU1's stack > > Hello everyone! > > As I remember, I implemented stack_erasing_sysctl() very similar to other sysctl > handlers. Is that issue relevant for other handlers as well? Yeah, it's very similar. But the difference is that others use a global variable as the `table->data`, but here we use a local variable as the `table->data`. The local variable is allocated from the stack. So other thread could corrupt the stack like the diagram above. > > Muchun, could you elaborate how CPU1's stack is corrupted and how you detected > that? Thanks! Why did I find this problem? Because I solve another problem which is very similar to this issue. You can reference the following fix patch. Thanks. https://lkml.org/lkml/2020/8/22/105 > > Best regards, > Alexander > > >> Fix this by duplicating the `table`, and only update the duplicate of > >> it. > >> > >> Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") > >> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >> --- > >> changelogs in v2: > >> 1. Add more details about how the race happened to the commit message. > >> > >> kernel/stackleak.c | 11 ++++++++--- > >> 1 file changed, 8 insertions(+), 3 deletions(-) > >> > >> diff --git a/kernel/stackleak.c b/kernel/stackleak.c > >> index a8fc9ae1d03d..fd95b87478ff 100644 > >> --- a/kernel/stackleak.c > >> +++ b/kernel/stackleak.c > >> @@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write, > >> int ret = 0; > >> int state = !static_branch_unlikely(&stack_erasing_bypass); > >> int prev_state = state; > >> + struct ctl_table dup_table = *table; > >> > >> - table->data = &state; > >> - table->maxlen = sizeof(int); > >> - ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); > >> + /* > >> + * In order to avoid races with __do_proc_doulongvec_minmax(), we > >> + * can duplicate the @table and alter the duplicate of it. > >> + */ > >> + dup_table.data = &state; > >> + dup_table.maxlen = sizeof(int); > >> + ret = proc_dointvec_minmax(&dup_table, write, buffer, lenp, ppos); > >> state = !!state; > >> if (ret || !write || state == prev_state) > >> return ret; > >> -- > >> 2.11.0 > >> > > > > > -- Yours, Muchun ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Re: [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers 2020-09-07 13:53 ` [External] " Muchun Song @ 2020-09-11 3:53 ` Muchun Song 2020-09-14 13:56 ` Alexander Popov 1 sibling, 0 replies; 10+ messages in thread From: Muchun Song @ 2020-09-11 3:53 UTC (permalink / raw) To: alex.popov, Kees Cook, Masami Hiramatsu, Steven Rostedt, miguel.ojeda.sandonis Cc: LKML Ping guys. Thanks. On Mon, Sep 7, 2020 at 9:53 PM Muchun Song <songmuchun@bytedance.com> wrote: > > On Mon, Sep 7, 2020 at 7:24 PM Alexander Popov <alex.popov@linux.com> wrote: > > > > On 07.09.2020 05:54, Muchun Song wrote: > > > Hi all, > > > > > > Any comments or suggestions? Thanks. > > > > > > On Fri, Aug 28, 2020 at 11:19 AM Muchun Song <songmuchun@bytedance.com> wrote: > > >> > > >> There is a race between the assignment of `table->data` and write value > > >> to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on > > >> the other thread. > > >> > > >> CPU0: CPU1: > > >> proc_sys_write > > >> stack_erasing_sysctl proc_sys_call_handler > > >> table->data = &state; stack_erasing_sysctl > > >> table->data = &state; > > >> proc_doulongvec_minmax > > >> do_proc_doulongvec_minmax sysctl_head_finish > > >> __do_proc_doulongvec_minmax unuse_table > > >> i = table->data; > > >> *i = val; // corrupt CPU1's stack > > > > Hello everyone! > > > > As I remember, I implemented stack_erasing_sysctl() very similar to other sysctl > > handlers. Is that issue relevant for other handlers as well? > > Yeah, it's very similar. But the difference is that others use a > global variable as the > `table->data`, but here we use a local variable as the `table->data`. > The local variable > is allocated from the stack. So other thread could corrupt the stack > like the diagram > above. > > > > > Muchun, could you elaborate how CPU1's stack is corrupted and how you detected > > that? Thanks! > > Why did I find this problem? Because I solve another problem which is > very similar to > this issue. You can reference the following fix patch. Thanks. > > https://lkml.org/lkml/2020/8/22/105 > > > > > > > > Best regards, > > Alexander > > > > >> Fix this by duplicating the `table`, and only update the duplicate of > > >> it. > > >> > > >> Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") > > >> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > >> --- > > >> changelogs in v2: > > >> 1. Add more details about how the race happened to the commit message. > > >> > > >> kernel/stackleak.c | 11 ++++++++--- > > >> 1 file changed, 8 insertions(+), 3 deletions(-) > > >> > > >> diff --git a/kernel/stackleak.c b/kernel/stackleak.c > > >> index a8fc9ae1d03d..fd95b87478ff 100644 > > >> --- a/kernel/stackleak.c > > >> +++ b/kernel/stackleak.c > > >> @@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write, > > >> int ret = 0; > > >> int state = !static_branch_unlikely(&stack_erasing_bypass); > > >> int prev_state = state; > > >> + struct ctl_table dup_table = *table; > > >> > > >> - table->data = &state; > > >> - table->maxlen = sizeof(int); > > >> - ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); > > >> + /* > > >> + * In order to avoid races with __do_proc_doulongvec_minmax(), we > > >> + * can duplicate the @table and alter the duplicate of it. > > >> + */ > > >> + dup_table.data = &state; > > >> + dup_table.maxlen = sizeof(int); > > >> + ret = proc_dointvec_minmax(&dup_table, write, buffer, lenp, ppos); > > >> state = !!state; > > >> if (ret || !write || state == prev_state) > > >> return ret; > > >> -- > > >> 2.11.0 > > >> > > > > > > > > > > > -- > Yours, > Muchun -- Yours, Muchun ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Re: [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers 2020-09-07 13:53 ` [External] " Muchun Song 2020-09-11 3:53 ` Muchun Song @ 2020-09-14 13:56 ` Alexander Popov 2020-09-14 14:09 ` Muchun Song 2020-09-22 5:59 ` Muchun Song 1 sibling, 2 replies; 10+ messages in thread From: Alexander Popov @ 2020-09-14 13:56 UTC (permalink / raw) To: Muchun Song Cc: Kees Cook, Masami Hiramatsu, Steven Rostedt, miguel.ojeda.sandonis, LKML, Luis Chamberlain, Iurii Zaikin, linux-fsdevel, mike.kravetz On 07.09.2020 16:53, Muchun Song wrote: > On Mon, Sep 7, 2020 at 7:24 PM Alexander Popov <alex.popov@linux.com> wrote: >> >> On 07.09.2020 05:54, Muchun Song wrote: >>> Hi all, >>> >>> Any comments or suggestions? Thanks. >>> >>> On Fri, Aug 28, 2020 at 11:19 AM Muchun Song <songmuchun@bytedance.com> wrote: >>>> >>>> There is a race between the assignment of `table->data` and write value >>>> to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on >>>> the other thread. >>>> >>>> CPU0: CPU1: >>>> proc_sys_write >>>> stack_erasing_sysctl proc_sys_call_handler >>>> table->data = &state; stack_erasing_sysctl >>>> table->data = &state; >>>> proc_doulongvec_minmax >>>> do_proc_doulongvec_minmax sysctl_head_finish >>>> __do_proc_doulongvec_minmax unuse_table >>>> i = table->data; >>>> *i = val; // corrupt CPU1's stack >> >> Hello everyone! >> >> As I remember, I implemented stack_erasing_sysctl() very similar to other sysctl >> handlers. Is that issue relevant for other handlers as well? > > Yeah, it's very similar. But the difference is that others use a > global variable as the > `table->data`, but here we use a local variable as the `table->data`. > The local variable > is allocated from the stack. So other thread could corrupt the stack > like the diagram > above. Hi Muchun, I don't think that the proposed copying of struct ctl_table to local variable is a good fix of that issue. There might be other bugs caused by concurrent execution of stack_erasing_sysctl(). I would recommend using some locking instead. But you say there are other similar issues. Should it be fixed on higher level in kernel/sysctl.c? [Adding more knowing people to CC] Thanks! >> Muchun, could you elaborate how CPU1's stack is corrupted and how you detected >> that? Thanks! > > Why did I find this problem? Because I solve another problem which is > very similar to > this issue. You can reference the following fix patch. Thanks. > > https://lkml.org/lkml/2020/8/22/105 >> >>>> Fix this by duplicating the `table`, and only update the duplicate of >>>> it. >>>> >>>> Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") >>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >>>> --- >>>> changelogs in v2: >>>> 1. Add more details about how the race happened to the commit message. >>>> >>>> kernel/stackleak.c | 11 ++++++++--- >>>> 1 file changed, 8 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/kernel/stackleak.c b/kernel/stackleak.c >>>> index a8fc9ae1d03d..fd95b87478ff 100644 >>>> --- a/kernel/stackleak.c >>>> +++ b/kernel/stackleak.c >>>> @@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write, >>>> int ret = 0; >>>> int state = !static_branch_unlikely(&stack_erasing_bypass); >>>> int prev_state = state; >>>> + struct ctl_table dup_table = *table; >>>> >>>> - table->data = &state; >>>> - table->maxlen = sizeof(int); >>>> - ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); >>>> + /* >>>> + * In order to avoid races with __do_proc_doulongvec_minmax(), we >>>> + * can duplicate the @table and alter the duplicate of it. >>>> + */ >>>> + dup_table.data = &state; >>>> + dup_table.maxlen = sizeof(int); >>>> + ret = proc_dointvec_minmax(&dup_table, write, buffer, lenp, ppos); >>>> state = !!state; >>>> if (ret || !write || state == prev_state) >>>> return ret; >>>> -- >>>> 2.11.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Re: [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers 2020-09-14 13:56 ` Alexander Popov @ 2020-09-14 14:09 ` Muchun Song 2020-09-22 5:59 ` Muchun Song 1 sibling, 0 replies; 10+ messages in thread From: Muchun Song @ 2020-09-14 14:09 UTC (permalink / raw) To: alex.popov Cc: Kees Cook, Masami Hiramatsu, Steven Rostedt, miguel.ojeda.sandonis, LKML, Luis Chamberlain, Iurii Zaikin, linux-fsdevel, Mike Kravetz On Mon, Sep 14, 2020 at 9:56 PM Alexander Popov <alex.popov@linux.com> wrote: > > On 07.09.2020 16:53, Muchun Song wrote: > > On Mon, Sep 7, 2020 at 7:24 PM Alexander Popov <alex.popov@linux.com> wrote: > >> > >> On 07.09.2020 05:54, Muchun Song wrote: > >>> Hi all, > >>> > >>> Any comments or suggestions? Thanks. > >>> > >>> On Fri, Aug 28, 2020 at 11:19 AM Muchun Song <songmuchun@bytedance.com> wrote: > >>>> > >>>> There is a race between the assignment of `table->data` and write value > >>>> to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on > >>>> the other thread. > >>>> > >>>> CPU0: CPU1: > >>>> proc_sys_write > >>>> stack_erasing_sysctl proc_sys_call_handler > >>>> table->data = &state; stack_erasing_sysctl > >>>> table->data = &state; > >>>> proc_doulongvec_minmax > >>>> do_proc_doulongvec_minmax sysctl_head_finish > >>>> __do_proc_doulongvec_minmax unuse_table > >>>> i = table->data; > >>>> *i = val; // corrupt CPU1's stack > >> > >> Hello everyone! > >> > >> As I remember, I implemented stack_erasing_sysctl() very similar to other sysctl > >> handlers. Is that issue relevant for other handlers as well? > > > > Yeah, it's very similar. But the difference is that others use a > > global variable as the > > `table->data`, but here we use a local variable as the `table->data`. > > The local variable > > is allocated from the stack. So other thread could corrupt the stack > > like the diagram > > above. > > Hi Muchun, > > I don't think that the proposed copying of struct ctl_table to local variable is > a good fix of that issue. There might be other bugs caused by concurrent > execution of stack_erasing_sysctl(). I can not figure out how the bug happened when there is concurrent execution of stack_erasing_sysctl(). > > I would recommend using some locking instead. > > But you say there are other similar issues. Should it be fixed on higher level > in kernel/sysctl.c? Yeah, we can see the same issue here. https://lkml.org/lkml/2020/8/22/105. I agree with you. Maybe a fix on the higher level is a good choice in kernel/sysctl.c. If someone also agrees with this solution, I can do this work. > > [Adding more knowing people to CC] > > Thanks! > > >> Muchun, could you elaborate how CPU1's stack is corrupted and how you detected > >> that? Thanks! > > > > Why did I find this problem? Because I solve another problem which is > > very similar to > > this issue. You can reference the following fix patch. Thanks. > > > > https://lkml.org/lkml/2020/8/22/105 > >> > >>>> Fix this by duplicating the `table`, and only update the duplicate of > >>>> it. > >>>> > >>>> Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") > >>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >>>> --- > >>>> changelogs in v2: > >>>> 1. Add more details about how the race happened to the commit message. > >>>> > >>>> kernel/stackleak.c | 11 ++++++++--- > >>>> 1 file changed, 8 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/kernel/stackleak.c b/kernel/stackleak.c > >>>> index a8fc9ae1d03d..fd95b87478ff 100644 > >>>> --- a/kernel/stackleak.c > >>>> +++ b/kernel/stackleak.c > >>>> @@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write, > >>>> int ret = 0; > >>>> int state = !static_branch_unlikely(&stack_erasing_bypass); > >>>> int prev_state = state; > >>>> + struct ctl_table dup_table = *table; > >>>> > >>>> - table->data = &state; > >>>> - table->maxlen = sizeof(int); > >>>> - ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); > >>>> + /* > >>>> + * In order to avoid races with __do_proc_doulongvec_minmax(), we > >>>> + * can duplicate the @table and alter the duplicate of it. > >>>> + */ > >>>> + dup_table.data = &state; > >>>> + dup_table.maxlen = sizeof(int); > >>>> + ret = proc_dointvec_minmax(&dup_table, write, buffer, lenp, ppos); > >>>> state = !!state; > >>>> if (ret || !write || state == prev_state) > >>>> return ret; > >>>> -- > >>>> 2.11.0 -- Yours, Muchun ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Re: [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers 2020-09-14 13:56 ` Alexander Popov 2020-09-14 14:09 ` Muchun Song @ 2020-09-22 5:59 ` Muchun Song 2020-09-28 6:32 ` Alexander Popov 1 sibling, 1 reply; 10+ messages in thread From: Muchun Song @ 2020-09-22 5:59 UTC (permalink / raw) To: alex.popov Cc: Kees Cook, Masami Hiramatsu, Steven Rostedt, miguel.ojeda.sandonis, LKML, Luis Chamberlain, Iurii Zaikin, linux-fsdevel, Mike Kravetz On Mon, Sep 14, 2020 at 9:56 PM Alexander Popov <alex.popov@linux.com> wrote: > > On 07.09.2020 16:53, Muchun Song wrote: > > On Mon, Sep 7, 2020 at 7:24 PM Alexander Popov <alex.popov@linux.com> wrote: > >> > >> On 07.09.2020 05:54, Muchun Song wrote: > >>> Hi all, > >>> > >>> Any comments or suggestions? Thanks. > >>> > >>> On Fri, Aug 28, 2020 at 11:19 AM Muchun Song <songmuchun@bytedance.com> wrote: > >>>> > >>>> There is a race between the assignment of `table->data` and write value > >>>> to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on > >>>> the other thread. > >>>> > >>>> CPU0: CPU1: > >>>> proc_sys_write > >>>> stack_erasing_sysctl proc_sys_call_handler > >>>> table->data = &state; stack_erasing_sysctl > >>>> table->data = &state; > >>>> proc_doulongvec_minmax > >>>> do_proc_doulongvec_minmax sysctl_head_finish > >>>> __do_proc_doulongvec_minmax unuse_table > >>>> i = table->data; > >>>> *i = val; // corrupt CPU1's stack > >> > >> Hello everyone! > >> > >> As I remember, I implemented stack_erasing_sysctl() very similar to other sysctl > >> handlers. Is that issue relevant for other handlers as well? > > > > Yeah, it's very similar. But the difference is that others use a > > global variable as the > > `table->data`, but here we use a local variable as the `table->data`. > > The local variable > > is allocated from the stack. So other thread could corrupt the stack > > like the diagram > > above. > > Hi Muchun, > > I don't think that the proposed copying of struct ctl_table to local variable is > a good fix of that issue. There might be other bugs caused by concurrent > execution of stack_erasing_sysctl(). Hi Alexander, Yeah, we can fix this issue on a higher level in kernel/sysctl.c. But we will rework some kernel/sysctl.c base code. Because the commit: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") is introduced from linux-4.20. So we should backport this fix patch to the other stable tree. Be the safe side, we can apply this patch to only fix the stack_erasing_sysctl. In this case, the impact of backport is minimal. In the feature, we can fix the issue(another patch) like this on a higher level in kernel/sysctl.c and only apply it in the later kernel version. Is this OK? > > I would recommend using some locking instead. > > But you say there are other similar issues. Should it be fixed on higher level > in kernel/sysctl.c? > > [Adding more knowing people to CC] > > Thanks! > > >> Muchun, could you elaborate how CPU1's stack is corrupted and how you detected > >> that? Thanks! > > > > Why did I find this problem? Because I solve another problem which is > > very similar to > > this issue. You can reference the following fix patch. Thanks. > > > > https://lkml.org/lkml/2020/8/22/105 > >> > >>>> Fix this by duplicating the `table`, and only update the duplicate of > >>>> it. > >>>> > >>>> Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") > >>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >>>> --- > >>>> changelogs in v2: > >>>> 1. Add more details about how the race happened to the commit message. > >>>> > >>>> kernel/stackleak.c | 11 ++++++++--- > >>>> 1 file changed, 8 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/kernel/stackleak.c b/kernel/stackleak.c > >>>> index a8fc9ae1d03d..fd95b87478ff 100644 > >>>> --- a/kernel/stackleak.c > >>>> +++ b/kernel/stackleak.c > >>>> @@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write, > >>>> int ret = 0; > >>>> int state = !static_branch_unlikely(&stack_erasing_bypass); > >>>> int prev_state = state; > >>>> + struct ctl_table dup_table = *table; > >>>> > >>>> - table->data = &state; > >>>> - table->maxlen = sizeof(int); > >>>> - ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); > >>>> + /* > >>>> + * In order to avoid races with __do_proc_doulongvec_minmax(), we > >>>> + * can duplicate the @table and alter the duplicate of it. > >>>> + */ > >>>> + dup_table.data = &state; > >>>> + dup_table.maxlen = sizeof(int); > >>>> + ret = proc_dointvec_minmax(&dup_table, write, buffer, lenp, ppos); > >>>> state = !!state; > >>>> if (ret || !write || state == prev_state) > >>>> return ret; > >>>> -- > >>>> 2.11.0 -- Yours, Muchun ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Re: [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers 2020-09-22 5:59 ` Muchun Song @ 2020-09-28 6:32 ` Alexander Popov 2020-09-28 7:30 ` Muchun Song 0 siblings, 1 reply; 10+ messages in thread From: Alexander Popov @ 2020-09-28 6:32 UTC (permalink / raw) To: Muchun Song Cc: Kees Cook, Masami Hiramatsu, Steven Rostedt, miguel.ojeda.sandonis, LKML, Luis Chamberlain, Iurii Zaikin, linux-fsdevel, Mike Kravetz On 22.09.2020 08:59, Muchun Song wrote: > On Mon, Sep 14, 2020 at 9:56 PM Alexander Popov <alex.popov@linux.com> wrote: >> >> On 07.09.2020 16:53, Muchun Song wrote: >>> On Mon, Sep 7, 2020 at 7:24 PM Alexander Popov <alex.popov@linux.com> wrote: >>>> >>>> On 07.09.2020 05:54, Muchun Song wrote: >>>>> Hi all, >>>>> >>>>> Any comments or suggestions? Thanks. >>>>> >>>>> On Fri, Aug 28, 2020 at 11:19 AM Muchun Song <songmuchun@bytedance.com> wrote: >>>>>> >>>>>> There is a race between the assignment of `table->data` and write value >>>>>> to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on >>>>>> the other thread. >>>>>> >>>>>> CPU0: CPU1: >>>>>> proc_sys_write >>>>>> stack_erasing_sysctl proc_sys_call_handler >>>>>> table->data = &state; stack_erasing_sysctl >>>>>> table->data = &state; >>>>>> proc_doulongvec_minmax >>>>>> do_proc_doulongvec_minmax sysctl_head_finish >>>>>> __do_proc_doulongvec_minmax unuse_table >>>>>> i = table->data; >>>>>> *i = val; // corrupt CPU1's stack >>>> >>>> Hello everyone! >>>> >>>> As I remember, I implemented stack_erasing_sysctl() very similar to other sysctl >>>> handlers. Is that issue relevant for other handlers as well? >>> >>> Yeah, it's very similar. But the difference is that others use a >>> global variable as the >>> `table->data`, but here we use a local variable as the `table->data`. >>> The local variable >>> is allocated from the stack. So other thread could corrupt the stack >>> like the diagram >>> above. >> >> Hi Muchun, >> >> I don't think that the proposed copying of struct ctl_table to local variable is >> a good fix of that issue. There might be other bugs caused by concurrent >> execution of stack_erasing_sysctl(). > > Hi Alexander, > > Yeah, we can fix this issue on a higher level in kernel/sysctl.c. But > we will rework some kernel/sysctl.c base code. Because the commit: > > 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") > > is introduced from linux-4.20. So we should backport this fix patch to the other > stable tree. Be the safe side, we can apply this patch to only fix the > stack_erasing_sysctl. In this case, the impact of backport is minimal. > > In the feature, we can fix the issue(another patch) like this on a higher > level in kernel/sysctl.c and only apply it in the later kernel version. Is > this OK? Muchun, I would recommend: 1) fixing the reason of the issue in kernel/sysctl.c or 2) use some locking in stack_erasing_sysctl() to fix the issue locally. Honestly, I don't like this "dup_table" approach in the patch below. It doesn't remove the data race. Thank you! Alexander >> I would recommend using some locking instead. >> >> But you say there are other similar issues. Should it be fixed on higher level >> in kernel/sysctl.c? >> >> [Adding more knowing people to CC] >> >> Thanks! >> >>>> Muchun, could you elaborate how CPU1's stack is corrupted and how you detected >>>> that? Thanks! >>> >>> Why did I find this problem? Because I solve another problem which is >>> very similar to >>> this issue. You can reference the following fix patch. Thanks. >>> >>> https://lkml.org/lkml/2020/8/22/105 >>>> >>>>>> Fix this by duplicating the `table`, and only update the duplicate of >>>>>> it. >>>>>> >>>>>> Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") >>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >>>>>> --- >>>>>> changelogs in v2: >>>>>> 1. Add more details about how the race happened to the commit message. >>>>>> >>>>>> kernel/stackleak.c | 11 ++++++++--- >>>>>> 1 file changed, 8 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/kernel/stackleak.c b/kernel/stackleak.c >>>>>> index a8fc9ae1d03d..fd95b87478ff 100644 >>>>>> --- a/kernel/stackleak.c >>>>>> +++ b/kernel/stackleak.c >>>>>> @@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write, >>>>>> int ret = 0; >>>>>> int state = !static_branch_unlikely(&stack_erasing_bypass); >>>>>> int prev_state = state; >>>>>> + struct ctl_table dup_table = *table; >>>>>> >>>>>> - table->data = &state; >>>>>> - table->maxlen = sizeof(int); >>>>>> - ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); >>>>>> + /* >>>>>> + * In order to avoid races with __do_proc_doulongvec_minmax(), we >>>>>> + * can duplicate the @table and alter the duplicate of it. >>>>>> + */ >>>>>> + dup_table.data = &state; >>>>>> + dup_table.maxlen = sizeof(int); >>>>>> + ret = proc_dointvec_minmax(&dup_table, write, buffer, lenp, ppos); >>>>>> state = !!state; >>>>>> if (ret || !write || state == prev_state) >>>>>> return ret; >>>>>> -- >>>>>> 2.11.0 > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Re: [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers 2020-09-28 6:32 ` Alexander Popov @ 2020-09-28 7:30 ` Muchun Song 0 siblings, 0 replies; 10+ messages in thread From: Muchun Song @ 2020-09-28 7:30 UTC (permalink / raw) To: alex.popov Cc: Kees Cook, Masami Hiramatsu, Steven Rostedt, miguel.ojeda.sandonis, LKML, Luis Chamberlain, Iurii Zaikin, linux-fsdevel, Mike Kravetz On Mon, Sep 28, 2020 at 2:32 PM Alexander Popov <alex.popov@linux.com> wrote: > > On 22.09.2020 08:59, Muchun Song wrote: > > On Mon, Sep 14, 2020 at 9:56 PM Alexander Popov <alex.popov@linux.com> wrote: > >> > >> On 07.09.2020 16:53, Muchun Song wrote: > >>> On Mon, Sep 7, 2020 at 7:24 PM Alexander Popov <alex.popov@linux.com> wrote: > >>>> > >>>> On 07.09.2020 05:54, Muchun Song wrote: > >>>>> Hi all, > >>>>> > >>>>> Any comments or suggestions? Thanks. > >>>>> > >>>>> On Fri, Aug 28, 2020 at 11:19 AM Muchun Song <songmuchun@bytedance.com> wrote: > >>>>>> > >>>>>> There is a race between the assignment of `table->data` and write value > >>>>>> to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on > >>>>>> the other thread. > >>>>>> > >>>>>> CPU0: CPU1: > >>>>>> proc_sys_write > >>>>>> stack_erasing_sysctl proc_sys_call_handler > >>>>>> table->data = &state; stack_erasing_sysctl > >>>>>> table->data = &state; > >>>>>> proc_doulongvec_minmax > >>>>>> do_proc_doulongvec_minmax sysctl_head_finish > >>>>>> __do_proc_doulongvec_minmax unuse_table > >>>>>> i = table->data; > >>>>>> *i = val; // corrupt CPU1's stack > >>>> > >>>> Hello everyone! > >>>> > >>>> As I remember, I implemented stack_erasing_sysctl() very similar to other sysctl > >>>> handlers. Is that issue relevant for other handlers as well? > >>> > >>> Yeah, it's very similar. But the difference is that others use a > >>> global variable as the > >>> `table->data`, but here we use a local variable as the `table->data`. > >>> The local variable > >>> is allocated from the stack. So other thread could corrupt the stack > >>> like the diagram > >>> above. > >> > >> Hi Muchun, > >> > >> I don't think that the proposed copying of struct ctl_table to local variable is > >> a good fix of that issue. There might be other bugs caused by concurrent > >> execution of stack_erasing_sysctl(). > > > > Hi Alexander, > > > > Yeah, we can fix this issue on a higher level in kernel/sysctl.c. But > > we will rework some kernel/sysctl.c base code. Because the commit: > > > > 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") > > > > is introduced from linux-4.20. So we should backport this fix patch to the other > > stable tree. Be the safe side, we can apply this patch to only fix the > > stack_erasing_sysctl. In this case, the impact of backport is minimal. > > > > In the feature, we can fix the issue(another patch) like this on a higher > > level in kernel/sysctl.c and only apply it in the later kernel version. Is > > this OK? > > Muchun, I would recommend: > 1) fixing the reason of the issue in kernel/sysctl.c > or > 2) use some locking in stack_erasing_sysctl() to fix the issue locally. Yeah, this is work. > > Honestly, I don't like this "dup_table" approach in the patch below. It doesn't > remove the data race. Alexander, I don't understand where the race is? I think that the duplicate is enough. But If you prefer using the lock to protect the data. I also can do that. > > Thank you! > Alexander > > >> I would recommend using some locking instead. > >> > >> But you say there are other similar issues. Should it be fixed on higher level > >> in kernel/sysctl.c? > >> > >> [Adding more knowing people to CC] > >> > >> Thanks! > >> > >>>> Muchun, could you elaborate how CPU1's stack is corrupted and how you detected > >>>> that? Thanks! > >>> > >>> Why did I find this problem? Because I solve another problem which is > >>> very similar to > >>> this issue. You can reference the following fix patch. Thanks. > >>> > >>> https://lkml.org/lkml/2020/8/22/105 > >>>> > >>>>>> Fix this by duplicating the `table`, and only update the duplicate of > >>>>>> it. > >>>>>> > >>>>>> Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack erasing") > >>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >>>>>> --- > >>>>>> changelogs in v2: > >>>>>> 1. Add more details about how the race happened to the commit message. > >>>>>> > >>>>>> kernel/stackleak.c | 11 ++++++++--- > >>>>>> 1 file changed, 8 insertions(+), 3 deletions(-) > >>>>>> > >>>>>> diff --git a/kernel/stackleak.c b/kernel/stackleak.c > >>>>>> index a8fc9ae1d03d..fd95b87478ff 100644 > >>>>>> --- a/kernel/stackleak.c > >>>>>> +++ b/kernel/stackleak.c > >>>>>> @@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write, > >>>>>> int ret = 0; > >>>>>> int state = !static_branch_unlikely(&stack_erasing_bypass); > >>>>>> int prev_state = state; > >>>>>> + struct ctl_table dup_table = *table; > >>>>>> > >>>>>> - table->data = &state; > >>>>>> - table->maxlen = sizeof(int); > >>>>>> - ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); > >>>>>> + /* > >>>>>> + * In order to avoid races with __do_proc_doulongvec_minmax(), we > >>>>>> + * can duplicate the @table and alter the duplicate of it. > >>>>>> + */ > >>>>>> + dup_table.data = &state; > >>>>>> + dup_table.maxlen = sizeof(int); > >>>>>> + ret = proc_dointvec_minmax(&dup_table, write, buffer, lenp, ppos); > >>>>>> state = !!state; > >>>>>> if (ret || !write || state == prev_state) > >>>>>> return ret; > >>>>>> -- > >>>>>> 2.11.0 > > > > > > > -- Yours, Muchun ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-09-28 7:31 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-08-28 3:19 [PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers Muchun Song 2020-09-07 2:54 ` Muchun Song 2020-09-07 11:24 ` Alexander Popov 2020-09-07 13:53 ` [External] " Muchun Song 2020-09-11 3:53 ` Muchun Song 2020-09-14 13:56 ` Alexander Popov 2020-09-14 14:09 ` Muchun Song 2020-09-22 5:59 ` Muchun Song 2020-09-28 6:32 ` Alexander Popov 2020-09-28 7:30 ` Muchun Song
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox