* [PATCH] mm: change memcg->oom_group access with atomic operations @ 2023-02-20 15:16 Yue Zhao 2023-02-20 21:09 ` Roman Gushchin 0 siblings, 1 reply; 26+ messages in thread From: Yue Zhao @ 2023-02-20 15:16 UTC (permalink / raw) To: linux-mm Cc: akpm, roman.gushchin, hannes, mhocko, shakeelb, muchun.song, cgroups, linux-kernel, Yue Zhao The knob for cgroup v2 memory controller: memory.oom.group will be read and written simultaneously by user space programs, thus we'd better change memcg->oom_group access with atomic operations to avoid concurrency problems. Signed-off-by: Yue Zhao <findns94@gmail.com> --- mm/memcontrol.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 73afff8062f9..e4695fb80bda 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2057,7 +2057,7 @@ struct mem_cgroup *mem_cgroup_get_oom_group(struct task_struct *victim, * highest-level memory cgroup with oom.group set. */ for (; memcg; memcg = parent_mem_cgroup(memcg)) { - if (memcg->oom_group) + if (READ_ONCE(memcg->oom_group)) oom_group = memcg; if (memcg == oom_domain) @@ -6569,7 +6569,7 @@ static int memory_oom_group_show(struct seq_file *m, void *v) { struct mem_cgroup *memcg = mem_cgroup_from_seq(m); - seq_printf(m, "%d\n", memcg->oom_group); + seq_printf(m, "%d\n", READ_ONCE(memcg->oom_group)); return 0; } @@ -6591,7 +6591,7 @@ static ssize_t memory_oom_group_write(struct kernfs_open_file *of, if (oom_group != 0 && oom_group != 1) return -EINVAL; - memcg->oom_group = oom_group; + WRITE_ONCE(memcg->oom_group, oom_group); return nbytes; } -- 2.17.1 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-20 15:16 [PATCH] mm: change memcg->oom_group access with atomic operations Yue Zhao @ 2023-02-20 21:09 ` Roman Gushchin 2023-02-20 23:06 ` Shakeel Butt 0 siblings, 1 reply; 26+ messages in thread From: Roman Gushchin @ 2023-02-20 21:09 UTC (permalink / raw) To: Yue Zhao Cc: linux-mm, akpm, hannes, mhocko, shakeelb, muchun.song, cgroups, linux-kernel On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > The knob for cgroup v2 memory controller: memory.oom.group > will be read and written simultaneously by user space > programs, thus we'd better change memcg->oom_group access > with atomic operations to avoid concurrency problems. > > Signed-off-by: Yue Zhao <findns94@gmail.com> Hi Yue! I'm curious, have any seen any real issues which your patch is solving? Can you, please, provide a bit more details. Also, WRITE/READ_ONCE() don't generally make operations atomic, they only prevent the compiler from merging and re-fetching reads and writes. Thanks! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-20 21:09 ` Roman Gushchin @ 2023-02-20 23:06 ` Shakeel Butt 2023-02-21 5:17 ` Roman Gushchin 2023-02-21 8:26 ` Michal Hocko 0 siblings, 2 replies; 26+ messages in thread From: Shakeel Butt @ 2023-02-20 23:06 UTC (permalink / raw) To: Roman Gushchin Cc: Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > The knob for cgroup v2 memory controller: memory.oom.group > > will be read and written simultaneously by user space > > programs, thus we'd better change memcg->oom_group access > > with atomic operations to avoid concurrency problems. > > > > Signed-off-by: Yue Zhao <findns94@gmail.com> > > Hi Yue! > > I'm curious, have any seen any real issues which your patch is solving? > Can you, please, provide a bit more details. > IMHO such details are not needed. oom_group is being accessed concurrently and one of them can be a write access. At least READ_ONCE/WRITE_ONCE is needed here. Most probably syzbot didn't catch this race because it does not know about the memory.oom.group interface. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-20 23:06 ` Shakeel Butt @ 2023-02-21 5:17 ` Roman Gushchin 2023-02-21 6:52 ` Shakeel Butt ` (2 more replies) 2023-02-21 8:26 ` Michal Hocko 1 sibling, 3 replies; 26+ messages in thread From: Roman Gushchin @ 2023-02-21 5:17 UTC (permalink / raw) To: Shakeel Butt Cc: Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: >>> The knob for cgroup v2 memory controller: memory.oom.group >>> will be read and written simultaneously by user space >>> programs, thus we'd better change memcg->oom_group access >>> with atomic operations to avoid concurrency problems. >>> >>> Signed-off-by: Yue Zhao <findns94@gmail.com> >> >> Hi Yue! >> >> I'm curious, have any seen any real issues which your patch is solving? >> Can you, please, provide a bit more details. >> > > IMHO such details are not needed. oom_group is being accessed > concurrently and one of them can be a write access. At least > READ_ONCE/WRITE_ONCE is needed here. Needed for what? I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill. My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these. Also there are other similar cgroup interfaces without READ_ONCE()/WRITE_ONCE(). Thanks! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 5:17 ` Roman Gushchin @ 2023-02-21 6:52 ` Shakeel Butt 2023-02-21 13:51 ` Matthew Wilcox 2023-02-21 17:00 ` Martin Zhao 2023-02-21 7:22 ` Muchun Song 2023-02-21 17:00 ` Martin Zhao 2 siblings, 2 replies; 26+ messages in thread From: Shakeel Butt @ 2023-02-21 6:52 UTC (permalink / raw) To: Roman Gushchin Cc: Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > >>> The knob for cgroup v2 memory controller: memory.oom.group > >>> will be read and written simultaneously by user space > >>> programs, thus we'd better change memcg->oom_group access > >>> with atomic operations to avoid concurrency problems. > >>> > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > >> > >> Hi Yue! > >> > >> I'm curious, have any seen any real issues which your patch is solving? > >> Can you, please, provide a bit more details. > >> > > > > IMHO such details are not needed. oom_group is being accessed > > concurrently and one of them can be a write access. At least > > READ_ONCE/WRITE_ONCE is needed here. > > Needed for what? For this particular case, documenting such an access. Though I don't think there are any architectures which may tear a one byte read/write and merging/refetching is not an issue for this. > > I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill. > > My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these. > > Also there are other similar cgroup interfaces without READ_ONCE()/WRITE_ONCE() Yeah and those are v1 interfaces e.g. oom_kill_disable, swappiness, soft_limit. These definitely need [READ|WRITE]_ONCE primitive. Yue, can you update your patch and convert all accesses to these fields through [READ|WRITE]_ONCE ? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 6:52 ` Shakeel Butt @ 2023-02-21 13:51 ` Matthew Wilcox 2023-02-21 16:56 ` Shakeel Butt ` (2 more replies) 2023-02-21 17:00 ` Martin Zhao 1 sibling, 3 replies; 26+ messages in thread From: Matthew Wilcox @ 2023-02-21 13:51 UTC (permalink / raw) To: Shakeel Butt Cc: Roman Gushchin, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > >>> will be read and written simultaneously by user space > > >>> programs, thus we'd better change memcg->oom_group access > > >>> with atomic operations to avoid concurrency problems. > > >>> > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > >> > > >> Hi Yue! > > >> > > >> I'm curious, have any seen any real issues which your patch is solving? > > >> Can you, please, provide a bit more details. > > >> > > > > > > IMHO such details are not needed. oom_group is being accessed > > > concurrently and one of them can be a write access. At least > > > READ_ONCE/WRITE_ONCE is needed here. > > > > Needed for what? > > For this particular case, documenting such an access. Though I don't > think there are any architectures which may tear a one byte read/write > and merging/refetching is not an issue for this. Wouldn't a compiler be within its rights to implement a one byte store as: load-word modify-byte-in-word store-word and if this is a lockless store to a word which has an adjacent byte also being modified by another CPU, one of those CPUs can lose its store? And WRITE_ONCE would prevent the compiler from implementing the store in that way. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 13:51 ` Matthew Wilcox @ 2023-02-21 16:56 ` Shakeel Butt 2023-02-21 18:23 ` Paul E. McKenney 2023-02-21 17:47 ` Roman Gushchin 2023-02-22 9:01 ` David Laight 2 siblings, 1 reply; 26+ messages in thread From: Shakeel Butt @ 2023-02-21 16:56 UTC (permalink / raw) To: Matthew Wilcox, Paul E. McKenney, Marco Elver Cc: Roman Gushchin, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel +Paul & Marco On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > > >>> will be read and written simultaneously by user space > > > >>> programs, thus we'd better change memcg->oom_group access > > > >>> with atomic operations to avoid concurrency problems. > > > >>> > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > > >> > > > >> Hi Yue! > > > >> > > > >> I'm curious, have any seen any real issues which your patch is solving? > > > >> Can you, please, provide a bit more details. > > > >> > > > > > > > > IMHO such details are not needed. oom_group is being accessed > > > > concurrently and one of them can be a write access. At least > > > > READ_ONCE/WRITE_ONCE is needed here. > > > > > > Needed for what? > > > > For this particular case, documenting such an access. Though I don't > > think there are any architectures which may tear a one byte read/write > > and merging/refetching is not an issue for this. > > Wouldn't a compiler be within its rights to implement a one byte store as: > > load-word > modify-byte-in-word > store-word > > and if this is a lockless store to a word which has an adjacent byte also > being modified by another CPU, one of those CPUs can lose its store? > And WRITE_ONCE would prevent the compiler from implementing the store > in that way. > Thanks Willy for pointing this out. If the compiler can really do this then [READ|WRITE]_ONCE are required here. I always have big bad compiler lwn article open in a tab. I couldn't map this transformation to ones mentioned in that article. Do we have name of this one? thanks, Shakeel ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 16:56 ` Shakeel Butt @ 2023-02-21 18:23 ` Paul E. McKenney 2023-02-21 22:23 ` Roman Gushchin 0 siblings, 1 reply; 26+ messages in thread From: Paul E. McKenney @ 2023-02-21 18:23 UTC (permalink / raw) To: Shakeel Butt Cc: Matthew Wilcox, Marco Elver, Roman Gushchin, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote: > +Paul & Marco > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > > > >>> will be read and written simultaneously by user space > > > > >>> programs, thus we'd better change memcg->oom_group access > > > > >>> with atomic operations to avoid concurrency problems. > > > > >>> > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > > > >> > > > > >> Hi Yue! > > > > >> > > > > >> I'm curious, have any seen any real issues which your patch is solving? > > > > >> Can you, please, provide a bit more details. > > > > >> > > > > > > > > > > IMHO such details are not needed. oom_group is being accessed > > > > > concurrently and one of them can be a write access. At least > > > > > READ_ONCE/WRITE_ONCE is needed here. > > > > > > > > Needed for what? > > > > > > For this particular case, documenting such an access. Though I don't > > > think there are any architectures which may tear a one byte read/write > > > and merging/refetching is not an issue for this. > > > > Wouldn't a compiler be within its rights to implement a one byte store as: > > > > load-word > > modify-byte-in-word > > store-word > > > > and if this is a lockless store to a word which has an adjacent byte also > > being modified by another CPU, one of those CPUs can lose its store? > > And WRITE_ONCE would prevent the compiler from implementing the store > > in that way. > > Thanks Willy for pointing this out. If the compiler can really do this > then [READ|WRITE]_ONCE are required here. I always have big bad > compiler lwn article open in a tab. I couldn't map this transformation > to ones mentioned in that article. Do we have name of this one? No, recent compilers are absolutely forbidden from doing this sort of thing except under very special circumstances. Before C11, compilers could and in fact did do things like this. This is after all a great way to keep the CPU's vector unit from getting bored. Unfortunately for those who prize optimization above all else, doing this can introduce data races, for example: char a; char b; spin_lock la; spin_lock lb; void change_a(char new_a) { spin_lock(&la); a = new_a; spin_unlock(&la); } void change_b(char new_b) { spin_lock(&lb); b = new_b; spin_unlock(&lb); } If the compiler "optimized" that "a = new_a" so as to produce a non-atomic read-modify-write sequence, it would be introducing a data race. And since C11, the compiler is absolutely forbidden from introducing data races. So, again, no, the compiler cannot invent writes to variables. What are those very special circumstances? 1. The other variables were going to be written to anyway, and none of the writes was non-volatile and there was no ordering directive between any of those writes. 2. The other variables are dead, as in there are no subsequent reads from them anywhere in the program. Of course in that case, there is no need to read the prior values of those variables. 3. All accesses to all of the variables are visible to the compiler, and the compiler can prove that there are no concurrent accesses to any of them. For example, all of the variables are on-stack variables whose addresses are never taken. Does that help, or am I misunderstanding the question? Thanx, Paul ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 18:23 ` Paul E. McKenney @ 2023-02-21 22:23 ` Roman Gushchin 2023-02-21 22:38 ` Paul E. McKenney 0 siblings, 1 reply; 26+ messages in thread From: Roman Gushchin @ 2023-02-21 22:23 UTC (permalink / raw) To: Paul E. McKenney Cc: Shakeel Butt, Matthew Wilcox, Marco Elver, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote: > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote: > > +Paul & Marco > > > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > > > > >>> will be read and written simultaneously by user space > > > > > >>> programs, thus we'd better change memcg->oom_group access > > > > > >>> with atomic operations to avoid concurrency problems. > > > > > >>> > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > > > > >> > > > > > >> Hi Yue! > > > > > >> > > > > > >> I'm curious, have any seen any real issues which your patch is solving? > > > > > >> Can you, please, provide a bit more details. > > > > > >> > > > > > > > > > > > > IMHO such details are not needed. oom_group is being accessed > > > > > > concurrently and one of them can be a write access. At least > > > > > > READ_ONCE/WRITE_ONCE is needed here. > > > > > > > > > > Needed for what? > > > > > > > > For this particular case, documenting such an access. Though I don't > > > > think there are any architectures which may tear a one byte read/write > > > > and merging/refetching is not an issue for this. > > > > > > Wouldn't a compiler be within its rights to implement a one byte store as: > > > > > > load-word > > > modify-byte-in-word > > > store-word > > > > > > and if this is a lockless store to a word which has an adjacent byte also > > > being modified by another CPU, one of those CPUs can lose its store? > > > And WRITE_ONCE would prevent the compiler from implementing the store > > > in that way. > > > > Thanks Willy for pointing this out. If the compiler can really do this > > then [READ|WRITE]_ONCE are required here. I always have big bad > > compiler lwn article open in a tab. I couldn't map this transformation > > to ones mentioned in that article. Do we have name of this one? > > No, recent compilers are absolutely forbidden from doing this sort of > thing except under very special circumstances. > > Before C11, compilers could and in fact did do things like this. This is > after all a great way to keep the CPU's vector unit from getting bored. > Unfortunately for those who prize optimization above all else, doing > this can introduce data races, for example: > > char a; > char b; > spin_lock la; > spin_lock lb; > > void change_a(char new_a) > { > spin_lock(&la); > a = new_a; > spin_unlock(&la); > } > > void change_b(char new_b) > { > spin_lock(&lb); > b = new_b; > spin_unlock(&lb); > } > > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic > read-modify-write sequence, it would be introducing a data race. > And since C11, the compiler is absolutely forbidden from introducing > data races. So, again, no, the compiler cannot invent writes to > variables. > > What are those very special circumstances? > > 1. The other variables were going to be written to anyway, and > none of the writes was non-volatile and there was no ordering > directive between any of those writes. > > 2. The other variables are dead, as in there are no subsequent > reads from them anywhere in the program. Of course in that case, > there is no need to read the prior values of those variables. > > 3. All accesses to all of the variables are visible to the compiler, > and the compiler can prove that there are no concurrent accesses > to any of them. For example, all of the variables are on-stack > variables whose addresses are never taken. > > Does that help, or am I misunderstanding the question? Thank you, Paul! So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here. Or I still miss something? Thanks! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 22:23 ` Roman Gushchin @ 2023-02-21 22:38 ` Paul E. McKenney 2023-02-21 23:13 ` Shakeel Butt 0 siblings, 1 reply; 26+ messages in thread From: Paul E. McKenney @ 2023-02-21 22:38 UTC (permalink / raw) To: Roman Gushchin Cc: Shakeel Butt, Matthew Wilcox, Marco Elver, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote: > On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote: > > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote: > > > +Paul & Marco > > > > > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > > > > > >>> will be read and written simultaneously by user space > > > > > > >>> programs, thus we'd better change memcg->oom_group access > > > > > > >>> with atomic operations to avoid concurrency problems. > > > > > > >>> > > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > > > > > >> > > > > > > >> Hi Yue! > > > > > > >> > > > > > > >> I'm curious, have any seen any real issues which your patch is solving? > > > > > > >> Can you, please, provide a bit more details. > > > > > > >> > > > > > > > > > > > > > > IMHO such details are not needed. oom_group is being accessed > > > > > > > concurrently and one of them can be a write access. At least > > > > > > > READ_ONCE/WRITE_ONCE is needed here. > > > > > > > > > > > > Needed for what? > > > > > > > > > > For this particular case, documenting such an access. Though I don't > > > > > think there are any architectures which may tear a one byte read/write > > > > > and merging/refetching is not an issue for this. > > > > > > > > Wouldn't a compiler be within its rights to implement a one byte store as: > > > > > > > > load-word > > > > modify-byte-in-word > > > > store-word > > > > > > > > and if this is a lockless store to a word which has an adjacent byte also > > > > being modified by another CPU, one of those CPUs can lose its store? > > > > And WRITE_ONCE would prevent the compiler from implementing the store > > > > in that way. > > > > > > Thanks Willy for pointing this out. If the compiler can really do this > > > then [READ|WRITE]_ONCE are required here. I always have big bad > > > compiler lwn article open in a tab. I couldn't map this transformation > > > to ones mentioned in that article. Do we have name of this one? > > > > No, recent compilers are absolutely forbidden from doing this sort of > > thing except under very special circumstances. > > > > Before C11, compilers could and in fact did do things like this. This is > > after all a great way to keep the CPU's vector unit from getting bored. > > Unfortunately for those who prize optimization above all else, doing > > this can introduce data races, for example: > > > > char a; > > char b; > > spin_lock la; > > spin_lock lb; > > > > void change_a(char new_a) > > { > > spin_lock(&la); > > a = new_a; > > spin_unlock(&la); > > } > > > > void change_b(char new_b) > > { > > spin_lock(&lb); > > b = new_b; > > spin_unlock(&lb); > > } > > > > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic > > read-modify-write sequence, it would be introducing a data race. > > And since C11, the compiler is absolutely forbidden from introducing > > data races. So, again, no, the compiler cannot invent writes to > > variables. > > > > What are those very special circumstances? > > > > 1. The other variables were going to be written to anyway, and > > none of the writes was non-volatile and there was no ordering > > directive between any of those writes. > > > > 2. The other variables are dead, as in there are no subsequent > > reads from them anywhere in the program. Of course in that case, > > there is no need to read the prior values of those variables. > > > > 3. All accesses to all of the variables are visible to the compiler, > > and the compiler can prove that there are no concurrent accesses > > to any of them. For example, all of the variables are on-stack > > variables whose addresses are never taken. > > > > Does that help, or am I misunderstanding the question? > > Thank you, Paul! > > So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here. > Or I still miss something? Yes, given that the compiler will already avoid inventing data-race-prone C-language accesses to shared variables, so if that was the only reason that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and WRITE_ONCE() won't be helping you. Or perhaps better to put it a different way... The fact that the compiler is not permitted to invent data-racy reads and writes is exactly why you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in lock-based critical sections. Instead, you only need READ_ONCE() and WRITE_ONCE() when you have lockless accesses to the same shared variables. Thanx, Paul ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 22:38 ` Paul E. McKenney @ 2023-02-21 23:13 ` Shakeel Butt 2023-02-21 23:38 ` Paul E. McKenney 0 siblings, 1 reply; 26+ messages in thread From: Shakeel Butt @ 2023-02-21 23:13 UTC (permalink / raw) To: paulmck Cc: Roman Gushchin, Matthew Wilcox, Marco Elver, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Tue, Feb 21, 2023 at 2:38 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote: > > On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote: > > > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote: > > > > +Paul & Marco > > > > > > > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > > > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > > > > > > >>> will be read and written simultaneously by user space > > > > > > > >>> programs, thus we'd better change memcg->oom_group access > > > > > > > >>> with atomic operations to avoid concurrency problems. > > > > > > > >>> > > > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > > > > > > >> > > > > > > > >> Hi Yue! > > > > > > > >> > > > > > > > >> I'm curious, have any seen any real issues which your patch is solving? > > > > > > > >> Can you, please, provide a bit more details. > > > > > > > >> > > > > > > > > > > > > > > > > IMHO such details are not needed. oom_group is being accessed > > > > > > > > concurrently and one of them can be a write access. At least > > > > > > > > READ_ONCE/WRITE_ONCE is needed here. > > > > > > > > > > > > > > Needed for what? > > > > > > > > > > > > For this particular case, documenting such an access. Though I don't > > > > > > think there are any architectures which may tear a one byte read/write > > > > > > and merging/refetching is not an issue for this. > > > > > > > > > > Wouldn't a compiler be within its rights to implement a one byte store as: > > > > > > > > > > load-word > > > > > modify-byte-in-word > > > > > store-word > > > > > > > > > > and if this is a lockless store to a word which has an adjacent byte also > > > > > being modified by another CPU, one of those CPUs can lose its store? > > > > > And WRITE_ONCE would prevent the compiler from implementing the store > > > > > in that way. > > > > > > > > Thanks Willy for pointing this out. If the compiler can really do this > > > > then [READ|WRITE]_ONCE are required here. I always have big bad > > > > compiler lwn article open in a tab. I couldn't map this transformation > > > > to ones mentioned in that article. Do we have name of this one? > > > > > > No, recent compilers are absolutely forbidden from doing this sort of > > > thing except under very special circumstances. > > > > > > Before C11, compilers could and in fact did do things like this. This is > > > after all a great way to keep the CPU's vector unit from getting bored. > > > Unfortunately for those who prize optimization above all else, doing > > > this can introduce data races, for example: > > > > > > char a; > > > char b; > > > spin_lock la; > > > spin_lock lb; > > > > > > void change_a(char new_a) > > > { > > > spin_lock(&la); > > > a = new_a; > > > spin_unlock(&la); > > > } > > > > > > void change_b(char new_b) > > > { > > > spin_lock(&lb); > > > b = new_b; > > > spin_unlock(&lb); > > > } > > > > > > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic > > > read-modify-write sequence, it would be introducing a data race. > > > And since C11, the compiler is absolutely forbidden from introducing > > > data races. So, again, no, the compiler cannot invent writes to > > > variables. > > > > > > What are those very special circumstances? > > > > > > 1. The other variables were going to be written to anyway, and > > > none of the writes was non-volatile and there was no ordering > > > directive between any of those writes. > > > > > > 2. The other variables are dead, as in there are no subsequent > > > reads from them anywhere in the program. Of course in that case, > > > there is no need to read the prior values of those variables. > > > > > > 3. All accesses to all of the variables are visible to the compiler, > > > and the compiler can prove that there are no concurrent accesses > > > to any of them. For example, all of the variables are on-stack > > > variables whose addresses are never taken. > > > > > > Does that help, or am I misunderstanding the question? > > > > Thank you, Paul! > > > > So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here. > > Or I still miss something? > > Yes, given that the compiler will already avoid inventing data-race-prone > C-language accesses to shared variables, so if that was the only reason > that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and > WRITE_ONCE() won't be helping you. > > Or perhaps better to put it a different way... The fact that the compiler > is not permitted to invent data-racy reads and writes is exactly why > you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in > lock-based critical sections. Instead, you only need READ_ONCE() and > WRITE_ONCE() when you have lockless accesses to the same shared variables. > This is lockless access to memcg->oom_group potentially from multiple CPUs, so, READ_ONCE() and WRITE_ONCE() are needed, right? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 23:13 ` Shakeel Butt @ 2023-02-21 23:38 ` Paul E. McKenney 2023-02-21 23:57 ` Roman Gushchin 0 siblings, 1 reply; 26+ messages in thread From: Paul E. McKenney @ 2023-02-21 23:38 UTC (permalink / raw) To: Shakeel Butt Cc: Roman Gushchin, Matthew Wilcox, Marco Elver, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Tue, Feb 21, 2023 at 03:13:36PM -0800, Shakeel Butt wrote: > On Tue, Feb 21, 2023 at 2:38 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote: > > > On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote: > > > > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote: > > > > > +Paul & Marco > > > > > > > > > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > > > > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > > > > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > > > > > > > >>> will be read and written simultaneously by user space > > > > > > > > >>> programs, thus we'd better change memcg->oom_group access > > > > > > > > >>> with atomic operations to avoid concurrency problems. > > > > > > > > >>> > > > > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > > > > > > > >> > > > > > > > > >> Hi Yue! > > > > > > > > >> > > > > > > > > >> I'm curious, have any seen any real issues which your patch is solving? > > > > > > > > >> Can you, please, provide a bit more details. > > > > > > > > >> > > > > > > > > > > > > > > > > > > IMHO such details are not needed. oom_group is being accessed > > > > > > > > > concurrently and one of them can be a write access. At least > > > > > > > > > READ_ONCE/WRITE_ONCE is needed here. > > > > > > > > > > > > > > > > Needed for what? > > > > > > > > > > > > > > For this particular case, documenting such an access. Though I don't > > > > > > > think there are any architectures which may tear a one byte read/write > > > > > > > and merging/refetching is not an issue for this. > > > > > > > > > > > > Wouldn't a compiler be within its rights to implement a one byte store as: > > > > > > > > > > > > load-word > > > > > > modify-byte-in-word > > > > > > store-word > > > > > > > > > > > > and if this is a lockless store to a word which has an adjacent byte also > > > > > > being modified by another CPU, one of those CPUs can lose its store? > > > > > > And WRITE_ONCE would prevent the compiler from implementing the store > > > > > > in that way. > > > > > > > > > > Thanks Willy for pointing this out. If the compiler can really do this > > > > > then [READ|WRITE]_ONCE are required here. I always have big bad > > > > > compiler lwn article open in a tab. I couldn't map this transformation > > > > > to ones mentioned in that article. Do we have name of this one? > > > > > > > > No, recent compilers are absolutely forbidden from doing this sort of > > > > thing except under very special circumstances. > > > > > > > > Before C11, compilers could and in fact did do things like this. This is > > > > after all a great way to keep the CPU's vector unit from getting bored. > > > > Unfortunately for those who prize optimization above all else, doing > > > > this can introduce data races, for example: > > > > > > > > char a; > > > > char b; > > > > spin_lock la; > > > > spin_lock lb; > > > > > > > > void change_a(char new_a) > > > > { > > > > spin_lock(&la); > > > > a = new_a; > > > > spin_unlock(&la); > > > > } > > > > > > > > void change_b(char new_b) > > > > { > > > > spin_lock(&lb); > > > > b = new_b; > > > > spin_unlock(&lb); > > > > } > > > > > > > > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic > > > > read-modify-write sequence, it would be introducing a data race. > > > > And since C11, the compiler is absolutely forbidden from introducing > > > > data races. So, again, no, the compiler cannot invent writes to > > > > variables. > > > > > > > > What are those very special circumstances? > > > > > > > > 1. The other variables were going to be written to anyway, and > > > > none of the writes was non-volatile and there was no ordering > > > > directive between any of those writes. > > > > > > > > 2. The other variables are dead, as in there are no subsequent > > > > reads from them anywhere in the program. Of course in that case, > > > > there is no need to read the prior values of those variables. > > > > > > > > 3. All accesses to all of the variables are visible to the compiler, > > > > and the compiler can prove that there are no concurrent accesses > > > > to any of them. For example, all of the variables are on-stack > > > > variables whose addresses are never taken. > > > > > > > > Does that help, or am I misunderstanding the question? > > > > > > Thank you, Paul! > > > > > > So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here. > > > Or I still miss something? > > > > Yes, given that the compiler will already avoid inventing data-race-prone > > C-language accesses to shared variables, so if that was the only reason > > that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and > > WRITE_ONCE() won't be helping you. > > > > Or perhaps better to put it a different way... The fact that the compiler > > is not permitted to invent data-racy reads and writes is exactly why > > you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in > > lock-based critical sections. Instead, you only need READ_ONCE() and > > WRITE_ONCE() when you have lockless accesses to the same shared variables. > > This is lockless access to memcg->oom_group potentially from multiple > CPUs, so, READ_ONCE() and WRITE_ONCE() are needed, right? Agreed, lockless concurrent accesses should use READ_ONCE() and WRITE_ONCE(). And if either conflicting access is lockless, it is lockless. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 23:38 ` Paul E. McKenney @ 2023-02-21 23:57 ` Roman Gushchin 2023-02-22 0:37 ` Paul E. McKenney 0 siblings, 1 reply; 26+ messages in thread From: Roman Gushchin @ 2023-02-21 23:57 UTC (permalink / raw) To: Paul E. McKenney Cc: Shakeel Butt, Matthew Wilcox, Marco Elver, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Tue, Feb 21, 2023 at 03:38:24PM -0800, Paul E. McKenney wrote: > On Tue, Feb 21, 2023 at 03:13:36PM -0800, Shakeel Butt wrote: > > On Tue, Feb 21, 2023 at 2:38 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote: > > > > On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote: > > > > > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote: > > > > > > +Paul & Marco > > > > > > > > > > > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > > > > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > > > > > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > > > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > > > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > > > > > > > > >>> will be read and written simultaneously by user space > > > > > > > > > >>> programs, thus we'd better change memcg->oom_group access > > > > > > > > > >>> with atomic operations to avoid concurrency problems. > > > > > > > > > >>> > > > > > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > > > > > > > > >> > > > > > > > > > >> Hi Yue! > > > > > > > > > >> > > > > > > > > > >> I'm curious, have any seen any real issues which your patch is solving? > > > > > > > > > >> Can you, please, provide a bit more details. > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > IMHO such details are not needed. oom_group is being accessed > > > > > > > > > > concurrently and one of them can be a write access. At least > > > > > > > > > > READ_ONCE/WRITE_ONCE is needed here. > > > > > > > > > > > > > > > > > > Needed for what? > > > > > > > > > > > > > > > > For this particular case, documenting such an access. Though I don't > > > > > > > > think there are any architectures which may tear a one byte read/write > > > > > > > > and merging/refetching is not an issue for this. > > > > > > > > > > > > > > Wouldn't a compiler be within its rights to implement a one byte store as: > > > > > > > > > > > > > > load-word > > > > > > > modify-byte-in-word > > > > > > > store-word > > > > > > > > > > > > > > and if this is a lockless store to a word which has an adjacent byte also > > > > > > > being modified by another CPU, one of those CPUs can lose its store? > > > > > > > And WRITE_ONCE would prevent the compiler from implementing the store > > > > > > > in that way. > > > > > > > > > > > > Thanks Willy for pointing this out. If the compiler can really do this > > > > > > then [READ|WRITE]_ONCE are required here. I always have big bad > > > > > > compiler lwn article open in a tab. I couldn't map this transformation > > > > > > to ones mentioned in that article. Do we have name of this one? > > > > > > > > > > No, recent compilers are absolutely forbidden from doing this sort of > > > > > thing except under very special circumstances. > > > > > > > > > > Before C11, compilers could and in fact did do things like this. This is > > > > > after all a great way to keep the CPU's vector unit from getting bored. > > > > > Unfortunately for those who prize optimization above all else, doing > > > > > this can introduce data races, for example: > > > > > > > > > > char a; > > > > > char b; > > > > > spin_lock la; > > > > > spin_lock lb; > > > > > > > > > > void change_a(char new_a) > > > > > { > > > > > spin_lock(&la); > > > > > a = new_a; > > > > > spin_unlock(&la); > > > > > } > > > > > > > > > > void change_b(char new_b) > > > > > { > > > > > spin_lock(&lb); > > > > > b = new_b; > > > > > spin_unlock(&lb); > > > > > } > > > > > > > > > > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic > > > > > read-modify-write sequence, it would be introducing a data race. > > > > > And since C11, the compiler is absolutely forbidden from introducing > > > > > data races. So, again, no, the compiler cannot invent writes to > > > > > variables. > > > > > > > > > > What are those very special circumstances? > > > > > > > > > > 1. The other variables were going to be written to anyway, and > > > > > none of the writes was non-volatile and there was no ordering > > > > > directive between any of those writes. > > > > > > > > > > 2. The other variables are dead, as in there are no subsequent > > > > > reads from them anywhere in the program. Of course in that case, > > > > > there is no need to read the prior values of those variables. > > > > > > > > > > 3. All accesses to all of the variables are visible to the compiler, > > > > > and the compiler can prove that there are no concurrent accesses > > > > > to any of them. For example, all of the variables are on-stack > > > > > variables whose addresses are never taken. > > > > > > > > > > Does that help, or am I misunderstanding the question? > > > > > > > > Thank you, Paul! > > > > > > > > So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here. > > > > Or I still miss something? > > > > > > Yes, given that the compiler will already avoid inventing data-race-prone > > > C-language accesses to shared variables, so if that was the only reason > > > that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and > > > WRITE_ONCE() won't be helping you. > > > > > > Or perhaps better to put it a different way... The fact that the compiler > > > is not permitted to invent data-racy reads and writes is exactly why > > > you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in > > > lock-based critical sections. Instead, you only need READ_ONCE() and > > > WRITE_ONCE() when you have lockless accesses to the same shared variables. > > > > This is lockless access to memcg->oom_group potentially from multiple > > CPUs, so, READ_ONCE() and WRITE_ONCE() are needed, right? > > Agreed, lockless concurrent accesses should use READ_ONCE() and WRITE_ONCE(). > And if either conflicting access is lockless, it is lockless. ;-) Now I'm confused, why we should use it here? Writing is happening from a separate syscall (a single write from a syscall), reading is happening from a oom context. The variable is boolean, it's either 0 or 1. What difference READ_ONCE()/WRITE_ONCE() will make here? Thanks! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 23:57 ` Roman Gushchin @ 2023-02-22 0:37 ` Paul E. McKenney 2023-02-22 4:28 ` Roman Gushchin 0 siblings, 1 reply; 26+ messages in thread From: Paul E. McKenney @ 2023-02-22 0:37 UTC (permalink / raw) To: Roman Gushchin Cc: Shakeel Butt, Matthew Wilcox, Marco Elver, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Tue, Feb 21, 2023 at 03:57:58PM -0800, Roman Gushchin wrote: > On Tue, Feb 21, 2023 at 03:38:24PM -0800, Paul E. McKenney wrote: > > On Tue, Feb 21, 2023 at 03:13:36PM -0800, Shakeel Butt wrote: > > > On Tue, Feb 21, 2023 at 2:38 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > > > On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote: > > > > > On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote: > > > > > > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote: > > > > > > > +Paul & Marco > > > > > > > > > > > > > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > > > > > > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > > > > > > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > > > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > > > > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > > > > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > > > > > > > > > >>> will be read and written simultaneously by user space > > > > > > > > > > >>> programs, thus we'd better change memcg->oom_group access > > > > > > > > > > >>> with atomic operations to avoid concurrency problems. > > > > > > > > > > >>> > > > > > > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > > > > > > > > > >> > > > > > > > > > > >> Hi Yue! > > > > > > > > > > >> > > > > > > > > > > >> I'm curious, have any seen any real issues which your patch is solving? > > > > > > > > > > >> Can you, please, provide a bit more details. > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > IMHO such details are not needed. oom_group is being accessed > > > > > > > > > > > concurrently and one of them can be a write access. At least > > > > > > > > > > > READ_ONCE/WRITE_ONCE is needed here. > > > > > > > > > > > > > > > > > > > > Needed for what? > > > > > > > > > > > > > > > > > > For this particular case, documenting such an access. Though I don't > > > > > > > > > think there are any architectures which may tear a one byte read/write > > > > > > > > > and merging/refetching is not an issue for this. > > > > > > > > > > > > > > > > Wouldn't a compiler be within its rights to implement a one byte store as: > > > > > > > > > > > > > > > > load-word > > > > > > > > modify-byte-in-word > > > > > > > > store-word > > > > > > > > > > > > > > > > and if this is a lockless store to a word which has an adjacent byte also > > > > > > > > being modified by another CPU, one of those CPUs can lose its store? > > > > > > > > And WRITE_ONCE would prevent the compiler from implementing the store > > > > > > > > in that way. > > > > > > > > > > > > > > Thanks Willy for pointing this out. If the compiler can really do this > > > > > > > then [READ|WRITE]_ONCE are required here. I always have big bad > > > > > > > compiler lwn article open in a tab. I couldn't map this transformation > > > > > > > to ones mentioned in that article. Do we have name of this one? > > > > > > > > > > > > No, recent compilers are absolutely forbidden from doing this sort of > > > > > > thing except under very special circumstances. > > > > > > > > > > > > Before C11, compilers could and in fact did do things like this. This is > > > > > > after all a great way to keep the CPU's vector unit from getting bored. > > > > > > Unfortunately for those who prize optimization above all else, doing > > > > > > this can introduce data races, for example: > > > > > > > > > > > > char a; > > > > > > char b; > > > > > > spin_lock la; > > > > > > spin_lock lb; > > > > > > > > > > > > void change_a(char new_a) > > > > > > { > > > > > > spin_lock(&la); > > > > > > a = new_a; > > > > > > spin_unlock(&la); > > > > > > } > > > > > > > > > > > > void change_b(char new_b) > > > > > > { > > > > > > spin_lock(&lb); > > > > > > b = new_b; > > > > > > spin_unlock(&lb); > > > > > > } > > > > > > > > > > > > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic > > > > > > read-modify-write sequence, it would be introducing a data race. > > > > > > And since C11, the compiler is absolutely forbidden from introducing > > > > > > data races. So, again, no, the compiler cannot invent writes to > > > > > > variables. > > > > > > > > > > > > What are those very special circumstances? > > > > > > > > > > > > 1. The other variables were going to be written to anyway, and > > > > > > none of the writes was non-volatile and there was no ordering > > > > > > directive between any of those writes. > > > > > > > > > > > > 2. The other variables are dead, as in there are no subsequent > > > > > > reads from them anywhere in the program. Of course in that case, > > > > > > there is no need to read the prior values of those variables. > > > > > > > > > > > > 3. All accesses to all of the variables are visible to the compiler, > > > > > > and the compiler can prove that there are no concurrent accesses > > > > > > to any of them. For example, all of the variables are on-stack > > > > > > variables whose addresses are never taken. > > > > > > > > > > > > Does that help, or am I misunderstanding the question? > > > > > > > > > > Thank you, Paul! > > > > > > > > > > So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here. > > > > > Or I still miss something? > > > > > > > > Yes, given that the compiler will already avoid inventing data-race-prone > > > > C-language accesses to shared variables, so if that was the only reason > > > > that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and > > > > WRITE_ONCE() won't be helping you. > > > > > > > > Or perhaps better to put it a different way... The fact that the compiler > > > > is not permitted to invent data-racy reads and writes is exactly why > > > > you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in > > > > lock-based critical sections. Instead, you only need READ_ONCE() and > > > > WRITE_ONCE() when you have lockless accesses to the same shared variables. > > > > > > This is lockless access to memcg->oom_group potentially from multiple > > > CPUs, so, READ_ONCE() and WRITE_ONCE() are needed, right? > > > > Agreed, lockless concurrent accesses should use READ_ONCE() and WRITE_ONCE(). > > And if either conflicting access is lockless, it is lockless. ;-) > > Now I'm confused, why we should use it here? > Writing is happening from a separate syscall (a single write from a syscall), > reading is happening from a oom context. The variable is boolean, it's either > 0 or 1. What difference READ_ONCE()/WRITE_ONCE() will make here? > Thanks! In practice, not much difference other than documenting shared accesses. Which can be valuable. In theory, when you do a normal C-language store, the compiler is within its rights to use the variable for temporary storage between the time of the last read from that variable and the next write to that variable. Back to practice, I have not heard of this happening for shared variables. On the other hand, compilers really do this for on-stack variables whose addresses are not taken, which is one of the reasons that gdb might say that the variable is optimized out when you try to look at its value. So the potential is there, and if it was my code, I would therefore use READ_ONCE() and WRITE_ONCE(). Thanx, Paul ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-22 0:37 ` Paul E. McKenney @ 2023-02-22 4:28 ` Roman Gushchin 0 siblings, 0 replies; 26+ messages in thread From: Roman Gushchin @ 2023-02-22 4:28 UTC (permalink / raw) To: paulmck Cc: Shakeel Butt, Matthew Wilcox, Marco Elver, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel > On Feb 21, 2023, at 4:38 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > On Tue, Feb 21, 2023 at 03:57:58PM -0800, Roman Gushchin wrote: >>> On Tue, Feb 21, 2023 at 03:38:24PM -0800, Paul E. McKenney wrote: >>> On Tue, Feb 21, 2023 at 03:13:36PM -0800, Shakeel Butt wrote: >>>> On Tue, Feb 21, 2023 at 2:38 PM Paul E. McKenney <paulmck@kernel.org> wrote: >>>>> >>>>> On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote: >>>>>> On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote: >>>>>>> On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote: >>>>>>>> +Paul & Marco >>>>>>>> >>>>>>>> On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote: >>>>>>>>> >>>>>>>>> On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: >>>>>>>>>> On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: >>>>>>>>>>>> On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: >>>>>>>>>>>>>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: >>>>>>>>>>>>>> The knob for cgroup v2 memory controller: memory.oom.group >>>>>>>>>>>>>> will be read and written simultaneously by user space >>>>>>>>>>>>>> programs, thus we'd better change memcg->oom_group access >>>>>>>>>>>>>> with atomic operations to avoid concurrency problems. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Signed-off-by: Yue Zhao <findns94@gmail.com> >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Yue! >>>>>>>>>>>>> >>>>>>>>>>>>> I'm curious, have any seen any real issues which your patch is solving? >>>>>>>>>>>>> Can you, please, provide a bit more details. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> IMHO such details are not needed. oom_group is being accessed >>>>>>>>>>>> concurrently and one of them can be a write access. At least >>>>>>>>>>>> READ_ONCE/WRITE_ONCE is needed here. >>>>>>>>>>> >>>>>>>>>>> Needed for what? >>>>>>>>>> >>>>>>>>>> For this particular case, documenting such an access. Though I don't >>>>>>>>>> think there are any architectures which may tear a one byte read/write >>>>>>>>>> and merging/refetching is not an issue for this. >>>>>>>>> >>>>>>>>> Wouldn't a compiler be within its rights to implement a one byte store as: >>>>>>>>> >>>>>>>>> load-word >>>>>>>>> modify-byte-in-word >>>>>>>>> store-word >>>>>>>>> >>>>>>>>> and if this is a lockless store to a word which has an adjacent byte also >>>>>>>>> being modified by another CPU, one of those CPUs can lose its store? >>>>>>>>> And WRITE_ONCE would prevent the compiler from implementing the store >>>>>>>>> in that way. >>>>>>>> >>>>>>>> Thanks Willy for pointing this out. If the compiler can really do this >>>>>>>> then [READ|WRITE]_ONCE are required here. I always have big bad >>>>>>>> compiler lwn article open in a tab. I couldn't map this transformation >>>>>>>> to ones mentioned in that article. Do we have name of this one? >>>>>>> >>>>>>> No, recent compilers are absolutely forbidden from doing this sort of >>>>>>> thing except under very special circumstances. >>>>>>> >>>>>>> Before C11, compilers could and in fact did do things like this. This is >>>>>>> after all a great way to keep the CPU's vector unit from getting bored. >>>>>>> Unfortunately for those who prize optimization above all else, doing >>>>>>> this can introduce data races, for example: >>>>>>> >>>>>>> char a; >>>>>>> char b; >>>>>>> spin_lock la; >>>>>>> spin_lock lb; >>>>>>> >>>>>>> void change_a(char new_a) >>>>>>> { >>>>>>> spin_lock(&la); >>>>>>> a = new_a; >>>>>>> spin_unlock(&la); >>>>>>> } >>>>>>> >>>>>>> void change_b(char new_b) >>>>>>> { >>>>>>> spin_lock(&lb); >>>>>>> b = new_b; >>>>>>> spin_unlock(&lb); >>>>>>> } >>>>>>> >>>>>>> If the compiler "optimized" that "a = new_a" so as to produce a non-atomic >>>>>>> read-modify-write sequence, it would be introducing a data race. >>>>>>> And since C11, the compiler is absolutely forbidden from introducing >>>>>>> data races. So, again, no, the compiler cannot invent writes to >>>>>>> variables. >>>>>>> >>>>>>> What are those very special circumstances? >>>>>>> >>>>>>> 1. The other variables were going to be written to anyway, and >>>>>>> none of the writes was non-volatile and there was no ordering >>>>>>> directive between any of those writes. >>>>>>> >>>>>>> 2. The other variables are dead, as in there are no subsequent >>>>>>> reads from them anywhere in the program. Of course in that case, >>>>>>> there is no need to read the prior values of those variables. >>>>>>> >>>>>>> 3. All accesses to all of the variables are visible to the compiler, >>>>>>> and the compiler can prove that there are no concurrent accesses >>>>>>> to any of them. For example, all of the variables are on-stack >>>>>>> variables whose addresses are never taken. >>>>>>> >>>>>>> Does that help, or am I misunderstanding the question? >>>>>> >>>>>> Thank you, Paul! >>>>>> >>>>>> So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here. >>>>>> Or I still miss something? >>>>> >>>>> Yes, given that the compiler will already avoid inventing data-race-prone >>>>> C-language accesses to shared variables, so if that was the only reason >>>>> that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and >>>>> WRITE_ONCE() won't be helping you. >>>>> >>>>> Or perhaps better to put it a different way... The fact that the compiler >>>>> is not permitted to invent data-racy reads and writes is exactly why >>>>> you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in >>>>> lock-based critical sections. Instead, you only need READ_ONCE() and >>>>> WRITE_ONCE() when you have lockless accesses to the same shared variables. >>>> >>>> This is lockless access to memcg->oom_group potentially from multiple >>>> CPUs, so, READ_ONCE() and WRITE_ONCE() are needed, right? >>> >>> Agreed, lockless concurrent accesses should use READ_ONCE() and WRITE_ONCE(). >>> And if either conflicting access is lockless, it is lockless. ;-) >> >> Now I'm confused, why we should use it here? >> Writing is happening from a separate syscall (a single write from a syscall), >> reading is happening from a oom context. The variable is boolean, it's either >> 0 or 1. What difference READ_ONCE()/WRITE_ONCE() will make here? >> Thanks! > > In practice, not much difference other than documenting shared accesses. > Which can be valuable. > > In theory, when you do a normal C-language store, the compiler is within > its rights to use the variable for temporary storage between the time > of the last read from that variable and the next write to that variable. > Back to practice, I have not heard of this happening for shared variables. > On the other hand, compilers really do this for on-stack variables whose > addresses are not taken, which is one of the reasons that gdb might say > that the variable is optimized out when you try to look at its value. > > So the potential is there, and if it was my code, I would therefore use > READ_ONCE() and WRITE_ONCE(). Got it, Paul, thank you for the explanation! It seems like the resolution is that putting READ_ONCE()/WRITE_ONCE() across knobs in mm/memcontrol.c is generally a good idea, but mostly for cosmetic reasons. Yue, can you, please, update the patch? Btw, what a thread! Apparently writing & reading a single boolean is not that simple… :) Thanks for all participants! Roman ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 13:51 ` Matthew Wilcox 2023-02-21 16:56 ` Shakeel Butt @ 2023-02-21 17:47 ` Roman Gushchin 2023-02-21 18:15 ` Shakeel Butt 2023-02-21 18:18 ` Matthew Wilcox 2023-02-22 9:01 ` David Laight 2 siblings, 2 replies; 26+ messages in thread From: Roman Gushchin @ 2023-02-21 17:47 UTC (permalink / raw) To: Matthew Wilcox Cc: Shakeel Butt, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Tue, Feb 21, 2023 at 01:51:29PM +0000, Matthew Wilcox wrote: > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > > >>> will be read and written simultaneously by user space > > > >>> programs, thus we'd better change memcg->oom_group access > > > >>> with atomic operations to avoid concurrency problems. > > > >>> > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > > >> > > > >> Hi Yue! > > > >> > > > >> I'm curious, have any seen any real issues which your patch is solving? > > > >> Can you, please, provide a bit more details. > > > >> > > > > > > > > IMHO such details are not needed. oom_group is being accessed > > > > concurrently and one of them can be a write access. At least > > > > READ_ONCE/WRITE_ONCE is needed here. > > > > > > Needed for what? > > > > For this particular case, documenting such an access. Though I don't > > think there are any architectures which may tear a one byte read/write > > and merging/refetching is not an issue for this. > > Wouldn't a compiler be within its rights to implement a one byte store as: > > load-word > modify-byte-in-word > store-word > > and if this is a lockless store to a word which has an adjacent byte also > being modified by another CPU, one of those CPUs can lose its store? > And WRITE_ONCE would prevent the compiler from implementing the store > in that way. Even then it's not an issue in this case, as we end up with either 0 or 1, I don't see how we can screw things up here. Thanks! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 17:47 ` Roman Gushchin @ 2023-02-21 18:15 ` Shakeel Butt 2023-02-21 18:18 ` Matthew Wilcox 1 sibling, 0 replies; 26+ messages in thread From: Shakeel Butt @ 2023-02-21 18:15 UTC (permalink / raw) To: Roman Gushchin Cc: Matthew Wilcox, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Tue, Feb 21, 2023 at 9:47 AM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > On Tue, Feb 21, 2023 at 01:51:29PM +0000, Matthew Wilcox wrote: > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote: > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > > > >>> will be read and written simultaneously by user space > > > > >>> programs, thus we'd better change memcg->oom_group access > > > > >>> with atomic operations to avoid concurrency problems. > > > > >>> > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > > > >> > > > > >> Hi Yue! > > > > >> > > > > >> I'm curious, have any seen any real issues which your patch is solving? > > > > >> Can you, please, provide a bit more details. > > > > >> > > > > > > > > > > IMHO such details are not needed. oom_group is being accessed > > > > > concurrently and one of them can be a write access. At least > > > > > READ_ONCE/WRITE_ONCE is needed here. > > > > > > > > Needed for what? > > > > > > For this particular case, documenting such an access. Though I don't > > > think there are any architectures which may tear a one byte read/write > > > and merging/refetching is not an issue for this. > > > > Wouldn't a compiler be within its rights to implement a one byte store as: > > > > load-word > > modify-byte-in-word > > store-word > > > > and if this is a lockless store to a word which has an adjacent byte also > > being modified by another CPU, one of those CPUs can lose its store? > > And WRITE_ONCE would prevent the compiler from implementing the store > > in that way. > > Even then it's not an issue in this case, as we end up with either 0 or 1, > I don't see how we can screw things up here. > What do you mean by this is not an issue in this case? Yes, the oom_group usage will be ok but we can not say anything about the adjacent byte/fields. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 17:47 ` Roman Gushchin 2023-02-21 18:15 ` Shakeel Butt @ 2023-02-21 18:18 ` Matthew Wilcox 1 sibling, 0 replies; 26+ messages in thread From: Matthew Wilcox @ 2023-02-21 18:18 UTC (permalink / raw) To: Roman Gushchin Cc: Shakeel Butt, Yue Zhao, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel On Tue, Feb 21, 2023 at 09:47:05AM -0800, Roman Gushchin wrote: > > Wouldn't a compiler be within its rights to implement a one byte store as: > > > > load-word > > modify-byte-in-word > > store-word > > > > and if this is a lockless store to a word which has an adjacent byte also > > being modified by another CPU, one of those CPUs can lose its store? > > And WRITE_ONCE would prevent the compiler from implementing the store > > in that way. > > Even then it's not an issue in this case, as we end up with either 0 or 1, > I don't see how we can screw things up here. Thread 1: load word containing oom_group and oom_lock Thread 2: store to oom_lock Thread 1: store word containing oom_group and oom_lock Thread 2's store has been lost. ^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 13:51 ` Matthew Wilcox 2023-02-21 16:56 ` Shakeel Butt 2023-02-21 17:47 ` Roman Gushchin @ 2023-02-22 9:01 ` David Laight 2 siblings, 0 replies; 26+ messages in thread From: David Laight @ 2023-02-22 9:01 UTC (permalink / raw) To: 'Matthew Wilcox', Shakeel Butt Cc: Roman Gushchin, Yue Zhao, linux-mm@kvack.org, akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, muchun.song@linux.dev, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org From: Matthew Wilcox > Sent: 21 February 2023 13:51 ... > > For this particular case, documenting such an access. Though I don't > > think there are any architectures which may tear a one byte read/write > > and merging/refetching is not an issue for this. > > Wouldn't a compiler be within its rights to implement a one byte store as: > > load-word > modify-byte-in-word > store-word > > and if this is a lockless store to a word which has an adjacent byte also > being modified by another CPU, one of those CPUs can lose its store? > And WRITE_ONCE would prevent the compiler from implementing the store > in that way. Some alpha cpu couldn't do byte memory accesses - so always did 32bit read-modify-write. But Linux doesn't support those ones any more. On arm 16bit structure members can be accessed with 32bit instructions because the 16bit ones have a smaller offset. On x86 the bit operations might access the (possibly misaligned) 32bit word containing the required bit - but they are locked. ISTR a problem where gcc was using wider instructions and doing a RMW on an adjacent volatile field. I really can't remember the justification for not marking fields that have unlocked accesses 'volatile' instead of requiring all the accesses be done as explicit volatile ones. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 6:52 ` Shakeel Butt 2023-02-21 13:51 ` Matthew Wilcox @ 2023-02-21 17:00 ` Martin Zhao 1 sibling, 0 replies; 26+ messages in thread From: Martin Zhao @ 2023-02-21 17:00 UTC (permalink / raw) To: Shakeel Butt Cc: Roman Gushchin, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel, tangyeechou On Tue, Feb 21, 2023 at 2:52 PM Shakeel Butt <shakeelb@google.com> wrote: > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > >>> will be read and written simultaneously by user space > > >>> programs, thus we'd better change memcg->oom_group access > > >>> with atomic operations to avoid concurrency problems. > > >>> > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > >> > > >> Hi Yue! > > >> > > >> I'm curious, have any seen any real issues which your patch is solving? > > >> Can you, please, provide a bit more details. > > >> > > > > > > IMHO such details are not needed. oom_group is being accessed > > > concurrently and one of them can be a write access. At least > > > READ_ONCE/WRITE_ONCE is needed here. > > > > Needed for what? > > For this particular case, documenting such an access. Though I don't > think there are any architectures which may tear a one byte read/write > and merging/refetching is not an issue for this. > > > > > I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill. > > > > My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these. > > > > Also there are other similar cgroup interfaces without READ_ONCE()/WRITE_ONCE() > > Yeah and those are v1 interfaces e.g. oom_kill_disable, swappiness, > soft_limit. These definitely need [READ|WRITE]_ONCE primitive. > > Yue, can you update your patch and convert all accesses to these > fields through [READ|WRITE]_ONCE ? Sure, it will take some time to update my patch later. I think most of the accesses use [READ|WRITE]_ONCE already, only few interfaces need to update. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 5:17 ` Roman Gushchin 2023-02-21 6:52 ` Shakeel Butt @ 2023-02-21 7:22 ` Muchun Song 2023-02-21 17:48 ` Roman Gushchin 2023-02-21 17:00 ` Martin Zhao 2 siblings, 1 reply; 26+ messages in thread From: Muchun Song @ 2023-02-21 7:22 UTC (permalink / raw) To: Roman Gushchin Cc: Shakeel Butt, Yue Zhao, Linux Memory Management List, Andrew Morton, Johannes Weiner, Michal Hocko, cgroups, linux-kernel > On Feb 21, 2023, at 13:17, Roman Gushchin <roman.gushchin@linux.dev> wrote: > >> On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: >> >> On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: >>>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: >>>> The knob for cgroup v2 memory controller: memory.oom.group >>>> will be read and written simultaneously by user space >>>> programs, thus we'd better change memcg->oom_group access >>>> with atomic operations to avoid concurrency problems. >>>> >>>> Signed-off-by: Yue Zhao <findns94@gmail.com> >>> >>> Hi Yue! >>> >>> I'm curious, have any seen any real issues which your patch is solving? >>> Can you, please, provide a bit more details. >>> >> >> IMHO such details are not needed. oom_group is being accessed >> concurrently and one of them can be a write access. At least >> READ_ONCE/WRITE_ONCE is needed here. > > Needed for what? > > I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill. > > My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these. I think there is no difference in the assembly code between them in most cases. The only intention that I can think of is to avoid the potential complaint (data race) emitted by KCSAN. > > Also there are other similar cgroup interfaces without READ_ONCE()/WRITE_ONCE(). If we decide to fix, then we should fix all. Thanks. > > Thanks! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 7:22 ` Muchun Song @ 2023-02-21 17:48 ` Roman Gushchin 0 siblings, 0 replies; 26+ messages in thread From: Roman Gushchin @ 2023-02-21 17:48 UTC (permalink / raw) To: Muchun Song Cc: Shakeel Butt, Yue Zhao, Linux Memory Management List, Andrew Morton, Johannes Weiner, Michal Hocko, cgroups, linux-kernel On Tue, Feb 21, 2023 at 03:22:32PM +0800, Muchun Song wrote: > > > > On Feb 21, 2023, at 13:17, Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > >> On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > >> > >> On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > >>>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > >>>> The knob for cgroup v2 memory controller: memory.oom.group > >>>> will be read and written simultaneously by user space > >>>> programs, thus we'd better change memcg->oom_group access > >>>> with atomic operations to avoid concurrency problems. > >>>> > >>>> Signed-off-by: Yue Zhao <findns94@gmail.com> > >>> > >>> Hi Yue! > >>> > >>> I'm curious, have any seen any real issues which your patch is solving? > >>> Can you, please, provide a bit more details. > >>> > >> > >> IMHO such details are not needed. oom_group is being accessed > >> concurrently and one of them can be a write access. At least > >> READ_ONCE/WRITE_ONCE is needed here. > > > > Needed for what? > > > > I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill. > > > > My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these. > > I think there is no difference in the assembly code between them in most > cases. The only intention that I can think of is to avoid the potential > complaint (data race) emitted by KCSAN. +1 And it might be a totally good reason for this change, let's just make it clear, instead of pretending to fix non-existing concurrency problems. Thanks! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 5:17 ` Roman Gushchin 2023-02-21 6:52 ` Shakeel Butt 2023-02-21 7:22 ` Muchun Song @ 2023-02-21 17:00 ` Martin Zhao 2023-02-21 18:02 ` Roman Gushchin 2 siblings, 1 reply; 26+ messages in thread From: Martin Zhao @ 2023-02-21 17:00 UTC (permalink / raw) To: Roman Gushchin Cc: Shakeel Butt, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel, tangyeechou On Tue, Feb 21, 2023 at 1:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > >>> The knob for cgroup v2 memory controller: memory.oom.group > >>> will be read and written simultaneously by user space > >>> programs, thus we'd better change memcg->oom_group access > >>> with atomic operations to avoid concurrency problems. > >>> > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > >> > >> Hi Yue! > >> > >> I'm curious, have any seen any real issues which your patch is solving? > >> Can you, please, provide a bit more details. > >> > > > > IMHO such details are not needed. oom_group is being accessed > > concurrently and one of them can be a write access. At least > > READ_ONCE/WRITE_ONCE is needed here. > > Needed for what? > > I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill. > > My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these. Thanks for your watching! This topic is found in code review by coincidence, so no real issues recorded for now. I checked other read/write callbacks about other knobs, most of them use READ_ONCE/WRITE_ONCE on the user setting variable. Actually I am curious as well why this interface doesn't use READ_ONCE/WRITE_ONCE, is there any other synchronization mechanism I didn't notice yet? > > Also there are other similar cgroup interfaces without READ_ONCE()/WRITE_ONCE(). > > Thanks! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 17:00 ` Martin Zhao @ 2023-02-21 18:02 ` Roman Gushchin 0 siblings, 0 replies; 26+ messages in thread From: Roman Gushchin @ 2023-02-21 18:02 UTC (permalink / raw) To: Martin Zhao Cc: Shakeel Butt, linux-mm, akpm, hannes, mhocko, muchun.song, cgroups, linux-kernel, tangyeechou On Wed, Feb 22, 2023 at 01:00:00AM +0800, Martin Zhao wrote: > On Tue, Feb 21, 2023 at 1:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote: > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > >>> The knob for cgroup v2 memory controller: memory.oom.group > > >>> will be read and written simultaneously by user space > > >>> programs, thus we'd better change memcg->oom_group access > > >>> with atomic operations to avoid concurrency problems. > > >>> > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com> > > >> > > >> Hi Yue! > > >> > > >> I'm curious, have any seen any real issues which your patch is solving? > > >> Can you, please, provide a bit more details. > > >> > > > > > > IMHO such details are not needed. oom_group is being accessed > > > concurrently and one of them can be a write access. At least > > > READ_ONCE/WRITE_ONCE is needed here. > > > > Needed for what? > > > > I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill. > > > > My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these. > > Thanks for your watching! > This topic is found in code review by coincidence, so no real issues > recorded for now. I checked other read/write callbacks about other knobs, > most of them use READ_ONCE/WRITE_ONCE on the user setting variable. Sorry, which knobs are you talking about? I actually don't see any user knobs in mm/memcontrol.c which are using WRITE_ONCE(). I see some of them using xchg(), but it's a different thing. > Actually I am curious as well why this interface doesn't use > READ_ONCE/WRITE_ONCE, is there any other synchronization mechanism I > didn't notice yet? Because nobody saw any issues with the current code? And again if it's something that makes any automated verifiers/tooling unhappy, I'm totally fine for fixing it, just let make it clear (and also fix the commit title, which is not true). Thanks! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-20 23:06 ` Shakeel Butt 2023-02-21 5:17 ` Roman Gushchin @ 2023-02-21 8:26 ` Michal Hocko 2023-02-21 17:00 ` Martin Zhao 1 sibling, 1 reply; 26+ messages in thread From: Michal Hocko @ 2023-02-21 8:26 UTC (permalink / raw) To: Shakeel Butt Cc: Roman Gushchin, Yue Zhao, linux-mm, akpm, hannes, muchun.song, cgroups, linux-kernel On Mon 20-02-23 23:06:24, Shakeel Butt wrote: > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > The knob for cgroup v2 memory controller: memory.oom.group > > > will be read and written simultaneously by user space > > > programs, thus we'd better change memcg->oom_group access > > > with atomic operations to avoid concurrency problems. > > > > > > Signed-off-by: Yue Zhao <findns94@gmail.com> > > > > Hi Yue! > > > > I'm curious, have any seen any real issues which your patch is solving? > > Can you, please, provide a bit more details. > > > > IMHO such details are not needed. oom_group is being accessed > concurrently and one of them can be a write access. At least > READ_ONCE/WRITE_ONCE is needed here. Most probably syzbot didn't > catch this race because it does not know about the memory.oom.group > interface. I do agree with Roman here. It is _always_ good to mention whether this is a tool/review or actual bug triggered fix. Also {READ,WRITE}_ONCE doesn't guarantee atomicity so it would be good to rephrase the changelog. Something like: The knob for cgroup v2 memory controller: memory.oom.group is not protected by any locking so it can be modified while it is used. This is not an actual problem because races are unlikely (the knob is usually configured long before any workloads hits actual memcg oom) but it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from doing anything funky. This patch is not fixing any actual user visible bug but it is in line of a standard practice. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] mm: change memcg->oom_group access with atomic operations 2023-02-21 8:26 ` Michal Hocko @ 2023-02-21 17:00 ` Martin Zhao 0 siblings, 0 replies; 26+ messages in thread From: Martin Zhao @ 2023-02-21 17:00 UTC (permalink / raw) To: Michal Hocko Cc: Shakeel Butt, Roman Gushchin, linux-mm, akpm, hannes, muchun.song, cgroups, linux-kernel, tangyeechou On Tue, Feb 21, 2023 at 4:26 PM Michal Hocko <mhocko@suse.com> wrote: > > On Mon 20-02-23 23:06:24, Shakeel Butt wrote: > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote: > > > On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote: > > > > The knob for cgroup v2 memory controller: memory.oom.group > > > > will be read and written simultaneously by user space > > > > programs, thus we'd better change memcg->oom_group access > > > > with atomic operations to avoid concurrency problems. > > > > > > > > Signed-off-by: Yue Zhao <findns94@gmail.com> > > > > > > Hi Yue! > > > > > > I'm curious, have any seen any real issues which your patch is solving? > > > Can you, please, provide a bit more details. > > > > > > > IMHO such details are not needed. oom_group is being accessed > > concurrently and one of them can be a write access. At least > > READ_ONCE/WRITE_ONCE is needed here. Most probably syzbot didn't > > catch this race because it does not know about the memory.oom.group > > interface. > > I do agree with Roman here. It is _always_ good to mention whether this > is a tool/review or actual bug triggered fix. Also {READ,WRITE}_ONCE doesn't > guarantee atomicity so it would be good to rephrase the changelog. > Something like: > The knob for cgroup v2 memory controller: memory.oom.group > is not protected by any locking so it can be modified while it is used. > This is not an actual problem because races are unlikely (the knob is > usually configured long before any workloads hits actual memcg oom) > but it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from > doing anything funky. Thanks a lot, I will rephrase and update my patch later. > > This patch is not fixing any actual user visible bug but it is in line > of a standard practice. > -- > Michal Hocko > SUSE Labs ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2023-02-22 9:01 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-20 15:16 [PATCH] mm: change memcg->oom_group access with atomic operations Yue Zhao 2023-02-20 21:09 ` Roman Gushchin 2023-02-20 23:06 ` Shakeel Butt 2023-02-21 5:17 ` Roman Gushchin 2023-02-21 6:52 ` Shakeel Butt 2023-02-21 13:51 ` Matthew Wilcox 2023-02-21 16:56 ` Shakeel Butt 2023-02-21 18:23 ` Paul E. McKenney 2023-02-21 22:23 ` Roman Gushchin 2023-02-21 22:38 ` Paul E. McKenney 2023-02-21 23:13 ` Shakeel Butt 2023-02-21 23:38 ` Paul E. McKenney 2023-02-21 23:57 ` Roman Gushchin 2023-02-22 0:37 ` Paul E. McKenney 2023-02-22 4:28 ` Roman Gushchin 2023-02-21 17:47 ` Roman Gushchin 2023-02-21 18:15 ` Shakeel Butt 2023-02-21 18:18 ` Matthew Wilcox 2023-02-22 9:01 ` David Laight 2023-02-21 17:00 ` Martin Zhao 2023-02-21 7:22 ` Muchun Song 2023-02-21 17:48 ` Roman Gushchin 2023-02-21 17:00 ` Martin Zhao 2023-02-21 18:02 ` Roman Gushchin 2023-02-21 8:26 ` Michal Hocko 2023-02-21 17:00 ` Martin Zhao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).