* [PATCH] mm: initialize variable for mem_cgroup_end_page_stat @ 2014-10-30 1:44 Sasha Levin 2014-10-30 8:27 ` Michal Hocko 0 siblings, 1 reply; 12+ messages in thread From: Sasha Levin @ 2014-10-30 1:44 UTC (permalink / raw) To: akpm; +Cc: linux-kernel, riel, mhocko, hannes, peterz, linux-mm, Sasha Levin Commit "mm: memcontrol: fix missed end-writeback page accounting" has changed the behaviour of mem_cgroup_begin_page_stat() to not always set the "locked" parameter. We should initialize it at the callers to prevent garbage being used in a later call to mem_cgroup_end_page_stat(). Signed-off-by: Sasha Levin <sasha.levin@oracle.com> --- mm/page-writeback.c | 4 ++-- mm/rmap.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 19ceae8..7a02c97 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2329,7 +2329,7 @@ int test_clear_page_writeback(struct page *page) struct address_space *mapping = page_mapping(page); unsigned long memcg_flags; struct mem_cgroup *memcg; - bool locked; + bool locked = false; int ret; memcg = mem_cgroup_begin_page_stat(page, &locked, &memcg_flags); @@ -2366,7 +2366,7 @@ int __test_set_page_writeback(struct page *page, bool keep_write) struct address_space *mapping = page_mapping(page); unsigned long memcg_flags; struct mem_cgroup *memcg; - bool locked; + bool locked = false; int ret; memcg = mem_cgroup_begin_page_stat(page, &locked, &memcg_flags); diff --git a/mm/rmap.c b/mm/rmap.c index 19886fb..4a4dc84 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1044,7 +1044,7 @@ void page_add_file_rmap(struct page *page) { struct mem_cgroup *memcg; unsigned long flags; - bool locked; + bool locked = false; memcg = mem_cgroup_begin_page_stat(page, &locked, &flags); if (atomic_inc_and_test(&page->_mapcount)) { @@ -1058,7 +1058,7 @@ static void page_remove_file_rmap(struct page *page) { struct mem_cgroup *memcg; unsigned long flags; - bool locked; + bool locked = false; memcg = mem_cgroup_begin_page_stat(page, &locked, &flags); -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 1:44 [PATCH] mm: initialize variable for mem_cgroup_end_page_stat Sasha Levin @ 2014-10-30 8:27 ` Michal Hocko 2014-10-30 13:32 ` Sasha Levin 0 siblings, 1 reply; 12+ messages in thread From: Michal Hocko @ 2014-10-30 8:27 UTC (permalink / raw) To: Sasha Levin; +Cc: akpm, linux-kernel, riel, hannes, peterz, linux-mm On Wed 29-10-14 21:44:24, Sasha Levin wrote: > Commit "mm: memcontrol: fix missed end-writeback page accounting" has changed > the behaviour of mem_cgroup_begin_page_stat() to not always set the "locked" > parameter. > > We should initialize it at the callers to prevent garbage being used in a > later call to mem_cgroup_end_page_stat(). The contract is that if the returned memcg is non-NULL then the locked is always initialized. Nobody but mem_cgroup_end_page_stat should touch this variable and this function makes sure it uses it properly. Similar applies to flags which is initialized only if we really take the slow path (has a meaning only if locked == true). So this is not really needed. Was this triggered by a compiler warning? > Signed-off-by: Sasha Levin <sasha.levin@oracle.com> > --- > mm/page-writeback.c | 4 ++-- > mm/rmap.c | 4 ++-- > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > index 19ceae8..7a02c97 100644 > --- a/mm/page-writeback.c > +++ b/mm/page-writeback.c > @@ -2329,7 +2329,7 @@ int test_clear_page_writeback(struct page *page) > struct address_space *mapping = page_mapping(page); > unsigned long memcg_flags; > struct mem_cgroup *memcg; > - bool locked; > + bool locked = false; > int ret; > > memcg = mem_cgroup_begin_page_stat(page, &locked, &memcg_flags); > @@ -2366,7 +2366,7 @@ int __test_set_page_writeback(struct page *page, bool keep_write) > struct address_space *mapping = page_mapping(page); > unsigned long memcg_flags; > struct mem_cgroup *memcg; > - bool locked; > + bool locked = false; > int ret; > > memcg = mem_cgroup_begin_page_stat(page, &locked, &memcg_flags); > diff --git a/mm/rmap.c b/mm/rmap.c > index 19886fb..4a4dc84 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1044,7 +1044,7 @@ void page_add_file_rmap(struct page *page) > { > struct mem_cgroup *memcg; > unsigned long flags; > - bool locked; > + bool locked = false; > > memcg = mem_cgroup_begin_page_stat(page, &locked, &flags); > if (atomic_inc_and_test(&page->_mapcount)) { > @@ -1058,7 +1058,7 @@ static void page_remove_file_rmap(struct page *page) > { > struct mem_cgroup *memcg; > unsigned long flags; > - bool locked; > + bool locked = false; > > memcg = mem_cgroup_begin_page_stat(page, &locked, &flags); > > -- > 1.7.10.4 > -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 8:27 ` Michal Hocko @ 2014-10-30 13:32 ` Sasha Levin 2014-10-30 14:14 ` Johannes Weiner 0 siblings, 1 reply; 12+ messages in thread From: Sasha Levin @ 2014-10-30 13:32 UTC (permalink / raw) To: Michal Hocko; +Cc: akpm, linux-kernel, riel, hannes, peterz, linux-mm On 10/30/2014 04:27 AM, Michal Hocko wrote: > On Wed 29-10-14 21:44:24, Sasha Levin wrote: >> > Commit "mm: memcontrol: fix missed end-writeback page accounting" has changed >> > the behaviour of mem_cgroup_begin_page_stat() to not always set the "locked" >> > parameter. >> > >> > We should initialize it at the callers to prevent garbage being used in a >> > later call to mem_cgroup_end_page_stat(). > The contract is that if the returned memcg is non-NULL then the locked > is always initialized. Nobody but mem_cgroup_end_page_stat should touch > this variable and this function makes sure it uses it properly. Similar > applies to flags which is initialized only if we really take the slow > path (has a meaning only if locked == true). > > So this is not really needed. Was this triggered by a compiler warning? The problem is that you are attempting to read 'locked' when you call mem_cgroup_end_page_stat(), so it gets used even before you enter the function - and using uninitialized variables is undefined. Yes, it's a compiler warning. Thanks, Sasha ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 13:32 ` Sasha Levin @ 2014-10-30 14:14 ` Johannes Weiner 2014-10-30 14:24 ` Sasha Levin 0 siblings, 1 reply; 12+ messages in thread From: Johannes Weiner @ 2014-10-30 14:14 UTC (permalink / raw) To: Sasha Levin; +Cc: Michal Hocko, akpm, linux-kernel, riel, peterz, linux-mm On Thu, Oct 30, 2014 at 09:32:14AM -0400, Sasha Levin wrote: > On 10/30/2014 04:27 AM, Michal Hocko wrote: > > On Wed 29-10-14 21:44:24, Sasha Levin wrote: > >> > Commit "mm: memcontrol: fix missed end-writeback page accounting" has changed > >> > the behaviour of mem_cgroup_begin_page_stat() to not always set the "locked" > >> > parameter. > >> > > >> > We should initialize it at the callers to prevent garbage being used in a > >> > later call to mem_cgroup_end_page_stat(). > > The contract is that if the returned memcg is non-NULL then the locked > > is always initialized. Nobody but mem_cgroup_end_page_stat should touch > > this variable and this function makes sure it uses it properly. Similar > > applies to flags which is initialized only if we really take the slow > > path (has a meaning only if locked == true). > > > > So this is not really needed. Was this triggered by a compiler warning? > > The problem is that you are attempting to read 'locked' when you call > mem_cgroup_end_page_stat(), so it gets used even before you enter the > function - and using uninitialized variables is undefined. We are not using that value anywhere if !memcg. What path are you referring to? > Yes, it's a compiler warning. Could you provide that please, including arch, and gcc version? Thanks ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 14:14 ` Johannes Weiner @ 2014-10-30 14:24 ` Sasha Levin 2014-10-30 15:06 ` Johannes Weiner 2014-10-30 15:31 ` Michal Hocko 0 siblings, 2 replies; 12+ messages in thread From: Sasha Levin @ 2014-10-30 14:24 UTC (permalink / raw) To: Johannes Weiner; +Cc: Michal Hocko, akpm, linux-kernel, riel, peterz, linux-mm On 10/30/2014 10:14 AM, Johannes Weiner wrote: >> The problem is that you are attempting to read 'locked' when you call >> > mem_cgroup_end_page_stat(), so it gets used even before you enter the >> > function - and using uninitialized variables is undefined. > We are not using that value anywhere if !memcg. What path are you > referring to? You're using that value as soon as you are passing it to a function, it doesn't matter what happens inside that function. >> > Yes, it's a compiler warning. > Could you provide that please, including arch, and gcc version? On x86, $ gcc --version gcc (GCC) 5.0.0 20141029 (experimental) [ 26.868116] ================================================================================ [ 26.870376] UBSan: Undefined behaviour in mm/rmap.c:1084:2 [ 26.871792] load of value 255 is not a valid value for type '_Bool' [ 26.873256] CPU: 4 PID: 8304 Comm: rngd Not tainted 3.18.0-rc2-next-20141029-sasha-00039-g77ed13d-dirty #1427 [ 26.875636] ffff8800cac17ff0 0000000000000000 0000000000000000 ffff880069ffbb28 [ 26.877611] ffffffffaf010c16 0000000000000037 ffffffffb1c0d050 ffff880069ffbb38 [ 26.879140] ffffffffa6e97899 ffff880069ffbbb8 ffffffffa6e97cc7 ffff880069ffbbb8 [ 26.880765] Call Trace: [ 26.881185] dump_stack (lib/dump_stack.c:52) [ 26.882755] ubsan_epilogue (lib/ubsan.c:159) [ 26.883555] __ubsan_handle_load_invalid_value (lib/ubsan.c:482) [ 26.884492] ? mem_cgroup_begin_page_stat (mm/memcontrol.c:1962) [ 26.885441] ? unmap_page_range (./arch/x86/include/asm/paravirt.h:694 mm/memory.c:1091 mm/memory.c:1258 mm/memory.c:1279 mm/memory.c:1303) [ 26.886242] page_remove_rmap (mm/rmap.c:1084 mm/rmap.c:1096) [ 26.886922] unmap_page_range (./arch/x86/include/asm/atomic.h:27 include/linux/mm.h:463 mm/memory.c:1146 mm/memory.c:1258 mm/memory.c:1279 mm/memory.c:1303) [ 26.887824] unmap_single_vma (mm/memory.c:1348) [ 26.888582] unmap_vmas (mm/memory.c:1377 (discriminator 3)) [ 26.889430] exit_mmap (mm/mmap.c:2837) [ 26.890060] mmput (kernel/fork.c:659) [ 26.890656] do_exit (./arch/x86/include/asm/thread_info.h:168 kernel/exit.c:462 kernel/exit.c:747) [ 26.891359] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 26.892287] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2559 kernel/locking/lockdep.c:2601) [ 26.893107] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1598 (discriminator 2)) [ 26.893974] do_group_exit (include/linux/sched.h:775 kernel/exit.c:873) [ 26.894695] SyS_exit_group (kernel/exit.c:901) [ 26.895433] tracesys_phase2 (arch/x86/kernel/entry_64.S:529) [ 26.896134] ================================================================================ Thanks, Sasha ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 14:24 ` Sasha Levin @ 2014-10-30 15:06 ` Johannes Weiner 2014-10-30 16:02 ` Sasha Levin 2014-10-30 15:31 ` Michal Hocko 1 sibling, 1 reply; 12+ messages in thread From: Johannes Weiner @ 2014-10-30 15:06 UTC (permalink / raw) To: Sasha Levin; +Cc: Michal Hocko, akpm, linux-kernel, riel, peterz, linux-mm On Thu, Oct 30, 2014 at 10:24:47AM -0400, Sasha Levin wrote: > On 10/30/2014 10:14 AM, Johannes Weiner wrote: > >> The problem is that you are attempting to read 'locked' when you call > >> > mem_cgroup_end_page_stat(), so it gets used even before you enter the > >> > function - and using uninitialized variables is undefined. > > We are not using that value anywhere if !memcg. What path are you > > referring to? > > You're using that value as soon as you are passing it to a function, it > doesn't matter what happens inside that function. It's copied as part of the pass-by-value protocol, but we really don't do anything with it. So why does it matter? > >> > Yes, it's a compiler warning. > > Could you provide that please, including arch, and gcc version? > > On x86, > > $ gcc --version > gcc (GCC) 5.0.0 20141029 (experimental) > > [ 26.868116] ================================================================================ > [ 26.870376] UBSan: Undefined behaviour in mm/rmap.c:1084:2 Well, "compiler warning" is misleading at best, this is some out-of-tree runtime debugging tool. As per above, there isn't a practical problem here, but your patch worsens the code by making callsites ignorant of how the interface works. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 15:06 ` Johannes Weiner @ 2014-10-30 16:02 ` Sasha Levin 0 siblings, 0 replies; 12+ messages in thread From: Sasha Levin @ 2014-10-30 16:02 UTC (permalink / raw) To: Johannes Weiner; +Cc: Michal Hocko, akpm, linux-kernel, riel, peterz, linux-mm On 10/30/2014 11:06 AM, Johannes Weiner wrote: >> You're using that value as soon as you are passing it to a function, it >> > doesn't matter what happens inside that function. > It's copied as part of the pass-by-value protocol, but we really don't > do anything with it. So why does it matter? Because it's undefined behaviour, which gives your compiler a license to do whatever it wants? Thanks, Sasha ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 14:24 ` Sasha Levin 2014-10-30 15:06 ` Johannes Weiner @ 2014-10-30 15:31 ` Michal Hocko 2014-10-30 17:26 ` Johannes Weiner 1 sibling, 1 reply; 12+ messages in thread From: Michal Hocko @ 2014-10-30 15:31 UTC (permalink / raw) To: Sasha Levin; +Cc: Johannes Weiner, akpm, linux-kernel, riel, peterz, linux-mm On Thu 30-10-14 10:24:47, Sasha Levin wrote: > On 10/30/2014 10:14 AM, Johannes Weiner wrote: > >> The problem is that you are attempting to read 'locked' when you call > >> > mem_cgroup_end_page_stat(), so it gets used even before you enter the > >> > function - and using uninitialized variables is undefined. > > We are not using that value anywhere if !memcg. What path are you > > referring to? > > You're using that value as soon as you are passing it to a function, it > doesn't matter what happens inside that function. I have discussed that with our gcc guys and you are right. Strictly speaking the compiler is free to do if (!memcg) abort(); mem_cgroup_end_page_stat(...); but it is highly unlikely that this will ever happen. Anyway better be safe than sorry. I guess the following should be sufficient and even more symmetric: --- >From 6c3e748af7ee24984477e850bb93d65f83914903 Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Thu, 30 Oct 2014 16:18:23 +0100 Subject: [PATCH] mm, memcg: fix potential undefined when for page stat accounting since d7365e783edb (mm: memcontrol: fix missed end-writeback page accounting) mem_cgroup_end_page_stat consumes locked and flags variables directly rather than via pointers which might trigger C undefined behavior as those variables are initialized only in the slow path of mem_cgroup_begin_page_stat. Although mem_cgroup_end_page_stat handles parameters correctly and touches them only when they hold a sensible value it is caller which loads a potentially uninitialized value which then might allow compiler to do crazy things. Fix this by using pointer parameters for both locked and flags. This is even better from the API point of view because it is symmetrical to mem_cgroup_begin_page_stat. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> --- include/linux/memcontrol.h | 6 +++--- mm/memcontrol.c | 8 ++++---- mm/page-writeback.c | 4 ++-- mm/rmap.c | 4 ++-- 4 files changed, 11 insertions(+), 11 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d4575a1d6e99..de018766be45 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -141,8 +141,8 @@ static inline bool mem_cgroup_disabled(void) struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page, bool *locked, unsigned long *flags); -void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, bool locked, - unsigned long flags); +void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, bool *locked, + unsigned long *flags); void mem_cgroup_update_page_stat(struct mem_cgroup *memcg, enum mem_cgroup_stat_index idx, int val); @@ -297,7 +297,7 @@ static inline struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page, } static inline void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, - bool locked, unsigned long flags) + bool *locked, unsigned long *flags) { } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b841bf430179..031ca345677b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2053,11 +2053,11 @@ again: * @locked: value received from mem_cgroup_begin_page_stat() * @flags: value received from mem_cgroup_begin_page_stat() */ -void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, bool locked, - unsigned long flags) +void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, bool *locked, + unsigned long *flags) { - if (memcg && locked) - spin_unlock_irqrestore(&memcg->move_lock, flags); + if (memcg && *locked) + spin_unlock_irqrestore(&memcg->move_lock, *flags); rcu_read_unlock(); } diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 19ceae87522d..d5d81f5384d1 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2357,7 +2357,7 @@ int test_clear_page_writeback(struct page *page) dec_zone_page_state(page, NR_WRITEBACK); inc_zone_page_state(page, NR_WRITTEN); } - mem_cgroup_end_page_stat(memcg, locked, memcg_flags); + mem_cgroup_end_page_stat(memcg, &locked, &memcg_flags); return ret; } @@ -2399,7 +2399,7 @@ int __test_set_page_writeback(struct page *page, bool keep_write) mem_cgroup_inc_page_stat(memcg, MEM_CGROUP_STAT_WRITEBACK); inc_zone_page_state(page, NR_WRITEBACK); } - mem_cgroup_end_page_stat(memcg, locked, memcg_flags); + mem_cgroup_end_page_stat(memcg, &locked, &memcg_flags); return ret; } diff --git a/mm/rmap.c b/mm/rmap.c index 740dd7d15806..f782b6c2ae48 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1051,7 +1051,7 @@ void page_add_file_rmap(struct page *page) __inc_zone_page_state(page, NR_FILE_MAPPED); mem_cgroup_inc_page_stat(memcg, MEM_CGROUP_STAT_FILE_MAPPED); } - mem_cgroup_end_page_stat(memcg, locked, flags); + mem_cgroup_end_page_stat(memcg, &locked, &flags); } static void page_remove_file_rmap(struct page *page) @@ -1081,7 +1081,7 @@ static void page_remove_file_rmap(struct page *page) if (unlikely(PageMlocked(page))) clear_page_mlock(page); out: - mem_cgroup_end_page_stat(memcg, locked, flags); + mem_cgroup_end_page_stat(memcg, &locked, &flags); } /** -- 2.1.1 -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 15:31 ` Michal Hocko @ 2014-10-30 17:26 ` Johannes Weiner 2014-10-30 17:42 ` Michal Hocko 0 siblings, 1 reply; 12+ messages in thread From: Johannes Weiner @ 2014-10-30 17:26 UTC (permalink / raw) To: Michal Hocko; +Cc: Sasha Levin, akpm, linux-kernel, riel, peterz, linux-mm On Thu, Oct 30, 2014 at 04:31:59PM +0100, Michal Hocko wrote: > On Thu 30-10-14 10:24:47, Sasha Levin wrote: > > On 10/30/2014 10:14 AM, Johannes Weiner wrote: > > >> The problem is that you are attempting to read 'locked' when you call > > >> > mem_cgroup_end_page_stat(), so it gets used even before you enter the > > >> > function - and using uninitialized variables is undefined. > > > We are not using that value anywhere if !memcg. What path are you > > > referring to? > > > > You're using that value as soon as you are passing it to a function, it > > doesn't matter what happens inside that function. > > I have discussed that with our gcc guys and you are right. Strictly > speaking the compiler is free to do > if (!memcg) abort(); > mem_cgroup_end_page_stat(...); > > but it is highly unlikely that this will ever happen. Anyway better be > safe than sorry. I guess the following should be sufficient and even > more symmetric: The functional aspect of this is a terrible motivation for this change. Sure the compiler could, but it doesn't, and it won't. But there is some merit in keeping the checker's output meaningful as long as it doesn't obfuscate the interface too much. > From 6c3e748af7ee24984477e850bb93d65f83914903 Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@suse.cz> > Date: Thu, 30 Oct 2014 16:18:23 +0100 > Subject: [PATCH] mm, memcg: fix potential undefined when for page stat > accounting > > since d7365e783edb (mm: memcontrol: fix missed end-writeback page > accounting) mem_cgroup_end_page_stat consumes locked and flags variables > directly rather than via pointers which might trigger C undefined > behavior as those variables are initialized only in the slow path of > mem_cgroup_begin_page_stat. > Although mem_cgroup_end_page_stat handles parameters correctly and > touches them only when they hold a sensible value it is caller which > loads a potentially uninitialized value which then might allow compiler > to do crazy things. I'm not opposed to passing pointers into end_page_stat(), but please mention the checker in the changelog. > Fix this by using pointer parameters for both locked and flags. This is > even better from the API point of view because it is symmetrical to > mem_cgroup_begin_page_stat. Uhm, locked and flags are return values in begin_page_stat() but input arguments in end_page_stat(). Symmetry obfuscates that, so that's not an upside at all. It's a cost that we can pay to keep the checker benefits, but the underlying nastiness remains. It comes from the fact that we use conditional locking to avoid the read-side spinlock, rather than using a reader-friendly lock to begin with. So let's change it to pointers, but at the same time be clear that this doesn't make the code better. It just fixes the checker. Thanks ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 17:26 ` Johannes Weiner @ 2014-10-30 17:42 ` Michal Hocko 2014-10-30 19:30 ` Peter Zijlstra 2014-10-31 18:17 ` Johannes Weiner 0 siblings, 2 replies; 12+ messages in thread From: Michal Hocko @ 2014-10-30 17:42 UTC (permalink / raw) To: Johannes Weiner; +Cc: Sasha Levin, akpm, linux-kernel, riel, peterz, linux-mm On Thu 30-10-14 13:26:32, Johannes Weiner wrote: > On Thu, Oct 30, 2014 at 04:31:59PM +0100, Michal Hocko wrote: > > On Thu 30-10-14 10:24:47, Sasha Levin wrote: > > > On 10/30/2014 10:14 AM, Johannes Weiner wrote: > > > >> The problem is that you are attempting to read 'locked' when you call > > > >> > mem_cgroup_end_page_stat(), so it gets used even before you enter the > > > >> > function - and using uninitialized variables is undefined. > > > > We are not using that value anywhere if !memcg. What path are you > > > > referring to? > > > > > > You're using that value as soon as you are passing it to a function, it > > > doesn't matter what happens inside that function. > > > > I have discussed that with our gcc guys and you are right. Strictly > > speaking the compiler is free to do > > if (!memcg) abort(); > > mem_cgroup_end_page_stat(...); > > > > but it is highly unlikely that this will ever happen. Anyway better be > > safe than sorry. I guess the following should be sufficient and even > > more symmetric: > > The functional aspect of this is a terrible motivation for this > change. Sure the compiler could, but it doesn't, and it won't. > > But there is some merit in keeping the checker's output meaningful as > long as it doesn't obfuscate the interface too much. > > > From 6c3e748af7ee24984477e850bb93d65f83914903 Mon Sep 17 00:00:00 2001 > > From: Michal Hocko <mhocko@suse.cz> > > Date: Thu, 30 Oct 2014 16:18:23 +0100 > > Subject: [PATCH] mm, memcg: fix potential undefined when for page stat > > accounting > > > > since d7365e783edb (mm: memcontrol: fix missed end-writeback page > > accounting) mem_cgroup_end_page_stat consumes locked and flags variables > > directly rather than via pointers which might trigger C undefined > > behavior as those variables are initialized only in the slow path of > > mem_cgroup_begin_page_stat. > > Although mem_cgroup_end_page_stat handles parameters correctly and > > touches them only when they hold a sensible value it is caller which > > loads a potentially uninitialized value which then might allow compiler > > to do crazy things. > > I'm not opposed to passing pointers into end_page_stat(), but please > mention the checker in the changelog. Done. > > Fix this by using pointer parameters for both locked and flags. This is > > even better from the API point of view because it is symmetrical to > > mem_cgroup_begin_page_stat. > > Uhm, locked and flags are return values in begin_page_stat() but input > arguments in end_page_stat(). Symmetry obfuscates that, so that's not > an upside at all. It's a cost that we can pay to keep the checker Well, I would use a typedef to obfuscate those values because nobody except for mem_cgroup_{begin,end}_page_stat should touch them. But we are not doing typedefs in kernel... > benefits, but the underlying nastiness remains. It comes from the > fact that we use conditional locking to avoid the read-side spinlock, > rather than using a reader-friendly lock to begin with. > So let's change it to pointers, but at the same time be clear that > this doesn't make the code better. It just fixes the checker. No it is not about the checker which is correct here actually. A simple load to setup parameter from an uninitialized variable is an undefined behavior (that load happens unconditionally). This has nothing to do with the way how we use locked and flags inside the function. New version with an updated changelog --- >From b2762f30d3896172c5666066e72938b3f5f9158a Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Thu, 30 Oct 2014 18:35:19 +0100 Subject: [PATCH] mm, memcg: fix potential undefined when for page stat accounting since d7365e783edb (mm: memcontrol: fix missed end-writeback page accounting) mem_cgroup_end_page_stat consumes locked and flags variables directly rather than via pointers which might trigger C undefined behavior as those variables are initialized only in the slow path of mem_cgroup_begin_page_stat. Although mem_cgroup_end_page_stat handles parameters correctly and touches them only when they hold a sensible value it is caller which loads a potentially uninitialized value which then might allow compiler to do crazy things. I haven't seen any warning from gcc and it seems that the current version (4.9) doesn't exploit this type undefined behavior but Sasha has reported the following: [ 26.868116] ================================================================================ [ 26.870376] UBSan: Undefined behaviour in mm/rmap.c:1084:2 [ 26.871792] load of value 255 is not a valid value for type '_Bool' [ 26.873256] CPU: 4 PID: 8304 Comm: rngd Not tainted 3.18.0-rc2-next-20141029-sasha-00039-g77ed13d-dirty #1427 [ 26.875636] ffff8800cac17ff0 0000000000000000 0000000000000000 ffff880069ffbb28 [ 26.877611] ffffffffaf010c16 0000000000000037 ffffffffb1c0d050 ffff880069ffbb38 [ 26.879140] ffffffffa6e97899 ffff880069ffbbb8 ffffffffa6e97cc7 ffff880069ffbbb8 [ 26.880765] Call Trace: [ 26.881185] dump_stack (lib/dump_stack.c:52) [ 26.882755] ubsan_epilogue (lib/ubsan.c:159) [ 26.883555] __ubsan_handle_load_invalid_value (lib/ubsan.c:482) [ 26.884492] ? mem_cgroup_begin_page_stat (mm/memcontrol.c:1962) [ 26.885441] ? unmap_page_range (./arch/x86/include/asm/paravirt.h:694 mm/memory.c:1091 mm/memory.c:1258 mm/memory.c:1279 mm/memory.c:1303) [ 26.886242] page_remove_rmap (mm/rmap.c:1084 mm/rmap.c:1096) [ 26.886922] unmap_page_range (./arch/x86/include/asm/atomic.h:27 include/linux/mm.h:463 mm/memory.c:1146 mm/memory.c:1258 mm/memory.c:1279 mm/memory.c:1303) [ 26.887824] unmap_single_vma (mm/memory.c:1348) [ 26.888582] unmap_vmas (mm/memory.c:1377 (discriminator 3)) [ 26.889430] exit_mmap (mm/mmap.c:2837) [ 26.890060] mmput (kernel/fork.c:659) [ 26.890656] do_exit (./arch/x86/include/asm/thread_info.h:168 kernel/exit.c:462 kernel/exit.c:747) [ 26.891359] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 26.892287] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2559 kernel/locking/lockdep.c:2601) [ 26.893107] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1598 (discriminator 2)) [ 26.893974] do_group_exit (include/linux/sched.h:775 kernel/exit.c:873) [ 26.894695] SyS_exit_group (kernel/exit.c:901) [ 26.895433] tracesys_phase2 (arch/x86/kernel/entry_64.S:529) [ 26.896134] ================================================================================ Fix this by using pointer parameters for both locked and flags and be more robust for future compiler changes even though the current code is implemented correctly. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> --- include/linux/memcontrol.h | 6 +++--- mm/memcontrol.c | 8 ++++---- mm/page-writeback.c | 4 ++-- mm/rmap.c | 4 ++-- 4 files changed, 11 insertions(+), 11 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d4575a1d6e99..de018766be45 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -141,8 +141,8 @@ static inline bool mem_cgroup_disabled(void) struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page, bool *locked, unsigned long *flags); -void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, bool locked, - unsigned long flags); +void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, bool *locked, + unsigned long *flags); void mem_cgroup_update_page_stat(struct mem_cgroup *memcg, enum mem_cgroup_stat_index idx, int val); @@ -297,7 +297,7 @@ static inline struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page, } static inline void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, - bool locked, unsigned long flags) + bool *locked, unsigned long *flags) { } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b841bf430179..031ca345677b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2053,11 +2053,11 @@ again: * @locked: value received from mem_cgroup_begin_page_stat() * @flags: value received from mem_cgroup_begin_page_stat() */ -void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, bool locked, - unsigned long flags) +void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, bool *locked, + unsigned long *flags) { - if (memcg && locked) - spin_unlock_irqrestore(&memcg->move_lock, flags); + if (memcg && *locked) + spin_unlock_irqrestore(&memcg->move_lock, *flags); rcu_read_unlock(); } diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 19ceae87522d..d5d81f5384d1 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2357,7 +2357,7 @@ int test_clear_page_writeback(struct page *page) dec_zone_page_state(page, NR_WRITEBACK); inc_zone_page_state(page, NR_WRITTEN); } - mem_cgroup_end_page_stat(memcg, locked, memcg_flags); + mem_cgroup_end_page_stat(memcg, &locked, &memcg_flags); return ret; } @@ -2399,7 +2399,7 @@ int __test_set_page_writeback(struct page *page, bool keep_write) mem_cgroup_inc_page_stat(memcg, MEM_CGROUP_STAT_WRITEBACK); inc_zone_page_state(page, NR_WRITEBACK); } - mem_cgroup_end_page_stat(memcg, locked, memcg_flags); + mem_cgroup_end_page_stat(memcg, &locked, &memcg_flags); return ret; } diff --git a/mm/rmap.c b/mm/rmap.c index 740dd7d15806..f782b6c2ae48 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1051,7 +1051,7 @@ void page_add_file_rmap(struct page *page) __inc_zone_page_state(page, NR_FILE_MAPPED); mem_cgroup_inc_page_stat(memcg, MEM_CGROUP_STAT_FILE_MAPPED); } - mem_cgroup_end_page_stat(memcg, locked, flags); + mem_cgroup_end_page_stat(memcg, &locked, &flags); } static void page_remove_file_rmap(struct page *page) @@ -1081,7 +1081,7 @@ static void page_remove_file_rmap(struct page *page) if (unlikely(PageMlocked(page))) clear_page_mlock(page); out: - mem_cgroup_end_page_stat(memcg, locked, flags); + mem_cgroup_end_page_stat(memcg, &locked, &flags); } /** -- 2.1.1 -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 17:42 ` Michal Hocko @ 2014-10-30 19:30 ` Peter Zijlstra 2014-10-31 18:17 ` Johannes Weiner 1 sibling, 0 replies; 12+ messages in thread From: Peter Zijlstra @ 2014-10-30 19:30 UTC (permalink / raw) To: Michal Hocko Cc: Johannes Weiner, Sasha Levin, akpm, linux-kernel, riel, linux-mm On Thu, Oct 30, 2014 at 06:42:41PM +0100, Michal Hocko wrote: > Well, I would use a typedef to obfuscate those values because nobody > except for mem_cgroup_{begin,end}_page_stat should touch them. But we > are not doing typedefs in kernel... $ git grep typedef | wc -l 11379 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: initialize variable for mem_cgroup_end_page_stat 2014-10-30 17:42 ` Michal Hocko 2014-10-30 19:30 ` Peter Zijlstra @ 2014-10-31 18:17 ` Johannes Weiner 1 sibling, 0 replies; 12+ messages in thread From: Johannes Weiner @ 2014-10-31 18:17 UTC (permalink / raw) To: Michal Hocko; +Cc: Sasha Levin, akpm, linux-kernel, riel, peterz, linux-mm On Thu, Oct 30, 2014 at 06:42:41PM +0100, Michal Hocko wrote: > On Thu 30-10-14 13:26:32, Johannes Weiner wrote: > > On Thu, Oct 30, 2014 at 04:31:59PM +0100, Michal Hocko wrote: > > > I have discussed that with our gcc guys and you are right. Strictly > > > speaking the compiler is free to do > > > if (!memcg) abort(); > > > mem_cgroup_end_page_stat(...); > > > > > > but it is highly unlikely that this will ever happen. Anyway better be > > > safe than sorry. I guess the following should be sufficient and even > > > more symmetric: > > > > The functional aspect of this is a terrible motivation for this > > change. Sure the compiler could, but it doesn't, and it won't. > > > > But there is some merit in keeping the checker's output meaningful as > > long as it doesn't obfuscate the interface too much. [...] > > So let's change it to pointers, but at the same time be clear that > > this doesn't make the code better. It just fixes the checker. > > No it is not about the checker which is correct here actually. A simple > load to setup parameter from an uninitialized variable is an undefined > behavior (that load happens unconditionally). This has nothing to do > with the way how we use locked and flags inside the function. Never mind... :) The diff looks fine. > From b2762f30d3896172c5666066e72938b3f5f9158a Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@suse.cz> > Date: Thu, 30 Oct 2014 18:35:19 +0100 > Subject: [PATCH] mm, memcg: fix potential undefined when for page stat > accounting > > since d7365e783edb (mm: memcontrol: fix missed end-writeback page > accounting) mem_cgroup_end_page_stat consumes locked and flags variables > directly rather than via pointers which might trigger C undefined > behavior as those variables are initialized only in the slow path of > mem_cgroup_begin_page_stat. > Although mem_cgroup_end_page_stat handles parameters correctly and > touches them only when they hold a sensible value it is caller which > loads a potentially uninitialized value which then might allow compiler > to do crazy things. > > I haven't seen any warning from gcc and it seems that the current > version (4.9) doesn't exploit this type undefined behavior but Sasha has > reported the following: > [ 26.868116] ================================================================================ > [ 26.870376] UBSan: Undefined behaviour in mm/rmap.c:1084:2 > [ 26.871792] load of value 255 is not a valid value for type '_Bool' > [ 26.873256] CPU: 4 PID: 8304 Comm: rngd Not tainted 3.18.0-rc2-next-20141029-sasha-00039-g77ed13d-dirty #1427 > [ 26.875636] ffff8800cac17ff0 0000000000000000 0000000000000000 ffff880069ffbb28 > [ 26.877611] ffffffffaf010c16 0000000000000037 ffffffffb1c0d050 ffff880069ffbb38 > [ 26.879140] ffffffffa6e97899 ffff880069ffbbb8 ffffffffa6e97cc7 ffff880069ffbbb8 > [ 26.880765] Call Trace: > [ 26.881185] dump_stack (lib/dump_stack.c:52) > [ 26.882755] ubsan_epilogue (lib/ubsan.c:159) > [ 26.883555] __ubsan_handle_load_invalid_value (lib/ubsan.c:482) > [ 26.884492] ? mem_cgroup_begin_page_stat (mm/memcontrol.c:1962) > [ 26.885441] ? unmap_page_range (./arch/x86/include/asm/paravirt.h:694 mm/memory.c:1091 mm/memory.c:1258 mm/memory.c:1279 mm/memory.c:1303) > [ 26.886242] page_remove_rmap (mm/rmap.c:1084 mm/rmap.c:1096) > [ 26.886922] unmap_page_range (./arch/x86/include/asm/atomic.h:27 include/linux/mm.h:463 mm/memory.c:1146 mm/memory.c:1258 mm/memory.c:1279 mm/memory.c:1303) > [ 26.887824] unmap_single_vma (mm/memory.c:1348) > [ 26.888582] unmap_vmas (mm/memory.c:1377 (discriminator 3)) > [ 26.889430] exit_mmap (mm/mmap.c:2837) > [ 26.890060] mmput (kernel/fork.c:659) > [ 26.890656] do_exit (./arch/x86/include/asm/thread_info.h:168 kernel/exit.c:462 kernel/exit.c:747) > [ 26.891359] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 26.892287] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2559 kernel/locking/lockdep.c:2601) > [ 26.893107] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1598 (discriminator 2)) > [ 26.893974] do_group_exit (include/linux/sched.h:775 kernel/exit.c:873) > [ 26.894695] SyS_exit_group (kernel/exit.c:901) > [ 26.895433] tracesys_phase2 (arch/x86/kernel/entry_64.S:529) > [ 26.896134] ================================================================================ > > Fix this by using pointer parameters for both locked and flags and be > more robust for future compiler changes even though the current code is > implemented correctly. > > Reported-by: Sasha Levin <sasha.levin@oracle.com> > Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-10-31 18:17 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-10-30 1:44 [PATCH] mm: initialize variable for mem_cgroup_end_page_stat Sasha Levin 2014-10-30 8:27 ` Michal Hocko 2014-10-30 13:32 ` Sasha Levin 2014-10-30 14:14 ` Johannes Weiner 2014-10-30 14:24 ` Sasha Levin 2014-10-30 15:06 ` Johannes Weiner 2014-10-30 16:02 ` Sasha Levin 2014-10-30 15:31 ` Michal Hocko 2014-10-30 17:26 ` Johannes Weiner 2014-10-30 17:42 ` Michal Hocko 2014-10-30 19:30 ` Peter Zijlstra 2014-10-31 18:17 ` Johannes Weiner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox