From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vladimir Davydov Subject: Re: [PATCH v3 3/3] proc: add kpageidle file Date: Fri, 8 May 2015 12:56:04 +0300 Message-ID: <20150508095604.GO31732@esperanza> References: <4c24a6bf2c9711dd4dbb72a43a16eba6867527b7.1430217477.git.vdavydov@parallels.com> <20150429043536.GB11486@blaptop> <20150429091248.GD1694@esperanza> <20150430082531.GD21771@blaptop> <20150430145055.GB17640@esperanza> <20150504031722.GA2768@blaptop> <20150504094938.GB4197@esperanza> <20150504105459.GA19384@blaptop> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Return-path: Content-Disposition: inline In-Reply-To: <20150504105459.GA19384@blaptop> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Minchan Kim Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Greg Thelen , Michel Lespinasse , David Rientjes , Pavel Emelyanov , Cyrill Gorcunov , Jonathan Corbet , linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Rik van Riel , Hugh Dickins , Christoph Lameter , "Paul E. McKenney" , Peter Zijlstra List-Id: linux-api@vger.kernel.org On Mon, May 04, 2015 at 07:54:59PM +0900, Minchan Kim wrote: > So, I guess once below compiler optimization happens in __page_set_anon_rmap, > it could be corrupt in page_refernced. > > __page_set_anon_rmap: > page->mapping = (struct address_space *) anon_vma; > page->mapping = (struct address_space *)((void *)page_mapping + PAGE_MAPPING_ANON); > > Because page_referenced checks it with PageAnon which has no memory barrier. > So if above compiler optimization happens, page_referenced can pass the anon > page in rmap_walk_file, not ramp_walk_anon. It's my theory. :) FWIW If such splits were possible, we would have bugs all over the kernel IMO. An example is do_wp_page() vs shrink_active_list(). In do_wp_page() we can call page_move_anon_rmap(), which sets page->mapping in exactly the same fashion as above-mentioned __page_set_anon_rmap(): anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON; page->mapping = (struct address_space *) anon_vma; The page in question may be on an LRU list, because nowhere in do_wp_page() we remove it from the list, neither do we take any LRU related locks. The page is locked, that's true, but shrink_active_list() calls page_referenced() on an unlocked page, so according to your logic they can race with the latter receiving a page with page->mapping equal to anon_vma w/o PAGE_MAPPING_ANON bit set: CPU0 CPU1 ---- ---- do_wp_page shrink_active_list lock_page page_referenced PageAnon->yes, so skip trylock_page page_move_anon_rmap page->mapping = anon_vma rmap_walk PageAnon->no rmap_walk_file BUG page->mapping = page->mapping+PAGE_MAPPING_ANON However, this does not happen. Thanks, Vladimir