From: "Huang\, Ying" <ying.huang@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Minchan Kim <minchan@kernel.org>, Michal Hocko <mhocko@suse.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Mel Gorman <mgorman@techsingularity.net>,
Dave Hansen <dave.hansen@intel.com>,
Arnd Bergmann <arnd@arndb.de>, Chen Liqin <liqin.linux@gmail.com>,
Russell King <linux@armlinux.org.uk>,
Yoshinori Sato <ysato@users.sourceforge.jp>,
"James E.J. Bottomley" <jejb@parisc-linux.org>,
Guan Xuetao <gxt@mprc.pku.edu.cn>,
"David S. Miller" <davem@davemloft.net>,
Chris Zankel <chris@zankel.net>,
Vineet Gupta <vgupta@synopsys.com>,
Ley Foon Tan <lftan@altera.com>,
Ralf Baechle <ralf@linux-mips.org>,
Andi Kleen <ak@linux.intel.com>
Subject: Re: [PATCH -mm] mm: Fix races between swapoff and flush dcache
Date: Mon, 05 Mar 2018 09:07:17 +0800 [thread overview]
Message-ID: <877eqr49fe.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <20180302131829.7009e1e19f478d55159928de@linux-foundation.org> (Andrew Morton's message of "Fri, 2 Mar 2018 13:18:29 -0800")
Andrew Morton <akpm@linux-foundation.org> writes:
> On Fri, 2 Mar 2018 16:04:26 +0800 "Huang, Ying" <ying.huang@intel.com> wrote:
>
>> From: Huang Ying <ying.huang@intel.com>
>>
>> >From commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB
>> trunks") on, after swapoff, the address_space associated with the swap
>> device will be freed. So page_mapping() users which may touch the
>> address_space need some kind of mechanism to prevent the address_space
>> from being freed during accessing.
>>
>> The dcache flushing functions (flush_dcache_page(), etc) in
>> architecture specific code may access the address_space of swap device
>> for anonymous pages in swap cache via page_mapping() function. But in
>> some cases there are no mechanisms to prevent the swap device from
>> being swapoff, for example,
>>
>> CPU1 CPU2
>> __get_user_pages() swapoff()
>> flush_dcache_page()
>> mapping = page_mapping()
>> ... exit_swap_address_space()
>> ... kvfree(spaces)
>> mapping_mapped(mapping)
>>
>> The address space may be accessed after being freed.
>>
>> But from cachetlb.txt and Russell King, flush_dcache_page() only care
>> about file cache pages, for anonymous pages, flush_anon_page() should
>> be used. The implementation of flush_dcache_page() in all
>> architectures follows this too. They will check whether
>> page_mapping() is NULL and whether mapping_mapped() is true to
>> determine whether to flush the dcache immediately. And they will use
>> interval tree (mapping->i_mmap) to find all user space mappings.
>> While mapping_mapped() and mapping->i_mmap isn't used by anonymous
>> pages in swap cache at all.
>>
>> So, to fix the race between swapoff and flush dcache, __page_mapping()
>> is add to return the address_space for file cache pages and NULL
>> otherwise. All page_mapping() invoking in flush dcache functions are
>> replaced with __page_mapping().
>>
>> The patch is only build tested, because I have no machine with
>> architecture other than x86.
>>
>> ...
>>
>> +/*
>> + * For file cache pages, return the address_space, otherwise return NULL
>> + */
>> +struct address_space *__page_mapping(struct page *page)
>> +{
>> + struct address_space *mapping;
>> +
>> + page = compound_head(page);
>> +
>> + /* This happens if someone calls flush_dcache_page on slab page */
>> + if (unlikely(PageSlab(page)))
>> + return NULL;
>> +
>> + mapping = page->mapping;
>> + if ((unsigned long)mapping & PAGE_MAPPING_ANON)
>> + return NULL;
>> +
>> + return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
>> +}
>> +
>
> I think page_mapping_file() would be a better name.
Thanks! I will use that name.
> And do we really need to duplicate page_mapping()? Could it be
>
> struct address_space *page_mapping_file(struct page *page)
> {
> if (PageSwapCache(page))
> return NULL;
> return page_mapping(page);
> }
Yes. This looks better.
> (We don't need to run compound_head() here, do we?)
Yes. I think so.
Best Regards,
Huang, Ying
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: "Huang\, Ying" <ying.huang@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
Minchan Kim <minchan@kernel.org>, Michal Hocko <mhocko@suse.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Mel Gorman <mgorman@techsingularity.net>,
Dave Hansen <dave.hansen@intel.com>,
Arnd Bergmann <arnd@arndb.de>, Chen Liqin <liqin.linux@gmail.com>,
Russell King <linux@armlinux.org.uk>,
"Yoshinori Sato" <ysato@users.sourceforge.jp>,
"James E.J. Bottomley" <jejb@parisc-linux.org>,
Guan Xuetao <gxt@mprc.pku.edu.cn>,
"David S. Miller" <davem@davemloft.net>,
Chris Zankel <chris@zankel.net>,
Vineet Gupta <vgupta@synopsys.com>,
Ley Foon Tan <lftan@altera.com>,
Ralf Baechle <ralf@linux-mips.org>,
Andi Kleen <ak@linux.intel.com>
Subject: Re: [PATCH -mm] mm: Fix races between swapoff and flush dcache
Date: Mon, 05 Mar 2018 09:07:17 +0800 [thread overview]
Message-ID: <877eqr49fe.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <20180302131829.7009e1e19f478d55159928de@linux-foundation.org> (Andrew Morton's message of "Fri, 2 Mar 2018 13:18:29 -0800")
Andrew Morton <akpm@linux-foundation.org> writes:
> On Fri, 2 Mar 2018 16:04:26 +0800 "Huang, Ying" <ying.huang@intel.com> wrote:
>
>> From: Huang Ying <ying.huang@intel.com>
>>
>> >From commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB
>> trunks") on, after swapoff, the address_space associated with the swap
>> device will be freed. So page_mapping() users which may touch the
>> address_space need some kind of mechanism to prevent the address_space
>> from being freed during accessing.
>>
>> The dcache flushing functions (flush_dcache_page(), etc) in
>> architecture specific code may access the address_space of swap device
>> for anonymous pages in swap cache via page_mapping() function. But in
>> some cases there are no mechanisms to prevent the swap device from
>> being swapoff, for example,
>>
>> CPU1 CPU2
>> __get_user_pages() swapoff()
>> flush_dcache_page()
>> mapping = page_mapping()
>> ... exit_swap_address_space()
>> ... kvfree(spaces)
>> mapping_mapped(mapping)
>>
>> The address space may be accessed after being freed.
>>
>> But from cachetlb.txt and Russell King, flush_dcache_page() only care
>> about file cache pages, for anonymous pages, flush_anon_page() should
>> be used. The implementation of flush_dcache_page() in all
>> architectures follows this too. They will check whether
>> page_mapping() is NULL and whether mapping_mapped() is true to
>> determine whether to flush the dcache immediately. And they will use
>> interval tree (mapping->i_mmap) to find all user space mappings.
>> While mapping_mapped() and mapping->i_mmap isn't used by anonymous
>> pages in swap cache at all.
>>
>> So, to fix the race between swapoff and flush dcache, __page_mapping()
>> is add to return the address_space for file cache pages and NULL
>> otherwise. All page_mapping() invoking in flush dcache functions are
>> replaced with __page_mapping().
>>
>> The patch is only build tested, because I have no machine with
>> architecture other than x86.
>>
>> ...
>>
>> +/*
>> + * For file cache pages, return the address_space, otherwise return NULL
>> + */
>> +struct address_space *__page_mapping(struct page *page)
>> +{
>> + struct address_space *mapping;
>> +
>> + page = compound_head(page);
>> +
>> + /* This happens if someone calls flush_dcache_page on slab page */
>> + if (unlikely(PageSlab(page)))
>> + return NULL;
>> +
>> + mapping = page->mapping;
>> + if ((unsigned long)mapping & PAGE_MAPPING_ANON)
>> + return NULL;
>> +
>> + return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
>> +}
>> +
>
> I think page_mapping_file() would be a better name.
Thanks! I will use that name.
> And do we really need to duplicate page_mapping()? Could it be
>
> struct address_space *page_mapping_file(struct page *page)
> {
> if (PageSwapCache(page))
> return NULL;
> return page_mapping(page);
> }
Yes. This looks better.
> (We don't need to run compound_head() here, do we?)
Yes. I think so.
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2018-03-05 1:07 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-02 8:04 [PATCH -mm] mm: Fix races between swapoff and flush dcache Huang, Ying
2018-03-02 8:04 ` Huang, Ying
2018-03-02 21:18 ` Andrew Morton
2018-03-02 21:18 ` Andrew Morton
2018-03-05 1:07 ` Huang, Ying [this message]
2018-03-05 1:07 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877eqr49fe.fsf@yhuang-dev.intel.com \
--to=ying.huang@intel.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=chris@zankel.net \
--cc=dave.hansen@intel.com \
--cc=davem@davemloft.net \
--cc=gxt@mprc.pku.edu.cn \
--cc=hannes@cmpxchg.org \
--cc=jejb@parisc-linux.org \
--cc=lftan@altera.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux@armlinux.org.uk \
--cc=liqin.linux@gmail.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=ralf@linux-mips.org \
--cc=vgupta@synopsys.com \
--cc=ysato@users.sourceforge.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.