From: Minchan Kim <minchan@kernel.org>
To: Vinayak Menon <vinmenon@codeaurora.org>
Cc: linux-mm@kvack.org
Subject: Re: [PATCH] mm: fix the race between swapin_readahead and SWP_SYNCHRONOUS_IO path
Date: Mon, 9 Sep 2019 16:26:13 -0700 [thread overview]
Message-ID: <20190909232613.GA39783@google.com> (raw)
In-Reply-To: <1567169011-4748-1-git-send-email-vinmenon@codeaurora.org>
Hi Vinayak,
On Fri, Aug 30, 2019 at 06:13:31PM +0530, Vinayak Menon wrote:
> The following race is observed due to which a processes faulting
> on a swap entry, finds the page neither in swapcache nor swap. This
> causes zram to give a zero filled page that gets mapped to the
> process, resulting in a user space crash later.
>
> Consider parent and child processes Pa and Pb sharing the same swap
> slot with swap_count 2. Swap is on zram with SWP_SYNCHRONOUS_IO set.
> Virtual address 'VA' of Pa and Pb points to the shared swap entry.
>
> Pa Pb
>
> fault on VA fault on VA
> do_swap_page do_swap_page
> lookup_swap_cache fails lookup_swap_cache fails
> Pb scheduled out
> swapin_readahead (deletes zram entry)
> swap_free (makes swap_count 1)
> Pb scheduled in
> swap_readpage (swap_count == 1)
> Takes SWP_SYNCHRONOUS_IO path
> zram enrty absent
> zram gives a zero filled page
>
> Fix this by reading the swap_count before lookup_swap_cache, which conforms
> with the order in which page is added to swap cache and swap count is
> decremented in do_swap_page. In the race case above, this will let Pb take
> the readahead path and thus pick the proper page from swapcache.
Thanks for the report, Vinayak.
It's a zram specific issue because it deallocates zram block
unconditionally once read IO is done. The expectation was that dirty
page is on the swap cache but with SWP_SYNCHRONOUS_IO, it's not true
any more so I want to resolve the issue in zram specific code, not
general one.
A idea in my mind is swap_slot_free_notify should check the slot
reference counter and if it's higher than 1, it shouldn't free the
slot until. What do you think about?
>
> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
> ---
> mm/memory.c | 21 ++++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index e0c232f..22643aa 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2744,6 +2744,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> struct page *page = NULL, *swapcache;
> struct mem_cgroup *memcg;
> swp_entry_t entry;
> + struct swap_info_struct *si;
> + bool skip_swapcache = false;
> pte_t pte;
> int locked;
> int exclusive = 0;
> @@ -2771,15 +2773,24 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>
>
> delayacct_set_flag(DELAYACCT_PF_SWAPIN);
> +
> + /*
> + * lookup_swap_cache below can fail and before the SWP_SYNCHRONOUS_IO
> + * check is made, another process can populate the swapcache, delete
> + * the swap entry and decrement the swap count. So decide on taking
> + * the SWP_SYNCHRONOUS_IO path before the lookup. In the event of the
> + * race described, the victim process will find a swap_count > 1
> + * and can then take the readahead path instead of SWP_SYNCHRONOUS_IO.
> + */
> + si = swp_swap_info(entry);
> + if (si->flags & SWP_SYNCHRONOUS_IO && __swap_count(entry) == 1)
> + skip_swapcache = true;
> +
> page = lookup_swap_cache(entry, vma, vmf->address);
> swapcache = page;
>
> if (!page) {
> - struct swap_info_struct *si = swp_swap_info(entry);
> -
> - if (si->flags & SWP_SYNCHRONOUS_IO &&
> - __swap_count(entry) == 1) {
> - /* skip swapcache */
> + if (skip_swapcache) {
> page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma,
> vmf->address);
> if (page) {
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
> member of the Code Aurora Forum, hosted by The Linux Foundation
>
next prev parent reply other threads:[~2019-09-09 23:26 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-30 12:43 [PATCH] mm: fix the race between swapin_readahead and SWP_SYNCHRONOUS_IO path Vinayak Menon
2019-09-02 13:21 ` Michal Hocko
2019-09-03 6:13 ` Vinayak Menon
2019-09-03 11:41 ` Michal Hocko
2019-09-03 12:17 ` Vinayak Menon
2019-09-09 4:05 ` Vinayak Menon
2019-09-09 11:23 ` Michal Hocko
2019-09-09 23:26 ` Minchan Kim [this message]
2019-09-10 8:22 ` Vinayak Menon
2019-09-10 17:51 ` Minchan Kim
2019-09-11 10:07 ` Vinayak Menon
2019-09-12 17:14 ` Minchan Kim
2019-09-13 9:05 ` Vinayak Menon
2019-09-16 20:05 ` Minchan Kim
2019-09-17 5:38 ` Vinayak Menon
2019-09-18 1:12 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190909232613.GA39783@google.com \
--to=minchan@kernel.org \
--cc=linux-mm@kvack.org \
--cc=vinmenon@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.