linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chris Li <chrisl@kernel.org>
To: Kairui Song <ryncsn@gmail.com>
Cc: "Huang, Ying" <ying.huang@intel.com>,
	Minchan Kim <minchan@kernel.org>,
	linux-mm@kvack.org,  Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@suse.com>,
	 Yosry Ahmed <yosryahmed@google.com>,
	David Hildenbrand <david@redhat.com>,
	linux-kernel@vger.kernel.org,  Yu Zhao <yuzhao@google.com>
Subject: Re: Whether is the race for SWP_SYNCHRONOUS_IO possible? (was Re: [PATCH v3 6/7] mm/swap, shmem: use unified swapin helper for shmem)
Date: Wed, 31 Jan 2024 15:45:18 -0800	[thread overview]
Message-ID: <CAF8kJuN6G578RWQ5R6eW=FQWg3mphywgnusQYCRMJA4TzhR4jg@mail.gmail.com> (raw)
In-Reply-To: <CAMgjq7DhjeeCehzj5hiO=v+X0Jg5mEpKim3k8abJA20TN63SHA@mail.gmail.com>

On Tue, Jan 30, 2024 at 7:58 PM Kairui Song <ryncsn@gmail.com> wrote:
>
> Hi Ying,
>
> On Wed, Jan 31, 2024 at 10:53 AM Huang, Ying <ying.huang@intel.com> wrote:
> >
> > Hi, Minchan,
> >
> > When I review the patchset from Kairui, I checked the code to skip swap
> > cache in do_swap_page() for swap device with SWP_SYNCHRONOUS_IO.  Is the
> > following race possible?  Where a page is swapped out to a swap device
> > with SWP_SYNCHRONOUS_IO and the swap count is 1.  Then 2 threads of the
> > process runs on CPU0 and CPU1 as below.  CPU0 is running do_swap_page().
>
> Chris raised a similar issue about the shmem path, and I was worrying
> about the same issue in previous discussions about do_swap_page:
> https://lore.kernel.org/linux-mm/CAMgjq7AwFiDb7cAMkWMWb3vkccie1-tocmZfT7m4WRb_UKPghg@mail.gmail.com/

Ha thanks for remembering that.

>
> """
> In do_swap_page path, multiple process could swapin the page at the
> same time (a mapped once page can still be shared by sub threads),
> they could get different folios. The later pte lock and pte_same check
> is not enough, because while one process is not holding the pte lock,
> another process could read-in, swap_free the entry, then swap-out the
> page again, using same entry, an ABA problem. The race is not likely
> to happen in reality but in theory possible.
> """
>
> >
> > CPU0                            CPU1
> > ----                            ----
> > swap_cache_get_folio()
> > check sync io and swap count
> > alloc folio
> > swap_readpage()
> > folio_lock_or_retry()
> >                                 swap in the swap entry
> >                                 write page
> >                                 swap out to same swap entry
> > pte_offset_map_lock()
> > check pte_same()
> > swap_free()   <-- new content lost!
> > set_pte_at()  <-- stale page!
> > folio_unlock()
> > pte_unmap_unlock()
>
> Thank you very much for highlighting this!
>
> My concern previously is the same as yours (swapping out using the
> same entry is like an ABA issue, where pte_same failed to detect the
> page table change), later when working on V3, I mistakenly thought
> that's impossible as entry should be pinned until swap_free on CPU0,
> and I'm wrong. CPU1 can also just call swap_free, then swap count is
> dropped to 0 and it can just swap out using the same entry. Now I
> think my patch 6/7 is also affected by this potential race. Seems
> nothing can stop it from doing this.
>
> Actually I was trying to make a reproducer locally, due to swap slot
> cache, swap allocation algorithm, and the short race window, this is
> very unlikely to happen though.

You can put some sleep in some of the CPU0 where expect the other race
to happen to manual help triggering it. Yes, it sounds hard to trigger
in real life due to reclaim swap out.

>
> How about we just increase the swap count temporarily in the direct
> swap in path (after alloc folio), then drop the count after pte_same
> (or shmem_add_to_page_cache in shmem path)? That seems enough to
> prevent the entry reuse issue.

Sounds like a good solution.

Chris


  reply	other threads:[~2024-01-31 23:45 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-29 17:54 [PATCH v3 0/7] swapin refactor for optimization and unified readahead Kairui Song
2024-01-29 17:54 ` [PATCH v3 1/7] mm/swapfile.c: add back some comment Kairui Song
2024-01-29 17:54 ` [PATCH v3 2/7] mm/swap: move no readahead swapin code to a stand-alone helper Kairui Song
2024-01-30  5:38   ` Huang, Ying
2024-01-30  5:55     ` Kairui Song
2024-01-29 17:54 ` [PATCH v3 3/7] mm/swap: always account swapped in page into current memcg Kairui Song
2024-01-30  6:12   ` Huang, Ying
2024-01-30  7:01     ` Kairui Song
2024-01-30  7:03       ` Kairui Song
2024-01-29 17:54 ` [PATCH v3 4/7] mm/swap: introduce swapin_entry for unified readahead policy Kairui Song
2024-01-30  6:29   ` Huang, Ying
2024-01-29 17:54 ` [PATCH v3 5/7] mm/swap: avoid a duplicated swap cache lookup for SWP_SYNCHRONOUS_IO Kairui Song
2024-01-30  6:51   ` Huang, Ying
2024-01-29 17:54 ` [PATCH v3 6/7] mm/swap, shmem: use unified swapin helper for shmem Kairui Song
2024-01-31  2:51   ` Whether is the race for SWP_SYNCHRONOUS_IO possible? (was Re: [PATCH v3 6/7] mm/swap, shmem: use unified swapin helper for shmem) Huang, Ying
2024-01-31  3:58     ` Kairui Song
2024-01-31 23:45       ` Chris Li [this message]
2024-02-01  0:52         ` Huang, Ying
2024-01-31 23:38     ` Chris Li
2024-01-29 17:54 ` [PATCH v3 7/7] mm/swap: refactor swap_cache_get_folio Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAF8kJuN6G578RWQ5R6eW=FQWg3mphywgnusQYCRMJA4TzhR4jg@mail.gmail.com' \
    --to=chrisl@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=ryncsn@gmail.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).