From: "Huang\, Ying" <ying.huang@intel.com>
To: Gao Xiang <hsiangkao@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>,
Rafael Aquini <aquini@redhat.com>,
Carlos Maiolino <cmaiolino@redhat.com>,
Eric Sandeen <esandeen@redhat.com>,
stable <stable@vger.kernel.org>
Subject: Re: [PATCH] mm, THP, swap: fix allocating cluster for swapfile by mistake
Date: Thu, 20 Aug 2020 12:36:08 +0800 [thread overview]
Message-ID: <871rk2x7bb.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <20200819195613.24269-1-hsiangkao@redhat.com> (Gao Xiang's message of "Thu, 20 Aug 2020 03:56:13 +0800")
Gao Xiang <hsiangkao@redhat.com> writes:
> SWP_FS doesn't mean the device is file-backed swap device,
> which just means each writeback request should go through fs
> by DIO. Or it'll just use extents added by .swap_activate(),
> but it also works as file-backed swap device.
>
> So in order to achieve the goal of the original patch,
> SWP_BLKDEV should be used instead.
>
> FS corruption can be observed with SSD device + XFS +
> fragmented swapfile due to CONFIG_THP_SWAP=y.
>
> Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device")
> Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out")
> Cc: "Huang, Ying" <ying.huang@intel.com>
> Cc: stable <stable@vger.kernel.org>
> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Good catch! The fix itself looks good me! Although the description is
a little confusing.
After some digging, it seems that SWP_FS is set on the swap devices
which make swap entry read/write go through the file system specific
callback (now used by swap over NFS only).
Best Regards,
Huang, Ying
> ---
>
> I reproduced the issue with the following details:
>
> Environment:
> QEMU + upstream kernel + buildroot + NVMe (2 GB)
>
> Kernel config:
> CONFIG_BLK_DEV_NVME=y
> CONFIG_THP_SWAP=y
>
> Some reproducable steps:
> mkfs.xfs -f /dev/nvme0n1
> mkdir /tmp/mnt
> mount /dev/nvme0n1 /tmp/mnt
> bs="32k"
> sz="1024m" # doesn't matter too much, I also tried 16m
> xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
> xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
> xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
> xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
> xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw
>
> mkswap /tmp/mnt/sw
> swapon /tmp/mnt/sw
>
> stress --vm 2 --vm-bytes 600M # doesn't matter too much as well
>
> Symptoms:
> - FS corruption (e.g. checksum failure)
> - memory corruption at: 0xd2808010
> - segfault
> ...
>
> mm/swapfile.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 6c26916e95fd..2937daf3ca02 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1074,7 +1074,7 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[], int entry_size)
> goto nextsi;
> }
> if (size == SWAPFILE_CLUSTER) {
> - if (!(si->flags & SWP_FS))
> + if (si->flags & SWP_BLKDEV)
> n_ret = swap_alloc_cluster(si, swp_entries);
> } else
> n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
next prev parent reply other threads:[~2020-08-20 4:36 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-19 19:56 [PATCH] mm, THP, swap: fix allocating cluster for swapfile by mistake Gao Xiang
2020-08-19 20:05 ` Andrew Morton
2020-08-19 20:15 ` Gao Xiang
2020-08-19 21:41 ` Yang Shi
2020-08-20 1:24 ` Gao Xiang
2020-08-19 20:44 ` Rafael Aquini
2020-08-19 20:54 ` Gao Xiang
2020-08-20 4:36 ` Huang, Ying [this message]
2020-08-20 4:41 ` Gao Xiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871rk2x7bb.fsf@yhuang-dev.intel.com \
--to=ying.huang@intel.com \
--cc=akpm@linux-foundation.org \
--cc=aquini@redhat.com \
--cc=cmaiolino@redhat.com \
--cc=esandeen@redhat.com \
--cc=hsiangkao@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.