From: Mel Gorman <mgorman@techsingularity.net>
To: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: mhocko@suse.com, kvm@vger.kernel.org, marc.zyngier@arm.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org, cai@lca.pw,
akpm@linux-foundation.org, kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH] mm, compaction: Make sure we isolate a valid PFN
Date: Fri, 24 May 2019 16:51:55 +0100 [thread overview]
Message-ID: <20190524155155.GQ18914@techsingularity.net> (raw)
In-Reply-To: <1558711908-15688-1-git-send-email-suzuki.poulose@arm.com>
On Fri, May 24, 2019 at 04:31:48PM +0100, Suzuki K Poulose wrote:
> When we have holes in a normal memory zone, we could endup having
> cached_migrate_pfns which may not necessarily be valid, under heavy memory
> pressure with swapping enabled ( via __reset_isolation_suitable(), triggered
> by kswapd).
>
> Later if we fail to find a page via fast_isolate_freepages(), we may
> end up using the migrate_pfn we started the search with, as valid
> page. This could lead to accessing NULL pointer derefernces like below,
> due to an invalid mem_section pointer.
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [47/1825]
> Mem abort info:
> ESR = 0x96000004
> Exception class = DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> Data abort info:
> ISV = 0, ISS = 0x00000004
> CM = 0, WnR = 0
> user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000082f94ae9
> [0000000000000008] pgd=0000000000000000
> Internal error: Oops: 96000004 [#1] SMP
> ...
> CPU: 10 PID: 6080 Comm: qemu-system-aar Not tainted 510-rc1+ #6
> Hardware name: AmpereComputing(R) OSPREY EV-883832-X3-0001/OSPREY, BIOS 4819 09/25/2018
> pstate: 60000005 (nZCv daif -PAN -UAO)
> pc : set_pfnblock_flags_mask+0x58/0xe8
> lr : compaction_alloc+0x300/0x950
> [...]
> Process qemu-system-aar (pid: 6080, stack limit = 0x0000000095070da5)
> Call trace:
> set_pfnblock_flags_mask+0x58/0xe8
> compaction_alloc+0x300/0x950
> migrate_pages+0x1a4/0xbb0
> compact_zone+0x750/0xde8
> compact_zone_order+0xd8/0x118
> try_to_compact_pages+0xb4/0x290
> __alloc_pages_direct_compact+0x84/0x1e0
> __alloc_pages_nodemask+0x5e0/0xe18
> alloc_pages_vma+0x1cc/0x210
> do_huge_pmd_anonymous_page+0x108/0x7c8
> __handle_mm_fault+0xdd4/0x1190
> handle_mm_fault+0x114/0x1c0
> __get_user_pages+0x198/0x3c0
> get_user_pages_unlocked+0xb4/0x1d8
> __gfn_to_pfn_memslot+0x12c/0x3b8
> gfn_to_pfn_prot+0x4c/0x60
> kvm_handle_guest_abort+0x4b0/0xcd8
> handle_exit+0x140/0x1b8
> kvm_arch_vcpu_ioctl_run+0x260/0x768
> kvm_vcpu_ioctl+0x490/0x898
> do_vfs_ioctl+0xc4/0x898
> ksys_ioctl+0x8c/0xa0
> __arm64_sys_ioctl+0x28/0x38
> el0_svc_common+0x74/0x118
> el0_svc_handler+0x38/0x78
> el0_svc+0x8/0xc
> Code: f8607840 f100001f 8b011401 9a801020 (f9400400)
> ---[ end trace af6a35219325a9b6 ]---
>
> The issue was reported on an arm64 server with 128GB with holes in the zone
> (e.g, [32GB@4GB, 96GB@544GB]), with a swap device enabled, while running 100 KVM
> guest instances.
>
> This patch fixes the issue by ensuring that the page belongs to a valid PFN
> when we fallback to using the lower limit of the scan range upon failure in
> fast_isolate_freepages().
>
> Fixes: 5a811889de10f1eb ("mm, compaction: use free lists to quickly locate a migration target")
> Reported-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Mel Gorman <mgorman@techsingularity.net>
--
Mel Gorman
SUSE Labs
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@techsingularity.net>
To: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@suse.com,
cai@lca.pw, linux-kernel@vger.kernel.org, marc.zyngier@arm.com,
kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org
Subject: Re: [PATCH] mm, compaction: Make sure we isolate a valid PFN
Date: Fri, 24 May 2019 16:51:55 +0100 [thread overview]
Message-ID: <20190524155155.GQ18914@techsingularity.net> (raw)
In-Reply-To: <1558711908-15688-1-git-send-email-suzuki.poulose@arm.com>
On Fri, May 24, 2019 at 04:31:48PM +0100, Suzuki K Poulose wrote:
> When we have holes in a normal memory zone, we could endup having
> cached_migrate_pfns which may not necessarily be valid, under heavy memory
> pressure with swapping enabled ( via __reset_isolation_suitable(), triggered
> by kswapd).
>
> Later if we fail to find a page via fast_isolate_freepages(), we may
> end up using the migrate_pfn we started the search with, as valid
> page. This could lead to accessing NULL pointer derefernces like below,
> due to an invalid mem_section pointer.
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [47/1825]
> Mem abort info:
> ESR = 0x96000004
> Exception class = DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> Data abort info:
> ISV = 0, ISS = 0x00000004
> CM = 0, WnR = 0
> user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000082f94ae9
> [0000000000000008] pgd=0000000000000000
> Internal error: Oops: 96000004 [#1] SMP
> ...
> CPU: 10 PID: 6080 Comm: qemu-system-aar Not tainted 510-rc1+ #6
> Hardware name: AmpereComputing(R) OSPREY EV-883832-X3-0001/OSPREY, BIOS 4819 09/25/2018
> pstate: 60000005 (nZCv daif -PAN -UAO)
> pc : set_pfnblock_flags_mask+0x58/0xe8
> lr : compaction_alloc+0x300/0x950
> [...]
> Process qemu-system-aar (pid: 6080, stack limit = 0x0000000095070da5)
> Call trace:
> set_pfnblock_flags_mask+0x58/0xe8
> compaction_alloc+0x300/0x950
> migrate_pages+0x1a4/0xbb0
> compact_zone+0x750/0xde8
> compact_zone_order+0xd8/0x118
> try_to_compact_pages+0xb4/0x290
> __alloc_pages_direct_compact+0x84/0x1e0
> __alloc_pages_nodemask+0x5e0/0xe18
> alloc_pages_vma+0x1cc/0x210
> do_huge_pmd_anonymous_page+0x108/0x7c8
> __handle_mm_fault+0xdd4/0x1190
> handle_mm_fault+0x114/0x1c0
> __get_user_pages+0x198/0x3c0
> get_user_pages_unlocked+0xb4/0x1d8
> __gfn_to_pfn_memslot+0x12c/0x3b8
> gfn_to_pfn_prot+0x4c/0x60
> kvm_handle_guest_abort+0x4b0/0xcd8
> handle_exit+0x140/0x1b8
> kvm_arch_vcpu_ioctl_run+0x260/0x768
> kvm_vcpu_ioctl+0x490/0x898
> do_vfs_ioctl+0xc4/0x898
> ksys_ioctl+0x8c/0xa0
> __arm64_sys_ioctl+0x28/0x38
> el0_svc_common+0x74/0x118
> el0_svc_handler+0x38/0x78
> el0_svc+0x8/0xc
> Code: f8607840 f100001f 8b011401 9a801020 (f9400400)
> ---[ end trace af6a35219325a9b6 ]---
>
> The issue was reported on an arm64 server with 128GB with holes in the zone
> (e.g, [32GB@4GB, 96GB@544GB]), with a swap device enabled, while running 100 KVM
> guest instances.
>
> This patch fixes the issue by ensuring that the page belongs to a valid PFN
> when we fallback to using the lower limit of the scan range upon failure in
> fast_isolate_freepages().
>
> Fixes: 5a811889de10f1eb ("mm, compaction: use free lists to quickly locate a migration target")
> Reported-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Mel Gorman <mgorman@techsingularity.net>
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2019-05-25 9:11 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-24 9:20 mm/compaction: BUG: NULL pointer dereference Suzuki K Poulose
2019-05-24 9:20 ` Suzuki K Poulose
2019-05-24 10:39 ` Mel Gorman
2019-05-24 10:39 ` Mel Gorman
2019-05-24 10:42 ` Suzuki K Poulose
2019-05-24 10:42 ` Suzuki K Poulose
2019-05-24 15:31 ` [PATCH] mm, compaction: Make sure we isolate a valid PFN Suzuki K Poulose
2019-05-24 15:31 ` Suzuki K Poulose
2019-05-24 15:51 ` Mel Gorman [this message]
2019-05-24 15:51 ` Mel Gorman
2019-05-27 5:38 ` Anshuman Khandual
2019-05-27 5:38 ` Anshuman Khandual
2019-05-24 10:56 ` mm/compaction: BUG: NULL pointer dereference Anshuman Khandual
2019-05-24 10:56 ` Anshuman Khandual
2019-05-24 12:30 ` Mel Gorman
2019-05-24 12:30 ` Mel Gorman
2019-05-24 13:13 ` Anshuman Khandual
2019-05-24 13:13 ` Anshuman Khandual
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190524155155.GQ18914@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=akpm@linux-foundation.org \
--cc=cai@lca.pw \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=marc.zyngier@arm.com \
--cc=mhocko@suse.com \
--cc=suzuki.poulose@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.