All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: mhocko@suse.com, kvm@vger.kernel.org, marc.zyngier@arm.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, cai@lca.pw,
	akpm@linux-foundation.org, kvmarm@lists.cs.columbia.edu
Subject: Re: mm/compaction: BUG: NULL pointer dereference
Date: Fri, 24 May 2019 13:30:47 +0100	[thread overview]
Message-ID: <20190524123047.GO18914@techsingularity.net> (raw)
In-Reply-To: <cfddd75a-b302-5557-05b8-2b328bba27c8@arm.com>

On Fri, May 24, 2019 at 04:26:16PM +0530, Anshuman Khandual wrote:
> 
> 
> On 05/24/2019 02:50 PM, Suzuki K Poulose wrote:
> > Hi,
> > 
> > We are hitting NULL pointer dereferences while running stress tests with KVM.
> > See splat [0]. The test is to spawn 100 VMs all doing standard debian
> > installation (Thanks to Marc's automated scripts, available here [1] ).
> > The problem has been reproduced with a better rate of success from 5.1-rc6
> > onwards.
> > 
> > The issue is only reproducible with swapping enabled and the entire
> > memory is used up, when swapping heavily. Also this issue is only reproducible
> > on only one server with 128GB, which has the following memory layout:
> > 
> > [32GB@4GB, hole , 96GB@544GB]
> > 
> > Here is my non-expert analysis of the issue so far.
> > 
> > Under extreme memory pressure, the kswapd could trigger reset_isolation_suitable()
> > to figure out the cached values for migrate/free pfn for a zone, by scanning through
> > the entire zone. On our server it does so in the range of [ 0x10_0000, 0xa00_0000 ],
> > with the following area of holes : [ 0x20_0000, 0x880_0000 ].
> > In the failing case, we end up setting the cached migrate pfn as : 0x508_0000, which
> > is right in the center of the zone pfn range. i.e ( 0x10_0000 + 0xa00_0000 ) / 2,
> > with reset_migrate = 0x88_4e00, reset_free = 0x10_0000.
> > 
> > Now these cached values are used by the fast_isolate_freepages() to find a pfn. However,
> > since we cant find anything during the search we fall back to using the page belonging
> > to the min_pfn (which is the migrate_pfn), without proper checks to see if that is valid
> > PFN or not. This is then passed on to fast_isolate_around() which tries to do :
> > set_pageblock_skip(page) on the page which blows up due to an NULL mem_section pointer.
> > 
> > The following patch seems to fix the issue for me, but I am not quite convinced that
> > it is the right fix. Thoughts ?
> > 
> > 
> > diff --git a/mm/compaction.c b/mm/compaction.c
> > index 9febc8c..9e1b9ac 100644
> > --- a/mm/compaction.c
> > +++ b/mm/compaction.c
> > @@ -1399,7 +1399,7 @@ fast_isolate_freepages(struct compact_control *cc)
> >  				page = pfn_to_page(highest);
> >  				cc->free_pfn = highest;
> >  			} else {
> > -				if (cc->direct_compaction) {
> > +				if (cc->direct_compaction && pfn_valid(min_pfn)) {
> >  					page = pfn_to_page(min_pfn);
> 
> pfn_to_online_page() here would be better as it does not add pfn_valid() cost on
> architectures which does not subscribe to CONFIG_HOLES_IN_ZONE. But regardless if
> the compaction is trying to scan pfns in zone holes, then it should be avoided.

CONFIG_HOLES_IN_ZONE typically applies in special cases where an arch
punches holes within a section. As both do a section lookup, the cost is
similar but pfn_valid in general is less subtle in this case. Normally
pfn_valid_within is only ok when a pfn_valid check has been made on the
max_order aligned range as well as a zone boundary check. In this case,
it's much more straight-forward to leave it as pfn_valid.

-- 
Mel Gorman
SUSE Labs
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@techsingularity.net>
To: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>,
	linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@suse.com,
	cai@lca.pw, linux-kernel@vger.kernel.org, marc.zyngier@arm.com,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org
Subject: Re: mm/compaction: BUG: NULL pointer dereference
Date: Fri, 24 May 2019 13:30:47 +0100	[thread overview]
Message-ID: <20190524123047.GO18914@techsingularity.net> (raw)
In-Reply-To: <cfddd75a-b302-5557-05b8-2b328bba27c8@arm.com>

On Fri, May 24, 2019 at 04:26:16PM +0530, Anshuman Khandual wrote:
> 
> 
> On 05/24/2019 02:50 PM, Suzuki K Poulose wrote:
> > Hi,
> > 
> > We are hitting NULL pointer dereferences while running stress tests with KVM.
> > See splat [0]. The test is to spawn 100 VMs all doing standard debian
> > installation (Thanks to Marc's automated scripts, available here [1] ).
> > The problem has been reproduced with a better rate of success from 5.1-rc6
> > onwards.
> > 
> > The issue is only reproducible with swapping enabled and the entire
> > memory is used up, when swapping heavily. Also this issue is only reproducible
> > on only one server with 128GB, which has the following memory layout:
> > 
> > [32GB@4GB, hole , 96GB@544GB]
> > 
> > Here is my non-expert analysis of the issue so far.
> > 
> > Under extreme memory pressure, the kswapd could trigger reset_isolation_suitable()
> > to figure out the cached values for migrate/free pfn for a zone, by scanning through
> > the entire zone. On our server it does so in the range of [ 0x10_0000, 0xa00_0000 ],
> > with the following area of holes : [ 0x20_0000, 0x880_0000 ].
> > In the failing case, we end up setting the cached migrate pfn as : 0x508_0000, which
> > is right in the center of the zone pfn range. i.e ( 0x10_0000 + 0xa00_0000 ) / 2,
> > with reset_migrate = 0x88_4e00, reset_free = 0x10_0000.
> > 
> > Now these cached values are used by the fast_isolate_freepages() to find a pfn. However,
> > since we cant find anything during the search we fall back to using the page belonging
> > to the min_pfn (which is the migrate_pfn), without proper checks to see if that is valid
> > PFN or not. This is then passed on to fast_isolate_around() which tries to do :
> > set_pageblock_skip(page) on the page which blows up due to an NULL mem_section pointer.
> > 
> > The following patch seems to fix the issue for me, but I am not quite convinced that
> > it is the right fix. Thoughts ?
> > 
> > 
> > diff --git a/mm/compaction.c b/mm/compaction.c
> > index 9febc8c..9e1b9ac 100644
> > --- a/mm/compaction.c
> > +++ b/mm/compaction.c
> > @@ -1399,7 +1399,7 @@ fast_isolate_freepages(struct compact_control *cc)
> >  				page = pfn_to_page(highest);
> >  				cc->free_pfn = highest;
> >  			} else {
> > -				if (cc->direct_compaction) {
> > +				if (cc->direct_compaction && pfn_valid(min_pfn)) {
> >  					page = pfn_to_page(min_pfn);
> 
> pfn_to_online_page() here would be better as it does not add pfn_valid() cost on
> architectures which does not subscribe to CONFIG_HOLES_IN_ZONE. But regardless if
> the compaction is trying to scan pfns in zone holes, then it should be avoided.

CONFIG_HOLES_IN_ZONE typically applies in special cases where an arch
punches holes within a section. As both do a section lookup, the cost is
similar but pfn_valid in general is less subtle in this case. Normally
pfn_valid_within is only ok when a pfn_valid check has been made on the
max_order aligned range as well as a zone boundary check. In this case,
it's much more straight-forward to leave it as pfn_valid.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2019-05-25  9:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-24  9:20 mm/compaction: BUG: NULL pointer dereference Suzuki K Poulose
2019-05-24  9:20 ` Suzuki K Poulose
2019-05-24 10:39 ` Mel Gorman
2019-05-24 10:39   ` Mel Gorman
2019-05-24 10:42   ` Suzuki K Poulose
2019-05-24 10:42     ` Suzuki K Poulose
2019-05-24 15:31   ` [PATCH] mm, compaction: Make sure we isolate a valid PFN Suzuki K Poulose
2019-05-24 15:31     ` Suzuki K Poulose
2019-05-24 15:51     ` Mel Gorman
2019-05-24 15:51       ` Mel Gorman
2019-05-27  5:38     ` Anshuman Khandual
2019-05-27  5:38       ` Anshuman Khandual
2019-05-24 10:56 ` mm/compaction: BUG: NULL pointer dereference Anshuman Khandual
2019-05-24 10:56   ` Anshuman Khandual
2019-05-24 12:30   ` Mel Gorman [this message]
2019-05-24 12:30     ` Mel Gorman
2019-05-24 13:13     ` Anshuman Khandual
2019-05-24 13:13       ` Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190524123047.GO18914@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=cai@lca.pw \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=marc.zyngier@arm.com \
    --cc=mhocko@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.