All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Thorlton <athorlton@sgi.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Subject: Re: BUG: mm, numa: test segfaults, only when NUMA balancing is on
Date: Thu, 7 Nov 2013 15:52:28 -0600	[thread overview]
Message-ID: <20131107215228.GA4236@sgi.com> (raw)
In-Reply-To: <20131016155429.GP25735@sgi.com>

On Wed, Oct 16, 2013 at 10:54:29AM -0500, Alex Thorlton wrote:
> Hi guys,
> 
> I ran into a bug a week or so ago, that I believe has something to do
> with NUMA balancing, but I'm having a tough time tracking down exactly
> what is causing it.  When running with the following configuration
> options set:
> 
> CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
> CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
> CONFIG_NUMA_BALANCING=y
> # CONFIG_HUGETLBFS is not set
> # CONFIG_HUGETLB_PAGE is not set
> 
> I get intermittent segfaults when running the memscale test that we've
> been using to test some of the THP changes.  Here's a link to the test:
> 
> ftp://shell.sgi.com/collect/memscale/

For anyone who's interested, this test has been moved to:

http://oss.sgi.com/projects/memtests/thp_memscale.tar.gz

It should remain there permanently.

> 
> I typically run the test with a line similar to this:
> 
> ./thp_memscale -C 0 -m 0 -c <cores> -b <memory>
> 
> Where <cores> is the number of cores to spawn threads on, and <memory>
> is the amount of memory to reserve from each core.  The <memory> field
> can accept values like 512m or 1g, etc.  I typically run 256 cores and
> 512m, though I think the problem should be reproducable on anything with
> 128+ cores.
> 
> The test never seems to have any problems when running with hugetlbfs
> on and NUMA balancing off, but it segfaults every once in a while with
> the config options above.  It seems to occur more frequently, the more
> cores you run on.  It segfaults on about 50% of the runs at 256 cores,
> and on almost every run at 512 cores.  The fewest number of cores I've
> seen a segfault on has been 128, though it seems to be rare on this many
> cores.
> 
> At this point, I'm not familiar enough with NUMA balancing code to know
> what could be causing this, and we don't typically run with NUMA
> balancing on, so I don't see this in my everyday testing, but I felt
> that it was definitely worth bringing up.
> 
> If anybody has any ideas of where I could poke around to find a
> solution, please let me know.
> 
> - Alex

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Alex Thorlton <athorlton@sgi.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Subject: Re: BUG: mm, numa: test segfaults, only when NUMA balancing is on
Date: Thu, 7 Nov 2013 15:52:28 -0600	[thread overview]
Message-ID: <20131107215228.GA4236@sgi.com> (raw)
In-Reply-To: <20131016155429.GP25735@sgi.com>

On Wed, Oct 16, 2013 at 10:54:29AM -0500, Alex Thorlton wrote:
> Hi guys,
> 
> I ran into a bug a week or so ago, that I believe has something to do
> with NUMA balancing, but I'm having a tough time tracking down exactly
> what is causing it.  When running with the following configuration
> options set:
> 
> CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
> CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
> CONFIG_NUMA_BALANCING=y
> # CONFIG_HUGETLBFS is not set
> # CONFIG_HUGETLB_PAGE is not set
> 
> I get intermittent segfaults when running the memscale test that we've
> been using to test some of the THP changes.  Here's a link to the test:
> 
> ftp://shell.sgi.com/collect/memscale/

For anyone who's interested, this test has been moved to:

http://oss.sgi.com/projects/memtests/thp_memscale.tar.gz

It should remain there permanently.

> 
> I typically run the test with a line similar to this:
> 
> ./thp_memscale -C 0 -m 0 -c <cores> -b <memory>
> 
> Where <cores> is the number of cores to spawn threads on, and <memory>
> is the amount of memory to reserve from each core.  The <memory> field
> can accept values like 512m or 1g, etc.  I typically run 256 cores and
> 512m, though I think the problem should be reproducable on anything with
> 128+ cores.
> 
> The test never seems to have any problems when running with hugetlbfs
> on and NUMA balancing off, but it segfaults every once in a while with
> the config options above.  It seems to occur more frequently, the more
> cores you run on.  It segfaults on about 50% of the runs at 256 cores,
> and on almost every run at 512 cores.  The fewest number of cores I've
> seen a segfault on has been 128, though it seems to be rare on this many
> cores.
> 
> At this point, I'm not familiar enough with NUMA balancing code to know
> what could be causing this, and we don't typically run with NUMA
> balancing on, so I don't see this in my everyday testing, but I felt
> that it was definitely worth bringing up.
> 
> If anybody has any ideas of where I could poke around to find a
> solution, please let me know.
> 
> - Alex

  parent reply	other threads:[~2013-11-07 21:52 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-16 15:54 BUG: mm, numa: test segfaults, only when NUMA balancing is on Alex Thorlton
2013-10-16 15:54 ` Alex Thorlton
2013-10-17 11:30 ` Bob Liu
2013-10-17 11:30   ` Bob Liu
2013-10-18  0:33   ` Alex Thorlton
2013-10-18  0:33     ` Alex Thorlton
2013-11-04 14:58 ` Mel Gorman
2013-11-04 14:58   ` Mel Gorman
2013-11-04 20:03   ` Alex Thorlton
2013-11-04 20:03     ` Alex Thorlton
2013-11-06 13:10     ` Mel Gorman
2013-11-06 13:10       ` Mel Gorman
2013-11-07 21:48       ` Alex Thorlton
2013-11-07 21:48         ` Alex Thorlton
2013-11-08 11:20         ` Mel Gorman
2013-11-08 11:20           ` Mel Gorman
2013-11-08 14:08           ` Mel Gorman
2013-11-08 14:08             ` Mel Gorman
2013-11-08 22:13           ` Alex Thorlton
2013-11-08 22:13             ` Alex Thorlton
2013-11-12 21:29             ` Alex Thorlton
2013-11-12 21:29               ` Alex Thorlton
2013-11-15  0:09               ` Mel Gorman
2013-11-15  0:09                 ` Mel Gorman
2013-11-15 14:45                 ` Mel Gorman
2013-11-15 14:45                   ` Mel Gorman
2013-11-22 21:28                   ` Alex Thorlton
2013-11-22 21:28                     ` Alex Thorlton
2013-11-22 23:05                     ` Mel Gorman
2013-11-22 23:05                       ` Mel Gorman
2013-11-23  0:09                       ` Mel Gorman
2013-11-23  0:09                         ` Mel Gorman
2013-11-27 23:58                         ` Alex Thorlton
2013-11-27 23:58                           ` Alex Thorlton
2013-11-07 21:52 ` Alex Thorlton [this message]
2013-11-07 21:52   ` Alex Thorlton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131107215228.GA4236@sgi.com \
    --to=athorlton@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.