From: Joshua Kinard <kumba@gentoo.org>
To: LKML <linux-kernel@vger.kernel.org>,
Linux MIPS List <linux-mips@linux-mips.org>
Subject: Re: MIPS: BUG() in isolate_lru_pages in mm/vmscan.c?
Date: Sat, 25 Apr 2015 14:53:29 -0400 [thread overview]
Message-ID: <553BE2A9.2090500@gentoo.org> (raw)
In-Reply-To: <553BB91C.3010308@gentoo.org>
On 04/25/2015 11:56, Joshua Kinard wrote:
> I keep tripping up a BUG() in isolate_lru_pages in mm/vmscan.c:1345:
>
> switch (__isolate_lru_page(page, mode)) {
> case 0:
> nr_pages = hpage_nr_pages(page);
> mem_cgroup_update_lru_size(lruvec, lru, -nr_pages);
> list_move(&page->lru, dst);
> nr_taken += nr_pages;
> break;
>
> case -EBUSY:
> /* else it is being freed elsewhere */
> list_move(&page->lru, src);
> continue;
>
> default:
> BUG();
> }
>
> This is on an SGI Onyx2 platform (MIPS, IP27), two node boards (4x R14000
> CPUs), and 8G of RAM. The problem appears tied to heavy disk I/O, typically
> writes. I can reproduce sometimes with a long bonnie++ run, but I haven't
> gotten a recent panic() message under 4.0 yet. Most of the time, it silently
> hardlocks. I only have serial console access at 9600bps, so it may lock too
> fast before the serial driver can dump the panic.
>
> Is there any information behind the purpose or triggers of this BUG()? I went
> back in git all the way to the initial 2006 commit that added this function,
> but could not find any comments or explanation of just what it's protecting
> against. That makes it hard to know where to start debugging.
>
> I've already tried switching filesystems, first ext4, now XFS. Enabling
> CONFIG_NUMA seems to make it harder to trigger, but that's not an objective
> observation. An md RAID resync doesn't appear to trigger it either.
This patch seems to explain things a little bit (from 20070316):
http://marc.info/?l=linux-mm-commits&m=117401513810763&w=2
> Subject: lumpy: back out removal of active check in isolate_lru_pages
> From: Andy Whitcroft <apw@shadowen.org>
>
> As pointed out by Christop Lameter it should not be possible for a page to
> change its active/inactive state without taking the lru_lock. Reinstate this
> safety net.
>
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>
> Acked-by: Mel Gorman <mel@csn.ul.ie>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> mm/vmscan.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff -puN mm/vmscan.c~lumpy-back-out-removal-of-active-check-in-isolate_lru_pages mm/vmscan.c
> --- a/mm/vmscan.c~lumpy-back-out-removal-of-active-check-in-isolate_lru_pages
> +++ a/mm/vmscan.c
> @@ -686,10 +686,13 @@ static unsigned long isolate_lru_pages(u
> nr_taken++;
> break;
>
> - default:
> - /* page is being freed, or is a missmatch */
> + case -EBUSY:
> + /* else it is being freed elsewhere */
> list_move(&page->lru, src);
> continue;
> +
> + default:
> + BUG();
> }
>
> if (!order)
So if my reading is correct, the BUG() is being triggered because a page might
be changing its active/inactive state w/o taking the lru_lock. Given that the
SGI IP27 platform is an early NUMA machine and nodes can have a bit of physical
distance between them (thus some latency), could this be a sign of some kind of
SMP race condition specific to this platform?
--J
prev parent reply other threads:[~2015-04-25 18:55 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-25 15:56 MIPS: BUG() in isolate_lru_pages in mm/vmscan.c? Joshua Kinard
2015-04-25 18:53 ` Joshua Kinard [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=553BE2A9.2090500@gentoo.org \
--to=kumba@gentoo.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@linux-mips.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox