From: Joshua Kinard <kumba@gentoo.org>
To: LKML <linux-kernel@vger.kernel.org>,
Linux MIPS List <linux-mips@linux-mips.org>
Subject: Re: MIPS: BUG() in isolate_lru_pages in mm/vmscan.c?
Date: Sat, 25 Apr 2015 14:53:29 -0400 [thread overview]
Message-ID: <553BE2A9.2090500@gentoo.org> (raw)
In-Reply-To: <553BB91C.3010308@gentoo.org>
On 04/25/2015 11:56, Joshua Kinard wrote:
> I keep tripping up a BUG() in isolate_lru_pages in mm/vmscan.c:1345:
>
> switch (__isolate_lru_page(page, mode)) {
> case 0:
> nr_pages = hpage_nr_pages(page);
> mem_cgroup_update_lru_size(lruvec, lru, -nr_pages);
> list_move(&page->lru, dst);
> nr_taken += nr_pages;
> break;
>
> case -EBUSY:
> /* else it is being freed elsewhere */
> list_move(&page->lru, src);
> continue;
>
> default:
> BUG();
> }
>
> This is on an SGI Onyx2 platform (MIPS, IP27), two node boards (4x R14000
> CPUs), and 8G of RAM. The problem appears tied to heavy disk I/O, typically
> writes. I can reproduce sometimes with a long bonnie++ run, but I haven't
> gotten a recent panic() message under 4.0 yet. Most of the time, it silently
> hardlocks. I only have serial console access at 9600bps, so it may lock too
> fast before the serial driver can dump the panic.
>
> Is there any information behind the purpose or triggers of this BUG()? I went
> back in git all the way to the initial 2006 commit that added this function,
> but could not find any comments or explanation of just what it's protecting
> against. That makes it hard to know where to start debugging.
>
> I've already tried switching filesystems, first ext4, now XFS. Enabling
> CONFIG_NUMA seems to make it harder to trigger, but that's not an objective
> observation. An md RAID resync doesn't appear to trigger it either.
This patch seems to explain things a little bit (from 20070316):
http://marc.info/?l=linux-mm-commits&m=117401513810763&w=2
> Subject: lumpy: back out removal of active check in isolate_lru_pages
> From: Andy Whitcroft <apw@shadowen.org>
>
> As pointed out by Christop Lameter it should not be possible for a page to
> change its active/inactive state without taking the lru_lock. Reinstate this
> safety net.
>
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>
> Acked-by: Mel Gorman <mel@csn.ul.ie>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> mm/vmscan.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff -puN mm/vmscan.c~lumpy-back-out-removal-of-active-check-in-isolate_lru_pages mm/vmscan.c
> --- a/mm/vmscan.c~lumpy-back-out-removal-of-active-check-in-isolate_lru_pages
> +++ a/mm/vmscan.c
> @@ -686,10 +686,13 @@ static unsigned long isolate_lru_pages(u
> nr_taken++;
> break;
>
> - default:
> - /* page is being freed, or is a missmatch */
> + case -EBUSY:
> + /* else it is being freed elsewhere */
> list_move(&page->lru, src);
> continue;
> +
> + default:
> + BUG();
> }
>
> if (!order)
So if my reading is correct, the BUG() is being triggered because a page might
be changing its active/inactive state w/o taking the lru_lock. Given that the
SGI IP27 platform is an early NUMA machine and nodes can have a bit of physical
distance between them (thus some latency), could this be a sign of some kind of
SMP race condition specific to this platform?
--J
prev parent reply other threads:[~2015-04-25 18:55 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-25 15:56 MIPS: BUG() in isolate_lru_pages in mm/vmscan.c? Joshua Kinard
2015-04-25 18:53 ` Joshua Kinard [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=553BE2A9.2090500@gentoo.org \
--to=kumba@gentoo.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@linux-mips.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.