All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	torvalds@linux-foundation.org, kirill.shutemov@linux.intel.com,
	akpm@linux-foundation.org, hannes@cmpxchg.org,
	iamjoonsoo.kim@lge.com, mgorman@techsingularity.net,
	tony.luck@intel.com, vbabka@suse.cz, mhocko@kernel.org,
	hillf.zj@alibaba-inc.com, hughd@google.com, oleg@redhat.com,
	peterz@infradead.org, riel@redhat.com, srikar@linux.vnet.ibm.com,
	vdavydov.dev@gmail.com, mingo@kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure.
Date: Thu, 18 Jan 2018 15:58:30 +0100	[thread overview]
Message-ID: <20180118145830.GA6406@redhat.com> (raw)
In-Reply-To: <d8347087-18a6-1709-8aa8-3c6f2d16aa94@linux.intel.com>

On Thu, Jan 18, 2018 at 06:45:00AM -0800, Dave Hansen wrote:
> On 01/18/2018 04:25 AM, Kirill A. Shutemov wrote:
> > [   10.084024] diff: -858690919
> > [   10.084258] hpage_nr_pages: 1
> > [   10.084386] check1: 0
> > [   10.084478] check2: 0
> ...
> > diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> > index d22b84310f6d..57b4397f1ea5 100644
> > --- a/mm/page_vma_mapped.c
> > +++ b/mm/page_vma_mapped.c
> > @@ -70,6 +70,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
> >  		}
> >  		if (pte_page(*pvmw->pte) < pvmw->page)
> >  			return false;
> > +
> > +		if (pte_page(*pvmw->pte) - pvmw->page) {
> > +			printk("diff: %d\n", pte_page(*pvmw->pte) - pvmw->page);
> > +			printk("hpage_nr_pages: %d\n", hpage_nr_pages(pvmw->page));
> > +			printk("check1: %d\n", pte_page(*pvmw->pte) - pvmw->page < 0);
> > +			printk("check2: %d\n", pte_page(*pvmw->pte) - pvmw->page >= hpage_nr_pages(pvmw->page));
> > +			BUG();
> > +		}
> 
> This says that pte_page(*pvmw->pte) and pvmw->page are roughly 4GB away
> from each other (858690919*4=0xccba559c0).  That's not the compiler
> being wonky, it just means that the virtual addresses of the memory
> sections are that far apart.
> 
> This won't happen when you have vmemmap or flatmem because the mem_map[]
> is virtually contiguous and pointer arithmetic just works against all
> 'struct page' pointers.  But with classic sparsemem, it doesn't.
> 
> You need to make sure that the PFNs are in the same section before you
> can do the math that you want to do here.

Isn't it simply that pvmw->page isn't a page or pte_page(*pvmw->pte)
isn't a page?

The distance cannot matter, MMU isn't involved, this is pure 64bit
aritmetics, 1giga 1 terabyte, 48bits 5level pagetables are meaningless
in this comparison.

#include <stdio.h>

int main()
{
	volatile long i;
	struct x { char a[4000000000]; };
	for (i = 0; i < 4000000000*3; i += 4000000000) {
		printf("%ld\n", ((struct x *)0)-((((struct x *)i))));
	}
	printf("xxxx\n");
	for (i = 0; i < 4000000000; i += 1) {
		if (i==4)
			i = 4000000000;
		printf("%ld\n", ((struct x *)0)-((((struct x *)i))));
	}
	return 0;
}

You need to add two debug checks on "pte_page(*pvmw->pte) % 64" and
same for pvmw->page to find out the one of the two that isn't a page.

If both are real pages there's a bug that allocates page structs not
naturally aligned.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	torvalds@linux-foundation.org, kirill.shutemov@linux.intel.com,
	akpm@linux-foundation.org, hannes@cmpxchg.org,
	iamjoonsoo.kim@lge.com, mgorman@techsingularity.net,
	tony.luck@intel.com, vbabka@suse.cz, mhocko@kernel.org,
	hillf.zj@alibaba-inc.com, hughd@google.com, oleg@redhat.com,
	peterz@infradead.org, riel@redhat.com, srikar@linux.vnet.ibm.com,
	vdavydov.dev@gmail.com, mingo@kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure.
Date: Thu, 18 Jan 2018 15:58:30 +0100	[thread overview]
Message-ID: <20180118145830.GA6406@redhat.com> (raw)
In-Reply-To: <d8347087-18a6-1709-8aa8-3c6f2d16aa94@linux.intel.com>

On Thu, Jan 18, 2018 at 06:45:00AM -0800, Dave Hansen wrote:
> On 01/18/2018 04:25 AM, Kirill A. Shutemov wrote:
> > [   10.084024] diff: -858690919
> > [   10.084258] hpage_nr_pages: 1
> > [   10.084386] check1: 0
> > [   10.084478] check2: 0
> ...
> > diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> > index d22b84310f6d..57b4397f1ea5 100644
> > --- a/mm/page_vma_mapped.c
> > +++ b/mm/page_vma_mapped.c
> > @@ -70,6 +70,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
> >  		}
> >  		if (pte_page(*pvmw->pte) < pvmw->page)
> >  			return false;
> > +
> > +		if (pte_page(*pvmw->pte) - pvmw->page) {
> > +			printk("diff: %d\n", pte_page(*pvmw->pte) - pvmw->page);
> > +			printk("hpage_nr_pages: %d\n", hpage_nr_pages(pvmw->page));
> > +			printk("check1: %d\n", pte_page(*pvmw->pte) - pvmw->page < 0);
> > +			printk("check2: %d\n", pte_page(*pvmw->pte) - pvmw->page >= hpage_nr_pages(pvmw->page));
> > +			BUG();
> > +		}
> 
> This says that pte_page(*pvmw->pte) and pvmw->page are roughly 4GB away
> from each other (858690919*4=0xccba559c0).  That's not the compiler
> being wonky, it just means that the virtual addresses of the memory
> sections are that far apart.
> 
> This won't happen when you have vmemmap or flatmem because the mem_map[]
> is virtually contiguous and pointer arithmetic just works against all
> 'struct page' pointers.  But with classic sparsemem, it doesn't.
> 
> You need to make sure that the PFNs are in the same section before you
> can do the math that you want to do here.

Isn't it simply that pvmw->page isn't a page or pte_page(*pvmw->pte)
isn't a page?

The distance cannot matter, MMU isn't involved, this is pure 64bit
aritmetics, 1giga 1 terabyte, 48bits 5level pagetables are meaningless
in this comparison.

#include <stdio.h>

int main()
{
	volatile long i;
	struct x { char a[4000000000]; };
	for (i = 0; i < 4000000000*3; i += 4000000000) {
		printf("%ld\n", ((struct x *)0)-((((struct x *)i))));
	}
	printf("xxxx\n");
	for (i = 0; i < 4000000000; i += 1) {
		if (i==4)
			i = 4000000000;
		printf("%ld\n", ((struct x *)0)-((((struct x *)i))));
	}
	return 0;
}

You need to add two debug checks on "pte_page(*pvmw->pte) % 64" and
same for pvmw->page to find out the one of the two that isn't a page.

If both are real pages there's a bug that allocates page structs not
naturally aligned.

  reply	other threads:[~2018-01-18 14:59 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-05 14:45 [x86? mm? fs? 4.15-rc6] Random oopses by simple write under memory pressure Tetsuo Handa
2018-01-09 10:39 ` [mm? 4.15-rc7] " Tetsuo Handa
2018-01-10 11:49   ` [mm? 4.15-rc7] Random oopses " Tetsuo Handa
2018-01-10 12:45     ` Michal Hocko
2018-01-10 13:37       ` Tetsuo Handa
2018-01-11 13:57         ` Michal Hocko
2018-01-11 14:11           ` Tetsuo Handa
2018-01-11 14:21             ` Michal Hocko
2018-01-11 14:37               ` Tetsuo Handa
2018-01-12  1:31               ` [mm " Tetsuo Handa
2018-01-12  1:42                 ` Linus Torvalds
2018-01-12 11:22                   ` Tetsuo Handa
2018-01-14 11:54                     ` Tetsuo Handa
2018-01-14 11:54                       ` Tetsuo Handa
2018-01-15 23:05                       ` Linus Torvalds
2018-01-15 23:05                         ` Linus Torvalds
2018-01-16  1:15                         ` [mm 4.15-rc8] " Tetsuo Handa
2018-01-16  1:15                           ` Tetsuo Handa
2018-01-16  2:14                           ` Linus Torvalds
2018-01-16  2:14                             ` Linus Torvalds
2018-01-16  8:06                             ` Dave Hansen
2018-01-16  8:06                               ` Dave Hansen
2018-01-16  8:37                               ` Ingo Molnar
2018-01-16  8:37                                 ` Ingo Molnar
2018-01-16 19:30                               ` Linus Torvalds
2018-01-16 19:30                                 ` Linus Torvalds
2018-01-16 17:33                             ` Tetsuo Handa
2018-01-16 17:33                               ` Tetsuo Handa
2018-01-16 19:34                               ` Linus Torvalds
2018-01-16 19:34                                 ` Linus Torvalds
2018-01-17 11:08                                 ` Tetsuo Handa
2018-01-17 11:08                                   ` Tetsuo Handa
2018-01-17 21:39                                   ` Linus Torvalds
2018-01-17 21:39                                     ` Linus Torvalds
2018-01-17 21:51                                     ` Linus Torvalds
2018-01-17 21:51                                       ` Linus Torvalds
2018-01-17 22:04                                       ` Dave Hansen
2018-01-17 22:04                                         ` Dave Hansen
2018-01-17 22:00                                     ` Dave Hansen
2018-01-17 22:00                                       ` Dave Hansen
2018-01-17 22:15                                       ` Linus Torvalds
2018-01-17 22:15                                         ` Linus Torvalds
2018-01-18  8:12                                   ` Tetsuo Handa
2018-01-18  8:12                                     ` Tetsuo Handa
2018-01-18 12:25                                     ` Kirill A. Shutemov
2018-01-18 12:25                                       ` Kirill A. Shutemov
2018-01-18 13:12                                       ` Kirill A. Shutemov
2018-01-18 13:12                                         ` Kirill A. Shutemov
2018-01-18 14:34                                         ` Kirill A. Shutemov
2018-01-18 14:34                                           ` Kirill A. Shutemov
2018-01-18 14:38                                         ` Dave Hansen
2018-01-18 14:38                                           ` Dave Hansen
2018-01-18 14:45                                           ` Kirill A. Shutemov
2018-01-18 14:45                                             ` Kirill A. Shutemov
2018-01-18 14:51                                             ` Dave Hansen
2018-01-18 14:51                                               ` Dave Hansen
2018-01-18 16:58                                           ` Linus Torvalds
2018-01-18 16:58                                             ` Linus Torvalds
2018-01-18 14:45                                       ` Dave Hansen
2018-01-18 14:45                                         ` Dave Hansen
2018-01-18 14:58                                         ` Andrea Arcangeli [this message]
2018-01-18 14:58                                           ` Andrea Arcangeli
2018-01-18 16:56                                           ` Kirill A. Shutemov
2018-01-18 16:56                                             ` Kirill A. Shutemov
2018-01-18 17:26                                             ` Luck, Tony
2018-01-18 17:26                                               ` Luck, Tony
2018-01-18 17:28                                               ` Linus Torvalds
2018-01-18 17:28                                                 ` Linus Torvalds
2018-01-18 17:26                                             ` Linus Torvalds
2018-01-18 17:26                                               ` Linus Torvalds
2018-01-18 23:49                                               ` Kirill A. Shutemov
2018-01-18 23:49                                                 ` Kirill A. Shutemov
2018-01-19 12:55                                                 ` Matthew Wilcox
2018-01-19 12:55                                                   ` Matthew Wilcox
2018-01-19 18:42                                                   ` Linus Torvalds
2018-01-19 18:42                                                     ` Linus Torvalds
2018-01-19 22:12                                                     ` Al Viro
2018-01-19 22:12                                                       ` Al Viro
2018-01-19 22:53                                                       ` Linus Torvalds
2018-01-19 22:53                                                         ` Linus Torvalds
2018-01-20  2:02                                                         ` Al Viro
2018-01-20  2:02                                                           ` Al Viro
2018-01-20  5:24                                                           ` Al Viro
2018-01-20  5:24                                                             ` Al Viro
2018-01-20  9:38                                                             ` Luc Van Oostenryck
2018-01-20  9:38                                                               ` Luc Van Oostenryck
2018-01-20  9:38                                                               ` Luc Van Oostenryck
2018-01-20 14:45                                                               ` Luc Van Oostenryck
2018-01-22 13:26                                                     ` Rasmus Villemoes
2018-01-22 19:58                                                       ` Linus Torvalds
2018-01-18 15:40                                         ` Kirill A. Shutemov
2018-01-18 15:40                                           ` Kirill A. Shutemov
2018-01-18 17:22                                           ` Michal Hocko
2018-01-18 17:22                                             ` Michal Hocko
2018-01-19 10:02                                             ` Kirill A. Shutemov
2018-01-19 10:02                                               ` Kirill A. Shutemov
2018-01-19 10:33                                               ` Michal Hocko
2018-01-19 10:33                                                 ` Michal Hocko
2018-01-19 11:49                                                 ` Kirill A. Shutemov
2018-01-19 11:49                                                   ` Kirill A. Shutemov
2018-01-19 12:07                                                   ` Michal Hocko
2018-01-19 12:07                                                     ` Michal Hocko
2018-01-19 12:30                                                     ` Kirill A. Shutemov
2018-01-19 12:30                                                       ` Kirill A. Shutemov
2018-01-19  2:01                                           ` Tetsuo Handa
2018-01-19  2:01                                             ` Tetsuo Handa
2018-01-11 18:11             ` [mm? 4.15-rc7] " Linus Torvalds
2018-01-11 20:59               ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180118145830.GA6406@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hillf.zj@alibaba-inc.com \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.