All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: David Airlie <airlied@linux.ie>,
	bugzilla-daemon@bugzilla.kernel.org,
	dri-devel@lists.freedesktop.org, linux-mm@kvack.org,
	Huang Rui <ray.huang@amd.com>,
	petr@vandrovec.name, Christian Koenig <christian.koenig@amd.com>
Subject: Re: [Bug 204407] New: Bad page state in process Xorg
Date: Fri, 02 Aug 2019 17:17:30 -0400	[thread overview]
Message-ID: <1564780650.11067.50.camel@lca.pw> (raw)
In-Reply-To: <20190802203344.GD5597@bombadil.infradead.org>

On Fri, 2019-08-02 at 13:33 -0700, Matthew Wilcox wrote:
> On Fri, Aug 02, 2019 at 01:23:06PM -0700, Andrew Morton wrote:
> > > [259701.387365] BUG: Bad page state in process Xorg  pfn:2a300
> > > [259701.393593] page:ffffea0000a8c000 refcount:0 mapcount:-128
> > > mapping:0000000000000000 index:0x0
> 
> mapcount -128 is PAGE_MAPCOUNT_RESERVE, aka PageBuddy.  I think somebody
> called put_page() once more than they should have.  The one before this
> caused it to be freed to the page allocator, which set PageBuddy.  Then
> this one happened and we got a complaint.
> 
> > > [259701.402832] flags: 0x2000000000000000()
> > > [259701.407426] raw: 2000000000000000 ffffffff822ab778 ffffea0000a8f208
> > > 0000000000000000
> > > [259701.415900] raw: 0000000000000000 0000000000000003 00000000ffffff7f
> > > 0000000000000000
> > > [259701.424373] page dumped because: nonzero mapcount
> 
> It occurs to me that when a page is freed, we could record some useful bits
> of information in the page from the stack trace to help debug double-free 
> situations.  Even just stashing __builtin_return_address in page->mapping
> would be helpful, I think.

Sounds like need to enable "page_owner", so it will do  __dump_page_owner().

> 
> > > [259701.549382] Call Trace:
> > > [259701.549382]  dump_stack+0x46/0x60
> > > [259701.549382]  bad_page.cold.28+0x81/0xb4
> > > [259701.549382]  __free_pages_ok+0x236/0x240
> > > [259701.549382]  __ttm_dma_free_page+0x2f/0x40
> > > [259701.549382]  ttm_dma_unpopulate+0x29b/0x370
> > > [259701.549382]  ttm_tt_destroy.part.6+0x44/0x50
> > > [259701.549382]  ttm_bo_cleanup_memtype_use+0x29/0x70
> > > [259701.549382]  ttm_bo_put+0x225/0x280
> > > [259701.549382]  ttm_bo_vm_close+0x10/0x20
> > > [259701.549382]  remove_vma+0x20/0x40
> > > [259701.549382]  __do_munmap+0x2da/0x420
> > > [259701.549382]  __vm_munmap+0x66/0xc0
> > > [259701.549382]  __x64_sys_munmap+0x22/0x30
> > > [259701.549382]  do_syscall_64+0x5e/0x1a0
> > > [259701.549382]  ? prepare_exit_to_usermode+0x75/0xa0
> > > [259701.549382]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > [259701.549382] RIP: 0033:0x7f504d0ec1d7
> > > [259701.549382] Code: 10 e9 67 ff ff ff 0f 1f 44 00 00 48 8b 15 b1 6c 0c
> > > 00 f7
> > > d8 64 89 02 48 c7 c0 ff ff ff ff e9 6b ff ff ff b8 0b 00 00 00 0f 05 <48>
> > > 3d 01
> > > f0 ff ff 73 01 c3 48 8b 0d 89 6c 0c 00 f7 d8 64 89 01 48
> > > [259701.549382] RSP: 002b:00007ffe529db138 EFLAGS: 00000206 ORIG_RAX:
> > > 000000000000000b
> > > [259701.549382] RAX: ffffffffffffffda RBX: 0000564a5eabce70 RCX:
> > > 00007f504d0ec1d7
> > > [259701.549382] RDX: 00007ffe529db140 RSI: 0000000000400000 RDI:
> > > 00007f5044b65000
> > > [259701.549382] RBP: 0000564a5eafe460 R08: 000000000000000b R09:
> > > 000000010283e000
> > > [259701.549382] R10: 0000000000000001 R11: 0000000000000206 R12:
> > > 0000564a5e475b08
> > > [259701.549382] R13: 0000564a5e475c80 R14: 00007ffe529db190 R15:
> > > 0000000000000c80
> > > [259701.707238] Disabling lock debugging due to kernel taint
> > 
> > I assume the above is misbehaviour in the DRM code?
> 
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: Qian Cai <cai@lca.pw>
To: Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: petr@vandrovec.name, bugzilla-daemon@bugzilla.kernel.org,
	Christian Koenig <christian.koenig@amd.com>,
	Huang Rui <ray.huang@amd.com>, David Airlie <airlied@linux.ie>,
	Daniel Vetter <daniel@ffwll.ch>,
	 dri-devel@lists.freedesktop.org, linux-mm@kvack.org
Subject: Re: [Bug 204407] New: Bad page state in process Xorg
Date: Fri, 02 Aug 2019 17:17:30 -0400	[thread overview]
Message-ID: <1564780650.11067.50.camel@lca.pw> (raw)
In-Reply-To: <20190802203344.GD5597@bombadil.infradead.org>

On Fri, 2019-08-02 at 13:33 -0700, Matthew Wilcox wrote:
> On Fri, Aug 02, 2019 at 01:23:06PM -0700, Andrew Morton wrote:
> > > [259701.387365] BUG: Bad page state in process Xorg  pfn:2a300
> > > [259701.393593] page:ffffea0000a8c000 refcount:0 mapcount:-128
> > > mapping:0000000000000000 index:0x0
> 
> mapcount -128 is PAGE_MAPCOUNT_RESERVE, aka PageBuddy.  I think somebody
> called put_page() once more than they should have.  The one before this
> caused it to be freed to the page allocator, which set PageBuddy.  Then
> this one happened and we got a complaint.
> 
> > > [259701.402832] flags: 0x2000000000000000()
> > > [259701.407426] raw: 2000000000000000 ffffffff822ab778 ffffea0000a8f208
> > > 0000000000000000
> > > [259701.415900] raw: 0000000000000000 0000000000000003 00000000ffffff7f
> > > 0000000000000000
> > > [259701.424373] page dumped because: nonzero mapcount
> 
> It occurs to me that when a page is freed, we could record some useful bits
> of information in the page from the stack trace to help debug double-free 
> situations.  Even just stashing __builtin_return_address in page->mapping
> would be helpful, I think.

Sounds like need to enable "page_owner", so it will do  __dump_page_owner().

> 
> > > [259701.549382] Call Trace:
> > > [259701.549382]  dump_stack+0x46/0x60
> > > [259701.549382]  bad_page.cold.28+0x81/0xb4
> > > [259701.549382]  __free_pages_ok+0x236/0x240
> > > [259701.549382]  __ttm_dma_free_page+0x2f/0x40
> > > [259701.549382]  ttm_dma_unpopulate+0x29b/0x370
> > > [259701.549382]  ttm_tt_destroy.part.6+0x44/0x50
> > > [259701.549382]  ttm_bo_cleanup_memtype_use+0x29/0x70
> > > [259701.549382]  ttm_bo_put+0x225/0x280
> > > [259701.549382]  ttm_bo_vm_close+0x10/0x20
> > > [259701.549382]  remove_vma+0x20/0x40
> > > [259701.549382]  __do_munmap+0x2da/0x420
> > > [259701.549382]  __vm_munmap+0x66/0xc0
> > > [259701.549382]  __x64_sys_munmap+0x22/0x30
> > > [259701.549382]  do_syscall_64+0x5e/0x1a0
> > > [259701.549382]  ? prepare_exit_to_usermode+0x75/0xa0
> > > [259701.549382]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > [259701.549382] RIP: 0033:0x7f504d0ec1d7
> > > [259701.549382] Code: 10 e9 67 ff ff ff 0f 1f 44 00 00 48 8b 15 b1 6c 0c
> > > 00 f7
> > > d8 64 89 02 48 c7 c0 ff ff ff ff e9 6b ff ff ff b8 0b 00 00 00 0f 05 <48>
> > > 3d 01
> > > f0 ff ff 73 01 c3 48 8b 0d 89 6c 0c 00 f7 d8 64 89 01 48
> > > [259701.549382] RSP: 002b:00007ffe529db138 EFLAGS: 00000206 ORIG_RAX:
> > > 000000000000000b
> > > [259701.549382] RAX: ffffffffffffffda RBX: 0000564a5eabce70 RCX:
> > > 00007f504d0ec1d7
> > > [259701.549382] RDX: 00007ffe529db140 RSI: 0000000000400000 RDI:
> > > 00007f5044b65000
> > > [259701.549382] RBP: 0000564a5eafe460 R08: 000000000000000b R09:
> > > 000000010283e000
> > > [259701.549382] R10: 0000000000000001 R11: 0000000000000206 R12:
> > > 0000564a5e475b08
> > > [259701.549382] R13: 0000564a5e475c80 R14: 00007ffe529db190 R15:
> > > 0000000000000c80
> > > [259701.707238] Disabling lock debugging due to kernel taint
> > 
> > I assume the above is misbehaviour in the DRM code?
> 
> 


  reply	other threads:[~2019-08-02 21:17 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-204407-27@https.bugzilla.kernel.org/>
2019-08-02 20:23 ` [Bug 204407] New: Bad page state in process Xorg Andrew Morton
2019-08-02 20:23   ` Andrew Morton
2019-08-02 20:33   ` Matthew Wilcox
2019-08-02 20:33     ` Matthew Wilcox
2019-08-02 21:17     ` Qian Cai [this message]
2019-08-02 21:17       ` Qian Cai
2019-08-02 22:59       ` Matthew Wilcox
2019-08-02 22:59         ` Matthew Wilcox
2019-08-02 23:29         ` Petr Vandrovec
2019-08-15 14:32           ` Vlastimil Babka
2019-08-15 14:32             ` Vlastimil Babka
2019-08-15 19:13             ` Petr Vandrovec
2019-08-15 19:13               ` Petr Vandrovec
2019-08-16 12:47               ` Vlastimil Babka
2019-08-16 12:47                 ` Vlastimil Babka
2019-08-16 12:52                 ` Joerg Roedel
2019-08-16 12:52                   ` Joerg Roedel
2019-08-17  0:20                 ` Petr Vandrovec
2019-08-17  0:20                   ` Petr Vandrovec
2019-08-19 14:44                   ` Vlastimil Babka
2019-08-19 14:44                     ` Vlastimil Babka
2019-08-02 21:25   ` Petr Vandrovec

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1564780650.11067.50.camel@lca.pw \
    --to=cai@lca.pw \
    --cc=airlied@linux.ie \
    --cc=akpm@linux-foundation.org \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-mm@kvack.org \
    --cc=petr@vandrovec.name \
    --cc=ray.huang@amd.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.