From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Matthew Wilcox <willy@infradead.org>
Cc: Hillf Danton <hdanton@sina.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
syzbot <syzbot+c48f34012b06c4ac67dd@syzkaller.appspotmail.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
syzkaller-bugs@googlegroups.com,
Mike Kravetz <mike.kravetz@oracle.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Jens Axboe <axboe@kernel.dk>,
Markus Elfring <Markus.Elfring@web.de>
Subject: Re: kernel BUG at include/linux/swapops.h:LINE!
Date: Mon, 27 Jul 2020 13:31:40 +0300 [thread overview]
Message-ID: <20200727103140.xycdx6ctecomqsoe@box> (raw)
In-Reply-To: <20200726164904.GG23808@casper.infradead.org>
On Sun, Jul 26, 2020 at 05:49:04PM +0100, Matthew Wilcox wrote:
> On Fri, Jul 24, 2020 at 02:13:11PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Jul 23, 2020 at 03:37:44PM +0800, Hillf Danton wrote:
> > >
> > > On Tue, 21 Jul 2020 14:11:31 +0300 Kirill A. Shutemov wrote:
> > > > On Mon, Jul 20, 2020 at 04:51:44PM -0700, Andrew Morton wrote:
> > > > > On Sun, 19 Jul 2020 14:10:19 -0700 syzbot wrote:
> > > > >
> > > > > > syzbot has found a reproducer for the following issue on:
> > > > > >
> > > > > > HEAD commit: 4c43049f Add linux-next specific files for 20200716
> > > > > > git tree: linux-next
> > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=12c56087100000
> > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=2c76d72659687242
> > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=c48f34012b06c4ac67dd
> > > > > > compiler: gcc (GCC) 10.1.0-syz 20200507
> > > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1344abeb100000
> > > > > >
> > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > > Reported-by: syzbot+c48f34012b06c4ac67dd@syzkaller.appspotmail.com
> > > > >
> > > > > Thanks.
> > > > >
> > > > > __handle_mm_fault
> > > > > ->pmd_migration_entry_wait
> > > > > ->migration_entry_to_page
> > > > >
> > > > > stumbled onto an unlocked page.
> > > > >
> > > > > I don't immediately see a cause. Perhaps Matthew's "THP prep patches",
> > > > > perhaps something else.
> > > > >
> > > > > Is it possible to perform a bisection?
> > > >
> > > > Maybe it's related to the new lock_page_async()?
> > >
> > > Or is there likely the window that after copy_huge_pmd() the src pmd migrate
> > > entry is removed and the page unlocked but the dst is not?
> >
> > No.
> >
> > copy_huge_pmd() runs with exclusive mmap_lock on the source side and
> > destination side is not running yet.
>
> The one I'm hitting is huge related though.
>
> I added this debug:
>
> +++ b/include/linux/swapops.h
> @@ -165,8 +165,9 @@ static inline struct page *device_private_entry_to_page(swp_entry_t entry)
> #ifdef CONFIG_MIGRATION
> static inline swp_entry_t make_migration_entry(struct page *page, int write)
> {
> - BUG_ON(!PageLocked(compound_head(page)));
> + VM_BUG_ON_PAGE(!PageLocked(page), page);
>
> +if (PageCompound(page)) printk("pfn %lx order %d\n", page_to_pfn(page), thp_order(thp_head(page)));
> return swp_entry(write ? SWP_MIGRATION_WRITE : SWP_MIGRATION_READ,
> page_to_pfn(page));
> }
> @@ -194,7 +195,11 @@ static inline struct page *migration_entry_to_page(swp_entry_t entry)
> * Any use of migration entries may only occur while the
> * corresponding page is locked
> */
> - BUG_ON(!PageLocked(compound_head(p)));
> + if (!PageLocked(p)) {
> + dump_page(p, "not locked");
> + printk("swap entry %d.%lx\n", swp_type(entry), swp_offset(entry));
> + BUG();
> + }
> return p;
> }
>
>
> and got useful output (while running generic/086):
>
> 1457 086 (20181): drop_caches: 3
> 1457 page:00000000a216ae9a refcount:2 mapcount:0 mapping:000000009ba7bfed index:0x2227 pfn:0x229e7
> 1457 aops:def_blk_aops ino:0
> 1457 flags: 0x4000000000002030(lru|active|private)
> 1457 raw: 4000000000002030 fffff5b4416b5a48 fffff5b4408a7988 ffff9e9c34848578
> 1457 raw: 0000000000002227 ffff9e9bd18f0d00 00000002ffffffff 0000000000000000
> 1457 page dumped because: not locked
> 1457 swap entry 30.229e7
> 1457 ------------[ cut here ]------------
> 1457 kernel BUG at include/linux/swapops.h:201!
> 1457 invalid opcode: 0000 [#1] SMP PTI
> 1457 CPU: 3 PID: 646 Comm: check Kdump: loaded Tainted: G W 5.8.0-rc6-00067-gd8b18bdf9870-dirty #355
> 1457 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
> 1457 RIP: 0010:__migration_entry_wait+0x109/0x110
> [...]
>
> Looking back in the trace, I see:
>
> ...
> 1457 pfn 229e5 order 9
> 1457 pfn 229e6 order 9
> 1457 pfn 229e7 order 9
> 1457 pfn 229e8 order 9
> 1457 pfn 229e9 order 9
> ...
>
> so I would say we have a refcount problem. I've probably made it worse by
> creating more THPs, but I don't think I'm the originator of the problem.
>
> I know very little about the migration code today. I suspect I'm going
> to have to learn about it next week.
It would be interesting to know if the migration entires ever got removed
for pfn. I mean if remove_migration_pte() got called for it.
It can be rmap issue too. Maybe it misses PMD on remove_migration_ptes()
or something.
--
Kirill A. Shutemov
next prev parent reply other threads:[~2020-07-27 10:31 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-30 17:05 kernel BUG at include/linux/swapops.h:LINE! syzbot
2020-07-19 21:10 ` syzbot
2020-07-20 23:51 ` Andrew Morton
2020-07-21 0:21 ` Matthew Wilcox
2020-07-21 2:14 ` Matthew Wilcox
2020-07-21 11:11 ` Kirill A. Shutemov
2020-07-21 15:11 ` Jens Axboe
2020-07-23 7:37 ` Hillf Danton
2020-07-24 11:13 ` Kirill A. Shutemov
2020-07-26 16:49 ` Matthew Wilcox
2020-07-27 10:31 ` Kirill A. Shutemov [this message]
2020-07-27 12:03 ` Matthew Wilcox
2020-07-27 12:59 ` Hillf Danton
2020-07-27 13:44 ` Matthew Wilcox
2020-07-27 14:46 ` Hillf Danton
2020-07-29 19:21 ` Kirill A. Shutemov
2020-07-29 19:54 ` Matthew Wilcox
2020-07-29 22:11 ` Matthew Wilcox
2021-05-08 11:24 ` [syzbot] " syzbot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200727103140.xycdx6ctecomqsoe@box \
--to=kirill@shutemov.name \
--cc=Markus.Elfring@web.de \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=syzbot+c48f34012b06c4ac67dd@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.