From: <kabe@vega.pgw.jp>
To: bhe@redhat.com
Cc: bugzilla-daemon@bugzilla.kernel.org, akpm@linux-foundation.org,
richardw.yang@linux.intel.com, david@redhat.com,
mhocko@kernel.org, n-horiguchi@ah.jp.nec.com, linux-mm@kvack.org,
kkabe@vega.pgw.jp
Subject: Re: [Bug 206401] kernel panic on Hyper-V after 5 minutes due tomemory hot-add
Date: Thu, 13 Feb 2020 13:22:06 +0900 [thread overview]
Message-ID: <200213132206.M0106897@vega.pgw.jp> (raw)
In-Reply-To: Your message of "Wed, 12 Feb 2020 15:31:23 +0800". <20200212073123.GG8965@MiWiFi-R3L-srv>
bhe@redhat.com sed in <20200212073123.GG8965@MiWiFi-R3L-srv>
>> On 02/11/20 at 04:41pm, Andrew Morton wrote:
>> > On Tue, 11 Feb 2020 07:07:41 +0800 Wei Yang <richardw.yang@linux.intel.com> wrote:
>> >
>> > > On Mon, Feb 10, 2020 at 02:15:51PM +0800, Baoquan He wrote:
>> > > >On 02/10/20 at 02:09pm, Baoquan He wrote:
>> > > >> On 02/09/20 at 09:56pm, Andrew Morton wrote:
>> > > >> > On Mon, 10 Feb 2020 13:40:27 +0800 Baoquan He <bhe@redhat.com> wrote:
>> > > >> >
>> > > >> > > Hi Andrew,
>> > > >> > >
>> > > >> > > On 02/09/20 at 09:32pm, Andrew Morton wrote:
>> > > >> > > > On Tue, 04 Feb 2020 11:25:48 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
>> > > >> > > >
>> > > >> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=206401
>> > > >> > > > >
>> > > >> > > >
>> > > >> > > > An oops during mem hotadd. Could someone please take a look when
>> > > >> > > > convenient?
>> > > >> > >
>> > > >> > > This has been addressed by Wei Yang's patch, please check it here:
>> > > >> > >
>> > > >> > > http://lkml.kernel.org/r/20200209104826.3385-7-bhe@redhat.com
>> > > >> > >
>> > > >> >
>> > > >> > hm, OK, thanks. It's unfortunate that a 5.5 fix is buried in a
>> > > >> > six-patch series which is still in progress! Can we please merge that
>> > > >> > as a standalone fix with a cc:stable, Fixes:, etc?
>> > > >
>> > > >Maybe can add Fixes tag as follow when merge:
>> > > >
>> > > >Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
>> > > >
>> >
>> > The reporter (cc'ed here) is still seeing issues:
>> > https://bugzilla.kernel.org/show_bug.cgi?id=206401
>> >
>> > Could we please continue this investigation via emailed reply-to-all,
>> > rather than via the bugzilla interface?
>>
>> Yes, people prefer mailing list to discuss issues.
>>
>> Hi T.Kabe,
>>
>> Could you provide the call trace again after below patch is applied?
>> The comment #9 in bugzilla is not very clear to me.
>>
>> mm/sparsemem: pfn_to_page is not valid yet on SPARSEMEM
>> http://lkml.kernel.org/r/20200209104826.3385-7-bhe@redhat.com
>>
>> And, as you said, applying above patch, and do not call
>> __free_pages_core() in generic_online_page() will work. I doubt it,
>> because without __free_pages_core(), your added pages are not added
>> into buddy for managing. I think we should make clear this problem
>> firstly, in order not to introduce new problem by improper work around,
>> then check next.
>>
>> Thanks
>> Baoquan
Got it, I restarted off fresh from kernel-5.6-rc1,
applied patch
>> http://lkml.kernel.org/r/20200209104826.3385-7-bhe@redhat.com
and got the following panic.
Diag printk's for add_memory() et al is not there, but I guess
memory hot-add request from hypervisor is returning "success",
corrupting something else and bombing out later.
[ 24.289967] Not activating Mandatory Access Control as /sbin/tomoyo-init does not exist.
[ 302.263730] hv_balloon: Max. dynamic memory size: 1048576 MB
[ 635.216014] BUG: unable to handle page fault for address: d13ff000
[ 635.216058] #PF: supervisor write access in kernel mode
[ 635.216076] #PF: error_code(0x0002) - not-present page
[ 635.216106] *pde = 00000000
[ 635.216139] Oops: 0002 [#1] SMP
[ 635.216171] CPU: 0 PID: 470 Comm: systemd-udevd Not tainted 5.6.0-rc1.el8.i586 #1
[ 635.216199] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012
[ 635.216233] EIP: wp_page_copy+0x8e/0x750
[ 635.216253] Code: 03 00 00 8b 45 d0 85 c0 0f 84 46 05 00 00 e8 d9 85 e5 ff 89 45 bc 89 f8 e8 cf 85 e5 ff 8b 55 bc 8d 78 04 8b 0a 83 e7 fc 89 d6 <89> 08 8b 8a fc 0f 00 00 89 88 fc 0f 00 00 89 c1 29 f9 89 55 bc 29
[ 635.216293] EAX: d13ff000 EBX: c3743f28 ECX: 00000000 EDX: c10c9000
[ 635.216314] ESI: c10c9000 EDI: d13ff004 EBP: c3743eec ESP: c3743ea8
[ 635.216336] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210282
[ 635.216368] CR0: 80050033 CR2: d13ff000 CR3: 03add000 CR4: 003406d0
[ 635.216389] Call Trace:
[ 635.216407] ? reuse_swap_page+0x83/0x390
[ 635.216425] do_wp_page+0x87/0x6e0
[ 635.216438] ? __do_sys_fstat64+0x4a/0x60
[ 635.216453] handle_mm_fault+0x808/0xe30
[ 635.216468] do_page_fault+0x19f/0x4d0
[ 635.216484] ? do_kern_addr_fault+0x80/0x80
[ 635.216500] common_exception_read_cr2+0x15a/0x15f
[ 635.216521] EIP: 0xb7b28104
[ 635.216538] Code: 29 f9 89 4c 24 10 83 f9 0f 0f 86 92 00 00 00 8b 45 40 8d 14 3e 8b 4c 24 0c 39 48 0c 75 74 8b 4c 24 0c 81 7c 24 10 ef 03 00 00 <89> 42 08 89 4a 0c 89 55 40 89 50 0c 76 0e c7 42 10 00 00 00 00 c7
[ 635.216591] EAX: b7c4e7d8 EBX: 000011a0 ECX: b7c4e7d8 EDX: 01994178
[ 635.216606] ESI: 01993168 EDI: 00001010 EBP: b7c4e7a0 ESP: bfcc9f00
[ 635.216628] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00210293
[ 635.216661] Modules linked in: rfkill intel_rapl_msr intel_rapl_common snd_pcm snd_timer snd soundcore crc32_pclmul intel_rapl_perf sg pcspkr hv_netvsc joydev i2c_piix4 hyperv_fb hv_utils hv_balloon ip_tables ext4 mbcache jbd2 sd_mod t10_pi sr_mod cdrom ata_generic hyperv_keyboard hid_hyperv hv_storvsc scsi_transport_fc ata_piix crc32c_intel serio_raw hv_vmbus libata
[ 635.216758] CR2: 00000000d13ff000
[ 635.216769] ---[ end trace dee4a93859538102 ]---
[ 635.216785] EIP: wp_page_copy+0x8e/0x750
[ 635.216811] Code: 03 00 00 8b 45 d0 85 c0 0f 84 46 05 00 00 e8 d9 85 e5 ff 89 45 bc 89 f8 e8 cf 85 e5 ff 8b 55 bc 8d 78 04 8b 0a 83 e7 fc 89 d6 <89> 08 8b 8a fc 0f 00 00 89 88 fc 0f 00 00 89 c1 29 f9 89 55 bc 29
[ 635.216847] EAX: d13ff000 EBX: c3743f28 ECX: 00000000 EDX: c10c9000
[ 635.216864] ESI: c10c9000 EDI: d13ff004 EBP: c3743eec ESP: c3743ea8
[ 635.216883] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210282
[ 635.216899] CR0: 80050033 CR2: d13ff000 CR3: 03add000 CR4: 003406d0
[ 635.216914] Kernel panic - not syncing: Fatal exception
[ 635.216926] Kernel Offset: 0x1400000 from 0xc1000000 (relocation range: 0xc0000000-0xcafeffff)
[ 635.216946] ---[ end Kernel panic - not syncing: Fatal exception ]---
--
kabe
next prev parent reply other threads:[~2020-02-13 4:22 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-206401-27@https.bugzilla.kernel.org/>
[not found] ` <bug-206401-27-zYD8WfDKqD@https.bugzilla.kernel.org/>
2020-02-10 5:32 ` [Bug 206401] kernel panic on Hyper-V after 5 minutes due to memory hot-add Andrew Morton
2020-02-10 5:40 ` Baoquan He
2020-02-10 5:56 ` Andrew Morton
2020-02-10 6:09 ` Baoquan He
2020-02-10 6:15 ` Baoquan He
2020-02-10 23:07 ` Wei Yang
2020-02-12 0:41 ` Andrew Morton
2020-02-12 7:31 ` Baoquan He
2020-02-12 8:21 ` David Hildenbrand
2020-02-13 4:22 ` kabe [this message]
2020-02-13 8:19 ` [Bug 206401] kernel panic on Hyper-V after 5 minutes due tomemory hot-add Baoquan He
2020-02-14 14:26 ` [Bug 206401] kernel panic on Hyper-V after 5 minutes duetomemory hot-add kkabe
2020-02-14 14:48 ` Baoquan He
2020-02-14 15:01 ` Baoquan He
2020-02-17 4:48 ` Baoquan He
2020-02-17 5:31 ` [Bug 206401] kernel panic on Hyper-V after 5 minutes duetomemoryhot-add kkabe
2020-02-17 8:00 ` David Hildenbrand
2020-02-17 10:33 ` [Bug 206401] kernel panic on Hyper-V after 5 minutes duetomemory hot-add Michal Hocko
2020-02-17 11:21 ` [Bug 206401] kernel panic on Hyper-V after 5 minutes due to memory hot-add kkabe
2020-02-17 5:46 ` kkabe
2020-02-17 7:44 ` Baoquan He
2020-02-17 9:34 ` Oscar Salvador
2020-02-17 10:13 ` Baoquan He
2020-02-17 10:17 ` Baoquan He
2020-02-17 10:24 ` David Hildenbrand
2020-02-17 10:33 ` Baoquan He
2020-02-17 10:38 ` David Hildenbrand
2020-02-17 11:20 ` Baoquan He
2020-02-17 12:47 ` Michal Hocko
2020-02-18 6:24 ` kkabe
2020-02-18 8:47 ` Michal Hocko
2020-02-18 9:19 ` kkabe
2020-02-18 9:26 ` David Hildenbrand
2020-02-18 10:05 ` [RFC PATCH] memory_hotplug: disable the functionality for 32b (was: Re: [Bug 206401] kernel panic on Hyper-V after 5 minutes due to) " Michal Hocko
2020-02-18 10:11 ` David Hildenbrand
2020-02-19 3:23 ` Baoquan He
2020-02-19 21:46 ` Andrew Morton
2020-02-19 21:46 ` Andrew Morton
2020-02-19 23:07 ` [RFC PATCH] memory_hotplug: disable the functionality for 32b Robin Murphy
2020-02-19 23:07 ` Robin Murphy
2020-02-19 3:39 ` [Bug 206401] kernel panic on Hyper-V after 5 minutes due to memory hot-add Baoquan He
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200213132206.M0106897@vega.pgw.jp \
--to=kabe@vega.pgw.jp \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=bugzilla-daemon@bugzilla.kernel.org \
--cc=david@redhat.com \
--cc=kkabe@vega.pgw.jp \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=richardw.yang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.