Re: [bug, 5.2.16] kswapd/compaction null pointer crash [was Re: xfs_inode not reclaimed/memory leak on 5.2.16]

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Florian Weimer <fw@deneb.enyo.de>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>,
	 linux-mm@kvack.org,  Mel Gorman <mgorman@techsingularity.net>
Subject: Re: [bug, 5.2.16] kswapd/compaction null pointer crash [was Re: xfs_inode not reclaimed/memory leak on 5.2.16]
Date: Wed, 16 Oct 2019 21:38:49 +0200	[thread overview]
Message-ID: <87blugh452.fsf@mid.deneb.enyo.de> (raw)
In-Reply-To: <96023250-6168-3806-320a-a3468f1cd8c9@suse.cz> (Vlastimil Babka's message of "Tue, 1 Oct 2019 11:10:22 +0200")

* Vlastimil Babka:

> On 9/30/19 11:17 PM, Dave Chinner wrote:
>> On Mon, Sep 30, 2019 at 09:07:53PM +0200, Florian Weimer wrote:
>>> * Dave Chinner:
>>>
>>>> On Mon, Sep 30, 2019 at 09:28:27AM +0200, Florian Weimer wrote:
>>>>> Simply running “du -hc” on a large directory tree causes du to be
>>>>> killed because of kernel paging request failure in the XFS code.
>>>>
>>>> dmesg output? if the system was still running, then you might be
>>>> able to pull the trace from syslog. But we can't do much without
>>>> knowing what the actual failure was....
>>>
>>> Huh.  I actually have something in syslog:
>>>
>>> [ 4001.238411] BUG: kernel NULL pointer dereference, address:
>>> 0000000000000000
>>> [ 4001.238415] #PF: supervisor read access in kernel mode
>>> [ 4001.238417] #PF: error_code(0x0000) - not-present page
>>> [ 4001.238418] PGD 0 P4D 0 
>>> [ 4001.238420] Oops: 0000 [#1] SMP PTI
>>> [ 4001.238423] CPU: 3 PID: 143 Comm: kswapd0 Tainted: G I 5.2.16fw+
>>> #1
>>> [ 4001.238424] Hardware name: System manufacturer System Product
>>> Name/P6X58D-E, BIOS 0701 05/10/2011
>>> [ 4001.238430] RIP: 0010:__reset_isolation_pfn+0x27f/0x3c0
>> 
>> That's memory compaction code it's crashed in.
>> 
>>> [ 4001.238432] Code: 44 c6 48 8b 00 a8 10 74 bc 49 8b 16 48 89 d0
>>> 48 c1 ea 35 48 8b 14 d7 48 c1 e8 2d 48 85 d2 74 0a 0f b6 c0 48 c1
>>> e0 04 48 01 c2 <48> 8b 02 4c 89 f2 41 b8 01 00 00 00 31 f6 b9 03 00
>>> 00 00 4c 89 f7
>
> Tried to decode it, but couldn't match it to source code, my version of
> compiled code is too different. Would it be possible to either send
> mm/compaction.o from the matching build, or output of 'objdump -d -l'
> for the __reset_isolation_pfn function?

(dropping the fs lists)

I got another crash, this time triggered by rsync (large tree with
many small files, few files changed).

Oops:

[41969.140117] BUG: kernel NULL pointer dereference, address: 0000000000000000
[41969.140121] #PF: supervisor read access in kernel mode
[41969.140122] #PF: error_code(0x0000) - not-present page
[41969.140123] PGD 0 P4D 0
[41969.140125] Oops: 0000 [#1] SMP PTI
[41969.140127] CPU: 5 PID: 144 Comm: kswapd0 Tainted: G          I       5.2.18fw+ #10
[41969.140128] Hardware name: System manufacturer System Product Name/P6X58D-E, BIOS 0701    05/10/2011
[41969.140133] RIP: 0010:__reset_isolation_pfn+0x27f/0x3c0
[41969.140134] Code: 44 c6 48 8b 00 a8 10 74 bc 49 8b 16 48 89 d0 48 c1 ea 35 48 8b 14 d7 48 c1 e8 2d 48 85 d2 74 0a 0f b6 c0 48 c1 e0 04 48 01 c2 <48> 8b 02 4c 89 f2 41 b8 01 00 00 00 31 f6 b9 03 00 00 00 4c 89 f7
[41969.140135] RSP: 0018:ffffc900003ffde0 EFLAGS: 00010246
[41969.140137] RAX: 000000000004fdac RBX: 0000000000118000 RCX: 0000000000000000
[41969.140138] RDX: 0000000000000000 RSI: 0000000000000230 RDI: ffff88833fffa000
[41969.140138] RBP: ffffc900003ffe18 R08: 000000000000003c R09: ffff888335080000
[41969.140139] R10: ffff88833fff9000 R11: 0000000000000000 R12: 0000000000000001
[41969.140140] R13: 0000000000000001 R14: ffff888338dc01c0 R15: 0000000000000001
[41969.140141] FS:  0000000000000000(0000) GS:ffff888333d40000(0000) knlGS:0000000000000000
[41969.140142] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[41969.140143] CR2: 0000000000000000 CR3: 000000000200a001 CR4: 00000000000206e0
[41969.140144] Call Trace:
[41969.140147]  __reset_isolation_suitable+0x9b/0x120
[41969.140149]  reset_isolation_suitable+0x3b/0x40
[41969.140152]  kswapd+0x98/0x300
[41969.140154]  ? wait_woken+0x80/0x80
[41969.140157]  kthread+0x114/0x130
[41969.140158]  ? balance_pgdat+0x450/0x450
[41969.140159]  ? kthread_park+0x80/0x80
[41969.140162]  ret_from_fork+0x1f/0x30
[41969.140163] Modules linked in: usb_storage nfnetlink 8021q garp stp llc fuse ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_filter xt_state xt_conntrack iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter tun ip6_tables binfmt_misc mxm_wmi evdev snd_hda_codec_hdmi coretemp serio_raw snd_hda_intel kvm_intel snd_hda_codec kvm snd_hwdep irqbypass snd_hda_core pcspkr snd_pcm snd_timer snd soundcore sg i7core_edac asus_atk0110 wmi button loop ip_tables x_tables raid10 raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 multipath linear md_mod hid_generic usbhid hid crc32c_intel psmouse sr_mod cdrom radeon e1000e xhci_pci ptp ehci_pci uhci_hcd xhci_hcd pps_core ehci_hcd sky2 usbcore ttm usb_common sd_mod
[41969.140187] CR2: 0000000000000000
[41969.140189] ---[ end trace e27ddb472a95c047 ]---

This time, I've got a kernel with debugging information (still
5.2.18).  The crash is at offset 0x39f:

        if (!mem_section[SECTION_NR_TO_ROOT(nr)])
     384:       48 c1 ea 35             shr    $0x35,%rdx
     388:       48 8b 14 d7             mov    (%rdi,%rdx,8),%rdx
     38c:       48 c1 e8 2d             shr    $0x2d,%rax
     390:       48 85 d2                test   %rdx,%rdx
     393:       74 0a                   je     39f <__reset_isolation_pfn+0x27f>
        return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
     395:       0f b6 c0                movzbl %al,%eax
     398:       48 c1 e0 04             shl    $0x4,%rax
     39c:       48 01 c2                add    %rax,%rdx
        unsigned long map = section->section_mem_map;
     39f:       48 8b 02                mov    (%rdx),%rax
                                clear_pageblock_skip(page);
     3a2:       4c 89 f2                mov    %r14,%rdx
     3a5:       41 b8 01 00 00 00       mov    $0x1,%r8d
     3ab:       31 f6                   xor    %esi,%esi
     3ad:       b9 03 00 00 00          mov    $0x3,%ecx
     3b2:       4c 89 f7                mov    %r14,%rdi

Hmm, -l output is likely more helpful here:

/home/fw/src/linux/linux/mm/compaction.c:293
     37a:       a8 10                   test   $0x10,%al
     37c:       74 bc                   je     33a <__reset_isolation_pfn+0x21a>
page_to_section():
/home/fw/src/linux/linux/./include/linux/mm.h:1265
     37e:       49 8b 16                mov    (%r14),%rdx
     381:       48 89 d0                mov    %rdx,%rax
__nr_to_section():
/home/fw/src/linux/linux/./include/linux/mmzone.h:1218
     384:       48 c1 ea 35             shr    $0x35,%rdx
     388:       48 8b 14 d7             mov    (%rdi,%rdx,8),%rdx
page_to_section():
/home/fw/src/linux/linux/./include/linux/mm.h:1265
     38c:       48 c1 e8 2d             shr    $0x2d,%rax
__nr_to_section():
/home/fw/src/linux/linux/./include/linux/mmzone.h:1218
     390:       48 85 d2                test   %rdx,%rdx
     393:       74 0a                   je     39f <__reset_isolation_pfn+0x27f>
/home/fw/src/linux/linux/./include/linux/mmzone.h:1220
     395:       0f b6 c0                movzbl %al,%eax
     398:       48 c1 e0 04             shl    $0x4,%rax
     39c:       48 01 c2                add    %rax,%rdx
__section_mem_map_addr():
/home/fw/src/linux/linux/./include/linux/mmzone.h:1247
     39f:       48 8b 02                mov    (%rdx),%rax
__reset_isolation_pfn():
/home/fw/src/linux/linux/mm/compaction.c:294
     3a2:       4c 89 f2                mov    %r14,%rdx
     3a5:       41 b8 01 00 00 00       mov    $0x1,%r8d
     3ab:       31 f6                   xor    %esi,%esi

It's this loop:

  286         /*
  287          * Only clear the hint if a sample indicates there is either a
  288          * free page or an LRU page in the block. One or other condition
  289          * is necessary for the block to be a migration source/target.
  290          */
  291         do {
  292                 if (pfn_valid_within(pfn)) {
  293                         if (check_source && PageLRU(page)) {
  294                                 clear_pageblock_skip(page);
  295                                 return true;
  296                         }
  297 
  298                         if (check_target && PageBuddy(page)) {
  299                                 clear_pageblock_skip(page);
  300                                 return true;
  301                         }
  302                 }
  303 
  304                 page += (1 << PAGE_ALLOC_COSTLY_ORDER);
  305                 pfn += (1 << PAGE_ALLOC_COSTLY_ORDER);
  306         } while (page < end_page);

next prev parent reply	other threads:[~2019-10-16 19:41 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-30  7:28 xfs_inode not reclaimed/memory leak on 5.2.16 Florian Weimer
2019-09-30  8:54 ` Dave Chinner
2019-09-30 19:07   ` Florian Weimer
2019-09-30 21:17     ` [bug, 5.2.16] kswapd/compaction null pointer crash [was Re: xfs_inode not reclaimed/memory leak on 5.2.16] Dave Chinner
2019-09-30 21:42       ` Florian Weimer
2019-10-01  9:10       ` Vlastimil Babka
2019-10-01 19:40         ` Florian Weimer
2019-10-07 13:28           ` Vlastimil Babka
2019-10-07 13:56             ` Vlastimil Babka
2019-10-08  8:52               ` Mel Gorman
2019-10-16 19:38         ` Florian Weimer [this message]
2019-10-16 20:03           ` Vlastimil Babka
2019-10-18 17:38             ` Florian Weimer
2019-10-21  8:13               ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87blugh452.fsf@mid.deneb.enyo.de \
    --to=fw@deneb.enyo.de \
    --cc=david@fromorbit.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.