From: Matthew Wilcox <willy@infradead.org>
To: Daniel Dao <dqminh@cloudflare.com>
Cc: linux-fsdevel@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
kernel-team <kernel-team@cloudflare.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
djwong@kernel.org
Subject: Re: Kernel NULL pointer deref and data corruptions with xfs on 6.1
Date: Thu, 27 Jul 2023 04:27:40 +0100 [thread overview]
Message-ID: <ZMHkLA+r2K6hKsr5@casper.infradead.org> (raw)
In-Reply-To: <CA+wXwBRGab3UqbLqsr8xG=ZL2u9bgyDNNea4RGfTDjqB=J3geQ@mail.gmail.com>
On Fri, Jul 21, 2023 at 11:49:04AM +0100, Daniel Dao wrote:
> We do not have a reproducer yet, but we now have more debugging data
> which hopefully
> should help narrow this down. Details as followed:
>
> 1. Kernel NULL pointer deferencences in __filemap_get_folio
>
> This happened on a few different hosts, with a few different repeated addresses.
> The addresses are 0000000000000036, 0000000000000076,
> 00000000000000f6. This looks
> like the xarray is corrupted and we were trying to do some work on a
> sibling entry.
I think I have a fix for this one. Please try the attached.
> 2. Kernel NULL pointer deferencences in xfs_read_iomap_begin
>
> BUG: unable to handle page fault for address: 0000000000034668
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 11cfd37067 P4D 11cfd37067 PUD b88086067 PMD 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 124 PID: 3831226 Comm: rocksdb:low Kdump: loaded Tainted: G
> W O L 6.1.27-cloudflare-2023.5.0 #1
> Hardware name: HYVE EDGE-METAL-GEN11/HS1811D_Lite, BIOS V0.11-sig 12/23/2022
> RIP: 0010:xfs_read_iomap_begin (fs/xfs/xfs_iomap.c:1200)
> Code: 0f 1f 44 00 00 41 57 41 56 41 55 41 54 55 53 48 83 ec 50 48
> 89 14 24 4c 89 44 24 08 65 48 8b 04 25 28 00 00 00 48 89 44 24 48 <48>
> 8b 87 >
> All code
> ========
> 0: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 5: 41 57 push %r15
> 7: 41 56 push %r14
> 9: 41 55 push %r13
> b: 41 54 push %r12
> d: 55 push %rbp
> e: 53 push %rbx
> f: 48 83 ec 50 sub $0x50,%rsp
> 13: 48 89 14 24 mov %rdx,(%rsp)
> 17: 4c 89 44 24 08 mov %r8,0x8(%rsp)
> 1c: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
> 23: 00 00
> 25: 48 89 44 24 48 mov %rax,0x48(%rsp)
> 2a:* 48 rex.W <-- trapping instruction
> 2b: 8b .byte 0x8b
> 2c: 87 00 xchg %eax,(%rax)
>
> Code starting with the faulting instruction
> ===========================================
> 0: 48 rex.W
> 1: 8b .byte 0x8b
> 2: 87 00 xchg %eax,(%rax)
This one is hard to understand because the decoding of the instruction
got cut off. But ...
> RSP: 0018:ffffa63810733a70 EFLAGS: 00010282
> RAX: 78ac714f0997e100 RBX: ffffa63810733b40 RCX: 0000000000000000
> RDX: 0000000000004000 RSI: 0000000000000000 RDI: 00000000000347a0
RDI is kind of close to the fault address ... RDI is used as the first
argument in the x86-64 SYSV ABI, and the first parameter to
xfs_read_iomap_begin() is supposed to be a struct inode pointer.
I don't think this is related.
> We also have a deadlock reading a very specific file on this host. We managed to
> do a kdump on this host and extracted out the state of the mapping.
This is almost certainly a different bug, but alos XArray related, so
I'll keep looking at this one.
next prev parent reply other threads:[~2023-07-27 3:28 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-21 10:49 Kernel NULL pointer deref and data corruptions with xfs on 6.1 Daniel Dao
2023-07-24 11:23 ` Daniel Dao
2023-07-24 21:45 ` Dave Chinner
2023-07-24 22:04 ` Daniel Dao
2023-07-25 3:41 ` Matthew Wilcox
2023-07-27 3:27 ` Matthew Wilcox [this message]
2023-07-27 10:25 ` Daniel Dao
2023-07-27 12:27 ` Matthew Wilcox
2023-08-04 16:57 ` Frederick Lawler
2023-08-30 19:26 ` Frederick Lawler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZMHkLA+r2K6hKsr5@casper.infradead.org \
--to=willy@infradead.org \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=dqminh@cloudflare.com \
--cc=kernel-team@cloudflare.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).