From: <pjy@amazon.com>
To: Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>,
"Christoph Hellwig" <hch@lst.de>,
Sagi Grimberg <sagi@grimberg.me>
Cc: <linux-nvme@lists.infradead.org>
Subject: BUG Report: kernel NULL pointer dereference in bio_integrity_advance()
Date: Mon, 26 Aug 2024 14:32:31 +0000 [thread overview]
Message-ID: <mb61pcylvnym8.fsf@amazon.com> (raw)
Hi,
I saw that running the following command on 5.4, 5.10, 5.15 stable
kernels crashes the system with a NULL pointer dereference:
root@pjy:~# touch test.txt
root@pjy:~# nvme io-passthru /dev/nvme0 --opcode=0x1 --input-file=test.txt --data-len=1 --write --namespace=1 --metadata-len=1
nvme nvme0: using deprecated NVME_IOCTL_IO_CMD ioctl on the char device!
Unable to handle kernel NULL pointer dereference at virtual address 000000000000000a
Mem abort info:
ESR = 0x96000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=0000000106500000
[000000000000000a] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in: crct10dif_ce nvme nvme_core fuse drm dm_mod ip_tables x_tables ipv6
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.1-gb6abb62daa55 #1
Hardware name: linux,dummy-virt (DT)
pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : bio_integrity_advance+0x4c/0x100
lr : bio_advance+0x34/0x120
sp : ffff800010003d90
x29: ffff800010003d90 x28: ffff800011d234c0 x27: ffff80001151ff20
x26: ffff0000be180310 x25: 00000000000000b1 x24: 0000000000000001
x23: 0000000000000000 x22: 0000000000000000 x21: ffff0000c6733640
x20: ffff0000c47afa00 x19: ffff0000c5f81108 x18: 0000000000000001
x17: ffff8000edf70000 x16: ffff800010004000 x15: 0000000000004000
x14: 0000000000000001 x13: 0000000000000002 x12: 0000000000000400
x11: 0000000000000040 x10: ffff0000c0034168 x9 : ffff0000c0034160
x8 : ffff0000c0424550 x7 : 0000000000000000 x6 : 0000000000000000
x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
x2 : ffff0000c5f81100 x1 : 0000000000000001 x0 : 0000000000000000
Call trace:
bio_integrity_advance+0x4c/0x100
bio_advance+0x34/0x120
blk_update_request+0x174/0x400
blk_mq_end_request+0x2c/0x150
nvme_complete_rq+0x4c/0x10c [nvme_core]
nvme_pci_complete_rq+0x4c/0xa4 [nvme]
nvme_process_cq+0x144/0x250 [nvme]
nvme_irq+0x18/0x30 [nvme]
__handle_irq_event_percpu+0x40/0x15c
handle_irq_event+0x64/0x140
handle_fasteoi_irq+0xa8/0x1a0
handle_domain_irq+0x64/0x94
gic_handle_irq+0xbc/0x140
call_on_irq_stack+0x2c/0x60
do_interrupt_handler+0x54/0x60
el1_interrupt+0x30/0x80
el1h_64_irq_handler+0x1c/0x2c
el1h_64_irq+0x78/0x7c
finish_task_switch.isra.0+0x98/0x260
__schedule+0x2a4/0x714
schedule_idle+0x2c/0x50
do_idle+0x190/0x2cc
cpu_startup_entry+0x28/0x80
rest_init+0xe8/0x100
arch_call_rest_init+0x14/0x20
start_kernel+0x634/0x674
__primary_switched+0xc0/0xc8
Code: f9402800 f84c8c04 f100009f 9a9f1000 (39402804)
---[ end trace 515229a85ac6ccf1 ]---
Kernel panic - not syncing: Oops: Fatal exception in interrupt
SMP: stopping secondary CPUs
Kernel Offset: disabled
CPU features: 0x11000471,20000846
Memory Limit: none
---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
This is because in the function:
void bio_integrity_advance(struct bio *bio, unsigned int bytes_done)
{
struct bio_integrity_payload *bip = bio_integrity(bio);
struct blk_integrity *bi = blk_get_integrity(bio->bi_bdev->bd_disk);
unsigned bytes = bio_integrity_bytes(bi, bytes_done >> 9);
bip->bip_iter.bi_sector += bio_integrity_intervals(bi, bytes_done >> 9);
bvec_iter_advance(bip->bip_vec, &bip->bip_iter, bytes);
}
Here blk_get_integrity() returns NULL and bio_integrity_bytes() uses it
without checking for NULL.
This issue is also present in mainline but doesn't trigger because after
d4aa57a1cac3 ("block: don't bother iter advancing a fully done bio")
bio_advance() is not called for this reproducer, but this bug might be
triggerable through another path.
I want to send a patch to fix this but need some help to understand
where the change has to be made.
in 5.15 for example:
void bio_advance(struct bio *bio, unsigned bytes)
{
if (bio_integrity(bio))
bio_integrity_advance(bio, bytes);
bio_crypt_advance(bio, bytes);
bio_advance_iter(bio, &bio->bi_iter, bytes);
}
Here bio_integrity(bio) returns non-null and therefore
bio_integrity_advance() is called but in that fuction,
blk_get_integrity(bio->bi_bdev->bd_disk) returns NULL because for this
disk bi->profile is NULL.
So, the problem is that bi->profile is NULL for this disk but
bio->bi_integrity is non-NULL for this bio.
Please help me debug this further.
P.S. - Reproducing using qemu.
Here are the commands I used:
qemu-system-aarch64 -machine 'virt,gic-version=3' -cpu 'cortex-a57' -smp \
2 -m 4G -drive format=raw,file=rootfs -device \
virtio-net-device,netdev=net -netdev user,id=net,hostfwd=tcp::2222-:22 \
-kernel linux/arch/arm64/boot/Image -nographic -append "root=/dev/vda rw \
console=ttyAMA0 debug earlyprintk=serial slub_debug=UZ nokaslr" -gdb \
tcp::1234 -d guest_errors,unimp -D log.txt -drive \
file=nvm.img,if=none,id=nvm -device nvme,serial=deadbeef,drive=nvm
Kernel is v5.15.1 compiled with arm64 defconfig
The following commands will then crash the kernel:
# touch test.txt
# nvme io-passthru /dev/nvme0 --opcode=0x1 --input-file=test.txt --data-len=1 --write --namespace=1 --metadata-len=1
Thanks,
Puranjay
next reply other threads:[~2024-08-26 15:45 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-26 14:32 pjy [this message]
2024-08-27 7:26 ` BUG Report: kernel NULL pointer dereference in bio_integrity_advance() Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mb61pcylvnym8.fsf@amazon.com \
--to=pjy@amazon.com \
--cc=axboe@fb.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.