* [PATCH] block:added printing when bio->bi_status fails
@ 2024-08-07 9:33 824731276
2024-08-07 19:55 ` kernel test robot
2024-08-07 20:05 ` kernel test robot
0 siblings, 2 replies; 6+ messages in thread
From: 824731276 @ 2024-08-07 9:33 UTC (permalink / raw)
To: axboe; +Cc: linux-kernel, linux-block, baiguo
From: baiguo <baiguo@kylinos.cn>
When ftrace is not enabled and bio is not OK,
the system cannot actively record which disk is abnormal.
Add a message record to bio_endio.
Signed-off-by: baiguo <baiguo@kylinos.cn>
---
block/bio.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/block/bio.c b/block/bio.c
index c4053d496..29ae86c21 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1617,6 +1617,11 @@ void bio_endio(struct bio *bio)
bio_clear_flag(bio, BIO_TRACE_COMPLETION);
}
+ if (bio->bi_status && bio->bi_disk)
+ printk(KERN_ERR "bio: %s status is %d, disk[%d:%d]\n",\
+ __func__, bio->bi_status, bio->bi_disk->major,\
+ bio->bi_disk->first_minor);
+
/*
* Need to have a real endio function for chained bios, otherwise
* various corner cases will break (like stacking block devices that
--
2.33.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] block:added printing when bio->bi_status fails
2024-08-07 9:33 824731276
@ 2024-08-07 19:55 ` kernel test robot
2024-08-07 20:05 ` kernel test robot
1 sibling, 0 replies; 6+ messages in thread
From: kernel test robot @ 2024-08-07 19:55 UTC (permalink / raw)
To: 824731276, axboe; +Cc: oe-kbuild-all, linux-kernel, linux-block, baiguo
Hi,
kernel test robot noticed the following build errors:
[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master v6.11-rc2 next-20240807]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/824731276-qq-com/block-added-printing-when-bio-bi_status-fails/20240807-174005
base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/r/tencent_F71A15579D1E52ED0B58EF2F3607AA883308%40qq.com
patch subject: [PATCH] block:added printing when bio->bi_status fails
config: openrisc-allnoconfig (https://download.01.org/0day-ci/archive/20240808/202408080303.bwOWkFK1-lkp@intel.com/config)
compiler: or1k-linux-gcc (GCC) 14.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240808/202408080303.bwOWkFK1-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202408080303.bwOWkFK1-lkp@intel.com/
All errors (new ones prefixed by >>):
block/bio.c: In function 'bio_endio':
>> block/bio.c:1620:34: error: 'struct bio' has no member named 'bi_disk'
1620 | if (bio->bi_status && bio->bi_disk)
| ^~
In file included from include/asm-generic/bug.h:22,
from arch/openrisc/include/asm/bug.h:5,
from include/linux/bug.h:5,
from include/linux/mmdebug.h:5,
from include/linux/mm.h:6,
from block/bio.c:5:
block/bio.c:1622:62: error: 'struct bio' has no member named 'bi_disk'
1622 | __func__, bio->bi_status, bio->bi_disk->major,\
| ^~
include/linux/printk.h:437:33: note: in definition of macro 'printk_index_wrap'
437 | _p_func(_fmt, ##__VA_ARGS__); \
| ^~~~~~~~~~~
block/bio.c:1621:17: note: in expansion of macro 'printk'
1621 | printk(KERN_ERR "bio: %s status is %d, disk[%d:%d]\n",\
| ^~~~~~
block/bio.c:1623:36: error: 'struct bio' has no member named 'bi_disk'
1623 | bio->bi_disk->first_minor);
| ^~
include/linux/printk.h:437:33: note: in definition of macro 'printk_index_wrap'
437 | _p_func(_fmt, ##__VA_ARGS__); \
| ^~~~~~~~~~~
block/bio.c:1621:17: note: in expansion of macro 'printk'
1621 | printk(KERN_ERR "bio: %s status is %d, disk[%d:%d]\n",\
| ^~~~~~
vim +1620 block/bio.c
1589
1590 /**
1591 * bio_endio - end I/O on a bio
1592 * @bio: bio
1593 *
1594 * Description:
1595 * bio_endio() will end I/O on the whole bio. bio_endio() is the preferred
1596 * way to end I/O on a bio. No one should call bi_end_io() directly on a
1597 * bio unless they own it and thus know that it has an end_io function.
1598 *
1599 * bio_endio() can be called several times on a bio that has been chained
1600 * using bio_chain(). The ->bi_end_io() function will only be called the
1601 * last time.
1602 **/
1603 void bio_endio(struct bio *bio)
1604 {
1605 again:
1606 if (!bio_remaining_done(bio))
1607 return;
1608 if (!bio_integrity_endio(bio))
1609 return;
1610
1611 blk_zone_bio_endio(bio);
1612
1613 rq_qos_done_bio(bio);
1614
1615 if (bio->bi_bdev && bio_flagged(bio, BIO_TRACE_COMPLETION)) {
1616 trace_block_bio_complete(bdev_get_queue(bio->bi_bdev), bio);
1617 bio_clear_flag(bio, BIO_TRACE_COMPLETION);
1618 }
1619
> 1620 if (bio->bi_status && bio->bi_disk)
1621 printk(KERN_ERR "bio: %s status is %d, disk[%d:%d]\n",\
1622 __func__, bio->bi_status, bio->bi_disk->major,\
1623 bio->bi_disk->first_minor);
1624
1625 /*
1626 * Need to have a real endio function for chained bios, otherwise
1627 * various corner cases will break (like stacking block devices that
1628 * save/restore bi_end_io) - however, we want to avoid unbounded
1629 * recursion and blowing the stack. Tail call optimization would
1630 * handle this, but compiling with frame pointers also disables
1631 * gcc's sibling call optimization.
1632 */
1633 if (bio->bi_end_io == bio_chain_endio) {
1634 bio = __bio_chain_endio(bio);
1635 goto again;
1636 }
1637
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] block:added printing when bio->bi_status fails
2024-08-07 9:33 824731276
2024-08-07 19:55 ` kernel test robot
@ 2024-08-07 20:05 ` kernel test robot
1 sibling, 0 replies; 6+ messages in thread
From: kernel test robot @ 2024-08-07 20:05 UTC (permalink / raw)
To: 824731276, axboe; +Cc: llvm, oe-kbuild-all, linux-kernel, linux-block, baiguo
Hi,
kernel test robot noticed the following build errors:
[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master v6.11-rc2 next-20240807]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/824731276-qq-com/block-added-printing-when-bio-bi_status-fails/20240807-174005
base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/r/tencent_F71A15579D1E52ED0B58EF2F3607AA883308%40qq.com
patch subject: [PATCH] block:added printing when bio->bi_status fails
config: x86_64-allnoconfig (https://download.01.org/0day-ci/archive/20240808/202408080348.jL0uiVq7-lkp@intel.com/config)
compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 617a15a9eac96088ae5e9134248d8236e34b91b1)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240808/202408080348.jL0uiVq7-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202408080348.jL0uiVq7-lkp@intel.com/
All errors (new ones prefixed by >>):
>> block/bio.c:1620:29: error: no member named 'bi_disk' in 'struct bio'
1620 | if (bio->bi_status && bio->bi_disk)
| ~~~ ^
block/bio.c:1622:36: error: no member named 'bi_disk' in 'struct bio'
1622 | __func__, bio->bi_status, bio->bi_disk->major,\
| ~~~ ^
include/linux/printk.h:465:60: note: expanded from macro 'printk'
465 | #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
| ^~~~~~~~~~~
include/linux/printk.h:437:19: note: expanded from macro 'printk_index_wrap'
437 | _p_func(_fmt, ##__VA_ARGS__); \
| ^~~~~~~~~~~
block/bio.c:1623:10: error: no member named 'bi_disk' in 'struct bio'
1623 | bio->bi_disk->first_minor);
| ~~~ ^
include/linux/printk.h:465:60: note: expanded from macro 'printk'
465 | #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
| ^~~~~~~~~~~
include/linux/printk.h:437:19: note: expanded from macro 'printk_index_wrap'
437 | _p_func(_fmt, ##__VA_ARGS__); \
| ^~~~~~~~~~~
3 errors generated.
vim +1620 block/bio.c
1589
1590 /**
1591 * bio_endio - end I/O on a bio
1592 * @bio: bio
1593 *
1594 * Description:
1595 * bio_endio() will end I/O on the whole bio. bio_endio() is the preferred
1596 * way to end I/O on a bio. No one should call bi_end_io() directly on a
1597 * bio unless they own it and thus know that it has an end_io function.
1598 *
1599 * bio_endio() can be called several times on a bio that has been chained
1600 * using bio_chain(). The ->bi_end_io() function will only be called the
1601 * last time.
1602 **/
1603 void bio_endio(struct bio *bio)
1604 {
1605 again:
1606 if (!bio_remaining_done(bio))
1607 return;
1608 if (!bio_integrity_endio(bio))
1609 return;
1610
1611 blk_zone_bio_endio(bio);
1612
1613 rq_qos_done_bio(bio);
1614
1615 if (bio->bi_bdev && bio_flagged(bio, BIO_TRACE_COMPLETION)) {
1616 trace_block_bio_complete(bdev_get_queue(bio->bi_bdev), bio);
1617 bio_clear_flag(bio, BIO_TRACE_COMPLETION);
1618 }
1619
> 1620 if (bio->bi_status && bio->bi_disk)
1621 printk(KERN_ERR "bio: %s status is %d, disk[%d:%d]\n",\
1622 __func__, bio->bi_status, bio->bi_disk->major,\
1623 bio->bi_disk->first_minor);
1624
1625 /*
1626 * Need to have a real endio function for chained bios, otherwise
1627 * various corner cases will break (like stacking block devices that
1628 * save/restore bi_end_io) - however, we want to avoid unbounded
1629 * recursion and blowing the stack. Tail call optimization would
1630 * handle this, but compiling with frame pointers also disables
1631 * gcc's sibling call optimization.
1632 */
1633 if (bio->bi_end_io == bio_chain_endio) {
1634 bio = __bio_chain_endio(bio);
1635 goto again;
1636 }
1637
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] block:added printing when bio->bi_status fails
@ 2024-08-08 9:54 824731276
2024-08-16 4:04 ` kernel test robot
2024-08-16 7:45 ` Yu Kuai
0 siblings, 2 replies; 6+ messages in thread
From: 824731276 @ 2024-08-08 9:54 UTC (permalink / raw)
To: axboe; +Cc: linux-kernel, linux-block, baiguo
From: baiguo <baiguo@kylinos.cn>
When ftrace is not enabled and bio is not OK,
the system cannot actively record which disk is abnormal.
Add a message record to bio_endio.
Signed-off-by: baiguo <baiguo@kylinos.cn>
---
block/bio.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/block/bio.c b/block/bio.c
index c4053d496..fb07589c8 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1617,6 +1617,11 @@ void bio_endio(struct bio *bio)
bio_clear_flag(bio, BIO_TRACE_COMPLETION);
}
+ if (bio->bi_status && bio->bi_bdev)
+ printk(KERN_ERR "bio: %s status is %d, disk[%d:%d]\n",\
+ __func__, bio->bi_status, bio->bi_bdev->bd_disk->major,\
+ bio->bi_bdev->bd_disk->first_minor);
+
/*
* Need to have a real endio function for chained bios, otherwise
* various corner cases will break (like stacking block devices that
--
2.33.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] block:added printing when bio->bi_status fails
2024-08-08 9:54 [PATCH] block:added printing when bio->bi_status fails 824731276
@ 2024-08-16 4:04 ` kernel test robot
2024-08-16 7:45 ` Yu Kuai
1 sibling, 0 replies; 6+ messages in thread
From: kernel test robot @ 2024-08-16 4:04 UTC (permalink / raw)
To: 824731276
Cc: oe-lkp, lkp, linux-block, axboe, linux-kernel, baiguo,
oliver.sang
Hello,
kernel test robot noticed "WARNING:at_fs/buffer.c:#mark_buffer_dirty" on:
commit: 0824beb1d430c30731166484b8c26e37147d4dbb ("[PATCH] block:added printing when bio->bi_status fails")
url: https://github.com/intel-lab-lkp/linux/commits/824731276-qq-com/block-added-printing-when-bio-bi_status-fails/20240808-181758
base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/all/tencent_9A3345EA79C1EE9DC4464BB576C6A602A105@qq.com/
patch subject: [PATCH] block:added printing when bio->bi_status fails
in testcase: xfstests
version: xfstests-x86_64-f5ada754-1_20240812
with following parameters:
disk: 4HDD
fs: udf
test: generic-081
compiler: gcc-12
test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 28G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202408161114.dfe9cc49-oliver.sang@intel.com
we do see a lot of print that:
[ 68.449409][ T89] bio: bio_endio status is 10, disk[253:2]
[ 68.454169][ T2095] bio: bio_endio status is 10, disk[253:3]
[ 68.455863][ T89] bio: bio_endio status is 10, disk[253:2]
[ 68.466598][ T89] bio: bio_endio status is 10, disk[253:2]
[ 68.472314][ T89] bio: bio_endio status is 10, disk[253:2]
...
[ 74.216172][ T89] bio: bio_endio status is 10, disk[253:2]
[ 74.221903][ T89] bio: bio_endio status is 10, disk[253:2]
[ 74.2:2]
[ 74.348274][ T89] bio: bio_endio status is 10, disk[253:3]
[ 74.356178][ T2096] ------------[ cut here ]------------
[ 74.361531][ T2096] WARNING: CPU: 0 PID: 2096 at fs/buffer.c:1181 mark_buffer_dirty+0x1e6/0x240
then see below WARNING
[ 72.605562][ T2097] ------------[ cut here ]------------
[ 72.610907][ T2097] WARNING: CPU: 7 PID: 2097 at fs/buffer.c:1181 mark_buffer_dirty (fs/buffer.c:1181 (discriminator 1))
[ 72.619661][ T2097] Modules linked in: dm_snapshot dm_bufio udf crc_itu_t cdrom btrfs blake2b_generic xor zstd_compress raid6_pq libcrc32c intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp sd_mod ipmi_devintf sg ipmi_msghandler kvm_intel i915 kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 drm_buddy ahci intel_gtt rapl mei_wdt drm_display_helper libahci intel_cstate wmi_bmof ttm mei_me i2c_i801 intel_uncore libata drm_kms_helper i2c_smbus mei intel_pch_thermal video wmi acpi_pad binfmt_misc loop fuse drm dm_mod ip_tables
[ 72.671579][ T2097] CPU: 7 UID: 0 PID: 2097 Comm: xfs_io Not tainted 6.11.0-rc1-00021-g0824beb1d430 #1
[ 72.680936][ T2097] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
[ 72.689071][ T2097] RIP: 0010:mark_buffer_dirty (fs/buffer.c:1181 (discriminator 1))
[ 72.694676][ T2097] Code: 58 c6 ff 48 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 75 60 48 8b 7d 00 5b be 04 00 00 00 5d e9 3a fc fc ff <0f> 0b e9 34 fe ff ff 48 89 df e8 5b 23 e5 ff e9 54 fe ff ff 48 89
All code
========
0: 58 pop %rax
1: c6 (bad)
2: ff 48 89 decl -0x77(%rax)
5: ea (bad)
6: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
d: fc ff df
10: 48 c1 ea 03 shr $0x3,%rdx
14: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
18: 75 60 jne 0x7a
1a: 48 8b 7d 00 mov 0x0(%rbp),%rdi
1e: 5b pop %rbx
1f: be 04 00 00 00 mov $0x4,%esi
24: 5d pop %rbp
25: e9 3a fc fc ff jmpq 0xfffffffffffcfc64
2a:* 0f 0b ud2 <-- trapping instruction
2c: e9 34 fe ff ff jmpq 0xfffffffffffffe65
31: 48 89 df mov %rbx,%rdi
34: e8 5b 23 e5 ff callq 0xffffffffffe52394
39: e9 54 fe ff ff jmpq 0xfffffffffffffe92
3e: 48 rex.W
3f: 89 .byte 0x89
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e9 34 fe ff ff jmpq 0xfffffffffffffe3b
7: 48 89 df mov %rbx,%rdi
a: e8 5b 23 e5 ff callq 0xffffffffffe5236a
f: e9 54 fe ff ff jmpq 0xfffffffffffffe68
14: 48 rex.W
15: 89 .byte 0x89
[ 72.714238][ T2097] RSP: 0018:ffffc900033df8a0 EFLAGS: 00010246
[ 72.720206][ T2097] RAX: 0000000000000001 RBX: ffff888120d219d8 RCX: ffffffff81c2b878
[ 72.728093][ T2097] RDX: ffffed10241a433c RSI: 0000000000000008 RDI: ffff888120d219d8
[ 72.735965][ T2097] RBP: ffff888120d219d8 R08: 0000000000000000 R09: ffffed10241a433b
[ 72.743867][ T2097] R10: ffff888120d219df R11: 0000000000000008 R12: ffffed10241a4340
[ 72.751752][ T2097] R13: 0000000000000004 R14: ffff8887472e8000 R15: 0000000000000948
[ 72.759638][ T2097] FS: 0000000000000000(0000) GS:ffff888634180000(0000) knlGS:0000000000000000
[ 72.768470][ T2097] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 72.774945][ T2097] CR2: 000055d09397fc48 CR3: 000000075685c004 CR4: 00000000003706f0
[ 72.782843][ T2097] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 72.790713][ T2097] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 72.798583][ T2097] Call Trace:
[ 72.801752][ T2097] <TASK>
[ 72.804567][ T2097] ? __warn (kernel/panic.c:735)
[ 72.808515][ T2097] ? mark_buffer_dirty (fs/buffer.c:1181 (discriminator 1))
[ 72.813514][ T2097] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 72.817908][ T2097] ? handle_bug (arch/x86/kernel/traps.c:239)
[ 72.822125][ T2097] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
[ 72.826692][ T2097] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621)
[ 72.831612][ T2097] ? mark_buffer_dirty (arch/x86/include/asm/bitops.h:213 arch/x86/include/asm/bitops.h:245 include/asm-generic/bitops/instrumented-non-atomic.h:154 include/linux/buffer_head.h:171 fs/buffer.c:1181)
[ 72.836529][ T2097] ? mark_buffer_dirty (fs/buffer.c:1181 (discriminator 1))
[ 72.841536][ T2097] udf_bitmap_free_blocks (fs/udf/balloc.c:164) udf
[ 72.847326][ T2097] udf_free_blocks (fs/udf/balloc.c:662) udf
[ 72.852530][ T2097] udf_discard_prealloc (fs/udf/truncate.c:147) udf
[ 72.858161][ T2097] ? __pfx_udf_discard_prealloc (fs/udf/truncate.c:118) udf
[ 72.864303][ T2097] ? __pfx_down_write (kernel/locking/rwsem.c:1577)
[ 72.869051][ T2097] ? __pfx_locks_remove_file (fs/locks.c:2687)
[ 72.874412][ T2097] udf_release_file (fs/udf/file.c:185 fs/udf/file.c:174) udf
[ 72.879584][ T2097] ? security_file_release (security/security.c:2754 (discriminator 11))
[ 72.884757][ T2097] __fput (fs/file_table.c:422)
[ 72.888638][ T2097] task_work_run (kernel/task_work.c:222 (discriminator 1))
[ 72.893108][ T2097] ? __pfx_task_work_run (kernel/task_work.c:190)
[ 72.898101][ T2097] do_exit (kernel/exit.c:883)
[ 72.902048][ T2097] ? __pfx_do_exit (kernel/exit.c:821)
[ 72.906518][ T2097] ? _raw_spin_lock_irq (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
[ 72.911423][ T2097] do_group_exit (kernel/exit.c:1012)
[ 72.915818][ T2097] get_signal (include/linux/signal.h:78 kernel/signal.c:2751)
[ 72.920215][ T2097] ? finish_task_switch+0x495/0x750
[ 72.925907][ T2097] ? __switch_to (arch/x86/include/asm/bitops.h:55 include/asm-generic/bitops/instrumented-atomic.h:29 include/linux/thread_info.h:89 include/linux/sched.h:1945 arch/x86/include/asm/fpu/sched.h:68 arch/x86/kernel/process_64.c:674)
[ 72.930377][ T2097] ? __pfx_get_signal (kernel/signal.c:2682)
[ 72.935108][ T2097] ? __schedule (kernel/sched/core.c:6399)
[ 72.939576][ T2097] arch_do_signal_or_restart (arch/x86/kernel/signal.c:310)
[ 72.945005][ T2097] ? __pfx_arch_do_signal_or_restart (arch/x86/kernel/signal.c:307)
[ 72.951046][ T2097] syscall_exit_to_user_mode (kernel/entry/common.c:111 include/linux/entry-common.h:328 kernel/entry/common.c:207 kernel/entry/common.c:218)
[ 72.956561][ T2097] do_syscall_64 (arch/x86/entry/common.c:102)
[ 72.960942][ T2097] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 72.966720][ T2097] RIP: 0033:0x7f896a5efd32
[ 72.971012][ T2097] Code: Unable to access opcode bytes at 0x7f896a5efd08.
Code starting with the faulting instruction
===========================================
[ 72.977922][ T2097] RSP: 002b:00007f896a1ffdb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000022
[ 72.986232][ T2097] RAX: fffffffffffffdfe RBX: 00007f896a2006c0 RCX: 00007f896a5efd32
[ 72.994101][ T2097] RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000
[ 73.001969][ T2097] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007ffdb5bc88c7
[ 73.009844][ T2097] R10: 00007f896a535f80 R11: 0000000000000293 R12: ffffffffffffff80
[ 73.017712][ T2097] R13: 0000000000000002 R14: 00007ffdb5bc87d0 R15: 00007f8969a00000
[ 73.025582][ T2097] </TASK>
[ 73.028478][ T2097] ---[ end trace 0000000000000000 ]---
[ 73.034147][ T66] bio: bio_endio status is 10, disk[253:3]
[ 73.034167][ T2097] bio: bio_endio status is 10, disk[253:3]
[ 73.039846][ T66] Buffer I/O error on dev dm-3, logical block 259, lost async page write
[ 73.045565][ T2097] Buffer I/O error on dev dm-3, logical block 128, lost async page write
[ 73.053894][ T66] bio: bio_endio status is 10, disk[253:3]
[ 73.067887][ T66] Buffer I/O error on dev dm-3, logical block 387, lost async page write
[ 73.076268][ T66] bio: bio_endio status is 10, disk[253:3]
[ 73.081976][ T66] Buffer I/O error on dev dm-3, logical block 388, lost async page write
[ 73.180045][ T2097] bio: bio_endio status is 10, disk[253:3]
[ 73.185743][ T2097] Buffer I/O error on dev dm-3, logical block 128, lost sync page write
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240816/202408161114.dfe9cc49-oliver.sang@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] block:added printing when bio->bi_status fails
2024-08-08 9:54 [PATCH] block:added printing when bio->bi_status fails 824731276
2024-08-16 4:04 ` kernel test robot
@ 2024-08-16 7:45 ` Yu Kuai
1 sibling, 0 replies; 6+ messages in thread
From: Yu Kuai @ 2024-08-16 7:45 UTC (permalink / raw)
To: 824731276, axboe; +Cc: linux-kernel, linux-block, baiguo, yukuai (C)
Hi,
在 2024/08/08 17:54, 824731276@qq.com 写道:
> From: baiguo <baiguo@kylinos.cn>
>
> When ftrace is not enabled and bio is not OK,
> the system cannot actively record which disk is abnormal.
> Add a message record to bio_endio.
>
> Signed-off-by: baiguo <baiguo@kylinos.cn>
> ---
> block/bio.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/block/bio.c b/block/bio.c
> index c4053d496..fb07589c8 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1617,6 +1617,11 @@ void bio_endio(struct bio *bio)
> bio_clear_flag(bio, BIO_TRACE_COMPLETION);
> }
>
> + if (bio->bi_status && bio->bi_bdev)
> + printk(KERN_ERR "bio: %s status is %d, disk[%d:%d]\n",\
> + __func__, bio->bi_status, bio->bi_bdev->bd_disk->major,\
> + bio->bi_bdev->bd_disk->first_minor);
I don't understand why you'll need this, bio_endio() will still be
called for unsupported bio from submit_bio_noacct() when the disk is
fine.
For real disks blk_print_req_error() already print message for failed IO
that are submitted to disk.
Thanks,
Kuai
> +
> /*
> * Need to have a real endio function for chained bios, otherwise
> * various corner cases will break (like stacking block devices that
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-08-16 7:45 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-08 9:54 [PATCH] block:added printing when bio->bi_status fails 824731276
2024-08-16 4:04 ` kernel test robot
2024-08-16 7:45 ` Yu Kuai
-- strict thread matches above, loose matches on Subject: below --
2024-08-07 9:33 824731276
2024-08-07 19:55 ` kernel test robot
2024-08-07 20:05 ` kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).