* linux 4.19.19: md0_raid:1317 blocked for more than 120 seconds.
@ 2019-02-11 15:12 Wolfgang Walter
[not found] ` <0ee180ac-bb43-6c2f-4084-5cc452a18c9d@suse.com>
0 siblings, 1 reply; 5+ messages in thread
From: Wolfgang Walter @ 2019-02-11 15:12 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-raid, linux-kernel, Guoqing Jiang
With 4.19.19 we see sometimes the following issue (practically only with blk_mq, though):
Feb 4 20:04:46 tettnang kernel: [252300.060165] INFO: task md0_raid1:317 blocked for more than 120 seconds.
Feb 4 20:04:46 tettnang kernel: [252300.060188] Not tainted 4.19.19-debian64.all+1.1 #1
Feb 4 20:04:46 tettnang kernel: [252300.060197] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 4 20:04:46 tettnang kernel: [252300.060207] md0_raid1 D 0 317 2 0x80000000
Feb 4 20:04:46 tettnang kernel: [252300.060211] Call Trace:
Feb 4 20:04:46 tettnang kernel: [252300.060222] ? __schedule+0x2a2/0x8c0
Feb 4 20:04:46 tettnang kernel: [252300.060226] ? _raw_spin_unlock_irqrestore+0x20/0x40
Feb 4 20:04:46 tettnang kernel: [252300.060229] schedule+0x32/0x90
Feb 4 20:04:46 tettnang kernel: [252300.060241] md_super_wait+0x69/0xa0 [md_mod]
Feb 4 20:04:46 tettnang kernel: [252300.060247] ? finish_wait+0x80/0x80
Feb 4 20:04:46 tettnang kernel: [252300.060255] md_bitmap_wait_writes+0x8e/0xa0 [md_mod]
Feb 4 20:04:46 tettnang kernel: [252300.060263] ? md_bitmap_get_counter+0x42/0xd0 [md_mod]
Feb 4 20:04:46 tettnang kernel: [252300.060271] md_bitmap_daemon_work+0x1e8/0x380 [md_mod]
Feb 4 20:04:46 tettnang kernel: [252300.060278] ? md_rdev_init+0xb0/0xb0 [md_mod]
Feb 4 20:04:46 tettnang kernel: [252300.060285] md_check_recovery+0x26/0x540 [md_mod]
Feb 4 20:04:46 tettnang kernel: [252300.060290] raid1d+0x5c/0xf00 [raid1]
Feb 4 20:04:46 tettnang kernel: [252300.060294] ? preempt_count_add+0x79/0xb0
Feb 4 20:04:46 tettnang kernel: [252300.060298] ? lock_timer_base+0x67/0x80
Feb 4 20:04:46 tettnang kernel: [252300.060302] ? _raw_spin_unlock_irqrestore+0x20/0x40
Feb 4 20:04:46 tettnang kernel: [252300.060304] ? try_to_del_timer_sync+0x4d/0x80
Feb 4 20:04:46 tettnang kernel: [252300.060306] ? del_timer_sync+0x35/0x40
Feb 4 20:04:46 tettnang kernel: [252300.060309] ? schedule_timeout+0x17a/0x3b0
Feb 4 20:04:46 tettnang kernel: [252300.060312] ? preempt_count_add+0x79/0xb0
Feb 4 20:04:46 tettnang kernel: [252300.060315] ? _raw_spin_lock_irqsave+0x25/0x50
Feb 4 20:04:46 tettnang kernel: [252300.060321] ? md_rdev_init+0xb0/0xb0 [md_mod]
Feb 4 20:04:46 tettnang kernel: [252300.060327] ? md_thread+0xf9/0x160 [md_mod]
Feb 4 20:04:46 tettnang kernel: [252300.060330] ? r1bio_pool_alloc+0x20/0x20 [raid1]
Feb 4 20:04:46 tettnang kernel: [252300.060336] md_thread+0xf9/0x160 [md_mod]
Feb 4 20:04:46 tettnang kernel: [252300.060340] ? finish_wait+0x80/0x80
Feb 4 20:04:46 tettnang kernel: [252300.060344] kthread+0x112/0x130
Feb 4 20:04:46 tettnang kernel: [252300.060346] ? kthread_create_worker_on_cpu+0x70/0x70
Feb 4 20:04:46 tettnang kernel: [252300.060350] ret_from_fork+0x35/0x40
I saw that there was a similar problem with raid10 and an upstream patch
e820d55cb99dd93ac2dc949cf486bb187e5cd70d
md: fix raid10 hang issue caused by barrier
by Guoqing Jiang
I wonder if there is a similar fix needed for raid1?
Regards,
--
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts
^ permalink raw reply [flat|nested] 5+ messages in thread[parent not found: <0ee180ac-bb43-6c2f-4084-5cc452a18c9d@suse.com>]
* Re: linux 4.19.19: md0_raid:1317 blocked for more than 120 seconds. [not found] ` <0ee180ac-bb43-6c2f-4084-5cc452a18c9d@suse.com> @ 2019-02-12 11:20 ` Wolfgang Walter 2019-02-14 2:09 ` Guoqing Jiang 0 siblings, 1 reply; 5+ messages in thread From: Wolfgang Walter @ 2019-02-12 11:20 UTC (permalink / raw) To: Guoqing Jiang; +Cc: Jens Axboe, NeilBrown, linux-raid, linux-kernel Am Dienstag, 12. Februar 2019, 16:20:11 schrieb Guoqing Jiang: > On 2/11/19 11:12 PM, Wolfgang Walter wrote: > > With 4.19.19 we see sometimes the following issue (practically only with > > blk_mq, though): > > > > Feb 4 20:04:46 tettnang kernel: [252300.060165] INFO: task md0_raid1:317 > > blocked for more than 120 seconds. Feb 4 20:04:46 tettnang kernel: > > [252300.060188] Not tainted 4.19.19-debian64.all+1.1 #1 Feb 4 > > 20:04:46 tettnang kernel: [252300.060197] "echo 0 > > > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 4 > > 20:04:46 tettnang kernel: [252300.060207] md0_raid1 D 0 317 > > 2 0x80000000 Feb 4 20:04:46 tettnang kernel: [252300.060211] Call > > Trace: > > Feb 4 20:04:46 tettnang kernel: [252300.060222] ? __schedule+0x2a2/0x8c0 > > Feb 4 20:04:46 tettnang kernel: [252300.060226] ? > > _raw_spin_unlock_irqrestore+0x20/0x40 Feb 4 20:04:46 tettnang kernel: > > [252300.060229] schedule+0x32/0x90 Feb 4 20:04:46 tettnang kernel: > > [252300.060241] md_super_wait+0x69/0xa0 [md_mod] Feb 4 20:04:46 > > tettnang kernel: [252300.060247] ? finish_wait+0x80/0x80 Feb 4 20:04:46 > > tettnang kernel: [252300.060255] md_bitmap_wait_writes+0x8e/0xa0 > > [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060263] ? > > md_bitmap_get_counter+0x42/0xd0 [md_mod] Feb 4 20:04:46 tettnang kernel: > > [252300.060271] md_bitmap_daemon_work+0x1e8/0x380 [md_mod] Feb 4 > > 20:04:46 tettnang kernel: [252300.060278] ? md_rdev_init+0xb0/0xb0 > > [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060285] > > md_check_recovery+0x26/0x540 [md_mod] Feb 4 20:04:46 tettnang kernel: > > [252300.060290] raid1d+0x5c/0xf00 [raid1] Feb 4 20:04:46 tettnang > > kernel: [252300.060294] ? preempt_count_add+0x79/0xb0 Feb 4 20:04:46 > > tettnang kernel: [252300.060298] ? lock_timer_base+0x67/0x80 Feb 4 > > 20:04:46 tettnang kernel: [252300.060302] ? > > _raw_spin_unlock_irqrestore+0x20/0x40 Feb 4 20:04:46 tettnang kernel: > > [252300.060304] ? try_to_del_timer_sync+0x4d/0x80 Feb 4 20:04:46 > > tettnang kernel: [252300.060306] ? del_timer_sync+0x35/0x40 Feb 4 > > 20:04:46 tettnang kernel: [252300.060309] ? schedule_timeout+0x17a/0x3b0 > > Feb 4 20:04:46 tettnang kernel: [252300.060312] ? > > preempt_count_add+0x79/0xb0 Feb 4 20:04:46 tettnang kernel: > > [252300.060315] ? _raw_spin_lock_irqsave+0x25/0x50 Feb 4 20:04:46 > > tettnang kernel: [252300.060321] ? md_rdev_init+0xb0/0xb0 [md_mod] Feb > > 4 20:04:46 tettnang kernel: [252300.060327] ? md_thread+0xf9/0x160 > > [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060330] ? > > r1bio_pool_alloc+0x20/0x20 [raid1] Feb 4 20:04:46 tettnang kernel: > > [252300.060336] md_thread+0xf9/0x160 [md_mod] Feb 4 20:04:46 tettnang > > kernel: [252300.060340] ? finish_wait+0x80/0x80 Feb 4 20:04:46 tettnang > > kernel: [252300.060344] kthread+0x112/0x130 Feb 4 20:04:46 tettnang > > kernel: [252300.060346] ? kthread_create_worker_on_cpu+0x70/0x70 Feb 4 > > 20:04:46 tettnang kernel: [252300.060350] ret_from_fork+0x35/0x40 > > > > I saw that there was a similar problem with raid10 and an upstream patch > > > > e820d55cb99dd93ac2dc949cf486bb187e5cd70d > > md: fix raid10 hang issue caused by barrier > > by Guoqing Jiang > > > > I wonder if there is a similar fix needed for raid1? > > Seems not, the calltrace tells the previous write superblock IO was not > finish as expected, > there is a report for raid5 which has similar problem with md_super_wait > in the link [1]. Maybe > you can disable blk-mq to narrow down the issue as well. I already did for 4 weeks. I didn't saw this with blk-mq disabled (for scsi and md), though this may be by luck. > > And I don't know why md_bitmap_wait_writes is needed in > bitmap_daemon_work because > it doesn't call write_page before md_bitmap_wait_writes, and the change > is introduced in > commit 85c9ccd4f026a ("md/bitmap: Don't write bitmap while earlier > writes might be in-flight"). > Neil, is it safe to remove it? Thanks. > > [1] |https://bbs.archlinux.org/viewtopic.php?id=243520 I found this bug report in debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=904822 Regards, -- Wolfgang Walter Studentenwerk München Anstalt des öffentlichen Rechts ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: linux 4.19.19: md0_raid:1317 blocked for more than 120 seconds. 2019-02-12 11:20 ` Wolfgang Walter @ 2019-02-14 2:09 ` Guoqing Jiang 2019-02-14 15:27 ` Wolfgang Walter 0 siblings, 1 reply; 5+ messages in thread From: Guoqing Jiang @ 2019-02-14 2:09 UTC (permalink / raw) To: Wolfgang Walter; +Cc: Jens Axboe, NeilBrown, linux-raid, linux-kernel On 2/12/19 7:20 PM, Wolfgang Walter wrote: > Am Dienstag, 12. Februar 2019, 16:20:11 schrieb Guoqing Jiang: >> On 2/11/19 11:12 PM, Wolfgang Walter wrote: >>> With 4.19.19 we see sometimes the following issue (practically only with >>> blk_mq, though): >>> >>> Feb 4 20:04:46 tettnang kernel: [252300.060165] INFO: task md0_raid1:317 >>> blocked for more than 120 seconds. Feb 4 20:04:46 tettnang kernel: >>> [252300.060188] Not tainted 4.19.19-debian64.all+1.1 #1 Feb 4 >>> 20:04:46 tettnang kernel: [252300.060197] "echo 0 > >>> /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 4 >>> 20:04:46 tettnang kernel: [252300.060207] md0_raid1 D 0 317 >>> 2 0x80000000 Feb 4 20:04:46 tettnang kernel: [252300.060211] Call >>> Trace: >>> Feb 4 20:04:46 tettnang kernel: [252300.060222] ? __schedule+0x2a2/0x8c0 >>> Feb 4 20:04:46 tettnang kernel: [252300.060226] ? >>> _raw_spin_unlock_irqrestore+0x20/0x40 Feb 4 20:04:46 tettnang kernel: >>> [252300.060229] schedule+0x32/0x90 Feb 4 20:04:46 tettnang kernel: >>> [252300.060241] md_super_wait+0x69/0xa0 [md_mod] Feb 4 20:04:46 >>> tettnang kernel: [252300.060247] ? finish_wait+0x80/0x80 Feb 4 20:04:46 >>> tettnang kernel: [252300.060255] md_bitmap_wait_writes+0x8e/0xa0 >>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060263] ? >>> md_bitmap_get_counter+0x42/0xd0 [md_mod] Feb 4 20:04:46 tettnang kernel: >>> [252300.060271] md_bitmap_daemon_work+0x1e8/0x380 [md_mod] Feb 4 >>> 20:04:46 tettnang kernel: [252300.060278] ? md_rdev_init+0xb0/0xb0 >>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060285] >>> md_check_recovery+0x26/0x540 [md_mod] Feb 4 20:04:46 tettnang kernel: >>> [252300.060290] raid1d+0x5c/0xf00 [raid1] Feb 4 20:04:46 tettnang >>> kernel: [252300.060294] ? preempt_count_add+0x79/0xb0 Feb 4 20:04:46 >>> tettnang kernel: [252300.060298] ? lock_timer_base+0x67/0x80 Feb 4 >>> 20:04:46 tettnang kernel: [252300.060302] ? >>> _raw_spin_unlock_irqrestore+0x20/0x40 Feb 4 20:04:46 tettnang kernel: >>> [252300.060304] ? try_to_del_timer_sync+0x4d/0x80 Feb 4 20:04:46 >>> tettnang kernel: [252300.060306] ? del_timer_sync+0x35/0x40 Feb 4 >>> 20:04:46 tettnang kernel: [252300.060309] ? schedule_timeout+0x17a/0x3b0 >>> Feb 4 20:04:46 tettnang kernel: [252300.060312] ? >>> preempt_count_add+0x79/0xb0 Feb 4 20:04:46 tettnang kernel: >>> [252300.060315] ? _raw_spin_lock_irqsave+0x25/0x50 Feb 4 20:04:46 >>> tettnang kernel: [252300.060321] ? md_rdev_init+0xb0/0xb0 [md_mod] Feb >>> 4 20:04:46 tettnang kernel: [252300.060327] ? md_thread+0xf9/0x160 >>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060330] ? >>> r1bio_pool_alloc+0x20/0x20 [raid1] Feb 4 20:04:46 tettnang kernel: >>> [252300.060336] md_thread+0xf9/0x160 [md_mod] Feb 4 20:04:46 tettnang >>> kernel: [252300.060340] ? finish_wait+0x80/0x80 Feb 4 20:04:46 tettnang >>> kernel: [252300.060344] kthread+0x112/0x130 Feb 4 20:04:46 tettnang >>> kernel: [252300.060346] ? kthread_create_worker_on_cpu+0x70/0x70 Feb 4 >>> 20:04:46 tettnang kernel: [252300.060350] ret_from_fork+0x35/0x40 >>> >>> I saw that there was a similar problem with raid10 and an upstream patch >>> >>> e820d55cb99dd93ac2dc949cf486bb187e5cd70d >>> md: fix raid10 hang issue caused by barrier >>> by Guoqing Jiang >>> >>> I wonder if there is a similar fix needed for raid1? >> Seems not, the calltrace tells the previous write superblock IO was not >> finish as expected, >> there is a report for raid5 which has similar problem with md_super_wait >> in the link [1]. Maybe >> you can disable blk-mq to narrow down the issue as well. > I already did for 4 weeks. I didn't saw this with blk-mq disabled (for scsi > and md), though this may be by luck. Then I guess it maybe related to blk-mq, which scheduler are you used with blk-mq? And maybe you can switch it to see if it is caused by specified scheduler or not. >> [1] |https://bbs.archlinux.org/viewtopic.php?id=243520 > I found this bug report in debian: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=904822 Thanks, the bug report also said it didn't happen after disable blk-mq. Regards, Guoqing ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: linux 4.19.19: md0_raid:1317 blocked for more than 120 seconds. 2019-02-14 2:09 ` Guoqing Jiang @ 2019-02-14 15:27 ` Wolfgang Walter 2019-02-15 7:57 ` Guoqing Jiang 0 siblings, 1 reply; 5+ messages in thread From: Wolfgang Walter @ 2019-02-14 15:27 UTC (permalink / raw) To: Guoqing Jiang; +Cc: Jens Axboe, NeilBrown, linux-raid, linux-kernel Am Donnerstag, 14. Februar 2019, 10:09:56 schrieb Guoqing Jiang: > On 2/12/19 7:20 PM, Wolfgang Walter wrote: > > Am Dienstag, 12. Februar 2019, 16:20:11 schrieb Guoqing Jiang: > >> On 2/11/19 11:12 PM, Wolfgang Walter wrote: > >>> With 4.19.19 we see sometimes the following issue (practically only with > >>> blk_mq, though): > >>> > >>> Feb 4 20:04:46 tettnang kernel: [252300.060165] INFO: task > >>> md0_raid1:317 > >>> blocked for more than 120 seconds. Feb 4 20:04:46 tettnang kernel: > >>> [252300.060188] Not tainted 4.19.19-debian64.all+1.1 #1 Feb 4 > >>> 20:04:46 tettnang kernel: [252300.060197] "echo 0 > > >>> /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 4 > >>> 20:04:46 tettnang kernel: [252300.060207] md0_raid1 D 0 317 > >>> > >>> 2 0x80000000 Feb 4 20:04:46 tettnang kernel: [252300.060211] Call > >>> > >>> Trace: > >>> Feb 4 20:04:46 tettnang kernel: [252300.060222] ? > >>> __schedule+0x2a2/0x8c0 > >>> Feb 4 20:04:46 tettnang kernel: [252300.060226] ? > >>> _raw_spin_unlock_irqrestore+0x20/0x40 Feb 4 20:04:46 tettnang kernel: > >>> [252300.060229] schedule+0x32/0x90 Feb 4 20:04:46 tettnang kernel: > >>> [252300.060241] md_super_wait+0x69/0xa0 [md_mod] Feb 4 20:04:46 > >>> tettnang kernel: [252300.060247] ? finish_wait+0x80/0x80 Feb 4 > >>> 20:04:46 > >>> tettnang kernel: [252300.060255] md_bitmap_wait_writes+0x8e/0xa0 > >>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060263] ? > >>> md_bitmap_get_counter+0x42/0xd0 [md_mod] Feb 4 20:04:46 tettnang > >>> kernel: > >>> [252300.060271] md_bitmap_daemon_work+0x1e8/0x380 [md_mod] Feb 4 > >>> 20:04:46 tettnang kernel: [252300.060278] ? md_rdev_init+0xb0/0xb0 > >>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060285] > >>> md_check_recovery+0x26/0x540 [md_mod] Feb 4 20:04:46 tettnang kernel: > >>> [252300.060290] raid1d+0x5c/0xf00 [raid1] Feb 4 20:04:46 tettnang > >>> kernel: [252300.060294] ? preempt_count_add+0x79/0xb0 Feb 4 20:04:46 > >>> tettnang kernel: [252300.060298] ? lock_timer_base+0x67/0x80 Feb 4 > >>> 20:04:46 tettnang kernel: [252300.060302] ? > >>> _raw_spin_unlock_irqrestore+0x20/0x40 Feb 4 20:04:46 tettnang kernel: > >>> [252300.060304] ? try_to_del_timer_sync+0x4d/0x80 Feb 4 20:04:46 > >>> tettnang kernel: [252300.060306] ? del_timer_sync+0x35/0x40 Feb 4 > >>> 20:04:46 tettnang kernel: [252300.060309] ? > >>> schedule_timeout+0x17a/0x3b0 > >>> Feb 4 20:04:46 tettnang kernel: [252300.060312] ? > >>> preempt_count_add+0x79/0xb0 Feb 4 20:04:46 tettnang kernel: > >>> [252300.060315] ? _raw_spin_lock_irqsave+0x25/0x50 Feb 4 20:04:46 > >>> tettnang kernel: [252300.060321] ? md_rdev_init+0xb0/0xb0 [md_mod] Feb > >>> 4 20:04:46 tettnang kernel: [252300.060327] ? md_thread+0xf9/0x160 > >>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060330] ? > >>> r1bio_pool_alloc+0x20/0x20 [raid1] Feb 4 20:04:46 tettnang kernel: > >>> [252300.060336] md_thread+0xf9/0x160 [md_mod] Feb 4 20:04:46 tettnang > >>> kernel: [252300.060340] ? finish_wait+0x80/0x80 Feb 4 20:04:46 > >>> tettnang > >>> kernel: [252300.060344] kthread+0x112/0x130 Feb 4 20:04:46 tettnang > >>> kernel: [252300.060346] ? kthread_create_worker_on_cpu+0x70/0x70 Feb 4 > >>> 20:04:46 tettnang kernel: [252300.060350] ret_from_fork+0x35/0x40 > >>> > >>> I saw that there was a similar problem with raid10 and an upstream patch > >>> > >>> e820d55cb99dd93ac2dc949cf486bb187e5cd70d > >>> md: fix raid10 hang issue caused by barrier > >>> by Guoqing Jiang > >>> > >>> I wonder if there is a similar fix needed for raid1? > >> > >> Seems not, the calltrace tells the previous write superblock IO was not > >> finish as expected, > >> there is a report for raid5 which has similar problem with md_super_wait > >> in the link [1]. Maybe > >> you can disable blk-mq to narrow down the issue as well. > > > > I already did for 4 weeks. I didn't saw this with blk-mq disabled (for > > scsi > > and md), though this may be by luck. > > Then I guess it maybe related to blk-mq, which scheduler are you used > with blk-mq? > And maybe you can switch it to see if it is caused by specified > scheduler or not. mq-deadline for SCSI and none for md and dm. > > >> [1] |https://bbs.archlinux.org/viewtopic.php?id=243520 > > > > I found this bug report in debian: > > > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=904822 > > Thanks, the bug report also said it didn't happen after disable blk-mq. > Regards, -- Wolfgang Walter Studentenwerk München Anstalt des öffentlichen Rechts ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: linux 4.19.19: md0_raid:1317 blocked for more than 120 seconds. 2019-02-14 15:27 ` Wolfgang Walter @ 2019-02-15 7:57 ` Guoqing Jiang 0 siblings, 0 replies; 5+ messages in thread From: Guoqing Jiang @ 2019-02-15 7:57 UTC (permalink / raw) To: Wolfgang Walter; +Cc: Jens Axboe, NeilBrown, linux-raid, linux-kernel On 2/14/19 11:27 PM, Wolfgang Walter wrote: > Am Donnerstag, 14. Februar 2019, 10:09:56 schrieb Guoqing Jiang: >> On 2/12/19 7:20 PM, Wolfgang Walter wrote: >>> Am Dienstag, 12. Februar 2019, 16:20:11 schrieb Guoqing Jiang: >>>> On 2/11/19 11:12 PM, Wolfgang Walter wrote: >>>>> With 4.19.19 we see sometimes the following issue (practically only with >>>>> blk_mq, though): >>>>> >>>>> Feb 4 20:04:46 tettnang kernel: [252300.060165] INFO: task >>>>> md0_raid1:317 >>>>> blocked for more than 120 seconds. Feb 4 20:04:46 tettnang kernel: >>>>> [252300.060188] Not tainted 4.19.19-debian64.all+1.1 #1 Feb 4 >>>>> 20:04:46 tettnang kernel: [252300.060197] "echo 0 > >>>>> /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 4 >>>>> 20:04:46 tettnang kernel: [252300.060207] md0_raid1 D 0 317 >>>>> >>>>> 2 0x80000000 Feb 4 20:04:46 tettnang kernel: [252300.060211] Call >>>>> >>>>> Trace: >>>>> Feb 4 20:04:46 tettnang kernel: [252300.060222] ? >>>>> __schedule+0x2a2/0x8c0 >>>>> Feb 4 20:04:46 tettnang kernel: [252300.060226] ? >>>>> _raw_spin_unlock_irqrestore+0x20/0x40 Feb 4 20:04:46 tettnang kernel: >>>>> [252300.060229] schedule+0x32/0x90 Feb 4 20:04:46 tettnang kernel: >>>>> [252300.060241] md_super_wait+0x69/0xa0 [md_mod] Feb 4 20:04:46 >>>>> tettnang kernel: [252300.060247] ? finish_wait+0x80/0x80 Feb 4 >>>>> 20:04:46 >>>>> tettnang kernel: [252300.060255] md_bitmap_wait_writes+0x8e/0xa0 >>>>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060263] ? >>>>> md_bitmap_get_counter+0x42/0xd0 [md_mod] Feb 4 20:04:46 tettnang >>>>> kernel: >>>>> [252300.060271] md_bitmap_daemon_work+0x1e8/0x380 [md_mod] Feb 4 >>>>> 20:04:46 tettnang kernel: [252300.060278] ? md_rdev_init+0xb0/0xb0 >>>>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060285] >>>>> md_check_recovery+0x26/0x540 [md_mod] Feb 4 20:04:46 tettnang kernel: >>>>> [252300.060290] raid1d+0x5c/0xf00 [raid1] Feb 4 20:04:46 tettnang >>>>> kernel: [252300.060294] ? preempt_count_add+0x79/0xb0 Feb 4 20:04:46 >>>>> tettnang kernel: [252300.060298] ? lock_timer_base+0x67/0x80 Feb 4 >>>>> 20:04:46 tettnang kernel: [252300.060302] ? >>>>> _raw_spin_unlock_irqrestore+0x20/0x40 Feb 4 20:04:46 tettnang kernel: >>>>> [252300.060304] ? try_to_del_timer_sync+0x4d/0x80 Feb 4 20:04:46 >>>>> tettnang kernel: [252300.060306] ? del_timer_sync+0x35/0x40 Feb 4 >>>>> 20:04:46 tettnang kernel: [252300.060309] ? >>>>> schedule_timeout+0x17a/0x3b0 >>>>> Feb 4 20:04:46 tettnang kernel: [252300.060312] ? >>>>> preempt_count_add+0x79/0xb0 Feb 4 20:04:46 tettnang kernel: >>>>> [252300.060315] ? _raw_spin_lock_irqsave+0x25/0x50 Feb 4 20:04:46 >>>>> tettnang kernel: [252300.060321] ? md_rdev_init+0xb0/0xb0 [md_mod] Feb >>>>> 4 20:04:46 tettnang kernel: [252300.060327] ? md_thread+0xf9/0x160 >>>>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060330] ? >>>>> r1bio_pool_alloc+0x20/0x20 [raid1] Feb 4 20:04:46 tettnang kernel: >>>>> [252300.060336] md_thread+0xf9/0x160 [md_mod] Feb 4 20:04:46 tettnang >>>>> kernel: [252300.060340] ? finish_wait+0x80/0x80 Feb 4 20:04:46 >>>>> tettnang >>>>> kernel: [252300.060344] kthread+0x112/0x130 Feb 4 20:04:46 tettnang >>>>> kernel: [252300.060346] ? kthread_create_worker_on_cpu+0x70/0x70 Feb 4 >>>>> 20:04:46 tettnang kernel: [252300.060350] ret_from_fork+0x35/0x40 >>>>> >>>>> I saw that there was a similar problem with raid10 and an upstream patch >>>>> >>>>> e820d55cb99dd93ac2dc949cf486bb187e5cd70d >>>>> md: fix raid10 hang issue caused by barrier >>>>> by Guoqing Jiang >>>>> >>>>> I wonder if there is a similar fix needed for raid1? >>>> Seems not, the calltrace tells the previous write superblock IO was not >>>> finish as expected, >>>> there is a report for raid5 which has similar problem with md_super_wait >>>> in the link [1]. Maybe >>>> you can disable blk-mq to narrow down the issue as well. >>> I already did for 4 weeks. I didn't saw this with blk-mq disabled (for >>> scsi >>> and md), though this may be by luck. >> Then I guess it maybe related to blk-mq, which scheduler are you used >> with blk-mq? >> And maybe you can switch it to see if it is caused by specified >> scheduler or not. > mq-deadline for SCSI and none for md and dm. Can you try with the patch [1]? In case the block was caused by flush. [1]: https://patchwork.kernel.org/patch/10787903/ Thanks, Guoqing ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-02-15 7:57 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-02-11 15:12 linux 4.19.19: md0_raid:1317 blocked for more than 120 seconds Wolfgang Walter
[not found] ` <0ee180ac-bb43-6c2f-4084-5cc452a18c9d@suse.com>
2019-02-12 11:20 ` Wolfgang Walter
2019-02-14 2:09 ` Guoqing Jiang
2019-02-14 15:27 ` Wolfgang Walter
2019-02-15 7:57 ` Guoqing Jiang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).