* [dm-devel] A hang bug of dm on s390x
@ 2023-02-15 11:23 Pingfan Liu
2023-02-16 0:08 ` Ming Lei
0 siblings, 1 reply; 5+ messages in thread
From: Pingfan Liu @ 2023-02-15 11:23 UTC (permalink / raw)
To: dm-devel; +Cc: Ming Lei, Tao Liu
Hi guys,
I encountered a hang issue on a s390x system. The tested kernel is
not preemptible and booting with "nr_cpus=1"
The test steps:
umount /home
lvremove /dev/rhel_s390x-kvm-011/home
## uncomment "snapshot_autoextend_threshold = 70" and
"snapshot_autoextend_percent = 20" in /etc/lvm/lvm.conf
systemctl enable lvm2-monitor.service
systemctl start lvm2-monitor.service
lvremove -y rhel_s390x-kvm-011/thinp
lvcreate -L 10M -T rhel_s390x-kvm-011/thinp
lvcreate -V 400M -T rhel_s390x-kvm-011/thinp -n src
mkfs.ext4 /dev/rhel_s390x-kvm-011/src
mount /dev/rhel_s390x-kvm-011/src /mnt
for((i=0;i<4;i++)); do dd if=/dev/zero of=/mnt/test$i.img
bs=100M count=1; done
And the system hangs with the console log [1]
The related kernel config
CONFIG_PREEMPT_NONE_BUILD=y
CONFIG_PREEMPT_NONE=y
CONFIG_PREEMPT_COUNT=y
CONFIG_SCHED_CORE=y
It turns out that when hanging, the kernel is stuck in the dead-loop
in the function dm_wq_work()
while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) {
spin_lock_irq(&md->deferred_lock);
bio = bio_list_pop(&md->deferred);
spin_unlock_irq(&md->deferred_lock);
if (!bio)
break;
thread_cpu = smp_processor_id();
submit_bio_noacct(bio);
}
where dm_wq_work()->__submit_bio_noacct()->...->dm_handle_requeue()
keeps generating new bio, and the condition "if (!bio)" can not be
meet.
After applying the following patch, the issue is gone.
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index e1ea3a7bd9d9..95c9cb07a42f 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2567,6 +2567,7 @@ static void dm_wq_work(struct work_struct *work)
break;
submit_bio_noacct(bio);
+ cond_resched();
}
}
But I think it is not a proper solution. And without this patch, if
removing nr_cpus=1 (the system has two cpus), the issue can not be
triggered. That says when more than one cpu, the above loop can exit
by the condition "if (!bio)"
Any ideas?
Thanks,
Pingfan
[1]: the console log when hanging
[ 2062.321473] device-mapper: thin: 253:4: reached low water mark for
data device: sending event.
[ 2062.353217] dm-3: detected capacity change from 122880 to 147456
[-- MARK -- Wed Dec 14 08:45:00 2022]
[ 2062.376690] device-mapper: thin: 253:4: switching pool to
out-of-data-space (queue IO) mode
[ 2122.393998] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 2122.394011] (detected by 0, t=6002 jiffies, g=36205, q=624 ncpus=1)
[ 2122.394014] rcu: All QSes seen, last rcu_sched kthread activity
6002 (4295149593-4295143591), jiffies_till_next_fqs=1, root ->qsmask
0x0
[ 2122.394017] rcu: rcu_sched kthread starved for 6002 jiffies!
g36205 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 2122.394019] rcu: Unless rcu_sched kthread gets sufficient CPU
time, OOM is now expected behavior.
[ 2122.394020] rcu: RCU grace-period kthread stack dump:
[ 2122.394022] task:rcu_sched state:R running task stack:
0 pid: 15 ppid: 2 flags:0x00000000
[ 2122.394027] Call Trace:
[ 2122.394030] [<00000001001fd5b0>] __schedule+0x300/0x6c0
[ 2122.394040] [<00000001001fd9d2>] schedule+0x62/0xf0
[ 2122.394042] [<000000010020360c>] schedule_timeout+0x8c/0x170
[ 2122.394045] [<00000000ff9e219e>] rcu_gp_fqs_loop+0x30e/0x3d0
[ 2122.394053] [<00000000ff9e3ee2>] rcu_gp_kthread+0x132/0x1b0
[ 2122.394054] [<00000000ff980578>] kthread+0x108/0x110
[ 2122.394058] [<00000000ff9035cc>] __ret_from_fork+0x3c/0x60
[ 2122.394061] [<000000010020455a>] ret_from_fork+0xa/0x30
[ 2122.394064] rcu: Stack dump where RCU GP kthread last ran:
[ 2122.394064] Task dump for CPU 0:
[ 2122.394066] task:kworker/0:2 state:R running task stack:
0 pid:16943 ppid: 2 flags:0x00000004
[ 2122.394070] Workqueue: kdmflush/253:6 dm_wq_work [dm_mod]
[ 2122.394100] Call Trace:
[ 2122.394100] [<00000000ff98bb98>] sched_show_task.part.0+0xf8/0x130
[ 2122.394106] [<00000001001efc9a>]
rcu_check_gp_kthread_starvation+0x172/0x190
[ 2122.394111] [<00000000ff9e5afe>] print_other_cpu_stall+0x2de/0x370
[ 2122.394113] [<00000000ff9e5d60>] check_cpu_stall+0x1d0/0x270
[ 2122.394114] [<00000000ff9e6152>] rcu_sched_clock_irq+0x82/0x230
[ 2122.394117] [<00000000ff9f808a>] update_process_times+0xba/0xf0
[ 2122.394121] [<00000000ffa0adfa>] tick_sched_handle+0x4a/0x70
[ 2122.394124] [<00000000ffa0b2fe>] tick_sched_timer+0x5e/0xc0
[ 2122.394126] [<00000000ff9f8cb6>] __hrtimer_run_queues+0x136/0x290
[ 2122.394128] [<00000000ff9f9f80>] hrtimer_interrupt+0x150/0x2d0
[ 2122.394130] [<00000000ff90ce36>] do_IRQ+0x56/0x70
[ 2122.394133] [<00000000ff90d216>] do_irq_async+0x56/0xb0
[ 2122.394135] [<00000001001f6786>] do_ext_irq+0x96/0x160
[ 2122.394138] [<00000001002047bc>] ext_int_handler+0xdc/0x110
[ 2122.394140] [<000003ff8005cf48>]
dm_split_and_process_bio+0x28/0x4d0 [dm_mod]
[ 2122.394152] ([<000003ff8005d2ec>]
dm_split_and_process_bio+0x3cc/0x4d0 [dm_mod])
[ 2122.394162] [<000003ff8005dca8>] dm_submit_bio+0x68/0x110 [dm_mod]
[ 2122.394173] [<00000000ffda0698>] __submit_bio+0x78/0x190
[ 2122.394178] [<00000000ffda081c>] __submit_bio_noacct+0x6c/0x1e0
[ 2122.394180] [<000003ff8005c2ac>] dm_wq_work+0x5c/0xc0 [dm_mod]
[ 2122.394190] [<00000000ff9771e6>] process_one_work+0x216/0x4a0
[ 2122.394196] [<00000000ff9779a4>] worker_thread+0x64/0x4a0
[ 2122.394198] [<00000000ff980578>] kthread+0x108/0x110
[ 2122.394199] [<00000000ff9035cc>] __ret_from_fork+0x3c/0x60
[ 2122.394201] [<000000010020455a>] ret_from_fork+0xa/0x30
[ 2302.444001] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 2302.444015] (detected by 0, t=24007 jiffies, g=36205, q=624 ncpus=1)
[ 2302.444018] rcu: All QSes seen, last rcu_sched kthread activity
24007 (4295167598-4295143591), jiffies_till_next_fqs=1, root ->qsmask
0x0
[ 2302.444021] rcu: rcu_sched kthread starved for 24007 jiffies!
g36205 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 2302.444024] rcu: Unless rcu_sched kthread gets sufficient CPU
time, OOM is now expected behavior.
[ 2302.444025] rcu: RCU grace-period kthread stack dump:
[ 2302.444027] task:rcu_sched state:R running task stack:
0 pid: 15 ppid: 2 flag[-- MARK -- Wed Dec 14 08:50:00 2022]
s:0x00000000
[ 2302.444204] Call Trace:
[ 2302.444207] [<00000001001fd5b0>] __schedule+0x300/0x6c0
[ 2302.444216] [<00000001001fd9d2>] schedule+0x62/0xf0
[ 2302.444218] [<000000010020360c>] schedule_timeout+0x8c/0x170
[ 2302.444221] [<00000000ff9e219e>] rcu_gp_fqs_loop+0x30e/0x3d0
[ 2302.444227] [<00000000ff9e3ee2>] rcu_gp_kthread+0x132/0x1b0
[ 2302.444229] [<00000000ff980578>] kthread+0x108/0x110
[ 2302.444232] [<00000000ff9035cc>] __ret_from_fork+0x3c/0x60
[ 2302.444235] [<000000010020455a>] ret_from_fork+0xa/0x30
[ 2302.444237] rcu: Stack dump where RCU GP kthread last ran:
[ 2302.444238] Task dump for CPU 0:
[ 2302.444239] task:kworker/0:2 state:R running task stack:
0 pid:16943 ppid: 2 flags:0x00000004
[ 2302.444244] Workqueue: kdmflush/253:6 dm_wq_work [dm_mod]
[ 2302.444270] Call Trace:
[ 2302.444270] [<00000000ff98bb98>] sched_show_task.part.0+0xf8/0x130
[ 2302.444275] [<00000001001efc9a>]
rcu_check_gp_kthread_starvation+0x172/0x190
[ 2302.444280] [<00000000ff9e5afe>] print_other_cpu_stall+0x2de/0x370
[ 2302.444282] [<00000000ff9e5d60>] check_cpu_stall+0x1d0/0x270
[ 2302.444284] [<00000000ff9e6152>] rcu_sched_clock_irq+0x82/0x230
[ 2302.444286] [<00000000ff9f808a>] update_process_times+0xba/0xf0
[ 2302.444290] [<00000000ffa0adfa>] tick_sched_handle+0x4a/0x70
[ 2302.444292] [<00000000ffa0b2fe>] tick_sched_timer+0x5e/0xc0
[ 2302.444294] [<00000000ff9f8cb6>] __hrtimer_run_queues+0x136/0x290
[ 2302.444296] [<00000000ff9f9f80>] hrtimer_interrupt+0x150/0x2d0
[ 2302.444298] [<00000000ff90ce36>] do_IRQ+0x56/0x70
[ 2302.444301] [<00000000ff90d216>] do_irq_async+0x56/0xb0
[ 2302.444303] [<00000001001f6786>] do_ext_irq+0x96/0x160
[ 2302.444306] [<00000001002047bc>] ext_int_handler+0xdc/0x110
[ 2302.444308] [<00000000ffbe2a1e>] kmem_cache_alloc+0x15e/0x530
[ 2302.444316] [<00000000ffb3e780>] mempool_alloc+0x60/0x210
[ 2302.444319] [<00000000ffd9ac4e>] bio_alloc_bioset+0x1ae/0x410
[ 2302.444324] [<00000000ffd9af8c>] bio_alloc_clone+0x3c/0x90
[ 2302.444326] [<000003ff8005cf9a>]
dm_split_and_process_bio+0x7a/0x4d0 [dm_mod]
[ 2302.444337] [<000003ff8005dca8>] dm_submit_bio+0x68/0x110 [dm_mod]
[ 2302.444347] [<00000000ffda0698>] __submit_bio+0x78/0x190
[ 2302.444350] [<00000000ffda081c>] __submit_bio_noacct+0x6c/0x1e0
[ 2302.444353] [<000003ff8005c2ac>] dm_wq_work+0x5c/0xc0 [dm_mod]
[ 2302.444363] [<00000000ff9771e6>] process_one_work+0x216/0x4a0
[ 2302.444367] [<00000000ff9779a4>] worker_thread+0x64/0x4a0
[ 2302.444369] [<00000000ff980578>] kthread+0x108/0x110
[ 2302.444371] [<00000000ff9035cc>] __ret_from_fork+0x3c/0x60
[ 2302.444373] [<000000010020455a>] ret_from_fork+0xa/0x30
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [dm-devel] A hang bug of dm on s390x
2023-02-15 11:23 [dm-devel] A hang bug of dm on s390x Pingfan Liu
@ 2023-02-16 0:08 ` Ming Lei
2023-02-16 8:30 ` Pingfan Liu
0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2023-02-16 0:08 UTC (permalink / raw)
To: Pingfan Liu; +Cc: dm-devel, Ming Lei, Tao Liu
On Wed, Feb 15, 2023 at 07:23:40PM +0800, Pingfan Liu wrote:
> Hi guys,
>
> I encountered a hang issue on a s390x system. The tested kernel is
> not preemptible and booting with "nr_cpus=1"
>
> The test steps:
> umount /home
> lvremove /dev/rhel_s390x-kvm-011/home
> ## uncomment "snapshot_autoextend_threshold = 70" and
> "snapshot_autoextend_percent = 20" in /etc/lvm/lvm.conf
>
> systemctl enable lvm2-monitor.service
> systemctl start lvm2-monitor.service
>
> lvremove -y rhel_s390x-kvm-011/thinp
> lvcreate -L 10M -T rhel_s390x-kvm-011/thinp
> lvcreate -V 400M -T rhel_s390x-kvm-011/thinp -n src
> mkfs.ext4 /dev/rhel_s390x-kvm-011/src
> mount /dev/rhel_s390x-kvm-011/src /mnt
> for((i=0;i<4;i++)); do dd if=/dev/zero of=/mnt/test$i.img
> bs=100M count=1; done
>
> And the system hangs with the console log [1]
>
> The related kernel config
>
> CONFIG_PREEMPT_NONE_BUILD=y
> CONFIG_PREEMPT_NONE=y
> CONFIG_PREEMPT_COUNT=y
> CONFIG_SCHED_CORE=y
>
> It turns out that when hanging, the kernel is stuck in the dead-loop
> in the function dm_wq_work()
> while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) {
> spin_lock_irq(&md->deferred_lock);
> bio = bio_list_pop(&md->deferred);
> spin_unlock_irq(&md->deferred_lock);
>
> if (!bio)
> break;
> thread_cpu = smp_processor_id();
> submit_bio_noacct(bio);
> }
> where dm_wq_work()->__submit_bio_noacct()->...->dm_handle_requeue()
> keeps generating new bio, and the condition "if (!bio)" can not be
> meet.
>
>
> After applying the following patch, the issue is gone.
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index e1ea3a7bd9d9..95c9cb07a42f 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -2567,6 +2567,7 @@ static void dm_wq_work(struct work_struct *work)
> break;
>
> submit_bio_noacct(bio);
> + cond_resched();
> }
> }
>
> But I think it is not a proper solution. And without this patch, if
> removing nr_cpus=1 (the system has two cpus), the issue can not be
> triggered. That says when more than one cpu, the above loop can exit
> by the condition "if (!bio)"
>
> Any ideas?
I think the patch is correct.
For kernel built without CONFIG_PREEMPT, in case of single cpu core,
if the dm target(such as dm-thin) needs another wq or kthread for
handling IO, then dm target side is blocked because dm_wq_work()
holds the single cpu, sooner or later, dm target may have not
resource to handle new io from dm core and returns REQUEUE.
Then dm_wq_work becomes one dead loop.
Thanks,
Ming
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dm-devel] A hang bug of dm on s390x
2023-02-16 0:08 ` Ming Lei
@ 2023-02-16 8:30 ` Pingfan Liu
2023-02-16 12:39 ` Zdenek Kabelac
2023-02-16 17:29 ` Mike Snitzer
0 siblings, 2 replies; 5+ messages in thread
From: Pingfan Liu @ 2023-02-16 8:30 UTC (permalink / raw)
To: Ming Lei
Cc: Mike Snitzer, dm-devel, Zdenek Kabelac, Ming Lei, Tao Liu,
Alasdair Kergon
Hi Ming,
Thank you for looking into this.
let me loop in Alasdair, Mike and Zdenek for further comment on LVM stuff
Thanks,
Pingfan
On Thu, Feb 16, 2023 at 8:08 AM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Wed, Feb 15, 2023 at 07:23:40PM +0800, Pingfan Liu wrote:
> > Hi guys,
> >
> > I encountered a hang issue on a s390x system. The tested kernel is
> > not preemptible and booting with "nr_cpus=1"
> >
> > The test steps:
> > umount /home
> > lvremove /dev/rhel_s390x-kvm-011/home
> > ## uncomment "snapshot_autoextend_threshold = 70" and
> > "snapshot_autoextend_percent = 20" in /etc/lvm/lvm.conf
> >
> > systemctl enable lvm2-monitor.service
> > systemctl start lvm2-monitor.service
> >
> > lvremove -y rhel_s390x-kvm-011/thinp
> > lvcreate -L 10M -T rhel_s390x-kvm-011/thinp
> > lvcreate -V 400M -T rhel_s390x-kvm-011/thinp -n src
> > mkfs.ext4 /dev/rhel_s390x-kvm-011/src
> > mount /dev/rhel_s390x-kvm-011/src /mnt
> > for((i=0;i<4;i++)); do dd if=/dev/zero of=/mnt/test$i.img
> > bs=100M count=1; done
> >
> > And the system hangs with the console log [1]
> >
> > The related kernel config
> >
> > CONFIG_PREEMPT_NONE_BUILD=y
> > CONFIG_PREEMPT_NONE=y
> > CONFIG_PREEMPT_COUNT=y
> > CONFIG_SCHED_CORE=y
> >
> > It turns out that when hanging, the kernel is stuck in the dead-loop
> > in the function dm_wq_work()
> > while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) {
> > spin_lock_irq(&md->deferred_lock);
> > bio = bio_list_pop(&md->deferred);
> > spin_unlock_irq(&md->deferred_lock);
> >
> > if (!bio)
> > break;
> > thread_cpu = smp_processor_id();
> > submit_bio_noacct(bio);
> > }
> > where dm_wq_work()->__submit_bio_noacct()->...->dm_handle_requeue()
> > keeps generating new bio, and the condition "if (!bio)" can not be
> > meet.
> >
> >
> > After applying the following patch, the issue is gone.
> >
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index e1ea3a7bd9d9..95c9cb07a42f 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -2567,6 +2567,7 @@ static void dm_wq_work(struct work_struct *work)
> > break;
> >
> > submit_bio_noacct(bio);
> > + cond_resched();
> > }
> > }
> >
> > But I think it is not a proper solution. And without this patch, if
> > removing nr_cpus=1 (the system has two cpus), the issue can not be
> > triggered. That says when more than one cpu, the above loop can exit
> > by the condition "if (!bio)"
> >
> > Any ideas?
>
> I think the patch is correct.
>
> For kernel built without CONFIG_PREEMPT, in case of single cpu core,
> if the dm target(such as dm-thin) needs another wq or kthread for
> handling IO, then dm target side is blocked because dm_wq_work()
> holds the single cpu, sooner or later, dm target may have not
> resource to handle new io from dm core and returns REQUEUE.
>
> Then dm_wq_work becomes one dead loop.
>
>
> Thanks,
> Ming
>
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dm-devel] A hang bug of dm on s390x
2023-02-16 8:30 ` Pingfan Liu
@ 2023-02-16 12:39 ` Zdenek Kabelac
2023-02-16 17:29 ` Mike Snitzer
1 sibling, 0 replies; 5+ messages in thread
From: Zdenek Kabelac @ 2023-02-16 12:39 UTC (permalink / raw)
To: Pingfan Liu, Ming Lei
Cc: Mike Snitzer, dm-devel, Ming Lei, Tao Liu, Alasdair Kergon
Dne 16. 02. 23 v 9:30 Pingfan Liu napsal(a):
> Hi Ming,
>
> Thank you for looking into this.
>
> let me loop in Alasdair, Mike and Zdenek for further comment on LVM stuff
>
>
> Thanks,
>
> Pingfan
Hi
From lvm2 POV - couple clarifications - to let thin-pool auto-extend - user
has to configure:
thin_pool_autoextend_threshold = 70
thin_pool_autoextend_percent = 20
If the thin_pool_autoextend_threshold is left with the default value 100,
there is no extension made to the thin-pool.
Default behavior of thin-pool kernel target when it runs out-of-space is to
put all in-flight IO operation on-hold for 60s (configurable by kernel
parameter) then all such operation starts to be errored and thin-pool goes to
out-of-space error state.
To immediately get this state use '--errorwhenfull=y' with thinpool
(lvcreate, lvconvert, lvchange) - this will avoid any delay if user doesn't
want expansion of thin-pool and wants to get error ASAP.
But this all might be unrelated to the issue you are getting on your hw.
Regards
Zdenek
> On Thu, Feb 16, 2023 at 8:08 AM Ming Lei <ming.lei@redhat.com> wrote:
>> On Wed, Feb 15, 2023 at 07:23:40PM +0800, Pingfan Liu wrote:
>>> Hi guys,
>>>
>>> I encountered a hang issue on a s390x system. The tested kernel is
>>> not preemptible and booting with "nr_cpus=1"
>>>
>>> The test steps:
>>> umount /home
>>> lvremove /dev/rhel_s390x-kvm-011/home
>>> ## uncomment "snapshot_autoextend_threshold = 70" and
>>> "snapshot_autoextend_percent = 20" in /etc/lvm/lvm.conf
>>>
>>> systemctl enable lvm2-monitor.service
>>> systemctl start lvm2-monitor.service
>>>
>>> lvremove -y rhel_s390x-kvm-011/thinp
>>> lvcreate -L 10M -T rhel_s390x-kvm-011/thinp
>>> lvcreate -V 400M -T rhel_s390x-kvm-011/thinp -n src
>>> mkfs.ext4 /dev/rhel_s390x-kvm-011/src
>>> mount /dev/rhel_s390x-kvm-011/src /mnt
>>> for((i=0;i<4;i++)); do dd if=/dev/zero of=/mnt/test$i.img
>>> bs=100M count=1; done
>>>
>>> And the system hangs with the console log [1]
>>>
>>> The related kernel config
>>>
>>> CONFIG_PREEMPT_NONE_BUILD=y
>>> CONFIG_PREEMPT_NONE=y
>>> CONFIG_PREEMPT_COUNT=y
>>> CONFIG_SCHED_CORE=y
>>>
>>> It turns out that when hanging, the kernel is stuck in the dead-loop
>>> in the function dm_wq_work()
>>> while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) {
>>> spin_lock_irq(&md->deferred_lock);
>>> bio = bio_list_pop(&md->deferred);
>>> spin_unlock_irq(&md->deferred_lock);
>>>
>>> if (!bio)
>>> break;
>>> thread_cpu = smp_processor_id();
>>> submit_bio_noacct(bio);
>>> }
>>> where dm_wq_work()->__submit_bio_noacct()->...->dm_handle_requeue()
>>> keeps generating new bio, and the condition "if (!bio)" can not be
>>> meet.
>>>
>>>
>>> After applying the following patch, the issue is gone.
>>>
>>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
>>> index e1ea3a7bd9d9..95c9cb07a42f 100644
>>> --- a/drivers/md/dm.c
>>> +++ b/drivers/md/dm.c
>>> @@ -2567,6 +2567,7 @@ static void dm_wq_work(struct work_struct *work)
>>> break;
>>>
>>> submit_bio_noacct(bio);
>>> + cond_resched();
>>> }
>>> }
>>>
>>> But I think it is not a proper solution. And without this patch, if
>>> removing nr_cpus=1 (the system has two cpus), the issue can not be
>>> triggered. That says when more than one cpu, the above loop can exit
>>> by the condition "if (!bio)"
>>>
>>> Any ideas?
>> I think the patch is correct.
>>
>> For kernel built without CONFIG_PREEMPT, in case of single cpu core,
>> if the dm target(such as dm-thin) needs another wq or kthread for
>> handling IO, then dm target side is blocked because dm_wq_work()
>> holds the single cpu, sooner or later, dm target may have not
>> resource to handle new io from dm core and returns REQUEUE.
>>
>> Then dm_wq_work becomes one dead loop.
>>
>>
>> Thanks,
>> Ming
>>
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dm-devel] A hang bug of dm on s390x
2023-02-16 8:30 ` Pingfan Liu
2023-02-16 12:39 ` Zdenek Kabelac
@ 2023-02-16 17:29 ` Mike Snitzer
1 sibling, 0 replies; 5+ messages in thread
From: Mike Snitzer @ 2023-02-16 17:29 UTC (permalink / raw)
To: Pingfan Liu
Cc: Ming Lei, dm-devel, Zdenek Kabelac, Ming Lei, Tao Liu,
Alasdair Kergon
[Top-posting but please don't...]
I've staged this fix for 6.3 inclusion and marked it for stable@:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-6.3&id=0ca44fcef241768fd25ee763b3d203b9852f269b
Ming, I also staged this similar fix (not reasoned through scenario
where it'd actually occur that dm_wq_requeue_work would loop endlessly
but its good practice to include cond_resched() in such a workqueue
while loop):
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-6.3&id=f77692d65d54665d81815349cc727baa85e8b71d
Thanks,
Mike
On Thu, Feb 16 2023 at 3:30P -0500,
Pingfan Liu <piliu@redhat.com> wrote:
> Hi Ming,
>
> Thank you for looking into this.
>
> let me loop in Alasdair, Mike and Zdenek for further comment on LVM stuff
>
>
> Thanks,
>
> Pingfan
>
> On Thu, Feb 16, 2023 at 8:08 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Wed, Feb 15, 2023 at 07:23:40PM +0800, Pingfan Liu wrote:
> > > Hi guys,
> > >
> > > I encountered a hang issue on a s390x system. The tested kernel is
> > > not preemptible and booting with "nr_cpus=1"
> > >
> > > The test steps:
> > > umount /home
> > > lvremove /dev/rhel_s390x-kvm-011/home
> > > ## uncomment "snapshot_autoextend_threshold = 70" and
> > > "snapshot_autoextend_percent = 20" in /etc/lvm/lvm.conf
> > >
> > > systemctl enable lvm2-monitor.service
> > > systemctl start lvm2-monitor.service
> > >
> > > lvremove -y rhel_s390x-kvm-011/thinp
> > > lvcreate -L 10M -T rhel_s390x-kvm-011/thinp
> > > lvcreate -V 400M -T rhel_s390x-kvm-011/thinp -n src
> > > mkfs.ext4 /dev/rhel_s390x-kvm-011/src
> > > mount /dev/rhel_s390x-kvm-011/src /mnt
> > > for((i=0;i<4;i++)); do dd if=/dev/zero of=/mnt/test$i.img
> > > bs=100M count=1; done
> > >
> > > And the system hangs with the console log [1]
> > >
> > > The related kernel config
> > >
> > > CONFIG_PREEMPT_NONE_BUILD=y
> > > CONFIG_PREEMPT_NONE=y
> > > CONFIG_PREEMPT_COUNT=y
> > > CONFIG_SCHED_CORE=y
> > >
> > > It turns out that when hanging, the kernel is stuck in the dead-loop
> > > in the function dm_wq_work()
> > > while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) {
> > > spin_lock_irq(&md->deferred_lock);
> > > bio = bio_list_pop(&md->deferred);
> > > spin_unlock_irq(&md->deferred_lock);
> > >
> > > if (!bio)
> > > break;
> > > thread_cpu = smp_processor_id();
> > > submit_bio_noacct(bio);
> > > }
> > > where dm_wq_work()->__submit_bio_noacct()->...->dm_handle_requeue()
> > > keeps generating new bio, and the condition "if (!bio)" can not be
> > > meet.
> > >
> > >
> > > After applying the following patch, the issue is gone.
> > >
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index e1ea3a7bd9d9..95c9cb07a42f 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -2567,6 +2567,7 @@ static void dm_wq_work(struct work_struct *work)
> > > break;
> > >
> > > submit_bio_noacct(bio);
> > > + cond_resched();
> > > }
> > > }
> > >
> > > But I think it is not a proper solution. And without this patch, if
> > > removing nr_cpus=1 (the system has two cpus), the issue can not be
> > > triggered. That says when more than one cpu, the above loop can exit
> > > by the condition "if (!bio)"
> > >
> > > Any ideas?
> >
> > I think the patch is correct.
> >
> > For kernel built without CONFIG_PREEMPT, in case of single cpu core,
> > if the dm target(such as dm-thin) needs another wq or kthread for
> > handling IO, then dm target side is blocked because dm_wq_work()
> > holds the single cpu, sooner or later, dm target may have not
> > resource to handle new io from dm core and returns REQUEUE.
> >
> > Then dm_wq_work becomes one dead loop.
> >
> >
> > Thanks,
> > Ming
> >
>
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-02-17 0:28 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-15 11:23 [dm-devel] A hang bug of dm on s390x Pingfan Liu
2023-02-16 0:08 ` Ming Lei
2023-02-16 8:30 ` Pingfan Liu
2023-02-16 12:39 ` Zdenek Kabelac
2023-02-16 17:29 ` Mike Snitzer
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.