Re: [Update PATCH V3] md: don't unregister sync_thread with reconfig_mutex held

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: Donald Buczek <buczek@molgen.mpg.de>,
	Logan Gunthorpe <logang@deltatee.com>, Song Liu <song@kernel.org>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: [Update PATCH V3] md: don't unregister sync_thread with reconfig_mutex held
Date: Mon, 30 May 2022 17:55:58 +0800	[thread overview]
Message-ID: <31a9aed2-16cf-663a-5da3-0f9543ceb8c9@linux.dev> (raw)
In-Reply-To: <954f9c33-7801-b6d2-65e3-9e5237905886@linux.dev>



On 5/23/22 5:51 PM, Guoqing Jiang wrote:
>
>
>>>>> I noticed a clear regression with mdadm tests with this patch in 
>>>>> md-next
>>>>> (7e6ba434cc6080).
>>>>>
>>>>> Before the patch, tests 07reshape5intr and 07revert-grow would fail
>>>>> fairly infrequently (about 1 in 4 runs for the former and 1 in 30 
>>>>> runs
>>>>> for the latter).
>>>>>
>>>>> After this patch, both tests always fail.
>>>>>
>>>>> I don't have time to dig into why this is, but it would be nice if
>>>>> someone can at least fix the regression. It is hard to make any 
>>>>> progress
>>>>> on these tests if we are continuing to further break them.
>
> I have tried with both ubuntu 22.04 kernel which is 5.15 and vanilla 
> 5.12, none of them
> can pass your mentioned tests.
>
> [root@localhost mdadm]# lsblk|grep vd
> vda          252:0    0    1G  0 disk
> vdb          252:16   0    1G  0 disk
> vdc          252:32   0    1G  0 disk
> vdd          252:48   0    1G  0 disk
> [root@localhost mdadm]# ./test --dev=disk --disks=/dev/vd{a..d} 
> --tests=05r1-add-internalbitmap
> Testing on linux-5.12.0-default kernel
> /root/mdadm/tests/05r1-add-internalbitmap... succeeded
> [root@localhost mdadm]# ./test --dev=disk --disks=/dev/vd{a..d} 
> --tests=07reshape5intr
> Testing on linux-5.12.0-default kernel
> /root/mdadm/tests/07reshape5intr... FAILED - see 
> /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for 
> details
> [root@localhost mdadm]# ./test --dev=disk --disks=/dev/vd{a..d} 
> --tests=07revert-grow
> Testing on linux-5.12.0-default kernel
> /root/mdadm/tests/07revert-grow... FAILED - see 
> /var/tmp/07revert-grow.log and /var/tmp/fail07revert-grow.log for details
> [root@localhost mdadm]# head -10  /var/tmp/07revert-grow.log | grep mdadm
> + . /root/mdadm/tests/07revert-grow
> *++ mdadm -CR --assume-clean /dev/md0 -l5 -n4 -x1 /dev/vda /dev/vdb 
> /dev/vdc /dev/vdd /dev/vda /dev/vdb /dev/vdc /dev/vdd --metadata=0.9**
> *
> The above line is clearly wrong from my understanding.
>
> And let's check ubuntu 22.04.
>
> root@vm:/home/gjiang/mdadm# lsblk|grep vd
> vda    252:0    0     1G  0 disk
> vdb    252:16   0     1G  0 disk
> vdc    252:32   0     1G  0 disk
> root@vm:/home/gjiang/mdadm# ./test --dev=disk --disks=/dev/vd{a..d} 
> --tests=05r1-failfast
> Testing on linux-5.15.0-30-generic kernel
> /home/gjiang/mdadm/tests/05r1-failfast... succeeded
> root@vm:/home/gjiang/mdadm# ./test --dev=disk --disks=/dev/vd{a..c}   
> --tests=07reshape5intr
> Testing on linux-5.15.0-30-generic kernel
> /home/gjiang/mdadm/tests/07reshape5intr... FAILED - see 
> /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for 
> details
> root@vm:/home/gjiang/mdadm# ./test --dev=disk --disks=/dev/vd{a..c} 
> --tests=07revert-grow
> Testing on linux-5.15.0-30-generic kernel
> /home/gjiang/mdadm/tests/07revert-grow... FAILED - see 
> /var/tmp/07revert-grow.log and /var/tmp/fail07revert-grow.log for details
>
> So I would not consider it is regression.

I tried with 5.18.0-rc3, no problem for 07reshape5intr (will investigate 
why it failed with this patch), but 07revert-grow still failed without
apply this one.

 From fail07revert-grow.log, it shows below issues.

[ 7856.233515] mdadm[25246]: segfault at 0 ip 000000000040fe56 sp 
00007ffdcf252800 error 4 in mdadm[400000+81000]
[ 7856.233544] Code: 00 48 8d 7c 24 30 e8 79 30 ff ff 48 8d 7c 24 30 31 
f6 31 c0 e8 db 34 ff ff 85 c0 79 77 bf 26 50 46 00 b9 04 00 00 00 48 89 
de <f3> a6 0f 97 c0 1c 00 84 c0 75 18 e8 fa 36 ff ff 48 0f be 53 04 48

[ 7866.871747] mdadm[25463]: segfault at 0 ip 000000000040fe56 sp 
00007ffe91e39800 error 4 in mdadm[400000+81000]
[ 7866.871760] Code: 00 48 8d 7c 24 30 e8 79 30 ff ff 48 8d 7c 24 30 31 
f6 31 c0 e8 db 34 ff ff 85 c0 79 77 bf 26 50 46 00 b9 04 00 00 00 48 89 
de <f3> a6 0f 97 c0 1c 00 84 c0 75 18 e8 fa 36 ff ff 48 0f be 53 04 48

[ 7876.779855] ======================================================
[ 7876.779858] WARNING: possible circular locking dependency detected
[ 7876.779861] 5.18.0-rc3-57-default #28 Tainted: G            E
[ 7876.779864] ------------------------------------------------------
[ 7876.779867] mdadm/25444 is trying to acquire lock:
[ 7876.779870] ffff991817749938 ((wq_completion)md_misc){+.+.}-{0:0}, 
at: flush_workqueue+0x87/0x470
[ 7876.779879]
                but task is already holding lock:
[ 7876.779882] ffff9917c5c1c2c0 (&mddev->reconfig_mutex){+.+.}-{3:3}, 
at: action_store+0x11a/0x2c0 [md_mod]
[ 7876.779892]
                which lock already depends on the new lock.

[ 7876.779896]
                the existing dependency chain (in reverse order) is:
[ 7876.779899]
                -> #3 (&mddev->reconfig_mutex){+.+.}-{3:3}:
[ 7876.779904]        __mutex_lock+0x8f/0x920
[ 7876.779909]        layout_store+0x47/0x120 [md_mod]
[ 7876.779914]        md_attr_store+0x7a/0xc0 [md_mod]
[ 7876.779919]        kernfs_fop_write_iter+0x135/0x1b0
[ 7876.779924]        new_sync_write+0x10c/0x190
[ 7876.779927]        vfs_write+0x30e/0x370
[ 7876.779930]        ksys_write+0xa4/0xe0
[ 7876.779933]        do_syscall_64+0x3a/0x80
[ 7876.779936]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 7876.779940]
                -> #2 (kn->active#359){++++}-{0:0}:
[ 7876.779945]        __kernfs_remove+0x28c/0x2e0
[ 7876.779948]        kernfs_remove_by_name_ns+0x52/0x90
[ 7876.779952]        remove_files.isra.1+0x30/0x70
[ 7876.779955]        sysfs_remove_group+0x3d/0x80
[ 7876.779958]        sysfs_remove_groups+0x29/0x40
[ 7876.779962]        __kobject_del+0x1b/0x80
[ 7876.779965]        kobject_del+0xf/0x20
[ 7876.779968]        mddev_delayed_delete+0x15/0x20 [md_mod]
[ 7876.779973]        process_one_work+0x2d8/0x650
[ 7876.779976]        worker_thread+0x2a/0x3b0
[ 7876.779979]        kthread+0xe8/0x110
[ 7876.779981]        ret_from_fork+0x22/0x30
[ 7876.779985]
                -> #1 ((work_completion)(&mddev->del_work)#2){+.+.}-{0:0}:
[ 7876.779990]        process_one_work+0x2af/0x650
[ 7876.779993]        worker_thread+0x2a/0x3b0
[ 7876.779996]        kthread+0xe8/0x110
[ 7876.779998]        ret_from_fork+0x22/0x30
[ 7876.780001]
                -> #0 ((wq_completion)md_misc){+.+.}-{0:0}:
[ 7876.780005]        __lock_acquire+0x12a0/0x1770
[ 7876.780009]        lock_acquire+0x277/0x310
[ 7876.780011]        flush_workqueue+0xae/0x470
[ 7876.780014]        action_store+0x188/0x2c0 [md_mod]
[ 7876.780019]        md_attr_store+0x7a/0xc0 [md_mod]
[ 7876.780024]        kernfs_fop_write_iter+0x135/0x1b0
[ 7876.780027]        new_sync_write+0x10c/0x190
[ 7876.780030]        vfs_write+0x30e/0x370
[ 7876.780033]        ksys_write+0xa4/0xe0
[ 7876.780036]        do_syscall_64+0x3a/0x80
[ 7876.780038]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 7876.780042]
                other info that might help us debug this:

[ 7876.780045] Chain exists of:
                  (wq_completion)md_misc --> kn->active#359 --> 
&mddev->reconfig_mutex

[ 7876.780052]  Possible unsafe locking scenario:

[ 7876.780054]        CPU0                    CPU1
[ 7876.780056]        ----                    ----
[ 7876.780059]   lock(&mddev->reconfig_mutex);
[ 7876.780061] lock(kn->active#359);
[ 7876.780064] lock(&mddev->reconfig_mutex);
[ 7876.780068]   lock((wq_completion)md_misc);
[ 7876.780070]
                 *** DEADLOCK ***

[ 7876.780073] 5 locks held by mdadm/25444:
[ 7876.780075]  #0: ffff9917c769a458 (sb_writers#3){.+.+}-{0:0}, at: 
ksys_write+0xa4/0xe0
[ 7876.780082]  #1: ffff9917de1a6518 (&of->prealloc_mutex){+.+.}-{3:3}, 
at: kernfs_fop_write_iter+0x5c/0x1b0
[ 7876.780088]  #2: ffff9917de1a6488 (&of->mutex){+.+.}-{3:3}, at: 
kernfs_fop_write_iter+0x103/0x1b0
[ 7876.780094]  #3: ffff99181b0d2158 (kn->active#287){++++}-{0:0}, at: 
kernfs_fop_write_iter+0x10c/0x1b0
[ 7876.780100]  #4: ffff9917c5c1c2c0 
(&mddev->reconfig_mutex){+.+.}-{3:3}, at: action_store+0x11a/0x2c0 [md_mod]
[ 7876.780108]
                stack backtrace:
[ 7876.780111] CPU: 1 PID: 25444 Comm: mdadm Tainted: G E     
5.18.0-rc3-57-default #28
[ 7876.780115] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
[ 7876.780120] Call Trace:
[ 7876.780122]  <TASK>
[ 7876.780124]  dump_stack_lvl+0x55/0x6d
[ 7876.780128]  check_noncircular+0x105/0x120
[ 7876.780133]  ? __lock_acquire+0x12a0/0x1770
[ 7876.780136]  ? lockdep_lock+0x21/0x90
[ 7876.780139]  __lock_acquire+0x12a0/0x1770
[ 7876.780143]  lock_acquire+0x277/0x310
[ 7876.780146]  ? flush_workqueue+0x87/0x470
[ 7876.780149]  ? __raw_spin_lock_init+0x3b/0x60
[ 7876.780152]  ? lockdep_init_map_type+0x58/0x250
[ 7876.780156]  flush_workqueue+0xae/0x470
[ 7876.780158]  ? flush_workqueue+0x87/0x470
[ 7876.780161]  ? _raw_spin_unlock+0x29/0x40
[ 7876.780166]  ? action_store+0x188/0x2c0 [md_mod]
[ 7876.780171]  action_store+0x188/0x2c0 [md_mod]
[ 7876.780176]  md_attr_store+0x7a/0xc0 [md_mod]
[ 7876.780182]  kernfs_fop_write_iter+0x135/0x1b0
[ 7876.780186]  new_sync_write+0x10c/0x190
[ 7876.780190]  vfs_write+0x30e/0x370
[ 7876.780193]  ksys_write+0xa4/0xe0
[ 7876.780197]  do_syscall_64+0x3a/0x80
[ 7876.780200]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 7876.780203] RIP: 0033:0x7f7cd43f9c03
[ 7876.780206] Code: 2d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 
1f 80 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 
05 <48> 3d 00 f0 ff ff 77 55 f3 c3 0f 1f 00 41 54 55 49 89 d4 53 48 89
[ 7876.780213] RSP: 002b:00007fff35b600b8 EFLAGS: 00000246 ORIG_RAX: 
0000000000000001
[ 7876.780218] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 
00007f7cd43f9c03
[ 7876.780221] RDX: 0000000000000004 RSI: 00000000004669a7 RDI: 
0000000000000003
[ 7876.780224] RBP: 00000000004669a7 R08: 000000000000000b R09: 
00000000ffffffff
[ 7876.780227] R10: 0000000000000000 R11: 0000000000000246 R12: 
000000000183d730
[ 7876.780231] R13: 00000000ffffffff R14: 0000000000001800 R15: 
0000000000000000
[ 7876.780236]  </TASK>
[ 7876.781007] md: reshape of RAID array md0
[ 7876.785949] md: md0: reshape interrupted.
[ 7877.193686] md0: detected capacity change from 107520 to 0
[ 7877.193694] md: md0 stopped.
[ 7877.314438] debugfs: Directory 'md0' with parent 'block' already present!
[ 7877.649549] md: array md0 already has disks!
[ 7877.665188] md/raid:md0: device loop0 operational as raid disk 0
[ 7877.665193] md/raid:md0: device loop3 operational as raid disk 3
[ 7877.665197] md/raid:md0: device loop2 operational as raid disk 2
[ 7877.665199] md/raid:md0: device loop1 operational as raid disk 1
[ 7877.665202] md/raid:md0: device loop4 operational as raid disk 4
[ 7877.665205] md/raid:md0: force stripe size 512 for reshape
[ 7877.667172] md/raid:md0: raid level 5 active with 4 out of 4 devices, 
algorithm 2
[ 7877.667360] md0: detected capacity change from 0 to 107520
[ 7878.107865] mdadm[25693]: segfault at 0 ip 000000000040fe56 sp 
00007fff7a845d50 error 4 in mdadm[400000+81000]
[ 7878.107878] Code: 00 48 8d 7c 24 30 e8 79 30 ff ff 48 8d 7c 24 30 31 
f6 31 c0 e8 db 34 ff ff 85 c0 79 77 bf 26 50 46 00 b9 04 00 00 00 48 89 
de <f3> a6 0f 97 c0 1c 00 84 c0 75 18 e8 fa 36 ff ff 48 0f be 53 04 48
[ 7878.428298] md: reshape of RAID array md0
[ 7880.552543] md: md0: reshape done.
[ 7882.053117] md0: detected capacity change from 107520 to 0
[ 7882.053124] md: md0 stopped.
[ ... ]
[ 7904.978417] md0: detected capacity change from 0 to 107520
[ 7905.346749] mdadm[26328]: segfault at 0 ip 000000000040fe56 sp 
00007ffc28540c20 error 4 in mdadm[400000+81000]
[ 7905.346764] Code: 00 48 8d 7c 24 30 e8 79 30 ff ff 48 8d 7c 24 30 31 
f6 31 c0 e8 db 34 ff ff 85 c0 79 77 bf 26 50 46 00 b9 04 00 00 00 48 89 
de <f3> a6 0f 97 c0 1c 00 84 c0 75 18 e8 fa 36 ff ff 48 0f be 53 04 48
[ 7905.476488] md: reshape of RAID array md0
[ 7907.530152] md: md0: reshape done.
[ 7909.149524] md0: detected capacity change from 107520 to 0
[ 7909.149533] md: md0 stopped.

Thanks,
Guoqing

next prev parent reply	other threads:[~2022-05-30  9:56 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-05  8:16 [PATCH 0/2] two fixes for md Guoqing Jiang
2022-05-05  8:16 ` [PATCH V3 1/2] md: don't unregister sync_thread with reconfig_mutex held Guoqing Jiang
2022-05-05 14:02   ` kernel test robot
2022-05-05 18:04   ` kernel test robot
2022-05-06  2:34     ` Guoqing Jiang
2022-05-06  2:34       ` Guoqing Jiang
2022-05-05  8:16 ` [PATCH 2/2] md: protect md_unregister_thread from reentrancy Guoqing Jiang
2022-05-09  6:39   ` Song Liu
2022-05-09  8:12     ` Guoqing Jiang
2022-05-06 11:36 ` [Update PATCH V3] md: don't unregister sync_thread with reconfig_mutex held Guoqing Jiang
2022-05-09  6:37   ` Song Liu
2022-05-09  8:09     ` Guoqing Jiang
2022-05-09  9:32       ` Wols Lists
2022-05-09 10:37         ` Guoqing Jiang
2022-05-09 11:19           ` Wols Lists
2022-05-09 11:26             ` Guoqing Jiang
2022-05-10  6:44       ` Song Liu
2022-05-10 12:01         ` Donald Buczek
2022-05-10 12:09           ` Guoqing Jiang
2022-05-10 12:35             ` Donald Buczek
2022-05-10 18:02               ` Song Liu
2022-05-11  8:10                 ` Guoqing Jiang
2022-05-11 21:45                   ` Song Liu
2022-05-20 18:27         ` Logan Gunthorpe
2022-05-21 18:23           ` Donald Buczek
2022-05-23  1:08             ` Guoqing Jiang
2022-05-23  5:41               ` Donald Buczek
2022-05-23  9:51                 ` Guoqing Jiang
2022-05-24 16:13                   ` Logan Gunthorpe
2022-05-25  9:04                     ` Guoqing Jiang
2022-05-25 18:22                       ` Logan Gunthorpe
2022-05-26  9:46                         ` Jan Kara
2022-05-26 11:53                         ` Jan Kara
2022-05-31  6:11                           ` Christoph Hellwig
2022-05-31  7:43                             ` Jan Kara
2022-05-30  9:55                   ` Guoqing Jiang [this message]
2022-05-30 16:35                     ` Logan Gunthorpe
2022-05-31  8:13                       ` Guoqing Jiang
2022-05-24 15:58                 ` Logan Gunthorpe
2022-05-24 18:16                   ` Song Liu
2022-05-25  9:17                 ` Guoqing Jiang
2022-05-24 15:51             ` Logan Gunthorpe
2022-06-02  8:12           ` Xiao Ni
2022-05-09  8:18   ` Donald Buczek
2022-05-09  8:48     ` Guoqing Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31a9aed2-16cf-663a-5da3-0f9543ceb8c9@linux.dev \
    --to=guoqing.jiang@linux.dev \
    --cc=buczek@molgen.mpg.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.