From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 906ABC433F5 for ; Mon, 30 May 2022 09:56:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234433AbiE3J4O (ORCPT ); Mon, 30 May 2022 05:56:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232713AbiE3J4N (ORCPT ); Mon, 30 May 2022 05:56:13 -0400 Received: from out0.migadu.com (out0.migadu.com [IPv6:2001:41d0:2:267::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E49594C7A2 for ; Mon, 30 May 2022 02:56:11 -0700 (PDT) Subject: Re: [Update PATCH V3] md: don't unregister sync_thread with reconfig_mutex held DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1653904570; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZoMpYzh34zzB5ARo9PPQnq+/eblJ/Z2YLXYPvv2jr5I=; b=eS0Z43PNZVD10WHTgEWOFFZmi2r8aBZ+yvVXNLJNkV3oZx9gRaGyHszmYvWxZa/Kqd5ss3 MdFnhVQVVuFa9fIPZhfYdRYjtXun+mWeAF0OPpi1dTqHcYs2g90ibUyTSSK+hFgl/w+KI+ u0ipenVUsFRGZxM5LHYfKhbwpzQKwyI= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Guoqing Jiang To: Donald Buczek , Logan Gunthorpe , Song Liu Cc: linux-raid References: <20220505081641.21500-1-guoqing.jiang@linux.dev> <20220506113656.25010-1-guoqing.jiang@linux.dev> <141b4110-767e-7670-21d5-6a5f636d1207@linux.dev> <836b2a93-65be-8d6c-8610-18373b88f86d@molgen.mpg.de> <5b0584a3-c128-cb53-7c8a-63744c60c667@linux.dev> <4edc9468-d195-6937-f550-211bccbd6756@molgen.mpg.de> <954f9c33-7801-b6d2-65e3-9e5237905886@linux.dev> Message-ID: <31a9aed2-16cf-663a-5da3-0f9543ceb8c9@linux.dev> Date: Mon, 30 May 2022 17:55:58 +0800 MIME-Version: 1.0 In-Reply-To: <954f9c33-7801-b6d2-65e3-9e5237905886@linux.dev> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org On 5/23/22 5:51 PM, Guoqing Jiang wrote: > > >>>>> I noticed a clear regression with mdadm tests with this patch in >>>>> md-next >>>>> (7e6ba434cc6080). >>>>> >>>>> Before the patch, tests 07reshape5intr and 07revert-grow would fail >>>>> fairly infrequently (about 1 in 4 runs for the former and 1 in 30 >>>>> runs >>>>> for the latter). >>>>> >>>>> After this patch, both tests always fail. >>>>> >>>>> I don't have time to dig into why this is, but it would be nice if >>>>> someone can at least fix the regression. It is hard to make any >>>>> progress >>>>> on these tests if we are continuing to further break them. > > I have tried with both ubuntu 22.04 kernel which is 5.15 and vanilla > 5.12, none of them > can pass your mentioned tests. > > [root@localhost mdadm]# lsblk|grep vd > vda          252:0    0    1G  0 disk > vdb          252:16   0    1G  0 disk > vdc          252:32   0    1G  0 disk > vdd          252:48   0    1G  0 disk > [root@localhost mdadm]# ./test --dev=disk --disks=/dev/vd{a..d} > --tests=05r1-add-internalbitmap > Testing on linux-5.12.0-default kernel > /root/mdadm/tests/05r1-add-internalbitmap... succeeded > [root@localhost mdadm]# ./test --dev=disk --disks=/dev/vd{a..d} > --tests=07reshape5intr > Testing on linux-5.12.0-default kernel > /root/mdadm/tests/07reshape5intr... FAILED - see > /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for > details > [root@localhost mdadm]# ./test --dev=disk --disks=/dev/vd{a..d} > --tests=07revert-grow > Testing on linux-5.12.0-default kernel > /root/mdadm/tests/07revert-grow... FAILED - see > /var/tmp/07revert-grow.log and /var/tmp/fail07revert-grow.log for details > [root@localhost mdadm]# head -10  /var/tmp/07revert-grow.log | grep mdadm > + . /root/mdadm/tests/07revert-grow > *++ mdadm -CR --assume-clean /dev/md0 -l5 -n4 -x1 /dev/vda /dev/vdb > /dev/vdc /dev/vdd /dev/vda /dev/vdb /dev/vdc /dev/vdd --metadata=0.9** > * > The above line is clearly wrong from my understanding. > > And let's check ubuntu 22.04. > > root@vm:/home/gjiang/mdadm# lsblk|grep vd > vda    252:0    0     1G  0 disk > vdb    252:16   0     1G  0 disk > vdc    252:32   0     1G  0 disk > root@vm:/home/gjiang/mdadm# ./test --dev=disk --disks=/dev/vd{a..d} > --tests=05r1-failfast > Testing on linux-5.15.0-30-generic kernel > /home/gjiang/mdadm/tests/05r1-failfast... succeeded > root@vm:/home/gjiang/mdadm# ./test --dev=disk --disks=/dev/vd{a..c}   > --tests=07reshape5intr > Testing on linux-5.15.0-30-generic kernel > /home/gjiang/mdadm/tests/07reshape5intr... FAILED - see > /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for > details > root@vm:/home/gjiang/mdadm# ./test --dev=disk --disks=/dev/vd{a..c} > --tests=07revert-grow > Testing on linux-5.15.0-30-generic kernel > /home/gjiang/mdadm/tests/07revert-grow... FAILED - see > /var/tmp/07revert-grow.log and /var/tmp/fail07revert-grow.log for details > > So I would not consider it is regression. I tried with 5.18.0-rc3, no problem for 07reshape5intr (will investigate why it failed with this patch), but 07revert-grow still failed without apply this one. From fail07revert-grow.log, it shows below issues. [ 7856.233515] mdadm[25246]: segfault at 0 ip 000000000040fe56 sp 00007ffdcf252800 error 4 in mdadm[400000+81000] [ 7856.233544] Code: 00 48 8d 7c 24 30 e8 79 30 ff ff 48 8d 7c 24 30 31 f6 31 c0 e8 db 34 ff ff 85 c0 79 77 bf 26 50 46 00 b9 04 00 00 00 48 89 de a6 0f 97 c0 1c 00 84 c0 75 18 e8 fa 36 ff ff 48 0f be 53 04 48 [ 7866.871747] mdadm[25463]: segfault at 0 ip 000000000040fe56 sp 00007ffe91e39800 error 4 in mdadm[400000+81000] [ 7866.871760] Code: 00 48 8d 7c 24 30 e8 79 30 ff ff 48 8d 7c 24 30 31 f6 31 c0 e8 db 34 ff ff 85 c0 79 77 bf 26 50 46 00 b9 04 00 00 00 48 89 de a6 0f 97 c0 1c 00 84 c0 75 18 e8 fa 36 ff ff 48 0f be 53 04 48 [ 7876.779855] ====================================================== [ 7876.779858] WARNING: possible circular locking dependency detected [ 7876.779861] 5.18.0-rc3-57-default #28 Tainted: G            E [ 7876.779864] ------------------------------------------------------ [ 7876.779867] mdadm/25444 is trying to acquire lock: [ 7876.779870] ffff991817749938 ((wq_completion)md_misc){+.+.}-{0:0}, at: flush_workqueue+0x87/0x470 [ 7876.779879]                but task is already holding lock: [ 7876.779882] ffff9917c5c1c2c0 (&mddev->reconfig_mutex){+.+.}-{3:3}, at: action_store+0x11a/0x2c0 [md_mod] [ 7876.779892]                which lock already depends on the new lock. [ 7876.779896]                the existing dependency chain (in reverse order) is: [ 7876.779899]                -> #3 (&mddev->reconfig_mutex){+.+.}-{3:3}: [ 7876.779904]        __mutex_lock+0x8f/0x920 [ 7876.779909]        layout_store+0x47/0x120 [md_mod] [ 7876.779914]        md_attr_store+0x7a/0xc0 [md_mod] [ 7876.779919]        kernfs_fop_write_iter+0x135/0x1b0 [ 7876.779924]        new_sync_write+0x10c/0x190 [ 7876.779927]        vfs_write+0x30e/0x370 [ 7876.779930]        ksys_write+0xa4/0xe0 [ 7876.779933]        do_syscall_64+0x3a/0x80 [ 7876.779936]        entry_SYSCALL_64_after_hwframe+0x44/0xae [ 7876.779940]                -> #2 (kn->active#359){++++}-{0:0}: [ 7876.779945]        __kernfs_remove+0x28c/0x2e0 [ 7876.779948]        kernfs_remove_by_name_ns+0x52/0x90 [ 7876.779952]        remove_files.isra.1+0x30/0x70 [ 7876.779955]        sysfs_remove_group+0x3d/0x80 [ 7876.779958]        sysfs_remove_groups+0x29/0x40 [ 7876.779962]        __kobject_del+0x1b/0x80 [ 7876.779965]        kobject_del+0xf/0x20 [ 7876.779968]        mddev_delayed_delete+0x15/0x20 [md_mod] [ 7876.779973]        process_one_work+0x2d8/0x650 [ 7876.779976]        worker_thread+0x2a/0x3b0 [ 7876.779979]        kthread+0xe8/0x110 [ 7876.779981]        ret_from_fork+0x22/0x30 [ 7876.779985]                -> #1 ((work_completion)(&mddev->del_work)#2){+.+.}-{0:0}: [ 7876.779990]        process_one_work+0x2af/0x650 [ 7876.779993]        worker_thread+0x2a/0x3b0 [ 7876.779996]        kthread+0xe8/0x110 [ 7876.779998]        ret_from_fork+0x22/0x30 [ 7876.780001]                -> #0 ((wq_completion)md_misc){+.+.}-{0:0}: [ 7876.780005]        __lock_acquire+0x12a0/0x1770 [ 7876.780009]        lock_acquire+0x277/0x310 [ 7876.780011]        flush_workqueue+0xae/0x470 [ 7876.780014]        action_store+0x188/0x2c0 [md_mod] [ 7876.780019]        md_attr_store+0x7a/0xc0 [md_mod] [ 7876.780024]        kernfs_fop_write_iter+0x135/0x1b0 [ 7876.780027]        new_sync_write+0x10c/0x190 [ 7876.780030]        vfs_write+0x30e/0x370 [ 7876.780033]        ksys_write+0xa4/0xe0 [ 7876.780036]        do_syscall_64+0x3a/0x80 [ 7876.780038]        entry_SYSCALL_64_after_hwframe+0x44/0xae [ 7876.780042]                other info that might help us debug this: [ 7876.780045] Chain exists of:                  (wq_completion)md_misc --> kn->active#359 --> &mddev->reconfig_mutex [ 7876.780052]  Possible unsafe locking scenario: [ 7876.780054]        CPU0                    CPU1 [ 7876.780056]        ----                    ---- [ 7876.780059]   lock(&mddev->reconfig_mutex); [ 7876.780061] lock(kn->active#359); [ 7876.780064] lock(&mddev->reconfig_mutex); [ 7876.780068]   lock((wq_completion)md_misc); [ 7876.780070]                 *** DEADLOCK *** [ 7876.780073] 5 locks held by mdadm/25444: [ 7876.780075]  #0: ffff9917c769a458 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0xa4/0xe0 [ 7876.780082]  #1: ffff9917de1a6518 (&of->prealloc_mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x5c/0x1b0 [ 7876.780088]  #2: ffff9917de1a6488 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x103/0x1b0 [ 7876.780094]  #3: ffff99181b0d2158 (kn->active#287){++++}-{0:0}, at: kernfs_fop_write_iter+0x10c/0x1b0 [ 7876.780100]  #4: ffff9917c5c1c2c0 (&mddev->reconfig_mutex){+.+.}-{3:3}, at: action_store+0x11a/0x2c0 [md_mod] [ 7876.780108]                stack backtrace: [ 7876.780111] CPU: 1 PID: 25444 Comm: mdadm Tainted: G E     5.18.0-rc3-57-default #28 [ 7876.780115] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014 [ 7876.780120] Call Trace: [ 7876.780122]  [ 7876.780124]  dump_stack_lvl+0x55/0x6d [ 7876.780128]  check_noncircular+0x105/0x120 [ 7876.780133]  ? __lock_acquire+0x12a0/0x1770 [ 7876.780136]  ? lockdep_lock+0x21/0x90 [ 7876.780139]  __lock_acquire+0x12a0/0x1770 [ 7876.780143]  lock_acquire+0x277/0x310 [ 7876.780146]  ? flush_workqueue+0x87/0x470 [ 7876.780149]  ? __raw_spin_lock_init+0x3b/0x60 [ 7876.780152]  ? lockdep_init_map_type+0x58/0x250 [ 7876.780156]  flush_workqueue+0xae/0x470 [ 7876.780158]  ? flush_workqueue+0x87/0x470 [ 7876.780161]  ? _raw_spin_unlock+0x29/0x40 [ 7876.780166]  ? action_store+0x188/0x2c0 [md_mod] [ 7876.780171]  action_store+0x188/0x2c0 [md_mod] [ 7876.780176]  md_attr_store+0x7a/0xc0 [md_mod] [ 7876.780182]  kernfs_fop_write_iter+0x135/0x1b0 [ 7876.780186]  new_sync_write+0x10c/0x190 [ 7876.780190]  vfs_write+0x30e/0x370 [ 7876.780193]  ksys_write+0xa4/0xe0 [ 7876.780197]  do_syscall_64+0x3a/0x80 [ 7876.780200]  entry_SYSCALL_64_after_hwframe+0x44/0xae [ 7876.780203] RIP: 0033:0x7f7cd43f9c03 [ 7876.780206] Code: 2d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 f3 c3 0f 1f 00 41 54 55 49 89 d4 53 48 89 [ 7876.780213] RSP: 002b:00007fff35b600b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 7876.780218] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f7cd43f9c03 [ 7876.780221] RDX: 0000000000000004 RSI: 00000000004669a7 RDI: 0000000000000003 [ 7876.780224] RBP: 00000000004669a7 R08: 000000000000000b R09: 00000000ffffffff [ 7876.780227] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000183d730 [ 7876.780231] R13: 00000000ffffffff R14: 0000000000001800 R15: 0000000000000000 [ 7876.780236]  [ 7876.781007] md: reshape of RAID array md0 [ 7876.785949] md: md0: reshape interrupted. [ 7877.193686] md0: detected capacity change from 107520 to 0 [ 7877.193694] md: md0 stopped. [ 7877.314438] debugfs: Directory 'md0' with parent 'block' already present! [ 7877.649549] md: array md0 already has disks! [ 7877.665188] md/raid:md0: device loop0 operational as raid disk 0 [ 7877.665193] md/raid:md0: device loop3 operational as raid disk 3 [ 7877.665197] md/raid:md0: device loop2 operational as raid disk 2 [ 7877.665199] md/raid:md0: device loop1 operational as raid disk 1 [ 7877.665202] md/raid:md0: device loop4 operational as raid disk 4 [ 7877.665205] md/raid:md0: force stripe size 512 for reshape [ 7877.667172] md/raid:md0: raid level 5 active with 4 out of 4 devices, algorithm 2 [ 7877.667360] md0: detected capacity change from 0 to 107520 [ 7878.107865] mdadm[25693]: segfault at 0 ip 000000000040fe56 sp 00007fff7a845d50 error 4 in mdadm[400000+81000] [ 7878.107878] Code: 00 48 8d 7c 24 30 e8 79 30 ff ff 48 8d 7c 24 30 31 f6 31 c0 e8 db 34 ff ff 85 c0 79 77 bf 26 50 46 00 b9 04 00 00 00 48 89 de a6 0f 97 c0 1c 00 84 c0 75 18 e8 fa 36 ff ff 48 0f be 53 04 48 [ 7878.428298] md: reshape of RAID array md0 [ 7880.552543] md: md0: reshape done. [ 7882.053117] md0: detected capacity change from 107520 to 0 [ 7882.053124] md: md0 stopped. [ ... ] [ 7904.978417] md0: detected capacity change from 0 to 107520 [ 7905.346749] mdadm[26328]: segfault at 0 ip 000000000040fe56 sp 00007ffc28540c20 error 4 in mdadm[400000+81000] [ 7905.346764] Code: 00 48 8d 7c 24 30 e8 79 30 ff ff 48 8d 7c 24 30 31 f6 31 c0 e8 db 34 ff ff 85 c0 79 77 bf 26 50 46 00 b9 04 00 00 00 48 89 de a6 0f 97 c0 1c 00 84 c0 75 18 e8 fa 36 ff ff 48 0f be 53 04 48 [ 7905.476488] md: reshape of RAID array md0 [ 7907.530152] md: md0: reshape done. [ 7909.149524] md0: detected capacity change from 107520 to 0 [ 7909.149533] md: md0 stopped. Thanks, Guoqing