All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: Mikulas Patocka <mpatocka@redhat.com>, Song Liu <song@kernel.org>
Cc: linux-raid@vger.kernel.org, dm-devel@redhat.com,
	Zdenek Kabelac <zkabelac@redhat.com>
Subject: Re: [dm-devel] A crash caused by the commit 0dd84b319352bb8ba64752d4e45396d8b13e6018
Date: Thu, 3 Nov 2022 15:28:55 +0800	[thread overview]
Message-ID: <ba8e67d8-b18b-13ba-2883-6ca6c6520ef2@linux.dev> (raw)
In-Reply-To: <78646e88-2457-81e1-e3e7-cf66b67ba923@linux.dev>



On 11/3/22 11:47 AM, Guoqing Jiang wrote:
>> [   78.491429] <TASK>
>> [   78.491640]  clone_endio+0xf4/0x1c0 [dm_mod]
>> [   78.492072]  clone_endio+0xf4/0x1c0 [dm_mod]
>
> The clone_endio belongs to "clone" target_type.

Hmm, could be the "clone_endio" from dm.c instead of dm-clone-target.c.

>
>> [   78.492505] __submit_bio+0x76/0x120
>> [   78.492859]  submit_bio_noacct_nocheck+0xb6/0x2a0
>> [   78.493325]  flush_expired_bios+0x28/0x2f [dm_delay]
>
> This is "delay" target_type. Could you shed light on how the two targets
> connect with dm-raid? And I have shallow knowledge about dm ...
>
>> [   78.493808] process_one_work+0x1b4/0x300
>> [   78.494211]  worker_thread+0x45/0x3e0
>> [   78.494570]  ? rescuer_thread+0x380/0x380
>> [   78.494957]  kthread+0xc2/0x100
>> [   78.495279]  ? kthread_complete_and_exit+0x20/0x20
>> [   78.495743]  ret_from_fork+0x1f/0x30
>> [   78.496096]  </TASK>
>> [   78.496326] Modules linked in: brd dm_delay dm_raid dm_mod 
>> af_packet uvesafb cfbfillrect cfbimgblt cn cfbcopyarea fb font fbdev 
>> tun autofs4 binfmt_misc configfs ipv6 virtio_rng virtio_balloon 
>> rng_core virtio_net pcspkr net_failover failover qemu_fw_cfg button 
>> mousedev raid10 raid456 libcrc32c async_raid6_recov async_memcpy 
>> async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod sd_mod 
>> t10_pi crc64_rocksoft crc64 virtio_scsi scsi_mod evdev psmouse bsg 
>> scsi_common [last unloaded: brd]
>> [   78.500425] CR2: 0000000000000000
>> [   78.500752] ---[ end trace 0000000000000000 ]---
>> [   78.501214] RIP: 0010:mempool_free+0x47/0x80
>
> BTW, is the mempool_free from endio -> dec_count -> complete_io?

I guess it is "mempool_free(io, &io->client->pool)", and the pool is 
freed by
dm_io_client_destroy, and seems dm-raid is not responsible for either create
pool or destroy pool.

> And io which caused the crash is from dm_io -> async_io / sync_io
>  -> dispatch_io, seems dm-raid1 can call it instead of dm-raid, so I
> suppose the io is for mirror image. 

The io should be from another path (dm_submit_bio -> 
dm_split_and_process_bio
-> __split_and_process_bio -> __map_bio which sets "bi_end_io = 
clone_endio").

My guess is, there is racy condition between "lvchange --rebuild" and 
raid_dtr since
it was reproduced by running cmd in loop.

Anyway, we can revert the mentioned commit and go back to Neil's 
solution [1],
but I'd like to reproduce it and learn DM a bit.

[1]. 
https://lore.kernel.org/linux-raid/a6657e08-b6a7-358b-2d2a-0ac37d49d23a@linux.dev/T/#m95ac225cab7409f66c295772483d091084a6d470

Thanks,
Guoqing

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

WARNING: multiple messages have this Message-ID (diff)
From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: Mikulas Patocka <mpatocka@redhat.com>, Song Liu <song@kernel.org>
Cc: Zdenek Kabelac <zkabelac@redhat.com>,
	linux-raid@vger.kernel.org, dm-devel@redhat.com
Subject: Re: A crash caused by the commit 0dd84b319352bb8ba64752d4e45396d8b13e6018
Date: Thu, 3 Nov 2022 15:28:55 +0800	[thread overview]
Message-ID: <ba8e67d8-b18b-13ba-2883-6ca6c6520ef2@linux.dev> (raw)
In-Reply-To: <78646e88-2457-81e1-e3e7-cf66b67ba923@linux.dev>



On 11/3/22 11:47 AM, Guoqing Jiang wrote:
>> [   78.491429] <TASK>
>> [   78.491640]  clone_endio+0xf4/0x1c0 [dm_mod]
>> [   78.492072]  clone_endio+0xf4/0x1c0 [dm_mod]
>
> The clone_endio belongs to "clone" target_type.

Hmm, could be the "clone_endio" from dm.c instead of dm-clone-target.c.

>
>> [   78.492505] __submit_bio+0x76/0x120
>> [   78.492859]  submit_bio_noacct_nocheck+0xb6/0x2a0
>> [   78.493325]  flush_expired_bios+0x28/0x2f [dm_delay]
>
> This is "delay" target_type. Could you shed light on how the two targets
> connect with dm-raid? And I have shallow knowledge about dm ...
>
>> [   78.493808] process_one_work+0x1b4/0x300
>> [   78.494211]  worker_thread+0x45/0x3e0
>> [   78.494570]  ? rescuer_thread+0x380/0x380
>> [   78.494957]  kthread+0xc2/0x100
>> [   78.495279]  ? kthread_complete_and_exit+0x20/0x20
>> [   78.495743]  ret_from_fork+0x1f/0x30
>> [   78.496096]  </TASK>
>> [   78.496326] Modules linked in: brd dm_delay dm_raid dm_mod 
>> af_packet uvesafb cfbfillrect cfbimgblt cn cfbcopyarea fb font fbdev 
>> tun autofs4 binfmt_misc configfs ipv6 virtio_rng virtio_balloon 
>> rng_core virtio_net pcspkr net_failover failover qemu_fw_cfg button 
>> mousedev raid10 raid456 libcrc32c async_raid6_recov async_memcpy 
>> async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod sd_mod 
>> t10_pi crc64_rocksoft crc64 virtio_scsi scsi_mod evdev psmouse bsg 
>> scsi_common [last unloaded: brd]
>> [   78.500425] CR2: 0000000000000000
>> [   78.500752] ---[ end trace 0000000000000000 ]---
>> [   78.501214] RIP: 0010:mempool_free+0x47/0x80
>
> BTW, is the mempool_free from endio -> dec_count -> complete_io?

I guess it is "mempool_free(io, &io->client->pool)", and the pool is 
freed by
dm_io_client_destroy, and seems dm-raid is not responsible for either create
pool or destroy pool.

> And io which caused the crash is from dm_io -> async_io / sync_io
>  -> dispatch_io, seems dm-raid1 can call it instead of dm-raid, so I
> suppose the io is for mirror image. 

The io should be from another path (dm_submit_bio -> 
dm_split_and_process_bio
-> __split_and_process_bio -> __map_bio which sets "bi_end_io = 
clone_endio").

My guess is, there is racy condition between "lvchange --rebuild" and 
raid_dtr since
it was reproduced by running cmd in loop.

Anyway, we can revert the mentioned commit and go back to Neil's 
solution [1],
but I'd like to reproduce it and learn DM a bit.

[1]. 
https://lore.kernel.org/linux-raid/a6657e08-b6a7-358b-2d2a-0ac37d49d23a@linux.dev/T/#m95ac225cab7409f66c295772483d091084a6d470

Thanks,
Guoqing

  reply	other threads:[~2022-11-03  7:34 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-02 16:27 [dm-devel] A crash caused by the commit 0dd84b319352bb8ba64752d4e45396d8b13e6018 Mikulas Patocka
2022-11-02 16:27 ` Mikulas Patocka
2022-11-03  3:47 ` [dm-devel] " Guoqing Jiang
2022-11-03  3:47   ` Guoqing Jiang
2022-11-03  7:28   ` Guoqing Jiang [this message]
2022-11-03  7:28     ` Guoqing Jiang
2022-11-03 13:00   ` [dm-devel] " Mikulas Patocka
2022-11-03 13:00     ` Mikulas Patocka
2022-11-03 15:20     ` [dm-devel] " Mikulas Patocka
2022-11-03 15:20       ` Mikulas Patocka
2022-11-04  2:41       ` [dm-devel] " Guoqing Jiang
2022-11-04  2:41         ` Guoqing Jiang
2022-11-04 13:40         ` [dm-devel] " Mikulas Patocka
2022-11-04 13:40           ` Mikulas Patocka
2022-11-07  9:32           ` [dm-devel] " Guoqing Jiang
2022-11-07  9:32             ` Guoqing Jiang
2022-11-03 14:46   ` [dm-devel] " Heming Zhao
2022-11-03 14:46     ` Heming Zhao
2022-11-04  1:23     ` [dm-devel] " Guoqing Jiang
2022-11-04  1:23       ` Guoqing Jiang
2022-11-04 11:10       ` [dm-devel] " Zdenek Kabelac
2022-11-04 11:10         ` Zdenek Kabelac
2022-11-04 15:18         ` [dm-devel] " Xiao Ni
2022-11-04 15:18           ` Xiao Ni
2022-11-07  1:52         ` Guoqing Jiang
2022-11-07  1:52           ` Guoqing Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ba8e67d8-b18b-13ba-2883-6ca6c6520ef2@linux.dev \
    --to=guoqing.jiang@linux.dev \
    --cc=dm-devel@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=song@kernel.org \
    --cc=zkabelac@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.