All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: Donald Buczek <buczek@molgen.mpg.de>, song@kernel.org
Cc: linux-raid@vger.kernel.org, Paul Menzel <pmenzel@molgen.mpg.de>,
	dm-devel@redhat.com, snitzer@redhat.com, agk@redhat.com
Subject: Re: [dm-devel] [PATCH V2] md: don't unregister sync_thread with reconfig_mutex held
Date: Tue, 14 Dec 2021 18:03:43 +0800	[thread overview]
Message-ID: <2ed2d0ba-14e9-f708-265e-8fc64902bd93@linux.dev> (raw)
In-Reply-To: <75958549-03d4-e396-e761-9956c4b2f991@molgen.mpg.de>



On 12/14/21 5:31 PM, Donald Buczek wrote:

>>>>   -void md_reap_sync_thread(struct mddev *mddev)
>>>> +void md_reap_sync_thread(struct mddev *mddev, bool 
>>>> reconfig_mutex_held)
>>>>   {
>>>>       struct md_rdev *rdev;
>>>>       sector_t old_dev_sectors = mddev->dev_sectors;
>>>>       bool is_reshaped = false;
>>>>         /* resync has finished, collect result */
>>>> +    if (reconfig_mutex_held)
>>>> +        mddev_unlock(mddev);
>>>
>>>
>>> If one thread got here, e.g. via action_store( /* "idle" */ ), now 
>>> that the mutex is unlocked, is there anything which would prevent 
>>> another thread getting  here as well, e.g. via the same path?
>>>
>>> If not, they both might call
>>>
>>>> md_unregister_thread(&mddev->sync_thread);
>>>
>>> Which is not reentrant:
>>>
>>> void md_unregister_thread(struct md_thread **threadp)
>>> {
>>>     struct md_thread *thread = *threadp;
>>>     if (!thread)
>>>         return;
>>>     pr_debug("interrupting MD-thread pid %d\n", 
>>> task_pid_nr(thread->tsk));
>>>     /* Locking ensures that mddev_unlock does not wake_up a
>>>      * non-existent thread
>>>      */
>>>     spin_lock(&pers_lock);
>>>     *threadp = NULL;
>>>     spin_unlock(&pers_lock);
>>>
>>>     kthread_stop(thread->tsk);
>>>     kfree(thread);
>>> }
>>>
>>> This might be a preexisting problem, because the call site in 
>>> dm-raid.c, which you updated to `md_reap_sync_thread(mddev, 
>>> false);`, didn't hold the mutex anyway.
>>>
>>> Am I missing something? Probably, I do.
>>>
>>> Otherwise: Move the deref of threadp in md_unregister_thread() into 
>>> the spinlock scope?
>>
>> Good point, I think you are right.
>>
>> And actually pers_lock does extra service to protect accesses to 
>> mddev->thread (I think it
>> also suitable for mddev->sync_thread ) when the mutex can't be held. 
>> Care to send a patch
>> for it?
>
> I'm really sorry, but it's one thing to point to a possible problem 
> and another thing to come up with a correct solution.

Yes, it is often the reality of life, and we can always correct 
ourselves if there is problem 😎.

> While I think it would be easy to avoid the double free with the 
> spinlock (or maybe atomic RMW) , we surely don't want to hold the 
> spinlock while we are sleeping in kthread_stop(). If we don't hold 
> some kind of lock, what are the side effects of another sync thread 
> being started or any other reconfiguration? Are the existing flags 
> enough to protect us from this? If we do want to hold the lock while 
> waiting for the thread to terminate, should it be made into a mutex? 
> If so, it probably shouldn't be static but moved into the mddev 
> structure. I'd need weeks if not month to figure that out and to feel 
> bold enough to post it.

Thanks for deep thinking about it, I think we can avoid to call 
kthread_stop with spinlock
held. Maybe something like this, but just my raw idea, please have a 
thorough review.

  void md_unregister_thread(struct md_thread **threadp)
  {
-       struct md_thread *thread = *threadp;
-       if (!thread)
+       struct md_thread *thread = READ_ONCE(*threadp);
+
+       spin_lock(&pers_lock);
+       if (!thread) {
+               spin_unlock(&pers_lock);
                 return;
+       }
         pr_debug("interrupting MD-thread pid %d\n", 
task_pid_nr(thread->tsk));
         /* Locking ensures that mddev_unlock does not wake_up a
          * non-existent thread
          */
-       spin_lock(&pers_lock);
         *threadp = NULL;
         spin_unlock(&pers_lock);

-       kthread_stop(thread->tsk);
+       if (IS_ERR_OR_NULL(thread->tsk)) {
+               kthread_stop(thread->tsk);
+               thread->tsk = NULL;
+       }
         kfree(thread);
  }
  EXPORT_SYMBOL(md_unregister_thread);

> I don't want to push work to others, but my own my understanding of md 
> is to narrow.

Me either 😉

Thanks,
Guoqing


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

WARNING: multiple messages have this Message-ID (diff)
From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: Donald Buczek <buczek@molgen.mpg.de>, song@kernel.org
Cc: agk@redhat.com, snitzer@redhat.com, linux-raid@vger.kernel.org,
	dm-devel@redhat.com, Paul Menzel <pmenzel@molgen.mpg.de>
Subject: Re: [PATCH V2] md: don't unregister sync_thread with reconfig_mutex held
Date: Tue, 14 Dec 2021 18:03:43 +0800	[thread overview]
Message-ID: <2ed2d0ba-14e9-f708-265e-8fc64902bd93@linux.dev> (raw)
In-Reply-To: <75958549-03d4-e396-e761-9956c4b2f991@molgen.mpg.de>



On 12/14/21 5:31 PM, Donald Buczek wrote:

>>>>   -void md_reap_sync_thread(struct mddev *mddev)
>>>> +void md_reap_sync_thread(struct mddev *mddev, bool 
>>>> reconfig_mutex_held)
>>>>   {
>>>>       struct md_rdev *rdev;
>>>>       sector_t old_dev_sectors = mddev->dev_sectors;
>>>>       bool is_reshaped = false;
>>>>         /* resync has finished, collect result */
>>>> +    if (reconfig_mutex_held)
>>>> +        mddev_unlock(mddev);
>>>
>>>
>>> If one thread got here, e.g. via action_store( /* "idle" */ ), now 
>>> that the mutex is unlocked, is there anything which would prevent 
>>> another thread getting  here as well, e.g. via the same path?
>>>
>>> If not, they both might call
>>>
>>>> md_unregister_thread(&mddev->sync_thread);
>>>
>>> Which is not reentrant:
>>>
>>> void md_unregister_thread(struct md_thread **threadp)
>>> {
>>>     struct md_thread *thread = *threadp;
>>>     if (!thread)
>>>         return;
>>>     pr_debug("interrupting MD-thread pid %d\n", 
>>> task_pid_nr(thread->tsk));
>>>     /* Locking ensures that mddev_unlock does not wake_up a
>>>      * non-existent thread
>>>      */
>>>     spin_lock(&pers_lock);
>>>     *threadp = NULL;
>>>     spin_unlock(&pers_lock);
>>>
>>>     kthread_stop(thread->tsk);
>>>     kfree(thread);
>>> }
>>>
>>> This might be a preexisting problem, because the call site in 
>>> dm-raid.c, which you updated to `md_reap_sync_thread(mddev, 
>>> false);`, didn't hold the mutex anyway.
>>>
>>> Am I missing something? Probably, I do.
>>>
>>> Otherwise: Move the deref of threadp in md_unregister_thread() into 
>>> the spinlock scope?
>>
>> Good point, I think you are right.
>>
>> And actually pers_lock does extra service to protect accesses to 
>> mddev->thread (I think it
>> also suitable for mddev->sync_thread ) when the mutex can't be held. 
>> Care to send a patch
>> for it?
>
> I'm really sorry, but it's one thing to point to a possible problem 
> and another thing to come up with a correct solution.

Yes, it is often the reality of life, and we can always correct 
ourselves if there is problem 😎.

> While I think it would be easy to avoid the double free with the 
> spinlock (or maybe atomic RMW) , we surely don't want to hold the 
> spinlock while we are sleeping in kthread_stop(). If we don't hold 
> some kind of lock, what are the side effects of another sync thread 
> being started or any other reconfiguration? Are the existing flags 
> enough to protect us from this? If we do want to hold the lock while 
> waiting for the thread to terminate, should it be made into a mutex? 
> If so, it probably shouldn't be static but moved into the mddev 
> structure. I'd need weeks if not month to figure that out and to feel 
> bold enough to post it.

Thanks for deep thinking about it, I think we can avoid to call 
kthread_stop with spinlock
held. Maybe something like this, but just my raw idea, please have a 
thorough review.

  void md_unregister_thread(struct md_thread **threadp)
  {
-       struct md_thread *thread = *threadp;
-       if (!thread)
+       struct md_thread *thread = READ_ONCE(*threadp);
+
+       spin_lock(&pers_lock);
+       if (!thread) {
+               spin_unlock(&pers_lock);
                 return;
+       }
         pr_debug("interrupting MD-thread pid %d\n", 
task_pid_nr(thread->tsk));
         /* Locking ensures that mddev_unlock does not wake_up a
          * non-existent thread
          */
-       spin_lock(&pers_lock);
         *threadp = NULL;
         spin_unlock(&pers_lock);

-       kthread_stop(thread->tsk);
+       if (IS_ERR_OR_NULL(thread->tsk)) {
+               kthread_stop(thread->tsk);
+               thread->tsk = NULL;
+       }
         kfree(thread);
  }
  EXPORT_SYMBOL(md_unregister_thread);

> I don't want to push work to others, but my own my understanding of md 
> is to narrow.

Me either 😉

Thanks,
Guoqing

  reply	other threads:[~2021-12-14 10:11 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-13  0:49 [dm-devel] [PATCH V2] md: don't unregister sync_thread with reconfig_mutex held Guoqing Jiang
2021-02-13  0:49 ` Guoqing Jiang
2021-02-15 11:07 ` [dm-devel] " Paul Menzel
2021-02-15 11:07   ` Paul Menzel
2021-02-24  9:09   ` [dm-devel] " Song Liu
2021-02-24  9:09     ` Song Liu
2021-02-24  9:25     ` [dm-devel] " Guoqing Jiang
2021-02-24  9:25       ` Guoqing Jiang
2021-03-19 23:00       ` [dm-devel] " Song Liu
2021-03-19 23:00         ` Song Liu
2021-11-30 17:25         ` [dm-devel] " Paul Menzel
2021-11-30 17:25           ` Paul Menzel
2021-11-30 17:27           ` [dm-devel] " Paul Menzel
2021-11-30 17:27             ` Paul Menzel
2021-12-08 14:16             ` [dm-devel] " Guoqing Jiang
2021-12-08 14:16               ` Guoqing Jiang
2021-12-08 16:35               ` [dm-devel] " Heinz Mauelshagen
2021-12-09  0:47                 ` Guoqing Jiang
2021-12-09  0:47                   ` Guoqing Jiang
2021-12-09 12:54   ` [dm-devel] " Donald Buczek
2021-12-09 12:54     ` Donald Buczek
2021-12-09 12:57   ` [dm-devel] " Donald Buczek
2021-12-09 12:57     ` Donald Buczek
2021-12-10  1:06     ` [dm-devel] " Guoqing Jiang
2021-12-10  1:06       ` Guoqing Jiang
2021-12-10 14:16 ` [dm-devel] " Donald Buczek
2021-12-10 14:16   ` Donald Buczek
2021-12-14  2:34   ` [dm-devel] " Guoqing Jiang
2021-12-14  2:34     ` Guoqing Jiang
2021-12-14  9:31     ` [dm-devel] " Donald Buczek
2021-12-14  9:31       ` Donald Buczek
2021-12-14 10:03       ` Guoqing Jiang [this message]
2021-12-14 10:03         ` Guoqing Jiang
2021-12-14 12:21         ` [dm-devel] " Donald Buczek
2021-12-14 12:21           ` Donald Buczek
2022-01-11 12:25 ` [dm-devel] " Mikko Rantalainen
2022-01-11 12:25   ` Mikko Rantalainen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2ed2d0ba-14e9-f708-265e-8fc64902bd93@linux.dev \
    --to=guoqing.jiang@linux.dev \
    --cc=agk@redhat.com \
    --cc=buczek@molgen.mpg.de \
    --cc=dm-devel@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=pmenzel@molgen.mpg.de \
    --cc=snitzer@redhat.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.