All of lore.kernel.org
 help / color / mirror / Atom feed
From: Takahiro Yasui <tyasui@redhat.com>
To: k-ueda@ct.jp.nec.com
Cc: device-mapper development <dm-devel@redhat.com>
Subject: Re: [RFC][PATCH 0/3] dm-raid1: fix deadlock at suspend after suspend was interrupted
Date: Wed, 20 Jan 2010 17:58:04 -0500	[thread overview]
Message-ID: <4B578A7C.6030400@redhat.com> (raw)
In-Reply-To: <4B566EDC.5070006@ct.jp.nec.com>

Hi Ueda-san,

Kiyoshi Ueda wrote:
> On 01/20/2010 05:40 AM +0900, Takahiro Yasui wrote:
>> Hi,
>>
>> This is a patch set to fix deadlock on suspending of mirror device.
>>
>>
>> ISSUE
>> =====
>>
>> Suspend procedure on a dm-mirror device could cause deadlock on recovery_count
>> semaphore.
>>
>> When mirror_presuspend is called, recovery_count semaphore is acquired in
>> dm_rh_stop_recovery() to stop recovery routine, but when an signal is caught
>> in dm_wait_for_completion() or an error occurred in in dm_suspend(),
>> the suspend process is interrupted without releasing recovery_count semaphore
>> of a mirror device. This means that another suspend is executed, and then
>> the suspend process gets stuck at dm_rh_stop_recovery().
>>
>> When suspend procedure is interrupted, the device should work properly since
>> the status of the device is not "suspended."
>>
>>
>> SOLUTION
>> ========
>>
>> Introduce a target handler, cancel_presuspend, to cancel status changes
>> done by a target specific presuspend handler.
> 
> How about using ->resume as a cancelling method?
> Though you have to audit existing targets' ->resume handler,
> I think it's better idea than adding another target handler
> just for this purpose.

A resume method contains a whole resume procedure, but when suspend is
interrupted, postsuspend handler is not processed. So the requirements
are to restore state changes done by presuspend handler. If a whole
resume procedure is executed, at least, dm-log will have a problem.

mirror log is flushed in postsuspend handler and log disk might contain
stale data at the moment when suspend is interrupted. If resume handler
is used instead of cancel_presuspend handler, log data on memory will be
overwritten by stale data on disk.

I'm afraid that we need to modify each target's resume handler so that
they work properly even after processing presuspend handler but before
postsuspend handler.

Please let me know if there is some oversight.

> And in your dm-raid1 patch, cancelling log's presuspend which is used
> by dm-log-userspace is missed.

Thank you for telling this. Yes, userspace target should be also handled.
I will fix it.

Thanks,
Taka

  reply	other threads:[~2010-01-20 22:58 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-19 20:40 [RFC][PATCH 0/3] dm-raid1: fix deadlock at suspend after suspend was interrupted Takahiro Yasui
2010-01-20  2:47 ` Kiyoshi Ueda
2010-01-20 22:58   ` Takahiro Yasui [this message]
2010-01-21  9:20     ` Kiyoshi Ueda
2010-01-22  6:16       ` Takahiro Yasui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B578A7C.6030400@redhat.com \
    --to=tyasui@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=k-ueda@ct.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.