From: Goldwyn Rodrigues <rgoldwyn@suse.de>
To: Abhijit Bhopatkar <abhopatk@cisco.com>,
linux-raid@vger.kernel.org, Lidong Zhong <lzhong@suse.com>
Cc: "Reese Faucette (rfaucett)" <rfaucett@cisco.com>
Subject: Re: [PATCH] md-cluster: avoid deadlock on MESSAGE lock resource
Date: Tue, 26 May 2015 09:44:49 -0500 [thread overview]
Message-ID: <556486E1.4030404@suse.de> (raw)
In-Reply-To: <556330FA.1030200@cisco.com>
On 05/25/2015 09:26 AM, Abhijit Bhopatkar wrote:
> On 17/05/15 2:28 am, Goldwyn Rodrigues wrote:
>>
>>
>> On 05/08/2015 08:14 AM, Abhijit Bhopatkar wrote:
>>> On 08/05/15 6:40 pm, Abhijit Bhopatkar wrote:
>>>>
>>>> Every receiver has CR lock on MESSAGE while processing the message. When
>>>> every receiver releases ACK lock and for some reason fails to grab EX on
>>>> MESSAGE resource in time, a waiting sender could queue an EX on MESSAGE
>>>> instead. Now when receiver queues its up convert request on MESSAGE it
>>>> will end up in a deadlock situation.
>>>>
>>>> Setting NOQUEUE flag on MESSAGE lock resource while grabbing the EX on
>>>> MESSAGE on sender will avoid this deadlock. If sender can not grab
>>>> MESSAGE lock immediately it should retry until the lock is granted.
>>>>
>>>> Signed-off-by: Abhijit Bhopatkar <abhopatk@cisco.com>
>>>> ---
>>>> This has been minimally tested on a three node cluster.
>>>>
>>>
>>> I have tested standard mdadm operations (create, assemble etc).
>>> What more testing would you want me to do on this before its considered
>>> ready?
>>
>> I am not sure how using LKF_NOQUEUE will help in this situation here. LKF_NOQUEUE primarily means do not queue if you can't grant it right away. Besides, I don't like the idea of goto loop.
>>
>> The sender can still creep in between the ack and the message locks. A situation would be where the "disrupting" sender is the lock owner of all the locks and hence will not have to pay communication costs and will manage to attain the locks faster.
>>
>> Perhaps DLM_LKF_HEADQUEUE or DLM_LKF_NOORDER is what you are looking for, but that again is not the complete solution.
>>
>> Another idea I could think of is for the sender to downconvert TOKEN to a shared lock such as CR halfway in the communication (say after message CR), and all receivers take the TOKEN in CR mode and release it once the communication is finally over.
>>
>> Regards,
>>
>>
> I agree about the goto pollution and yes converting receivers to use DLM_LKF_HEADQUEUE will solve the problem gracefully. Will send the new patch shortly.
>
> However I do not understand why this is incomplete solution. The "disruptive sender" as you have called it, is already "TOKEN" owner and otherwise it will compete for TOKEN lock as usual with other senders with equal priority. Not gaining any priority over others. The changes simply make sender stall for all _receivers_ to complete their serialization and wait till all receivers convert MESSAGE lock from CR to EX to NL, nothing else changes.
>
Yes, you are right. I ignored an operation on the message lock resource.
I will perform some tests before I signoff.
Thanks,
--
Goldwyn
prev parent reply other threads:[~2015-05-26 14:44 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-08 13:10 [PATCH] md-cluster: avoid deadlock on MESSAGE lock resource Abhijit Bhopatkar
2015-05-08 13:14 ` Abhijit Bhopatkar
2015-05-13 2:05 ` Lidong Zhong
2015-05-16 20:58 ` Goldwyn Rodrigues
2015-05-25 14:26 ` Abhijit Bhopatkar
2015-05-26 14:44 ` Goldwyn Rodrigues [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=556486E1.4030404@suse.de \
--to=rgoldwyn@suse.de \
--cc=abhopatk@cisco.com \
--cc=linux-raid@vger.kernel.org \
--cc=lzhong@suse.com \
--cc=rfaucett@cisco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).