From: Vladislav Bogdanov <bubble@hoster-ok.com>
To: Zdenek Kabelac <zkabelac@redhat.com>
Cc: Andreas Pflug <pgadmin@pse-consulting.de>,
LVM general discussion and development <linux-lvm@redhat.com>
Subject: Re: [linux-lvm] LVM snapshot with Clustered VG [SOLVED]
Date: Fri, 15 Mar 2013 18:36:28 +0300 [thread overview]
Message-ID: <51433FFC.7040609@hoster-ok.com> (raw)
In-Reply-To: <51433805.7030503@redhat.com>
15.03.2013 18:02, Zdenek Kabelac wrote:
> Dne 15.3.2013 15:51, Vladislav Bogdanov napsal(a):
>> 15.03.2013 16:32, Zdenek Kabelac wrote:
>>> Dne 15.3.2013 13:53, Vladislav Bogdanov napsal(a):
>>>> 15.03.2013 12:37, Zdenek Kabelac wrote:
>>>>> Dne 15.3.2013 10:29, Vladislav Bogdanov napsal(a):
>>>>>> 15.03.2013 12:00, Zdenek Kabelac wrote:
>>>>>>> Dne 14.3.2013 22:57, Andreas Pflug napsal(a):
>>>>>>>> On 03/13/13 19:30, Vladislav Bogdanov wrote:
>>>>>>>>>
>>>>>> You could activate LVs with the above syntax [ael]
>>>>> (there is a tag support - so you could exclusively activate LV on
>>>>> remote
>>>>> node in via some configuration tags)
>>>>
>>>> Could you please explain this - I do not see anything relevant in man
>>>> pages.
>>>
>>> Let's say - you have 3 nodes A, B, C - each have a TAG_A, TAG_B, TAG_C,
>>> then on node A you may exclusively activate LV which has TAG_B - this
>>> will try to exclusively active LV on the node which has it configured
>>> in lvm.conf (see the volume_list= [])
>>
>> Aha, if I understand correctly this is absolutely not what I need.
>> I want all this to be fully dynamic without any "config-editing voodoo".
>>
>>>
>>>>
>>>>>
>>>>> And you want to 'upgrade' remote locks to something else ?
>>>>
>>>> Yes, shared-to-exclusive end vice verse.
>>>
>>> So how do you convert the lock from shared to exclusive without unlock
>>> (if I get it right - you keep the ConcurrentRead lock - and you want to
>>> take Exlusive - to make change state from 'active' to 'active
>>> exlusive')
>>> https://en.wikipedia.org/wiki/Distributed_lock_manager
>>
>> I just pass LCKF_CONVERT to dlm_controld if requested and needed. And
>> that is dlm's task to either satisfy conversion or to refuse it.
>>
>
> So to understand myself better this thing -
>
> the dlm sends 'unlock' requests to all other nodes except the one which
> should be converted to exclusive mode and send exclusive lock to the
> prefered node?
No.
clvmd sends request to a remote clvmd to upgrade or acquire or release
the lock.
That remote instance asks local dlm to do the job. dlm either says OK or
says ERROR.
It does not do anything except that.
It LV is locked on a remote node, be it shared or exclusive lock, dlm
says ERROR if exclusive lock (or conversion to it) is requested.
My patches also allow "-an --force" to release shared locks on other
nodes. Exclusive lock may be released or downgraded only on node which
holds it (or with --node <node>).
>
>>>
>>> Clvmd 'communicates' via these locks.
>>
>> Not exactly true.
>>
>> clvmd does cluster communications with corosync, which implements
>> virtual synchrony, so all cluster nodes receive messages in the same
>> order.
>> At the bottom, clvmd uses libdlm to communicate with dlm_controld and
>> request it to lock/unlock.
>> dlm_controld instances use corosync for membership and locally manages
>> in-kernel dlm counter-part, which uses tcp/sctp mesh-like connections to
>> communicate.
>> So request from one clvmd instance goes to another and goes to kernel
>> from there, and then it is distributed to other nodes. Actually that
>> does not matter where does it hits kernel space if it supports
>> delegation of locks to remote nodes, but I suspect it doesn't. But if it
>> doesn't support such thing, then the only option to manage lock on a
>> remote node is to request that's node dlm instance to do the locking job.
>>
>>> So the proper algorithm needs to be there for ending with some proper
>>> state after locks changes (and sorry I'm not a dlm expert here)
>>
>> That is what actually happens.
>> There is just no difference between running (to upgrade local lock to
>> exclusive on node <node>.
>>
>> ssh <node> lvchange -aey --force VG/LV
>>
>> or
>>
>> lvchange -aey --node <node> --force VG/LV
>
>
> --node is exactly what the tag is for - each node may have it's tag.
> lvm doesn't work with cluster nodes.
But corosync and dlm operate node IDs, and pacemaker operates node names
and IDs. None of them use tags.
>
> The question is - could be the code transformed to use this logic ?
> I guess you need to know dlm node name here right ?
Node IDs are obtained from corosync membership list, and may be used for
that. If corosync is configured with nodelist in a way pacemaker wants
it
(http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-node-name.html),
then node names may be used too.
>
>
>> The same command, it is just sent via different channels.
>>
>> Again, I just send locking request to a remote clvmd instance through
>> corosync.
>> It asks its local dlm to convert (acquire, release) lock and returns its
>> answer back. After dlm answers, operation is either performed, and then
>> OK is send back to a initiator, or refused, and the error is sent back.
>
>
>>>> There is no other events on a destination node in ver3 migration
>>>> protocol, so I'm unable to convert lock to exclusive there after
>>>> migration is finished. So I do that from a source node, after it
>>>> released lock.
>>>>
>>>>>
>>>>> Is that supported by dlm (since lvm locks are mapped to dlm)?
>>>> Command just sent to a specific clvm instance and performed there.
>>>
>>> As said - the 'lock' is the thing which controls the activation state.
>>> So faking it on the software level may possible lead to inconsistency
>>> between the dlm and clvmd view of the lock state.
>>
>> No faking. Just a remote management of the same lock.
>
> Could you repost patches against git ?
I plan that for the next week.
> With some usage examples ?
Yes, if you give me an "example of example" ;)
Vladislav
next prev parent reply other threads:[~2013-03-15 15:36 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-01 11:28 [linux-lvm] LVM snapshot with Clustered VG Andreas Pflug
2013-03-01 15:41 ` Vladislav Bogdanov
2013-03-06 7:40 ` Andreas Pflug
2013-03-06 7:58 ` Vladislav Bogdanov
2013-03-06 9:15 ` Andreas Pflug
2013-03-06 9:35 ` Vladislav Bogdanov
2013-03-06 9:59 ` Andreas Pflug
2013-03-06 11:20 ` Vladislav Bogdanov
2013-03-06 12:17 ` Andreas Pflug
2013-03-06 13:28 ` Vladislav Bogdanov
2013-03-12 6:52 ` Andreas Pflug
2013-03-13 15:14 ` [linux-lvm] LVM snapshot with Clustered VG [SOLVED] Andreas Pflug
2013-03-13 16:53 ` Vladislav Bogdanov
2013-03-13 17:37 ` Andreas Pflug
2013-03-13 18:30 ` Vladislav Bogdanov
2013-03-14 21:57 ` Andreas Pflug
2013-03-15 9:00 ` Zdenek Kabelac
2013-03-15 9:29 ` Vladislav Bogdanov
2013-03-15 9:37 ` Zdenek Kabelac
2013-03-15 12:53 ` Vladislav Bogdanov
2013-03-15 13:11 ` Vladislav Bogdanov
2013-03-15 13:32 ` Zdenek Kabelac
2013-03-15 14:51 ` Vladislav Bogdanov
2013-03-15 15:02 ` Zdenek Kabelac
2013-03-15 15:36 ` Vladislav Bogdanov [this message]
2013-03-15 15:55 ` Zdenek Kabelac
2013-03-15 17:16 ` Vladislav Bogdanov
-- strict thread matches above, loose matches on Subject: below --
2013-03-15 16:31 David Teigland
2013-03-15 17:46 ` Vladislav Bogdanov
2013-03-15 18:38 ` David Teigland
2013-03-16 11:00 ` Vladislav Bogdanov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51433FFC.7040609@hoster-ok.com \
--to=bubble@hoster-ok.com \
--cc=linux-lvm@redhat.com \
--cc=pgadmin@pse-consulting.de \
--cc=zkabelac@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).