* [linux-lvm] mirrored LV + cmirror problem
@ 2008-02-15 11:36 Lajkó Attila
2008-02-15 12:02 ` Michael Eisenkölbl
2008-02-15 15:41 ` Jonathan Brassow
0 siblings, 2 replies; 13+ messages in thread
From: Lajkó Attila @ 2008-02-15 11:36 UTC (permalink / raw)
To: linux-lvm
Hello,
I have a problem with clvmd an cmirror:
We have a two nodes cluster (RHEL4.6). I created a mirrored LV on a
clustered volume group on 2 iscsi LUNS (VTrak M200i).
When i disconnect one of the LUNs - simulating a storage problem -
the mirrored LV doesn't go to linear mode, the LVM commands (lvs,
lvconvert, etc.) get stuck and the GFS file system is not accessible
(on both nodes).
What is see in /var/log/messages:
Feb 15 12:29:26 el42 kernel: dm-cmirror: server_complete_resync_work -
Setting recovery_halted = 1
Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
Feb 15 12:29:26 el42 last message repeated 4 times
Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
Feb 15 12:29:26 el42 kernel: dm-cmirror: Recovery halted due to error
on ItlWCmkP
Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/dev/mapper/mirrp3)
called while suspended
Feb 15 12:29:26 el42 kernel: dm-cmirror: LOG INFO:
Feb 15 12:29:26 el42 kernel: dm-cmirror: uuid: LVM-
zEHPYfjtLCL7yqQhsG2kcPzthyLbyBPd7xlok1gd7NHgXR3l2XaVQWEVItlWCmkP
Feb 15 12:29:26 el42 kernel: dm-cmirror: uuid_ref : 1
Feb 15 12:29:26 el42 kernel: dm-cmirror: log type : disk
Feb 15 12:29:26 el42 kernel: dm-cmirror: ?region_count: 320
Feb 15 12:29:26 el42 kernel: dm-cmirror: ?sync_count : 320
Feb 15 12:29:26 el42 kernel: dm-cmirror: ?sync_search : 320
Feb 15 12:29:26 el42 kernel: dm-cmirror: in_sync : YES
Feb 15 12:29:26 el42 kernel: dm-cmirror: suspended : NO
Feb 15 12:29:26 el42 kernel: dm-cmirror: recovery_halted : YES
Feb 15 12:29:26 el42 kernel: dm-cmirror: server_id : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror: server_valid: YES
Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_presuspend: recovery
halted on ItlWCmkP(1)
Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_postsuspend
Feb 15 12:29:26 el42 kernel: dm-cmirror: Telling everyone I'm
suspending (ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):
(ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 0
Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 0
Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):
(ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 0
Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 57005
Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 0
Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/etc/lvm/lvm.conf)
called while suspended
Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 1
Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_SELECTION(11): (ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
Feb 15 12:29:27 el42 kernel: dm-cmirror: co-ordinator: 1
Feb 15 12:29:27 el42 kernel: dm-cmirror: node_count : 2
Feb 15 12:29:27 el42 kernel: dm-cmirror: LRT_MASTER_ASSIGN(12):
(ItlWCmkP)
Feb 15 12:29:27 el42 kernel: dm-cmirror: starter : 2
Feb 15 12:29:27 el42 kernel: dm-cmirror: co-ordinator: 1
Feb 15 12:29:27 el42 lvm[4929]: Failed to remove faulty devices in
vgtest-lvtest
Feb 15 12:29:27 el42 kernel: dm-cmirror: node_count : 1
Feb 15 12:29:27 el42 kernel: dm-cmirror: Suspending now (ItlWCmkP)
Feb 15 12:29:28 el42 lvm[4929]: No longer monitoring mirror device
vgtest-lvtest for events
Regards,
Attila Lajk�
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-15 11:36 [linux-lvm] mirrored LV + cmirror problem Lajkó Attila
@ 2008-02-15 12:02 ` Michael Eisenkölbl
2008-02-15 13:26 ` Lajkó Attila
2008-02-15 15:41 ` Jonathan Brassow
1 sibling, 1 reply; 13+ messages in thread
From: Michael Eisenkölbl @ 2008-02-15 12:02 UTC (permalink / raw)
To: LVM general discussion and development
how big are the LUNs?
michael
Lajk� Attila schrieb:
> Hello,
>
>
> I have a problem with clvmd an cmirror:
>
> We have a two nodes cluster (RHEL4.6). I created a mirrored LV on a
> clustered volume group on 2 iscsi LUNS (VTrak M200i).
> When i disconnect one of the LUNs - simulating a storage problem -
> the mirrored LV doesn't go to linear mode, the LVM commands (lvs,
> lvconvert, etc.) get stuck and the GFS file system is not accessible
> (on both nodes).
>
> What is see in /var/log/messages:
>
> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_complete_resync_work -
> Setting recovery_halted = 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
> Feb 15 12:29:26 el42 last message repeated 4 times
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Recovery halted due to error
> on ItlWCmkP
> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/dev/mapper/mirrp3)
> called while suspended
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LOG INFO:
> Feb 15 12:29:26 el42 kernel: dm-cmirror: uuid:
> LVM-zEHPYfjtLCL7yqQhsG2kcPzthyLbyBPd7xlok1gd7NHgXR3l2XaVQWEVItlWCmkP
> Feb 15 12:29:26 el42 kernel: dm-cmirror: uuid_ref : 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror: log type : disk
> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?region_count: 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?sync_count : 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?sync_search : 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror: in_sync : YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror: suspended : NO
> Feb 15 12:29:26 el42 kernel: dm-cmirror: recovery_halted : YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_id : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_valid: YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_presuspend: recovery
> halted on ItlWCmkP(1)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_postsuspend
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Telling everyone I'm
> suspending (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):
> (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):
> (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 57005
> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/etc/lvm/lvm.conf)
> called while suspended
> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_SELECTION(11): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror: co-ordinator: 1
> Feb 15 12:29:27 el42 kernel: dm-cmirror: node_count : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror: LRT_MASTER_ASSIGN(12):
> (ItlWCmkP)
> Feb 15 12:29:27 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror: co-ordinator: 1
> Feb 15 12:29:27 el42 lvm[4929]: Failed to remove faulty devices in
> vgtest-lvtest
> Feb 15 12:29:27 el42 kernel: dm-cmirror: node_count : 1
> Feb 15 12:29:27 el42 kernel: dm-cmirror: Suspending now (ItlWCmkP)
> Feb 15 12:29:28 el42 lvm[4929]: No longer monitoring mirror device
> vgtest-lvtest for events
>
> Regards,
> Attila Lajk�
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-15 12:02 ` Michael Eisenkölbl
@ 2008-02-15 13:26 ` Lajkó Attila
2008-02-15 13:36 ` Michael Eisenkölbl
0 siblings, 1 reply; 13+ messages in thread
From: Lajkó Attila @ 2008-02-15 13:26 UTC (permalink / raw)
To: LVM general discussion and development
The VTrak LUNs are 1 and 5GB. The PVs are on a 240 MB partition on
each luns , the LV size is 160 MB. It's just a test.
Attila.
On Feb 15, 2008, at 1:02 PM, Michael Eisenk�lbl wrote:
> how big are the LUNs?
>
> michael
>
> Lajk� Attila schrieb:
>> Hello,
>>
>>
>> I have a problem with clvmd an cmirror:
>>
>> We have a two nodes cluster (RHEL4.6). I created a mirrored LV on a
>> clustered volume group on 2 iscsi LUNS (VTrak M200i).
>> When i disconnect one of the LUNs - simulating a storage problem -
>> the mirrored LV doesn't go to linear mode, the LVM commands (lvs,
>> lvconvert, etc.) get stuck and the GFS file system is not
>> accessible (on both nodes).
>>
>> What is see in /var/log/messages:
>>
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:
>> server_complete_resync_work - Setting recovery_halted = 1
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
>> Feb 15 12:29:26 el42 last message repeated 4 times
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Recovery halted due to
>> error on ItlWCmkP
>> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/dev/mapper/
>> mirrp3) called while suspended
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LOG INFO:
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: uuid: LVM-
>> zEHPYfjtLCL7yqQhsG2kcPzthyLbyBPd7xlok1gd7NHgXR3l2XaVQWEVItlWCmkP
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: uuid_ref : 1
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: log type : disk
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?region_count: 320
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?sync_count : 320
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?sync_search : 320
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: in_sync : YES
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: suspended : NO
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: recovery_halted : YES
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_id : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_valid: YES
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_presuspend:
>> recovery halted on ItlWCmkP(1)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_postsuspend
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Telling everyone I'm
>> suspending (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):
>> (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 0
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 0
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):
>> (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 0
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 57005
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 0
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
>> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/etc/lvm/
>> lvm.conf) called while suspended
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 1
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_SELECTION(11):
>> (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
>> Feb 15 12:29:27 el42 kernel: dm-cmirror: co-ordinator: 1
>> Feb 15 12:29:27 el42 kernel: dm-cmirror: node_count : 2
>> Feb 15 12:29:27 el42 kernel: dm-cmirror: LRT_MASTER_ASSIGN(12):
>> (ItlWCmkP)
>> Feb 15 12:29:27 el42 kernel: dm-cmirror: starter : 2
>> Feb 15 12:29:27 el42 kernel: dm-cmirror: co-ordinator: 1
>> Feb 15 12:29:27 el42 lvm[4929]: Failed to remove faulty devices in
>> vgtest-lvtest
>> Feb 15 12:29:27 el42 kernel: dm-cmirror: node_count : 1
>> Feb 15 12:29:27 el42 kernel: dm-cmirror: Suspending now (ItlWCmkP)
>> Feb 15 12:29:28 el42 lvm[4929]: No longer monitoring mirror device
>> vgtest-lvtest for events
>>
>> Regards,
>> Attila Lajk�
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-15 13:26 ` Lajkó Attila
@ 2008-02-15 13:36 ` Michael Eisenkölbl
0 siblings, 0 replies; 13+ messages in thread
From: Michael Eisenkölbl @ 2008-02-15 13:36 UTC (permalink / raw)
To: LVM general discussion and development
we have also several problems with 6 TB volumes.
kind regards
Lajk� Attila schrieb:
>
> The VTrak LUNs are 1 and 5GB. The PVs are on a 240 MB partition on
> each luns , the LV size is 160 MB. It's just a test.
>
> Attila.
>
> On Feb 15, 2008, at 1:02 PM, Michael Eisenk�lbl wrote:
>
>> how big are the LUNs?
>>
>> michael
>>
>> Lajk� Attila schrieb:
>>> Hello,
>>>
>>>
>>> I have a problem with clvmd an cmirror:
>>>
>>> We have a two nodes cluster (RHEL4.6). I created a mirrored LV on a
>>> clustered volume group on 2 iscsi LUNS (VTrak M200i).
>>> When i disconnect one of the LUNs - simulating a storage problem -
>>> the mirrored LV doesn't go to linear mode, the LVM commands (lvs,
>>> lvconvert, etc.) get stuck and the GFS file system is not accessible
>>> (on both nodes).
>>>
>>> What is see in /var/log/messages:
>>>
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_complete_resync_work
>>> - Setting recovery_halted = 1
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
>>> Feb 15 12:29:26 el42 last message repeated 4 times
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Recovery halted due to
>>> error on ItlWCmkP
>>> Feb 15 12:29:26 el42 lvm[4929]: WARNING:
>>> dev_open(/dev/mapper/mirrp3) called while suspended
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LOG INFO:
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: uuid:
>>> LVM-zEHPYfjtLCL7yqQhsG2kcPzthyLbyBPd7xlok1gd7NHgXR3l2XaVQWEVItlWCmkP
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: uuid_ref : 1
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: log type : disk
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?region_count: 320
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?sync_count : 320
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?sync_search : 320
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: in_sync : YES
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: suspended : NO
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: recovery_halted : YES
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_id : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_valid: YES
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_presuspend:
>>> recovery halted on ItlWCmkP(1)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_postsuspend
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Telling everyone I'm
>>> suspending (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):
>>> (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 0
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 0
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):
>>> (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 0
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 57005
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 0
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
>>> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/etc/lvm/lvm.conf)
>>> called while suspended
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 1
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_SELECTION(11): (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror: co-ordinator: 1
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror: node_count : 2
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror: LRT_MASTER_ASSIGN(12):
>>> (ItlWCmkP)
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror: starter : 2
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror: co-ordinator: 1
>>> Feb 15 12:29:27 el42 lvm[4929]: Failed to remove faulty devices in
>>> vgtest-lvtest
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror: node_count : 1
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror: Suspending now (ItlWCmkP)
>>> Feb 15 12:29:28 el42 lvm[4929]: No longer monitoring mirror device
>>> vgtest-lvtest for events
>>>
>>> Regards,
>>> Attila Lajk�
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-15 11:36 [linux-lvm] mirrored LV + cmirror problem Lajkó Attila
2008-02-15 12:02 ` Michael Eisenkölbl
@ 2008-02-15 15:41 ` Jonathan Brassow
2008-02-15 16:17 ` Lajkó Attila
1 sibling, 1 reply; 13+ messages in thread
From: Jonathan Brassow @ 2008-02-15 15:41 UTC (permalink / raw)
To: LVM general discussion and development
Are all the packages rhel4.6 as well, or have you compiled pkgs
yourself?
What was the load you had on the system?
The messages I see from dm-cmirror suggest that it is properly
shutting down in the face of the failure... However, before it has
finished, we can see "Failed to remove faulty devices in vgtest-
lvtest". This suggests to me that clvmd is not waiting long enough
for the shutdown to complete, but I only see 3 seconds of the log.
When was the device failure initiated?
brassow
On Feb 15, 2008, at 5:36 AM, Lajk� Attila wrote:
> Hello,
>
>
> I have a problem with clvmd an cmirror:
>
> We have a two nodes cluster (RHEL4.6). I created a mirrored LV on a
> clustered volume group on 2 iscsi LUNS (VTrak M200i).
> When i disconnect one of the LUNs - simulating a storage problem -
> the mirrored LV doesn't go to linear mode, the LVM commands (lvs,
> lvconvert, etc.) get stuck and the GFS file system is not accessible
> (on both nodes).
>
> What is see in /var/log/messages:
>
> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_complete_resync_work
> - Setting recovery_halted = 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
> Feb 15 12:29:26 el42 last message repeated 4 times
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Recovery halted due to
> error on ItlWCmkP
> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/dev/mapper/
> mirrp3) called while suspended
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LOG INFO:
> Feb 15 12:29:26 el42 kernel: dm-cmirror: uuid: LVM-
> zEHPYfjtLCL7yqQhsG2kcPzthyLbyBPd7xlok1gd7NHgXR3l2XaVQWEVItlWCmkP
> Feb 15 12:29:26 el42 kernel: dm-cmirror: uuid_ref : 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror: log type : disk
> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?region_count: 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?sync_count : 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror: ?sync_search : 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror: in_sync : YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror: suspended : NO
> Feb 15 12:29:26 el42 kernel: dm-cmirror: recovery_halted : YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_id : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_valid: YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_presuspend:
> recovery halted on ItlWCmkP(1)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_postsuspend
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Telling everyone I'm
> suspending (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):
> (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):
> (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 57005
> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/etc/lvm/lvm.conf)
> called while suspended
> Feb 15 12:29:26 el42 kernel: dm-cmirror: co-ordinator: 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror: node_count : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_SELECTION(11): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror: co-ordinator: 1
> Feb 15 12:29:27 el42 kernel: dm-cmirror: node_count : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror: LRT_MASTER_ASSIGN(12):
> (ItlWCmkP)
> Feb 15 12:29:27 el42 kernel: dm-cmirror: starter : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror: co-ordinator: 1
> Feb 15 12:29:27 el42 lvm[4929]: Failed to remove faulty devices in
> vgtest-lvtest
> Feb 15 12:29:27 el42 kernel: dm-cmirror: node_count : 1
> Feb 15 12:29:27 el42 kernel: dm-cmirror: Suspending now (ItlWCmkP)
> Feb 15 12:29:28 el42 lvm[4929]: No longer monitoring mirror device
> vgtest-lvtest for events
>
> Regards,
> Attila Lajk�
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-15 15:41 ` Jonathan Brassow
@ 2008-02-15 16:17 ` Lajkó Attila
2008-02-15 21:06 ` Jonathan Brassow
0 siblings, 1 reply; 13+ messages in thread
From: Lajkó Attila @ 2008-02-15 16:17 UTC (permalink / raw)
To: LVM general discussion and development
On Feb 15, 2008, at 4:41 PM, Jonathan Brassow wrote:
> Are all the packages rhel4.6 as well, or have you compiled pkgs
> yourself?
All the packages are binaries from rhel4.6:
lvm2-cluster-2.02.27-2.el4_6.1
cmirror-1.0.1-1
cmirror-kernel-xenU-2.6.9-38.5
>
>
> What was the load you had on the system?
Very low, approx. 0.
> The messages I see from dm-cmirror suggest that it is properly
> shutting down in the face of the failure... However, before it has
> finished, we can see "Failed to remove faulty devices in vgtest-
> lvtest". This suggests to me that clvmd is not waiting long enough
> for the shutdown to complete, but I only see 3 seconds of the log.
> When was the device failure initiated?
>
> brassow
>
>
The failure was initiated at the beginning of the log. I put the
complete messages files from both nodes (el42 and el4) to FTP.
ftp://ftp.ulx.hu/upload/clvmd/messages.el4
ftp://ftp.ulx.hu/upload/clvmd/messages.el42
Attila
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-15 16:17 ` Lajkó Attila
@ 2008-02-15 21:06 ` Jonathan Brassow
2008-02-17 12:49 ` Lajkó Attila
0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Brassow @ 2008-02-15 21:06 UTC (permalink / raw)
To: LVM general discussion and development
If the problem is reproducible, we should be able to track it down.
When a failure happens, the kernel sends an event to userspace that
signals 'dmeventd' to take action. If we take dmeventd out of the
picture, we can run the commands ourselves with higher verbose settings.
When you activate the volume, you can 'lvchange -ay --monitor n <vg>/
<lv>' - this will prevent dmeventd from monitoring the mirror. Then
kill the log device. Finally, run 'vgreduce --removemissing <VG> -
vvvv' to perform the recovery. (redirecting all the output to a file
will give us something to look at if the failure is reproduced.)
We may need to grab debugging output from clvmd too, but that can get
messy, so we'll start with this.
brassow
P.S. It looks like you must have *.debug; in your /etc/syslog.conf,
yes?
On Feb 15, 2008, at 10:17 AM, Lajk� Attila wrote:
>
>
> On Feb 15, 2008, at 4:41 PM, Jonathan Brassow wrote:
>
>> Are all the packages rhel4.6 as well, or have you compiled pkgs
>> yourself?
>
> All the packages are binaries from rhel4.6:
> lvm2-cluster-2.02.27-2.el4_6.1
> cmirror-1.0.1-1
> cmirror-kernel-xenU-2.6.9-38.5
>
>>
>>
>> What was the load you had on the system?
>
> Very low, approx. 0.
>
>> The messages I see from dm-cmirror suggest that it is properly
>> shutting down in the face of the failure... However, before it has
>> finished, we can see "Failed to remove faulty devices in vgtest-
>> lvtest". This suggests to me that clvmd is not waiting long enough
>> for the shutdown to complete, but I only see 3 seconds of the log.
>> When was the device failure initiated?
>>
>> brassow
>>
>>
>
> The failure was initiated at the beginning of the log. I put the
> complete messages files from both nodes (el42 and el4) to FTP.
>
> ftp://ftp.ulx.hu/upload/clvmd/messages.el4
> ftp://ftp.ulx.hu/upload/clvmd/messages.el42
>
> Attila
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-15 21:06 ` Jonathan Brassow
@ 2008-02-17 12:49 ` Lajkó Attila
2008-02-18 16:08 ` Jonathan Brassow
0 siblings, 1 reply; 13+ messages in thread
From: Lajkó Attila @ 2008-02-17 12:49 UTC (permalink / raw)
To: LVM general discussion and development
Here is the output of vgreduce:
ftp://ftp.ulx.hu/upload/clvmd/vgreduce.tar.gz
Attila
2008. febr. 15, 22:06 DU dátummal Jonathan Brassow <jbrassow@redhat.com>
ezt írta:
> If the problem is reproducible, we should be able to track it down.
>
> When a failure happens, the kernel sends an event to userspace that
> signals 'dmeventd' to take action. If we take dmeventd out of the
> picture, we can run the commands ourselves with higher verbose
> settings.
>
> When you activate the volume, you can 'lvchange -ay --monitor n <vg>/
> <lv>' - this will prevent dmeventd from monitoring the mirror. Then
> kill the log device. Finally, run 'vgreduce --removemissing <VG> -
> vvvv' to perform the recovery. (redirecting all the output to a file
> will give us something to look at if the failure is reproduced.)
>
> We may need to grab debugging output from clvmd too, but that can get
> messy, so we'll start with this.
>
> brassow
>
> P.S. It looks like you must have *.debug; in your /etc/syslog.conf,
> yes?
>
> On Feb 15, 2008, at 10:17 AM, Lajkó Attila wrote:
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-17 12:49 ` Lajkó Attila
@ 2008-02-18 16:08 ` Jonathan Brassow
2008-02-18 16:48 ` Lajkó Attila
0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Brassow @ 2008-02-18 16:08 UTC (permalink / raw)
To: LVM general discussion and development
How are you performing the failures? It looks like just one machine
is loosing its connection to the device, while the other machines
links remain in place.
brassow
On Feb 17, 2008, at 6:49 AM, Lajk� Attila wrote:
> Here is the output of vgreduce:
> ftp://ftp.ulx.hu/upload/clvmd/vgreduce.tar.gz
>
> Attila
>
>
> 2008. febr. 15, 22:06 DU d�tummal Jonathan Brassow <jbrassow@redhat.com
> >
> ezt �rta:
>
>> If the problem is reproducible, we should be able to track it down.
>>
>> When a failure happens, the kernel sends an event to userspace that
>> signals 'dmeventd' to take action. If we take dmeventd out of the
>> picture, we can run the commands ourselves with higher verbose
>> settings.
>>
>> When you activate the volume, you can 'lvchange -ay --monitor n <vg>/
>> <lv>' - this will prevent dmeventd from monitoring the mirror. Then
>> kill the log device. Finally, run 'vgreduce --removemissing <VG> -
>> vvvv' to perform the recovery. (redirecting all the output to a file
>> will give us something to look at if the failure is reproduced.)
>>
>> We may need to grab debugging output from clvmd too, but that can get
>> messy, so we'll start with this.
>>
>> brassow
>>
>> P.S. It looks like you must have *.debug; in your /etc/syslog.conf,
>> yes?
>>
>> On Feb 15, 2008, at 10:17 AM, Lajk� Attila wrote:
>
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-18 16:08 ` Jonathan Brassow
@ 2008-02-18 16:48 ` Lajkó Attila
2008-02-18 18:30 ` Jonathan Brassow
0 siblings, 1 reply; 13+ messages in thread
From: Lajkó Attila @ 2008-02-18 16:48 UTC (permalink / raw)
To: LVM general discussion and development
Yes, I disconnected the lun via the Vtrak GUI from one of the nodes
(el4).
Attila
On Feb 18, 2008, at 5:08 PM, Jonathan Brassow wrote:
> How are you performing the failures? It looks like just one machine
> is loosing its connection to the device, while the other machines
> links remain in place.
>
> brassow
>
> On Feb 17, 2008, at 6:49 AM, Lajk� Attila wrote:
>
>> Here is the output of vgreduce:
>> ftp://ftp.ulx.hu/upload/clvmd/vgreduce.tar.gz
>>
>> Attila
>>
>>
>> 2008. febr. 15, 22:06 DU d�tummal Jonathan Brassow <jbrassow@redhat.com
>> >
>> ezt �rta:
>>
>>> If the problem is reproducible, we should be able to track it down.
>>>
>>> When a failure happens, the kernel sends an event to userspace that
>>> signals 'dmeventd' to take action. If we take dmeventd out of the
>>> picture, we can run the commands ourselves with higher verbose
>>> settings.
>>>
>>> When you activate the volume, you can 'lvchange -ay --monitor n
>>> <vg>/
>>> <lv>' - this will prevent dmeventd from monitoring the mirror. Then
>>> kill the log device. Finally, run 'vgreduce --removemissing <VG> -
>>> vvvv' to perform the recovery. (redirecting all the output to a file
>>> will give us something to look at if the failure is reproduced.)
>>>
>>> We may need to grab debugging output from clvmd too, but that can
>>> get
>>> messy, so we'll start with this.
>>>
>>> brassow
>>>
>>> P.S. It looks like you must have *.debug; in your /etc/syslog.conf,
>>> yes?
>>>
>>> On Feb 15, 2008, at 10:17 AM, Lajk� Attila wrote:
>>
>>
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-18 16:48 ` Lajkó Attila
@ 2008-02-18 18:30 ` Jonathan Brassow
2008-02-19 13:51 ` Lajkó Attila
0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Brassow @ 2008-02-18 18:30 UTC (permalink / raw)
To: LVM general discussion and development
That's the problem. (C)LVM does not currently have a way to handle
the disappearance of a device from just one machine... it expects a
similar view of devices from all machines in a cluster. "Locking" on
the second node fails because it doesn't know what to do with the disk
that it sees (that has been removed from the node suffering the
failure).
There is someone working handling orphaned/removed devices that
reappear (which would be similar to this case), but I'm not sure if
they've taken this scenario into account. I'll let him know about this.
brassow
On Feb 18, 2008, at 10:48 AM, Lajk� Attila wrote:
> Yes, I disconnected the lun via the Vtrak GUI from one of the nodes
> (el4).
>
> Attila
>
> On Feb 18, 2008, at 5:08 PM, Jonathan Brassow wrote:
>
>> How are you performing the failures? It looks like just one
>> machine is loosing its connection to the device, while the other
>> machines links remain in place.
>>
>> brassow
>>
>> On Feb 17, 2008, at 6:49 AM, Lajk� Attila wrote:
>>
>>> Here is the output of vgreduce:
>>> ftp://ftp.ulx.hu/upload/clvmd/vgreduce.tar.gz
>>>
>>> Attila
>>>
>>>
>>> 2008. febr. 15, 22:06 DU d�tummal Jonathan Brassow <jbrassow@redhat.com
>>> >
>>> ezt �rta:
>>>
>>>> If the problem is reproducible, we should be able to track it down.
>>>>
>>>> When a failure happens, the kernel sends an event to userspace that
>>>> signals 'dmeventd' to take action. If we take dmeventd out of the
>>>> picture, we can run the commands ourselves with higher verbose
>>>> settings.
>>>>
>>>> When you activate the volume, you can 'lvchange -ay --monitor n
>>>> <vg>/
>>>> <lv>' - this will prevent dmeventd from monitoring the mirror. Then
>>>> kill the log device. Finally, run 'vgreduce --removemissing <VG> -
>>>> vvvv' to perform the recovery. (redirecting all the output to a
>>>> file
>>>> will give us something to look at if the failure is reproduced.)
>>>>
>>>> We may need to grab debugging output from clvmd too, but that can
>>>> get
>>>> messy, so we'll start with this.
>>>>
>>>> brassow
>>>>
>>>> P.S. It looks like you must have *.debug; in your /etc/syslog.conf,
>>>> yes?
>>>>
>>>> On Feb 15, 2008, at 10:17 AM, Lajk� Attila wrote:
>>>
>>>
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-18 18:30 ` Jonathan Brassow
@ 2008-02-19 13:51 ` Lajkó Attila
2008-02-19 15:43 ` Jonathan Brassow
0 siblings, 1 reply; 13+ messages in thread
From: Lajkó Attila @ 2008-02-19 13:51 UTC (permalink / raw)
To: LVM general discussion and development
I disconnected the lun from both nodes and it worked, the LV went to
linear. Thanks for the help!
One more question, what do you think about this testing scenario: Two-
nodes cluster, each node has a local disk which is exported via GNBD
to itself and to the other one. I build a cluster-mirrored LV on the
GNBD devices, then I kill one of the nodes. In this case, the cmirror
should work well? The LV will go to linear mode?
Attila
On Feb 18, 2008, at 7:30 PM, Jonathan Brassow wrote:
> That's the problem. (C)LVM does not currently have a way to handle
> the disappearance of a device from just one machine... it expects a
> similar view of devices from all machines in a cluster. "Locking"
> on the second node fails because it doesn't know what to do with the
> disk that it sees (that has been removed from the node suffering the
> failure).
>
> There is someone working handling orphaned/removed devices that
> reappear (which would be similar to this case), but I'm not sure if
> they've taken this scenario into account. I'll let him know about
> this.
>
> brassow
>
> On Feb 18, 2008, at 10:48 AM, Lajk� Attila wrote:
>
>> Yes, I disconnected the lun via the Vtrak GUI from one of the nodes
>> (el4).
>>
>> Attila
>>
>> On Feb 18, 2008, at 5:08 PM, Jonathan Brassow wrote:
>>
>>> How are you performing the failures? It looks like just one
>>> machine is loosing its connection to the device, while the other
>>> machines links remain in place.
>>>
>>> brassow
>>>
>>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] mirrored LV + cmirror problem
2008-02-19 13:51 ` Lajkó Attila
@ 2008-02-19 15:43 ` Jonathan Brassow
0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Brassow @ 2008-02-19 15:43 UTC (permalink / raw)
To: LVM general discussion and development
I've heard of people trying to create GNBD SANs in the past... Some
have gone so far as to have many nodes and pool the mirrors together...
It can be tricky. IIRC, one of the trickiest parts was bringing the
node back, reinserting the GNBD disk into the volume group, and
converting back to mirror.
Also, I think the remote node uses the GNBD interface, while the local
node simply uses the local interface. You may be able to use iSCSI
too, but I'm not sure.
brassow
On Feb 19, 2008, at 7:51 AM, Lajk� Attila wrote:
> I disconnected the lun from both nodes and it worked, the LV went to
> linear. Thanks for the help!
>
> One more question, what do you think about this testing scenario:
> Two-nodes cluster, each node has a local disk which is exported via
> GNBD to itself and to the other one. I build a cluster-mirrored LV
> on the GNBD devices, then I kill one of the nodes. In this case, the
> cmirror should work well? The LV will go to linear mode?
>
> Attila
>
> On Feb 18, 2008, at 7:30 PM, Jonathan Brassow wrote:
>
>> That's the problem. (C)LVM does not currently have a way to handle
>> the disappearance of a device from just one machine... it expects a
>> similar view of devices from all machines in a cluster. "Locking"
>> on the second node fails because it doesn't know what to do with
>> the disk that it sees (that has been removed from the node
>> suffering the failure).
>>
>> There is someone working handling orphaned/removed devices that
>> reappear (which would be similar to this case), but I'm not sure if
>> they've taken this scenario into account. I'll let him know about
>> this.
>>
>> brassow
>>
>> On Feb 18, 2008, at 10:48 AM, Lajk� Attila wrote:
>>
>>> Yes, I disconnected the lun via the Vtrak GUI from one of the
>>> nodes (el4).
>>>
>>> Attila
>>>
>>> On Feb 18, 2008, at 5:08 PM, Jonathan Brassow wrote:
>>>
>>>> How are you performing the failures? It looks like just one
>>>> machine is loosing its connection to the device, while the other
>>>> machines links remain in place.
>>>>
>>>> brassow
>>>>
>>>>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2008-02-19 15:43 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-15 11:36 [linux-lvm] mirrored LV + cmirror problem Lajkó Attila
2008-02-15 12:02 ` Michael Eisenkölbl
2008-02-15 13:26 ` Lajkó Attila
2008-02-15 13:36 ` Michael Eisenkölbl
2008-02-15 15:41 ` Jonathan Brassow
2008-02-15 16:17 ` Lajkó Attila
2008-02-15 21:06 ` Jonathan Brassow
2008-02-17 12:49 ` Lajkó Attila
2008-02-18 16:08 ` Jonathan Brassow
2008-02-18 16:48 ` Lajkó Attila
2008-02-18 18:30 ` Jonathan Brassow
2008-02-19 13:51 ` Lajkó Attila
2008-02-19 15:43 ` Jonathan Brassow
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.