All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-lvm] mirrored LV + cmirror problem
@ 2008-02-15 11:36 Lajkó Attila
  2008-02-15 12:02 ` Michael Eisenkölbl
  2008-02-15 15:41 ` Jonathan Brassow
  0 siblings, 2 replies; 13+ messages in thread
From: Lajkó Attila @ 2008-02-15 11:36 UTC (permalink / raw)
  To: linux-lvm

Hello,


I have a problem with clvmd an cmirror:

We have a two nodes cluster (RHEL4.6). I created a mirrored LV on a  
clustered volume group on 2 iscsi LUNS (VTrak M200i).
When i disconnect one of the LUNs - simulating a storage problem -   
the mirrored LV doesn't go to linear mode, the LVM commands (lvs,  
lvconvert, etc.) get stuck and the GFS file system is not accessible  
(on both nodes).

What is see in /var/log/messages:

Feb 15 12:29:26 el42 kernel: dm-cmirror: server_complete_resync_work -  
Setting recovery_halted = 1
Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
Feb 15 12:29:26 el42 last message repeated 4 times
Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
Feb 15 12:29:26 el42 kernel: dm-cmirror: Recovery halted due to error  
on ItlWCmkP
Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/dev/mapper/mirrp3)  
called while suspended
Feb 15 12:29:26 el42 kernel: dm-cmirror: LOG INFO:
Feb 15 12:29:26 el42 kernel: dm-cmirror:   uuid: LVM- 
zEHPYfjtLCL7yqQhsG2kcPzthyLbyBPd7xlok1gd7NHgXR3l2XaVQWEVItlWCmkP
Feb 15 12:29:26 el42 kernel: dm-cmirror:   uuid_ref    : 1
Feb 15 12:29:26 el42 kernel: dm-cmirror:   log type    : disk
Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?region_count: 320
Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?sync_count  : 320
Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?sync_search : 320
Feb 15 12:29:26 el42 kernel: dm-cmirror:   in_sync     : YES
Feb 15 12:29:26 el42 kernel: dm-cmirror:   suspended   : NO
Feb 15 12:29:26 el42 kernel: dm-cmirror:   recovery_halted : YES
Feb 15 12:29:26 el42 kernel: dm-cmirror:   server_id   : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror:   server_valid: YES
Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_presuspend: recovery  
halted on ItlWCmkP(1)
Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_postsuspend
Feb 15 12:29:26 el42 kernel: dm-cmirror: Telling everyone I'm  
suspending (ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):  
(ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 0
Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 0
Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):  
(ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 0
Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 57005
Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 0
Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/etc/lvm/lvm.conf)  
called while suspended
Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 1
Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 2
Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_SELECTION(11): (ItlWCmkP)
Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
Feb 15 12:29:27 el42 kernel: dm-cmirror:   co-ordinator: 1
Feb 15 12:29:27 el42 kernel: dm-cmirror:   node_count  : 2
Feb 15 12:29:27 el42 kernel: dm-cmirror: LRT_MASTER_ASSIGN(12):  
(ItlWCmkP)
Feb 15 12:29:27 el42 kernel: dm-cmirror:   starter     : 2
Feb 15 12:29:27 el42 kernel: dm-cmirror:   co-ordinator: 1
Feb 15 12:29:27 el42 lvm[4929]: Failed to remove faulty devices in  
vgtest-lvtest
Feb 15 12:29:27 el42 kernel: dm-cmirror:   node_count  : 1
Feb 15 12:29:27 el42 kernel: dm-cmirror: Suspending now (ItlWCmkP)
Feb 15 12:29:28 el42 lvm[4929]: No longer monitoring mirror device  
vgtest-lvtest for events

Regards,
Attila Lajk�

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-15 11:36 [linux-lvm] mirrored LV + cmirror problem Lajkó Attila
@ 2008-02-15 12:02 ` Michael Eisenkölbl
  2008-02-15 13:26   ` Lajkó Attila
  2008-02-15 15:41 ` Jonathan Brassow
  1 sibling, 1 reply; 13+ messages in thread
From: Michael Eisenkölbl @ 2008-02-15 12:02 UTC (permalink / raw)
  To: LVM general discussion and development

how big are the LUNs?

michael

Lajk� Attila schrieb:
> Hello,
>
>
> I have a problem with clvmd an cmirror:
>
> We have a two nodes cluster (RHEL4.6). I created a mirrored LV on a 
> clustered volume group on 2 iscsi LUNS (VTrak M200i).
> When i disconnect one of the LUNs - simulating a storage problem -  
> the mirrored LV doesn't go to linear mode, the LVM commands (lvs, 
> lvconvert, etc.) get stuck and the GFS file system is not accessible 
> (on both nodes).
>
> What is see in /var/log/messages:
>
> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_complete_resync_work - 
> Setting recovery_halted = 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
> Feb 15 12:29:26 el42 last message repeated 4 times
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Recovery halted due to error 
> on ItlWCmkP
> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/dev/mapper/mirrp3) 
> called while suspended
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LOG INFO:
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   uuid: 
> LVM-zEHPYfjtLCL7yqQhsG2kcPzthyLbyBPd7xlok1gd7NHgXR3l2XaVQWEVItlWCmkP
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   uuid_ref    : 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   log type    : disk
> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?region_count: 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?sync_count  : 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?sync_search : 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   in_sync     : YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   suspended   : NO
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   recovery_halted : YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   server_id   : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   server_valid: YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_presuspend: recovery 
> halted on ItlWCmkP(1)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_postsuspend
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Telling everyone I'm 
> suspending (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13): 
> (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13): 
> (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 57005
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/etc/lvm/lvm.conf) 
> called while suspended
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_SELECTION(11): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror:   co-ordinator: 1
> Feb 15 12:29:27 el42 kernel: dm-cmirror:   node_count  : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror: LRT_MASTER_ASSIGN(12): 
> (ItlWCmkP)
> Feb 15 12:29:27 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror:   co-ordinator: 1
> Feb 15 12:29:27 el42 lvm[4929]: Failed to remove faulty devices in 
> vgtest-lvtest
> Feb 15 12:29:27 el42 kernel: dm-cmirror:   node_count  : 1
> Feb 15 12:29:27 el42 kernel: dm-cmirror: Suspending now (ItlWCmkP)
> Feb 15 12:29:28 el42 lvm[4929]: No longer monitoring mirror device 
> vgtest-lvtest for events
>
> Regards,
> Attila Lajk�
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-15 12:02 ` Michael Eisenkölbl
@ 2008-02-15 13:26   ` Lajkó Attila
  2008-02-15 13:36     ` Michael Eisenkölbl
  0 siblings, 1 reply; 13+ messages in thread
From: Lajkó Attila @ 2008-02-15 13:26 UTC (permalink / raw)
  To: LVM general discussion and development


The VTrak LUNs are 1 and 5GB. The PVs are on a 240 MB partition on  
each luns , the LV size is 160 MB. It's just a test.

Attila.

On Feb 15, 2008, at 1:02 PM, Michael Eisenk�lbl wrote:

> how big are the LUNs?
>
> michael
>
> Lajk� Attila schrieb:
>> Hello,
>>
>>
>> I have a problem with clvmd an cmirror:
>>
>> We have a two nodes cluster (RHEL4.6). I created a mirrored LV on a  
>> clustered volume group on 2 iscsi LUNS (VTrak M200i).
>> When i disconnect one of the LUNs - simulating a storage problem -   
>> the mirrored LV doesn't go to linear mode, the LVM commands (lvs,  
>> lvconvert, etc.) get stuck and the GFS file system is not  
>> accessible (on both nodes).
>>
>> What is see in /var/log/messages:
>>
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:  
>> server_complete_resync_work - Setting recovery_halted = 1
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
>> Feb 15 12:29:26 el42 last message repeated 4 times
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Recovery halted due to  
>> error on ItlWCmkP
>> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/dev/mapper/ 
>> mirrp3) called while suspended
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LOG INFO:
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   uuid: LVM- 
>> zEHPYfjtLCL7yqQhsG2kcPzthyLbyBPd7xlok1gd7NHgXR3l2XaVQWEVItlWCmkP
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   uuid_ref    : 1
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   log type    : disk
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?region_count: 320
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?sync_count  : 320
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?sync_search : 320
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   in_sync     : YES
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   suspended   : NO
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   recovery_halted : YES
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   server_id   : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   server_valid: YES
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_presuspend:  
>> recovery halted on ItlWCmkP(1)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_postsuspend
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Telling everyone I'm  
>> suspending (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):  
>> (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 0
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 0
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):  
>> (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 0
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 57005
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 0
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
>> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/etc/lvm/ 
>> lvm.conf) called while suspended
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 1
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 2
>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_SELECTION(11):  
>> (ItlWCmkP)
>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
>> Feb 15 12:29:27 el42 kernel: dm-cmirror:   co-ordinator: 1
>> Feb 15 12:29:27 el42 kernel: dm-cmirror:   node_count  : 2
>> Feb 15 12:29:27 el42 kernel: dm-cmirror: LRT_MASTER_ASSIGN(12):  
>> (ItlWCmkP)
>> Feb 15 12:29:27 el42 kernel: dm-cmirror:   starter     : 2
>> Feb 15 12:29:27 el42 kernel: dm-cmirror:   co-ordinator: 1
>> Feb 15 12:29:27 el42 lvm[4929]: Failed to remove faulty devices in  
>> vgtest-lvtest
>> Feb 15 12:29:27 el42 kernel: dm-cmirror:   node_count  : 1
>> Feb 15 12:29:27 el42 kernel: dm-cmirror: Suspending now (ItlWCmkP)
>> Feb 15 12:29:28 el42 lvm[4929]: No longer monitoring mirror device  
>> vgtest-lvtest for events
>>
>> Regards,
>> Attila Lajk�
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-15 13:26   ` Lajkó Attila
@ 2008-02-15 13:36     ` Michael Eisenkölbl
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Eisenkölbl @ 2008-02-15 13:36 UTC (permalink / raw)
  To: LVM general discussion and development

we have also several problems with 6 TB volumes.

kind regards

Lajk� Attila schrieb:
>
> The VTrak LUNs are 1 and 5GB. The PVs are on a 240 MB partition on 
> each luns , the LV size is 160 MB. It's just a test.
>
> Attila.
>
> On Feb 15, 2008, at 1:02 PM, Michael Eisenk�lbl wrote:
>
>> how big are the LUNs?
>>
>> michael
>>
>> Lajk� Attila schrieb:
>>> Hello,
>>>
>>>
>>> I have a problem with clvmd an cmirror:
>>>
>>> We have a two nodes cluster (RHEL4.6). I created a mirrored LV on a 
>>> clustered volume group on 2 iscsi LUNS (VTrak M200i).
>>> When i disconnect one of the LUNs - simulating a storage problem -  
>>> the mirrored LV doesn't go to linear mode, the LVM commands (lvs, 
>>> lvconvert, etc.) get stuck and the GFS file system is not accessible 
>>> (on both nodes).
>>>
>>> What is see in /var/log/messages:
>>>
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_complete_resync_work 
>>> - Setting recovery_halted = 1
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
>>> Feb 15 12:29:26 el42 last message repeated 4 times
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Recovery halted due to 
>>> error on ItlWCmkP
>>> Feb 15 12:29:26 el42 lvm[4929]: WARNING: 
>>> dev_open(/dev/mapper/mirrp3) called while suspended
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LOG INFO:
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   uuid: 
>>> LVM-zEHPYfjtLCL7yqQhsG2kcPzthyLbyBPd7xlok1gd7NHgXR3l2XaVQWEVItlWCmkP
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   uuid_ref    : 1
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   log type    : disk
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?region_count: 320
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?sync_count  : 320
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?sync_search : 320
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   in_sync     : YES
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   suspended   : NO
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   recovery_halted : YES
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   server_id   : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   server_valid: YES
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_presuspend: 
>>> recovery halted on ItlWCmkP(1)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_postsuspend
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: Telling everyone I'm 
>>> suspending (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13): 
>>> (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 0
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 0
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13): 
>>> (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 0
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 57005
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 0
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
>>> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/etc/lvm/lvm.conf) 
>>> called while suspended
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 1
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 2
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_SELECTION(11): (ItlWCmkP)
>>> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror:   co-ordinator: 1
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror:   node_count  : 2
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror: LRT_MASTER_ASSIGN(12): 
>>> (ItlWCmkP)
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror:   starter     : 2
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror:   co-ordinator: 1
>>> Feb 15 12:29:27 el42 lvm[4929]: Failed to remove faulty devices in 
>>> vgtest-lvtest
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror:   node_count  : 1
>>> Feb 15 12:29:27 el42 kernel: dm-cmirror: Suspending now (ItlWCmkP)
>>> Feb 15 12:29:28 el42 lvm[4929]: No longer monitoring mirror device 
>>> vgtest-lvtest for events
>>>
>>> Regards,
>>> Attila Lajk�
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-15 11:36 [linux-lvm] mirrored LV + cmirror problem Lajkó Attila
  2008-02-15 12:02 ` Michael Eisenkölbl
@ 2008-02-15 15:41 ` Jonathan Brassow
  2008-02-15 16:17   ` Lajkó Attila
  1 sibling, 1 reply; 13+ messages in thread
From: Jonathan Brassow @ 2008-02-15 15:41 UTC (permalink / raw)
  To: LVM general discussion and development

Are all the packages rhel4.6 as well, or have you compiled pkgs  
yourself?

What was the load you had on the system?

The messages I see from dm-cmirror suggest that it is properly  
shutting down in the face of the failure... However, before it has  
finished, we can see "Failed to remove faulty devices in vgtest- 
lvtest".  This suggests to me that clvmd is not waiting long enough  
for the shutdown to complete, but I only see 3 seconds of the log.   
When was the device failure initiated?

  brassow


On Feb 15, 2008, at 5:36 AM, Lajk� Attila wrote:

> Hello,
>
>
> I have a problem with clvmd an cmirror:
>
> We have a two nodes cluster (RHEL4.6). I created a mirrored LV on a  
> clustered volume group on 2 iscsi LUNS (VTrak M200i).
> When i disconnect one of the LUNs - simulating a storage problem -   
> the mirrored LV doesn't go to linear mode, the LVM commands (lvs,  
> lvconvert, etc.) get stuck and the GFS file system is not accessible  
> (on both nodes).
>
> What is see in /var/log/messages:
>
> Feb 15 12:29:26 el42 kernel: dm-cmirror: server_complete_resync_work  
> - Setting recovery_halted = 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
> Feb 15 12:29:26 el42 last message repeated 4 times
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Log flush failure: -5 -EIO
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Recovery halted due to  
> error on ItlWCmkP
> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/dev/mapper/ 
> mirrp3) called while suspended
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LOG INFO:
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   uuid: LVM- 
> zEHPYfjtLCL7yqQhsG2kcPzthyLbyBPd7xlok1gd7NHgXR3l2XaVQWEVItlWCmkP
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   uuid_ref    : 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   log type    : disk
> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?region_count: 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?sync_count  : 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror:  ?sync_search : 320
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   in_sync     : YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   suspended   : NO
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   recovery_halted : YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   server_id   : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   server_valid: YES
> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_presuspend:  
> recovery halted on ItlWCmkP(1)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: cluster_postsuspend
> Feb 15 12:29:26 el42 kernel: dm-cmirror: Telling everyone I'm  
> suspending (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):  
> (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_MASTER_LEAVING(13):  
> (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 57005
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 0
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_ELECTION(10): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:26 el42 lvm[4929]: WARNING: dev_open(/etc/lvm/lvm.conf)  
> called while suspended
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   co-ordinator: 1
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   node_count  : 2
> Feb 15 12:29:26 el42 kernel: dm-cmirror: LRT_SELECTION(11): (ItlWCmkP)
> Feb 15 12:29:26 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror:   co-ordinator: 1
> Feb 15 12:29:27 el42 kernel: dm-cmirror:   node_count  : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror: LRT_MASTER_ASSIGN(12):  
> (ItlWCmkP)
> Feb 15 12:29:27 el42 kernel: dm-cmirror:   starter     : 2
> Feb 15 12:29:27 el42 kernel: dm-cmirror:   co-ordinator: 1
> Feb 15 12:29:27 el42 lvm[4929]: Failed to remove faulty devices in  
> vgtest-lvtest
> Feb 15 12:29:27 el42 kernel: dm-cmirror:   node_count  : 1
> Feb 15 12:29:27 el42 kernel: dm-cmirror: Suspending now (ItlWCmkP)
> Feb 15 12:29:28 el42 lvm[4929]: No longer monitoring mirror device  
> vgtest-lvtest for events
>
> Regards,
> Attila Lajk�
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-15 15:41 ` Jonathan Brassow
@ 2008-02-15 16:17   ` Lajkó Attila
  2008-02-15 21:06     ` Jonathan Brassow
  0 siblings, 1 reply; 13+ messages in thread
From: Lajkó Attila @ 2008-02-15 16:17 UTC (permalink / raw)
  To: LVM general discussion and development



On Feb 15, 2008, at 4:41 PM, Jonathan Brassow wrote:

> Are all the packages rhel4.6 as well, or have you compiled pkgs  
> yourself?

All the packages are binaries from rhel4.6:
lvm2-cluster-2.02.27-2.el4_6.1
cmirror-1.0.1-1
cmirror-kernel-xenU-2.6.9-38.5

>
>
> What was the load you had on the system?

Very low, approx. 0.

> The messages I see from dm-cmirror suggest that it is properly  
> shutting down in the face of the failure... However, before it has  
> finished, we can see "Failed to remove faulty devices in vgtest- 
> lvtest".  This suggests to me that clvmd is not waiting long enough  
> for the shutdown to complete, but I only see 3 seconds of the log.   
> When was the device failure initiated?
>
> brassow
>
>

The failure was initiated at the beginning of the log. I put the  
complete messages files from both nodes (el42 and el4) to FTP.

ftp://ftp.ulx.hu/upload/clvmd/messages.el4
ftp://ftp.ulx.hu/upload/clvmd/messages.el42

Attila

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-15 16:17   ` Lajkó Attila
@ 2008-02-15 21:06     ` Jonathan Brassow
  2008-02-17 12:49       ` Lajkó Attila
  0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Brassow @ 2008-02-15 21:06 UTC (permalink / raw)
  To: LVM general discussion and development

If the problem is reproducible, we should be able to track it down.

When a failure happens, the kernel sends an event to userspace that  
signals 'dmeventd' to take action.  If we take dmeventd out of the  
picture, we can run the commands ourselves with higher verbose settings.

When you activate the volume, you can 'lvchange -ay --monitor n <vg>/ 
<lv>' - this will prevent dmeventd from monitoring the mirror.  Then  
kill the log device.  Finally, run 'vgreduce --removemissing <VG> - 
vvvv' to perform the recovery.  (redirecting all the output to a file  
will give us something to look at if the failure is reproduced.)

We may need to grab debugging output from clvmd too, but that can get  
messy, so we'll start with this.

  brassow

P.S.  It looks like you must have *.debug; in your /etc/syslog.conf,  
yes?

On Feb 15, 2008, at 10:17 AM, Lajk� Attila wrote:

>
>
> On Feb 15, 2008, at 4:41 PM, Jonathan Brassow wrote:
>
>> Are all the packages rhel4.6 as well, or have you compiled pkgs  
>> yourself?
>
> All the packages are binaries from rhel4.6:
> lvm2-cluster-2.02.27-2.el4_6.1
> cmirror-1.0.1-1
> cmirror-kernel-xenU-2.6.9-38.5
>
>>
>>
>> What was the load you had on the system?
>
> Very low, approx. 0.
>
>> The messages I see from dm-cmirror suggest that it is properly  
>> shutting down in the face of the failure... However, before it has  
>> finished, we can see "Failed to remove faulty devices in vgtest- 
>> lvtest".  This suggests to me that clvmd is not waiting long enough  
>> for the shutdown to complete, but I only see 3 seconds of the log.   
>> When was the device failure initiated?
>>
>> brassow
>>
>>
>
> The failure was initiated at the beginning of the log. I put the  
> complete messages files from both nodes (el42 and el4) to FTP.
>
> ftp://ftp.ulx.hu/upload/clvmd/messages.el4
> ftp://ftp.ulx.hu/upload/clvmd/messages.el42
>
> Attila
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-15 21:06     ` Jonathan Brassow
@ 2008-02-17 12:49       ` Lajkó Attila
  2008-02-18 16:08         ` Jonathan Brassow
  0 siblings, 1 reply; 13+ messages in thread
From: Lajkó Attila @ 2008-02-17 12:49 UTC (permalink / raw)
  To: LVM general discussion and development

Here is the output of vgreduce:
ftp://ftp.ulx.hu/upload/clvmd/vgreduce.tar.gz

Attila


2008. febr. 15, 22:06 DU dátummal Jonathan Brassow <jbrassow@redhat.com>
ezt írta:

> If the problem is reproducible, we should be able to track it down.
>
> When a failure happens, the kernel sends an event to userspace that
> signals 'dmeventd' to take action. If we take dmeventd out of the
> picture, we can run the commands ourselves with higher verbose
> settings.
>
> When you activate the volume, you can 'lvchange -ay --monitor n <vg>/
> <lv>' - this will prevent dmeventd from monitoring the mirror. Then
> kill the log device. Finally, run 'vgreduce --removemissing <VG> -
> vvvv' to perform the recovery. (redirecting all the output to a file
> will give us something to look at if the failure is reproduced.)
>
> We may need to grab debugging output from clvmd too, but that can get
> messy, so we'll start with this.
>
> brassow
>
> P.S. It looks like you must have *.debug; in your /etc/syslog.conf,
> yes?
>
> On Feb 15, 2008, at 10:17 AM, Lajkó Attila wrote:

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-17 12:49       ` Lajkó Attila
@ 2008-02-18 16:08         ` Jonathan Brassow
  2008-02-18 16:48           ` Lajkó Attila
  0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Brassow @ 2008-02-18 16:08 UTC (permalink / raw)
  To: LVM general discussion and development

How are you performing the failures?  It looks like just one machine  
is loosing its connection to the device, while the other machines  
links remain in place.

  brassow

On Feb 17, 2008, at 6:49 AM, Lajk� Attila wrote:

> Here is the output of vgreduce:
> ftp://ftp.ulx.hu/upload/clvmd/vgreduce.tar.gz
>
> Attila
>
>
> 2008. febr. 15, 22:06 DU d�tummal Jonathan Brassow <jbrassow@redhat.com 
> >
> ezt �rta:
>
>> If the problem is reproducible, we should be able to track it down.
>>
>> When a failure happens, the kernel sends an event to userspace that
>> signals 'dmeventd' to take action. If we take dmeventd out of the
>> picture, we can run the commands ourselves with higher verbose
>> settings.
>>
>> When you activate the volume, you can 'lvchange -ay --monitor n <vg>/
>> <lv>' - this will prevent dmeventd from monitoring the mirror. Then
>> kill the log device. Finally, run 'vgreduce --removemissing <VG> -
>> vvvv' to perform the recovery. (redirecting all the output to a file
>> will give us something to look at if the failure is reproduced.)
>>
>> We may need to grab debugging output from clvmd too, but that can get
>> messy, so we'll start with this.
>>
>> brassow
>>
>> P.S. It looks like you must have *.debug; in your /etc/syslog.conf,
>> yes?
>>
>> On Feb 15, 2008, at 10:17 AM, Lajk� Attila wrote:
>
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-18 16:08         ` Jonathan Brassow
@ 2008-02-18 16:48           ` Lajkó Attila
  2008-02-18 18:30             ` Jonathan Brassow
  0 siblings, 1 reply; 13+ messages in thread
From: Lajkó Attila @ 2008-02-18 16:48 UTC (permalink / raw)
  To: LVM general discussion and development

Yes, I disconnected the lun via the Vtrak GUI from one of the nodes  
(el4).

Attila

On Feb 18, 2008, at 5:08 PM, Jonathan Brassow wrote:

> How are you performing the failures?  It looks like just one machine  
> is loosing its connection to the device, while the other machines  
> links remain in place.
>
> brassow
>
> On Feb 17, 2008, at 6:49 AM, Lajk� Attila wrote:
>
>> Here is the output of vgreduce:
>> ftp://ftp.ulx.hu/upload/clvmd/vgreduce.tar.gz
>>
>> Attila
>>
>>
>> 2008. febr. 15, 22:06 DU d�tummal Jonathan Brassow <jbrassow@redhat.com 
>> >
>> ezt �rta:
>>
>>> If the problem is reproducible, we should be able to track it down.
>>>
>>> When a failure happens, the kernel sends an event to userspace that
>>> signals 'dmeventd' to take action. If we take dmeventd out of the
>>> picture, we can run the commands ourselves with higher verbose
>>> settings.
>>>
>>> When you activate the volume, you can 'lvchange -ay --monitor n  
>>> <vg>/
>>> <lv>' - this will prevent dmeventd from monitoring the mirror. Then
>>> kill the log device. Finally, run 'vgreduce --removemissing <VG> -
>>> vvvv' to perform the recovery. (redirecting all the output to a file
>>> will give us something to look at if the failure is reproduced.)
>>>
>>> We may need to grab debugging output from clvmd too, but that can  
>>> get
>>> messy, so we'll start with this.
>>>
>>> brassow
>>>
>>> P.S. It looks like you must have *.debug; in your /etc/syslog.conf,
>>> yes?
>>>
>>> On Feb 15, 2008, at 10:17 AM, Lajk� Attila wrote:
>>
>>
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-18 16:48           ` Lajkó Attila
@ 2008-02-18 18:30             ` Jonathan Brassow
  2008-02-19 13:51               ` Lajkó Attila
  0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Brassow @ 2008-02-18 18:30 UTC (permalink / raw)
  To: LVM general discussion and development

That's the problem.  (C)LVM does not currently have a way to handle  
the disappearance of a device from just one machine... it expects a  
similar view of devices from all machines in a cluster.  "Locking" on  
the second node fails because it doesn't know what to do with the disk  
that it sees (that has been removed from the node suffering the  
failure).

There is someone working handling orphaned/removed devices that  
reappear (which would be similar to this case), but I'm not sure if  
they've taken this scenario into account.  I'll let him know about this.

  brassow

On Feb 18, 2008, at 10:48 AM, Lajk� Attila wrote:

> Yes, I disconnected the lun via the Vtrak GUI from one of the nodes  
> (el4).
>
> Attila
>
> On Feb 18, 2008, at 5:08 PM, Jonathan Brassow wrote:
>
>> How are you performing the failures?  It looks like just one  
>> machine is loosing its connection to the device, while the other  
>> machines links remain in place.
>>
>> brassow
>>
>> On Feb 17, 2008, at 6:49 AM, Lajk� Attila wrote:
>>
>>> Here is the output of vgreduce:
>>> ftp://ftp.ulx.hu/upload/clvmd/vgreduce.tar.gz
>>>
>>> Attila
>>>
>>>
>>> 2008. febr. 15, 22:06 DU d�tummal Jonathan Brassow <jbrassow@redhat.com 
>>> >
>>> ezt �rta:
>>>
>>>> If the problem is reproducible, we should be able to track it down.
>>>>
>>>> When a failure happens, the kernel sends an event to userspace that
>>>> signals 'dmeventd' to take action. If we take dmeventd out of the
>>>> picture, we can run the commands ourselves with higher verbose
>>>> settings.
>>>>
>>>> When you activate the volume, you can 'lvchange -ay --monitor n  
>>>> <vg>/
>>>> <lv>' - this will prevent dmeventd from monitoring the mirror. Then
>>>> kill the log device. Finally, run 'vgreduce --removemissing <VG> -
>>>> vvvv' to perform the recovery. (redirecting all the output to a  
>>>> file
>>>> will give us something to look at if the failure is reproduced.)
>>>>
>>>> We may need to grab debugging output from clvmd too, but that can  
>>>> get
>>>> messy, so we'll start with this.
>>>>
>>>> brassow
>>>>
>>>> P.S. It looks like you must have *.debug; in your /etc/syslog.conf,
>>>> yes?
>>>>
>>>> On Feb 15, 2008, at 10:17 AM, Lajk� Attila wrote:
>>>
>>>
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-18 18:30             ` Jonathan Brassow
@ 2008-02-19 13:51               ` Lajkó Attila
  2008-02-19 15:43                 ` Jonathan Brassow
  0 siblings, 1 reply; 13+ messages in thread
From: Lajkó Attila @ 2008-02-19 13:51 UTC (permalink / raw)
  To: LVM general discussion and development

I disconnected the lun from both nodes and it worked, the LV went to  
linear. Thanks for the help!

One more question, what do you think about this testing scenario: Two- 
nodes cluster, each node has a local disk which is exported via GNBD  
to itself and to the other one. I build a cluster-mirrored LV on the  
GNBD devices, then I kill one of the nodes. In this case, the cmirror  
should work well? The LV will go to linear mode?

Attila

On Feb 18, 2008, at 7:30 PM, Jonathan Brassow wrote:

> That's the problem.  (C)LVM does not currently have a way to handle  
> the disappearance of a device from just one machine... it expects a  
> similar view of devices from all machines in a cluster.  "Locking"  
> on the second node fails because it doesn't know what to do with the  
> disk that it sees (that has been removed from the node suffering the  
> failure).
>
> There is someone working handling orphaned/removed devices that  
> reappear (which would be similar to this case), but I'm not sure if  
> they've taken this scenario into account.  I'll let him know about  
> this.
>
> brassow
>
> On Feb 18, 2008, at 10:48 AM, Lajk� Attila wrote:
>
>> Yes, I disconnected the lun via the Vtrak GUI from one of the nodes  
>> (el4).
>>
>> Attila
>>
>> On Feb 18, 2008, at 5:08 PM, Jonathan Brassow wrote:
>>
>>> How are you performing the failures?  It looks like just one  
>>> machine is loosing its connection to the device, while the other  
>>> machines links remain in place.
>>>
>>> brassow
>>>
>>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linux-lvm] mirrored LV + cmirror problem
  2008-02-19 13:51               ` Lajkó Attila
@ 2008-02-19 15:43                 ` Jonathan Brassow
  0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Brassow @ 2008-02-19 15:43 UTC (permalink / raw)
  To: LVM general discussion and development

I've heard of people trying to create GNBD SANs in the past...  Some  
have gone so far as to have many nodes and pool the mirrors together...

It can be tricky.  IIRC, one of the trickiest parts was bringing the  
node back, reinserting the GNBD disk into the volume group, and  
converting back to mirror.

Also, I think the remote node uses the GNBD interface, while the local  
node simply uses the local interface.  You may be able to use iSCSI  
too, but I'm not sure.

  brassow

On Feb 19, 2008, at 7:51 AM, Lajk� Attila wrote:

> I disconnected the lun from both nodes and it worked, the LV went to  
> linear. Thanks for the help!
>
> One more question, what do you think about this testing scenario:  
> Two-nodes cluster, each node has a local disk which is exported via  
> GNBD to itself and to the other one. I build a cluster-mirrored LV  
> on the GNBD devices, then I kill one of the nodes. In this case, the  
> cmirror should work well? The LV will go to linear mode?
>
> Attila
>
> On Feb 18, 2008, at 7:30 PM, Jonathan Brassow wrote:
>
>> That's the problem.  (C)LVM does not currently have a way to handle  
>> the disappearance of a device from just one machine... it expects a  
>> similar view of devices from all machines in a cluster.  "Locking"  
>> on the second node fails because it doesn't know what to do with  
>> the disk that it sees (that has been removed from the node  
>> suffering the failure).
>>
>> There is someone working handling orphaned/removed devices that  
>> reappear (which would be similar to this case), but I'm not sure if  
>> they've taken this scenario into account.  I'll let him know about  
>> this.
>>
>> brassow
>>
>> On Feb 18, 2008, at 10:48 AM, Lajk� Attila wrote:
>>
>>> Yes, I disconnected the lun via the Vtrak GUI from one of the  
>>> nodes (el4).
>>>
>>> Attila
>>>
>>> On Feb 18, 2008, at 5:08 PM, Jonathan Brassow wrote:
>>>
>>>> How are you performing the failures?  It looks like just one  
>>>> machine is loosing its connection to the device, while the other  
>>>> machines links remain in place.
>>>>
>>>> brassow
>>>>
>>>>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-02-19 15:43 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-15 11:36 [linux-lvm] mirrored LV + cmirror problem Lajkó Attila
2008-02-15 12:02 ` Michael Eisenkölbl
2008-02-15 13:26   ` Lajkó Attila
2008-02-15 13:36     ` Michael Eisenkölbl
2008-02-15 15:41 ` Jonathan Brassow
2008-02-15 16:17   ` Lajkó Attila
2008-02-15 21:06     ` Jonathan Brassow
2008-02-17 12:49       ` Lajkó Attila
2008-02-18 16:08         ` Jonathan Brassow
2008-02-18 16:48           ` Lajkó Attila
2008-02-18 18:30             ` Jonathan Brassow
2008-02-19 13:51               ` Lajkó Attila
2008-02-19 15:43                 ` Jonathan Brassow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.