All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG in dm/dm-mirror module?
@ 2007-08-11  1:08 malahal
  2007-08-11  8:52 ` Milan Broz
  0 siblings, 1 reply; 8+ messages in thread
From: malahal @ 2007-08-11  1:08 UTC (permalink / raw)
  To: dm-devel

Hi, I am trying to create a mirrored disk log. I have four block
devices, two for the log (which is a mirror!) and two for the actual
mirror device.  But I can't use the mirror device at all. It just hangs
for any read/write. Here are the details of dmsetup calls. I am using
RHEL5 (2.6.18-8.el5). Looks like a mirror module bug and I appreciate
any help.

dev1="/dev/sda1"
dev2="/dev/sdb1"
dev3="/dev/sdc1"
dev4="/dev/sdd1"
echo "0 8192 mirror core 1 512 2 $dev1 0 $dev2 0" | dmsetup create log
echo "0 24576 mirror disk 2 /dev/mapper/log 512 2 $dev3 0 $dev4 0" | dmsetup create mirror

The following are the stack traces for kmirrord and the "dd
if=/dev/mapper/mirror of=/dev/null" command on the mirror: 

crash> bt 312
PID: 312    TASK: c00000001dd311d0  CPU: 1   COMMAND: "kmirrord"
 #0 [c00000001dd577d0] .schedule at c0000000003483c8
 #1 [c00000001dd578e0] .io_schedule at c000000000349094
 #2 [c00000001dd57970] .sync_io at d0000000002777f4
 #3 [c00000001dd57a20] .dm_io_sync_vm at d0000000002778a0
 #4 [c00000001dd57af0] .disk_flush at d0000000000808cc
 #5 [c00000001dd57b90] .do_work at d00000000008342c
 #6 [c00000001dd57d50] .run_workqueue at c00000000007b76c
 #7 [c00000001dd57df0] .worker_thread at c00000000007c4d8
 #8 [c00000001dd57ee0] .kthread at c000000000080fb8
 #9 [c00000001dd57f90] .kernel_thread at c0000000000264bc

crash> bt 2489
PID: 2489   TASK: c00000000747cbc0  CPU: 1   COMMAND: "dd"
 #0 [c00000000717b5c0] .schedule at c0000000003483c8
 #1 [c00000000717b6d0] .io_schedule at c000000000349094
 #2 [c00000000717b760] .sync_page at c0000000000aa180
 #3 [c00000000717b7e0] .__wait_on_bit_lock at c0000000003492c0
 #4 [c00000000717b880] .__lock_page at c0000000000a9fa4
 #5 [c00000000717b950] .do_generic_mapping_read at c0000000000aad00
 #6 [c00000000717baa0] .__generic_file_aio_read at c0000000000aba74
 #7 [c00000000717bb70] .generic_file_read at c0000000000ad0c8
 #8 [c00000000717bcf0] .vfs_read at c0000000000e0f10
 #9 [c00000000717bd90] .sys_read at c0000000000e13f4
#10 [c00000000717be30] syscall_exit at c00000000000869c
 syscall  [c00] exception frame:
 R0:  0000000000000003    R1:  00000000ffedf650    R2:  000000000fff9450   
 R3:  0000000000000000    R4:  0000000010030000    R5:  0000000000000200   
 R6:  000000001001d1a8    R7:  0000000000000000    R8:  00000000000001ff   
 R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000   
 R12: 0000000000000000    R13: 0000000010025144    R14: 0000000000000000   
 R15: 0000000010040a60    R16: 000000001001d190    R17: 000000001001d198   
 R18: 0000000000000000    R19: 000000001001d1f4    R20: 000000001001d128   
 R21: 000000001001d178    R22: 000000001001d160    R23: 000000001001d1bc   
 R24: 0000000010030000    R25: 000000001001d170    R26: 000000001001d170   
 R27: 0000000000000000    R28: 0000000010030000    R29: 0000000000000200   
 R30: 000000001001d008    R31: 0000000010030200   
 NIP: 000000000ff1a6d4    MSR: 000000000000d032    OR3: 0000000000000000
 CTR: 000000000ff1a6c0    LR:  0000000010001df4    XER: 0000000000000000
 CCR: 0000000044000428    MQ:  c000000000421ae8    DAR: 0000000010050c64
 DSISR: 0000000042000000     Syscall Result: 0000000000000200

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG in dm/dm-mirror module?
  2007-08-11  1:08 BUG in dm/dm-mirror module? malahal
@ 2007-08-11  8:52 ` Milan Broz
  2007-08-13 15:18   ` Jonathan Brassow
  0 siblings, 1 reply; 8+ messages in thread
From: Milan Broz @ 2007-08-11  8:52 UTC (permalink / raw)
  To: device-mapper development

malahal@us.ibm.com wrote:
> Hi, I am trying to create a mirrored disk log. I have four block
> devices, two for the log (which is a mirror!) and two for the actual
> mirror device.  But I can't use the mirror device at all. It just hangs
> for any read/write. Here are the details of dmsetup calls. I am using
> RHEL5 (2.6.18-8.el5). Looks like a mirror module bug and I appreciate
> any help.
> 
> dev1="/dev/sda1"
> dev2="/dev/sdb1"
> dev3="/dev/sdc1"
> dev4="/dev/sdd1"
> echo "0 8192 mirror core 1 512 2 $dev1 0 $dev2 0" | dmsetup create log
> echo "0 24576 mirror disk 2 /dev/mapper/log 512 2 $dev3 0 $dev4 0" | dmsetup create mirror

Hi,

yes, there is known problem with one kmirrord thread and using mirrored log.
(i.e. mirror over mirror)

For problem description see this patch for upstream kernel
http://www2.kernel.org/pub/linux/kernel/people/agk/patches/2.6/2.6.21/dm-raid1-one-kmirrord-per-mirror.patch

All testing RHEL5 kernels from 2.6.18-18 has this fix included,
so for testing purposes you can try RHEL5.1 beta kernel.

Milan
--
mbroz@redhat.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG in dm/dm-mirror module?
  2007-08-11  8:52 ` Milan Broz
@ 2007-08-13 15:18   ` Jonathan Brassow
  2007-08-13 16:48     ` Phillip Susi
  2007-08-13 18:24     ` malahal
  0 siblings, 2 replies; 8+ messages in thread
From: Jonathan Brassow @ 2007-08-13 15:18 UTC (permalink / raw)
  To: device-mapper development


On Aug 11, 2007, at 3:52 AM, Milan Broz wrote:

> malahal@us.ibm.com wrote:
>> Hi, I am trying to create a mirrored disk log. I have four block
>> devices, two for the log (which is a mirror!) and two for the actual
>> mirror device.  But I can't use the mirror device at all. It just  
>> hangs
>> for any read/write. Here are the details of dmsetup calls. I am using
>> RHEL5 (2.6.18-8.el5). Looks like a mirror module bug and I appreciate
>> any help.
>>
>> dev1="/dev/sda1"
>> dev2="/dev/sdb1"
>> dev3="/dev/sdc1"
>> dev4="/dev/sdd1"
>> echo "0 8192 mirror core 1 512 2 $dev1 0 $dev2 0" | dmsetup create  
>> log
>> echo "0 24576 mirror disk 2 /dev/mapper/log 512 2 $dev3 0 $dev4 0"  
>> | dmsetup create mirror
>
> Hi,
>
> yes, there is known problem with one kmirrord thread and using  
> mirrored log.
> (i.e. mirror over mirror)
>
> For problem description see this patch for upstream kernel
> http://www2.kernel.org/pub/linux/kernel/people/agk/patches/ 
> 2.6/2.6.21/dm-raid1-one-kmirrord-per-mirror.patch
>
> All testing RHEL5 kernels from 2.6.18-18 has this fix included,
> so for testing purposes you can try RHEL5.1 beta kernel.

On a different topic, why are you mirroring the log?  Isn't this  
somewhat dangerous?

Let's say that the primary copy of the log dies or goes offline.  You  
continue on because the log device is still "good".  If your machine  
crashes and the primary log device is "rediscovered" on bootup, what  
happens?  The contents of the stale side will be copied - resulting  
in your log not properly reflecting the state of your mirror device  
and maybe even leaving inconsistencies.

You might argue that we should update the metadata to exclude the  
failed primary at the point of failure.  Two things come to mind:
1) log I/O will continue until you take action - leaving you open to  
the scenario above
2) it would be simpler to just allocate a new log (since you are  
changing metadata anyway) and initialize the log as "in-sync" if the  
mirror is already "in-sync".

If you ignore the possibility of transient device failures, mirroring  
the log might make some sense.  You gain an advantage only at the  
times when a log device fails and:
1) the machine fails before the initial resync has completed
2) the machine fails while assigning a new log device

Ultimately, I think that in order to have a fast solution that allows  
you to do the above (as well as a whole host of other advanced  
features, like real-time mirroring) you need kernel accessible device  
labels on each mirror device and log.  The labels would track things  
like: who's the primary, who's a slave, who's in the group, who's  
failed, etc.  I've seen some people advocate putting this in the log,  
but the log can fail.  (I hope I've already conveyed why I don't  
think it's a good idea to mirror the log.)  I don't have any good  
ideas for making this happen right now.

  brassow

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG in dm/dm-mirror module?
  2007-08-13 15:18   ` Jonathan Brassow
@ 2007-08-13 16:48     ` Phillip Susi
  2007-08-13 20:18       ` Jonathan Brassow
  2007-08-13 18:24     ` malahal
  1 sibling, 1 reply; 8+ messages in thread
From: Phillip Susi @ 2007-08-13 16:48 UTC (permalink / raw)
  To: device-mapper development

Jonathan Brassow wrote:
> On a different topic, why are you mirroring the log?  Isn't this 
> somewhat dangerous?
> 
> Let's say that the primary copy of the log dies or goes offline.  You 
> continue on because the log device is still "good".  If your machine 
> crashes and the primary log device is "rediscovered" on bootup, what 
> happens?  The contents of the stale side will be copied - resulting in 
> your log not properly reflecting the state of your mirror device and 
> maybe even leaving inconsistencies.

This is a problem with any mirror, not just one holding a mirror log.

> You might argue that we should update the metadata to exclude the failed 
> primary at the point of failure.  Two things come to mind:
> 1) log I/O will continue until you take action - leaving you open to the 
> scenario above
> 2) it would be simpler to just allocate a new log (since you are 
> changing metadata anyway) and initialize the log as "in-sync" if the 
> mirror is already "in-sync".

Yes, once one drive fails, the metadata on the other drive should 
indicate that the mirror is broken and this is now the most up to date 
copy.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG in dm/dm-mirror module?
  2007-08-13 15:18   ` Jonathan Brassow
  2007-08-13 16:48     ` Phillip Susi
@ 2007-08-13 18:24     ` malahal
  1 sibling, 0 replies; 8+ messages in thread
From: malahal @ 2007-08-13 18:24 UTC (permalink / raw)
  To: Jonathan Brassow; +Cc: device-mapper development

Jonathan Brassow [jbrassow@redhat.com] wrote:
> 
> Let's say that the primary copy of the log dies or goes offline.  You  
> continue on because the log device is still "good".  If your machine  
> crashes and the primary log device is "rediscovered" on bootup, what  
> happens?  The contents of the stale side will be copied - resulting  
> in your log not properly reflecting the state of your mirror device  
> and maybe even leaving inconsistencies.

How does this work today with a normal mirror (does the disk log keep
enough info who should be the master on reboot?)?

> Ultimately, I think that in order to have a fast solution that allows  
> you to do the above (as well as a whole host of other advanced  
> features, like real-time mirroring) you need kernel accessible device  
> labels on each mirror device and log.  The labels would track things  
> like: who's the primary, who's a slave, who's in the group, who's  
> failed, etc.  I've seen some people advocate putting this in the log,  
> but the log can fail.  (I hope I've already conveyed why I don't  
> think it's a good idea to mirror the log.)  I don't have any good  
> ideas for making this happen right now.

Yes, having a kernel accessible label on the mirror device would be best
to handle these kinds of scenarios. Other possible option is to enhance
log module to handle 'mirrored log' which can update log device failures
in the log itself.

Thanks, Malahal.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG in dm/dm-mirror module?
  2007-08-13 16:48     ` Phillip Susi
@ 2007-08-13 20:18       ` Jonathan Brassow
  2007-08-13 21:21         ` Phillip Susi
  2007-08-14 15:55         ` malahal
  0 siblings, 2 replies; 8+ messages in thread
From: Jonathan Brassow @ 2007-08-13 20:18 UTC (permalink / raw)
  To: device-mapper development


On Aug 13, 2007, at 11:48 AM, Phillip Susi wrote:

> Jonathan Brassow wrote:
>> On a different topic, why are you mirroring the log?  Isn't this  
>> somewhat dangerous?
>> Let's say that the primary copy of the log dies or goes offline.   
>> You continue on because the log device is still "good".  If your  
>> machine crashes and the primary log device is "rediscovered" on  
>> bootup, what happens?  The contents of the stale side will be  
>> copied - resulting in your log not properly reflecting the state  
>> of your mirror device and maybe even leaving inconsistencies.
>
> This is a problem with any mirror, not just one holding a mirror log.

It is a special problem with the mirror log.

Mirrors will recover themselves and become consistent upon a reboot.   
In the case of a mirror that holds a file system, if you lost some of  
your most recent writes, journaling/fsck will take care of it.  In  
the case of a mirror that holds another mirror's log, you wind up  
with a log that does not contain recent data - and could spell  
coherency issues for the top level mirror.

>
>> You might argue that we should update the metadata to exclude the  
>> failed primary at the point of failure.  Two things come to mind:
>> 1) log I/O will continue until you take action - leaving you open  
>> to the scenario above
>> 2) it would be simpler to just allocate a new log (since you are  
>> changing metadata anyway) and initialize the log as "in-sync" if  
>> the mirror is already "in-sync".
>
> Yes, once one drive fails, the metadata on the other drive should  
> indicate that the mirror is broken and this is now the most up to  
> date copy.

There is no metadata on the other drive, that's part of the problem.   
We must discern between metadata that is made by LVM (or other  
userspace app) and meta-data areas that are known to the device  
mapper target.  Currently, the mirroring target only has the log  
device - which I contend is insufficient.

  brassow

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG in dm/dm-mirror module?
  2007-08-13 20:18       ` Jonathan Brassow
@ 2007-08-13 21:21         ` Phillip Susi
  2007-08-14 15:55         ` malahal
  1 sibling, 0 replies; 8+ messages in thread
From: Phillip Susi @ 2007-08-13 21:21 UTC (permalink / raw)
  To: device-mapper development

Jonathan Brassow wrote:
> It is a special problem with the mirror log.
> 
> Mirrors will recover themselves and become consistent upon a reboot.  In 
> the case of a mirror that holds a file system, if you lost some of your 
> most recent writes, journaling/fsck will take care of it.  In the case 
> of a mirror that holds another mirror's log, you wind up with a log that 
> does not contain recent data - and could spell coherency issues for the 
> top level mirror.

Having a filesystem that is consistent is still not correct if it is 
older data, at least not when the newer data is available.

> There is no metadata on the other drive, that's part of the problem.  We 
> must discern between metadata that is made by LVM (or other userspace 
> app) and meta-data areas that are known to the device mapper target.  
> Currently, the mirroring target only has the log device - which I 
> contend is insufficient.

LVM needs to update its metadata to indicate that the other drive failed 
and this one now contains more up to date information going forward.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG in dm/dm-mirror module?
  2007-08-13 20:18       ` Jonathan Brassow
  2007-08-13 21:21         ` Phillip Susi
@ 2007-08-14 15:55         ` malahal
  1 sibling, 0 replies; 8+ messages in thread
From: malahal @ 2007-08-14 15:55 UTC (permalink / raw)
  To: Jonathan Brassow; +Cc: device-mapper development

Jonathan Brassow [jbrassow@redhat.com] wrote:
> 
> On Aug 13, 2007, at 11:48 AM, Phillip Susi wrote:
> 
> >
> >This is a problem with any mirror, not just one holding a mirror log.
> 
> It is a special problem with the mirror log.
> 
> Mirrors will recover themselves and become consistent upon a reboot.   
> In the case of a mirror that holds a file system, if you lost some of  
> your most recent writes, journaling/fsck will take care of it.  In  

I believe the mirror code handles errors at region level. So one region
could be out of sync while the other regions are updated with the latest
data if the disk failure(s) are transient. I don't think the disk with
few 'out-of-sync' regions can be assumed to have consistent data.

In any case, we need a better method to select the master mirror device.
Does LVM have an extra sector or so to give it to the kernel module?

Thanks, Malahal.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-08-14 15:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-11  1:08 BUG in dm/dm-mirror module? malahal
2007-08-11  8:52 ` Milan Broz
2007-08-13 15:18   ` Jonathan Brassow
2007-08-13 16:48     ` Phillip Susi
2007-08-13 20:18       ` Jonathan Brassow
2007-08-13 21:21         ` Phillip Susi
2007-08-14 15:55         ` malahal
2007-08-13 18:24     ` malahal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.