All of lore.kernel.org
 help / color / mirror / Atom feed
* device-mapper: fix TB stripe data corruption
@ 2005-01-21 18:12 Alasdair G Kergon
  2005-01-21 20:33 ` Benjamin LaHaise
  0 siblings, 1 reply; 7+ messages in thread
From: Alasdair G Kergon @ 2005-01-21 18:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Missing cast thought to cause data corruption on devices with stripes > ~1TB.

Signed-Off-By: Alasdair G Kergon <agk@redhat.com>
--- diff/drivers/md/dm-stripe.c	2005-01-20 17:32:37.000000000 +0000
+++ source/drivers/md/dm-stripe.c	2005-01-20 17:32:26.000000000 +0000
@@ -179,7 +179,7 @@
 
 	bio->bi_bdev = sc->stripe[stripe].dev->bdev;
 	bio->bi_sector = sc->stripe[stripe].physical_start +
-	    (chunk << sc->chunk_shift) + (offset & sc->chunk_mask);
+	    ((sector_t) chunk << sc->chunk_shift) + (offset & sc->chunk_mask);
 	return 1;
 }
 
@@ -212,7 +212,7 @@
 
 static struct target_type stripe_target = {
 	.name   = "striped",
-	.version= {1, 0, 1},
+	.version= {1, 0, 2},
 	.module = THIS_MODULE,
 	.ctr    = stripe_ctr,
 	.dtr    = stripe_dtr,

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: device-mapper: fix TB stripe data corruption
  2005-01-21 18:12 device-mapper: fix TB stripe data corruption Alasdair G Kergon
@ 2005-01-21 20:33 ` Benjamin LaHaise
  2005-01-21 21:12   ` Kevin Corry
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin LaHaise @ 2005-01-21 20:33 UTC (permalink / raw)
  To: Alasdair G Kergon, Andrew Morton, linux-kernel

On Fri, Jan 21, 2005 at 06:12:03PM +0000, Alasdair G Kergon wrote:
> Missing cast thought to cause data corruption on devices with stripes > ~1TB.

Why not make chunk a sector_t?

		-ben

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: device-mapper: fix TB stripe data corruption
  2005-01-21 20:33 ` Benjamin LaHaise
@ 2005-01-21 21:12   ` Kevin Corry
  2005-01-21 21:20     ` Roland Dreier
  0 siblings, 1 reply; 7+ messages in thread
From: Kevin Corry @ 2005-01-21 21:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: Benjamin LaHaise, Alasdair G Kergon, Andrew Morton

On Friday 21 January 2005 2:33 pm, Benjamin LaHaise wrote:
> On Fri, Jan 21, 2005 at 06:12:03PM +0000, Alasdair G Kergon wrote:
> > Missing cast thought to cause data corruption on devices with stripes >
> > ~1TB.
>
> Why not make chunk a sector_t?

We have to take a mod of the chunk value and the number of stripes (which can 
be a non-power-of-2, so a shift won't work). It's been my understanding that 
you couldn't mod a 64-bit value with a 32-bit value in the kernel. Just to be 
sure, I've changed "chunk" to a sector_t and recompiled, and get the 
following error:

  MODPOST
*** Warning: "__udivdi3" [drivers/md/dm-mod.ko] undefined!
*** Warning: "__umoddi3" [drivers/md/dm-mod.ko] undefined!

-- 
Kevin Corry
kevcorry@us.ibm.com
http://evms.sourceforge.net/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: device-mapper: fix TB stripe data corruption
  2005-01-21 21:12   ` Kevin Corry
@ 2005-01-21 21:20     ` Roland Dreier
  2005-01-21 21:57       ` Kevin Corry
  0 siblings, 1 reply; 7+ messages in thread
From: Roland Dreier @ 2005-01-21 21:20 UTC (permalink / raw)
  To: Kevin Corry
  Cc: linux-kernel, Benjamin LaHaise, Alasdair G Kergon, Andrew Morton

    Kevin> We have to take a mod of the chunk value and the number of
    Kevin> stripes (which can be a non-power-of-2, so a shift won't
    Kevin> work). It's been my understanding that you couldn't mod a
    Kevin> 64-bit value with a 32-bit value in the kernel.

If I understand you correctly, do_div() (defined in <asm/div64.h>)
does what you need.  Look at asm-generic/div64.h for a good
description of the precise semantics of do_div().

 - Roland

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: device-mapper: fix TB stripe data corruption
  2005-01-21 21:20     ` Roland Dreier
@ 2005-01-21 21:57       ` Kevin Corry
  2005-01-22 22:35         ` Alasdair G Kergon
  0 siblings, 1 reply; 7+ messages in thread
From: Kevin Corry @ 2005-01-21 21:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Roland Dreier, Benjamin LaHaise, Alasdair G Kergon, Andrew Morton

On Friday 21 January 2005 3:20 pm, Roland Dreier wrote:
>     Kevin> We have to take a mod of the chunk value and the number of
>     Kevin> stripes (which can be a non-power-of-2, so a shift won't
>     Kevin> work). It's been my understanding that you couldn't mod a
>     Kevin> 64-bit value with a 32-bit value in the kernel.
>
> If I understand you correctly, do_div() (defined in <asm/div64.h>)
> does what you need.  Look at asm-generic/div64.h for a good
> description of the precise semantics of do_div().

Thanks for the tip, Roland. That seems to be exactly what we needed.  Here's a 
different version of Alasdair's patch that changes chunk to 64-bit and uses 
do_div().

-- 
Kevin Corry
kevcorry@us.ibm.com
http://evms.sourceforge.net/


In stripe_map(), change chunk to 64-bit and use do_div to divide and mod by
the number of stripes.

--- diff/drivers/md/dm-stripe.c 2005-01-21 15:55:02.093379864 -0600
+++ source/drivers/md/dm-stripe.c 2005-01-21 15:54:25.400957960 -0600
@@ -173,9 +173,8 @@
  struct stripe_c *sc = (struct stripe_c *) ti->private;
 
  sector_t offset = bio->bi_sector - ti->begin;
- uint32_t chunk = (uint32_t) (offset >> sc->chunk_shift);
- uint32_t stripe = chunk % sc->stripes; /* 32bit modulus */
- chunk = chunk / sc->stripes;
+ sector_t chunk = offset >> sc->chunk_shift;
+ uint32_t stripe = do_div(chunk, sc->stripes);
 
  bio->bi_bdev = sc->stripe[stripe].dev->bdev;
  bio->bi_sector = sc->stripe[stripe].physical_start +
@@ -210,7 +209,7 @@
 
 static struct target_type stripe_target = {
  .name   = "striped",
- .version= {1, 0, 1},
+ .version= {1, 0, 2},
  .module = THIS_MODULE,
  .ctr    = stripe_ctr,
  .dtr    = stripe_dtr,

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: device-mapper: fix TB stripe data corruption
  2005-01-21 21:57       ` Kevin Corry
@ 2005-01-22 22:35         ` Alasdair G Kergon
  0 siblings, 0 replies; 7+ messages in thread
From: Alasdair G Kergon @ 2005-01-22 22:35 UTC (permalink / raw)
  To: Kevin Corry
  Cc: linux-kernel, Roland Dreier, Benjamin LaHaise, Alasdair G Kergon,
	Andrew Morton

On Fri, Jan 21, 2005 at 03:57:38PM -0600, Kevin Corry wrote:
> On Friday 21 January 2005 3:20 pm, Roland Dreier wrote:
> > If I understand you correctly, do_div() (defined in <asm/div64.h>)

I went for the simplest and safest fix first as this is a data
corruption problem and I want assurance that this fixes device-mapper
striping.

I didn't want to change it to do_div() without first checking
it would not slow down the code on the main architectures: on the contrary
I would hope that use of an optimised library inline speeds it up, but 
want to be sure.  You don't need the 64-bit mod until you have hundreds
of TB in a single logical volume block device, filesystem...

So far, I've only seen two test reports, both of which say they
are still seeing data corruption in a filesystem on top of dm-stripe
after applying this patch.  But none of this information so far
is specific enough to say whether the remaining problem(s) is/are
in device-mapper.

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: device-mapper: fix TB stripe data corruption
       [not found] ` <200501251448.31769.kevcorry@us.ibm.com>
@ 2005-01-28  0:06   ` Kevin Corry
  0 siblings, 0 replies; 7+ messages in thread
From: Kevin Corry @ 2005-01-28  0:06 UTC (permalink / raw)
  To: agk, Dmitry Yurtaev; +Cc: DevMapper

On Tuesday 25 January 2005 2:48 pm, Kevin Corry wrote:
> On Tuesday 25 January 2005 1:40 pm, Dmitry Yurtaev wrote:
> > i'm sorry for mailing you directly. i was trying to get lvm2 working
> > with 2 striped 1.8T volumes and came across your patch in the lklm. but
> > it didn't help. after some digging i've found few suspicious functions
> > in dm.h, especially this one:
> >
> > --- file: drivers/md/dm.h
> > static inline unsigned long dm_round_up(unsigned long n, unsigned long
> > size) {
> >          unsigned long r = n % size;
> >          return n + (r ? (size - r) : 0);
> > }
> >
> > it is called from drivers/md/dm.c:max_io_len() with two arguments of
> > sector_t type... i've made an ugly fix:
> >
> > static inline sector_t dm_round_up(sector_t n, unsigned long size)
> > {
> >          sector_t q = n;
> >          sector_t r = sector_div(q, size);
> >          return n + (r ? (size - r) : 0);
> > }
> >
> > and that plus your patch eliminated my problem with data corruption.
> >
> > it would be nice if someone knowlegeable will audit dm.[hc]... also it
> > looks to me like there're few places in dm-stripe.c which assume that a
> > stripe size is <2T...
>
> Here's a patch to implement the above change that you suggested. Thanks for
> the feedback!

Here's a corrected version of the dm.h patch from Tuesday. The previous 
version was missing another sector_div() in dm_div_up(). When I compiled it 
the first time, I had DM built as modules, and it all seemed to build 
correctly. Today I switched DM back to statically built and then noticed a 
compile error.

-- 
Kevin Corry
kevcorry@us.ibm.com
http://evms.sourceforge.net/



Make dm_round_up() and dm_div_up() 64-bit safe.

--- diff/drivers/md/dm.h 2005-01-27 18:00:17.918379224 -0600
+++ source/drivers/md/dm.h 2005-01-27 18:00:37.034473136 -0600
@@ -145,18 +145,21 @@
 /*
  * ceiling(n / size) * size
  */
-static inline unsigned long dm_round_up(unsigned long n, unsigned long size)
+static inline sector_t dm_round_up(sector_t n, unsigned long size)
 {
- unsigned long r = n % size;
+ sector_t q = n;
+ sector_t r = sector_div(q, size);
  return n + (r ? (size - r) : 0);
 }
 
 /*
  * Ceiling(n / size)
  */
-static inline unsigned long dm_div_up(unsigned long n, unsigned long size)
+static inline sector_t dm_div_up(sector_t n, unsigned long size)
 {
- return dm_round_up(n, size) / size;
+ sector_t r = dm_round_up(n, size);
+ sector_div(r, size);
+ return r;
 }
 
 static inline sector_t to_sector(unsigned long n)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-01-28  0:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-21 18:12 device-mapper: fix TB stripe data corruption Alasdair G Kergon
2005-01-21 20:33 ` Benjamin LaHaise
2005-01-21 21:12   ` Kevin Corry
2005-01-21 21:20     ` Roland Dreier
2005-01-21 21:57       ` Kevin Corry
2005-01-22 22:35         ` Alasdair G Kergon
     [not found] <41F6A093.1020606@meditprofi.ru>
     [not found] ` <200501251448.31769.kevcorry@us.ibm.com>
2005-01-28  0:06   ` Kevin Corry

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.