All of lore.kernel.org
 help / color / mirror / Atom feed
* PATCH - pvmove doesn't terminate sometimes on 64-bit bigendian machines
@ 2005-09-14  4:32 Neil Brown
  2005-09-16 17:32 ` Alasdair G Kergon
  2005-09-28 17:48 ` Alasdair G Kergon
  0 siblings, 2 replies; 5+ messages in thread
From: Neil Brown @ 2005-09-14  4:32 UTC (permalink / raw)
  To: dm-devel


Hi,
 I recently had to track down a problem with pvmove not terminating on
 an s390x.
 The patch below fixes the problem.

 The problem is specific to 64-bit bigendian machines such as s390x or
 ppc-64.

 The linux bitset operators (test_bit, set_bit etc) work on arrays of 
 "unsigned long".
 dm-log uses such bitsets but treats them often as arrays of uint32_t,
 only allocateing a multiple of 4 bytes (as 'clean_bits' is a uint32_t).
 Further, it only zeros a multiple of 4 bytes.

 In the simplest case, if "region_count" were (say) 30, then
 bitset_size (below) would be 4 and bitset_uint32_count would be 1.
 Thus the memory for this butset, after allocation and zeroing would
 be
    0 0 0 0 X X X X 
 On a bigendian 64bit machine, bit 0 for this bitset is in the 8th
 byte! (and every bit that dm-log would use would be in the X area).

    0 0 0 0 X X X X 
                  ^
                  here

 which hasn't been cleared properly.

 As the dm-raid1 code only syncs and counts regions which have a 0 in
 the 'sync_bits' bitset, and only finishes when it has counted high
 enough, a large number of 1's among those 'X's will cause the sync to
 not complete.

 It is worth noting that the code uses the same bitsets for in-memory
 and on-disk logs.  As these bitsets are host-endian and host-sized,
 this means that they cannot safely be moved between computers with
 different architectures.  I don't know if the dm doco makes this
 clear...

NeilBrown


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/dm-log.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff ./drivers/md/dm-log.c~current~ ./drivers/md/dm-log.c
--- ./drivers/md/dm-log.c~current~	2005-09-14 14:19:51.000000000 +1000
+++ ./drivers/md/dm-log.c	2005-09-14 14:21:56.000000000 +1000
@@ -333,10 +333,10 @@ static int core_ctr(struct dirty_log *lo
 	lc->sync = sync;
 
 	/*
-	 * Work out how many words we need to hold the bitset.
+	 * Work out how many "unsigned long"s we need to hold the bitset.
 	 */
 	bitset_size = dm_round_up(region_count,
-				  sizeof(*lc->clean_bits) << BYTE_SHIFT);
+				  sizeof(unsigned long) << BYTE_SHIFT);
 	bitset_size >>= BYTE_SHIFT;
 
 	lc->bitset_uint32_count = bitset_size / 4;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PATCH - pvmove doesn't terminate sometimes on 64-bit bigendian machines
  2005-09-14  4:32 PATCH - pvmove doesn't terminate sometimes on 64-bit bigendian machines Neil Brown
@ 2005-09-16 17:32 ` Alasdair G Kergon
  2005-09-28 17:48 ` Alasdair G Kergon
  1 sibling, 0 replies; 5+ messages in thread
From: Alasdair G Kergon @ 2005-09-16 17:32 UTC (permalink / raw)
  To: device-mapper development

On Wed, Sep 14, 2005 at 02:32:59PM +1000, Neil Brown wrote:
>  As these bitsets are host-endian and host-sized,
>  this means that they cannot safely be moved between computers with
>  different architectures.  

That needs to be fixed.

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PATCH - pvmove doesn't terminate sometimes on 64-bit bigendian machines
  2005-09-14  4:32 PATCH - pvmove doesn't terminate sometimes on 64-bit bigendian machines Neil Brown
  2005-09-16 17:32 ` Alasdair G Kergon
@ 2005-09-28 17:48 ` Alasdair G Kergon
  2005-10-10  3:22   ` Neil Brown
  1 sibling, 1 reply; 5+ messages in thread
From: Alasdair G Kergon @ 2005-09-28 17:48 UTC (permalink / raw)
  To: Neil Brown; +Cc: dm-devel, Patrick Caulfield

On Wed, Sep 14, 2005 at 02:32:59PM +1000, Neil Brown wrote:
>  It is worth noting that the code uses the same bitsets for in-memory
>  and on-disk logs.  As these bitsets are host-endian and host-sized,
>  this means that they cannot safely be moved between computers with
>  different architectures.  I don't know if the dm doco makes this
>  clear...
 
Are you able to try out the (untested) patch from Patrick below?
Since the current implementation is broken for 64-bit BE we might
as well fix both problems at the same time.

Alasdair


From: Patrick Caulfield <pcaulfie@redhat.com>

The solution seems to be to use to set_le_bit test_le_bit functions that ext2
uses. That makes the BE machines keep the bitmap internally in LE order - it
does mean you can't use any other type of operations on the bitmap words but
that looks to be OK in this instance. The efficiency tradeoff is very minimal as
you would expect for something that ext2 uses.

The nice thing about this way is that there is no on-disk format change for
32bit LE boxes at least, and even nicer is that it removes code and data
structures because log->disk_bits disappears.


Index: linux-2.6.14-rc2/drivers/md/dm-log.c
===================================================================
--- linux-2.6.14-rc2.orig/drivers/md/dm-log.c	2005-09-28 18:32:53.000000000 +0100
+++ linux-2.6.14-rc2/drivers/md/dm-log.c	2005-09-28 18:33:30.000000000 +0100
@@ -157,7 +157,6 @@
 	struct log_header *disk_header;
 
 	struct io_region bits_location;
-	uint32_t *disk_bits;
 };
 
 /*
@@ -166,20 +165,20 @@
  */
 static  inline int log_test_bit(uint32_t *bs, unsigned bit)
 {
-	return test_bit(bit, (unsigned long *) bs) ? 1 : 0;
+	return ext2_test_bit(bit, (unsigned long *) bs) ? 1 : 0;
 }
 
 static inline void log_set_bit(struct log_c *l,
 			       uint32_t *bs, unsigned bit)
 {
-	set_bit(bit, (unsigned long *) bs);
+	ext2_set_bit(bit, (unsigned long *) bs);
 	l->touched = 1;
 }
 
 static inline void log_clear_bit(struct log_c *l,
 				 uint32_t *bs, unsigned bit)
 {
-	clear_bit(bit, (unsigned long *) bs);
+	ext2_clear_bit(bit, (unsigned long *) bs);
 	l->touched = 1;
 }
 
@@ -239,45 +238,24 @@
 /*----------------------------------------------------------------
  * Bits IO
  *--------------------------------------------------------------*/
-static inline void bits_to_core(uint32_t *core, uint32_t *disk, unsigned count)
-{
-	unsigned i;
-
-	for (i = 0; i < count; i++)
-		core[i] = le32_to_cpu(disk[i]);
-}
-
-static inline void bits_to_disk(uint32_t *core, uint32_t *disk, unsigned count)
-{
-	unsigned i;
-
-	/* copy across the clean/dirty bitset */
-	for (i = 0; i < count; i++)
-		disk[i] = cpu_to_le32(core[i]);
-}
-
 static int read_bits(struct log_c *log)
 {
 	int r;
 	unsigned long ebits;
 
 	r = dm_io_sync_vm(1, &log->bits_location, READ,
-			  log->disk_bits, &ebits);
+			  log->clean_bits, &ebits);
 	if (r)
 		return r;
 
-	bits_to_core(log->clean_bits, log->disk_bits,
-		     log->bitset_uint32_count);
 	return 0;
 }
 
 static int write_bits(struct log_c *log)
 {
 	unsigned long ebits;
-	bits_to_disk(log->clean_bits, log->disk_bits,
-		     log->bitset_uint32_count);
 	return dm_io_sync_vm(1, &log->bits_location, WRITE,
-			     log->disk_bits, &ebits);
+			     log->clean_bits, &ebits);
 }
 
 /*----------------------------------------------------------------
@@ -433,11 +411,6 @@
 	size = dm_round_up(lc->bitset_uint32_count * sizeof(uint32_t),
 			   1 << SECTOR_SHIFT);
 	lc->bits_location.count = size >> SECTOR_SHIFT;
-	lc->disk_bits = vmalloc(size);
-	if (!lc->disk_bits) {
-		vfree(lc->disk_header);
-		goto bad;
-	}
 	return 0;
 
  bad:
@@ -451,7 +424,6 @@
 	struct log_c *lc = (struct log_c *) log->context;
 	dm_put_device(lc->ti, lc->log_dev);
 	vfree(lc->disk_header);
-	vfree(lc->disk_bits);
 	core_dtr(log);
 }
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PATCH - pvmove doesn't terminate sometimes on 64-bit bigendian machines
  2005-09-28 17:48 ` Alasdair G Kergon
@ 2005-10-10  3:22   ` Neil Brown
  2005-10-10  7:27     ` Patrick Caulfield
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Brown @ 2005-10-10  3:22 UTC (permalink / raw)
  To: Alasdair G Kergon; +Cc: dm-devel

On Wednesday September 28, agk@redhat.com wrote:
> On Wed, Sep 14, 2005 at 02:32:59PM +1000, Neil Brown wrote:
> >  It is worth noting that the code uses the same bitsets for in-memory
> >  and on-disk logs.  As these bitsets are host-endian and host-sized,
> >  this means that they cannot safely be moved between computers with
> >  different architectures.  I don't know if the dm doco makes this
> >  clear...
>  
> Are you able to try out the (untested) patch from Patrick below?
> Since the current implementation is broken for 64-bit BE we might
> as well fix both problems at the same time.
> 

I've ask the person who reported the problem to test this patch.  I'll
let you know what eventuates.

... it's a pity about the 'ext2' appearing in there instead of a more
sensible 'set_le_bit', isn't it :-(

Thanks,

NeilBrown


> Alasdair
> 
> 
> From: Patrick Caulfield <pcaulfie@redhat.com>
> 
> The solution seems to be to use to set_le_bit test_le_bit functions that ext2
> uses. That makes the BE machines keep the bitmap internally in LE order - it
> does mean you can't use any other type of operations on the bitmap words but
> that looks to be OK in this instance. The efficiency tradeoff is very minimal as
> you would expect for something that ext2 uses.
> 
> The nice thing about this way is that there is no on-disk format change for
> 32bit LE boxes at least, and even nicer is that it removes code and data
> structures because log->disk_bits disappears.
> 
> 
> Index: linux-2.6.14-rc2/drivers/md/dm-log.c
> ===================================================================
> --- linux-2.6.14-rc2.orig/drivers/md/dm-log.c	2005-09-28 18:32:53.000000000 +0100
> +++ linux-2.6.14-rc2/drivers/md/dm-log.c	2005-09-28 18:33:30.000000000 +0100
> @@ -157,7 +157,6 @@
>  	struct log_header *disk_header;
>  
>  	struct io_region bits_location;
> -	uint32_t *disk_bits;
>  };
>  
>  /*
> @@ -166,20 +165,20 @@
>   */
>  static  inline int log_test_bit(uint32_t *bs, unsigned bit)
>  {
> -	return test_bit(bit, (unsigned long *) bs) ? 1 : 0;
> +	return ext2_test_bit(bit, (unsigned long *) bs) ? 1 : 0;
>  }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PATCH - pvmove doesn't terminate sometimes on 64-bit bigendian machines
  2005-10-10  3:22   ` Neil Brown
@ 2005-10-10  7:27     ` Patrick Caulfield
  0 siblings, 0 replies; 5+ messages in thread
From: Patrick Caulfield @ 2005-10-10  7:27 UTC (permalink / raw)
  Cc: dm-devel

Neil Brown wrote:
> On Wednesday September 28, agk@redhat.com wrote:
> 
>>On Wed, Sep 14, 2005 at 02:32:59PM +1000, Neil Brown wrote:
>>
>>> It is worth noting that the code uses the same bitsets for in-memory
>>> and on-disk logs.  As these bitsets are host-endian and host-sized,
>>> this means that they cannot safely be moved between computers with
>>> different architectures.  I don't know if the dm doco makes this
>>> clear...
>>
>> 
>>Are you able to try out the (untested) patch from Patrick below?
>>Since the current implementation is broken for 64-bit BE we might
>>as well fix both problems at the same time.
>>
> 
> 
> I've ask the person who reported the problem to test this patch.  I'll
> let you know what eventuates.

Thanks

> ... it's a pity about the 'ext2' appearing in there instead of a more
> sensible 'set_le_bit', isn't it :-(
> 

yes. In some (well, the big-endian) architectures it's called set_le_bit and
#defined to ext2_set_bit, bit the LE architectures just define ext2_set_bit to
be set_bit :(

-- 

patrick

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-10-10  7:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-14  4:32 PATCH - pvmove doesn't terminate sometimes on 64-bit bigendian machines Neil Brown
2005-09-16 17:32 ` Alasdair G Kergon
2005-09-28 17:48 ` Alasdair G Kergon
2005-10-10  3:22   ` Neil Brown
2005-10-10  7:27     ` Patrick Caulfield

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.