Distributed Replicated Block Device (DRBD) development
 help / color / mirror / Atom feed
* Re: [Drbd-dev] [DRBD][PATCH] drbd_bitfild_endian.patch
  2007-11-26 16:32 [Drbd-dev] [DRBD][PATCH] drbd_bitfild_endian.patch Maxim Uvarov
@ 2007-11-26 15:06 ` Lars Ellenberg
  2007-11-26 19:28   ` Maxim Uvarov
       [not found]   ` <474C12E0.6020306@ru.mvista.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Lars Ellenberg @ 2007-11-26 15:06 UTC (permalink / raw)
  To: Maxim Uvarov; +Cc: drbd-dev

On Mon, Nov 26, 2007 at 04:32:44PM +0000, Maxim Uvarov wrote:
> Hello all,
> 
> We have found that drbd  state is passed incorrectly between machines  
> with different endianness.
> Attached patch fixes this problem by converting bitfild to int.
> 
> Replacing all bitfileds (both in kernel and userland) isn't safe way 
> (there are
> about hundred lines to replace). Maybe adding converter from bitfield to 
> int32
> is appropriate solution ? Then sending state we do 
> "drbd_state_to_int(state)"
> and then receive we restore state by "int_to_drbd_state(i)". only 9 lines of
> kernel code was changed (No need to change userspace). Additional 
> overhead is
> neglible.

right you are.
but how about this instead:

diff --git a/drbd/linux/drbd.h b/drbd/linux/drbd.h
index b6ea313..ecf3a27 100644
--- a/drbd/linux/drbd.h
+++ b/drbd/linux/drbd.h
@@ -28,6 +28,7 @@
 #include <linux/drbd_config.h>
 
 #include <asm/types.h>
+#include <asm/byteorder.h>
 
 #ifdef __KERNEL__
 #include <linux/types.h>
@@ -176,18 +177,39 @@ typedef enum {
 	disk_mask=15
 } drbd_disks_t;
 
+/* bitfields are not endian safe.
+ * pointed out by Maxim Uvarov <muvarov@ru.mvista.com>
+ * even though we transmit as "cpu_to_be32(state)",
+ * the offsets of the bitfields still need to be swapped
+ * on different endianess.
+ */
 typedef union {
 	struct {
-		unsigned role : 2 ;   // 3/4      primary/secondary/unknown
-		unsigned peer : 2 ;   // 3/4      primary/secondary/unknown
-		unsigned conn : 5 ;   // 17/32    cstates
-		unsigned disk : 4 ;   // 8/16     from Diskless to UpToDate
-		unsigned pdsk : 4 ;   // 8/16     from Diskless to UpToDate
-		unsigned susp : 1 ;   // 2/2      IO suspended  no/yes
-		unsigned aftr_isp : 1 ; // isp .. imposed sync pause
+#if defined(__LITTLE_ENDIAN)
+		unsigned role : 2 ;   /* 3/4      primary/secondary/unknown */
+		unsigned peer : 2 ;   /* 3/4      primary/secondary/unknown */
+		unsigned conn : 5 ;   /* 17/32    cstates */
+		unsigned disk : 4 ;   /* 8/16     from Diskless to UpToDate */
+		unsigned pdsk : 4 ;   /* 8/16     from Diskless to UpToDate */
+		unsigned susp : 1 ;   /* 2/2      IO suspended  no/yes */
+		unsigned aftr_isp : 1 ; /* isp .. imposed sync pause */
 		unsigned peer_isp : 1 ;
 		unsigned user_isp : 1 ;
-		unsigned _pad : 11;   // 0        unused
+		unsigned _pad : 11;   /* 0        unused */
+#elif defined(__BIG_ENDIAN)
+		unsigned _pad : 11;   /* 0        unused */
+		unsigned user_isp : 1 ;
+		unsigned peer_isp : 1 ;
+		unsigned aftr_isp : 1 ; /* isp .. imposed sync pause */
+		unsigned susp : 1 ;   /* 2/2      IO suspended  no/yes */
+		unsigned pdsk : 4 ;   /* 8/16     from Diskless to UpToDate */
+		unsigned disk : 4 ;   /* 8/16     from Diskless to UpToDate */
+		unsigned conn : 5 ;   /* 17/32    cstates */
+		unsigned peer : 2 ;   /* 3/4      primary/secondary/unknown */
+		unsigned role : 2 ;   /* 3/4      primary/secondary/unknown */
+#else
+# error "this endianess is not supported"
+#endif
 	};
 	unsigned int i;
 } drbd_state_t;


-- 
: Lars Ellenberg                            Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Drbd-dev] [DRBD][PATCH] drbd_bitfild_endian.patch
@ 2007-11-26 16:32 Maxim Uvarov
  2007-11-26 15:06 ` Lars Ellenberg
  0 siblings, 1 reply; 5+ messages in thread
From: Maxim Uvarov @ 2007-11-26 16:32 UTC (permalink / raw)
  To: drbd-dev

[-- Attachment #1: Type: text/plain, Size: 605 bytes --]

Hello all,

We have found that drbd  state is passed incorrectly between machines  
with different endianness.
Attached patch fixes this problem by converting bitfild to int.

Replacing all bitfileds (both in kernel and userland) isn't safe way 
(there are
about hundred lines to replace). Maybe adding converter from bitfield to 
int32
is appropriate solution ? Then sending state we do 
"drbd_state_to_int(state)"
and then receive we restore state by "int_to_drbd_state(i)". only 9 lines of
kernel code was changed (No need to change userspace). Additional 
overhead is
neglible.


Best regards,
Maxim.

[-- Attachment #2: drbd_bitfild_endian.patch --]
[-- Type: text/plain, Size: 5829 bytes --]

Source: MontaVista Software, Inc.
MR: 26056
Type: Defect Fix
Disposition: needs submitting to DRBD maintainer 
Signed-off-by: Maxim Uvarov <muvarov@ru.mvista.com>
Signed-off-by: Maxim Syrchin <msyrchin@ru.mvista.com>
Description:
  DRBD uses bitfield to store its state. Since bitfields are not 
  cross-architecture safe - it causes error then archs of peers are
  different. The simplest way to solve this is to convert biitfild
  to cross-arch safe int32. This change makes new protocol incompatile
  with old one - so PRO_VERSION is increased from 86 up to 88.

Index: linux-2.6.21_mvlcge500/drivers/block/drbd/drbd_int.h
===================================================================
--- linux-2.6.21_mvlcge500.orig/drivers/block/drbd/drbd_int.h
+++ linux-2.6.21_mvlcge500/drivers/block/drbd/drbd_int.h
@@ -1446,6 +1446,64 @@ void drbd_bcast_state(drbd_dev *mdev);
                 D,({drbd_state_t ns; ns.i = D->state.i; ns.T1 = (S1); \
                 ns.T2 = (S2); ns.T3 = (S3); ns;})
 
+/*
+ * Using bitfields is no cross-architecture safe.
+ * Use shifts to convert bitfield to cross-safe int
+ */
+
+#define	ROLE_OFF	0
+#define	PEER_OFF	2
+#define	CONN_OFF	4
+#define	DISK_OFF	9
+#define	PDSK_OFF	13
+#define	SUSP_OFF	17
+#define	AISP_OFF	18
+#define	PISP_OFF	19
+#define	UISP_OFF	20
+
+#define	ROLE_MASK	(unsigned int)0x03
+#define	PEER_MASK	(unsigned int)0x03
+#define	CONN_MASK	(unsigned int)0x1f
+#define	DISK_MASK	(unsigned int)0x0f
+#define	PDSK_MASK	(unsigned int)0x0f
+#define	SUSP_MASK	(unsigned int)0x01
+#define	AISP_MASK	(unsigned int)0x01
+#define	PISP_MASK	(unsigned int)0x01
+#define	UISP_MASK	(unsigned int)0x01
+
+static inline unsigned int drbd_state_to_int (unsigned int state_i)
+{
+	unsigned int i = 0;
+	drbd_state_t state;
+	state.i = state_i;
+	i = ((unsigned int)state.role)<<ROLE_OFF;
+	i |= ((unsigned int)state.peer)<<PEER_OFF;
+	i |= ((unsigned int)state.conn)<<CONN_OFF;
+	i |= ((unsigned int)state.disk)<<DISK_OFF;
+	i |= ((unsigned int)state.pdsk)<<PDSK_OFF;
+	i |= ((unsigned int)state.susp)<<SUSP_OFF;
+	i |= ((unsigned int)state.aftr_isp)<<AISP_OFF;
+	i |= ((unsigned int)state.peer_isp)<<PISP_OFF;
+	i |= ((unsigned int)state.user_isp)<<UISP_OFF;
+	return i;
+}
+
+static inline unsigned int int_to_drbd_state (unsigned int i)
+{
+	drbd_state_t state;
+	state.i = 0;
+	state.role = (i>>ROLE_OFF)&ROLE_MASK;
+	state.peer = (i>>PEER_OFF)&PEER_MASK;
+	state.conn = (i>>CONN_OFF)&CONN_MASK;
+	state.disk = (i>>DISK_OFF)&DISK_MASK;
+	state.pdsk = (i>>PDSK_OFF)&PDSK_MASK;
+	state.susp = (i>>SUSP_OFF)&SUSP_MASK;
+	state.aftr_isp = (i>>AISP_OFF)&AISP_MASK;
+	state.peer_isp = (i>>PISP_OFF)&PISP_MASK;
+	state.user_isp = (i>>UISP_OFF)&UISP_MASK;
+	return state.i;
+}
+
 static inline void drbd_state_lock(drbd_dev *mdev)
 {
 	wait_event(mdev->misc_wait,
Index: linux-2.6.21_mvlcge500/drivers/block/drbd/drbd_main.c
===================================================================
--- linux-2.6.21_mvlcge500.orig/drivers/block/drbd/drbd_main.c
+++ linux-2.6.21_mvlcge500/drivers/block/drbd/drbd_main.c
@@ -1366,7 +1366,7 @@ int drbd_send_state(drbd_dev *mdev)
 {
 	Drbd_State_Packet p;
 
-	p.state    = cpu_to_be32(mdev->state.i);
+	p.state    = cpu_to_be32(drbd_state_to_int(mdev->state.i));
 
 	return drbd_send_cmd(mdev,USE_DATA_SOCKET,ReportState,
 			     (Drbd_Header*)&p,sizeof(p));
@@ -1376,8 +1376,8 @@ STATIC int drbd_send_state_req(drbd_dev 
 {
 	Drbd_Req_State_Packet p;
 
-	p.mask    = cpu_to_be32(mask.i);
-	p.val     = cpu_to_be32(val.i);
+	p.mask    = cpu_to_be32(drbd_state_to_int(mask.i));
+	p.val     = cpu_to_be32(drbd_state_to_int(val.i));
 
 	return drbd_send_cmd(mdev,USE_DATA_SOCKET,StateChgRequest,
 			     (Drbd_Header*)&p,sizeof(p));
@@ -3179,15 +3179,15 @@ _dump_packet(drbd_dev *mdev, struct sock
 		break;
 
 	case ReportState:
-		v.i = be32_to_cpu(p->State.state);
+		v.i = int_to_drbd_state(be32_to_cpu(p->State.state));
 		m.i = 0xffffffff;
 		dump_st(tmp,sizeof(tmp),m,v);
 		INFOP("%s (s %x {%s})\n", cmdname(cmd), v.i, tmp);
 		break;
 
 	case StateChgRequest:
-		m.i = be32_to_cpu(p->ReqState.mask);
-		v.i = be32_to_cpu(p->ReqState.val);
+		m.i = int_to_drbd_state(be32_to_cpu(p->ReqState.mask));
+		v.i = int_to_drbd_state(be32_to_cpu(p->ReqState.val));
 		dump_st(tmp,sizeof(tmp),m,v);
 		INFOP("%s (m %x v %x {%s})\n", cmdname(cmd), m.i, v.i, tmp);
 		break;
Index: linux-2.6.21_mvlcge500/drivers/block/drbd/drbd_receiver.c
===================================================================
--- linux-2.6.21_mvlcge500.orig/drivers/block/drbd/drbd_receiver.c
+++ linux-2.6.21_mvlcge500/drivers/block/drbd/drbd_receiver.c
@@ -2369,8 +2369,8 @@ STATIC int receive_req_state(drbd_dev *m
 	if (drbd_recv(mdev, h->payload, h->length) != h->length)
 		return FALSE;
 
-	mask.i = be32_to_cpu(p->mask);
-	val.i = be32_to_cpu(p->val);
+	mask.i = int_to_drbd_state(be32_to_cpu(p->mask));
+	val.i =  int_to_drbd_state(be32_to_cpu(p->val));
 
 	if (test_bit(DISCARD_CONCURRENT,&mdev->flags)) drbd_state_lock(mdev);
 
@@ -2401,7 +2401,7 @@ STATIC int receive_state(drbd_dev *mdev,
 	nconn = mdev->state.conn;
 	if (nconn == WFReportParams ) nconn = Connected;
 
-	peer_state.i = be32_to_cpu(p->state);
+	peer_state.i =  int_to_drbd_state(be32_to_cpu(p->state));
 
 	if (mdev->p_uuid && mdev->state.conn <= Connected &&
 	    inc_local_if_state(mdev,Negotiating) &&
Index: linux-2.6.21_mvlcge500/include/linux/drbd_config.h
===================================================================
--- linux-2.6.21_mvlcge500.orig/include/linux/drbd_config.h
+++ linux-2.6.21_mvlcge500/include/linux/drbd_config.h
@@ -24,7 +24,7 @@ extern const char * drbd_buildtag(void);
 
 #define REL_VERSION "8.0.3"
 #define API_VERSION 86
-#define PRO_VERSION 86
+#define PRO_VERSION 88
 
 // undef if you need the workaround in drbd_receiver
 #define HAVE_UML_TO_VIRT 1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Drbd-dev] [DRBD][PATCH] drbd_bitfild_endian.patch
  2007-11-26 15:06 ` Lars Ellenberg
@ 2007-11-26 19:28   ` Maxim Uvarov
       [not found]   ` <474C12E0.6020306@ru.mvista.com>
  1 sibling, 0 replies; 5+ messages in thread
From: Maxim Uvarov @ 2007-11-26 19:28 UTC (permalink / raw)
  To: Lars Ellenberg; +Cc: drbd-dev

Lars Ellenberg wrote:

>On Mon, Nov 26, 2007 at 04:32:44PM +0000, Maxim Uvarov wrote:
>  
>
>>Hello all,
>>
>>We have found that drbd  state is passed incorrectly between machines  
>>with different endianness.
>>Attached patch fixes this problem by converting bitfild to int.
>>
>>Replacing all bitfileds (both in kernel and userland) isn't safe way 
>>(there are
>>about hundred lines to replace). Maybe adding converter from bitfield to 
>>int32
>>is appropriate solution ? Then sending state we do 
>>"drbd_state_to_int(state)"
>>and then receive we restore state by "int_to_drbd_state(i)". only 9 lines of
>>kernel code was changed (No need to change userspace). Additional 
>>overhead is
>>neglible.
>>    
>>
>
>right you are.
>but how about this instead:
>
>diff --git a/drbd/linux/drbd.h b/drbd/linux/drbd.h
>index b6ea313..ecf3a27 100644
>--- a/drbd/linux/drbd.h
>+++ b/drbd/linux/drbd.h
>@@ -28,6 +28,7 @@
> #include <linux/drbd_config.h>
> 
> #include <asm/types.h>
>+#include <asm/byteorder.h>
> 
> #ifdef __KERNEL__
> #include <linux/types.h>
>@@ -176,18 +177,39 @@ typedef enum {
> 	disk_mask=15
> } drbd_disks_t;
> 
>+/* bitfields are not endian safe.
>+ * pointed out by Maxim Uvarov <muvarov@ru.mvista.com>
>+ * even though we transmit as "cpu_to_be32(state)",
>+ * the offsets of the bitfields still need to be swapped
>+ * on different endianess.
>+ */
> typedef union {
> 	struct {
>-		unsigned role : 2 ;   // 3/4      primary/secondary/unknown
>-		unsigned peer : 2 ;   // 3/4      primary/secondary/unknown
>-		unsigned conn : 5 ;   // 17/32    cstates
>-		unsigned disk : 4 ;   // 8/16     from Diskless to UpToDate
>-		unsigned pdsk : 4 ;   // 8/16     from Diskless to UpToDate
>-		unsigned susp : 1 ;   // 2/2      IO suspended  no/yes
>-		unsigned aftr_isp : 1 ; // isp .. imposed sync pause
>+#if defined(__LITTLE_ENDIAN)
>+		unsigned role : 2 ;   /* 3/4      primary/secondary/unknown */
>+		unsigned peer : 2 ;   /* 3/4      primary/secondary/unknown */
>+		unsigned conn : 5 ;   /* 17/32    cstates */
>+		unsigned disk : 4 ;   /* 8/16     from Diskless to UpToDate */
>+		unsigned pdsk : 4 ;   /* 8/16     from Diskless to UpToDate */
>+		unsigned susp : 1 ;   /* 2/2      IO suspended  no/yes */
>+		unsigned aftr_isp : 1 ; /* isp .. imposed sync pause */
> 		unsigned peer_isp : 1 ;
> 		unsigned user_isp : 1 ;
>-		unsigned _pad : 11;   // 0        unused
>+		unsigned _pad : 11;   /* 0        unused */
>+#elif defined(__BIG_ENDIAN)
>+		unsigned _pad : 11;   /* 0        unused */
>+		unsigned user_isp : 1 ;
>+		unsigned peer_isp : 1 ;
>+		unsigned aftr_isp : 1 ; /* isp .. imposed sync pause */
>+		unsigned susp : 1 ;   /* 2/2      IO suspended  no/yes */
>+		unsigned pdsk : 4 ;   /* 8/16     from Diskless to UpToDate */
>+		unsigned disk : 4 ;   /* 8/16     from Diskless to UpToDate */
>+		unsigned conn : 5 ;   /* 17/32    cstates */
>+		unsigned peer : 2 ;   /* 3/4      primary/secondary/unknown */
>+		unsigned role : 2 ;   /* 3/4      primary/secondary/unknown */
>+#else
>+# error "this endianess is not supported"
>+#endif
> 	};
> 	unsigned int i;
> } drbd_state_t;
>
>
>  
>
Actually I did it at first time  but condition was  
__BIG_ENDIAN_BITFIELD. It require also changes
in default settings. And define _BITFIELD in user space. And  this 
version was denied by our architect with comment:
<cut>
Sending bit fields like this, unfortunately, is not portable. There is no
guarantee of bitfield layout event between compiler versions, let alone 
between
different architectures even of the same endian-ness.
</cut>

So it's not good idea to depend on compiler version.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Drbd-dev] [DRBD][PATCH] drbd_bitfild_endian.patch
       [not found]   ` <474C12E0.6020306@ru.mvista.com>
@ 2007-11-27 10:55     ` Lars Ellenberg
  2007-12-03 16:52       ` Philipp Reisner
  0 siblings, 1 reply; 5+ messages in thread
From: Lars Ellenberg @ 2007-11-27 10:55 UTC (permalink / raw)
  To: Maxim Uvarov; +Cc: drbd-dev

On Tue, Nov 27, 2007 at 12:51:44PM +0000, Maxim Uvarov wrote:
> By the way, Lars,  don't  use if defined(__LITTLE_ENDIAN) condition 
> because  __LITTLE_ENDIAN and __BIG_ENDIAN are always defined :)

in userland, yes.
in kernel, no.

in kernel there is no "if BYTE_ORDER == __LITTLE_ENDIAN",
there is only "ifdef __LITTLE_ENDIAN".

unless you prove me wrong.

-- 
: Lars Ellenberg                            Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Drbd-dev] [DRBD][PATCH] drbd_bitfild_endian.patch
  2007-11-27 10:55     ` Lars Ellenberg
@ 2007-12-03 16:52       ` Philipp Reisner
  0 siblings, 0 replies; 5+ messages in thread
From: Philipp Reisner @ 2007-12-03 16:52 UTC (permalink / raw)
  To: drbd-dev; +Cc: Maxim Uvarov, Lars Ellenberg

On Tuesday 27 November 2007 11:55:12 Lars Ellenberg wrote:
> On Tue, Nov 27, 2007 at 12:51:44PM +0000, Maxim Uvarov wrote:
> > By the way, Lars,  don't  use if defined(__LITTLE_ENDIAN) condition
> > because  __LITTLE_ENDIAN and __BIG_ENDIAN are always defined :)
>
> in userland, yes.
> in kernel, no.
>
> in kernel there is no "if BYTE_ORDER == __LITTLE_ENDIAN",
> there is only "ifdef __LITTLE_ENDIAN".
>
> unless you prove me wrong.

Hi,

I have decided to go with Maxim's initial approach == Lars' approach.

I decided for that because according to GCC's documentation the
layout of bit-fields is part of the platform's ABI. It is not
compiler version dependant...

Please review commit 
d43cf82288b7cbd703aa0b587c4ab32605fb97c5

Thanks!

Changing the protocol is not an option for drbd-8.0. The protocol
is frozen (for little endian).

That breaks of course rolling upgrades for users of existing
clusters on bit-endian machines.... if there are any.

BTW: If we really want we could even fix this, since we know
     that _pad is always 0 .... but it is not worth the effort
     when there are no users out there.

-phil
-- 
: Dipl-Ing Philipp Reisner                      Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH          Fax +43-1-8178292-82 :
: Vivenotgasse 48, 1120 Vienna, Austria        http://www.linbit.com :

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-12-03 16:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-26 16:32 [Drbd-dev] [DRBD][PATCH] drbd_bitfild_endian.patch Maxim Uvarov
2007-11-26 15:06 ` Lars Ellenberg
2007-11-26 19:28   ` Maxim Uvarov
     [not found]   ` <474C12E0.6020306@ru.mvista.com>
2007-11-27 10:55     ` Lars Ellenberg
2007-12-03 16:52       ` Philipp Reisner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox