Distributed Replicated Block Device (DRBD) development
 help / color / mirror / Atom feed
* [Drbd-dev] DRBD8: failed to complete sync due to receiving bitmap in unexpected state
@ 2006-12-11 22:16 Montrose, Ernest
  2006-12-12 10:19 ` Lars Ellenberg
  0 siblings, 1 reply; 2+ messages in thread
From: Montrose, Ernest @ 2006-12-11 22:16 UTC (permalink / raw)
  To: drbd-dev

[-- Attachment #1: Type: text/plain, Size: 1284 bytes --]

Hi all,
Were are seeing a case where a Sync happened, data is marked consistent
on both sides, target went to Connected
  state, source DID NOT CHANGE FROM WFBitMapS state. The clock on the
two systems seem to be not quite synchronized, but it seems that:

1. The two nodes connected, realised they needed to resync and worked
out that one node had the
  good data.
2. Because other syncing was going on, the sync process was paused
3. Later on, sync resumed, good side connection went to WFBitmapS, bad
side WFBitmapT
4. Sync happened, data was marked consistent on both sides, target went
to Connected
  state, source DID NOT CHANGE FROM WFBitMapS.

Now, the only oddity I see is on the target side where we see:

Dec 10 04:52:52 george kernel: drbd1: unexpected cstate (PausedSyncT) in
receive_bitmap

This did NOT stop the resync, but I would suspect it meant that a
critical message was never sent which left the source side in WFBitmapS.

Presumably there is a window where one side is out of the paused state
before the other.
 
Simon Grham actually did a bit of analysis of this and think that the
problem might be a race condition in drbd_receive.c:receive_bitmap().
Any ideas, because I cannot reproduce this at reliably at this time.
 
EM--

[-- Attachment #2: Type: text/html, Size: 6709 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Drbd-dev] DRBD8: failed to complete sync due to receiving bitmap in unexpected state
  2006-12-11 22:16 [Drbd-dev] DRBD8: failed to complete sync due to receiving bitmap in unexpected state Montrose, Ernest
@ 2006-12-12 10:19 ` Lars Ellenberg
  0 siblings, 0 replies; 2+ messages in thread
From: Lars Ellenberg @ 2006-12-12 10:19 UTC (permalink / raw)
  To: drbd-dev

/ 2006-12-11 17:16:50 -0500
\ Montrose, Ernest:
> Hi all,
> Were are seeing a case where a Sync happened, data is marked consistent
> on both sides, target went to Connected
>   state, source DID NOT CHANGE FROM WFBitMapS state. The clock on the
> two systems seem to be not quite synchronized, but it seems that:
> 
> 1. The two nodes connected, realised they needed to resync and worked
> out that one node had the
>   good data.
> 2. Because other syncing was going on, the sync process was paused
> 3. Later on, sync resumed, good side connection went to WFBitmapS, bad
> side WFBitmapT
> 4. Sync happened, data was marked consistent on both sides, target went
> to Connected
>   state, source DID NOT CHANGE FROM WFBitMapS.
> 
> Now, the only oddity I see is on the target side where we see:
> 
> Dec 10 04:52:52 george kernel: drbd1: unexpected cstate (PausedSyncT) in
> receive_bitmap
> 
> This did NOT stop the resync, but I would suspect it meant that a
> critical message was never sent which left the source side in WFBitmapS.
> 
> Presumably there is a window where one side is out of the paused state
> before the other.
>  
> Simon Grham actually did a bit of analysis of this and think that the
> problem might be a race condition in drbd_receive.c:receive_bitmap().
> Any ideas, because I cannot reproduce this at reliably at this time.

Not yet...
is any state change Secondary->Primary involved,
or are the only (re)connecting?

-- 
: Lars Ellenberg                            Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-12-12 10:19 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-11 22:16 [Drbd-dev] DRBD8: failed to complete sync due to receiving bitmap in unexpected state Montrose, Ernest
2006-12-12 10:19 ` Lars Ellenberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox