From: Lars Ellenberg <lars.ellenberg@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] DRBD Reconnection Bug
Date: Tue, 1 Jul 2008 11:08:41 +0200 [thread overview]
Message-ID: <20080701090841.GA20245@soda.linbit> (raw)
In-Reply-To: <6fbb8fd90806302329i8e041f3kc21e834e8ddf3b5f@mail.gmail.com>
On Mon, Jun 30, 2008 at 11:29:13PM -0700, John Muth wrote:
> I've been testing DRBD 8.2.6 on top of Centos 5.1 for use in a lights-out
> environment.
>
> I think I ran across a bug.Ha sanyone seen anything like this:
>
> Background:
>
> I'm running DRBD on top of LVM. I have a LVM volume named 'sm-vol' on two nodes
> and I've got /dev/drbd0 mapped to it. I replicate using 'protocol C'. At the
> beginning of the test node A is the primary and node B is the secondary. My
> test script does:
>
> 1. Write verifiable data to /dev/drbd0 on node A. Create a LVM snapshot on
> node B and verify the data.
> 2. Bring down node B using 'service drbd stop' Write more data frm node A.
> Bring the node B back up with 'service drbd start'. When re-sync is complete,
> create a LVM snapshot on the node B and verify that the contents match what was
> written by the node A.
> 3. Bring down node A with 'service drbd stop'. Make the node B into the
> primary (drbdadm primary sm-vol). Write more data to the /dev/drbd0 from node
> B.. Bring node A back with 'service drbd start'. Wait for resync to complete.
> Create LVM snapshot on node A and erify data.
> 4. Run 'drbdadm secondary sm-vol' on node B followed by 'drbdadm primary
> sm-vol' on node A.
> 5 Repeat
>
> Once every 100 or so loops, DRBD gets wedged during a reconnection. When the
> system gets wedged, I do 'cat /etc/drbd' on both nodes and I see the following:
>
> On primary:
>
> [root@vm5 ~]# cat /proc/drbd
> version: 8.2.6 (api:88/proto:86-88)
> GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by
> buildsvn@c5-i386-build, 2008-06-26 16:40:17
> 0: cs:WFBitMapS st:Primary/Secondary ds:UpToDate/Outdated C r---
> ns:288064 nr:94516 dw:613836 dr:845384 al:122 bm:31 lo:0 pe:0 ua:0 ap:0
> oos:95888
>
> On secondary:
>
> [root@vm6 ~]# cat /proc/drbd
> version: 8.2.6 (api:88/proto:86-88)
> GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by
> buildsvn@c5-i386-build, 2008-06-26 16:40:17
> 0: cs:WFBitMapT st:Secondary/Primary ds:Negotiating/UpToDate C r---
> ns:0 nr:0 dw:0 dr:0 al:106 bm:93 lo:0 pe:0 ua:0 ap:0 oos:95888
>
> /var/log/messages looks like:
>
> On primary:
>
> Jun 30 13:30:08 vm5 kernel: drbd0: peer( Secondary -> Unknown ) conn( Connected
> -> TearDown ) pdsk( UpToDate -> DUnknown )
> On secondary:
>
> Jun 30 13:28:46 vm6 kernel: drbd0: peer( Primary -> Unknown ) conn( Connected
> -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
since you can "easily" reproduce this,
please get the time in sync on those boxes (ntp),
makes it much easier to read logs.
then post new logs.
> Has anyone seen this before?
there have been numerous race conditions in the state handling code,
it is well posible that there are more.
--
: Lars Ellenberg Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
next prev parent reply other threads:[~2008-07-01 9:08 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-01 6:29 [Drbd-dev] DRBD Reconnection Bug John Muth
2008-07-01 9:08 ` Lars Ellenberg [this message]
2008-07-01 9:53 ` Lars Ellenberg
2008-07-01 14:28 ` John Muth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080701090841.GA20245@soda.linbit \
--to=lars.ellenberg@linbit.com \
--cc=drbd-dev@lists.linbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.