From: Lars Ellenberg <lars.ellenberg@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] DRBD Reconnection Bug
Date: Tue, 1 Jul 2008 11:08:41 +0200 [thread overview]
Message-ID: <20080701090841.GA20245@soda.linbit> (raw)
In-Reply-To: <6fbb8fd90806302329i8e041f3kc21e834e8ddf3b5f@mail.gmail.com>
On Mon, Jun 30, 2008 at 11:29:13PM -0700, John Muth wrote:
> I've been testing DRBD 8.2.6 on top of Centos 5.1 for use in a lights-out
> environment.
>
> I think I ran across a bug.Ha sanyone seen anything like this:
>
> Background:
>
> I'm running DRBD on top of LVM. I have a LVM volume named 'sm-vol' on two nodes
> and I've got /dev/drbd0 mapped to it. I replicate using 'protocol C'. At the
> beginning of the test node A is the primary and node B is the secondary. My
> test script does:
>
> 1. Write verifiable data to /dev/drbd0 on node A. Create a LVM snapshot on
> node B and verify the data.
> 2. Bring down node B using 'service drbd stop' Write more data frm node A.
> Bring the node B back up with 'service drbd start'. When re-sync is complete,
> create a LVM snapshot on the node B and verify that the contents match what was
> written by the node A.
> 3. Bring down node A with 'service drbd stop'. Make the node B into the
> primary (drbdadm primary sm-vol). Write more data to the /dev/drbd0 from node
> B.. Bring node A back with 'service drbd start'. Wait for resync to complete.
> Create LVM snapshot on node A and erify data.
> 4. Run 'drbdadm secondary sm-vol' on node B followed by 'drbdadm primary
> sm-vol' on node A.
> 5 Repeat
>
> Once every 100 or so loops, DRBD gets wedged during a reconnection. When the
> system gets wedged, I do 'cat /etc/drbd' on both nodes and I see the following:
>
> On primary:
>
> [root@vm5 ~]# cat /proc/drbd
> version: 8.2.6 (api:88/proto:86-88)
> GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by
> buildsvn@c5-i386-build, 2008-06-26 16:40:17
> 0: cs:WFBitMapS st:Primary/Secondary ds:UpToDate/Outdated C r---
> ns:288064 nr:94516 dw:613836 dr:845384 al:122 bm:31 lo:0 pe:0 ua:0 ap:0
> oos:95888
>
> On secondary:
>
> [root@vm6 ~]# cat /proc/drbd
> version: 8.2.6 (api:88/proto:86-88)
> GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by
> buildsvn@c5-i386-build, 2008-06-26 16:40:17
> 0: cs:WFBitMapT st:Secondary/Primary ds:Negotiating/UpToDate C r---
> ns:0 nr:0 dw:0 dr:0 al:106 bm:93 lo:0 pe:0 ua:0 ap:0 oos:95888
>
> /var/log/messages looks like:
>
> On primary:
>
> Jun 30 13:30:08 vm5 kernel: drbd0: peer( Secondary -> Unknown ) conn( Connected
> -> TearDown ) pdsk( UpToDate -> DUnknown )
> On secondary:
>
> Jun 30 13:28:46 vm6 kernel: drbd0: peer( Primary -> Unknown ) conn( Connected
> -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
since you can "easily" reproduce this,
please get the time in sync on those boxes (ntp),
makes it much easier to read logs.
then post new logs.
> Has anyone seen this before?
there have been numerous race conditions in the state handling code,
it is well posible that there are more.
--
: Lars Ellenberg Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
next prev parent reply other threads:[~2008-07-01 9:08 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-01 6:29 [Drbd-dev] DRBD Reconnection Bug John Muth
2008-07-01 9:08 ` Lars Ellenberg [this message]
2008-07-01 9:53 ` Lars Ellenberg
2008-07-01 14:28 ` John Muth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080701090841.GA20245@soda.linbit \
--to=lars.ellenberg@linbit.com \
--cc=drbd-dev@lists.linbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox