* Re: [Drbd-dev] DRBD Reconnection Bug
2008-07-01 9:53 ` Lars Ellenberg
@ 2008-07-01 14:28 ` John Muth
0 siblings, 0 replies; 4+ messages in thread
From: John Muth @ 2008-07-01 14:28 UTC (permalink / raw)
To: drbd-dev
[-- Attachment #1.1: Type: text/plain, Size: 5834 bytes --]
Hi Lars,
I've attached the script.
I'm using STAF to do the remote execution stuff. Calls to
'$mgr->start_process' starts a process on a remote note. '$mgr->wait_all'
waits for completion. Similarly '$mgr->start_copy_file' starts a file copy.
I'm afraid I'm pretty clueless WRT to building from source. If you could
send some instructions on how to get the git source and build it on my
system, I'll be glad to try it for you.
Thanks,
John Muth
On 7/1/08, Lars Ellenberg <lars.ellenberg@linbit.com> wrote:
>
> On Tue, Jul 01, 2008 at 11:08:41AM +0200, Lars Ellenberg wrote:
> > On Mon, Jun 30, 2008 at 11:29:13PM -0700, John Muth wrote:
> > > I've been testing DRBD 8.2.6 on top of Centos 5.1 for use in a
> lights-out
> > > environment.
> > >
> > > I think I ran across a bug.Ha sanyone seen anything like this:
> > >
> > > Background:
> > >
> > > I'm running DRBD on top of LVM. I have a LVM volume named 'sm-vol' on
> two nodes
> > > and I've got /dev/drbd0 mapped to it. I replicate using 'protocol C'.
> At the
> > > beginning of the test node A is the primary and node B is the
> secondary. My
> > > test script does:
> > >
> > > 1. Write verifiable data to /dev/drbd0 on node A. Create a LVM
> snapshot on
> > > node B and verify the data.
> > > 2. Bring down node B using 'service drbd stop' Write more data frm
> node A.
> > > Bring the node B back up with 'service drbd start'. When re-sync is
> complete,
> > > create a LVM snapshot on the node B and verify that the contents match
> what was
> > > written by the node A.
> > > 3. Bring down node A with 'service drbd stop'. Make the node B into
> the
> > > primary (drbdadm primary sm-vol). Write more data to the /dev/drbd0
> from node
> > > B.. Bring node A back with 'service drbd start'. Wait for resync to
> complete.
> > > Create LVM snapshot on node A and erify data.
> > > 4. Run 'drbdadm secondary sm-vol' on node B followed by 'drbdadm
> primary
> > > sm-vol' on node A.
> > > 5 Repeat
> > >
> > > Once every 100 or so loops, DRBD gets wedged during a reconnection.
> When the
> > > system gets wedged, I do 'cat /etc/drbd' on both nodes and I see the
> following:
> > >
> > > On primary:
> > >
> > > [root@vm5 ~]# cat /proc/drbd
> > > version: 8.2.6 (api:88/proto:86-88)
> > > GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by
> > > buildsvn@c5-i386-build, 2008-06-26 16:40:17
> > > 0: cs:WFBitMapS st:Primary/Secondary ds:UpToDate/Outdated C r---
> > > ns:288064 nr:94516 dw:613836 dr:845384 al:122 bm:31 lo:0 pe:0 ua:0
> ap:0
> > > oos:95888
> > >
> > > On secondary:
> > >
> > > [root@vm6 ~]# cat /proc/drbd
> > > version: 8.2.6 (api:88/proto:86-88)
> > > GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by
> > > buildsvn@c5-i386-build, 2008-06-26 16:40:17
> > > 0: cs:WFBitMapT st:Secondary/Primary ds:Negotiating/UpToDate C r---
> > > ns:0 nr:0 dw:0 dr:0 al:106 bm:93 lo:0 pe:0 ua:0 ap:0 oos:95888
> > >
> > > /var/log/messages looks like:
> > >
> > > On primary:
> > >
> > > Jun 30 13:30:08 vm5 kernel: drbd0: peer( Secondary -> Unknown ) conn(
> Connected
> > > -> TearDown ) pdsk( UpToDate -> DUnknown )
> >
> > > On secondary:
> > >
> > > Jun 30 13:28:46 vm6 kernel: drbd0: peer( Primary -> Unknown ) conn(
> Connected
> > > -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
> >
> > since you can "easily" reproduce this,
> > please get the time in sync on those boxes (ntp),
> > makes it much easier to read logs.
> >
> > then post new logs.
>
>
> but.
>
>
> > Jun 30 13:28:46 vm6 kernel: drbd0: receiver (re)started
> > Jun 30 13:28:46 vm6 kernel: drbd0: conn( Unconnected -> WFConnection )
> > Jun 30 13:28:46 vm6 kernel: drbd0: disk( UpToDate -> Diskless )
> > Jun 30 13:28:46 vm6 kernel: drbd0: Handshake successful: Agreed network
> protocol version 88
> > Jun 30 13:28:46 vm6 kernel: drbd0: conn( WFConnection -> WFReportParams )
> > Jun 30 13:28:46 vm6 kernel: drbd0: Starting asender thread (from
> drbd0_receiver [9457])
> > Jun 30 13:28:46 vm6 kernel: drbd0: data-integrity-alg: <not-used>
> > Jun 30 13:28:46 vm6 kernel: drbd0: peer( Unknown -> Primary ) conn(
> WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate )
>
>
> hm. going Connected while still "Diskless"...
>
>
> > Jun 30 13:28:55 vm6 kernel: drbd0: disk( Diskless -> Attaching )
>
>
> then attaching the disk,
>
>
> > Jun 30 13:28:55 vm6 kernel: drbd0: Found 4 transactions (145 active
> extents) in activity log.
> > Jun 30 13:28:55 vm6 kernel: drbd0: max_segment_size ( = BIO size ) =
> 32768
> > Jun 30 13:28:55 vm6 kernel: drbd0: reading of bitmap took 1 jiffies
> > Jun 30 13:28:55 vm6 kernel: drbd0: recounting of set bits took additional
> 0 jiffies
> > Jun 30 13:28:55 vm6 kernel: drbd0: 0 KB (0 bits) marked out-of-sync by on
> disk bit-map.
> > Jun 30 13:28:55 vm6 kernel: drbd0: disk( Attaching -> Negotiating )
>
>
> going "Negotiating"
>
>
> > Jun 30 13:28:55 vm6 kernel: drbd0: Writing meta data super block now.
>
>
> > Jun 30 13:28:55 vm6 kernel: drbd0: unexpected cstate (Connected) in
> receive_bitmap
>
>
> right there.
>
>
> > Jun 30 13:28:55 vm6 kernel: drbd0: conn( Connected -> WFBitMapT )
>
>
> according to this, it likely is a race between attaching the disk and
> establishing the connection. we had similar races before, the were
> believed to be fixed :(
>
> do you mind retrying with the latest git checkout?
>
> can you please post your excerciser script?
> which drbdadm / drbdsetup commands get executed in which order?
>
>
>
> --
> : Lars Ellenberg Tel +43-1-8178292-55 :
> : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
> : Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
> _______________________________________________
> drbd-dev mailing list
> drbd-dev@lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-dev
>
[-- Attachment #1.2: Type: text/html, Size: 7510 bytes --]
[-- Attachment #2: test_drbd.pl --]
[-- Type: application/octet-stream, Size: 11803 bytes --]
#!/usr/bin/perl
#
# Test LVM DRBD configuration for use in the Parascale cloud
#
use Getopt::Long;
use PSCSTAF;
sub usage {
print STDERR "usage ${0} <conf_file>\n";
}
my $node1 = "vm5.jm";
my $node2 = "vm6.jm";
my $drbd_dev = "/dev/drbd0";
my $drbd_vg = "sm-vg";
my $drbd_vol = "sm-vol";
my $dits_name = "${drbd_vol}";
my $drbd_name = "sm-vol";
my $loops = 1;
my $dits_loops = 100;
GetOptions( 'loops=i' => \$loops,
'vg=i' => \$drbd_vg,
'vol=i' => \$drbd_vol
);
# Get config file
my $config = shift;
if (!defined $config) {
usage();
exit 1;
}
if (! -f $config) {
print STDERR "configuration file '${config}' does not exist\n";
usage();
exit 1;
}
# load the config file
require ${config};
my $dits_dir = shift;
if (!defined $dits_dir) {
usage();
exit 1;
}
#if (! -d $dits-dir) {
# print STDERR "'${dits_dir}' is not a directory\n";
# exit 1;
#}
if (! -f "${dits_dir}/src/dits.c") {
print STDERR "'${dits_dir} does not contain dits.c\n";
exit 1;
}
my $mgr = PSCSTAF::Manager->new("/Lang/Perl/psc/vmware");
if (!defined $mgr) {
print STDERR "Error: ${PSCSTAF::errstr}\n";
exit 2;
}
if (!defined $snlist) {
print STDERR "no snlist defined. exiting\n";
exit 2;
}
if (@{$snlist} < 2) {
print STDERR "not enough entries in snlist need 2\n";
exit 2;
}
# === Config steps:
# Roll back $node1 and $node2 using VM API's
# install 'kmod-drbd82' on $node1 and $node2
# configure LVM VG's and Volume on both nodes (this is new product CN script)
# configure DRBD on both nodes (this is is a new product CN script)
# pick a primary and sync it
# tests
# 1. Run dits on primary. When finished, create an snapshot on secondary and
# verify contents. Delete the snapshot.
# 2. Stop DRBD on secondary. Run dits on primary. When finished, re-start
# DRBD on secondary and wait for resync. Verify contents.
my $rdits_dir = "/var/tmp/dits_src";
my $rc = install_dits($dits_dir, $rdits_dir, $node1, $node2);
if (!$rc) {
print STDERR "Error: install_dits Failed\n";
exit 1;
}
# initialize_dits($remotedir, $node, $drbd_dev, $size)
my $rc = initialize_dits("$rdits_dir/src", $drbd_vol, $node1, $drbd_dev, 1000000);
if (!$rc) {
print STDERR "Error: initialize_dits Failed\n";
exit 1;
}
# verify_dits($remote_dir, $name, $pnode, $pdev, $rnode, $rvg, $rvol)
my $rc = verify_dits("$rdits_dir/src", $drbd_vol, $node1, $drbd_dev, $node2, $drbd_vg, $drbd_vol);
if (!$rc) {
print STDERR "Error: verify_dits Failed\n";
exit 1;
}
print "****** ${loops} loops through tests - countdown\n";
while ($loops > 0) {
print "\n**** Start Loop: ${loops} remaining\n\n";
if (!test_basic()) {
print STDERR "Error: Failed Core replication\n";
exit 1;
}
if (!test_shutdown_secondary()) {
print STDERR "Error: Failed Secondary up/down\n";
exit 1;
}
if (!test_shutdown_primary()) {
print STDERR "Error: Failed Failover/Failback\n";
exit 1;
}
$loops--;
}
exit 0;
sub test_basic {
print "===== Basic Replication Test\n";
# run_dits($remotedir, $name, $node, $drbd_dev)
my $rc = run_dits("$rdits_dir/src", $drbd_vol, $node1, $drbd_dev,
$dits_loops);
if (!$rc) {
print STDERR "Error: run_dits Failed\n";
return undef;
}
my $rc = verify_dits("$rdits_dir/src", $drbd_vol, $node1, $drbd_dev,
$node2, $drbd_vg, $drbd_vol);
if (!$rc) {
print STDERR "Error: verify_dits Failed\n";
return undef;
}
return 1;
}
sub test_shutdown_secondary {
# Test case 2 - Ensure that a down secondary recovers
my $rc = drbd_down($node2);
if (!$rc) {
print STDERR "Error: drbd_down Failed\n";
return undef;
}
# run_dits($remotedir, $name, $node, $drbd_dev)
my $rc = run_dits("$rdits_dir/src", $drbd_vol, $node1, $drbd_dev,
$dits_loops);
if (!$rc) {
print STDERR "Error: verify_dits Failed\n";
return undef;
}
# DRBD Up
my $rc = drbd_up($node2);
if (!$rc) {
print STDERR "drbd_up(${node2}) failed\n";
return undef;
}
my $rc = drbd_wait_resync($node2, $drbd_name);
if (!$rc) {
print STDERR "drbd_wait_resync(${node2}) failed\n";
return undef;
}
my $rc = verify_dits("$rdits_dir/src", $drbd_vol, $node1, $drbd_dev,
$node2, $drbd_vg, $drbd_vol);
if (!$rc) {
print STDERR "Error: verify_dits Failed\n";
return undef;
}
return 1;
}
sub test_shutdown_primary {
my $rc;
$rc = drbd_down($node1);
if (!$rc) {
return undef;
}
$rc = drbd_make_primary($node2);
if (!$rc) {
return undef;
}
# run_dits($remotedir, $name, $node, $drbd_dev)
my $rc = run_dits("$rdits_dir/src", $drbd_vol, $node2, $drbd_dev,
$dits_loops);
if (!$rc) {
return undef;
}
# DRBD Up
my $rc = drbd_up($node1);
if (!$rc) {
return undef;
}
my $rc = drbd_wait_resync($node1, $drbd_name);
if (!$rc) {
return undef;
}
my $rc = verify_dits("$rdits_dir/src", $drbd_vol, $node2, $drbd_dev,
$node1, $drbd_vg, $drbd_vol);
if (!$rc) {
return undef;
}
$rc = drbd_make_secondary($node2);
if (!$rc) {
return undef;
}
$rc = drbd_make_primary($node1);
if (!$rc) {
return undef;
}
return 1;
}
sub log_proc_listener {
my $l = shift;
print "--- Remote Process Complete ---\n";
print "node: " . $l->node . "\n";
print "dir: " . $l->dir . "\n";
print "cmd: " . $l->cmdline . "\n";
print "rc: " . $l->rc . "\n";
print "output:\n";
foreach my $line (split /\n/, $l->stdout) {
print "\t${line}\n";
}
}
sub log_fs_listener {
my $l = shift;
print "--- File System Operation Complete ---\n";
print "node: " . $l->node . "\n";
print "cmd: " . $l->cmdline . "\n";
print "rc: " . $l->rc . "\n";
}
# drbd_make_primary($node)
sub drbd_make_primary {
my $node = shift;
print "=== Make ${node} secondary\n";
my $l = $mgr->start_process($node, "/", "drbdadm",
"primary ${drbd_name}");
$mgr->wait_all();
log_proc_listener($l);
if ($l->rc) {
return undef;
}
return 1;
}
# drbd_make_secondary($node)
sub drbd_make_secondary {
my $node = shift;
print "=== Make ${node} secondary\n";
my $l = $mgr->start_process($node, "/", "drbdadm",
"secondary ${drbd_name}");
$mgr->wait_all();
log_proc_listener($l);
if ($l->rc) {
return undef;
}
return 1;
}
# drbd_wait_resync($node, $drbd_name)
sub drbd_wait_resync {
my $node = shift;
my $drbd_name = shift;
print "=== Wait for DRDB sync: node: ${node} dev:${drbd_name}\n";
# wait for DRBD to resync
my $resync = 1;
while ($resync) {
sleep 5;
my $l = $mgr->start_process($node, "/", "drbdadm",
"dstate ${drbd_name}");
$mgr->wait_all();
log_proc_listener($l);
if ($l->rc) {
print "dits check DRBD status failed on node " . $l->node .
" rc:" . $l->rc . "\n";
print $l->stdout;
return undef;
}
my $so = $l->stdout;
chomp $so;
if ($so eq "UpToDate/UpToDate") {
$resync = 0;
}
if ($so eq "Negotiating/UpToDate") {
print STDERR "We didn't come back\n";
return undef;
}
if ($so eq "UpToDate/Negotiating") {
print STDERR "We didn't come back\n";
return undef;
}
}
return 1;
}
# drbd_down($node)
sub drbd_down {
my $node = shift;
print "=== Stop DRBD on ${node}\n";
my $l = $mgr->start_process($node, "/", "service", "drbd stop");
$mgr->wait_all();
log_proc_listener($l);
if ($l->rc) {
return undef;
}
return 1;
}
# drbd_up($node)
sub drbd_up {
my $node = shift;
print "=== Start DRBD on ${node}\n";
my $l = $mgr->start_process($node, "/", "service", "drbd start");
$mgr->wait_all();
log_proc_listener($l);
if ($l->rc) {
return undef;
}
return 1;
}
# verify_dits($remote_dir, $name, $pnode, $pdev, $rnode, $rvg, $rvol)
sub verify_dits {
my $dir = shift;
my $name = shift;
my $pnode = shift;
my $pdev = shift;
my $rnode = shift;
my $rvg = shift;
my $rvol = shift;
print "=== Verify dits workarea: nodes: ${pnode} ${rnode}\n";
if (defined $rnode) {
# copy dits results to secondary
print "\n=== copy dits state to secondary\n";
my $dits_state = "/var/tmp/dits-rtck-${name}";
my $l = $mgr->start_copy_file($pnode, $dits_state, $rnode, $dits_state);
$mgr->wait_all();
log_fs_listener($l);
if ($l->rc) {
return undef;
}
# Create snapshot on secondary
print "\n=== create snapshot on secondary\n";
my $l = $mgr->start_process($rnode, "/", "lvcreate",
"-s -L20M -n ${rvol}-snap " .
"/dev/${rvg}/${rvol}");
$mgr->wait_all();
log_proc_listener($l);
if ($l->rc) {
return undef;
}
}
# The dits check can take a while, so do them in parallel.
print "\n=== Verify data\n";
my $llist = [];
if (defined $rnode) {
# Check drbd snapshot on secondary
my $l = $mgr->start_process($rnode, "${rdits_dir}/src",
"./dits",
"-N ${name} -cv" .
" /dev/${rvg}/${rvol}-snap");
push(@{$llist}, $l);
}
if ($pnode ne "") {
# Check drbd device on primary
my $l = $mgr->start_process($pnode, "${dir}", "./dits",
"-N ${name} -cv ${pdev}");
push(@{$llist}, $l);
}
$mgr->wait_all();
my $errors = 0;
foreach my $l (@{$llist}) {
log_proc_listener($l);
if ($l->rc) {
$errors++;
}
}
if ($errors) {
return undef;
}
# If we ran on remote, clean up snapshot
if (defined $rnode) {
# Remove snapshot on secondary
print "\n=== remove snapshot on secondary\n";
my $l = $mgr->start_process($rnode, "/", "lvremove",
"-f /dev/${rvg}/${rvol}-snap");
$mgr->wait_all();
log_proc_listener($l);
if ($l->rc) {
return undef;
}
}
return 1;
}
# run_dits($remotedir, $name, $node, $drbd_dev, $dits_loops)
sub run_dits {
my $dir = shift;
my $dits_name = shift;
my $node = shift;
my $drbd_dev = shift;
my $dits_loops = shift;
# write on dits file
print "\n=== run 'dits -tv -I${dits_loops} $drbd_dev}' on ${node}\n";
my $l = $mgr->start_process($node, "${dir}", "./dits", "-tv" .
" -N ${dits_name} -I${dits_loops}" .
" ${drbd_dev}");
$mgr->wait_all();
log_proc_listener($l);
if ($l->rc) {
return undef;
}
return 1;
}
# initialize_dits($remotedir, $name, $node, $drbd_dev, $size)
sub initialize_dits {
my $dir = shift;
my $name = shift;
my $node = shift;
my $drbd_dev = shift;
my $size = shift;
print "=== dits init: node:${node} dev:${drbd_dev} size:${size}\n";
my $l = $mgr->start_process($node, "${dir}", "./dits",
"-N ${name} -iv ${drbd_dev} ${size}");
$mgr->wait_all();
log_proc_listener($l);
if ($l->rc) {
return undef;
}
return 1;
}
# install_dits($localdir, $remotedir, $node ...)
sub install_dits {
my $dits_dir = shift;
my $rdits_dir = shift;
# remove old dits directory
print "\n=== Remove old dits source (if it exists)\n";
my $llist = [];
foreach my $n (@_) {
my $l = $mgr->start_rmdir($n, $rdits_dir);
push(@{$llist}, $l);
}
$mgr->wait_all();
my $errors = 0;
foreach my $l (@{$llist}) {
log_fs_listener($l);
if ($l->rc && $l->rc != 48) {
$errors++;
}
}
if ($errors) {
return undef;
}
# copy over new dits source
print "\n=== Copy dits source to remote nodes\n";
my $llist = [];
foreach my $n (@_) {
my $l = $mgr->start_cpdir("local", $dits_dir, $n, $rdits_dir);
push(@{$llist}, $l);
}
$mgr->wait_all();
my $errors = 0;
foreach my $l (@{$llist}) {
log_fs_listener($l);
if ($l->rc) {
$errors++;
}
}
if ($errors) {
return undef;
}
print "\n=== Configure dits build to remote nodes\n";
my $llist = [];
foreach my $n (@_) {
my $l = $mgr->start_process($n, $rdits_dir, "/bin/bash", "./configure");
push(@{$llist}, $l);
}
$mgr->wait_all();
my $errors = 0;
foreach my $l (@{$llist}) {
log_proc_listener($l);
if ($l->rc) {
$errors++;
}
}
if ($errors) {
return undef;
}
print "\n=== make dits build on remote nodes\n";
my $llist = [];
foreach my $n (@_) {
my $l = $mgr->start_process($n, $rdits_dir, "make");
push(@{$llist}, $l);
}
$mgr->wait_all();
my $errors = 0;
foreach my $l (@{$llist}) {
log_proc_listener($l);
if ($l->rc) {
$errors++;
}
}
if ($errors) {
return undef;
}
return 1;
}
^ permalink raw reply [flat|nested] 4+ messages in thread