* [Drbd-dev] Troubleshooting digest failures?
@ 2008-08-06 14:56 Gregor Mosheh
2008-08-06 16:04 ` Lars Ellenberg
0 siblings, 1 reply; 3+ messages in thread
From: Gregor Mosheh @ 2008-08-06 14:56 UTC (permalink / raw)
To: drbd-dev
Hey guys. I've gotten no response from the user list, so maybe it's time
for a different tack debugging DRBD's innards...
I've been having a problem which I describe here. The last posting is
probably the most relevant.
http://www.gossamer-threads.com/lists/drbd/users/15119
How would I go about debugging this? Is there extra logging or debugging
which I can enable? Have any of you seen this before?
--
Gregor Mosheh / Greg Allensworth, BS, A+
System Administrator
HostGIS cartographic development & hosting services
http://www.HostGIS.com/
"Remember that no one cares if you can back up,
only if you can restore." - AMANDA
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Drbd-dev] Troubleshooting digest failures?
2008-08-06 14:56 [Drbd-dev] Troubleshooting digest failures? Gregor Mosheh
@ 2008-08-06 16:04 ` Lars Ellenberg
2008-08-06 17:09 ` Lars Ellenberg
0 siblings, 1 reply; 3+ messages in thread
From: Lars Ellenberg @ 2008-08-06 16:04 UTC (permalink / raw)
To: drbd-dev
On Wed, Aug 06, 2008 at 08:56:25AM -0600, Gregor Mosheh wrote:
> Hey guys.
Hello again.
Sorry for the joke, but I cannot help it.
You know the story about "The hare and the hedgehog"?
> I've gotten no response from the user list,
now, that is not entirely true ;)
> so maybe it's time
> for a different tack debugging DRBD's innards...
>
> I've been having a problem which I describe here. The last posting is
> probably the most relevant.
> http://www.gossamer-threads.com/lists/drbd/users/15119
>
> How would I go about debugging this? Is there extra logging or
> debugging which I can enable? Have any of you seen this before?
Anyways,
appart from what I wrote in your thread, and the
"What causes nodes to become out-of-sync?" thread,
http://www.gossamer-threads.com/lists/drbd/users/15081
there is not much else I can say.
You said you have an other cluster, not yet in production, where it did
not occur so far, and you suggest it may be just the missing load that
makes it "appear" healthy.
How about using it as test setup, and generate load on it,
until you can provoke the symptom there, too?
To reverse that, if you cannot provoke the symptom there,
I'd still point to hardware issues on the affected cluster.
Cheers,
--
: Lars Ellenberg Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Drbd-dev] Troubleshooting digest failures?
2008-08-06 16:04 ` Lars Ellenberg
@ 2008-08-06 17:09 ` Lars Ellenberg
0 siblings, 0 replies; 3+ messages in thread
From: Lars Ellenberg @ 2008-08-06 17:09 UTC (permalink / raw)
To: drbd-dev
On Wed, Aug 06, 2008 at 06:04:09PM +0200, Lars Ellenberg wrote:
> On Wed, Aug 06, 2008 at 08:56:25AM -0600, Gregor Mosheh wrote:
> > Hey guys.
>
> Hello again.
>
> Sorry for the joke, but I cannot help it.
> You know the story about "The hare and the hedgehog"?
>
> > I've gotten no response from the user list,
>
> now, that is not entirely true ;)
>
> > so maybe it's time
> > for a different tack debugging DRBD's innards...
> >
> > I've been having a problem which I describe here. The last posting is
> > probably the most relevant.
> > http://www.gossamer-threads.com/lists/drbd/users/15119
> >
> > How would I go about debugging this? Is there extra logging or
> > debugging which I can enable? Have any of you seen this before?
>
> Anyways,
> appart from what I wrote in your thread, and the
> "What causes nodes to become out-of-sync?" thread,
> http://www.gossamer-threads.com/lists/drbd/users/15081
> there is not much else I can say.
>
> You said you have an other cluster, not yet in production, where it did
> not occur so far, and you suggest it may be just the missing load that
> makes it "appear" healthy.
>
> How about using it as test setup, and generate load on it,
> until you can provoke the symptom there, too?
>
> To reverse that, if you cannot provoke the symptom there,
> I'd still point to hardware issues on the affected cluster.
also, please have a look at this thread, where I try to explain
why modifying in-flight data buffers would lead to these symptoms.
http://www.gossamer-threads.com/lists/drbd/users/15189
also, when online-verify reports the out-of-sync sectors,
please to the
# dd iflag=direct if=/dev/whatever bs=512 \
skip=sector-offset count=size \
of=nodename.dump
# diff -U0 <(xxd node0.dump) <(xxd node1.dump)
trick (explained in the "what causes nodes to become out of sync"
thread) to get a diff of the hexdumps, so we can tell whether there is
single bit flips,
multiple word data changes
complete unrelated stuff
in the corresponding sectors on the different nodes.
--
: Lars Ellenberg Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-08-06 17:09 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-06 14:56 [Drbd-dev] Troubleshooting digest failures? Gregor Mosheh
2008-08-06 16:04 ` Lars Ellenberg
2008-08-06 17:09 ` Lars Ellenberg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox