From: Lars Ellenberg <lars.ellenberg@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] Barrier assert failures with latest 8.0 sources
Date: Mon, 21 Jan 2008 17:36:38 +0100 [thread overview]
Message-ID: <20080121163638.GC7594@barkeeper1.linbit> (raw)
In-Reply-To: <342BAC0A5467384983B586A6B0B3767107E89B46@EXNA.corp.stratus.com>
On Sat, Jan 19, 2008 at 11:40:35AM -0500, Graham, Simon wrote:
> > I'm attempting to run with the latest 8.0 sources from Git (plus a
> > couple of patches - basically the ones I have submitted that have not
> > yet been applied) and am seeing a lot of assert failures in the
> barrier
> > code since the latest change to send barriers as early as possible. A
> > representative trace for a device is attached - you will see that the
> > device gets connected then pauses resync (not sure if this is really
> > relevant) and then we start streaming the assert failures --
> apparently
> > we are off by one barrier from this point on...
>
> Hmm.. maybe not as hard to diagnose as I thought -- when the drbd
> connection is lost, we end up calling tl_clear which clears out the
> transfer list _but_ leaves a single barrier in the list with number 4711
> and req-cnt 0 (so oldest_barrier and newest_barrier both point to this
> pseudo-barrier entry).
>
> When we reconnect and start processing requests again, when the first
> barrier is needed, it will be number 4712 and will get added to the list
> and the BarrierRq will be sent with this number. When the BarrierAck is
> received, oldest_barrier is still 4711 though, leading to the assert
> failure...
>
> I'm not sure why tl_clear leaves this pseudo-barrier in the list...
> shouldn't it simply leave the list completely empty just like tl_init
> does?
probably.
we have seen these ASSERTS, too, btw, also without this latest change in
the barrier code, so aparently it has been there all along.
unfortunately we are all sort of distracted right now.
but coding will resume shortly :)
--
: Lars Ellenberg Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
next prev parent reply other threads:[~2008-01-21 16:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-19 16:40 [Drbd-dev] Barrier assert failures with latest 8.0 sources Graham, Simon
2008-01-21 16:36 ` Lars Ellenberg [this message]
2008-01-22 2:29 ` Graham, Simon
2008-01-22 16:20 ` Lars Ellenberg
[not found] ` <342BAC0A5467384983B586A6B0B3767107E89D34@EXNA.corp.s tratus.com>
2008-01-22 20:49 ` Graham, Simon
2008-01-23 13:53 ` Graham, Simon
2008-01-23 14:03 ` Graham, Simon
-- strict thread matches above, loose matches on Subject: below --
2008-01-19 16:25 Graham, Simon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080121163638.GC7594@barkeeper1.linbit \
--to=lars.ellenberg@linbit.com \
--cc=drbd-dev@lists.linbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.