All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lars Ellenberg <Lars.Ellenberg@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] DRBD8: Panic in drbd_bm_write_sect() after an io errorduring resync.
Date: Fri, 16 Feb 2007 18:31:50 +0100	[thread overview]
Message-ID: <20070216173150.GA9147@soda.linbit> (raw)
In-Reply-To: <BD7042533C2F8943A6A4257A9E31C45439C8FE@EXNA.corp.stratus.com>

/ 2007-02-16 09:55:12 -0500
\ Montrose, Ernest:
> Phil,
> Thanks!
> 
> I think all these panics on I/O errors are all related to the same bug.
> 
> Your comments make me look at a different angle... Looking at the logs
> around the failure
> Shows a problem on repeated I/O errors...the state machine is somewhat
> confused..It essentially
> Goes from Uptodate->Failed which is fine...then from
> Failed->Diskless...fine...then we go and
> Wait for mdev->local_cnt to be false like you explained...
> Then we get more I/O errors...and our problem starts...
> We go from Diskless->failed..again.(This does not seem correct since we
> just went from this state)

even though I dislike our overall state engine design,
it may be enough to do

--- drbd/drbd_main.c    (revision 2754)
+++ drbd/drbd_main.c    (working copy)
@@ -604,6 +604,11 @@
                dec_local(mdev);
        }

+       /* If we are Diskless, we can only go to Attaching. */
+       if ( (os.disk == Diskless) && (ns.disk != Attaching) ) {
+               ns.disk = Diskless;
+       }
+
        /* Early state sanitising. Dissalow the invalidate ioctl to
 * connect  */
        if( (ns.conn == StartingSyncS || ns.conn == StartingSyncT) &&
                os.conn < Connected ) {


> Then faile->diskless again
> We get more I/O errors...(not good)
> Mdev->bc is set to null eventually
> We went and wait again for mdev->local_cnt to be False..(not good)
> Now we die an awful ungodly death..:)
> 
> Here is the full log around the failure:
> Feb 15 16:01:57 captain kernel: end_request: I/O error, dev sda, sector
> 17554615
> Feb 15 16:01:57 captain kernel: drbd0: disk( UpToDate -> Failed )
> Feb 15 16:01:57 captain kernel: drbd0: Local IO failed. Detaching...
> Feb 15 16:01:57 captain kernel: drbd_io_error: EM--****** Handling an IO
> error***mdev->bc is valid***********************
> Feb 15 16:01:57 captain kernel: drbd0: disk( Failed -> Diskless )
> Feb 15 16:01:57 captain kernel: drbd0: Notified peer that my disk is
> broken.
> Feb 15 16:01:57 captain kernel: after_state_ch: EM-- *******Waiting for
> mdev->local_cnt to be FALSE ******
> Feb 15 16:01:57 captain kernel: end_request: I/O error, dev sda, sector
> 17554623
> Feb 15 16:01:57 captain kernel: drbd0: disk( Diskless -> Failed )

right. this is not allowed.

but this also means that our reference counting of in-flight local
requests is not ok, since once local_cnt is zero, there should be no
more in-flight requests to the local disk that might trigger
the end_io handler.

-- 
: Lars Ellenberg                            Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :

  reply	other threads:[~2007-02-16 17:31 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-16 14:55 [Drbd-dev] DRBD8: Panic in drbd_bm_write_sect() after an io errorduring resync Montrose, Ernest
2007-02-16 17:31 ` Lars Ellenberg [this message]
2007-02-19 14:13   ` Philipp Reisner
  -- strict thread matches above, loose matches on Subject: below --
2007-02-15 15:44 Montrose, Ernest
2007-02-16 11:43 ` Philipp Reisner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070216173150.GA9147@soda.linbit \
    --to=lars.ellenberg@linbit.com \
    --cc=drbd-dev@lists.linbit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.