* [Drbd-dev] DRBD8: Panic in drbd_uuid_compare due to mdev->bc being null
@ 2007-06-11 20:17 Montrose, Ernest
2007-06-12 14:39 ` Philipp Reisner
0 siblings, 1 reply; 2+ messages in thread
From: Montrose, Ernest @ 2007-06-11 20:17 UTC (permalink / raw)
To: Philipp Reisner, drbd-dev
[-- Attachment #1: Type: text/plain, Size: 966 bytes --]
Hi all,
We are seeing a panic that occurs while syncing. Essentially if you are
primary and you are syncing and get an io error then on the next attach
you can panic. Especially if that attach happens quickly after the
detach.
I think what's happening is this:
* The local disk dies and we transiton to "Diskless"
* After_state_ch() suppose to call drbd_free_bc() to free mdev->bc.
* But before we can free mdev->bc, mdev->local_cnt would have to be 0,
in this case it was not. Not too sure why. So we wait for
mdev->local_cnt to become 0.
* While waiting an "Attach" request comes in. We ASSERT that mdev->bc is
not NULL but we brush it off, set a new bc and leak the old.
* The wait in after_state_ch is now over. we free the new mdev->bc that
the "attach" had set.
* we call drbd_sync_handshake(), access a NULL mdev->bc and we die.
A quick thing to do is to just fail the attach request if mdev->bc is
not null. Patch included
EM--
[-- Attachment #2: drbd_nl.patch --]
[-- Type: application/octet-stream, Size: 1107 bytes --]
Index: drbd/drbd_nl.c
===================================================================
--- drbd/drbd_nl.c (revision 14981)
+++ drbd/drbd_nl.c (working copy)
@@ -676,7 +676,7 @@
struct inode *inode, *inode2;
struct lru_cache* resync_lru = NULL;
drbd_state_t ns,os;
- int rv;
+ int rv,ntries=0;
/* if you want to reconfigure, please tear down first */
if (mdev->state.disk > Diskless) {
@@ -684,6 +684,23 @@
goto fail;
}
+ /*
+ * We may have gotten here very quickly from a detach. Wait for a bit
+ * then fail.
+ */
+ while(mdev->bc != NULL && ntries <= 5) {
+ if(ntries == 5)
+ {
+ WARN("drbd_nl_disk_conf: mdev->bc not NULL. Giving up!\n");
+ retcode=HaveDiskConfig;
+ goto fail;
+ }
+ WARN("drbd_nl_disk_conf: mdev->bc not NULL. Waiting..\n");
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule_timeout(HZ/10);
+ ++ntries;
+ }/*End while*/
+
nbc = kmalloc(sizeof(struct drbd_backing_dev),GFP_KERNEL);
if(!nbc) {
retcode=KMallocFailed;
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [Drbd-dev] DRBD8: Panic in drbd_uuid_compare due to mdev->bc being null
2007-06-11 20:17 [Drbd-dev] DRBD8: Panic in drbd_uuid_compare due to mdev->bc being null Montrose, Ernest
@ 2007-06-12 14:39 ` Philipp Reisner
0 siblings, 0 replies; 2+ messages in thread
From: Philipp Reisner @ 2007-06-12 14:39 UTC (permalink / raw)
To: drbd-dev; +Cc: Montrose, Ernest
On Monday 11 June 2007 22:17:25 Montrose, Ernest wrote:
> Hi all,
>
> We are seeing a panic that occurs while syncing. Essentially if you are
> primary and you are syncing and get an io error then on the next attach
> you can panic. Especially if that attach happens quickly after the
> detach.
>
> I think what's happening is this:
> * The local disk dies and we transiton to "Diskless"
> * After_state_ch() suppose to call drbd_free_bc() to free mdev->bc.
> * But before we can free mdev->bc, mdev->local_cnt would have to be 0,
> in this case it was not. Not too sure why. So we wait for
> mdev->local_cnt to become 0.
> * While waiting an "Attach" request comes in. We ASSERT that mdev->bc is
> not NULL but we brush it off, set a new bc and leak the old.
> * The wait in after_state_ch is now over. we free the new mdev->bc that
> the "attach" had set.
> * we call drbd_sync_handshake(), access a NULL mdev->bc and we die.
>
> A quick thing to do is to just fail the attach request if mdev->bc is
> not null. Patch included
>
> EM--
Hi Ernest,
Thanks! I just changed it to Linux coding style, and shortened it a
bit. It is commited.
-phil
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, 1120 Vienna, Austria http://www.linbit.com :
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2007-06-12 14:39 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-11 20:17 [Drbd-dev] DRBD8: Panic in drbd_uuid_compare due to mdev->bc being null Montrose, Ernest
2007-06-12 14:39 ` Philipp Reisner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox