Distributed Replicated Block Device (DRBD) development
 help / color / mirror / Atom feed
* RE: [Drbd-dev] DRBD-8 - handling data write errors
@ 2007-01-11  4:00 Graham, Simon
  2007-01-11 10:01 ` Lars Ellenberg
  0 siblings, 1 reply; 4+ messages in thread
From: Graham, Simon @ 2007-01-11  4:00 UTC (permalink / raw)
  To: Graham, Simon, drbd-dev

> I'm not really sure how to fix this at the moment, but I'm considering
> the following:
> 
> 1. The side that gets the error marks the block as out of sync AND
> marks
> the local disk as inconsistent.
> 2. Receipt of a NegAck causes the block to be marked as out of sync
AND
> the peer disk is made inconsistent
>    (not sure if I need this step since step 1 should cause this fact
to
> be broadcast but it seems safer).
> 

So - I've found there is some existing code in place already - for
example, set_out_of_sync
is done in req_may_be_done if either local or remote fails, however,
this is not sufficient
for a couple of reasons:

1. Need to get the failing disk set Inconsistent so that following reads
do not attempt
   to use the local block.

2. It seems to me that the current code doesn't really handle
set_out_of_sync being set
   whilst resync is in progress (i.e. if a write error occurs on an
application write
   during resync).

I've also coded something that sends the Inconsistent state to the other
side, which will trigger resync immediately - perhaps I shouldn't do
this??? Not really going to be able to fix this problem (although it
might be worth trying if the error was transient)...

I wonder if we shouldn't instead simply always detach on error (i.e.
stop using PassOn at all) to get the best behavior... this would
certainly make things simpler (and we could remove the forcible detach
on meta-data error that I added earlier -- if you want to be able to
handle errors then never use PassOn!

Simon

^ permalink raw reply	[flat|nested] 4+ messages in thread
* RE: [Drbd-dev] DRBD-8 - handling data write errors
@ 2007-01-11 15:50 Graham, Simon
  0 siblings, 0 replies; 4+ messages in thread
From: Graham, Simon @ 2007-01-11 15:50 UTC (permalink / raw)
  To: Lars Ellenberg, drbd-dev

> this is not a short-term project, but
> how about this:
>  introduce an additional "badblocks" bitmap -- actually, I think
> probably
>  a "range-list" type of storage would be appropriate here.
> 
>  local read error:
>     mark dirty, read full blocks remotely (which may be more than the
>     application requested), write -- written ok: mark clean again.
>  local write error:
>     mark block (range) as bad,
>     mark system "degraded"
>  both blocks bad, or remote not reachable:
>     pass to upper layers.
> 

You are right - it's not short term! Also;
. I think it'd be necessary to write this new badblocks structure to the
on-disk 
  meta-data so we'd need to allocate space for it
. We'd then need to deal with the case of having no more space to record
badblocks (the
  disk is pretty toasty in this case - maybe just detach).

You know, the underlying disks already include a lot of this
functionality and the more I think about it, the more convinced I am
that detaching on any error is the right thing to do -- 
. DRBD already (I think) correctly handles things if you re-attach
following this (it'll try 
  to resync the failed blocks and if that fails it would detach again).
. Although this seems like you end up doing a lot of work, these errors
are unlikely so I
  think it's OK to use a large hammer.

I'm going to do some experiments with the error handler set to Detach -
will report back on results.
Simon


^ permalink raw reply	[flat|nested] 4+ messages in thread
* [Drbd-dev] DRBD-8 - handling data write errors
@ 2007-01-10 20:50 Graham, Simon
  0 siblings, 0 replies; 4+ messages in thread
From: Graham, Simon @ 2007-01-10 20:50 UTC (permalink / raw)
  To: drbd-dev

A few months ago, I added support to handle write errors for the
meta-data portions of the disk (by forcibly detaching the disk when this
occurs).

Now I'm looking at handling write errors in the data portion and
wondering what the best approach would be - the current behavior is
definitely wrong because we end up with the two plexes having
inconsistent data with no record of the fact in the bitmap or other disk
state. This is true no matter what flavor of error handling is used.

I'm not really sure how to fix this at the moment, but I'm considering
the following:

1. The side that gets the error marks the block as out of sync AND marks
the local disk as inconsistent.
2. Receipt of a NegAck causes the block to be marked as out of sync AND
the peer disk is made inconsistent 
   (not sure if I need this step since step 1 should cause this fact to
be broadcast but it seems safer).

I'm not sure if I need to bump the UUID info as well here to ensure
resync happens correctly in the future?

I also considered forcibly detaching the disk in this case but rejected
that as it makes the rest of the disk unavailable when the error might
only be in one block.

One last problem to be considered is when we get write errors in
different blocks on both disks -- they cant both be inconsistent (how
would we know which way to resync?); I'm not sure what the right answer
is here - anyone have any suggestions?

Thanks,
Simon

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-01-11 15:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-11  4:00 [Drbd-dev] DRBD-8 - handling data write errors Graham, Simon
2007-01-11 10:01 ` Lars Ellenberg
  -- strict thread matches above, loose matches on Subject: below --
2007-01-11 15:50 Graham, Simon
2007-01-10 20:50 Graham, Simon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox