From mboxrd@z Thu Jan 1 00:00:00 1970 From: Takahiro Yasui Subject: Re: [PATCH 7/7] Hold all write bios when errors are handled Date: Mon, 23 Nov 2009 12:54:47 -0500 Message-ID: <4B0ACC67.6060805@redhat.com> References: <20091123055835.GA28981@us.ibm.com> Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20091123055835.GA28981@us.ibm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: dm-devel@redhat.com List-Id: dm-devel.ids On 11/23/09 00:58, malahal@us.ibm.com wrote: > Mikulas Patocka [mpatocka@redhat.com] wrote: >> @@ -730,10 +719,25 @@ static void do_failures(struct mirror_se >> if (!ms->log_failure) { >> ms->in_sync = 0; >> dm_rh_mark_nosync(ms->rh, bio); >> + } >> + /* >> + * If all the legs are dead, fail the I/O. >> + * >> + * If we are not using dmeventd, we pretend that the I/O >> + * succeeded. This is wrong (the failed leg might come online >> + * again after reboot and it would be replicated back to >> + * the good leg) but it is consistent with current behavior. >> + * For proper behavior, dm-raid1 shouldn't be used without >> + * dmeventd at all. >> + * >> + * If we use dmeventd, hold the bio until dmeventd does its job. >> + */ >> + if (!get_valid_mirror(ms)) >> + bio_endio(bio, -EIO); >> + else if (!errors_handled(ms)) >> bio_endio(bio, 0); >> - } else { >> + else >> hold_bio(ms, bio); > > You seem to be holding the I/O that has failed, but what about writes > that are issued after this failure. They seem to go through just fine. > Did I miss something? I think do_writes() needs to check for a leg > failure (just like it does for log_failure already), and put them on > failure/hold list, right? On 09/09/09 16:18, Takahiro Yasui wrote: > - As I mentioned before, bios which are sent to out-of-sync regions also > need to be blocked because bios to out-of-sync regions are processed by > generic_make_request() and would return the "success" to upper layer > without dm-raid1 notices it. This might cause data corruption. > https://www.redhat.com/archives/dm-devel/2009-July/msg00118.html I think you are right. I also commented above two months ago, but I haven't got comments from Mikulas. > Also, we do need to do the above work only if "primary" leg fails. We > can continue to work just like the old code if "secondary" legs fail, > right? Not sure if this is worth optimizing though, but I would like to > see it implemented as it is just a few extra checks. We can have > primary_failure field like log_failure field. Good idea. This data corruption we are talking could happen only if the primary leg fails. Keeping system running in case of non-primary legs looks reasonable. Taka