From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F93AC433EF for ; Tue, 28 Sep 2021 07:33:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7567660F46 for ; Tue, 28 Sep 2021 07:33:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239239AbhI1He4 (ORCPT ); Tue, 28 Sep 2021 03:34:56 -0400 Received: from mga11.intel.com ([192.55.52.93]:28679 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239099AbhI1He4 (ORCPT ); Tue, 28 Sep 2021 03:34:56 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10120"; a="221433385" X-IronPort-AV: E=Sophos;i="5.85,328,1624345200"; d="scan'208";a="221433385" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2021 00:33:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.85,328,1624345200"; d="scan'208";a="486424804" Received: from linux.intel.com ([10.54.29.200]) by orsmga008.jf.intel.com with ESMTP; 28 Sep 2021 00:33:14 -0700 Received: from [10.213.3.81] (mtkaczyk-MOBL1.ger.corp.intel.com [10.213.3.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by linux.intel.com (Postfix) with ESMTPS id 7ABDC5807C8; Tue, 28 Sep 2021 00:33:13 -0700 (PDT) Subject: Re: [PATCH 1/2] md, raid1, raid10: Set MD_BROKEN for RAID1 and RAID10 To: Song Liu Cc: linux-raid References: <20210917153452.5593-1-mariusz.tkaczyk@linux.intel.com> <20210917153452.5593-2-mariusz.tkaczyk@linux.intel.com> <83f06776-891f-dffa-7449-4128c419ada9@linux.intel.com> From: "Tkaczyk, Mariusz" Message-ID: Date: Tue, 28 Sep 2021 09:33:10 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org Hi Song, On 28.09.2021 00:59, Song Liu wrote: >>>>> + if (!test_bit(Faulty, &rdev->flags) && >>>>> + !test_bit(MD_BROKEN, &mddev->flags) && >>>>> + (bio->bi_opf & MD_FAILFAST)) { >>>> >>>> So with MD_BROKEN, we will not try to update the SB? >> Array is dead, is there added value in writing failed state? > > I think there is still value to remember this. Say a raid1 with 2 drives, > A and B. If B is unpluged from the system, we would like to update SB > on A to remember that. Otherwise, if B is pluged back later (maybe after > system reset), we won't tell which one has the latest data. > > Does this make sense? Removing one drive from raid1 array doesn't cause raid failure. So, removing B will be recorded on A. Raid1 is not good example because to fail array we need to remove all members, so MD_BROKEN doesn't matter because !test_bit(Faulty, &rdev->flags) is true. Let say that we have raid10 with member A, B, C, D. Member A is removed, and it is recorded correctly (we are degraded now). Next, member B is removed which causes array failure. W/ my patch: member B failure is not saved on members C and D. Raid is failed but it is not recorded in metadata. w/o my patch: member B failure is saved on C and D, and metadata is in failed state. >>>> >>>>> set_bit(MD_SB_NEED_REWRITE, &mddev->sb_flags); >>>>> set_bit(LastDev, &rdev->flags); >>>>> } >>>>> @@ -2979,7 +2980,8 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len) >>>>> int err = -EINVAL; >>>>> if (cmd_match(buf, "faulty") && rdev->mddev->pers) { >>>>> md_error(rdev->mddev, rdev); >>>>> - if (test_bit(Faulty, &rdev->flags)) >>>>> + >>>>> + if (!test_bit(MD_BROKEN, &rdev->mddev->flags)) >>>> >>>> I don't think this makes much sense. EBUSY for already failed array >>>> sounds weird. >>>> Also, shall we also set MD_BROKEN here? >>>> >>> Actually, we just called md_error above, so we don't need to set MD_BROKEN here. >>> But we shouldn't return EBUSY in such cases, right? >>> >> About EBUSY: >> This is how it is implemented in mdadm, we are expecting it in >> case of failure. See my fix[2]. >> I agree that it can be confusing, but this is how it is working. >> Do you want to change it across mdadm and md? >> This will break compatibility. >> >> About MD_BROKEN: >> As you see we are determining failure by checking rdev state, if "Faulty" >> in flags after md_error() is not set, then it assumes that array is >> failed and EBUSY is returned to userspace. > > This changed the behavior for raid0, no? > > W/o the change mdadm --fail on raid0 will get EBUSY. W/ this change, > it will get 0, and the device is NOT marked as faulty, right? > See commit mentioned in description. MD_BROKEN is used for raid0, so EBUSY is returned, same as w/o patch. Thanks, Mariusz