From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1aIk7Z-0002Pb-SS for mharc-grub-devel@gnu.org; Mon, 11 Jan 2016 16:35:25 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50967) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aIk7X-0002PR-O4 for grub-devel@gnu.org; Mon, 11 Jan 2016 16:35:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aIk7S-0004pn-LC for grub-devel@gnu.org; Mon, 11 Jan 2016 16:35:23 -0500 Received: from mail-lb0-x235.google.com ([2a00:1450:4010:c04::235]:34732) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aIk7S-0004os-9T for grub-devel@gnu.org; Mon, 11 Jan 2016 16:35:18 -0500 Received: by mail-lb0-x235.google.com with SMTP id cl12so44488822lbc.1 for ; Mon, 11 Jan 2016 13:35:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-type; bh=eHe+OOUeadGKA1WcGmmjXUn1NucMeFjN0e8m3vKg4sM=; b=Zqu/JWhEzrEsUnIjHViPi08iQvJ5k+a82+wWfzeslo95ieLjx2Iu9Kt54eFIpjgQIm AmV+LVXAxYgGEQIc8bvY2i6YEujtUVVDK3v6VA+YanymIzuNj7V0jnY4jcCjX7GVCNmN sADWXeeCi0QcjtKq9KiFnBGf/Qr0FjO/zfoRRx7lfrSlIYN52dyK1nxz/nBAbqoT4k9P TENt3ekXGmhCDzPGINbZ+smu6RYTm/P3YN2zIibeIjLeToOuinAt+pA3Gn3Dfc2mQ2jk brq7nt4wprXvwz15gAQ3SYc40PisGbDYupw1SAsmvS0gnPc+Q24+WX5CdUxrBVWD26V1 DVhg== X-Received: by 10.112.46.35 with SMTP id s3mr26848384lbm.115.1452548117305; Mon, 11 Jan 2016 13:35:17 -0800 (PST) Received: from [192.168.1.41] (ppp91-76-25-247.pppoe.mtu-net.ru. [91.76.25.247]) by smtp.gmail.com with ESMTPSA id xo4sm16263061lbb.27.2016.01.11.13.35.15 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 11 Jan 2016 13:35:15 -0800 (PST) Subject: Re: [RFD] diskfilter stale RAID member detection vs. lazy scanning To: =?UTF-8?Q?Vladimir_'=cf=86-coder/phcoder'_Serbinenko?= References: <20150628210655.6dfdbd9a@opensuse.site> <55A6A104.6060407@gmail.com> <20150716064201.0249fe57@opensuse.site> <55A764E7.7060705@gmail.com> From: Andrei Borzenkov Message-ID: <56942012.6090206@gmail.com> Date: Tue, 12 Jan 2016 00:35:14 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <55A764E7.7060705@gmail.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="kW59OlanRJorAKVWUcxKP0XsvTxibiqnC" X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:4010:c04::235 Cc: The development of GNU GRUB X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: The development of GNU GRUB List-Id: The development of GNU GRUB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Jan 2016 21:35:24 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --kW59OlanRJorAKVWUcxKP0XsvTxibiqnC Content-Type: multipart/mixed; boundary="------------080504050308070405010209" This is a multi-part message in MIME format. --------------080504050308070405010209 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable 16.07.2015 11:01, Vladimir '=CF=86-coder/phcoder' Serbinenko =D0=BF=D0=B8= =D1=88=D0=B5=D1=82: > On 16.07.2015 05:42, Andrei Borzenkov wrote: >> =D0=92 Wed, 15 Jul 2015 20:05:56 +0200 >> Vladimir '=CF=86-coder/phcoder' Serbinenko =D0=BF=D0= =B8=D1=88=D0=B5=D1=82: >> >>> On 28.06.2015 20:06, Andrei Borzenkov wrote: >>>> I was looking at implementing detection of outdated RAID members. >>>> Unfortunately it appears to be fundamentally incompatible with lazy >>>> scanning as implemented currently by GRUB. We simply cannot stop >>>> scanning for other copies of metadata once "enough" was seen. Becaus= e >>>> any other disk may contain more actual copy which invalidates >>>> everything seen up to this point. >>>> >>>> So basically either we officially admit that GRUB is not able to det= ect >>>> stale members or we drop lazy scanning. >>>> >>>> Comments, ideas? >>>> >>> We don't need to see all disks to decide that there is no staleness. = If >>> you have an array with N devices and you can lose at most K of them, >>> then you can check for staleness after you have seen max(K+1, N-K) >>> drives. Why? >>> >> >> It's not the problem. The problem is what to do if you see disk with >> generation N+1 after you assembled array with generation N. This can >> mean that what we see is old copy and we should through it away and >> start collecting new one. If I read Linux MD code correctly, that is >> what it actually does. And this means we cannot stop scanning even >> after array is complete. >> > While it's true that it's possible that all the members we have seen ar= e > stale, it shouldn't be common and it's not the biggest problem. Biggest= > problem is inconsistency. > We can never guarantee of having seen all the disks as they may not be > eeven visible through firmware but it shouldn't stop us from fixing the= > inconsistency problem. >> Extreme example is three-pieces mirror where each piece is actually >> perfectly valid and usable by itself so losing two of them still means= >> we can continue to work with remaining one. >> > Mirrors get completely assembled in my patch. >=20 I fixed trivial read error in case of raid1/raid10 (see attached patch). It works in naive testing. We need regression tests for stale data. --------------080504050308070405010209 Content-Type: text/x-patch; name="0001-Fix-reading-from-RAID1-and-RAID10.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0001-Fix-reading-from-RAID1-and-RAID10.patch" =46rom 2611f7a1649e9564cf65b1312bd76e5f3feb3a3e Mon Sep 17 00:00:00 2001 From: Andrei Borzenkov Date: Mon, 11 Jan 2016 23:41:13 +0300 Subject: [PATCH] Fix reading from RAID1 and RAID10 Need to set error if current disk is stale. --- grub-core/disk/diskfilter.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/grub-core/disk/diskfilter.c b/grub-core/disk/diskfilter.c index 7fea4c0..d779a0a 100644 --- a/grub-core/disk/diskfilter.c +++ b/grub-core/disk/diskfilter.c @@ -782,6 +782,9 @@ read_segment (struct grub_diskfilter_segment *seg, gr= ub_disk_addr_t sector, && err !=3D GRUB_ERR_UNKNOWN_DEVICE) return err; } + else + err =3D GRUB_ERR_READ_ERROR; + k++; if (k =3D=3D seg->node_count) k =3D 0; --=20 1.9.1 --------------080504050308070405010209-- --kW59OlanRJorAKVWUcxKP0XsvTxibiqnC Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlaUIBIACgkQR6LMutpd94ycXgCgqhJGrdlbOz56VJ6MYJR646M+ ZEoAoLVcQJ2llPbNRek9S73TD4cv6TdM =7NN3 -----END PGP SIGNATURE----- --kW59OlanRJorAKVWUcxKP0XsvTxibiqnC--