From mboxrd@z Thu Jan 1 00:00:00 1970 From: "H. Peter Anvin" Subject: Re: RAID-10 keeps aborting Date: Tue, 04 Jun 2013 19:39:58 -0700 Message-ID: <51AEA4FE.3060900@zytor.com> References: <51AC1440.7020505@zytor.com> <51AC3283.4000403@zytor.com> <51ACBAA0.40604@zytor.com> <51ACD511.4030604@zytor.com> <51AE2A8C.4080508@zytor.com> <51AE3441.3000208@zytor.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: Dan Williams Cc: "Martin K. Petersen" , linux-raid List-Id: linux-raid.ids On 06/04/2013 11:56 AM, Dan Williams wrote: > On Tue, Jun 4, 2013 at 11:38 AM, H. Peter Anvin wrote: >> Well, if that is what the block device layer is defined to do then that >> is what the block layer does. It makes sense from the point of view of >> a disk, there block layer has to translate and redo, so if the block >> layer is defined to do that, why not rely on it? >> > > I'm just hung up on when we can safely mark the array as not dirty. > At a minimum this means raid needs a "I have an ignored-write-failure > in-flight, awaiting retry from upper layer" state. > Ah yes, if you rely on the block layer to retry on you you don't see the beginnings and ends of the entire transaction, and at least ideally the RAID -- and the specific blocks -- should be marked dirty during that operation. The same applies to DISCARD presumably. Yuck, this suddenly got complex. Perhaps WRITE SAME should simply be disabled on raid1/raid10 until this can be addressed? Do we need to do the same for DISCARD? -hpa