From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx1.redhat.com (ext-mx03.extmail.prod.ext.phx2.redhat.com [10.5.110.7]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id oBU3EjNP025493 for ; Wed, 29 Dec 2010 22:14:45 -0500 Received: from mx2.isti.cnr.it (mx2.isti.cnr.it [194.119.192.4]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id oBU3EX1n012319 for ; Wed, 29 Dec 2010 22:14:34 -0500 Received: from SCRIPT-SPFWL-DAEMON.mx.isti.cnr.it by mx.isti.cnr.it (PMDF V6.5-x5 #31825) id <01NW0O3C5WFKNG5IWL@mx.isti.cnr.it> for linux-lvm@redhat.com; Thu, 30 Dec 2010 04:13:30 +0100 (MET) Received: from conversionlocal.isti.cnr.it by mx.isti.cnr.it (PMDF V6.5-x5 #31825) id <01NW0O3BP0U8NG5IWA@mx.isti.cnr.it> for linux-lvm@redhat.com; Thu, 30 Dec 2010 04:13:28 +0100 (MET) Received: from [164.132.149.45] by mx.isti.cnr.it (PMDF V6.5-x5 #31826) with ESMTPSA id <01NW0O3A52IONQ2M0S@mx.isti.cnr.it> for linux-lvm@redhat.com; Thu, 30 Dec 2010 04:13:27 +0100 (MET) Date: Thu, 30 Dec 2010 04:13:35 +0100 From: Spelic In-reply-to: Message-id: <4D1BF8DF.20908@shiftmail.org> MIME-version: 1.0 Content-transfer-encoding: 7bit References: <4D1A9FAF.6050401@shiftmail.org> <4D1B3F6A.4070309@shiftmail.org> Subject: Re: [linux-lvm] pvmove painfully slow on parity RAID Reply-To: LVM general discussion and development List-Id: LVM general discussion and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , List-Id: Content-Type: text/plain; format="flowed"; charset="us-ascii" To: linux-lvm@redhat.com On 12/30/2010 03:42 AM, Stuart D. Gathman wrote: > On Wed, 29 Dec 2010, Spelic wrote: > >> I tried multiple times for every device with consistent results, so I'm pretty >> sure these are actual numbers. >> What's happening? >> Apart from the amazing difference of parity raid vs nonparity raid, with >> parity raid it seems to vary randomly with the number of devices and the >> chunksize..? >> > This is pretty much my experience with parity raid all around. Which > is why I stick with raid1 and raid10. > Parity raid goes fast for me for normal filesystem operations, that's why I suppose there is some strict sequentiality is enforced here. > That said, the sequential writes of pvmove should be fast for raid5 *if* > the chunks are aligned so that there is no read/modify/write cycle. > > 1) Perhaps your test targets are not properly aligned? > aligned to zero yes (arrays are empty now), but all raids have different chunk sizes and stripe sizes as I reported, which are all bigger than the lvm chunksize which is 1M for the VG. > 2) Perhaps the raid5 implementation (hardware? linux md? > experimental lvm raid5?) does a read modify write even when it > doesn't have to. > > Your numbers sure look like read/modify/write is happening for some reason. > Ok but strict sequentiality is probably enforced too much. There must be some barrier or flush & wait thing going on here at each tiny bit of information (at each lvm chunk maybe?). Are you a lvm devel? Consider that a sequential dd write goes hundreds of megabytes per second on my arrays, not hundreds of... kilobytes! Even random io goes *much* faster than this, if one stripe does not have to wait for another stripe to be fully updated (i.e. sequentiality not enforced from the application layer). If pvmove ouputed 100MB before every sync or flush, I'm pretty sure I would see speeds almost 100 times higher. Also there is still the mystery of why times appear *randomly* related to the number of devices, chunk sizes, and stripe sizes! if the rmw cycle was the culprit, how come I see: raid5, 4 devices, 16384k chunk: 41sec (4.9MB/sec) raid5, 6 device, 4096k chunk: 2m18sec ?!?! (1.44 MB/sec!?) the first has much larger stripe size of 49152K , the second has 20480K ! Thank you