From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wido den Hollander Subject: Re: "rbd rm image" slow with big images ? Date: Thu, 31 May 2012 21:39:28 +0200 Message-ID: <4FC7C8F0.9040700@widodh.nl> References: <4FC7B528.30609@widodh.nl> <4FC7B57C.8000403@profihost.ag> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from smtp02.mail.pcextreme.nl ([109.72.87.138]:46293 "EHLO smtp02.mail.pcextreme.nl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933182Ab2EaTjb (ORCPT ); Thu, 31 May 2012 15:39:31 -0400 In-Reply-To: <4FC7B57C.8000403@profihost.ag> Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Stefan Priebe Cc: ceph-devel@vger.kernel.org On 05/31/2012 08:16 PM, Stefan Priebe wrote: > One note: > he has written: > "then just delete it, without having writed nothing in image " That is true, but RBD doesn't know that. There is no record of which object got created and which didn't, so the removal process has to issue a removal for each RBD object that might exist. That is the nature of RBD. It makes it simple and reliable. Wido > > > Am 31.05.2012 20:15, schrieb Wido den Hollander: >> Hi, >> >> On 05/31/2012 09:12 AM, Alexandre DERUMIER wrote: >>> Hi, >>> >>> I trying to delete some rbd images with rbd rm, >>> and it seem to be "slow" with big images. >>> >>> >>> >>> I'm testing it with just create a new image (1TB): >>> >>> # time rbd -p pool1 create --size 1000000 image2 >>> >>> real 0m0.031s >>> user 0m0.015s >>> sys 0m0.010s >>> >>> >>> then just delete it, without having writed nothing in image >>> >>> >>> # time rbd -p pool1 rm image2 >>> Removing image: 100% complete...done. >>> >>> real 1m45.558s >>> user 0m14.683s >>> sys 0m17.363s >>> >>> >>> >>> same test with 100GB >>> >>> # time rbd -p pool1 create --size 100000 image2 >>> >>> real 0m0.032s >>> user 0m0.016s >>> sys 0m0.007s >>> >>> # time rbd -p pool1 rm image2 >>> Removing image: 100% complete...done. >>> >>> real 0m10.499s >>> user 0m1.488s >>> sys 0m1.720s >>> >>> >>> I'm using journal in tmpfs, 3 servers, 15 osds with 1disk 15K (xfs) >>> network bandwith,diskio,cpu are low. >>> >>> Is it the normal behaviour ? Maybe some xfs tuning could help ? >> >> It's in the nature of RBD. >> >> A RBD image consists of multiple 4MB (default) RADOS objects. >> >> Let's say you have a disk of 40GB, that will contain 10.000 4MB RADOS >> objects, you can find those objects by doing: rados -p rbd ls >> >> Now, when you create a new image only the header is writting, but no >> object is written. >> >> When you start writing to a RBD image you will be writing to one of the >> 4MB objects. When it doesn't exist it will be created. >> >> So when you install your VM it will create objects, but not all of them. >> >> RBD knows which RADOS objects to access by three parameters: >> >> * Image name >> * Image size >> * Stripe size (4MB) >> >> So when your VM access for byte Y until Z on the disk, RBD knows which >> object to access by calculating this. >> >> Now, when you start removing the image there is no way of knowing which >> object exists and which doesn't, so RBD will try to remove all objects. >> >> In the case of a fresh image this results in 10.000 RADOS remove >> operations for non-existent objects and that is slow. >> >> Wido >> >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html