From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48456) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YM5ZM-0007a0-4D for qemu-devel@nongnu.org; Thu, 12 Feb 2015 21:01:29 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YM5ZI-0007q0-PP for qemu-devel@nongnu.org; Thu, 12 Feb 2015 21:01:24 -0500 Received: from mail-ig0-x22b.google.com ([2607:f8b0:4001:c05::22b]:50450) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YM5ZI-0007pt-JI for qemu-devel@nongnu.org; Thu, 12 Feb 2015 21:01:20 -0500 Received: by mail-ig0-f171.google.com with SMTP id h15so7831206igd.4 for ; Thu, 12 Feb 2015 18:01:19 -0800 (PST) Date: Fri, 13 Feb 2015 10:01:19 +0800 From: Liu Yuan Message-ID: <20150213020119.GS7801@ubuntu-trusty> References: <1422347727-13006-1-git-send-email-ishizaki.teruaki@lab.ntt.co.jp> <20150210111256.GG18727@ubuntu-trusty> <54DC071D.40503@lab.ntt.co.jp> <20150212021928.GI7801@ubuntu-trusty> <54DC10EC.3050502@lab.ntt.co.jp> <20150212025523.GJ7801@ubuntu-trusty> <54DD5450.1000304@lab.ntt.co.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54DD5450.1000304@lab.ntt.co.jp> Subject: Re: [Qemu-devel] [PATCH v4] sheepdog: selectable object size support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Teruaki Ishizaki Cc: kwolf@redhat.com, mitake.hitoshi@lab.ntt.co.jp, sheepdog@lists.wpkg.org, qemu-devel@nongnu.org, stefanha@redhat.com On Fri, Feb 13, 2015 at 10:33:04AM +0900, Teruaki Ishizaki wrote: > (2015/02/12 11:55), Liu Yuan wrote: > >On Thu, Feb 12, 2015 at 11:33:16AM +0900, Teruaki Ishizaki wrote: > >>(2015/02/12 11:19), Liu Yuan wrote: > >>>On Thu, Feb 12, 2015 at 10:51:25AM +0900, Teruaki Ishizaki wrote: > >>>>(2015/02/10 20:12), Liu Yuan wrote: > >>>>>On Tue, Jan 27, 2015 at 05:35:27PM +0900, Teruaki Ishizaki wrote: > >>>>>>Previously, qemu block driver of sheepdog used hard-coded VDI object size. > >>>>>>This patch enables users to handle "block_size_shift" value for > >>>>>>calculating VDI object size. > >>>>>> > >>>>>>When you start qemu, you don't need to specify additional command option. > >>>>>> > >>>>>>But when you create the VDI which doesn't have default object size > >>>>>>with qemu-img command, you specify block_size_shift option. > >>>>>> > >>>>>>If you want to create a VDI of 8MB(1 << 23) object size, > >>>>>>you need to specify following command option. > >>>>>> > >>>>>> # qemu-img create -o block_size_shift=23 sheepdog:test1 100M > >>>>>> > >>>>>>In addition, when you don't specify qemu-img command option, > >>>>>>a default value of sheepdog cluster is used for creating VDI. > >>>>>> > >>>>>> # qemu-img create sheepdog:test2 100M > >>>>>> > >>>>>>Signed-off-by: Teruaki Ishizaki > >>>>>>--- > >>>>>>V4: > >>>>>> - Limit a read/write buffer size for creating a preallocated VDI. > >>>>>> - Replace a parse function for the block_size_shift option. > >>>>>> - Fix an error message. > >>>>>> > >>>>>>V3: > >>>>>> - Delete the needless operation of buffer. > >>>>>> - Delete the needless operations of request header. > >>>>>> for SD_OP_GET_CLUSTER_DEFAULT. > >>>>>> - Fix coding style problems. > >>>>>> > >>>>>>V2: > >>>>>> - Fix coding style problem (white space). > >>>>>> - Add members, store_policy and block_size_shift to struct SheepdogVdiReq. > >>>>>> - Initialize request header to use block_size_shift specified by user. > >>>>>>--- > >>>>>> block/sheepdog.c | 138 ++++++++++++++++++++++++++++++++++++++------- > >>>>>> include/block/block_int.h | 1 + > >>>>>> 2 files changed, 119 insertions(+), 20 deletions(-) > >>>>>> > >>>>>>diff --git a/block/sheepdog.c b/block/sheepdog.c > >>>>>>index be3176f..a43b947 100644 > >>>>>>--- a/block/sheepdog.c > >>>>>>+++ b/block/sheepdog.c > >>>>>>@@ -37,6 +37,7 @@ > >>>>>> #define SD_OP_READ_VDIS 0x15 > >>>>>> #define SD_OP_FLUSH_VDI 0x16 > >>>>>> #define SD_OP_DEL_VDI 0x17 > >>>>>>+#define SD_OP_GET_CLUSTER_DEFAULT 0x18 > >>>>> > >>>>>This might not be necessary. For old qemu or the qemu-img without setting > >>>>>option, the block_size_shift will be 0. > >>>>> > >>>>>If we make 0 to represent 4MB object, then we don't need to get the default > >>>>>cluster object size. > >>>>> > >>>>>We migth even get rid of the idea of cluster default size. The downsize is that, > >>>>>if we want to create a vdi with different size not the default 4MB, > >>>>>we have to write it every time for qemu-img or dog. > >>>>> > >>>>>If we choose to keep the idea of cluster default size, I think we'd also try to > >>>>>avoid call this request from QEMU to make backward compatibility easier. In this > >>>>>scenario, 0 might be used to ask new sheep to decide to use cluster default size. > >>>>> > >>>>>Both old qemu and new QEMU will send 0 to sheep and both old and new sheep can > >>>>>handle 0 though it has different meanings. > >>>>> > >>>>>Table for this bit as 0: > >>>>>Qe: qemu > >>>>>SD: Sheep daemon > >>>>>CDS: Cluster Default Size > >>>>>Ign: Ignored by the sheep daemon > >>>>> > >>>>>Qe/sd new old > >>>>>new CDS Ign > >>>>>old CDS NULL > >>>>Does Ign mean that VDI is handled as 4MB object size? > >>> > >>>Yes, old sheep can only handle 4MB object and doesn't check this field at all. > >>> > >>>> > >>>>> > >>>>>I think this approach is acceptable. The difference to your patch is that > >>>>>we don't send SD_OP_GET_CLUSTER_DEFAULT to sheep daemon and > >>>>>SD_OP_GET_CLUSTER_DEFAULT can be removed. > >>>>When users create a new VDI with qemu-img, qemu's Sheepdog backend > >>>>driver calculates max limit VDI size. > >>> > >>>>But if block_size_shift option is not specified, qemu's Sheepdog backend > >>>>driver can't calculate max limit VDI size. > >>> > >>>If block_size_shift not specified, this means > >>> > >>>1 for old sheep, use 4MB size > >>>2 for new sheep, use cluster wide default value. > >>> > >>>And sheep then can calculate it on its own, no? > >>> > >>Dog command(client) calculate max size, so I think > >>that qemu's Sheepdog backend driver should calculate it > >>like dog command. > >> > >>Is that policy changeable? > > > >I checked the QEMU code and got your idea. In the past it was fixed size so very > >easy to hardcode the check in the client, no communication with sheep needed. > > > >Yes, if it is reasonable, we can change it. > > > >I think we can push the size calculation logic into sheep, if not the right size > >return INVALID_PARAMETER to clients. Clients just check this and report error > >back to users. > > > >There is no backward compability for this approach, since 4MB is the smallest > >size. > > > >OLD QEMU will limit the max_size as 4TB, which is no problem for new sheep. > > I have checked the Qemu and sheepdog code. > When we resize VDI, sd_truncate() is called and > resize value is handled by Qemu. > (Sorry I haven't noticed this operation) > > Then, sd_truncate() writes Sheepdog inode object directly. > So Sheepdog server can't handle maximum VDI size. > > As I thought, should we use SD_OP_GET_CLUSTER_DEFAULT? > Should maxmimum VDI size be calculated on client program? Based on your description, yes, we have to use it. I'd suggest rename SD_OP_GET_CLUSTER_DEFAULT as SD_OP_GET_DEFAULT_OBJECT_SIZE. If we use it, you need to take care of old sheep that will return INVALID_PARAMETER and handle it. Thanks Yuan