qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
To: Liu Yuan <namei.unix@gmail.com>
Cc: kwolf@redhat.com, mitake.hitoshi@lab.ntt.co.jp,
	sheepdog@lists.wpkg.org, qemu-devel@nongnu.org,
	stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH v4] sheepdog: selectable object size support
Date: Fri, 13 Feb 2015 13:28:10 +0900	[thread overview]
Message-ID: <54DD7D5A.40702@lab.ntt.co.jp> (raw)
In-Reply-To: <20150213020119.GS7801@ubuntu-trusty>

(2015/02/13 11:01), Liu Yuan wrote:
> On Fri, Feb 13, 2015 at 10:33:04AM +0900, Teruaki Ishizaki wrote:
>> (2015/02/12 11:55), Liu Yuan wrote:
>>> On Thu, Feb 12, 2015 at 11:33:16AM +0900, Teruaki Ishizaki wrote:
>>>> (2015/02/12 11:19), Liu Yuan wrote:
>>>>> On Thu, Feb 12, 2015 at 10:51:25AM +0900, Teruaki Ishizaki wrote:
>>>>>> (2015/02/10 20:12), Liu Yuan wrote:
>>>>>>> On Tue, Jan 27, 2015 at 05:35:27PM +0900, Teruaki Ishizaki wrote:
>>>>>>>> Previously, qemu block driver of sheepdog used hard-coded VDI object size.
>>>>>>>> This patch enables users to handle "block_size_shift" value for
>>>>>>>> calculating VDI object size.
>>>>>>>>
>>>>>>>> When you start qemu, you don't need to specify additional command option.
>>>>>>>>
>>>>>>>> But when you create the VDI which doesn't have default object size
>>>>>>>> with qemu-img command, you specify block_size_shift option.
>>>>>>>>
>>>>>>>> If you want to create a VDI of 8MB(1 << 23) object size,
>>>>>>>> you need to specify following command option.
>>>>>>>>
>>>>>>>>   # qemu-img create -o block_size_shift=23 sheepdog:test1 100M
>>>>>>>>
>>>>>>>> In addition, when you don't specify qemu-img command option,
>>>>>>>> a default value of sheepdog cluster is used for creating VDI.
>>>>>>>>
>>>>>>>>   # qemu-img create sheepdog:test2 100M
>>>>>>>>
>>>>>>>> Signed-off-by: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
>>>>>>>> ---
>>>>>>>> V4:
>>>>>>>>   - Limit a read/write buffer size for creating a preallocated VDI.
>>>>>>>>   - Replace a parse function for the block_size_shift option.
>>>>>>>>   - Fix an error message.
>>>>>>>>
>>>>>>>> V3:
>>>>>>>>   - Delete the needless operation of buffer.
>>>>>>>>   - Delete the needless operations of request header.
>>>>>>>>     for SD_OP_GET_CLUSTER_DEFAULT.
>>>>>>>>   - Fix coding style problems.
>>>>>>>>
>>>>>>>> V2:
>>>>>>>>   - Fix coding style problem (white space).
>>>>>>>>   - Add members, store_policy and block_size_shift to struct SheepdogVdiReq.
>>>>>>>>   - Initialize request header to use block_size_shift specified by user.
>>>>>>>> ---
>>>>>>>>   block/sheepdog.c          |  138 ++++++++++++++++++++++++++++++++++++++-------
>>>>>>>>   include/block/block_int.h |    1 +
>>>>>>>>   2 files changed, 119 insertions(+), 20 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/block/sheepdog.c b/block/sheepdog.c
>>>>>>>> index be3176f..a43b947 100644
>>>>>>>> --- a/block/sheepdog.c
>>>>>>>> +++ b/block/sheepdog.c
>>>>>>>> @@ -37,6 +37,7 @@
>>>>>>>>   #define SD_OP_READ_VDIS      0x15
>>>>>>>>   #define SD_OP_FLUSH_VDI      0x16
>>>>>>>>   #define SD_OP_DEL_VDI        0x17
>>>>>>>> +#define SD_OP_GET_CLUSTER_DEFAULT   0x18
>>>>>>>
>>>>>>> This might not be necessary. For old qemu or the qemu-img without setting
>>>>>>> option, the block_size_shift will be 0.
>>>>>>>
>>>>>>> If we make 0 to represent 4MB object, then we don't need to get the default
>>>>>>> cluster object size.
>>>>>>>
>>>>>>> We migth even get rid of the idea of cluster default size. The downsize is that,
>>>>>>> if we want to create a vdi with different size not the default 4MB,
>>>>>>> we have to write it every time for qemu-img or dog.
>>>>>>>
>>>>>>> If we choose to keep the idea of cluster default size, I think we'd also try to
>>>>>>> avoid call this request from QEMU to make backward compatibility easier. In this
>>>>>>> scenario, 0 might be used to ask new sheep to decide to use cluster default size.
>>>>>>>
>>>>>>> Both old qemu and new QEMU will send 0 to sheep and both old and new sheep can
>>>>>>> handle 0 though it has different meanings.
>>>>>>>
>>>>>>> Table for this bit as 0:
>>>>>>> Qe: qemu
>>>>>>> SD: Sheep daemon
>>>>>>> CDS: Cluster Default Size
>>>>>>> Ign: Ignored by the sheep daemon
>>>>>>>
>>>>>>> Qe/sd   new    old
>>>>>>> new     CDS    Ign
>>>>>>> old     CDS    NULL
>>>>>> Does Ign mean that VDI is handled as 4MB object size?
>>>>>
>>>>> Yes, old sheep can only handle 4MB object and doesn't check this field at all.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> I think this approach is acceptable. The difference to your patch is that
>>>>>>> we don't send SD_OP_GET_CLUSTER_DEFAULT to sheep daemon and
>>>>>>> SD_OP_GET_CLUSTER_DEFAULT can be removed.
>>>>>> When users create a new VDI with qemu-img, qemu's Sheepdog backend
>>>>>> driver calculates max limit VDI size.
>>>>>
>>>>>> But if block_size_shift option is not specified, qemu's Sheepdog backend
>>>>>> driver can't calculate max limit VDI size.
>>>>>
>>>>> If block_size_shift not specified, this means
>>>>>
>>>>> 1 for old sheep, use 4MB size
>>>>> 2 for new sheep, use cluster wide default value.
>>>>>
>>>>> And sheep then can calculate it on its own, no?
>>>>>
>>>> Dog command(client) calculate max size, so I think
>>>> that qemu's Sheepdog backend driver should calculate it
>>>> like dog command.
>>>>
>>>> Is that policy changeable?
>>>
>>> I checked the QEMU code and got your idea. In the past it was fixed size so very
>>> easy to hardcode the check in the client, no communication with sheep needed.
>>>
>>> Yes, if it is reasonable, we can change it.
>>>
>>> I think we can push the size calculation logic into sheep, if not the right size
>>> return INVALID_PARAMETER to clients. Clients just check this and report error
>>> back to users.
>>>
>>> There is no backward compability for this approach, since 4MB is the smallest
>>> size.
>>>
>>> OLD QEMU will limit the max_size as 4TB, which is no problem for new sheep.
>>
>> I have checked the Qemu and sheepdog code.
>> When we resize VDI, sd_truncate() is called and
>> resize value is handled by Qemu.
>> (Sorry I haven't noticed this operation)
>>
>> Then, sd_truncate() writes Sheepdog inode object directly.
>> So Sheepdog server can't handle maximum VDI size.
>>
>> As I thought, should we use SD_OP_GET_CLUSTER_DEFAULT?
>> Should maxmimum VDI size be calculated on client program?
>
> Based on your description, yes, we have to use it. I'd suggest rename
> SD_OP_GET_CLUSTER_DEFAULT as SD_OP_GET_DEFAULT_OBJECT_SIZE. If we use it, you
> need to take care of old sheep that will return INVALID_PARAMETER and handle it.
>
Now, SD_OP_GET_CLUSTER_DEFAULT can get block_size_shift, copy_policy
and copies informations.
I think that SD_OP_GET_DEFAULT_OBJECT_SIZE doesn't fit in.

I'll change to handle INVALID_PARAMETER.

Thanks,
Teruaki

      reply	other threads:[~2015-02-13  4:28 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-27  8:35 [Qemu-devel] [PATCH v4] sheepdog: selectable object size support Teruaki Ishizaki
2015-02-02  6:52 ` Liu Yuan
2015-02-04  4:54   ` Teruaki Ishizaki
2015-02-06  2:18     ` Liu Yuan
2015-02-06  7:57       ` Teruaki Ishizaki
2015-02-09  3:08         ` Liu Yuan
2015-02-10  3:10 ` Liu Yuan
2015-02-10  3:18   ` Liu Yuan
2015-02-10  8:22   ` Teruaki Ishizaki
2015-02-10  8:58     ` Liu Yuan
2015-02-10  9:56       ` Teruaki Ishizaki
2015-02-10 10:35         ` Liu Yuan
2015-02-12  6:19           ` [Qemu-devel] [sheepdog] " Hitoshi Mitake
2015-02-12  7:00             ` Liu Yuan
2015-02-12  7:28               ` Hitoshi Mitake
2015-02-12  7:42                 ` Liu Yuan
2015-02-12  8:01                   ` Teruaki Ishizaki
2015-02-12  8:11                     ` Liu Yuan
2015-02-12  8:13                   ` Hitoshi Mitake
2015-02-12  8:16                     ` Liu Yuan
2015-02-10 11:12 ` [Qemu-devel] " Liu Yuan
2015-02-12  1:51   ` Teruaki Ishizaki
2015-02-12  2:19     ` Liu Yuan
2015-02-12  2:33       ` Teruaki Ishizaki
2015-02-12  2:55         ` Liu Yuan
2015-02-13  1:33           ` Teruaki Ishizaki
2015-02-13  2:01             ` Liu Yuan
2015-02-13  4:28               ` Teruaki Ishizaki [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54DD7D5A.40702@lab.ntt.co.jp \
    --to=ishizaki.teruaki@lab.ntt.co.jp \
    --cc=kwolf@redhat.com \
    --cc=mitake.hitoshi@lab.ntt.co.jp \
    --cc=namei.unix@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sheepdog@lists.wpkg.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).