xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Julien Grall <julien.grall@citrix.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>,
	"Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com>,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	"Ian Campbell" <Ian.Campbell@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [RFC] Support of non-indirect grant backend on 64KB guest
Date: Wed, 19 Aug 2015 07:54:44 -0700	[thread overview]
Message-ID: <55D498B4.2030804@citrix.com> (raw)
In-Reply-To: <55D44338.8030108@citrix.com>



On 19/08/2015 01:50, Roger Pau Monné wrote:
> El 18/08/15 a les 20.45, Julien Grall ha escrit:
>> Hi Roger,
>>
>> On 18/08/2015 00:09, Roger Pau Monné wrote:
>>> Hello,
>>>
>>> El 18/08/15 a les 8.29, Julien Grall ha escrit:
>>>> Hi,
>>>>
>>>> Firstly, this patch is not ready at all and mostly here for
>>>> collecting comment about the way to do it. It's not clean so no need
>>>> to complain about the coding style.
>>>>
>>>> The qdisk backend in QEMU is not supporting indirect grant, this is
>>>> means that a request can only support 11 * 4KB = 44KB.
>>>>
>>>> When using 64KB page, a Linux block request (struct *request) may
>>>> contain up to 64KB of data. This is because the block segment size
>>>> must at least be the size of a Linux page.
>>>>
>>>> So when indirect is not supported by the backend, we are not able to
>>>> fitall the data in a single request. We therefore need to create a
>>>> second request to copy the rest of the data.
>>>>
>>>> I've wrote a patch last week which make 64KB guest booting with
>>>> qdisk. Although, I'm not sure this is the right way to do it. I would
>>>> appreciate ifone of the block maintainers give me insight about it.
>>>
>>> Maybe I'm missing some key data, but I see two ways to solve this, the
>>> first one is the one you describe above, and consists in allowing
>>> blkfront to split a request into multiple ring slots. The other solution
>>> would be to add indirect descriptors support to Qdisk, has this been
>>> looked into?
>>>
>>> AFAICT it looks more interesting, and x86 can also benefit from it.
>>> Since I would like to prevent adding more cruft to blkfront, I rather
>>> prefer 64KB guests to require indirect descriptors in order to run.
>>
>> Actually supporting indirect in Qdisk was one of our idea. While I agree
>> this is a good improvement in general we put aside this idea for various
>> reasons.
>>
>> The first one is openStack is using by default Qdisk backend, so Linux
>> 64KB guest wouldn't be able to boot on current version of Xen. This is
>> the only blocker in order use 64KB guests, everything else is working.
>> Having the indirect grant support in QEMU for Xen 4.6 is not realistic,
>> there is only a month left and we are already in feature.
>
> Linux 64KB is already not able to boot on OpenStack from what you said,
> users will need to either update their kernels or their Qemu (depending
> on what we end up patching).

Linux 64KB guest is not able to boot at all as Xen guest. The support of 
64KB is aiming to be merged in Linux 4.3.

So with your suggestion the users would have to update both QEMU and 
Linux rather than only one components. That make the things not working 
out-of-box (i.e without any changes in Xen side).

>> That would mean that any new distribution using Linux 64KB would not
>> work out-of-box on Xen.
>>
>> Furthermore, not supporting non-indirect grant in the frontend means
>> that any userspace backend won't be supported for Linux 64KB guests.
>>
>> Overall, I think we have to support non-indirect with Linux 64KB guests.
>> Many (but not all) distribution will only support 64KB pages, so we
>> can't wait until Xen 4.7 to get something running. Not that I rule out
>> the requirement for the user to upgrade the QEMU version in order to run
>> 64KB guests.
>
> Why can this be fixed in the Qemu side and the fix backported to 4.6.1?

And what about any other backend not supporting indirect grant?

> If you want to fix it in Linux you will also have to wait for the next
> release anyway, and then asks users to use a specific kernel version
> (because distros won't pick the change that fast).

Aarch64 distros are not yet out officially. There is still work going 
out and they are planning to target to use Linux 4.2/4.3. We have 
distributions willing to take patches in their tree in order to support 
Xen guests.

Although, if we have to patch QEMU, we also need to push on distribution 
that may not need 4KB enabled in order to use them as DOM0...

> Overall I still think this should be fixed in Qemu, as said above users
> will have to update anyway, either their kernels or their Qemu version.

Let's take the problem in another way. Your are a big cloud provider 
using Aarch64 hardware. You decided to use Xen 4.5 (and not Xen 4.6) as 
the base version and a DOM0 using 4KB page granularity. Now one of the 
small customer decides to use a distribution which have 64KB pages 
enabled, if booting using Qdisk it won't work at all. Why on the earth 
the cloud provider will update his QEMU to support this kind of guest?

I think this is a really bad idea given that 64KB should work out-of-box 
and not depending on the backend side.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2015-08-19 14:54 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-18  6:29 [RFC] Support of non-indirect grant backend on 64KB guest Julien Grall
2015-08-18  7:09 ` Roger Pau Monné
2015-08-18  7:26   ` Jan Beulich
2015-08-18 18:45   ` Julien Grall
2015-08-19  8:50     ` Roger Pau Monné
2015-08-19 14:54       ` Julien Grall [this message]
2015-08-19 15:17         ` Roger Pau Monné
2015-08-19 15:52           ` Julien Grall
2015-08-19 23:44           ` Stefano Stabellini
2015-08-20  8:31             ` Roger Pau Monné
2015-08-20  9:43               ` David Vrabel
2015-08-20 16:16                 ` Julien Grall
2015-08-20 17:23                 ` Stefano Stabellini
2015-08-21 16:05                   ` Konrad Rzeszutek Wilk
2015-08-21 16:08                     ` David Vrabel
2015-08-21 16:49                       ` Stefano Stabellini
2015-08-21 17:10                       ` PAGE_SIZE (64KB), while block driver 'struct request' deals with < PAGE_SIZE (up to 44Kb). Was:Re: " Konrad Rzeszutek Wilk
2015-08-27 17:51                         ` Julien Grall
2015-09-04 14:04                           ` Stefano Stabellini
2015-09-04 15:41                             ` Konrad Rzeszutek Wilk
2015-09-04 16:15                               ` Julien Grall
2015-09-04 17:32                                 ` Konrad Rzeszutek Wilk
2015-09-04 22:05                                   ` Julien Grall
2015-08-20  9:37             ` Jan Beulich
2015-08-19  8:58     ` Jan Beulich
2015-08-19 15:25       ` Julien Grall
2015-08-20 17:42 ` David Vrabel
2015-08-21  1:30   ` Julien Grall
2015-08-21 16:07     ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55D498B4.2030804@citrix.com \
    --to=julien.grall@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Stefano.Stabellini@eu.citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).