Re: [Hackathon minutes] PV block improvements

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: xen-devel <xen-devel@lists.xen.org>, Wei Liu <wei.liu2@citrix.com>
Subject: Re: [Hackathon minutes] PV block improvements
Date: Tue, 2 Jul 2013 13:49:11 +0200	[thread overview]
Message-ID: <51D2BE37.7080002@citrix.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1306241201320.4782@kaball.uk.xensource.com>

On 24/06/13 13:06, Stefano Stabellini wrote:
> On Sat, 22 Jun 2013, Wei Liu wrote:
>> On Fri, Jun 21, 2013 at 04:16:25PM -0400, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Jun 21, 2013 at 07:10:59PM +0200, Roger Pau Monné wrote:
>>>> Hello,
>>>>
>>>> While working on further block improvements I've found an issue with
>>>> persistent grants in blkfront.
>>>>
>>>> Persistent grants basically allocate grants and then they are never
>>>> released, so both blkfront and blkback keep using the same memory pages
>>>> for all the transactions.
>>>>
>>>> This is not a problem in blkback, because we can dynamically choose how
>>>> many grants we want to map. On the other hand, blkfront cannot remove
>>>> the access to those grants at any point, because blkfront doesn't know
>>>> if blkback has this grants mapped persistently or not.
>>>>
>>>> So if for example we start expanding the number of segments in indirect
>>>> requests, to a value like 512 segments per requests, blkfront will
>>>> probably try to persistently map 512*32+512 = 16896 grants per device,
>>>> that's much more grants that the current default, which is 32*256 = 8192
>>>> (if using grant tables v2). This can cause serious problems to other
>>>> interfaces inside the DomU, since blkfront basically starts hoarding all
>>>> possible grants, leaving other interfaces completely locked.
>>>>
>>>> I've been thinking about different ways to solve this, but so far I
>>>> haven't been able to found a nice solution:
>>>>
>>>> 1. Limit the number of persistent grants a blkfront instance can use,
>>>> let's say that only the first X used grants will be persistently mapped
>>>> by both blkfront and blkback, and if more grants are needed the previous
>>>> map/unmap will be used.
>>>>
>>>> 2. Switch to grant copy in blkback, and get rid of persistent grants (I
>>>> have not benchmarked this solution, but I'm quite sure it will involve a
>>>> performance regression, specially when scaling to a high number of domains).
>>>>
>>
>> Any chance that the speed of copying is fast enough for block devices?
>>
>>>> 3. Increase the size of the grant_table or the size of a single grant
>>>> (from 4k to 2M) (this is from Stefano Stabellini).
>>>>
>>>> 4. Introduce a new request type that we can use to request blkback to
>>>> unmap certain grefs so we can free them in blkfront.
>>>
>>>
>>> 5). Lift the limit of grant pages a domain can have.
>>
>> If I'm not mistaken, this is basically the same as "increase the size of
>> the grant_table" in #3.
> 
> Yes, that was one of the things I was suggesting, but it needs
> investigating: I wouldn't want that increasing the number of grant
> frames would reach a different scalability limit of the data structure.

I don't think there's any implicit scalability limit in the data
structure itself, it's just an array and grants are ordered as
array[gref]. I've discussed with Stefano the usage of domain pages to
increase the size of the grant table, so instead of using xenheap pages
we could use domain pages and thus remove the limitation (since we will
be consuming domain memory). I have a very hacky prototype that uses
domain pages instead of xenheap pages for expanding the grant table, but
I think that before implementing this it would be more suitable to
implement #4, even if we are using domain pages to increase the grant
table, we need a way to allow blkfront to remove persistent grants, or
we will end up with a lot of unsused pages in blkfront after I/O bursts.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2013-07-02 11:49 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-24 15:06 [Hackathon minutes] PV block improvements Roger Pau Monné
2013-06-21 17:10 ` Roger Pau Monné
2013-06-21 18:07   ` Matt Wilson
2013-06-22  7:11     ` Roger Pau Monné
2013-06-25  6:09       ` Matt Wilson
2013-06-25 13:01         ` Wei Liu
2013-06-25 15:39           ` Matt Wilson
2013-06-25 15:53       ` Ian Campbell
2013-06-25 18:04         ` Stefano Stabellini
2013-06-26  9:37           ` George Dunlap
2013-06-26 11:37             ` Ian Campbell
2013-06-27 13:58               ` George Dunlap
2013-06-27 14:21                 ` Ian Campbell
2013-06-27 15:20                   ` Roger Pau Monné
2013-06-25 15:57       ` Ian Campbell
2013-06-25 16:05         ` Jan Beulich
2013-06-25 16:30         ` Roger Pau Monné
2013-06-27 15:12     ` Roger Pau Monné
2013-06-27 15:26       ` Stefano Stabellini
2013-06-21 20:16   ` Konrad Rzeszutek Wilk
2013-06-21 23:17     ` Wei Liu
2013-06-24 11:06       ` Stefano Stabellini
2013-07-02 11:49         ` Roger Pau Monné [this message]
2013-06-22  7:17     ` Roger Pau Monné

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51D2BE37.7080002@citrix.com \
    --to=roger.pau@citrix.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).