Re: [RFC] netif: staging grants for requests

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Joao Martins <joao.m.martins@oracle.com>
To: Paul Durrant <Paul.Durrant@citrix.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Wei Liu <wei.liu2@citrix.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>
Subject: Re: [RFC] netif: staging grants for requests
Date: Mon, 9 Jan 2017 13:01:05 +0000	[thread overview]
Message-ID: <58738991.1010702@oracle.com> (raw)
In-Reply-To: <f4766244eb8d43f9bb1589e40668c74f@AMSPEX02CL03.citrite.net>

On 01/09/2017 08:56 AM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Joao Martins [mailto:joao.m.martins@oracle.com]
>> Sent: 06 January 2017 20:09
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Stefano
>> Stabellini <sstabellini@kernel.org>
>> Subject: Re: [RFC] netif: staging grants for requests
>>
>> On 01/06/2017 09:33 AM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Joao Martins [mailto:joao.m.martins@oracle.com]
>>>> Sent: 14 December 2016 18:11
>>>> To: xen-devel@lists.xenproject.org
>>>> Cc: David Vrabel <david.vrabel@citrix.com>; Andrew Cooper
>>>> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Paul
>> Durrant
>>>> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>
>>>> Subject: [RFC] netif: staging grants for requests
>>>>
>>>> Hey,
>>>>
>>>> Back in the Xen hackaton '16 networking session there were a couple of
>> ideas
>>>> brought up. One of them was about exploring permanently mapped
>> grants
>>>> between
>>>> xen-netback/xen-netfront.
>>>>
>>>> I started experimenting and came up with sort of a design document (in
>>>> pandoc)
>>>> on what it would like to be proposed. This is meant as a seed for
>> discussion
>>>> and also requesting input to know if this is a good direction. Of course, I
>>>> am willing to try alternatives that we come up beyond the contents of the
>>>> spec, or any other suggested changes ;)
>>>>
>>>> Any comments or feedback is welcome!
>>>>
>>>
>>> Hi,
>> Hey!
>>
>>>
>>> Sorry for the delay... I've been OOTO for three weeks.
>> Thanks for the comments!
>>
>>> I like the general approach or pre-granting buffers for RX so that the
>> backend
>>> can simply memcpy and tell the frontend which buffer a packet appears in
>> Cool,
>>
>>> but IIUC you are proposing use of a single pre-granted area for TX also,
>> which would
>>> presumably require the frontend to always copy on the TX side? I wonder if
>> we
>>> might go for a slightly different scheme...
>> I see.
>>
>>>
>>> The assumption is that the working set of TX buffers in the guest OS is fairly
>>> small (which is probably true for a small number of heavily used sockets
>> and an
>>> OS that uses a slab allocator)...
>> Hmm, [speaking about linux] maybe for the skb allocation cache. For the
>> remaining packet pages maybe not for say a scather-gather list...? But I guess
>> it would need to be validated whether this working set is indeed kept small
>> as
>> this seems like a very strong assumption to comply with its various
>> possibilities in workloads. Plus wouldn't we leak info from these pages if it
>> wasn't used on the device but rather elsewhere in the guest stack?
> 
> Yes, potentially there is an information leak but I am assuming that the backend
> is also trusted by the frontend, which is pretty will baked into the protocol
> anyway.
I assumed the same - just thought it was worth clarifying.

> Also, if the working set (which is going to be OS/stack dependent) turned
> out to be a bit too large then the frontend can always fall back to a copy into a
> locally allocated buffer, as in your proposal, anyway.
Yeap.

>>> The guest TX code maintains a hash table of buffer addresses to grant refs.
>> When
>>> a packet is sent the code looks to see if it has already granted the buffer
>> and
>>> re-uses the existing ref if so, otherwise it grants the buffer and adds the
>> new
>>> ref into the table.
>>
>>> The backend also maintains a hash of grant refs to addresses and,
>> whenever it
>>> sees a new ref, it grant maps it and adds the address into the table.
>> Otherwise
>>> it does a hash lookup and thus has a buffer address it can immediately
>> memcpy
>>> from.
>>>
>>> If the frontend wants the backend to release a grant ref (e.g. because it's
>>> starting to run out of grant table) then a control message can be used to
>> ask
>>> for it back, at which point the backend removes the ref from its cache and
>>> unmaps it.
>> Wouldn't this be somewhat similar to the persistent grants in xen block
>> drivers?
> 
> Yes, it would, and I'd rather that protocol was also re-worked in this fashion.
I guess then I could reuse part of my old series (persistent grants) in this
reworked fashion you suggest. I didn't went that route as I had the (apparently
wrong) impression that a persistent grants based approach were undesirable (as I
took it from past sessions)

>>> Using this scheme we allow a guest OS to still use either a zero-copy
>> approach
>>> if it wishes to do so, or a static pre-grant... or something between
>>> (e.g. pre-grant for headers, zero copy for bulk data).
>>>
>>> Does that sound reasonable?
>> Not sure yet but it looks nice if we can indeed achieve the zero copy part. But
>> I have two concerns: say a backend could be forced to always remove refs as
>> its
>> cache is always full having frontend not being able to reuse these pages
>> (subject to its own allocator behavior, in case assumption above wouldn't be
>> satisfied) nullifying backend effort into maintaining its mapped grefs table.
>> One other concern is whether those pages (assumed to be reused) might be
>> leaking
>> off guest data to the backend (when not used on netfront).
> 
> As I said, the protocol already requires the backend to be trusted by the frontend
> (since grants cannot be revoked, if for no other reason) so information leakage is
> not a particular concern. What I want to avoid is a protocol that denies any
> possibility of zero-copy, even in the best case, which is the way things currently
> are with persistent grants in blkif.
I'll be more reassured once I've checked/verified this frontend assumption can
be achieved by a somewhat visible percentage of the sent pages; but still you've
got a valid point there that the protocol shouldn't be forcing the guest OS in
always doing a copy. Hence I like in having a control message to co-manage these
grefs table.

Joao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

     prev parent reply	other threads:[~2017-01-09 12:58 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-14 18:11 [RFC] netif: staging grants for requests Joao Martins
2017-01-04 13:54 ` Wei Liu
2017-01-05 20:27   ` Joao Martins
2017-01-04 19:40 ` Stefano Stabellini
2017-01-05 11:54   ` Wei Liu
2017-01-05 20:27   ` Joao Martins
2017-01-06  0:30     ` Stefano Stabellini
2017-01-06 17:13       ` Joao Martins
2017-01-06 19:02         ` Stefano Stabellini
2017-01-06  9:33 ` Paul Durrant
2017-01-06 19:18   ` Stefano Stabellini
2017-01-06 20:19     ` Joao Martins
2017-01-09  9:03     ` Paul Durrant
2017-01-09 18:25       ` Stefano Stabellini
2017-01-06 20:08   ` Joao Martins
2017-01-09  8:56     ` Paul Durrant
2017-01-09 13:01       ` Joao Martins [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58738991.1010702@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=Paul.Durrant@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.