All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zoltan Kiss <zoltan.kiss@huawei.com>
To: David Vrabel <david.vrabel@citrix.com>,
	"Xen-devel@lists.xen.org" <Xen-devel@lists.xen.org>
Subject: Re: Linux grant map/unmap improvement proposal (Draft B)
Date: Wed, 15 Oct 2014 18:45:06 +0100	[thread overview]
Message-ID: <543EB2A2.7040005@huawei.com> (raw)
In-Reply-To: <543BD686.3080006@citrix.com>



On 13/10/2014 14:41, David Vrabel wrote:
[...]
> Packets with foreign pages from other sources cannot be successfully
> copied, since netback does not know the grant reference.  Once such
"... One such"
> configuration is a VM providing an iSCSI or other network-based
> storage that presents a block device in the backend that is then used
> by another VM on the same host.
If the packet coming from the storage target VM is delivered to L3 in 
Dom0's stack, the foreign pages will be swapped out with local copies. 
That's a feature of the zerocopy framework used by netback, mostly due 
to fears that strange things can happen in and above the IP layer.
So unless the storage backend in Dom0 implements an own TCP/IP stack and 
uses the vifX.Y device directly, it probably won't see foreign frames 
from the storage target. Of course it wouldn't be smart to rely on this 
on the long term, it would be good to remove that copy.
Or do you mean the other direction, when the guest using this storage 
writes to it, and that date is mapped by the block backend and used to 
construct an SKB? (by the time I finished the sentence I realized you 
meant this scenarie, but I leave the above comments just for the sake of 
clarification)

>
> Blkback and network storage
> ---------------------------
>
> Blkback unmaps the foreign pages in a I/O request when the request is
> completed.  If networked storage is used it is possible for requests
> to be completed while the skbs referring to those pages are still
> queued for transmit (e.g., because a retransmission was queued while
> the responds to the original packet was in flight).
>
> When the network driver attempts to send the packet with the unmapped
> page it may:
>
> - Fault while trying to access the unmapped page.
>
> - Transmit from a frame that is no longer granted (potentially
>    transmitting sensitive guest or Xen data).
>
> The fault does not occur with userspace storage backends since gntdev
> replaces the foreign mapping with one to a local scratch page.  It
> uses GNTOP_unmap_and_replace which atomically replaces the foreign
> mapping with another (source) mapping.  However, this cannot be used
> with batched operations since it clears the source mapping and it does
> not prevent against transmitting from a non-granted frame.
>
>
>
>
> Safe grant unmap
> ----------------
>
> Grant references will only be unmapped when they are no longer in use.
> i.e., the page reference count is one.
>
>      int gnttab_unmap_refs_async(struct gnttab_unmap_grant_ref *unmap_ops,
>          struct gnttab_unmap_grant_ref *kunmap_ops,
>          struct page **pages, unsigned int count,
>          void (*done)(void *data), void *data);
>
> The `gnttab_unmap_refs_async()` function will unmap the grant
> references using the supplied unmap operations and call `done(data)`.
> The grant unmap will only be done once all pages are no longer in use.
I'm a bit confused about this function. I guess it checks the refcount 
before unmap. But then what does the done(data) function does?
>
> It shall run synchronously on the first attempt (this is expected to
> be the most common case).  If any page is in use, it shall queue the
> unmap request to be tried at a later time.
Who will own this queue? The caller (e.g. blkback)? How often should it 
retry? That retry is triggered by a timer?
>
> Only the blkback and gntdev devices need to use asynchronouse unmaps.
>
[...]

>
> Identifying foreign pages
> -------------------------
>
> A new page flag is introduced: PG_foreign.  This will alias PG_pinned
> so it does not require an additional bit.
>
> If PG_foreign is set then `page->private` contains the grant reference
> and domid for this foreign page.  This information can only be packed
> into an unsigned long on 64-bit platforms.  32-bit platforms will have
> to allocate an additional structure to store the domid and gref.
>
> The aliasing of PG_foreign and PG_pinned is safe because:
>
> - Page table pages will never be foreign.
> - Foreign pages shall have `p2m[P] & FOREIGN_FRAME_BIT`.
>
> The use of the private field is safe because:
>
> - The page is allocated by the balloon driver and thus it owns the
>    private field.
>
> - The other fields in the union (ptl, slab_cache, and first_page) will
>    not be used because the page is not used in a page table, slab or
>    compound page.
>
This flag sounds similar to the flag used in classic for netback grant 
mapping. Would it be accepted in upstream? Aliasing PG_pinned would make 
sure of that?
> Netback can thus:
>
> 1. Test PG_foreign.
> 2. Verify that the page is foreign via the p2m.
> 3. Extract the domid and gref from page->private.
>
> The PG_foreign test is not strictly necessary as the p2m lookup is
> sufficient, but it should be quicker for non-foreign pages.

  parent reply	other threads:[~2014-10-15 17:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-13 13:41 Linux grant map/unmap improvement proposal (Draft B) David Vrabel
2014-10-13 16:43 ` Stefano Stabellini
2014-10-13 17:22   ` David Vrabel
2014-10-14 10:27 ` Ian Campbell
2014-10-14 10:32   ` David Vrabel
2014-10-14 10:35     ` Ian Campbell
2014-10-14 12:49       ` David Vrabel
2014-10-14 12:59         ` Ian Campbell
2014-10-15 17:45 ` Zoltan Kiss [this message]
2014-10-16 15:54   ` David Vrabel
2014-12-18 17:55 ` David Vrabel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=543EB2A2.7040005@huawei.com \
    --to=zoltan.kiss@huawei.com \
    --cc=Xen-devel@lists.xen.org \
    --cc=david.vrabel@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.