linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Julien Grall <julien.grall@citrix.com>
To: Wei Liu <wei.liu2@citrix.com>, Julien Grall <julien.grall@citrix.com>
Cc: <ian.campbell@citrix.com>, <stefano.stabellini@eu.citrix.com>,
	<netdev@vger.kernel.org>, <tim@xen.org>,
	<linux-kernel@vger.kernel.org>, <xen-devel@lists.xenproject.org>,
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
Date: Fri, 15 May 2015 13:35:42 +0100	[thread overview]
Message-ID: <5555E81E.8070803@citrix.com> (raw)
In-Reply-To: <20150515023534.GE19352@zion.uk.xensource.com>

Hi Wei,

Thanks you for the review.

On 15/05/15 03:35, Wei Liu wrote:
> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
>> The PV network protocol is using 4KB page granularity. The goal of this
>> patch is to allow a Linux using 64KB page granularity working as a
>> network backend on a non-modified Xen.
>>
>> It's only necessary to adapt the ring size and break skb data in small
>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>
>> Although only simple workload is working (dhcp request, ping). If I try
>> to use wget in the guest, it will stall until a tcpdump is started on
>> the vif interface in DOM0. I wasn't able to find why.
>>
> 
> I think in wget workload you're more likely to break down 64K pages to
> 4K pages. Some of your calculation of mfn, offset might be wrong.

If so, why tcpdump on the vif interface would make wget suddenly
working? Does it make netback use a different path?

>> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
>> it's used for (I have limited knowledge on the network driver).
>>
> 
> This is the maximum slots a guest packet can use. AIUI the protocol
> still works on 4K granularity (you break 64K page to a bunch of 4K
> pages), you don't need to change this.

1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
number of Linux page. So we would have to get the number for Xen page.

Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
but it get stuck in the loop.

>> ---
>>  drivers/net/xen-netback/common.h  |  7 ++++---
>>  drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
>>  2 files changed, 18 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 8a495b3..0eda6e9 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -44,6 +44,7 @@
>>  #include <xen/interface/grant_table.h>
>>  #include <xen/grant_table.h>
>>  #include <xen/xenbus.h>
>> +#include <xen/page.h>
>>  #include <linux/debugfs.h>
>>  
>>  typedef unsigned int pending_ring_idx_t;
>> @@ -64,8 +65,8 @@ struct pending_tx_info {
>>  	struct ubuf_info callback_struct;
>>  };
>>  
>> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
>> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
>>  
>>  struct xenvif_rx_meta {
>>  	int id;
>> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
>>  /* Discriminate from any valid pending_idx value. */
>>  #define INVALID_PENDING_IDX 0xFFFF
>>  
>> -#define MAX_BUFFER_OFFSET PAGE_SIZE
>> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
>>  
>>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
>>  
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 9ae1d43..ea5ce84 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  {
>>  	struct gnttab_copy *copy_gop;
>>  	struct xenvif_rx_meta *meta;
>> -	unsigned long bytes;
>> +	unsigned long bytes, off_grant;
>>  	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
>>  
>>  	/* Data must not cross a page boundary. */
>> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  		if (npo->copy_off == MAX_BUFFER_OFFSET)
>>  			meta = get_next_rx_buffer(queue, npo);
>>  
>> -		bytes = PAGE_SIZE - offset;
>> +		off_grant = offset & ~XEN_PAGE_MASK;
>> +		bytes = XEN_PAGE_SIZE - off_grant;
>>  		if (bytes > size)
>>  			bytes = size;
>>  
>> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  		} else {
>>  			copy_gop->source.domid = DOMID_SELF;
>>  			copy_gop->source.u.gmfn =
>> -				virt_to_mfn(page_address(page));
>> +				virt_to_mfn(page_address(page) + offset);
>>  		}
>> -		copy_gop->source.offset = offset;
>> +		copy_gop->source.offset = off_grant;
>>  
>>  		copy_gop->dest.domid = queue->vif->domid;
>>  		copy_gop->dest.offset = npo->copy_off;
>> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
>>  		first->size -= txp->size;
>>  		slots++;
>>  
>> -		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
>> +		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
>>  			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
>>  				 txp->offset, txp->size);
>>  			xenvif_fatal_tx_err(queue->vif);
>> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>>  		}
>>  
>>  		/* No crossing a page as the payload mustn't fragment. */
>> -		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
>> +		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
>>  			netdev_err(queue->vif->dev,
>>  				   "txreq.offset: %x, size: %u, end: %lu\n",
>>  				   txreq.offset, txreq.size,
>> -				   (txreq.offset&~PAGE_MASK) + txreq.size);
>> +				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
>>  			xenvif_fatal_tx_err(queue->vif);
>>  			break;
>>  		}
>> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>>  			virt_to_mfn(skb->data);
> 
> You didn't change the calculation of MFN here. I think it returns the
> MFN of the first 4K sub-page of that 64K page.  Do I miss anything?

There is no change required. On ARM virt_to_mfn is implemented with:

pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT)

which will return a 4KB PFN (see patch #23).

> 
>>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>>  		queue->tx_copy_ops[*copy_ops].dest.offset =
>> -			offset_in_page(skb->data);
>> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
>>  
>>  		queue->tx_copy_ops[*copy_ops].len = data_len;
>>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
>> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
> 
> This function is to coalesce frag_list to a new SKB. It's completely
> fine to use the natural granularity of backend domain. The way you
> modified it can lead to waste of memory, i.e. you only use first 4K of a
> 64K page.

Thanks for explaining. I wasn't sure how the function works so I change
it for safety. I will redo the change.

FWIW, I'm sure there is other place in netback where we waste memory
with 64KB page granularity (such as grant table). I need to track them.

Let me know if you have some place in mind where the memory usage can be
improved.

>>  			return -ENOMEM;
>>  		}
>>  
>> -		if (offset + PAGE_SIZE < skb->len)
>> -			len = PAGE_SIZE;
>> +		if (offset + XEN_PAGE_SIZE < skb->len)
>> +			len = XEN_PAGE_SIZE;
>>  		else
>>  			len = skb->len - offset;
>>  		if (skb_copy_bits(skb, offset, page_address(page), len))
>> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
>>  	/* Fill the skb with the new (local) frags. */
>>  	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
>>  	skb_shinfo(skb)->nr_frags = i;
>> -	skb->truesize += i * PAGE_SIZE;
>> +	skb->truesize += i * XEN_PAGE_SIZE;
> 
> The true size accounts for the actual memory occupied by this SKB. Since
> the page is allocated with alloc_page, the granularity should be
> PAGE_SIZE not XEN_PAGE_SIZE.

Ok. I will replace with PAGE_SIZE.

Regards,

-- 
Julien Grall

  reply	other threads:[~2015-05-15 12:36 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-14 17:00 [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest Julien Grall
2015-05-14 17:00 ` [RFC 01/23] xen: Include xen/page.h rather than asm/xen/page.h Julien Grall
2015-05-19 13:50   ` [Xen-devel] " David Vrabel
2015-05-14 17:00 ` [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring Julien Grall
2015-05-19 13:51   ` [Xen-devel] " David Vrabel
2015-05-19 14:12     ` Julien Grall
2015-05-14 17:00 ` [RFC 03/23] xen/grant-table: Remove unused macro SPP Julien Grall
2015-05-19 13:52   ` [Xen-devel] " David Vrabel
2015-05-14 17:00 ` [RFC 04/23] block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS Julien Grall
2015-05-20 14:36   ` Roger Pau Monné
2015-05-14 17:00 ` [RFC 05/23] block/xen-blkfront: Remove invalid comment Julien Grall
2015-05-20 14:42   ` Roger Pau Monné
2015-05-14 17:00 ` [RFC 06/23] block/xen-blkback: s/nr_pages/nr_segs/ Julien Grall
2015-05-20 14:54   ` Roger Pau Monné
2015-05-14 17:00 ` [RFC 07/23] net/xen-netfront: Correct printf format in xennet_get_responses Julien Grall
2015-05-19 13:53   ` [Xen-devel] " David Vrabel
2015-05-14 17:00 ` [RFC 08/23] net/xen-netback: Remove unused code in xenvif_rx_action Julien Grall
2015-05-15  0:26   ` Wei Liu
2015-05-14 17:00 ` [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt Julien Grall
2015-06-23 13:25   ` Stefano Stabellini
2015-06-23 13:53     ` [Xen-devel] " Julien Grall
2015-05-14 17:00 ` [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable Julien Grall
2015-05-15 15:54   ` Boris Ostrovsky
2015-05-19 14:16     ` Julien Grall
2015-05-14 17:00 ` [RFC 11/23] xen: Add Xen specific page definition Julien Grall
2015-05-14 17:00 ` [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page Julien Grall
2015-05-19 13:57   ` [Xen-devel] " David Vrabel
2015-05-19 14:18     ` Julien Grall
2015-05-14 17:00 ` [RFC 13/23] xen/xenbus: Use Xen page definition Julien Grall
2015-05-19 13:59   ` [Xen-devel] " David Vrabel
2015-05-19 14:19     ` Julien Grall
2015-05-14 17:00 ` [RFC 14/23] tty/hvc: xen: Use xen " Julien Grall
2015-05-14 17:00 ` [RFC 15/23] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux Julien Grall
2015-05-19 15:23   ` [Xen-devel] " David Vrabel
2015-05-14 17:00 ` [RFC 16/23] xen/events: fifo: Make it running on 64KB granularity Julien Grall
2015-05-19 15:25   ` [Xen-devel] " David Vrabel
2015-05-14 17:00 ` [RFC 17/23] xen/grant-table: " Julien Grall
2015-05-19 15:27   ` [Xen-devel] " David Vrabel
2015-05-14 17:00 ` [RFC 18/23] block/xen-blkfront: Make it running on 64KB page granularity Julien Grall
2015-05-14 17:00 ` [RFC 19/23] block/xen-blkback: " Julien Grall
2015-05-14 17:01 ` [RFC 20/23] net/xen-netfront: " Julien Grall
2015-05-14 17:01 ` [RFC 21/23] net/xen-netback: " Julien Grall
2015-05-15  2:35   ` Wei Liu
2015-05-15 12:35     ` Julien Grall [this message]
2015-05-15 15:31       ` [Xen-devel] " Wei Liu
2015-05-15 15:41         ` Ian Campbell
2015-05-18 12:11         ` Julien Grall
2015-05-18 12:54           ` Wei Liu
2015-05-19 22:56             ` Julien Grall
2015-05-20  8:26               ` Wei Liu
2015-05-20 14:26                 ` Julien Grall
2015-05-20 14:29               ` Julien Grall
2015-05-14 17:01 ` [RFC 22/23] xen/privcmd: Add support for Linux " Julien Grall
2015-05-19 15:39   ` [Xen-devel] " David Vrabel
2015-06-18 17:05     ` Julien Grall
2015-05-14 17:01 ` [RFC 23/23] arm/xen: Add support for " Julien Grall
2015-06-23 14:19   ` Stefano Stabellini
2015-06-23 14:37     ` Julien Grall
2015-06-23 14:49       ` Stefano Stabellini
2015-06-23 15:02         ` [Xen-devel] " Julien Grall
2015-06-23 16:12           ` Stefano Stabellini
2015-05-15 15:45 ` [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest David Vrabel
2015-05-15 15:51   ` Boris Ostrovsky
2015-05-18 12:23   ` Julien Grall
2015-06-23 13:37     ` Stefano Stabellini
2015-06-23 13:41       ` Stefano Stabellini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5555E81E.8070803@citrix.com \
    --to=julien.grall@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).