Netdev List
 help / color / mirror / Atom feed
* Re: Neterion and UFO handling [was: Re: [PATCH] ipv6: udp packets following an UFO enqueued packet need also be handled by UFO]
From: Jon Mason @ 2013-10-08  8:07 UTC (permalink / raw)
  To: Jon Mason, Jiri Pirko, netdev, yoshfuji, David Miller, kuznet,
	jmorris, kaber, herbert, Eric Dumazet
In-Reply-To: <20131002162730.GA3991@order.stressinduktion.org>

On Wed, Oct 2, 2013 at 9:27 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> Hi!
>
> I have a question regarding UFO and the neterion driver, which as the only one
> advertises hardware UFO support:
>
> The patch discusses in this thread
> http://thread.gmane.org/gmane.linux.network/284348/focus=285405 could change
> some semantics how packets are constructed before submitted to the driver.
>
> We currently guarantee that we have the MAC/IP/UDP header in skb->data and the
> payload is attached in the skb's frags. With the changes discussed in this
> thread it is possible that we also append to skb->data some amount of data
> which is not targeted for the header. From reading the driver sources it seems
> the hardware interprets the skb->data to skb_headlen as the header, so we
> could include some data in the fragments more than once.

>From my reading of the HW Spec and a quick look at the driver, it
appears that the driver is using one entry in the TX ring for the
header and another for the body of the packet to be fragmented (which
is what the hardware wants).  I don't understand what you are saying,
but if you are asking if simply appending a new header & data to the
end of skb->data will get it out on the wire correct, I don't believe
it will.

I do have hardware that I can try the patch on, if you can walk me
through the use case (unless it is as easy as setup an IPv6 connection
and ping).

Time for sleep now....

Thanks,
Jon

>
> Do you think this change is safe? Otherwise I would suggest that the UFO
> capability is switched off until the driver signals the hardware the start and
> end of the headers correctly?
>
> I left the mail below intact which points to the specific place in s2io.c
> where I think the problem is.
>
> On Wed, Oct 02, 2013 at 08:14:27AM -0700, Eric Dumazet wrote:
>> On Wed, 2013-10-02 at 15:03 +0200, Hannes Frederic Sowa wrote:
>> > On Wed, Oct 02, 2013 at 02:12:07PM +0200, Hannes Frederic Sowa wrote:
>> > > Hi Eric!
>> > >
>> > > On Wed, Oct 02, 2013 at 03:41:28AM -0700, Eric Dumazet wrote:
>> > > > On Wed, 2013-10-02 at 10:58 +0200, Jiri Pirko wrote:
>> > > > > Wed, Oct 02, 2013 at 01:25:34AM CEST, hannes@stressinduktion.org wrote:
>> > > > > >-    if (((length > mtu) || (skb && skb_is_gso(skb))) &&
>> > > > > >+    if (((length > mtu) || (skb && skb_has_frags(skb))) &&
>> > > >
>> > > > >
>> > > > > This seems correct to me. sk_is_gso would work as well is you apply my
>> > > > > patch "[patch net] ip6_output: do skb ufo init for peeked non ufo skb as
>> > > > > well" which does the setting of gso_size.
>> > > >
>> > > > Well, skb having frags or not should not be a concern :
>> > > > Thats an allocation choice (lets say to avoid high order allocations).
>> > > >
>> > > > Setting gso_size is probably better.
>> > >
>> > > e89e9cf539a28df7d0eb1d0a545368e9920b34ac ("[IPv4/IPv6]: UFO Scatter-gather
>> > > approach") states:
>> > >
>> > > "
>> > > skb->data will contain MAC/IP/UDP header and skb_shinfo(skb)->frags[]
>> > > contains the data payload. The skb->ip_summed will be set to CHECKSUM_HW
>> > > indicating that hardware has to do checksum calculation. Hardware should
>> > > compute the UDP checksum of complete datagram and also ip header checksum of
>> > > each fragmented IP packet.
>> > > "
>> > >
>> > > This is the reason why I tried not to update the gso_size. If it is ok, I am
>> > > fine with that.
>> >
>> > Especially, drivers/net/ethernet/neterion/s2io.c states that the first dma
>> > mapping (skb->data with skb_headlen, which is fine) is used as the inband
>> > header:
>> >
>> >         if (offload_type == SKB_GSO_UDP)
>> >                 frg_cnt++; /* as Txd0 was used for inband header */
>> >
>> > That is my only other hint that we maybe should not update gso_size and
>> > gso_type. I guess software fallback does not have this problem, but I won't
>> > have time to check until this evening.
>> >
>> > I am really not sure if just setting gso_size does not break neterion UFO
>> > offloading. :/
>>
>> Well, just ask Jon Mason to double check ;)
>>
>> I think the commit intent was to set gso_size :
>>
>>    skb_shinfo(skb)->ufo_size will indicate the length of data part in each IP
>>     fragment going out of the adapter after IP fragmentation by hardware.
>>
>> The fact that it states "skb->data will contain MAC/IP/UDP header and
>> skb_shinfo(skb)->frags[] contains the data payload." seems irrelevant.
>>
>> If Neterion driver mandates that skb->head *only* contains the
>> MAC/IP/UDP header, that should be handled in the driver itself.
>
> Thanks Eric for clearing this up.
>
> I really thought it would be the common pattern for UFO to have only headers
> in skb->data, so I didn't bother to ask in the first place.
>
> Thanks,
>
>   Hannes
>

^ permalink raw reply

* RE: [PATCH] mac80211: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...))
From: David Laight @ 2013-10-08  8:33 UTC (permalink / raw)
  To: djduanjiong, Johannes Berg, John W. Linville, David S. Miller
  Cc: linux-wireless, netdev, Duan Jiong
In-Reply-To: <1381190953-6362-1-git-send-email-duanj.fnst@cn.fujitsu.com>

> Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
> ---
>  net/mac80211/key.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/net/mac80211/key.c b/net/mac80211/key.c
> index 620677e..3e51dd7 100644
> --- a/net/mac80211/key.c
> +++ b/net/mac80211/key.c
> @@ -879,7 +879,7 @@ ieee80211_gtk_rekey_add(struct ieee80211_vif *vif,
>  				  keyconf->keylen, keyconf->key,
>  				  0, NULL);
>  	if (IS_ERR(key))
> -		return ERR_PTR(PTR_ERR(key));
> +		return ERR_CAST(key);

I suspect the old code is a deliberate attempt to indicate that it
the error value that is being returned rather than the pointer.

Looking higher up that file there seems to be a fubar when setting
up the TKIP key.
It seems to put the same 6 bytes into every TID.
(I'm sure seq_len shouldn't be ignored either.0

	David

^ permalink raw reply

* Re: [PATCH] mac80211: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...))
From: Johannes Berg @ 2013-10-08  8:37 UTC (permalink / raw)
  To: David Laight
  Cc: djduanjiong-Re5JQEeQqe8AvxtiuMwx3w, John W. Linville,
	David S. Miller, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Duan Jiong
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B7379-CgBM+Bx2aUAnGFn1LkZF6NBPR1lH4CV8@public.gmane.org>

On Tue, 2013-10-08 at 09:33 +0100, David Laight wrote:

> >  	if (IS_ERR(key))
> > -		return ERR_PTR(PTR_ERR(key));
> > +		return ERR_CAST(key);
> 
> I suspect the old code is a deliberate attempt to indicate that it
> the error value that is being returned rather than the pointer.

I was under the impression that's what ERR_CAST means.

> Looking higher up that file there seems to be a fubar when setting
> up the TKIP key.
> It seems to put the same 6 bytes into every TID.
> (I'm sure seq_len shouldn't be ignored either.0

Both are fine.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* RE: bug in passing file descriptors
From: David Laight @ 2013-10-08  8:43 UTC (permalink / raw)
  To: Steve Rago, David Miller; +Cc: luto, netdev, mtk.manpages, ebiederm
In-Reply-To: <5253199B.3000109@nec-labs.com>

> > There is no compatability issue.
> >
> > 32-bit tasks will always see the 4-byte align/length.
> > 64-bit tasks will always see the 8-byte align/length.
> >
> 
> Really?  So when I compile my application on a 32-bit Linux box and then try to run it on a 64-bit
> Linux box, you're not
> going to overrun my buffer when CMSG_SPACE led me to allocate an insufficient amount of memory needed
> to account for
> padding on the 64-bit platform?

Worth checking what happens if you try to send an fd between a 32 bit
application and a 64bit one.
Also is the fd actually closed if the receiving applications buffer
isn't big enough.

	David

^ permalink raw reply

* Re: [PATCH RFC 00/77] Re-design MSI/MSI-X interrupts enablement pattern
From: Alexander Gordeev @ 2013-10-08  9:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-kernel, Bjorn Helgaas, Ralf Baechle, Michael Ellerman,
	Benjamin Herrenschmidt, Martin Schwidefsky, Ingo Molnar,
	Dan Williams, Andy King, Jon Mason, Matt Porter, linux-pci,
	linux-mips, linuxppc-dev, linux390, linux-s390, x86, linux-ide,
	iss_storagedev, linux-nvme, linux-rdma, netdev, e1000-devel,
	linux-driver, So
In-Reply-To: <20131007182117.GC27396@htj.dyndns.org>

On Mon, Oct 07, 2013 at 02:21:17PM -0400, Tejun Heo wrote:
> Whee.... that's a lot more than I expected.  I was just scanning
> multiple msi users.  Maybe we can stage the work in more manageable
> steps so that you don't have to go through massive conversion only to
> do it all over again afterwards and likewise people don't get
> bombarded on each iteration?  Maybe we can first update pci / msi code
> proper, msi and then msix?

Multipe MSIs is just a handful of drivers, really. MSI-X impact still
will be huge. But if we opt a different name for the new pci_enable_msix()
then we could first update pci/msi, then drivers (in few stages possibly)
and finally remove the old implementation.

> tejun

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply

* I NEED YOUR TRUST,
From: Mr_Badmons @ 2013-10-08  9:27 UTC (permalink / raw)


I NEED YOUR TRUST,

I am Mr,Badmons Dukaz,The Head of file Department in Bank Of Africa.I seek your assistance and i assured of your capability to champion this business opportunity to remit $18.2 million U.S.A dollars. Into your account if you are interested let me know so that i can send you the details of this transaction.
I agree that 30% of this money will be for you 10% will be set aside for expenses incurred during the business and 60% would be for me.
Regards,
Badmons Dukaz. 

^ permalink raw reply

* [PATCH net-next] Revert "xen-netback: improve ring effeciency for guest RX"
From: Wei Liu @ 2013-10-08  9:54 UTC (permalink / raw)
  To: netdev
  Cc: xen-devel, Wei Liu, Ian Campbell, Annie Li, Matt Wilson, Xi Xiong,
	David Vrabel, Paul Durrant

This reverts commit 4f0581d25827d5e864bcf07b05d73d0d12a20a5c.

The named changeset is causing problem. Let's aim to make this part less
fragile before trying to improve things.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Matt Wilson <msw@amazon.com>
Cc: Xi Xiong <xixiong@amazon.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Paul Durrant <paul.durrant@citrix.com>
---
 drivers/net/xen-netback/netback.c |  144 +++++++++++++++++++++----------------
 1 file changed, 83 insertions(+), 61 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index d0b0feb..f3e591c 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -47,14 +47,6 @@
 #include <asm/xen/hypercall.h>
 #include <asm/xen/page.h>
 
-/* SKB control block overlay is used to store useful information when
- * doing guest RX.
- */
-struct skb_cb_overlay {
-	int meta_slots_used;
-	int peek_slots_count;
-};
-
 /* Provide an option to disable split event channels at load time as
  * event channels are limited resource. Split event channels are
  * enabled by default.
@@ -220,6 +212,49 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
 	return false;
 }
 
+struct xenvif_count_slot_state {
+	unsigned long copy_off;
+	bool head;
+};
+
+unsigned int xenvif_count_frag_slots(struct xenvif *vif,
+				     unsigned long offset, unsigned long size,
+				     struct xenvif_count_slot_state *state)
+{
+	unsigned count = 0;
+
+	offset &= ~PAGE_MASK;
+
+	while (size > 0) {
+		unsigned long bytes;
+
+		bytes = PAGE_SIZE - offset;
+
+		if (bytes > size)
+			bytes = size;
+
+		if (start_new_rx_buffer(state->copy_off, bytes, state->head)) {
+			count++;
+			state->copy_off = 0;
+		}
+
+		if (state->copy_off + bytes > MAX_BUFFER_OFFSET)
+			bytes = MAX_BUFFER_OFFSET - state->copy_off;
+
+		state->copy_off += bytes;
+
+		offset += bytes;
+		size -= bytes;
+
+		if (offset == PAGE_SIZE)
+			offset = 0;
+
+		state->head = false;
+	}
+
+	return count;
+}
+
 /*
  * Figure out how many ring slots we're going to need to send @skb to
  * the guest. This function is essentially a dry run of
@@ -227,53 +262,40 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
  */
 unsigned int xenvif_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
 {
+	struct xenvif_count_slot_state state;
 	unsigned int count;
-	int i, copy_off;
-	struct skb_cb_overlay *sco;
+	unsigned char *data;
+	unsigned i;
 
-	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
+	state.head = true;
+	state.copy_off = 0;
 
-	copy_off = skb_headlen(skb) % PAGE_SIZE;
+	/* Slot for the first (partial) page of data. */
+	count = 1;
 
+	/* Need a slot for the GSO prefix for GSO extra data? */
 	if (skb_shinfo(skb)->gso_size)
 		count++;
 
-	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
-		unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
-		unsigned long offset = skb_shinfo(skb)->frags[i].page_offset;
-		unsigned long bytes;
-
-		offset &= ~PAGE_MASK;
-
-		while (size > 0) {
-			BUG_ON(offset >= PAGE_SIZE);
-			BUG_ON(copy_off > MAX_BUFFER_OFFSET);
-
-			bytes = PAGE_SIZE - offset;
-
-			if (bytes > size)
-				bytes = size;
+	data = skb->data;
+	while (data < skb_tail_pointer(skb)) {
+		unsigned long offset = offset_in_page(data);
+		unsigned long size = PAGE_SIZE - offset;
 
-			if (start_new_rx_buffer(copy_off, bytes, 0)) {
-				count++;
-				copy_off = 0;
-			}
+		if (data + size > skb_tail_pointer(skb))
+			size = skb_tail_pointer(skb) - data;
 
-			if (copy_off + bytes > MAX_BUFFER_OFFSET)
-				bytes = MAX_BUFFER_OFFSET - copy_off;
+		count += xenvif_count_frag_slots(vif, offset, size, &state);
 
-			copy_off += bytes;
+		data += size;
+	}
 
-			offset += bytes;
-			size -= bytes;
+	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+		unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
+		unsigned long offset = skb_shinfo(skb)->frags[i].page_offset;
 
-			if (offset == PAGE_SIZE)
-				offset = 0;
-		}
+		count += xenvif_count_frag_slots(vif, offset, size, &state);
 	}
-
-	sco = (struct skb_cb_overlay *)skb->cb;
-	sco->peek_slots_count = count;
 	return count;
 }
 
@@ -305,11 +327,14 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif *vif,
 	return meta;
 }
 
-/* Set up the grant operations for this fragment. */
+/*
+ * Set up the grant operations for this fragment. If it's a flipping
+ * interface, we also set up the unmap request from here.
+ */
 static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 				 struct netrx_pending_operations *npo,
 				 struct page *page, unsigned long size,
-				 unsigned long offset, int head, int *first)
+				 unsigned long offset, int *head)
 {
 	struct gnttab_copy *copy_gop;
 	struct xenvif_rx_meta *meta;
@@ -333,12 +358,12 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		if (bytes > size)
 			bytes = size;
 
-		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
+		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
 			/*
 			 * Netfront requires there to be some data in the head
 			 * buffer.
 			 */
-			BUG_ON(*first);
+			BUG_ON(*head);
 
 			meta = get_next_rx_buffer(vif, npo);
 		}
@@ -372,10 +397,10 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		}
 
 		/* Leave a gap for the GSO descriptor. */
-		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
+		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
 			vif->rx.req_cons++;
 
-		*first = 0; /* There must be something in this buffer now. */
+		*head = 0; /* There must be something in this buffer now. */
 
 	}
 }
@@ -401,7 +426,7 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 	struct xen_netif_rx_request *req;
 	struct xenvif_rx_meta *meta;
 	unsigned char *data;
-	int first = 1;
+	int head = 1;
 	int old_meta_prod;
 
 	old_meta_prod = npo->meta_prod;
@@ -437,7 +462,7 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 			len = skb_tail_pointer(skb) - data;
 
 		xenvif_gop_frag_copy(vif, skb, npo,
-				     virt_to_page(data), len, offset, 1, &first);
+				     virt_to_page(data), len, offset, &head);
 		data += len;
 	}
 
@@ -446,7 +471,7 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 				     skb_frag_page(&skb_shinfo(skb)->frags[i]),
 				     skb_frag_size(&skb_shinfo(skb)->frags[i]),
 				     skb_shinfo(skb)->frags[i].page_offset,
-				     0, &first);
+				     &head);
 	}
 
 	return npo->meta_prod - old_meta_prod;
@@ -504,6 +529,10 @@ static void xenvif_add_frag_responses(struct xenvif *vif, int status,
 	}
 }
 
+struct skb_cb_overlay {
+	int meta_slots_used;
+};
+
 static void xenvif_kick_thread(struct xenvif *vif)
 {
 	wake_up(&vif->wq);
@@ -534,26 +563,19 @@ void xenvif_rx_action(struct xenvif *vif)
 	count = 0;
 
 	while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) {
-		RING_IDX old_rx_req_cons;
-
 		vif = netdev_priv(skb->dev);
 		nr_frags = skb_shinfo(skb)->nr_frags;
 
-		old_rx_req_cons = vif->rx.req_cons;
 		sco = (struct skb_cb_overlay *)skb->cb;
 		sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
 
-		count += vif->rx.req_cons - old_rx_req_cons;
+		count += nr_frags + 1;
 
 		__skb_queue_tail(&rxq, skb);
 
-		skb = skb_peek(&vif->rx_queue);
-		if (skb == NULL)
-			break;
-		sco = (struct skb_cb_overlay *)skb->cb;
-
 		/* Filled the batch queue? */
-		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
+		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
+		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
 			break;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related

* Re: [raw v1 2/4] [NET] Use raw_cpu ops for SNMP stats
From: Peter Zijlstra @ 2013-10-08 10:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Christoph Lameter, Tejun Heo, akpm, Eric Dumazet, netdev,
	Steven Rostedt, linux-kernel, Thomas Gleixner
In-Reply-To: <20131008072114.GA4455@gmail.com>

On Tue, Oct 08, 2013 at 09:21:14AM +0200, Ingo Molnar wrote:
> 
> * Christoph Lameter <cl@linux.com> wrote:
> 
> > SNMP stats are not protected by preemption but by bh handling.
> 
> Most forms of bh exclusion work via the preemption count though, and 
> softirq contexts themselves are generally not preemptible [to other CPUs] 
> either.
> 
> So the warnings should, in most cases, not trigger.

Right, so softirqs run either in the irq tail at which point
preempt_count += SOFTIRQ_OFFSET and thus preemption is disabled, or it
runs in ksoftirqd which has strict cpu affinity which also disables the
warning, and it also increments preempt_count with SOFTIRQ_OFFSET to
exclude the softirq from interrupts while its running, also disabling
the warning.

So it should very much not trigger.. if it does you want to know about
it.

^ permalink raw reply

* Re: [PATCH net-next] Revert "xen-netback: improve ring effeciency for guest RX"
From: Ian Campbell @ 2013-10-08 10:21 UTC (permalink / raw)
  To: Wei Liu
  Cc: netdev, xen-devel, Annie Li, Matt Wilson, Xi Xiong, David Vrabel,
	Paul Durrant
In-Reply-To: <1381226061-6521-1-git-send-email-wei.liu2@citrix.com>

On Tue, 2013-10-08 at 10:54 +0100, Wei Liu wrote:
> This reverts commit 4f0581d25827d5e864bcf07b05d73d0d12a20a5c.
> 
> The named changeset is causing problem. Let's aim to make this part less
> fragile before trying to improve things.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>

Although I thought davem would just run git revert so I don't know if it
is needed.

^ permalink raw reply

* Re: [raw v1 2/4] [NET] Use raw_cpu ops for SNMP stats
From: Ingo Molnar @ 2013-10-08 10:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Lameter, Tejun Heo, akpm, Eric Dumazet, netdev,
	Steven Rostedt, linux-kernel, Thomas Gleixner
In-Reply-To: <20131008102142.GT3081@twins.programming.kicks-ass.net>


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, Oct 08, 2013 at 09:21:14AM +0200, Ingo Molnar wrote:
> > 
> > * Christoph Lameter <cl@linux.com> wrote:
> > 
> > > SNMP stats are not protected by preemption but by bh handling.
> > 
> > Most forms of bh exclusion work via the preemption count though, and 
> > softirq contexts themselves are generally not preemptible [to other CPUs] 
> > either.
> > 
> > So the warnings should, in most cases, not trigger.
> 
> Right, so softirqs run either in the irq tail at which point 
> preempt_count += SOFTIRQ_OFFSET and thus preemption is disabled, or it 
> runs in ksoftirqd which has strict cpu affinity which also disables the 
> warning, and it also increments preempt_count with SOFTIRQ_OFFSET to 
> exclude the softirq from interrupts while its running, also disabling 
> the warning.

A third context would be syscall-level code that runs with 
local_bh_disable()/enable() - but that too ought to have the preempt count 
elevated.

> So it should very much not trigger.. if it does you want to know about 
> it.

Yes. If nothing else then for the education value.

Thanks,

	Ingo

^ permalink raw reply

* [PATCH net-next 0/5] xen-netback: IPv6 offload support
From: Paul Durrant @ 2013-10-08 10:44 UTC (permalink / raw)
  To: xen-devel, netdev

This patch series adds support for checksum and large packet offloads into
xen-netback.
Testing has mainly been done using the Microsoft network hardware
certification suite running in Server 2008R2 VMs with Citrix PV frontends.

^ permalink raw reply

* [PATCH net-next 1/5] xen-netback: add IPv6 checksum offload to guest
From: Paul Durrant @ 2013-10-08 10:44 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381229081-17814-1-git-send-email-paul.durrant@citrix.com>

Check xenstore flag feature-ipv6-csum-offload to determine if a
guest is happy to accept IPv6 packets with only partial checksum.
Also check analogous feature-ip-csum-offload to determone if a
guest is happy to accept IPv4 packets with only partial checksum
as a replacement for a negated feature-no-csum-offload value and
add a comment to deprecate use of feature-no-csum-offload.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liuw2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/common.h    |    3 ++-
 drivers/net/xen-netback/interface.c |   10 +++++++---
 drivers/net/xen-netback/xenbus.c    |   24 +++++++++++++++++++++++-
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 5715318..b4a9a3c 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -153,7 +153,8 @@ struct xenvif {
 	u8 can_sg:1;
 	u8 gso:1;
 	u8 gso_prefix:1;
-	u8 csum:1;
+	u8 ip_csum:1;
+	u8 ipv6_csum:1;
 
 	/* Internal feature information. */
 	u8 can_queue:1;	    /* can queue packets for receiver? */
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 01bb854..8e92783 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -216,8 +216,10 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
 		features &= ~NETIF_F_SG;
 	if (!vif->gso && !vif->gso_prefix)
 		features &= ~NETIF_F_TSO;
-	if (!vif->csum)
+	if (!vif->ip_csum)
 		features &= ~NETIF_F_IP_CSUM;
+	if (!vif->ipv6_csum)
+		features &= ~NETIF_F_IPV6_CSUM;
 
 	return features;
 }
@@ -306,7 +308,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	vif->domid  = domid;
 	vif->handle = handle;
 	vif->can_sg = 1;
-	vif->csum = 1;
+	vif->ip_csum = 1;
 	vif->dev = dev;
 
 	vif->credit_bytes = vif->remaining_credit = ~0UL;
@@ -316,7 +318,9 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	vif->credit_timeout.expires = jiffies;
 
 	dev->netdev_ops	= &xenvif_netdev_ops;
-	dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
+	dev->hw_features = NETIF_F_SG |
+		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
+		NETIF_F_TSO;
 	dev->features = dev->hw_features;
 	SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index b45bce2..5e45b00 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -567,10 +567,32 @@ static int connect_rings(struct backend_info *be)
 		val = 0;
 	vif->gso_prefix = !!val;
 
+	/* Before feature-ipv6-csum-offload was introduced, checksum offload
+	 * was turned on by a zero value in (or lack of)
+	 * feature-no-csum-offload. Therefore, for compatibility, assume
+	 * that if feature-ip-csum-offload is missing that IP (v4) offload
+	 * is turned on. If this is not the case then the flag will be
+	 * cleared when we subsequently sample feature-no-csum-offload.
+	 */
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-ip-csum-offload",
+			 "%d", &val) < 0)
+		val = 1;
+	vif->ip_csum = !!val;
+
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-ipv6-csum-offload",
+			 "%d", &val) < 0)
+		val = 0;
+	vif->ipv6_csum = !!val;
+
+	/* This is deprecated. Frontends should set IP v4 and v6 checksum
+	 * offload using feature-ip-csum-offload and
+	 * feature-ipv6-csum-offload respectively.
+	 */
 	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-no-csum-offload",
 			 "%d", &val) < 0)
 		val = 0;
-	vif->csum = !val;
+	if (val)
+		vif->ip_csum = 0;
 
 	/* Map the shared frame, irq etc. */
 	err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref,
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 2/5] xen-netback: Add support for IPv6 checksum offload from guest
From: Paul Durrant @ 2013-10-08 10:44 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381229081-17814-1-git-send-email-paul.durrant@citrix.com>

For performance of VM to VM traffic on a single host it is better to avoid
calculation of TCP/UDP checksum in the sending frontend. To allow this this
patch adds the code necessary to set up partial checksum for IPv6 packets
and xenstore flag feature-ipv6-csum-offload to advertise that fact to
frontends.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liuw2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/netback.c |  224 ++++++++++++++++++++++++++++++-------
 drivers/net/xen-netback/xenbus.c  |    9 ++
 2 files changed, 191 insertions(+), 42 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index f3e591c..35b1fa6 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -109,15 +109,10 @@ static inline unsigned long idx_to_kaddr(struct xenvif *vif,
 	return (unsigned long)pfn_to_kaddr(idx_to_pfn(vif, idx));
 }
 
-/*
- * This is the amount of packet we copy rather than map, so that the
- * guest can't fiddle with the contents of the headers while we do
- * packet processing on them (netfilter, routing, etc).
+/* This is a miniumum size for the linear area to avoid lots of
+ * calls to __pskb_pull_tail() as we set up checksum offsets.
  */
-#define PKT_PROT_LEN    (ETH_HLEN + \
-			 VLAN_HLEN + \
-			 sizeof(struct iphdr) + MAX_IPOPTLEN + \
-			 sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE)
+#define PKT_PROT_LEN 128
 
 static u16 frag_get_pending_idx(skb_frag_t *frag)
 {
@@ -1118,61 +1113,72 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
 	return 0;
 }
 
-static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
+static inline void maybe_pull_tail(struct sk_buff *skb, unsigned int len)
 {
-	struct iphdr *iph;
+	if (skb_is_nonlinear(skb) && skb_headlen(skb) < len) {
+		int target = min_t(int, skb->len, len);
+		__pskb_pull_tail(skb, target - skb_headlen(skb));
+	}
+}
+
+static int checksum_setup_ip(struct xenvif *vif, struct sk_buff *skb,
+			     int recalculate_partial_csum)
+{
+	struct iphdr *iph = (void *)skb->data;
+	unsigned int header_size;
+	unsigned int off;
 	int err = -EPROTO;
-	int recalculate_partial_csum = 0;
 
-	/*
-	 * A GSO SKB must be CHECKSUM_PARTIAL. However some buggy
-	 * peers can fail to set NETRXF_csum_blank when sending a GSO
-	 * frame. In this case force the SKB to CHECKSUM_PARTIAL and
-	 * recalculate the partial checksum.
-	 */
-	if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb)) {
-		vif->rx_gso_checksum_fixup++;
-		skb->ip_summed = CHECKSUM_PARTIAL;
-		recalculate_partial_csum = 1;
-	}
+	off = sizeof(struct iphdr);
 
-	/* A non-CHECKSUM_PARTIAL SKB does not require setup. */
-	if (skb->ip_summed != CHECKSUM_PARTIAL)
-		return 0;
+	header_size = skb->network_header + off + MAX_IPOPTLEN;
+	maybe_pull_tail(skb, header_size);
 
-	if (skb->protocol != htons(ETH_P_IP))
-		goto out;
+	off = iph->ihl * 4;
 
-	iph = (void *)skb->data;
 	switch (iph->protocol) {
 	case IPPROTO_TCP:
-		if (!skb_partial_csum_set(skb, 4 * iph->ihl,
+		if (!skb_partial_csum_set(skb, off,
 					  offsetof(struct tcphdr, check)))
 			goto out;
 
 		if (recalculate_partial_csum) {
 			struct tcphdr *tcph = tcp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct tcphdr) +
+				MAX_TCP_OPTION_SPACE;
+			maybe_pull_tail(skb, header_size);
+
 			tcph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
-							 skb->len - iph->ihl*4,
+							 skb->len - off,
 							 IPPROTO_TCP, 0);
 		}
 		break;
 	case IPPROTO_UDP:
-		if (!skb_partial_csum_set(skb, 4 * iph->ihl,
+		if (!skb_partial_csum_set(skb, off,
 					  offsetof(struct udphdr, check)))
 			goto out;
 
 		if (recalculate_partial_csum) {
 			struct udphdr *udph = udp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct udphdr);
+			maybe_pull_tail(skb, header_size);
+
 			udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
-							 skb->len - iph->ihl*4,
+							 skb->len - off,
 							 IPPROTO_UDP, 0);
 		}
 		break;
 	default:
 		if (net_ratelimit())
 			netdev_err(vif->dev,
-				   "Attempting to checksum a non-TCP/UDP packet, dropping a protocol %d packet\n",
+				   "Attempting to checksum a non-TCP/UDP packet, "
+				   "dropping a protocol %d packet\n",
 				   iph->protocol);
 		goto out;
 	}
@@ -1183,6 +1189,148 @@ out:
 	return err;
 }
 
+static int checksum_setup_ipv6(struct xenvif *vif, struct sk_buff *skb,
+			       int recalculate_partial_csum)
+{
+	int err = -EPROTO;
+	struct ipv6hdr *ipv6h = (void *)skb->data;
+	u8 nexthdr;
+	unsigned int header_size;
+	unsigned int off;
+	bool done;
+
+	done = false;
+	off = sizeof(struct ipv6hdr);
+
+	header_size = skb->network_header + off;
+	maybe_pull_tail(skb, header_size);
+
+	nexthdr = ipv6h->nexthdr;
+
+	while ((off <= sizeof(struct ipv6hdr) + ntohs(ipv6h->payload_len)) &&
+	       !done) {
+		/* We only access at most the first 2 bytes of any option header
+		 * to determine the next one.
+		 */
+		header_size = skb->network_header + off + 2;
+		maybe_pull_tail(skb, header_size);
+
+		switch (nexthdr) {
+		case IPPROTO_FRAGMENT: {
+			struct frag_hdr *hp = (void *)(skb->data + off);
+
+			nexthdr = hp->nexthdr;
+			off += 8;
+			break;
+		}
+		case IPPROTO_DSTOPTS:
+		case IPPROTO_HOPOPTS:
+		case IPPROTO_ROUTING: {
+			struct ipv6_opt_hdr *hp = (void *)(skb->data + off);
+
+			nexthdr = hp->nexthdr;
+			off += ipv6_optlen(hp);
+			break;
+		}
+		case IPPROTO_AH: {
+			struct ip_auth_hdr *hp = (void *)(skb->data + off);
+
+			nexthdr = hp->nexthdr;
+			off += (hp->hdrlen+2)<<2;
+			break;
+		}
+		default:
+			done = true;
+			break;
+		}
+	}
+
+	if (!done)
+		goto out;
+
+	switch (nexthdr) {
+	case IPPROTO_TCP:
+		if (!skb_partial_csum_set(skb, off,
+					  offsetof(struct tcphdr, check)))
+			goto out;
+
+		if (recalculate_partial_csum) {
+			struct tcphdr *tcph = tcp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct tcphdr) +
+				MAX_TCP_OPTION_SPACE;
+			maybe_pull_tail(skb, header_size);
+
+			tcph->check = ~csum_ipv6_magic(&ipv6h->saddr,
+						       &ipv6h->daddr,
+						       skb->len - off,
+						       IPPROTO_TCP, 0);
+		}
+		break;
+	case IPPROTO_UDP:
+		if (!skb_partial_csum_set(skb, off,
+					  offsetof(struct udphdr, check)))
+			goto out;
+
+		if (recalculate_partial_csum) {
+			struct udphdr *udph = udp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct udphdr);
+			maybe_pull_tail(skb, header_size);
+
+			udph->check = ~csum_ipv6_magic(&ipv6h->saddr,
+						       &ipv6h->daddr,
+						       skb->len - off,
+						       IPPROTO_UDP, 0);
+		}
+		break;
+	default:
+		if (net_ratelimit())
+			netdev_err(vif->dev,
+				   "Attempting to checksum a non-TCP/UDP packet, "
+				   "dropping a protocol %d packet\n",
+				   nexthdr);
+		goto out;
+	}
+
+	err = 0;
+
+out:
+	return err;
+}
+
+static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
+{
+	int err = -EPROTO;
+	int recalculate_partial_csum = 0;
+
+	/* A GSO SKB must be CHECKSUM_PARTIAL. However some buggy
+	 * peers can fail to set NETRXF_csum_blank when sending a GSO
+	 * frame. In this case force the SKB to CHECKSUM_PARTIAL and
+	 * recalculate the partial checksum.
+	 */
+	if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb)) {
+		vif->rx_gso_checksum_fixup++;
+		skb->ip_summed = CHECKSUM_PARTIAL;
+		recalculate_partial_csum = 1;
+	}
+
+	/* A non-CHECKSUM_PARTIAL SKB does not require setup. */
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		return 0;
+
+	if (skb->protocol == htons(ETH_P_IP))
+		err = checksum_setup_ip(vif, skb, recalculate_partial_csum);
+	else if (skb->protocol == htons(ETH_P_IPV6))
+		err = checksum_setup_ipv6(vif, skb, recalculate_partial_csum);
+
+	return err;
+}
+
 static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
 {
 	unsigned long now = jiffies;
@@ -1428,15 +1576,7 @@ static int xenvif_tx_submit(struct xenvif *vif, int budget)
 
 		xenvif_fill_frags(vif, skb);
 
-		/*
-		 * If the initial fragment was < PKT_PROT_LEN then
-		 * pull through some bytes from the other fragments to
-		 * increase the linear region to PKT_PROT_LEN bytes.
-		 */
-		if (skb_headlen(skb) < PKT_PROT_LEN && skb_is_nonlinear(skb)) {
-			int target = min_t(int, skb->len, PKT_PROT_LEN);
-			__pskb_pull_tail(skb, target - skb_headlen(skb));
-		}
+		maybe_pull_tail(skb, PKT_PROT_LEN);
 
 		skb->dev      = vif->dev;
 		skb->protocol = eth_type_trans(skb, skb->dev);
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 5e45b00..dff423b 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -101,6 +101,15 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		/* We support partial checksum setup for IPv6 packets */
+		err = xenbus_printf(xbt, dev->nodename,
+				    "feature-ipv6-csum-offload",
+				    "%d", 1);
+		if (err) {
+			message = "writing feature-ipv6-csum-offload";
+			goto abort_transaction;
+		}
+
 		/* We support rx-copy path. */
 		err = xenbus_printf(xbt, dev->nodename,
 				    "feature-rx-copy", "%d", 1);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 4/5] xen-netback: handle IPv6 TCP GSO packets from the guest
From: Paul Durrant @ 2013-10-08 10:44 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381229081-17814-1-git-send-email-paul.durrant@citrix.com>

This patch a xenstore flag, festure-gso-tcpv6, to adevrtise netback can
handle IPv6 TCP GSO packets and creates SKB_GSO_TCPV6 skbs if the frontend
passes an extra segment with the new type XEN_NETIF_GSO_TYPE_TCPV6 added to
netif.h.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liuw2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/netback.c |   11 ++++++++---
 drivers/net/xen-netback/xenbus.c  |    7 +++++++
 include/xen/interface/io/netif.h  |    3 ++-
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 35b1fa6..ac42f73 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1096,15 +1096,20 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
 		return -EINVAL;
 	}
 
-	/* Currently only TCPv4 S.O. is supported. */
-	if (gso->u.gso.type != XEN_NETIF_GSO_TYPE_TCPV4) {
+	switch (gso->u.gso.type) {
+	case XEN_NETIF_GSO_TYPE_TCPV4:
+		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+		break;
+	case XEN_NETIF_GSO_TYPE_TCPV6:
+		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+		break;
+	default:
 		netdev_err(vif->dev, "Bad GSO type %d.\n", gso->u.gso.type);
 		xenvif_fatal_tx_err(vif);
 		return -EINVAL;
 	}
 
 	skb_shinfo(skb)->gso_size = gso->u.gso.size;
-	skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
 
 	/* Header must be checked, and gso_segs computed. */
 	skb_shinfo(skb)->gso_type |= SKB_GSO_DODGY;
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index dff423b..389fa72 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -101,6 +101,13 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		err = xenbus_printf(xbt, dev->nodename, "feature-gso-tcpv6",
+				    "%d", sg);
+		if (err) {
+			message = "writing feature-gso-tcpv6";
+			goto abort_transaction;
+		}
+
 		/* We support partial checksum setup for IPv6 packets */
 		err = xenbus_printf(xbt, dev->nodename,
 				    "feature-ipv6-csum-offload",
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index eb262e3..c19cc06 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -95,8 +95,9 @@ struct xen_netif_tx_request {
 #define _XEN_NETIF_EXTRA_FLAG_MORE	(0)
 #define  XEN_NETIF_EXTRA_FLAG_MORE	(1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
 
-/* GSO types - only TCPv4 currently supported. */
+/* GSO types */
 #define XEN_NETIF_GSO_TYPE_TCPV4	(1)
+#define XEN_NETIF_GSO_TYPE_TCPV6	(2)
 
 /*
  * This structure needs to fit within both netif_tx_request and
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 3/5] xen-netback: Unconditionally set NETIF_F_RXCSUM
From: Paul Durrant @ 2013-10-08 10:44 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381229081-17814-1-git-send-email-paul.durrant@citrix.com>

There is no mechanism to insist that a guest always generates a packet
with good checksum (at least for IPv4) so we must handle checksum
offloading from the guest and hence should set NETIF_F_RXCSUM.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liuw2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/interface.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 8e92783..cb0d8ea 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -321,7 +321,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	dev->hw_features = NETIF_F_SG |
 		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
 		NETIF_F_TSO;
-	dev->features = dev->hw_features;
+	dev->features = dev->hw_features | NETIF_F_RXCSUM;
 	SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
 
 	dev->tx_queue_len = XENVIF_QUEUE_LENGTH;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 5/5] xen-netback: enable IPv6 TCP GSO to the guest
From: Paul Durrant @ 2013-10-08 10:44 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381229081-17814-1-git-send-email-paul.durrant@citrix.com>

This patch adds code to handle SKB_GSO_TCPV6 skbs and construct appropriate
extra or prefix segments to pass the large packet to the frontend. New
xenstore flags, feature-gso-tcpv6 and feature-gso-tcpv6-prefix, are sampled
to determine if the frontend is capable of handling such packets.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liuw2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/common.h    |    6 +++--
 drivers/net/xen-netback/interface.c |    8 ++++--
 drivers/net/xen-netback/netback.c   |   47 +++++++++++++++++++++++++++--------
 drivers/net/xen-netback/xenbus.c    |   29 +++++++++++++++++++--
 4 files changed, 74 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index b4a9a3c..720b1ca 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -87,6 +87,7 @@ struct pending_tx_info {
 struct xenvif_rx_meta {
 	int id;
 	int size;
+	int gso_type;
 	int gso_size;
 };
 
@@ -150,9 +151,10 @@ struct xenvif {
 	u8               fe_dev_addr[6];
 
 	/* Frontend feature information. */
+	int gso_mask;
+	int gso_prefix_mask;
+
 	u8 can_sg:1;
-	u8 gso:1;
-	u8 gso_prefix:1;
 	u8 ip_csum:1;
 	u8 ipv6_csum:1;
 
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index cb0d8ea..3d11387 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -214,8 +214,12 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
 
 	if (!vif->can_sg)
 		features &= ~NETIF_F_SG;
-	if (!vif->gso && !vif->gso_prefix)
+	if (~(vif->gso_mask | vif->gso_prefix_mask) &
+	    (1 << XEN_NETIF_GSO_TYPE_TCPV4))
 		features &= ~NETIF_F_TSO;
+	if (~(vif->gso_mask | vif->gso_prefix_mask) &
+	    (1 << XEN_NETIF_GSO_TYPE_TCPV6))
+		features &= ~NETIF_F_TSO6;
 	if (!vif->ip_csum)
 		features &= ~NETIF_F_IP_CSUM;
 	if (!vif->ipv6_csum)
@@ -320,7 +324,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	dev->netdev_ops	= &xenvif_netdev_ops;
 	dev->hw_features = NETIF_F_SG |
 		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
-		NETIF_F_TSO;
+		NETIF_F_TSO | NETIF_F_TSO6;
 	dev->features = dev->hw_features | NETIF_F_RXCSUM;
 	SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index ac42f73..ee0d55c 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -140,7 +140,7 @@ static int max_required_rx_slots(struct xenvif *vif)
 	int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
 
 	/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
-	if (vif->can_sg || vif->gso || vif->gso_prefix)
+	if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
 		max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
 
 	return max;
@@ -334,6 +334,7 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 	struct gnttab_copy *copy_gop;
 	struct xenvif_rx_meta *meta;
 	unsigned long bytes;
+	int gso_type;
 
 	/* Data must not cross a page boundary. */
 	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
@@ -392,7 +393,14 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		}
 
 		/* Leave a gap for the GSO descriptor. */
-		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
+		if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
+			gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+		else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
+			gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+		else
+			gso_type = 0;
+
+		if (*head && ((1 << gso_type) & vif->gso_prefix_mask))
 			vif->rx.req_cons++;
 
 		*head = 0; /* There must be something in this buffer now. */
@@ -423,14 +431,28 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 	unsigned char *data;
 	int head = 1;
 	int old_meta_prod;
+	int gso_type;
+	int gso_size;
 
 	old_meta_prod = npo->meta_prod;
 
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
+		gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+		gso_size = skb_shinfo(skb)->gso_size;
+	} else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) {
+		gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+		gso_size = skb_shinfo(skb)->gso_size;
+	} else {
+		gso_type = 0;
+		gso_size = 0;
+	}
+
 	/* Set up a GSO prefix descriptor, if necessary */
-	if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
+	if ((1 << gso_type) & vif->gso_prefix_mask) {
 		req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
 		meta = npo->meta + npo->meta_prod++;
-		meta->gso_size = skb_shinfo(skb)->gso_size;
+		meta->gso_type = gso_type;
+		meta->gso_size = gso_size;
 		meta->size = 0;
 		meta->id = req->id;
 	}
@@ -438,10 +460,13 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 	req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
 	meta = npo->meta + npo->meta_prod++;
 
-	if (!vif->gso_prefix)
-		meta->gso_size = skb_shinfo(skb)->gso_size;
-	else
+	if ((1 << gso_type) & vif->gso_mask) {
+		meta->gso_type = gso_type;
+		meta->gso_size = gso_size;
+	} else {
+		meta->gso_type = 0;
 		meta->gso_size = 0;
+	}
 
 	meta->size = 0;
 	meta->id = req->id;
@@ -587,7 +612,8 @@ void xenvif_rx_action(struct xenvif *vif)
 
 		vif = netdev_priv(skb->dev);
 
-		if (vif->meta[npo.meta_cons].gso_size && vif->gso_prefix) {
+		if ((1 << vif->meta[npo.meta_cons].gso_type) &
+		    vif->gso_prefix_mask) {
 			resp = RING_GET_RESPONSE(&vif->rx,
 						 vif->rx.rsp_prod_pvt++);
 
@@ -624,7 +650,8 @@ void xenvif_rx_action(struct xenvif *vif)
 					vif->meta[npo.meta_cons].size,
 					flags);
 
-		if (vif->meta[npo.meta_cons].gso_size && !vif->gso_prefix) {
+		if ((1 << vif->meta[npo.meta_cons].gso_type) &
+		    vif->gso_mask) {
 			struct xen_netif_extra_info *gso =
 				(struct xen_netif_extra_info *)
 				RING_GET_RESPONSE(&vif->rx,
@@ -632,8 +659,8 @@ void xenvif_rx_action(struct xenvif *vif)
 
 			resp->flags |= XEN_NETRXF_extra_info;
 
+			gso->u.gso.type = vif->meta[npo.meta_cons].gso_type;
 			gso->u.gso.size = vif->meta[npo.meta_cons].gso_size;
-			gso->u.gso.type = XEN_NETIF_GSO_TYPE_TCPV4;
 			gso->u.gso.pad = 0;
 			gso->u.gso.features = 0;
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 389fa72..4894256 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -573,15 +573,40 @@ static int connect_rings(struct backend_info *be)
 		val = 0;
 	vif->can_sg = !!val;
 
+	vif->gso_mask = 0;
+	vif->gso_prefix_mask = 0;
+
 	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4",
 			 "%d", &val) < 0)
 		val = 0;
-	vif->gso = !!val;
+	if (val)
+		vif->gso_mask = 1 << XEN_NETIF_GSO_TYPE_TCPV4;
 
 	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4-prefix",
 			 "%d", &val) < 0)
 		val = 0;
-	vif->gso_prefix = !!val;
+	if (val)
+		vif->gso_prefix_mask = 1 << XEN_NETIF_GSO_TYPE_TCPV4;
+
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6",
+			 "%d", &val) < 0)
+		val = 0;
+	if (val)
+		vif->gso_mask = 1 << XEN_NETIF_GSO_TYPE_TCPV6;
+
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6-prefix",
+			 "%d", &val) < 0)
+		val = 0;
+	if (val)
+		vif->gso_prefix_mask = 1 << XEN_NETIF_GSO_TYPE_TCPV6;
+
+	if (vif->gso_mask & vif->gso_prefix_mask) {
+		xenbus_dev_fatal(dev, err,
+				 "%s: gso and gso prefix flags are not "
+				 "mutually exclusive",
+				 dev->otherend);
+		return -EOPNOTSUPP;
+	}
 
 	/* Before feature-ipv6-csum-offload was introduced, checksum offload
 	 * was turned on by a zero value in (or lack of)
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next v2 0/5] xen-netback: IPv6 offload support
From: Paul Durrant @ 2013-10-08 10:58 UTC (permalink / raw)
  To: xen-devel, netdev

This patch series adds support for checksum and large packet offloads into
xen-netback.
Testing has mainly been done using the Microsoft network hardware
certification suite running in Server 2008R2 VMs with Citrix PV frontends.

v2:
- Fixed Wei's email address in Cc lines

^ permalink raw reply

* [PATCH net-next v2 1/5] xen-netback: add IPv6 checksum offload to guest
From: Paul Durrant @ 2013-10-08 10:58 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381229896-18657-1-git-send-email-paul.durrant@citrix.com>

Check xenstore flag feature-ipv6-csum-offload to determine if a
guest is happy to accept IPv6 packets with only partial checksum.
Also check analogous feature-ip-csum-offload to determone if a
guest is happy to accept IPv4 packets with only partial checksum
as a replacement for a negated feature-no-csum-offload value and
add a comment to deprecate use of feature-no-csum-offload.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/common.h    |    3 ++-
 drivers/net/xen-netback/interface.c |   10 +++++++---
 drivers/net/xen-netback/xenbus.c    |   24 +++++++++++++++++++++++-
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 5715318..b4a9a3c 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -153,7 +153,8 @@ struct xenvif {
 	u8 can_sg:1;
 	u8 gso:1;
 	u8 gso_prefix:1;
-	u8 csum:1;
+	u8 ip_csum:1;
+	u8 ipv6_csum:1;
 
 	/* Internal feature information. */
 	u8 can_queue:1;	    /* can queue packets for receiver? */
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 01bb854..8e92783 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -216,8 +216,10 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
 		features &= ~NETIF_F_SG;
 	if (!vif->gso && !vif->gso_prefix)
 		features &= ~NETIF_F_TSO;
-	if (!vif->csum)
+	if (!vif->ip_csum)
 		features &= ~NETIF_F_IP_CSUM;
+	if (!vif->ipv6_csum)
+		features &= ~NETIF_F_IPV6_CSUM;
 
 	return features;
 }
@@ -306,7 +308,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	vif->domid  = domid;
 	vif->handle = handle;
 	vif->can_sg = 1;
-	vif->csum = 1;
+	vif->ip_csum = 1;
 	vif->dev = dev;
 
 	vif->credit_bytes = vif->remaining_credit = ~0UL;
@@ -316,7 +318,9 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	vif->credit_timeout.expires = jiffies;
 
 	dev->netdev_ops	= &xenvif_netdev_ops;
-	dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
+	dev->hw_features = NETIF_F_SG |
+		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
+		NETIF_F_TSO;
 	dev->features = dev->hw_features;
 	SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index b45bce2..5e45b00 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -567,10 +567,32 @@ static int connect_rings(struct backend_info *be)
 		val = 0;
 	vif->gso_prefix = !!val;
 
+	/* Before feature-ipv6-csum-offload was introduced, checksum offload
+	 * was turned on by a zero value in (or lack of)
+	 * feature-no-csum-offload. Therefore, for compatibility, assume
+	 * that if feature-ip-csum-offload is missing that IP (v4) offload
+	 * is turned on. If this is not the case then the flag will be
+	 * cleared when we subsequently sample feature-no-csum-offload.
+	 */
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-ip-csum-offload",
+			 "%d", &val) < 0)
+		val = 1;
+	vif->ip_csum = !!val;
+
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-ipv6-csum-offload",
+			 "%d", &val) < 0)
+		val = 0;
+	vif->ipv6_csum = !!val;
+
+	/* This is deprecated. Frontends should set IP v4 and v6 checksum
+	 * offload using feature-ip-csum-offload and
+	 * feature-ipv6-csum-offload respectively.
+	 */
 	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-no-csum-offload",
 			 "%d", &val) < 0)
 		val = 0;
-	vif->csum = !val;
+	if (val)
+		vif->ip_csum = 0;
 
 	/* Map the shared frame, irq etc. */
 	err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref,
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next v2 4/5] xen-netback: handle IPv6 TCP GSO packets from the guest
From: Paul Durrant @ 2013-10-08 10:58 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381229896-18657-1-git-send-email-paul.durrant@citrix.com>

This patch a xenstore flag, festure-gso-tcpv6, to adevrtise netback can
handle IPv6 TCP GSO packets and creates SKB_GSO_TCPV6 skbs if the frontend
passes an extra segment with the new type XEN_NETIF_GSO_TYPE_TCPV6 added to
netif.h.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/netback.c |   11 ++++++++---
 drivers/net/xen-netback/xenbus.c  |    7 +++++++
 include/xen/interface/io/netif.h  |    3 ++-
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 35b1fa6..ac42f73 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1096,15 +1096,20 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
 		return -EINVAL;
 	}
 
-	/* Currently only TCPv4 S.O. is supported. */
-	if (gso->u.gso.type != XEN_NETIF_GSO_TYPE_TCPV4) {
+	switch (gso->u.gso.type) {
+	case XEN_NETIF_GSO_TYPE_TCPV4:
+		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+		break;
+	case XEN_NETIF_GSO_TYPE_TCPV6:
+		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+		break;
+	default:
 		netdev_err(vif->dev, "Bad GSO type %d.\n", gso->u.gso.type);
 		xenvif_fatal_tx_err(vif);
 		return -EINVAL;
 	}
 
 	skb_shinfo(skb)->gso_size = gso->u.gso.size;
-	skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
 
 	/* Header must be checked, and gso_segs computed. */
 	skb_shinfo(skb)->gso_type |= SKB_GSO_DODGY;
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index dff423b..389fa72 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -101,6 +101,13 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		err = xenbus_printf(xbt, dev->nodename, "feature-gso-tcpv6",
+				    "%d", sg);
+		if (err) {
+			message = "writing feature-gso-tcpv6";
+			goto abort_transaction;
+		}
+
 		/* We support partial checksum setup for IPv6 packets */
 		err = xenbus_printf(xbt, dev->nodename,
 				    "feature-ipv6-csum-offload",
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index eb262e3..c19cc06 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -95,8 +95,9 @@ struct xen_netif_tx_request {
 #define _XEN_NETIF_EXTRA_FLAG_MORE	(0)
 #define  XEN_NETIF_EXTRA_FLAG_MORE	(1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
 
-/* GSO types - only TCPv4 currently supported. */
+/* GSO types */
 #define XEN_NETIF_GSO_TYPE_TCPV4	(1)
+#define XEN_NETIF_GSO_TYPE_TCPV6	(2)
 
 /*
  * This structure needs to fit within both netif_tx_request and
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next v2 3/5] xen-netback: Unconditionally set NETIF_F_RXCSUM
From: Paul Durrant @ 2013-10-08 10:58 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381229896-18657-1-git-send-email-paul.durrant@citrix.com>

There is no mechanism to insist that a guest always generates a packet
with good checksum (at least for IPv4) so we must handle checksum
offloading from the guest and hence should set NETIF_F_RXCSUM.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/interface.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 8e92783..cb0d8ea 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -321,7 +321,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	dev->hw_features = NETIF_F_SG |
 		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
 		NETIF_F_TSO;
-	dev->features = dev->hw_features;
+	dev->features = dev->hw_features | NETIF_F_RXCSUM;
 	SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
 
 	dev->tx_queue_len = XENVIF_QUEUE_LENGTH;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next v2 5/5] xen-netback: enable IPv6 TCP GSO to the guest
From: Paul Durrant @ 2013-10-08 10:58 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381229896-18657-1-git-send-email-paul.durrant@citrix.com>

This patch adds code to handle SKB_GSO_TCPV6 skbs and construct appropriate
extra or prefix segments to pass the large packet to the frontend. New
xenstore flags, feature-gso-tcpv6 and feature-gso-tcpv6-prefix, are sampled
to determine if the frontend is capable of handling such packets.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/common.h    |    6 +++--
 drivers/net/xen-netback/interface.c |    8 ++++--
 drivers/net/xen-netback/netback.c   |   47 +++++++++++++++++++++++++++--------
 drivers/net/xen-netback/xenbus.c    |   29 +++++++++++++++++++--
 4 files changed, 74 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index b4a9a3c..720b1ca 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -87,6 +87,7 @@ struct pending_tx_info {
 struct xenvif_rx_meta {
 	int id;
 	int size;
+	int gso_type;
 	int gso_size;
 };
 
@@ -150,9 +151,10 @@ struct xenvif {
 	u8               fe_dev_addr[6];
 
 	/* Frontend feature information. */
+	int gso_mask;
+	int gso_prefix_mask;
+
 	u8 can_sg:1;
-	u8 gso:1;
-	u8 gso_prefix:1;
 	u8 ip_csum:1;
 	u8 ipv6_csum:1;
 
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index cb0d8ea..3d11387 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -214,8 +214,12 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
 
 	if (!vif->can_sg)
 		features &= ~NETIF_F_SG;
-	if (!vif->gso && !vif->gso_prefix)
+	if (~(vif->gso_mask | vif->gso_prefix_mask) &
+	    (1 << XEN_NETIF_GSO_TYPE_TCPV4))
 		features &= ~NETIF_F_TSO;
+	if (~(vif->gso_mask | vif->gso_prefix_mask) &
+	    (1 << XEN_NETIF_GSO_TYPE_TCPV6))
+		features &= ~NETIF_F_TSO6;
 	if (!vif->ip_csum)
 		features &= ~NETIF_F_IP_CSUM;
 	if (!vif->ipv6_csum)
@@ -320,7 +324,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	dev->netdev_ops	= &xenvif_netdev_ops;
 	dev->hw_features = NETIF_F_SG |
 		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
-		NETIF_F_TSO;
+		NETIF_F_TSO | NETIF_F_TSO6;
 	dev->features = dev->hw_features | NETIF_F_RXCSUM;
 	SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index ac42f73..ee0d55c 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -140,7 +140,7 @@ static int max_required_rx_slots(struct xenvif *vif)
 	int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
 
 	/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
-	if (vif->can_sg || vif->gso || vif->gso_prefix)
+	if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
 		max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
 
 	return max;
@@ -334,6 +334,7 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 	struct gnttab_copy *copy_gop;
 	struct xenvif_rx_meta *meta;
 	unsigned long bytes;
+	int gso_type;
 
 	/* Data must not cross a page boundary. */
 	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
@@ -392,7 +393,14 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		}
 
 		/* Leave a gap for the GSO descriptor. */
-		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
+		if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
+			gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+		else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
+			gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+		else
+			gso_type = 0;
+
+		if (*head && ((1 << gso_type) & vif->gso_prefix_mask))
 			vif->rx.req_cons++;
 
 		*head = 0; /* There must be something in this buffer now. */
@@ -423,14 +431,28 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 	unsigned char *data;
 	int head = 1;
 	int old_meta_prod;
+	int gso_type;
+	int gso_size;
 
 	old_meta_prod = npo->meta_prod;
 
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
+		gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+		gso_size = skb_shinfo(skb)->gso_size;
+	} else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) {
+		gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+		gso_size = skb_shinfo(skb)->gso_size;
+	} else {
+		gso_type = 0;
+		gso_size = 0;
+	}
+
 	/* Set up a GSO prefix descriptor, if necessary */
-	if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
+	if ((1 << gso_type) & vif->gso_prefix_mask) {
 		req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
 		meta = npo->meta + npo->meta_prod++;
-		meta->gso_size = skb_shinfo(skb)->gso_size;
+		meta->gso_type = gso_type;
+		meta->gso_size = gso_size;
 		meta->size = 0;
 		meta->id = req->id;
 	}
@@ -438,10 +460,13 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 	req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
 	meta = npo->meta + npo->meta_prod++;
 
-	if (!vif->gso_prefix)
-		meta->gso_size = skb_shinfo(skb)->gso_size;
-	else
+	if ((1 << gso_type) & vif->gso_mask) {
+		meta->gso_type = gso_type;
+		meta->gso_size = gso_size;
+	} else {
+		meta->gso_type = 0;
 		meta->gso_size = 0;
+	}
 
 	meta->size = 0;
 	meta->id = req->id;
@@ -587,7 +612,8 @@ void xenvif_rx_action(struct xenvif *vif)
 
 		vif = netdev_priv(skb->dev);
 
-		if (vif->meta[npo.meta_cons].gso_size && vif->gso_prefix) {
+		if ((1 << vif->meta[npo.meta_cons].gso_type) &
+		    vif->gso_prefix_mask) {
 			resp = RING_GET_RESPONSE(&vif->rx,
 						 vif->rx.rsp_prod_pvt++);
 
@@ -624,7 +650,8 @@ void xenvif_rx_action(struct xenvif *vif)
 					vif->meta[npo.meta_cons].size,
 					flags);
 
-		if (vif->meta[npo.meta_cons].gso_size && !vif->gso_prefix) {
+		if ((1 << vif->meta[npo.meta_cons].gso_type) &
+		    vif->gso_mask) {
 			struct xen_netif_extra_info *gso =
 				(struct xen_netif_extra_info *)
 				RING_GET_RESPONSE(&vif->rx,
@@ -632,8 +659,8 @@ void xenvif_rx_action(struct xenvif *vif)
 
 			resp->flags |= XEN_NETRXF_extra_info;
 
+			gso->u.gso.type = vif->meta[npo.meta_cons].gso_type;
 			gso->u.gso.size = vif->meta[npo.meta_cons].gso_size;
-			gso->u.gso.type = XEN_NETIF_GSO_TYPE_TCPV4;
 			gso->u.gso.pad = 0;
 			gso->u.gso.features = 0;
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 389fa72..4894256 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -573,15 +573,40 @@ static int connect_rings(struct backend_info *be)
 		val = 0;
 	vif->can_sg = !!val;
 
+	vif->gso_mask = 0;
+	vif->gso_prefix_mask = 0;
+
 	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4",
 			 "%d", &val) < 0)
 		val = 0;
-	vif->gso = !!val;
+	if (val)
+		vif->gso_mask = 1 << XEN_NETIF_GSO_TYPE_TCPV4;
 
 	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4-prefix",
 			 "%d", &val) < 0)
 		val = 0;
-	vif->gso_prefix = !!val;
+	if (val)
+		vif->gso_prefix_mask = 1 << XEN_NETIF_GSO_TYPE_TCPV4;
+
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6",
+			 "%d", &val) < 0)
+		val = 0;
+	if (val)
+		vif->gso_mask = 1 << XEN_NETIF_GSO_TYPE_TCPV6;
+
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6-prefix",
+			 "%d", &val) < 0)
+		val = 0;
+	if (val)
+		vif->gso_prefix_mask = 1 << XEN_NETIF_GSO_TYPE_TCPV6;
+
+	if (vif->gso_mask & vif->gso_prefix_mask) {
+		xenbus_dev_fatal(dev, err,
+				 "%s: gso and gso prefix flags are not "
+				 "mutually exclusive",
+				 dev->otherend);
+		return -EOPNOTSUPP;
+	}
 
 	/* Before feature-ipv6-csum-offload was introduced, checksum offload
 	 * was turned on by a zero value in (or lack of)
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next v2 2/5] xen-netback: Add support for IPv6 checksum offload from guest
From: Paul Durrant @ 2013-10-08 10:58 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381229896-18657-1-git-send-email-paul.durrant@citrix.com>

For performance of VM to VM traffic on a single host it is better to avoid
calculation of TCP/UDP checksum in the sending frontend. To allow this this
patch adds the code necessary to set up partial checksum for IPv6 packets
and xenstore flag feature-ipv6-csum-offload to advertise that fact to
frontends.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/netback.c |  224 ++++++++++++++++++++++++++++++-------
 drivers/net/xen-netback/xenbus.c  |    9 ++
 2 files changed, 191 insertions(+), 42 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index f3e591c..35b1fa6 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -109,15 +109,10 @@ static inline unsigned long idx_to_kaddr(struct xenvif *vif,
 	return (unsigned long)pfn_to_kaddr(idx_to_pfn(vif, idx));
 }
 
-/*
- * This is the amount of packet we copy rather than map, so that the
- * guest can't fiddle with the contents of the headers while we do
- * packet processing on them (netfilter, routing, etc).
+/* This is a miniumum size for the linear area to avoid lots of
+ * calls to __pskb_pull_tail() as we set up checksum offsets.
  */
-#define PKT_PROT_LEN    (ETH_HLEN + \
-			 VLAN_HLEN + \
-			 sizeof(struct iphdr) + MAX_IPOPTLEN + \
-			 sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE)
+#define PKT_PROT_LEN 128
 
 static u16 frag_get_pending_idx(skb_frag_t *frag)
 {
@@ -1118,61 +1113,72 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
 	return 0;
 }
 
-static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
+static inline void maybe_pull_tail(struct sk_buff *skb, unsigned int len)
 {
-	struct iphdr *iph;
+	if (skb_is_nonlinear(skb) && skb_headlen(skb) < len) {
+		int target = min_t(int, skb->len, len);
+		__pskb_pull_tail(skb, target - skb_headlen(skb));
+	}
+}
+
+static int checksum_setup_ip(struct xenvif *vif, struct sk_buff *skb,
+			     int recalculate_partial_csum)
+{
+	struct iphdr *iph = (void *)skb->data;
+	unsigned int header_size;
+	unsigned int off;
 	int err = -EPROTO;
-	int recalculate_partial_csum = 0;
 
-	/*
-	 * A GSO SKB must be CHECKSUM_PARTIAL. However some buggy
-	 * peers can fail to set NETRXF_csum_blank when sending a GSO
-	 * frame. In this case force the SKB to CHECKSUM_PARTIAL and
-	 * recalculate the partial checksum.
-	 */
-	if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb)) {
-		vif->rx_gso_checksum_fixup++;
-		skb->ip_summed = CHECKSUM_PARTIAL;
-		recalculate_partial_csum = 1;
-	}
+	off = sizeof(struct iphdr);
 
-	/* A non-CHECKSUM_PARTIAL SKB does not require setup. */
-	if (skb->ip_summed != CHECKSUM_PARTIAL)
-		return 0;
+	header_size = skb->network_header + off + MAX_IPOPTLEN;
+	maybe_pull_tail(skb, header_size);
 
-	if (skb->protocol != htons(ETH_P_IP))
-		goto out;
+	off = iph->ihl * 4;
 
-	iph = (void *)skb->data;
 	switch (iph->protocol) {
 	case IPPROTO_TCP:
-		if (!skb_partial_csum_set(skb, 4 * iph->ihl,
+		if (!skb_partial_csum_set(skb, off,
 					  offsetof(struct tcphdr, check)))
 			goto out;
 
 		if (recalculate_partial_csum) {
 			struct tcphdr *tcph = tcp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct tcphdr) +
+				MAX_TCP_OPTION_SPACE;
+			maybe_pull_tail(skb, header_size);
+
 			tcph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
-							 skb->len - iph->ihl*4,
+							 skb->len - off,
 							 IPPROTO_TCP, 0);
 		}
 		break;
 	case IPPROTO_UDP:
-		if (!skb_partial_csum_set(skb, 4 * iph->ihl,
+		if (!skb_partial_csum_set(skb, off,
 					  offsetof(struct udphdr, check)))
 			goto out;
 
 		if (recalculate_partial_csum) {
 			struct udphdr *udph = udp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct udphdr);
+			maybe_pull_tail(skb, header_size);
+
 			udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
-							 skb->len - iph->ihl*4,
+							 skb->len - off,
 							 IPPROTO_UDP, 0);
 		}
 		break;
 	default:
 		if (net_ratelimit())
 			netdev_err(vif->dev,
-				   "Attempting to checksum a non-TCP/UDP packet, dropping a protocol %d packet\n",
+				   "Attempting to checksum a non-TCP/UDP packet, "
+				   "dropping a protocol %d packet\n",
 				   iph->protocol);
 		goto out;
 	}
@@ -1183,6 +1189,148 @@ out:
 	return err;
 }
 
+static int checksum_setup_ipv6(struct xenvif *vif, struct sk_buff *skb,
+			       int recalculate_partial_csum)
+{
+	int err = -EPROTO;
+	struct ipv6hdr *ipv6h = (void *)skb->data;
+	u8 nexthdr;
+	unsigned int header_size;
+	unsigned int off;
+	bool done;
+
+	done = false;
+	off = sizeof(struct ipv6hdr);
+
+	header_size = skb->network_header + off;
+	maybe_pull_tail(skb, header_size);
+
+	nexthdr = ipv6h->nexthdr;
+
+	while ((off <= sizeof(struct ipv6hdr) + ntohs(ipv6h->payload_len)) &&
+	       !done) {
+		/* We only access at most the first 2 bytes of any option header
+		 * to determine the next one.
+		 */
+		header_size = skb->network_header + off + 2;
+		maybe_pull_tail(skb, header_size);
+
+		switch (nexthdr) {
+		case IPPROTO_FRAGMENT: {
+			struct frag_hdr *hp = (void *)(skb->data + off);
+
+			nexthdr = hp->nexthdr;
+			off += 8;
+			break;
+		}
+		case IPPROTO_DSTOPTS:
+		case IPPROTO_HOPOPTS:
+		case IPPROTO_ROUTING: {
+			struct ipv6_opt_hdr *hp = (void *)(skb->data + off);
+
+			nexthdr = hp->nexthdr;
+			off += ipv6_optlen(hp);
+			break;
+		}
+		case IPPROTO_AH: {
+			struct ip_auth_hdr *hp = (void *)(skb->data + off);
+
+			nexthdr = hp->nexthdr;
+			off += (hp->hdrlen+2)<<2;
+			break;
+		}
+		default:
+			done = true;
+			break;
+		}
+	}
+
+	if (!done)
+		goto out;
+
+	switch (nexthdr) {
+	case IPPROTO_TCP:
+		if (!skb_partial_csum_set(skb, off,
+					  offsetof(struct tcphdr, check)))
+			goto out;
+
+		if (recalculate_partial_csum) {
+			struct tcphdr *tcph = tcp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct tcphdr) +
+				MAX_TCP_OPTION_SPACE;
+			maybe_pull_tail(skb, header_size);
+
+			tcph->check = ~csum_ipv6_magic(&ipv6h->saddr,
+						       &ipv6h->daddr,
+						       skb->len - off,
+						       IPPROTO_TCP, 0);
+		}
+		break;
+	case IPPROTO_UDP:
+		if (!skb_partial_csum_set(skb, off,
+					  offsetof(struct udphdr, check)))
+			goto out;
+
+		if (recalculate_partial_csum) {
+			struct udphdr *udph = udp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct udphdr);
+			maybe_pull_tail(skb, header_size);
+
+			udph->check = ~csum_ipv6_magic(&ipv6h->saddr,
+						       &ipv6h->daddr,
+						       skb->len - off,
+						       IPPROTO_UDP, 0);
+		}
+		break;
+	default:
+		if (net_ratelimit())
+			netdev_err(vif->dev,
+				   "Attempting to checksum a non-TCP/UDP packet, "
+				   "dropping a protocol %d packet\n",
+				   nexthdr);
+		goto out;
+	}
+
+	err = 0;
+
+out:
+	return err;
+}
+
+static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
+{
+	int err = -EPROTO;
+	int recalculate_partial_csum = 0;
+
+	/* A GSO SKB must be CHECKSUM_PARTIAL. However some buggy
+	 * peers can fail to set NETRXF_csum_blank when sending a GSO
+	 * frame. In this case force the SKB to CHECKSUM_PARTIAL and
+	 * recalculate the partial checksum.
+	 */
+	if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb)) {
+		vif->rx_gso_checksum_fixup++;
+		skb->ip_summed = CHECKSUM_PARTIAL;
+		recalculate_partial_csum = 1;
+	}
+
+	/* A non-CHECKSUM_PARTIAL SKB does not require setup. */
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		return 0;
+
+	if (skb->protocol == htons(ETH_P_IP))
+		err = checksum_setup_ip(vif, skb, recalculate_partial_csum);
+	else if (skb->protocol == htons(ETH_P_IPV6))
+		err = checksum_setup_ipv6(vif, skb, recalculate_partial_csum);
+
+	return err;
+}
+
 static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
 {
 	unsigned long now = jiffies;
@@ -1428,15 +1576,7 @@ static int xenvif_tx_submit(struct xenvif *vif, int budget)
 
 		xenvif_fill_frags(vif, skb);
 
-		/*
-		 * If the initial fragment was < PKT_PROT_LEN then
-		 * pull through some bytes from the other fragments to
-		 * increase the linear region to PKT_PROT_LEN bytes.
-		 */
-		if (skb_headlen(skb) < PKT_PROT_LEN && skb_is_nonlinear(skb)) {
-			int target = min_t(int, skb->len, PKT_PROT_LEN);
-			__pskb_pull_tail(skb, target - skb_headlen(skb));
-		}
+		maybe_pull_tail(skb, PKT_PROT_LEN);
 
 		skb->dev      = vif->dev;
 		skb->protocol = eth_type_trans(skb, skb->dev);
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 5e45b00..dff423b 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -101,6 +101,15 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		/* We support partial checksum setup for IPv6 packets */
+		err = xenbus_printf(xbt, dev->nodename,
+				    "feature-ipv6-csum-offload",
+				    "%d", 1);
+		if (err) {
+			message = "writing feature-ipv6-csum-offload";
+			goto abort_transaction;
+		}
+
 		/* We support rx-copy path. */
 		err = xenbus_printf(xbt, dev->nodename,
 				    "feature-rx-copy", "%d", 1);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH] mac80211: port CCMP to cryptoapi's CCM driver
From: Ard Biesheuvel @ 2013-10-08 11:31 UTC (permalink / raw)
  To: linux-wireless, netdev; +Cc: patches, johannes, linville, Ard Biesheuvel

Use the generic CCM aead chaining mode driver rather than a local
implementation that sits right on top of the core AES cipher.

This allows the use of accelerated implementations of either
CCM as a whole or the CTR mode which it encapsulates.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 net/mac80211/Kconfig   |   1 +
 net/mac80211/aes_ccm.c | 165 +++++++++++++++++--------------------------------
 net/mac80211/aes_ccm.h |   8 +--
 net/mac80211/key.h     |   2 +-
 net/mac80211/wpa.c     |  24 +++----
 5 files changed, 71 insertions(+), 129 deletions(-)

diff --git a/net/mac80211/Kconfig b/net/mac80211/Kconfig
index 62535fe..dc31ec3 100644
--- a/net/mac80211/Kconfig
+++ b/net/mac80211/Kconfig
@@ -4,6 +4,7 @@ config MAC80211
 	select CRYPTO
 	select CRYPTO_ARC4
 	select CRYPTO_AES
+	select CRYPTO_CCM
 	select CRC32
 	select AVERAGE
 	---help---
diff --git a/net/mac80211/aes_ccm.c b/net/mac80211/aes_ccm.c
index be7614b9..ef808d7 100644
--- a/net/mac80211/aes_ccm.c
+++ b/net/mac80211/aes_ccm.c
@@ -2,6 +2,8 @@
  * Copyright 2003-2004, Instant802 Networks, Inc.
  * Copyright 2005-2006, Devicescape Software, Inc.
  *
+ * Rewrite: Copyright (C) 2013 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
@@ -17,134 +19,77 @@
 #include "key.h"
 #include "aes_ccm.h"
 
-static void aes_ccm_prepare(struct crypto_cipher *tfm, u8 *scratch, u8 *a)
+void ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
+			       u8 *data, size_t data_len, u8 *cdata, u8 *mic)
 {
-	int i;
-	u8 *b_0, *aad, *b, *s_0;
+	struct scatterlist assoc, pt, ct[2];
 
-	b_0 = scratch + 3 * AES_BLOCK_SIZE;
-	aad = scratch + 4 * AES_BLOCK_SIZE;
-	b = scratch;
-	s_0 = scratch + AES_BLOCK_SIZE;
+	/* allocate the variable sized aead_request on the stack */
+	int l = DIV_ROUND_UP(crypto_aead_reqsize(tfm),
+			     sizeof(struct aead_request));
+	struct aead_request req[1 + l];
 
-	crypto_cipher_encrypt_one(tfm, b, b_0);
+	memset(req, 0, sizeof(*req) + crypto_aead_reqsize(tfm));
 
-	/* Extra Authenticate-only data (always two AES blocks) */
-	for (i = 0; i < AES_BLOCK_SIZE; i++)
-		aad[i] ^= b[i];
-	crypto_cipher_encrypt_one(tfm, b, aad);
+	sg_init_one(&pt, data, data_len);
+	sg_init_one(&assoc, &aad[2], be16_to_cpup((__be16 *)aad));
+	sg_init_table(ct, 2);
+	sg_set_buf(&ct[0], cdata, data_len);
+	sg_set_buf(&ct[1], mic, IEEE80211_CCMP_MIC_LEN);
 
-	aad += AES_BLOCK_SIZE;
+	aead_request_set_tfm(req, tfm);
+	aead_request_set_crypt(req, &pt, ct, data_len, b_0);
+	aead_request_set_assoc(req, &assoc, assoc.length);
 
-	for (i = 0; i < AES_BLOCK_SIZE; i++)
-		aad[i] ^= b[i];
-	crypto_cipher_encrypt_one(tfm, a, aad);
+	crypto_aead_encrypt(req);
+}
 
-	/* Mask out bits from auth-only-b_0 */
-	b_0[0] &= 0x07;
+int ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
+			      u8 *cdata, size_t data_len, u8 *mic, u8 *data)
+{
+	struct scatterlist assoc, pt, ct[2];
 
-	/* S_0 is used to encrypt T (= MIC) */
-	b_0[14] = 0;
-	b_0[15] = 0;
-	crypto_cipher_encrypt_one(tfm, s_0, b_0);
-}
+	/* allocate the variable sized aead_request on the stack */
+	int l = DIV_ROUND_UP(crypto_aead_reqsize(tfm),
+			     sizeof(struct aead_request));
+	struct aead_request req[1 + l];
 
+	memset(req, 0, sizeof(*req) + crypto_aead_reqsize(tfm));
 
-void ieee80211_aes_ccm_encrypt(struct crypto_cipher *tfm, u8 *scratch,
-			       u8 *data, size_t data_len,
-			       u8 *cdata, u8 *mic)
-{
-	int i, j, last_len, num_blocks;
-	u8 *pos, *cpos, *b, *s_0, *e, *b_0;
-
-	b = scratch;
-	s_0 = scratch + AES_BLOCK_SIZE;
-	e = scratch + 2 * AES_BLOCK_SIZE;
-	b_0 = scratch + 3 * AES_BLOCK_SIZE;
-
-	num_blocks = DIV_ROUND_UP(data_len, AES_BLOCK_SIZE);
-	last_len = data_len % AES_BLOCK_SIZE;
-	aes_ccm_prepare(tfm, scratch, b);
-
-	/* Process payload blocks */
-	pos = data;
-	cpos = cdata;
-	for (j = 1; j <= num_blocks; j++) {
-		int blen = (j == num_blocks && last_len) ?
-			last_len : AES_BLOCK_SIZE;
-
-		/* Authentication followed by encryption */
-		for (i = 0; i < blen; i++)
-			b[i] ^= pos[i];
-		crypto_cipher_encrypt_one(tfm, b, b);
-
-		b_0[14] = (j >> 8) & 0xff;
-		b_0[15] = j & 0xff;
-		crypto_cipher_encrypt_one(tfm, e, b_0);
-		for (i = 0; i < blen; i++)
-			*cpos++ = *pos++ ^ e[i];
-	}
-
-	for (i = 0; i < IEEE80211_CCMP_MIC_LEN; i++)
-		mic[i] = b[i] ^ s_0[i];
-}
+	sg_init_one(&pt, data, data_len);
+	sg_init_one(&assoc, &aad[2], be16_to_cpup((__be16 *)aad));
+	sg_init_table(ct, 2);
+	sg_set_buf(&ct[0], cdata, data_len);
+	sg_set_buf(&ct[1], mic, IEEE80211_CCMP_MIC_LEN);
 
+	aead_request_set_tfm(req, tfm);
+	aead_request_set_crypt(req, ct, &pt, data_len + IEEE80211_CCMP_MIC_LEN,
+			       b_0);
+	aead_request_set_assoc(req, &assoc, assoc.length);
 
-int ieee80211_aes_ccm_decrypt(struct crypto_cipher *tfm, u8 *scratch,
-			      u8 *cdata, size_t data_len, u8 *mic, u8 *data)
-{
-	int i, j, last_len, num_blocks;
-	u8 *pos, *cpos, *b, *s_0, *a, *b_0;
-
-	b = scratch;
-	s_0 = scratch + AES_BLOCK_SIZE;
-	a = scratch + 2 * AES_BLOCK_SIZE;
-	b_0 = scratch + 3 * AES_BLOCK_SIZE;
-
-	num_blocks = DIV_ROUND_UP(data_len, AES_BLOCK_SIZE);
-	last_len = data_len % AES_BLOCK_SIZE;
-	aes_ccm_prepare(tfm, scratch, a);
-
-	/* Process payload blocks */
-	cpos = cdata;
-	pos = data;
-	for (j = 1; j <= num_blocks; j++) {
-		int blen = (j == num_blocks && last_len) ?
-			last_len : AES_BLOCK_SIZE;
-
-		/* Decryption followed by authentication */
-		b_0[14] = (j >> 8) & 0xff;
-		b_0[15] = j & 0xff;
-		crypto_cipher_encrypt_one(tfm, b, b_0);
-		for (i = 0; i < blen; i++) {
-			*pos = *cpos++ ^ b[i];
-			a[i] ^= *pos++;
-		}
-		crypto_cipher_encrypt_one(tfm, a, a);
-	}
-
-	for (i = 0; i < IEEE80211_CCMP_MIC_LEN; i++) {
-		if ((mic[i] ^ s_0[i]) != a[i])
-			return -1;
-	}
-
-	return 0;
+	return crypto_aead_decrypt(req);
 }
 
-
-struct crypto_cipher *ieee80211_aes_key_setup_encrypt(const u8 key[])
+struct crypto_aead *ieee80211_aes_key_setup_encrypt(const u8 key[])
 {
-	struct crypto_cipher *tfm;
+	struct crypto_aead *tfm;
+	int err;
 
-	tfm = crypto_alloc_cipher("aes", 0, CRYPTO_ALG_ASYNC);
-	if (!IS_ERR(tfm))
-		crypto_cipher_setkey(tfm, key, WLAN_KEY_LEN_CCMP);
+	tfm = crypto_alloc_aead("ccm(aes)", 0, CRYPTO_ALG_ASYNC);
+	if (IS_ERR(tfm))
+		return tfm;
 
-	return tfm;
-}
+	err = crypto_aead_setkey(tfm, key, WLAN_KEY_LEN_CCMP);
+	if (!err)
+		err = crypto_aead_setauthsize(tfm, IEEE80211_CCMP_MIC_LEN);
+	if (!err)
+		return tfm;
 
+	crypto_free_aead(tfm);
+	return ERR_PTR(err);
+}
 
-void ieee80211_aes_key_free(struct crypto_cipher *tfm)
+void ieee80211_aes_key_free(struct crypto_aead *tfm)
 {
-	crypto_free_cipher(tfm);
+	crypto_free_aead(tfm);
 }
diff --git a/net/mac80211/aes_ccm.h b/net/mac80211/aes_ccm.h
index 5b7d744..52650b3 100644
--- a/net/mac80211/aes_ccm.h
+++ b/net/mac80211/aes_ccm.h
@@ -12,13 +12,13 @@
 
 #include <linux/crypto.h>
 
-struct crypto_cipher *ieee80211_aes_key_setup_encrypt(const u8 key[]);
-void ieee80211_aes_ccm_encrypt(struct crypto_cipher *tfm, u8 *scratch,
+struct crypto_aead *ieee80211_aes_key_setup_encrypt(const u8 key[]);
+void ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
 			       u8 *data, size_t data_len,
 			       u8 *cdata, u8 *mic);
-int ieee80211_aes_ccm_decrypt(struct crypto_cipher *tfm, u8 *scratch,
+int ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
 			      u8 *cdata, size_t data_len,
 			      u8 *mic, u8 *data);
-void ieee80211_aes_key_free(struct crypto_cipher *tfm);
+void ieee80211_aes_key_free(struct crypto_aead *tfm);
 
 #endif /* AES_CCM_H */
diff --git a/net/mac80211/key.h b/net/mac80211/key.h
index 036d57e..aaae0ed 100644
--- a/net/mac80211/key.h
+++ b/net/mac80211/key.h
@@ -83,7 +83,7 @@ struct ieee80211_key {
 			 * Management frames.
 			 */
 			u8 rx_pn[IEEE80211_NUM_TIDS + 1][IEEE80211_CCMP_PN_LEN];
-			struct crypto_cipher *tfm;
+			struct crypto_aead *tfm;
 			u32 replays; /* dot11RSNAStatsCCMPReplays */
 		} ccmp;
 		struct {
diff --git a/net/mac80211/wpa.c b/net/mac80211/wpa.c
index c9edfcb..c39ee61 100644
--- a/net/mac80211/wpa.c
+++ b/net/mac80211/wpa.c
@@ -301,22 +301,16 @@ ieee80211_crypto_tkip_decrypt(struct ieee80211_rx_data *rx)
 }
 
 
-static void ccmp_special_blocks(struct sk_buff *skb, u8 *pn, u8 *scratch,
+static void ccmp_special_blocks(struct sk_buff *skb, u8 *pn, u8 *b_0, u8 *aad,
 				int encrypted)
 {
 	__le16 mask_fc;
 	int a4_included, mgmt;
 	u8 qos_tid;
-	u8 *b_0, *aad;
 	u16 data_len, len_a;
 	unsigned int hdrlen;
 	struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data;
 
-	memset(scratch, 0, 6 * AES_BLOCK_SIZE);
-
-	b_0 = scratch + 3 * AES_BLOCK_SIZE;
-	aad = scratch + 4 * AES_BLOCK_SIZE;
-
 	/*
 	 * Mask FC: zero subtype b4 b5 b6 (if not mgmt)
 	 * Retry, PwrMgt, MoreData; set Protected
@@ -343,7 +337,7 @@ static void ccmp_special_blocks(struct sk_buff *skb, u8 *pn, u8 *scratch,
 		data_len -= IEEE80211_CCMP_MIC_LEN;
 
 	/* First block, b_0 */
-	b_0[0] = 0x59; /* flags: Adata: 1, M: 011, L: 001 */
+	b_0[0] = 0x1; /* set L := 1, M and Adata flags are implied */
 	/* Nonce: Nonce Flags | A2 | PN
 	 * Nonce Flags: Priority (b0..b3) | Management (b4) | Reserved (b5..b7)
 	 */
@@ -407,7 +401,8 @@ static int ccmp_encrypt_skb(struct ieee80211_tx_data *tx, struct sk_buff *skb)
 	u8 *pos;
 	u8 pn[6];
 	u64 pn64;
-	u8 scratch[6 * AES_BLOCK_SIZE];
+	u8 aad[2 * AES_BLOCK_SIZE];
+	u8 b_0[AES_BLOCK_SIZE];
 
 	if (info->control.hw_key &&
 	    !(info->control.hw_key->flags & IEEE80211_KEY_FLAG_GENERATE_IV) &&
@@ -460,8 +455,8 @@ static int ccmp_encrypt_skb(struct ieee80211_tx_data *tx, struct sk_buff *skb)
 		return 0;
 
 	pos += IEEE80211_CCMP_HDR_LEN;
-	ccmp_special_blocks(skb, pn, scratch, 0);
-	ieee80211_aes_ccm_encrypt(key->u.ccmp.tfm, scratch, pos, len,
+	ccmp_special_blocks(skb, pn, b_0, aad, 0);
+	ieee80211_aes_ccm_encrypt(key->u.ccmp.tfm, b_0, aad, pos, len,
 				  pos, skb_put(skb, IEEE80211_CCMP_MIC_LEN));
 
 	return 0;
@@ -525,12 +520,13 @@ ieee80211_crypto_ccmp_decrypt(struct ieee80211_rx_data *rx)
 	}
 
 	if (!(status->flag & RX_FLAG_DECRYPTED)) {
-		u8 scratch[6 * AES_BLOCK_SIZE];
+		u8 aad[2 * AES_BLOCK_SIZE];
+		u8 b_0[AES_BLOCK_SIZE];
 		/* hardware didn't decrypt/verify MIC */
-		ccmp_special_blocks(skb, pn, scratch, 1);
+		ccmp_special_blocks(skb, pn, b_0, aad, 1);
 
 		if (ieee80211_aes_ccm_decrypt(
-			    key->u.ccmp.tfm, scratch,
+			    key->u.ccmp.tfm, b_0, aad,
 			    skb->data + hdrlen + IEEE80211_CCMP_HDR_LEN,
 			    data_len,
 			    skb->data + skb->len - IEEE80211_CCMP_MIC_LEN,
-- 
1.8.1.2

^ permalink raw reply related

* You've won a Prize
From: Microsoft Iberica SL @ 2013-10-08 12:47 UTC (permalink / raw)


You've won a Prize
MICROSOFT IBERICA SL"
YOU 'VE WON.
ATTN:MICROSOFT IBERICA SL
Your email has won (EUR244,000,00)
(TWO HUNDRED AND FOURTY FOUR THOUSAND EURO)
Batch number:XL73276498AM
Ref number:QR352899526KC
This is a millennium scientific computer game in which
email addresses were used.It is a promotional program aimed at
encouraging internet users,therefore you do not need to buy ticket to enter
for it.
For further development,clarification and procedure please
Contact:Dr Eduardo Sanchez,
Email contact:payingroll446@yahoo.com.hk

^ permalink raw reply

* Re: [PATCH] mac80211: port CCMP to cryptoapi's CCM driver
From: Johannes Berg @ 2013-10-08 11:52 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux-wireless, netdev, patches, linville
In-Reply-To: <1381231915-24232-1-git-send-email-ard.biesheuvel@linaro.org>

On Tue, 2013-10-08 at 13:31 +0200, Ard Biesheuvel wrote:

Hmm, thanks I guess. I'll need to review this in more detail, but I have
a question first:

> +	/* allocate the variable sized aead_request on the stack */
> +	int l = DIV_ROUND_UP(crypto_aead_reqsize(tfm),
> +			     sizeof(struct aead_request));
> +	struct aead_request req[1 + l];

This looks a bit odd, why round up first and then add one? Why even
bother using a struct array rather than some local struct like

struct {
  struct aead_request req;
  u8 data[crypto_aed_reqsize(tfm)];
} req_data;

or so?

johannes

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox