Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] net: sfc: use new api ethtool_{get|set}_link_ksettings
From: Philippe Reynes @ 2016-12-15 21:50 UTC (permalink / raw)
  To: Jarod Wilson; +Cc: linux-net-drivers, ecree, bkenward, netdev, linux-kernel
In-Reply-To: <741a20e3-5b7a-bfe8-a7af-81bacd736e16@redhat.com>

Hi Jarod,

On 12/15/16, Jarod Wilson <jarod@redhat.com> wrote:
> On 2016-12-14 6:12 PM, Philippe Reynes wrote:
>> The ethtool api {get|set}_settings is deprecated.
>> We move this driver to new api {get|set}_link_ksettings.
>>
>> Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
>> ---
>>  drivers/net/ethernet/sfc/ethtool.c    |   35 ++++++++++++-------
>>  drivers/net/ethernet/sfc/mcdi_port.c  |   60
>> ++++++++++++++++++++------------
>>  drivers/net/ethernet/sfc/net_driver.h |   12 +++---
>>  3 files changed, 65 insertions(+), 42 deletions(-)
>
> What about drivers/net/ethernet/sfc/falcon/ethtool.c? Coming in a
> separate patch?

Yes, I send only one driver per patch.
I'll send another patch for this driver.

Philippe

^ permalink raw reply

* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa @ 2016-12-15 21:45 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: David Laight, Netdev, kernel-hardening@lists.openwall.com,
	Jean-Philippe Aumasson, LKML, Linux Crypto Mailing List,
	Daniel J . Bernstein, Linus Torvalds, Eric Biggers
In-Reply-To: <CAHmME9p9cf1W3vhbu=YTRY1Xt=fmE1sVqY1XPt5iQwxfCfQUOA@mail.gmail.com>

On 15.12.2016 22:25, Jason A. Donenfeld wrote:
> On Thu, Dec 15, 2016 at 10:17 PM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
>> And I was exactly questioning this.
>>
>> static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
>>                                     const struct in6_addr *daddr)
>> {
>>         net_get_random_once(&ip6_frags.rnd, sizeof(ip6_frags.rnd));
>>         return jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr),
>>                             (__force u32)id, ip6_frags.rnd);
>> }
> 
> For this example, the replacement is the function entitled siphash_4u32:
> 
>  static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
>                                      const struct in6_addr *daddr)
>  {
>          net_get_random_once(&ip6_frags.rnd, sizeof(ip6_frags.rnd));
>          return siphash_4u32(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr),
>                  (__force u32)id, 0, ip6_frags.rnd);
>  }
> 
> And then you make ip6_frags.rnd be of type siphash_key_t. Then
> everything is taken care of and works beautifully. Please see v5 of
> this patchset.

Sorry to not be specific enough, the Hash-DoS is in ipv6_addr_hash.
Maybe it was a silly example to start with, sorry. But anyway, your
proposal wouldn't have prevented the hash DoS. I wanted to show how it
can be difficult to make sure that all pointers come from an appropriate
aligned memory region.

The idea would be to actually factor out the key in the data structure
and align it with __aligned(SIPHASH_ALIGNMENT), make sure the padding
bits are all equal zero to not cause any bugs and irregularities with
the corresponding equality function. This might need some serious review
when switching to siphash to actually make use of it and prevent
HashDoS. Or simply use the unaligned version always...

>> I would be interested if the compiler can actually constant-fold the
>> address of the stack allocation with an simple if () or some
>> __builtin_constant_p fiddeling, so we don't have this constant review
>> overhead to which function we pass which data. This would also make
>> this whole discussion moot.
> 
> I'll play with it to see if the compiler is capable of doing that.
> Does anybody know off hand if it is or if there are other examples of
> the compiler doing that?

Not of the top of my head, but it should be easy to test.

> In any case, for all current replacement of jhash_1word, jhash_2words,
> jhash_3words, there's the siphash_2u32 or siphash_4u32 functions. This
> covers the majority of cases.

Agreed and this is also totally fine by me.

> For replacements of md5_transform, either the data is small and can
> fit in siphash_Nu{32,64}, or it can be put into a struct explicitly
> aligned on the stack.

> For the remaining use of jhash_nwords, either siphash() can be used or
> siphash_unaligned() can be used if the source is of unknown alignment.
> Both functions have their alignment requirements (or lack thereof)
> documented in a docbook comment.

I think the warning needs to be bigger, seriously. Most of the people
develop on 64 bit arch, where it will just work during testing and break
later on 32 bit. ;)

> I'll look into the constant folding to see if it actually works. If it
> does, I'll use it. If not, I believe the current solution works.
> 
> How's that sound?

I am still very much concerned about the API.

By the way, if you target net-next, it is currently closed. So no need
to hurry.

Bye,
Hannes

^ permalink raw reply

* [PATCH 2/2] gtp: Fix initialization of Flags octet in GTPv1 header
From: Pablo Neira Ayuso @ 2016-12-15 21:35 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, laforge
In-Reply-To: <1481837753-10317-1-git-send-email-pablo@netfilter.org>

From: Harald Welte <laforge@gnumonks.org>

When generating a GTPv1 header in gtp1_push_header(), initialize the
'reserved' bit to zero.  All 3GPP specifications for GTPv1 from Release
99 through Release 13 agree that a transmitter shall set this bit to
zero, see e.g. Note 0 of Figure 2 in Section 6 of 3GPP TS 29.060 v13.5.0
Release 13, available from
http://www.etsi.org/deliver/etsi_ts/129000_129099/129060/13.05.00_60/ts_129060v130500p.pdf

Signed-off-by: Harald Welte <laforge@gnumonks.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 drivers/net/gtp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index 6031d499f2be..8b6810bad54b 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -423,11 +423,11 @@ static inline void gtp1_push_header(struct sk_buff *skb, struct pdp_ctx *pctx)
 
 	/* Bits    8  7  6  5  4  3  2	1
 	 *	  +--+--+--+--+--+--+--+--+
-	 *	  |version |PT| 1| E| S|PN|
+	 *	  |version |PT| 0| E| S|PN|
 	 *	  +--+--+--+--+--+--+--+--+
 	 *	    0  0  1  1	1  0  0  0
 	 */
-	gtp1->flags	= 0x38; /* v1, GTP-non-prime. */
+	gtp1->flags	= 0x30; /* v1, GTP-non-prime. */
 	gtp1->type	= GTP_TPDU;
 	gtp1->length	= htons(payload_len);
 	gtp1->tid	= htonl(pctx->u.v1.o_tei);
-- 
2.1.4

^ permalink raw reply related

* [PATCH 1/2] gtp: gtp_check_src_ms_ipv4() always return success
From: Pablo Neira Ayuso @ 2016-12-15 21:35 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, laforge
In-Reply-To: <1481837753-10317-1-git-send-email-pablo@netfilter.org>

From: Lionel Gauthier <Lionel.Gauthier@eurecom.fr>

gtp_check_src_ms_ipv4() did not find the PDP context matching with the
UE IP address because the memory location is not right, but the result
is inverted by the Boolean "not" operator.  So whatever is the PDP
context, any call to this function is successful.

Signed-off-by: Lionel Gauthier <Lionel.Gauthier@eurecom.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 drivers/net/gtp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index 98f10c216521..6031d499f2be 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -158,9 +158,9 @@ static bool gtp_check_src_ms_ipv4(struct sk_buff *skb, struct pdp_ctx *pctx,
 	if (!pskb_may_pull(skb, hdrlen + sizeof(struct iphdr)))
 		return false;
 
-	iph = (struct iphdr *)(skb->data + hdrlen + sizeof(struct iphdr));
+	iph = (struct iphdr *)(skb->data + hdrlen);
 
-	return iph->saddr != pctx->ms_addr_ip4.s_addr;
+	return iph->saddr == pctx->ms_addr_ip4.s_addr;
 }
 
 /* Check if the inner IP source address in this packet is assigned to any
-- 
2.1.4

^ permalink raw reply related

* [PATCH 0/2] GTP tunneling fixes for net
From: Pablo Neira Ayuso @ 2016-12-15 21:35 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, laforge

Hi David,

The following patchset contains two GTP tunneling fixes for your net
tree, they are:

1) Offset to IPv4 header in gtp_check_src_ms_ipv4() is incorrect, thus
   this function always succeeds and therefore this defeats this sanity
   check. This allows packets that have no PDP to go though, patch from
   Lionel Gauthier.

2) According to Note 0 of Figure 2 in Section 6 of 3GPP TS 29.060 v13.5.0
   Release 13, always set GTPv1 reserved bit to zero. This may cause
   interoperability problems, patch from Harald Welte.

Please, apply, thanks a lot!

Harald Welte (1):
  gtp: Fix initialization of Flags octet in GTPv1 header

Lionel Gauthier (1):
  gtp: gtp_check_src_ms_ipv4() always return success

 drivers/net/gtp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

-- 
2.1.4

^ permalink raw reply

* Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock
From: Lino Sanfilippo @ 2016-12-15 21:32 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Francois Romieu, bh74.an, ks.giri, vipul.pandya, peppe.cavallaro,
	alexandre.torgue, davem, linux-kernel, netdev
In-Reply-To: <20161215210324.GA13878@amd>

On 15.12.2016 22:03, Pavel Machek wrote:

> 
> I actually did experiment with adding locking there, too, and no, no
> luck. It seems stmmac_tx_err() is more broken than just locking.
> 

Ah ok. Then maybe priv->hw->dma->stop_tx() does not do the job correctly (stop the
tx path properly) and the HW is still active on the tx path while the tx buffers are
freed. OTOH stmmac_release() also stops the phy before the tx (and rx) paths are stopped.
Did you try to stop the phy fist in stmmac_tx_err_work(), too?

Regards,
Lino

^ permalink raw reply

* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-15 21:25 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Laight, Netdev, kernel-hardening@lists.openwall.com,
	Jean-Philippe Aumasson, LKML, Linux Crypto Mailing List,
	Daniel J . Bernstein, Linus Torvalds, Eric Biggers

On Thu, Dec 15, 2016 at 10:17 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> And I was exactly questioning this.
>
> static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
>                                     const struct in6_addr *daddr)
> {
>         net_get_random_once(&ip6_frags.rnd, sizeof(ip6_frags.rnd));
>         return jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr),
>                             (__force u32)id, ip6_frags.rnd);
> }

For this example, the replacement is the function entitled siphash_4u32:

 static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
                                     const struct in6_addr *daddr)
 {
         net_get_random_once(&ip6_frags.rnd, sizeof(ip6_frags.rnd));
         return siphash_4u32(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr),
                 (__force u32)id, 0, ip6_frags.rnd);
 }

And then you make ip6_frags.rnd be of type siphash_key_t. Then
everything is taken care of and works beautifully. Please see v5 of
this patchset.

> I would be interested if the compiler can actually constant-fold the
> address of the stack allocation with an simple if () or some
> __builtin_constant_p fiddeling, so we don't have this constant review
> overhead to which function we pass which data. This would also make
> this whole discussion moot.

I'll play with it to see if the compiler is capable of doing that.
Does anybody know off hand if it is or if there are other examples of
the compiler doing that?

In any case, for all current replacement of jhash_1word, jhash_2words,
jhash_3words, there's the siphash_2u32 or siphash_4u32 functions. This
covers the majority of cases.

For replacements of md5_transform, either the data is small and can
fit in siphash_Nu{32,64}, or it can be put into a struct explicitly
aligned on the stack.

For the remaining use of jhash_nwords, either siphash() can be used or
siphash_unaligned() can be used if the source is of unknown alignment.
Both functions have their alignment requirements (or lack thereof)
documented in a docbook comment.

I'll look into the constant folding to see if it actually works. If it
does, I'll use it. If not, I believe the current solution works.

How's that sound?

Jason

^ permalink raw reply

* [net-next PATCH v6 4/5] virtio_net: add XDP_TX support
From: John Fastabend @ 2016-12-15 20:14 UTC (permalink / raw)
  To: mst
  Cc: daniel, netdev, alexei.starovoitov, john.r.fastabend, brouer,
	tgraf, davem
In-Reply-To: <20161215200712.23639.53043.stgit@john-Precision-Tower-5810>

This adds support for the XDP_TX action to virtio_net. When an XDP
program is run and returns the XDP_TX action the virtio_net XDP
implementation will transmit the packet on a TX queue that aligns
with the current CPU that the XDP packet was processed on.

Before sending the packet the header is zeroed.  Also XDP is expected
to handle checksum correctly so no checksum offload  support is
provided.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/virtio_net.c |  100 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 93 insertions(+), 7 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 992ec5f..1f8300b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -330,12 +330,58 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 	return skb;
 }
 
+static void virtnet_xdp_xmit(struct virtnet_info *vi,
+			     struct receive_queue *rq,
+			     struct send_queue *sq,
+			     struct xdp_buff *xdp)
+{
+	struct page *page = virt_to_head_page(xdp->data);
+	struct virtio_net_hdr_mrg_rxbuf *hdr;
+	unsigned int num_sg, len;
+	void *xdp_sent;
+	int err;
+
+	/* Free up any pending old buffers before queueing new ones. */
+	while ((xdp_sent = virtqueue_get_buf(sq->vq, &len)) != NULL) {
+		struct page *sent_page = virt_to_head_page(xdp_sent);
+
+		if (vi->mergeable_rx_bufs)
+			put_page(sent_page);
+		else
+			give_pages(rq, sent_page);
+	}
+
+	/* Zero header and leave csum up to XDP layers */
+	hdr = xdp->data;
+	memset(hdr, 0, vi->hdr_len);
+
+	num_sg = 1;
+	sg_init_one(sq->sg, xdp->data, xdp->data_end - xdp->data);
+	err = virtqueue_add_outbuf(sq->vq, sq->sg, num_sg,
+				   xdp->data, GFP_ATOMIC);
+	if (unlikely(err)) {
+		if (vi->mergeable_rx_bufs)
+			put_page(page);
+		else
+			give_pages(rq, page);
+		return; // On error abort to avoid unnecessary kick
+	} else if (!vi->mergeable_rx_bufs) {
+		/* If not mergeable bufs must be big packets so cleanup pages */
+		give_pages(rq, (struct page *)page->private);
+		page->private = 0;
+	}
+
+	virtqueue_kick(sq->vq);
+}
+
 static u32 do_xdp_prog(struct virtnet_info *vi,
+		       struct receive_queue *rq,
 		       struct bpf_prog *xdp_prog,
 		       struct page *page, int offset, int len)
 {
 	int hdr_padded_len;
 	struct xdp_buff xdp;
+	unsigned int qp;
 	u32 act;
 	u8 *buf;
 
@@ -353,9 +399,15 @@ static u32 do_xdp_prog(struct virtnet_info *vi,
 	switch (act) {
 	case XDP_PASS:
 		return XDP_PASS;
+	case XDP_TX:
+		qp = vi->curr_queue_pairs -
+			vi->xdp_queue_pairs +
+			smp_processor_id();
+		xdp.data = buf + (vi->mergeable_rx_bufs ? 0 : 4);
+		virtnet_xdp_xmit(vi, rq, &vi->sq[qp], &xdp);
+		return XDP_TX;
 	default:
 		bpf_warn_invalid_xdp_action(act);
-	case XDP_TX:
 	case XDP_ABORTED:
 	case XDP_DROP:
 		return XDP_DROP;
@@ -390,9 +442,17 @@ static struct sk_buff *receive_big(struct net_device *dev,
 
 		if (unlikely(hdr->hdr.gso_type || hdr->hdr.flags))
 			goto err_xdp;
-		act = do_xdp_prog(vi, xdp_prog, page, 0, len);
-		if (act == XDP_DROP)
+		act = do_xdp_prog(vi, rq, xdp_prog, page, 0, len);
+		switch (act) {
+		case XDP_PASS:
+			break;
+		case XDP_TX:
+			rcu_read_unlock();
+			goto xdp_xmit;
+		case XDP_DROP:
+		default:
 			goto err_xdp;
+		}
 	}
 	rcu_read_unlock();
 
@@ -407,6 +467,7 @@ static struct sk_buff *receive_big(struct net_device *dev,
 err:
 	dev->stats.rx_dropped++;
 	give_pages(rq, page);
+xdp_xmit:
 	return NULL;
 }
 
@@ -425,6 +486,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	struct bpf_prog *xdp_prog;
 	unsigned int truesize;
 
+	head_skb = NULL;
+
 	rcu_read_lock();
 	xdp_prog = rcu_dereference(rq->xdp_prog);
 	if (xdp_prog) {
@@ -448,9 +511,17 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		if (unlikely(hdr->hdr.gso_type || hdr->hdr.flags))
 			goto err_xdp;
 
-		act = do_xdp_prog(vi, xdp_prog, page, offset, len);
-		if (act == XDP_DROP)
+		act = do_xdp_prog(vi, rq, xdp_prog, page, offset, len);
+		switch (act) {
+		case XDP_PASS:
+			break;
+		case XDP_TX:
+			rcu_read_unlock();
+			goto xdp_xmit;
+		case XDP_DROP:
+		default:
 			goto err_xdp;
+		}
 	}
 	rcu_read_unlock();
 
@@ -528,6 +599,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 err_buf:
 	dev->stats.rx_dropped++;
 	dev_kfree_skb(head_skb);
+xdp_xmit:
 	return NULL;
 }
 
@@ -1713,6 +1785,16 @@ static void free_receive_page_frags(struct virtnet_info *vi)
 			put_page(vi->rq[i].alloc_frag.page);
 }
 
+static bool is_xdp_queue(struct virtnet_info *vi, int q)
+{
+	if (q < (vi->curr_queue_pairs - vi->xdp_queue_pairs))
+		return false;
+	else if (q < vi->curr_queue_pairs)
+		return true;
+	else
+		return false;
+}
+
 static void free_unused_bufs(struct virtnet_info *vi)
 {
 	void *buf;
@@ -1720,8 +1802,12 @@ static void free_unused_bufs(struct virtnet_info *vi)
 
 	for (i = 0; i < vi->max_queue_pairs; i++) {
 		struct virtqueue *vq = vi->sq[i].vq;
-		while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
-			dev_kfree_skb(buf);
+		while ((buf = virtqueue_detach_unused_buf(vq)) != NULL) {
+			if (!is_xdp_queue(vi, i))
+				dev_kfree_skb(buf);
+			else
+				put_page(virt_to_head_page(buf));
+		}
 	}
 
 	for (i = 0; i < vi->max_queue_pairs; i++) {

^ permalink raw reply related

* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa @ 2016-12-15 21:17 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: David Laight, Netdev, kernel-hardening@lists.openwall.com,
	Jean-Philippe Aumasson, LKML, Linux Crypto Mailing List,
	Daniel J . Bernstein, Linus Torvalds, Eric Biggers
In-Reply-To: <CAHmME9rDCb=2rojJba13Uew9V9qAbxv1qcJGHwEAKoahxyE9QA@mail.gmail.com>

On 15.12.2016 21:43, Jason A. Donenfeld wrote:
> On Thu, Dec 15, 2016 at 9:31 PM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
>> ARM64 and x86-64 have memory operations that are not vector operations
>> that operate on 128 bit memory.
> 
> Fair enough. imull I guess.
> 
>> How do you know that the compiler for some architecture will not chose a
>> more optimized instruction to load a 64 bit memory value into two 32 bit
>> registers if you tell the compiler it is 8 byte aligned but it actually
>> isn't? I don't know the answer but telling the compiler some data is 8
>> byte aligned while it isn't really pretty much seems like a call for
>> trouble.
> 
> If a compiler is in the business of using special 64-bit instructions
> on 64-bit aligned data, then it is also the job of the compiler to
> align structs to 64-bits when passed __aligned(8), which is what we've
> done in this code. If the compiler were to hypothetically choose to
> ignore that and internally convert it to a __aligned(4), then it would
> only be able to do so with the knowledge that it will never use 64-bit
> aligned data instructions. But so far as I can tell, gcc always
> respects __aligned(8), which is why I use it in this patchset.
> 
> I think there might have been confusion here, because perhaps someone
> was hoping that since in6_addr is 128-bits, that the __aligned
> attribute would not be required and that the struct would just
> automatically be aligned to at least 8 bytes. But in fact, as I
> mentioned, in6_addr is actually composed of u32[4] and not u64[2], so
> it will only be aligned to 4 bytes, making the __aligned(8) necessary.
> 
> I think for the purposes of this patchset, this is a solved problem.
> There's the unaligned version of the function if you don't know about
> the data, and there's the aligned version if you're using
> __aligned(SIPHASH_ALIGNMENT) on your data. Plain and simple.

And I was exactly questioning this.

static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
				    const struct in6_addr *daddr)
{
	net_get_random_once(&ip6_frags.rnd, sizeof(ip6_frags.rnd));
	return jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr),
			    (__force u32)id, ip6_frags.rnd);
}

This function had a hash DoS (and kind of still has), but it has been
mitigated by explicit checks, I hope.

So you start looking for all the pointers where ipv6 addresses could
come from and find some globally defined struct where I would need to
put the aligned(SIPHASH_ALIGNMENT) into to make this work on 32 bit
code? Otherwise just the unaligned version is safe on 32 bit code.

Who knows this? It isn't even obvious by looking at the header!

I would be interested if the compiler can actually constant-fold the
address of the stack allocation with an simple if () or some
__builtin_constant_p fiddeling, so we don't have this constant review
overhead to which function we pass which data. This would also make
this whole discussion mood.

Bye,
Hannes

^ permalink raw reply

* Re: Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-15 21:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: kernel-hardening@lists.openwall.com, Hannes Frederic Sowa,
	David Laight, Netdev, Jean-Philippe Aumasson, LKML,
	Linux Crypto Mailing List, Daniel J . Bernstein, Eric Biggers

On Thu, Dec 15, 2016 at 10:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> I think you can/should just use the natural alignment for "u64".
>
> For architectures that need 8-byte alignment, u64 will already be
> properly aligned. For architectures (like x86-32) that only need
> 4-byte alignment, you get it.

I should have added mention of that with my previous email. For the
parameters that are always a multiple of u64 -- namely, the key -- I
now do that in v5 of the patchset. So this is already done.

^ permalink raw reply

* Re: [kernel-hardening] Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Linus Torvalds @ 2016-12-15 21:14 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: kernel-hardening@lists.openwall.com, Hannes Frederic Sowa,
	David Laight, Netdev, Jean-Philippe Aumasson, LKML,
	Linux Crypto Mailing List, Daniel J . Bernstein, Eric Biggers
In-Reply-To: <CAHmME9p+xEBTKz+Wfy5Ypav4HU7H+rnA-0hLgd1sMmthDzOmvw@mail.gmail.com>

On Thu, Dec 15, 2016 at 1:11 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Indeed, I stand corrected. But in any case, the use of __aligned(8) in
> the patchset ensures that things are fixed and that we don't have this
> issue.

I think you can/should just use the natural alignment for "u64".

For architectures that need 8-byte alignment, u64 will already be
properly aligned. For architectures (like x86-32) that only need
4-byte alignment, you get it.

              Linus

^ permalink raw reply

* Re: [kernel-hardening] Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-15 21:11 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Hannes Frederic Sowa, David Laight, Netdev,
	Jean-Philippe Aumasson, LKML, Linux Crypto Mailing List,
	Daniel J . Bernstein, Linus Torvalds, Eric Biggers
In-Reply-To: <20161215210933.GY3207@twins.programming.kicks-ass.net>

On Thu, Dec 15, 2016 at 10:09 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Dec 15, 2016 at 07:50:36PM +0100, Jason A. Donenfeld wrote:
>> There's no 32-bit platform
>> that will trap on a 64-bit unaligned access because there's no such
>> thing as a 64-bit access there. In short, we're fine.
>
> ARMv7 LPAE is a 32bit architecture that has 64bit load/stores IIRC.
>
> x86 has cmpxchg8b that can do 64bit things and very much wants the u64
> aligned.
>
> Also, IIRC we have a few platforms where u64 doesn't carry 8 byte
> alignment, m68k or something like that, but yes, you likely don't care.

Indeed, I stand corrected. But in any case, the use of __aligned(8) in
the patchset ensures that things are fixed and that we don't have this
issue.

^ permalink raw reply

* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Peter Zijlstra @ 2016-12-15 21:09 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Hannes Frederic Sowa, David Laight, Netdev,
	kernel-hardening@lists.openwall.com, Jean-Philippe Aumasson, LKML,
	Linux Crypto Mailing List, Daniel J . Bernstein, Linus Torvalds,
	Eric Biggers
In-Reply-To: <CAHmME9pTLFu3-4n6m_OMj5jVWGE-+yC-4CnkynD--H4Nt8_cpA@mail.gmail.com>

On Thu, Dec 15, 2016 at 07:50:36PM +0100, Jason A. Donenfeld wrote:
> There's no 32-bit platform
> that will trap on a 64-bit unaligned access because there's no such
> thing as a 64-bit access there. In short, we're fine.

ARMv7 LPAE is a 32bit architecture that has 64bit load/stores IIRC.

x86 has cmpxchg8b that can do 64bit things and very much wants the u64
aligned.

Also, IIRC we have a few platforms where u64 doesn't carry 8 byte
alignment, m68k or something like that, but yes, you likely don't care.

Just to make life interesting...

^ permalink raw reply

* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa @ 2016-12-15 21:09 UTC (permalink / raw)
  To: Peter Zijlstra, Jason A. Donenfeld
  Cc: David Laight, Netdev, kernel-hardening@lists.openwall.com,
	Jean-Philippe Aumasson, LKML, Linux Crypto Mailing List,
	Daniel J . Bernstein, Linus Torvalds, Eric Biggers
In-Reply-To: <20161215210425.GX3207@twins.programming.kicks-ass.net>

On 15.12.2016 22:04, Peter Zijlstra wrote:
> On Thu, Dec 15, 2016 at 09:43:04PM +0100, Jason A. Donenfeld wrote:
>> On Thu, Dec 15, 2016 at 9:31 PM, Hannes Frederic Sowa
>> <hannes@stressinduktion.org> wrote:
>>> ARM64 and x86-64 have memory operations that are not vector operations
>>> that operate on 128 bit memory.
>>
>> Fair enough. imull I guess.
> 
> imull is into rdx:rax, not memory. I suspect he's talking about
> cmpxchg16b.

Exactly and I think I saw a ll/sc 128 bit on armv8 to atomically
manipulate linked lists.

Bye,
Hannes

^ permalink raw reply

* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Peter Zijlstra @ 2016-12-15 21:04 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Hannes Frederic Sowa, David Laight, Netdev,
	kernel-hardening@lists.openwall.com, Jean-Philippe Aumasson, LKML,
	Linux Crypto Mailing List, Daniel J . Bernstein, Linus Torvalds,
	Eric Biggers
In-Reply-To: <CAHmME9rDCb=2rojJba13Uew9V9qAbxv1qcJGHwEAKoahxyE9QA@mail.gmail.com>

On Thu, Dec 15, 2016 at 09:43:04PM +0100, Jason A. Donenfeld wrote:
> On Thu, Dec 15, 2016 at 9:31 PM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
> > ARM64 and x86-64 have memory operations that are not vector operations
> > that operate on 128 bit memory.
> 
> Fair enough. imull I guess.

imull is into rdx:rax, not memory. I suspect he's talking about
cmpxchg16b.

^ permalink raw reply

* Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock
From: Pavel Machek @ 2016-12-15 21:03 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Francois Romieu, bh74.an, ks.giri, vipul.pandya, peppe.cavallaro,
	alexandre.torgue, davem, linux-kernel, netdev
In-Reply-To: <cfad6b84-de23-2e03-6d34-2b3451975179@gmx.de>

[-- Attachment #1: Type: text/plain, Size: 1784 bytes --]

Hi!

> sorry for the late reply.

No problem. Thanks for the help.

> On 11.12.2016 21:11, Pavel Machek wrote:
> > 
> > Do you understand what stmmac_tx_err(priv); is supposed to do? In
> > particular, if it is called while the driver is working ok -- should
> > the driver survive that?
> 
> As far as I understood it is supposed to fixup an errorneous tx path, e.g. a
> missing tx completion for transmitted frames.
> 
> Some drivers do this by restarting only the HW parts responsible for tx, some
> others by restarting the complete hardware. 
> But IMO it should also be ok to be called if the HW is still working fine.
> 
> > Because it does not currently, and I don't know how to test that
> > code. Unplugging the cable does not provoke that.
> > 
> > I tried
> > 
> >         } else if (unlikely(status == tx_hard_error))
> >                 stmmac_tx_err(priv);
> > +
> > +       {
> > +               static int i;
> > +               i++;
> > +               if (i==1000) {
> > +                       i = 0;
> > +                       printk("Simulated error\n");
> > +                       stmmac_tx_err(priv);
> > +               }
> > +       }
> >  }
> > 
> 
> Ok, there is this race that Francois mentioned so it is not surprising that
> the driver does not survive the call of stmmac_tx_err() as it is called now.
> Thats why I suggested to do a proper shutdown and restart of the tx path to
> avoid the race.

I actually did experiment with adding locking there, too, and no, no
luck. It seems stmmac_tx_err() is more broken than just locking.

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-15 20:43 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Laight, Netdev, kernel-hardening@lists.openwall.com,
	Jean-Philippe Aumasson, LKML, Linux Crypto Mailing List,
	Daniel J . Bernstein, Linus Torvalds, Eric Biggers
In-Reply-To: <18d1e9d1-1e52-b9a6-de26-2f33859ec052@stressinduktion.org>

On Thu, Dec 15, 2016 at 9:31 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> ARM64 and x86-64 have memory operations that are not vector operations
> that operate on 128 bit memory.

Fair enough. imull I guess.

> How do you know that the compiler for some architecture will not chose a
> more optimized instruction to load a 64 bit memory value into two 32 bit
> registers if you tell the compiler it is 8 byte aligned but it actually
> isn't? I don't know the answer but telling the compiler some data is 8
> byte aligned while it isn't really pretty much seems like a call for
> trouble.

If a compiler is in the business of using special 64-bit instructions
on 64-bit aligned data, then it is also the job of the compiler to
align structs to 64-bits when passed __aligned(8), which is what we've
done in this code. If the compiler were to hypothetically choose to
ignore that and internally convert it to a __aligned(4), then it would
only be able to do so with the knowledge that it will never use 64-bit
aligned data instructions. But so far as I can tell, gcc always
respects __aligned(8), which is why I use it in this patchset.

I think there might have been confusion here, because perhaps someone
was hoping that since in6_addr is 128-bits, that the __aligned
attribute would not be required and that the struct would just
automatically be aligned to at least 8 bytes. But in fact, as I
mentioned, in6_addr is actually composed of u32[4] and not u64[2], so
it will only be aligned to 4 bytes, making the __aligned(8) necessary.

I think for the purposes of this patchset, this is a solved problem.
There's the unaligned version of the function if you don't know about
the data, and there's the aligned version if you're using
__aligned(SIPHASH_ALIGNMENT) on your data. Plain and simple.

Jason

^ permalink raw reply

* [net-next PATCH v6 2/5] virtio_net: Add XDP support
From: John Fastabend @ 2016-12-15 20:13 UTC (permalink / raw)
  To: mst
  Cc: daniel, netdev, alexei.starovoitov, john.r.fastabend, brouer,
	tgraf, davem
In-Reply-To: <20161215200712.23639.53043.stgit@john-Precision-Tower-5810>

From: John Fastabend <john.fastabend@gmail.com>

This adds XDP support to virtio_net. Some requirements must be
met for XDP to be enabled depending on the mode. First it will
only be supported with LRO disabled so that data is not pushed
across multiple buffers. Second the MTU must be less than a page
size to avoid having to handle XDP across multiple pages.

If mergeable receive is enabled this patch only supports the case
where header and data are in the same buf which we can check when
a packet is received by looking at num_buf. If the num_buf is
greater than 1 and a XDP program is loaded the packet is dropped
and a warning is thrown. When any_header_sg is set this does not
happen and both header and data is put in a single buffer as expected
so we check this when XDP programs are loaded.  Subsequent patches
will process the packet in a degraded mode to ensure connectivity
and correctness is not lost even if backend pushes packets into
multiple buffers.

If big packets mode is enabled and MTU/LRO conditions above are
met then XDP is allowed.

This patch was tested with qemu with vhost=on and vhost=off where
mergeable and big_packet modes were forced via hard coding feature
negotiation. Multiple buffers per packet was forced via a small
test patch to vhost.c in the vhost=on qemu mode.

Suggested-by: Shrijeet Mukherjee <shrijeet@gmail.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/virtio_net.c |  176 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 171 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b425fa1..93075da 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -22,6 +22,7 @@
 #include <linux/module.h>
 #include <linux/virtio.h>
 #include <linux/virtio_net.h>
+#include <linux/bpf.h>
 #include <linux/scatterlist.h>
 #include <linux/if_vlan.h>
 #include <linux/slab.h>
@@ -81,6 +82,8 @@ struct receive_queue {
 
 	struct napi_struct napi;
 
+	struct bpf_prog __rcu *xdp_prog;
+
 	/* Chain pages by the private ptr. */
 	struct page *pages;
 
@@ -324,6 +327,38 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 	return skb;
 }
 
+static u32 do_xdp_prog(struct virtnet_info *vi,
+		       struct bpf_prog *xdp_prog,
+		       struct page *page, int offset, int len)
+{
+	int hdr_padded_len;
+	struct xdp_buff xdp;
+	u32 act;
+	u8 *buf;
+
+	buf = page_address(page) + offset;
+
+	if (vi->mergeable_rx_bufs)
+		hdr_padded_len = sizeof(struct virtio_net_hdr_mrg_rxbuf);
+	else
+		hdr_padded_len = sizeof(struct padded_vnet_hdr);
+
+	xdp.data = buf + hdr_padded_len;
+	xdp.data_end = xdp.data + (len - vi->hdr_len);
+
+	act = bpf_prog_run_xdp(xdp_prog, &xdp);
+	switch (act) {
+	case XDP_PASS:
+		return XDP_PASS;
+	default:
+		bpf_warn_invalid_xdp_action(act);
+	case XDP_TX:
+	case XDP_ABORTED:
+	case XDP_DROP:
+		return XDP_DROP;
+	}
+}
+
 static struct sk_buff *receive_small(struct virtnet_info *vi, void *buf, unsigned int len)
 {
 	struct sk_buff * skb = buf;
@@ -340,14 +375,32 @@ static struct sk_buff *receive_big(struct net_device *dev,
 				   void *buf,
 				   unsigned int len)
 {
+	struct bpf_prog *xdp_prog;
 	struct page *page = buf;
-	struct sk_buff *skb = page_to_skb(vi, rq, page, 0, len, PAGE_SIZE);
+	struct sk_buff *skb;
 
+	rcu_read_lock();
+	xdp_prog = rcu_dereference(rq->xdp_prog);
+	if (xdp_prog) {
+		struct virtio_net_hdr_mrg_rxbuf *hdr = buf;
+		u32 act;
+
+		if (unlikely(hdr->hdr.gso_type || hdr->hdr.flags))
+			goto err_xdp;
+		act = do_xdp_prog(vi, xdp_prog, page, 0, len);
+		if (act == XDP_DROP)
+			goto err_xdp;
+	}
+	rcu_read_unlock();
+
+	skb = page_to_skb(vi, rq, page, 0, len, PAGE_SIZE);
 	if (unlikely(!skb))
 		goto err;
 
 	return skb;
 
+err_xdp:
+	rcu_read_unlock();
 err:
 	dev->stats.rx_dropped++;
 	give_pages(rq, page);
@@ -365,11 +418,42 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	u16 num_buf = virtio16_to_cpu(vi->vdev, hdr->num_buffers);
 	struct page *page = virt_to_head_page(buf);
 	int offset = buf - page_address(page);
-	unsigned int truesize = max(len, mergeable_ctx_to_buf_truesize(ctx));
+	struct sk_buff *head_skb, *curr_skb;
+	struct bpf_prog *xdp_prog;
+	unsigned int truesize;
+
+	rcu_read_lock();
+	xdp_prog = rcu_dereference(rq->xdp_prog);
+	if (xdp_prog) {
+		u32 act;
+
+		/* No known backend devices should send packets with
+		 * more than a single buffer when XDP conditions are
+		 * met. However it is not strictly illegal so the case
+		 * is handled as an exception and a warning is thrown.
+		 */
+		if (unlikely(num_buf > 1)) {
+			bpf_warn_invalid_xdp_buffer();
+			goto err_xdp;
+		}
+
+		/* Transient failure which in theory could occur if
+		 * in-flight packets from before XDP was enabled reach
+		 * the receive path after XDP is loaded. In practice I
+		 * was not able to create this condition.
+		 */
+		if (unlikely(hdr->hdr.gso_type || hdr->hdr.flags))
+			goto err_xdp;
+
+		act = do_xdp_prog(vi, xdp_prog, page, offset, len);
+		if (act == XDP_DROP)
+			goto err_xdp;
+	}
+	rcu_read_unlock();
 
-	struct sk_buff *head_skb = page_to_skb(vi, rq, page, offset, len,
-					       truesize);
-	struct sk_buff *curr_skb = head_skb;
+	truesize = max(len, mergeable_ctx_to_buf_truesize(ctx));
+	head_skb = page_to_skb(vi, rq, page, offset, len, truesize);
+	curr_skb = head_skb;
 
 	if (unlikely(!curr_skb))
 		goto err_skb;
@@ -423,6 +507,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	ewma_pkt_len_add(&rq->mrg_avg_pkt_len, head_skb->len);
 	return head_skb;
 
+err_xdp:
+	rcu_read_unlock();
 err_skb:
 	put_page(page);
 	while (--num_buf) {
@@ -1337,6 +1423,13 @@ static int virtnet_set_channels(struct net_device *dev,
 	if (queue_pairs > vi->max_queue_pairs || queue_pairs == 0)
 		return -EINVAL;
 
+	/* For now we don't support modifying channels while XDP is loaded
+	 * also when XDP is loaded all RX queues have XDP programs so we only
+	 * need to check a single RX queue.
+	 */
+	if (vi->rq[0].xdp_prog)
+		return -EINVAL;
+
 	get_online_cpus();
 	err = virtnet_set_queues(vi, queue_pairs);
 	if (!err) {
@@ -1428,6 +1521,70 @@ static void virtnet_init_settings(struct net_device *dev)
 	.set_settings = virtnet_set_settings,
 };
 
+static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
+{
+	unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
+	struct virtnet_info *vi = netdev_priv(dev);
+	struct bpf_prog *old_prog;
+	int i;
+
+	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
+	    virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6)) {
+		netdev_warn(dev, "can't set XDP while host is implementing LRO, disable LRO first\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (vi->mergeable_rx_bufs && !vi->any_header_sg) {
+		netdev_warn(dev, "XDP expects header/data in single page, any_header_sg required\n");
+		return -EINVAL;
+	}
+
+	if (dev->mtu > max_sz) {
+		netdev_warn(dev, "XDP requires MTU less than %lu\n", max_sz);
+		return -EINVAL;
+	}
+
+	if (prog) {
+		prog = bpf_prog_add(prog, vi->max_queue_pairs - 1);
+		if (IS_ERR(prog))
+			return PTR_ERR(prog);
+	}
+
+	for (i = 0; i < vi->max_queue_pairs; i++) {
+		old_prog = rtnl_dereference(vi->rq[i].xdp_prog);
+		rcu_assign_pointer(vi->rq[i].xdp_prog, prog);
+		if (old_prog)
+			bpf_prog_put(old_prog);
+	}
+
+	return 0;
+}
+
+static bool virtnet_xdp_query(struct net_device *dev)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+	int i;
+
+	for (i = 0; i < vi->max_queue_pairs; i++) {
+		if (vi->rq[i].xdp_prog)
+			return true;
+	}
+	return false;
+}
+
+static int virtnet_xdp(struct net_device *dev, struct netdev_xdp *xdp)
+{
+	switch (xdp->command) {
+	case XDP_SETUP_PROG:
+		return virtnet_xdp_set(dev, xdp->prog);
+	case XDP_QUERY_PROG:
+		xdp->prog_attached = virtnet_xdp_query(dev);
+		return 0;
+	default:
+		return -EINVAL;
+	}
+}
+
 static const struct net_device_ops virtnet_netdev = {
 	.ndo_open            = virtnet_open,
 	.ndo_stop   	     = virtnet_close,
@@ -1444,6 +1601,7 @@ static void virtnet_init_settings(struct net_device *dev)
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	.ndo_busy_poll		= virtnet_busy_poll,
 #endif
+	.ndo_xdp		= virtnet_xdp,
 };
 
 static void virtnet_config_changed_work(struct work_struct *work)
@@ -1505,12 +1663,20 @@ static void virtnet_free_queues(struct virtnet_info *vi)
 
 static void free_receive_bufs(struct virtnet_info *vi)
 {
+	struct bpf_prog *old_prog;
 	int i;
 
+	rtnl_lock();
 	for (i = 0; i < vi->max_queue_pairs; i++) {
 		while (vi->rq[i].pages)
 			__free_pages(get_a_page(&vi->rq[i], GFP_KERNEL), 0);
+
+		old_prog = rtnl_dereference(vi->rq[i].xdp_prog);
+		RCU_INIT_POINTER(vi->rq[i].xdp_prog, NULL);
+		if (old_prog)
+			bpf_prog_put(old_prog);
 	}
+	rtnl_unlock();
 }
 
 static void free_receive_page_frags(struct virtnet_info *vi)

^ permalink raw reply related

* [net-next PATCH v6 1/5] net: xdp: add invalid buffer warning
From: John Fastabend @ 2016-12-15 20:12 UTC (permalink / raw)
  To: mst
  Cc: daniel, netdev, alexei.starovoitov, john.r.fastabend, brouer,
	tgraf, davem
In-Reply-To: <20161215200712.23639.53043.stgit@john-Precision-Tower-5810>

This adds a warning for drivers to use when encountering an invalid
buffer for XDP. For normal cases this should not happen but to catch
this in virtual/qemu setups that I may not have expected from the
emulation layer having a standard warning is useful.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 include/linux/filter.h |    1 +
 net/core/filter.c      |    6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 6a16583..af8a180 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -602,6 +602,7 @@ int sk_get_filter(struct sock *sk, struct sock_filter __user *filter,
 struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
 				       const struct bpf_insn *patch, u32 len);
 void bpf_warn_invalid_xdp_action(u32 act);
+void bpf_warn_invalid_xdp_buffer(void);
 
 #ifdef CONFIG_BPF_JIT
 extern int bpf_jit_enable;
diff --git a/net/core/filter.c b/net/core/filter.c
index b146170..7190bd6 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2972,6 +2972,12 @@ void bpf_warn_invalid_xdp_action(u32 act)
 }
 EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_action);
 
+void bpf_warn_invalid_xdp_buffer(void)
+{
+	WARN_ONCE(1, "Illegal XDP buffer encountered, expect throughput degradation\n");
+}
+EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_buffer);
+
 static u32 sk_filter_convert_ctx_access(enum bpf_access_type type, int dst_reg,
 					int src_reg, int ctx_off,
 					struct bpf_insn *insn_buf,

^ permalink raw reply related

* [PATCH net] sfc: skip ptp allocations on secondary interfaces
From: Jarod Wilson @ 2016-12-15 20:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jarod Wilson, linux-net-drivers, Edward Cree, Bert Kenward,
	netdev

Rather than allocating efx_ptp_data, a buffer, a workqueue, etc., and then
ultimately deciding not to call ptp_clock_register() for non-primary
interfaces, just exit out of efx_ptp_prober() earlier. Saves a little
memory and speeds up init and teardown slightly.

CC: linux-net-drivers@solarflare.com
CC: Edward Cree <ecree@solarflare.com>
CC: Bert Kenward <bkenward@solarflare.com>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
 drivers/net/ethernet/sfc/ptp.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c
index 60cdb97..45df9f0 100644
--- a/drivers/net/ethernet/sfc/ptp.c
+++ b/drivers/net/ethernet/sfc/ptp.c
@@ -1220,6 +1220,10 @@ int efx_ptp_probe(struct efx_nic *efx, struct efx_channel *channel)
 	int rc = 0;
 	unsigned int pos;
 
+	if (!(efx->mcdi->fn_flags &
+	    (1 << MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_PRIMARY)))
+		return 0;
+
 	ptp = kzalloc(sizeof(struct efx_ptp_data), GFP_KERNEL);
 	efx->ptp_data = ptp;
 	if (!efx->ptp_data)
@@ -1261,23 +1265,21 @@ int efx_ptp_probe(struct efx_nic *efx, struct efx_channel *channel)
 	if (rc < 0)
 		goto fail3;
 
-	if (efx->mcdi->fn_flags &
-	    (1 << MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_PRIMARY)) {
-		ptp->phc_clock_info = efx_phc_clock_info;
-		ptp->phc_clock = ptp_clock_register(&ptp->phc_clock_info,
-						    &efx->pci_dev->dev);
-		if (IS_ERR(ptp->phc_clock)) {
-			rc = PTR_ERR(ptp->phc_clock);
-			goto fail3;
-		} else if (ptp->phc_clock) {
-			INIT_WORK(&ptp->pps_work, efx_ptp_pps_worker);
-			ptp->pps_workwq = create_singlethread_workqueue("sfc_pps");
-			if (!ptp->pps_workwq) {
-				rc = -ENOMEM;
-				goto fail4;
-			}
+	ptp->phc_clock_info = efx_phc_clock_info;
+	ptp->phc_clock = ptp_clock_register(&ptp->phc_clock_info,
+					    &efx->pci_dev->dev);
+	if (IS_ERR(ptp->phc_clock)) {
+		rc = PTR_ERR(ptp->phc_clock);
+		goto fail3;
+	} else if (ptp->phc_clock) {
+		INIT_WORK(&ptp->pps_work, efx_ptp_pps_worker);
+		ptp->pps_workwq = create_singlethread_workqueue("sfc_pps");
+		if (!ptp->pps_workwq) {
+			rc = -ENOMEM;
+			goto fail4;
 		}
 	}
+
 	ptp->nic_ts_enabled = false;
 
 	return 0;
-- 
2.10.0

^ permalink raw reply related

* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa @ 2016-12-15 20:31 UTC (permalink / raw)
  To: Jason A. Donenfeld, David Laight
  Cc: Netdev, kernel-hardening@lists.openwall.com,
	Jean-Philippe Aumasson, LKML, Linux Crypto Mailing List,
	Daniel J . Bernstein, Linus Torvalds, Eric Biggers
In-Reply-To: <CAHmME9pTLFu3-4n6m_OMj5jVWGE-+yC-4CnkynD--H4Nt8_cpA@mail.gmail.com>

Hello,

On 15.12.2016 19:50, Jason A. Donenfeld wrote:
> Hi David & Hannes,
> 
> This conversation is veering off course.

Why?

> I think this doesn't really
> matter at all. Gcc converts u64 into essentially a pair of u32 on
> 32-bit platforms, so the alignment requirements for 32-bit is at a
> maximum 32 bits. On 64-bit platforms the alignment requirements are
> related at a maximum to the biggest register size, so 64-bit
> alignment. For this reason, no matter the behavior of __aligned(8),
> we're okay. Likewise, even without __aligned(8), if gcc aligns structs
> by their biggest member, then we get 4 byte alignment on 32-bit and 8
> byte alignment on 64-bit, which is fine. There's no 32-bit platform
> that will trap on a 64-bit unaligned access because there's no such
> thing as a 64-bit access there. In short, we're fine.

ARM64 and x86-64 have memory operations that are not vector operations
that operate on 128 bit memory.

How do you know that the compiler for some architecture will not chose a
more optimized instruction to load a 64 bit memory value into two 32 bit
registers if you tell the compiler it is 8 byte aligned but it actually
isn't? I don't know the answer but telling the compiler some data is 8
byte aligned while it isn't really pretty much seems like a call for
trouble.

Why can't a compiler not vectorize this code if it can prove that it
doesn't conflict with other register users?

Bye,
Hannes

^ permalink raw reply

* [net-next PATCH v6 3/5] virtio_net: add dedicated XDP transmit queues
From: John Fastabend @ 2016-12-15 20:13 UTC (permalink / raw)
  To: mst
  Cc: daniel, netdev, alexei.starovoitov, john.r.fastabend, brouer,
	tgraf, davem
In-Reply-To: <20161215200712.23639.53043.stgit@john-Precision-Tower-5810>

XDP requires using isolated transmit queues to avoid interference
with normal networking stack (BQL, NETDEV_TX_BUSY, etc). This patch
adds a XDP queue per cpu when a XDP program is loaded and does not
expose the queues to the OS via the normal API call to
netif_set_real_num_tx_queues(). This way the stack will never push
an skb to these queues.

However virtio/vhost/qemu implementation only allows for creating
TX/RX queue pairs at this time so creating only TX queues was not
possible. And because the associated RX queues are being created I
went ahead and exposed these to the stack and let the backend use
them. This creates more RX queues visible to the network stack than
TX queues which is worth mentioning but does not cause any issues as
far as I can tell.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/virtio_net.c |   30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 93075da..992ec5f 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -114,6 +114,9 @@ struct virtnet_info {
 	/* # of queue pairs currently used by the driver */
 	u16 curr_queue_pairs;
 
+	/* # of XDP queue pairs currently used by the driver */
+	u16 xdp_queue_pairs;
+
 	/* I like... big packets and I cannot lie! */
 	bool big_packets;
 
@@ -1526,7 +1529,8 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 	unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
 	struct virtnet_info *vi = netdev_priv(dev);
 	struct bpf_prog *old_prog;
-	int i;
+	u16 xdp_qp = 0, curr_qp;
+	int i, err;
 
 	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
 	    virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6)) {
@@ -1544,12 +1548,34 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 		return -EINVAL;
 	}
 
+	curr_qp = vi->curr_queue_pairs - vi->xdp_queue_pairs;
+	if (prog)
+		xdp_qp = nr_cpu_ids;
+
+	/* XDP requires extra queues for XDP_TX */
+	if (curr_qp + xdp_qp > vi->max_queue_pairs) {
+		netdev_warn(dev, "request %i queues but max is %i\n",
+			    curr_qp + xdp_qp, vi->max_queue_pairs);
+		return -ENOMEM;
+	}
+
+	err = virtnet_set_queues(vi, curr_qp + xdp_qp);
+	if (err) {
+		dev_warn(&dev->dev, "XDP Device queue allocation failure.\n");
+		return err;
+	}
+
 	if (prog) {
 		prog = bpf_prog_add(prog, vi->max_queue_pairs - 1);
-		if (IS_ERR(prog))
+		if (IS_ERR(prog)) {
+			virtnet_set_queues(vi, curr_qp);
 			return PTR_ERR(prog);
+		}
 	}
 
+	vi->xdp_queue_pairs = xdp_qp;
+	netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp);
+
 	for (i = 0; i < vi->max_queue_pairs; i++) {
 		old_prog = rtnl_dereference(vi->rq[i].xdp_prog);
 		rcu_assign_pointer(vi->rq[i].xdp_prog, prog);

^ permalink raw reply related

* [PATCH v5 4/4] random: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld, Jean-Philippe Aumasson
In-Reply-To: <20161215203003.31989-1-Jason@zx2c4.com>

This duplicates the current algorithm for get_random_int/long, but uses
siphash instead. This comes with several benefits. It's certainly
faster and more cryptographically secure than MD5. This patch also
separates hashed fields into three values instead of one, in order to
increase diffusion.

The previous MD5 algorithm used a per-cpu MD5 state, which caused
successive calls to the function to chain upon each other. While it's
not entirely clear that this kind of chaining is absolutely necessary
when using a secure PRF like siphash, it can't hurt, and the timing of
the call chain does add a degree of natural entropy. So, in keeping with
this design, instead of the massive per-cpu 64-byte MD5 state, there is
instead a per-cpu previously returned value for chaining.

The speed benefits are substantial:

                | siphash | md5    | speedup |
		------------------------------
get_random_long | 137130  | 415983 | 3.03x   |
get_random_int  | 86384   | 343323 | 3.97x   |

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Ted Tso <tytso@mit.edu>
---
 drivers/char/random.c | 32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index d6876d506220..a51f0ff43f00 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -262,6 +262,7 @@
 #include <linux/syscalls.h>
 #include <linux/completion.h>
 #include <linux/uuid.h>
+#include <linux/siphash.h>
 #include <crypto/chacha20.h>
 
 #include <asm/processor.h>
@@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = {
 };
 #endif 	/* CONFIG_SYSCTL */
 
-static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] ____cacheline_aligned;
+static siphash_key_t random_int_secret;
 
 int random_int_secret_init(void)
 {
@@ -2050,8 +2051,7 @@ int random_int_secret_init(void)
 	return 0;
 }
 
-static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
-		__aligned(sizeof(unsigned long));
+static DEFINE_PER_CPU(u64, get_random_int_chaining);
 
 /*
  * Get a random word for internal kernel use only. Similar to urandom but
@@ -2061,19 +2061,16 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
  */
 unsigned int get_random_int(void)
 {
-	__u32 *hash;
 	unsigned int ret;
+	u64 *chaining;
 
 	if (arch_get_random_int(&ret))
 		return ret;
 
-	hash = get_cpu_var(get_random_int_hash);
-
-	hash[0] += current->pid + jiffies + random_get_entropy();
-	md5_transform(hash, random_int_secret);
-	ret = hash[0];
-	put_cpu_var(get_random_int_hash);
-
+	chaining = &get_cpu_var(get_random_int_chaining);
+	ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() +
+				       current->pid, random_int_secret);
+	put_cpu_var(get_random_int_chaining);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_int);
@@ -2083,19 +2080,16 @@ EXPORT_SYMBOL(get_random_int);
  */
 unsigned long get_random_long(void)
 {
-	__u32 *hash;
 	unsigned long ret;
+	u64 *chaining;
 
 	if (arch_get_random_long(&ret))
 		return ret;
 
-	hash = get_cpu_var(get_random_int_hash);
-
-	hash[0] += current->pid + jiffies + random_get_entropy();
-	md5_transform(hash, random_int_secret);
-	ret = *(unsigned long *)hash;
-	put_cpu_var(get_random_int_hash);
-
+	chaining = &get_cpu_var(get_random_int_chaining);
+	ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() +
+				       current->pid, random_int_secret);
+	put_cpu_var(get_random_int_chaining);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_long);
-- 
2.11.0

^ permalink raw reply related

* [PATCH v5 3/4] secure_seq: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161215203003.31989-1-Jason@zx2c4.com>

This gives a clear speed and security improvement. Siphash is both
faster and is more solid crypto than the aging MD5.

Rather than manually filling MD5 buffers, for IPv6, we simply create
a layout by a simple anonymous struct, for which gcc generates
rather efficient code. For IPv4, we pass the values directly to the
short input convenience functions.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Laight <David.Laight@aculab.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/core/secure_seq.c | 133 ++++++++++++++++++++------------------------------
 1 file changed, 52 insertions(+), 81 deletions(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 88a8e429fc3e..c80583bf3213 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -1,3 +1,5 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. */
+
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/cryptohash.h>
@@ -8,14 +10,14 @@
 #include <linux/ktime.h>
 #include <linux/string.h>
 #include <linux/net.h>
-
+#include <linux/siphash.h>
 #include <net/secure_seq.h>
 
 #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET)
+#include <linux/in6.h>
 #include <net/tcp.h>
-#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4)
 
-static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned;
+static siphash_key_t net_secret;
 
 static __always_inline void net_secret_init(void)
 {
@@ -44,44 +46,42 @@ static u32 seq_scale(u32 seq)
 u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 				 __be16 sport, __be16 dport, u32 *tsoff)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+		u32 padding;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
+	u64 hash;
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	*tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-	return seq_scale(hash[0]);
+	hash = siphash(&combined, sizeof(combined), net_secret);
+	*tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+	return seq_scale(hash);
 }
 EXPORT_SYMBOL(secure_tcpv6_sequence_number);
 
 u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
 			       __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 dport;
+		u16 padding1;
+		u32 padding2;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.dport = dport
+	};
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32) daddr[i];
-	secret[4] = net_secret[4] + (__force u32)dport;
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	return hash[0];
+	return siphash(&combined, sizeof(combined), net_secret);
 }
 EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 #endif
@@ -91,33 +91,17 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
 			       __be16 sport, __be16 dport, u32 *tsoff)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
+	u64 hash;
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	*tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-	return seq_scale(hash[0]);
+	hash = siphash_4u32(saddr, daddr, sport, dport, net_secret);
+	*tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+	return seq_scale(hash);
 }
 
 u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = (__force u32)dport ^ net_secret[14];
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	return hash[0];
+	return siphash_4u32(saddr, daddr, dport, 0, net_secret);
 }
 EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 #endif
@@ -126,21 +110,11 @@ EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr,
 				__be16 sport, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
 	u64 seq;
-
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash_4u32(saddr, daddr, sport, dport, net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccp_sequence_number);
@@ -149,26 +123,23 @@ EXPORT_SYMBOL(secure_dccp_sequence_number);
 u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr,
 				  __be16 sport, __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+		u32 padding;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
 	u64 seq;
-	u32 i;
-
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash(&combined, sizeof(combined), net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccpv6_sequence_number);
-- 
2.11.0

^ permalink raw reply related

* [PATCH v5 2/4] siphash: add Nu{32,64} helpers
From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161215203003.31989-1-Jason@zx2c4.com>

These restore parity with the jhash interface by providing high
performance helpers for common input sizes.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Tom Herbert <tom@herbertland.com>
---
 include/linux/siphash.h |  33 ++++++++++
 lib/siphash.c           | 157 +++++++++++++++++++++++++++++++++++++-----------
 lib/test_siphash.c      |  18 ++++++
 3 files changed, 172 insertions(+), 36 deletions(-)

diff --git a/include/linux/siphash.h b/include/linux/siphash.h
index 145cf5667078..6f5a08a0fc7e 100644
--- a/include/linux/siphash.h
+++ b/include/linux/siphash.h
@@ -29,4 +29,37 @@ static inline u64 siphash_unaligned(const void *data, size_t len,
 u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key);
 #endif
 
+u64 siphash_1u64(const u64 a, const siphash_key_t key);
+u64 siphash_2u64(const u64 a, const u64 b, const siphash_key_t key);
+u64 siphash_3u64(const u64 a, const u64 b, const u64 c,
+		 const siphash_key_t key);
+u64 siphash_4u64(const u64 a, const u64 b, const u64 c, const u64 d,
+		 const siphash_key_t key);
+
+static inline u64 siphash_2u32(const u32 a, const u32 b, const siphash_key_t key)
+{
+	return siphash_1u64((u64)b << 32 | a, key);
+}
+
+static inline u64 siphash_4u32(const u32 a, const u32 b, const u32 c, const u32 d,
+			       const siphash_key_t key)
+{
+	return siphash_2u64((u64)b << 32 | a, (u64)d << 32 | c, key);
+}
+
+static inline u64 siphash_6u32(const u32 a, const u32 b, const u32 c, const u32 d,
+			       const u32 e, const u32 f, const siphash_key_t key)
+{
+	return siphash_3u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e,
+			    key);
+}
+
+static inline u64 siphash_8u32(const u32 a, const u32 b, const u32 c, const u32 d,
+			       const u32 e, const u32 f, const u32 g, const u32 h,
+			       const siphash_key_t key)
+{
+	return siphash_4u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e,
+			    (u64)h << 32 | g, key);
+}
+
 #endif /* _LINUX_SIPHASH_H */
diff --git a/lib/siphash.c b/lib/siphash.c
index afc13cbb1b78..970c083ab06a 100644
--- a/lib/siphash.c
+++ b/lib/siphash.c
@@ -25,6 +25,29 @@
 	v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \
 	} while(0)
 
+#define PREAMBLE(len) \
+	u64 v0 = 0x736f6d6570736575ULL; \
+	u64 v1 = 0x646f72616e646f6dULL; \
+	u64 v2 = 0x6c7967656e657261ULL; \
+	u64 v3 = 0x7465646279746573ULL; \
+	u64 b = ((u64)len) << 56; \
+	v3 ^= key[1]; \
+	v2 ^= key[0]; \
+	v1 ^= key[1]; \
+	v0 ^= key[0];
+
+#define POSTAMBLE \
+	v3 ^= b; \
+	SIPROUND; \
+	SIPROUND; \
+	v0 ^= b; \
+	v2 ^= 0xff; \
+	SIPROUND; \
+	SIPROUND; \
+	SIPROUND; \
+	SIPROUND; \
+	return (v0 ^ v1) ^ (v2 ^ v3);
+
 /**
  * siphash - compute 64-bit siphash PRF value
  * @data: buffer to hash, must be aligned to SIPHASH_ALIGNMENT
@@ -33,18 +56,10 @@
  */
 u64 siphash(const void *data, size_t len, const siphash_key_t key)
 {
-	u64 v0 = 0x736f6d6570736575ULL;
-	u64 v1 = 0x646f72616e646f6dULL;
-	u64 v2 = 0x6c7967656e657261ULL;
-	u64 v3 = 0x7465646279746573ULL;
-	u64 b = ((u64)len) << 56;
-	u64 m;
 	const u8 *end = data + len - (len % sizeof(u64));
 	const u8 left = len & (sizeof(u64) - 1);
-	v3 ^= key[1];
-	v2 ^= key[0];
-	v1 ^= key[1];
-	v0 ^= key[0];
+	u64 m;
+	PREAMBLE(len)
 	for (; data != end; data += sizeof(u64)) {
 		m = le64_to_cpup(data);
 		v3 ^= m;
@@ -67,16 +82,7 @@ u64 siphash(const void *data, size_t len, const siphash_key_t key)
 	case 1: b |= end[0];
 	}
 #endif
-	v3 ^= b;
-	SIPROUND;
-	SIPROUND;
-	v0 ^= b;
-	v2 ^= 0xff;
-	SIPROUND;
-	SIPROUND;
-	SIPROUND;
-	SIPROUND;
-	return (v0 ^ v1) ^ (v2 ^ v3);
+	POSTAMBLE
 }
 EXPORT_SYMBOL(siphash);
 
@@ -89,18 +95,10 @@ EXPORT_SYMBOL(siphash);
  */
 u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key)
 {
-	u64 v0 = 0x736f6d6570736575ULL;
-	u64 v1 = 0x646f72616e646f6dULL;
-	u64 v2 = 0x6c7967656e657261ULL;
-	u64 v3 = 0x7465646279746573ULL;
-	u64 b = ((u64)len) << 56;
-	u64 m;
 	const u8 *end = data + len - (len % sizeof(u64));
 	const u8 left = len & (sizeof(u64) - 1);
-	v3 ^= key[1];
-	v2 ^= key[0];
-	v1 ^= key[1];
-	v0 ^= key[0];
+	u64 m;
+	PREAMBLE(len)
 	for (; data != end; data += sizeof(u64)) {
 		m = get_unaligned_le64(data);
 		v3 ^= m;
@@ -123,16 +121,103 @@ u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key)
 	case 1: b |= bytes[0];
 	}
 #endif
-	v3 ^= b;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_unaligned);
+#endif
+
+/**
+ * siphash_1u64 - compute 64-bit siphash PRF value of a u64
+ * @first: first u64
+ * @key: the siphash key
+ */
+u64 siphash_1u64(const u64 first, const siphash_key_t key)
+{
+	PREAMBLE(8)
+	v3 ^= first;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= first;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_1u64);
+
+/**
+ * siphash_2u64 - compute 64-bit siphash PRF value of 2 u64
+ * @first: first u64
+ * @second: second u64
+ * @key: the siphash key
+ */
+u64 siphash_2u64(const u64 first, const u64 second, const siphash_key_t key)
+{
+	PREAMBLE(16)
+	v3 ^= first;
 	SIPROUND;
 	SIPROUND;
-	v0 ^= b;
-	v2 ^= 0xff;
+	v0 ^= first;
+	v3 ^= second;
 	SIPROUND;
 	SIPROUND;
+	v0 ^= second;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_2u64);
+
+/**
+ * siphash_3u64 - compute 64-bit siphash PRF value of 3 u64
+ * @first: first u64
+ * @second: second u64
+ * @third: third u64
+ * @key: the siphash key
+ */
+u64 siphash_3u64(const u64 first, const u64 second, const u64 third,
+		 const siphash_key_t key)
+{
+	PREAMBLE(24)
+	v3 ^= first;
 	SIPROUND;
 	SIPROUND;
-	return (v0 ^ v1) ^ (v2 ^ v3);
+	v0 ^= first;
+	v3 ^= second;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= second;
+	v3 ^= third;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= third;
+	POSTAMBLE
 }
-EXPORT_SYMBOL(siphash_unaligned);
-#endif
+EXPORT_SYMBOL(siphash_3u64);
+
+/**
+ * siphash_4u64 - compute 64-bit siphash PRF value of 4 u64
+ * @first: first u64
+ * @second: second u64
+ * @third: third u64
+ * @forth: forth u64
+ * @key: the siphash key
+ */
+u64 siphash_4u64(const u64 first, const u64 second, const u64 third,
+		 const u64 forth, const siphash_key_t key)
+{
+	PREAMBLE(32)
+	v3 ^= first;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= first;
+	v3 ^= second;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= second;
+	v3 ^= third;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= third;
+	v3 ^= forth;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= forth;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_4u64);
diff --git a/lib/test_siphash.c b/lib/test_siphash.c
index 93549e4e22c5..1635189c171f 100644
--- a/lib/test_siphash.c
+++ b/lib/test_siphash.c
@@ -67,6 +67,24 @@ static int __init siphash_test_init(void)
 			ret = -EINVAL;
 		}
 	}
+	if (siphash_1u64(0x0706050403020100ULL, test_key) != test_vectors[8]) {
+		pr_info("self-test 1u64: FAIL\n");
+		ret = -EINVAL;
+	}
+	if (siphash_2u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, test_key) != test_vectors[16]) {
+		pr_info("self-test 2u64: FAIL\n");
+		ret = -EINVAL;
+	}
+	if (siphash_3u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL,
+			 0x1716151413121110ULL, test_key) != test_vectors[24]) {
+		pr_info("self-test 3u64: FAIL\n");
+		ret = -EINVAL;
+	}
+	if (siphash_4u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL,
+			 0x1716151413121110ULL, 0x1f1e1d1c1b1a1918ULL, test_key) != test_vectors[32]) {
+		pr_info("self-test 4u64: FAIL\n");
+		ret = -EINVAL;
+	}
 	if (!ret)
 		pr_info("self-tests: pass\n");
 	return ret;
-- 
2.11.0

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox