Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jean-Philippe Aumasson @ 2016-12-16 13:22 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: George Spelvin, Andi Kleen, David Miller, David Laight,
	Eric Biggers, Hannes Frederic Sowa, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Theodore Ts'o, vegard.nossum,
	Daniel J . Bernstein
In-Reply-To: <CAHmME9pjoAsoct1sVDpFFuqaqutv9X7DGJ5OBQXRAS57KFimUA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 827 bytes --]

It needs some basic security review, which I'll try do next week (check for
security margin, optimality of rotation counts, etc.). But after a lot of
experience with this kind of construction (BLAKE, SipHash, NORX), I'm
confident it will be safe as it is.



On Fri, Dec 16, 2016 at 1:44 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> Hey JP,
>
> On Fri, Dec 16, 2016 at 9:08 AM, Jean-Philippe Aumasson
> <jeanphilippe.aumasson@gmail.com> wrote:
> > Here's a tentative HalfSipHash:
> > https://github.com/veorq/SipHash/blob/halfsiphash/halfsiphash.c
> >
> > Haven't computed the cycle count nor measured its speed.
>
> This is incredible. Really. Wow!
>
> I'll integrate this into my patchset and will write up some
> documentation about when one should be used over the other.
>
> Thanks again. Quite exciting.
>
> Jason
>

[-- Attachment #2: Type: text/html, Size: 1706 bytes --]

^ permalink raw reply

* [PATCH 3/3] net: fec: Fix typo in error msg and comment
From: Peter Meerwald-Stadler @ 2016-12-16 13:23 UTC (permalink / raw)
  To: netdev; +Cc: Peter Meerwald-Stadler, Fugang Duan, trivial

Signed-off-by: Peter Meerwald-Stadler <pmeerw@pmeerw.net>
Cc: Fugang Duan <fugang.duan@nxp.com>
Cc: trivial@kernel.org
---
 drivers/net/ethernet/freescale/fec_main.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 38160c2..250b790 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -2001,7 +2001,7 @@ static int fec_enet_mii_init(struct platform_device *pdev)
 		mii_speed--;
 	if (mii_speed > 63) {
 		dev_err(&pdev->dev,
-			"fec clock (%lu) to fast to get right mii speed\n",
+			"fec clock (%lu) too fast to get right mii speed\n",
 			clk_get_rate(fep->clk_ipg));
 		err = -EINVAL;
 		goto err_out;
@@ -2950,7 +2950,7 @@ static void set_multicast_list(struct net_device *ndev)
 		}
 
 		/* only upper 6 bits (FEC_HASH_BITS) are used
-		 * which point to specific bit in he hash registers
+		 * which point to specific bit in the hash registers
 		 */
 		hash = (crc >> (32 - FEC_HASH_BITS)) & 0x3f;
 
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH] net: wan: Use dma_pool_zalloc
From: Souptick Joarder @ 2016-12-16 13:55 UTC (permalink / raw)
  To: Joe Perches; +Cc: Krzysztof Hałasa, netdev, Rameshwar Sahu
In-Reply-To: <1481868649.29291.85.camel@perches.com>

On Fri, Dec 16, 2016 at 11:40 AM, Joe Perches <joe@perches.com> wrote:
> On Fri, 2016-12-16 at 11:33 +0530, Souptick Joarder wrote:
>> On Thu, Dec 15, 2016 at 10:18 PM, Joe Perches <joe@perches.com> wrote:
>> > On Thu, 2016-12-15 at 10:41 +0530, Souptick Joarder wrote:
>> > > On Mon, Dec 12, 2016 at 10:12 AM, Souptick Joarder <jrdr.linux@gmail.com> wrote:
>> > > > On Fri, Dec 9, 2016 at 6:33 PM, Krzysztof Hałasa <khalasa@piap.pl> wrote:
>> > > > > Souptick Joarder <jrdr.linux@gmail.com> writes:
>> > > > >
>> > > > > > We should use dma_pool_zalloc instead of dma_pool_alloc/memset
>> >
>> > []
>> > > > > > diff --git a/drivers/net/wan/ixp4xx_hss.c b/drivers/net/wan/ixp4xx_hss.c
>> >
>> > []
>> > > > > > @@ -976,10 +976,9 @@ static int init_hdlc_queues(struct port *port)
>> > > > > >                       return -ENOMEM;
>> > > > > >       }
>> > > > > >
>> > > > > > -     if (!(port->desc_tab = dma_pool_alloc(dma_pool, GFP_KERNEL,
>> > > > > > -                                           &port->desc_tab_phys)))
>> > > > > > +     if (!(port->desc_tab = dma_pool_zalloc(dma_pool, GFP_KERNEL,
>> > > > > > +                                            &port->desc_tab_phys)))
>> > > > > >               return -ENOMEM;
>> > > > > > -     memset(port->desc_tab, 0, POOL_ALLOC_SIZE);
>> > > > > >       memset(port->rx_buff_tab, 0, sizeof(port->rx_buff_tab)); /* tables */
>> > > > > >       memset(port->tx_buff_tab, 0, sizeof(port->tx_buff_tab));
>> > > > >
>> > > > > This look fine, feel free to send it to the netdev mailing list for
>> > > > > inclusion.
>> > > >
>> > > > Including netdev mailing list based as requested.
>> > > > > Acked-by: Krzysztof Halasa <khalasa@piap.pl>
>> >
>> > []
>> > > Any comment on this patch ?
>> >
>> > Shouldn't the one in drivers/net/ethernet/xscale/ixp4xx_eth.c
>> > also be changed?
>>
>> Yes, you are right.   Do you want me to include it in same patch?
>
> Your choice.  I would use a single patch.

There are few other places where the same change is applicable.
I am planning to put all those changes in a single patch. It includes
changes in drivers/net/ethernet/xscale/ixp4xx_eth.c

You can review this patch separately.
>

^ permalink raw reply

* [PATCH iproute2/net-next 0/2] tc: flower: enhance mask support
From: Simon Horman @ 2016-12-16 13:54 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Simon Horman

Hi,

this short series enhances mask support for tc flower by:
* Documenting existing mask support for *_ip parameters
* Allowing *_mac options to accept a mask.
  This makes use of existing kernel features.

Based on net-next +
"[PATCH iproute2 0/2] Add dest UDP port to IP tunnel parameters"

Simon Horman (2):
  tc: flower: document that *_ip parameters take a PREFIX as an
    argument.
  tc: flower: Allow *_mac options to accept a mask

 man/man8/tc-flower.8 | 41 +++++++++++++++++++++++------------------
 tc/f_flower.c        | 51 ++++++++++++++++++++++++++++++++++++++++-----------
 2 files changed, 63 insertions(+), 29 deletions(-)

-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply

* [PATCH iproute2/net-next 1/2] tc: flower: document that *_ip parameters take a PREFIX as an argument.
From: Simon Horman @ 2016-12-16 13:54 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Simon Horman
In-Reply-To: <1481896477-13497-1-git-send-email-simon.horman@netronome.com>

* The argument to src_ip, dst_ip, enc_src_ip and enc_dst_ip take an
  optional prefix length which is used to provide a mask to limit the scope
  of matching.
* This is documented as a PREFIX in keeping with ip-route(8).

Example of uses of IPv4 and IPv6 prefixes

tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ip parent ffff: flower \
    indev eth0 dst_ip 192.168.1.1 action drop
tc filter add dev eth0 protocol ip parent ffff: flower \
    indev eth0 src_ip 10.0.0.0/8 action drop
tc filter add dev eth0 protocol ipv6 parent ffff: flower \
    indev eth0 src_ip 2001:DB8:1::/48 action drop
tc filter add dev eth0 protocol ipv6 parent ffff: flower \
    indev eth0 dst_ip 2001:DB8::1 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 man/man8/tc-flower.8 | 28 ++++++++++++++--------------
 tc/f_flower.c        |  8 ++++----
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/man/man8/tc-flower.8 b/man/man8/tc-flower.8
index 88df83360b89..a383b6584dc6 100644
--- a/man/man8/tc-flower.8
+++ b/man/man8/tc-flower.8
@@ -31,8 +31,8 @@ flower \- flow based traffic control filter
 .IR ETH_TYPE " } | "
 .BR ip_proto " { " tcp " | " udp " | " sctp " | " icmp " | " icmpv6 " | "
 .IR IP_PROTO " } | { "
-.BR dst_ip " | " src_ip " } { "
-.IR ipv4_address " | " ipv6_address " } | { "
+.BR dst_ip " | " src_ip " } "
+.IR PREFIX " | { "
 .BR dst_port " | " src_port " } "
 .IR port_number " } | "
 .B enc_key_id
@@ -103,14 +103,14 @@ may be
 .BR tcp ", " udp ", " sctp ", " icmp ", " icmpv6
 or an unsigned 8bit value in hexadecimal format.
 .TP
-.BI dst_ip " ADDRESS"
+.BI dst_ip " PREFIX"
 .TQ
-.BI src_ip " ADDRESS"
+.BI src_ip " PREFIX"
 Match on source or destination IP address.
-.I ADDRESS
-must be a valid IPv4 or IPv6 address, depending on
-.BR protocol
-option of tc filter.
+.I PREFIX
+must be a valid IPv4 or IPv6 address, depending on the \fBprotocol\fR
+option to tc filter, optionally followed by a slash and the prefix length.
+If the prefix is missing, \fBtc\fR assumes a full-length host match.
 .TP
 .BI dst_port " NUMBER"
 .TQ
@@ -128,16 +128,16 @@ which have to be specified in beforehand.
 .TP
 .BI enc_key_id " NUMBER"
 .TQ
-.BI enc_dst_ip " ADDRESS"
+.BI enc_dst_ip " PREFIX"
 .TQ
-.BI enc_src_ip " ADDRESS"
-.TQ
-.BI enc_dst_port " NUMBER"
+.BI enc_src_ip " PREFIX"
 Match on IP tunnel metadata. Key id
 .I NUMBER
 is a 32 bit tunnel key id (e.g. VNI for VXLAN tunnel).
-.I ADDRESS
-must be a valid IPv4 or IPv6 address. Dst port
+.I PREFIX
+must be a valid IPv4 or IPv6 address optionally followed by a slash and the
+prefix length. If the prefix is missing, \fBtc\fR assumes a full-length
+host match.  Dst port
 .I NUMBER
 is a 16 bit UDP dst port.
 .SH NOTES
diff --git a/tc/f_flower.c b/tc/f_flower.c
index 653dfefc060a..cdf74344f78f 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -48,14 +48,14 @@ static void explain(void)
 		"                       dst_mac MAC-ADDR |\n"
 		"                       src_mac MAC-ADDR |\n"
 		"                       ip_proto [tcp | udp | sctp | icmp | icmpv6 | IP-PROTO ] |\n"
-		"                       dst_ip [ IPV4-ADDR | IPV6-ADDR ] |\n"
-		"                       src_ip [ IPV4-ADDR | IPV6-ADDR ] |\n"
+		"                       dst_ip PREFIX |\n"
+		"                       src_ip PREFIX |\n"
 		"                       dst_port PORT-NUMBER |\n"
 		"                       src_port PORT-NUMBER |\n"
 		"                       type ICMP-TYPE |\n"
 		"                       code ICMP-CODE }\n"
-		"                       enc_dst_ip [ IPV4-ADDR | IPV6-ADDR ] |\n"
-		"                       enc_src_ip [ IPV4-ADDR | IPV6-ADDR ] |\n"
+		"                       enc_dst_ip PREFIX |\n"
+		"                       enc_src_ip PREFIX |\n"
 		"                       enc_key_id [ KEY-ID ] }\n"
 		"       FILTERID := X:Y:Z\n"
 		"       ACTION-SPEC := ... look at individual actions\n"
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related

* [PATCH iproute2/net-next 2/2] tc: flower: Allow *_mac options to accept a mask
From: Simon Horman @ 2016-12-16 13:54 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Simon Horman
In-Reply-To: <1481896477-13497-1-git-send-email-simon.horman@netronome.com>

* The argument to src_mac and dst_mac may now take an optional mask
  to limit the scope of matching.
* This address is is documented as a LLADDR in keeping with ip-link(8).
* The formats accepted match those already output when dumping flower
  filters from the kernel.

Example of use of LLADDR with and without a mask:

tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
	src_mac 52:54:01:00:00:00/ff:ff:00:00:00:01 action drop
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
	src_mac 52:54:00:00:00:00/23 action drop
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
	src_mac 52:54:00:00:00:00 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 man/man8/tc-flower.8 | 13 +++++++++----
 tc/f_flower.c        | 43 ++++++++++++++++++++++++++++++++++++-------
 2 files changed, 45 insertions(+), 11 deletions(-)

diff --git a/man/man8/tc-flower.8 b/man/man8/tc-flower.8
index a383b6584dc6..31c7d3b32f9b 100644
--- a/man/man8/tc-flower.8
+++ b/man/man8/tc-flower.8
@@ -22,7 +22,7 @@ flower \- flow based traffic control filter
 .BR skip_sw " | " skip_hw
 .R " | { "
 .BR dst_mac " | " src_mac " } "
-.IR mac_address " | "
+.IR MASKED_LLADDR " | "
 .B vlan_id
 .IR VID " | "
 .B vlan_prio
@@ -74,10 +74,15 @@ filter, or TC offload is not enabled for the interface, operation will fail.
 .BI skip_hw
 Do not process filter by hardware.
 .TP
-.BI dst_mac " mac_address"
+.BI dst_mac " MASKED_LLADDR"
 .TQ
-.BI src_mac " mac_address"
-Match on source or destination MAC address.
+.BI src_mac " MASKED_LLADDR"
+Match on source or destination MAC address.  A mask may be optionally
+provided to limit the bits of the address which are matched. A mask is
+provided by following the address with a slash and then the mask. It may be
+provided in LLADDR format, in which case it is a bitwise mask, or as a
+number of high bits to match. If the mask is missing then a match on all
+bits is assumed.
 .TP
 .BI vlan_id " VID"
 Match on vlan tag id.
diff --git a/tc/f_flower.c b/tc/f_flower.c
index cdf74344f78f..6d9a3b70afed 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -45,8 +45,8 @@ static void explain(void)
 		"                       vlan_id VID |\n"
 		"                       vlan_prio PRIORITY |\n"
 		"                       vlan_ethtype [ ipv4 | ipv6 | ETH-TYPE ] |\n"
-		"                       dst_mac MAC-ADDR |\n"
-		"                       src_mac MAC-ADDR |\n"
+		"                       dst_mac MASKED-LLADDR |\n"
+		"                       src_mac MASKED-LLADDR |\n"
 		"                       ip_proto [tcp | udp | sctp | icmp | icmpv6 | IP-PROTO ] |\n"
 		"                       dst_ip PREFIX |\n"
 		"                       src_ip PREFIX |\n"
@@ -58,6 +58,7 @@ static void explain(void)
 		"                       enc_src_ip PREFIX |\n"
 		"                       enc_key_id [ KEY-ID ] }\n"
 		"       FILTERID := X:Y:Z\n"
+		"       MASKED_LLADDR := { LLADDR | LLADDR/MASK | LLADDR/BITS }\n"
 		"       ACTION-SPEC := ... look at individual actions\n"
 		"\n"
 		"NOTE: CLASSID, IP-PROTO are parsed as hexadecimal input.\n"
@@ -68,16 +69,44 @@ static void explain(void)
 static int flower_parse_eth_addr(char *str, int addr_type, int mask_type,
 				 struct nlmsghdr *n)
 {
-	int ret;
-	char addr[ETH_ALEN];
+	int ret, err = -1;
+	char addr[ETH_ALEN], *slash;
+
+	slash = strchr(str, '/');
+	if (slash)
+		*slash = '\0';
 
 	ret = ll_addr_a2n(addr, sizeof(addr), str);
 	if (ret < 0)
-		return -1;
+		goto err;
 	addattr_l(n, MAX_MSG, addr_type, addr, sizeof(addr));
-	memset(addr, 0xff, ETH_ALEN);
+
+	if (slash) {
+		unsigned bits;
+
+		if (!get_unsigned(&bits, slash + 1, 10)) {
+			uint64_t mask;
+
+			/* Extra 16 bit shift to push mac address into
+			 * high bits of uint64_t
+			 */
+			mask = htonll(0xffffffffffffULL << (16 + 48 - bits));
+			memcpy(addr, &mask, ETH_ALEN);
+		} else {
+			ret = ll_addr_a2n(addr, sizeof(addr), slash + 1);
+			if (ret < 0)
+				goto err;
+		}
+	} else {
+		memset(addr, 0xff, ETH_ALEN);
+	}
 	addattr_l(n, MAX_MSG, mask_type, addr, sizeof(addr));
-	return 0;
+
+	err = 0;
+err:
+	if (slash)
+		*slash = '/';
+	return err;
 }
 
 static int flower_parse_vlan_eth_type(char *str, __be16 eth_type, int type,
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related

* Re: [PATCH perf/core REBASE 2/5] samples/bpf: Switch over to libbpf
From: Arnaldo Carvalho de Melo @ 2016-12-16 14:21 UTC (permalink / raw)
  To: Joe Stringer
  Cc: Arnaldo Carvalho de Melo, LKML, netdev, Wang Nan, ast,
	Daniel Borkmann
In-Reply-To: <CAPWQB7EAN7Fg8+vO3Tn7WYWcZW_JpKQFNCJzY_K100ckab6JRg@mail.gmail.com>

Em Thu, Dec 15, 2016 at 05:48:31PM -0800, Joe Stringer escreveu:
> On 15 December 2016 at 14:00, Joe Stringer <joe@ovn.org> wrote:
> > On 15 December 2016 at 10:34, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >> So, I'm stopping here so that I can push what I have to Ingo, then I'll get
> >> back to this, hopefully by then you beat me and I have just to retest 8-)

> > OK, thanks for the report. Looks like there was another difference
> > between the two libbpfs - one used total program size for its
> > load_program API; the actual kernel API uses instruction count. This
> > incremental should do the trick:

> > https://github.com/joestringer/linux/commit/6ff7726f20077bed66fb725f5189c13690154b6a
 
> The full branch with this change (fast-forward from your tmp branch)
> is available here:
> https://github.com/joestringer/linux/tree/submit/libbpf_samples_v5
 
> I tried running every selftest and BPF sample I could get my hands on;
> there's one or two that I couldn't run, but seemed more to do with my
> versions of TC/iproute and kernel config rather than libbpf changes.
> Let me know if you see any further trouble.

Thanks for doing that! I'll try and reproduce your tests as soon as I'm
back to the office, it looks like it all will go together in my next
pull to Ingo.

- Arnaldo

^ permalink raw reply

* RE: [PATCH iproute2 v2 1/3] ifstat: Add extended statistics to ifstat
From: Nogah Frankel @ 2016-12-16 12:11 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: netdev@vger.kernel.org, roopa@cumulusnetworks.com, Jiri Pirko,
	Elad Raz, Yotam Gigi, Ido Schimmel, Or Gerlitz
In-Reply-To: <20161215093327.2daf60c6@xeon-e3>


> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, December 15, 2016 7:33 PM
> To: Nogah Frankel <nogahf@mellanox.com>
> Cc: netdev@vger.kernel.org; roopa@cumulusnetworks.com; Jiri Pirko
> <jiri@mellanox.com>; Elad Raz <eladr@mellanox.com>; Yotam Gigi
> <yotamg@mellanox.com>; Ido Schimmel <idosch@mellanox.com>; Or Gerlitz
> <ogerlitz@mellanox.com>
> Subject: Re: [PATCH iproute2 v2 1/3] ifstat: Add extended statistics to ifstat
> 
> On Thu, 15 Dec 2016 15:00:43 +0200
> Nogah Frankel <nogahf@mellanox.com> wrote:
> 
> > Extended stats are part of the RTM_GETSTATS method. This patch adds them
> > to ifstat.
> > While extended stats can come in many forms, we support only the
> > rtnl_link_stats64 struct for them (which is the 64 bits version of struct
> > rtnl_link_stats).
> > We support stats in the main nesting level, or one lower.
> > The extension can be called by its name or any shorten of it. If there is
> > more than one matched, the first one will be picked.
> >
> > To get the extended stats the flag -x <stats type> is used.
> >
> > Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
> > Reviewed-by: Jiri Pirko <jiri@mellanox.com>
> > ---
> >  misc/ifstat.c | 161
> ++++++++++++++++++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 146 insertions(+), 15 deletions(-)
> >
> > diff --git a/misc/ifstat.c b/misc/ifstat.c
> > index 92d67b0..d17ae21 100644
> > --- a/misc/ifstat.c
> > +++ b/misc/ifstat.c
> > @@ -35,6 +35,7 @@
> >
> >  #include <SNAPSHOT.h>
> >
> > +#include "utils.h"
> >  int dump_zeros;
> >  int reset_history;
> >  int ignore_history;
> 
> Minor nit, please cleanup include order here (original code was wrong).
> 
> Standard practice is:
>  #include system headers (like stdio.h etc)
>  #include "xxx.h" local headers.
> 
> Should be:
> #include <getopt.h>
> 
> #include <linux/if.h>
> #include <linux/if_link.h>
> 
> #include "json_writer.h"
> #include "libnetlink.h"
> #include "utils.h"
> #include "SNAPSHOT.h"

Thanks, I'll fix it.

^ permalink raw reply

* Re: Soft lockup in inet_put_port on 4.6
From: Josef Bacik @ 2016-12-16 14:54 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: Tom Herbert, Craig Gallek, Eric Dumazet,
	Linux Kernel Network Developers
In-Reply-To: <aca378b5-aae4-49cf-0542-d1167a5768b8@stressinduktion.org>

On Thu, Dec 15, 2016 at 7:07 PM, Hannes Frederic Sowa 
<hannes@stressinduktion.org> wrote:
> Hi Josef,
> 
> On 15.12.2016 19:53, Josef Bacik wrote:
>>  On Tue, Dec 13, 2016 at 6:32 PM, Tom Herbert <tom@herbertland.com> 
>> wrote:
>>>  On Tue, Dec 13, 2016 at 3:03 PM, Craig Gallek 
>>> <kraigatgoog@gmail.com>
>>>  wrote:
>>>>   On Tue, Dec 13, 2016 at 3:51 PM, Tom Herbert 
>>>> <tom@herbertland.com>
>>>>  wrote:
>>>>>   I think there may be some suspicious code in inet_csk_get_port. 
>>>>> At
>>>>>   tb_found there is:
>>>>> 
>>>>>                   if (((tb->fastreuse > 0 && reuse) ||
>>>>>                        (tb->fastreuseport > 0 &&
>>>>>                         !rcu_access_pointer(sk->sk_reuseport_cb) 
>>>>> &&
>>>>>                         sk->sk_reuseport && uid_eq(tb->fastuid,
>>>>>  uid))) &&
>>>>>                       smallest_size == -1)
>>>>>                           goto success;
>>>>>                   if (inet_csk(sk)->icsk_af_ops->bind_conflict(sk,
>>>>>  tb, true)) {
>>>>>                           if ((reuse ||
>>>>>                                (tb->fastreuseport > 0 &&
>>>>>                                 sk->sk_reuseport &&
>>>>> 
>>>>>  !rcu_access_pointer(sk->sk_reuseport_cb) &&
>>>>>                                 uid_eq(tb->fastuid, uid))) &&
>>>>>                               smallest_size != -1 && --attempts 
>>>>> >= 0) {
>>>>>                                   spin_unlock_bh(&head->lock);
>>>>>                                   goto again;
>>>>>                           }
>>>>>                           goto fail_unlock;
>>>>>                   }
>>>>> 
>>>>>   AFAICT there is redundancy in these two conditionals.  The same 
>>>>> clause
>>>>>   is being checked in both: (tb->fastreuseport > 0 &&
>>>>>   !rcu_access_pointer(sk->sk_reuseport_cb) && sk->sk_reuseport &&
>>>>>   uid_eq(tb->fastuid, uid))) && smallest_size == -1. If this is 
>>>>> true the
>>>>>   first conditional should be hit, goto done,  and the second 
>>>>> will never
>>>>>   evaluate that part to true-- unless the sk is changed (do we 
>>>>> need
>>>>>   READ_ONCE for sk->sk_reuseport_cb?).
>>>>   That's an interesting point... It looks like this function also
>>>>   changed in 4.6 from using a single local_bh_disable() at the 
>>>> beginning
>>>>   with several spin_lock(&head->lock) to exclusively
>>>>   spin_lock_bh(&head->lock) at each locking point.  Perhaps the 
>>>> full bh
>>>>   disable variant was preventing the timers in your stack trace 
>>>> from
>>>>   running interleaved with this function before?
>>> 
>>>  Could be, although dropping the lock shouldn't be able to affect 
>>> the
>>>  search state. TBH, I'm a little lost in reading function, the
>>>  SO_REUSEPORT handling is pretty complicated. For instance,
>>>  rcu_access_pointer(sk->sk_reuseport_cb) is checked three times in 
>>> that
>>>  function and also in every call to inet_csk_bind_conflict. I 
>>> wonder if
>>>  we can simply this under the assumption that SO_REUSEPORT is only
>>>  allowed if the port number (snum) is explicitly specified.
>> 
>>  Ok first I have data for you Hannes, here's the time distributions
>>  before during and after the lockup (with all the debugging in place 
>> the
>>  box eventually recovers).  I've attached it as a text file since it 
>> is
>>  long.
> 
> Thanks a lot!
> 
>>  Second is I was thinking about why we would spend so much time 
>> doing the
>>  ->owners list, and obviously it's because of the massive amount of
>>  timewait sockets on the owners list.  I wrote the following dumb 
>> patch
>>  and tested it and the problem has disappeared completely.  Now I 
>> don't
>>  know if this is right at all, but I thought it was weird we weren't
>>  copying the soreuseport option from the original socket onto the 
>> twsk.
>>  Is there are reason we aren't doing this currently?  Does this help
>>  explain what is happening?  Thanks,
> 
> The patch is interesting and a good clue, but I am immediately a bit
> concerned that we don't copy/tag the socket with the uid also to keep
> the security properties for SO_REUSEPORT. I have to think a bit more
> about this.
> 
> We have seen hangs during connect. I am afraid this patch wouldn't 
> help
> there while also guaranteeing uniqueness.


Yeah so I looked at the code some more and actually my patch is really 
bad.  If sk2->sk_reuseport is set we'll look at sk2->sk_reuseport_cb, 
which is outside of the timewait sock, so that's definitely bad.

But we should at least be setting it to 0 so that we don't do this 
normally.  Unfortunately simply setting it to 0 doesn't fix the 
problem.  So for some reason having ->sk_reuseport set to 1 on a 
timewait socket makes this problem non-existent, which is strange.

So back to the drawing board I guess.  I wonder if doing what craig 
suggested and batching the timewait timer expires so it hurts less 
would accomplish the same results.  Thanks,

Josef

^ permalink raw reply

* Re: [PATCH iproute2 v2 1/3] ifstat: Add extended statistics to ifstat
From: Rami Rosen @ 2016-12-16 15:21 UTC (permalink / raw)
  To: Nogah Frankel
  Cc: Stephen Hemminger, netdev@vger.kernel.org,
	roopa@cumulusnetworks.com, Jiri Pirko, Elad Raz, Yotam Gigi,
	Ido Schimmel, Or Gerlitz
In-Reply-To: <DB5PR05MB1895991888E270CE790F1C16AC9C0@DB5PR05MB1895.eurprd05.prod.outlook.com>

Hi,

>Thanks, I'll fix it.
Another minor nit, on this occasion:

bool is_extanded should be: bool is_extended

Regards,
Rami Rosen

^ permalink raw reply

* Re: [PATCH perf/core REBASE 2/5] samples/bpf: Switch over to libbpf
From: Arnaldo Carvalho de Melo @ 2016-12-16 15:42 UTC (permalink / raw)
  To: Joe Stringer
  Cc: LKML, netdev, Wang Nan, ast, Daniel Borkmann,
	Arnaldo Carvalho de Melo
In-Reply-To: <CAPWQB7EAN7Fg8+vO3Tn7WYWcZW_JpKQFNCJzY_K100ckab6JRg@mail.gmail.com>

Em Thu, Dec 15, 2016 at 05:48:31PM -0800, Joe Stringer escreveu:
> On 15 December 2016 at 14:00, Joe Stringer <joe@ovn.org> wrote:
> > On 15 December 2016 at 10:34, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >> Em Thu, Dec 15, 2016 at 03:29:18PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>> Em Thu, Dec 15, 2016 at 12:50:22PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>> > Em Wed, Dec 14, 2016 at 02:43:39PM -0800, Joe Stringer escreveu:
> >>> > > Now that libbpf under tools/lib/bpf/* is synced with the version from
> >>> > > samples/bpf, we can get rid most of the libbpf library here.
> >>> > >
> >>> > > Signed-off-by: Joe Stringer <joe@ovn.org>
> >>> > > Cc: Alexei Starovoitov <ast@fb.com>
> >>> > > Cc: Daniel Borkmann <daniel@iogearbox.net>
> >>> > > Cc: Wang Nan <wangnan0@huawei.com>
> >>> > > Link: http://lkml.kernel.org/r/20161209024620.31660-6-joe@ovn.org
> >>> > > [ Use -I$(srctree)/tools/lib/ to support out of source code tree builds, as noticed by Wang Nan ]
> >>>
> >>> So, the above comment no longer applied to this adjusted patch from you,
> >>> as you removed one hunk too much, that, after applied, gets samples/bpf/
> >>> to build successfully:
> >>>
> >>> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> >>> index add514e2984a..81b0ef2f7994 100644
> >>> --- a/samples/bpf/Makefile
> >>> +++ b/samples/bpf/Makefile
> >>> @@ -107,6 +107,7 @@ always += lwt_len_hist_kern.o
> >>>  always += xdp_tx_iptunnel_kern.o
> >>>
> >>>  HOSTCFLAGS += -I$(objtree)/usr/include
> >>> +HOSTCFLAGS += -I$(srctree)/tools/lib/
> >>>  HOSTCFLAGS += -I$(srctree)/tools/testing/selftests/bpf/
> >>>
> >>>  HOSTCFLAGS_bpf_load.o += -I$(objtree)/usr/include -Wno-unused-variable
> >>>
> >>> ---------------------
> >>>
> >>> I added it, continuing...
> >>
> >> But then, when I tried to run offwaketime with it, it fails:
> >>
> >> [root@jouet bpf]# ./offwaketime  ls
> >> bpf_load_program() err=22
> >> BPF_LDX uses reserved fields
> >> bpf_load_program() err=22
> >> BPF_LDX uses reserved fields
> >> [root@jouet bpf]#
> >>
> >> If I remove this patch and try again, it works:
> >>
> >> [root@jouet bpf]# ./offwaketime | head -4
> >> swapper/1;start_secondary;cpu_startup_entry;schedule_preempt_disabled;schedule;__schedule;-;---;; 46
> >> chrome;return_from_SYSCALL_64;do_syscall_64;exit_to_usermode_loop;schedule;__schedule;-;try_to_wake_up;do_futex;sys_futex;do_syscall_64;return_from_SYSCALL_64;;Chrome_ChildIOT 1
> >> firefox;entry_SYSCALL_64_fastpath;sys_poll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;pollwake;__wake_up_common;__wake_up_sync_key;pipe_write;__vfs_write;vfs_write;sys_write;entry_SYSCALL_64_fastpath;;Timer 3
> >> dockerd-current;entry_SYSCALL_64_fastpath;sys_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;futex_wake;do_futex;sys_futex;entry_SYSCALL_64_fastpath;;dockerd-current 2
> >> [root@jouet bpf]#
> >>
> >>
> >> So, I'm stopping here so that I can push what I have to Ingo, then I'll get
> >> back to this, hopefully by then you beat me and I have just to retest 8-)
> >
> > OK, thanks for the report. Looks like there was another difference
> > between the two libbpfs - one used total program size for its
> > load_program API; the actual kernel API uses instruction count. This
> > incremental should do the trick:
> >
> > https://github.com/joestringer/linux/commit/6ff7726f20077bed66fb725f5189c13690154b6a
> 
> The full branch with this change (fast-forward from your tmp branch)
> is available here:
> https://github.com/joestringer/linux/tree/submit/libbpf_samples_v5

Can you please repost the patches you changed, and just those, sometimes
I'm with limited net connectivity, so not having to go use the github
interface, figure out how to download the patches, etc, is a win.

- Arnaldo
 
> I tried running every selftest and BPF sample I could get my hands on;
> there's one or two that I couldn't run, but seemed more to do with my
> versions of TC/iproute and kernel config rather than libbpf changes.
> Let me know if you see any further trouble.

^ permalink raw reply

* RE: [PATCH v5 2/4] siphash: add Nu{32,64} helpers
From: George Spelvin @ 2016-12-16 15:44 UTC (permalink / raw)
  To: ak, davem, David.Laight, ebiggers3, hannes, Jason,
	kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev,
	tom, torvalds, tytso, vegard.nossum
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0240EFA@AcuExch.aculab.com>

Jason A. Donenfeld wrote:
> Isn't that equivalent to:
>	v0 = key[0];
>	v1 = key[1];
>	v2 = key[0] ^ (0x736f6d6570736575ULL ^ 0x646f72616e646f6dULL);
>	v3 = key[1] ^ (0x646f72616e646f6dULL ^ 0x7465646279746573ULL);

(Pre-XORing key[] with the first two constants which, if the constants
are random in the first place, can be a no-op.)  Other than the typo
in the v2 line, yes.  If they key is non-public, then you can xor an
arbitrary constant in to both halves to slightly speed up the startup.

(Nits: There's a typo in the v2 line, you don't need to parenthesize
associative operators like xor, and the "ull" suffix is redundant here.)

> Those constants also look like ASCII strings.

They are.  The ASCII is "somepseudorandomlygeneratedbytes".

> What cryptographic analysis has been done on the values?

They're "nothing up my sleeve numbers".

They're arbitrary numbers, and almost any other values would do exactly
as well.  The main properties are:

1) They're different (particulatly v0 != v2 and v1 != v3), and
2) Neither they, nor their xor, is rotationally symmetric like 0x55555555.
   (Because SipHash is mostly rotationally symmetric, broken only by the
   interruption of the carry chain at the msbit, it helps slightly
   to break this up at the beginning.)

Those exact values only matter for portability.  If you don't need anyone
else to be able to compute matching outputs, then you could use any other
convenient constants (like the MD5 round constants).

^ permalink raw reply

* Re: [PATCH net] sfc: skip ptp allocations on secondary interfaces
From: Jarod Wilson @ 2016-12-16 15:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-net-drivers, Edward Cree, Bert Kenward, netdev
In-Reply-To: <20161215203104.12714-1-jarod@redhat.com>

On 2016-12-15 3:31 PM, Jarod Wilson wrote:
> Rather than allocating efx_ptp_data, a buffer, a workqueue, etc., and then
> ultimately deciding not to call ptp_clock_register() for non-primary
> interfaces, just exit out of efx_ptp_prober() earlier. Saves a little
> memory and speeds up init and teardown slightly.
>
> CC: linux-net-drivers@solarflare.com
> CC: Edward Cree <ecree@solarflare.com>
> CC: Bert Kenward <bkenward@solarflare.com>
> CC: netdev@vger.kernel.org
> Signed-off-by: Jarod Wilson <jarod@redhat.com>

Self-Nak on this one. Talking to Bert off-list, he pointed out that the 
driver uses the PTP structure allocated here for the secondary interface 
to do internal time-stamping and the like. While it might be ideal not 
to intermingle PTP structs with non-PTP work, disabling this right now 
is obviously a non-starter.

-- 
Jarod Wilson
jarod@redhat.com

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 15:51 UTC (permalink / raw)
  To: Jean-Philippe Aumasson
  Cc: George Spelvin, Andi Kleen, David Miller, David Laight,
	Eric Biggers, Hannes Frederic Sowa, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Theodore Ts'o, vegard.nossum,
	Daniel J . Bernstein
In-Reply-To: <CAGiyFddB_HT3H2yhYQ5rprYZ487rJ4iCaH9uPJQD57hiPbn9ng@mail.gmail.com>

Hi JP & George,

My function names:
- SipHash -> siphash
- HalfSipHash -> hsiphash

It appears that hsiphash can produce either 32-bit output or 64-bit
output, with the output length parameter as part of the hash algorithm
in there. When I code this for my kernel patchset, I very likely will
only implement one output length size. Right now I'm leaning toward
32-bit. Questions:

- Is this a reasonable choice?
- When hsiphash is desired due to its faster speed, are there any
circumstances in which producing a 64-bit output would actually be
useful? Namely, are there any hashtables that could benefit from a
64-bit functions?
- Are there reasons why hsiphash with 64-bit output would be
reasonable? Or will we be fine sticking with 32-bit output only?

With both hsiphash and siphash, the division of usage will probably become:
- Use 64-bit output 128-bit key siphash for keyed RNG-like things,
such as syncookies and sequence numbers
- Use 64-bit output 128-bit key siphash for hashtables that must
absolutely be secure to an extremely high bandwidth attacker, such as
userspace directly DoSing a kernel hashtable
- Use 32-bit output 64-bit key hsiphash for quick hashtable functions
that still must be secure but do not require as large of a security
margin

Sound good?

Jason

^ permalink raw reply

* Re: [PATCH v5 3/4] secure_seq: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-16 15:57 UTC (permalink / raw)
  To: David Laight
  Cc: Netdev, kernel-hardening@lists.openwall.com, LKML,
	linux-crypto@vger.kernel.org, Ted Tso, Hannes Frederic Sowa,
	Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin,
	Vegard Nossum, ak@linux.intel.com, davem@davemloft.net,
	luto@amacapital.net
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0240E66@AcuExch.aculab.com>

Hi David,

On Fri, Dec 16, 2016 at 10:59 AM, David Laight <David.Laight@aculab.com> wrote:
> You are still putting over-aligned data on stack.
> You only need to align it to the alignment of u64 (not the size of u64).
> If an on-stack item has a stronger alignment requirement than the stack
> the gcc has to generate two stack frames for the function.

Yesterday, folks were saying that sometimes 32-bit platforms need
8-byte alignment for certain 64-bit operations, so I shouldn't fall
back to 4-byte alignment there. But actually, looking at this more
closely, I can just make SIPHASH_ALIGNMENT == __alignof__(u64), which
will take care of all possible concerns, since gcc knows best which
platforms need what alignment. Thanks for making this clear to me with
"the alignment of u64 (not the size of u64)".

> Oh - and wait a bit longer between revisions.

Okay. We can be turtles.

Jason

^ permalink raw reply

* Re: Soft lockup in inet_put_port on 4.6
From: Josef Bacik @ 2016-12-16 15:21 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: Tom Herbert, Craig Gallek, Eric Dumazet,
	Linux Kernel Network Developers
In-Reply-To: <1481900088.24490.6@smtp.office365.com>

On Fri, Dec 16, 2016 at 9:54 AM, Josef Bacik <jbacik@fb.com> wrote:
> On Thu, Dec 15, 2016 at 7:07 PM, Hannes Frederic Sowa 
> <hannes@stressinduktion.org> wrote:
>> Hi Josef,
>> 
>> On 15.12.2016 19:53, Josef Bacik wrote:
>>>  On Tue, Dec 13, 2016 at 6:32 PM, Tom Herbert <tom@herbertland.com> 
>>> wrote:
>>>>  On Tue, Dec 13, 2016 at 3:03 PM, Craig Gallek 
>>>> <kraigatgoog@gmail.com>
>>>>  wrote:
>>>>>   On Tue, Dec 13, 2016 at 3:51 PM, Tom Herbert 
>>>>> <tom@herbertland.com>
>>>>>  wrote:
>>>>>>   I think there may be some suspicious code in 
>>>>>> inet_csk_get_port. At
>>>>>>   tb_found there is:
>>>>>> 
>>>>>>                   if (((tb->fastreuse > 0 && reuse) ||
>>>>>>                        (tb->fastreuseport > 0 &&
>>>>>>                         !rcu_access_pointer(sk->sk_reuseport_cb) 
>>>>>> &&
>>>>>>                         sk->sk_reuseport && uid_eq(tb->fastuid,
>>>>>>  uid))) &&
>>>>>>                       smallest_size == -1)
>>>>>>                           goto success;
>>>>>>                   if 
>>>>>> (inet_csk(sk)->icsk_af_ops->bind_conflict(sk,
>>>>>>  tb, true)) {
>>>>>>                           if ((reuse ||
>>>>>>                                (tb->fastreuseport > 0 &&
>>>>>>                                 sk->sk_reuseport &&
>>>>>> 
>>>>>>  !rcu_access_pointer(sk->sk_reuseport_cb) &&
>>>>>>                                 uid_eq(tb->fastuid, uid))) &&
>>>>>>                               smallest_size != -1 && --attempts 
>>>>>> >= 0) {
>>>>>>                                   spin_unlock_bh(&head->lock);
>>>>>>                                   goto again;
>>>>>>                           }
>>>>>>                           goto fail_unlock;
>>>>>>                   }
>>>>>> 
>>>>>>   AFAICT there is redundancy in these two conditionals.  The 
>>>>>> same clause
>>>>>>   is being checked in both: (tb->fastreuseport > 0 &&
>>>>>>   !rcu_access_pointer(sk->sk_reuseport_cb) && sk->sk_reuseport &&
>>>>>>   uid_eq(tb->fastuid, uid))) && smallest_size == -1. If this is 
>>>>>> true the
>>>>>>   first conditional should be hit, goto done,  and the second 
>>>>>> will never
>>>>>>   evaluate that part to true-- unless the sk is changed (do we 
>>>>>> need
>>>>>>   READ_ONCE for sk->sk_reuseport_cb?).
>>>>>   That's an interesting point... It looks like this function also
>>>>>   changed in 4.6 from using a single local_bh_disable() at the 
>>>>> beginning
>>>>>   with several spin_lock(&head->lock) to exclusively
>>>>>   spin_lock_bh(&head->lock) at each locking point.  Perhaps the 
>>>>> full bh
>>>>>   disable variant was preventing the timers in your stack trace 
>>>>> from
>>>>>   running interleaved with this function before?
>>>> 
>>>>  Could be, although dropping the lock shouldn't be able to affect 
>>>> the
>>>>  search state. TBH, I'm a little lost in reading function, the
>>>>  SO_REUSEPORT handling is pretty complicated. For instance,
>>>>  rcu_access_pointer(sk->sk_reuseport_cb) is checked three times in 
>>>> that
>>>>  function and also in every call to inet_csk_bind_conflict. I 
>>>> wonder if
>>>>  we can simply this under the assumption that SO_REUSEPORT is only
>>>>  allowed if the port number (snum) is explicitly specified.
>>> 
>>>  Ok first I have data for you Hannes, here's the time distributions
>>>  before during and after the lockup (with all the debugging in 
>>> place the
>>>  box eventually recovers).  I've attached it as a text file since 
>>> it is
>>>  long.
>> 
>> Thanks a lot!
>> 
>>>  Second is I was thinking about why we would spend so much time 
>>> doing the
>>>  ->owners list, and obviously it's because of the massive amount of
>>>  timewait sockets on the owners list.  I wrote the following dumb 
>>> patch
>>>  and tested it and the problem has disappeared completely.  Now I 
>>> don't
>>>  know if this is right at all, but I thought it was weird we weren't
>>>  copying the soreuseport option from the original socket onto the 
>>> twsk.
>>>  Is there are reason we aren't doing this currently?  Does this help
>>>  explain what is happening?  Thanks,
>> 
>> The patch is interesting and a good clue, but I am immediately a bit
>> concerned that we don't copy/tag the socket with the uid also to keep
>> the security properties for SO_REUSEPORT. I have to think a bit more
>> about this.
>> 
>> We have seen hangs during connect. I am afraid this patch wouldn't 
>> help
>> there while also guaranteeing uniqueness.
> 
> 
> Yeah so I looked at the code some more and actually my patch is 
> really bad.  If sk2->sk_reuseport is set we'll look at 
> sk2->sk_reuseport_cb, which is outside of the timewait sock, so 
> that's definitely bad.
> 
> But we should at least be setting it to 0 so that we don't do this 
> normally.  Unfortunately simply setting it to 0 doesn't fix the 
> problem.  So for some reason having ->sk_reuseport set to 1 on a 
> timewait socket makes this problem non-existent, which is strange.
> 
> So back to the drawing board I guess.  I wonder if doing what craig 
> suggested and batching the timewait timer expires so it hurts less 
> would accomplish the same results.  Thanks,

Wait no I lied, we access the sk->sk_reuseport_cb, not sk2's.  This is 
the code

                        if ((!reuse || !sk2->sk_reuse ||
                            sk2->sk_state == TCP_LISTEN) &&
                            (!reuseport || !sk2->sk_reuseport ||
                             rcu_access_pointer(sk->sk_reuseport_cb) ||
                             (sk2->sk_state != TCP_TIME_WAIT &&
                             !uid_eq(uid, sock_i_uid(sk2))))) {

                                if (!sk2->sk_rcv_saddr || 
!sk->sk_rcv_saddr ||
                                    sk2->sk_rcv_saddr == 
sk->sk_rcv_saddr)
                                        break;
                        }

so in my patches case we now have reuseport == 1, sk2->sk_reuseport == 
1.  But now we are using reuseport, so sk->sk_reuseport_cb should be 
non-NULL right?  So really setting the timewait sock's sk_reuseport 
should have no bearing on how this loop plays out right?  Thanks,

Josef

^ permalink raw reply

* [PATCH] cgroup: Fix CGROUP_BPF config
From: Andy Lutomirski @ 2016-12-16 16:33 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Mack; +Cc: Network Development, Andy Lutomirski

CGROUP_BPF depended on SOCK_CGROUP_DATA which can't be manually
enabled, making it rather challenging to turn CGROUP_BPF on.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 init/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/init/Kconfig b/init/Kconfig
index aafafeb0c117..223b734abccd 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1156,7 +1156,8 @@ config CGROUP_PERF
 
 config CGROUP_BPF
 	bool "Support for eBPF programs attached to cgroups"
-	depends on BPF_SYSCALL && SOCK_CGROUP_DATA
+	depends on BPF_SYSCALL
+	select SOCK_CGROUP_DATA
 	help
 	  Allow attaching eBPF programs to a cgroup using the bpf(2)
 	  syscall command BPF_PROG_ATTACH.
-- 
2.9.3

^ permalink raw reply related

* [PATCH] ip: vfinfo: remove code duplication for IFLA_VF_RSS_QUERY_EN
From: Julien Fortin @ 2016-12-16 16:36 UTC (permalink / raw)
  To: stephen, netdev; +Cc: phil, Julien Fortin

From: Julien Fortin <julien@cumulusnetworks.com>

Fixes: 4fb4a10e120b1 ("ipaddress: Print IFLA_VF_QUERY_RSS_EN setting”)

Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
---
 ip/ipaddress.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index de64877..400ebb4 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -414,14 +414,6 @@ static void print_vfinfo(FILE *fp, struct rtattr *vfinfo)
 			fprintf(fp, ", query_rss %s",
 			        rss_query->setting ? "on" : "off");
 	}
-	if (vf[IFLA_VF_RSS_QUERY_EN]) {
-		struct ifla_vf_rss_query_en *rss_query =
-			RTA_DATA(vf[IFLA_VF_RSS_QUERY_EN]);
-
-		if (rss_query->setting != -1)
-			fprintf(fp, ", query_rss %s",
-			        rss_query->setting ? "on" : "off");
-	}
 	if (vf[IFLA_VF_STATS] && show_stats)
 		print_vf_stats64(fp, vf[IFLA_VF_STATS]);
 }
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH] ip: vfinfo: remove code duplication for IFLA_VF_RSS_QUERY_EN
From: Phil Sutter @ 2016-12-16 16:43 UTC (permalink / raw)
  To: Julien Fortin; +Cc: stephen, netdev
In-Reply-To: <20161216163605.19728-1-julien@cumulusnetworks.com>

On Fri, Dec 16, 2016 at 05:36:05PM +0100, Julien Fortin wrote:
> From: Julien Fortin <julien@cumulusnetworks.com>
> 
> Fixes: 4fb4a10e120b1 ("ipaddress: Print IFLA_VF_QUERY_RSS_EN setting”)
> 
> Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>

Oh! Looks like Git messed up during a rebase and the obviously wrong
result was overlooked.

Acked-by: Phil Sutter <phil@nwl.cc>

^ permalink raw reply

* "rfkill: Add rfkill-any LED trigger" causes deadlock
From: Михаил Кринкин @ 2016-12-16 16:46 UTC (permalink / raw)
  To: kernel, johannes.berg, linux-wireless, davem, netdev
In-Reply-To: <20161216163707.GA2629@gmail.com>

Hi,

with recent i can't load my thinkpad with recent linux-next, bisect points
at the commit 73f4f76a196d7adb ("rfkill: Add rfkill-any LED trigger").

Problem occurs because thinkapd_acpi rfkill set_block handler
tpacpi_rfk_hook_set_block calls rfkill_set_sw_state, which in turn calls
new rfkill_any_led_trigger_event. So when rfkill_set_block called from
rfkill_register we have deadlock.

I added WARN to __rfkill_any_led_trigger_event to see how deadlock occurs,
here is backtrace:

[    6.090079] WARNING: CPU: 2 PID: 307 at net/rfkill/core.c:184
rfkill_any_led_trigger_event+0x2d/0x40
[    6.090080] reached __rfkill_any_led_trigger_event
[    6.090081] Modules linked in:
[    6.090082]  snd_pcm sg thinkpad_acpi(+) mei_me nvram snd_seq mei
idma64 virt_dma intel_pch_thermal ucsi intel_lpss_pci snd_seq_device
hci_uart snd_timer snd btbcm soundcore btqca led_class btintel
bluetooth i8042 rtc_cmos serio evdev intel_lpss_acpi intel_lpss
acpi_pad tpm_infineon mfd_core kvm_intel kvm irqbypass ipv6 autofs4
ext4 jbd2 mbcache sd_mod i915 i2c_algo_bit drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops drm ahci libahci video
pinctrl_sunrisepoint pinctrl_intel
[    6.090112] CPU: 2 PID: 307 Comm: systemd-udevd Tainted: G        W
      4.9.0-01741-g73f4f76-dirty #106
[    6.090113] Hardware name: LENOVO 20GJ004ERT/20GJ004ERT, BIOS
R0CET21W (1.09 ) 03/07/2016
[    6.090114]  ffffb199c023f998 ffffffffb4334b6b ffffb199c023f9e8
0000000000000000
[    6.090117]  ffffb199c023f9d8 ffffffffb40658ab 000000b84dd75d28
0000000080000006
[    6.090119]  ffff99ea4dd75d28 0000000000000001 0000000000000001
0000000000000000
[    6.090122] Call Trace:
[    6.090126]  [<ffffffffb4334b6b>] dump_stack+0x4d/0x72
[    6.090128]  [<ffffffffb40658ab>] __warn+0xcb/0xf0
[    6.090130]  [<ffffffffb406592f>] warn_slowpath_fmt+0x5f/0x80
[    6.090132]  [<ffffffffb4631bfd>] rfkill_any_led_trigger_event+0x2d/0x40
[    6.090135]  [<ffffffffb46322d9>] rfkill_set_sw_state+0x89/0xd0
[    6.090142]  [<ffffffffc06c9921>]
tpacpi_rfk_update_swstate+0x31/0x50 [thinkpad_acpi]
[    6.090148]  [<ffffffffc06c9972>]
tpacpi_rfk_hook_set_block+0x32/0x70 [thinkpad_acpi]
[    6.090150]  [<ffffffffb46327f0>] rfkill_set_block+0x90/0x160
[    6.090152]  [<ffffffffb4632930>] __rfkill_switch_all+0x70/0xa0
[    6.090154]  [<ffffffffb4633028>] rfkill_register+0x278/0x2b0
[    6.090159]  [<ffffffffc06e1f0e>] tpacpi_new_rfkill+0xef/0x15d
[thinkpad_acpi]
[    6.090164]  [<ffffffffc06e2407>] bluetooth_init+0x15e/0x1a9 [thinkpad_acpi]
[    6.090169]  [<ffffffffc06e2fb5>]
thinkpad_acpi_module_init.part.47+0x5e7/0x996 [thinkpad_acpi]
[    6.090174]  [<ffffffffc06e3364>] ?
thinkpad_acpi_module_init.part.47+0x996/0x996 [thinkpad_acpi]
[    6.090179]  [<ffffffffc06e36af>]
thinkpad_acpi_module_init+0x34b/0xc9c [thinkpad_acpi]
[    6.090181]  [<ffffffffb4000450>] do_one_initcall+0x50/0x180
[    6.090184]  [<ffffffffb4177e12>] ? do_init_module+0x27/0x1ee
[    6.090186]  [<ffffffffb41cc4b6>] ? kmem_cache_alloc_trace+0x156/0x1a0
[    6.090188]  [<ffffffffb4177e4a>] do_init_module+0x5f/0x1ee
[    6.090191]  [<ffffffffb40ed562>] load_module+0x22c2/0x28d0
[    6.090192]  [<ffffffffb40ea0f0>] ? __symbol_put+0x60/0x60
[    6.090195]  [<ffffffffb42ee15d>] ? ima_post_read_file+0x7d/0xa0
[    6.090198]  [<ffffffffb42a809b>] ? security_kernel_post_read_file+0x6b/0x80
[    6.090200]  [<ffffffffb40eddcf>] SYSC_finit_module+0xdf/0x110
[    6.090202]  [<ffffffffb40ede1e>] SyS_finit_module+0xe/0x10
[    6.090204]  [<ffffffffb463d524>] entry_SYSCALL_64_fastpath+0x17/0x98
[    6.090206] ---[ end trace ea6da61c1ec208d1 ]---
[    6.090207] rfkill_any_led_trigger unlocked
[    6.091664] ------------[ cut here ]------------

^ permalink raw reply

* RE: [upstream-release] [PATCH net 2/4] fsl/fman: arm: call of_platform_populate() for arm64 platfrom
From: Madalin-Cristian Bucur @ 2016-12-16 17:01 UTC (permalink / raw)
  To: Scott Wood, netdev@vger.kernel.org, mpe@ellerman.id.au
  Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	davem@davemloft.net
In-Reply-To: <DB5PR0401MB1928F163614C64900D1458ED919D0@DB5PR0401MB1928.eurprd04.prod.outlook.com>

> From: Scott Wood
> Sent: Thursday, December 15, 2016 8:42 PM
> 
> On 12/15/2016 07:11 AM, Madalin Bucur wrote:
> > From: Igal Liberman <igal.liberman@freescale.com>
> >
> > Signed-off-by: Igal Liberman <igal.liberman@freescale.com>
> > ---
> >  drivers/net/ethernet/freescale/fman/fman.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/freescale/fman/fman.c
> b/drivers/net/ethernet/freescale/fman/fman.c
> > index dafd9e1..f36b4eb 100644
> > --- a/drivers/net/ethernet/freescale/fman/fman.c
> > +++ b/drivers/net/ethernet/freescale/fman/fman.c
> > @@ -2868,6 +2868,16 @@ static struct fman *read_dts_node(struct
> platform_device *of_dev)
> >
> >  	fman->dev = &of_dev->dev;
> >
> > +#ifdef CONFIG_ARM64
> > +	/* call of_platform_populate in order to probe sub-nodes on arm64 */
> > +	err = of_platform_populate(fm_node, NULL, NULL, &of_dev->dev);
> > +	if (err) {
> > +		dev_err(&of_dev->dev, "%s: of_platform_populate() failed\n",
> > +			__func__);
> > +		goto fman_free;
> > +	}
> > +#endif
> 
> Should we remove fsl,fman from the PPC of_device_ids[], so this doesn't
> need an ifdef?
> 
> Why is it #ifdef CONFIG_ARM64 rather than #ifndef CONFIG_PPC?
> 
> -Scott

Igal was working on adding ARM64 support when this patch was created, thus the
choice of #ifdef CONFIG_ARM64. Unifying this for PPC and ARM64 by always calling
of_platform_populate() sounds like the best approach. I would need to synchronize
the introduction of this code with the removal of the fsl,fman entry from the
of_device_ids[] array.

Dave, Michael, Scott, is it ok to add to v2 of this patch set the patch that removes
the compatible "fsl,fman" from arch/powerpc/platforms/85xx/corenet_generic.c?

Thanks,
Madalin

^ permalink raw reply

* RE: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: David Laight @ 2016-12-16 17:06 UTC (permalink / raw)
  To: 'George Spelvin', ak@linux.intel.com, davem@davemloft.net,
	ebiggers3@gmail.com, hannes@stressinduktion.org, Jason@zx2c4.com,
	jeanphilippe.aumasson@gmail.com,
	kernel-hardening@lists.openwall.com, linux-crypto@vger.kernel.org,
	linux-kernel@vger.kernel.org, luto@amacapital.net,
	netdev@vger.kernel.org, tom@herbertland.com,
	torvalds@linux-foundation.org, "tytso@mit.edu" <tytso
  Cc: djb@cr.yp.to
In-Reply-To: <20161215232840.22459.qmail@ns.sciencehorizons.net>

From: George Spelvin
> Sent: 15 December 2016 23:29
> > If a halved version of SipHash can bring significant performance boost
> > (with 32b words instead of 64b words) with an acceptable security level
> > (64-bit enough?) then we may design such a version.
> 
> I was thinking if the key could be pushed to 80 bits, that would be nice,
> but honestly 64 bits is fine.  This is DoS protection, and while it's
> possible to brute-force a 64 bit secret, there are more effective (DDoS)
> attacks possible for the same cost.

A 32bit hash would also remove all the issues about the alignment
of IP addresses (etc) on 64bit systems.

> (I'd suggest a name of "HalfSipHash" to convey the reduced security
> effectively.)
> 
> > Regarding output size, are 64 bits sufficient?
> 
> As a replacement for jhash, 32 bits are sufficient.  It's for
> indexing an in-memory hash table on a 32-bit machine.

It is also worth remembering that if the intent is to generate
a hash table index (not a unique fingerprint) you will always
get collisions on the final value.
Randomness could still give overlong hash chains - which might
still need rehashing with a different key.

	David

^ permalink raw reply

* Re: [PULL] virtio, vhost: new device, fixes, speedups
From: Michael S. Tsirkin @ 2016-12-16 17:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: arend.vanspriel, KVM list, Network Development,
	Linux Kernel Mailing List, virtualization
In-Reply-To: <CA+55aFxwG=jj6=2fKjNkHCOnbKr-mNvNucbA=CSdPvb4kzW3FA@mail.gmail.com>

On Thu, Dec 15, 2016 at 06:20:40PM -0800, Linus Torvalds wrote:
> On Thu, Dec 15, 2016 at 3:05 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
> 
> Pulled, but I wonder...
> 
> >  Documentation/translations/zh_CN/sparse.txt        |   7 +-
> >  arch/arm/plat-samsung/include/plat/gpio-cfg.h      |   2 +-
> >  drivers/crypto/virtio/virtio_crypto_common.h       | 128 +++++
> [...]
> 
> what are you generating these diffstats with? Because they are pretty bogus..
> 
> The end result is correct:
> 
> >  86 files changed, 2106 insertions(+), 280 deletions(-)
> 
> but the file order in the diffstat is completely random, which makes
> it very hard to compare with what I get. It also makes it hard to see
> what you changed, because it's not alphabetical like it should be
> (strictly speaking the git pathname ordering isnt' really
> alphabetical, since the '/' sorts as the NUL character, but close
> enough).
> 
> I can't see the logic to the re-ordering of the lines, so I'm
> intrigued how you even generated it.
> 
>             Linus


Oh, that's because I set orderfile globally rather than
just for the qemu project which wants it.
Fixed, sorry about that.

-- 
MST

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 17:09 UTC (permalink / raw)
  To: David Laight
  Cc: George Spelvin, ak@linux.intel.com, davem@davemloft.net,
	ebiggers3@gmail.com, hannes@stressinduktion.org,
	jeanphilippe.aumasson@gmail.com,
	kernel-hardening@lists.openwall.com, linux-crypto@vger.kernel.org,
	linux-kernel@vger.kernel.org, luto@amacapital.net,
	netdev@vger.kernel.org, tom@herbertland.com,
	torvalds@linux-foundation.org, tytso@mit.edu,
	vegard.nossum@gmail.com, djb@cr.yp.to
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0241238@AcuExch.aculab.com>

Hi David,

On Fri, Dec 16, 2016 at 6:06 PM, David Laight <David.Laight@aculab.com> wrote:
> A 32bit hash would also remove all the issues about the alignment
> of IP addresses (etc) on 64bit systems.

The current replacements of md5_transform with siphash in the v6 patch
series will continue to use the original siphash, since the 128-bit
key is rather important for these kinds of secrets. Additionally,
64-bit siphash is already faster than the md5_transform that it
replaces. So the alignment concerns (now, non-issues; problems have
been solved, I believe) still remain.

Jason

^ permalink raw reply

* "rfkill: Add rfkill-any LED trigger" causes deadlock
From: Mike Krinkin @ 2016-12-16 17:09 UTC (permalink / raw)
  To: kernel-ePNcKBjznIDVItvQsEIGlw,
	johannes.berg-ral2JQCrhuEAvxtiuMwx3w,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <CACpa5=_chGuU3zVcVR3nkNySwz1cGAQTs39v+BjwcDNSvOqhTw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, Dec 16, 2016 at 07:46:06PM +0300, Михаил Кринкин wrote:
> Hi,
> 
> with recent i can't load my thinkpad with recent linux-next, bisect points
> at the commit 73f4f76a196d7adb ("rfkill: Add rfkill-any LED trigger").
> 
> Problem occurs because thinkapd_acpi rfkill set_block handler
> tpacpi_rfk_hook_set_block calls rfkill_set_sw_state, which in turn calls
> new rfkill_any_led_trigger_event. So when rfkill_set_block called from
> rfkill_register we have deadlock.
> 
> I added WARN to __rfkill_any_led_trigger_event to see how deadlock occurs,
> here is backtrace:
> 
> [    6.090079] WARNING: CPU: 2 PID: 307 at net/rfkill/core.c:184
> rfkill_any_led_trigger_event+0x2d/0x40
> [    6.090080] reached __rfkill_any_led_trigger_event
> [    6.090081] Modules linked in:
> [    6.090082]  snd_pcm sg thinkpad_acpi(+) mei_me nvram snd_seq mei
> idma64 virt_dma intel_pch_thermal ucsi intel_lpss_pci snd_seq_device
> hci_uart snd_timer snd btbcm soundcore btqca led_class btintel
> bluetooth i8042 rtc_cmos serio evdev intel_lpss_acpi intel_lpss
> acpi_pad tpm_infineon mfd_core kvm_intel kvm irqbypass ipv6 autofs4
> ext4 jbd2 mbcache sd_mod i915 i2c_algo_bit drm_kms_helper syscopyarea
> sysfillrect sysimgblt fb_sys_fops drm ahci libahci video
> pinctrl_sunrisepoint pinctrl_intel
> [    6.090112] CPU: 2 PID: 307 Comm: systemd-udevd Tainted: G        W
>       4.9.0-01741-g73f4f76-dirty #106
> [    6.090113] Hardware name: LENOVO 20GJ004ERT/20GJ004ERT, BIOS
> R0CET21W (1.09 ) 03/07/2016
> [    6.090114]  ffffb199c023f998 ffffffffb4334b6b ffffb199c023f9e8
> 0000000000000000
> [    6.090117]  ffffb199c023f9d8 ffffffffb40658ab 000000b84dd75d28
> 0000000080000006
> [    6.090119]  ffff99ea4dd75d28 0000000000000001 0000000000000001
> 0000000000000000
> [    6.090122] Call Trace:
> [    6.090126]  [<ffffffffb4334b6b>] dump_stack+0x4d/0x72
> [    6.090128]  [<ffffffffb40658ab>] __warn+0xcb/0xf0
> [    6.090130]  [<ffffffffb406592f>] warn_slowpath_fmt+0x5f/0x80
> [    6.090132]  [<ffffffffb4631bfd>] rfkill_any_led_trigger_event+0x2d/0x40
> [    6.090135]  [<ffffffffb46322d9>] rfkill_set_sw_state+0x89/0xd0
> [    6.090142]  [<ffffffffc06c9921>]
> tpacpi_rfk_update_swstate+0x31/0x50 [thinkpad_acpi]
> [    6.090148]  [<ffffffffc06c9972>]
> tpacpi_rfk_hook_set_block+0x32/0x70 [thinkpad_acpi]
> [    6.090150]  [<ffffffffb46327f0>] rfkill_set_block+0x90/0x160
> [    6.090152]  [<ffffffffb4632930>] __rfkill_switch_all+0x70/0xa0
> [    6.090154]  [<ffffffffb4633028>] rfkill_register+0x278/0x2b0
> [    6.090159]  [<ffffffffc06e1f0e>] tpacpi_new_rfkill+0xef/0x15d
> [thinkpad_acpi]
> [    6.090164]  [<ffffffffc06e2407>] bluetooth_init+0x15e/0x1a9 [thinkpad_acpi]
> [    6.090169]  [<ffffffffc06e2fb5>]
> thinkpad_acpi_module_init.part.47+0x5e7/0x996 [thinkpad_acpi]
> [    6.090174]  [<ffffffffc06e3364>] ?
> thinkpad_acpi_module_init.part.47+0x996/0x996 [thinkpad_acpi]
> [    6.090179]  [<ffffffffc06e36af>]
> thinkpad_acpi_module_init+0x34b/0xc9c [thinkpad_acpi]
> [    6.090181]  [<ffffffffb4000450>] do_one_initcall+0x50/0x180
> [    6.090184]  [<ffffffffb4177e12>] ? do_init_module+0x27/0x1ee
> [    6.090186]  [<ffffffffb41cc4b6>] ? kmem_cache_alloc_trace+0x156/0x1a0
> [    6.090188]  [<ffffffffb4177e4a>] do_init_module+0x5f/0x1ee
> [    6.090191]  [<ffffffffb40ed562>] load_module+0x22c2/0x28d0
> [    6.090192]  [<ffffffffb40ea0f0>] ? __symbol_put+0x60/0x60
> [    6.090195]  [<ffffffffb42ee15d>] ? ima_post_read_file+0x7d/0xa0
> [    6.090198]  [<ffffffffb42a809b>] ? security_kernel_post_read_file+0x6b/0x80
> [    6.090200]  [<ffffffffb40eddcf>] SYSC_finit_module+0xdf/0x110
> [    6.090202]  [<ffffffffb40ede1e>] SyS_finit_module+0xe/0x10
> [    6.090204]  [<ffffffffb463d524>] entry_SYSCALL_64_fastpath+0x17/0x98
> [    6.090206] ---[ end trace ea6da61c1ec208d1 ]---
> [    6.090207] rfkill_any_led_trigger unlocked
> [    6.091664] ------------[ cut here ]------------

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox