Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 0/2] net/sched: Add hardware specific counters to TC actions
From: Eelco Chaudron @ 2018-08-16 12:02 UTC (permalink / raw)
  To: David Miller, jakub.kicinski
  Cc: netdev, jhs, xiyou.wangcong, jiri, simon.horman,
	Marcelo Ricardo Leitner
In-Reply-To: <20180811.120627.662252154567814394.davem@davemloft.net>

On 11 Aug 2018, at 21:06, David Miller wrote:

> From: Jakub Kicinski <jakub.kicinski@netronome.com>
> Date: Thu, 9 Aug 2018 20:26:08 -0700
>
>> It is not immediately clear why this is needed.  The memory and
>> updating two sets of counters won't come for free, so perhaps a
>> stronger justification than troubleshooting is due? :S
>>
>> Netdev has counters for fallback vs forwarded traffic, so you'd know
>> that traffic hits the SW datapath, plus the rules which are in_hw 
>> will
>> most likely not match as of today for flower (assuming correctness).

I strongly believe that these counters are a requirement for a mixed 
software/hardware (flow) based forwarding environment. The global 
counters will not help much here as you might have chosen to have 
certain traffic forwarded by software.

These counters are probably the only option you have to figure out why 
forwarding is not as fast as expected, and you want to blame the TC 
offload NIC.

>> I'm slightly concerned about potential performance impact, would you
>> be able to share some numbers for non-trivial number of flows (100k
>> active?)?
>
> Agreed, features used for diagnostics cannot have a harmful penalty 
> for
> fast path performance.

Fast path performance is not affected as these counters are not 
incremented there. They are only incremented by the nic driver when they 
gather their statistics from hardware.

However, the flow creation is effected, as this is where the extra 
memory gets allocated. I had done some 40K flow tests before and did not 
see any noticeable change in flow insertion performance. As requested by 
Jakub I did it again for 100K (and threw a Netronome blade in the mix 
;). I used Marcelo’s test tool, 
https://github.com/marceloleitner/perf-flower.git.

Here are the numbers (time in seconds) for 10 runs in sorted order:

+-------------+----------------+
| Base_kernel | Change_applied |
+-------------+----------------+
|    5.684019 |       5.656388 |
|    5.699658 |       5.674974 |
|    5.725220 |       5.722107 |
|    5.739285 |       5.839855 |
|    5.748088 |       5.865238 |
|    5.766231 |       5.873913 |
|    5.842264 |       5.909259 |
|    5.902202 |       5.912685 |
|    5.905391 |       5.947138 |
|    6.032997 |       5.997779 |
+-------------+----------------+

I guess the deviation is in the userspace part, which is where in real 
life flows get added anyway.


Let me know if more is unclear.


//Eelco

^ permalink raw reply

* [PATCH v2] net: phy: add tracepoints
From: Tobias Waldekranz @ 2018-08-16 15:24 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, David S. Miller, Steven Rostedt,
	Ingo Molnar, GitAuthor: Tobias Waldekranz, open list

Two tracepoints for now:

   * `phy_interrupt` Pretty self-explanatory.

   * `phy_state_change` Whenever the PHY's state machine is run, trace
     the old and the new state.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
v1 -> v2:
   * Actually include the entire patch, based on current net-next.
   * Pretty print the PHY's state according to Steven's suggestion.
---
 drivers/net/phy/phy.c      |   8 ++-
 include/trace/events/phy.h | 100 +++++++++++++++++++++++++++++++++++++
 2 files changed, 107 insertions(+), 1 deletion(-)
 create mode 100644 include/trace/events/phy.h

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 1ee25877c4d1..74fc7151802c 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -38,6 +38,9 @@
 
 #include <asm/irq.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/phy.h>
+
 #define PHY_STATE_STR(_state)			\
 	case PHY_##_state:			\
 		return __stringify(_state);	\
@@ -783,6 +786,7 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
 	if (PHY_HALTED == phydev->state)
 		return IRQ_NONE;		/* It can't be ours.  */
 
+	trace_phy_interrupt(irq, phydev);
 	return phy_change(phydev);
 }
 
@@ -1114,10 +1118,12 @@ void phy_state_machine(struct work_struct *work)
 	if (err < 0)
 		phy_error(phydev);
 
-	if (old_state != phydev->state)
+	if (old_state != phydev->state) {
+		trace_phy_state_change(phydev, old_state);
 		phydev_dbg(phydev, "PHY state change %s -> %s\n",
 			   phy_state_to_str(old_state),
 			   phy_state_to_str(phydev->state));
+	}
 
 	/* Only re-schedule a PHY state machine change if we are polling the
 	 * PHY, if PHY_IGNORE_INTERRUPT is set, then we will be moving
diff --git a/include/trace/events/phy.h b/include/trace/events/phy.h
new file mode 100644
index 000000000000..c5a0c5739d4a
--- /dev/null
+++ b/include/trace/events/phy.h
@@ -0,0 +1,100 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM phy
+
+#if !defined(_TRACE_PHY_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_PHY_H
+
+#include <linux/tracepoint.h>
+
+#define PHY_STATE_ENUMS \
+        EM( DOWN )      \
+        EM( STARTING )  \
+        EM( READY )     \
+        EM( PENDING )   \
+        EM( UP )        \
+        EM( AN )        \
+        EM( RUNNING )   \
+        EM( NOLINK )    \
+        EM( FORCING )   \
+        EM( CHANGELINK )\
+        EM( HALTED )    \
+        EMe(RESUMING)
+
+#undef EM
+#undef EMe
+
+#define EM(a) TRACE_DEFINE_ENUM( PHY_##a );
+#define EMe(a) TRACE_DEFINE_ENUM( PHY_##a );
+
+PHY_STATE_ENUMS
+
+#undef EM
+#undef EMe
+
+#define EM(a) { PHY_##a, #a },
+#define EMe(a) { PHY_##a, #a }
+
+TRACE_EVENT(phy_interrupt,
+	    TP_PROTO(int irq, struct phy_device *phydev),
+	    TP_ARGS(irq, phydev),
+	    TP_STRUCT__entry(
+		    __field(int, irq)
+		    __field(int, addr)
+		    __field(int, state)
+		    __array(char, ifname, IFNAMSIZ)
+		    ),
+	    TP_fast_assign(
+		    __entry->irq = irq;
+		    __entry->addr = phydev->mdio.addr;
+		    __entry->state = phydev->state;
+		    if (phydev->attached_dev)
+			    memcpy(__entry->ifname,
+				   netdev_name(phydev->attached_dev),
+				   IFNAMSIZ);
+		    else
+			    memset(__entry->ifname, 0, IFNAMSIZ);
+		    ),
+	    TP_printk("phy-%d-irq irq=%d ifname=%-.16s state=%s",
+		      __entry->addr,
+		      __entry->irq,
+		      __entry->ifname,
+		      __print_symbolic(__entry->state, PHY_STATE_ENUMS)
+		    )
+	);
+
+TRACE_EVENT(phy_state_change,
+	    TP_PROTO(struct phy_device *phydev, enum phy_state old_state),
+	    TP_ARGS(phydev, old_state),
+	    TP_STRUCT__entry(
+		    __field(int, addr)
+		    __field(int, state)
+		    __field(int, old_state)
+		    __array(char, ifname, IFNAMSIZ)
+		    ),
+	    TP_fast_assign(
+		    __entry->addr = phydev->mdio.addr;
+		    __entry->state = phydev->state;
+		    __entry->old_state = old_state;
+		    if (phydev->attached_dev)
+			    memcpy(__entry->ifname,
+				   netdev_name(phydev->attached_dev),
+				   IFNAMSIZ);
+		    else
+			    memset(__entry->ifname, 0, IFNAMSIZ);
+		    ),
+	    TP_printk("phy-%d-change ifname=%-.16s old_state=%s state=%s",
+		      __entry->addr,
+		      __entry->ifname,
+		      __print_symbolic(__entry->old_state, PHY_STATE_ENUMS),
+		      __print_symbolic(__entry->state, PHY_STATE_ENUMS)
+		    )
+	);
+
+#undef EM
+#undef EMe
+#undef PHY_STATE_ENUMS
+
+#endif /* _TRACE_PHY_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH v2] net: phy: add tracepoints
From: Steven Rostedt @ 2018-08-16 15:35 UTC (permalink / raw)
  To: Tobias Waldekranz
  Cc: netdev, Andrew Lunn, Florian Fainelli, David S. Miller,
	Ingo Molnar, open list
In-Reply-To: <20180816152452.7834-1-tobias@waldekranz.com>

On Thu, 16 Aug 2018 17:24:48 +0200
Tobias Waldekranz <tobias@waldekranz.com> wrote:

> Two tracepoints for now:
> 
>    * `phy_interrupt` Pretty self-explanatory.
> 
>    * `phy_state_change` Whenever the PHY's state machine is run, trace
>      the old and the new state.
> 
> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
> v1 -> v2:
>    * Actually include the entire patch, based on current net-next.
>    * Pretty print the PHY's state according to Steven's suggestion.

Cool!

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve

> ---

^ permalink raw reply

* Re: [PATCH] net: phy: add tracepoints
From: Steven Rostedt @ 2018-08-16 12:46 UTC (permalink / raw)
  To: Tobias Waldekranz
  Cc: Tobias Waldekranz, Andrew Lunn, Florian Fainelli, David S. Miller,
	Ingo Molnar, open list, open list:ETHERNET PHY LIBRARY
In-Reply-To: <20180816093056.2485-1-tobias@waldekranz.com>

On Thu, 16 Aug 2018 11:30:53 +0200
Tobias Waldekranz <tobias@waldekranz.com> wrote:

> Two tracepoints for now:
> 
>    * `phy_interrupt` Pretty self-explanatory.
> 
>    * `phy_state_change` Whenever the PHY's state machine is run, trace
>      the old and the new state.

>From a tracing perspective, this patch looks fine,

  Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

But below I have a possible improvement.

> 
> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
> ---
>  drivers/net/phy/phy.c      |  4 +++
>  include/trace/events/phy.h | 68 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 72 insertions(+)
>  create mode 100644 include/trace/events/phy.h
> 
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 9aabfa1a455a..8d22926f3962 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -38,6 +38,9 @@
>  
>  #include <asm/irq.h>
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/phy.h>
> +
>  #define PHY_STATE_STR(_state)			\
>  	case PHY_##_state:			\
>  		return __stringify(_state);	\
> @@ -1039,6 +1042,7 @@ void phy_state_machine(struct work_struct *work)
>  	if (err < 0)
>  		phy_error(phydev);
>  
> +        trace_phy_state_change(phydev, old_state);

I see that you are passing in a enum phy_state.

>  	if (old_state != phydev->state)
>  		phydev_dbg(phydev, "PHY state change %s -> %s\n",
>  			   phy_state_to_str(old_state),
> diff --git a/include/trace/events/phy.h b/include/trace/events/phy.h
> new file mode 100644
> index 000000000000..7ba6c0dda47e
> --- /dev/null
> +++ b/include/trace/events/phy.h
> @@ -0,0 +1,68 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM phy

You can add this:

#define PHY_STATE_ENUMS	\
	EM( DOWN ) 	\
	EM( STARTING )	\
	EM( READY )	\
	EM( PENDING )	\
	EM( UP )	\
	EM( AN )	\
	EM( RUNNING )	\
	EM( NOLINK )	\
	EM( FORCING )	\
	EM( CHANGELINK )\
	EM( HALTED )	\
	EMe(RESUMING)

#undef EM
#undef EMe

#define EM(a) TRACE_DEFINE_ENUM( PHY_##a );
#define EMe(a) TRACE_DEFINE_ENUM( PHY_##a );

PHY_STATE_ENUMS

#undef EM
#undef EMe

#define EM(a) { PHY_##a, #a },
#define EMe(a) { PHY_##a, #a }

> +
> +#if !defined(_TRACE_PHY_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_PHY_H
> +
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(phy_interrupt,
> +	    TP_PROTO(int irq, struct phy_device *phydev),
> +	    TP_ARGS(irq, phydev),
> +	    TP_STRUCT__entry(
> +		    __field(int, irq)
> +		    __field(int, addr)
> +		    __field(int, state)
> +		    __array(char, ifname, IFNAMSIZ)
> +		    ),
> +	    TP_fast_assign(
> +		    __entry->irq = irq;
> +		    __entry->addr = phydev->mdio.addr;
> +		    __entry->state = phydev->state;
> +		    if (phydev->attached_dev)
> +			    memcpy(__entry->ifname,
> +				   netdev_name(phydev->attached_dev),
> +				   IFNAMSIZ);
> +		    else
> +			    memset(__entry->ifname, 0, IFNAMSIZ);
> +		    ),
> +	    TP_printk("phy-%d-irq irq=%d ifname=%16s state=%d",

Here you can have "state=%s"

> +		      __entry->addr,
> +		      __entry->irq,
> +		      __entry->ifname,
> +		      __entry->state

And here you can have:

			__print_symbolic(__entry->state,
					PHY_STATE_ENUMS )

> +		    )
> +	);
> +
> +TRACE_EVENT(phy_state_change,
> +	    TP_PROTO(struct phy_device *phydev, enum phy_state old_state),
> +	    TP_ARGS(phydev, old_state),
> +	    TP_STRUCT__entry(
> +		    __field(int, addr)
> +		    __field(int, state)
> +		    __field(int, old_state)
> +		    __array(char, ifname, IFNAMSIZ)
> +		    ),
> +	    TP_fast_assign(
> +		    __entry->addr = phydev->mdio.addr;
> +		    __entry->state = phydev->state;
> +		    __entry->old_state = old_state;
> +		    if (phydev->attached_dev)
> +			    memcpy(__entry->ifname,
> +				   netdev_name(phydev->attached_dev),
> +				   IFNAMSIZ);
> +		    else
> +			    memset(__entry->ifname, 0, IFNAMSIZ);
> +		    ),
> +	    TP_printk("phy-%d-change ifname=%16s old_state=%d state=%d",
> +		      __entry->addr,
> +		      __entry->ifname,
> +		      __entry->old_state,
> +		      __entry->state

And do the same above for both old_state and state.

Then instead of just printing numbers and having to remember what state
goes with what number (and god forbid those enums were to change), you
would get actual names of the states, and be easier to read the trace
output.

Note, I didn't test what I wrote, there could be a bug, but it
shouldn't be too far off from what I'm trying to get at.

-- Steve


> +		    )
> +	);
> +
> +#endif /* _TRACE_PHY_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>

^ permalink raw reply

* Re: [iproute PATCH v4] Make colored output configurable
From: David Ahern @ 2018-08-16 13:06 UTC (permalink / raw)
  To: Phil Sutter, Stephen Hemminger; +Cc: netdev, Till Maas
In-Reply-To: <20180816093703.23022-1-phil@nwl.cc>

On 8/16/18 3:37 AM, Phil Sutter wrote:
> Allow for -color={never,auto,always} to have colored output disabled,
> enabled only if stdout is a terminal or enabled regardless of stdout
> state.
> 
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
> Changes since v1:
> - Allow to override isatty() check by specifying '-color' flag more than
>   once.
> - Document new behaviour in man pages.
> 
> Changes since v2:
> - Implement new -color=foo syntax.
> - Update commit message and man page texts accordingly.
> 
> Changes since v3:
> - Fix typo in tc/tc.c causing compile error.
> ---
>  bridge/bridge.c   |  3 +--
>  include/color.h   |  7 +++++++
>  ip/ip.c           |  3 +--
>  lib/color.c       | 33 ++++++++++++++++++++++++++++++++-
>  man/man8/bridge.8 | 13 +++++++++++--
>  man/man8/ip.8     | 13 +++++++++++--
>  man/man8/tc.8     | 13 +++++++++++--
>  tc/tc.c           |  3 +--
>  8 files changed, 75 insertions(+), 13 deletions(-)
> 
> diff --git a/bridge/bridge.c b/bridge/bridge.c
> index 451d684e0bcfd..e35e5bdf7fb30 100644
> --- a/bridge/bridge.c
> +++ b/bridge/bridge.c
> @@ -173,8 +173,7 @@ main(int argc, char **argv)
>  			NEXT_ARG();
>  			if (netns_switch(argv[1]))
>  				exit(-1);
> -		} else if (matches(opt, "-color") == 0) {
> -			++color;
> +		} else if (matches_color(opt, &color) == 0) {
>  		} else if (matches(opt, "-compressvlans") == 0) {
>  			++compress_vlans;
>  		} else if (matches(opt, "-force") == 0) {
> diff --git a/include/color.h b/include/color.h
> index 4f2c918db7e43..42038dc2e7f87 100644
> --- a/include/color.h
> +++ b/include/color.h
> @@ -12,8 +12,15 @@ enum color_attr {
>  	COLOR_NONE
>  };
>  
> +enum color_opt {
> +	COLOR_OPT_NEVER = 0,
> +	COLOR_OPT_AUTO = 1,
> +	COLOR_OPT_ALWAYS = 2
> +};

The order of AUTO and ALWAYS is backwards. Existing users who do
something like:

    ip -c addr list | less -R
or
    ip -c addr list > /tmp/addr
    less -R /tmp/addr

should not be affected by this change. That is an existing command that
works and should continue to work the same after this change.

Users who add -c but don't want the codes if stdout is not a tty are the
ones who should be doing something new - be it adding another -c or
using -c=auto.

^ permalink raw reply

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: Michal Kubecek @ 2018-08-16 16:06 UTC (permalink / raw)
  To: Greg KH
  Cc: maowenan, dwmw2, netdev, eric.dumazet, edumazet, davem, ycheng,
	jdw, stable, Takashi Iwai
In-Reply-To: <20180816152409.GK10648@kroah.com>

On Thu, Aug 16, 2018 at 05:24:09PM +0200, Greg KH wrote:
> On Thu, Aug 16, 2018 at 02:33:56PM +0200, Michal Kubecek wrote:
> > 
> > Anyway, even at this rate, I only get ~10% of one core (Intel E5-2697).
> > 
> > What I can see, though, is that with current stable 4.4 code, modified
> > testcase which sends something like
> > 
> >   2:3, 3:4, ..., 3001:3002, 3003:3004, 3004:3005, ... 6001:6002, ...
> > 
> > I quickly eat 6 MB of memory for receive queue of one socket while
> > earlier 4.4 kernels only take 200-300 KB. I didn't test latest 4.4 with
> > Takashi's follow-up yet but I'm pretty sure it will help while
> > preserving nice performance when using the original segmentsmack
> > testcase (with increased packet ratio).
> 
> Ok, for now I've applied Takashi's fix to the 4.4 stable queue and will
> push out a new 4.4-rc later tonight.  Can everyone standardize on that
> and test and let me know if it does, or does not, fix the reported
> issues?

I did repeat the tests with Takashi's fix and the CPU utilization is
similar to what we have now, i.e. 3-5% with 10K pkt/s. I could still
saturate one CPU somewhere around 50K pkt/s but that already requires
2.75 MB/s (22 Mb/s) of throughput. (My previous tests with Mao Wenan's
changes in fact used lower speeds as the change from 128 to 1024 would
need to be done in two places.)

Where Takashi's patch does help is that it does not prevent collapsing
of ranges of adjacent segments with total length shorter than ~4KB. It
took more time to verify: it cannot be checked by watching the socket
memory consumption with ss as tcp_collapse_ofo_queue isn't called until
we reach the limits. So I needed to trace when and how tcp_collpse() is
called with both current stable 4.4 code and one with Takashi's fix.
  
> If not, we can go from there and evaluate this much larger patch
> series.  But let's try the simple thing first.

At high packet rates (say 30K pkt/s and more), we can still saturate the
CPU. This is also mentioned in the announcement with claim that switch
to rbtree based queue would be necessary to fully address that. My tests
seem to confirm that but I'm still not sure it is worth backporting
something as intrusive into stable 4.4.

Michal Kubecek

^ permalink raw reply

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: Greg KH @ 2018-08-16 16:20 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: maowenan, dwmw2, netdev, eric.dumazet, edumazet, davem, ycheng,
	jdw, stable, Takashi Iwai
In-Reply-To: <20180816160616.u3refk4mqpyqagzi@unicorn.suse.cz>

On Thu, Aug 16, 2018 at 06:06:16PM +0200, Michal Kubecek wrote:
> > If not, we can go from there and evaluate this much larger patch
> > series.  But let's try the simple thing first.
> 
> At high packet rates (say 30K pkt/s and more), we can still saturate the
> CPU. This is also mentioned in the announcement with claim that switch
> to rbtree based queue would be necessary to fully address that. My tests
> seem to confirm that but I'm still not sure it is worth backporting
> something as intrusive into stable 4.4.

No, it is not.  If you worry about those things, you should not be
running a 4.4 kernel, use 4.14 or newer please.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH] netfilter/x_tables: do not fail xt_alloc_table_info too easilly
From: Pablo Neira Ayuso @ 2018-08-16 16:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Florian Westphal, Vlastimil Babka, Georgi Nikolov, Andrew Morton,
	David S. Miller, netfilter-devel, coreteam, netdev, linux-kernel,
	linux-mm, Michal Hocko
In-Reply-To: <20180807195400.23687-1-mhocko@kernel.org>

On Tue, Aug 07, 2018 at 09:54:00PM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc()
> in xt_alloc_table_info()") has unintentionally fortified
> xt_alloc_table_info allocation when __GFP_RETRY has been dropped from
> the vmalloc fallback. Later on there was a syzbot report that this
> can lead to OOM killer invocations when tables are too large and
> 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive")
> has been merged to restore the original behavior. Georgi Nikolov however
> noticed that he is not able to install his iptables anymore so this can
> be seen as a regression.
> 
> The primary argument for 0537250fdc6c was that this allocation path
> shouldn't really trigger the OOM killer and kill innocent tasks. On the
> other hand the interface requires root and as such should allow what the
> admin asks for. Root inside a namespaces makes this more complicated
> because those might be not trusted in general. If they are not then such
> namespaces should be restricted anyway. Therefore drop the __GFP_NORETRY
> and replace it by __GFP_ACCOUNT to enfore memcg constrains on it.

Applied, thanks.

^ permalink raw reply

* Re: [iproute PATCH v4] Make colored output configurable
From: Phil Sutter @ 2018-08-16 15:00 UTC (permalink / raw)
  To: David Ahern; +Cc: Stephen Hemminger, netdev, Till Maas
In-Reply-To: <46aa7a15-0317-10e4-0404-580b38fe2efb@gmail.com>

On Thu, Aug 16, 2018 at 07:06:07AM -0600, David Ahern wrote:
> On 8/16/18 3:37 AM, Phil Sutter wrote:
> > Allow for -color={never,auto,always} to have colored output disabled,
> > enabled only if stdout is a terminal or enabled regardless of stdout
> > state.
> > 
> > Signed-off-by: Phil Sutter <phil@nwl.cc>
> > ---
> > Changes since v1:
> > - Allow to override isatty() check by specifying '-color' flag more than
> >   once.
> > - Document new behaviour in man pages.
> > 
> > Changes since v2:
> > - Implement new -color=foo syntax.
> > - Update commit message and man page texts accordingly.
> > 
> > Changes since v3:
> > - Fix typo in tc/tc.c causing compile error.
> > ---
> >  bridge/bridge.c   |  3 +--
> >  include/color.h   |  7 +++++++
> >  ip/ip.c           |  3 +--
> >  lib/color.c       | 33 ++++++++++++++++++++++++++++++++-
> >  man/man8/bridge.8 | 13 +++++++++++--
> >  man/man8/ip.8     | 13 +++++++++++--
> >  man/man8/tc.8     | 13 +++++++++++--
> >  tc/tc.c           |  3 +--
> >  8 files changed, 75 insertions(+), 13 deletions(-)
> > 
> > diff --git a/bridge/bridge.c b/bridge/bridge.c
> > index 451d684e0bcfd..e35e5bdf7fb30 100644
> > --- a/bridge/bridge.c
> > +++ b/bridge/bridge.c
> > @@ -173,8 +173,7 @@ main(int argc, char **argv)
> >  			NEXT_ARG();
> >  			if (netns_switch(argv[1]))
> >  				exit(-1);
> > -		} else if (matches(opt, "-color") == 0) {
> > -			++color;
> > +		} else if (matches_color(opt, &color) == 0) {
> >  		} else if (matches(opt, "-compressvlans") == 0) {
> >  			++compress_vlans;
> >  		} else if (matches(opt, "-force") == 0) {
> > diff --git a/include/color.h b/include/color.h
> > index 4f2c918db7e43..42038dc2e7f87 100644
> > --- a/include/color.h
> > +++ b/include/color.h
> > @@ -12,8 +12,15 @@ enum color_attr {
> >  	COLOR_NONE
> >  };
> >  
> > +enum color_opt {
> > +	COLOR_OPT_NEVER = 0,
> > +	COLOR_OPT_AUTO = 1,
> > +	COLOR_OPT_ALWAYS = 2
> > +};
> 
> The order of AUTO and ALWAYS is backwards. Existing users who do
> something like:

Ordering color_opt this way felt natural since the default (0)
automatically matches the existing default (no colors).

>     ip -c addr list | less -R
> or
>     ip -c addr list > /tmp/addr
>     less -R /tmp/addr
> 
> should not be affected by this change. That is an existing command that
> works and should continue to work the same after this change.
> 
> Users who add -c but don't want the codes if stdout is not a tty are the
> ones who should be doing something new - be it adding another -c or
> using -c=auto.

My code is correct in that regard: Plain -c[olor] is equivalent to
-color=always.

Cheers, Phil

^ permalink raw reply

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: Greg KH @ 2018-08-16 15:24 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: maowenan, dwmw2, netdev, eric.dumazet, edumazet, davem, ycheng,
	jdw, stable, Takashi Iwai
In-Reply-To: <20180816123356.47iq3bts427cpptx@unicorn.suse.cz>

On Thu, Aug 16, 2018 at 02:33:56PM +0200, Michal Kubecek wrote:
> On Thu, Aug 16, 2018 at 08:05:50PM +0800, maowenan wrote:
> > On 2018/8/16 19:39, Michal Kubecek wrote:
> > > 
> > > I suspect you may be doing something wrong with your tests. I checked
> > > the segmentsmack testcase and the CPU utilization on receiving side
> > > (with sending 10 times as many packets as default) went down from ~100%
> > > to ~3% even when comparing what is in stable 4.4 now against older 4.4
> > > kernel.
> > 
> > There seems no obvious problem when you send packets with default
> > parameter in Segmentsmack POC, Which is also very related with your
> > server's hardware configuration. Please try with below parameter to
> > form OFO packets
> 
> I did and even with these (questionable, see below) changes, I did not
> get more than 10% (of one core) by receiving ksoftirqd.
> 
> >       for (i = 0; i < 1024; i++)      // 128->1024
> ...
> >       usleep(10*1000); // Adjust this and packet count to match the target!, sleep 100ms->10ms
> 
> The comment in the testcase source suggests to do _one_ of these two
> changes so that you generate 10 times as many packets as the original
> testcase. You did both so that you end up sending 102400 packets per
> second. With 55 byte long packets, this kind of attack requires at least
> 5.5 MB/s (44 Mb/s) of throughput. This is no longer a "low packet rate
> DoS", I'm afraid.
> 
> Anyway, even at this rate, I only get ~10% of one core (Intel E5-2697).
> 
> What I can see, though, is that with current stable 4.4 code, modified
> testcase which sends something like
> 
>   2:3, 3:4, ..., 3001:3002, 3003:3004, 3004:3005, ... 6001:6002, ...
> 
> I quickly eat 6 MB of memory for receive queue of one socket while
> earlier 4.4 kernels only take 200-300 KB. I didn't test latest 4.4 with
> Takashi's follow-up yet but I'm pretty sure it will help while
> preserving nice performance when using the original segmentsmack
> testcase (with increased packet ratio).

Ok, for now I've applied Takashi's fix to the 4.4 stable queue and will
push out a new 4.4-rc later tonight.  Can everyone standardize on that
and test and let me know if it does, or does not, fix the reported
issues?

If not, we can go from there and evaluate this much larger patch series.
But let's try the simple thing first.

thanks,

greg k-h

^ permalink raw reply

* [PATCH 3.18 15/15] Bluetooth: hidp: buffer overflow in hidp_process_report
From: Greg Kroah-Hartman @ 2018-08-16 18:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Salyzyn, Marcel Holtmann,
	Johan Hedberg, David S. Miller, Kees Cook, Benjamin Tissoires,
	linux-bluetooth, netdev, security, kernel-team
In-Reply-To: <20180816171633.546734046@linuxfoundation.org>

3.18-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Salyzyn <salyzyn@android.com>

commit 7992c18810e568b95c869b227137a2215702a805 upstream.

CVE-2018-9363

The buffer length is unsigned at all layers, but gets cast to int and
checked in hidp_process_report and can lead to a buffer overflow.
Switch len parameter to unsigned int to resolve issue.

This affects 3.18 and newer kernels.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Fixes: a4b1b5877b514b276f0f31efe02388a9c2836728 ("HID: Bluetooth: hidp: make sure input buffers are big enough")
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: linux-bluetooth@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: security@kernel.org
Cc: kernel-team@android.com
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/bluetooth/hidp/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -416,8 +416,8 @@ static void hidp_del_timer(struct hidp_s
 		del_timer(&session->timer);
 }
 
-static void hidp_process_report(struct hidp_session *session,
-				int type, const u8 *data, int len, int intr)
+static void hidp_process_report(struct hidp_session *session, int type,
+				const u8 *data, unsigned int len, int intr)
 {
 	if (len > HID_MAX_BUFFER_SIZE)
 		len = HID_MAX_BUFFER_SIZE;

^ permalink raw reply

* [PATCH 4.4 11/13] Bluetooth: hidp: buffer overflow in hidp_process_report
From: Greg Kroah-Hartman @ 2018-08-16 18:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Salyzyn, Marcel Holtmann,
	Johan Hedberg, David S. Miller, Kees Cook, Benjamin Tissoires,
	linux-bluetooth, netdev, security, kernel-team
In-Reply-To: <20180816171628.554317013@linuxfoundation.org>

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Salyzyn <salyzyn@android.com>

commit 7992c18810e568b95c869b227137a2215702a805 upstream.

CVE-2018-9363

The buffer length is unsigned at all layers, but gets cast to int and
checked in hidp_process_report and can lead to a buffer overflow.
Switch len parameter to unsigned int to resolve issue.

This affects 3.18 and newer kernels.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Fixes: a4b1b5877b514b276f0f31efe02388a9c2836728 ("HID: Bluetooth: hidp: make sure input buffers are big enough")
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: linux-bluetooth@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: security@kernel.org
Cc: kernel-team@android.com
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/bluetooth/hidp/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -431,8 +431,8 @@ static void hidp_del_timer(struct hidp_s
 		del_timer(&session->timer);
 }
 
-static void hidp_process_report(struct hidp_session *session,
-				int type, const u8 *data, int len, int intr)
+static void hidp_process_report(struct hidp_session *session, int type,
+				const u8 *data, unsigned int len, int intr)
 {
 	if (len > HID_MAX_BUFFER_SIZE)
 		len = HID_MAX_BUFFER_SIZE;

^ permalink raw reply

* [PATCH 4.9 13/15] Bluetooth: hidp: buffer overflow in hidp_process_report
From: Greg Kroah-Hartman @ 2018-08-16 18:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Salyzyn, Marcel Holtmann,
	Johan Hedberg, David S. Miller, Kees Cook, Benjamin Tissoires,
	linux-bluetooth, netdev, security, kernel-team
In-Reply-To: <20180816171625.340082081@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Salyzyn <salyzyn@android.com>

commit 7992c18810e568b95c869b227137a2215702a805 upstream.

CVE-2018-9363

The buffer length is unsigned at all layers, but gets cast to int and
checked in hidp_process_report and can lead to a buffer overflow.
Switch len parameter to unsigned int to resolve issue.

This affects 3.18 and newer kernels.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Fixes: a4b1b5877b514b276f0f31efe02388a9c2836728 ("HID: Bluetooth: hidp: make sure input buffers are big enough")
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: linux-bluetooth@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: security@kernel.org
Cc: kernel-team@android.com
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/bluetooth/hidp/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -431,8 +431,8 @@ static void hidp_del_timer(struct hidp_s
 		del_timer(&session->timer);
 }
 
-static void hidp_process_report(struct hidp_session *session,
-				int type, const u8 *data, int len, int intr)
+static void hidp_process_report(struct hidp_session *session, int type,
+				const u8 *data, unsigned int len, int intr)
 {
 	if (len > HID_MAX_BUFFER_SIZE)
 		len = HID_MAX_BUFFER_SIZE;

^ permalink raw reply

* [PATCH 4.14 20/22] Bluetooth: hidp: buffer overflow in hidp_process_report
From: Greg Kroah-Hartman @ 2018-08-16 18:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Salyzyn, Marcel Holtmann,
	Johan Hedberg, David S. Miller, Kees Cook, Benjamin Tissoires,
	linux-bluetooth, netdev, security, kernel-team
In-Reply-To: <20180816171622.831964729@linuxfoundation.org>

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Salyzyn <salyzyn@android.com>

commit 7992c18810e568b95c869b227137a2215702a805 upstream.

CVE-2018-9363

The buffer length is unsigned at all layers, but gets cast to int and
checked in hidp_process_report and can lead to a buffer overflow.
Switch len parameter to unsigned int to resolve issue.

This affects 3.18 and newer kernels.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Fixes: a4b1b5877b514b276f0f31efe02388a9c2836728 ("HID: Bluetooth: hidp: make sure input buffers are big enough")
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: linux-bluetooth@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: security@kernel.org
Cc: kernel-team@android.com
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/bluetooth/hidp/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -431,8 +431,8 @@ static void hidp_del_timer(struct hidp_s
 		del_timer(&session->timer);
 }
 
-static void hidp_process_report(struct hidp_session *session,
-				int type, const u8 *data, int len, int intr)
+static void hidp_process_report(struct hidp_session *session, int type,
+				const u8 *data, unsigned int len, int intr)
 {
 	if (len > HID_MAX_BUFFER_SIZE)
 		len = HID_MAX_BUFFER_SIZE;

^ permalink raw reply

* [PATCH 4.17 19/21] Bluetooth: hidp: buffer overflow in hidp_process_report
From: Greg Kroah-Hartman @ 2018-08-16 18:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Salyzyn, Marcel Holtmann,
	Johan Hedberg, David S. Miller, Kees Cook, Benjamin Tissoires,
	linux-bluetooth, netdev, security, kernel-team
In-Reply-To: <20180816171612.136242278@linuxfoundation.org>

4.17-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Salyzyn <salyzyn@android.com>

commit 7992c18810e568b95c869b227137a2215702a805 upstream.

CVE-2018-9363

The buffer length is unsigned at all layers, but gets cast to int and
checked in hidp_process_report and can lead to a buffer overflow.
Switch len parameter to unsigned int to resolve issue.

This affects 3.18 and newer kernels.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Fixes: a4b1b5877b514b276f0f31efe02388a9c2836728 ("HID: Bluetooth: hidp: make sure input buffers are big enough")
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: linux-bluetooth@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: security@kernel.org
Cc: kernel-team@android.com
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/bluetooth/hidp/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -431,8 +431,8 @@ static void hidp_del_timer(struct hidp_s
 		del_timer(&session->timer);
 }
 
-static void hidp_process_report(struct hidp_session *session,
-				int type, const u8 *data, int len, int intr)
+static void hidp_process_report(struct hidp_session *session, int type,
+				const u8 *data, unsigned int len, int intr)
 {
 	if (len > HID_MAX_BUFFER_SIZE)
 		len = HID_MAX_BUFFER_SIZE;

^ permalink raw reply

* [PATCH 4.18 20/22] Bluetooth: hidp: buffer overflow in hidp_process_report
From: Greg Kroah-Hartman @ 2018-08-16 18:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Salyzyn, Marcel Holtmann,
	Johan Hedberg, David S. Miller, Kees Cook, Benjamin Tissoires,
	linux-bluetooth, netdev, security, kernel-team
In-Reply-To: <20180816171556.502583508@linuxfoundation.org>

4.18-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Salyzyn <salyzyn@android.com>

commit 7992c18810e568b95c869b227137a2215702a805 upstream.

CVE-2018-9363

The buffer length is unsigned at all layers, but gets cast to int and
checked in hidp_process_report and can lead to a buffer overflow.
Switch len parameter to unsigned int to resolve issue.

This affects 3.18 and newer kernels.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Fixes: a4b1b5877b514b276f0f31efe02388a9c2836728 ("HID: Bluetooth: hidp: make sure input buffers are big enough")
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: linux-bluetooth@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: security@kernel.org
Cc: kernel-team@android.com
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/bluetooth/hidp/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -431,8 +431,8 @@ static void hidp_del_timer(struct hidp_s
 		del_timer(&session->timer);
 }
 
-static void hidp_process_report(struct hidp_session *session,
-				int type, const u8 *data, int len, int intr)
+static void hidp_process_report(struct hidp_session *session, int type,
+				const u8 *data, unsigned int len, int intr)
 {
 	if (len > HID_MAX_BUFFER_SIZE)
 		len = HID_MAX_BUFFER_SIZE;

^ permalink raw reply

* RE: TJA1100 100Base-T1 PHY features via ethtool?
From: Woojung.Huh @ 2018-08-16 15:58 UTC (permalink / raw)
  To: mgr, f.fainelli; +Cc: davem, netdev, kernel
In-Reply-To: <20180814071332.smqescmspo73zq6l@pengutronix.de>

Hi Florian & Michael,

> > ethtool is being converted to netlink, and that will be a much more
> > flexible interface to work with since it is basically easily extensible
> > (unlike the current ethtool + ioctl approach).
> 
> Yes, netlink sounds absolutely more useful here.
Is ethtool + netlink expected to be merged in net-next soon?
Couldn't find anything on the web except some experimental information.

Thanks.
Woojung

^ permalink raw reply

* Re: [PATCH] dt-bindings: net: ravb: Add support for r8a774a1 SoC
From: David Miller @ 2018-08-16 19:09 UTC (permalink / raw)
  To: fabrizio.castro
  Cc: robh+dt, mark.rutland, sergei.shtylyov, geert+renesas,
	horms+renesas, biju.das, yoshihiro.shimoda.uh, netdev,
	linux-renesas-soc, devicetree, linux-kernel, horms,
	Chris.Paterson2
In-Reply-To: <1534250017-15725-1-git-send-email-fabrizio.castro@bp.renesas.com>

From: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
Date: Tue, 14 Aug 2018 13:33:37 +0100

> Document RZ/G2M (R8A774A1) SoC bindings.
> 
> Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
> Reviewed-by: Biju Das <biju.das@bp.renesas.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] r8169: don't use MSI-X on RTL8106e
From: David Miller @ 2018-08-16 19:21 UTC (permalink / raw)
  To: jian-hong; +Cc: nic_swsd, netdev, hkallweit1, linux-kernel, linux
In-Reply-To: <20180815062110.16155-1-jian-hong@endlessm.com>

From: <jian-hong@endlessm.com>
Date: Wed, 15 Aug 2018 14:21:10 +0800

> Found the ethernet network on ASUS X441UAR doesn't come back on resume
> from suspend when using MSI-X.  The chip is RTL8106e - version 39.

Heiner, please take a look at this.

You recently disabled MSI-X on RTL8168g for similar reasons.

Now that we've seen two chips like this, maybe there is some other
problem afoot.

Thanks.

^ permalink raw reply

* Re: [PATCH] isdn: Disable IIOCDBGVAR
From: David Miller @ 2018-08-16 19:26 UTC (permalink / raw)
  To: keescook; +Cc: viro, isdn, linux-kernel, netdev
In-Reply-To: <20180815191405.GA29528@beast>

From: Kees Cook <keescook@chromium.org>
Date: Wed, 15 Aug 2018 12:14:05 -0700

> It was possible to directly leak the kernel address where the isdn_dev
> structure pointer was stored. This is a kernel ASLR bypass for anyone
> with access to the ioctl. The code had been present since the beginning
> of git history, though this shouldn't ever be needed for normal operation,
> therefore remove it.
> 
> Reported-by: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Karsten Keil <isdn@linux-pingi.de>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
> netdev doesn't like explict stable markings, so I'll just ask here that it
> get included in -stable please. :)

Applied and queued up for -stable, thanks :)

^ permalink raw reply

* Re: [PATCH net-next] Documentation: networking: ti-cpsw: correct cbs parameters for Eth1 100Mb
From: David Miller @ 2018-08-16 19:27 UTC (permalink / raw)
  To: ivan.khoronzhuk
  Cc: grygorii.strashko, corbet, netdev, linux-doc, linux-kernel
In-Reply-To: <20180815202953.10137-1-ivan.khoronzhuk@linaro.org>

From: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Date: Wed, 15 Aug 2018 23:29:53 +0300

> If set cbs parameters calculated for 1000Mb, but use on 100Mb port
> w/o h/w offload (for cpsw offload it doesn't matter), it works
> incorrectly. According to the example and testing board, second port
> is 100Mb interface. Correct them on recalculated for 100Mb interface.
> It allows to use the same command for CBS software implementation for
> board in example.
> 
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 1/1] tap: comment fix
From: David Miller @ 2018-08-16 19:30 UTC (permalink / raw)
  To: jianjian.wang1
  Cc: jasowang, girish.moodalbail, mst, willemb, viro, wexu, netdev,
	linux-kernel
In-Reply-To: <CAP4sYWUDb8wtNVvgPO+3a5MWoiBGuafQpvG6KR30RWinf36Msw@mail.gmail.com>

From: Wang Jian <jianjian.wang1@gmail.com>
Date: Thu, 16 Aug 2018 21:01:27 +0800

> The tap_queue and the "tap_dev" are loosely coupled, not "macvlan_dev".
> 
> And I also change one rcu_read_lock's place, seems can reduce rcu
> critical section a little.
> 
> Signed-off-by: Wang Jian <jianjian.wang1@gmail.com>

This patch was corrupted by your email client, for example it turned
TAB characters into sequences of spaces.

Please fix this, email a test patch to yourself, and do not resend the
patch to this mailing list until you can successfully extract and
cleanly apply the test patch you email to yourself.

Thank you.

^ permalink raw reply

* Re: [PATCH] r8169: don't use MSI-X on RTL8106e
From: Heiner Kallweit @ 2018-08-16 19:37 UTC (permalink / raw)
  To: David Miller, jian-hong; +Cc: nic_swsd, netdev, linux-kernel, linux
In-Reply-To: <20180816.122131.604270853620318143.davem@davemloft.net>

On 16.08.2018 21:21, David Miller wrote:
> From: <jian-hong@endlessm.com>
> Date: Wed, 15 Aug 2018 14:21:10 +0800
> 
>> Found the ethernet network on ASUS X441UAR doesn't come back on resume
>> from suspend when using MSI-X.  The chip is RTL8106e - version 39.
> 
> Heiner, please take a look at this.
> 
> You recently disabled MSI-X on RTL8168g for similar reasons.
> 
> Now that we've seen two chips like this, maybe there is some other
> problem afoot.
> 
Thanks for the hint. I saw it already and just contacted Realtek
whether they are aware of any MSI-X issues with particular chip
versions. With the chip versions I have access to MSI-X works fine.

There's also the theoretical option that the issues are caused by
broken BIOS's. But so far only chip versions have been reported
which are very similar, at least with regard to version number
(2x VER_40, 1x VER_39). So they may share some buggy component.

Let's see whether Realtek can provide some hint.
If more chip versions are reported having problems with MSI-X,
then we could switch to a whitelist or disable MSI-X in general.

Heiner

> Thanks.
> 

^ permalink raw reply

* Re: [PATCH] r8169: don't use MSI-X on RTL8106e
From: David Miller @ 2018-08-16 19:39 UTC (permalink / raw)
  To: hkallweit1; +Cc: jian-hong, nic_swsd, netdev, linux-kernel, linux
In-Reply-To: <458efbf9-5971-653a-e7cd-8c56ba055648@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Thu, 16 Aug 2018 21:37:31 +0200

> On 16.08.2018 21:21, David Miller wrote:
>> From: <jian-hong@endlessm.com>
>> Date: Wed, 15 Aug 2018 14:21:10 +0800
>> 
>>> Found the ethernet network on ASUS X441UAR doesn't come back on resume
>>> from suspend when using MSI-X.  The chip is RTL8106e - version 39.
>> 
>> Heiner, please take a look at this.
>> 
>> You recently disabled MSI-X on RTL8168g for similar reasons.
>> 
>> Now that we've seen two chips like this, maybe there is some other
>> problem afoot.
>> 
> Thanks for the hint. I saw it already and just contacted Realtek
> whether they are aware of any MSI-X issues with particular chip
> versions. With the chip versions I have access to MSI-X works fine.
> 
> There's also the theoretical option that the issues are caused by
> broken BIOS's. But so far only chip versions have been reported
> which are very similar, at least with regard to version number
> (2x VER_40, 1x VER_39). So they may share some buggy component.
> 
> Let's see whether Realtek can provide some hint.
> If more chip versions are reported having problems with MSI-X,
> then we could switch to a whitelist or disable MSI-X in general.

It could be that we need to reprogram some register(s) on resume,
which normally might not be needed, and that is what is causing the
problem with some chips.

^ permalink raw reply

* [bpf-next RFC 0/3] Introduce eBPF flow dissector
From: Petar Penkov @ 2018-08-16 16:44 UTC (permalink / raw)
  To: netdev; +Cc: davem, ast, daniel, simon.horman, Petar Penkov

From: Petar Penkov <ppenkov@google.com>

This patch series hardens the RX stack by allowing flow dissection in BPF,
as previously discussed [1]. Because of the rigorous checks of the BPF
verifier, this provides significant security guarantees. In particular, the
BPF flow dissector cannot get inside of an infinite loop, as with
CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot
read outside of packet bounds, because all memory accesses are checked.
Also, with BPF the administrator can decide which protocols to support,
reducing potential attack surface. Rarely encountered protocols can be
excluded from dissection and the program can be updated without kernel
recompile or reboot if a bug is discovered.

Patch 1 adds infrastructure to execute a BPF program in __skb_flow_dissect.
This includes a new BPF program and attach type.

Patch 2 adds a flow dissector program in BPF. This parses most protocols in
__skb_flow_dissect in BPF for a subset of flow keys (basic, control, ports,
and address types).

Patch 3 adds a selftest that attaches the BPF program to the flow dissector
and sends traffic with different levels of encapsulation.

This RFC patchset exposes a few design considerations:

1/ Because the flow dissector key definitions live in
include/linux/net/flow_dissector.h, they are not visible from userspace,
and the flow keys definitions need to be copied in the BPF program.

2/ An alternative to adding a new hook would have been to attach flow
dissection programs at the XDP hook. Because this hook is executed before
GRO, it would have to execute on every MSS, which would be more
computationally expensive. Furthermore, the XDP hook is executed before an
SKB has been allocated and there is no clear way to move the dissected keys
into the SKB after it has been allocated. Eventually, perhaps a single pass
can implement both GRO and flow dissection -- but napi_gro_cb shows that a
lot more flow state would need to be parsed for this.

3/ The BPF program cannot use direct packet access everywhere because it
uses an offset, initially supplied by the flow dissector.  Because the
initial value of this non-constant offset comes from outside of the
program, the verifier does not know what its value is, and it cannot verify
that it is within packet bounds. Therefore, direct packet access programs
get rejected.

4/ Loading and attaching the BPF program requires capable(), as opposed to
ns_capable(), because a malicious program might be able to return bad
values that would trigger bugs in the kernel, such as the nhoff value bug
fixed in commit 324f8305e59b ("net-backports: flow_dissector: properly cap
thoff field").

[1] http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf

Petar Penkov (3):
  flow_dissector: implements flow dissector BPF hook
  flow_dissector: implements eBPF parser
  selftests/bpf: test bpf flow dissection

 include/linux/bpf_types.h                     |   1 +
 include/linux/skbuff.h                        |   7 +
 include/net/flow_dissector.h                  |  16 +
 include/uapi/linux/bpf.h                      |  14 +-
 kernel/bpf/syscall.c                          |   8 +
 kernel/bpf/verifier.c                         |   2 +
 net/core/filter.c                             | 157 ++++
 net/core/flow_dissector.c                     |  76 ++
 tools/bpf/bpftool/prog.c                      |   1 +
 tools/include/uapi/linux/bpf.h                |   5 +-
 tools/lib/bpf/libbpf.c                        |   2 +
 tools/testing/selftests/bpf/.gitignore        |   2 +
 tools/testing/selftests/bpf/Makefile          |   8 +-
 tools/testing/selftests/bpf/bpf_flow.c        | 542 ++++++++++++
 tools/testing/selftests/bpf/bpf_helpers.h     |   3 +
 tools/testing/selftests/bpf/config            |   1 +
 .../selftests/bpf/flow_dissector_load.c       | 140 ++++
 .../selftests/bpf/test_flow_dissector.c       | 782 ++++++++++++++++++
 .../selftests/bpf/test_flow_dissector.sh      | 115 +++
 tools/testing/selftests/bpf/with_addr.sh      |  54 ++
 tools/testing/selftests/bpf/with_tunnels.sh   |  36 +
 21 files changed, 1967 insertions(+), 5 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/bpf_flow.c
 create mode 100644 tools/testing/selftests/bpf/flow_dissector_load.c
 create mode 100644 tools/testing/selftests/bpf/test_flow_dissector.c
 create mode 100755 tools/testing/selftests/bpf/test_flow_dissector.sh
 create mode 100755 tools/testing/selftests/bpf/with_addr.sh
 create mode 100755 tools/testing/selftests/bpf/with_tunnels.sh

-- 
2.18.0.865.gffc8e1a3cd6-goog

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox