Netdev List
 help / color / mirror / Atom feed
* Re: [v3] net_sched: act_police: add 2 new attributes to support police 64bit rate and peakrate
From: Cong Wang @ 2019-09-05 17:04 UTC (permalink / raw)
  To: David Dai
  Cc: Jamal Hadi Salim, Jiri Pirko, David Miller,
	Linux Kernel Network Developers, LKML, zdai
In-Reply-To: <1567609423-26826-1-git-send-email-zdai@linux.vnet.ibm.com>

On Wed, Sep 4, 2019 at 8:03 AM David Dai <zdai@linux.vnet.ibm.com> wrote:
>
> For high speed adapter like Mellanox CX-5 card, it can reach upto
> 100 Gbits per second bandwidth. Currently htb already supports 64bit rate
> in tc utility. However police action rate and peakrate are still limited
> to 32bit value (upto 32 Gbits per second). Add 2 new attributes
> TCA_POLICE_RATE64 and TCA_POLICE_RATE64 in kernel for 64bit support
> so that tc utility can use them for 64bit rate and peakrate value to
> break the 32bit limit, and still keep the backward binary compatibility.
>
> Tested-by: David Dai <zdai@linux.vnet.ibm.com>
> Signed-off-by: David Dai <zdai@linux.vnet.ibm.com>

Acked-by: Cong Wang <xiyou.wangcong@gmail.com>

Thanks.

^ permalink raw reply

* Re: [PATCH] net/skbuff: silence warnings under memory pressure
From: Steven Rostedt @ 2019-09-05 17:14 UTC (permalink / raw)
  To: Qian Cai
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky, Michal Hocko,
	Eric Dumazet, davem, netdev, linux-mm, linux-kernel
In-Reply-To: <1567699393.5576.96.camel@lca.pw>

On Thu, 05 Sep 2019 12:03:13 -0400
Qian Cai <cai@lca.pw> wrote:

> > > and could deal with console hardware that involve irq_exit() anyway.  
> > 
> > printk->console_driver->write() does not involve irq.  
> 
> Hmm, from the article,
> 
> https://en.wikipedia.org/wiki/Universal_asynchronous_receiver-transmitter
> 
> "Since transmission of a single or multiple characters may take a long time
> relative to CPU speeds, a UART maintains a flag showing busy status so that the
> host system knows if there is at least one character in the transmit buffer or
> shift register; "ready for next character(s)" may also be signaled with an
> interrupt."

I'm pretty sure all serial consoles do a busy loop on the UART and not
use interrupts to notify when it's available. That would require an
asynchronous implementation of printk() which would be quite complex to
implement.

-- Steve

^ permalink raw reply

* Re: [PATCH 1/2] net: phy: dp83867: Add documentation for SGMII mode type
From: Trent Piepho @ 2019-09-05 17:06 UTC (permalink / raw)
  To: vitaly.gaiduk@cloudbear.ru, davem@davemloft.net,
	f.fainelli@gmail.com, robh+dt@kernel.org
  Cc: mark.rutland@arm.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, andrew@lunn.ch,
	devicetree@vger.kernel.org
In-Reply-To: <1567700761-14195-2-git-send-email-vitaly.gaiduk@cloudbear.ru>

On Thu, 2019-09-05 at 19:26 +0300, Vitaly Gaiduk wrote:
> Add documentation of ti,sgmii-type which can be used to select
> SGMII mode type (4 or 6-wire).
> 
> Signed-off-by: Vitaly Gaiduk <vitaly.gaiduk@cloudbear.ru>
> ---
>  Documentation/devicetree/bindings/net/ti,dp83867.txt | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/ti,dp83867.txt b/Documentation/devicetree/bindings/net/ti,dp83867.txt
> index db6aa3f2215b..18e7fd52897f 100644
> --- a/Documentation/devicetree/bindings/net/ti,dp83867.txt
> +++ b/Documentation/devicetree/bindings/net/ti,dp83867.txt
> @@ -37,6 +37,7 @@ Optional property:
>  			      for applicable values.  The CLK_OUT pin can also
>  			      be disabled by this property.  When omitted, the
>  			      PHY's default will be left as is.
> +	- ti,sgmii-type - This denotes the fact which SGMII mode is used (4 or 6-wire).

Really should explain what kind of value it is and what the values
mean.  I.e., should this be ti,sgimii-type = <4> to select 4 wire?

Maybe a boolean, "sgmii-clock", to indicate the presence of sgmii rx
clock lines, would make more sense?

I also wonder if phy-mode = "sgmii-clk" or "sgmii-6wire", vs the
existing phy-mode = "sgmii", might also be a better way to describe
this instead of a new property.

^ permalink raw reply

* Re: [PATCH] net/skbuff: silence warnings under memory pressure
From: Steven Rostedt @ 2019-09-05 17:23 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Qian Cai, Petr Mladek, Sergey Senozhatsky, Michal Hocko,
	Eric Dumazet, davem, netdev, linux-mm, linux-kernel
In-Reply-To: <20190905113208.GA521@jagdpanzerIV>

On Thu, 5 Sep 2019 20:32:08 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> I think we can queue significantly much less irq_work-s from printk().
> 
> Petr, Steven, what do you think?

What if we just rate limit the wake ups of klogd? I mean, really, do we
need to keep calling wake up if it probably never even executed?

-- Steve

^ permalink raw reply

* Re: [PATCH v2 1/2] ethtool: implement Energy Detect Powerdown support via phy-tunable
From: Florian Fainelli @ 2019-09-05 17:23 UTC (permalink / raw)
  To: Ardelean, Alexandru, andrew@lunn.ch
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, hkallweit1@gmail.com
In-Reply-To: <361eb94a4da73d1fa21893e8e294639f0fc0bcd2.camel@analog.com>

On 9/4/19 11:25 PM, Ardelean, Alexandru wrote:
> On Wed, 2019-09-04 at 21:53 +0200, Andrew Lunn wrote:
>> [External]
>>
>> On Wed, Sep 04, 2019 at 07:23:21PM +0300, Alexandru Ardelean wrote:
>>
>> Hi Alexandru
>>
>> Somewhere we need a comment stating what EDPD means. Here would be a
>> good place.
> 
> ack
> 
>>
>>> +#define ETHTOOL_PHY_EDPD_DFLT_TX_INTERVAL	0x7fff
>>> +#define ETHTOOL_PHY_EDPD_NO_TX			0x8000
>>> +#define ETHTOOL_PHY_EDPD_DISABLE		0
>>
>> I think you are passing a u16. So why not 0xfffe and 0xffff?  We also
>> need to make it clear what the units are for interval. This file
> 
> I initially thought about keeping this u8 and going with 0xff & 0xfe.
> But 254 or 253 could be too small to specify the value of an interval.
> 
> Also (maybe due ti all the coding-patterns that I saw over the course of some time), make me feel that I should add a
> flag somewhere.
> 
> Bottom line is: 0xfffe and 0xffff also work from my side, if it is acceptable (by the community).
> 
> Another approach I considered, was to maybe have this EDPD just do enable & disable (which is sufficient for the `adin`
> PHY & `micrel` as well).
> That would mean that if we would ever want to configure the TX interval (in the future), we would need an extra PHY-
> tunable parameter just for that; because changing the enable/disable behavior would be dangerous.
> And also, deferring the TX-interval configuration, does not sound like good design/pattern, since it can allow for tons
> of PHY-tunable parameters for every little knob.

It seems to me that the interval is a better way to deal with that, if
you specify a non zero interval, you enable EDPD, even if your PHY can
only act on an enable/disable bit. For PHYs that do support setting a TX
internal, the non-zero interval can be translated into whatever
appropriate unit. In all cases, a 0 interval means disable.

Andrew, does that work  for you?
-- 
Florian

^ permalink raw reply

* Applied "spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages" to the spi tree
From: Mark Brown @ 2019-09-05 17:39 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown,
	mlichvar, netdev, richardcochran
In-Reply-To: <20190905010114.26718-2-olteanv@gmail.com>

The patch

   spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages

has been applied to the spi tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.4

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

From d1c44c9342c17e3314371325d9272684a075b65c Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <olteanv@gmail.com>
Date: Thu, 5 Sep 2019 04:01:11 +0300
Subject: [PATCH] spi: Use an abbreviated pointer to ctlr->cur_msg in
 __spi_pump_messages

This helps a bit with line fitting now (the list_first_entry call) as
well as during the next patch which needs to iterate through all
transfers of ctlr->cur_msg so it timestamps them.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20190905010114.26718-2-olteanv@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 drivers/spi/spi.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index aef55acb5ccd..b2890923d256 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1265,8 +1265,9 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer);
  */
 static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 {
-	unsigned long flags;
+	struct spi_message *msg;
 	bool was_busy = false;
+	unsigned long flags;
 	int ret;
 
 	/* Lock queue */
@@ -1325,10 +1326,10 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 	}
 
 	/* Extract head of queue */
-	ctlr->cur_msg =
-		list_first_entry(&ctlr->queue, struct spi_message, queue);
+	msg = list_first_entry(&ctlr->queue, struct spi_message, queue);
+	ctlr->cur_msg = msg;
 
-	list_del_init(&ctlr->cur_msg->queue);
+	list_del_init(&msg->queue);
 	if (ctlr->busy)
 		was_busy = true;
 	else
@@ -1361,7 +1362,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 			if (ctlr->auto_runtime_pm)
 				pm_runtime_put(ctlr->dev.parent);
 
-			ctlr->cur_msg->status = ret;
+			msg->status = ret;
 			spi_finalize_current_message(ctlr);
 
 			mutex_unlock(&ctlr->io_mutex);
@@ -1369,28 +1370,28 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 		}
 	}
 
-	trace_spi_message_start(ctlr->cur_msg);
+	trace_spi_message_start(msg);
 
 	if (ctlr->prepare_message) {
-		ret = ctlr->prepare_message(ctlr, ctlr->cur_msg);
+		ret = ctlr->prepare_message(ctlr, msg);
 		if (ret) {
 			dev_err(&ctlr->dev, "failed to prepare message: %d\n",
 				ret);
-			ctlr->cur_msg->status = ret;
+			msg->status = ret;
 			spi_finalize_current_message(ctlr);
 			goto out;
 		}
 		ctlr->cur_msg_prepared = true;
 	}
 
-	ret = spi_map_msg(ctlr, ctlr->cur_msg);
+	ret = spi_map_msg(ctlr, msg);
 	if (ret) {
-		ctlr->cur_msg->status = ret;
+		msg->status = ret;
 		spi_finalize_current_message(ctlr);
 		goto out;
 	}
 
-	ret = ctlr->transfer_one_message(ctlr, ctlr->cur_msg);
+	ret = ctlr->transfer_one_message(ctlr, msg);
 	if (ret) {
 		dev_err(&ctlr->dev,
 			"failed to transfer one message from queue\n");
-- 
2.20.1


^ permalink raw reply related

* [PATCH bpf-next] kbuild: replace BASH-specific ${@:2} with shift and ${@}
From: Andrii Nakryiko @ 2019-09-05 17:59 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Stephen Rothwell,
	Masahiro Yamada

${@:2} is BASH-specific extension, which makes link-vmlinux.sh rely on
BASH. Use shift and ${@} instead to fix this issue.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 341dfcf8d78e ("btf: expose BTF info through sysfs")
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 scripts/link-vmlinux.sh | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 0d8f41db8cd6..8c59970a09dc 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -57,12 +57,16 @@ modpost_link()
 
 # Link of vmlinux
 # ${1} - output file
-# ${@:2} - optional extra .o files
+# ${2}, ${3}, ... - optional extra .o files
 vmlinux_link()
 {
 	local lds="${objtree}/${KBUILD_LDS}"
+	local output=${1}
 	local objects
 
+	# skip output file argument
+	shift
+
 	if [ "${SRCARCH}" != "um" ]; then
 		objects="--whole-archive			\
 			${KBUILD_VMLINUX_OBJS}			\
@@ -70,9 +74,10 @@ vmlinux_link()
 			--start-group				\
 			${KBUILD_VMLINUX_LIBS}			\
 			--end-group				\
-			${@:2}"
+			${@}"
 
-		${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} -o ${1}	\
+		${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}	\
+			-o ${output}				\
 			-T ${lds} ${objects}
 	else
 		objects="-Wl,--whole-archive			\
@@ -81,9 +86,10 @@ vmlinux_link()
 			-Wl,--start-group			\
 			${KBUILD_VMLINUX_LIBS}			\
 			-Wl,--end-group				\
-			${@:2}"
+			${@}"
 
-		${CC} ${CFLAGS_vmlinux} -o ${1}			\
+		${CC} ${CFLAGS_vmlinux}				\
+			-o ${output}				\
 			-Wl,-T,${lds}				\
 			${objects}				\
 			-lutil -lrt -lpthread
-- 
2.21.0


^ permalink raw reply related

* [patch net-next] net: fib_notifier: move fib_notifier_ops from struct net into per-net struct
From: Jiri Pirko @ 2019-09-05 18:06 UTC (permalink / raw)
  To: netdev; +Cc: davem, idosch, dsahern, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

No need for fib_notifier_ops to be in struct net. It is used only by
fib_notifier as a private data. Use net_generic to introduce per-net
fib_notifier struct and move fib_notifier_ops there.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/net_namespace.h |  3 ---
 net/core/fib_notifier.c     | 29 +++++++++++++++++++++++------
 2 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index ab40d7afdc54..64bcb589a610 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -103,9 +103,6 @@ struct net {
 	/* core fib_rules */
 	struct list_head	rules_ops;
 
-	struct list_head	fib_notifier_ops;  /* Populated by
-						    * register_pernet_subsys()
-						    */
 	struct net_device       *loopback_dev;          /* The loopback */
 	struct netns_core	core;
 	struct netns_mib	mib;
diff --git a/net/core/fib_notifier.c b/net/core/fib_notifier.c
index 13a40b831d6d..470a606d5e8d 100644
--- a/net/core/fib_notifier.c
+++ b/net/core/fib_notifier.c
@@ -5,8 +5,15 @@
 #include <linux/module.h>
 #include <linux/init.h>
 #include <net/net_namespace.h>
+#include <net/netns/generic.h>
 #include <net/fib_notifier.h>
 
+static unsigned int fib_notifier_net_id;
+
+struct fib_notifier_net {
+	struct list_head fib_notifier_ops;
+};
+
 static ATOMIC_NOTIFIER_HEAD(fib_chain);
 
 int call_fib_notifier(struct notifier_block *nb, struct net *net,
@@ -34,6 +41,7 @@ EXPORT_SYMBOL(call_fib_notifiers);
 
 static unsigned int fib_seq_sum(void)
 {
+	struct fib_notifier_net *fn_net;
 	struct fib_notifier_ops *ops;
 	unsigned int fib_seq = 0;
 	struct net *net;
@@ -41,8 +49,9 @@ static unsigned int fib_seq_sum(void)
 	rtnl_lock();
 	down_read(&net_rwsem);
 	for_each_net(net) {
+		fn_net = net_generic(net, fib_notifier_net_id);
 		rcu_read_lock();
-		list_for_each_entry_rcu(ops, &net->fib_notifier_ops, list) {
+		list_for_each_entry_rcu(ops, &fn_net->fib_notifier_ops, list) {
 			if (!try_module_get(ops->owner))
 				continue;
 			fib_seq += ops->fib_seq_read(net);
@@ -58,9 +67,10 @@ static unsigned int fib_seq_sum(void)
 
 static int fib_net_dump(struct net *net, struct notifier_block *nb)
 {
+	struct fib_notifier_net *fn_net = net_generic(net, fib_notifier_net_id);
 	struct fib_notifier_ops *ops;
 
-	list_for_each_entry_rcu(ops, &net->fib_notifier_ops, list) {
+	list_for_each_entry_rcu(ops, &fn_net->fib_notifier_ops, list) {
 		int err;
 
 		if (!try_module_get(ops->owner))
@@ -127,12 +137,13 @@ EXPORT_SYMBOL(unregister_fib_notifier);
 static int __fib_notifier_ops_register(struct fib_notifier_ops *ops,
 				       struct net *net)
 {
+	struct fib_notifier_net *fn_net = net_generic(net, fib_notifier_net_id);
 	struct fib_notifier_ops *o;
 
-	list_for_each_entry(o, &net->fib_notifier_ops, list)
+	list_for_each_entry(o, &fn_net->fib_notifier_ops, list)
 		if (ops->family == o->family)
 			return -EEXIST;
-	list_add_tail_rcu(&ops->list, &net->fib_notifier_ops);
+	list_add_tail_rcu(&ops->list, &fn_net->fib_notifier_ops);
 	return 0;
 }
 
@@ -167,18 +178,24 @@ EXPORT_SYMBOL(fib_notifier_ops_unregister);
 
 static int __net_init fib_notifier_net_init(struct net *net)
 {
-	INIT_LIST_HEAD(&net->fib_notifier_ops);
+	struct fib_notifier_net *fn_net = net_generic(net, fib_notifier_net_id);
+
+	INIT_LIST_HEAD(&fn_net->fib_notifier_ops);
 	return 0;
 }
 
 static void __net_exit fib_notifier_net_exit(struct net *net)
 {
-	WARN_ON_ONCE(!list_empty(&net->fib_notifier_ops));
+	struct fib_notifier_net *fn_net = net_generic(net, fib_notifier_net_id);
+
+	WARN_ON_ONCE(!list_empty(&fn_net->fib_notifier_ops));
 }
 
 static struct pernet_operations fib_notifier_net_ops = {
 	.init = fib_notifier_net_init,
 	.exit = fib_notifier_net_exit,
+	.id = &fib_notifier_net_id,
+	.size = sizeof(struct fib_notifier_net),
 };
 
 static int __init fib_notifier_init(void)
-- 
2.21.0


^ permalink raw reply related

* Re: rtnl_lock() question
From: Rustad, Mark D @ 2019-09-05 18:07 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: jonathan.lemon@gmail.com, eric.dumazet@gmail.com,
	netdev@vger.kernel.org
In-Reply-To: <867cf373f204715aec3b2e04ef9f65454cf25a2e.camel@mellanox.com>

[-- Attachment #1: Type: text/plain, Size: 566 bytes --]

On Sep 4, 2019, at 4:23 PM, Saeed Mahameed <saeedm@mellanox.com> wrote:

> some allocations require parameters that should remain valid and
> constant across the whole reconfiguration procedure such
> params.num_channels, so they must be done inside the lock.

You could always check if those parameters have changed once under the lock  
and, if they did, drop the lock, reallocate and try again. Since such  
changes should be very infrequent, this is something that really should not  
loop multiple times.

--
Mark Rustad, Networking Division, Intel Corporation

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply

* [PATCH net] net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list
From: Shmulik Ladkani @ 2019-09-05 18:36 UTC (permalink / raw)
  To: Daniel Borkmann, Eric Dumazet, Willem de Bruijn
  Cc: eyal, netdev, Shmulik Ladkani, Alexander Duyck

Historically, support for frag_list packets entering skb_segment() was
limited to frag_list members terminating on exact same gso_size
boundaries. This is verified with a BUG_ON since commit 89319d3801d1
("net: Add frag_list support to skb_segment"), quote:

    As such we require all frag_list members terminate on exact MSS
    boundaries.  This is checked using BUG_ON.
    As there should only be one producer in the kernel of such packets,
    namely GRO, this requirement should not be difficult to maintain.

However, since commit 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper"),
the "exact MSS boundaries" assumption no longer holds:
An eBPF program using bpf_skb_change_proto() DOES modify 'gso_size', but
leaves the frag_list members as originally merged by GRO with the
original 'gso_size'. Example of such programs are bpf-based NAT46 or
NAT64.

This lead to a kernel BUG_ON for flows involving:
 - GRO generating a frag_list skb
 - bpf program performing bpf_skb_change_proto() or bpf_skb_adjust_room()
 - skb_segment() of the skb

See example BUG_ON reports in [0].

In commit 13acc94eff12 ("net: permit skb_segment on head_frag frag_list skb"),
skb_segment() was modified to support the "gso_size mangling" case of
a frag_list GRO'ed skb, but *only* for frag_list members having
head_frag==true (having a page-fragment head).

Alas, GRO packets having frag_list members with a linear kmalloced head
(head_frag==false) still hit the BUG_ON.

This commit adds support to skb_segment() for a 'head_skb' packet having
a frag_list whose members are *non* head_frag, with gso_size mangled, by
disabling SG and thus falling-back to copying the data from the given
'head_skb' into the generated segmented skbs - as suggested by Willem de
Bruijn [1].

Since this approach involves the penalty of skb_copy_and_csum_bits()
when building the segments, care was taken in order to enable this
solution only when required:
 - untrusted gso_size, by testing SKB_GSO_DODGY is set
   (SKB_GSO_DODGY is set by any gso_size mangling functions in
    net/core/filter.c)
 - the frag_list is non empty, its item is a non head_frag, *and* the
   headlen of the given 'head_skb' does not match the gso_size.

[0]
https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/
https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff1835c@fb.com/

[1]
https://lore.kernel.org/netdev/CA+FuTSfVsgNDi7c=GUU8nMg2hWxF2SjCNLXetHeVPdnxAW5K-w@mail.gmail.com/

Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper")
Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
---
 net/core/skbuff.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index ea8e8d332d85..c4bd1881acff 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3678,6 +3678,24 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 	sg = !!(features & NETIF_F_SG);
 	csum = !!can_checksum_protocol(features, proto);
 
+	if (mss != GSO_BY_FRAGS &&
+	    (skb_shinfo(head_skb)->gso_type & SKB_GSO_DODGY)) {
+		/* gso_size is untrusted.
+		 *
+		 * If head_skb has a frag_list with a linear non head_frag
+		 * item, and head_skb's headlen does not fit requested
+		 * gso_size, fall back to copying the skbs - by disabling sg.
+		 *
+		 * We assume checking the first frag suffices, i.e if either of
+		 * the frags have non head_frag data, then the first frag is
+		 * too.
+		 */
+		if (list_skb && skb_headlen(list_skb) && !list_skb->head_frag &&
+		    (mss != skb_headlen(head_skb) - doffset)) {
+			sg = false;
+		}
+	}
+
 	if (sg && csum && (mss != GSO_BY_FRAGS))  {
 		if (!(features & NETIF_F_GSO_PARTIAL)) {
 			struct sk_buff *iter;
-- 
2.19.1


^ permalink raw reply related

* Re: [PATCH v2] net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)
From: Eric Dumazet @ 2019-09-05 18:49 UTC (permalink / raw)
  To: Maciej Żenczykowski, Maciej Żenczykowski,
	David S . Miller
  Cc: netdev, David Ahern, Lorenzo Colitti
In-Reply-To: <20190902162336.240405-1-zenczykowski@gmail.com>



On 9/2/19 6:23 PM, Maciej Żenczykowski wrote:
> From: Maciej Żenczykowski <maze@google.com>
> 
> There is a subtle change in behaviour introduced by:
>   commit c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
>   'ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create'
> 
> Before that patch /proc/net/ipv6_route includes:
> 00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000003 00000000 80200001 lo
> 
> Afterwards /proc/net/ipv6_route includes:
> 00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80240001 lo
> 
> ie. the above commit causes the ::1/128 local (automatic) route to be flagged with RTF_ADDRCONF (0x040000).
> 
> AFAICT, this is incorrect since these routes are *not* coming from RA's.
> 
> As such, this patch restores the old behaviour.
> 
> Fixes: c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Lorenzo Colitti <lorenzo@google.com>
> Signed-off-by: Maciej Żenczykowski <maze@google.com>
> ---
>  net/ipv6/route.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 558c6c68855f..516b2e568dae 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -4365,13 +4365,14 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
>  	struct fib6_config cfg = {
>  		.fc_table = l3mdev_fib_table(idev->dev) ? : RT6_TABLE_LOCAL,
>  		.fc_ifindex = idev->dev->ifindex,
> -		.fc_flags = RTF_UP | RTF_ADDRCONF | RTF_NONEXTHOP,
> +		.fc_flags = RTF_UP | RTF_NONEXTHOP,
>  		.fc_dst = *addr,
>  		.fc_dst_len = 128,
>  		.fc_protocol = RTPROT_KERNEL,
>  		.fc_nlinfo.nl_net = net,
>  		.fc_ignore_dev_down = true,
>  	};
> +	struct fib6_info *f6i;
>  
>  	if (anycast) {
>  		cfg.fc_type = RTN_ANYCAST;
> @@ -4381,7 +4382,10 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
>  		cfg.fc_flags |= RTF_LOCAL;
>  	}
>  
> -	return ip6_route_info_create(&cfg, gfp_flags, NULL);
> +	f6i = ip6_route_info_create(&cfg, gfp_flags, NULL);
> +	if (f6i)
> +		f6i->dst_nocount = true;

Shouldn't it use 

	if (!IS_ERR(f6i))
		f6i->dst_nocount = true;

???


> +	return f6i;
>  }
>  
>  /* remove deleted ip from prefsrc entries */
> 

^ permalink raw reply

* Re: [PATCH v2 00/10] Add definition for the number of standard PCI BARs
From: Denis Efremov @ 2019-09-05 19:02 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Bjorn Helgaas, linux-kernel, linux-pci, Sebastian Ott,
	Gerald Schaefer, H. Peter Anvin, Giuseppe Cavallaro,
	Alexandre Torgue, Matt Porter, Alexandre Bounine, Peter Jones,
	Bartlomiej Zolnierkiewicz, Cornelia Huck, Alex Williamson,
	Jose Abreu, kvm, linux-fbdev, netdev, x86, linux-s390
In-Reply-To: <20190816105128.GD14111@e119886-lin.cambridge.arm.com>



On 16.08.2019 13:51, Andrew Murray wrote:
> On Fri, Aug 16, 2019 at 12:24:27PM +0300, Denis Efremov wrote:
>> Code that iterates over all standard PCI BARs typically uses
>> PCI_STD_RESOURCE_END, but this is error-prone because it requires
>> "i <= PCI_STD_RESOURCE_END" rather than something like
>> "i < PCI_STD_NUM_BARS". We could add such a definition and use it the same
>> way PCI_SRIOV_NUM_BARS is used. There is already the definition
>> PCI_BAR_COUNT for s390 only. Thus, this patchset introduces it globally.
>>
>> Changes in v2:
>>   - Reverse checks in pci_iomap_range,pci_iomap_wc_range.
>>   - Refactor loops in vfio_pci to keep PCI_STD_RESOURCES.
>>   - Add 2 new patches to replace the magic constant with new define.
>>   - Split net patch in v1 to separate stmmac and dwc-xlgmac patches.
>>
>> Denis Efremov (10):
>>   PCI: Add define for the number of standard PCI BARs
>>   s390/pci: Loop using PCI_STD_NUM_BARS
>>   x86/PCI: Loop using PCI_STD_NUM_BARS
>>   stmmac: pci: Loop using PCI_STD_NUM_BARS
>>   net: dwc-xlgmac: Loop using PCI_STD_NUM_BARS
>>   rapidio/tsi721: Loop using PCI_STD_NUM_BARS
>>   efifb: Loop using PCI_STD_NUM_BARS
>>   vfio_pci: Loop using PCI_STD_NUM_BARS
>>   PCI: hv: Use PCI_STD_NUM_BARS
>>   PCI: Use PCI_STD_NUM_BARS
>>
>>  arch/s390/include/asm/pci.h                      |  5 +----
>>  arch/s390/include/asm/pci_clp.h                  |  6 +++---
>>  arch/s390/pci/pci.c                              | 16 ++++++++--------
>>  arch/s390/pci/pci_clp.c                          |  6 +++---
>>  arch/x86/pci/common.c                            |  2 +-
>>  drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c |  4 ++--
>>  drivers/net/ethernet/synopsys/dwc-xlgmac-pci.c   |  2 +-
>>  drivers/pci/controller/pci-hyperv.c              | 10 +++++-----
>>  drivers/pci/pci.c                                | 11 ++++++-----
>>  drivers/pci/quirks.c                             |  4 ++--
>>  drivers/rapidio/devices/tsi721.c                 |  2 +-
>>  drivers/vfio/pci/vfio_pci.c                      | 11 +++++++----
>>  drivers/vfio/pci/vfio_pci_config.c               | 10 ++++++----
>>  drivers/vfio/pci/vfio_pci_private.h              |  4 ++--
>>  drivers/video/fbdev/efifb.c                      |  2 +-
>>  include/linux/pci.h                              |  2 +-
>>  include/uapi/linux/pci_regs.h                    |  1 +
>>  17 files changed, 51 insertions(+), 47 deletions(-)
> 
> I've come across a few more places where this change can be made. There
> may be multiple instances in the same file, but only the first is shown
> below:
> 
> drivers/misc/pci_endpoint_test.c:       for (bar = BAR_0; bar <= BAR_5; bar++) {
> drivers/net/ethernet/intel/e1000/e1000_main.c:          for (i = BAR_1; i <= BAR_5; i++) {
> drivers/net/ethernet/intel/ixgb/ixgb_main.c:    for (i = BAR_1; i <= BAR_5; i++) {
> drivers/pci/controller/dwc/pci-dra7xx.c:        for (bar = BAR_0; bar <= BAR_5; bar++)
> drivers/pci/controller/dwc/pci-layerscape-ep.c: for (bar = BAR_0; bar <= BAR_5; bar++)
> drivers/pci/controller/dwc/pcie-artpec6.c:      for (bar = BAR_0; bar <= BAR_5; bar++)
> drivers/pci/controller/dwc/pcie-designware-plat.c:      for (bar = BAR_0; bar <= BAR_5; bar++)
> drivers/pci/endpoint/functions/pci-epf-test.c:  for (bar = BAR_0; bar <= BAR_5; bar++) {
> include/linux/pci-epc.h:        u64     bar_fixed_size[BAR_5 + 1];
> drivers/scsi/pm8001/pm8001_hwi.c:       for (bar = 0; bar < 6; bar++) {
> drivers/scsi/pm8001/pm8001_init.c:      for (bar = 0; bar < 6; bar++) {
> drivers/ata/sata_nv.c:  for (bar = 0; bar < 6; bar++)
> drivers/video/fbdev/core/fbmem.c:       for (idx = 0, bar = 0; bar < PCI_ROM_RESOURCE; bar++) {
> drivers/staging/gasket/gasket_core.c:   for (i = 0; i < GASKET_NUM_BARS; i++) {
> drivers/tty/serial/8250/8250_pci.c:     for (i = 0; i < PCI_NUM_BAR_RESOURCES; i++) { <-----------
> 
> It looks like BARs are often iterated with PCI_NUM_BAR_RESOURCES, there
> are a load of these too found with:
> 
> git grep PCI_ROM_RESOURCE | grep "< "
> 
> I'm happy to share patches if preferred.
> 

I'm not sure about lib/devres.c
265:#define PCIM_IOMAP_MAX      PCI_ROM_RESOURCE
268:    void __iomem *table[PCIM_IOMAP_MAX];
277:    for (i = 0; i < PCIM_IOMAP_MAX; i++)
324:    BUG_ON(bar >= PCIM_IOMAP_MAX);
352:    for (i = 0; i < PCIM_IOMAP_MAX; i++)
455:    for (i = 0; i < PCIM_IOMAP_MAX; i++) {

Is it worth changing?
#define PCIM_IOMAP_MAX  PCI_STD_NUM_BARS

Thanks,
Denis

^ permalink raw reply

* Re: linux-next: build failure after merge of the net-next tree
From: Andrii Nakryiko @ 2019-09-05 19:26 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Stephen Rothwell, David Miller, Networking,
	Linux Next Mailing List, Linux Kernel Mailing List,
	Andrii Nakryiko, Daniel Borkmann, Alexei Starovoitov
In-Reply-To: <CAK7LNAQEU6uu-Z=VeR2KNa8ezCLA7VHtpvM2tvAKsWtUTi6Eug@mail.gmail.com>

On Tue, Sep 3, 2019 at 11:20 PM Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
>
> On Wed, Sep 4, 2019 at 3:00 PM Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >
> > Hi all,
> >
> > After merging the net-next tree, today's linux-next build (arm
> > multi_v7_defconfig) failed like this:
> >
> > scripts/link-vmlinux.sh: 74: Bad substitution
> >
> > Caused by commit
> >
> >   341dfcf8d78e ("btf: expose BTF info through sysfs")
> >
> > interacting with commit
> >
> >   1267f9d3047d ("kbuild: add $(BASH) to run scripts with bash-extension")
> >
> > from the kbuild tree.
>
>
> I knew that they were using bash-extension
> in the #!/bin/sh script.  :-D
>
> In fact, I wrote my patch in order to break their code
> and  make btf people realize that they were doing wrong.

Was there a specific reason to wait until this would break during
Stephen's merge, instead of giving me a heads up (or just replying on
original patch) and letting me fix it and save everyone's time and
efforts?

Either way, I've fixed the issue in
https://patchwork.ozlabs.org/patch/1158620/ and will pay way more
attention to BASH-specific features going forward (I found it pretty
hard to verify stuff like this, unfortunately). But again, code review
process is the best place to catch this and I really hope in the
future we can keep this process productive. Thanks!

>
>
>
> > The change in the net-next tree turned link-vmlinux.sh into a bash script
> > (I think).
> >
> > I have applied the following patch for today:
>
>
> But, this is a temporary fix only for linux-next.
>
> scripts/link-vmlinux.sh does not need to use the
> bash-extension ${@:2} in the first place.
>
> I hope btf people will write the correct code.

I replaced ${@:2} with shift and ${@}, I hope that's a correct fix,
but if you think it's not, please reply on the patch and let me know.


>
> Thanks.
>
>
>
>
> > From: Stephen Rothwell <sfr@canb.auug.org.au>
> > Date: Wed, 4 Sep 2019 15:43:41 +1000
> > Subject: [PATCH] link-vmlinux.sh is now a bash script
> >
> > Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
> > ---
> >  Makefile                | 4 ++--
> >  scripts/link-vmlinux.sh | 2 +-
> >  2 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/Makefile b/Makefile
> > index ac97fb282d99..523d12c5cebe 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -1087,7 +1087,7 @@ ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
> >
> >  # Final link of vmlinux with optional arch pass after final link
> >  cmd_link-vmlinux =                                                 \
> > -       $(CONFIG_SHELL) $< $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_vmlinux) ;    \
> > +       $(BASH) $< $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_vmlinux) ;    \
> >         $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
> >
> >  vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE
> > @@ -1403,7 +1403,7 @@ clean: rm-files := $(CLEAN_FILES)
> >  PHONY += archclean vmlinuxclean
> >
> >  vmlinuxclean:
> > -       $(Q)$(CONFIG_SHELL) $(srctree)/scripts/link-vmlinux.sh clean
> > +       $(Q)$(BASH) $(srctree)/scripts/link-vmlinux.sh clean
> >         $(Q)$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) clean)
> >
> >  clean: archclean vmlinuxclean
> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > index f7edb75f9806..ea1f8673869d 100755
> > --- a/scripts/link-vmlinux.sh
> > +++ b/scripts/link-vmlinux.sh
> > @@ -1,4 +1,4 @@
> > -#!/bin/sh
> > +#!/bin/bash
> >  # SPDX-License-Identifier: GPL-2.0
> >  #
> >  # link vmlinux
> > --
> > 2.23.0.rc1
> >
> > --
> > Cheers,
> > Stephen Rothwell
>
>
>
> --
> Best Regards
> Masahiro Yamada

^ permalink raw reply

* Zdravstvujte! Vas interesujut klientskie bazy dannyh?
From: netdev @ 2019-09-05 17:56 UTC (permalink / raw)
  To: netdev

Zdravstvujte! Vas interesujut klientskie bazy dannyh?

^ permalink raw reply

* Re: [PATCH net-next] net/mlx5: DR, Remove useless set memory to zero use memset()
From: Saeed Mahameed @ 2019-09-05 19:37 UTC (permalink / raw)
  To: Erez Shitrit, weiyongjun1@huawei.com, Mark Bloch, leon@kernel.org,
	Alex Vesker
  Cc: kernel-janitors@vger.kernel.org, netdev@vger.kernel.org,
	linux-rdma@vger.kernel.org
In-Reply-To: <20190905095326.127277-1-weiyongjun1@huawei.com>

On Thu, 2019-09-05 at 09:53 +0000, Wei Yongjun wrote:
> The memory return by kzalloc() has already be set to zero, so
> remove useless memset(0).
> 
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

Applied to net-next-mlx5.
Thanks !

^ permalink raw reply

* Re: [PATCH net-next] net/mlx5: DR, Fix error return code in dr_domain_init_resources()
From: Saeed Mahameed @ 2019-09-05 19:37 UTC (permalink / raw)
  To: Erez Shitrit, weiyongjun1@huawei.com, Mark Bloch, leon@kernel.org,
	Alex Vesker
  Cc: kernel-janitors@vger.kernel.org, netdev@vger.kernel.org,
	linux-rdma@vger.kernel.org
In-Reply-To: <20190905095600.127371-1-weiyongjun1@huawei.com>

On Thu, 2019-09-05 at 09:56 +0000, Wei Yongjun wrote:
> Fix to return negative error code -ENOMEM from the error handling
> case instead of 0, as done elsewhere in this function.
> 
> Fixes: 4ec9e7b02697 ("net/mlx5: DR, Expose steering domain
> functionality")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
> 

Applied to net-next-mlx5.
Thanks !


^ permalink raw reply

* general protection fault in dev_map_hash_update_elem
From: syzbot @ 2019-09-05 20:08 UTC (permalink / raw)
  To: ast, bpf, daniel, davem, hawk, jakub.kicinski, john.fastabend,
	kafai, linux-kernel, netdev, songliubraving, syzkaller-bugs, yhs

Hello,

syzbot found the following crash on:

HEAD commit:    6d028043 Add linux-next specific files for 20190830
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=135c1a92600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=82a6bec43ab0cb69
dashboard link: https://syzkaller.appspot.com/bug?extid=4e7a85b1432052e8d6f8
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=109124e1600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+4e7a85b1432052e8d6f8@syzkaller.appspotmail.com

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 10235 Comm: syz-executor.0 Not tainted 5.3.0-rc6-next-20190830  
#75
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:__write_once_size include/linux/compiler.h:203 [inline]
RIP: 0010:__hlist_del include/linux/list.h:795 [inline]
RIP: 0010:hlist_del_rcu include/linux/rculist.h:475 [inline]
RIP: 0010:__dev_map_hash_update_elem kernel/bpf/devmap.c:668 [inline]
RIP: 0010:dev_map_hash_update_elem+0x3c8/0x6e0 kernel/bpf/devmap.c:691
Code: 48 89 f1 48 89 75 c8 48 c1 e9 03 80 3c 11 00 0f 85 d3 02 00 00 48 b9  
00 00 00 00 00 fc ff df 48 8b 53 10 48 89 d6 48 c1 ee 03 <80> 3c 0e 00 0f  
85 97 02 00 00 48 85 c0 48 89 02 74 38 48 89 55 b8
RSP: 0018:ffff88808d607c30 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8880a7f14580 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a7f14588
RBP: ffff88808d607c78 R08: 0000000000000004 R09: ffffed1011ac0f73
R10: ffffed1011ac0f72 R11: 0000000000000003 R12: ffff88809f4e9400
R13: ffff88809b06ba00 R14: 0000000000000000 R15: ffff88809f4e9528
FS:  00007f3a3d50c700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007feb3fcd0000 CR3: 00000000986b9000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  map_update_elem+0xc82/0x10b0 kernel/bpf/syscall.c:966
  __do_sys_bpf+0x8b5/0x3350 kernel/bpf/syscall.c:2854
  __se_sys_bpf kernel/bpf/syscall.c:2825 [inline]
  __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:2825
  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x459879
Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f3a3d50bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000459879
RDX: 0000000000000020 RSI: 0000000020000040 RDI: 0000000000000002
RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3a3d50c6d4
R13: 00000000004bfc86 R14: 00000000004d1960 R15: 00000000ffffffff
Modules linked in:
---[ end trace 083223e21dbd0ae5 ]---
RIP: 0010:__write_once_size include/linux/compiler.h:203 [inline]
RIP: 0010:__hlist_del include/linux/list.h:795 [inline]
RIP: 0010:hlist_del_rcu include/linux/rculist.h:475 [inline]
RIP: 0010:__dev_map_hash_update_elem kernel/bpf/devmap.c:668 [inline]
RIP: 0010:dev_map_hash_update_elem+0x3c8/0x6e0 kernel/bpf/devmap.c:691
Code: 48 89 f1 48 89 75 c8 48 c1 e9 03 80 3c 11 00 0f 85 d3 02 00 00 48 b9  
00 00 00 00 00 fc ff df 48 8b 53 10 48 89 d6 48 c1 ee 03 <80> 3c 0e 00 0f  
85 97 02 00 00 48 85 c0 48 89 02 74 38 48 89 55 b8
RSP: 0018:ffff88808d607c30 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8880a7f14580 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a7f14588
RBP: ffff88808d607c78 R08: 0000000000000004 R09: ffffed1011ac0f73
R10: ffffed1011ac0f72 R11: 0000000000000003 R12: ffff88809f4e9400
R13: ffff88809b06ba00 R14: 0000000000000000 R15: ffff88809f4e9528
FS:  00007f3a3d50c700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007feb3fcd0000 CR3: 00000000986b9000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* [PATCH net] tcp: ulp: fix possible crash in tcp_diag_get_aux_size()
From: Eric Dumazet @ 2019-09-05 20:20 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Eric Dumazet, Luke Hsiao, Neal Cardwell,
	Davide Caratti

tcp_diag_get_aux_size() can be called with sockets in any state.

icsk_ulp_ops is only present for full sockets.

For SYN_RECV or TIME_WAIT ones we would access garbage.

Fixes: 61723b393292 ("tcp: ulp: add functions to dump ulp-specific information")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Luke Hsiao <lukehsiao@google.com>
Reported-by: Neal Cardwell <ncardwell@google.com>
Cc: Davide Caratti <dcaratti@redhat.com>
---
 net/ipv4/tcp_diag.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c
index babc156deabb573f11bba344215d2c3712c4a3cd..81a8221d650a94be53d17354c60ddd0c655eaccf 100644
--- a/net/ipv4/tcp_diag.c
+++ b/net/ipv4/tcp_diag.c
@@ -163,7 +163,7 @@ static size_t tcp_diag_get_aux_size(struct sock *sk, bool net_admin)
 	}
 #endif
 
-	if (net_admin) {
+	if (net_admin && sk_fullsock(sk)) {
 		const struct tcp_ulp_ops *ulp_ops;
 
 		ulp_ops = icsk->icsk_ulp_ops;
-- 
2.23.0.187.g17f5b7556c-goog


^ permalink raw reply related

* Re: [PATCH net] tcp: ulp: fix possible crash in tcp_diag_get_aux_size()
From: Eric Dumazet @ 2019-09-05 20:21 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Luke Hsiao, Neal Cardwell, Davide Caratti
In-Reply-To: <20190905202041.138085-1-edumazet@google.com>

On Thu, Sep 5, 2019 at 10:20 PM Eric Dumazet <edumazet@google.com> wrote:
>
> tcp_diag_get_aux_size() can be called with sockets in any state.
>
> icsk_ulp_ops is only present for full sockets.
>
> For SYN_RECV or TIME_WAIT ones we would access garbage.
>
> Fixes: 61723b393292 ("tcp: ulp: add functions to dump ulp-specific information")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Luke Hsiao <lukehsiao@google.com>
> Reported-by: Neal Cardwell <ncardwell@google.com>
> Cc: Davide Caratti <dcaratti@redhat.com>
> ---

Sorry for the 'net' tag. This patch targets net-next tree only.

^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH] iavf: fix MAC address setting for VFs when filter is rejected
From: Bowers, AndrewX @ 2019-09-05 20:32 UTC (permalink / raw)
  To: intel-wired-lan@lists.osuosl.org; +Cc: netdev@vger.kernel.org
In-Reply-To: <20190905063422.28743-1-sassmann@kpanic.de>

> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Stefan Assmann
> Sent: Wednesday, September 4, 2019 11:34 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; davem@davemloft.net; sassmann@kpanic.de
> Subject: [Intel-wired-lan] [PATCH] iavf: fix MAC address setting for VFs when
> filter is rejected
> 
> Currently iavf unconditionally applies MAC address change requests. This
> brings the VF in a state where it is no longer able to pass traffic if the PF
> rejects a MAC filter change for the VF.
> A typical scenario for a rejected MAC filter is for an untrusted VF to request
> to change the MAC address when an administratively set MAC is present.
> 
> To keep iavf working in this scenario the MAC filter handling in iavf needs to
> act on the PF reply regarding the MAC filter change. In the case of an ack the
> new MAC address gets set, whereas in the case of a nack the previous MAC
> address needs to stay in place.
> 
> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
> ---
>  drivers/net/ethernet/intel/iavf/iavf_main.c     | 1 -
>  drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 7 +++++++
>  2 files changed, 7 insertions(+), 1 deletion(-)

Tested-by: Andrew Bowers <andrewx.bowers@intel.com>



^ permalink raw reply

* [net-next 03/16] ice: Check root pointer for validity
From: Jeff Kirsher @ 2019-09-05 20:33 UTC (permalink / raw)
  To: davem
  Cc: Anirudh Venkataramanan, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190905203406.4152-1-jeffrey.t.kirsher@intel.com>

From: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>

ice_sched_get_tc_node uses pi->root without checking for NULL. Add a
check to prevent NULL pointer dereference.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_sched.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c b/drivers/net/ethernet/intel/ice/ice_sched.c
index 79d64f9ed609..fc624b73d05d 100644
--- a/drivers/net/ethernet/intel/ice/ice_sched.c
+++ b/drivers/net/ethernet/intel/ice/ice_sched.c
@@ -284,7 +284,7 @@ struct ice_sched_node *ice_sched_get_tc_node(struct ice_port_info *pi, u8 tc)
 {
 	u8 i;
 
-	if (!pi)
+	if (!pi || !pi->root)
 		return NULL;
 	for (i = 0; i < pi->root->num_children; i++)
 		if (pi->root->children[i]->tc_num == tc)
-- 
2.21.0


^ permalink raw reply related

* [net-next 00/16][pull request] 100GbE Intel Wired LAN Driver Updates 2019-09-05
From: Jeff Kirsher @ 2019-09-05 20:33 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann

This series contains updates to ice driver.

Brett fixes the setting of num_q_vectors by using the maximum number
between the allocated transmit and receive queues.

Anirudh simplifies the code to use a helper function to return the main
VSI, which is the first element in the pf->vsi array.  Adds a pointer
check to prevent a NULL pointer dereference.  Adds a check to ensure we
do not initialize DCB on devices that are not DCB capable.  Does some
housekeeping on the code to remove unnecessary indirection and reduce
the PF structure by removing elements that are not needed since the
values they were storing can be readily gotten from
ice_get_avail_*_count()'s.  Updates the printed strings to make it
easier to search the logs for driver capabilities.

Jesse cleans up unnecessary function arguments.  Updated the code to use
prefetch() to add some efficiency to the driver to avoid a cache miss.
Did some housekeeping on the code to remove the configurable transmit
work limit via ethtool which ended up creating performance overhead.
Made additional performance enhancements by updating the driver to start
out with a reasonable number of descriptors by changing the default to
2048.

Mitch fixes the reset logic for VFs by clearing VF_MBX_ARQLEN register
when the source of the reset is not PFR.

Lukasz updates the driver to include a similar fix for the i40e driver
by reporting link down for VF's when the PF queues are not enabled.

Akeem updates the driver to report the VF link status once we get VF
resources so that we can reflect the link status similarly to how the PF
reports link speed.

Ashish updates the transmit context structure based on recent changes to
the hardware specification.

Dave updates the DCB logic to allow a delayed registration for MIB
change events so that the driver is not accepting events before it is
ready for them.

The following are changes since commit 0e5b36bc4c1fccfc18dd851d960781589c16dae8:
  r8152: adjust the settings of ups flags
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 100GbE

Akeem G Abodunrin (1):
  ice: Report VF link status with opcode to get resources

Anirudh Venkataramanan (5):
  ice: Add ice_get_main_vsi to get PF/main VSI
  ice: Check root pointer for validity
  ice: Check for DCB capability before initializing DCB
  ice: Minor refactor in queue management
  ice: Rework around device/function capabilities

Ashish Shah (1):
  ice: update Tx context struct

Brett Creeley (1):
  ice: Update fields in ice_vsi_set_num_qs when reconfiguring

Dave Ertman (1):
  ice: Allow for delayed LLDP MIB change registration

Jesse Brandeburg (5):
  ice: clean up arguments
  ice: move code closer together
  ice: small efficiency fixes
  ice: change work limit to a constant
  ice: change default number of receive descriptors

Lukasz Czapnik (1):
  ice: report link down for VF when PF's queues are not enabled

Mitch Williams (1):
  ice: Reliably reset VFs

 drivers/net/ethernet/intel/ice/ice.h          | 46 +++---------
 drivers/net/ethernet/intel/ice/ice_common.c   | 43 +++++------
 drivers/net/ethernet/intel/ice/ice_dcb.c      | 39 +++++++++-
 drivers/net/ethernet/intel/ice/ice_dcb.h      | 11 +--
 drivers/net/ethernet/intel/ice/ice_dcb_lib.c  |  7 +-
 drivers/net/ethernet/intel/ice/ice_ethtool.c  | 24 +++---
 .../net/ethernet/intel/ice/ice_lan_tx_rx.h    |  1 +
 drivers/net/ethernet/intel/ice/ice_lib.c      | 29 ++++----
 drivers/net/ethernet/intel/ice/ice_main.c     | 73 +++++++++++--------
 drivers/net/ethernet/intel/ice/ice_sched.c    |  2 +-
 drivers/net/ethernet/intel/ice/ice_txrx.c     | 53 +++++++-------
 .../net/ethernet/intel/ice/ice_virtchnl_pf.c  | 36 +++++----
 12 files changed, 195 insertions(+), 169 deletions(-)

-- 
2.21.0


^ permalink raw reply

* [net-next 04/16] ice: clean up arguments
From: Jeff Kirsher @ 2019-09-05 20:33 UTC (permalink / raw)
  To: davem
  Cc: Jesse Brandeburg, netdev, nhorman, sassmann, Tony Nguyen,
	Andrew Bowers, Jeff Kirsher
In-Reply-To: <20190905203406.4152-1-jeffrey.t.kirsher@intel.com>

From: Jesse Brandeburg <jesse.brandeburg@intel.com>

There are a couple of functions that don't need two arguments
passed in when the second argument already had access to
the pointer pointed to by the first.

Remove the unnecessary arguments.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx.c | 43 +++++++++++------------
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 5bf5c179a738..4fe1b332e67e 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -95,17 +95,16 @@ void ice_free_tx_ring(struct ice_ring *tx_ring)
 
 /**
  * ice_clean_tx_irq - Reclaim resources after transmit completes
- * @vsi: the VSI we care about
  * @tx_ring: Tx ring to clean
  * @napi_budget: Used to determine if we are in netpoll
  *
  * Returns true if there's any budget left (e.g. the clean is finished)
  */
-static bool
-ice_clean_tx_irq(struct ice_vsi *vsi, struct ice_ring *tx_ring, int napi_budget)
+static bool ice_clean_tx_irq(struct ice_ring *tx_ring, int napi_budget)
 {
 	unsigned int total_bytes = 0, total_pkts = 0;
-	unsigned int budget = vsi->work_lmt;
+	unsigned int budget = ICE_DFLT_IRQ_WORK;
+	struct ice_vsi *vsi = tx_ring->vsi;
 	s16 i = tx_ring->next_to_clean;
 	struct ice_tx_desc *tx_desc;
 	struct ice_tx_buf *tx_buf;
@@ -114,6 +113,8 @@ ice_clean_tx_irq(struct ice_vsi *vsi, struct ice_ring *tx_ring, int napi_budget)
 	tx_desc = ICE_TX_DESC(tx_ring, i);
 	i -= tx_ring->count;
 
+	prefetch(&vsi->state);
+
 	do {
 		struct ice_tx_desc *eop_desc = tx_buf->next_to_watch;
 
@@ -206,7 +207,7 @@ ice_clean_tx_irq(struct ice_vsi *vsi, struct ice_ring *tx_ring, int napi_budget)
 		smp_mb();
 		if (__netif_subqueue_stopped(tx_ring->netdev,
 					     tx_ring->q_index) &&
-		   !test_bit(__ICE_DOWN, vsi->state)) {
+		    !test_bit(__ICE_DOWN, vsi->state)) {
 			netif_wake_subqueue(tx_ring->netdev,
 					    tx_ring->q_index);
 			++tx_ring->tx_stats.restart_q;
@@ -879,7 +880,7 @@ ice_rx_hash(struct ice_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc,
 
 /**
  * ice_rx_csum - Indicate in skb if checksum is good
- * @vsi: the VSI we care about
+ * @ring: the ring we care about
  * @skb: skb currently being received and modified
  * @rx_desc: the receive descriptor
  * @ptype: the packet type decoded by hardware
@@ -887,7 +888,7 @@ ice_rx_hash(struct ice_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc,
  * skb->protocol must be set before this function is called
  */
 static void
-ice_rx_csum(struct ice_vsi *vsi, struct sk_buff *skb,
+ice_rx_csum(struct ice_ring *ring, struct sk_buff *skb,
 	    union ice_32b_rx_flex_desc *rx_desc, u8 ptype)
 {
 	struct ice_rx_ptype_decoded decoded;
@@ -904,7 +905,7 @@ ice_rx_csum(struct ice_vsi *vsi, struct sk_buff *skb,
 	skb_checksum_none_assert(skb);
 
 	/* check if Rx checksum is enabled */
-	if (!(vsi->netdev->features & NETIF_F_RXCSUM))
+	if (!(ring->netdev->features & NETIF_F_RXCSUM))
 		return;
 
 	/* check if HW has decoded the packet and checksum */
@@ -944,7 +945,7 @@ ice_rx_csum(struct ice_vsi *vsi, struct sk_buff *skb,
 	return;
 
 checksum_fail:
-	vsi->back->hw_csum_rx_error++;
+	ring->vsi->back->hw_csum_rx_error++;
 }
 
 /**
@@ -968,7 +969,7 @@ ice_process_skb_fields(struct ice_ring *rx_ring,
 	/* modifies the skb - consumes the enet header */
 	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
 
-	ice_rx_csum(rx_ring->vsi, skb, rx_desc, ptype);
+	ice_rx_csum(rx_ring, skb, rx_desc, ptype);
 }
 
 /**
@@ -1354,14 +1355,13 @@ static u32 ice_buildreg_itr(u16 itr_idx, u16 itr)
 
 /**
  * ice_update_ena_itr - Update ITR and re-enable MSIX interrupt
- * @vsi: the VSI associated with the q_vector
  * @q_vector: q_vector for which ITR is being updated and interrupt enabled
  */
-static void
-ice_update_ena_itr(struct ice_vsi *vsi, struct ice_q_vector *q_vector)
+static void ice_update_ena_itr(struct ice_q_vector *q_vector)
 {
 	struct ice_ring_container *tx = &q_vector->tx;
 	struct ice_ring_container *rx = &q_vector->rx;
+	struct ice_vsi *vsi = q_vector->vsi;
 	u32 itr_val;
 
 	/* when exiting WB_ON_ITR lets set a low ITR value and trigger
@@ -1419,15 +1419,14 @@ ice_update_ena_itr(struct ice_vsi *vsi, struct ice_q_vector *q_vector)
 			q_vector->itr_countdown--;
 	}
 
-	if (!test_bit(__ICE_DOWN, vsi->state))
-		wr32(&vsi->back->hw,
+	if (!test_bit(__ICE_DOWN, q_vector->vsi->state))
+		wr32(&q_vector->vsi->back->hw,
 		     GLINT_DYN_CTL(q_vector->reg_idx),
 		     itr_val);
 }
 
 /**
  * ice_set_wb_on_itr - set WB_ON_ITR for this q_vector
- * @vsi: pointer to the VSI structure
  * @q_vector: q_vector to set WB_ON_ITR on
  *
  * We need to tell hardware to write-back completed descriptors even when
@@ -1440,9 +1439,10 @@ ice_update_ena_itr(struct ice_vsi *vsi, struct ice_q_vector *q_vector)
  * value that's not 0 due to ITR granularity. Also, set the INTENA_MSK bit to
  * make sure hardware knows we aren't meddling with the INTENA_M bit.
  */
-static void
-ice_set_wb_on_itr(struct ice_vsi *vsi, struct ice_q_vector *q_vector)
+static void ice_set_wb_on_itr(struct ice_q_vector *q_vector)
 {
+	struct ice_vsi *vsi = q_vector->vsi;
+
 	/* already in WB_ON_ITR mode no need to change it */
 	if (q_vector->itr_countdown == ICE_IN_WB_ON_ITR_MODE)
 		return;
@@ -1473,7 +1473,6 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
 {
 	struct ice_q_vector *q_vector =
 				container_of(napi, struct ice_q_vector, napi);
-	struct ice_vsi *vsi = q_vector->vsi;
 	bool clean_complete = true;
 	struct ice_ring *ring;
 	int budget_per_ring;
@@ -1483,7 +1482,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
 	 * budget and be more aggressive about cleaning up the Tx descriptors.
 	 */
 	ice_for_each_ring(ring, q_vector->tx)
-		if (!ice_clean_tx_irq(vsi, ring, budget))
+		if (!ice_clean_tx_irq(ring, budget))
 			clean_complete = false;
 
 	/* Handle case where we are called by netpoll with a budget of 0 */
@@ -1519,9 +1518,9 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
 	 * poll us due to busy-polling
 	 */
 	if (likely(napi_complete_done(napi, work_done)))
-		ice_update_ena_itr(vsi, q_vector);
+		ice_update_ena_itr(q_vector);
 	else
-		ice_set_wb_on_itr(vsi, q_vector);
+		ice_set_wb_on_itr(q_vector);
 
 	return min_t(int, work_done, budget - 1);
 }
-- 
2.21.0


^ permalink raw reply related

* [net-next 02/16] ice: Add ice_get_main_vsi to get PF/main VSI
From: Jeff Kirsher @ 2019-09-05 20:33 UTC (permalink / raw)
  To: davem
  Cc: Anirudh Venkataramanan, netdev, nhorman, sassmann, Tony Nguyen,
	Andrew Bowers, Jeff Kirsher
In-Reply-To: <20190905203406.4152-1-jeffrey.t.kirsher@intel.com>

From: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>

There are multiple places where we currently use ice_find_vsi_by_type
to get the PF (a.k.a. main) VSI. The PF VSI by definition is always
the first element in the pf->vsi array (i.e. pf->vsi[0]). So instead
add and use a new helper function ice_get_main_vsi, which just returns
pf->vsi[0].

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h      | 20 +++++++-------------
 drivers/net/ethernet/intel/ice/ice_main.c |  6 +++---
 2 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index fb2bc836b20a..bbb3c290a0bf 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -425,21 +425,15 @@ ice_irq_dynamic_ena(struct ice_hw *hw, struct ice_vsi *vsi,
 }
 
 /**
- * ice_find_vsi_by_type - Find and return VSI of a given type
- * @pf: PF to search for VSI
- * @type: Value indicating type of VSI we are looking for
+ * ice_get_main_vsi - Get the PF VSI
+ * @pf: PF instance
+ *
+ * returns pf->vsi[0], which by definition is the PF VSI
  */
-static inline struct ice_vsi *
-ice_find_vsi_by_type(struct ice_pf *pf, enum ice_vsi_type type)
+static inline struct ice_vsi *ice_get_main_vsi(struct ice_pf *pf)
 {
-	int i;
-
-	for (i = 0; i < pf->num_alloc_vsi; i++) {
-		struct ice_vsi *vsi = pf->vsi[i];
-
-		if (vsi && vsi->type == type)
-			return vsi;
-	}
+	if (pf->vsi)
+		return pf->vsi[0];
 
 	return NULL;
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 50a17a0337be..703fc7bf2b31 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -120,7 +120,7 @@ static int ice_init_mac_fltr(struct ice_pf *pf)
 	u8 broadcast[ETH_ALEN];
 	struct ice_vsi *vsi;
 
-	vsi = ice_find_vsi_by_type(pf, ICE_VSI_PF);
+	vsi = ice_get_main_vsi(pf);
 	if (!vsi)
 		return -EINVAL;
 
@@ -826,7 +826,7 @@ ice_link_event(struct ice_pf *pf, struct ice_port_info *pi, bool link_up,
 	if (link_up == old_link && link_speed == old_link_speed)
 		return result;
 
-	vsi = ice_find_vsi_by_type(pf, ICE_VSI_PF);
+	vsi = ice_get_main_vsi(pf);
 	if (!vsi || !vsi->port_info)
 		return -EINVAL;
 
@@ -1439,7 +1439,7 @@ static void ice_check_media_subtask(struct ice_pf *pf)
 	struct ice_vsi *vsi;
 	int err;
 
-	vsi = ice_find_vsi_by_type(pf, ICE_VSI_PF);
+	vsi = ice_get_main_vsi(pf);
 	if (!vsi)
 		return;
 
-- 
2.21.0


^ permalink raw reply related

* [net-next 10/16] ice: Check for DCB capability before initializing DCB
From: Jeff Kirsher @ 2019-09-05 20:34 UTC (permalink / raw)
  To: davem
  Cc: Anirudh Venkataramanan, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190905203406.4152-1-jeffrey.t.kirsher@intel.com>

From: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>

Check the ICE_FLAG_DCB_CAPABLE before calling ice_init_pf_dcb.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_dcb_lib.c |  3 ---
 drivers/net/ethernet/intel/ice/ice_main.c    | 15 ++++++++-------
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
index e922adf1fa15..20f440a64650 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
@@ -474,7 +474,6 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
 		}
 
 		pf->dcbx_cap = DCB_CAP_DCBX_HOST | DCB_CAP_DCBX_VER_IEEE;
-		set_bit(ICE_FLAG_DCB_CAPABLE, pf->flags);
 		return 0;
 	}
 
@@ -483,8 +482,6 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
 	/* DCBX in FW and LLDP enabled in FW */
 	pf->dcbx_cap = DCB_CAP_DCBX_LLD_MANAGED | DCB_CAP_DCBX_VER_IEEE;
 
-	set_bit(ICE_FLAG_DCB_CAPABLE, pf->flags);
-
 	err = ice_dcb_init_cfg(pf, locked);
 	if (err)
 		goto dcb_init_err;
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 703fc7bf2b31..8bb3b81876a9 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2252,6 +2252,8 @@ static void ice_deinit_pf(struct ice_pf *pf)
 static int ice_init_pf(struct ice_pf *pf)
 {
 	bitmap_zero(pf->flags, ICE_PF_FLAGS_NBITS);
+	if (pf->hw.func_caps.common_cap.dcb)
+		set_bit(ICE_FLAG_DCB_CAPABLE, pf->flags);
 #ifdef CONFIG_PCI_IOV
 	if (pf->hw.func_caps.common_cap.sr_iov_1_1) {
 		struct ice_hw *hw = &pf->hw;
@@ -2529,13 +2531,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
 		goto err_init_pf_unroll;
 	}
 
-	err = ice_init_pf_dcb(pf, false);
-	if (err) {
-		clear_bit(ICE_FLAG_DCB_CAPABLE, pf->flags);
-		clear_bit(ICE_FLAG_DCB_ENA, pf->flags);
-
-		/* do not fail overall init if DCB init fails */
-		err = 0;
+	if (test_bit(ICE_FLAG_DCB_CAPABLE, pf->flags)) {
+		/* Note: DCB init failure is non-fatal to load */
+		if (ice_init_pf_dcb(pf, false)) {
+			clear_bit(ICE_FLAG_DCB_CAPABLE, pf->flags);
+			clear_bit(ICE_FLAG_DCB_ENA, pf->flags);
+		}
 	}
 
 	ice_determine_q_usage(pf);
-- 
2.21.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox