Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH V2 net-next 14/17] net: hns3: add Asym Pause support to phy default features
From: Lipeng @ 2017-12-19  4:02 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, linuxarm, salil.mehta, lipeng321
In-Reply-To: <1513656159-127589-1-git-send-email-lipeng321@huawei.com>

From: Fuyun Liang <liangfuyun1@huawei.com>

commit c4fb2cdf575d ("net: hns3: fix a bug for phy supported feature
initialization") adds default supported features for phy, but our hardware
also supports Asym Pause. This patch adds Asym Pause support to phy
default features to prevent Asym Pause can not be advertised when the phy
negotiates flow control.

Fixes: c4fb2cdf575d ("net: hns3: fix a bug for phy supported feature initialization")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Lipeng <lipeng321@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
index 3745153..c1dea3a 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
@@ -17,6 +17,7 @@
 #define HCLGE_PHY_SUPPORTED_FEATURES	(SUPPORTED_Autoneg | \
 					 SUPPORTED_TP | \
 					 SUPPORTED_Pause | \
+					 SUPPORTED_Asym_Pause | \
 					 PHY_10BT_FEATURES | \
 					 PHY_100BT_FEATURES | \
 					 PHY_1000BT_FEATURES)
-- 
1.9.1

^ permalink raw reply related

* Re: [trivial PATCH] treewide: Align function definition open/close braces
From: Martin K. Petersen @ 2017-12-19  3:31 UTC (permalink / raw)
  To: Joe Perches
  Cc: linux-rtc, alsa-devel, linuxppc-dev, Jiri Kosina, linux-scsi,
	MPT-FusionLinux.pdl, acpi4asus-user, linux-wireless, linux-kernel,
	dri-devel, platform-driver-x86, linux-xfs, linux-acpi,
	linux-audit, amd-gfx, netdev, linux-fsdevel, Linus Torvalds,
	ocfs2-devel, linux-media
In-Reply-To: <1513556924.31581.51.camel@perches.com>


Joe,

> Some functions definitions have either the initial open brace and/or
> the closing brace outside of column 1.
>
> Move those braces to column 1.

SCSI bits look OK.

Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply

* [PATCH v2 3/3] trace: print address if symbol not found
From: Tobin C. Harding @ 2017-12-19  3:28 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Tobin C. Harding, Steven Rostedt, Tycho Andersen, Linus Torvalds,
	Kees Cook, Andrew Morton, Daniel Borkmann, Masahiro Yamada,
	Alexei Starovoitov, linux-kernel, Network Development
In-Reply-To: <1513654094-16832-1-git-send-email-me@tobin.cc>

Fixes behaviour modified by: commit 40eee173a35e ("kallsyms: don't leak
address when symbol not found")

Previous patch changed behaviour of kallsyms function sprint_symbol() to
return an error code instead of printing the address if a symbol was not
found. Ftrace relies on the original behaviour. We should not break
tracing when applying the previous patch. We can maintain the original
behaviour by checking the return code on calls to sprint_symbol() and
friends.

Check return code and print actual address on error (i.e symbol not
found).

Signed-off-by: Tobin C. Harding <me@tobin.cc>
---
 kernel/trace/trace.h             | 24 ++++++++++++++++++++++++
 kernel/trace/trace_events_hist.c |  6 +++---
 kernel/trace/trace_output.c      |  2 +-
 3 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 2a6d0325a761..881b1a577d75 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1814,4 +1814,28 @@ static inline void trace_event_eval_update(struct trace_eval_map **map, int len)
 
 extern struct trace_iterator *tracepoint_print_iter;
 
+static inline int
+trace_sprint_symbol(char *buffer, unsigned long address)
+{
+	int ret;
+
+	ret = sprint_symbol(buffer, address);
+	if (ret == -1)
+		ret = sprintf(buffer, "0x%lx", address);
+
+	return ret;
+}
+
+static inline int
+trace_sprint_symbol_no_offset(char *buffer, unsigned long address)
+{
+	int ret;
+
+	ret = sprint_symbol_no_offset(buffer, address);
+	if (ret == -1)
+		ret = sprintf(buffer, "0x%lx", address);
+
+	return ret;
+}
+
 #endif /* _LINUX_KERNEL_TRACE_H */
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 1e1558c99d56..ca523327c058 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -982,7 +982,7 @@ static void hist_trigger_stacktrace_print(struct seq_file *m,
 			return;
 
 		seq_printf(m, "%*c", 1 + spaces, ' ');
-		sprint_symbol(str, stacktrace_entries[i]);
+		trace_sprint_symbol(str, stacktrace_entries[i]);
 		seq_printf(m, "%s\n", str);
 	}
 }
@@ -1014,12 +1014,12 @@ hist_trigger_entry_print(struct seq_file *m,
 			seq_printf(m, "%s: %llx", field_name, uval);
 		} else if (key_field->flags & HIST_FIELD_FL_SYM) {
 			uval = *(u64 *)(key + key_field->offset);
-			sprint_symbol_no_offset(str, uval);
+			trace_sprint_symbol_no_offset(str, uval);
 			seq_printf(m, "%s: [%llx] %-45s", field_name,
 				   uval, str);
 		} else if (key_field->flags & HIST_FIELD_FL_SYM_OFFSET) {
 			uval = *(u64 *)(key + key_field->offset);
-			sprint_symbol(str, uval);
+			trace_sprint_symbol(str, uval);
 			seq_printf(m, "%s: [%llx] %-55s", field_name,
 				   uval, str);
 		} else if (key_field->flags & HIST_FIELD_FL_EXECNAME) {
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 90db994ac900..f3c3a0a60f72 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -365,7 +365,7 @@ seq_print_sym_offset(struct trace_seq *s, const char *fmt,
 #ifdef CONFIG_KALLSYMS
 	const char *name;
 
-	sprint_symbol(str, address);
+	trace_sprint_symbol(str, address);
 	name = kretprobed(str);
 
 	if (name && strlen(name)) {
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 0/3] kallsyms: don't leak address
From: Tobin C. Harding @ 2017-12-19  3:28 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Tobin C. Harding, Steven Rostedt, Tycho Andersen, Linus Torvalds,
	Kees Cook, Andrew Morton, Daniel Borkmann, Masahiro Yamada,
	Alexei Starovoitov, linux-kernel, Network Development

This set plugs a kernel address leak that occurs if kallsyms symbol
look up fails. This set was prompted by a leaking address found using
scripts/leaking_addresses.pl on a PowerPC machine in the wild.

$ perl scripts/leaking_addresses.pl		[address sanitized]
...
/proc/8025/task/8025/stack: [<0000000000000000>] 0xc0000001XXXXXXXX

$ uname -r
4.4.0-79-powerpc64-smp

Patch set does not change behaviour when KALLSYMS is not defined
(suggested by Linus).

Comments on version 1 indicated that current behaviour may be useful for
debugging. This version adds a kernel command-line parameter in order to
be able to preserve current behaviour (print raw address if kallsyms
symbol look up fails). (Command-line parameter suggested by Steve.)

New command-line parameter is documented only in the kernel-doc for
kallsyms functions sprint_symbol() and sprint_symbol_no_offset(). Is
this sufficient? Perhaps an entry in printk-formats.txt also?

Patch 1 - return error code if symbol look up fails unless new
	  command-line parameter 'insecure_print_all_symbols' is enabled.
Patch 2 - print <symbol not found> to buffer if symbol look up returns
	  an error.
Patch 3 - maintain current behaviour in ftrace.

thanks,
Tobin.

v2:
 - Add kernel command-line parameter.
 - Remove unnecessary function.
 - Fix broken ftrace code (and actually build and test ftrace code).

Patch 1 and 2 tested. Patch 3 (ftrace) tested but not all code paths
executed (discussed with Steve in another thread).

Tobin C. Harding (3):
  kallsyms: don't leak address when symbol not found
  vsprintf: print <no-symbol> if symbol not found
  trace: print address if symbol not found

 kernel/kallsyms.c                | 31 +++++++++++++++++++++++++------
 kernel/trace/trace.h             | 24 ++++++++++++++++++++++++
 kernel/trace/trace_events_hist.c |  6 +++---
 kernel/trace/trace_output.c      |  2 +-
 lib/vsprintf.c                   | 11 ++++++++---
 5 files changed, 61 insertions(+), 13 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH v2 2/3] vsprintf: print <no-symbol> if symbol not found
From: Tobin C. Harding @ 2017-12-19  3:28 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Tobin C. Harding, Steven Rostedt, Tycho Andersen, Linus Torvalds,
	Kees Cook, Andrew Morton, Daniel Borkmann, Masahiro Yamada,
	Alexei Starovoitov, linux-kernel, Network Development
In-Reply-To: <1513654094-16832-1-git-send-email-me@tobin.cc>

Depends on: commit 40eee173a35e ("kallsyms: don't leak address when
symbol not found")

Currently vsprintf for specifiers %p[SsB] relies on the behaviour of
kallsyms (sprint_symbol()) and prints the actual address if a symbol is
not found. Previous patch changes this behaviour so that sprint_symbol()
returns an error if symbol not found. With this patch in place we can
print a sanitized message '<symbol not found>' instead of leaking the
address.

Print '<symbol not found>' for printk specifier %p[sSB] if symbol look
up fails.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
---
 lib/vsprintf.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 01c3957b2de6..820ed4fe6e6c 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -674,6 +674,8 @@ char *symbol_string(char *buf, char *end, void *ptr,
 	unsigned long value;
 #ifdef CONFIG_KALLSYMS
 	char sym[KSYM_SYMBOL_LEN];
+	const char *sym_not_found = "<symbol not found>";
+	int ret;
 #endif
 
 	if (fmt[1] == 'R')
@@ -682,11 +684,14 @@ char *symbol_string(char *buf, char *end, void *ptr,
 
 #ifdef CONFIG_KALLSYMS
 	if (*fmt == 'B')
-		sprint_backtrace(sym, value);
+		ret = sprint_backtrace(sym, value);
 	else if (*fmt != 'f' && *fmt != 's')
-		sprint_symbol(sym, value);
+		ret = sprint_symbol(sym, value);
 	else
-		sprint_symbol_no_offset(sym, value);
+		ret = sprint_symbol_no_offset(sym, value);
+
+	if (ret == -1)
+		strcpy(sym, sym_not_found);
 
 	return string(buf, end, sym, spec);
 #else
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 1/3] kallsyms: don't leak address when symbol not found
From: Tobin C. Harding @ 2017-12-19  3:28 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Tobin C. Harding, Steven Rostedt, Tycho Andersen, Linus Torvalds,
	Kees Cook, Andrew Morton, Daniel Borkmann, Masahiro Yamada,
	Alexei Starovoitov, linux-kernel, Network Development
In-Reply-To: <1513654094-16832-1-git-send-email-me@tobin.cc>

Currently if kallsyms_lookup() fails to find the symbol then the address
is printed. This potentially leaks sensitive information but is useful
for debugging. We would like to stop the leak but keep the current
behaviour when needed for debugging. To achieve this we can add a
command-line parameter that if enabled maintains the current
behaviour. If the command-line parameter is not enabled we can return an
error instead of printing the address giving the calling code the option
of how to handle the look up failure.

Add command-line parameter 'insecure_print_all_symbols'. If parameter is
not enabled return an error value instead of printing the raw address.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
---
 kernel/kallsyms.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index d5fa4116688a..2707cf751437 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -383,6 +383,16 @@ int lookup_symbol_attrs(unsigned long addr, unsigned long *size,
 	return lookup_module_symbol_attrs(addr, size, offset, modname, name);
 }
 
+/* Enables printing of raw address when symbol look up fails */
+static bool insecure_print_all_symbols;
+
+static int __init enable_insecure_print_all_symbols(char *unused)
+{
+	insecure_print_all_symbols = true;
+	return 0;
+}
+early_param("insecure_print_all_symbols", enable_insecure_print_all_symbols);
+
 /* Look up a kernel symbol and return it in a text buffer. */
 static int __sprint_symbol(char *buffer, unsigned long address,
 			   int symbol_offset, int add_offset)
@@ -394,8 +404,15 @@ static int __sprint_symbol(char *buffer, unsigned long address,
 
 	address += symbol_offset;
 	name = kallsyms_lookup(address, &size, &offset, &modname, buffer);
-	if (!name)
-		return sprintf(buffer, "0x%lx", address - symbol_offset);
+	if (insecure_print_all_symbols) {
+		if (!name)
+			return sprintf(buffer, "0x%lx", address - symbol_offset);
+	} else {
+		if (!name) {
+			buffer[0] = '\0';
+			return -1;
+		}
+	}
 
 	if (name != buffer)
 		strcpy(buffer, name);
@@ -417,8 +434,9 @@ static int __sprint_symbol(char *buffer, unsigned long address,
  * @address: address to lookup
  *
  * This function looks up a kernel symbol with @address and stores its name,
- * offset, size and module name to @buffer if possible. If no symbol was found,
- * just saves its @address as is.
+ * offset, size and module name to @buffer if possible. If no symbol was found
+ * returns -1 unless kernel command-line parameter 'insecure_print_all_symbols'
+ * is enabled, in which case saves @address as is to buffer.
  *
  * This function returns the number of bytes stored in @buffer.
  */
@@ -434,8 +452,9 @@ EXPORT_SYMBOL_GPL(sprint_symbol);
  * @address: address to lookup
  *
  * This function looks up a kernel symbol with @address and stores its name
- * and module name to @buffer if possible. If no symbol was found, just saves
- * its @address as is.
+ * and module name to @buffer if possible. If no symbol was found, returns -1
+ * unless kernel command-line parameter 'insecure_print_all_symbols' is enabled,
+ * in which case saves @address as is to buffer.
  *
  * This function returns the number of bytes stored in @buffer.
  */
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH v7 2/3] sock: Move the socket inuse to namespace.
From: David Miller @ 2017-12-19  3:22 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: xiangxia.m.yue, netdev
In-Reply-To: <CAM_iQpXhuQ03LB+DsVU4z=m5BCOWKTP1j1oUnBX7-rWC=z_oSA@mail.gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Mon, 18 Dec 2017 13:38:39 -0800

> On Mon, Dec 18, 2017 at 11:30 AM, David Miller <davem@davemloft.net> wrote:
>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>> Date: Thu, 14 Dec 2017 05:51:58 -0800
>>
>>> In some case, we want to know how many sockets are in use in
>>> different _net_ namespaces. It's a key resource metric.
>>
>> Useful or not, you're not exporting this value.
>>
>> All this patch series does is convert the existing export of the
>> global tally to add up the per-net values.
>>
>> So if you're not exporting the per-net value on it's own in any way,
>> this patch series isn't achieving the stated goal.
>>
>> I'm not applying this series, sorry.
> 
> 
> This value is already exported via procfs:
> sockstat_seq_show() -> socket_seq_show().
> 
> And the proc file itself should already be per-net:
> 
> static int sockstat_seq_open(struct inode *inode, struct file *file)
> {
>         return single_open_net(inode, file, sockstat_seq_show);
> }
> 
> 
> This patch just makes that value to be per-net too.

You're right, my bad.

I'll keep reviewing this.

^ permalink raw reply

* Re: [PATCH 3/3] trace: print address if symbol not found
From: Tobin C. Harding @ 2017-12-19  3:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: kernel-hardening, Tycho Andersen, Linus Torvalds, Kees Cook,
	Andrew Morton, Daniel Borkmann, Masahiro Yamada,
	Alexei Starovoitov, linux-kernel, Network Development
In-Reply-To: <20171219030011.GH19604@eros>

On Tue, Dec 19, 2017 at 02:00:11PM +1100, Tobin C. Harding wrote:
> On Mon, Dec 18, 2017 at 06:51:43PM -0500, Steven Rostedt wrote:
> > On Tue, 19 Dec 2017 08:16:14 +1100
> > "Tobin C. Harding" <me@tobin.cc> wrote:
> > 
> > > > >  #endif /* _LINUX_KERNEL_TRACE_H */
> > > > > diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
> > > > > index 1e1558c99d56..3e28522a76f4 100644
> > > > > --- a/kernel/trace/trace_events_hist.c
> > > > > +++ b/kernel/trace/trace_events_hist.c
> > > > > @@ -982,7 +982,7 @@ static void hist_trigger_stacktrace_print(struct seq_file *m,
> > > > >  			return;
> > > > >  
> > > > >  		seq_printf(m, "%*c", 1 + spaces, ' ');
> > > > > -		sprint_symbol(str, stacktrace_entries[i]);
> > > > > +		trace_sprint_symbol_addr(str, stacktrace_entries[i]);  
> > > > 
> > 
> > > 
> > > If you have the time to give me some brief pointers on how I should go
> > > about testing this I'd love to test it before the next version. I know
> > > very little about ftrace.
> > 
> > For hitting the histogram stacktrace trigger (this code path), make
> > sure you have CONFIG_HIST_TRIGGERS enabled. And then do:
> > 
> >  # cd /sys/kernel/debug/tracing
> >  # echo 'hist:keys=common_pid.execname,stacktrace:vals=prev_state' > \
> >      events/sched/sched_switch/trigger
> >  # cat events/sched/sched_switch/hist
> > 
> > For the "sym" part, you can do (from the same directory):
> > 
> >  # echo 'hist:keys=call_site.sym:vals=bytes_req' > \
> >      events/kmem/kmalloc/trigger
> >  # cat events/kmem/kmalloc/hist
> > 
> > 
> > And for sym-offset:
> > 
> >  # echo 'hist:keys=call_site.sym-offset:vals=bytes_req' > \
> >     events/kmem/kmalloc/trigger
> >  # cat events/kmem/kmalloc/hist
> 
> I ran through these as outlined here for the new version (v4). This hits

Should have been:

                                                            v2

thanks,
Tobin.

^ permalink raw reply

* Re: [PATCH 3/3] trace: print address if symbol not found
From: Tobin C. Harding @ 2017-12-19  3:00 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: kernel-hardening, Tycho Andersen, Linus Torvalds, Kees Cook,
	Andrew Morton, Daniel Borkmann, Masahiro Yamada,
	Alexei Starovoitov, linux-kernel, Network Development
In-Reply-To: <20171218185143.4046a71b@gandalf.local.home>

On Mon, Dec 18, 2017 at 06:51:43PM -0500, Steven Rostedt wrote:
> On Tue, 19 Dec 2017 08:16:14 +1100
> "Tobin C. Harding" <me@tobin.cc> wrote:
> 
> > > >  #endif /* _LINUX_KERNEL_TRACE_H */
> > > > diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
> > > > index 1e1558c99d56..3e28522a76f4 100644
> > > > --- a/kernel/trace/trace_events_hist.c
> > > > +++ b/kernel/trace/trace_events_hist.c
> > > > @@ -982,7 +982,7 @@ static void hist_trigger_stacktrace_print(struct seq_file *m,
> > > >  			return;
> > > >  
> > > >  		seq_printf(m, "%*c", 1 + spaces, ' ');
> > > > -		sprint_symbol(str, stacktrace_entries[i]);
> > > > +		trace_sprint_symbol_addr(str, stacktrace_entries[i]);  
> > > 
> 
> > 
> > If you have the time to give me some brief pointers on how I should go
> > about testing this I'd love to test it before the next version. I know
> > very little about ftrace.
> 
> For hitting the histogram stacktrace trigger (this code path), make
> sure you have CONFIG_HIST_TRIGGERS enabled. And then do:
> 
>  # cd /sys/kernel/debug/tracing
>  # echo 'hist:keys=common_pid.execname,stacktrace:vals=prev_state' > \
>      events/sched/sched_switch/trigger
>  # cat events/sched/sched_switch/hist
> 
> For the "sym" part, you can do (from the same directory):
> 
>  # echo 'hist:keys=call_site.sym:vals=bytes_req' > \
>      events/kmem/kmalloc/trigger
>  # cat events/kmem/kmalloc/hist
> 
> 
> And for sym-offset:
> 
>  # echo 'hist:keys=call_site.sym-offset:vals=bytes_req' > \
>     events/kmem/kmalloc/trigger
>  # cat events/kmem/kmalloc/hist

I ran through these as outlined here for the new version (v4). This hits
the modified code but doesn't test symbol look up failure.

I also configured kernel with 'Perform a startup test on ftrace' for
good luck.

Are you happy with this level of testing?

thanks,
Tobin.

^ permalink raw reply

* [PATCH net] net: always reevalulate autoflowlabel setting for reset packet
From: Shaohua Li @ 2017-12-19  2:58 UTC (permalink / raw)
  To: netdev, davem; +Cc: Kernel Team, Shaohua Li, Martin KaFai Lau

From: Shaohua Li <shli@fb.com>

ipv6_pinfo.autoflowlabel is set in sock creation. Later if we change
sysctl.ip6.auto_flowlabels, the ipv6_pinfo.autoflowlabel isn't changed,
so the sock will keep the old behavior in terms of auto flowlabel. Reset
packet is suffering from this problem, because reset packset is sent
from a special control socket, which is created at boot time. Since
sysctl.ipv6.auto_flowlabels is 2 by default, the control socket will
always have its ipv6_pinfo.autoflowlabel set, even after user set
sysctl.ipv6.auto_flowlabels to 1, so reset packset will always have
flowlabel.

To fix this, we always reevaluate autoflowlabel setting for reset
packet. Normal sock has the same issue too, but since the
sysctl.ipv6.auto_flowlabels is usually set at host startup, this isn't a
big issue for normal sock.

Cc: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
 net/ipv6/tcp_ipv6.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 7178476..fc35233 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -789,7 +789,9 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
 	unsigned int tot_len = sizeof(struct tcphdr);
 	struct dst_entry *dst;
 	__be32 *topt;
+	struct ipv6_pinfo *np = inet6_sk(ctl_sk);

+	np->autoflowlabel = ip6_default_np_autolabel(net);
 	if (tsecr)
 		tot_len += TCPOLEN_TSTAMP_ALIGNED;
 #ifdef CONFIG_TCP_MD5SIG
-- 
2.9.5

^ permalink raw reply related

* Re: [Patch net-next] net_sched: properly check for empty skb array on error path
From: Cong Wang @ 2017-12-19  2:20 UTC (permalink / raw)
  To: John Fastabend; +Cc: Linux Kernel Network Developers
In-Reply-To: <11149665-47bb-ec52-fdf0-db7bfa67152e@gmail.com>

On Mon, Dec 18, 2017 at 5:25 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
> On 12/18/2017 02:34 PM, Cong Wang wrote:
>> First, the check of &q->ring.queue against NULL is wrong, it
>> is always false. We should check the value rather than the address.
>>
>
> Thanks.
>
>> Secondly, we need the same check in pfifo_fast_reset() too,
>> as both ->reset() and ->destroy() are called in qdisc_destroy().
>>
>
> not that it hurts to have the check here, but if init fails
> in qdisc_create it seems only ->destroy() is called without
> a ->reset().
>
> Is there another path for init() to fail that I'm missing.

Pretty sure ->reset() is called in qdisc_destroy() and also before
->destroy():


void qdisc_destroy(struct Qdisc *qdisc)
{
        const struct Qdisc_ops  *ops = qdisc->ops;
        struct sk_buff *skb, *tmp;

        if (qdisc->flags & TCQ_F_BUILTIN ||
            !refcount_dec_and_test(&qdisc->refcnt))
                return;

#ifdef CONFIG_NET_SCHED
        qdisc_hash_del(qdisc);

        qdisc_put_stab(rtnl_dereference(qdisc->stab));
#endif
        gen_kill_estimator(&qdisc->rate_est);
        if (ops->reset)
                ops->reset(qdisc);
        if (ops->destroy)
                ops->destroy(qdisc);

^ permalink raw reply

* RE: [Patch v2] net: phy: marvell: Limit 88m1101 autoneg errata to 88E1145 as well.
From: Qiang Zhao @ 2017-12-19  2:13 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <20171218.131957.1315720305748482852.davem@davemloft.net>

 From: David Miller <davem@davemloft.net>
 Date: Tue, 19 Dec 2017 2:20AM
> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Tuesday, December 19, 2017 2:20 AM
> To: Qiang Zhao <qiang.zhao@nxp.com>
> Cc: netdev@vger.kernel.org
> Subject: Re: [Patch v2] net: phy: marvell: Limit 88m1101 autoneg errata to
> 88E1145 as well.
> 
> From: Zhao Qiang <qiang.zhao@nxp.com>
> Date: Mon, 18 Dec 2017 10:26:43 +0800
> 
> > 88E1145 also need this autoneg errata.
> >
> > Fixes: f2899788353c ("net: phy: marvell: Limit errata to 88m1101")
> > Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com>
> > ---
> > Changes for v2
> > 	- modify the commit msg in a proper way.
> 
> Applied and queued up for -stable.

Thank you!

Best Regards
Qiang Zhao

^ permalink raw reply

* Re: [PATCH v4 16/36] nds32: System calls handling
From: Vincent Chen @ 2017-12-19  2:10 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Greentime Hu, Greentime, Linux Kernel Mailing List, linux-arch,
	Thomas Gleixner, Jason Cooper, Marc Zyngier, Rob Herring,
	Networking, DTML, Al Viro, David Howells, Will Deacon,
	Daniel Lezcano, linux-serial-u79uwXL29TY76Z2rM5mHXA,
	Geert Uytterhoeven, Linus Walleij, Mark Rutland, Greg KH,
	ren_guo-Y+KPrCd2zL4AvxtiuMwx3w, Philipp
In-Reply-To: <CAK8P3a2Yas3rWdx_qYx48PECundOzRSKOsqkJnUTzGW86OjJVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

2017-12-18 19:19 GMT+08:00 Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>:
> On Mon, Dec 18, 2017 at 7:46 AM, Greentime Hu <green.hu-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>
>> new file mode 100644
>> index 0000000..90da745
>> --- /dev/null
>> +++ b/arch/nds32/include/uapi/asm/unistd.h
>> @@ -0,0 +1,12 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (C) 2005-2017 Andes Technology Corporation
>> +
>> +#define __ARCH_WANT_SYNC_FILE_RANGE2
>> +
>> +/* Use the standard ABI for syscalls */
>> +#include <asm-generic/unistd.h>
>> +
>> +/* Additional NDS32 specific syscalls. */
>> +#define __NR_cacheflush                (__NR_arch_specific_syscall)
>> +#define __NR__llseek             __NR_llseek
>> +__SYSCALL(__NR_cacheflush, sys_cacheflush)
>
> I'm still confused by __NR__llseek here, why do you need that one?
>

Dear Arnd:
We hoped to solve  ABI register alignment problem for llseek in glibc
by __NR__llseek.
After checking glibc again, I find glibc has same __NR__llseek macro
and It's better to solve this problem.
So, I will remove this definition in the next version patch.


>> +SYSCALL_DEFINE6(mmap2, unsigned long, addr, unsigned long, len,
>> +              unsigned long, prot, unsigned long, flags,
>> +              unsigned long, fd, unsigned long, pgoff)
>> +{
>> +       if (pgoff & (~PAGE_MASK >> 12))
>> +               return -EINVAL;
>> +
>> +       return sys_mmap_pgoff(addr, len, prot, flags, fd,
>> +                             pgoff >> (PAGE_SHIFT - 12));
>> +}
>> +
>> +SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
>> +              unsigned long, prot, unsigned long, flags,
>> +              unsigned long, fd, unsigned long, pgoff)
>> +{
>> +       if (unlikely(pgoff & ~PAGE_MASK))
>> +               return -EINVAL;
>> +
>> +       return sys_mmap_pgoff(addr, len, prot, flags, fd,
>> +                             pgoff >> PAGE_SHIFT);
>> +}
>
> And I don't see why you define sys_mmap() in addition to sys_mmap2().
>
This is my mistake. I will remove it in the next version patch.

> The rest of the syscall handling looks good now.
>
>          Arnd


Thanks
Vincent
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH net-next v2] cxgb4: RSS table is 4k for T6
From: Ganesh Goudar @ 2017-12-19  1:52 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, indranil, venkatesh, Ganesh Goudar

RSS table is 4k for T6 and later cards, add check for the
same.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
v2: Not a series, It is single patch
---
 drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c     |  5 ++--
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |  1 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c   |  2 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c |  7 ++---
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c         | 13 +++++++--
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.h         | 31 +++++++++++-----------
 6 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
index d73fb6a..336670d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
@@ -1004,9 +1004,10 @@ int cudbg_collect_rss(struct cudbg_init *pdbg_init,
 {
 	struct adapter *padap = pdbg_init->adap;
 	struct cudbg_buffer temp_buff = { 0 };
-	int rc;
+	int rc, nentries;
 
-	rc = cudbg_get_buff(dbg_buff, RSS_NENTRIES * sizeof(u16), &temp_buff);
+	nentries = t4_chip_rss_size(padap);
+	rc = cudbg_get_buff(dbg_buff, nentries * sizeof(u16), &temp_buff);
 	if (rc)
 		return rc;
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index b1df2aa..69d0b64 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -1528,6 +1528,7 @@ int t4_init_portinfo(struct port_info *pi, int mbox,
 		     int port, int pf, int vf, u8 mac[]);
 int t4_port_init(struct adapter *adap, int mbox, int pf, int vf);
 void t4_fatal_err(struct adapter *adapter);
+unsigned int t4_chip_rss_size(struct adapter *adapter);
 int t4_config_rss_range(struct adapter *adapter, int mbox, unsigned int viid,
 			int start, int n, const u16 *rspq, unsigned int nrspq);
 int t4_config_glbl_rss(struct adapter *adapter, int mbox, unsigned int mode,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
index 41c8736..581d628 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
@@ -179,7 +179,7 @@ static u32 cxgb4_get_entity_length(struct adapter *adap, u32 entity)
 		len = cudbg_mbytes_to_bytes(len);
 		break;
 	case CUDBG_RSS:
-		len = RSS_NENTRIES * sizeof(u16);
+		len = t4_chip_rss_size(adap) * sizeof(u16);
 		break;
 	case CUDBG_RSS_VF_CONF:
 		len = adap->params.arch.vfcount *
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index d8efcd9..d3ced04 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -2021,11 +2021,12 @@ static int rss_show(struct seq_file *seq, void *v, int idx)
 
 static int rss_open(struct inode *inode, struct file *file)
 {
-	int ret;
-	struct seq_tab *p;
 	struct adapter *adap = inode->i_private;
+	int ret, nentries;
+	struct seq_tab *p;
 
-	p = seq_open_tab(file, RSS_NENTRIES / 8, 8 * sizeof(u16), 0, rss_show);
+	nentries = t4_chip_rss_size(adap);
+	p = seq_open_tab(file, nentries / 8, 8 * sizeof(u16), 0, rss_show);
 	if (!p)
 		return -ENOMEM;
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index f044717..242bcdd 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -4927,6 +4927,14 @@ void t4_intr_disable(struct adapter *adapter)
 	t4_set_reg_field(adapter, PL_INT_MAP0_A, 1 << pf, 0);
 }
 
+unsigned int t4_chip_rss_size(struct adapter *adap)
+{
+	if (CHELSIO_CHIP_VERSION(adap->params.chip) <= CHELSIO_T5)
+		return RSS_NENTRIES;
+	else
+		return T6_RSS_NENTRIES;
+}
+
 /**
  *	t4_config_rss_range - configure a portion of the RSS mapping table
  *	@adapter: the adapter
@@ -5065,10 +5073,11 @@ static int rd_rss_row(struct adapter *adap, int row, u32 *val)
  */
 int t4_read_rss(struct adapter *adapter, u16 *map)
 {
+	int i, ret, nentries;
 	u32 val;
-	int i, ret;
 
-	for (i = 0; i < RSS_NENTRIES / 2; ++i) {
+	nentries = t4_chip_rss_size(adapter);
+	for (i = 0; i < nentries / 2; ++i) {
 		ret = rd_rss_row(adapter, i, &val);
 		if (ret)
 			return ret;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
index 872a91b..361d503 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
@@ -38,21 +38,22 @@
 #include <linux/types.h>
 
 enum {
-	NCHAN          = 4,     /* # of HW channels */
-	MAX_MTU        = 9600,  /* max MAC MTU, excluding header + FCS */
-	EEPROMSIZE     = 17408, /* Serial EEPROM physical size */
-	EEPROMVSIZE    = 32768, /* Serial EEPROM virtual address space size */
-	EEPROMPFSIZE   = 1024,  /* EEPROM writable area size for PFn, n>0 */
-	RSS_NENTRIES   = 2048,  /* # of entries in RSS mapping table */
-	TCB_SIZE       = 128,   /* TCB size */
-	NMTUS          = 16,    /* size of MTU table */
-	NCCTRL_WIN     = 32,    /* # of congestion control windows */
-	NTX_SCHED      = 8,     /* # of HW Tx scheduling queues */
-	PM_NSTATS      = 5,     /* # of PM stats */
-	T6_PM_NSTATS   = 7,     /* # of PM stats in T6 */
-	MBOX_LEN       = 64,    /* mailbox size in bytes */
-	TRACE_LEN      = 112,   /* length of trace data and mask */
-	FILTER_OPT_LEN = 36,    /* filter tuple width for optional components */
+	NCHAN           = 4,    /* # of HW channels */
+	MAX_MTU         = 9600, /* max MAC MTU, excluding header + FCS */
+	EEPROMSIZE      = 17408,/* Serial EEPROM physical size */
+	EEPROMVSIZE     = 32768,/* Serial EEPROM virtual address space size */
+	EEPROMPFSIZE    = 1024, /* EEPROM writable area size for PFn, n>0 */
+	RSS_NENTRIES    = 2048, /* # of entries in RSS mapping table */
+	T6_RSS_NENTRIES = 4096, /* # of entries in RSS mapping table */
+	TCB_SIZE        = 128,  /* TCB size */
+	NMTUS           = 16,   /* size of MTU table */
+	NCCTRL_WIN      = 32,   /* # of congestion control windows */
+	NTX_SCHED       = 8,    /* # of HW Tx scheduling queues */
+	PM_NSTATS       = 5,    /* # of PM stats */
+	T6_PM_NSTATS    = 7,    /* # of PM stats in T6 */
+	MBOX_LEN        = 64,   /* mailbox size in bytes */
+	TRACE_LEN       = 112,  /* length of trace data and mask */
+	FILTER_OPT_LEN  = 36,   /* filter tuple width for optional components */
 };
 
 enum {
-- 
2.1.0

^ permalink raw reply related

* Re: [PATCH net-next 17/17] net: hns3: change TM sched mode to TC-based mode when SRIOV enabled
From: lipeng (Y) @ 2017-12-19  1:41 UTC (permalink / raw)
  To: Sergei Shtylyov, davem; +Cc: netdev, linux-kernel, linuxarm, salil.mehta
In-Reply-To: <378b5b7e-e1fc-4c35-d198-8cd9c61b0db9@cogentembedded.com>



On 2017/12/18 17:08, Sergei Shtylyov wrote:
> On 12/18/2017 12:31 PM, Lipeng wrote:
>
>> TC-based sched mode supports SRIOV enabled and SRIOV disabled. This
>> patch change the TM sched mode to TC-based mode in initialization
>> process.
>>
>> Fixes: cc9bb43 (net: hns3: Add tc-based TM support for sriov enabled 
>> port)
>
>    Need at least 12 hex digits.
>

agree , may lost some hex digits,  will fix it.

>> Signed-off-by: Lipeng <lipeng321@huawei.com>
> [...]
>
> MBR, Sergei
>
>

^ permalink raw reply

* Re: [PATCH net-next 14/17] net: hns3: add Asym Pause support to phy default features
From: lipeng (Y) @ 2017-12-19  1:40 UTC (permalink / raw)
  To: Sergei Shtylyov, davem; +Cc: netdev, linux-kernel, linuxarm, salil.mehta
In-Reply-To: <b3ec4aee-5a79-5f5a-9c11-f0645ae4f237@cogentembedded.com>



On 2017/12/18 17:07, Sergei Shtylyov wrote:
> Hello!
>
> On 12/18/2017 12:31 PM, Lipeng wrote:
>
>> From: Fuyun Liang <liangfuyun1@huawei.com>
>>
>> commit c4fb2cdf575d (net: hns3: fix a bug for phy supported feature
>> initialization) adds default supported features for phy, but our 
>> hardware
>
>    Ten cited commit's summary needs to be enclosed in (""), not just 
> ()...
>
Thanks , will fix it.

>> also supports Asym Pause. This patch adds Asym Pause support to phy
>> default features to prevent Asym Pause can not be advertised when the 
>> phy
>> negotiates flow control.
>>
>> Fixes: c4fb2cdf575d (net: hns3: fix a bug for phy supported feature 
>> initialization)
>
>    Here as well...
>
will fix here too.

Thanks

>> Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
>> Signed-off-by: Lipeng <lipeng321@huawei.com>
> [...]
>
> MBR, Sergei
>
>

^ permalink raw reply

* [PATCH net-next 1/2] cxgb4: RSS table is 4k for T6
From: Ganesh Goudar @ 2017-12-19  1:39 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, indranil, venkatesh, Ganesh Goudar

RSS table is 4k for T6 and later cards, add check for the
same.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c     |  5 ++--
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |  1 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c   |  2 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c |  7 ++---
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c         | 13 +++++++--
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.h         | 31 +++++++++++-----------
 6 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
index d73fb6a..336670d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
@@ -1004,9 +1004,10 @@ int cudbg_collect_rss(struct cudbg_init *pdbg_init,
 {
 	struct adapter *padap = pdbg_init->adap;
 	struct cudbg_buffer temp_buff = { 0 };
-	int rc;
+	int rc, nentries;
 
-	rc = cudbg_get_buff(dbg_buff, RSS_NENTRIES * sizeof(u16), &temp_buff);
+	nentries = t4_chip_rss_size(padap);
+	rc = cudbg_get_buff(dbg_buff, nentries * sizeof(u16), &temp_buff);
 	if (rc)
 		return rc;
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index b1df2aa..69d0b64 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -1528,6 +1528,7 @@ int t4_init_portinfo(struct port_info *pi, int mbox,
 		     int port, int pf, int vf, u8 mac[]);
 int t4_port_init(struct adapter *adap, int mbox, int pf, int vf);
 void t4_fatal_err(struct adapter *adapter);
+unsigned int t4_chip_rss_size(struct adapter *adapter);
 int t4_config_rss_range(struct adapter *adapter, int mbox, unsigned int viid,
 			int start, int n, const u16 *rspq, unsigned int nrspq);
 int t4_config_glbl_rss(struct adapter *adapter, int mbox, unsigned int mode,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
index 41c8736..581d628 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
@@ -179,7 +179,7 @@ static u32 cxgb4_get_entity_length(struct adapter *adap, u32 entity)
 		len = cudbg_mbytes_to_bytes(len);
 		break;
 	case CUDBG_RSS:
-		len = RSS_NENTRIES * sizeof(u16);
+		len = t4_chip_rss_size(adap) * sizeof(u16);
 		break;
 	case CUDBG_RSS_VF_CONF:
 		len = adap->params.arch.vfcount *
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index 4956e42..200bf67 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -2021,11 +2021,12 @@ static int rss_show(struct seq_file *seq, void *v, int idx)
 
 static int rss_open(struct inode *inode, struct file *file)
 {
-	int ret;
-	struct seq_tab *p;
 	struct adapter *adap = inode->i_private;
+	int ret, nentries;
+	struct seq_tab *p;
 
-	p = seq_open_tab(file, RSS_NENTRIES / 8, 8 * sizeof(u16), 0, rss_show);
+	nentries = t4_chip_rss_size(adap);
+	p = seq_open_tab(file, nentries / 8, 8 * sizeof(u16), 0, rss_show);
 	if (!p)
 		return -ENOMEM;
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index f044717..242bcdd 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -4927,6 +4927,14 @@ void t4_intr_disable(struct adapter *adapter)
 	t4_set_reg_field(adapter, PL_INT_MAP0_A, 1 << pf, 0);
 }
 
+unsigned int t4_chip_rss_size(struct adapter *adap)
+{
+	if (CHELSIO_CHIP_VERSION(adap->params.chip) <= CHELSIO_T5)
+		return RSS_NENTRIES;
+	else
+		return T6_RSS_NENTRIES;
+}
+
 /**
  *	t4_config_rss_range - configure a portion of the RSS mapping table
  *	@adapter: the adapter
@@ -5065,10 +5073,11 @@ static int rd_rss_row(struct adapter *adap, int row, u32 *val)
  */
 int t4_read_rss(struct adapter *adapter, u16 *map)
 {
+	int i, ret, nentries;
 	u32 val;
-	int i, ret;
 
-	for (i = 0; i < RSS_NENTRIES / 2; ++i) {
+	nentries = t4_chip_rss_size(adapter);
+	for (i = 0; i < nentries / 2; ++i) {
 		ret = rd_rss_row(adapter, i, &val);
 		if (ret)
 			return ret;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
index 872a91b..361d503 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
@@ -38,21 +38,22 @@
 #include <linux/types.h>
 
 enum {
-	NCHAN          = 4,     /* # of HW channels */
-	MAX_MTU        = 9600,  /* max MAC MTU, excluding header + FCS */
-	EEPROMSIZE     = 17408, /* Serial EEPROM physical size */
-	EEPROMVSIZE    = 32768, /* Serial EEPROM virtual address space size */
-	EEPROMPFSIZE   = 1024,  /* EEPROM writable area size for PFn, n>0 */
-	RSS_NENTRIES   = 2048,  /* # of entries in RSS mapping table */
-	TCB_SIZE       = 128,   /* TCB size */
-	NMTUS          = 16,    /* size of MTU table */
-	NCCTRL_WIN     = 32,    /* # of congestion control windows */
-	NTX_SCHED      = 8,     /* # of HW Tx scheduling queues */
-	PM_NSTATS      = 5,     /* # of PM stats */
-	T6_PM_NSTATS   = 7,     /* # of PM stats in T6 */
-	MBOX_LEN       = 64,    /* mailbox size in bytes */
-	TRACE_LEN      = 112,   /* length of trace data and mask */
-	FILTER_OPT_LEN = 36,    /* filter tuple width for optional components */
+	NCHAN           = 4,    /* # of HW channels */
+	MAX_MTU         = 9600, /* max MAC MTU, excluding header + FCS */
+	EEPROMSIZE      = 17408,/* Serial EEPROM physical size */
+	EEPROMVSIZE     = 32768,/* Serial EEPROM virtual address space size */
+	EEPROMPFSIZE    = 1024, /* EEPROM writable area size for PFn, n>0 */
+	RSS_NENTRIES    = 2048, /* # of entries in RSS mapping table */
+	T6_RSS_NENTRIES = 4096, /* # of entries in RSS mapping table */
+	TCB_SIZE        = 128,  /* TCB size */
+	NMTUS           = 16,   /* size of MTU table */
+	NCCTRL_WIN      = 32,   /* # of congestion control windows */
+	NTX_SCHED       = 8,    /* # of HW Tx scheduling queues */
+	PM_NSTATS       = 5,    /* # of PM stats */
+	T6_PM_NSTATS    = 7,    /* # of PM stats in T6 */
+	MBOX_LEN        = 64,   /* mailbox size in bytes */
+	TRACE_LEN       = 112,  /* length of trace data and mask */
+	FILTER_OPT_LEN  = 36,   /* filter tuple width for optional components */
 };
 
 enum {
-- 
2.1.0

^ permalink raw reply related

* Re: [v2 PATCH -tip 3/6] net: sctp: Add SCTP ACK tracking trace event
From: Masami Hiramatsu @ 2017-12-19  1:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Ian McDonald, Vlad Yasevich, Stephen Hemminger,
	Peter Zijlstra, Thomas Gleixner, LKML, H . Peter Anvin,
	Gerrit Renker, David S . Miller, Neil Horman, dccp, netdev,
	linux-sctp, Stephen Rothwell
In-Reply-To: <20171218120516.2d4398b2@gandalf.local.home>

On Mon, 18 Dec 2017 12:05:16 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Mon, 18 Dec 2017 17:12:15 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > Add SCTP ACK tracking trace event to trace the changes of SCTP
> > association state in response to incoming packets.
> > It is used for debugging SCTP congestion control algorithms,
> > and will replace sctp_probe module.
> > 
> > Note that this event a bit tricky. Since this consists of 2
> > events (sctp_probe and sctp_probe_path) so you have to enable
> > both events as below.
> > 
> >   # cd /sys/kernel/debug/tracing
> >   # echo 1 > events/sctp/sctp_probe/enable
> >   # echo 1 > events/sctp/sctp_probe_path/enable
> > 
> > Or, you can enable all the events under sctp.
> > 
> >   # echo 1 > events/sctp/enable
> > 
> > Since sctp_probe_path event is always invoked from sctp_probe
> > event, you can not see any output if you only enable
> > sctp_probe_path.
> 
> I have to ask, why did you do it this way?
> 
> 
> > +#include <trace/define_trace.h>
> > diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
> > index 8f8ccded13e4..c5f92b2cc5c3 100644
> > --- a/net/sctp/sm_statefuns.c
> > +++ b/net/sctp/sm_statefuns.c
> > @@ -59,6 +59,9 @@
> >  #include <net/sctp/sm.h>
> >  #include <net/sctp/structs.h>
> >  
> > +#define CREATE_TRACE_POINTS
> > +#include <trace/events/sctp.h>
> > +
> >  static struct sctp_packet *sctp_abort_pkt_new(
> >  					struct net *net,
> >  					const struct sctp_endpoint *ep,
> > @@ -3219,6 +3222,8 @@ enum sctp_disposition sctp_sf_eat_sack_6_2(struct net *net,
> >  	struct sctp_sackhdr *sackh;
> >  	__u32 ctsn;
> >  
> > +	trace_sctp_probe(ep, asoc, chunk);
> 
> What about doing this right after this probe:
> 
> 	if (trace_sctp_probe_path_enabled()) {
> 		struct sctp_transport *sp;
> 
> 		list_for_each_entry(sp, &asoc->peer.transpor_addr_list,
> 				    transports) {
> 			trace_sctp_probe_path(sp, asoc);
> 		}
> 	}
> 
> The "trace_sctp_probe_path_enabled()" is a static branch, which means
> it's a nop just like a tracepoint is, and will not add any overhead if
> the trace_sctp_probe_path is not enabled.

That's a good idea! I'll update to use it :)

Thank you,

> 
> -- Steve
> 
> > +
> >  	if (!sctp_vtag_verify(chunk, asoc))
> >  		return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
> >  
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply

* Re: [Patch net-next] net_sched: properly check for empty skb array on error path
From: John Fastabend @ 2017-12-19  1:25 UTC (permalink / raw)
  To: Cong Wang, netdev
In-Reply-To: <20171218223426.4685-1-xiyou.wangcong@gmail.com>

On 12/18/2017 02:34 PM, Cong Wang wrote:
> First, the check of &q->ring.queue against NULL is wrong, it
> is always false. We should check the value rather than the address.
> 

Thanks.

> Secondly, we need the same check in pfifo_fast_reset() too,
> as both ->reset() and ->destroy() are called in qdisc_destroy().
> 

not that it hurts to have the check here, but if init fails
in qdisc_create it seems only ->destroy() is called without
a ->reset().

Is there another path for init() to fail that I'm missing.

> Fixes: c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array")
> Reported-by: syzbot <syzkaller@googlegroups.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>  net/sched/sch_generic.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 

^ permalink raw reply

* linux-next: manual merge of the net-next tree with the net tree
From: Stephen Rothwell @ 2017-12-19  0:51 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, Zhao Qiang,
	Heiner Kallweit

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/net/phy/marvell.c

between commit:

  c505873eaece ("net: phy: marvell: Limit 88m1101 autoneg errata to 88E1145 as well.")

from the net tree and commit:

  80274abafc60 ("net: phy: remove generic settings for callbacks config_aneg and read_status from drivers")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/net/phy/marvell.c
index 82104edca393,2fc026dc170a..000000000000
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@@ -2085,8 -2070,7 +2082,7 @@@ static struct phy_driver marvell_driver
  		.flags = PHY_HAS_INTERRUPT,
  		.probe = marvell_probe,
  		.config_init = &m88e1145_config_init,
 -		.config_aneg = &marvell_config_aneg,
 +		.config_aneg = &m88e1101_config_aneg,
- 		.read_status = &genphy_read_status,
  		.ack_interrupt = &marvell_ack_interrupt,
  		.config_intr = &marvell_config_intr,
  		.resume = &genphy_resume,

^ permalink raw reply

* Re: [PATCH net-next] bpf/cgroup: fix a verification error for a CGROUP_DEVICE type prog
From: Daniel Borkmann @ 2017-12-19  0:46 UTC (permalink / raw)
  To: Yonghong Song, ast, netdev; +Cc: guro, kernel-team
In-Reply-To: <20171218181344.2000185-1-yhs@fb.com>

On 12/18/2017 07:13 PM, Yonghong Song wrote:
> The tools/testing/selftests/bpf test program
> test_dev_cgroup fails with the following error
> when compiled with llvm 6.0. (I did not try
> with earlier versions.)
> 
>   libbpf: load bpf program failed: Permission denied
>   libbpf: -- BEGIN DUMP LOG ---
>   libbpf:
>   0: (61) r2 = *(u32 *)(r1 +4)
>   1: (b7) r0 = 0
>   2: (55) if r2 != 0x1 goto pc+8
>    R0=inv0 R1=ctx(id=0,off=0,imm=0) R2=inv1 R10=fp0
>   3: (69) r2 = *(u16 *)(r1 +0)
>   invalid bpf_context access off=0 size=2
>   ...
> 
> The culprit is the following statement in dev_cgroup.c:
>   short type = ctx->access_type & 0xFFFF;
> This code is typical as the ctx->access_type is assigned
> as below in kernel/bpf/cgroup.c:
>   struct bpf_cgroup_dev_ctx ctx = {
>         .access_type = (access << 16) | dev_type,
>         .major = major,
>         .minor = minor,
>   };
> 
> The compiler converts it to u16 access while
> the verifier cgroup_dev_is_valid_access rejects
> any non u32 access.
> 
> This patch permits the field access_type to be accessible
> with type u16 and u8 as well.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> Tested-by: Roman Gushchin <guro@fb.com>

Looks good, applied to bpf-next, thanks Yonghong!

^ permalink raw reply

* [RFC PATCH] virtio_net: Extend virtio to use VF datapath when available
From: Sridhar Samudrala @ 2017-12-19  0:40 UTC (permalink / raw)
  To: mst, stephen, netdev, virtualization, alexander.duyck,
	sridhar.samudrala

This patch enables virtio to switch over to a VF datapath when a VF netdev
is present with the same MAC address.  It allows live migration of a VM
with a direct attached VF without the need to setup a bond/team between a
VF and virtio net device in the guest.

The hypervisor needs to unplug the VF device from the guest on the source
host and reset the MAC filter of the VF to initiate failover of datapath to
virtio before starting the migration. After the migration is completed, the
destination hypervisor sets the MAC filter on the VF and plugs it back to
the guest to switch over to VF datapath.

It is entirely based on netvsc implementation and it should be possible to
make this code generic and move it to a common location that can be shared
by netvsc and virtio.

Also, i think we should make this a negotiated feature that is off by
default via a new feature bit.

This patch is based on the discussion initiated by Jesse on this thread.
https://marc.info/?l=linux-virtualization&m=151189725224231&w=2

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
---
 drivers/net/virtio_net.c | 341 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 339 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 559b215c0169..a34c717bb15b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -31,6 +31,8 @@
 #include <linux/average.h>
 #include <linux/filter.h>
 #include <net/route.h>
+#include <linux/netdevice.h>
+#include <linux/netpoll.h>
 
 static int napi_weight = NAPI_POLL_WEIGHT;
 module_param(napi_weight, int, 0444);
@@ -56,6 +58,8 @@ module_param(napi_tx, bool, 0644);
  */
 DECLARE_EWMA(pkt_len, 0, 64)
 
+#define VF_TAKEOVER_INT	(HZ / 10)
+
 #define VIRTNET_DRIVER_VERSION "1.0.0"
 
 static const unsigned long guest_offloads[] = {
@@ -117,6 +121,15 @@ struct receive_queue {
 	char name[40];
 };
 
+struct virtnet_vf_pcpu_stats {
+	u64	rx_packets;
+	u64	rx_bytes;
+	u64	tx_packets;
+	u64	tx_bytes;
+	struct u64_stats_sync   syncp;
+	u32	tx_dropped;
+};
+
 struct virtnet_info {
 	struct virtio_device *vdev;
 	struct virtqueue *cvq;
@@ -179,6 +192,11 @@ struct virtnet_info {
 	u32 speed;
 
 	unsigned long guest_offloads;
+
+	/* State to manage the associated VF interface. */
+	struct net_device __rcu *vf_netdev;
+	struct virtnet_vf_pcpu_stats __percpu *vf_stats;
+	struct delayed_work vf_takeover;
 };
 
 struct padded_vnet_hdr {
@@ -1300,16 +1318,51 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC);
 }
 
+/* Send skb on the slave VF device. */
+static int virtnet_vf_xmit(struct net_device *dev, struct net_device *vf_netdev,
+			   struct sk_buff *skb)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+	unsigned int len = skb->len;
+	int rc;
+
+	skb->dev = vf_netdev;
+	skb->queue_mapping = qdisc_skb_cb(skb)->slave_dev_queue_mapping;
+
+	rc = dev_queue_xmit(skb);
+	if (likely(rc == NET_XMIT_SUCCESS || rc == NET_XMIT_CN)) {
+		struct virtnet_vf_pcpu_stats *pcpu_stats
+			= this_cpu_ptr(vi->vf_stats);
+
+		u64_stats_update_begin(&pcpu_stats->syncp);
+		pcpu_stats->tx_packets++;
+		pcpu_stats->tx_bytes += len;
+		u64_stats_update_end(&pcpu_stats->syncp);
+	} else {
+		this_cpu_inc(vi->vf_stats->tx_dropped);
+	}
+
+	return rc;
+}
+
 static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
 	int qnum = skb_get_queue_mapping(skb);
 	struct send_queue *sq = &vi->sq[qnum];
+	struct net_device *vf_netdev;
 	int err;
 	struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
 	bool kick = !skb->xmit_more;
 	bool use_napi = sq->napi.weight;
 
+	/* if VF is present and up then redirect packets
+	 * called with rcu_read_lock_bh
+	 */
+	vf_netdev = rcu_dereference_bh(vi->vf_netdev);
+	if (vf_netdev && netif_running(vf_netdev) && !netpoll_tx_running(dev))
+		return virtnet_vf_xmit(dev, vf_netdev, skb);
+
 	/* Free up any pending old buffers before queueing new ones. */
 	free_old_xmit_skbs(sq);
 
@@ -1456,10 +1509,41 @@ static int virtnet_set_mac_address(struct net_device *dev, void *p)
 	return ret;
 }
 
+static void virtnet_get_vf_stats(struct net_device *dev,
+				 struct virtnet_vf_pcpu_stats *tot)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+	int i;
+
+	memset(tot, 0, sizeof(*tot));
+
+	for_each_possible_cpu(i) {
+		const struct virtnet_vf_pcpu_stats *stats
+				= per_cpu_ptr(vi->vf_stats, i);
+		u64 rx_packets, rx_bytes, tx_packets, tx_bytes;
+		unsigned int start;
+
+		do {
+			start = u64_stats_fetch_begin_irq(&stats->syncp);
+			rx_packets = stats->rx_packets;
+			tx_packets = stats->tx_packets;
+			rx_bytes = stats->rx_bytes;
+			tx_bytes = stats->tx_bytes;
+		} while (u64_stats_fetch_retry_irq(&stats->syncp, start));
+
+		tot->rx_packets += rx_packets;
+		tot->tx_packets += tx_packets;
+		tot->rx_bytes   += rx_bytes;
+		tot->tx_bytes   += tx_bytes;
+		tot->tx_dropped += stats->tx_dropped;
+	}
+}
+
 static void virtnet_stats(struct net_device *dev,
 			  struct rtnl_link_stats64 *tot)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
+	struct virtnet_vf_pcpu_stats vf_stats;
 	int cpu;
 	unsigned int start;
 
@@ -1490,6 +1574,13 @@ static void virtnet_stats(struct net_device *dev,
 	tot->rx_dropped = dev->stats.rx_dropped;
 	tot->rx_length_errors = dev->stats.rx_length_errors;
 	tot->rx_frame_errors = dev->stats.rx_frame_errors;
+
+	virtnet_get_vf_stats(dev, &vf_stats);
+	tot->rx_packets += vf_stats.rx_packets;
+	tot->tx_packets += vf_stats.tx_packets;
+	tot->rx_bytes += vf_stats.rx_bytes;
+	tot->tx_bytes += vf_stats.tx_bytes;
+	tot->tx_dropped += vf_stats.tx_dropped;
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
@@ -2508,6 +2599,47 @@ static int virtnet_validate(struct virtio_device *vdev)
 	return 0;
 }
 
+static void __virtnet_vf_setup(struct net_device *ndev,
+			       struct net_device *vf_netdev)
+{
+	int ret;
+
+	/* Align MTU of VF with master */
+	ret = dev_set_mtu(vf_netdev, ndev->mtu);
+	if (ret)
+		netdev_warn(vf_netdev,
+			    "unable to change mtu to %u\n", ndev->mtu);
+
+	if (netif_running(ndev)) {
+		ret = dev_open(vf_netdev);
+		if (ret)
+			netdev_warn(vf_netdev,
+				    "unable to open: %d\n", ret);
+	}
+}
+
+/* Setup VF as slave of the virtio device.
+ * Runs in workqueue to avoid recursion in netlink callbacks.
+ */
+static void virtnet_vf_setup(struct work_struct *w)
+{
+	struct virtnet_info *vi
+		= container_of(w, struct virtnet_info, vf_takeover.work);
+	struct net_device *ndev = vi->dev;
+	struct net_device *vf_netdev;
+
+	if (!rtnl_trylock()) {
+		schedule_delayed_work(&vi->vf_takeover, 0);
+		return;
+	}
+
+	vf_netdev = rtnl_dereference(vi->vf_netdev);
+	if (vf_netdev)
+		__virtnet_vf_setup(ndev, vf_netdev);
+
+	rtnl_unlock();
+}
+
 static int virtnet_probe(struct virtio_device *vdev)
 {
 	int i, err;
@@ -2600,6 +2732,11 @@ static int virtnet_probe(struct virtio_device *vdev)
 	}
 
 	INIT_WORK(&vi->config_work, virtnet_config_changed_work);
+	INIT_DELAYED_WORK(&vi->vf_takeover, virtnet_vf_setup);
+
+	vi->vf_stats = netdev_alloc_pcpu_stats(struct virtnet_vf_pcpu_stats);
+	if (!vi->vf_stats)
+		goto free_stats;
 
 	/* If we can receive ANY GSO packets, we must allocate large ones. */
 	if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
@@ -2634,7 +2771,7 @@ static int virtnet_probe(struct virtio_device *vdev)
 			 */
 			dev_err(&vdev->dev, "device MTU appears to have changed "
 				"it is now %d < %d", mtu, dev->min_mtu);
-			goto free_stats;
+			goto free_vf_stats;
 		}
 
 		dev->mtu = mtu;
@@ -2658,7 +2795,7 @@ static int virtnet_probe(struct virtio_device *vdev)
 	/* Allocate/initialize the rx/tx queues, and invoke find_vqs */
 	err = init_vqs(vi);
 	if (err)
-		goto free_stats;
+		goto free_vf_stats;
 
 #ifdef CONFIG_SYSFS
 	if (vi->mergeable_rx_bufs)
@@ -2712,6 +2849,8 @@ static int virtnet_probe(struct virtio_device *vdev)
 	cancel_delayed_work_sync(&vi->refill);
 	free_receive_page_frags(vi);
 	virtnet_del_vqs(vi);
+free_vf_stats:
+	free_percpu(vi->vf_stats);
 free_stats:
 	free_percpu(vi->stats);
 free:
@@ -2733,19 +2872,178 @@ static void remove_vq_common(struct virtnet_info *vi)
 	virtnet_del_vqs(vi);
 }
 
+static struct net_device *get_virtio_bymac(const u8 *mac)
+{
+	struct net_device *dev;
+
+	ASSERT_RTNL();
+
+	for_each_netdev(&init_net, dev) {
+		if (dev->netdev_ops != &virtnet_netdev)
+			continue;       /* not a virtio_net device */
+
+		if (ether_addr_equal(mac, dev->perm_addr))
+			return dev;
+	}
+
+	return NULL;
+}
+
+static struct net_device *get_virtio_byref(struct net_device *vf_netdev)
+{
+	struct net_device *dev;
+
+	ASSERT_RTNL();
+
+	for_each_netdev(&init_net, dev) {
+		struct virtnet_info *vi;
+
+		if (dev->netdev_ops != &virtnet_netdev)
+			continue;	/* not a virtio_net device */
+
+		vi = netdev_priv(dev);
+		if (rtnl_dereference(vi->vf_netdev) == vf_netdev)
+			return dev;	/* a match */
+	}
+
+	return NULL;
+}
+
+/* Called when VF is injecting data into network stack.
+ * Change the associated network device from VF to virtio.
+ * note: already called with rcu_read_lock
+ */
+static rx_handler_result_t virtnet_vf_handle_frame(struct sk_buff **pskb)
+{
+	struct sk_buff *skb = *pskb;
+	struct net_device *ndev = rcu_dereference(skb->dev->rx_handler_data);
+	struct virtnet_info *vi = netdev_priv(ndev);
+	struct virtnet_vf_pcpu_stats *pcpu_stats =
+				this_cpu_ptr(vi->vf_stats);
+
+	skb->dev = ndev;
+
+	u64_stats_update_begin(&pcpu_stats->syncp);
+	pcpu_stats->rx_packets++;
+	pcpu_stats->rx_bytes += skb->len;
+	u64_stats_update_end(&pcpu_stats->syncp);
+
+	return RX_HANDLER_ANOTHER;
+}
+
+static int virtnet_vf_join(struct net_device *vf_netdev,
+			   struct net_device *ndev)
+{
+	struct virtnet_info *vi = netdev_priv(ndev);
+	int ret;
+
+	ret = netdev_rx_handler_register(vf_netdev,
+					 virtnet_vf_handle_frame, ndev);
+	if (ret != 0) {
+		netdev_err(vf_netdev,
+			   "can not register virtio VF receive handler (err = %d)\n",
+			   ret);
+		goto rx_handler_failed;
+	}
+
+	ret = netdev_upper_dev_link(vf_netdev, ndev, NULL);
+	if (ret != 0) {
+		netdev_err(vf_netdev,
+			   "can not set master device %s (err = %d)\n",
+			   ndev->name, ret);
+		goto upper_link_failed;
+	}
+
+	/* set slave flag before open to prevent IPv6 addrconf */
+	vf_netdev->flags |= IFF_SLAVE;
+
+	schedule_delayed_work(&vi->vf_takeover, VF_TAKEOVER_INT);
+
+	call_netdevice_notifiers(NETDEV_JOIN, vf_netdev);
+
+	netdev_info(vf_netdev, "joined to %s\n", ndev->name);
+	return 0;
+
+upper_link_failed:
+	netdev_rx_handler_unregister(vf_netdev);
+rx_handler_failed:
+	return ret;
+}
+
+static int virtnet_register_vf(struct net_device *vf_netdev)
+{
+	struct net_device *ndev;
+	struct virtnet_info *vi;
+
+	if (vf_netdev->addr_len != ETH_ALEN)
+		return NOTIFY_DONE;
+
+	/* We will use the MAC address to locate the virtio_net interface to
+	 * associate with the VF interface. If we don't find a matching
+	 * virtio interface, move on.
+	 */
+	ndev = get_virtio_bymac(vf_netdev->perm_addr);
+	if (!ndev)
+		return NOTIFY_DONE;
+
+	vi = netdev_priv(ndev);
+	if (rtnl_dereference(vi->vf_netdev))
+		return NOTIFY_DONE;
+
+	if (virtnet_vf_join(vf_netdev, ndev) != 0)
+		return NOTIFY_DONE;
+
+	netdev_info(ndev, "VF registering %s\n", vf_netdev->name);
+
+	dev_hold(vf_netdev);
+	rcu_assign_pointer(vi->vf_netdev, vf_netdev);
+
+	return NOTIFY_OK;
+}
+
+static int virtnet_unregister_vf(struct net_device *vf_netdev)
+{
+	struct net_device *ndev;
+	struct virtnet_info *vi;
+
+	ndev = get_virtio_byref(vf_netdev);
+	if (!ndev)
+		return NOTIFY_DONE;
+
+	vi = netdev_priv(ndev);
+	cancel_delayed_work_sync(&vi->vf_takeover);
+
+	netdev_info(ndev, "VF unregistering %s\n", vf_netdev->name);
+
+	netdev_rx_handler_unregister(vf_netdev);
+	netdev_upper_dev_unlink(vf_netdev, ndev);
+	RCU_INIT_POINTER(vi->vf_netdev, NULL);
+	dev_put(vf_netdev);
+
+	return NOTIFY_OK;
+}
+
 static void virtnet_remove(struct virtio_device *vdev)
 {
 	struct virtnet_info *vi = vdev->priv;
+	struct net_device *vf_netdev;
 
 	virtnet_cpu_notif_remove(vi);
 
 	/* Make sure no work handler is accessing the device. */
 	flush_work(&vi->config_work);
 
+	rtnl_lock();
+	vf_netdev = rtnl_dereference(vi->vf_netdev);
+	if (vf_netdev)
+		virtnet_unregister_vf(vf_netdev);
+	rtnl_unlock();
+
 	unregister_netdev(vi->dev);
 
 	remove_vq_common(vi);
 
+	free_percpu(vi->vf_stats);
 	free_percpu(vi->stats);
 	free_netdev(vi->dev);
 }
@@ -2823,6 +3121,42 @@ static struct virtio_driver virtio_net_driver = {
 #endif
 };
 
+static int virtio_netdev_event(struct notifier_block *this,
+			       unsigned long event, void *ptr)
+{
+	struct net_device *event_dev = netdev_notifier_info_to_dev(ptr);
+
+	/* Skip our own events */
+	if (event_dev->netdev_ops == &virtnet_netdev)
+		return NOTIFY_DONE;
+
+	/* Avoid non-Ethernet type devices */
+	if (event_dev->type != ARPHRD_ETHER)
+		return NOTIFY_DONE;
+
+	/* Avoid Vlan dev with same MAC registering as VF */
+	if (is_vlan_dev(event_dev))
+		return NOTIFY_DONE;
+
+	/* Avoid Bonding master dev with same MAC registering as VF */
+	if ((event_dev->priv_flags & IFF_BONDING) &&
+	    (event_dev->flags & IFF_MASTER))
+		return NOTIFY_DONE;
+
+	switch (event) {
+	case NETDEV_REGISTER:
+		return virtnet_register_vf(event_dev);
+	case NETDEV_UNREGISTER:
+		return virtnet_unregister_vf(event_dev);
+	default:
+		return NOTIFY_DONE;
+	}
+}
+
+static struct notifier_block virtio_netdev_notifier = {
+	.notifier_call = virtio_netdev_event,
+};
+
 static __init int virtio_net_driver_init(void)
 {
 	int ret;
@@ -2841,6 +3175,8 @@ static __init int virtio_net_driver_init(void)
         ret = register_virtio_driver(&virtio_net_driver);
 	if (ret)
 		goto err_virtio;
+
+	register_netdevice_notifier(&virtio_netdev_notifier);
 	return 0;
 err_virtio:
 	cpuhp_remove_multi_state(CPUHP_VIRT_NET_DEAD);
@@ -2853,6 +3189,7 @@ module_init(virtio_net_driver_init);
 
 static __exit void virtio_net_driver_exit(void)
 {
+	unregister_netdevice_notifier(&virtio_netdev_notifier);
 	unregister_virtio_driver(&virtio_net_driver);
 	cpuhp_remove_multi_state(CPUHP_VIRT_NET_DEAD);
 	cpuhp_remove_multi_state(virtionet_online);
-- 
2.14.3

^ permalink raw reply related

* Re: [PATCH] bpf: make function xdp_do_generic_redirect_map() static
From: Daniel Borkmann @ 2017-12-19  0:38 UTC (permalink / raw)
  To: Xiongwei Song, ast, davem; +Cc: netdev, linux-kernel
In-Reply-To: <20171218231715.3227-1-sxwjean@gmail.com>

On 12/19/2017 12:17 AM, Xiongwei Song wrote:
> The function xdp_do_generic_redirect_map() is only used in this file, so
> make it static.
> 
> Clean up sparse warning:
> net/core/filter.c:2687:5: warning: no previous prototype
> for 'xdp_do_generic_redirect_map' [-Wmissing-prototypes]
> 
> Signed-off-by: Xiongwei Song <sxwjean@gmail.com>

Applied to bpf-next, thanks Xiongwei!

^ permalink raw reply

* Re: [PATCH bpf-next] selftests/bpf: add netdevsim to config
From: Daniel Borkmann @ 2017-12-19  0:36 UTC (permalink / raw)
  To: Jakub Kicinski, alexei.starovoitov; +Cc: netdev, oss-drivers
In-Reply-To: <20171218231130.24619-1-jakub.kicinski@netronome.com>

On 12/19/2017 12:11 AM, Jakub Kicinski wrote:
> BPF offload tests (test_offload.py) will require netdevsim
> to be built, add it to config.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>

Applied to bpf-next, thanks Jakub!

^ permalink raw reply

* Re: [PATCH bpf-next] bpf: arm64: fix uninitialized variable
From: Daniel Borkmann @ 2017-12-19  0:34 UTC (permalink / raw)
  To: Alexei Starovoitov, Alexei Starovoitov, David S . Miller
  Cc: Arnd Bergmann, netdev, kernel-team
In-Reply-To: <68b024dd-4113-8c8a-a606-7b4b0206973d@fb.com>

On 12/18/2017 07:36 PM, Alexei Starovoitov wrote:
> On 12/18/17 10:19 AM, Daniel Borkmann wrote:
>> On 12/18/2017 07:09 PM, Alexei Starovoitov wrote:
>>> From: Alexei Starovoitov <ast@fb.com>
>>>
>>> fix the following issue:
>>> arch/arm64/net/bpf_jit_comp.c: In function 'bpf_int_jit_compile':
>>> arch/arm64/net/bpf_jit_comp.c:982:18: error: 'image_size' may be used
>>> uninitialized in this function [-Werror=maybe-uninitialized]
>>>
>>> Fixes: db496944fdaa ("bpf: arm64: add JIT support for multi-function programs")
>>> Reported-by: Arnd Bergmann <arnd@arndb.de>
>>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>>> ---
>>>  arch/arm64/net/bpf_jit_comp.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
>>> index 396490cf7316..acaa935ed977 100644
>>> --- a/arch/arm64/net/bpf_jit_comp.c
>>> +++ b/arch/arm64/net/bpf_jit_comp.c
>>> @@ -897,6 +897,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
>>>          image_ptr = jit_data->image;
>>>          header = jit_data->header;
>>>          extra_pass = true;
>>> +        image_size = sizeof(u32) * ctx.idx;
>>>          goto skip_init_ctx;
>>>      }
>>>      memset(&ctx, 0, sizeof(ctx));
>>
>> I don't really mind, but it feels more complex than it needs to be
>> imho, since in the initial pass you fetch 'image_size' in fake pass
>> from ctx.idx, then we set ctx.idx to 0 again, do another pass and
>> use the cached ctx.idx from that second pass instead of the first
>> one where we set 'image_size' originally, so we definitely need to
>> take that into consideration in future reviews at least.
> 
> not sure what you mean.
> This check: ctx.idx != jit_data->ctx.idx matters the most.
> After first alloc the 'image_size' variable used for dumping only.
> That's why the JITing itself worked fine. We could have removed it
> since it's computable from idx, but imo it's fine this way.

Fair enough, given final ctx.idx value must be guaranteed to never change
in future between pass#1 and pass#2 from the first bpf_int_jit_compile()
run, then lets go with this smaller version; applied to bpf-next, thanks
Alexei!

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox