Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] net:ethernet:samsung:initialize cur_rx_qnum
From: Francois Romieu @ 2016-12-09 22:42 UTC (permalink / raw)
  To: Rayagond Kokatanur; +Cc: siva.kallam, bh74.an, ks.giri, vipul.pandya, netdev
In-Reply-To: <1481285645-6028-1-git-send-email-rayagond.kokatanur@gmail.com>

Rayagond Kokatanur <rayagond.kokatanur@gmail.com> :
> This patch initialize the cur_rx_qnum upon occurence of rx interrupt,
> without this initialization driver will not work with multiple rx queues
> configurations.
> 
> NOTE: This patch is not tested on actual hw.

(your patch should include a Signed-off-by)

Imho the driver needs more changes to support multiple rx queues.

- rx interrupt for queue A -> priv->cur_rx_qnum = A
- rx interrupt for queue B -> priv->cur_rx_qnum = B
- rx napi processing       -> Err...

Please start turning priv->cur_rx_qnum into a SXGBE_RX_QUEUES sized bitmap.

-- 
Ueimor

^ permalink raw reply

* Re: Synopsys Ethernet QoS
From: Florian Fainelli @ 2016-12-09 22:52 UTC (permalink / raw)
  To: Andy Shevchenko, David Miller
  Cc: Joao Pinto, Giuseppe CAVALLARO, lars.persson, rabin.vincent,
	netdev, CARLOS.PALMINHA
In-Reply-To: <CAHp75VcKURaTQX9=SY8+46GGATuwO1oXAi8eMS+uwk58sjKx5Q@mail.gmail.com>

On 12/09/2016 02:25 PM, Andy Shevchenko wrote:
> On Fri, Dec 9, 2016 at 5:41 PM, David Miller <davem@davemloft.net> wrote:
> 
>> But one thing I am against is changing the driver name for existing
>> users.  If an existing chip is supported by the stmmac driver for
>> existing users, they should still continue to use the "stmmac" driver.
>>
>> Therefore, if consolidation changes the driver module name for
>> existing users, then that is not a good plan at all.
> 
> You have at least one supporter here. Though I jumped in to the
> discussion very late, not sure if everyone have time to answer to
> that.

I don't have many stakes in the stmmac driver (or other Synopsys drivers
for that matter), but renaming seems like a terrible idea that is going
to make backporting of fixes difficult for distribution.

While moving the driver into a separate directory could be done, and git
knows how to track files, renaming the driver entirely would break many
platforms (including but not limited to, Device Tree) that you may not
have visibility over (compatible strings, properties, and platform
device driver name for instance).

It's kind of sad that customers of that IP (stmmac, amd-xgbe, sxgbe) did
actually pioneer the upstreaming effort, but it is good to see people
from Synopsys willing to fix that in the future.
-- 
Florian

^ permalink raw reply

* Re: [PATCH 2/6] net: ethernet: ti: cpts: add support for ext rftclk selection
From: Grygorii Strashko @ 2016-12-09 23:29 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Richard Cochran, Murali Karicheri, David S. Miller,
	netdev-u79uwXL29TY76Z2rM5mHXA, Mugunthan V N, Sekhar Nori,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Rob Herring,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Wingman Kwok,
	linux-clk-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20161209004745.GJ5423-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>



On 12/08/2016 06:47 PM, Stephen Boyd wrote:
> On 12/06, Grygorii Strashko wrote:
>> Subject: [PATCH] cpts refclk sel
>>
>> Signed-off-by: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>
>> ---
>>  arch/arm/boot/dts/keystone-k2e-netcp.dtsi | 10 +++++-
>>  drivers/net/ethernet/ti/cpts.c            | 52 ++++++++++++++++++++++++++++++-
>>  2 files changed, 60 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/boot/dts/keystone-k2e-netcp.dtsi b/arch/arm/boot/dts/keystone-k2e-netcp.dtsi
>> index 919e655..b27aa22 100644
>> --- a/arch/arm/boot/dts/keystone-k2e-netcp.dtsi
>> +++ b/arch/arm/boot/dts/keystone-k2e-netcp.dtsi
>> @@ -138,7 +138,7 @@ netcp: netcp@24000000 {
>>  	/* NetCP address range */
>>  	ranges = <0 0x24000000 0x1000000>;
>>
>> -	clocks = <&clkpa>, <&clkcpgmac>, <&chipclk12>;
>> +	clocks = <&clkpa>, <&clkcpgmac>, <&cpts_mux>;

					^^ mux clock used here

>>  	clock-names = "pa_clk", "ethss_clk", "cpts";
>>  	dma-coherent;
>>
>> @@ -162,6 +162,14 @@ netcp: netcp@24000000 {
>>  			cpts-ext-ts-inputs = <6>;
>>  			cpts-ts-comp-length;
>>
>> +			cpts_mux: cpts_refclk_mux {
>> +				#clock-cells = <0>;
>> +				clocks = <&chipclk12>, <&chipclk13>;
>> +				cpts-mux-tbl = <0>, <1>;
>> +				assigned-clocks = <&cpts_mux>;
>> +				assigned-clock-parents = <&chipclk12>;
> 
> Is there a binding update?
 
this was pure RFC-DEV patch just to check the possibility of modeling 
CPTS_RFTCLK_SEL register as mux clock. 
Original patch:
https://lkml.org/lkml/2016/11/28/780

I've plan to resend it using clk framework.

 Why the subnode? 

Sry, I did not get this question - is there another way to pas phandle on clock
in clocks list property? Am I missing smth.?

Sry, this is my first clock :)

> Why not have it as part of the netcp node?

cpts is part of gbe ethss, which is part of netcp.

Only netcp is modeled as DD - cpts and gbe ethss implemented without using DD model,
so generic resources acquired by netcp and then passed to cpts and gbe ethss.

CPTS has register to control an external multiplexer that selects
one of up to 32 clocks for time sync reference (RFTCLK)

> Does the cpts-mux-tbl property change?

On Keystone 2 66AK2e (as example) the following list of clocks can be selected 
as ref clocks (list is different for other SoCs):
0000 = SYSCLK2
0001 = SYSCLK3
0010 = TIMI0
0011 = TIMI1
0100 = TSIPCLKA
1000 = TSREFCLK
1100 = TSIPCLKB
Others = Reserved

and only 0 and 1 are internal, other external and board specific
(parameters unknown and corresponding inputs can be used for other purposes),
so I can't define all parent clocks, only internal:

clocks = <&chipclk12>, <&chipclk13>;
cpts-mux-tbl = <0>, <1>;

to use another, external, clock - it should be explicitly defined in board file the board file 

timi1clk: timi1clk {
	#clock-cells = <0>;
	compatible = "fixed-clock";
...

&cpts_mux {
	clocks = <&chipclk12>, <&chipclk13>, <timi1clk>;
						^^^ i can't predict value here
	cpts-mux-tbl = <0>, <1>, <3>;
				^^i can't predict value here
	assigned-clocks = <&cpts_mux>;
	assigned-clock-parents = <&timi1clk>;
};

or I understood your question wrongly?

> 
>> +			};
>> +
>>  			interfaces {
>>  				gbe0: interface-0 {
>>  					slave-port = <0>;
>> diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
>> index 938de22..ef94316 100644
>> --- a/drivers/net/ethernet/ti/cpts.c
>> +++ b/drivers/net/ethernet/ti/cpts.c
>> @@ -17,6 +17,7 @@
>>   * along with this program; if not, write to the Free Software
>>   * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301  USA
>>   */
>> +#include <linux/clk-provider.h>
>>  #include <linux/err.h>
>>  #include <linux/if.h>
>>  #include <linux/hrtimer.h>
>> @@ -672,6 +673,7 @@ int cpts_register(struct cpts *cpts)
>>  	cpts->phc_index = ptp_clock_index(cpts->clock);
>>
>>  	schedule_delayed_work(&cpts->overflow_work, cpts->ov_check_period);
>> +
> 
> Maybe in another patch.
> 

sure

>>  	return 0;
>>
>>  err_ptp:
>> @@ -741,6 +743,54 @@ static void cpts_calc_mult_shift(struct cpts *cpts)
>>  		 freq, cpts->cc_mult, cpts->cc.shift, (ns - NSEC_PER_SEC));
>>  }
>>

...

>> +
>> +	reg = &cpts->reg->rftclk_sel;
>> +
>> +	clk = clk_register_mux_table(cpts->dev, refclk_np->name,
>> +				     parent_names, num_parents,
>> +				     0, reg, 0, 0x1F, 0, mux_table, NULL);
>> +	if (IS_ERR(clk))
>> +		return PTR_ERR(clk);
>> +
>> +	return of_clk_add_provider(refclk_np, of_clk_src_simple_get, clk);
> 
> Can you please use the clk_hw APIs instead?
> 

ok

-- 
regards,
-grygorii
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* fib_frontend: Add network specific broadcasts, when it takes a sense
From: Brandon Philips @ 2016-12-09 23:41 UTC (permalink / raw)
  To: netdev, Tom Denham, Aaron Levy, Brad Ison

Hello-

A number of us are working on an OSS overlay network system called
flannel. It is used in a variety of Linux container systems and one of
the backends is VXLAN.

The issue we have: when creating the VXLAN interface and assigning it
an address we see a broadcast route being added by the Kernel. For
example if we have 10.4.0.0/16 a broadcast route to 10.4.0.0 is
created. This route is unwanted because we assign 10.4.0.0 to one of
our VXLAN interfaces.

However, the Kernel interface bring-up comment reads: Add network
specific broadcasts, when it takes a sense. The code is here:
https://github.com/torvalds/linux/blob/master/net/ipv4/fib_frontend.c#L859-L872

Can someone explain why creation of the broadcast route is
non-optional? Would a patch to make it optional be acceptable? Is it
safe for us to simply delete the route? We have a patch that simply
deletes the broadcast route after interface creation but don't know
why the Kernel code "makes sense".

You can read more information about the issue here:
https://github.com/coreos/flannel/pull/569

Thank You,

Brandon

^ permalink raw reply

* Re: [PATCH V2 03/22] bnxt_re: register with the NIC driver
From: Jonathan Toppins @ 2016-12-10  0:03 UTC (permalink / raw)
  To: Selvin Xavier, dledford, linux-rdma
  Cc: netdev, Eddie Wai, Devesh Sharma, Somnath Kotur,
	Sriharsha Basavapatna
In-Reply-To: <1481266096-23331-4-git-send-email-selvin.xavier@broadcom.com>

On 12/09/2016 01:47 AM, Selvin Xavier wrote:
> This patch handles the registration with bnxt_en driver. The driver registers
> with netdev notifier chain. Upon receiving NETDEV_REGISTER event, the driver
> in turn registers with bnxt_en driver.
> 	1. bnxt_en's ulp_probe function returns a structure that contains information
> 	   about the device and additional entry points.
> 	2. bnxt_en driver returns 'struct bnxt_eth_dev' that contains set of operation
> 	   vectors that RocE driver invokes later.
> 	3. bnxt_request_msix() allows the RoCE driver to specify the number of MSI-X
> 	   vectors that are needed.
> 	4. bnxt_send_fw_msg () can be used to send messages to the FW
> 	5. bnxt_register_async_events() can be used to register for async event
> 	   callbacks.
> 
> v2: Remove some sparse warning. Also, remove some unused code from unreg path.
> 
> Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
> ---
>  drivers/infiniband/hw/bnxtre/bnxt_re.h      |  48 +++
>  drivers/infiniband/hw/bnxtre/bnxt_re_main.c | 436 ++++++++++++++++++++++++++++
>  2 files changed, 484 insertions(+)
> 

[...]

>  #endif
> diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
> index ebe1c69..029824a 100644
> --- a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
> +++ b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
> +
> +static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
> +{
> +	int i, j, rc;
> +
> +	/* Registered a new RoCE device instance to netdev */
> +	rc = bnxt_re_register_netdev(rdev);
> +	if (rc) {
> +		pr_err("Failed to register with netedev: %#x\n", rc);
> +		return -EINVAL;
> +	}
> +	set_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
> +
> +	rc = bnxt_re_request_msix(rdev);
> +	if (rc) {
> +		pr_err("Failed to get MSI-X vectors: %#x\n", rc);
> +		rc = -EINVAL;
> +		goto fail;
> +	}
> +	set_bit(BNXT_RE_FLAG_GOT_MSIX, &rdev->flags);

Though this exit path looks correct (need to verify) once all patches
are applied, this looks incorrect if only considering this specific
patch. I think you need the following:

+ return 0;

> +
> +fail:
> +	bnxt_re_ib_unreg(rdev, true);
> +	return rc;
> +}
> +

^ permalink raw reply

* fib_frontend: Add network specific broadcasts, when it takes a sense
From: Brandon Philips @ 2016-12-10  0:07 UTC (permalink / raw)
  To: netdev, Tom Denham, Aaron Levy, Brad Ison

Hello-

A number of us are working on an OSS overlay network system called flannel.
It is used in a variety of Linux container systems and one of the backends
is VXLAN.

The issue we have: when creating the VXLAN interface and assigning it an
address we see a broadcast route being added by the Kernel. For example if
we have 10.4.0.0/16 a broadcast route to 10.4.0.0 is created. This route is
unwanted because we assign 10.4.0.0 to one of our VXLAN interfaces.

However, the Kernel interface bring-up comment reads: Add network specific
broadcasts, when it takes a sense. The code is here:
https://github.com/torvalds/linux/blob/master/net/ipv4/fib_frontend.c#L859-L872

Can someone explain why creation of the broadcast route is non-optional?
Would a patch to make it optional be acceptable? Is it safe for us to
simply delete the route? We have a patch that simply deletes the broadcast
route after interface creation but don't know why the Kernel code "makes
sense".

You can read more information about the issue here:
https://github.com/coreos/flannel/pull/569

Thank You,

Brandon

^ permalink raw reply

* Re: Synopsys Ethernet QoS
From: Andy Shevchenko @ 2016-12-10  0:16 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: David Miller, Joao Pinto, Giuseppe CAVALLARO, lars.persson,
	rabin.vincent, netdev, CARLOS.PALMINHA
In-Reply-To: <3aee5a67-5e19-34e6-1719-ff13c7b914ea@gmail.com>

On Sat, Dec 10, 2016 at 12:52 AM, Florian Fainelli <f.fainelli@gmail.com> wrote:

> It's kind of sad that customers of that IP (stmmac, amd-xgbe, sxgbe)

> did
> actually pioneer the upstreaming effort, but it is good to see people
> from Synopsys willing to fix that in the future.

Wait, you would like to tell that we have more than 2 drivers for the
same (okay, same vendor) IP?!
It's better to unify them earlier, than have n+ copies.

P.S. Though, I don't see how sxgbe got in the list. First glance on
the code doesn't show similarities.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply

* [PATCH net v2] ibmveth: set correct gso_size and gso_type
From: Thomas Falcon @ 2016-12-10  1:31 UTC (permalink / raw)
  To: netdev; +Cc: brking, pradeeps, marcelo.leitner, jmaxwell37, zdai, eric.dumazet
In-Reply-To: <1481236803-4807-1-git-send-email-tlfalcon@linux.vnet.ibm.com>

This patch is based on an earlier one submitted
by Jon Maxwell with the following commit message:

"We recently encountered a bug where a few customers using ibmveth on the
same LPAR hit an issue where a TCP session hung when large receive was
enabled. Closer analysis revealed that the session was stuck because the
one side was advertising a zero window repeatedly.

We narrowed this down to the fact the ibmveth driver did not set gso_size
which is translated by TCP into the MSS later up the stack. The MSS is
used to calculate the TCP window size and as that was abnormally large,
it was calculating a zero window, even although the sockets receive buffer
was completely empty."

We rely on the Virtual I/O Server partition in a pseries
environment to provide the MSS through the TCP header checksum
field. The stipulation is that users should not disable checksum
offloading if rx packet aggregation is enabled through VIOS.

Some firmware offerings provide the MSS in the RX buffer.
This is signalled by a bit in the RX queue descriptor.

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com>
Reviewed-by: David Dai <zdai@us.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
---
v2: calculate gso_segs after Eric Dumazet's comments on the earlier patch
    and make sure everyone is included on CC
---
 drivers/net/ethernet/ibm/ibmveth.c | 72 ++++++++++++++++++++++++++++++++++++--
 drivers/net/ethernet/ibm/ibmveth.h |  1 +
 2 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index ebe6071..f0c3ae7 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -58,7 +58,7 @@
 
 static const char ibmveth_driver_name[] = "ibmveth";
 static const char ibmveth_driver_string[] = "IBM Power Virtual Ethernet Driver";
-#define ibmveth_driver_version "1.05"
+#define ibmveth_driver_version "1.06"
 
 MODULE_AUTHOR("Santiago Leon <santil@linux.vnet.ibm.com>");
 MODULE_DESCRIPTION("IBM Power Virtual Ethernet Driver");
@@ -137,6 +137,11 @@ static inline int ibmveth_rxq_frame_offset(struct ibmveth_adapter *adapter)
 	return ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_OFF_MASK;
 }
 
+static inline int ibmveth_rxq_large_packet(struct ibmveth_adapter *adapter)
+{
+	return ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_LRG_PKT;
+}
+
 static inline int ibmveth_rxq_frame_length(struct ibmveth_adapter *adapter)
 {
 	return be32_to_cpu(adapter->rx_queue.queue_addr[adapter->rx_queue.index].length);
@@ -1174,6 +1179,52 @@ static netdev_tx_t ibmveth_start_xmit(struct sk_buff *skb,
 	goto retry_bounce;
 }
 
+static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
+{
+	struct tcphdr *tcph;
+	int offset = 0;
+	int hdr_len;
+
+	/* only TCP packets will be aggregated */
+	if (skb->protocol == htons(ETH_P_IP)) {
+		struct iphdr *iph = (struct iphdr *)skb->data;
+
+		if (iph->protocol == IPPROTO_TCP) {
+			offset = iph->ihl * 4;
+			skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+		} else {
+			return;
+		}
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		struct ipv6hdr *iph6 = (struct ipv6hdr *)skb->data;
+
+		if (iph6->nexthdr == IPPROTO_TCP) {
+			offset = sizeof(struct ipv6hdr);
+			skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+		} else {
+			return;
+		}
+	} else {
+		return;
+	}
+	/* if mss is not set through Large Packet bit/mss in rx buffer,
+	 * expect that the mss will be written to the tcp header checksum.
+	 */
+	tcph = (struct tcphdr *)(skb->data + offset);
+	hdr_len = offset + tcph->doff * 4;
+	if (lrg_pkt) {
+		skb_shinfo(skb)->gso_size = mss;
+		skb_shinfo(skb)->gso_segs =
+					DIV_ROUND_UP(skb->len - hdr_len, mss);
+	} else if (offset) {
+		skb_shinfo(skb)->gso_size = ntohs(tcph->check);
+		skb_shinfo(skb)->gso_segs =
+				DIV_ROUND_UP(skb->len - hdr_len,
+					     skb_shinfo(skb)->gso_size);
+		tcph->check = 0;
+	}
+}
+
 static int ibmveth_poll(struct napi_struct *napi, int budget)
 {
 	struct ibmveth_adapter *adapter =
@@ -1182,6 +1233,7 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
 	int frames_processed = 0;
 	unsigned long lpar_rc;
 	struct iphdr *iph;
+	u16 mss = 0;
 
 restart_poll:
 	while (frames_processed < budget) {
@@ -1199,9 +1251,21 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
 			int length = ibmveth_rxq_frame_length(adapter);
 			int offset = ibmveth_rxq_frame_offset(adapter);
 			int csum_good = ibmveth_rxq_csum_good(adapter);
+			int lrg_pkt = ibmveth_rxq_large_packet(adapter);
 
 			skb = ibmveth_rxq_get_buffer(adapter);
 
+			/* if the large packet bit is set in the rx queue
+			 * descriptor, the mss will be written by PHYP eight
+			 * bytes from the start of the rx buffer, which is
+			 * skb->data at this stage
+			 */
+			if (lrg_pkt) {
+				__be64 *rxmss = (__be64 *)(skb->data + 8);
+
+				mss = (u16)be64_to_cpu(*rxmss);
+			}
+
 			new_skb = NULL;
 			if (length < rx_copybreak)
 				new_skb = netdev_alloc_skb(netdev, length);
@@ -1235,11 +1299,15 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
 					if (iph->check == 0xffff) {
 						iph->check = 0;
 						iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
-						adapter->rx_large_packets++;
 					}
 				}
 			}
 
+			if (length > netdev->mtu + ETH_HLEN) {
+				ibmveth_rx_mss_helper(skb, mss, lrg_pkt);
+				adapter->rx_large_packets++;
+			}
+
 			napi_gro_receive(napi, skb);	/* send it up */
 
 			netdev->stats.rx_packets++;
diff --git a/drivers/net/ethernet/ibm/ibmveth.h b/drivers/net/ethernet/ibm/ibmveth.h
index 4eade67..7acda04 100644
--- a/drivers/net/ethernet/ibm/ibmveth.h
+++ b/drivers/net/ethernet/ibm/ibmveth.h
@@ -209,6 +209,7 @@ struct ibmveth_rx_q_entry {
 #define IBMVETH_RXQ_TOGGLE		0x80000000
 #define IBMVETH_RXQ_TOGGLE_SHIFT	31
 #define IBMVETH_RXQ_VALID		0x40000000
+#define IBMVETH_RXQ_LRG_PKT		0x04000000
 #define IBMVETH_RXQ_NO_CSUM		0x02000000
 #define IBMVETH_RXQ_CSUM_GOOD		0x01000000
 #define IBMVETH_RXQ_OFF_MASK		0x0000FFFF
-- 
1.8.3.1

^ permalink raw reply related

* Re: Synopsys Ethernet QoS
From: Florian Fainelli @ 2016-12-10  1:44 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: David Miller, Joao Pinto, Giuseppe CAVALLARO, lars.persson,
	rabin.vincent, netdev, CARLOS.PALMINHA, Jie.Deng1
In-Reply-To: <CAHp75VfT9B3O5jU0eHoKtgYc48K2ZjCQ-g9ZQ9nX1Hew6tz-zw@mail.gmail.com>

Le 12/09/16 à 16:16, Andy Shevchenko a écrit :
> On Sat, Dec 10, 2016 at 12:52 AM, Florian Fainelli <f.fainelli@gmail.com> wrote:
> 
>> It's kind of sad that customers of that IP (stmmac, amd-xgbe, sxgbe)
> 
>> did
>> actually pioneer the upstreaming effort, but it is good to see people
>> from Synopsys willing to fix that in the future.
> 
> Wait, you would like to tell that we have more than 2 drivers for the
> same (okay, same vendor) IP?!
> It's better to unify them earlier, than have n+ copies.

Unfortunately that is the case, see this email:

https://www.mail-archive.com/netdev@vger.kernel.org/msg142796.html

dwc_eth_qos and stmmac have some overlap. There seems to be work
underway to unify these two to begin with.

> 
> P.S. Though, I don't see how sxgbe got in the list. First glance on
> the code doesn't show similarities.

Well samsung/sxgbe looks potentially similar to amd/xgbe, but that's
just my cursory look at the code, it may very well be something entirely
different. The descriptor formats just look suspiciously similar.
-- 
Florian

^ permalink raw reply

* Re: Soft lockup in inet_put_port on 4.6
From: Josef Bacik @ 2016-12-10  1:59 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, Tom Herbert,
	Linux Kernel Network Developers
In-Reply-To: <6C6EE0ED-7E78-4866-8AAF-D75FD4719EF3@fb.com>

On Thu, Dec 8, 2016 at 8:01 PM, Josef Bacik <jbacik@fb.com> wrote:
> 
>>  On Dec 8, 2016, at 7:32 PM, Eric Dumazet <eric.dumazet@gmail.com> 
>> wrote:
>> 
>>>  On Thu, 2016-12-08 at 16:36 -0500, Josef Bacik wrote:
>>> 
>>>  We can reproduce the problem at will, still trying to run down the
>>>  problem.  I'll try and find one of the boxes that dumped a core 
>>> and get
>>>  a bt of everybody.  Thanks,
>> 
>>  OK, sounds good.
>> 
>>  I had a look and :
>>  - could not spot a fix that came after 4.6.
>>  - could not spot an obvious bug.
>> 
>>  Anything special in the program triggering the issue ?
>>  SO_REUSEPORT and/or special socket options ?
>> 
> 
> So they recently started using SO_REUSEPORT, that's what triggered 
> it, if they don't use it then everything is fine.
> 
> I added some instrumentation for get_port to see if it was looping in 
> there and none of my printk's triggered.  The softlockup messages are 
> always on the inet_bind_bucket lock, sometimes in the process context 
> in get_port or in the softirq context either through inet_put_port or 
> inet_kill_twsk.  On the box that I have a coredump for there's only 
> one processor in the inet code so I'm not sure what to make of that.  
> That was a box from last week so I'll look at a more recent core and 
> see if it's different.  Thanks,

Ok more investigation today, a few bullet points

- With all the debugging turned on the boxes seem to recover after 
about a minute.  I'd get the spam of the soft lockup messages all on 
the inet_bind_bucket, and then the box would be fine.
- I looked at a core I had from before I started investigating things 
and there's only one process trying to get the inet_bind_bucket of all 
the 48 cpus.
- I noticed that there was over 100k twsk's in that original core.
- I put a global counter of the twsk's (since most of the softlockup 
messages have the twsk timers in the stack) and noticed with the 
debugging kernel it started around 16k twsk's and once it recovered it 
was down to less than a thousand.  There's a jump where it goes from 8k 
to 2k and then there's only one more softlockup message and the box is 
fine.
- This happens when we restart the service with the config option to 
start using SO_REUSEPORT.

The application is our load balancing app, so obviously has lots of 
connections opened at any given time.  What I'm wondering and will test 
on Monday is if the SO_REUSEPORT change even matters, or if simply 
restarting the service is what triggers the problem.  One thing I 
forgot to mention is that it's also using TCP_FASTOPEN in both the 
non-reuseport and reuseport variants.

What I suspect is happening is the service stops, all of the sockets it 
had open go into TIMEWAIT with relatively the same timer period, and 
then suddenly all wake up at the same time which coupled with the 
massive amount of traffic that we see per box anyway results in so much 
contention and ksoftirqd usage that the box livelocks for a while.  
With the lock debugging and stuff turned on we aren't able to service 
as much traffic so it recovers relatively quickly, whereas a normal 
production kernel never recovers.

Please keep in mind that I"m a file system developer so my conclusions 
may be completely insane, any guidance would be welcome.  I'll continue 
hammering on this on Monday.  Thanks,

Josef

^ permalink raw reply

* Re: [PATCH 0/2 v3] net: qcom/emac: simplify support for different SOCs
From: David Miller @ 2016-12-10  2:06 UTC (permalink / raw)
  To: timur; +Cc: netdev, alokc
In-Reply-To: <1481225061-30962-1-git-send-email-timur@codeaurora.org>

From: Timur Tabi <timur@codeaurora.org>
Date: Thu,  8 Dec 2016 13:24:19 -0600

> On SOCs that have the Qualcomm EMAC network controller, the internal
> PHY block is always different.  Sometimes the differences are small, 
> sometimes it might be a completely different IP.  Either way, using version
> numbers to differentiate them and putting all of the init code in one
> file does not scale.
> 
> This patchset does two things:  The first breaks up the current code into
> different files, and the second patch adds support for a third SOC, the
> Qualcomm Technologies QDF2400 ARM Server SOC.

Series applied.

^ permalink raw reply

* Re: Synopsys Ethernet QoS
From: Jie Deng @ 2016-12-10  2:13 UTC (permalink / raw)
  To: Andy Shevchenko, Florian Fainelli
  Cc: David Miller, Joao Pinto, Giuseppe CAVALLARO, lars.persson,
	rabin.vincent, netdev, CARLOS.PALMINHA
In-Reply-To: <CAHp75VfT9B3O5jU0eHoKtgYc48K2ZjCQ-g9ZQ9nX1Hew6tz-zw@mail.gmail.com>



On 2016/12/10 8:16, Andy Shevchenko wrote:
> On Sat, Dec 10, 2016 at 12:52 AM, Florian Fainelli <f.fainelli@gmail.com> wrote:
>
>> It's kind of sad that customers of that IP (stmmac, amd-xgbe, sxgbe)
>> did
>> actually pioneer the upstreaming effort, but it is good to see people
>> from Synopsys willing to fix that in the future.
> Wait, you would like to tell that we have more than 2 drivers for the
> same (okay, same vendor) IP?!
> It's better to unify them earlier, than have n+ copies.
>
> P.S. Though, I don't see how sxgbe got in the list. First glance on
> the code doesn't show similarities.
Glance on sxgbe_reg.h the register seems from Synopsys XGMAC IP... Probably,
amd-xgbe and sxgbe targeted the same IP

^ permalink raw reply

* Re: [PATCH net-next 1/2] net: phy: add extension of phy-mode for XLGMII
From: Jie Deng @ 2016-12-10  2:16 UTC (permalink / raw)
  To: Andrew Lunn, Jie Deng
  Cc: Florian Fainelli, davem, netdev, linux-kernel, CARLOS.PALMINHA,
	lars.persson, thomas.lendacky
In-Reply-To: <20161209163905.GG9923@lunn.ch>



On 2016/12/10 0:39, Andrew Lunn wrote:
> On Fri, Dec 09, 2016 at 01:19:07PM +0800, Jie Deng wrote:
>>
>> On 2016/12/9 6:15, Florian Fainelli wrote:
>>> On 12/06/2016 07:57 PM, Jie Deng wrote:
>>>> This patch adds phy-mode support for Synopsys XLGMAC
>>> The functional changes look good, but I would like to see some
>>> description of what the XL part stands for here.
>>>
>>> While you are modifying this, do you also mind submitting a Device Tree
>>> specification change:
>>>
>>> https://www.devicetree.org/specifications/
>>>
>>> Thanks!
>> Thank you for the information.
>>
>> Currenlty, the XLGMAC is a new IP from Synopsys.
> I think Florian wants to know about the IEEE standard or what ever
> which defines what the phy-mode XLGMAC is, in the same way there are
> standards for RGMII, SGMII, etc.
>
> 	  Andrew
Understood! Thank you !

^ permalink raw reply

* Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock
From: Lino Sanfilippo @ 2016-12-10  2:25 UTC (permalink / raw)
  To: Pavel Machek, Francois Romieu
  Cc: bh74.an, ks.giri, vipul.pandya, peppe.cavallaro, alexandre.torgue,
	davem, linux-kernel, netdev
In-Reply-To: <20161209112142.GA22710@amd>

Hi,

On 09.12.2016 12:21, Pavel Machek wrote:
> On Fri 2016-12-09 00:19:43, Francois Romieu wrote:
>> Lino Sanfilippo <LinoSanfilippo@gmx.de> :
>> [...]
>> > OTOH Pavel said that he actually could produce a deadlock. Now I wonder if
>> > this is caused by that locking scheme (in a way I have not figured out yet)
>> > or if it is a different issue.
>> 
>> stmmac_tx_err races with stmmac_xmit.
> 
> Umm, yes, that looks real.
> 
> And that means that removing tx_lock will not be completely trivial
> :-(. Lino, any ideas there?
> 

Ok, the race is there but it looks like a problem that is not related to 
the use or removal of the private lock.
By a glimpse into other drivers (e.g sky2 or e1000), a possible way to handle a 
tx error is to start a separate task and restart the tx path in that task instead
the irq handler (or timer in case of the watchdog).

In that task we could do:
1. deactivate napi
2. deactivate irqs
3. wait for running napi/irqs do complete (_sync)
4. call stmmac_tx_err()
5. reenable napi
6. reenable irqs

We have to ensure that no xmit() is executing while stmmac_tx_err() does the cleanup,
so stmmac_tx_err() should IMO rather call netif_tx_disable() instead of netif_stop_queue()
(the former grabs the xmit lock before it sets __QUEUE_STATE_DRV_XOFF to disable
the queue).

Regards,
Lino

^ permalink raw reply

* Re: [PATCH v3 net-next 0/4] udp: receive path optimizations
From: David Miller @ 2016-12-10  3:13 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, pabeni, eric.dumazet
In-Reply-To: <1481226117-31288-1-git-send-email-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Thu,  8 Dec 2016 11:41:53 -0800

> This patch series provides about 100 % performance increase under flood. 
> 
> v2: added Paolo feedback on udp_rmem_release() for tiny sk_rcvbuf
>     added the last patch touching sk_rmem_alloc later

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net v2] ibmveth: set correct gso_size and gso_type
From: Eric Dumazet @ 2016-12-10  3:28 UTC (permalink / raw)
  To: Thomas Falcon; +Cc: netdev, brking, pradeeps, marcelo.leitner, jmaxwell37, zdai
In-Reply-To: <1481333480-10827-1-git-send-email-tlfalcon@linux.vnet.ibm.com>

On Fri, 2016-12-09 at 19:31 -0600, Thomas Falcon wrote:
> This patch is based on an earlier one submitted
> by Jon Maxwell with the following commit message:
> 

> +					DIV_ROUND_UP(skb->len - hdr_len, mss);
> +	} else if (offset) {
> +		skb_shinfo(skb)->gso_size = ntohs(tcph->check);
> +		skb_shinfo(skb)->gso_segs =
> +				DIV_ROUND_UP(skb->len - hdr_len,
> +					     skb_shinfo(skb)->gso_size);
> +		tcph->check = 0;
> +	}

Are you sure that tcph->check could never be 0 on some cases ?

That would crash on a divide by 0

^ permalink raw reply

* Re: Soft lockup in inet_put_port on 4.6
From: Eric Dumazet @ 2016-12-10  3:47 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Hannes Frederic Sowa, Tom Herbert,
	Linux Kernel Network Developers
In-Reply-To: <1481335192.3663.0@smtp.office365.com>

On Fri, 2016-12-09 at 20:59 -0500, Josef Bacik wrote:
> On Thu, Dec 8, 2016 at 8:01 PM, Josef Bacik <jbacik@fb.com> wrote:
> > 
> >>  On Dec 8, 2016, at 7:32 PM, Eric Dumazet <eric.dumazet@gmail.com> 
> >> wrote:
> >> 
> >>>  On Thu, 2016-12-08 at 16:36 -0500, Josef Bacik wrote:
> >>> 
> >>>  We can reproduce the problem at will, still trying to run down the
> >>>  problem.  I'll try and find one of the boxes that dumped a core 
> >>> and get
> >>>  a bt of everybody.  Thanks,
> >> 
> >>  OK, sounds good.
> >> 
> >>  I had a look and :
> >>  - could not spot a fix that came after 4.6.
> >>  - could not spot an obvious bug.
> >> 
> >>  Anything special in the program triggering the issue ?
> >>  SO_REUSEPORT and/or special socket options ?
> >> 
> > 
> > So they recently started using SO_REUSEPORT, that's what triggered 
> > it, if they don't use it then everything is fine.
> > 
> > I added some instrumentation for get_port to see if it was looping in 
> > there and none of my printk's triggered.  The softlockup messages are 
> > always on the inet_bind_bucket lock, sometimes in the process context 
> > in get_port or in the softirq context either through inet_put_port or 
> > inet_kill_twsk.  On the box that I have a coredump for there's only 
> > one processor in the inet code so I'm not sure what to make of that.  
> > That was a box from last week so I'll look at a more recent core and 
> > see if it's different.  Thanks,
> 
> Ok more investigation today, a few bullet points
> 
> - With all the debugging turned on the boxes seem to recover after 
> about a minute.  I'd get the spam of the soft lockup messages all on 
> the inet_bind_bucket, and then the box would be fine.
> - I looked at a core I had from before I started investigating things 
> and there's only one process trying to get the inet_bind_bucket of all 
> the 48 cpus.
> - I noticed that there was over 100k twsk's in that original core.
> - I put a global counter of the twsk's (since most of the softlockup 
> messages have the twsk timers in the stack) and noticed with the 
> debugging kernel it started around 16k twsk's and once it recovered it 
> was down to less than a thousand.  There's a jump where it goes from 8k 
> to 2k and then there's only one more softlockup message and the box is 
> fine.
> - This happens when we restart the service with the config option to 
> start using SO_REUSEPORT.
> 
> The application is our load balancing app, so obviously has lots of 
> connections opened at any given time.  What I'm wondering and will test 
> on Monday is if the SO_REUSEPORT change even matters, or if simply 
> restarting the service is what triggers the problem.  One thing I 
> forgot to mention is that it's also using TCP_FASTOPEN in both the 
> non-reuseport and reuseport variants.
> 
> What I suspect is happening is the service stops, all of the sockets it 
> had open go into TIMEWAIT with relatively the same timer period, and 
> then suddenly all wake up at the same time which coupled with the 
> massive amount of traffic that we see per box anyway results in so much 
> contention and ksoftirqd usage that the box livelocks for a while.  
> With the lock debugging and stuff turned on we aren't able to service 
> as much traffic so it recovers relatively quickly, whereas a normal 
> production kernel never recovers.
> 
> Please keep in mind that I"m a file system developer so my conclusions 
> may be completely insane, any guidance would be welcome.  I'll continue 
> hammering on this on Monday.  Thanks,

Hmm... Is your ephemeral port range includes the port your load
balancing app is using ?

^ permalink raw reply

* Re: [PATCH] ibmveth: set correct gso_size and gso_type
From: David Miller @ 2016-12-10  3:48 UTC (permalink / raw)
  To: tlfalcon; +Cc: netdev
In-Reply-To: <1481236803-4807-1-git-send-email-tlfalcon@linux.vnet.ibm.com>

From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Date: Thu,  8 Dec 2016 16:40:03 -0600

> This patch is based on an earlier one submitted
> by Jon Maxwell with the following commit message:
> 
> "We recently encountered a bug where a few customers using ibmveth on the
> same LPAR hit an issue where a TCP session hung when large receive was
> enabled. Closer analysis revealed that the session was stuck because the
> one side was advertising a zero window repeatedly.
> 
> We narrowed this down to the fact the ibmveth driver did not set gso_size
> which is translated by TCP into the MSS later up the stack. The MSS is
> used to calculate the TCP window size and as that was abnormally large,
> it was calculating a zero window, even although the sockets receive buffer
> was completely empty."
> 
> We rely on the Virtual I/O Server partition in a pseries
> environment to provide the MSS through the TCP header checksum
> field. The stipulation is that users should not disable checksum
> offloading if rx packet aggregation is enabled through VIOS.
> 
> Some firmware offerings provide the MSS in the RX buffer.
> This is signalled by a bit in the RX queue descriptor.
> 
> Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
> Reviewed-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com>
> Reviewed-by: David Dai <zdai@us.ibm.com>
> Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>

Applied, although mis-using the TCP checksum field for this is kind of
bogus.  I'm surprised there wasn't some other place you could stick
this value, which wouldn't modify the packet contents.

^ permalink raw reply

* Re: [PATCH net-next] net: macb: Added PCI wrapper for Platform Driver.
From: David Miller @ 2016-12-10  3:56 UTC (permalink / raw)
  To: bfolta
  Cc: nicolas.ferre, niklas.cassel, alexandre.torgue, satananda.burla,
	rvatsavayi, simon.horman, linux-kernel, netdev, rafalo
In-Reply-To: <SN1PR0701MB1951518D661B27AB9C63FA59CC870@SN1PR0701MB1951.namprd07.prod.outlook.com>

From: Bartosz Folta <bfolta@cadence.com>
Date: Fri, 9 Dec 2016 10:05:46 +0000

> There are hardware PCI implementations of Cadence GEM network controller. This patch will allow to use such hardware with reuse of existing Platform Driver.

Please properly format your commit message text to 80 columns.

> 
> Signed-off-by: Bartosz Folta <bfolta@cadence.com>
> ---
>  drivers/net/ethernet/cadence/Kconfig    |   9 ++
>  drivers/net/ethernet/cadence/Makefile   |   1 +
>  drivers/net/ethernet/cadence/macb.c     |  31 +++++--
>  drivers/net/ethernet/cadence/macb_pci.c | 152 ++++++++++++++++++++++++++++++++
>  include/linux/platform_data/macb.h      |   6 ++
>  5 files changed, 194 insertions(+), 5 deletions(-)  create mode 100644 drivers/net/ethernet/cadence/macb_pci.c

This patch doesn't apply to net-next, please respin.

^ permalink raw reply

* Re: pull-request: mac80211-next 2016-12-09
From: David Miller @ 2016-12-10  3:59 UTC (permalink / raw)
  To: johannes; +Cc: netdev, linux-wireless
In-Reply-To: <20161209120014.20292-1-johannes@sipsolutions.net>

From: Johannes Berg <johannes@sipsolutions.net>
Date: Fri,  9 Dec 2016 13:00:13 +0100

> Closing net-next caught me by surprise, so I had to rebase a bit,
> but these three patches really should go in soon. I'm not sending
> them for 4.9 this late though.
> 
> Please pull and let me know if there's any problem.

Pulled, thanks Johannes.

^ permalink raw reply

* Re: [PATCH] net: smsc911x: back out silently on probe deferrals
From: David Miller @ 2016-12-10  4:05 UTC (permalink / raw)
  To: linus.walleij
  Cc: netdev, steve.glendinning, linux, jeremy.linton, kamlakant.patel,
	p.fedin
In-Reply-To: <1481289480-22096-1-git-send-email-linus.walleij@linaro.org>

From: Linus Walleij <linus.walleij@linaro.org>
Date: Fri,  9 Dec 2016 14:18:00 +0100

> When trying to get a regulator we may get deferred and we see
> this noise:
> 
> smsc911x 1b800000.ethernet-ebi2 (unnamed net_device) (uninitialized):
>    couldn't get regulators -517
> 
> Then the driver continues anyway. Which means that the regulator
> may not be properly retrieved and reference counted, and may be
> switched off in case noone else is using it.
> 
> Fix this by returning silently on deferred probe and let the
> system work it out.
> 
> Cc: Jeremy Linton <jeremy.linton@arm.com>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

Looks good, applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] net: skb_condense() can also deal with empty skbs
From: David Miller @ 2016-12-10  4:07 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1481299325.4930.183.camel@edumazet-glaptop3.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 09 Dec 2016 08:02:05 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> It seems attackers can also send UDP packets with no payload at all.
> 
> skb_condense() can still be a win in this case.
> 
> It will be possible to replace the custom code in tcp_add_backlog()
> to get full benefit from skb_condense()
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied.

^ permalink raw reply

* Re: [PATCH] i40e: don't truncate match_method assignment
From: David Miller @ 2016-12-10  4:07 UTC (permalink / raw)
  To: jacob.e.keller
  Cc: intel-wired-lan, jeffrey.t.kirsher, netdev, sfr, bimmy.pujari
In-Reply-To: <20161209213921.26451-1-jacob.e.keller@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>
Date: Fri,  9 Dec 2016 13:39:21 -0800

> The .match_method field is a u8, so we shouldn't be casting to a u16,
> and because it is only one byte, we do not need to byte swap anything.
> Just assign the value directly. This avoids issues on Big Endian
> architectures which would have byte swapped and then incorrectly
> truncated the value.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> Cc: Bimmy Pujari <bimmy.pujari@intel.com>
> ---
> Not sure if this was already in Jeff's queue, but since it's an obvious
> fix for the issue found by Stephen, I thought I'd send it out now just
> to make sure. Thanks for catching this, and sorry we didn't find the fix
> earlier.

Jeff, what do you want me to do with this?

^ permalink raw reply

* Re: [PATCH] net: mlx5: Fix Kconfig help text
From: David Miller @ 2016-12-10  4:09 UTC (permalink / raw)
  To: cov-sgV2jX0FEOL9JmXXK+q4OQ
  Cc: saeedm-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	leonro-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
In-Reply-To: <20161209215306.721-1-cov-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>

From: Christopher Covington <cov-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
Date: Fri,  9 Dec 2016 16:53:05 -0500

> Since the following commit, Infiniband and Ethernet have not been
> mutually exclusive.
> 
> Fixes: 4aa17b28 mlx5: Enable mutual support for IB and Ethernet
> 
> Signed-off-by: Christopher Covington <cov-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: netlink: GPF in sock_sndtimeo
From: Cong Wang @ 2016-12-10  4:13 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller,
	Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev,
	LKML, syzkaller
In-Reply-To: <20161209110155.GW22655@madcap2.tricolour.ca>

On Fri, Dec 9, 2016 at 3:01 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2016-12-08 22:57, Cong Wang wrote:
>> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a
>> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error
>> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback.
>> > Eliminating the lock since the sock is dead anways eliminates the error.
>> >
>> > Is it safe?  I'll resubmit if this looks remotely sane.  Meanwhile I'll try to
>> > get the test case to compile.
>>
>> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid'
>> are updated as a whole and race between audit_receive_msg() and
>> NETLINK_URELEASE.
>
> This is what I expected and why I originally added the mutex lock in the
> callback...  The dumps I got were bare with no wrapper identifying the
> process context or specific error, so I'm at a bit of a loss how to
> solve this (without thinking more about it) other than instinctively
> removing the mutex.

Netlink notifier can safely be converted to blocking one, I will send
a patch.

But I seriously doubt you really need NETLINK_URELEASE here,
it adds nothing but overhead, b/c the netlink notifier is called on
every netlink socket in the system, but for net exit path, that is
relatively a slow path.

Also, kauditd_send_skb() needs audit_cmd_mutex too.

I will send a formal patch.

Thanks.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox