Netdev List
 help / color / mirror / Atom feed
* [PATCH v4 1/5] skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow
From: Jason A. Donenfeld @ 2017-04-25 15:04 UTC (permalink / raw)
  To: netdev, linux-kernel, davem, David.Laight, kernel-hardening
  Cc: Jason A. Donenfeld
In-Reply-To: <20170425.104731.631398016575024152.davem@davemloft.net>

This is a defense-in-depth measure in response to bugs like
4d6fa57b4dab ("macsec: avoid heap overflow in skb_to_sgvec")

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
v4 fixes the commit message and moves the check into the inner-most if.

 net/core/skbuff.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f86bf69cfb8d..e75640006d78 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3489,7 +3489,9 @@ void __init skb_init(void)
  *	@len: Length of buffer space to be mapped
  *
  *	Fill the specified scatter-gather list with mappings/pointers into a
- *	region of the buffer space attached to a socket buffer.
+ *	region of the buffer space attached to a socket buffer. Returns either
+ *	the number of scatterlist items used, or -EMSGSIZE if the contents
+ *	could not fit.
  */
 static int
 __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
@@ -3517,6 +3519,8 @@ __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
 		end = start + skb_frag_size(&skb_shinfo(skb)->frags[i]);
 		if ((copy = end - offset) > 0) {
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+			if (elt && sg_is_last(&sg[elt - 1]))
+				return -EMSGSIZE;
 
 			if (copy > len)
 				copy = len;
@@ -3537,6 +3541,9 @@ __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
 
 		end = start + frag_iter->len;
 		if ((copy = end - offset) > 0) {
+			if (elt && sg_is_last(&sg[elt - 1]))
+				return -EMSGSIZE;
+
 			if (copy > len)
 				copy = len;
 			elt += __skb_to_sgvec(frag_iter, sg+elt, offset - start,
@@ -3581,6 +3588,9 @@ int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int le
 {
 	int nsg = __skb_to_sgvec(skb, sg, offset, len);
 
+	if (nsg <= 0)
+		return nsg;
+
 	sg_mark_end(&sg[nsg - 1]);
 
 	return nsg;
-- 
2.12.2

^ permalink raw reply related

* Re: [Intel-wired-lan] [PATCH] ixgbe: initialize u64_stats_sync structures early at ixgbe_probe
From: Alexander Duyck @ 2017-04-25 15:07 UTC (permalink / raw)
  To: Singh, Krishneil K
  Cc: Song, Liwei (Wind River), Kirsher, Jeffrey T,
	netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <EF50CC0C76A53A44BAD798C58E0A157E50B87CBB@ORSMSX103.amr.corp.intel.com>

On Mon, Apr 24, 2017 at 4:00 PM, Singh, Krishneil K
<krishneil.k.singh@intel.com> wrote:
>
>
>
> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@lists.osuosl.org] On Behalf Of Liwei Song
> Sent: Sunday, December 4, 2016 7:41 PM
> To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>
> Cc: netdev@vger.kernel.org; intel-wired-lan@lists.osuosl.org; linux-kernel@vger.kernel.org; Song, Liwei (Wind River) <liwei.song@windriver.com>
> Subject: [Intel-wired-lan] [PATCH] ixgbe: initialize u64_stats_sync structures early at ixgbe_probe
>
> Fix the following CallTrace:
> INFO: trying to register non-static key.
> the code is fine but needs lockdep annotation.
> turning off the locking correctness validator.
> CPU: 71 PID: 1 Comm: swapper/0 Not tainted 4.8.8-WR9.0.0.1_standard #11 Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.86B.0036.R05.1407140519 07/14/2014
>  00200086 00200086 eb5e1ab8 c144dd70 00000000 00000000 eb5e1af8 c10af89a
>  c1d23de4 eb5e1af8 00000009 eb5d8600 eb5d8638 eb5e1af8 c10b14d8 00000009  0000000a c1d32911 00000000 00000000 e44c826c eb5d8000 eb5e1b74 c10b214e Call Trace:
>  [<c144dd70>] dump_stack+0x5f/0x8f
>  [<c10af89a>] register_lock_class+0x25a/0x4c0  [<c10b14d8>] ? check_irq_usage+0x88/0xc0  [<c10b214e>] __lock_acquire+0x5e/0x17a0  [<c1abdb9b>] ? _raw_spin_unlock_irqrestore+0x3b/0x70
>  [<c10cf14a>] ? rcu_read_lock_sched_held+0x8a/0x90
>  [<c10b3c5f>] lock_acquire+0x9f/0x1f0
>  [<c1922dcf>] ? dev_get_stats+0x5f/0x110  [<c176e6b3>] ixgbe_get_stats64+0x113/0x320  [<c1922dcf>] ? dev_get_stats+0x5f/0x110  [<c1922dcf>] dev_get_stats+0x5f/0x110  [<c1ab5415>] rtnl_fill_stats+0x40/0x105  [<c193dd45>] rtnl_fill_ifinfo+0x4c5/0xd20  [<c11c5115>] ? __kmalloc_node_track_caller+0x1a5/0x410
>  [<c1917487>] ? __kmalloc_reserve.isra.42+0x27/0x80
>  [<c191754f>] ? __alloc_skb+0x6f/0x270
>  [<c1942291>] rtmsg_ifinfo_build_skb+0x71/0xd0  [<c194230a>] rtmsg_ifinfo.part.23+0x1a/0x50  [<c1923dad>] ? call_netdevice_notifiers_info+0x2d/0x60
>  [<c194236b>] rtmsg_ifinfo+0x2b/0x40
>  [<c192f997>] register_netdevice+0x3d7/0x4d0  [<c192faa7>] register_netdev+0x17/0x30  [<c177b83d>] ixgbe_probe+0x118d/0x1610  [<c1498202>] local_pci_probe+0x32/0x80  [<c1498172>] ? pci_match_device+0xd2/0x100  [<c14991e0>] pci_device_probe+0xc0/0x110  [<c1652cc5>] driver_probe_device+0x1c5/0x280  [<c1498172>] ? pci_match_device+0xd2/0x100  [<c1652e09>] __driver_attach+0x89/0x90  [<c1652d80>] ? driver_probe_device+0x280/0x280  [<c165114f>] bus_for_each_dev+0x4f/0x80  [<c165269e>] driver_attach+0x1e/0x20  [<c1652d80>] ? driver_probe_device+0x280/0x280  [<c1652317>] bus_add_driver+0x1a7/0x220  [<c1653a79>] driver_register+0x59/0xe0  [<c1f897b8>] ? igb_init_module+0x49/0x49  [<c1497b2a>] __pci_register_driver+0x4a/0x50  [<c1f8985d>] ixgbe_init_module+0xa5/0xc4  [<c1000485>] do_one_initcall+0x35/0x150  [<c107e818>] ? parameq+0x18/0x70  [<c1f395d8>] ? repair_env_string+0x12/0x51  [<c107ead0>] ? parse_args+0x260/0x3b0  [<c1074f73>] ? __usermodehelper_set_disable_depth+0x43/0x50
>  [<c1f39e90>] kernel_init_freeable+0x19b/0x267  [<c1f395c6>] ? set_debug_rodata+0xf/0xf  [<c10b1e7b>] ? trace_hardirqs_on+0xb/0x10  [<c1abdc02>] ? _raw_spin_unlock_irq+0x32/0x50  [<c1085f0b>] ? finish_task_switch+0xab/0x1f0  [<c1085ec9>] ? finish_task_switch+0x69/0x1f0  [<c1ab6a30>] kernel_init+0x10/0x110  [<c108bd65>] ? schedule_tail+0x25/0x80  [<c1abe422>] ret_from_kernel_thread+0xe/0x24  [<c1ab6a20>] ? rest_init+0x130/0x130
>
> This CallTrace occurred on 32-bit kernel with CONFIG_PROVE_LOCKING enabled.
>
> This happens at ixgbe driver probe hardware stage, when comes to ixgbe_get_stats64, the seqcount/seqlock still not initialize, although this was initialize in TX/RX resources setup routin, but it was too late, then lockdep give this Warning.
>
> To fix this, move the u64_stats_init function to driver probe stage, which before we get the status of seqcount and after the RX/TX ring was finished init.
>
> Signed-off-by: Liwei Song <liwei.song@windriver.com>
> ---
>
> Tested-by: Krishneil Singh  <krishneil.k.singh@intel.com>

This patch doesn't look right to me. I would suggest rejecting it.

The call to initialize the stats should be done when the ring is
allocated, not in ixgbe_probe(). This should probably be done in
ixgbe_alloc_q_vector() instead.

- Alex

^ permalink raw reply

* Re: [PATCH] macsec: avoid heap overflow in skb_to_sgvec
From: Jason A. Donenfeld @ 2017-04-25 15:08 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: Netdev, LKML, David Miller, stable, security
In-Reply-To: <20170425145340.GA25241@bistromath.localdomain>

Hi Sabrina,

On Tue, Apr 25, 2017 at 4:53 PM, Sabrina Dubroca <sd@queasysnail.net> wrote:
> Ugh, good catch :/
>
> AFAICT this patch doesn't really help, because NETIF_F_FRAGLIST
> doesn't get tested in paths that can lead to triggering this.

You're right. This fixes the xmit() path, but not the receive path,
which appears to take skbs directly from the upper device.

> I'll post a patch to allocate a properly-sized sg array.

I just posted this series, which should fix things in a robust way:

https://patchwork.ozlabs.org/patch/754861/

If you want to handle fraglists, it might be a decent idea to allocate
things of the proper size, I guess. It's a good opportunity to call
skb_cow_data, which you need to do anyway when modifying skbs, I
think.

Jason

^ permalink raw reply

* [PATCH net-next 0/2] Move sub crq init out of interrupt context
From: Nathan Fontenot @ 2017-04-25 19:00 UTC (permalink / raw)
  To: netdev; +Cc: jallen, tlfalcon

The sub crqs are currently intialized in interrupt context when
handling a crq response fromn the vios server. There is no reason
they must be initialized there.

Moving the initialization of the sub crqs to the ibmvnic_init routine
allows us to do the initialization outside of interrupt context and
make all of the allocations with GFP_KERNEL instead of GFP_ATOMIC.

-Nathan

---

Nathan Fontenot (2):
      ibmvnic: Split initialization of scrqs to its own routine
      ibmvnic: Move initialization of sub crqs to ibmvnic_init


 drivers/net/ethernet/ibm/ibmvnic.c |  114 +++++++++++++++++++-----------------
 1 file changed, 61 insertions(+), 53 deletions(-)

^ permalink raw reply

* [PATCH net-next 1/2] ibmvnic: Split initialization of scrqs to its own routine
From: Nathan Fontenot @ 2017-04-25 19:01 UTC (permalink / raw)
  To: netdev; +Cc: jallen, tlfalcon
In-Reply-To: <20170425185704.41126.65738.stgit@ltcalpine2-lp23.aus.stglabs.ibm.com>

Split the sending of capability request crqs and the initialization
of sub crqs into their own routines. This is a first step to moving
the allocation of sub-crqs out of interrupt context.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c |  101 +++++++++++++++++++-----------------
 1 file changed, 54 insertions(+), 47 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 51bf337..c2e260c 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1678,48 +1678,20 @@ static int init_sub_crq_irqs(struct ibmvnic_adapter *adapter)
 	return rc;
 }
 
-static void init_sub_crqs(struct ibmvnic_adapter *adapter, int retry)
+static int init_sub_crqs(struct ibmvnic_adapter *adapter)
 {
 	struct device *dev = &adapter->vdev->dev;
 	struct ibmvnic_sub_crq_queue **allqueues;
 	int registered_queues = 0;
-	union ibmvnic_crq crq;
 	int total_queues;
 	int more = 0;
 	int i;
 
-	if (!retry) {
-		/* Sub-CRQ entries are 32 byte long */
-		int entries_page = 4 * PAGE_SIZE / (sizeof(u64) * 4);
-
-		if (adapter->min_tx_entries_per_subcrq > entries_page ||
-		    adapter->min_rx_add_entries_per_subcrq > entries_page) {
-			dev_err(dev, "Fatal, invalid entries per sub-crq\n");
-			goto allqueues_failed;
-		}
-
-		/* Get the minimum between the queried max and the entries
-		 * that fit in our PAGE_SIZE
-		 */
-		adapter->req_tx_entries_per_subcrq =
-		    adapter->max_tx_entries_per_subcrq > entries_page ?
-		    entries_page : adapter->max_tx_entries_per_subcrq;
-		adapter->req_rx_add_entries_per_subcrq =
-		    adapter->max_rx_add_entries_per_subcrq > entries_page ?
-		    entries_page : adapter->max_rx_add_entries_per_subcrq;
-
-		adapter->req_tx_queues = adapter->opt_tx_comp_sub_queues;
-		adapter->req_rx_queues = adapter->opt_rx_comp_queues;
-		adapter->req_rx_add_queues = adapter->max_rx_add_queues;
-
-		adapter->req_mtu = adapter->netdev->mtu + ETH_HLEN;
-	}
-
 	total_queues = adapter->req_tx_queues + adapter->req_rx_queues;
 
 	allqueues = kcalloc(total_queues, sizeof(*allqueues), GFP_ATOMIC);
 	if (!allqueues)
-		goto allqueues_failed;
+		return -1;
 
 	for (i = 0; i < total_queues; i++) {
 		allqueues[i] = init_sub_crq_queue(adapter);
@@ -1776,6 +1748,56 @@ static void init_sub_crqs(struct ibmvnic_adapter *adapter, int retry)
 		adapter->rx_scrq[i]->scrq_num = i;
 	}
 
+	kfree(allqueues);
+	return 0;
+
+rx_failed:
+	kfree(adapter->tx_scrq);
+	adapter->tx_scrq = NULL;
+tx_failed:
+	for (i = 0; i < registered_queues; i++)
+		release_sub_crq_queue(adapter, allqueues[i]);
+	kfree(allqueues);
+	return -1;
+}
+
+static void ibmvnic_send_req_caps(struct ibmvnic_adapter *adapter, int retry)
+{
+	struct device *dev = &adapter->vdev->dev;
+	union ibmvnic_crq crq;
+	int rc;
+
+	if (!retry) {
+		/* Sub-CRQ entries are 32 byte long */
+		int entries_page = 4 * PAGE_SIZE / (sizeof(u64) * 4);
+
+		if (adapter->min_tx_entries_per_subcrq > entries_page ||
+		    adapter->min_rx_add_entries_per_subcrq > entries_page) {
+			dev_err(dev, "Fatal, invalid entries per sub-crq\n");
+			return;
+		}
+
+		/* Get the minimum between the queried max and the entries
+		 * that fit in our PAGE_SIZE
+		 */
+		adapter->req_tx_entries_per_subcrq =
+		    adapter->max_tx_entries_per_subcrq > entries_page ?
+		    entries_page : adapter->max_tx_entries_per_subcrq;
+		adapter->req_rx_add_entries_per_subcrq =
+		    adapter->max_rx_add_entries_per_subcrq > entries_page ?
+		    entries_page : adapter->max_rx_add_entries_per_subcrq;
+
+		adapter->req_tx_queues = adapter->opt_tx_comp_sub_queues;
+		adapter->req_rx_queues = adapter->opt_rx_comp_queues;
+		adapter->req_rx_add_queues = adapter->max_rx_add_queues;
+
+		adapter->req_mtu = adapter->netdev->mtu + ETH_HLEN;
+	}
+
+	rc = init_sub_crqs(adapter);
+	if (rc)
+		return;
+
 	memset(&crq, 0, sizeof(crq));
 	crq.request_capability.first = IBMVNIC_CRQ_CMD;
 	crq.request_capability.cmd = REQUEST_CAPABILITY;
@@ -1829,20 +1851,6 @@ static void init_sub_crqs(struct ibmvnic_adapter *adapter, int retry)
 		atomic_inc(&adapter->running_cap_crqs);
 		ibmvnic_send_crq(adapter, &crq);
 	}
-
-	kfree(allqueues);
-
-	return;
-
-rx_failed:
-	kfree(adapter->tx_scrq);
-	adapter->tx_scrq = NULL;
-tx_failed:
-	for (i = 0; i < registered_queues; i++)
-		release_sub_crq_queue(adapter, allqueues[i]);
-	kfree(allqueues);
-allqueues_failed:
-	ibmvnic_remove(adapter->vdev);
 }
 
 static int pending_scrq(struct ibmvnic_adapter *adapter,
@@ -2568,7 +2576,7 @@ static void handle_request_cap_rsp(union ibmvnic_crq *crq,
 					       number), name);
 		release_sub_crqs(adapter);
 		*req_value = be64_to_cpu(crq->request_capability_rsp.number);
-		init_sub_crqs(adapter, 1);
+		ibmvnic_send_req_caps(adapter, 1);
 		return;
 	default:
 		dev_err(dev, "Error %d in request cap rsp\n",
@@ -2881,8 +2889,7 @@ static void handle_query_cap_rsp(union ibmvnic_crq *crq,
 out:
 	if (atomic_read(&adapter->running_cap_crqs) == 0) {
 		adapter->wait_capability = false;
-		init_sub_crqs(adapter, 0);
-		/* We're done querying the capabilities, initialize sub-crqs */
+		ibmvnic_send_req_caps(adapter, 0);
 	}
 }
 

^ permalink raw reply related

* [PATCH net-next 2/2] ibmvnic: Move initialization of sub crqs to ibmvnic_init
From: Nathan Fontenot @ 2017-04-25 19:01 UTC (permalink / raw)
  To: netdev; +Cc: jallen, tlfalcon
In-Reply-To: <20170425185704.41126.65738.stgit@ltcalpine2-lp23.aus.stglabs.ibm.com>

The sub crq structures are initialized in interrupt context while
handling the response to crqs when negotiating capabilities for
the driver. The sub crqs do not need to be initialized at this point
and can be moved to being done from ibmvnic_init. Moving the init
of the sub crqs to ibmvnic_init also allows use to allocate the
memory with GFP_KERNEL instead of GFP_ATOMIC.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c |   23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index c2e260c..4fcd2f0 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1390,12 +1390,12 @@ static struct ibmvnic_sub_crq_queue *init_sub_crq_queue(struct ibmvnic_adapter
 	struct ibmvnic_sub_crq_queue *scrq;
 	int rc;
 
-	scrq = kzalloc(sizeof(*scrq), GFP_ATOMIC);
+	scrq = kzalloc(sizeof(*scrq), GFP_KERNEL);
 	if (!scrq)
 		return NULL;
 
 	scrq->msgs =
-		(union sub_crq *)__get_free_pages(GFP_ATOMIC | __GFP_ZERO, 2);
+		(union sub_crq *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 2);
 	if (!scrq->msgs) {
 		dev_warn(dev, "Couldn't allocate crq queue messages page\n");
 		goto zero_page_failed;
@@ -1689,7 +1689,7 @@ static int init_sub_crqs(struct ibmvnic_adapter *adapter)
 
 	total_queues = adapter->req_tx_queues + adapter->req_rx_queues;
 
-	allqueues = kcalloc(total_queues, sizeof(*allqueues), GFP_ATOMIC);
+	allqueues = kcalloc(total_queues, sizeof(*allqueues), GFP_KERNEL);
 	if (!allqueues)
 		return -1;
 
@@ -1729,7 +1729,7 @@ static int init_sub_crqs(struct ibmvnic_adapter *adapter)
 	}
 
 	adapter->tx_scrq = kcalloc(adapter->req_tx_queues,
-				   sizeof(*adapter->tx_scrq), GFP_ATOMIC);
+				   sizeof(*adapter->tx_scrq), GFP_KERNEL);
 	if (!adapter->tx_scrq)
 		goto tx_failed;
 
@@ -1739,7 +1739,7 @@ static int init_sub_crqs(struct ibmvnic_adapter *adapter)
 	}
 
 	adapter->rx_scrq = kcalloc(adapter->req_rx_queues,
-				   sizeof(*adapter->rx_scrq), GFP_ATOMIC);
+				   sizeof(*adapter->rx_scrq), GFP_KERNEL);
 	if (!adapter->rx_scrq)
 		goto rx_failed;
 
@@ -1765,7 +1765,6 @@ static void ibmvnic_send_req_caps(struct ibmvnic_adapter *adapter, int retry)
 {
 	struct device *dev = &adapter->vdev->dev;
 	union ibmvnic_crq crq;
-	int rc;
 
 	if (!retry) {
 		/* Sub-CRQ entries are 32 byte long */
@@ -1794,10 +1793,6 @@ static void ibmvnic_send_req_caps(struct ibmvnic_adapter *adapter, int retry)
 		adapter->req_mtu = adapter->netdev->mtu + ETH_HLEN;
 	}
 
-	rc = init_sub_crqs(adapter);
-	if (rc)
-		return;
-
 	memset(&crq, 0, sizeof(crq));
 	crq.request_capability.first = IBMVNIC_CRQ_CMD;
 	crq.request_capability.cmd = REQUEST_CAPABILITY;
@@ -3317,7 +3312,13 @@ static int ibmvnic_init(struct ibmvnic_adapter *adapter)
 		return -1;
 	}
 
-	return 0;
+	rc = init_sub_crqs(adapter);
+	if (rc) {
+		dev_err(dev, "Initialization of sub crqs failed\n");
+		release_crq_queue(adapter);
+	}
+
+	return rc;
 }
 
 static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id)

^ permalink raw reply related

* Re: [PATCH] macsec: avoid heap overflow in skb_to_sgvec
From: Sabrina Dubroca @ 2017-04-25 15:12 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: Netdev, LKML, David Miller, stable, security
In-Reply-To: <CAHmME9ovUDJRDA+Td4Y0bfQZEG5pZQo1JC0nYDVqWMOdxAe5kQ@mail.gmail.com>

2017-04-25, 17:08:28 +0200, Jason A. Donenfeld wrote:
> Hi Sabrina,
> 
> On Tue, Apr 25, 2017 at 4:53 PM, Sabrina Dubroca <sd@queasysnail.net> wrote:
> > Ugh, good catch :/
> >
> > AFAICT this patch doesn't really help, because NETIF_F_FRAGLIST
> > doesn't get tested in paths that can lead to triggering this.
> 
> You're right. This fixes the xmit() path, but not the receive path,
> which appears to take skbs directly from the upper device.
> 
> > I'll post a patch to allocate a properly-sized sg array.
> 
> I just posted this series, which should fix things in a robust way:
> 
> https://patchwork.ozlabs.org/patch/754861/

Yes, that prevents the overflow, but now you're just dropping
packets. I'll review that later, let's fix the overflow without
breaking connectivity for now.

-- 
Sabrina

^ permalink raw reply

* Re: [PATCH] macsec: avoid heap overflow in skb_to_sgvec
From: Jason A. Donenfeld @ 2017-04-25 15:13 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: Netdev, LKML, David Miller, stable, security
In-Reply-To: <20170425151248.GB25241@bistromath.localdomain>

On Tue, Apr 25, 2017 at 5:12 PM, Sabrina Dubroca <sd@queasysnail.net> wrote:
>> https://patchwork.ozlabs.org/patch/754861/
>
> Yes, that prevents the overflow, but now you're just dropping
> packets.

Right, it's a so-called "defense-in-depth" measure.

> I'll review that later, let's fix the overflow without
> breaking connectivity for now.

Agreed.

^ permalink raw reply

* Re: [PATCH 2/2] ARM: dts: Add the ethernet and ethernet PHY to the cygnus core DT.
From: Jon Mason @ 2017-04-25 15:15 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Eric Anholt, Florian Fainelli, Vivien Didelot, Andrew Lunn,
	Network Development, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	linux-arm-kernel, open list, BCM Kernel Feedback, Ray Jui,
	Scott Branden, Jon Mason
In-Reply-To: <f870339f-2fa1-411d-5ce8-adbe8802d8a8-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>

On Tue, Apr 25, 2017 at 5:40 AM, Sergei Shtylyov
<sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org> wrote:
> Hello.
>
> On 4/25/2017 12:50 AM, Eric Anholt wrote:
>
>> Cygnus has a single amac controller connected to the B53 switch with 2
>> PHYs.  On the BCM911360_EP platform, those two PHYs are connected to
>> the external ethernet jacks.
>
>
>    My spell checker trips on "amac" and "ethernet" -- perhaps they need
> capitalization?
>
>> Signed-off-by: Eric Anholt <eric-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
>> ---
>>  arch/arm/boot/dts/bcm-cygnus.dtsi      | 60
>> ++++++++++++++++++++++++++++++++++
>>  arch/arm/boot/dts/bcm911360_entphn.dts |  8 +++++
>>  2 files changed, 68 insertions(+)
>>
>> diff --git a/arch/arm/boot/dts/bcm-cygnus.dtsi
>> b/arch/arm/boot/dts/bcm-cygnus.dtsi
>> index 009f1346b817..318899df9972 100644
>> --- a/arch/arm/boot/dts/bcm-cygnus.dtsi
>> +++ b/arch/arm/boot/dts/bcm-cygnus.dtsi
>> @@ -142,6 +142,56 @@
>>                         interrupts = <0>;
>>                 };
>>
>> +               mdio: mdio@18002000 {
>> +                       compatible = "brcm,iproc-mdio";
>> +                       reg = <0x18002000 0x8>;
>> +                       #size-cells = <1>;
>> +                       #address-cells = <0>;
>> +
>> +                       gphy0: eth-gphy@0 {
>
>
>    The node anmes must be generic, the DT spec has standardized
> "ethernet-phy" name for this case.
>
>> +                               reg = <0>;
>> +                               max-speed = <1000>;
>> +                       };
>> +
>> +                       gphy1: eth-gphy@1 {
>> +                               reg = <1>;
>> +                               max-speed = <1000>;
>> +                       };
>> +               };
>
> [...]
>>
>> @@ -295,6 +345,16 @@
>>                         status = "disabled";
>>                 };
>>
>> +               eth0: enet@18042000 {
>> +                       compatible = "brcm,amac";
>> +                       reg = <0x18042000 0x1000>,
>> +                             <0x18110000 0x1000>;
>> +                       reg-names = "amac_base", "idm_base";
>
>
>    I don't think "_base" suffixes are necessary here.

100% necessary, per the driver.  See
drivers/net/ethernet/broadcom/bgmac-platform.c


>
> [...]
>
> MBR, Sergei
>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v4 1/5] skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow
From: David Miller @ 2017-04-25 15:17 UTC (permalink / raw)
  To: Jason; +Cc: netdev, linux-kernel, David.Laight, kernel-hardening
In-Reply-To: <20170425150419.2060-1-Jason@zx2c4.com>


John this isn't how it works.

When you submit new versions of a patch that are part of a patch
series, you must resubmit the entire series not just the patches which
are changing.

Thanks.

^ permalink raw reply

* Re: [PATCH net] sfc: tx ring can only have 2048 entries for all EF10 NICs
From: David Miller @ 2017-04-25 15:19 UTC (permalink / raw)
  To: bkenward; +Cc: davem, netdev, linux-net-drivers, ptalbert
In-Reply-To: <8a13989f-959d-7d93-c9a7-7e890514d99c@solarflare.com>

From: Bert Kenward <bkenward@solarflare.com>
Date: Tue, 25 Apr 2017 13:44:54 +0100

> Fixes: dd248f1bc65b ("sfc: Add PCI ID for Solarflare 8000 series 10/40G NIC")
> Reported-by: Patrick Talbert <ptalbert@redhat.com>
> Signed-off-by: Bert Kenward <bkenward@solarflare.com>

Applied, thanks.

^ permalink raw reply

* Re: macvlan: Fix device ref leak when purging bc_queue
From: Joe.Ghalam @ 2017-04-25 15:19 UTC (permalink / raw)
  To: davem, herbert; +Cc: Clifford.Wichmann, netdev
In-Reply-To: <20170425.104254.1884694182452846750.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Sent: Tuesday, April 25, 2017 7:42 AM
To: herbert@gondor.apana.org.au
Cc: Ghalam, Joe; Wichmann, Clifford; netdev@vger.kernel.org
Subject: Re: macvlan: Fix device ref leak when purging bc_queue

> Applied and queued up for -stable, thanks Herbert.

Herbert and David,
Glad to report that we had 10 iterations of the test that was showing 100% failure with Herbert's changes, and we have not seen any failures. The fix is good.
Thanks for the help.
Joe

^ permalink raw reply

* [PATCH net] macsec: avoid heap overflow in skb_to_sgvec on receive
From: Sabrina Dubroca @ 2017-04-25 15:19 UTC (permalink / raw)
  To: netdev; +Cc: Sabrina Dubroca, Jason A. Donenfeld

The previous fix for this issue, commit 4d6fa57b4dab ("macsec: avoid
heap overflow in skb_to_sgvec"), doesn't really fix much. It removed the
NETIF_F_FRAGLIST flag from MACsec device features, but this flag isn't
checked anywhere in the codepaths leading to a macsec_decrypt() call.

On TX, macsec could already handle a frag_list because an skb with a
frag_list will get linearized in skb_copy_expand() since it lacks the
necessary tailroom. Removing the NETIF_F_FRAGLIST makes sure
macsec_encrypt() will never see a frag_list.

On RX, we can simply get the number of necessary scatterlist items by
calling skb_cow_data().

Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
Fixes: CVE-2017-7477
Reported-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 drivers/net/macsec.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index dbab05afcdbe..c9cc40d2349c 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -617,7 +617,8 @@ static void macsec_encrypt_done(struct crypto_async_request *base, int err)
 
 static struct aead_request *macsec_alloc_req(struct crypto_aead *tfm,
 					     unsigned char **iv,
-					     struct scatterlist **sg)
+					     struct scatterlist **sg,
+					     int nfrags)
 {
 	size_t size, iv_offset, sg_offset;
 	struct aead_request *req;
@@ -629,7 +630,7 @@ static struct aead_request *macsec_alloc_req(struct crypto_aead *tfm,
 
 	size = ALIGN(size, __alignof__(struct scatterlist));
 	sg_offset = size;
-	size += sizeof(struct scatterlist) * (MAX_SKB_FRAGS + 1);
+	size += sizeof(struct scatterlist) * nfrags;
 
 	tmp = kmalloc(size, GFP_ATOMIC);
 	if (!tmp)
@@ -723,7 +724,7 @@ static struct sk_buff *macsec_encrypt(struct sk_buff *skb,
 		return ERR_PTR(-EINVAL);
 	}
 
-	req = macsec_alloc_req(tx_sa->key.tfm, &iv, &sg);
+	req = macsec_alloc_req(tx_sa->key.tfm, &iv, &sg, MAX_SKB_FRAGS + 1);
 	if (!req) {
 		macsec_txsa_put(tx_sa);
 		kfree_skb(skb);
@@ -921,13 +922,21 @@ static struct sk_buff *macsec_decrypt(struct sk_buff *skb,
 	struct aead_request *req;
 	struct macsec_eth_header *hdr;
 	u16 icv_len = secy->icv_len;
+	struct sk_buff *trailer;
+	int nfrags;
 
 	macsec_skb_cb(skb)->valid = false;
 	skb = skb_share_check(skb, GFP_ATOMIC);
 	if (!skb)
 		return ERR_PTR(-ENOMEM);
 
-	req = macsec_alloc_req(rx_sa->key.tfm, &iv, &sg);
+	nfrags = skb_cow_data(skb, 0, &trailer);
+	if (nfrags < 0) {
+		kfree_skb(skb);
+		return ERR_PTR(nfrags);
+	}
+
+	req = macsec_alloc_req(rx_sa->key.tfm, &iv, &sg, nfrags);
 	if (!req) {
 		kfree_skb(skb);
 		return ERR_PTR(-ENOMEM);
@@ -936,7 +945,7 @@ static struct sk_buff *macsec_decrypt(struct sk_buff *skb,
 	hdr = (struct macsec_eth_header *)skb->data;
 	macsec_fill_iv(iv, sci, ntohl(hdr->packet_number));
 
-	sg_init_table(sg, MAX_SKB_FRAGS + 1);
+	sg_init_table(sg, nfrags);
 	skb_to_sgvec(skb, sg, 0, skb->len);
 
 	if (hdr->tci_an & MACSEC_TCI_E) {
-- 
2.12.2

^ permalink raw reply related

* Re: pull-request: can 2017-04-25
From: David Miller @ 2017-04-25 15:21 UTC (permalink / raw)
  To: mkl; +Cc: netdev, linux-can, kernel
In-Reply-To: <20170425121645.9946-1-mkl@pengutronix.de>

From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Tue, 25 Apr 2017 14:16:42 +0200

> this is a pull request of three patches for net/master.
> 
> There are two patches by Stephane Grosjean for that add a new variant to the
> PCAN-Chip USB driver. The other patch is by Maksim Salau, which swtiches the
> memory for USB transfers from heap to stack.

I think you meant "from stack to heap", but anyways pulled thanks!

^ permalink raw reply

* [PATCH] macsec: dynamically allocate space for sglist
From: Jason A. Donenfeld @ 2017-04-25 15:23 UTC (permalink / raw)
  To: Netdev, LKML, David Miller, stable, security
  Cc: Jason A. Donenfeld, Sabrina Dubroca
In-Reply-To: <CAHmME9rZ29X=R-quViXaEO_ouxeB2EpzR2bkR93rrN4n--QDqw@mail.gmail.com>

We call skb_cow_data, which is good anyway to ensure we can actually
modify the skb as such (another error from prior). Now that we have the
number of fragments required, we can safely allocate exactly that amount
of memory.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: security@kernel.org
Cc: stable@vger.kernel.org
---
 drivers/net/macsec.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index dbab05afcdbe..56dafdee4c9c 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -617,7 +617,8 @@ static void macsec_encrypt_done(struct crypto_async_request *base, int err)
 
 static struct aead_request *macsec_alloc_req(struct crypto_aead *tfm,
 					     unsigned char **iv,
-					     struct scatterlist **sg)
+					     struct scatterlist **sg,
+					     int num_frags)
 {
 	size_t size, iv_offset, sg_offset;
 	struct aead_request *req;
@@ -629,7 +630,7 @@ static struct aead_request *macsec_alloc_req(struct crypto_aead *tfm,
 
 	size = ALIGN(size, __alignof__(struct scatterlist));
 	sg_offset = size;
-	size += sizeof(struct scatterlist) * (MAX_SKB_FRAGS + 1);
+	size += sizeof(struct scatterlist) * num_frags;
 
 	tmp = kmalloc(size, GFP_ATOMIC);
 	if (!tmp)
@@ -649,6 +650,7 @@ static struct sk_buff *macsec_encrypt(struct sk_buff *skb,
 {
 	int ret;
 	struct scatterlist *sg;
+	struct sk_buff *trailer;
 	unsigned char *iv;
 	struct ethhdr *eth;
 	struct macsec_eth_header *hh;
@@ -723,7 +725,14 @@ static struct sk_buff *macsec_encrypt(struct sk_buff *skb,
 		return ERR_PTR(-EINVAL);
 	}
 
-	req = macsec_alloc_req(tx_sa->key.tfm, &iv, &sg);
+	ret = skb_cow_data(skb, 0, &trailer);
+	if (unlikely(ret < 0)) {
+		macsec_txsa_put(tx_sa);
+		kfree_skb(skb);
+		return ERR_PTR(ret);
+	}
+
+	req = macsec_alloc_req(tx_sa->key.tfm, &iv, &sg, ret);
 	if (!req) {
 		macsec_txsa_put(tx_sa);
 		kfree_skb(skb);
@@ -917,6 +926,7 @@ static struct sk_buff *macsec_decrypt(struct sk_buff *skb,
 {
 	int ret;
 	struct scatterlist *sg;
+	struct sk_buff *trailer;
 	unsigned char *iv;
 	struct aead_request *req;
 	struct macsec_eth_header *hdr;
@@ -927,7 +937,12 @@ static struct sk_buff *macsec_decrypt(struct sk_buff *skb,
 	if (!skb)
 		return ERR_PTR(-ENOMEM);
 
-	req = macsec_alloc_req(rx_sa->key.tfm, &iv, &sg);
+	ret = skb_cow_data(skb, 0, &trailer);
+	if (unlikely(ret < 0)) {
+		kfree_skb(skb);
+		return ERR_PTR(ret);
+	}
+	req = macsec_alloc_req(rx_sa->key.tfm, &iv, &sg, ret);
 	if (!req) {
 		kfree_skb(skb);
 		return ERR_PTR(-ENOMEM);
@@ -2716,7 +2731,7 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb,
 }
 
 #define MACSEC_FEATURES \
-	(NETIF_F_SG | NETIF_F_HIGHDMA)
+	(NETIF_F_SG | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST)
 static struct lock_class_key macsec_netdev_addr_lock_key;
 
 static int macsec_dev_init(struct net_device *dev)
-- 
2.12.2

^ permalink raw reply related

* Re: [PATCH 2/2] ARM: dts: Add the ethernet and ethernet PHY to the cygnus core DT.
From: Sergei Shtylyov @ 2017-04-25 15:23 UTC (permalink / raw)
  To: Jon Mason
  Cc: Mark Rutland, Andrew Lunn, Florian Fainelli, Scott Branden,
	Vivien Didelot, Jon Mason, Network Development, open list,
	Eric Anholt,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Rob Herring, BCM Kernel Feedback, Ray Jui, linux-arm-kernel
In-Reply-To: <CAC3K-4qn5oCXvyMCkFESGqMHqHXGiNUW6wfRx2LUZ8qZ39qroQ@mail.gmail.com>

Hello!

On 04/25/2017 06:15 PM, Jon Mason wrote:

>>> Cygnus has a single amac controller connected to the B53 switch with 2
>>> PHYs.  On the BCM911360_EP platform, those two PHYs are connected to
>>> the external ethernet jacks.

[...]

>>> Signed-off-by: Eric Anholt <eric@anholt.net>
>>> ---
>>>  arch/arm/boot/dts/bcm-cygnus.dtsi      | 60
>>> ++++++++++++++++++++++++++++++++++
>>>  arch/arm/boot/dts/bcm911360_entphn.dts |  8 +++++
>>>  2 files changed, 68 insertions(+)
>>>
>>> diff --git a/arch/arm/boot/dts/bcm-cygnus.dtsi
>>> b/arch/arm/boot/dts/bcm-cygnus.dtsi
>>> index 009f1346b817..318899df9972 100644
>>> --- a/arch/arm/boot/dts/bcm-cygnus.dtsi
>>> +++ b/arch/arm/boot/dts/bcm-cygnus.dtsi
[...]
>>> @@ -295,6 +345,16 @@
>>>                         status = "disabled";
>>>                 };
>>>
>>> +               eth0: enet@18042000 {
>>> +                       compatible = "brcm,amac";
>>> +                       reg = <0x18042000 0x1000>,
>>> +                             <0x18110000 0x1000>;
>>> +                       reg-names = "amac_base", "idm_base";
>>
>>
>>    I don't think "_base" suffixes are necessary here.
>
> 100% necessary, per the driver.  See
> drivers/net/ethernet/broadcom/bgmac-platform.c

    I'd recommend to fix the driver/bindings then...

MBR, Sergei

^ permalink raw reply

* Re: [PATCH v1] net: phy: fix auto-negotiation stall due to unavailable interrupt
From: Alexander Kochetkov @ 2017-04-25 15:25 UTC (permalink / raw)
  To: David Miller
  Cc: Florian Fainelli, netdev, LKML, Sergei Shtylyov, Roger Quadros,
	Madalin Bucur, Alexandre Belloni
In-Reply-To: <20170425.103644.75674234677063517.davem@davemloft.net>

Hello David!

> 25 апр. 2017 г., в 17:36, David Miller <davem@davemloft.net> написал(а):
> 
> So... what are we doing here?
> 
> My understanding is that this should fix the same problem that commit
> 99f81afc139c6edd14d77a91ee91685a414a1c66 ("phy: micrel: Disable auto
> negotiation on startup") fixed and that this micrel commit should thus
> be reverted to improve MAC startup times which regressed.
> 
> Florian, any guidance?

Yes, this should be done.

I aksed Alexandre to check if 99f81afc139c6edd14d77a91ee91685a414a1c66 ("phy: micrel: Disable auto
negotiation on startup») can be reverted, and he answered what it may do that
sometime this/next week.

Alexander.

^ permalink raw reply

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
From: Stephen Hemminger @ 2017-04-25 15:27 UTC (permalink / raw)
  To: Rafa Corvillo; +Cc: netdev
In-Reply-To: <58F9FD64.80506@aoifes.com>

On Fri, 21 Apr 2017 14:39:00 +0200
Rafa Corvillo <rafael.corvillo@aoifes.com> wrote:

> We are working in an ARMv7 embedded system running kernel 4.9 (LEDE build).
> It is an imx6 board with 2 ethernet interfaces. One of them is connected to
> a Marvell switch.
> 
> The schema of the system is the following:
> 
>   +-------------------+ eth0
>   |                   +--+
>   |                   |  |
>   | Embedded system   +--+
>   |                   |
>   |      ARMv7        |
>   |                   | Marvell 88E8057(sky2) +-------------+
>   |                   +--+ +--+             +--+ eth1
>   |                   |  +---------------------+ |             |  +------+
>   |                   +--+      CPU port       +--+ mv88e6176  +--+
>   +------+--+---------+ |             |
> emulated|  | |             |
> GPIO    +--+ +--+             +--+ eth2
> MDIO      +-----------------------------------+ |             |  +------+
>                                MDIO +--+             +--+
> +-------------+
> 
> There is a bridge (br-lan) which includes eth0/eth1/eth2
> 
> If I connect the eth1/eth2, the link is up and I can do ping through it. 
> But, once
> I start sending a heavy traffic load the link fails and the kernel sends the
> following messages:
> 
> [   48.557140] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.564964] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.572110] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.579263] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.586417] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.593573] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.600718] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   54.877567] net_ratelimit: 6 callbacks suppressed
> [   54.882293] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   61.413552] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518

The status error bits are in sky2.h
0x5f20010 is
     05f2 frame length => 1522 
     0010 Too long err

That means the packet was longer than the configured MTU.
You are probably getting packets with VLAN tag but have not configured
a VLAN.

^ permalink raw reply

* Re: [PATCH net-next 1/1] net sched actions: Complete the JUMPX opcode
From: David Miller @ 2017-04-25 15:30 UTC (permalink / raw)
  To: jhs; +Cc: netdev, jiri, xiyou.wangcong, daniel, alexei.starovoitov
In-Reply-To: <1492967848-10165-1-git-send-email-jhs@emojatatu.com>

From: Jamal Hadi Salim <jhs@mojatatu.com>
Date: Sun, 23 Apr 2017 13:17:28 -0400

> From: Jamal Hadi Salim <jhs@mojatatu.com>
> 
> per discussion at netconf/netdev:
> When we have an action that is capable of branching (example a policer),
> we can achieve a continuation of the action graph by programming a
> "continue" where we find an exact replica of the same filter rule with a lower
> priority and the remainder of the action graph. When you have 100s of thousands
> of filters which require such a feature it gets very inefficient to do two
> lookups.
> 
> This patch completes a leftover feature of action codes. Its time has come.
> 
> Example below where a user labels packets with a different skbmark on ingress
> of a port depending on whether they have/not exceeded the configured rate.
> This mark is then used to make further decisions on some egress port.
> 
>  #rate control, very low so we can easily see the effect
> sudo $TC actions add action police rate 1kbit burst 90k \
> conform-exceed pipe/jump 2 index 10
>  # skbedit index 11 will be used if the user conforms
> sudo $TC actions add action skbedit mark 11 ok index 11
>  # skbedit index 12 will be used if the user does not conform
> sudo $TC actions add action skbedit mark 12 ok index 12
> 
>  #lets bind the user ..
> sudo $TC filter add dev $ETH parent ffff: protocol ip prio 8 u32 \
> match ip dst 127.0.0.8/32 flowid 1:10 \
> action police index 10 \
> action skbedit index 11 \
> action skbedit index 12
> 
>  #run a ping -f and see what happens..
>  #
> jhs@foobar:~$ sudo $TC -s filter ls dev $ETH parent ffff: protocol ip
> filter pref 8 u32
> filter pref 8 u32 fh 800: ht divisor 1
> filter pref 8 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10  (rule hit 2800 success 1005)
 ...
> Not bad, about 28% non-conforming packets..
> 
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>

Applied, thanks Jamal.

^ permalink raw reply

* Re: pull-request: can 2017-04-25
From: Marc Kleine-Budde @ 2017-04-25 15:33 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-can, kernel
In-Reply-To: <20170425.112126.232541678435322858.davem@davemloft.net>


[-- Attachment #1.1: Type: text/plain, Size: 811 bytes --]

On 04/25/2017 05:21 PM, David Miller wrote:
> From: Marc Kleine-Budde <mkl@pengutronix.de>
> Date: Tue, 25 Apr 2017 14:16:42 +0200
> 
>> this is a pull request of three patches for net/master.
>>
>> There are two patches by Stephane Grosjean for that add a new variant to the
>> PCAN-Chip USB driver. The other patch is by Maksim Salau, which swtiches the
>> memory for USB transfers from heap to stack.
> 
> I think you meant "from stack to heap", but anyways pulled thanks!

Yes and that's what the patch does.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH net] team: fix memory leaks
From: David Miller @ 2017-04-25 15:35 UTC (permalink / raw)
  To: bianpan2016; +Cc: jiri, netdev, linux-kernel
In-Reply-To: <1493029756-13171-1-git-send-email-bianpan2016@163.com>

From: Pan Bian <bianpan2016@163.com>
Date: Mon, 24 Apr 2017 18:29:16 +0800

> In functions team_nl_send_port_list_get() and
> team_nl_send_options_get(), pointer skb keeps the return value of
> nlmsg_new(). When the call to genlmsg_put() fails, the memory is not
> freed(). This will result in memory leak bugs.
> 
> Fixes: 9b00cf2d1024 ("team: implement multipart netlink messages for options transfers")
> Signed-off-by: Pan Bian <bianpan2016@163.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH RFC] ptr_ring: add ptr_ring_unconsume
From: Michael S. Tsirkin @ 2017-04-25 15:35 UTC (permalink / raw)
  To: Jason Wang; +Cc: linux-kernel, netdev
In-Reply-To: <91ae69e6-4dac-3db2-4778-c4163dfe6f91@redhat.com>

On Tue, Apr 25, 2017 at 12:07:01PM +0800, Jason Wang wrote:
> 
> 
> On 2017年04月24日 20:00, Michael S. Tsirkin wrote:
> > On Mon, Apr 24, 2017 at 07:54:18PM +0800, Jason Wang wrote:
> > > On 2017年04月24日 07:28, Michael S. Tsirkin wrote:
> > > > On Tue, Apr 18, 2017 at 11:07:42AM +0800, Jason Wang wrote:
> > > > > On 2017年04月17日 07:19, Michael S. Tsirkin wrote:
> > > > > > Applications that consume a batch of entries in one go
> > > > > > can benefit from ability to return some of them back
> > > > > > into the ring.
> > > > > > 
> > > > > > Add an API for that - assuming there's space. If there's no space
> > > > > > naturally we can't do this and have to drop entries, but this implies
> > > > > > ring is full so we'd likely drop some anyway.
> > > > > > 
> > > > > > Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> > > > > > ---
> > > > > > 
> > > > > > Jason, in my mind the biggest issue with your batching patchset is the
> > > > > > backet drops on disconnect.  This API will help avoid that in the common
> > > > > > case.
> > > > > Ok, I will rebase the series on top of this. (Though I don't think we care
> > > > > the packet loss).
> > > > E.g. I care - I often start sending packets to VM before it's
> > > > fully booted. Several vhost resets might follow.
> > > Ok.
> > > 
> > > > > > I would still prefer that we understand what's going on,
> > > > > I try to reply in another thread, does it make sense?
> > > > > 
> > > > > >     and I would
> > > > > > like to know what's the smallest batch size that's still helpful,
> > > > > Yes, I've replied in another thread, the result is:
> > > > > 
> > > > > 
> > > > > no batching   1.88Mpps
> > > > > RX_BATCH=1    1.93Mpps
> > > > > RX_BATCH=4    2.11Mpps
> > > > > RX_BATCH=16   2.14Mpps
> > > > > RX_BATCH=64   2.25Mpps
> > > > > RX_BATCH=256  2.18Mpps
> > > > Essentially 4 is enough, other stuf looks more like noise
> > > > to me. What about 2?
> > > The numbers are pretty stable, so probably not noise. Retested on top of
> > > batch zeroing:
> > > 
> > > no  1.97Mpps
> > > 1   2.09Mpps
> > > 2   2.11Mpps
> > > 4   2.16Mpps
> > > 8   2.19Mpps
> > > 16  2.21Mpps
> > > 32  2.25Mpps
> > > 64  2.30Mpps
> > > 128 2.21Mpps
> > > 256 2.21Mpps
> > > 
> > > 64 performs best.
> > > 
> > > Thanks
> > OK but it might be e.g. a function of the ring size, host cache size or
> > whatever. As we don't really understand the why, if we just optimize for
> > your setup we risk regressions in others.  64 entries is a lot, it
> > increases the queue size noticeably.  Could this be part of the effect?
> > Could you try changing the queue size to see what happens?
> 
> I increase tx_queue_len to 1100, but only see less than 1% improvement on
> pps number (batch = 1) in my machine. If you care about the regression, we
> probably can leave the choice to user through e.g module parameter. But I'm
> afraid we have already had too much choices for them. Or I can test this
> with different CPU types.
> 
> Thanks
> 

I agree here. Let's keep it a constant. Testing on more machines would
be nice but not strictly required.  I just dislike not understanding why
it helps because it means we can easily break it by mistake.  So my only
request really is that you wrap access to this internal buffer in an
API. Let's see - I think we need

struct vhost_net_buf
vhost_net_buf_get_ptr
vhost_net_buf_get_size
vhost_net_buf_is_empty
vhost_net_buf_peek
vhost_net_buf_consume
vhost_net_buf_produce



-- 
MST

^ permalink raw reply

* Re: [PATCH net] macsec: avoid heap overflow in skb_to_sgvec on receive
From: Jason A. Donenfeld @ 2017-04-25 15:39 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: Netdev
In-Reply-To: <f50ef5b9e525d83938f5e419557abb0307ee023b.1493133423.git.sd@queasysnail.net>

Hi Sabrina,

I think I may have beaten you to the punch here by a few minutes. :)

The difference between our two versions is that you don't re-add the
FRAGLIST attribute, whereas my patch does, and then it does the
dynamic allocation. I suspect this might be a bit more robust. It also
ensures that skb_cow_data is called on both paths. So perhaps let's
roll with mine?

Jason

^ permalink raw reply

* Aw: Re: [Intel-wired-lan] [PATCH] ixgbe: initialize u64_stats_sync structures early at ixgbe_probe
From: Lino Sanfilippo @ 2017-04-25 15:39 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Singh, Krishneil K, Song, Liwei (Wind River), Kirsher, Jeffrey T,
	netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAKgT0UeTdY9j1qDHP=DscgAZF2pPFCb+=ACySqKBK+UuBsaNBA@mail.gmail.com>

Hi,

> This patch doesn't look right to me. I would suggest rejecting it.
> 
> The call to initialize the stats should be done when the ring is
> allocated, not in ixgbe_probe(). This should probably be done in
> ixgbe_alloc_q_vector() instead.
> 

AFAICS ixgbe_alloc_q_vector() is also called in probe() (by ixgbe_init_interrupt_scheme()).
Furthermore it is also called in resume() which would lead to multiple initialization of
the u64_stats_sync in case of resume.

IMHO the u64_stats_sync variables have to be initialized before register_netdev() is called
since this is the point from which userspace can call ixgbe_get_stats64(). I would say the
best place to do so is the probe() function as it is done in this patch.

Just my 2 cents.

Regards,
Lino

^ permalink raw reply

* Re: [PATCH] IB/IPoIB: Check the headroom size
From: Doug Ledford @ 2017-04-25 15:40 UTC (permalink / raw)
  To: Or Gerlitz, Erez Shitrit, Paolo Abeni
  Cc: Honggang LI, Erez Shitrit,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux Netdev List, David Miller
In-Reply-To: <CAJ3xEMgwS1Bq8+eZC1iAr6xi8ZPHrchsOJ5r4LNJxR8P+6VipA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tue, 2017-04-25 at 17:39 +0300, Or Gerlitz wrote:
> On Tue, Apr 25, 2017 at 2:43 PM, Erez Shitrit <erezsh-LDSdmyG8hGUWn1LaSxwUnA@public.gmane.org
> .il> wrote:
> > 
> > On Tue, Apr 25, 2017 at 2:14 PM, Or Gerlitz <gerlitz.or-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > wrote:
> 
> > 
> > > 
> > > thanks for the info. Is this bug there since ipoib/bonding day
> > > one (and hence my bug...)
> > > or was indeed introduced later? if later, can you explain how
> > > fc791b633515 introduced that or you only know it by bisection?
> 
> > 
> > commit "fc791b633515" changes the size of the dev_hardlen to be 24
> > and
> > required 24 extra bytes in the skb, before it was only 4, if skb is
> > aligned to eth "mode" it already has 14 bytes for hard-header.
> > So only after that commit we have the issue.
> 
> If got you right, Paolo's commit introduced a regression, so we (I
> guess you and
> Paolo) need to either solve it or we (community) should consider a
> revert, please suggest.

It's a little more complex than that.  Paolo's commit *re-introduced* a
regression.  If you recall, long ago the IPoIB layer stuck the dgid
into the skb, then pulled it out later.  However, we didn't actually
declare things properly back then, but it worked anyway.  Then we had
the commit you authored to start using the skb->cb to store the dgid,
and our usage of hard header dropped to only 4 bytes.  Paolo's commit
went back to the old way of doing things, but also did the proper
accounting and setup to tell the netstack what we were doing (which the
initial implementation never did IIRC).  So, this issue should be
reproducible either after Paolo's commit or with any kernel prior to
your commit to use the skb->cb area to store the DGID, but it probably
requires the specific series of actions in this bug to trigger it.  A
normal, clean shutdown of the interface doesn't demonstrate the issue.

> The bug is now in stable and distro kernels, so please act.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox