Netdev List
 help / color / mirror / Atom feed
* [PATCH net] macsec: avoid heap overflow in skb_to_sgvec on receive
From: Sabrina Dubroca @ 2017-04-25 15:19 UTC (permalink / raw)
  To: netdev; +Cc: Sabrina Dubroca, Jason A. Donenfeld

The previous fix for this issue, commit 4d6fa57b4dab ("macsec: avoid
heap overflow in skb_to_sgvec"), doesn't really fix much. It removed the
NETIF_F_FRAGLIST flag from MACsec device features, but this flag isn't
checked anywhere in the codepaths leading to a macsec_decrypt() call.

On TX, macsec could already handle a frag_list because an skb with a
frag_list will get linearized in skb_copy_expand() since it lacks the
necessary tailroom. Removing the NETIF_F_FRAGLIST makes sure
macsec_encrypt() will never see a frag_list.

On RX, we can simply get the number of necessary scatterlist items by
calling skb_cow_data().

Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
Fixes: CVE-2017-7477
Reported-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 drivers/net/macsec.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index dbab05afcdbe..c9cc40d2349c 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -617,7 +617,8 @@ static void macsec_encrypt_done(struct crypto_async_request *base, int err)
 
 static struct aead_request *macsec_alloc_req(struct crypto_aead *tfm,
 					     unsigned char **iv,
-					     struct scatterlist **sg)
+					     struct scatterlist **sg,
+					     int nfrags)
 {
 	size_t size, iv_offset, sg_offset;
 	struct aead_request *req;
@@ -629,7 +630,7 @@ static struct aead_request *macsec_alloc_req(struct crypto_aead *tfm,
 
 	size = ALIGN(size, __alignof__(struct scatterlist));
 	sg_offset = size;
-	size += sizeof(struct scatterlist) * (MAX_SKB_FRAGS + 1);
+	size += sizeof(struct scatterlist) * nfrags;
 
 	tmp = kmalloc(size, GFP_ATOMIC);
 	if (!tmp)
@@ -723,7 +724,7 @@ static struct sk_buff *macsec_encrypt(struct sk_buff *skb,
 		return ERR_PTR(-EINVAL);
 	}
 
-	req = macsec_alloc_req(tx_sa->key.tfm, &iv, &sg);
+	req = macsec_alloc_req(tx_sa->key.tfm, &iv, &sg, MAX_SKB_FRAGS + 1);
 	if (!req) {
 		macsec_txsa_put(tx_sa);
 		kfree_skb(skb);
@@ -921,13 +922,21 @@ static struct sk_buff *macsec_decrypt(struct sk_buff *skb,
 	struct aead_request *req;
 	struct macsec_eth_header *hdr;
 	u16 icv_len = secy->icv_len;
+	struct sk_buff *trailer;
+	int nfrags;
 
 	macsec_skb_cb(skb)->valid = false;
 	skb = skb_share_check(skb, GFP_ATOMIC);
 	if (!skb)
 		return ERR_PTR(-ENOMEM);
 
-	req = macsec_alloc_req(rx_sa->key.tfm, &iv, &sg);
+	nfrags = skb_cow_data(skb, 0, &trailer);
+	if (nfrags < 0) {
+		kfree_skb(skb);
+		return ERR_PTR(nfrags);
+	}
+
+	req = macsec_alloc_req(rx_sa->key.tfm, &iv, &sg, nfrags);
 	if (!req) {
 		kfree_skb(skb);
 		return ERR_PTR(-ENOMEM);
@@ -936,7 +945,7 @@ static struct sk_buff *macsec_decrypt(struct sk_buff *skb,
 	hdr = (struct macsec_eth_header *)skb->data;
 	macsec_fill_iv(iv, sci, ntohl(hdr->packet_number));
 
-	sg_init_table(sg, MAX_SKB_FRAGS + 1);
+	sg_init_table(sg, nfrags);
 	skb_to_sgvec(skb, sg, 0, skb->len);
 
 	if (hdr->tci_an & MACSEC_TCI_E) {
-- 
2.12.2

^ permalink raw reply related

* Re: macvlan: Fix device ref leak when purging bc_queue
From: Joe.Ghalam @ 2017-04-25 15:19 UTC (permalink / raw)
  To: davem, herbert; +Cc: Clifford.Wichmann, netdev
In-Reply-To: <20170425.104254.1884694182452846750.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Sent: Tuesday, April 25, 2017 7:42 AM
To: herbert@gondor.apana.org.au
Cc: Ghalam, Joe; Wichmann, Clifford; netdev@vger.kernel.org
Subject: Re: macvlan: Fix device ref leak when purging bc_queue

> Applied and queued up for -stable, thanks Herbert.

Herbert and David,
Glad to report that we had 10 iterations of the test that was showing 100% failure with Herbert's changes, and we have not seen any failures. The fix is good.
Thanks for the help.
Joe

^ permalink raw reply

* Re: [PATCH net] sfc: tx ring can only have 2048 entries for all EF10 NICs
From: David Miller @ 2017-04-25 15:19 UTC (permalink / raw)
  To: bkenward; +Cc: davem, netdev, linux-net-drivers, ptalbert
In-Reply-To: <8a13989f-959d-7d93-c9a7-7e890514d99c@solarflare.com>

From: Bert Kenward <bkenward@solarflare.com>
Date: Tue, 25 Apr 2017 13:44:54 +0100

> Fixes: dd248f1bc65b ("sfc: Add PCI ID for Solarflare 8000 series 10/40G NIC")
> Reported-by: Patrick Talbert <ptalbert@redhat.com>
> Signed-off-by: Bert Kenward <bkenward@solarflare.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v4 1/5] skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow
From: David Miller @ 2017-04-25 15:17 UTC (permalink / raw)
  To: Jason; +Cc: netdev, linux-kernel, David.Laight, kernel-hardening
In-Reply-To: <20170425150419.2060-1-Jason@zx2c4.com>


John this isn't how it works.

When you submit new versions of a patch that are part of a patch
series, you must resubmit the entire series not just the patches which
are changing.

Thanks.

^ permalink raw reply

* Re: [PATCH 2/2] ARM: dts: Add the ethernet and ethernet PHY to the cygnus core DT.
From: Jon Mason @ 2017-04-25 15:15 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Eric Anholt, Florian Fainelli, Vivien Didelot, Andrew Lunn,
	Network Development, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	linux-arm-kernel, open list, BCM Kernel Feedback, Ray Jui,
	Scott Branden, Jon Mason
In-Reply-To: <f870339f-2fa1-411d-5ce8-adbe8802d8a8-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>

On Tue, Apr 25, 2017 at 5:40 AM, Sergei Shtylyov
<sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org> wrote:
> Hello.
>
> On 4/25/2017 12:50 AM, Eric Anholt wrote:
>
>> Cygnus has a single amac controller connected to the B53 switch with 2
>> PHYs.  On the BCM911360_EP platform, those two PHYs are connected to
>> the external ethernet jacks.
>
>
>    My spell checker trips on "amac" and "ethernet" -- perhaps they need
> capitalization?
>
>> Signed-off-by: Eric Anholt <eric-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
>> ---
>>  arch/arm/boot/dts/bcm-cygnus.dtsi      | 60
>> ++++++++++++++++++++++++++++++++++
>>  arch/arm/boot/dts/bcm911360_entphn.dts |  8 +++++
>>  2 files changed, 68 insertions(+)
>>
>> diff --git a/arch/arm/boot/dts/bcm-cygnus.dtsi
>> b/arch/arm/boot/dts/bcm-cygnus.dtsi
>> index 009f1346b817..318899df9972 100644
>> --- a/arch/arm/boot/dts/bcm-cygnus.dtsi
>> +++ b/arch/arm/boot/dts/bcm-cygnus.dtsi
>> @@ -142,6 +142,56 @@
>>                         interrupts = <0>;
>>                 };
>>
>> +               mdio: mdio@18002000 {
>> +                       compatible = "brcm,iproc-mdio";
>> +                       reg = <0x18002000 0x8>;
>> +                       #size-cells = <1>;
>> +                       #address-cells = <0>;
>> +
>> +                       gphy0: eth-gphy@0 {
>
>
>    The node anmes must be generic, the DT spec has standardized
> "ethernet-phy" name for this case.
>
>> +                               reg = <0>;
>> +                               max-speed = <1000>;
>> +                       };
>> +
>> +                       gphy1: eth-gphy@1 {
>> +                               reg = <1>;
>> +                               max-speed = <1000>;
>> +                       };
>> +               };
>
> [...]
>>
>> @@ -295,6 +345,16 @@
>>                         status = "disabled";
>>                 };
>>
>> +               eth0: enet@18042000 {
>> +                       compatible = "brcm,amac";
>> +                       reg = <0x18042000 0x1000>,
>> +                             <0x18110000 0x1000>;
>> +                       reg-names = "amac_base", "idm_base";
>
>
>    I don't think "_base" suffixes are necessary here.

100% necessary, per the driver.  See
drivers/net/ethernet/broadcom/bgmac-platform.c


>
> [...]
>
> MBR, Sergei
>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] macsec: avoid heap overflow in skb_to_sgvec
From: Jason A. Donenfeld @ 2017-04-25 15:13 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: Netdev, LKML, David Miller, stable, security
In-Reply-To: <20170425151248.GB25241@bistromath.localdomain>

On Tue, Apr 25, 2017 at 5:12 PM, Sabrina Dubroca <sd@queasysnail.net> wrote:
>> https://patchwork.ozlabs.org/patch/754861/
>
> Yes, that prevents the overflow, but now you're just dropping
> packets.

Right, it's a so-called "defense-in-depth" measure.

> I'll review that later, let's fix the overflow without
> breaking connectivity for now.

Agreed.

^ permalink raw reply

* Re: [PATCH] macsec: avoid heap overflow in skb_to_sgvec
From: Sabrina Dubroca @ 2017-04-25 15:12 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: Netdev, LKML, David Miller, stable, security
In-Reply-To: <CAHmME9ovUDJRDA+Td4Y0bfQZEG5pZQo1JC0nYDVqWMOdxAe5kQ@mail.gmail.com>

2017-04-25, 17:08:28 +0200, Jason A. Donenfeld wrote:
> Hi Sabrina,
> 
> On Tue, Apr 25, 2017 at 4:53 PM, Sabrina Dubroca <sd@queasysnail.net> wrote:
> > Ugh, good catch :/
> >
> > AFAICT this patch doesn't really help, because NETIF_F_FRAGLIST
> > doesn't get tested in paths that can lead to triggering this.
> 
> You're right. This fixes the xmit() path, but not the receive path,
> which appears to take skbs directly from the upper device.
> 
> > I'll post a patch to allocate a properly-sized sg array.
> 
> I just posted this series, which should fix things in a robust way:
> 
> https://patchwork.ozlabs.org/patch/754861/

Yes, that prevents the overflow, but now you're just dropping
packets. I'll review that later, let's fix the overflow without
breaking connectivity for now.

-- 
Sabrina

^ permalink raw reply

* [PATCH net-next 2/2] ibmvnic: Move initialization of sub crqs to ibmvnic_init
From: Nathan Fontenot @ 2017-04-25 19:01 UTC (permalink / raw)
  To: netdev; +Cc: jallen, tlfalcon
In-Reply-To: <20170425185704.41126.65738.stgit@ltcalpine2-lp23.aus.stglabs.ibm.com>

The sub crq structures are initialized in interrupt context while
handling the response to crqs when negotiating capabilities for
the driver. The sub crqs do not need to be initialized at this point
and can be moved to being done from ibmvnic_init. Moving the init
of the sub crqs to ibmvnic_init also allows use to allocate the
memory with GFP_KERNEL instead of GFP_ATOMIC.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c |   23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index c2e260c..4fcd2f0 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1390,12 +1390,12 @@ static struct ibmvnic_sub_crq_queue *init_sub_crq_queue(struct ibmvnic_adapter
 	struct ibmvnic_sub_crq_queue *scrq;
 	int rc;
 
-	scrq = kzalloc(sizeof(*scrq), GFP_ATOMIC);
+	scrq = kzalloc(sizeof(*scrq), GFP_KERNEL);
 	if (!scrq)
 		return NULL;
 
 	scrq->msgs =
-		(union sub_crq *)__get_free_pages(GFP_ATOMIC | __GFP_ZERO, 2);
+		(union sub_crq *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 2);
 	if (!scrq->msgs) {
 		dev_warn(dev, "Couldn't allocate crq queue messages page\n");
 		goto zero_page_failed;
@@ -1689,7 +1689,7 @@ static int init_sub_crqs(struct ibmvnic_adapter *adapter)
 
 	total_queues = adapter->req_tx_queues + adapter->req_rx_queues;
 
-	allqueues = kcalloc(total_queues, sizeof(*allqueues), GFP_ATOMIC);
+	allqueues = kcalloc(total_queues, sizeof(*allqueues), GFP_KERNEL);
 	if (!allqueues)
 		return -1;
 
@@ -1729,7 +1729,7 @@ static int init_sub_crqs(struct ibmvnic_adapter *adapter)
 	}
 
 	adapter->tx_scrq = kcalloc(adapter->req_tx_queues,
-				   sizeof(*adapter->tx_scrq), GFP_ATOMIC);
+				   sizeof(*adapter->tx_scrq), GFP_KERNEL);
 	if (!adapter->tx_scrq)
 		goto tx_failed;
 
@@ -1739,7 +1739,7 @@ static int init_sub_crqs(struct ibmvnic_adapter *adapter)
 	}
 
 	adapter->rx_scrq = kcalloc(adapter->req_rx_queues,
-				   sizeof(*adapter->rx_scrq), GFP_ATOMIC);
+				   sizeof(*adapter->rx_scrq), GFP_KERNEL);
 	if (!adapter->rx_scrq)
 		goto rx_failed;
 
@@ -1765,7 +1765,6 @@ static void ibmvnic_send_req_caps(struct ibmvnic_adapter *adapter, int retry)
 {
 	struct device *dev = &adapter->vdev->dev;
 	union ibmvnic_crq crq;
-	int rc;
 
 	if (!retry) {
 		/* Sub-CRQ entries are 32 byte long */
@@ -1794,10 +1793,6 @@ static void ibmvnic_send_req_caps(struct ibmvnic_adapter *adapter, int retry)
 		adapter->req_mtu = adapter->netdev->mtu + ETH_HLEN;
 	}
 
-	rc = init_sub_crqs(adapter);
-	if (rc)
-		return;
-
 	memset(&crq, 0, sizeof(crq));
 	crq.request_capability.first = IBMVNIC_CRQ_CMD;
 	crq.request_capability.cmd = REQUEST_CAPABILITY;
@@ -3317,7 +3312,13 @@ static int ibmvnic_init(struct ibmvnic_adapter *adapter)
 		return -1;
 	}
 
-	return 0;
+	rc = init_sub_crqs(adapter);
+	if (rc) {
+		dev_err(dev, "Initialization of sub crqs failed\n");
+		release_crq_queue(adapter);
+	}
+
+	return rc;
 }
 
 static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id)

^ permalink raw reply related

* [PATCH net-next 1/2] ibmvnic: Split initialization of scrqs to its own routine
From: Nathan Fontenot @ 2017-04-25 19:01 UTC (permalink / raw)
  To: netdev; +Cc: jallen, tlfalcon
In-Reply-To: <20170425185704.41126.65738.stgit@ltcalpine2-lp23.aus.stglabs.ibm.com>

Split the sending of capability request crqs and the initialization
of sub crqs into their own routines. This is a first step to moving
the allocation of sub-crqs out of interrupt context.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c |  101 +++++++++++++++++++-----------------
 1 file changed, 54 insertions(+), 47 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 51bf337..c2e260c 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1678,48 +1678,20 @@ static int init_sub_crq_irqs(struct ibmvnic_adapter *adapter)
 	return rc;
 }
 
-static void init_sub_crqs(struct ibmvnic_adapter *adapter, int retry)
+static int init_sub_crqs(struct ibmvnic_adapter *adapter)
 {
 	struct device *dev = &adapter->vdev->dev;
 	struct ibmvnic_sub_crq_queue **allqueues;
 	int registered_queues = 0;
-	union ibmvnic_crq crq;
 	int total_queues;
 	int more = 0;
 	int i;
 
-	if (!retry) {
-		/* Sub-CRQ entries are 32 byte long */
-		int entries_page = 4 * PAGE_SIZE / (sizeof(u64) * 4);
-
-		if (adapter->min_tx_entries_per_subcrq > entries_page ||
-		    adapter->min_rx_add_entries_per_subcrq > entries_page) {
-			dev_err(dev, "Fatal, invalid entries per sub-crq\n");
-			goto allqueues_failed;
-		}
-
-		/* Get the minimum between the queried max and the entries
-		 * that fit in our PAGE_SIZE
-		 */
-		adapter->req_tx_entries_per_subcrq =
-		    adapter->max_tx_entries_per_subcrq > entries_page ?
-		    entries_page : adapter->max_tx_entries_per_subcrq;
-		adapter->req_rx_add_entries_per_subcrq =
-		    adapter->max_rx_add_entries_per_subcrq > entries_page ?
-		    entries_page : adapter->max_rx_add_entries_per_subcrq;
-
-		adapter->req_tx_queues = adapter->opt_tx_comp_sub_queues;
-		adapter->req_rx_queues = adapter->opt_rx_comp_queues;
-		adapter->req_rx_add_queues = adapter->max_rx_add_queues;
-
-		adapter->req_mtu = adapter->netdev->mtu + ETH_HLEN;
-	}
-
 	total_queues = adapter->req_tx_queues + adapter->req_rx_queues;
 
 	allqueues = kcalloc(total_queues, sizeof(*allqueues), GFP_ATOMIC);
 	if (!allqueues)
-		goto allqueues_failed;
+		return -1;
 
 	for (i = 0; i < total_queues; i++) {
 		allqueues[i] = init_sub_crq_queue(adapter);
@@ -1776,6 +1748,56 @@ static void init_sub_crqs(struct ibmvnic_adapter *adapter, int retry)
 		adapter->rx_scrq[i]->scrq_num = i;
 	}
 
+	kfree(allqueues);
+	return 0;
+
+rx_failed:
+	kfree(adapter->tx_scrq);
+	adapter->tx_scrq = NULL;
+tx_failed:
+	for (i = 0; i < registered_queues; i++)
+		release_sub_crq_queue(adapter, allqueues[i]);
+	kfree(allqueues);
+	return -1;
+}
+
+static void ibmvnic_send_req_caps(struct ibmvnic_adapter *adapter, int retry)
+{
+	struct device *dev = &adapter->vdev->dev;
+	union ibmvnic_crq crq;
+	int rc;
+
+	if (!retry) {
+		/* Sub-CRQ entries are 32 byte long */
+		int entries_page = 4 * PAGE_SIZE / (sizeof(u64) * 4);
+
+		if (adapter->min_tx_entries_per_subcrq > entries_page ||
+		    adapter->min_rx_add_entries_per_subcrq > entries_page) {
+			dev_err(dev, "Fatal, invalid entries per sub-crq\n");
+			return;
+		}
+
+		/* Get the minimum between the queried max and the entries
+		 * that fit in our PAGE_SIZE
+		 */
+		adapter->req_tx_entries_per_subcrq =
+		    adapter->max_tx_entries_per_subcrq > entries_page ?
+		    entries_page : adapter->max_tx_entries_per_subcrq;
+		adapter->req_rx_add_entries_per_subcrq =
+		    adapter->max_rx_add_entries_per_subcrq > entries_page ?
+		    entries_page : adapter->max_rx_add_entries_per_subcrq;
+
+		adapter->req_tx_queues = adapter->opt_tx_comp_sub_queues;
+		adapter->req_rx_queues = adapter->opt_rx_comp_queues;
+		adapter->req_rx_add_queues = adapter->max_rx_add_queues;
+
+		adapter->req_mtu = adapter->netdev->mtu + ETH_HLEN;
+	}
+
+	rc = init_sub_crqs(adapter);
+	if (rc)
+		return;
+
 	memset(&crq, 0, sizeof(crq));
 	crq.request_capability.first = IBMVNIC_CRQ_CMD;
 	crq.request_capability.cmd = REQUEST_CAPABILITY;
@@ -1829,20 +1851,6 @@ static void init_sub_crqs(struct ibmvnic_adapter *adapter, int retry)
 		atomic_inc(&adapter->running_cap_crqs);
 		ibmvnic_send_crq(adapter, &crq);
 	}
-
-	kfree(allqueues);
-
-	return;
-
-rx_failed:
-	kfree(adapter->tx_scrq);
-	adapter->tx_scrq = NULL;
-tx_failed:
-	for (i = 0; i < registered_queues; i++)
-		release_sub_crq_queue(adapter, allqueues[i]);
-	kfree(allqueues);
-allqueues_failed:
-	ibmvnic_remove(adapter->vdev);
 }
 
 static int pending_scrq(struct ibmvnic_adapter *adapter,
@@ -2568,7 +2576,7 @@ static void handle_request_cap_rsp(union ibmvnic_crq *crq,
 					       number), name);
 		release_sub_crqs(adapter);
 		*req_value = be64_to_cpu(crq->request_capability_rsp.number);
-		init_sub_crqs(adapter, 1);
+		ibmvnic_send_req_caps(adapter, 1);
 		return;
 	default:
 		dev_err(dev, "Error %d in request cap rsp\n",
@@ -2881,8 +2889,7 @@ static void handle_query_cap_rsp(union ibmvnic_crq *crq,
 out:
 	if (atomic_read(&adapter->running_cap_crqs) == 0) {
 		adapter->wait_capability = false;
-		init_sub_crqs(adapter, 0);
-		/* We're done querying the capabilities, initialize sub-crqs */
+		ibmvnic_send_req_caps(adapter, 0);
 	}
 }
 

^ permalink raw reply related

* [PATCH net-next 0/2] Move sub crq init out of interrupt context
From: Nathan Fontenot @ 2017-04-25 19:00 UTC (permalink / raw)
  To: netdev; +Cc: jallen, tlfalcon

The sub crqs are currently intialized in interrupt context when
handling a crq response fromn the vios server. There is no reason
they must be initialized there.

Moving the initialization of the sub crqs to the ibmvnic_init routine
allows us to do the initialization outside of interrupt context and
make all of the allocations with GFP_KERNEL instead of GFP_ATOMIC.

-Nathan

---

Nathan Fontenot (2):
      ibmvnic: Split initialization of scrqs to its own routine
      ibmvnic: Move initialization of sub crqs to ibmvnic_init


 drivers/net/ethernet/ibm/ibmvnic.c |  114 +++++++++++++++++++-----------------
 1 file changed, 61 insertions(+), 53 deletions(-)

^ permalink raw reply

* Re: [PATCH] macsec: avoid heap overflow in skb_to_sgvec
From: Jason A. Donenfeld @ 2017-04-25 15:08 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: Netdev, LKML, David Miller, stable, security
In-Reply-To: <20170425145340.GA25241@bistromath.localdomain>

Hi Sabrina,

On Tue, Apr 25, 2017 at 4:53 PM, Sabrina Dubroca <sd@queasysnail.net> wrote:
> Ugh, good catch :/
>
> AFAICT this patch doesn't really help, because NETIF_F_FRAGLIST
> doesn't get tested in paths that can lead to triggering this.

You're right. This fixes the xmit() path, but not the receive path,
which appears to take skbs directly from the upper device.

> I'll post a patch to allocate a properly-sized sg array.

I just posted this series, which should fix things in a robust way:

https://patchwork.ozlabs.org/patch/754861/

If you want to handle fraglists, it might be a decent idea to allocate
things of the proper size, I guess. It's a good opportunity to call
skb_cow_data, which you need to do anyway when modifying skbs, I
think.

Jason

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH] ixgbe: initialize u64_stats_sync structures early at ixgbe_probe
From: Alexander Duyck @ 2017-04-25 15:07 UTC (permalink / raw)
  To: Singh, Krishneil K
  Cc: Song, Liwei (Wind River), Kirsher, Jeffrey T,
	netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <EF50CC0C76A53A44BAD798C58E0A157E50B87CBB@ORSMSX103.amr.corp.intel.com>

On Mon, Apr 24, 2017 at 4:00 PM, Singh, Krishneil K
<krishneil.k.singh@intel.com> wrote:
>
>
>
> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@lists.osuosl.org] On Behalf Of Liwei Song
> Sent: Sunday, December 4, 2016 7:41 PM
> To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>
> Cc: netdev@vger.kernel.org; intel-wired-lan@lists.osuosl.org; linux-kernel@vger.kernel.org; Song, Liwei (Wind River) <liwei.song@windriver.com>
> Subject: [Intel-wired-lan] [PATCH] ixgbe: initialize u64_stats_sync structures early at ixgbe_probe
>
> Fix the following CallTrace:
> INFO: trying to register non-static key.
> the code is fine but needs lockdep annotation.
> turning off the locking correctness validator.
> CPU: 71 PID: 1 Comm: swapper/0 Not tainted 4.8.8-WR9.0.0.1_standard #11 Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.86B.0036.R05.1407140519 07/14/2014
>  00200086 00200086 eb5e1ab8 c144dd70 00000000 00000000 eb5e1af8 c10af89a
>  c1d23de4 eb5e1af8 00000009 eb5d8600 eb5d8638 eb5e1af8 c10b14d8 00000009  0000000a c1d32911 00000000 00000000 e44c826c eb5d8000 eb5e1b74 c10b214e Call Trace:
>  [<c144dd70>] dump_stack+0x5f/0x8f
>  [<c10af89a>] register_lock_class+0x25a/0x4c0  [<c10b14d8>] ? check_irq_usage+0x88/0xc0  [<c10b214e>] __lock_acquire+0x5e/0x17a0  [<c1abdb9b>] ? _raw_spin_unlock_irqrestore+0x3b/0x70
>  [<c10cf14a>] ? rcu_read_lock_sched_held+0x8a/0x90
>  [<c10b3c5f>] lock_acquire+0x9f/0x1f0
>  [<c1922dcf>] ? dev_get_stats+0x5f/0x110  [<c176e6b3>] ixgbe_get_stats64+0x113/0x320  [<c1922dcf>] ? dev_get_stats+0x5f/0x110  [<c1922dcf>] dev_get_stats+0x5f/0x110  [<c1ab5415>] rtnl_fill_stats+0x40/0x105  [<c193dd45>] rtnl_fill_ifinfo+0x4c5/0xd20  [<c11c5115>] ? __kmalloc_node_track_caller+0x1a5/0x410
>  [<c1917487>] ? __kmalloc_reserve.isra.42+0x27/0x80
>  [<c191754f>] ? __alloc_skb+0x6f/0x270
>  [<c1942291>] rtmsg_ifinfo_build_skb+0x71/0xd0  [<c194230a>] rtmsg_ifinfo.part.23+0x1a/0x50  [<c1923dad>] ? call_netdevice_notifiers_info+0x2d/0x60
>  [<c194236b>] rtmsg_ifinfo+0x2b/0x40
>  [<c192f997>] register_netdevice+0x3d7/0x4d0  [<c192faa7>] register_netdev+0x17/0x30  [<c177b83d>] ixgbe_probe+0x118d/0x1610  [<c1498202>] local_pci_probe+0x32/0x80  [<c1498172>] ? pci_match_device+0xd2/0x100  [<c14991e0>] pci_device_probe+0xc0/0x110  [<c1652cc5>] driver_probe_device+0x1c5/0x280  [<c1498172>] ? pci_match_device+0xd2/0x100  [<c1652e09>] __driver_attach+0x89/0x90  [<c1652d80>] ? driver_probe_device+0x280/0x280  [<c165114f>] bus_for_each_dev+0x4f/0x80  [<c165269e>] driver_attach+0x1e/0x20  [<c1652d80>] ? driver_probe_device+0x280/0x280  [<c1652317>] bus_add_driver+0x1a7/0x220  [<c1653a79>] driver_register+0x59/0xe0  [<c1f897b8>] ? igb_init_module+0x49/0x49  [<c1497b2a>] __pci_register_driver+0x4a/0x50  [<c1f8985d>] ixgbe_init_module+0xa5/0xc4  [<c1000485>] do_one_initcall+0x35/0x150  [<c107e818>] ? parameq+0x18/0x70  [<c1f395d8>] ? repair_env_string+0x12/0x51  [<c107ead0>] ? parse_args+0x260/0x3b0  [<c1074f73>] ? __usermodehelper_set_disable_depth+0x43/0x50
>  [<c1f39e90>] kernel_init_freeable+0x19b/0x267  [<c1f395c6>] ? set_debug_rodata+0xf/0xf  [<c10b1e7b>] ? trace_hardirqs_on+0xb/0x10  [<c1abdc02>] ? _raw_spin_unlock_irq+0x32/0x50  [<c1085f0b>] ? finish_task_switch+0xab/0x1f0  [<c1085ec9>] ? finish_task_switch+0x69/0x1f0  [<c1ab6a30>] kernel_init+0x10/0x110  [<c108bd65>] ? schedule_tail+0x25/0x80  [<c1abe422>] ret_from_kernel_thread+0xe/0x24  [<c1ab6a20>] ? rest_init+0x130/0x130
>
> This CallTrace occurred on 32-bit kernel with CONFIG_PROVE_LOCKING enabled.
>
> This happens at ixgbe driver probe hardware stage, when comes to ixgbe_get_stats64, the seqcount/seqlock still not initialize, although this was initialize in TX/RX resources setup routin, but it was too late, then lockdep give this Warning.
>
> To fix this, move the u64_stats_init function to driver probe stage, which before we get the status of seqcount and after the RX/TX ring was finished init.
>
> Signed-off-by: Liwei Song <liwei.song@windriver.com>
> ---
>
> Tested-by: Krishneil Singh  <krishneil.k.singh@intel.com>

This patch doesn't look right to me. I would suggest rejecting it.

The call to initialize the stats should be done when the ring is
allocated, not in ixgbe_probe(). This should probably be done in
ixgbe_alloc_q_vector() instead.

- Alex

^ permalink raw reply

* [PATCH v4 1/5] skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow
From: Jason A. Donenfeld @ 2017-04-25 15:04 UTC (permalink / raw)
  To: netdev, linux-kernel, davem, David.Laight, kernel-hardening
  Cc: Jason A. Donenfeld
In-Reply-To: <20170425.104731.631398016575024152.davem@davemloft.net>

This is a defense-in-depth measure in response to bugs like
4d6fa57b4dab ("macsec: avoid heap overflow in skb_to_sgvec")

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
v4 fixes the commit message and moves the check into the inner-most if.

 net/core/skbuff.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f86bf69cfb8d..e75640006d78 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3489,7 +3489,9 @@ void __init skb_init(void)
  *	@len: Length of buffer space to be mapped
  *
  *	Fill the specified scatter-gather list with mappings/pointers into a
- *	region of the buffer space attached to a socket buffer.
+ *	region of the buffer space attached to a socket buffer. Returns either
+ *	the number of scatterlist items used, or -EMSGSIZE if the contents
+ *	could not fit.
  */
 static int
 __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
@@ -3517,6 +3519,8 @@ __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
 		end = start + skb_frag_size(&skb_shinfo(skb)->frags[i]);
 		if ((copy = end - offset) > 0) {
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+			if (elt && sg_is_last(&sg[elt - 1]))
+				return -EMSGSIZE;
 
 			if (copy > len)
 				copy = len;
@@ -3537,6 +3541,9 @@ __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
 
 		end = start + frag_iter->len;
 		if ((copy = end - offset) > 0) {
+			if (elt && sg_is_last(&sg[elt - 1]))
+				return -EMSGSIZE;
+
 			if (copy > len)
 				copy = len;
 			elt += __skb_to_sgvec(frag_iter, sg+elt, offset - start,
@@ -3581,6 +3588,9 @@ int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int le
 {
 	int nsg = __skb_to_sgvec(skb, sg, offset, len);
 
+	if (nsg <= 0)
+		return nsg;
+
 	sg_mark_end(&sg[nsg - 1]);
 
 	return nsg;
-- 
2.12.2

^ permalink raw reply related

* [PATCHv2 net] bridge: move bridge multicast cleanup to ndo_uninit
From: Xin Long @ 2017-04-25 14:58 UTC (permalink / raw)
  To: network dev; +Cc: davem, nikolay, stephen

During removing a bridge device, if the bridge is still up, a new mdb entry
still can be added in br_multicast_add_group() after all mdb entries are
removed in br_multicast_dev_del(). Like the path:

  mld_ifc_timer_expire ->
    mld_sendpack -> ...
      br_multicast_rcv ->
        br_multicast_add_group

The new mp's timer will be set up. If the timer expires after the bridge
is freed, it may cause use-after-free panic in br_multicast_group_expired.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
IP: [<ffffffffa07ed2c8>] br_multicast_group_expired+0x28/0xb0 [bridge]
Call Trace:
 <IRQ>
 [<ffffffff81094536>] call_timer_fn+0x36/0x110
 [<ffffffffa07ed2a0>] ? br_mdb_free+0x30/0x30 [bridge]
 [<ffffffff81096967>] run_timer_softirq+0x237/0x340
 [<ffffffff8108dcbf>] __do_softirq+0xef/0x280
 [<ffffffff8169889c>] call_softirq+0x1c/0x30
 [<ffffffff8102c275>] do_softirq+0x65/0xa0
 [<ffffffff8108e055>] irq_exit+0x115/0x120
 [<ffffffff81699515>] smp_apic_timer_interrupt+0x45/0x60
 [<ffffffff81697a5d>] apic_timer_interrupt+0x6d/0x80

Nikolay also found it would cause a memory leak - the mdb hash is
reallocated and not freed due to the mdb rehash.

unreferenced object 0xffff8800540ba800 (size 2048):
  backtrace:
    [<ffffffff816e2287>] kmemleak_alloc+0x67/0xc0
    [<ffffffff81260bea>] __kmalloc+0x1ba/0x3e0
    [<ffffffffa05c60ee>] br_mdb_rehash+0x5e/0x340 [bridge]
    [<ffffffffa05c74af>] br_multicast_new_group+0x43f/0x6e0 [bridge]
    [<ffffffffa05c7aa3>] br_multicast_add_group+0x203/0x260 [bridge]
    [<ffffffffa05ca4b5>] br_multicast_rcv+0x945/0x11d0 [bridge]
    [<ffffffffa05b6b10>] br_dev_xmit+0x180/0x470 [bridge]
    [<ffffffff815c781b>] dev_hard_start_xmit+0xbb/0x3d0
    [<ffffffff815c8743>] __dev_queue_xmit+0xb13/0xc10
    [<ffffffff815c8850>] dev_queue_xmit+0x10/0x20
    [<ffffffffa02f8d7a>] ip6_finish_output2+0x5ca/0xac0 [ipv6]
    [<ffffffffa02fbfc6>] ip6_finish_output+0x126/0x2c0 [ipv6]
    [<ffffffffa02fc245>] ip6_output+0xe5/0x390 [ipv6]
    [<ffffffffa032b92c>] NF_HOOK.constprop.44+0x6c/0x240 [ipv6]
    [<ffffffffa032bd16>] mld_sendpack+0x216/0x3e0 [ipv6]
    [<ffffffffa032d5eb>] mld_ifc_timer_expire+0x18b/0x2b0 [ipv6]

This could happen when ip link remove a bridge or destroy a netns with a
bridge device inside.

With Nikolay's suggestion, this patch is to clean up bridge multicast in
ndo_uninit after bridge dev is shutdown, instead of br_dev_delete, so
that netif_running check in br_multicast_add_group can avoid this issue.

v1->v2:
  - fix this issue by moving br_multicast_dev_del to ndo_uninit, instead
    of calling dev_close in br_dev_delete.

Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/bridge/br_device.c | 1 +
 net/bridge/br_if.c     | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 90f49a1..430b53e 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -123,6 +123,7 @@ static void br_dev_uninit(struct net_device *dev)
 {
 	struct net_bridge *br = netdev_priv(dev);
 
+	br_multicast_dev_del(br);
 	br_multicast_uninit_stats(br);
 	br_vlan_flush(br);
 	free_percpu(br->stats);
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 56a2a72..a8d0ed2 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -311,7 +311,6 @@ void br_dev_delete(struct net_device *dev, struct list_head *head)
 
 	br_fdb_delete_by_port(br, NULL, 0, 1);
 
-	br_multicast_dev_del(br);
 	cancel_delayed_work_sync(&br->gc_work);
 
 	br_sysfs_delbr(br->dev);
-- 
2.1.0

^ permalink raw reply related

* Re: [RFC 0/4] xdp: use netlink extended ACK reporting
From: David Ahern @ 2017-04-25 14:53 UTC (permalink / raw)
  To: Jakub Kicinski, netdev
  Cc: davem, johannes, daniel, alexei.starovoitov, bblanco,
	john.fastabend, kubakici, oss-drivers
In-Reply-To: <20170425080644.122536-1-jakub.kicinski@netronome.com>

On 4/25/17 2:06 AM, Jakub Kicinski wrote:

> Also - is anyone working on adding proper extack support to iproute2?
> The code I have right now is a bit of a hack...

This is what I have done:
    https://github.com/dsahern/iproute2/commits/ext-ack

Basically, added the parsing code and then a new rtnl_talk_extack
function that takes a callback to invoke with the extack data. The last
patch (of 3) purposely breaks ip set link mtu -- sending the mtu as a
u16 rather than a u32 just to work on the plumbing for parsing the
returned message:

$ ip li set dummy1 mtu 1490
Error with rtnetlink attribute IFLA_MTU

If an errmsg is returned it is printed as well.

^ permalink raw reply

* Re: pull-request: can-next 2017-04-25,pull-request: can-next 2017-04-25
From: David Miller @ 2017-04-25 14:53 UTC (permalink / raw)
  To: mkl; +Cc: netdev, kernel, linux-can
In-Reply-To: <5a2d15d7-eb0f-f423-5814-df0babe5b749@pengutronix.de>

From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Tue, 25 Apr 2017 10:44:26 +0200

> this is a pull request of 21 patches for net-next/master.
> 
> There are 4 patches by Stephane Grosjean for the PEAK PCAN-PCIe FD
> CAN-FD boards. The next 7 patches are by Mario Huettel, which add
> support for M_CAN IP version >= v3.1.x to the m_can driver. A patch by
> Remigiusz Kołłątaj adds support for the Microchip CAN BUS Analyzer. 8
> patches by Oliver Hartkopp complete the initial CAN network namespace
> support. Wei Yongjun's patch for the ti_hecc driver fixes the return
> value check in the probe function.

Pulled, thanks Marc.

^ permalink raw reply

* Re: [PATCH] macsec: avoid heap overflow in skb_to_sgvec
From: Sabrina Dubroca @ 2017-04-25 14:53 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: netdev, linux-kernel, davem, stable, security
In-Reply-To: <20170421211448.16995-1-Jason@zx2c4.com>

2017-04-21, 23:14:48 +0200, Jason A. Donenfeld wrote:
> While this may appear as a humdrum one line change, it's actually quite
> important. An sk_buff stores data in three places:
> 
> 1. A linear chunk of allocated memory in skb->data. This is the easiest
>    one to work with, but it precludes using scatterdata since the memory
>    must be linear.
> 2. The array skb_shinfo(skb)->frags, which is of maximum length
>    MAX_SKB_FRAGS. This is nice for scattergather, since these fragments
>    can point to different pages.
> 3. skb_shinfo(skb)->frag_list, which is a pointer to another sk_buff,
>    which in turn can have data in either (1) or (2).
> 
> The first two are rather easy to deal with, since they're of a fixed
> maximum length, while the third one is not, since there can be
> potentially limitless chains of fragments. Fortunately dealing with
> frag_list is opt-in for drivers, so drivers don't actually have to deal
> with this mess. For whatever reason, macsec decided it wanted pain, and
> so it explicitly specified NETIF_F_FRAGLIST.
> 
> Because dealing with (1), (2), and (3) is insane, most users of sk_buff
> doing any sort of crypto or paging operation calls a convenient function
> called skb_to_sgvec (which happens to be recursive if (3) is in use!).
> This takes a sk_buff as input, and writes into its output pointer an
> array of scattergather list items. Sometimes people like to declare a
> fixed size scattergather list on the stack; othertimes people like to
> allocate a fixed size scattergather list on the heap. However, if you're
> doing it in a fixed-size fashion, you really shouldn't be using
> NETIF_F_FRAGLIST too (unless you're also ensuring the sk_buff and its
> frag_list children arent't shared and then you check the number of
> fragments in total required.)
> 
> Macsec specifically does this:
> 
>         size += sizeof(struct scatterlist) * (MAX_SKB_FRAGS + 1);
>         tmp = kmalloc(size, GFP_ATOMIC);
>         *sg = (struct scatterlist *)(tmp + sg_offset);
> 	...
>         sg_init_table(sg, MAX_SKB_FRAGS + 1);
>         skb_to_sgvec(skb, sg, 0, skb->len);
> 
> Specifying MAX_SKB_FRAGS + 1 is the right answer usually, but not if you're
> using NETIF_F_FRAGLIST, in which case the call to skb_to_sgvec will
> overflow the heap, and disaster ensues.

Ugh, good catch :/

AFAICT this patch doesn't really help, because NETIF_F_FRAGLIST
doesn't get tested in paths that can lead to triggering this.

I'll post a patch to allocate a properly-sized sg array.

-- 
Sabrina

^ permalink raw reply

* [vhost:vhost 6/19] drivers/net/virtio_net.c:2089:19: error: assignment of read-only location '*(ctx + (sizetype)rxq2vq(i))'
From: kbuild test robot @ 2017-04-25 14:52 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, kbuild-all, kvm, virtualization

[-- Attachment #1: Type: text/plain, Size: 1435 bytes --]

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost
head:   674c124665ca2ff1bcf81b1b92a207f71a326742
commit: e43eed6b8068f1c570551fe33bed12ef840c956b [6/19] virtio_net: allow specifying context for rx
config: x86_64-acpi-redef (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        git checkout e43eed6b8068f1c570551fe33bed12ef840c956b
        # save the attached .config to linux build tree
        make ARCH=x86_64 

Note: the vhost/vhost HEAD 674c124665ca2ff1bcf81b1b92a207f71a326742 builds fine.
      It only hurts bisectibility.

All errors (new ones prefixed by >>):

   drivers/net/virtio_net.c: In function 'virtnet_find_vqs':
>> drivers/net/virtio_net.c:2089:19: error: assignment of read-only location '*(ctx + (sizetype)rxq2vq(i))'
       ctx[rxq2vq(i)] = true;
                      ^

vim +2089 drivers/net/virtio_net.c

  2083			callbacks[txq2vq(i)] = skb_xmit_done;
  2084			sprintf(vi->rq[i].name, "input.%d", i);
  2085			sprintf(vi->sq[i].name, "output.%d", i);
  2086			names[rxq2vq(i)] = vi->rq[i].name;
  2087			names[txq2vq(i)] = vi->sq[i].name;
  2088			if (ctx)
> 2089				ctx[rxq2vq(i)] = true;
  2090		}
  2091	
  2092		ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31327 bytes --]

[-- Attachment #3: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next] rhashtable: remove insecure_max_entries param
From: David Miller @ 2017-04-25 14:48 UTC (permalink / raw)
  To: fw; +Cc: herbert, netdev
In-Reply-To: <20170425141749.GD11322@breakpoint.cc>

From: Florian Westphal <fw@strlen.de>
Date: Tue, 25 Apr 2017 16:17:49 +0200

> I'd have less of an issue with this if we'd be talking about
> something computationally expensive, but this is about storing
> an extra value inside a struct just to avoid one "shr" in insert path...

Agreed, this shift is probably filling an available cpu cycle :-)

^ permalink raw reply

* Re: [PATCH 1/5] skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow
From: David Miller @ 2017-04-25 14:47 UTC (permalink / raw)
  To: Jason; +Cc: netdev, linux-kernel, David.Laight, kernel-hardening
In-Reply-To: <20170425140809.23881-1-Jason@zx2c4.com>

From: "Jason A. Donenfeld" <Jason@zx2c4.com>
Date: Tue, 25 Apr 2017 16:08:05 +0200

> This is a defense-in-depth measure in response to bugs like
> 4d6fa57b4dab0d77f4d8e9d9c73d1e63f6fe8fee.
> 
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Please refer to commits in the form:

$(SHA1_ID) ("Commit header line.")

That is, 12 bytes of SHA1_ID followed by the commit header line text
in both double quotes and parenthesis, like this:

4d6fa57b4dab ("macsec: avoid heap overflow in skb_to_sgvec")

Otherwise when changes get backported or applied to different trees,
they have different SHA1_ID values.  The commit header text removes
any and all ambiguity.

Thank you.

^ permalink raw reply

* Re: [PATCH net-next] ipvlan: use pernet operations and restrict l3s hooks to master netns
From: David Miller @ 2017-04-25 14:43 UTC (permalink / raw)
  To: fw; +Cc: netdev, maheshb
In-Reply-To: <20170420160815.7201-1-fw@strlen.de>

From: Florian Westphal <fw@strlen.de>
Date: Thu, 20 Apr 2017 18:08:15 +0200

> commit 4fbae7d83c98c30efc ("ipvlan: Introduce l3s mode") added
> registration of netfilter hooks via nf_register_hooks().
> 
> This API provides the illusion of 'global' netfilter hooks by placing the
> hooks in all current and future network namespaces.
> 
> In case of ipvlan the hook appears to be only needed in the namespace
> that contains the ipvlan master device (i.e., usually init_net), so
> placing them in all namespaces is not needed.
> 
> This switches ipvlan driver to pernet operations, and then only registers
> hooks in namespaces where a ipvlan master device is set to l3s mode.
> 
> Extra care has to be taken when the master device is moved to another
> namespace, as we might have to 'move' the netfilter hooks too.
> 
> This is done by storing the namespace the ipvlan port was created in.
> On REGISTER event, do (un)register operations in the old/new namespaces.
> 
> This will also allow removal of the nf_register_hooks() in a future patch.
> 
> Cc: Mahesh Bandewar <maheshb@google.com>
> Signed-off-by: Florian Westphal <fw@strlen.de>

Applied, thanks Florian.

^ permalink raw reply

* Re: macvlan: Fix device ref leak when purging bc_queue
From: David Miller @ 2017-04-25 14:42 UTC (permalink / raw)
  To: herbert; +Cc: Joe.Ghalam, Clifford.Wichmann, netdev
In-Reply-To: <20170420125512.GA9113@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 20 Apr 2017 20:55:12 +0800

> When a parent macvlan device is destroyed we end up purging its
> broadcast queue without dropping the device reference count on
> the packet source device.  This causes the source device to linger.
> 
> This patch drops that reference count.
> 
> Fixes: 260916dfb48c ("macvlan: Fix potential use-after free for...")
> Reported-by: Joe Ghalam <Joe.Ghalam@dell.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied and queued up for -stable, thanks Herbert.

^ permalink raw reply

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
From: Jiri Slaby @ 2017-04-25 14:41 UTC (permalink / raw)
  To: David Miller
  Cc: alexei.starovoitov, mingo, mingo, tglx, hpa, x86, jpoimboe,
	linux-kernel, netdev, daniel, edumazet
In-Reply-To: <20170424.142420.290668473718207530.davem@davemloft.net>

On 04/24/2017, 08:24 PM, David Miller wrote:
> From: Jiri Slaby <jslaby@suse.cz>
> Date: Mon, 24 Apr 2017 19:51:54 +0200
> 
>> For example what's the point of making the sk_load_word_positive_offset
>> label a global, callable function? Note that this is exactly the reason
>> why this particular two hunks look weird to you even though the
>> annotations only mechanically paraphrase what is in the current code.
> 
> So that it can be referenced by the eBPF JIT, because these are
> helpers for eBPF JIT generated code.  Every architecture implementing
> an eBPF JIT has this "mess".

I completely understand the needs for this, but I am complaining about
the way it is written. That is not the best -- unbalanced annotations, C
macros in lowercase (apart from that, C macros in .S need semicolons &
backslashes), FUNC macro, etc.

> You can't even put a tracepoint or kprobe on these things and expect
> to see "arguments" or "return PC" values in the usual spots.  This
> code has special calling conventions and register usage as Alexei
> explained.

Yes, I can see that.

> I would suggest that you read and understand how this assembler is
> designed, how it is called from the generated JIT code, and what it's
> semantics and register usage are, before trying to annotating it.

Of course I studied the code. I only missed macro CHOOSE_LOAD_FUNC which
I see now. So that answers why sk_load_word_positive_offset & similar
are marked as .globl.


But the original question I asked still remains: why do you mind calling
them BPF_FUNC_START & *_END, given:

1) the functions are marked by "FUNC" already:
$ git grep FUNC linus/master arch/x86/net/bpf_jit.S
linus/master:arch/x86/net/bpf_jit.S:#define FUNC(name) \
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_word)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_word_positive_offset)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_half)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_half_positive_offset)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_byte)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_byte_positive_offset)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_word_negative_offset)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_half_negative_offset)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_byte_negative_offset)

2) they _are_ all callable from within the JIT code:
EMIT1_off32(0xE8, jmp_offset);

Yes, I fucked up the ENDs. They should be on different locations. But
the pieces are still functions from my POV and should be annotated
accordingly.

thanks,
-- 
js
suse labs

^ permalink raw reply

* Re: [PATCH] IB/IPoIB: Check the headroom size
From: Or Gerlitz @ 2017-04-25 14:39 UTC (permalink / raw)
  To: Erez Shitrit, Paolo Abeni
  Cc: Honggang LI, Erez Shitrit, Doug Ledford,
	linux-rdma@vger.kernel.org, Linux Netdev List, David Miller
In-Reply-To: <CAAk-MO-B8Liv6RyGye_Fx-9OELCUrGMj5SwK9UsciC=khSO5fA@mail.gmail.com>

On Tue, Apr 25, 2017 at 2:43 PM, Erez Shitrit <erezsh@dev.mellanox.co.il> wrote:
> On Tue, Apr 25, 2017 at 2:14 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:

>> thanks for the info. Is this bug there since ipoib/bonding day one (and hence my bug...)
>> or was indeed introduced later? if later, can you explain how
>> fc791b633515 introduced that or you only know it by bisection?

> commit "fc791b633515" changes the size of the dev_hardlen to be 24 and
> required 24 extra bytes in the skb, before it was only 4, if skb is
> aligned to eth "mode" it already has 14 bytes for hard-header.
> So only after that commit we have the issue.

If got you right, Paolo's commit introduced a regression, so we (I
guess you and
Paolo) need to either solve it or we (community) should consider a
revert, please suggest.

The bug is now in stable and distro kernels, so please act.

Or.

^ permalink raw reply

* Re: [PATCH v1] net: phy: fix auto-negotiation stall due to unavailable interrupt
From: David Miller @ 2017-04-25 14:36 UTC (permalink / raw)
  To: al.kochet
  Cc: f.fainelli, netdev, linux-kernel, sergei.shtylyov, rogerq,
	madalin.bucur
In-Reply-To: <1492686004-30527-2-git-send-email-al.kochet@gmail.com>

From: Alexander Kochetkov <al.kochet@gmail.com>
Date: Thu, 20 Apr 2017 14:00:04 +0300

> The Ethernet link on an interrupt driven PHY was not coming up if the Ethernet
> cable was plugged before the Ethernet interface was brought up.
> 
> The patch trigger PHY state machine to update link state if PHY was requested to
> do auto-negotiation and auto-negotiation complete flag already set.
> 
> During power-up cycle the PHY do auto-negotiation, generate interrupt and set
> auto-negotiation complete flag. Interrupt is handled by PHY state machine but
> doesn't update link state because PHY is in PHY_READY state. After some time
> MAC bring up, start and request PHY to do auto-negotiation. If there are no new
> settings to advertise genphy_config_aneg() doesn't start PHY auto-negotiation.
> PHY continue to stay in auto-negotiation complete state and doesn't fire
> interrupt. At the same time PHY state machine expect that PHY started
> auto-negotiation and is waiting for interrupt from PHY and it won't get it.
> 
> Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
> Cc: stable <stable@vger.kernel.org> # v4.9+

So... what are we doing here?

My understanding is that this should fix the same problem that commit
99f81afc139c6edd14d77a91ee91685a414a1c66 ("phy: micrel: Disable auto
negotiation on startup") fixed and that this micrel commit should thus
be reverted to improve MAC startup times which regressed.

Florian, any guidance?

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox