* [PATCH net-next] hyperv: Fix some variable name typos in send-buffer init/revoke
From: Haiyang Zhang @ 2014-12-20 2:25 UTC (permalink / raw)
To: davem, netdev; +Cc: olaf, jasowang, driverdev-devel, linux-kernel, haiyangz
The changed names are union fields with the same size, so the existing code
still works. But, we now update these variables to the correct names.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
---
drivers/net/hyperv/hyperv_net.h | 1 +
drivers/net/hyperv/netvsc.c | 15 ++++++++-------
2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 2f48f79..384ca4f 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -590,6 +590,7 @@ struct nvsp_message {
#define NETVSC_RECEIVE_BUFFER_ID 0xcafe
+#define NETVSC_SEND_BUFFER_ID 0
#define NETVSC_PACKET_SIZE 4096
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index dd867e6..9f49c01 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -161,8 +161,8 @@ static int netvsc_destroy_buf(struct netvsc_device *net_device)
/* Deal with the send buffer we may have setup.
* If we got a send section size, it means we received a
- * SendsendBufferComplete msg (ie sent
- * NvspMessage1TypeSendReceiveBuffer msg) therefore, we need
+ * NVSP_MSG1_TYPE_SEND_SEND_BUF_COMPLETE msg (ie sent
+ * NVSP_MSG1_TYPE_SEND_SEND_BUF msg) therefore, we need
* to send a revoke msg here
*/
if (net_device->send_section_size) {
@@ -172,7 +172,8 @@ static int netvsc_destroy_buf(struct netvsc_device *net_device)
revoke_packet->hdr.msg_type =
NVSP_MSG1_TYPE_REVOKE_SEND_BUF;
- revoke_packet->msg.v1_msg.revoke_recv_buf.id = 0;
+ revoke_packet->msg.v1_msg.revoke_send_buf.id =
+ NETVSC_SEND_BUFFER_ID;
ret = vmbus_sendpacket(net_device->dev->channel,
revoke_packet,
@@ -204,7 +205,7 @@ static int netvsc_destroy_buf(struct netvsc_device *net_device)
net_device->send_buf_gpadl_handle = 0;
}
if (net_device->send_buf) {
- /* Free up the receive buffer */
+ /* Free up the send buffer */
vfree(net_device->send_buf);
net_device->send_buf = NULL;
}
@@ -339,9 +340,9 @@ static int netvsc_init_buf(struct hv_device *device)
init_packet = &net_device->channel_init_pkt;
memset(init_packet, 0, sizeof(struct nvsp_message));
init_packet->hdr.msg_type = NVSP_MSG1_TYPE_SEND_SEND_BUF;
- init_packet->msg.v1_msg.send_recv_buf.gpadl_handle =
+ init_packet->msg.v1_msg.send_send_buf.gpadl_handle =
net_device->send_buf_gpadl_handle;
- init_packet->msg.v1_msg.send_recv_buf.id = 0;
+ init_packet->msg.v1_msg.send_send_buf.id = NETVSC_SEND_BUFFER_ID;
/* Send the gpadl notification request */
ret = vmbus_sendpacket(device->channel, init_packet,
@@ -364,7 +365,7 @@ static int netvsc_init_buf(struct hv_device *device)
netdev_err(ndev, "Unable to complete send buffer "
"initialization with NetVsp - status %d\n",
init_packet->msg.v1_msg.
- send_recv_buf_complete.status);
+ send_send_buf_complete.status);
ret = -EINVAL;
goto cleanup;
}
--
1.7.1
^ permalink raw reply related
* Re: net: Detect drivers that reschedule NAPI and exhaust budget
From: David Miller @ 2014-12-20 2:40 UTC (permalink / raw)
To: eric.dumazet
Cc: herbert, david.vrabel, netdev, xen-devel, konrad.wilk,
boris.ostrovsky, edumazet
In-Reply-To: <1419039288.11185.4.camel@edumazet-glaptop2.roam.corp.google.com>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 19 Dec 2014 17:34:48 -0800
>> @@ -4620,7 +4620,11 @@ static void net_rx_action(struct softirq_action *h)
>> */
>> napi_gro_flush(n, HZ >= 1000);
>> }
>> - list_add_tail(&n->poll_list, &repoll);
>> + /* Some drivers may have called napi_schedule
>> + * prior to exhausting their budget.
>> + */
>> + if (!WARN_ON_ONCE(!list_empty(&n->poll_list)))
>> + list_add_tail(&n->poll_list, &repoll);
>> }
>> }
>>
>
> I do not think stack trace will point to the buggy driver ?
>
> IMO it would be better to print a single line with the netdev name ?
Right, we are already back from the poll routine and will just end
up seeing the call trace leading to the software interrupt.
^ permalink raw reply
* OVS + BPF, make sense?
From: Andy Zhou @ 2014-12-20 2:49 UTC (permalink / raw)
To: dev-yBygre7rU0SM8Zsap4Y0gw@public.gmane.org,
netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Hi, OVS and netdev developers:
On 12/17/2014, Alexei Starovoiov and a few of the OVS developers
(Joe, Andy, Jesse, Pravin and Justin) got together to discuss possible
ways for OVS to harness the power BPF in recent Linux kernels. During
the meeting, we feel that the content of the discussion may be of
interest to many OVS and Linux kernel developers, it should be a good
idea to post the meeting minutes.
The meeting minutes can be found below. I cross post them to both
ovs-dev and netdev mailing list. Apologises if you receive this email
twice.
We don't have a concrete plan at this point on how BPF can be applied.
However we are interested in exploring further and exchange ideas with
the developer communities. We can
probably meet again early 2015 if there is sufficient interest in this topic.
Regards,
Andy
BPF current status:
===========================
* Linux kernel up-streaming on going, currently focus on tracing.
Other enhancements
planed, for example, JIT opcode obfuscation, as a security enhancements.
* First set of LLVM upstreaming should land in Q1'15. More
enhancements will follow.
* GCC backend is possible, but not planed at this time.
* New features planed: per-cpu data structures, streaming interfaces
reusing trace buffer infrasturucture.
Possible use cases of BPF in OVS Linux kernel datapath
===========================================
1. Using BPF to implement a single action:
It may make sense for OVS to have its own program type. However,
bpf_register_prog_type() API currently is not exported. This
means the program type and related helper function can not be provided
by the OVS kernel module, but has be be up-streamed into the kernel
core. This may affect how OVS kernel module can provide backward
compatibility. Alexei explained the this is mainly driven by the
concerns of the
complexity related to tracking module load/unload while BPF program
are running, and
the concern of possible side-stepping of GPL.
BPF action may need to access kernel data structures, such as skb,
in kernel version agnostic ways. Alexei is aware of this requirement,
and is considering multiple potential solutions A) bpf helper
functions, b) using pseudo skb, c) ask kernel about offsets for each
interested data field... This is work in progress.
2. Using BPF to implement the entire action list:
This is a bigger task than 1, but can bring more benefits of BPF
to OVS. Current ovs action list are sequentially executed. BPF
provides if-then-else and other types of control capabilities to
'upgrade' action list to a true program.
Those BPF programs needs to be generated at run time by OVS users
space. Alexei thinks this may not be hard within the scope of current
OVS actions. Jesse suggested to reference libpcap style of program generation.
3. Using BPF to implement ovs flow extract
Flow extract functions are sweet spots for applying BPF. BPF can
be the backend of current OpenFlow match parser, or even the back end
of a more flexible parser such as P4.
4. Using BPF to implement overall OVS kernel module functionality
Alexei likes this approach the most. The potential benefits are:
a) flexible parser and flow data structure
b) user space and kernel data structures are always in-sync, thus
removing the complexity of version compatibility handling and error
checking.
c) possible higher performance than current kernel module, with
JITed BPF code.
d) The helper functions can be more easily planned out. This can be
important in case dynamic helper function registration is not
possible.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
^ permalink raw reply
* Re: net: Detect drivers that reschedule NAPI and exhaust budget
From: Herbert Xu @ 2014-12-20 6:55 UTC (permalink / raw)
To: David Miller
Cc: eric.dumazet, david.vrabel, netdev, xen-devel, konrad.wilk,
boris.ostrovsky, edumazet
In-Reply-To: <20141219.214000.819506179607476836.davem@davemloft.net>
On Fri, Dec 19, 2014 at 09:40:00PM -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 19 Dec 2014 17:34:48 -0800
>
> >> @@ -4620,7 +4620,11 @@ static void net_rx_action(struct softirq_action *h)
> >> */
> >> napi_gro_flush(n, HZ >= 1000);
> >> }
> >> - list_add_tail(&n->poll_list, &repoll);
> >> + /* Some drivers may have called napi_schedule
> >> + * prior to exhausting their budget.
> >> + */
> >> + if (!WARN_ON_ONCE(!list_empty(&n->poll_list)))
> >> + list_add_tail(&n->poll_list, &repoll);
> >> }
> >> }
> >>
> >
> > I do not think stack trace will point to the buggy driver ?
> >
> > IMO it would be better to print a single line with the netdev name ?
>
> Right, we are already back from the poll routine and will just end
> up seeing the call trace leading to the software interrupt.
Good point Eric.
-- >8 --
The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
interrupt masking in NAPI) required drivers to leave poll_list
empty if the entire budget is consumed.
We have already had two broken drivers so let's add a check for
this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/net/core/dev.c b/net/core/dev.c
index f411c28..47fdc5c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4620,7 +4620,13 @@ static void net_rx_action(struct softirq_action *h)
*/
napi_gro_flush(n, HZ >= 1000);
}
- list_add_tail(&n->poll_list, &repoll);
+ /* Some drivers may have called napi_schedule
+ * prior to exhausting their budget.
+ */
+ if (unlikely(!list_empty(&n->poll_list)))
+ pr_warn("%s: Budget exhausted after napi rescheduled\n", n->dev ? n->dev->name : "backlog");
+ else
+ list_add_tail(&n->poll_list, &repoll);
}
}
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related
* [PATCH]e100 in linux-3.18.0: some potential bugs
From: Jia-Ju Bai @ 2014-12-20 7:40 UTC (permalink / raw)
To: todd.fujinaka; +Cc: netdev, Linux-nics, linux.nics, e1000-devel
[-- Attachment #1: Type: text/plain, Size: 2245 bytes --]
I have actually tested e100 driver on the real hardware(Intel 82559 PCI
Ethernet Controller), and find some bugs:
The target file is drivers/net/ethernet/intel/e100.c, which is used to build
e100.ko.
(1) The function pci_pool_create is called by e100_probe when initializing
the ethernet card driver. But when pci_pool_create is failed, which means
that it returns NULL to nic->cbs_pool, the system crash will happen. Because
pci_pool_alloc (in e100_alloc_cbs in e100_up in e100_open) need to use
nic->cbs_pool to allocate the resource, but it is NULL. I suggest that a
check can be added in the code to detect whether pci_pool_create returns
NULL.
(2) In the normal process, netif_napi_add is called in e100_probe, but
netif_napi_del is not called in e100_remove. However, many other ethernet
card drivers call them in pairs, even in the error handling paths, such as
r8169 and igb.
Meanwhile, I also write the patch to fix the bugs. I have run the patch on
the hardware, it can work normally and fix the above bugs.
diff --git a/drivers/net/ethernet/intel/e100.c
b/drivers/net/ethernet/intel/e100.c
index 781065e..2631d3f 100644
--- a/drivers/net/ethernet/intel/e100.c
+++ b/drivers/net/ethernet/intel/e100.c
@@ -2969,6 +2969,11 @@ static int e100_probe(struct pci_dev *pdev, const
struct pci_device_id *ent)
nic->params.cbs.max * sizeof(struct cb),
sizeof(u32),
0);
+ if(!(nic->cbs_pool))
+ {
+ err = -ENOMEM;
+ goto err_out_pool;
+ }
netif_info(nic, probe, nic->netdev,
"addr 0x%llx, irq %d, MAC addr %pM\n",
(unsigned long long)pci_resource_start(pdev, use_io ? 1 :
0),
@@ -2976,6 +2981,8 @@ static int e100_probe(struct pci_dev *pdev, const
struct pci_device_id *ent)
return 0;
+err_out_pool:
+ unregister_netdev(netdev);
err_out_free:
e100_free(nic);
err_out_iounmap:
@@ -2985,6 +2992,7 @@ err_out_free_res:
err_out_disable_pdev:
pci_disable_device(pdev);
err_out_free_dev:
+ netif_napi_del(&nic->napi);
free_netdev(netdev);
return err;
}
@@ -2995,6 +3003,7 @@ static void e100_remove(struct pci_dev *pdev)
if (netdev) {
struct nic *nic = netdev_priv(netdev);
+ netif_napi_del(&nic->napi);
unregister_netdev(netdev);
e100_free(nic);
pci_iounmap(pdev, nic->csr);
[-- Attachment #2: patch_e100 --]
[-- Type: application/octet-stream, Size: 1265 bytes --]
diff --git a/drivers/net/ethernet/intel/e100.c b/drivers/net/ethernet/intel/e100.c
index 781065e..2631d3f 100644
--- a/drivers/net/ethernet/intel/e100.c
+++ b/drivers/net/ethernet/intel/e100.c
@@ -2969,6 +2969,11 @@ static int e100_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
nic->params.cbs.max * sizeof(struct cb),
sizeof(u32),
0);
+ if(!(nic->cbs_pool))
+ {
+ err = -ENOMEM;
+ goto err_out_pool;
+ }
netif_info(nic, probe, nic->netdev,
"addr 0x%llx, irq %d, MAC addr %pM\n",
(unsigned long long)pci_resource_start(pdev, use_io ? 1 : 0),
@@ -2976,6 +2981,8 @@ static int e100_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
return 0;
+err_out_pool:
+ unregister_netdev(netdev);
err_out_free:
e100_free(nic);
err_out_iounmap:
@@ -2985,6 +2992,7 @@ err_out_free_res:
err_out_disable_pdev:
pci_disable_device(pdev);
err_out_free_dev:
+ netif_napi_del(&nic->napi);
free_netdev(netdev);
return err;
}
@@ -2995,6 +3003,7 @@ static void e100_remove(struct pci_dev *pdev)
if (netdev) {
struct nic *nic = netdev_priv(netdev);
+ netif_napi_del(&nic->napi);
unregister_netdev(netdev);
e100_free(nic);
pci_iounmap(pdev, nic->csr);
^ permalink raw reply related
* [PATCH] e1000 in linux-3.18.0: a potential bug
From: Jia-Ju Bai @ 2014-12-20 7:50 UTC (permalink / raw)
To: todd.fujinaka, netdev; +Cc: linux.nics, e1000-devel
I have actually tested e1000 driver on the real hardware(Intel 82540EM PCI
Gigabit Ethernet Controller), and find a potential bug:
The target file is drivers/net/ethernet/intel/e1000/e1000_main.c, which is
used to build e1000.ko.
(1) In the normal process, netif_napi_add is called in e1000_probe, but
netif_napi_del is not called in e1000_remove. However, many other ethernet
card drivers call them in pairs, even in the error handling paths, such as
r8169 and igb.
Meanwhile, I also write the patch to fix the bug. I have run the patch on
the hardware, it can work normally and fix the above bug.
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c
b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 24f3986..f6def7b 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -1004,7 +1004,7 @@ static int e1000_probe(struct pci_dev *pdev, const
struct pci_device_id *ent)
/* make ready for any if (hw->...) below */
err = e1000_init_hw_struct(adapter, hw);
if (err)
- goto err_sw_init;
+ goto err_dma;
/* there is a workaround being applied below that limits
* 64-bit DMA addresses to 64-bit hardware. There are some
@@ -1239,8 +1239,9 @@ err_eeprom:
iounmap(hw->flash_address);
kfree(adapter->tx_ring);
kfree(adapter->rx_ring);
-err_dma:
err_sw_init:
+ netif_napi_del(&adapter->napi);
+err_dma:
err_mdio_ioremap:
iounmap(hw->ce4100_gbe_mdio_base_virt);
iounmap(hw->hw_addr);
@@ -1271,6 +1272,7 @@ static void e1000_remove(struct pci_dev *pdev)
e1000_down_and_stop(adapter);
e1000_release_manageability(adapter);
+ netif_napi_del(&adapter->napi);
unregister_netdev(netdev);
e1000_phy_hw_reset(hw);
Thanks!
^ permalink raw reply related
* [PATCH] e1000e in linux-3.18.0: some potential bugs
From: Jia-Ju Bai @ 2014-12-20 8:02 UTC (permalink / raw)
To: todd.fujinaka, netdev; +Cc: e1000-devel, linux.nics
I have actually tested e1000e driver on the real hardware(Intel 82572EI
PCI-E Gigabit Ethernet Controller), and find some potential bugs:
The target file is drivers/net/ethernet/intel/e1000e/netdev.c, which is used
to build e1000e.ko.
(1) In the normal process, netif_napi_add is called in e1000_probe, but
netif_napi_del is not called in e1000_remove. However, many other ethernet
card drivers call them in pairs, even in the error handling paths, such as
r8169 and igb.
(2) The function vzalloc is called by e1000e_setup_rx_resources (in
e1000_open) when initializing the ethernet card driver. But when vzalloc is
failed, "err" segment in e1000e_setup_rx_resources is executed to return and
then e1000e_free_tx_resources in "err_setup_rx" segment in e1000_open is
executed to halt. However, "writel(0, tx_ring->head)" statement in
e1000_clean_tx_ring in e1000e_free_tx_resources will cause system crash,
because "tx_ring->head" is not assigned the value. In the code,
"tx_ring->head" is initialized in e1000_configure_tx in e1000_configure
after the e1000e_setup_rx_resources.
(3) The same system crashes happens, when kcalloc in
e1000e_setup_rx_resources is failed(returns NULL).
(4) The same system crashes happens, when e1000_alloc_ring_dma in
e1000e_setup_rx_resources is failed(returns error code).
(5) In the normal process of e1000e, pci_enable_pcie_error_reporting and
pci_disable_pcie_error_reporting is called in pairs in e1000_probe and
e1000_remove. However, when pci_enable_pcie_error_reporting has been called
and pci_save_state in e1000_probe is failed, "err_alloc_etherdev" segment in
e1000_probe is executed immediately to exit, but
pci_disable_pcie_error_reporting is not called.
(6) The same situation happens when alloc_etherdev_mqs in e1000_probe is
failed.
(7) The same situation happens when ioremap in e1000_probe is failed.
(8) The same situation happens when e1000_sw_init in e1000_probe is failed.
(9) The same situation happens when register_netdev in e1000_probe is
failed.
(10) When request_irq in e1000_request_irq is failed, pm_qos_add_request in
e1000_open is called, but pm_qos_remove_request is not called.
Meanwhile, I also write the patch to fix the bugs. I have run the patch on
the hardware, it can work normally and fix the above bugs.
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
b/drivers/net/ethernet/intel/e1000e/netdev.c
index 247335d..02d1e67 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -2444,6 +2444,8 @@ static void e1000_clean_tx_ring(struct e1000_ring
*tx_ring)
tx_ring->next_to_use = 0;
tx_ring->next_to_clean = 0;
+ if(!(tx_ring->head))
+ return;
writel(0, tx_ring->head);
if (adapter->flags2 & FLAG2_PCIM2PCI_ARBITER_WA)
e1000e_update_tdt_wa(tx_ring, 0);
@@ -4358,11 +4360,13 @@ static int e1000_open(struct net_device *netdev)
netif_carrier_off(netdev);
/* allocate transmit descriptors */
+ adapter->tx_ring->head = NULL;
err = e1000e_setup_tx_resources(adapter->tx_ring);
if (err)
goto err_setup_tx;
/* allocate receive descriptors */
+ adapter->rx_ring->head = NULL;
err = e1000e_setup_rx_resources(adapter->rx_ring);
if (err)
goto err_setup_rx;
@@ -4430,6 +4434,7 @@ static int e1000_open(struct net_device *netdev)
return 0;
err_req_irq:
+ pm_qos_remove_request(&adapter->netdev->pm_qos_req);
e1000e_release_hw_control(adapter);
e1000_power_down_phy(adapter);
e1000e_free_rx_resources(adapter->rx_ring);
@@ -7045,6 +7050,7 @@ err_hw_init:
kfree(adapter->tx_ring);
kfree(adapter->rx_ring);
err_sw_init:
+ netif_napi_del(&adapter->napi);
if (adapter->hw.flash_address)
iounmap(adapter->hw.flash_address);
e1000e_reset_interrupt_capability(adapter);
@@ -7053,6 +7059,7 @@ err_flashmap:
err_ioremap:
free_netdev(netdev);
err_alloc_etherdev:
+ pci_disable_pcie_error_reporting(pdev);
pci_release_selected_regions(pdev,
pci_select_bars(pdev, IORESOURCE_MEM));
err_pci_reg:
@@ -7103,6 +7110,7 @@ static void e1000_remove(struct pci_dev *pdev)
/* Don't lie to e1000_close() down the road. */
if (!down)
clear_bit(__E1000_DOWN, &adapter->state);
+ netif_napi_del(&adapter->napi);
unregister_netdev(netdev);
if (pci_dev_run_wake(pdev))
Thanks!
^ permalink raw reply related
* [PATCH] igb in linux-3.18.0: some potential bugs
From: Jia-Ju Bai @ 2014-12-20 8:11 UTC (permalink / raw)
To: todd.fujinaka, netdev; +Cc: e1000-devel, linux.nics
I have actually tested igb driver on the real hardware(Intel 82575EB PCI-E
Gigabit Ethernet Controller), and find some potential bugs:
The target file is drivers/net/ethernet/intel/igb/igb_main.c
(1) In the normal process of igb, pci_enable_pcie_error_reporting and
pci_disable_pcie_error_reporting is called in pairs in igb_probe and
igb_remove. However, when pci_enable_pcie_error_reporting has been called
and alloc_etherdev_mqs in igb_probe is failed, "err_alloc_etherdev" segment
in igb_probe is executed immediately to exit, but
pci_disable_pcie_error_reporting is not called.
(2) The same situation happens when pci_iomap in igb_probe is failed.
(3) The same situation happens when igb_sw_init in igb_probe is failed.
(4) The same situation happens when register_netdev in igb_probe is failed.
(5) The same situation happens when igb_init_i2c in igb_probe is failed.
(6) The function kcalloc is called by igb_sw_init when initializing the
ethernet card driver, but kfree is not called when register_netdev in
igb_probe is failed, which may cause memory leak.
(7) The same situation happens when igb_init_i2c in igb_probe is failed.
(8) The same situation happens when kzalloc in igb_alloc_q_vector is failed.
(9) The same situation happens when igb_alloc_q_vector in
igb_alloc_q_vectors is failed.
(10) When igb_init_i2c in igb_probe is failed, igb_enable_sriov is called in
igb_probe_vfs, but igb_disable_sriov is not called.
(11) The same situation with [10] happens when register_netdev in igb_probe
is failed.
Meanwhile, I also write the patch to fix the bugs. I have run the patch on
the hardware, it can work normally and fix the above bugs.
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c
b/drivers/net/ethernet/intel/igb/igb_main.c
index 487cd9c..cd9364a 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -179,6 +179,7 @@ static void igb_check_vf_rate_limit(struct igb_adapter
*);
#ifdef CONFIG_PCI_IOV
static int igb_vf_configure(struct igb_adapter *adapter, int vf);
static int igb_pci_enable_sriov(struct pci_dev *dev, int num_vfs);
+static int igb_disable_sriov(struct pci_dev *pdev);
#endif
#ifdef CONFIG_PM
@@ -2653,17 +2654,22 @@ err_register:
igb_release_hw_control(adapter);
memset(&adapter->i2c_adap, 0, sizeof(adapter->i2c_adap));
err_eeprom:
+#ifdef CONFIG_PCI_IOV
+ igb_disable_sriov(pdev);
+#endif
if (!igb_check_reset_block(hw))
igb_reset_phy(hw);
if (hw->flash_address)
iounmap(hw->flash_address);
err_sw_init:
+ kfree(adapter->shadow_vfta);
igb_clear_interrupt_scheme(adapter);
pci_iounmap(pdev, hw->hw_addr);
err_ioremap:
free_netdev(netdev);
err_alloc_etherdev:
+ pci_disable_pcie_error_reporting(pdev);
pci_release_selected_regions(pdev,
pci_select_bars(pdev, IORESOURCE_MEM));
err_pci_reg:
Thanks!
^ permalink raw reply related
* RECEIVE YOUR ATM CARD BEFORE 23RD OF DECEMBER 2014
From: ACCESS BANK PLC ATM DEPARTMENT @ 2014-12-20 8:26 UTC (permalink / raw)
THIS ACCESS BANK PLC WANT TO INFORM YOU THAT YOUR ATM CARD IS READY, THAT IF YOU NEED IT, YOU MUST PAY THE $98. IF YOU ARE READY, MAKE SURE YOU SEND ME YOUR FULL NAMES AND YOUR DIRECT TELEPHONE NUMBER FOR ME TO CALL YOU SO THAT YOU CAN PAY DIRECTLY TO OUR ACCOUNT OFFICER.
Thanks,
DR. CHRIS MICHAEL
FROM ACCESS BANK PLC
E-MAIL: accessb575@gmail.com
^ permalink raw reply
* Re: [linux-nics] [PATCH]e100 in linux-3.18.0: some potential bugs
From: Jeff Kirsher @ 2014-12-20 10:18 UTC (permalink / raw)
To: Jia-Ju Bai; +Cc: todd.fujinaka, e1000-devel, netdev, linux.nics, Linux-nics
In-Reply-To: <000001d01c28$41c937c0$c55ba740$@163.com>
[-- Attachment #1: Type: text/plain, Size: 2155 bytes --]
On Sat, 2014-12-20 at 15:40 +0800, Jia-Ju Bai wrote:
> I have actually tested e100 driver on the real hardware(Intel 82559
> PCI
> Ethernet Controller), and find some bugs:
> The target file is drivers/net/ethernet/intel/e100.c, which is used to
> build
> e100.ko.
>
> (1) The function pci_pool_create is called by e100_probe when
> initializing
> the ethernet card driver. But when pci_pool_create is failed, which
> means
> that it returns NULL to nic->cbs_pool, the system crash will happen.
> Because
> pci_pool_alloc (in e100_alloc_cbs in e100_up in e100_open) need to use
> nic->cbs_pool to allocate the resource, but it is NULL. I suggest that
> a
> check can be added in the code to detect whether pci_pool_create
> returns
> NULL.
> (2) In the normal process, netif_napi_add is called in e100_probe, but
> netif_napi_del is not called in e100_remove. However, many other
> ethernet
> card drivers call them in pairs, even in the error handling paths,
> such as
> r8169 and igb.
>
> Meanwhile, I also write the patch to fix the bugs. I have run the
> patch on
> the hardware, it can work normally and fix the above bugs.
Did you actually experience an issue? Or is this a theoretical issue
that was never actually seen?
>
> diff --git a/drivers/net/ethernet/intel/e100.c
> b/drivers/net/ethernet/intel/e100.c
> index 781065e..2631d3f 100644
> --- a/drivers/net/ethernet/intel/e100.c
> +++ b/drivers/net/ethernet/intel/e100.c
> @@ -2969,6 +2969,11 @@ static int e100_probe(struct pci_dev *pdev,
> const
> struct pci_device_id *ent)
> nic->params.cbs.max * sizeof(struct cb),
> sizeof(u32),
> 0);
> + if(!(nic->cbs_pool))
> + {
> + err = -ENOMEM;
> + goto err_out_pool;
> + }
> netif_info(nic, probe, nic->netdev,
Minor nit-pick but your open bracket needs to be on the same line as the
if statement AND you need a space between the 'if' and (). So the above
code should look like:
if (!nic->cbs_pool) {
err = -ENOMEM;
goto err_out_pool;
}
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: [linux-nics] [PATCH] e1000e in linux-3.18.0: some potential bugs
From: Jeff Kirsher @ 2014-12-20 10:22 UTC (permalink / raw)
To: Jia-Ju Bai; +Cc: todd.fujinaka, netdev, e1000-devel, linux.nics
In-Reply-To: <000f01d01c2b$4af1b3b0$e0d51b10$@163.com>
[-- Attachment #1: Type: text/plain, Size: 3185 bytes --]
On Sat, 2014-12-20 at 16:02 +0800, Jia-Ju Bai wrote:
> I have actually tested e1000e driver on the real hardware(Intel
> 82572EI
> PCI-E Gigabit Ethernet Controller), and find some potential bugs:
> The target file is drivers/net/ethernet/intel/e1000e/netdev.c, which
> is used
> to build e1000e.ko.
>
> (1) In the normal process, netif_napi_add is called in e1000_probe,
> but
> netif_napi_del is not called in e1000_remove. However, many other
> ethernet
> card drivers call them in pairs, even in the error handling paths,
> such as
> r8169 and igb.
>
> (2) The function vzalloc is called by e1000e_setup_rx_resources (in
> e1000_open) when initializing the ethernet card driver. But when
> vzalloc is
> failed, "err" segment in e1000e_setup_rx_resources is executed to
> return and
> then e1000e_free_tx_resources in "err_setup_rx" segment in e1000_open
> is
> executed to halt. However, "writel(0, tx_ring->head)" statement in
> e1000_clean_tx_ring in e1000e_free_tx_resources will cause system
> crash,
> because "tx_ring->head" is not assigned the value. In the code,
> "tx_ring->head" is initialized in e1000_configure_tx in
> e1000_configure
> after the e1000e_setup_rx_resources.
> (3) The same system crashes happens, when kcalloc in
> e1000e_setup_rx_resources is failed(returns NULL).
> (4) The same system crashes happens, when e1000_alloc_ring_dma in
> e1000e_setup_rx_resources is failed(returns error code).
>
> (5) In the normal process of e1000e, pci_enable_pcie_error_reporting
> and
> pci_disable_pcie_error_reporting is called in pairs in e1000_probe and
> e1000_remove. However, when pci_enable_pcie_error_reporting has been
> called
> and pci_save_state in e1000_probe is failed, "err_alloc_etherdev"
> segment in
> e1000_probe is executed immediately to exit, but
> pci_disable_pcie_error_reporting is not called.
> (6) The same situation happens when alloc_etherdev_mqs in e1000_probe
> is
> failed.
> (7) The same situation happens when ioremap in e1000_probe is failed.
> (8) The same situation happens when e1000_sw_init in e1000_probe is
> failed.
> (9) The same situation happens when register_netdev in e1000_probe is
> failed.
>
> (10) When request_irq in e1000_request_irq is failed,
> pm_qos_add_request in
> e1000_open is called, but pm_qos_remove_request is not called.
>
> Meanwhile, I also write the patch to fix the bugs. I have run the
> patch on
> the hardware, it can work normally and fix the above bugs.
Again, is this an issue you saw or a theoretical issue?
>
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 247335d..02d1e67 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -2444,6 +2444,8 @@ static void e1000_clean_tx_ring(struct
> e1000_ring
> *tx_ring)
> tx_ring->next_to_use = 0;
> tx_ring->next_to_clean = 0;
>
> + if(!(tx_ring->head))
> + return;
Need a space between the 'if' and the (). Please check your patches by
running checkpatch.pl on them before sending them out.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: [linux-nics] [PATCH] e1000 in linux-3.18.0: a potential bug
From: Jeff Kirsher @ 2014-12-20 10:34 UTC (permalink / raw)
To: Jia-Ju Bai; +Cc: todd.fujinaka, netdev, e1000-devel, linux.nics
In-Reply-To: <000d01d01c29$a9edb870$fdc92950$@163.com>
[-- Attachment #1: Type: text/plain, Size: 936 bytes --]
On Sat, 2014-12-20 at 15:50 +0800, Jia-Ju Bai wrote:
> I have actually tested e1000 driver on the real hardware(Intel 82540EM
> PCI
> Gigabit Ethernet Controller), and find a potential bug:
> The target file is drivers/net/ethernet/intel/e1000/e1000_main.c,
> which is
> used to build e1000.ko.
>
> (1) In the normal process, netif_napi_add is called in e1000_probe,
> but
> netif_napi_del is not called in e1000_remove. However, many other
> ethernet
> card drivers call them in pairs, even in the error handling paths,
> such as
> r8169 and igb.
>
> Meanwhile, I also write the patch to fix the bug. I have run the patch
> on
> the hardware, it can work normally and fix the above bug.
Was this a bug you actually saw? Or a theoretical bug based on code
review?
I do not mind adding this to my queue so that we can review and test the
patch, although this will cause a fair amount of regression testing.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: [linux-nics] [PATCH] igb in linux-3.18.0: some potential bugs
From: Jeff Kirsher @ 2014-12-20 10:35 UTC (permalink / raw)
To: Jia-Ju Bai; +Cc: todd.fujinaka, netdev, e1000-devel, linux.nics
In-Reply-To: <001101d01c2c$8dcc0040$a96400c0$@163.com>
[-- Attachment #1: Type: text/plain, Size: 2066 bytes --]
On Sat, 2014-12-20 at 16:11 +0800, Jia-Ju Bai wrote:
> I have actually tested igb driver on the real hardware(Intel 82575EB
> PCI-E
> Gigabit Ethernet Controller), and find some potential bugs:
> The target file is drivers/net/ethernet/intel/igb/igb_main.c
>
> (1) In the normal process of igb, pci_enable_pcie_error_reporting and
> pci_disable_pcie_error_reporting is called in pairs in igb_probe and
> igb_remove. However, when pci_enable_pcie_error_reporting has been
> called
> and alloc_etherdev_mqs in igb_probe is failed, "err_alloc_etherdev"
> segment
> in igb_probe is executed immediately to exit, but
> pci_disable_pcie_error_reporting is not called.
> (2) The same situation happens when pci_iomap in igb_probe is failed.
> (3) The same situation happens when igb_sw_init in igb_probe is
> failed.
> (4) The same situation happens when register_netdev in igb_probe is
> failed.
> (5) The same situation happens when igb_init_i2c in igb_probe is
> failed.
>
> (6) The function kcalloc is called by igb_sw_init when initializing
> the
> ethernet card driver, but kfree is not called when register_netdev in
> igb_probe is failed, which may cause memory leak.
> (7) The same situation happens when igb_init_i2c in igb_probe is
> failed.
> (8) The same situation happens when kzalloc in igb_alloc_q_vector is
> failed.
> (9) The same situation happens when igb_alloc_q_vector in
> igb_alloc_q_vectors is failed.
>
> (10) When igb_init_i2c in igb_probe is failed, igb_enable_sriov is
> called in
> igb_probe_vfs, but igb_disable_sriov is not called.
> (11) The same situation with [10] happens when register_netdev in
> igb_probe
> is failed.
>
> Meanwhile, I also write the patch to fix the bugs. I have run the
> patch on
> the hardware, it can work normally and fix the above bugs.
Was this a bug you actually saw? Or a theoretical bug based on code
review?
I do not mind adding this to my queue so that we can review and test the
patch, although this will cause a fair amount of regression testing.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* [PATCH] net: wireless: ipw2x00: ipw2200.c: Remove unused function
From: Rickard Strandqvist @ 2014-12-20 12:29 UTC (permalink / raw)
To: Stanislav Yakovlev, John W. Linville
Cc: Rickard Strandqvist, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Remove the function ipw_alive() that is not used anywhere.
This was partially found by using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist-IW2WV5XWFqGZkjO+N0TKoMugMpMbD5Xr@public.gmane.org>
---
drivers/net/wireless/ipw2x00/ipw2200.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/drivers/net/wireless/ipw2x00/ipw2200.c b/drivers/net/wireless/ipw2x00/ipw2200.c
index edc3443..2f830ca 100644
--- a/drivers/net/wireless/ipw2x00/ipw2200.c
+++ b/drivers/net/wireless/ipw2x00/ipw2200.c
@@ -3021,20 +3021,6 @@ static void ipw_remove_current_network(struct ipw_priv *priv)
spin_unlock_irqrestore(&priv->ieee->lock, flags);
}
-/**
- * Check that card is still alive.
- * Reads debug register from domain0.
- * If card is present, pre-defined value should
- * be found there.
- *
- * @param priv
- * @return 1 if card is present, 0 otherwise
- */
-static inline int ipw_alive(struct ipw_priv *priv)
-{
- return ipw_read32(priv, 0x90) == 0xd55555d5;
-}
^ permalink raw reply related
* [PATCH] net: ceph: ceph_strings.c: Remove unused function
From: Rickard Strandqvist @ 2014-12-20 12:34 UTC (permalink / raw)
To: Sage Weil, David S. Miller
Cc: Rickard Strandqvist, ceph-devel, netdev, linux-kernel
Remove the function ceph_pool_op_name() that is not used anywhere.
This was partially found by using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
---
include/linux/ceph/ceph_fs.h | 2 --
net/ceph/ceph_strings.c | 14 --------------
2 files changed, 16 deletions(-)
diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
index 3c97d5e..0684f9e 100644
--- a/include/linux/ceph/ceph_fs.h
+++ b/include/linux/ceph/ceph_fs.h
@@ -191,8 +191,6 @@ struct ceph_mon_statfs_reply {
struct ceph_statfs st;
} __attribute__ ((packed));
-const char *ceph_pool_op_name(int op);
-
struct ceph_mon_poolop {
struct ceph_mon_request_header monhdr;
struct ceph_fsid fsid;
diff --git a/net/ceph/ceph_strings.c b/net/ceph/ceph_strings.c
index 3056020..139a9cb 100644
--- a/net/ceph/ceph_strings.c
+++ b/net/ceph/ceph_strings.c
@@ -42,17 +42,3 @@ const char *ceph_osd_state_name(int s)
return "???";
}
}
-
-const char *ceph_pool_op_name(int op)
-{
- switch (op) {
- case POOL_OP_CREATE: return "create";
- case POOL_OP_DELETE: return "delete";
- case POOL_OP_AUID_CHANGE: return "auid change";
- case POOL_OP_CREATE_SNAP: return "create snap";
- case POOL_OP_DELETE_SNAP: return "delete snap";
- case POOL_OP_CREATE_UNMANAGED_SNAP: return "create unmanaged snap";
- case POOL_OP_DELETE_UNMANAGED_SNAP: return "delete unmanaged snap";
- }
- return "???";
-}
--
1.7.10.4
^ permalink raw reply related
* Re: Re:Re: [linux-nics] [PATCH] e1000e in linux-3.18.0: some potential bugs
From: Jeff Kirsher @ 2014-12-20 12:47 UTC (permalink / raw)
To: 白家驹; +Cc: netdev, e1000-devel
In-Reply-To: <aaab622.119b.14a67ba656c.Coremail.baijiaju1990@163.com>
[-- Attachment #1: Type: text/plain, Size: 503 bytes --]
On Sat, 2014-12-20 at 20:44 +0800, 白家驹 wrote:
> Thank for the reply!
>
> For the first reply:
> I let some functions fail on purpose to test error handling code, and
> then run the driver in reality as well as monitor the function calls
> in runtime.
> The results are in my report.
>
> For the second reply:
> I admit you are right, and my code style need to be improved.
Adding netdev and e1000-devel mailing lists back onto the CC, since 白家
驹 removed them in his reply.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* [PATCH] net: wireless: rtlwifi: rtl8192ee: trx.c: Remove unused function
From: Rickard Strandqvist @ 2014-12-20 12:48 UTC (permalink / raw)
To: Larry Finger, Chaoming Li
Cc: Rickard Strandqvist, John W. Linville, Greg Kroah-Hartman,
Rasmus Villemoes, Joe Perches, linux-wireless, netdev,
linux-kernel
Remove the function rtl92ee_get_available_desc() that is not used anywhere.
This was partially found by using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
---
drivers/net/wireless/rtlwifi/rtl8192ee/trx.c | 21 ---------------------
drivers/net/wireless/rtlwifi/rtl8192ee/trx.h | 1 -
2 files changed, 22 deletions(-)
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c b/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c
index 2fcbef1..8186ed2 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c
@@ -710,27 +710,6 @@ static u16 get_desc_addr_fr_q_idx(u16 queue_index)
return desc_address;
}
-void rtl92ee_get_available_desc(struct ieee80211_hw *hw, u8 q_idx)
-{
- struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw));
- struct rtl_priv *rtlpriv = rtl_priv(hw);
- u16 point_diff = 0;
- u16 current_tx_read_point = 0, current_tx_write_point = 0;
- u32 tmp_4byte;
-
- tmp_4byte = rtl_read_dword(rtlpriv,
- get_desc_addr_fr_q_idx(q_idx));
- current_tx_read_point = (u16)((tmp_4byte >> 16) & 0x0fff);
- current_tx_write_point = (u16)((tmp_4byte) & 0x0fff);
-
- point_diff = ((current_tx_read_point > current_tx_write_point) ?
- (current_tx_read_point - current_tx_write_point) :
- (TX_DESC_NUM_92E - current_tx_write_point +
- current_tx_read_point));
-
- rtlpci->tx_ring[q_idx].avl_desc = point_diff;
-}
-
void rtl92ee_pre_fill_tx_bd_desc(struct ieee80211_hw *hw,
u8 *tx_bd_desc, u8 *desc, u8 queue_index,
struct sk_buff *skb, dma_addr_t addr)
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ee/trx.h b/drivers/net/wireless/rtlwifi/rtl8192ee/trx.h
index 6f9be1c..4426c49 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ee/trx.h
+++ b/drivers/net/wireless/rtlwifi/rtl8192ee/trx.h
@@ -829,7 +829,6 @@ void rtl92ee_rx_check_dma_ok(struct ieee80211_hw *hw, u8 *header_desc,
u8 queue_index);
u16 rtl92ee_rx_desc_buff_remained_cnt(struct ieee80211_hw *hw,
u8 queue_index);
-void rtl92ee_get_available_desc(struct ieee80211_hw *hw, u8 queue_index);
void rtl92ee_pre_fill_tx_bd_desc(struct ieee80211_hw *hw,
u8 *tx_bd_desc, u8 *desc, u8 queue_index,
struct sk_buff *skb, dma_addr_t addr);
--
1.7.10.4
^ permalink raw reply related
* Re: Re:Re: [linux-nics] [PATCH] e1000 in linux-3.18.0: a potential bug
From: Jeff Kirsher @ 2014-12-20 12:49 UTC (permalink / raw)
To: 白家驹; +Cc: netdev, e1000-devel
In-Reply-To: <79ea17da.119c.14a67bcadd1.Coremail.baijiaju1990@163.com>
[-- Attachment #1: Type: text/plain, Size: 1389 bytes --]
On Sat, 2014-12-20 at 20:47 +0800, 白家驹 wrote:
> Thanks for the reply!
> I run the driver normally, and monitor all function calls in runtime,
> and then find this violation.
Adding netdev and e1000-devel back onto the CC since Jia-Ju Bai removed
them in his reply...
>
> At 2014-12-20 18:34:20,"Jeff Kirsher" <jeffrey.t.kirsher@intel.com> wrote:
> >On Sat, 2014-12-20 at 15:50 +0800, Jia-Ju Bai wrote:
> >> I have actually tested e1000 driver on the real hardware(Intel 82540EM
> >> PCI
> >> Gigabit Ethernet Controller), and find a potential bug:
> >> The target file is drivers/net/ethernet/intel/e1000/e1000_main.c,
> >> which is
> >> used to build e1000.ko.
> >>
> >> (1) In the normal process, netif_napi_add is called in e1000_probe,
> >> but
> >> netif_napi_del is not called in e1000_remove. However, many other
> >> ethernet
> >> card drivers call them in pairs, even in the error handling paths,
> >> such as
> >> r8169 and igb.
> >>
> >> Meanwhile, I also write the patch to fix the bug. I have run the patch
> >> on
> >> the hardware, it can work normally and fix the above bug.
> >
> >Was this a bug you actually saw? Or a theoretical bug based on code
> >review?
> >
> >I do not mind adding this to my queue so that we can review and test the
> >patch, although this will cause a fair amount of regression testing.
>
>
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: Re:Re: [linux-nics] [PATCH]e100 in linux-3.18.0: some potential bugs
From: Jeff Kirsher @ 2014-12-20 12:52 UTC (permalink / raw)
To: 白家驹; +Cc: netdev, e1000-devel
In-Reply-To: <1e83d62.1197.14a67b6f11a.Coremail.baijiaju1990@163.com>
[-- Attachment #1: Type: text/plain, Size: 2870 bytes --]
On Sat, 2014-12-20 at 20:40 +0800, 白家驹 wrote:
> Thank for the reply!
>
> For the first reply:
> I let pci_pool_create fail on purpose to simulate insufficient memory,
> and then run the driver in reality, but the driver crashes.
>
> For the second reply:
> I admit you are right, and my code style need to be improved.
>
Added in netdev and e1000-devel back onto the CC, since Jia-Ju Bai
removed them in his reply
>
> At 2014-12-20 18:18:09,"Jeff Kirsher" <jeffrey.t.kirsher@intel.com> wrote:
> >On Sat, 2014-12-20 at 15:40 +0800, Jia-Ju Bai wrote:
> >> I have actually tested e100 driver on the real hardware(Intel 82559
> >> PCI
> >> Ethernet Controller), and find some bugs:
> >> The target file is drivers/net/ethernet/intel/e100.c, which is used to
> >> build
> >> e100.ko.
> >>
> >> (1) The function pci_pool_create is called by e100_probe when
> >> initializing
> >> the ethernet card driver. But when pci_pool_create is failed, which
> >> means
> >> that it returns NULL to nic->cbs_pool, the system crash will happen.
> >> Because
> >> pci_pool_alloc (in e100_alloc_cbs in e100_up in e100_open) need to use
> >> nic->cbs_pool to allocate the resource, but it is NULL. I suggest that
> >> a
> >> check can be added in the code to detect whether pci_pool_create
> >> returns
> >> NULL.
> >> (2) In the normal process, netif_napi_add is called in e100_probe, but
> >> netif_napi_del is not called in e100_remove. However, many other
> >> ethernet
> >> card drivers call them in pairs, even in the error handling paths,
> >> such as
> >> r8169 and igb.
> >>
> >> Meanwhile, I also write the patch to fix the bugs. I have run the
> >> patch on
> >> the hardware, it can work normally and fix the above bugs.
> >
> >Did you actually experience an issue? Or is this a theoretical issue
> >that was never actually seen?
> >
> >>
> >> diff --git a/drivers/net/ethernet/intel/e100.c
> >> b/drivers/net/ethernet/intel/e100.c
> >> index 781065e..2631d3f 100644
> >> --- a/drivers/net/ethernet/intel/e100.c
> >> +++ b/drivers/net/ethernet/intel/e100.c
> >> @@ -2969,6 +2969,11 @@ static int e100_probe(struct pci_dev *pdev,
> >> const
> >> struct pci_device_id *ent)
> >> nic->params.cbs.max * sizeof(struct cb),
> >> sizeof(u32),
> >> 0);
> >> + if(!(nic->cbs_pool))
> >> + {
> >> + err = -ENOMEM;
> >> + goto err_out_pool;
> >> + }
> >> netif_info(nic, probe, nic->netdev,
> >
> >Minor nit-pick but your open bracket needs to be on the same line as the
> >if statement AND you need a space between the 'if' and (). So the above
> >code should look like:
> > if (!nic->cbs_pool) {
> > err = -ENOMEM;
> > goto err_out_pool;
> > }
>
>
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* 答复: [linux-nics] [PATCH] igb in linux-3.18.0: some potential bugs
From: Jia-Ju Bai @ 2014-12-20 12:56 UTC (permalink / raw)
To: 'Jeff Kirsher'; +Cc: e1000-devel, netdev, linux.nics
In-Reply-To: <1419071709.2461.86.camel@jtkirshe-mobl.home>
Thank for the reply!
For the first reply:
I let some functions fail on purpose to test error handling code, and then run the driver in reality as well as monitor the function calls in runtime.
The results are in my report.
For the second reply:
I admit you are right, and my code style need to be improved.
On Sat, 2014-12-20 at 16:11 +0800, Jia-Ju Bai wrote:
> I have actually tested igb driver on the real hardware(Intel 82575EB
> PCI-E Gigabit Ethernet Controller), and find some potential bugs:
> The target file is drivers/net/ethernet/intel/igb/igb_main.c
>
> (1) In the normal process of igb, pci_enable_pcie_error_reporting and
> pci_disable_pcie_error_reporting is called in pairs in igb_probe and
> igb_remove. However, when pci_enable_pcie_error_reporting has been
> called and alloc_etherdev_mqs in igb_probe is failed,
> "err_alloc_etherdev"
> segment
> in igb_probe is executed immediately to exit, but
> pci_disable_pcie_error_reporting is not called.
> (2) The same situation happens when pci_iomap in igb_probe is failed.
> (3) The same situation happens when igb_sw_init in igb_probe is
> failed.
> (4) The same situation happens when register_netdev in igb_probe is
> failed.
> (5) The same situation happens when igb_init_i2c in igb_probe is
> failed.
>
> (6) The function kcalloc is called by igb_sw_init when initializing
> the ethernet card driver, but kfree is not called when register_netdev
> in igb_probe is failed, which may cause memory leak.
> (7) The same situation happens when igb_init_i2c in igb_probe is
> failed.
> (8) The same situation happens when kzalloc in igb_alloc_q_vector is
> failed.
> (9) The same situation happens when igb_alloc_q_vector in
> igb_alloc_q_vectors is failed.
>
> (10) When igb_init_i2c in igb_probe is failed, igb_enable_sriov is
> called in igb_probe_vfs, but igb_disable_sriov is not called.
> (11) The same situation with [10] happens when register_netdev in
> igb_probe is failed.
>
> Meanwhile, I also write the patch to fix the bugs. I have run the
> patch on the hardware, it can work normally and fix the above bugs.
Was this a bug you actually saw? Or a theoretical bug based on code review?
I do not mind adding this to my queue so that we can review and test the patch, although this will cause a fair amount of regression testing.
【来自网易邮箱的超大附件】
邮件带有附件预览链接,若您转发或回复此邮件时不希望对方预览附件,建议您手动删除链接。
signature.asc
下载: http://u.163.com/t/HnVMNHn21
预览: http://u.163.com/t/g1ju64
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Stable fixes for batman-adv
From: Sven Eckelmann @ 2014-12-20 12:48 UTC (permalink / raw)
To: davem; +Cc: netdev
Hi,
it seems that patches aren't forwarded anymore (since August?) from batman-adv
to the netdev mailing list. Please correct me if I am wrong.
I would hereby try to send some patches directly to the netdev mailing list
instead of waiting any longer for the patches to be forwarded. There are more
non-feature patches [1] waiting in the batman-adv repo (everything after
"batman-adv: fix alignment" from 2014-05-15) but these don't seem to
be related to crashes.
At least the first patch caused crashes in real world scenarios [2] and could
also be used to crash a mesh node on-demand. The last patch has its bug report
in the linux bug tracker [3]. The second patch is a wrong size calculation but
no problem was yet observed in the wild.
All patches fix problems which were introduced in Linux 3.13.
Kind regards,
Sven
[1] http://git.open-mesh.org/batman-adv.git/shortlog/refs/heads/maint
[2] https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2014-November/012561.html
[3] https://bugzilla.kernel.org/show_bug.cgi?id=84061
^ permalink raw reply
* [PATCH 2/3] batman-adv: Unify fragment size calculation
From: Sven Eckelmann @ 2014-12-20 12:48 UTC (permalink / raw)
To: davem; +Cc: netdev, Sven Eckelmann
In-Reply-To: <1419079737-31107-1-git-send-email-sven@narfation.org>
The fragmentation code was replaced in 610bfc6bc99bc83680d190ebc69359a05fc7f605
("batman-adv: Receive fragmented packets and merge") by an implementation which
can handle up to 16 fragments of a packet. The packet is prepared for the split
in fragments by the function batadv_frag_send_packet and the actual split is
done by batadv_frag_create.
Both functions calculate the size of a fragment themself. But their calculation
differs because batadv_frag_send_packet also subtracts ETH_HLEN. Therefore,
the check in batadv_frag_send_packet "can a full fragment can be created?" may
return true even when batadv_frag_create cannot create a full fragment.
The function batadv_frag_create doesn't check the size of the skb before
splitting it and therefore might try to create a larger fragment than the
remaining buffer. This creates an integer underflow and an invalid len is given
to skb_split.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
---
Problem is in the kernel since v3.13 and may be important for the stable tree.
net/batman-adv/fragmentation.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c
index 8af3461..00f9e14 100644
--- a/net/batman-adv/fragmentation.c
+++ b/net/batman-adv/fragmentation.c
@@ -434,7 +434,7 @@ bool batadv_frag_send_packet(struct sk_buff *skb,
* fragments larger than BATADV_FRAG_MAX_FRAG_SIZE
*/
mtu = min_t(unsigned, mtu, BATADV_FRAG_MAX_FRAG_SIZE);
- max_fragment_size = (mtu - header_size - ETH_HLEN);
+ max_fragment_size = mtu - header_size;
max_packet_size = max_fragment_size * BATADV_FRAG_MAX_FRAGMENTS;
/* Don't even try to fragment, if we need more than 16 fragments */
--
2.1.4
^ permalink raw reply related
* [PATCH 3/3] batman-adv: avoid NULL dereferences and fix if check
From: Sven Eckelmann @ 2014-12-20 12:48 UTC (permalink / raw)
To: davem; +Cc: netdev, Antonio Quartulli, Marek Lindner
In-Reply-To: <1419079737-31107-1-git-send-email-sven@narfation.org>
From: Antonio Quartulli <antonio@meshcoding.com>
Gateway having bandwidth_down equal to zero are not accepted
at all and so never added to the Gateway list.
For this reason checking the bandwidth_down member in
batadv_gw_out_of_range() is useless.
This is probably a copy/paste error and this check was supposed
to be "!gw_node" only. Moreover, the way the check is written
now may also lead to a NULL dereference.
Fix this by rewriting the if-condition properly.
Introduced by 414254e342a0d58144de40c3da777521ebaeeb07
("batman-adv: tvlv - gateway download/upload bandwidth container")
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
---
Problem is in the kernel since v3.13 and may be important for the stable tree.
net/batman-adv/gateway_client.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/batman-adv/gateway_client.c b/net/batman-adv/gateway_client.c
index 90cff58..e0bcf9e 100644
--- a/net/batman-adv/gateway_client.c
+++ b/net/batman-adv/gateway_client.c
@@ -810,7 +810,7 @@ bool batadv_gw_out_of_range(struct batadv_priv *bat_priv,
goto out;
gw_node = batadv_gw_node_get(bat_priv, orig_dst_node);
- if (!gw_node->bandwidth_down == 0)
+ if (!gw_node)
goto out;
switch (atomic_read(&bat_priv->gw_mode)) {
--
2.1.4
^ permalink raw reply related
* [PATCH 1/3] batman-adv: Calculate extra tail size based on queued fragments
From: Sven Eckelmann @ 2014-12-20 12:48 UTC (permalink / raw)
To: davem; +Cc: netdev, Sven Eckelmann
In-Reply-To: <1419079737-31107-1-git-send-email-sven@narfation.org>
The fragmentation code was replaced in 610bfc6bc99bc83680d190ebc69359a05fc7f605
("batman-adv: Receive fragmented packets and merge"). The new code provided a
mostly unused parameter skb for the merging function. It is used inside the
function to calculate the additionally needed skb tailroom. But instead of
increasing its own tailroom, it is only increasing the tailroom of the first
queued skb. This is not correct in some situations because the first queued
entry can be a different one than the parameter.
An observed problem was:
1. packet with size 104, total_size 1464, fragno 1 was received
- packet is queued
2. packet with size 1400, total_size 1464, fragno 0 was received
- packet is queued at the end of the list
3. enough data was received and can be given to the merge function
(1464 == (1400 - 20) + (104 - 20))
- merge functions gets 1400 byte large packet as skb argument
4. merge function gets first entry in queue (104 byte)
- stored as skb_out
5. merge function calculates the required extra tail as total_size - skb->len
- pskb_expand_head tail of skb_out with 64 bytes
6. merge function tries to squeeze the extra 1380 bytes from the second queued
skb (1400 byte aka skb parameter) in the 64 extra tail bytes of skb_out
Instead calculate the extra required tail bytes for skb_out also using skb_out
instead of using the parameter skb. The skb parameter is only used to get the
total_size from the last received packet. This is also the total_size used to
decide that all fragments were received.
Reported-by: Philipp Psurek <philipp.psurek@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Martin Hundebøll <martin@hundeboll.net>
---
Problem is in the kernel since v3.13 and may be important for the stable tree.
net/batman-adv/fragmentation.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c
index fc1835c..8af3461 100644
--- a/net/batman-adv/fragmentation.c
+++ b/net/batman-adv/fragmentation.c
@@ -251,7 +251,7 @@ batadv_frag_merge_packets(struct hlist_head *chain, struct sk_buff *skb)
kfree(entry);
/* Make room for the rest of the fragments. */
- if (pskb_expand_head(skb_out, 0, size - skb->len, GFP_ATOMIC) < 0) {
+ if (pskb_expand_head(skb_out, 0, size - skb_out->len, GFP_ATOMIC) < 0) {
kfree_skb(skb_out);
skb_out = NULL;
goto free;
--
2.1.4
^ permalink raw reply related
* Re: [PATCH] igb in linux-3.18.0: some potential bugs
From: Jia-Ju Bai @ 2014-12-20 12:59 UTC (permalink / raw)
To: 'Jeff Kirsher'; +Cc: e1000-devel, netdev, linux.nics
Thank for the reply!
For the first reply:
I let some functions fail on purpose to test error handling code, and then run the driver in reality as well as monitor the function calls in runtime.
The results are in my report.
For the second reply:
I admit you are right, and my code style need to be improved.
On Sat, 2014-12-20 at 16:11 +0800, Jia-Ju Bai wrote:
> I have actually tested igb driver on the real hardware(Intel 82575EB
> PCI-E Gigabit Ethernet Controller), and find some potential bugs:
> The target file is drivers/net/ethernet/intel/igb/igb_main.c
>
> (1) In the normal process of igb, pci_enable_pcie_error_reporting and
> pci_disable_pcie_error_reporting is called in pairs in igb_probe and
> igb_remove. However, when pci_enable_pcie_error_reporting has been
> called and alloc_etherdev_mqs in igb_probe is failed,
> "err_alloc_etherdev"
> segment
> in igb_probe is executed immediately to exit, but
> pci_disable_pcie_error_reporting is not called.
> (2) The same situation happens when pci_iomap in igb_probe is failed.
> (3) The same situation happens when igb_sw_init in igb_probe is
> failed.
> (4) The same situation happens when register_netdev in igb_probe is
> failed.
> (5) The same situation happens when igb_init_i2c in igb_probe is
> failed.
>
> (6) The function kcalloc is called by igb_sw_init when initializing
> the ethernet card driver, but kfree is not called when register_netdev
> in igb_probe is failed, which may cause memory leak.
> (7) The same situation happens when igb_init_i2c in igb_probe is
> failed.
> (8) The same situation happens when kzalloc in igb_alloc_q_vector is
> failed.
> (9) The same situation happens when igb_alloc_q_vector in
> igb_alloc_q_vectors is failed.
>
> (10) When igb_init_i2c in igb_probe is failed, igb_enable_sriov is
> called in igb_probe_vfs, but igb_disable_sriov is not called.
> (11) The same situation with [10] happens when register_netdev in
> igb_probe is failed.
>
> Meanwhile, I also write the patch to fix the bugs. I have run the
> patch on the hardware, it can work normally and fix the above bugs.
>Was this a bug you actually saw? Or a theoretical bug based on code review?
>I do not mind adding this to my queue so that we can review and test the patch, although this will cause a fair amount of regression testing.
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox