3c59x: pci_unmap_single() oops

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* 3c59x: pci_unmap_single() oops
@ 2017-12-29 14:51 tedheadster
  2017-12-29 16:40 ` [PATCH] 3c59x: fix missing dma_mapping_error check Neil Horman
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: tedheadster @ 2017-12-29 14:51 UTC (permalink / raw)
  To: netdev, nhorman

In the 4.15.0-rc5 kernel (and likely earlier) I get the following oops.

3c59x 0000:00:0c.0 enp0s12: renamed from eth0
enp0s12:  setting half-duplex.
------------[ cut here ]------------
3c59x 0000:00:0c.0: DMA-API: device driver failed to check map
error[device address=0x0000000009e1b040] [size=1536 bytes] [mapped as
single]
WARNING: CPU: 0 PID: 1 at check_unmap+0x559/0x695
Modules linked in: ohci_pci ohci_hcd ehci_pci ehci_hcd usbcore pcspkr
serio_raw 3c59x mii usb_common ipv6
CPU: 0 PID: 1 Comm: systemd Not tainted 4.15.0-rc5.i486 #10
EIP: check_unmap+0x559/0x695
EFLAGS: 00010096 CPU: 0
EAX: 0000008c EBX: cb8a8660 ECX: c0881544 EDX: 00000001
ESI: cb8e5280 EDI: c06e8b9f EBP: cb821e50 ESP: cb821df8
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: 004f1700 CR3: 0b2f9000 CR4: 00000000
Call Trace:
 ? mntput_no_expire+0x13/0x105
 debug_dma_unmap_page+0x61/0x69
 pci_unmap_single+0x4c/0x56 [3c59x]
 boomerang_rx+0x250/0x42a [3c59x]
 boomerang_interrupt+0xde/0x3ea [3c59x]
 __handle_irq_event_percpu+0x2a/0xaf
 handle_irq_event_percpu+0x17/0x3d
 handle_irq_event+0x22/0x3b
 handle_level_irq+0x55/0x7a
 handle_irq+0x4f/0x58
 do_IRQ+0x35/0x95
 common_interrupt+0x34/0x40
EIP: 0xb7ae6970
EFLAGS: 00000246 CPU: 0
EAX: b7e1c3d8 EBX: b7eef344 ECX: 00000000 EDX: 007048c4
ESI: 00000013 EDI: 007048c4 EBP: bf8d81c8 ESP: bf8d8094
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
Code: 01 00 00 8b 58 08 e9 4a 01 00 00 bb 1b 2f 70 c0 89 d8 57 ff 75
e4 ff 75 e0 ff 75 dc ff 75 d8 53 50 68 bd 16 71 c0 e8 1f 3c e1 ff <0f>
ff 83 c4 20 83 3d 44 d5 7e c0 00 75 0f a1 f0 e6 7b c0 85 c0
---[ end trace 8b519628d8703199 ]---

This may relate to "3c59x: Add dma error checking and recovery"

- Matthew Whitehead

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] 3c59x: fix missing dma_mapping_error check
  2017-12-29 14:51 3c59x: pci_unmap_single() oops tedheadster
@ 2017-12-29 16:40 ` Neil Horman
  2018-01-03  2:48   ` David Miller
  2018-01-03 14:44 ` [PATCHv2] 3c59x: fix missing dma_mapping_error check and bad ring refill logic Neil Horman
  2018-01-03 18:09 ` [PATCHv3] " Neil Horman
  2 siblings, 1 reply; 15+ messages in thread
From: Neil Horman @ 2017-12-29 16:40 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Neil Horman, Steffen Klassert, David S. Miller

A few spots in 3c59x missed calls to dma_mapping_error checks, casuing
WARN_ONS to trigger.  Clean those up.

Signed-off-by: Neil Horman <nhorman@redhat.com>
CC: Steffen Klassert <klassert@mathematik.tu-chemnitz.de>
CC: "David S. Miller" <davem@davemloft.net>
Reported-by: tedheadster@gmail.com
---
 drivers/net/ethernet/3com/3c59x.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c
index f4e13a7014bd..6be9212f9093 100644
--- a/drivers/net/ethernet/3com/3c59x.c
+++ b/drivers/net/ethernet/3com/3c59x.c
@@ -1729,6 +1729,7 @@ vortex_open(struct net_device *dev)
 	struct vortex_private *vp = netdev_priv(dev);
 	int i;
 	int retval;
+	dma_addr_t dma;
 
 	/* Use the now-standard shared IRQ implementation. */
 	if ((retval = request_irq(dev->irq, vp->full_bus_master_rx ?
@@ -1753,7 +1754,11 @@ vortex_open(struct net_device *dev)
 				break;			/* Bad news!  */
 
 			skb_reserve(skb, NET_IP_ALIGN);	/* Align IP on 16 byte boundaries */
-			vp->rx_ring[i].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data, PKT_BUF_SZ, PCI_DMA_FROMDEVICE));
+			dma = pci_map_single(VORTEX_PCI(vp), skb->data,
+					     PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
+			if (dma_mapping_error(&VORTEX_PCI(vp)->dev, dma))
+				break;
+			vp->rx_ring[i].addr = cpu_to_le32(dma);
 		}
 		if (i != RX_RING_SIZE) {
 			pr_emerg("%s: no memory for rx ring\n", dev->name);
@@ -2067,6 +2072,9 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		int len = (skb->len + 3) & ~3;
 		vp->tx_skb_dma = pci_map_single(VORTEX_PCI(vp), skb->data, len,
 						PCI_DMA_TODEVICE);
+		if (dma_mapping_error(&VORTEX_PCI(vp)->dev, vp->tx_skb_dma))
+			return NETDEV_TX_OK;
+
 		spin_lock_irq(&vp->window_lock);
 		window_set(vp, 7);
 		iowrite32(vp->tx_skb_dma, ioaddr + Wn7_MasterAddr);
@@ -2594,6 +2602,7 @@ boomerang_rx(struct net_device *dev)
 	void __iomem *ioaddr = vp->ioaddr;
 	int rx_status;
 	int rx_work_limit = vp->dirty_rx + RX_RING_SIZE - vp->cur_rx;
+	dma_addr_t dma;
 
 	if (vortex_debug > 5)
 		pr_debug("boomerang_rx(): status %4.4x\n", ioread16(ioaddr+EL3_STATUS));
@@ -2673,7 +2682,11 @@ boomerang_rx(struct net_device *dev)
 				break;			/* Bad news!  */
 			}
 
-			vp->rx_ring[entry].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data, PKT_BUF_SZ, PCI_DMA_FROMDEVICE));
+			dma = pci_map_single(VORTEX_PCI(vp), skb->data,
+					     PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
+			if (dma_mapping_error(&VORTEX_PCI(vp)->dev, dma))
+				break;
+			vp->rx_ring[entry].addr = cpu_to_le32(dma);
 			vp->rx_skbuff[entry] = skb;
 		}
 		vp->rx_ring[entry].status = 0;	/* Clear complete bit. */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] 3c59x: fix missing dma_mapping_error check
  2017-12-29 16:40 ` [PATCH] 3c59x: fix missing dma_mapping_error check Neil Horman
@ 2018-01-03  2:48   ` David Miller
  2018-01-03 10:42     ` Neil Horman
  0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2018-01-03  2:48 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, nhorman, klassert

From: Neil Horman <nhorman@tuxdriver.com>
Date: Fri, 29 Dec 2017 11:40:10 -0500

> @@ -2067,6 +2072,9 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  		int len = (skb->len + 3) & ~3;
>  		vp->tx_skb_dma = pci_map_single(VORTEX_PCI(vp), skb->data, len,
>  						PCI_DMA_TODEVICE);
> +		if (dma_mapping_error(&VORTEX_PCI(vp)->dev, vp->tx_skb_dma))
> +			return NETDEV_TX_OK;
> +

This leaks the SKB, right?

And for the RX cases, it allows the RX ring to deplete to empty which
tends to hang most chips.  You need to make the DMA failure detection
early and recycle the RX buffer back to the chip instead of passing
it up to the stack.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] 3c59x: fix missing dma_mapping_error check
  2018-01-03  2:48   ` David Miller
@ 2018-01-03 10:42     ` Neil Horman
  2018-01-03 14:53       ` David Miller
  0 siblings, 1 reply; 15+ messages in thread
From: Neil Horman @ 2018-01-03 10:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, nhorman, klassert

On Tue, Jan 02, 2018 at 09:48:27PM -0500, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Fri, 29 Dec 2017 11:40:10 -0500
> 
> > @@ -2067,6 +2072,9 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  		int len = (skb->len + 3) & ~3;
> >  		vp->tx_skb_dma = pci_map_single(VORTEX_PCI(vp), skb->data, len,
> >  						PCI_DMA_TODEVICE);
> > +		if (dma_mapping_error(&VORTEX_PCI(vp)->dev, vp->tx_skb_dma))
> > +			return NETDEV_TX_OK;
> > +
> 
> This leaks the SKB, right?
> 
Crud, I think you're right, I'll respin this to address today.

> And for the RX cases, it allows the RX ring to deplete to empty which
> tends to hang most chips.  You need to make the DMA failure detection
> early and recycle the RX buffer back to the chip instead of passing
> it up to the stack.
> 
Strictly speaking, I think we're ok here, because the dirty_rx counter creates a
contiguous area to refill, and we will just pick up where we left off on the
next napi poll.

That said, it can still depelete if we get several consecutive
dma_mapping_errors in a sufficiently long period of time that the ring doesn't
ever refill, so yeah, recycling will be better.  That will take a bigger rewrite
of this code.  I'll post something asap.

Neil

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] 3c59x: fix missing dma_mapping_error check
  2018-01-03 10:42     ` Neil Horman
@ 2018-01-03 14:53       ` David Miller
  0 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2018-01-03 14:53 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, nhorman, klassert

From: Neil Horman <nhorman@tuxdriver.com>
Date: Wed, 3 Jan 2018 05:42:04 -0500

> On Tue, Jan 02, 2018 at 09:48:27PM -0500, David Miller wrote:
>> And for the RX cases, it allows the RX ring to deplete to empty which
>> tends to hang most chips.  You need to make the DMA failure detection
>> early and recycle the RX buffer back to the chip instead of passing
>> it up to the stack.
>> 
> Strictly speaking, I think we're ok here, because the dirty_rx counter creates a
> contiguous area to refill, and we will just pick up where we left off on the
> next napi poll.

If you continually fail the mappings, even NAPI poll, eventually the
RX ring will empty.

I don't think we're ok here.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCHv2] 3c59x: fix missing dma_mapping_error check and bad ring refill logic
  2017-12-29 14:51 3c59x: pci_unmap_single() oops tedheadster
  2017-12-29 16:40 ` [PATCH] 3c59x: fix missing dma_mapping_error check Neil Horman
@ 2018-01-03 14:44 ` Neil Horman
  2018-01-03 14:58   ` David Miller
  2018-01-03 18:09 ` [PATCHv3] " Neil Horman
  2 siblings, 1 reply; 15+ messages in thread
From: Neil Horman @ 2018-01-03 14:44 UTC (permalink / raw)
  To: netdev
  Cc: tedheadster, Neil Horman, Neil Horman, Steffen Klassert,
	David S. Miller

A few spots in 3c59x missed calls to dma_mapping_error checks, casuing
WARN_ONS to trigger.  Clean those up.  While we're at it, refactor the
refill code a bit so that if skb allocation or dma mapping fails, we
recycle the existing buffer.  This prevents holes in the rx ring, and
makes for much simpler logic

Note: This is compile only tested.  Ted, if you could run this and
confirm that it continues to work properly, I would appreciate it, as I
currently don't have access to this hardware

Signed-off-by: Neil Horman <nhorman@redhat.com>
CC: Steffen Klassert <klassert@mathematik.tu-chemnitz.de>
CC: "David S. Miller" <davem@davemloft.net>
Reported-by: tedheadster@gmail.com

---
Change notes:

v2)
* Fixed tx path to free skb on mapping error
* Refactored rx path to recycle skbs on allocation/mapping error
* Used refactoring to remove oom timer and dirty_rx index
---
 drivers/net/ethernet/3com/3c59x.c | 91 +++++++++++++++++----------------------
 1 file changed, 39 insertions(+), 52 deletions(-)

diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c
index f4e13a7014bd..2928c9a4b477 100644
--- a/drivers/net/ethernet/3com/3c59x.c
+++ b/drivers/net/ethernet/3com/3c59x.c
@@ -602,7 +602,7 @@ struct vortex_private {
 	struct sk_buff* rx_skbuff[RX_RING_SIZE];
 	struct sk_buff* tx_skbuff[TX_RING_SIZE];
 	unsigned int cur_rx, cur_tx;		/* The next free ring entry */
-	unsigned int dirty_rx, dirty_tx;	/* The ring entries to be free()ed. */
+	unsigned int dirty_tx;	/* The ring entries to be free()ed. */
 	struct vortex_extra_stats xstats;	/* NIC-specific extra stats */
 	struct sk_buff *tx_skb;				/* Packet being eaten by bus master ctrl.  */
 	dma_addr_t tx_skb_dma;				/* Allocated DMA address for bus master ctrl DMA.   */
@@ -618,7 +618,6 @@ struct vortex_private {
 
 	/* The remainder are related to chip state, mostly media selection. */
 	struct timer_list timer;			/* Media selection timer. */
-	struct timer_list rx_oom_timer;		/* Rx skb allocation retry timer */
 	int options;						/* User-settable misc. driver options. */
 	unsigned int media_override:4, 		/* Passed-in media type. */
 		default_media:4,				/* Read from the EEPROM/Wn3_Config. */
@@ -760,7 +759,6 @@ static void mdio_sync(struct vortex_private *vp, int bits);
 static int mdio_read(struct net_device *dev, int phy_id, int location);
 static void mdio_write(struct net_device *vp, int phy_id, int location, int value);
 static void vortex_timer(struct timer_list *t);
-static void rx_oom_timer(struct timer_list *t);
 static netdev_tx_t vortex_start_xmit(struct sk_buff *skb,
 				     struct net_device *dev);
 static netdev_tx_t boomerang_start_xmit(struct sk_buff *skb,
@@ -1601,7 +1599,6 @@ vortex_up(struct net_device *dev)
 
 	timer_setup(&vp->timer, vortex_timer, 0);
 	mod_timer(&vp->timer, RUN_AT(media_tbl[dev->if_port].wait));
-	timer_setup(&vp->rx_oom_timer, rx_oom_timer, 0);
 
 	if (vortex_debug > 1)
 		pr_debug("%s: Initial media type %s.\n",
@@ -1676,7 +1673,7 @@ vortex_up(struct net_device *dev)
 	window_write16(vp, 0x0040, 4, Wn4_NetDiag);
 
 	if (vp->full_bus_master_rx) { /* Boomerang bus master. */
-		vp->cur_rx = vp->dirty_rx = 0;
+		vp->cur_rx = 0;
 		/* Initialize the RxEarly register as recommended. */
 		iowrite16(SetRxThreshold + (1536>>2), ioaddr + EL3_CMD);
 		iowrite32(0x0020, ioaddr + PktStatus);
@@ -1729,6 +1726,7 @@ vortex_open(struct net_device *dev)
 	struct vortex_private *vp = netdev_priv(dev);
 	int i;
 	int retval;
+	dma_addr_t dma;
 
 	/* Use the now-standard shared IRQ implementation. */
 	if ((retval = request_irq(dev->irq, vp->full_bus_master_rx ?
@@ -1753,7 +1751,11 @@ vortex_open(struct net_device *dev)
 				break;			/* Bad news!  */
 
 			skb_reserve(skb, NET_IP_ALIGN);	/* Align IP on 16 byte boundaries */
-			vp->rx_ring[i].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data, PKT_BUF_SZ, PCI_DMA_FROMDEVICE));
+			dma = pci_map_single(VORTEX_PCI(vp), skb->data,
+					     PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
+			if (dma_mapping_error(&VORTEX_PCI(vp)->dev, dma))
+				break;
+			vp->rx_ring[i].addr = cpu_to_le32(dma);
 		}
 		if (i != RX_RING_SIZE) {
 			pr_emerg("%s: no memory for rx ring\n", dev->name);
@@ -2067,6 +2069,12 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		int len = (skb->len + 3) & ~3;
 		vp->tx_skb_dma = pci_map_single(VORTEX_PCI(vp), skb->data, len,
 						PCI_DMA_TODEVICE);
+		if (dma_mapping_error(&VORTEX_PCI(vp)->dev, vp->tx_skb_dma)) {
+			dev_kfree_skb_any(skb);
+			dev->stats.tx_dropped++;
+			return NETDEV_TX_OK;
+		}
+
 		spin_lock_irq(&vp->window_lock);
 		window_set(vp, 7);
 		iowrite32(vp->tx_skb_dma, ioaddr + Wn7_MasterAddr);
@@ -2593,7 +2601,8 @@ boomerang_rx(struct net_device *dev)
 	int entry = vp->cur_rx % RX_RING_SIZE;
 	void __iomem *ioaddr = vp->ioaddr;
 	int rx_status;
-	int rx_work_limit = vp->dirty_rx + RX_RING_SIZE - vp->cur_rx;
+	int rx_work_limit = RX_RING_SIZE;
+	dma_addr_t dma;
 
 	if (vortex_debug > 5)
 		pr_debug("boomerang_rx(): status %4.4x\n", ioread16(ioaddr+EL3_STATUS));
@@ -2614,7 +2623,8 @@ boomerang_rx(struct net_device *dev)
 		} else {
 			/* The packet length: up to 4.5K!. */
 			int pkt_len = rx_status & 0x1fff;
-			struct sk_buff *skb;
+			struct sk_buff *skb, *newskb;
+			dma_addr_t newdma;
 			dma_addr_t dma = le32_to_cpu(vp->rx_ring[entry].addr);
 
 			if (vortex_debug > 4)
@@ -2633,9 +2643,27 @@ boomerang_rx(struct net_device *dev)
 				pci_dma_sync_single_for_device(VORTEX_PCI(vp), dma, PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
 				vp->rx_copy++;
 			} else {
+				/* Pre-allocate the replacement skb.  If it or its
+				 * mapping fails then recycle the buffer thats already
+				 * in place
+				 */
+				newskb = netdev_alloc_skb_ip_align(dev, PKT_BUF_SZ);
+				if (!newskb) {
+					dev->stats.rx_dropped++;
+					goto clear_complete;
+				}
+				newdma = pci_map_single(VORTEX_PCI(vp), newskb->data,
+							PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
+				if (dma_mapping_error(&VORTEX_PCI(vp)->dev, newdma)) {
+					dev->stats.rx_dropped++;
+					consume_skb(newskb);
+					goto clear_complete;
+				}
+
 				/* Pass up the skbuff already on the Rx ring. */
 				skb = vp->rx_skbuff[entry];
-				vp->rx_skbuff[entry] = NULL;
+				vp->rx_skbuff[entry] = newskb;
+				vp->rx_ring[entry].addr = cpu_to_le32(newdma);
 				skb_put(skb, pkt_len);
 				pci_unmap_single(VORTEX_PCI(vp), dma, PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
 				vp->rx_nocopy++;
@@ -2653,55 +2681,15 @@ boomerang_rx(struct net_device *dev)
 			netif_rx(skb);
 			dev->stats.rx_packets++;
 		}
-		entry = (++vp->cur_rx) % RX_RING_SIZE;
-	}
-	/* Refill the Rx ring buffers. */
-	for (; vp->cur_rx - vp->dirty_rx > 0; vp->dirty_rx++) {
-		struct sk_buff *skb;
-		entry = vp->dirty_rx % RX_RING_SIZE;
-		if (vp->rx_skbuff[entry] == NULL) {
-			skb = netdev_alloc_skb_ip_align(dev, PKT_BUF_SZ);
-			if (skb == NULL) {
-				static unsigned long last_jif;
-				if (time_after(jiffies, last_jif + 10 * HZ)) {
-					pr_warn("%s: memory shortage\n",
-						dev->name);
-					last_jif = jiffies;
-				}
-				if ((vp->cur_rx - vp->dirty_rx) == RX_RING_SIZE)
-					mod_timer(&vp->rx_oom_timer, RUN_AT(HZ * 1));
-				break;			/* Bad news!  */
-			}
 
-			vp->rx_ring[entry].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data, PKT_BUF_SZ, PCI_DMA_FROMDEVICE));
-			vp->rx_skbuff[entry] = skb;
-		}
+clear_complete:
 		vp->rx_ring[entry].status = 0;	/* Clear complete bit. */
 		iowrite16(UpUnstall, ioaddr + EL3_CMD);
+		entry = (++vp->cur_rx) % RX_RING_SIZE;
 	}
 	return 0;
 }
 
-/*
- * If we've hit a total OOM refilling the Rx ring we poll once a second
- * for some memory.  Otherwise there is no way to restart the rx process.
- */
-static void
-rx_oom_timer(struct timer_list *t)
-{
-	struct vortex_private *vp = from_timer(vp, t, rx_oom_timer);
-	struct net_device *dev = vp->mii.dev;
-
-	spin_lock_irq(&vp->lock);
-	if ((vp->cur_rx - vp->dirty_rx) == RX_RING_SIZE)	/* This test is redundant, but makes me feel good */
-		boomerang_rx(dev);
-	if (vortex_debug > 1) {
-		pr_debug("%s: rx_oom_timer %s\n", dev->name,
-			((vp->cur_rx - vp->dirty_rx) != RX_RING_SIZE) ? "succeeded" : "retrying");
-	}
-	spin_unlock_irq(&vp->lock);
-}
-
 static void
 vortex_down(struct net_device *dev, int final_down)
 {
@@ -2711,7 +2699,6 @@ vortex_down(struct net_device *dev, int final_down)
 	netdev_reset_queue(dev);
 	netif_stop_queue(dev);
 
-	del_timer_sync(&vp->rx_oom_timer);
 	del_timer_sync(&vp->timer);
 
 	/* Turn off statistics ASAP.  We update dev->stats below. */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCHv2] 3c59x: fix missing dma_mapping_error check and bad ring refill logic
  2018-01-03 14:44 ` [PATCHv2] 3c59x: fix missing dma_mapping_error check and bad ring refill logic Neil Horman
@ 2018-01-03 14:58   ` David Miller
  2018-01-03 15:13     ` Neil Horman
  0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2018-01-03 14:58 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, tedheadster, nhorman, klassert

From: Neil Horman <nhorman@tuxdriver.com>
Date: Wed,  3 Jan 2018 09:44:15 -0500

> A few spots in 3c59x missed calls to dma_mapping_error checks, casuing
> WARN_ONS to trigger.  Clean those up.  While we're at it, refactor the
> refill code a bit so that if skb allocation or dma mapping fails, we
> recycle the existing buffer.  This prevents holes in the rx ring, and
> makes for much simpler logic
> 
> Note: This is compile only tested.  Ted, if you could run this and
> confirm that it continues to work properly, I would appreciate it, as I
> currently don't have access to this hardware
> 
> Signed-off-by: Neil Horman <nhorman@redhat.com>
> CC: Steffen Klassert <klassert@mathematik.tu-chemnitz.de>
> CC: "David S. Miller" <davem@davemloft.net>
> Reported-by: tedheadster@gmail.com

See my other reply.

Your RX handling must become more sophisticated.

This is exactly what we tell driver authors to do.  If you cannot allocate
or DMA map a replacement RX buffer, you _MUST_ recycle the existing buffer
back to the chip rather than pass it up to the stack.

Thank you.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2] 3c59x: fix missing dma_mapping_error check and bad ring refill logic
  2018-01-03 14:58   ` David Miller
@ 2018-01-03 15:13     ` Neil Horman
  2018-01-03 15:26       ` David Miller
  0 siblings, 1 reply; 15+ messages in thread
From: Neil Horman @ 2018-01-03 15:13 UTC (permalink / raw)
  To: David Miller; +Cc: nhorman, netdev, tedheadster, klassert

On Wed, Jan 03, 2018 at 09:58:49AM -0500, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Wed,  3 Jan 2018 09:44:15 -0500
> 
> > A few spots in 3c59x missed calls to dma_mapping_error checks, casuing
> > WARN_ONS to trigger.  Clean those up.  While we're at it, refactor the
> > refill code a bit so that if skb allocation or dma mapping fails, we
> > recycle the existing buffer.  This prevents holes in the rx ring, and
> > makes for much simpler logic
> > 
> > Note: This is compile only tested.  Ted, if you could run this and
> > confirm that it continues to work properly, I would appreciate it, as I
> > currently don't have access to this hardware
> > 
> > Signed-off-by: Neil Horman <nhorman@redhat.com>
> > CC: Steffen Klassert <klassert@mathematik.tu-chemnitz.de>
> > CC: "David S. Miller" <davem@davemloft.net>
> > Reported-by: tedheadster@gmail.com
> 
> See my other reply.
> 
> Your RX handling must become more sophisticated.
> 
Yes, I understood your previous reply.

> This is exactly what we tell driver authors to do.  If you cannot allocate
> or DMA map a replacement RX buffer, you _MUST_ recycle the existing buffer
> back to the chip rather than pass it up to the stack.
> 
Thats exactly what this patch does, instead of creating a second loop to
traverse all the emptied ring buffers, now I:

1) Pre-allocate a new skb when I know I'm going to receive the in-place skb
2) Map the skb into the appropriate dma device domain
3) If (1) and (2) succede, then I swap the newly allocate skb and dma address
with the old one and recieve the old into the network stack
4) If (1) or (2) fail, then I goto clear_complete, which leaves the old skb and
dma address in place, sets the buffer status back to 0 (indicating completion),
and write the new ring status back to the hardware

This is what you wanted, a pre-allocate and swap-if-successful, recycle-if-not
approach, rather than the leave-a-hole-in-the-ring approach that is there
currently, no?  Or did I miss something else?

Neil

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2] 3c59x: fix missing dma_mapping_error check and bad ring refill logic
  2018-01-03 15:13     ` Neil Horman
@ 2018-01-03 15:26       ` David Miller
  2018-01-03 15:28         ` David Miller
  0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2018-01-03 15:26 UTC (permalink / raw)
  To: nhorman; +Cc: nhorman, netdev, tedheadster, klassert

From: Neil Horman <nhorman@redhat.com>
Date: Wed, 3 Jan 2018 10:13:33 -0500

> Thats exactly what this patch does, instead of creating a second loop to
> traverse all the emptied ring buffers, now I:
> 
> 1) Pre-allocate a new skb when I know I'm going to receive the in-place skb
> 2) Map the skb into the appropriate dma device domain
> 3) If (1) and (2) succede, then I swap the newly allocate skb and dma address
> with the old one and recieve the old into the network stack
> 4) If (1) or (2) fail, then I goto clear_complete, which leaves the old skb and
> dma address in place, sets the buffer status back to 0 (indicating completion),
> and write the new ring status back to the hardware
> 
> This is what you wanted, a pre-allocate and swap-if-successful, recycle-if-not
> approach, rather than the leave-a-hole-in-the-ring approach that is there
> currently, no?  Or did I miss something else?

I misread the code sorry, you're absolutely right.

I'll apply this patch, thanks Neil. :)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2] 3c59x: fix missing dma_mapping_error check and bad ring refill logic
  2018-01-03 15:26       ` David Miller
@ 2018-01-03 15:28         ` David Miller
  2018-01-03 16:41           ` Neil Horman
  0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2018-01-03 15:28 UTC (permalink / raw)
  To: nhorman; +Cc: nhorman, netdev, tedheadster, klassert

From: David Miller <davem@davemloft.net>
Date: Wed, 03 Jan 2018 10:26:06 -0500 (EST)

> From: Neil Horman <nhorman@redhat.com>
> Date: Wed, 3 Jan 2018 10:13:33 -0500
> 
>> Thats exactly what this patch does, instead of creating a second loop to
>> traverse all the emptied ring buffers, now I:
>> 
>> 1) Pre-allocate a new skb when I know I'm going to receive the in-place skb
>> 2) Map the skb into the appropriate dma device domain
>> 3) If (1) and (2) succede, then I swap the newly allocate skb and dma address
>> with the old one and recieve the old into the network stack
>> 4) If (1) or (2) fail, then I goto clear_complete, which leaves the old skb and
>> dma address in place, sets the buffer status back to 0 (indicating completion),
>> and write the new ring status back to the hardware
>> 
>> This is what you wanted, a pre-allocate and swap-if-successful, recycle-if-not
>> approach, rather than the leave-a-hole-in-the-ring approach that is there
>> currently, no?  Or did I miss something else?
> 
> I misread the code sorry, you're absolutely right.
> 
> I'll apply this patch, thanks Neil. :)

Hmmm, maybe we need a V3 after all :)

  CC [M]  drivers/net/ethernet/3com/3c59x.o
drivers/net/ethernet/3com/3c59x.c: In function ‘boomerang_rx’:
drivers/net/ethernet/3com/3c59x.c:2605:13: warning: unused variable ‘dma’ [-Wunused-variable]
  dma_addr_t dma;
             ^~~

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2] 3c59x: fix missing dma_mapping_error check and bad ring refill logic
  2018-01-03 15:28         ` David Miller
@ 2018-01-03 16:41           ` Neil Horman
  0 siblings, 0 replies; 15+ messages in thread
From: Neil Horman @ 2018-01-03 16:41 UTC (permalink / raw)
  To: David Miller; +Cc: nhorman, netdev, tedheadster, klassert

On Wed, Jan 03, 2018 at 10:28:31AM -0500, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Wed, 03 Jan 2018 10:26:06 -0500 (EST)
> 
> > From: Neil Horman <nhorman@redhat.com>
> > Date: Wed, 3 Jan 2018 10:13:33 -0500
> > 
> >> Thats exactly what this patch does, instead of creating a second loop to
> >> traverse all the emptied ring buffers, now I:
> >> 
> >> 1) Pre-allocate a new skb when I know I'm going to receive the in-place skb
> >> 2) Map the skb into the appropriate dma device domain
> >> 3) If (1) and (2) succede, then I swap the newly allocate skb and dma address
> >> with the old one and recieve the old into the network stack
> >> 4) If (1) or (2) fail, then I goto clear_complete, which leaves the old skb and
> >> dma address in place, sets the buffer status back to 0 (indicating completion),
> >> and write the new ring status back to the hardware
> >> 
> >> This is what you wanted, a pre-allocate and swap-if-successful, recycle-if-not
> >> approach, rather than the leave-a-hole-in-the-ring approach that is there
> >> currently, no?  Or did I miss something else?
> > 
> > I misread the code sorry, you're absolutely right.
> > 
> > I'll apply this patch, thanks Neil. :)
> 
> Hmmm, maybe we need a V3 after all :)
> 
>   CC [M]  drivers/net/ethernet/3com/3c59x.o
> drivers/net/ethernet/3com/3c59x.c: In function ‘boomerang_rx’:
> drivers/net/ethernet/3com/3c59x.c:2605:13: warning: unused variable ‘dma’ [-Wunused-variable]
>   dma_addr_t dma;
>              ^~~
Thats....odd, I built it twice here, and it didn't bomb out.  I must not have
-Werror enabled, apologies.  I'll respin
Neil

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCHv3] 3c59x: fix missing dma_mapping_error check and bad ring refill logic
  2017-12-29 14:51 3c59x: pci_unmap_single() oops tedheadster
  2017-12-29 16:40 ` [PATCH] 3c59x: fix missing dma_mapping_error check Neil Horman
  2018-01-03 14:44 ` [PATCHv2] 3c59x: fix missing dma_mapping_error check and bad ring refill logic Neil Horman
@ 2018-01-03 18:09 ` Neil Horman
  2018-01-03 18:44   ` David Miller
  2 siblings, 1 reply; 15+ messages in thread
From: Neil Horman @ 2018-01-03 18:09 UTC (permalink / raw)
  To: netdev
  Cc: tedheadster, Neil Horman, Neil Horman, Steffen Klassert,
	David S. Miller

A few spots in 3c59x missed calls to dma_mapping_error checks, casuing
WARN_ONS to trigger.  Clean those up.  While we're at it, refactor the
refill code a bit so that if skb allocation or dma mapping fails, we
recycle the existing buffer.  This prevents holes in the rx ring, and
makes for much simpler logic

Note: This is compile only tested.  Ted, if you could run this and
confirm that it continues to work properly, I would appreciate it, as I
currently don't have access to this hardware

Signed-off-by: Neil Horman <nhorman@redhat.com>
CC: Steffen Klassert <klassert@mathematik.tu-chemnitz.de>
CC: "David S. Miller" <davem@davemloft.net>
Reported-by: tedheadster@gmail.com

---
Change notes:

v2)
* Fixed tx path to free skb on mapping error
* Refactored rx path to recycle skbs on allocation/mapping error
* Used refactoring to remove oom timer and dirty_rx index

v3)
* Removed unused variable that was causing a warning
---
 drivers/net/ethernet/3com/3c59x.c | 90 +++++++++++++++++----------------------
 1 file changed, 38 insertions(+), 52 deletions(-)

diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c
index f4e13a7014bd..36c8950dbd2d 100644
--- a/drivers/net/ethernet/3com/3c59x.c
+++ b/drivers/net/ethernet/3com/3c59x.c
@@ -602,7 +602,7 @@ struct vortex_private {
 	struct sk_buff* rx_skbuff[RX_RING_SIZE];
 	struct sk_buff* tx_skbuff[TX_RING_SIZE];
 	unsigned int cur_rx, cur_tx;		/* The next free ring entry */
-	unsigned int dirty_rx, dirty_tx;	/* The ring entries to be free()ed. */
+	unsigned int dirty_tx;	/* The ring entries to be free()ed. */
 	struct vortex_extra_stats xstats;	/* NIC-specific extra stats */
 	struct sk_buff *tx_skb;				/* Packet being eaten by bus master ctrl.  */
 	dma_addr_t tx_skb_dma;				/* Allocated DMA address for bus master ctrl DMA.   */
@@ -618,7 +618,6 @@ struct vortex_private {
 
 	/* The remainder are related to chip state, mostly media selection. */
 	struct timer_list timer;			/* Media selection timer. */
-	struct timer_list rx_oom_timer;		/* Rx skb allocation retry timer */
 	int options;						/* User-settable misc. driver options. */
 	unsigned int media_override:4, 		/* Passed-in media type. */
 		default_media:4,				/* Read from the EEPROM/Wn3_Config. */
@@ -760,7 +759,6 @@ static void mdio_sync(struct vortex_private *vp, int bits);
 static int mdio_read(struct net_device *dev, int phy_id, int location);
 static void mdio_write(struct net_device *vp, int phy_id, int location, int value);
 static void vortex_timer(struct timer_list *t);
-static void rx_oom_timer(struct timer_list *t);
 static netdev_tx_t vortex_start_xmit(struct sk_buff *skb,
 				     struct net_device *dev);
 static netdev_tx_t boomerang_start_xmit(struct sk_buff *skb,
@@ -1601,7 +1599,6 @@ vortex_up(struct net_device *dev)
 
 	timer_setup(&vp->timer, vortex_timer, 0);
 	mod_timer(&vp->timer, RUN_AT(media_tbl[dev->if_port].wait));
-	timer_setup(&vp->rx_oom_timer, rx_oom_timer, 0);
 
 	if (vortex_debug > 1)
 		pr_debug("%s: Initial media type %s.\n",
@@ -1676,7 +1673,7 @@ vortex_up(struct net_device *dev)
 	window_write16(vp, 0x0040, 4, Wn4_NetDiag);
 
 	if (vp->full_bus_master_rx) { /* Boomerang bus master. */
-		vp->cur_rx = vp->dirty_rx = 0;
+		vp->cur_rx = 0;
 		/* Initialize the RxEarly register as recommended. */
 		iowrite16(SetRxThreshold + (1536>>2), ioaddr + EL3_CMD);
 		iowrite32(0x0020, ioaddr + PktStatus);
@@ -1729,6 +1726,7 @@ vortex_open(struct net_device *dev)
 	struct vortex_private *vp = netdev_priv(dev);
 	int i;
 	int retval;
+	dma_addr_t dma;
 
 	/* Use the now-standard shared IRQ implementation. */
 	if ((retval = request_irq(dev->irq, vp->full_bus_master_rx ?
@@ -1753,7 +1751,11 @@ vortex_open(struct net_device *dev)
 				break;			/* Bad news!  */
 
 			skb_reserve(skb, NET_IP_ALIGN);	/* Align IP on 16 byte boundaries */
-			vp->rx_ring[i].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data, PKT_BUF_SZ, PCI_DMA_FROMDEVICE));
+			dma = pci_map_single(VORTEX_PCI(vp), skb->data,
+					     PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
+			if (dma_mapping_error(&VORTEX_PCI(vp)->dev, dma))
+				break;
+			vp->rx_ring[i].addr = cpu_to_le32(dma);
 		}
 		if (i != RX_RING_SIZE) {
 			pr_emerg("%s: no memory for rx ring\n", dev->name);
@@ -2067,6 +2069,12 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		int len = (skb->len + 3) & ~3;
 		vp->tx_skb_dma = pci_map_single(VORTEX_PCI(vp), skb->data, len,
 						PCI_DMA_TODEVICE);
+		if (dma_mapping_error(&VORTEX_PCI(vp)->dev, vp->tx_skb_dma)) {
+			dev_kfree_skb_any(skb);
+			dev->stats.tx_dropped++;
+			return NETDEV_TX_OK;
+		}
+
 		spin_lock_irq(&vp->window_lock);
 		window_set(vp, 7);
 		iowrite32(vp->tx_skb_dma, ioaddr + Wn7_MasterAddr);
@@ -2593,7 +2601,7 @@ boomerang_rx(struct net_device *dev)
 	int entry = vp->cur_rx % RX_RING_SIZE;
 	void __iomem *ioaddr = vp->ioaddr;
 	int rx_status;
-	int rx_work_limit = vp->dirty_rx + RX_RING_SIZE - vp->cur_rx;
+	int rx_work_limit = RX_RING_SIZE;
 
 	if (vortex_debug > 5)
 		pr_debug("boomerang_rx(): status %4.4x\n", ioread16(ioaddr+EL3_STATUS));
@@ -2614,7 +2622,8 @@ boomerang_rx(struct net_device *dev)
 		} else {
 			/* The packet length: up to 4.5K!. */
 			int pkt_len = rx_status & 0x1fff;
-			struct sk_buff *skb;
+			struct sk_buff *skb, *newskb;
+			dma_addr_t newdma;
 			dma_addr_t dma = le32_to_cpu(vp->rx_ring[entry].addr);
 
 			if (vortex_debug > 4)
@@ -2633,9 +2642,27 @@ boomerang_rx(struct net_device *dev)
 				pci_dma_sync_single_for_device(VORTEX_PCI(vp), dma, PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
 				vp->rx_copy++;
 			} else {
+				/* Pre-allocate the replacement skb.  If it or its
+				 * mapping fails then recycle the buffer thats already
+				 * in place
+				 */
+				newskb = netdev_alloc_skb_ip_align(dev, PKT_BUF_SZ);
+				if (!newskb) {
+					dev->stats.rx_dropped++;
+					goto clear_complete;
+				}
+				newdma = pci_map_single(VORTEX_PCI(vp), newskb->data,
+							PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
+				if (dma_mapping_error(&VORTEX_PCI(vp)->dev, newdma)) {
+					dev->stats.rx_dropped++;
+					consume_skb(newskb);
+					goto clear_complete;
+				}
+
 				/* Pass up the skbuff already on the Rx ring. */
 				skb = vp->rx_skbuff[entry];
-				vp->rx_skbuff[entry] = NULL;
+				vp->rx_skbuff[entry] = newskb;
+				vp->rx_ring[entry].addr = cpu_to_le32(newdma);
 				skb_put(skb, pkt_len);
 				pci_unmap_single(VORTEX_PCI(vp), dma, PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
 				vp->rx_nocopy++;
@@ -2653,55 +2680,15 @@ boomerang_rx(struct net_device *dev)
 			netif_rx(skb);
 			dev->stats.rx_packets++;
 		}
-		entry = (++vp->cur_rx) % RX_RING_SIZE;
-	}
-	/* Refill the Rx ring buffers. */
-	for (; vp->cur_rx - vp->dirty_rx > 0; vp->dirty_rx++) {
-		struct sk_buff *skb;
-		entry = vp->dirty_rx % RX_RING_SIZE;
-		if (vp->rx_skbuff[entry] == NULL) {
-			skb = netdev_alloc_skb_ip_align(dev, PKT_BUF_SZ);
-			if (skb == NULL) {
-				static unsigned long last_jif;
-				if (time_after(jiffies, last_jif + 10 * HZ)) {
-					pr_warn("%s: memory shortage\n",
-						dev->name);
-					last_jif = jiffies;
-				}
-				if ((vp->cur_rx - vp->dirty_rx) == RX_RING_SIZE)
-					mod_timer(&vp->rx_oom_timer, RUN_AT(HZ * 1));
-				break;			/* Bad news!  */
-			}
 
-			vp->rx_ring[entry].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data, PKT_BUF_SZ, PCI_DMA_FROMDEVICE));
-			vp->rx_skbuff[entry] = skb;
-		}
+clear_complete:
 		vp->rx_ring[entry].status = 0;	/* Clear complete bit. */
 		iowrite16(UpUnstall, ioaddr + EL3_CMD);
+		entry = (++vp->cur_rx) % RX_RING_SIZE;
 	}
 	return 0;
 }
 
-/*
- * If we've hit a total OOM refilling the Rx ring we poll once a second
- * for some memory.  Otherwise there is no way to restart the rx process.
- */
-static void
-rx_oom_timer(struct timer_list *t)
-{
-	struct vortex_private *vp = from_timer(vp, t, rx_oom_timer);
-	struct net_device *dev = vp->mii.dev;
-
-	spin_lock_irq(&vp->lock);
-	if ((vp->cur_rx - vp->dirty_rx) == RX_RING_SIZE)	/* This test is redundant, but makes me feel good */
-		boomerang_rx(dev);
-	if (vortex_debug > 1) {
-		pr_debug("%s: rx_oom_timer %s\n", dev->name,
-			((vp->cur_rx - vp->dirty_rx) != RX_RING_SIZE) ? "succeeded" : "retrying");
-	}
-	spin_unlock_irq(&vp->lock);
-}
-
 static void
 vortex_down(struct net_device *dev, int final_down)
 {
@@ -2711,7 +2698,6 @@ vortex_down(struct net_device *dev, int final_down)
 	netdev_reset_queue(dev);
 	netif_stop_queue(dev);
 
-	del_timer_sync(&vp->rx_oom_timer);
 	del_timer_sync(&vp->timer);
 
 	/* Turn off statistics ASAP.  We update dev->stats below. */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] 3c59x: fix missing dma_mapping_error check and bad ring refill logic
  2018-01-03 18:09 ` [PATCHv3] " Neil Horman
@ 2018-01-03 18:44   ` David Miller
  2018-01-22  6:27     ` tedheadster
  0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2018-01-03 18:44 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, tedheadster, nhorman, klassert

From: Neil Horman <nhorman@tuxdriver.com>
Date: Wed,  3 Jan 2018 13:09:23 -0500

> A few spots in 3c59x missed calls to dma_mapping_error checks, casuing
> WARN_ONS to trigger.  Clean those up.  While we're at it, refactor the
> refill code a bit so that if skb allocation or dma mapping fails, we
> recycle the existing buffer.  This prevents holes in the rx ring, and
> makes for much simpler logic
> 
> Note: This is compile only tested.  Ted, if you could run this and
> confirm that it continues to work properly, I would appreciate it, as I
> currently don't have access to this hardware
> 
> Signed-off-by: Neil Horman <nhorman@redhat.com>
> CC: Steffen Klassert <klassert@mathematik.tu-chemnitz.de>
> CC: "David S. Miller" <davem@davemloft.net>
> Reported-by: tedheadster@gmail.com
> 
> ---
> Change notes:
> 
> v2)
> * Fixed tx path to free skb on mapping error
> * Refactored rx path to recycle skbs on allocation/mapping error
> * Used refactoring to remove oom timer and dirty_rx index
> 
> v3)
> * Removed unused variable that was causing a warning

Applied.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] 3c59x: fix missing dma_mapping_error check and bad ring refill logic
  2018-01-03 18:44   ` David Miller
@ 2018-01-22  6:27     ` tedheadster
  2018-01-22 12:36       ` Neil Horman
  0 siblings, 1 reply; 15+ messages in thread
From: tedheadster @ 2018-01-22  6:27 UTC (permalink / raw)
  To: David Miller; +Cc: nhorman, netdev, nhorman, klassert

On Wed, Jan 3, 2018 at 1:44 PM, David Miller <davem@davemloft.net> wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Wed,  3 Jan 2018 13:09:23 -0500
>
>> A few spots in 3c59x missed calls to dma_mapping_error checks, casuing
>> WARN_ONS to trigger.  Clean those up.  While we're at it, refactor the
>> refill code a bit so that if skb allocation or dma mapping fails, we
>> recycle the existing buffer.  This prevents holes in the rx ring, and
>> makes for much simpler logic
>>
>> Note: This is compile only tested.  Ted, if you could run this and
>> confirm that it continues to work properly, I would appreciate it, as I
>> currently don't have access to this hardware
>>

Neil,
  I was able to test this patch. I did not get any WARN_ON messages.
However, I am getting a lot of dropped receive packets; uptime is 11
minutes and it has already dropped 214 of 743 receive packets.

Admittedly this is on a slow i486 regression testing system, but the
drop rate is approximately 30% which seems high even for this system
because it is on a very quiet switched network.

I enabled some debugging messages by setting msglvl to 4 and
recompiling with DYNAMIC_DEBUG=y. I did not see any messages of the
form "No memory to allocate a sk_buff of size" so that leaves the
following two cases:

boomerang_rx()
...
newskb = netdev_alloc_skb_ip_align(dev, PKT_BUF_SZ);
if (!newskb) {
  dev->stats.rx_dropped++;
  goto clear_complete;
  }
  newdma = pci_map_single(VORTEX_PCI(vp), newskb->data,
                                                PKT_BUF_SZ, PCI_DMA_FROMDEVICE)
  if (dma_mapping_error(&VORTEX_PCI(vp)->dev, newdma)) {
    dev->stats.rx_dropped++;
    consume_skb(newskb);
    goto clear_complete;
  }

What shall we do to determine if it is hitting the pci_map_single() or
netdev_alloc_skb_ip_align() failure?

- Matthew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] 3c59x: fix missing dma_mapping_error check and bad ring refill logic
  2018-01-22  6:27     ` tedheadster
@ 2018-01-22 12:36       ` Neil Horman
  0 siblings, 0 replies; 15+ messages in thread
From: Neil Horman @ 2018-01-22 12:36 UTC (permalink / raw)
  To: whiteheadm; +Cc: David Miller, netdev, nhorman, klassert

On Mon, Jan 22, 2018 at 01:27:19AM -0500, tedheadster wrote:
> On Wed, Jan 3, 2018 at 1:44 PM, David Miller <davem@davemloft.net> wrote:
> > From: Neil Horman <nhorman@tuxdriver.com>
> > Date: Wed,  3 Jan 2018 13:09:23 -0500
> >
> >> A few spots in 3c59x missed calls to dma_mapping_error checks, casuing
> >> WARN_ONS to trigger.  Clean those up.  While we're at it, refactor the
> >> refill code a bit so that if skb allocation or dma mapping fails, we
> >> recycle the existing buffer.  This prevents holes in the rx ring, and
> >> makes for much simpler logic
> >>
> >> Note: This is compile only tested.  Ted, if you could run this and
> >> confirm that it continues to work properly, I would appreciate it, as I
> >> currently don't have access to this hardware
> >>
> 
> Neil,
>   I was able to test this patch. I did not get any WARN_ON messages.
> However, I am getting a lot of dropped receive packets; uptime is 11
> minutes and it has already dropped 214 of 743 receive packets.
> 
> Admittedly this is on a slow i486 regression testing system, but the
> drop rate is approximately 30% which seems high even for this system
> because it is on a very quiet switched network.
> 
> I enabled some debugging messages by setting msglvl to 4 and
> recompiling with DYNAMIC_DEBUG=y. I did not see any messages of the
> form "No memory to allocate a sk_buff of size" so that leaves the
> following two cases:
> 
> boomerang_rx()
> ...
> newskb = netdev_alloc_skb_ip_align(dev, PKT_BUF_SZ);
> if (!newskb) {
>   dev->stats.rx_dropped++;
>   goto clear_complete;
>   }
>   newdma = pci_map_single(VORTEX_PCI(vp), newskb->data,
>                                                 PKT_BUF_SZ, PCI_DMA_FROMDEVICE)
>   if (dma_mapping_error(&VORTEX_PCI(vp)->dev, newdma)) {
>     dev->stats.rx_dropped++;
>     consume_skb(newskb);
>     goto clear_complete;
>   }
> 
> What shall we do to determine if it is hitting the pci_map_single() or
> netdev_alloc_skb_ip_align() failure?
> 
> - Matthew
> 

Well, I would suggest that you either modify and rebuild the kernel to add a
printk in both of those locations or (if you don't want to go to all that
trouble), write a systemtap script to probe both of those locations and print a
warning if those paths are executed.

That said, while I understand its good for your understanding of the problem,
knowing which cse you hit likely won't help you fix much.  Depending on which
path you hit, it means your either low on ram from which to allocate skbs, or
your out of space in your iommu (i'm guessing your using a software iotlb lib).
Both are pretty limited resources on the system you describe.

Neil

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-01-22 12:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-29 14:51 3c59x: pci_unmap_single() oops tedheadster
2017-12-29 16:40 ` [PATCH] 3c59x: fix missing dma_mapping_error check Neil Horman
2018-01-03  2:48   ` David Miller
2018-01-03 10:42     ` Neil Horman
2018-01-03 14:53       ` David Miller
2018-01-03 14:44 ` [PATCHv2] 3c59x: fix missing dma_mapping_error check and bad ring refill logic Neil Horman
2018-01-03 14:58   ` David Miller
2018-01-03 15:13     ` Neil Horman
2018-01-03 15:26       ` David Miller
2018-01-03 15:28         ` David Miller
2018-01-03 16:41           ` Neil Horman
2018-01-03 18:09 ` [PATCHv3] " Neil Horman
2018-01-03 18:44   ` David Miller
2018-01-22  6:27     ` tedheadster
2018-01-22 12:36       ` Neil Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).