* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? [not found] ` <CAEdQ38HcPswBk3pUHzQerFZ=4KjPc5nVYTqNnGQNMk7QbPXuOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2017-12-08 13:42 ` Eric Dumazet [not found] ` <CANn89iJKGRLVNAE99JWiyXcOXveytkjbQAiZ9XPiJc6fyEdFVA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2017-12-08 13:42 UTC (permalink / raw) To: Matt Turner Cc: linux-mips-6z/3iImG2C8G8FEW9MqTrA@public.gmane.org, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Paolo Abeni, Hannes Frederic Sowa, Peter Zijlstra (Intel), Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev On Thu, Dec 7, 2017 at 11:54 PM, Matt Turner <mattst88-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On Thu, Dec 7, 2017 at 11:00 PM, Matt Turner <mattst88-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >> On Sun, Mar 12, 2017 at 6:43 PM, Matt Turner <mattst88-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>> On a Broadcom BCM91250a MIPS system I can reliably trigger NFS >>> corruption on the first file read. >>> >>> To demonstrate, I downloaded five identical copies of the gcc-5.4.0 >>> source tarball. On the NFS server, they hash to the same value: >>> >>> server distfiles # md5sum gcc-5.4.0.tar.bz2* >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.1 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.2 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.3 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.4 >>> >>> On the MIPS system (the NFS client): >>> >>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2 >>> 35346975989954df8a8db2b034da610d gcc-5.4.0.tar.bz2.2 >>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2* >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.1 >>> 35346975989954df8a8db2b034da610d gcc-5.4.0.tar.bz2.2 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.3 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.4 >>> >>> The first file read will contain some corruption, and it is persistent until... >>> >>> bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches >>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2* >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.1 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.2 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.3 >>> 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.4 >>> >>> the caches are dropped, at which point it reads back properly. >>> >>> Note that the corruption is different across reboots, both in the size >>> of the corruption and the location. I saw 1900~ and 1400~ byte >>> sequences corrupted on separate occasions, which don't correspond to >>> the system's 16kB page size. >>> >>> I've tested kernels from v3.19 to 4.11-rc1+ (master branch from >>> today). All exhibit this behavior with differing frequencies. Earlier >>> kernels seem to reproduce the issue less often, while more recent >>> kernels reliably exhibit the problem every boot. >>> >>> How can I further debug this? >> >> I think I was wrong about the statement about kernels v3.19 to >> 4.11-rc1+. I found out I couldn't reproduce with 4.7-rc1 and then >> bisected to 4cd13c21b207e80ddb1144c576500098f2d5f882 ("softirq: Let >> ksoftirqd do its job"). Still reproduces with current tip of Linus' >> tree. >> >> Any ideas? The board's ethernet is an uncommon device supported by >> CONFIG_SB1250_MAC. Something about the ethernet driver maybe? > > With the patch reverted on master (reverts cleanly), NFS corruption no > longer happens. Hi Matt. Thanks for bisecting. Patch simply exposes an existing bug more often by changing the way driver functions are scheduled. Which is probably a good thing. sbmac_intr() looks extremely suspicious to me. A NAPI driver hard interrupt should simply schedule NAPI. Apparently, if sbmac_intr() can not grab NAPIF_STATE_SCHED bit, it directly calls sbdma_rx_process() from hard interrupt context. Insane really. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CANn89iJKGRLVNAE99JWiyXcOXveytkjbQAiZ9XPiJc6fyEdFVA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? [not found] ` <CANn89iJKGRLVNAE99JWiyXcOXveytkjbQAiZ9XPiJc6fyEdFVA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2017-12-08 13:52 ` Eric Dumazet [not found] ` <1512741164.25033.28.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2017-12-08 13:52 UTC (permalink / raw) To: Eric Dumazet, Matt Turner Cc: linux-mips-6z/3iImG2C8G8FEW9MqTrA@public.gmane.org, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Paolo Abeni, Hannes Frederic Sowa, Peter Zijlstra (Intel), Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev On Fri, 2017-12-08 at 05:42 -0800, Eric Dumazet wrote: > On Thu, Dec 7, 2017 at 11:54 PM, Matt Turner <mattst88-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > wrote: > > On Thu, Dec 7, 2017 at 11:00 PM, Matt Turner <mattst88-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > > wrote: > > > On Sun, Mar 12, 2017 at 6:43 PM, Matt Turner <mattst88-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > > > wrote: > > > > On a Broadcom BCM91250a MIPS system I can reliably trigger NFS > > > > corruption on the first file read. > > > > > > > > To demonstrate, I downloaded five identical copies of the gcc- > > > > 5.4.0 > > > > source tarball. On the NFS server, they hash to the same value: > > > > > > > > server distfiles # md5sum gcc-5.4.0.tar.bz2* > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.1 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.2 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.3 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.4 > > > > > > > > On the MIPS system (the NFS client): > > > > > > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2 > > > > 35346975989954df8a8db2b034da610d gcc-5.4.0.tar.bz2.2 > > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2* > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.1 > > > > 35346975989954df8a8db2b034da610d gcc-5.4.0.tar.bz2.2 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.3 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.4 > > > > > > > > The first file read will contain some corruption, and it is > > > > persistent until... > > > > > > > > bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches > > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2* > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.1 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.2 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.3 > > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.4 > > > > > > > > the caches are dropped, at which point it reads back properly. > > > > > > > > Note that the corruption is different across reboots, both in > > > > the size > > > > of the corruption and the location. I saw 1900~ and 1400~ byte > > > > sequences corrupted on separate occasions, which don't > > > > correspond to > > > > the system's 16kB page size. > > > > > > > > I've tested kernels from v3.19 to 4.11-rc1+ (master branch from > > > > today). All exhibit this behavior with differing frequencies. > > > > Earlier > > > > kernels seem to reproduce the issue less often, while more > > > > recent > > > > kernels reliably exhibit the problem every boot. > > > > > > > > How can I further debug this? > > > > > > I think I was wrong about the statement about kernels v3.19 to > > > 4.11-rc1+. I found out I couldn't reproduce with 4.7-rc1 and then > > > bisected to 4cd13c21b207e80ddb1144c576500098f2d5f882 ("softirq: > > > Let > > > ksoftirqd do its job"). Still reproduces with current tip of > > > Linus' > > > tree. > > > > > > Any ideas? The board's ethernet is an uncommon device supported > > > by > > > CONFIG_SB1250_MAC. Something about the ethernet driver maybe? > > > > With the patch reverted on master (reverts cleanly), NFS corruption > > no > > longer happens. > > Hi Matt. > > Thanks for bisecting. > > Patch simply exposes an existing bug more often by changing the way > driver functions are scheduled. > > Which is probably a good thing. > > sbmac_intr() looks extremely suspicious to me. > > A NAPI driver hard interrupt should simply schedule NAPI. > > Apparently, if sbmac_intr() can not grab NAPIF_STATE_SCHED bit, it > directly calls sbdma_rx_process() from > hard interrupt context. > > Insane really. Please try this fix (not compiled on my x86 laptop, and I had no coffee yet, so it might have some trivial errors) diff --git a/drivers/net/ethernet/broadcom/sb1250-mac.c b/drivers/net/ethernet/broadcom/sb1250-mac.c index ecdef42f0ae63641419a603f0b4eec2fc213c334..3ddd9ca469b280e70509b22fd7d3f449c81fbedc 100644 --- a/drivers/net/ethernet/broadcom/sb1250-mac.c +++ b/drivers/net/ethernet/broadcom/sb1250-mac.c @@ -287,8 +287,6 @@ static int sbdma_add_rcvbuffer(struct sbmac_softc *sc, struct sbmacdma *d, static int sbdma_add_txbuffer(struct sbmacdma *d, struct sk_buff *m); static void sbdma_emptyring(struct sbmacdma *d); static void sbdma_fillring(struct sbmac_softc *sc, struct sbmacdma *d); -static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d, - int work_to_do, int poll); static void sbdma_tx_process(struct sbmac_softc *sc, struct sbmacdma *d, int poll); static int sbmac_initctx(struct sbmac_softc *s); @@ -1063,7 +1061,7 @@ static void sbmac_netpoll(struct net_device *netdev) ********************************************************************* */ static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d, - int work_to_do, int poll) + int work_to_do) { struct net_device *dev = sc->sbm_dev; int curidx; @@ -1076,7 +1074,6 @@ static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d, prefetch(d); -again: /* Check if the HW dropped any frames */ dev->stats.rx_fifo_errors += __raw_readq(sc->sbm_rxdma.sbdma_oodpktlost) & 0xffff; @@ -1169,10 +1166,7 @@ static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d, } prefetch(sb->data); prefetch((const void *)(((char *)sb->data)+32)); - if (poll) - dropped = netif_receive_skb(sb); - else - dropped = netif_rx(sb); + dropped = netif_receive_skb(sb); if (dropped == NET_RX_DROP) { dev->stats.rx_dropped++; @@ -1201,10 +1195,6 @@ static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d, d->sbdma_remptr = SBDMA_NEXTBUF(d,sbdma_remptr); work_done++; } - if (!poll) { - work_to_do = 32; - goto again; /* collect fifo drop statistics again */ - } done: return work_done; } @@ -2006,11 +1996,6 @@ static irqreturn_t sbmac_intr(int irq,void *dev_instance) __napi_schedule(&sc->napi); /* Depend on the exit from poll to reenable intr */ } - else { - /* may leave some packets behind */ - sbdma_rx_process(sc,&(sc->sbm_rxdma), - SBMAC_MAX_RXDESCR * 2, 0); - } } return IRQ_RETVAL(handled); } @@ -2529,7 +2514,7 @@ static int sbmac_poll(struct napi_struct *napi, int budget) struct sbmac_softc *sc = container_of(napi, struct sbmac_softc, napi); int work_done; - work_done = sbdma_rx_process(sc, &(sc->sbm_rxdma), budget, 1); + work_done = sbdma_rx_process(sc, &(sc->sbm_rxdma), budget); sbdma_tx_process(sc, &(sc->sbm_txdma), 1); if (work_done < budget) { -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 6+ messages in thread
[parent not found: <1512741164.25033.28.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? [not found] ` <1512741164.25033.28.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2017-12-08 20:26 ` Matt Turner [not found] ` <CAEdQ38HEduSTY38Noj4peaMN_G++5sLJfqzCMkd3M4pPNTpU_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Matt Turner @ 2017-12-08 20:26 UTC (permalink / raw) To: Eric Dumazet Cc: Eric Dumazet, linux-mips-6z/3iImG2C8G8FEW9MqTrA@public.gmane.org, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Paolo Abeni, Hannes Frederic Sowa, Peter Zijlstra (Intel), Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev On Fri, Dec 8, 2017 at 5:52 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On Fri, 2017-12-08 at 05:42 -0800, Eric Dumazet wrote: >> On Thu, Dec 7, 2017 at 11:54 PM, Matt Turner <mattst88-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> >> wrote: >> > On Thu, Dec 7, 2017 at 11:00 PM, Matt Turner <mattst88-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> >> > wrote: >> > > On Sun, Mar 12, 2017 at 6:43 PM, Matt Turner <mattst88-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> >> > > wrote: >> > > > On a Broadcom BCM91250a MIPS system I can reliably trigger NFS >> > > > corruption on the first file read. >> > > > >> > > > To demonstrate, I downloaded five identical copies of the gcc- >> > > > 5.4.0 >> > > > source tarball. On the NFS server, they hash to the same value: >> > > > >> > > > server distfiles # md5sum gcc-5.4.0.tar.bz2* >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.1 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.2 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.3 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.4 >> > > > >> > > > On the MIPS system (the NFS client): >> > > > >> > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2 >> > > > 35346975989954df8a8db2b034da610d gcc-5.4.0.tar.bz2.2 >> > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2* >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.1 >> > > > 35346975989954df8a8db2b034da610d gcc-5.4.0.tar.bz2.2 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.3 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.4 >> > > > >> > > > The first file read will contain some corruption, and it is >> > > > persistent until... >> > > > >> > > > bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches >> > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2* >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.1 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.2 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.3 >> > > > 4c626ac2a83ef30dfb9260e6f59c2b30 gcc-5.4.0.tar.bz2.4 >> > > > >> > > > the caches are dropped, at which point it reads back properly. >> > > > >> > > > Note that the corruption is different across reboots, both in >> > > > the size >> > > > of the corruption and the location. I saw 1900~ and 1400~ byte >> > > > sequences corrupted on separate occasions, which don't >> > > > correspond to >> > > > the system's 16kB page size. >> > > > >> > > > I've tested kernels from v3.19 to 4.11-rc1+ (master branch from >> > > > today). All exhibit this behavior with differing frequencies. >> > > > Earlier >> > > > kernels seem to reproduce the issue less often, while more >> > > > recent >> > > > kernels reliably exhibit the problem every boot. >> > > > >> > > > How can I further debug this? >> > > >> > > I think I was wrong about the statement about kernels v3.19 to >> > > 4.11-rc1+. I found out I couldn't reproduce with 4.7-rc1 and then >> > > bisected to 4cd13c21b207e80ddb1144c576500098f2d5f882 ("softirq: >> > > Let >> > > ksoftirqd do its job"). Still reproduces with current tip of >> > > Linus' >> > > tree. >> > > >> > > Any ideas? The board's ethernet is an uncommon device supported >> > > by >> > > CONFIG_SB1250_MAC. Something about the ethernet driver maybe? >> > >> > With the patch reverted on master (reverts cleanly), NFS corruption >> > no >> > longer happens. >> >> Hi Matt. >> >> Thanks for bisecting. >> >> Patch simply exposes an existing bug more often by changing the way >> driver functions are scheduled. >> >> Which is probably a good thing. >> >> sbmac_intr() looks extremely suspicious to me. >> >> A NAPI driver hard interrupt should simply schedule NAPI. >> >> Apparently, if sbmac_intr() can not grab NAPIF_STATE_SCHED bit, it >> directly calls sbdma_rx_process() from >> hard interrupt context. >> >> Insane really. > > Please try this fix (not compiled on my x86 laptop, and I had no coffee > yet, so it might have some trivial errors) Thanks for the quick reply! I tried the patch on top of master, but unfortunately the corruption still occurs. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CAEdQ38HEduSTY38Noj4peaMN_G++5sLJfqzCMkd3M4pPNTpU_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? [not found] ` <CAEdQ38HEduSTY38Noj4peaMN_G++5sLJfqzCMkd3M4pPNTpU_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2017-12-08 21:16 ` Eric Dumazet [not found] ` <1512767781.25033.30.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2017-12-08 21:16 UTC (permalink / raw) To: Matt Turner Cc: Eric Dumazet, linux-mips-6z/3iImG2C8G8FEW9MqTrA@public.gmane.org, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Paolo Abeni, Hannes Frederic Sowa, Peter Zijlstra (Intel), Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev On Fri, 2017-12-08 at 12:26 -0800, Matt Turner wrote: > > Thanks for the quick reply! > > I tried the patch on top of master, but unfortunately the corruption > still occurs. You might try replacing in sbdma_add_rcvbuffer() sb_new = netdev_alloc_skb(dev, size); by sb_new = alloc_skb(size, GFP_ATOMIC); Maybe the device does not like having a frame spanning 2 pages. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <1512767781.25033.30.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? [not found] ` <1512767781.25033.30.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2017-12-09 21:03 ` Matt Turner 2017-12-09 21:37 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Matt Turner @ 2017-12-09 21:03 UTC (permalink / raw) To: Eric Dumazet Cc: Eric Dumazet, linux-mips-6z/3iImG2C8G8FEW9MqTrA@public.gmane.org, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Paolo Abeni, Hannes Frederic Sowa, Peter Zijlstra (Intel), Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev On Fri, Dec 8, 2017 at 1:16 PM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On Fri, 2017-12-08 at 12:26 -0800, Matt Turner wrote: >> >> Thanks for the quick reply! >> >> I tried the patch on top of master, but unfortunately the corruption >> still occurs. > > You might try replacing in sbdma_add_rcvbuffer() > > sb_new = netdev_alloc_skb(dev, size); > > by > > sb_new = alloc_skb(size, GFP_ATOMIC); > > Maybe the device does not like having a frame spanning 2 pages. No such luck. I also gave changing the page size from 16K to 4K a shot without success. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? 2017-12-09 21:03 ` Matt Turner @ 2017-12-09 21:37 ` Eric Dumazet 0 siblings, 0 replies; 6+ messages in thread From: Eric Dumazet @ 2017-12-09 21:37 UTC (permalink / raw) To: Matt Turner Cc: Eric Dumazet, linux-mips@linux-mips.org, linux-nfs, Paolo Abeni, Hannes Frederic Sowa, Peter Zijlstra (Intel), Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev On Sat, 2017-12-09 at 13:03 -0800, Matt Turner wrote: > On Fri, Dec 8, 2017 at 1:16 PM, Eric Dumazet <eric.dumazet@gmail.com> > wrote: > > On Fri, 2017-12-08 at 12:26 -0800, Matt Turner wrote: > > > > > > Thanks for the quick reply! > > > > > > I tried the patch on top of master, but unfortunately the > > > corruption > > > still occurs. > > > > You might try replacing in sbdma_add_rcvbuffer() > > > > sb_new = netdev_alloc_skb(dev, size); > > > > by > > > > sb_new = alloc_skb(size, GFP_ATOMIC); > > > > Maybe the device does not like having a frame spanning 2 pages. > > No such luck. I also gave changing the page size from 16K to 4K a > shot > without success. If your hist is SMP, could you try running it with one CPU only ? Sorry, I have no more ideas :/ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-12-09 21:37 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAEdQ38HcOgAT6wJWWKY3P0hzYwkBGSQkRSQ2a=eaGmD6c6rwXA@mail.gmail.com>
[not found] ` <CAEdQ38G4VTXDGOarmmTac=hP92VJbQHRFxQTaSWQ3j4d63pogg@mail.gmail.com>
[not found] ` <CAEdQ38HcPswBk3pUHzQerFZ=4KjPc5nVYTqNnGQNMk7QbPXuOQ@mail.gmail.com>
[not found] ` <CAEdQ38HcPswBk3pUHzQerFZ=4KjPc5nVYTqNnGQNMk7QbPXuOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-12-08 13:42 ` NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? Eric Dumazet
[not found] ` <CANn89iJKGRLVNAE99JWiyXcOXveytkjbQAiZ9XPiJc6fyEdFVA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-12-08 13:52 ` Eric Dumazet
[not found] ` <1512741164.25033.28.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-12-08 20:26 ` Matt Turner
[not found] ` <CAEdQ38HEduSTY38Noj4peaMN_G++5sLJfqzCMkd3M4pPNTpU_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-12-08 21:16 ` Eric Dumazet
[not found] ` <1512767781.25033.30.camel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-12-09 21:03 ` Matt Turner
2017-12-09 21:37 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).