From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Dorian Gray <yourfavouritegod@gmail.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
Alan Stern <stern@rowland.harvard.edu>,
Suman Tripathi <stripathi@apm.com>,
iommu@lists.linux-foundation.org,
USB list <linux-usb@vger.kernel.org>,
Kernel development list <linux-kernel@vger.kernel.org>
Subject: Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]
Date: Mon, 20 Apr 2015 09:03:03 -0400 [thread overview]
Message-ID: <20150420130303.GB9002@l.oracle.com> (raw)
In-Reply-To: <CAJ2095rrd2ZR9mNJegBn1qO2-obod51GXL1bOJTtB+wH2MF+7g@mail.gmail.com>
On Sun, Apr 19, 2015 at 05:43:18PM +0200, Dorian Gray wrote:
> I think the case is closed.
> Now that I know it's not USB, but wireless driver, I looked through
> the new k3.19.5's changelog and saw this:
>
>
> commit b943e69d33fac1e5f6db57868e061096b0aae67a
> Author: Larry Finger <Larry.Finger@lwfinger.net>
> Date: Sat Mar 21 15:16:05 2015 -0500
>
> rtlwifi: Fix IOMMU mapping leak in AP mode
>
> commit be0b5e635883678bfbc695889772fed545f3427d upstream.
>
> Transmission of an AP beacon does not call the TX interrupt service routine,
> which usually does the cleanup. Instead, cleanup is handled in a tasklet
> completion routine. Unfortunately, this routine has a serious bug
> in that it does
> not release the DMA mapping before it frees the skb, thus one
> IOMMU mapping is
> leaked for each beacon. The test system failed with no free IOMMU
> mapping slots
> approximately one hour after hostapd was used to start an AP.
>
> This issue was reported and tested at
> https://github.com/lwfinger/rtlwifi_new/issues/30.
>
> Reported-and-tested-by: Kevin Mullican <kevin@mullican.com>
> Cc: Kevin Mullican <kevin@mullican.com>
> Signed-off-by: Shao Fu <shaofu@realtek.com>
> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
>
> Looks very related, especially because my wireless card is also always
> in AP mode, however I haven't been actually using it lately, so
> probably that's why I didn't notice anything related to it (and kept
> focused on USB), until I used dump_dma.
>
> Well, due to my minimal knowledge regarding kernel's internals I can't
> be 100% sure that this was it, but so far 3.19.5 is working stable
> (uptime 6hrs and counting).
Sweet!
>
> Thank you Konrad (and everyone else involved) for helping me out to
> pinpoint the actual culprit.
Sure thing. Happy to have been able to help!
> Jake
>
>
> On 18 April 2015 at 21:59, Dorian Gray <yourfavouritegod@gmail.com> wrote:
> > On 18 April 2015 at 12:10, Dorian Gray <yourfavouritegod@gmail.com> wrote:
> >> On 17 April 2015 at 22:06, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> >>> On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
> >>>> On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> >>>> > And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
> >>>> > and then load the attached module.
> >>>> >
> >>>> > That should tell you who and what else is holding on the buffers.
> >>>>
> >>>> Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent me.
> >>>> Now, I'm not sure if I've done it right - I waited until the error
> >>>> occured and then modprobe'd dump_dma.
> >>>> I have attached the kernel log, but it tells me not much, if anything...
> >>>
> >>> The network driver is quite hungry for DMA. Did it do the same thing
> >>> in the earlier kernels?
> >>>
> >>> Thanks.
> >>>>
> >>>> Thanks again.
> >>>> Jake
> >>>
> >>>
> >>
> >> Yeah, you're right:
> >>
> >> # grep rtl8192se dump_dma_k3.19.4.log | wc -l
> >> 6789
> >> #
> >> # grep rtl8192se dump_dma_k3.17.8.log | wc -l
> >> 162
> >> #
> >>
> >> So, wlan driver would be the real culprit then..?
> >> I would have never thought...
> >>
> >> I guess I'm gonna test 3.19.4 once more (just to be sure) with
> >> rtl8192se removed and see what happens.
> >>
> >> Thanks!
> >> Jake
> >
> >
> > [update]
> >
> > Ok, 6 hours of uptime (3.19.4 + blacklisted rtl8192se) and everything
> > was fine...
> > However, I was checking periodically and noticed that 'radeon' also
> > tends to grow continuously over time, whereas ethernet driver sticks
> > to, more or less, the same range:
> >
> > # uname -r
> > 3.19.4
> > #
> > # grep -Eo 'radeon|r8169' L1.log | sort | uniq -c
> > 62 r8169
> > 4183 radeon
> > #
> > # grep -Eo 'radeon|r8169' L2.log | sort | uniq -c
> > 33 r8169
> > 5582 radeon
> > #
> > # grep -Eo 'radeon|r8169' L3.log | sort | uniq -c
> > 54 r8169
> > 7007 radeon
> > #
> > # grep -Eo 'radeon|r8169' L4.log | sort | uniq -c
> > 49 r8169
> > 7429 radeon
> > #
> > # grep -Eo 'radeon|r8169' L5.log | sort | uniq -c
> > 34 r8169
> > 9360 radeon
> > #
> >
> > It doesn't grow that much in 3.17.8:
> >
> > # uname -r
> > 3.17.8
> > #
> > # grep -Eo 'radeon|r8169|rtl8192se' L1.log | sort | uniq -c
> > 265 r8169
> > 1229 radeon
> > 142 rtl8192se
> > #
> > # grep -Eo 'radeon|r8169|rtl8192se' L2.log | sort | uniq -c
> > 187 r8169
> > 3159 radeon
> > 124 rtl8192se
> > #
> > # grep -Eo 'radeon|r8169|rtl8192se' L3.log | sort | uniq -c
> > 41 r8169
> > 1894 radeon
> > 39 rtl8192se
> > #
> > # grep -Eo 'radeon|r8169|rtl8192se' L4.log | sort | uniq -c
> > 64 r8169
> > 3370 radeon
> > 77 rtl8192se
> > #
> > # grep -Eo 'radeon|r8169|rtl8192se' L5.log | sort | uniq -c
> > 52 r8169
> > 2597 radeon
> > 49 rtl8192se
> > #
> >
> >
> > Btw, at some point (3.19.4) I encounetered this:
> > [21631.181909] DMA-API: debugging out of memory - disabling
> >
> > Jake
next prev parent reply other threads:[~2015-04-20 13:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAJ2095qWQ7S2W9i1NiM6C-0cM=wnceyA9n-0UVLygUXHJ7yxzA@mail.gmail.com>
2015-04-16 14:15 ` Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.] Alan Stern
2015-04-16 14:24 ` Suman Tripathi
2015-04-16 14:54 ` Alexander Duyck
2015-04-16 16:57 ` Dorian Gray
2015-04-16 18:42 ` Konrad Rzeszutek Wilk
2015-04-16 20:13 ` Dorian Gray
2015-04-17 15:14 ` Dorian Gray
2015-04-17 20:06 ` Konrad Rzeszutek Wilk
2015-04-18 10:10 ` Dorian Gray
2015-04-18 19:59 ` Dorian Gray
2015-04-19 15:43 ` Dorian Gray
2015-04-20 13:03 ` Konrad Rzeszutek Wilk [this message]
2015-04-17 15:10 ` Dorian Gray
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150420130303.GB9002@l.oracle.com \
--to=konrad.wilk@oracle.com \
--cc=alexander.duyck@gmail.com \
--cc=iommu@lists.linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
--cc=stripathi@apm.com \
--cc=yourfavouritegod@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox