linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "François Valenduc" <francoisvalenduc@gmail.com>
To: Larry Finger <Larry.Finger@lwfinger.net>, linux-wireless@vger.kernel.org
Subject: Re: Kernel crash while copying big files since kernel 3.18
Date: Sun, 11 Jan 2015 15:35:43 +0100	[thread overview]
Message-ID: <54B28A3F.3000904@gmail.com> (raw)
In-Reply-To: <54AAE532.6050104@lwfinger.net>

Le 05/01/15 20:25, Larry Finger a écrit :
> On 01/05/2015 12:46 PM, François Valenduc wrote:
>> Le 05/01/15 18:25, Larry Finger a écrit :
>>> On 01/05/2015 01:12 AM, François Valenduc wrote:
>>>> Hello everybody,
>>>>
>>>> Since kernel 3.18, I encounter a kernel crash each time when I copy a
>>>> big file (around 12 Gb) from an external USB drive to the harddrive of
>>>> my laptop.
>>>> I tried a bisection between kernels 3.17 and 3.18 and I was
>>>> surprised to
>>>> find that this has to do with the driver of the wireless card
>>>> (rtl8188ee). However, I don't have problems if I copy the file
>>>> while the
>>>> rtl8188 module is not loaded. Unfortunately, the results of git-bisect
>>>> are not totally conclusive because the kernel crash during boot
>>>> when the
>>>> wireless connection is established. Here are the last steps of the
>>>> bisection:
>>>>
>>>> # bad: [c151aed6aa146e9587590051aba9da68b9370f9b] rtlwifi: rtl8188ee:
>>>> Update driver to match Realtek release of 06282014
>>>> git bisect bad c151aed6aa146e9587590051aba9da68b9370f9b
>>>> # good: [fd09ff958777cf583d7541f180991c0fc50bd2f7] rtlwifi: Remove
>>>> extra
>>>> workqueue for enter/leave power state
>>>> git bisect good fd09ff958777cf583d7541f180991c0fc50bd2f7
>>>> # skip: [9afa2e44f4d8f9d031f815c32bb8f225f0f6746b] rtlwifi: Modify
>>>> base.{c,h} for new drivers
>>>> git bisect skip 9afa2e44f4d8f9d031f815c32bb8f225f0f6746b
>>>> # skip: [3c67b8f9f3b5bb1207c9bb198e5ef04ff56921dd] rtlwifi: Modify
>>>> cam.{c,h} and efuse.{c,h} for new drivers
>>>> git bisect skip 3c67b8f9f3b5bb1207c9bb198e5ef04ff56921dd
>>>> # skip: [f7953b2ad66cc5fc66e13d5c0a40e61b45cdfca8] rtlwifi: Modify
>>>> core.c for new drivers
>>>> git bisect skip f7953b2ad66cc5fc66e13d5c0a40e61b45cdfca8
>>>> # skip: [d3feae41a3473a0f7b431d6af4e092865d586e52] rtlwifi: Update
>>>> power-save routines for 062814 driver
>>>> git bisect skip d3feae41a3473a0f7b431d6af4e092865d586e52
>>>> # skip: [38506ecefab911785d5e1aa5889f6eeb462e0954] rtlwifi: rtl_pci:
>>>> Start modification for new drivers
>>>> git bisect skip 38506ecefab911785d5e1aa5889f6eeb462e0954
>>>> # skip: [f3a97e93814aeac3f13e857a0071726acc9bd626] rtlwifi: Finish
>>>> modifying core routines for new drivers
>>>> git bisect skip f3a97e93814aeac3f13e857a0071726acc9bd626
>>>> # only skipped commits left to test
>>>> # possible first bad commit:
>>>> [c151aed6aa146e9587590051aba9da68b9370f9b]
>>>> rtlwifi: rtl8188ee: Update driver to match Realtek release of 06282014
>>>> # possible first bad commit:
>>>> [f3a97e93814aeac3f13e857a0071726acc9bd626]
>>>> rtlwifi: Finish modifying core routines for new drivers
>>>> # possible first bad commit:
>>>> [d3feae41a3473a0f7b431d6af4e092865d586e52]
>>>> rtlwifi: Update power-save routines for 062814 driver
>>>> # possible first bad commit:
>>>> [3c67b8f9f3b5bb1207c9bb198e5ef04ff56921dd]
>>>> rtlwifi: Modify cam.{c,h} and efuse.{c,h} for new drivers
>>>> # possible first bad commit:
>>>> [9afa2e44f4d8f9d031f815c32bb8f225f0f6746b]
>>>> rtlwifi: Modify base.{c,h} for new drivers
>>>> # possible first bad commit:
>>>> [f7953b2ad66cc5fc66e13d5c0a40e61b45cdfca8]
>>>> rtlwifi: Modify core.c for new drivers
>>>> # possible first bad commit:
>>>> [38506ecefab911785d5e1aa5889f6eeb462e0954]
>>>> rtlwifi: rtl_pci: Start modification for new drivers
>>>>
>>>> Can somebody explain what's happening ? I do the copy via Dolphin
>>>> in KDE
>>>> and the screen becomes black and the computer becomes totally
>>>> unresponsive. So, I don't have access to the logs to see the trace of
>>>> the problem.
>>>>
>>>> Thanks in advance for your help,
>>>
>>> There is a bug in 3.18 that is triggered when an O(3) memory
>>> allocation fails. There is a patch to fix this at
>>> http://marc.info/?l=linux-netdev&m=141999680927473&w=2 that has been
>>> merged into wireless-drivers as commit
>>> e9538cf4f90713eca71b1d6a74b4eae1d445c664. It will be applied to 3.18.X
>>> when it makes it into mainline 3.19-rcY, but that has not yet happened.
>>>
>>> You could manually apply that patch to your kernel source, or you
>>> could pull the git repo at http://github.com/lwfinger/rtlwifi_new.git.
>>> That code has this patch already applied.
>>>
>>> If this patch does not fix the problem, you might be able to capture
>>> at least part of the backtrace by starting the transfer and then
>>> switching to the logging console. When a crash happens, photograph the
>>> screen. On my system, I display it with CTRL-ALT-F10. I return to the
>>> normal graphical console with CTRL-ALT-F7, but your distro may use
>>> different virtual consoles.
>>>
>>> Larry
>>>
>>>
>> Thanks for your help, it seems that your patch solves the problem. Now,
>> the system doesn't crash anymore after copying the same large file than
>> yesterday. I also see this message in the log:
>> rtl_pci: Allocation of new skb failed in _rtl_pci_rx_interrupt which is
>> added by your patch.
>> Should I worry about this failure ? Or is it expected ?
>
> That is the positive proof that the new patch worked. Getting to that
> condition without the patch would have crashed the system. That printk
> is there to see if we were actually getting the condition and
> recovering. As the code is obviously working now, that line will be
> removed soon.
>
> Thanks,
>
> Larry
>
Do you still intend to remove the line about allocation failure in the
log ? I made a backup of my root partition compressed with pixz and that
line appeared 1350 times. So I removed the code which add this line. Is
it really expected that it occurs so often ? pixz use multithreading to
compress files and therefore at least 3 of the 4 CPU are used during
around 20 minutes, but are you sure there is no other problems ?

Thanks for your help,

François Valenduc

  reply	other threads:[~2015-01-11 14:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-05  7:12 Kernel crash while copying big files since kernel 3.18 François Valenduc
2015-01-05 16:13 ` François Valenduc
2015-01-05 17:25 ` Larry Finger
2015-01-05 18:46   ` François Valenduc
2015-01-05 19:25     ` Larry Finger
2015-01-11 14:35       ` François Valenduc [this message]
2015-01-11 17:00         ` Larry Finger
2015-01-25 19:38           ` François Valenduc
2015-01-25 20:24             ` Larry Finger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54B28A3F.3000904@gmail.com \
    --to=francoisvalenduc@gmail.com \
    --cc=Larry.Finger@lwfinger.net \
    --cc=linux-wireless@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).