From: Alan Jenkins <alan.christopher.jenkins@googlemail.com>
To: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Cc: "Luis R. Rodriguez" <lrodriguez@atheros.com>,
"ath5k-devel@lists.ath5k.org" <ath5k-devel@lists.ath5k.org>
Subject: Re: [ath5k] OOPS bisected - ath5k_eeprom_init_header() - "XX: This value is likely too big still, waiting on a better value"
Date: Mon, 20 Jun 2011 11:34:06 +0100 [thread overview]
Message-ID: <4DFF221E.6010901@googlemail.com> (raw)
In-Reply-To: <4DFF2156.4000100@googlemail.com>
[Removed "XXX" from subject in order to get past VGER spam filter]
On 20/06/11 11:30, Alan Jenkins wrote:
> ## Summary of bug report ##
> Confidence: high.
> (Bisected. Crash message captured, see end of message).
>
> Severity: low.
> (Users can easily avoid the trigger for this crash).
>
>
> ## Scenario: ##
> Asus EeePC 701.
> Wireless-toggle button, implemented in firmware, disables / enables
> the entire PCI slot. (See platform driver eeepc-laptop).
>
> ## Expected results: ##
> Ath5k (like any other PCI driver), must tolerate removal of PCI
> devices without warning, and *never crash*.
>
> ## Actual results: ##
> Ath5k generally handles removal very well, but crashes in a corner
> case. This is a regression introduced by v2.6.33 (commit commit
> 359207c687cc8f4f9845c8dadd0d6dabad44e584 "ath5k: Fix eeprom checksum
> check for custom sized eeproms").
>
> ## How to reproduce: ##
> Repeatedly press the wireless-toggle button as fast as you can. It
> shouldn't take more than about 20 presses before the OOPS occurs.
>
>
> ## Results of git-bisect ##
>
> commit 359207c687cc8f4f9845c8dadd0d6dabad44e584
> Author: Luis R. Rodriguez <lrodriguez@atheros.com>
> Date: Mon Jan 4 10:40:39 2010 -0500
>
> ath5k: Fix eeprom checksum check for custom sized eeproms
>
> Commit 8bf3d79bc401ca417ccf9fc076d3295d1a71dbf5 enabled EEPROM
> checksum checks to avoid bogus bug reports but failed to address
> updating the code to consider devices with custom EEPROM sizes.
> Devices with custom sized EEPROMs have the upper limit size stuffed
> in the EEPROM. Use this as the upper limit instead of the static
> default size. In case of a checksum error also provide back the
> max size and whether or not this was the default size or a custom
> one. If the EEPROM is busted we add a failsafe check to ensure
> we don't loop forever or try to read bogus areas of hardware.
>
> This closes bug 14874
>
> http://bugzilla.kernel.org/show_bug.cgi?id=14874
>
> Cc: stable@kernel.org
> Cc: David Quan <david.quan@atheros.com>
> Cc: Stephen Beahm <stephenbeahm@comcast.net>
> Reported-by: Joshua Covington <joshuacov@googlemail.com>
> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
> Signed-off-by: John W. Linville <linville@tuxdriver.com>
>
> ...
>
> - for (cksum = 0, offset = 0; offset < AR5K_EEPROM_INFO_MAX;
> offset++) {
> + AR5K_EEPROM_READ(AR5K_EEPROM_SIZE_UPPER, val);
> + if (val) {
> + eep_max = (val & AR5K_EEPROM_SIZE_UPPER_MASK) <<
> + AR5K_EEPROM_SIZE_ENDLOC_SHIFT;
> + AR5K_EEPROM_READ(AR5K_EEPROM_SIZE_LOWER, val);
> + eep_max = (eep_max | val) - AR5K_EEPROM_INFO_BASE;
> +
> + /*
> + * Fail safe check to prevent stupid loops due
> + * to busted EEPROMs. XXX: This value is likely too
> + * big still, waiting on a better value.
> + */
> + if (eep_max > (3 * AR5K_EEPROM_INFO_MAX)) {
> + ATH5K_ERR(ah->ah_sc, "Invalid max custom
> EEPROM size: "
> + "%d (0x%04x) max expected: %d
> (0x%04x)\n",
> + eep_max, eep_max,
> + 3 * AR5K_EEPROM_INFO_MAX,
> + 3 * AR5K_EEPROM_INFO_MAX);
> + return -EIO;
> + }
> + }
> +
> + for (cksum = 0, offset = 0; offset < eep_max; offset++) {
>
>
> ### Example OOPS message ###
>
> pci 0000:01:00.0: reg 10: [mem 0x00000000-0x0000ffff 64bit]
> pci 0000:01:00.0: BAR 0: assigned [mem 0xf8000000-0xf800ffff 64bit]
> pci 0000:01:00.0: BAR 0: set to [mem 0xf8000000-0xf800ffff 64bit] (PCI
> address [0xf8000000-0xf800ffff]
> ath5k 0000:01:00.0: enabling device (0000 -> 0002)
> ath5k 0000:01:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
> ath5k 0000:01:00.0: setting latency timer to 64
> ath5k 0000:01:00.0: registered as 'phy9'
> BUG: unable to handle kernel paging request at 1ef700e0
> IP: [<c05a098c>] iowrite32+0xd/0x30
> *pde = 00000000
> Oops: 0002 [#1] SMP
> last sysfs file: /sys/devices/platform/eeepc/rfkill/rfkill0/uevent
> Modules linked in: aes_i586 aes_generic snd_hda_codec_realtek arc4 ecb
> snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device
> ip6t_REJECT ath5k snd_pcm nf_conntrack_ipv6 mac80211 uvcvideo iTCO_wdt
> ip6table_filter ath iTCO_vendor_support videodev snd_timer
> eeepc_laptop microcode ip6_tables v4l1_compat joydev i2c_i801 cfg80211
> snd sparse_keymap atl2 rfkill soundcore snd_page_alloc ipv6 autofs4
> i915 drm_kms_helper drm usb_storage i2c_algo_bit i2c_core video output
>
> Pid: 17, comm: kacpi_notify Not tainted 2.6.33.3-85.fc13.i686 #1 701/701
> EIP: 0060:[<c05a098c>] EFLAGS: 00010216 CPU: 0
> EIP is at iowrite32+0xd/0x30
> EAX: 00000020 EBX: de324000 ECX: 1ef700e0 EDX: 1ef700e0
> ESI: 00000020 EDI: decabdba EBP: decabd90 ESP: decabd90
> DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> Process kacpi_notify (pid: 17, ti=decaa000 task=dec84c80
> task.ti=decaa000)
> Stack:
> decabda4 e08af61e de0e0a40 00000007 decabdba decabdc8 e08af6df decabe3a
> <0> de324000 00000000 00000000 de0e0a40 00000007 decabe38 decabe54
> e08c041c
> <0> 0000000c decabe1c 0000000c e09a0000 de325680 def6a500 1ef6a500
> de324000
> Call Trace:
> [<e08af61e>] ? ath5k_hw_eeprom_read+0x80/0x116 [ath5k]
> [<e08af6df>] ? ath5k_eeprom_read_mac+0x2b/0x8f [ath5k]
> [<e08c041c>] ? ath5k_pci_probe+0xbd4/0x1115 [ath5k]
> [<c05ab416>] ? local_pci_probe+0xe/0x10
> [<c05abdef>] ? pci_device_probe+0x43/0x66
> [<c062e3f6>] ? driver_probe_device+0xc5/0x1cd
> [<c062e587>] ? __device_attach+0x2a/0x2e
> [<c062d832>] ? bus_for_each_drv+0x3d/0x67
> [<c062e5f6>] ? device_attach+0x4c/0x6c
> [<c062e55d>] ? __device_attach+0x0/0x2e
> [<c062d697>] ? bus_probe_device+0x18/0x2d
> [<c062c31f>] ? device_add+0x30d/0x49c
> [<c075f8ea>] ? pci_bus_assign_resources+0xac/0x3a7
> [<c05a723d>] ? pci_device_add+0xca/0xce
> [<c05a6f26>] ? pci_bus_add_device+0xf/0x30
> [<e07480df>] ? eeepc_rfkill_hotplug+0x87/0xc0 [eeepc_laptop]
> [<e0748126>] ? eeepc_rfkill_notify+0xe/0x10 [eeepc_laptop]
> [<c05df721>] ? acpi_ev_notify_dispatch+0x4f/0x5a
> [<c05d205d>] ? acpi_os_execute_deferred+0x1d/0x28
> [<c0448d63>] ? worker_thread+0x13b/0x1b4
> [<c05d2040>] ? acpi_os_execute_deferred+0x0/0x28
> [<c044bd89>] ? autoremove_wake_function+0x0/0x2f
> [<c0448c28>] ? worker_thread+0x0/0x1b4
> [<c044ba47>] ? kthread+0x5f/0x64
> [<c044b9e8>] ? kthread+0x0/0x64
> [<c040383e>] ? kernel_thread_helper+0x6/0x10
> Code: 00 76 0d 25 ff ff 00 00 89 d7 89 c2 f3 6c eb 0a ba ad 60 8b c0
> e8 4e fe ff ff 5b 5f 5d c3 55 81 fa ff ff 03 00 89 e5 89 d1 76 04 <89>
> 02 eb 1d 81 fa 00 00 01 00 76 09 81 e2 ff ff 00 00 ef eb 0c
> EIP: [<c05a098c>] iowrite32+0xd/0x30 SS:ESP 0068:decabd90
> CR2: 000000001ef700e0
> ---[ end trace 2761b2bae95de764 ]---
>
> ## Analysis ##
>
> When the card is disabled, reads will return with all bits set. So we
> think the EEPROM is of maximum size, and we end up reading outside the
> memory mapping for the device?
>
> There are two obvious solutions -
>
> 1. Bounds-check the size read from the EEPROM against the appropriate
> memory mapping.
> 2. Check for "all ones" when reading the size of the EEPROM.
>
>
> I have a feeling the bounds-checking is more annoying than it sounds.
> When I looked at ath5k_hw_eeprom_read(), I saw two types of hardware.
> In one, the EEPROM was mapped directly into PCI space, which is what
> seems to be happening here. In other hardware, the access was
> indirected through just a couple of registers at fixed addresses - in
> this case the ROM could be bigger than the PCI space.
>
> The second option is a bit of a hack though, especially considering
> the "XXX:" comment.
>
>
> Best wishes
> Alan
WARNING: multiple messages have this Message-ID (diff)
From: Alan Jenkins <alan.christopher.jenkins@googlemail.com>
To: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Cc: "Luis R. Rodriguez" <lrodriguez@atheros.com>,
"ath5k-devel@lists.ath5k.org" <ath5k-devel@venema.h4ckr.net>
Subject: Re: [ath5k] OOPS bisected - ath5k_eeprom_init_header() - "XX: This value is likely too big still, waiting on a better value"
Date: Mon, 20 Jun 2011 11:34:06 +0100 [thread overview]
Message-ID: <4DFF221E.6010901@googlemail.com> (raw)
In-Reply-To: <4DFF2156.4000100@googlemail.com>
[Removed "XXX" from subject in order to get past VGER spam filter]
On 20/06/11 11:30, Alan Jenkins wrote:
> ## Summary of bug report ##
> Confidence: high.
> (Bisected. Crash message captured, see end of message).
>
> Severity: low.
> (Users can easily avoid the trigger for this crash).
>
>
> ## Scenario: ##
> Asus EeePC 701.
> Wireless-toggle button, implemented in firmware, disables / enables
> the entire PCI slot. (See platform driver eeepc-laptop).
>
> ## Expected results: ##
> Ath5k (like any other PCI driver), must tolerate removal of PCI
> devices without warning, and *never crash*.
>
> ## Actual results: ##
> Ath5k generally handles removal very well, but crashes in a corner
> case. This is a regression introduced by v2.6.33 (commit commit
> 359207c687cc8f4f9845c8dadd0d6dabad44e584 "ath5k: Fix eeprom checksum
> check for custom sized eeproms").
>
> ## How to reproduce: ##
> Repeatedly press the wireless-toggle button as fast as you can. It
> shouldn't take more than about 20 presses before the OOPS occurs.
>
>
> ## Results of git-bisect ##
>
> commit 359207c687cc8f4f9845c8dadd0d6dabad44e584
> Author: Luis R. Rodriguez <lrodriguez@atheros.com>
> Date: Mon Jan 4 10:40:39 2010 -0500
>
> ath5k: Fix eeprom checksum check for custom sized eeproms
>
> Commit 8bf3d79bc401ca417ccf9fc076d3295d1a71dbf5 enabled EEPROM
> checksum checks to avoid bogus bug reports but failed to address
> updating the code to consider devices with custom EEPROM sizes.
> Devices with custom sized EEPROMs have the upper limit size stuffed
> in the EEPROM. Use this as the upper limit instead of the static
> default size. In case of a checksum error also provide back the
> max size and whether or not this was the default size or a custom
> one. If the EEPROM is busted we add a failsafe check to ensure
> we don't loop forever or try to read bogus areas of hardware.
>
> This closes bug 14874
>
> http://bugzilla.kernel.org/show_bug.cgi?id=14874
>
> Cc: stable@kernel.org
> Cc: David Quan <david.quan@atheros.com>
> Cc: Stephen Beahm <stephenbeahm@comcast.net>
> Reported-by: Joshua Covington <joshuacov@googlemail.com>
> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
> Signed-off-by: John W. Linville <linville@tuxdriver.com>
>
> ...
>
> - for (cksum = 0, offset = 0; offset < AR5K_EEPROM_INFO_MAX;
> offset++) {
> + AR5K_EEPROM_READ(AR5K_EEPROM_SIZE_UPPER, val);
> + if (val) {
> + eep_max = (val & AR5K_EEPROM_SIZE_UPPER_MASK) <<
> + AR5K_EEPROM_SIZE_ENDLOC_SHIFT;
> + AR5K_EEPROM_READ(AR5K_EEPROM_SIZE_LOWER, val);
> + eep_max = (eep_max | val) - AR5K_EEPROM_INFO_BASE;
> +
> + /*
> + * Fail safe check to prevent stupid loops due
> + * to busted EEPROMs. XXX: This value is likely too
> + * big still, waiting on a better value.
> + */
> + if (eep_max > (3 * AR5K_EEPROM_INFO_MAX)) {
> + ATH5K_ERR(ah->ah_sc, "Invalid max custom
> EEPROM size: "
> + "%d (0x%04x) max expected: %d
> (0x%04x)\n",
> + eep_max, eep_max,
> + 3 * AR5K_EEPROM_INFO_MAX,
> + 3 * AR5K_EEPROM_INFO_MAX);
> + return -EIO;
> + }
> + }
> +
> + for (cksum = 0, offset = 0; offset < eep_max; offset++) {
>
>
> ### Example OOPS message ###
>
> pci 0000:01:00.0: reg 10: [mem 0x00000000-0x0000ffff 64bit]
> pci 0000:01:00.0: BAR 0: assigned [mem 0xf8000000-0xf800ffff 64bit]
> pci 0000:01:00.0: BAR 0: set to [mem 0xf8000000-0xf800ffff 64bit] (PCI
> address [0xf8000000-0xf800ffff]
> ath5k 0000:01:00.0: enabling device (0000 -> 0002)
> ath5k 0000:01:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
> ath5k 0000:01:00.0: setting latency timer to 64
> ath5k 0000:01:00.0: registered as 'phy9'
> BUG: unable to handle kernel paging request at 1ef700e0
> IP: [<c05a098c>] iowrite32+0xd/0x30
> *pde = 00000000
> Oops: 0002 [#1] SMP
> last sysfs file: /sys/devices/platform/eeepc/rfkill/rfkill0/uevent
> Modules linked in: aes_i586 aes_generic snd_hda_codec_realtek arc4 ecb
> snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device
> ip6t_REJECT ath5k snd_pcm nf_conntrack_ipv6 mac80211 uvcvideo iTCO_wdt
> ip6table_filter ath iTCO_vendor_support videodev snd_timer
> eeepc_laptop microcode ip6_tables v4l1_compat joydev i2c_i801 cfg80211
> snd sparse_keymap atl2 rfkill soundcore snd_page_alloc ipv6 autofs4
> i915 drm_kms_helper drm usb_storage i2c_algo_bit i2c_core video output
>
> Pid: 17, comm: kacpi_notify Not tainted 2.6.33.3-85.fc13.i686 #1 701/701
> EIP: 0060:[<c05a098c>] EFLAGS: 00010216 CPU: 0
> EIP is at iowrite32+0xd/0x30
> EAX: 00000020 EBX: de324000 ECX: 1ef700e0 EDX: 1ef700e0
> ESI: 00000020 EDI: decabdba EBP: decabd90 ESP: decabd90
> DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> Process kacpi_notify (pid: 17, ti=decaa000 task=dec84c80
> task.ti=decaa000)
> Stack:
> decabda4 e08af61e de0e0a40 00000007 decabdba decabdc8 e08af6df decabe3a
> <0> de324000 00000000 00000000 de0e0a40 00000007 decabe38 decabe54
> e08c041c
> <0> 0000000c decabe1c 0000000c e09a0000 de325680 def6a500 1ef6a500
> de324000
> Call Trace:
> [<e08af61e>] ? ath5k_hw_eeprom_read+0x80/0x116 [ath5k]
> [<e08af6df>] ? ath5k_eeprom_read_mac+0x2b/0x8f [ath5k]
> [<e08c041c>] ? ath5k_pci_probe+0xbd4/0x1115 [ath5k]
> [<c05ab416>] ? local_pci_probe+0xe/0x10
> [<c05abdef>] ? pci_device_probe+0x43/0x66
> [<c062e3f6>] ? driver_probe_device+0xc5/0x1cd
> [<c062e587>] ? __device_attach+0x2a/0x2e
> [<c062d832>] ? bus_for_each_drv+0x3d/0x67
> [<c062e5f6>] ? device_attach+0x4c/0x6c
> [<c062e55d>] ? __device_attach+0x0/0x2e
> [<c062d697>] ? bus_probe_device+0x18/0x2d
> [<c062c31f>] ? device_add+0x30d/0x49c
> [<c075f8ea>] ? pci_bus_assign_resources+0xac/0x3a7
> [<c05a723d>] ? pci_device_add+0xca/0xce
> [<c05a6f26>] ? pci_bus_add_device+0xf/0x30
> [<e07480df>] ? eeepc_rfkill_hotplug+0x87/0xc0 [eeepc_laptop]
> [<e0748126>] ? eeepc_rfkill_notify+0xe/0x10 [eeepc_laptop]
> [<c05df721>] ? acpi_ev_notify_dispatch+0x4f/0x5a
> [<c05d205d>] ? acpi_os_execute_deferred+0x1d/0x28
> [<c0448d63>] ? worker_thread+0x13b/0x1b4
> [<c05d2040>] ? acpi_os_execute_deferred+0x0/0x28
> [<c044bd89>] ? autoremove_wake_function+0x0/0x2f
> [<c0448c28>] ? worker_thread+0x0/0x1b4
> [<c044ba47>] ? kthread+0x5f/0x64
> [<c044b9e8>] ? kthread+0x0/0x64
> [<c040383e>] ? kernel_thread_helper+0x6/0x10
> Code: 00 76 0d 25 ff ff 00 00 89 d7 89 c2 f3 6c eb 0a ba ad 60 8b c0
> e8 4e fe ff ff 5b 5f 5d c3 55 81 fa ff ff 03 00 89 e5 89 d1 76 04 <89>
> 02 eb 1d 81 fa 00 00 01 00 76 09 81 e2 ff ff 00 00 ef eb 0c
> EIP: [<c05a098c>] iowrite32+0xd/0x30 SS:ESP 0068:decabd90
> CR2: 000000001ef700e0
> ---[ end trace 2761b2bae95de764 ]---
>
> ## Analysis ##
>
> When the card is disabled, reads will return with all bits set. So we
> think the EEPROM is of maximum size, and we end up reading outside the
> memory mapping for the device?
>
> There are two obvious solutions -
>
> 1. Bounds-check the size read from the EEPROM against the appropriate
> memory mapping.
> 2. Check for "all ones" when reading the size of the EEPROM.
>
>
> I have a feeling the bounds-checking is more annoying than it sounds.
> When I looked at ath5k_hw_eeprom_read(), I saw two types of hardware.
> In one, the EEPROM was mapped directly into PCI space, which is what
> seems to be happening here. In other hardware, the access was
> indirected through just a couple of registers at fixed addresses - in
> this case the ROM could be bigger than the PCI space.
>
> The second option is a bit of a hack though, especially considering
> the "XXX:" comment.
>
>
> Best wishes
> Alan
next parent reply other threads:[~2011-06-20 10:34 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4DFF2156.4000100@googlemail.com>
2011-06-20 10:34 ` Alan Jenkins [this message]
2011-06-20 10:34 ` [ath5k] OOPS bisected - ath5k_eeprom_init_header() - "XX: This value is likely too big still, waiting on a better value" Alan Jenkins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DFF221E.6010901@googlemail.com \
--to=alan.christopher.jenkins@googlemail.com \
--cc=ath5k-devel@lists.ath5k.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=lrodriguez@atheros.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.