* Re: Null pointer dereference when station associates [introduced by 4.0.5?] [not found] <558EC27A.60804@compton.nu> @ 2015-06-29 8:14 ` Johannes Berg 2015-06-29 8:30 ` Tom Hughes 0 siblings, 1 reply; 8+ messages in thread From: Johannes Berg @ 2015-06-29 8:14 UTC (permalink / raw) To: Tom Hughes, linux-wireless; +Cc: stable On Sat, 2015-06-27 at 16:34 +0100, Tom Hughes wrote: > > Interestingly from what I can see this is trying to create a file > for the station at a path something like: > > ieee80211/phy0/netdev:XXXX/stations/XXXXXX indeed. > but in my (currently working) boot under 4.0.4 there is no netdev > directory under phy0 in debugfs... but then maybe that is the problem > as well if the inode pointer was null? > This is pretty strange - if the dentry pointer (sdata ->debugfs.subdir_stations) was NULL or an ERR_PTR(), the code would return pretty much immediately. So it looks like that pointer is valid, but it's ->d_inode was NULL? I'm not really sure how that could happen. Since 4.0.4 was stable, and 4.0.5 crashes, you'd think there's something wrong between those two kernels and there were no changes to mac80211 related to these code paths in there. johannes ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Null pointer dereference when station associates [introduced by 4.0.5?] 2015-06-29 8:14 ` Null pointer dereference when station associates [introduced by 4.0.5?] Johannes Berg @ 2015-06-29 8:30 ` Tom Hughes 2015-06-29 9:20 ` Tom Hughes 0 siblings, 1 reply; 8+ messages in thread From: Tom Hughes @ 2015-06-29 8:30 UTC (permalink / raw) To: Johannes Berg, linux-wireless; +Cc: stable On 29/06/15 09:14, Johannes Berg wrote: > On Sat, 2015-06-27 at 16:34 +0100, Tom Hughes wrote: >> >> Interestingly from what I can see this is trying to create a file >> for the station at a path something like: >> >> ieee80211/phy0/netdev:XXXX/stations/XXXXXX > > indeed. > >> but in my (currently working) boot under 4.0.4 there is no netdev >> directory under phy0 in debugfs... but then maybe that is the problem >> as well if the inode pointer was null? >> > > This is pretty strange - if the dentry pointer (sdata > ->debugfs.subdir_stations) was NULL or an ERR_PTR(), the code would > return pretty much immediately. > > So it looks like that pointer is valid, but it's ->d_inode was NULL? > > I'm not really sure how that could happen. Indeed I'm a bit puzzled... I can't see anything obvious in the kernel logs indicating a problem, but here's a listing of the phy0 directory: [root@gosford]/home/tom# uname -a Linux gosford.compton.nu 4.0.4-301.fc22.i686+PAE #1 SMP Thu May 21 13:27:48 UTC 2015 i686 i686 i386 GNU/Linux [root@gosford]/home/tom# ls /sys/kernel/debug/ieee80211/phy0 ath9k keys rc statistics fragmentation_threshold long_retry_limit reset total_ps_buffered ht40allow_map power rts_threshold user_power hwflags queues short_retry_limit wep_iv with no netdev directory at all. Interestingly I just tried a different machine running on more or less the same kernel with a USB wireless stick and that did get a netdev directory... > Since 4.0.4 was stable, and 4.0.5 crashes, you'd think there's > something wrong between those two kernels and there were no changes to > mac80211 related to these code paths in there. Well 4.0.4 did hit it eventually, but it had been running stably for a month first. I then rebooted (because networking is basically wedged after this happens) and got 4.0.5 which hit it immediately as did several more reboots before I went back to the older kernel. Tom -- Tom Hughes (tom@compton.nu) http://compton.nu/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Null pointer dereference when station associates [introduced by 4.0.5?] 2015-06-29 8:30 ` Tom Hughes @ 2015-06-29 9:20 ` Tom Hughes 2015-06-29 9:44 ` Tom Hughes 0 siblings, 1 reply; 8+ messages in thread From: Tom Hughes @ 2015-06-29 9:20 UTC (permalink / raw) To: Johannes Berg, linux-wireless; +Cc: stable On 29/06/15 09:30, Tom Hughes wrote: > On 29/06/15 09:14, Johannes Berg wrote: >> On Sat, 2015-06-27 at 16:34 +0100, Tom Hughes wrote: >>> >>> Interestingly from what I can see this is trying to create a file >>> for the station at a path something like: >>> >>> ieee80211/phy0/netdev:XXXX/stations/XXXXXX >> >> indeed. >> >>> but in my (currently working) boot under 4.0.4 there is no netdev >>> directory under phy0 in debugfs... but then maybe that is the problem >>> as well if the inode pointer was null? >>> >> >> This is pretty strange - if the dentry pointer (sdata >> ->debugfs.subdir_stations) was NULL or an ERR_PTR(), the code would >> return pretty much immediately. >> >> So it looks like that pointer is valid, but it's ->d_inode was NULL? >> >> I'm not really sure how that could happen. > > Indeed I'm a bit puzzled... It looks like hostapd has something to do with it... If I stop hostapd and remove ath9k and then reprobe it then the netdev dir appears: gosford [~] % sudo modprobe ath9k gosford [~] % sudo ls /sys/kernel/debug/ieee80211/phy1 ath9k long_retry_limit reset user_power fragmentation_threshold netdev:wlp2s0 rts_threshold wep_iv ht40allow_map power short_retry_limit hwflags queues statistics keys rc total_ps_buffered Then I start hostapd and it vanishes: gosford [~] % sudo systemctl start hostapd gosford [~] % sudo ls /sys/kernel/debug/ieee80211/phy1 ath9k keys rc statistics fragmentation_threshold long_retry_limit reset total_ps_buffered ht40allow_map power rts_threshold user_power hwflags queues short_retry_limit wep_iv Tom -- Tom Hughes (tom@compton.nu) http://compton.nu/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Null pointer dereference when station associates [introduced by 4.0.5?] 2015-06-29 9:20 ` Tom Hughes @ 2015-06-29 9:44 ` Tom Hughes 2015-06-29 10:24 ` Tom Hughes 0 siblings, 1 reply; 8+ messages in thread From: Tom Hughes @ 2015-06-29 9:44 UTC (permalink / raw) To: Johannes Berg, linux-wireless; +Cc: stable On 29/06/15 10:20, Tom Hughes wrote: > On 29/06/15 09:30, Tom Hughes wrote: >> On 29/06/15 09:14, Johannes Berg wrote: >>> On Sat, 2015-06-27 at 16:34 +0100, Tom Hughes wrote: >>>> >>>> Interestingly from what I can see this is trying to create a file >>>> for the station at a path something like: >>>> >>>> ieee80211/phy0/netdev:XXXX/stations/XXXXXX >>> >>> indeed. >>> >>>> but in my (currently working) boot under 4.0.4 there is no netdev >>>> directory under phy0 in debugfs... but then maybe that is the problem >>>> as well if the inode pointer was null? >>>> >>> >>> This is pretty strange - if the dentry pointer (sdata >>> ->debugfs.subdir_stations) was NULL or an ERR_PTR(), the code would >>> return pretty much immediately. >>> >>> So it looks like that pointer is valid, but it's ->d_inode was NULL? >>> >>> I'm not really sure how that could happen. >> >> Indeed I'm a bit puzzled... > > It looks like hostapd has something to do with it... If I stop hostapd and > remove ath9k and then reprobe it then the netdev dir appears: > > gosford [~] % sudo modprobe ath9k > gosford [~] % sudo ls /sys/kernel/debug/ieee80211/phy1 > ath9k long_retry_limit reset user_power > fragmentation_threshold netdev:wlp2s0 rts_threshold wep_iv > ht40allow_map power short_retry_limit > hwflags queues statistics > keys rc total_ps_buffered > > Then I start hostapd and it vanishes: ...and you also need to have selinux in enforcing mode. It appears hostapd is trying to do something with debugfs and is being denied directory search access: time->Mon Jun 29 10:39:34 2015 type=PROCTITLE msg=audit(1435570774.085:16533): proctitle=2F7573722F7362696E2F686F7374617064002F6574632F686F73746170642F686F73746170642E636F6E66002D50002F72756E2F686F73746170642E706964002D42 type=SYSCALL msg=audit(1435570774.085:16533): arch=40000003 syscall=102 success=yes exit=36 a0=10 a1=bf93c910 a2=b777d000 a3=90517e8 items=0 ppid=1 pid=7241 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="hostapd" exe="/usr/sbin/hostapd" subj=system_u:system_r:hostapd_t:s0 key=(null) type=AVC msg=audit(1435570774.085:16533): avc: denied { search } for pid=7241 comm="hostapd" name="phy7" dev="debugfs" ino=5626659 scontext=system_u:system_r:hostapd_t:s0 tcontext=system_u:object_r:debugfs_t:s0 tclass=dir permissive=1 It must then do something that breaks the kernel... Tom -- Tom Hughes (tom@compton.nu) http://compton.nu/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Null pointer dereference when station associates [introduced by 4.0.5?] 2015-06-29 9:44 ` Tom Hughes @ 2015-06-29 10:24 ` Tom Hughes 2015-06-29 10:28 ` Tom Hughes 0 siblings, 1 reply; 8+ messages in thread From: Tom Hughes @ 2015-06-29 10:24 UTC (permalink / raw) To: Johannes Berg, linux-wireless; +Cc: stable On 29/06/15 10:44, Tom Hughes wrote: > On 29/06/15 10:20, Tom Hughes wrote: >> On 29/06/15 09:30, Tom Hughes wrote: >>> On 29/06/15 09:14, Johannes Berg wrote: >>>> On Sat, 2015-06-27 at 16:34 +0100, Tom Hughes wrote: >>>>> >>>>> Interestingly from what I can see this is trying to create a file >>>>> for the station at a path something like: >>>>> >>>>> ieee80211/phy0/netdev:XXXX/stations/XXXXXX >>>> >>>> indeed. >>>> >>>>> but in my (currently working) boot under 4.0.4 there is no netdev >>>>> directory under phy0 in debugfs... but then maybe that is the problem >>>>> as well if the inode pointer was null? >>>>> >>>> >>>> This is pretty strange - if the dentry pointer (sdata >>>> ->debugfs.subdir_stations) was NULL or an ERR_PTR(), the code would >>>> return pretty much immediately. >>>> >>>> So it looks like that pointer is valid, but it's ->d_inode was NULL? >>>> >>>> I'm not really sure how that could happen. >>> >>> Indeed I'm a bit puzzled... >> >> It looks like hostapd has something to do with it... If I stop hostapd and >> remove ath9k and then reprobe it then the netdev dir appears: >> >> gosford [~] % sudo modprobe ath9k >> gosford [~] % sudo ls /sys/kernel/debug/ieee80211/phy1 >> ath9k long_retry_limit reset user_power >> fragmentation_threshold netdev:wlp2s0 rts_threshold wep_iv >> ht40allow_map power short_retry_limit >> hwflags queues statistics >> keys rc total_ps_buffered >> >> Then I start hostapd and it vanishes: > > ...and you also need to have selinux in enforcing mode. > > It appears hostapd is trying to do something with debugfs and is > being denied directory search access: So I think this happens when hostapd switches the interface to AP mode, which causes the netdev to be torn down and then recreated, and the debugfs directory along with it. Except that if the netlink message to change the mode was sent from a daemon whose selinux context prevents searching debugfs the recreation somehow fails and leaves an invalid state that later causes the null pointer deref. Tom -- Tom Hughes (tom@compton.nu) http://compton.nu/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Null pointer dereference when station associates [introduced by 4.0.5?] 2015-06-29 10:24 ` Tom Hughes @ 2015-06-29 10:28 ` Tom Hughes 2015-06-29 18:41 ` [PATCH] Clear subdir_stations when stations directory is removed (was Re: Null pointer dereference when station associates [introduced by 4.0.5?]) Tom Hughes 0 siblings, 1 reply; 8+ messages in thread From: Tom Hughes @ 2015-06-29 10:28 UTC (permalink / raw) To: Johannes Berg, linux-wireless; +Cc: stable On 29/06/15 11:24, Tom Hughes wrote: > So I think this happens when hostapd switches the interface > to AP mode, which causes the netdev to be torn down and then > recreated, and the debugfs directory along with it. > > Except that if the netlink message to change the mode was > sent from a daemon whose selinux context prevents searching > debugfs the recreation somehow fails and leaves an invalid > state that later causes the null pointer deref. Think I have it... The teardown runs ieee80211_debugfs_remove_netdev which clears sdata->vif.debugfs_dir but does not clear sdata->debugfs.subdir_stations so that when ieee80211_debugfs_add_netdev later fails to create the top level netdev directory we are left with a bogus pointer for the stations directory. Then when we try and add an entry to the stations directory things blow up. Tom -- Tom Hughes (tom@compton.nu) http://compton.nu/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Clear subdir_stations when stations directory is removed (was Re: Null pointer dereference when station associates [introduced by 4.0.5?]) 2015-06-29 10:28 ` Tom Hughes @ 2015-06-29 18:41 ` Tom Hughes 2015-07-17 8:53 ` Johannes Berg 0 siblings, 1 reply; 8+ messages in thread From: Tom Hughes @ 2015-06-29 18:41 UTC (permalink / raw) To: Johannes Berg, linux-wireless; +Cc: stable On 29/06/15 11:28, Tom Hughes wrote: > On 29/06/15 11:24, Tom Hughes wrote: > >> So I think this happens when hostapd switches the interface >> to AP mode, which causes the netdev to be torn down and then >> recreated, and the debugfs directory along with it. >> >> Except that if the netlink message to change the mode was >> sent from a daemon whose selinux context prevents searching >> debugfs the recreation somehow fails and leaves an invalid >> state that later causes the null pointer deref. > > Think I have it... > > The teardown runs ieee80211_debugfs_remove_netdev > which clears sdata->vif.debugfs_dir but does not clear > sdata->debugfs.subdir_stations so that when ieee80211_debugfs_add_netdev > later fails to create the top level > netdev directory we are left with a bogus pointer for the stations > directory. > > Then when we try and add an entry to the stations directory things blow up. Here's a proposed patch. I have booted 4.0.6 with this applied and so far it hasn't failed even with selinux in enforcing mode. commit 30624496e9f411081d7ea1a407deabe0e32d0c62 Author: Tom Hughes <tom@compton.nu> Date: Mon Jun 29 11:31:04 2015 +0100 Clear subdir_stations when stations directory is removed If we don't do this, and we then fail to recreate the debugfs directory during a mode change, then we will fail later trying to add stations to this now bogus directory: BUG: unable to handle kernel NULL pointer dereference at 0000006c IP: [<c0a92202>] mutex_lock+0x12/0x30 Call Trace: [<c0678ab4>] start_creating+0x44/0xc0 [<c0679203>] debugfs_create_dir+0x13/0xf0 [<f8a938ae>] ieee80211_sta_debugfs_add+0x6e/0x490 [mac80211] Signed-off-by: Tom Hughes <tom@compton.nu> diff --git a/net/mac80211/debugfs_netdev.c b/net/mac80211/debugfs_netdev.c index 29236e8..c09c013 100644 --- a/net/mac80211/debugfs_netdev.c +++ b/net/mac80211/debugfs_netdev.c @@ -723,6 +723,7 @@ void ieee80211_debugfs_remove_netdev(struct ieee80211_sub_if_data *sdata) debugfs_remove_recursive(sdata->vif.debugfs_dir); sdata->vif.debugfs_dir = NULL; + sdata->debugfs.subdir_stations = NULL; } void ieee80211_debugfs_rename_netdev(struct ieee80211_sub_if_data *sdata) Tom -- Tom Hughes (tom@compton.nu) http://compton.nu/ ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Clear subdir_stations when stations directory is removed (was Re: Null pointer dereference when station associates [introduced by 4.0.5?]) 2015-06-29 18:41 ` [PATCH] Clear subdir_stations when stations directory is removed (was Re: Null pointer dereference when station associates [introduced by 4.0.5?]) Tom Hughes @ 2015-07-17 8:53 ` Johannes Berg 0 siblings, 0 replies; 8+ messages in thread From: Johannes Berg @ 2015-07-17 8:53 UTC (permalink / raw) To: Tom Hughes, linux-wireless; +Cc: stable On Mon, 2015-06-29 at 19:41 +0100, Tom Hughes wrote: > On 29/06/15 11:28, Tom Hughes wrote: > > On 29/06/15 11:24, Tom Hughes wrote: > > > > > So I think this happens when hostapd switches the interface > > > to AP mode, which causes the netdev to be torn down and then > > > recreated, and the debugfs directory along with it. > > > > > > Except that if the netlink message to change the mode was > > > sent from a daemon whose selinux context prevents searching > > > debugfs the recreation somehow fails and leaves an invalid > > > state that later causes the null pointer deref. > > > > Think I have it... > > > > The teardown runs ieee80211_debugfs_remove_netdev > > which clears sdata->vif.debugfs_dir but does not clear > > sdata->debugfs.subdir_stations so that when > > ieee80211_debugfs_add_netdev > > later fails to create the top level > > netdev directory we are left with a bogus pointer for the stations > > directory. > > > > Then when we try and add an entry to the stations directory things > > blow up. > > Here's a proposed patch. I have booted 4.0.6 with this applied and so > far > it hasn't failed even with selinux in enforcing mode. > > commit 30624496e9f411081d7ea1a407deabe0e32d0c62 > Author: Tom Hughes <tom@compton.nu> > Date: Mon Jun 29 11:31:04 2015 +0100 > > Clear subdir_stations when stations directory is removed > > If we don't do this, and we then fail to recreate the debugfs > directory during a mode change, then we will fail later trying > to add stations to this now bogus directory: > > BUG: unable to handle kernel NULL pointer dereference at 0000006c > IP: [<c0a92202>] mutex_lock+0x12/0x30 > Call Trace: > [<c0678ab4>] start_creating+0x44/0xc0 > [<c0679203>] debugfs_create_dir+0x13/0xf0 > [<f8a938ae>] ieee80211_sta_debugfs_add+0x6e/0x490 [mac80211] > > Signed-off-by: Tom Hughes <tom@compton.nu> > Applied. johannes ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-07-17 8:53 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <558EC27A.60804@compton.nu>
2015-06-29 8:14 ` Null pointer dereference when station associates [introduced by 4.0.5?] Johannes Berg
2015-06-29 8:30 ` Tom Hughes
2015-06-29 9:20 ` Tom Hughes
2015-06-29 9:44 ` Tom Hughes
2015-06-29 10:24 ` Tom Hughes
2015-06-29 10:28 ` Tom Hughes
2015-06-29 18:41 ` [PATCH] Clear subdir_stations when stations directory is removed (was Re: Null pointer dereference when station associates [introduced by 4.0.5?]) Tom Hughes
2015-07-17 8:53 ` Johannes Berg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).