Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] mwl8k: Add 0x2a02 PCI device-id (Marvell 88W8361)
From: Sedat Dilek @ 2012-04-28 22:49 UTC (permalink / raw)
  To: Jim Cromie
  Cc: Lennert Buytenhek, John W. Linville, linux-wireless, netdev,
	lautriv
In-Reply-To: <CAJfuBxw7YFAC9CLAGNmg3nM_g2xH2vuiz07D-3xOcyn2XuP5vA@mail.gmail.com>

On Sun, Apr 29, 2012 at 12:36 AM, Jim Cromie <jim.cromie@gmail.com> wrote:
> [dropped lkml from cc]
>
> On Sat, Apr 28, 2012 at 4:25 PM, Sedat Dilek <sedat.dilek@googlemail.com> wrote:
>> On Fri, Apr 27, 2012 at 11:53 PM, Sedat Dilek
>> <sedat.dilek@googlemail.com> wrote:
>>> On Fri, Apr 27, 2012 at 8:58 PM, Lennert Buytenhek
>>> <buytenh@wantstofly.org> wrote:
>>>> On Fri, Apr 27, 2012 at 03:29:26PM +0200, Sedat Dilek wrote:
>>>>
>>>>> >>> >> >> >> Are you planning to or even working on support (for) 8361 devices?
>>>>> >>> >> >> >
>>>>> >>> >> >> > I don't have any such plans, and I don't know of anyone who does.
>>>>> >>> >> >>
>>>>> >>> >> >> Does 8361 require firmware support?
>>>>> >>> >> >
>>>>> >>> >> > What do you mean by that?
>>>>> >>> >>
>>>>> >>> >> IIRC ath5k for example needs no external firmware file.
>>>>> >>> >
>>>>> >>> > The 8361 requires firmware to be loaded into it before it'll do
>>>>> >>> > anything useful.
>>>>> >>> >
>>>>> >>> >
>>>>> >>> >> >> Does a firmware file exist (name?)?
>>>>> >>> >> >
>>>>> >>> >> > There's firmware for the 8361 out there, however, that version of
>>>>> >>> >> > the firmware implements a firmware API that is different from the
>>>>> >>> >> > one that mwl8k currently implements.
>>>>> >>> >> >
>>>>> >>> >> > You could add 8361 support to mwl8k, but then you'd have to go over
>>>>> >>> >> > all the firmware command invocations in mwl8k and make sure that they
>>>>> >>> >> > will work on the 8361 firmware that you're trying to support as well.
>>>>> >>> >>
>>>>> >>> >> Without having a 8361 this will be even harder to walk through.
>>>>> >>> >>
>>>>> >>> >> Anyway, thanks for your detailed explanations.
>>>>> >>> >>
>>>>> >>> >> What's the alternative for such affected users?
>>>>> >>> >> Use ndis-wrapper?
>>>>> >>> >
>>>>> >>> > I'm not sure.  I've never tried to get a 8361 work under Linux.
>>>>> >>>
>>>>> >>> Just found on [1] this same wrong patch in [2] :-).
>>>>> >>> Dunno if [3] worked and from where they have stolen fw-files.
>>>>> >>>
>>>>> >>> - Sedat -
>>>>> >>>
>>>>> >>> [1] https://dev.openwrt.org/ticket/7209
>>>>> >>> [2] https://dev.openwrt.org/attachment/ticket/7209/mwl8k_fix_pci_id.patch
>>>>> >>> [3] https://dev.openwrt.org/attachment/ticket/7209/mwl8k_8361p.patch
>>>>> >>
>>>>> >> I doubt that [3] is really all that's needed to make it work.  But if
>>>>> >> there's someone for whom it works, I'd like them to run some tests on
>>>>> >> mwl8k + [3] on 88w8361p.
>>>>> >
>>>>> > Just asked lautriv to do so (he will do against linux-3.3.3), lets' see.
>>>>> > Also, I refreshed and adapted a bit the instructions (see attachment).
>>>>> >
>>>>> > - Sedat -
>>>>>
>>>>> [ CC Jim Cromie ]
>>>>>
>>>>> Now, with extracted firmware files renamed & copied to
>>>>> /lib/firmware/mwl8k/ and applied patch (see attachment) against
>>>>> Linux-3.4-rc4+ I am seeing with modinfo:
>>>>>
>>>>> $ sudo modinfo mwl8k
>>>>> filename:
>>>>> /lib/modules/3.4.0-rc4-5-generic/kernel/drivers/net/wireless/mwl8k.ko
>>>>> license:        GPL
>>>>> author:         Lennert Buytenhek <buytenh@marvell.com>
>>>>> version:        0.13
>>>>> description:    Marvell TOPDOG(R) 802.11 Wireless Network Driver
>>>>> firmware:       mwl8k/fmimage_8366_ap-2.fw
>>>>> firmware:       mwl8k/fmimage_8366.fw
>>>>> firmware:       mwl8k/helper_8366.fw
>>>>> firmware:       mwl8k/fmimage_8687.fw
>>>>> firmware:       mwl8k/helper_8687.fw
>>>>> firmware:       mwl8k/fmimage_8363.fw
>>>>> firmware:       mwl8k/helper_8363.fw
>>>>> firmware:       mwl8k/fmimage_8361p.fw
>>>>> firmware:       mwl8k/helper_8361p.fw
>>>>> srcversion:     9E1479A05C8D67E6AE90746
>>>>> alias:          pci:v000011ABd00002A43sv*sd*bc*sc*i*
>>>>> alias:          pci:v000011ABd00002A40sv*sd*bc*sc*i*
>>>>> alias:          pci:v000011ABd00002A30sv*sd*bc*sc*i*
>>>>> alias:          pci:v000011ABd00002A2Bsv*sd*bc*sc*i*
>>>>> alias:          pci:v000011ABd00002A24sv*sd*bc*sc*i*
>>>>> alias:          pci:v000011ABd00002A0Csv*sd*bc*sc*i*
>>>>> alias:          pci:v000011ABd00002A0Asv*sd*bc*sc*i*
>>>>> alias:          pci:v000011ABd00002A02sv*sd*bc*sc*i* <--- 8361P:
>>>>> 0x2a02 PCI device-id
>>>>> depends:        mac80211,cfg80211
>>>>> vermagic:       3.4.0-rc4-5-generic SMP mod_unload modversions
>>>>> parm:           ap_mode_default:Set to 1 to make ap mode the default
>>>>> instead of sta mode (bool)
>>>>>
>>>>> $ ls -l /lib/firmware/mwl8k/
>>>>> insgesamt 456
>>>>> -rw-r--r-- 1 root root  75848 Apr 27 13:49 fmimage_8361p.fw <---
>>>>> 8361P: Firmware image
>>>>> -rw-r--r-- 1 root root 101780 Mär 19 19:32 fmimage_8366_ap-1.fw
>>>>> -rw-r--r-- 1 root root 101976 Mär 19 19:32 fmimage_8366_ap-2.fw
>>>>> -rw-r--r-- 1 root root  96664 Mär 19 19:32 fmimage_8366.fw
>>>>> -rw-r--r-- 1 root root  73252 Feb 23 20:07 fmimage_8687.fw
>>>>> -rw-r--r-- 1 root root   2476 Apr 27 13:49 helper_8361p.fw <--- 8361P:
>>>>> Helper image
>>>>> -rw-r--r-- 1 root root   2476 Mär 19 19:32 helper_8366.fw
>>>>> -rw-r--r-- 1 root root   2476 Feb 23 20:07 helper_8687.fw
>>>>>
>>>>> As said... ***compile-tested*** only here.
>>>>
>>>> Which doesn't say much at all, but..
>>>>
>>>
>>> As already pointed out, no Marwell WLAN hardware here. Marvell comics
>>> of course :-).
>>>
>>>>
>>>>> I had a short query with lautriv on #linux-wireless this afternoon:
>>>>> * mwl8k kernel-module was autoloaded
>>>>> * wlan0 interface got fired up
>>>>> * ESSID was accepted
>>>>> * logs reported 802.11bgn support is active
>>>>> * (he tested on Linux-3.3.3)
>>>>>
>>>>> With WPA/WPA2 lautriv had some problems as his installation was
>>>>> missing wpasupplicant.
>>>>> He setup a classic /etc/network/interfaces.
>>>>> After resetting his router mwl8k worked nicely.
>>>>>
>>>>> Unfortunately, he was on the run and promised me to send logs and do
>>>>> more testing this evening.
>>>>
>>>> ..but this is pretty interesting.  I thought that the only available
>>>> 8361p firmware used an incompatible firmware API, and I didn't know
>>>> of the existence of an apparently compatible firmware.
>>>>
>>>>
>>>>> So, Lennert if you want more testing - What? How? etc.
>>>>
>>>> For one, the output of 'iw phy', please.
>>>>
>>>> Also, does monitor mode work?  Do you get plausible channel/rxpower
>>>> values in tcpdump in monitor mode?
>>>>
>>>> Are there any messages in the syslog about failing commands?
>>>>
>>>>
>>>> thanks,
>>>> Lennert
>>>
>>> On 1st sight, logs look fine:
>>>
>>> [21:52:52] <lautriv> [    6.050967] ieee80211 phy0: 88w8361p v4,
>>> 00173f3bdde3, STA firmware 2.1.4.25
>>>
>>> But WLAN connection is not that fast and stable as lautriv reports
>>> (several abnormalities were observed).
>>>
>>> I requested a tarball which includes:
>>> * dmesg (Linux-3.3.3)
>>> * e_n_a (/etc/network/interfaces)
>>> * ifconfig output
>>> * iwconfig output
>>> * iw_phy output
>>> * ps_axu (WPA) output
>>>
>>> lautriv will be so kind to be around on #linux-wireless/Freenode the
>>> next days (UTC+2: German/Swiss local-time).
>>> Just ping him.
>>>
>>> Hope you have fun, together!
>>>
>>> - Sedat -
>>
>> A new tarball from lautriv with same outputs as before, but now tested
>> with Linux-3.4-rc4.
>>
>> - Sedat -
>
>
> heres my logs, using firmware extracted by Sedat's script,
> and the patch on mwl8k.c
>
> bottom-line, it appears to be working.
>
> its contents are a bit more pedantic, and includes data for
> another wifi card (rtl8180 based) also in the box.
> It was obtained by this script:
>
> #!/bin/bash
>
> # dmesg (Linux-3.3.3)
> # e_n_a (/etc/network/interfaces)
> # ifconfig output
> # iwconfig output
> # iw_phy output
> # ps_axu (WPA) output
>
> devs="wlan0 wlan1"
> apmac=00:14:d1:e8:65:0a
>
> loudly () {
>    echo "# $@"
>    fname=`echo $@ | sed -e 's/ /-/g'`
>    $@ 2> $fname-err | tee $fname
>    [ $? != 0 ] && echo non-zero exit on $fname: $?
>    [ -s $fname-err ] || rm $fname-err
> }
>
> ( iw --debug event -f > iw-event-f )&
> pid_event=$!
>
> for N in 0 1 ; do
>    loudly iw dev wlan$N interface add fish$N type monitor # flags none
>    loudly iw dev fish$N set channel 8
>    loudly ifconfig fish$N up
>    ( tcpdump -i fish$N -s 65000 -p -U -w  fish$N.dump )&
>    pid_dump_fish$N=$!
> done
>
> loudly iw list
>
> #loudly iwspy
> # gives: Interface doesn't support wireless statistic collection
>
> for dev in $devs ; do
>    loudly ifconfig $dev
>    loudly iwconfig $dev
>    loudly iwlist $dev scan
>    loudly iw dev $dev info
>    loudly iw dev $dev link
>    loudly iw dev $dev scan
>    loudly iw dev $dev survey dump
> done
>
> for phy in $phys ; do
>    loudly iw phy $phy info
> done
>
> # these are unsupported on wlan0
> loudly iw dev wlan1 survey dump
> loudly iw dev wlan1 station dump
> loudly iw dev wlan1 station get $apmac
>
>
> for N in 0 1 ; do
>    loudly iw dev fish$N del
> done
>
> kill $pid_dump_fish0 $pid_dump_fish0
> kill $pid_event
>
> dmesg > dmesg
>
> grep -vE '^#|key' /etc/network/interfaces > e_n_a
>
> exit

Hi Jim,

thanks for your testing and the nice testcase-script!

lautriv you wanna run some more tests with Jim's script?

Jim, how stable/fast/reliable is your WLAN connection?
Suspend/resume tested?

Hope this helps to get native Linux support for 8361p.

Regards,
- Sedat -

P.S.: BTW, only to clarify it should be "e_n_i" as short-form for
/etc/network/interfaces file, but e_n_a sounds more female and nicer
:-).

^ permalink raw reply

* Re: [PATCH] mwl8k: Add 0x2a02 PCI device-id (Marvell 88W8361)
From: Jim Cromie @ 2012-04-28 23:11 UTC (permalink / raw)
  To: sedat.dilek
  Cc: Lennert Buytenhek, John W. Linville, linux-wireless, netdev,
	lautriv
In-Reply-To: <CA+icZUVT6kvqj47x26FU2qjbMHc=jr9_Lso=zduf_cmcSrv=UQ@mail.gmail.com>

On Sat, Apr 28, 2012 at 4:49 PM, Sedat Dilek <sedat.dilek@googlemail.com> wrote:
> On Sun, Apr 29, 2012 at 12:36 AM, Jim Cromie <jim.cromie@gmail.com> wrote:

>>>> As already pointed out, no Marwell WLAN hardware here. Marvell comics
>>>> of course :-).

(I had to leave that one..:-)



>>>
>>> A new tarball from lautriv with same outputs as before, but now tested
>>> with Linux-3.4-rc4.
>>>
>>> - Sedat -
>>
>>
>> heres my logs, using firmware extracted by Sedat's script,
>> and the patch on mwl8k.c
>>
>> bottom-line, it appears to be working.
>>
>> its contents are a bit more pedantic, and includes data for
>> another wifi card (rtl8180 based) also in the box.
>> It was obtained by this script:
>>
>> #!/bin/bash
>>
>> # dmesg (Linux-3.3.3)
>> # e_n_a (/etc/network/interfaces)
>> # ifconfig output
>> # iwconfig output
>> # iw_phy output
>> # ps_axu (WPA) output
>>
>> devs="wlan0 wlan1"
>> apmac=00:14:d1:e8:65:0a
>>
>> loudly () {
>>    echo "# $@"
>>    fname=`echo $@ | sed -e 's/ /-/g'`
>>    $@ 2> $fname-err | tee $fname
>>    [ $? != 0 ] && echo non-zero exit on $fname: $?
>>    [ -s $fname-err ] || rm $fname-err
>> }
>>
>> ( iw --debug event -f > iw-event-f )&
>> pid_event=$!
>>
>> for N in 0 1 ; do
>>    loudly iw dev wlan$N interface add fish$N type monitor # flags none
>>    loudly iw dev fish$N set channel 8
>>    loudly ifconfig fish$N up
>>    ( tcpdump -i fish$N -s 65000 -p -U -w  fish$N.dump )&
>>    pid_dump_fish$N=$!
>> done
>>
>> loudly iw list
>>
>> #loudly iwspy
>> # gives: Interface doesn't support wireless statistic collection
>>
>> for dev in $devs ; do
>>    loudly ifconfig $dev
>>    loudly iwconfig $dev
>>    loudly iwlist $dev scan
>>    loudly iw dev $dev info
>>    loudly iw dev $dev link
>>    loudly iw dev $dev scan
>>    loudly iw dev $dev survey dump
>> done
>>
>> for phy in $phys ; do
>>    loudly iw phy $phy info
>> done
>>
>> # these are unsupported on wlan0
>> loudly iw dev wlan1 survey dump
>> loudly iw dev wlan1 station dump
>> loudly iw dev wlan1 station get $apmac
>>
>>
>> for N in 0 1 ; do
>>    loudly iw dev fish$N del
>> done
>>
>> kill $pid_dump_fish0 $pid_dump_fish0
>> kill $pid_event
>>
>> dmesg > dmesg
>>
>> grep -vE '^#|key' /etc/network/interfaces > e_n_a
>>
>> exit
>
> Hi Jim,
>
> thanks for your testing and the nice testcase-script!
>
> lautriv you wanna run some more tests with Jim's script?
>
> Jim, how stable/fast/reliable is your WLAN connection?
> Suspend/resume tested?

I havent tested reliability in any way.
in fact, I havent tested any data-xfer per se,
will do an iperf test soon.

That said, bitrate is quite low, I havent looked at why.

jimc@chumly:~/projects/lx/wifi/mwl8k-8361p-logs$ grep -i MBit *
iw-dev-wlan0-link:	tx bitrate: 11.0 MBit/s
iw-dev-wlan1-link:	tx bitrate: 1.0 MBit/s
iw-dev-wlan1-station-dump:	tx bitrate:	1.0 MBit/s
iw-dev-wlan1-station-get-00:14:d1:e8:65:0a:	tx bitrate:	1.0 MBit/s

my laptop is much faster than both cards in the soekris box, to same AP

Connected to 00:14:d1:e8:65:0a (on wlan0)
	SSID: yoduh
	freq: 2447
	RX: 191134302 bytes (2120068 packets)
	TX: 17440426 bytes (120666 packets)
	signal: -45 dBm
	tx bitrate: 54.0 MBit/s

	bss flags:	
	dtim period:	0
	beacon int:	100


If you all have some suggestions on this, Id like to hear them.
And of course, any other testing you'd like too.


>
> Hope this helps to get native Linux support for 8361p.

hear hear.
FWIW,  I pulled this card out of a dead Netgear WNR854T,
which is linux based (and GPL compliant)

> Regards,
> - Sedat -
>
> P.S.: BTW, only to clarify it should be "e_n_i" as short-form for
> /etc/network/interfaces file, but e_n_a sounds more female and nicer
> :-).

I caught that, but it wasnt worth "correcting" ;-)

thanks
Jim

^ permalink raw reply

* Re: [PATCH] mwl8k: Add 0x2a02 PCI device-id (Marvell 88W8361)
From: Sedat Dilek @ 2012-04-28 23:26 UTC (permalink / raw)
  To: Jim Cromie
  Cc: Lennert Buytenhek, John W. Linville,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, lautriv
In-Reply-To: <CAJfuBxwz2PmNY37VB7Wx+GdQcOcS6Y96EM_JNtvs-dyv-imL6g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Sun, Apr 29, 2012 at 1:11 AM, Jim Cromie <jim.cromie-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Sat, Apr 28, 2012 at 4:49 PM, Sedat Dilek <sedat.dilek@googlemail.com> wrote:
>> On Sun, Apr 29, 2012 at 12:36 AM, Jim Cromie <jim.cromie-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>>>>> As already pointed out, no Marwell WLAN hardware here. Marvell comics
>>>>> of course :-).
>
> (I had to leave that one..:-)
>
>
>
>>>>
>>>> A new tarball from lautriv with same outputs as before, but now tested
>>>> with Linux-3.4-rc4.
>>>>
>>>> - Sedat -
>>>
>>>
>>> heres my logs, using firmware extracted by Sedat's script,
>>> and the patch on mwl8k.c
>>>
>>> bottom-line, it appears to be working.
>>>
>>> its contents are a bit more pedantic, and includes data for
>>> another wifi card (rtl8180 based) also in the box.
>>> It was obtained by this script:
>>>
>>> #!/bin/bash
>>>
>>> # dmesg (Linux-3.3.3)
>>> # e_n_a (/etc/network/interfaces)
>>> # ifconfig output
>>> # iwconfig output
>>> # iw_phy output
>>> # ps_axu (WPA) output
>>>
>>> devs="wlan0 wlan1"
>>> apmac=00:14:d1:e8:65:0a
>>>
>>> loudly () {
>>>    echo "# $@"
>>>    fname=`echo $@ | sed -e 's/ /-/g'`
>>>    $@ 2> $fname-err | tee $fname
>>>    [ $? != 0 ] && echo non-zero exit on $fname: $?
>>>    [ -s $fname-err ] || rm $fname-err
>>> }
>>>
>>> ( iw --debug event -f > iw-event-f )&
>>> pid_event=$!
>>>
>>> for N in 0 1 ; do
>>>    loudly iw dev wlan$N interface add fish$N type monitor # flags none
>>>    loudly iw dev fish$N set channel 8
>>>    loudly ifconfig fish$N up
>>>    ( tcpdump -i fish$N -s 65000 -p -U -w  fish$N.dump )&
>>>    pid_dump_fish$N=$!
>>> done
>>>
>>> loudly iw list
>>>
>>> #loudly iwspy
>>> # gives: Interface doesn't support wireless statistic collection
>>>
>>> for dev in $devs ; do
>>>    loudly ifconfig $dev
>>>    loudly iwconfig $dev
>>>    loudly iwlist $dev scan
>>>    loudly iw dev $dev info
>>>    loudly iw dev $dev link
>>>    loudly iw dev $dev scan
>>>    loudly iw dev $dev survey dump
>>> done
>>>
>>> for phy in $phys ; do
>>>    loudly iw phy $phy info
>>> done
>>>
>>> # these are unsupported on wlan0
>>> loudly iw dev wlan1 survey dump
>>> loudly iw dev wlan1 station dump
>>> loudly iw dev wlan1 station get $apmac
>>>
>>>
>>> for N in 0 1 ; do
>>>    loudly iw dev fish$N del
>>> done
>>>
>>> kill $pid_dump_fish0 $pid_dump_fish0
>>> kill $pid_event
>>>
>>> dmesg > dmesg
>>>
>>> grep -vE '^#|key' /etc/network/interfaces > e_n_a
>>>
>>> exit
>>
>> Hi Jim,
>>
>> thanks for your testing and the nice testcase-script!
>>
>> lautriv you wanna run some more tests with Jim's script?
>>
>> Jim, how stable/fast/reliable is your WLAN connection?
>> Suspend/resume tested?
>
> I havent tested reliability in any way.
> in fact, I havent tested any data-xfer per se,
> will do an iperf test soon.
>
> That said, bitrate is quite low, I havent looked at why.
>
> jimc@chumly:~/projects/lx/wifi/mwl8k-8361p-logs$ grep -i MBit *
> iw-dev-wlan0-link:      tx bitrate: 11.0 MBit/s
> iw-dev-wlan1-link:      tx bitrate: 1.0 MBit/s
> iw-dev-wlan1-station-dump:      tx bitrate:     1.0 MBit/s
> iw-dev-wlan1-station-get-00:14:d1:e8:65:0a:     tx bitrate:     1.0 MBit/s
>
> my laptop is much faster than both cards in the soekris box, to same AP
>
> Connected to 00:14:d1:e8:65:0a (on wlan0)
>        SSID: yoduh
>        freq: 2447
>        RX: 191134302 bytes (2120068 packets)
>        TX: 17440426 bytes (120666 packets)
>        signal: -45 dBm
>        tx bitrate: 54.0 MBit/s
>
>        bss flags:
>        dtim period:    0
>        beacon int:     100
>
>
> If you all have some suggestions on this, Id like to hear them.
> And of course, any other testing you'd like too.
>
>
>>
>> Hope this helps to get native Linux support for 8361p.
>
> hear hear.
> FWIW,  I pulled this card out of a dead Netgear WNR854T,
> which is linux based (and GPL compliant)
>
>> Regards,
>> - Sedat -
>>
>> P.S.: BTW, only to clarify it should be "e_n_i" as short-form for
>> /etc/network/interfaces file, but e_n_a sounds more female and nicer
>> :-).
>
> I caught that, but it wasnt worth "correcting" ;-)
>
> thanks
> Jim

Unfortunately, [1] says not much about debugging.
Anyway, Lennert has some new informations.
Let's see what the experts will say.

- Sedat -

[1] http://wireless.kernel.org/en/users/Drivers/mwl8k
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 1/2] AT91: Remove fixed mapping for AT91RM9200 ethernet
From: David Miller @ 2012-04-29  1:49 UTC (permalink / raw)
  To: nicolas.ferre
  Cc: netdev, linux-arm-kernel, linux-kernel, avictor.za, plagnioj
In-Reply-To: <3ad9b1273c4f30d42dcc3bdf6d03470111246f14.1335435759.git.nicolas.ferre@atmel.com>

From: Nicolas Ferre <nicolas.ferre@atmel.com>
Date: Thu, 26 Apr 2012 12:30:42 +0200

> From: Andrew Victor <avictor.za@gmail.com>
> 
> The AT91RM9200 Ethernet controller still has a fixed IO mapping.
> So:
> * Remove the fixed IO mapping and AT91_VA_BASE_EMAC definition.
> * Pass the physical base-address via platform-resources to the driver.
> * Convert at91_ether.c driver to perform an ioremap().
> * Ethernet PHY detection needs to be performed during the driver
> initialization process, it can no longer be done first.
> 
> Signed-off-by: Andrew Victor <linux@maxim.org.za>
> Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH 2/2] net/at91_ether: use gpio_to_irq for phy IRQ line
From: David Miller @ 2012-04-29  1:49 UTC (permalink / raw)
  To: nicolas.ferre
  Cc: netdev, linux-arm-kernel, linux-kernel, avictor.za, plagnioj
In-Reply-To: <cfc991548b5b73c8038f41877da1cd0104945cf5.1335435759.git.nicolas.ferre@atmel.com>

From: Nicolas Ferre <nicolas.ferre@atmel.com>
Date: Thu, 26 Apr 2012 12:30:43 +0200

> Use the gpio_to_irq() function to retrieve the phy IRQ line
> from the GPIO pin specification.
> This fix is needed now that we have moved to irqdomains on AT91.
> 
> Reported-by: Jamie Iles <jamie@jamieiles.com>
> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH] net: smsc75xx: fix MDIO access
From: David Miller @ 2012-04-29  1:55 UTC (permalink / raw)
  To: fillods; +Cc: netdev, steve.glendinning
In-Reply-To: <20120426195517.GU5277@charybde.local>

From: Stephane Fillod <fillods@users.sf.net>
Date: Thu, 26 Apr 2012 21:55:17 +0200

> * make MDIO read/write to work, the MII_ACCESS_BUSY bit was missing
>   to actually trigger the I/O. Rem: the smsc75xx is different from
>   the smsc95xx in that regard.
> * fix PHY interrupt acknowledge, which needs a mdio_write
>   to clear PHY_INT_SRC instead of a usual read like in smsc95xx.
> * fix bug in phy_init loop that was ignoring BMCR reset bit,
>   akin to smsc95xx's d946092000698fd204d82a9d239103c656fb63bf
> * mark link down on startup and let PHY interrupt deal with carrier
>   changes, akin to 07d69d4238418746a7b85c5d05ec17c658a2a390
> * declare the smsc75xx's MII as GMII capable
> 
> Tested on ARM/Omap3 with LAN7500-CEB.
> 
> Signed-off-by: Stephane Fillod <fillods@users.sf.net>
> Cc: Steve Glendinning <steve.glendinning@shawell.net>

Steve, could you give this a quick review?

Thanks.

^ permalink raw reply

* Re: [PATCH net-next 1/2] drivers/net/oki-semi: Remove the definition of PCH_GBE_ETH_ALEN
From: David Miller @ 2012-04-29  2:00 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1335510073-14018-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Fri, 27 Apr 2012 15:01:12 +0800

> From: RongQing.Li <roy.qing.li@gmail.com>
> 
> PCH_GBE_ETH_ALEN is equal to ETH_ALEN, so we can replace it with
> ETH_ALEN.
> 
> If they are not equal, it must be a bug, since this is ethernet,
> and the address has been already stored to mc_addr_list as ETH_ALEN
> bytes when call pch_gbe_mac_mc_addr_list_update.
> 
> Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next 2/2] drivers/net/oki-semi: Donot recompute IP header checksum
From: David Miller @ 2012-04-29  2:00 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1335510073-14018-2-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Fri, 27 Apr 2012 15:01:13 +0800

> From: RongQing.Li <roy.qing.li@gmail.com>
> 
> If I understand correct, NETIF_F_IP_CSUM only means the hardware 
> will compute the TCP/UDP checksum, IP checksum is always computed 
> in software
> 
> So as a workround of hardware unable to compute small packages 
> checksum, do not need to compute IP header checksum.
> 
> Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>

Applied.

^ permalink raw reply

* Re: [net-next 0/6][pull request] Intel Wired LAN Driver Updates
From: David Miller @ 2012-04-29  2:08 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, sassmann
In-Reply-To: <1335519413-1892-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Fri, 27 Apr 2012 02:36:47 -0700

> This series of patches contains updates for e1000e, igb and
> ixgbe.
> 
> The following are changes since commit 82981930125abfd39d7c8378a9cfdf5e1be2002b:
>   net: cleanups in sock_setsockopt()
> and are available in the git repository at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master

Pulled.

I super appreciate the descriptive commit messages, especialy
wrt. hardware workarounds.

Thanks.

^ permalink raw reply

* Re: [PATCH] usbnet: fix skb traversing races during unlink
From: David Miller @ 2012-04-29  2:18 UTC (permalink / raw)
  To: tom.leiming; +Cc: gregkh, netdev, linux-usb, huajun.li.lee, oneukum, stable
In-Reply-To: <1335522095-9012-1-git-send-email-tom.leiming@gmail.com>

From: Ming Lei <tom.leiming@gmail.com>
Date: Fri, 27 Apr 2012 18:21:35 +0800

> +/*The caller must hold list->lock*/

Please put spaces in your comments, like this:

/* The caller must hold list->lock */

> +
> +		/*speedup unlink by blocking resubmit*/

Same here.
>  
> -		entry = (struct skb_data *) skb->cb;
> +		skb_queue_walk(q, skb) {
> +			entry = (struct skb_data *) skb->cb;
> +			if (entry->state != unlink_start)
> +				break;
> +		}
> +		if (skb == (struct sk_buff *)q)
> +			break;

Please do not expose the internal details of SKB lists
with a test like this.  Eventually this will all be
converted to struct list_head and this kind of test
will cause unnecessary pain for such a conversion.

Instead, code it like this, as you would for a loop
using list_for_each*() or similar:

	skb_queue_walk(q, skb) {
		if (condition)
			goto found;
	}
	/* No matching entry. */
	break;
found:

Thanks.

^ permalink raw reply

* Re: [PATCH v2 1/4 net-next] net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb
From: David Miller @ 2012-04-29  2:31 UTC (permalink / raw)
  To: bcrl; +Cc: jchapman, netdev
In-Reply-To: <20120427182321.GE17021@kvack.org>

From: Benjamin LaHaise <bcrl@kvack.org>
Date: Fri, 27 Apr 2012 14:23:21 -0400

> This is the first step in reworking the IPv6 UDP code to be structured more
> like the IPv4 UDP code.  This patch creates __udpv6_queue_rcv_skb() with
> the equivalent sematics to __udp_queue_rcv_skb(), and wires it up to the
> backlog_rcv method.
> 
> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>

Applied.

^ permalink raw reply

* Re: [PATCH v2 2/4 net-next] net/ipv6/udp: UDP encapsulation: move socket locking into udpv6_queue_rcv_skb()
From: David Miller @ 2012-04-29  2:31 UTC (permalink / raw)
  To: bcrl; +Cc: jchapman, netdev
In-Reply-To: <20120427182358.GF17021@kvack.org>

From: Benjamin LaHaise <bcrl@kvack.org>
Date: Fri, 27 Apr 2012 14:23:59 -0400

> In order to make sure that when the encap_rcv() hook is introduced it is
> not called with the socket lock held, move socket locking from callers into
> udpv6_queue_rcv_skb(), matching what happens in IPv4.
> 
> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>

Applied.

^ permalink raw reply

* Re: [PATCH v2 3/4 net-next] net/ipv6/udp: UDP encapsulation: introduce encap_rcv hook into IPv6
From: David Miller @ 2012-04-29  2:31 UTC (permalink / raw)
  To: bcrl; +Cc: jchapman, netdev
In-Reply-To: <20120427182408.GG17021@kvack.org>

From: Benjamin LaHaise <bcrl@kvack.org>
Date: Fri, 27 Apr 2012 14:24:08 -0400

> Now that the sematics of udpv6_queue_rcv_skb() match IPv4's
> udp_queue_rcv_skb(), introduce the UDP encap_rcv() hook for IPv6.
> 
> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>

Applied.

^ permalink raw reply

* Re: [PATCH v2 4/4 net-next] net/l2tp: add support for L2TP over IPv6 UDP
From: David Miller @ 2012-04-29  2:31 UTC (permalink / raw)
  To: bcrl; +Cc: jchapman, netdev
In-Reply-To: <20120427182418.GH17021@kvack.org>

From: Benjamin LaHaise <bcrl@kvack.org>
Date: Fri, 27 Apr 2012 14:24:18 -0400

> Now that encap_rcv() works on IPv6 UDP sockets, wire L2TP up to IPv6.
> Support has been tested with and without hardware offloading.  This
> version fixes the L2TP over localhost issue with incorrect checksums
> being reported.
> 
> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>

Applied.

^ permalink raw reply

* Re: [PATCH] 9p: disconnect channel when PCI device is removed
From: Sasha Levin @ 2012-04-29  3:21 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: davem, ericvh, jvrao, rusty, netdev, linux-kernel, davej
In-Reply-To: <87d36rafid.fsf@skywalker.in.ibm.com>

On Sat, Apr 28, 2012 at 8:16 PM, Aneesh Kumar K.V
<aneesh.kumar@linux.vnet.ibm.com> wrote:
> Sasha Levin <levinsasha928@gmail.com> writes:
>
>> On Mon, Apr 16, 2012 at 12:59 PM, Aneesh Kumar K.V
>> <aneesh.kumar@linux.vnet.ibm.com> wrote:
>>> Sasha Levin <levinsasha928@gmail.com> writes:
>>>
>>>> On Sun, Apr 15, 2012 at 2:27 PM, Aneesh Kumar K.V
>>>> <aneesh.kumar@linux.vnet.ibm.com> wrote:
>>>>> Sasha Levin <levinsasha928@gmail.com> writes:
>>>>>
>>>>>> When a virtio_9p pci device is being removed, we should close down any
>>>>>> active channels and free up resources, we're not supposed to BUG() if there's
>>>>>> still an open channel since it's a valid case when removing the PCI device.
>>>>>>
>>>>>> Otherwise, removing the PCI device with an open channel would cause the
>>>>>> following BUG():
>>>>>>
>>>>> ...
>>>>>
>>>>>> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
>>>>>> index 3d43206..5af18d1 100644
>>>>>> --- a/net/9p/trans_virtio.c
>>>>>> +++ b/net/9p/trans_virtio.c
>>>>>> @@ -615,7 +615,8 @@ static void p9_virtio_remove(struct virtio_device *vdev)
>>>>>>  {
>>>>>>       struct virtio_chan *chan = vdev->priv;
>>>>>>
>>>>>> -     BUG_ON(chan->inuse);
>>>>>> +     if (chan->inuse)
>>>>>> +             p9_virtio_close(chan->client);
>>>>>>       vdev->config->del_vqs(vdev);
>>>>>>
>>>>>>       mutex_lock(&virtio_9p_lock);
>>>>>
>>>>> But an umount should have resulted in p9_virtio_close ? How are you
>>>>> removing the device ? Are you removing the device with file system
>>>>> mounted  ?. In that case may be we should return EBUSY ?
>>>>
>>>> I signal the underlying PCI device to remove (echo 1 >
>>>> /sys/devices/pci0000\:00/[...]/remove), we can't really prevent that
>>>> thing so we must clean up ourselves.
>>>
>>> What does that mean for the mounted file system ? What would happen to
>>> the pending fs operations in that case ?
>>
>> I'm guessing that all of them should be canceled.
>
> Pending operation we can cancel, but what about dirty pages in cached
> mode ? Also how does virtio-blk handle this ?

virtio-blk does pretty much the same thing:

sh-4.2# echo 1 > 0000\:00\:05.0/remove
sh-4.2# ls -al
[hang]

>> virtio-pci simulates
>> a PCI device, if the PCI device is unplugged there's not much to do
>> about the filesystem or pending requests.
>
> Ideal thing to do would be to make remove return -EBUSY; let the user
> umount. But looking at the code, I guess there is no easy way to return
> error from remove callback ?

There's no way to return error, the removal callback is just a notifier.

^ permalink raw reply

* Re: "martian source" mixes up source and destination
From: Michael Tokarev @ 2012-04-29  4:52 UTC (permalink / raw)
  To: Mantas M.; +Cc: linux-kernel, netdev
In-Reply-To: <jnh9tl$rth$1@dough.gmane.org>

On 28.04.2012 21:37, Mantas M. wrote:
> After enabling the /proc/sys/net/ipv4/conf/all/log_martians sysctl, I've
> been receiving the following messages on dmesg:

Cc'ing netdev@vger for network-related question.

>> martian source 255.255.255.255 from 192.168.1.115, on dev wlan0
>> ll header: ff:ff:ff:ff:ff:ff:f4:6d:04:85:fc:d7:08:00
>> martian source 192.168.1.255 from 192.168.1.115, on dev wlan0
>> ll header: ff:ff:ff:ff:ff:ff:f4:6d:04:85:fc:d7:08:00
>> martian source 192.168.1.255 from 192.168.1.115, on dev wlan0
>> ll header: ff:ff:ff:ff:ff:ff:f4:6d:04:85:fc:d7:08:00
>> <later>
>> martian source 255.255.255.255 from 192.168.1.183, on dev eth0
>> ll header: ff:ff:ff:ff:ff:ff:48:5d:60:e8:65:8f:08:00

It has been this way since very long time.  The meaning of this
message is:

  there is a martian source sending packets to 255.255.255.255
  from 192.168.1.183 on dev eth0

> However, the packets are *not* martians, but legitimate broadcast
> UDP/IPv4 packets: they have the *destination* as 255.255.255.255 or
> 192.168.1.255, with the source being 192.168.1.115 (wlan0 on this
> device) for the first three packets, and 192.168.1.183 (eth0) on the
> fourth.

log_martians does not check destination address, it checks _source_
address.  At least it did so in the past.  Please show output of
`ip route' from your machine.

> Linux 3.3.4-1-ARCH on Arch Linux

Thanks,

/mjt

^ permalink raw reply

* Re: kernel panic, inconsistent lock, vanilla 3.3.4, 32bit, tcp
From: Eric Dumazet @ 2012-04-29  7:32 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: netdev, Glauber Costa
In-Reply-To: <04009a9cbb27bf97cb7d86612191b751@visp.net.lb>

On Sun, 2012-04-29 at 00:45 +0300, Denys Fedoryshchenko wrote:
> Hi
> 
> Just got repeatable kernel panic on loaded proxy server
> 
> Here is the log:
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.903868]
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.903880] 
> =================================
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.903890] [ INFO: inconsistent 
> lock state ]
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.903903] 3.3.4-build-0061 #8 
> Not tainted
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904133] 
> ---------------------------------
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904344] inconsistent 
> {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] squid/1603 
> [HC0[0]:SC0[0]:HE1:SE1] takes:
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  (key#3){+.?...}, at: 
> [<c0232cc4>] __percpu_counter_sum+0xd/0x58
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] {IN-SOFTIRQ-W} state 
> was registered at:
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0158b84>] 
> __lock_acquire+0x284/0xc26
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c01598e8>] 
> lock_acquire+0x71/0x85
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0349765>] 
> _raw_spin_lock+0x33/0x40
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0232c93>] 
> __percpu_counter_add+0x58/0x7c
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02cfde1>] 
> sk_clone_lock+0x1e5/0x200
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0303ee4>] 
> inet_csk_clone_lock+0xe/0x78
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0315778>] 
> tcp_create_openreq_child+0x1b/0x404
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c031339c>] 
> tcp_v4_syn_recv_sock+0x32/0x1c1
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c031615a>] 
> tcp_check_req+0x1fd/0x2d7
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0313f77>] 
> tcp_v4_do_rcv+0xab/0x194
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c03153bb>] 
> tcp_v4_rcv+0x3b3/0x5cc
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc0c4>] 
> ip_local_deliver_finish+0x13a/0x1e9
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc539>] 
> NF_HOOK.clone.11+0x46/0x4d
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc652>] 
> ip_local_deliver+0x41/0x45
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc4d1>] 
> ip_rcv_finish+0x31a/0x33c
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc539>] 
> NF_HOOK.clone.11+0x46/0x4d
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc857>] 
> ip_rcv+0x201/0x23e
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02daa3a>] 
> __netif_receive_skb+0x319/0x368
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02dac07>] 
> netif_receive_skb+0x4e/0x7d
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02dacf6>] 
> napi_skb_finish+0x1e/0x34
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02db122>] 
> napi_gro_receive+0x20/0x24
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<f85d1743>] 
> e1000_receive_skb+0x3f/0x45 [e1000e]
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<f85d3464>] 
> e1000_clean_rx_irq+0x1f9/0x284 [e1000e]
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<f85d3926>] 
> e1000_clean+0x62/0x1f4 [e1000e]
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02db228>] 
> net_rx_action+0x90/0x160
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c012a445>] 
> __do_softirq+0x7b/0x118
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] irq event stamp: 
> 156915469
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] hardirqs last  enabled 
> at (156915469): [<c019b4f4>] __slab_alloc.clone.58.clone.63+0xc4/0x2de
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] hardirqs last disabled 
> at (156915468): [<c019b452>] __slab_alloc.clone.58.clone.63+0x22/0x2de
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] softirqs last  enabled 
> at (156915466): [<c02ce677>] lock_sock_nested+0x64/0x6c
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] softirqs last disabled 
> at (156915464): [<c0349914>] _raw_spin_lock_bh+0xe/0x45
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] other info that might 
> help us debug this:
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  Possible unsafe 
> locking scenario:
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]        CPU0
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]        ----
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   lock(key#3);
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   <Interrupt>
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]     lock(key#3);
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  *** DEADLOCK ***
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] 1 lock held by 
> squid/1603:
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  #0:  
> (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>] lock_sock+0xa/0xc
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] stack backtrace:
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] Pid: 1603, comm: squid 
> Not tainted 3.3.4-build-0061 #8
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] Call Trace:
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0347b73>] ? 
> printk+0x18/0x1d
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c015873a>] 
> valid_state+0x1f6/0x201
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0158816>] 
> mark_lock+0xd1/0x1bb
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c015876b>] ? 
> mark_lock+0x26/0x1bb
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c015805d>] ? 
> check_usage_forwards+0x77/0x77
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0158bf8>] 
> __lock_acquire+0x2f8/0xc26
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0159b8e>] ? 
> mark_held_locks+0x5d/0x7b
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0159cf6>] ? 
> trace_hardirqs_on+0xb/0xd
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0158dd4>] ? 
> __lock_acquire+0x4d4/0xc26
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01598e8>] 
> lock_acquire+0x71/0x85
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0232cc4>] ? 
> __percpu_counter_sum+0xd/0x58
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0349765>] 
> _raw_spin_lock+0x33/0x40
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0232cc4>] ? 
> __percpu_counter_sum+0xd/0x58
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0232cc4>] 
> __percpu_counter_sum+0xd/0x58
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c02cebc4>] 
> __sk_mem_schedule+0xdd/0x1c7
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c02d178d>] ? 
> __alloc_skb+0x76/0x100
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0305e8e>] 
> sk_wmem_schedule+0x21/0x2d
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0306370>] 
> sk_stream_alloc_skb+0x42/0xaa
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0306567>] 
> tcp_sendmsg+0x18f/0x68b
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c031f3dc>] ? 
> ip_fast_csum+0x30/0x30
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0320193>] 
> inet_sendmsg+0x53/0x5a
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c02cb633>] 
> sock_aio_write+0xd2/0xda
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c015876b>] ? 
> mark_lock+0x26/0x1bb
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01a1017>] 
> do_sync_write+0x9f/0xd9
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01a2111>] ? 
> file_free_rcu+0x2f/0x2f
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01a17a1>] 
> vfs_write+0x8f/0xab
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01a284d>] ? 
> fget_light+0x75/0x7c
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01a1900>] 
> sys_write+0x3d/0x5e
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0349ec9>] 
> syscall_call+0x7/0xb
> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0340000>] ? 
> rp_sidt+0x41/0x83
> 
> 
> ---
> Denys Fedoryshchenko

Seems to be related to Glauber Costa work

sk_memory_allocated_add() seems to assume __percpu_counter_add() is irq
safe (truth is : it is not safe)

^ permalink raw reply

* inconsistent lock/deadlock crash, vanilla 3.3.4, 32bit, tcp
From: Denys Fedoryshchenko @ 2012-04-29  7:41 UTC (permalink / raw)
  To: netdev, davem, jmorris, yoshfuji, kaber, linux-kernel

I apologize for late night post, and a lot of trash left in report.
I will cleanup it up now, and send with CC to maintainers.

Server job are proxy, with very high rate of new connections.
Deadlock at peaktime can be easily reproduced in 10-15 minutes.

Deadlock occured on almost all 3.3-stable versions (tried 3.3.3 - 
3.3.4). It is not easy to try older kernel,
but if required i can try.
Usually also, because SYN rate very high, i can see:
[   51.612987] TCP: Possible SYN flooding on port 8080. Sending 
cookies.  Check SNMP counters.

  [  762.903868]
  [  762.903880] =================================
  [  762.903890] [ INFO: inconsistent lock state ]
  [  762.903903] 3.3.4-build-0061 #8 Not tainted
  [  762.904133] ---------------------------------
  [  762.904344] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
  [  762.904542] squid/1603 [HC0[0]:SC0[0]:HE1:SE1] takes:
  [  762.904542]  (key#3){+.?...}, at: [<c0232cc4>] 
__percpu_counter_sum+0xd/0x58
  [  762.904542] {IN-SOFTIRQ-W} state was registered at:
  [  762.904542]   [<c0158b84>] __lock_acquire+0x284/0xc26
  [  762.904542]   [<c01598e8>] lock_acquire+0x71/0x85
  [  762.904542]   [<c0349765>] _raw_spin_lock+0x33/0x40
  [  762.904542]   [<c0232c93>] __percpu_counter_add+0x58/0x7c
  [  762.904542]   [<c02cfde1>] sk_clone_lock+0x1e5/0x200
  [  762.904542]   [<c0303ee4>] inet_csk_clone_lock+0xe/0x78
  [  762.904542]   [<c0315778>] tcp_create_openreq_child+0x1b/0x404
  [  762.904542]   [<c031339c>] tcp_v4_syn_recv_sock+0x32/0x1c1
  [  762.904542]   [<c031615a>] tcp_check_req+0x1fd/0x2d7
  [  762.904542]   [<c0313f77>] tcp_v4_do_rcv+0xab/0x194
  [  762.904542]   [<c03153bb>] tcp_v4_rcv+0x3b3/0x5cc
  [  762.904542]   [<c02fc0c4>] ip_local_deliver_finish+0x13a/0x1e9
  [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
  [  762.904542]   [<c02fc652>] ip_local_deliver+0x41/0x45
  [  762.904542]   [<c02fc4d1>] ip_rcv_finish+0x31a/0x33c
  [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
  [  762.904542]   [<c02fc857>] ip_rcv+0x201/0x23e
  [  762.904542]   [<c02daa3a>] __netif_receive_skb+0x319/0x368
  [  762.904542]   [<c02dac07>] netif_receive_skb+0x4e/0x7d
  [  762.904542]   [<c02dacf6>] napi_skb_finish+0x1e/0x34
  [  762.904542]   [<c02db122>] napi_gro_receive+0x20/0x24
  [  762.904542]   [<f85d1743>] e1000_receive_skb+0x3f/0x45 [e1000e]
  [  762.904542]   [<f85d3464>] e1000_clean_rx_irq+0x1f9/0x284 [e1000e]
  [  762.904542]   [<f85d3926>] e1000_clean+0x62/0x1f4 [e1000e]
  [  762.904542]   [<c02db228>] net_rx_action+0x90/0x160
  [  762.904542]   [<c012a445>] __do_softirq+0x7b/0x118
  [  762.904542] irq event stamp: 156915469
  [  762.904542] hardirqs last  enabled at (156915469): [<c019b4f4>] 
__slab_alloc.clone.58.clone.63+0xc4/0x2de
  [  762.904542] hardirqs last disabled at (156915468): [<c019b452>] 
__slab_alloc.clone.58.clone.63+0x22/0x2de
  [  762.904542] softirqs last  enabled at (156915466): [<c02ce677>] 
lock_sock_nested+0x64/0x6c
  [  762.904542] softirqs last disabled at (156915464): [<c0349914>] 
_raw_spin_lock_bh+0xe/0x45
  [  762.904542]
  [  762.904542] other info that might help us debug this:
  [  762.904542]  Possible unsafe locking scenario:
  [  762.904542]
  [  762.904542]        CPU0
  [  762.904542]        ----
  [  762.904542]   lock(key#3);
  [  762.904542]   <Interrupt>
  [  762.904542]     lock(key#3);
  [  762.904542]
  [  762.904542]  *** DEADLOCK ***
  [  762.904542]
  [  762.904542] 1 lock held by squid/1603:
  [  762.904542]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>] 
lock_sock+0xa/0xc
  [  762.904542]
  [  762.904542] stack backtrace:
  [  762.904542] Pid: 1603, comm: squid Not tainted 3.3.4-build-0061 #8
  [  762.904542] Call Trace:
  [  762.904542]  [<c0347b73>] ? printk+0x18/0x1d
  [  762.904542]  [<c015873a>] valid_state+0x1f6/0x201
  [  762.904542]  [<c0158816>] mark_lock+0xd1/0x1bb
  [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
  [  762.904542]  [<c015805d>] ? check_usage_forwards+0x77/0x77
  [  762.904542]  [<c0158bf8>] __lock_acquire+0x2f8/0xc26
  [  762.904542]  [<c0159b8e>] ? mark_held_locks+0x5d/0x7b
  [  762.904542]  [<c0159cf6>] ? trace_hardirqs_on+0xb/0xd
  [  762.904542]  [<c0158dd4>] ? __lock_acquire+0x4d4/0xc26
  [  762.904542]  [<c01598e8>] lock_acquire+0x71/0x85
  [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
  [  762.904542]  [<c0349765>] _raw_spin_lock+0x33/0x40
  [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
  [  762.904542]  [<c0232cc4>] __percpu_counter_sum+0xd/0x58
  [  762.904542]  [<c02cebc4>] __sk_mem_schedule+0xdd/0x1c7
  [  762.904542]  [<c02d178d>] ? __alloc_skb+0x76/0x100
  [  762.904542]  [<c0305e8e>] sk_wmem_schedule+0x21/0x2d
  [  762.904542]  [<c0306370>] sk_stream_alloc_skb+0x42/0xaa
  [  762.904542]  [<c0306567>] tcp_sendmsg+0x18f/0x68b
  [  762.904542]  [<c031f3dc>] ? ip_fast_csum+0x30/0x30
  [  762.904542]  [<c0320193>] inet_sendmsg+0x53/0x5a
  [  762.904542]  [<c02cb633>] sock_aio_write+0xd2/0xda
  [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
  [  762.904542]  [<c01a1017>] do_sync_write+0x9f/0xd9
  [  762.904542]  [<c01a2111>] ? file_free_rcu+0x2f/0x2f
  [  762.904542]  [<c01a17a1>] vfs_write+0x8f/0xab
  [  762.904542]  [<c01a284d>] ? fget_light+0x75/0x7c
  [  762.904542]  [<c01a1900>] sys_write+0x3d/0x5e
  [  762.904542]  [<c0349ec9>] syscall_call+0x7/0xb
  [  762.904542]  [<c0340000>] ? rp_sidt+0x41/0x83


---
Denys Fedoryshchenko

---
Network engineer
Denys Fedoryshchenko

Dora Highway - Center Cebaco - 2nd Floor
Beirut, Lebanon
Tel:	+961 1 247373
E-Mail: denys@visp.net.lb

^ permalink raw reply

* Re: kernel panic, inconsistent lock, vanilla 3.3.4, 32bit, tcp
From: Denys Fedoryshchenko @ 2012-04-29  7:45 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Glauber Costa
In-Reply-To: <1335684762.2900.69.camel@edumazet-glaptop>

On 2012-04-29 10:32, Eric Dumazet wrote:
> On Sun, 2012-04-29 at 00:45 +0300, Denys Fedoryshchenko wrote:
>> Hi
>>
>> Just got repeatable kernel panic on loaded proxy server
>>
>> Here is the log:
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.903868]
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.903880]
>> =================================
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.903890] [ INFO: 
>> inconsistent
>> lock state ]
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.903903] 3.3.4-build-0061 #8
>> Not tainted
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904133]
>> ---------------------------------
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904344] inconsistent
>> {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] squid/1603
>> [HC0[0]:SC0[0]:HE1:SE1] takes:
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  (key#3){+.?...}, 
>> at:
>> [<c0232cc4>] __percpu_counter_sum+0xd/0x58
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] {IN-SOFTIRQ-W} 
>> state
>> was registered at:
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0158b84>]
>> __lock_acquire+0x284/0xc26
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c01598e8>]
>> lock_acquire+0x71/0x85
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0349765>]
>> _raw_spin_lock+0x33/0x40
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0232c93>]
>> __percpu_counter_add+0x58/0x7c
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02cfde1>]
>> sk_clone_lock+0x1e5/0x200
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0303ee4>]
>> inet_csk_clone_lock+0xe/0x78
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0315778>]
>> tcp_create_openreq_child+0x1b/0x404
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c031339c>]
>> tcp_v4_syn_recv_sock+0x32/0x1c1
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c031615a>]
>> tcp_check_req+0x1fd/0x2d7
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c0313f77>]
>> tcp_v4_do_rcv+0xab/0x194
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c03153bb>]
>> tcp_v4_rcv+0x3b3/0x5cc
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc0c4>]
>> ip_local_deliver_finish+0x13a/0x1e9
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc539>]
>> NF_HOOK.clone.11+0x46/0x4d
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc652>]
>> ip_local_deliver+0x41/0x45
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc4d1>]
>> ip_rcv_finish+0x31a/0x33c
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc539>]
>> NF_HOOK.clone.11+0x46/0x4d
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02fc857>]
>> ip_rcv+0x201/0x23e
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02daa3a>]
>> __netif_receive_skb+0x319/0x368
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02dac07>]
>> netif_receive_skb+0x4e/0x7d
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02dacf6>]
>> napi_skb_finish+0x1e/0x34
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02db122>]
>> napi_gro_receive+0x20/0x24
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<f85d1743>]
>> e1000_receive_skb+0x3f/0x45 [e1000e]
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<f85d3464>]
>> e1000_clean_rx_irq+0x1f9/0x284 [e1000e]
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<f85d3926>]
>> e1000_clean+0x62/0x1f4 [e1000e]
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c02db228>]
>> net_rx_action+0x90/0x160
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   [<c012a445>]
>> __do_softirq+0x7b/0x118
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] irq event stamp:
>> 156915469
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] hardirqs last  
>> enabled
>> at (156915469): [<c019b4f4>] 
>> __slab_alloc.clone.58.clone.63+0xc4/0x2de
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] hardirqs last 
>> disabled
>> at (156915468): [<c019b452>] 
>> __slab_alloc.clone.58.clone.63+0x22/0x2de
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] softirqs last  
>> enabled
>> at (156915466): [<c02ce677>] lock_sock_nested+0x64/0x6c
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] softirqs last 
>> disabled
>> at (156915464): [<c0349914>] _raw_spin_lock_bh+0xe/0x45
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] other info that 
>> might
>> help us debug this:
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  Possible unsafe
>> locking scenario:
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]        CPU0
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]        ----
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   lock(key#3);
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]   <Interrupt>
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]     lock(key#3);
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  *** DEADLOCK ***
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] 1 lock held by
>> squid/1603:
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  #0:
>> (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>] lock_sock+0xa/0xc
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] stack backtrace:
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] Pid: 1603, comm: 
>> squid
>> Not tainted 3.3.4-build-0061 #8
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542] Call Trace:
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0347b73>] ?
>> printk+0x18/0x1d
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c015873a>]
>> valid_state+0x1f6/0x201
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0158816>]
>> mark_lock+0xd1/0x1bb
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c015876b>] ?
>> mark_lock+0x26/0x1bb
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c015805d>] ?
>> check_usage_forwards+0x77/0x77
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0158bf8>]
>> __lock_acquire+0x2f8/0xc26
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0159b8e>] ?
>> mark_held_locks+0x5d/0x7b
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0159cf6>] ?
>> trace_hardirqs_on+0xb/0xd
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0158dd4>] ?
>> __lock_acquire+0x4d4/0xc26
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01598e8>]
>> lock_acquire+0x71/0x85
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0232cc4>] ?
>> __percpu_counter_sum+0xd/0x58
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0349765>]
>> _raw_spin_lock+0x33/0x40
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0232cc4>] ?
>> __percpu_counter_sum+0xd/0x58
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0232cc4>]
>> __percpu_counter_sum+0xd/0x58
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c02cebc4>]
>> __sk_mem_schedule+0xdd/0x1c7
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c02d178d>] ?
>> __alloc_skb+0x76/0x100
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0305e8e>]
>> sk_wmem_schedule+0x21/0x2d
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0306370>]
>> sk_stream_alloc_skb+0x42/0xaa
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0306567>]
>> tcp_sendmsg+0x18f/0x68b
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c031f3dc>] ?
>> ip_fast_csum+0x30/0x30
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0320193>]
>> inet_sendmsg+0x53/0x5a
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c02cb633>]
>> sock_aio_write+0xd2/0xda
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c015876b>] ?
>> mark_lock+0x26/0x1bb
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01a1017>]
>> do_sync_write+0x9f/0xd9
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01a2111>] ?
>> file_free_rcu+0x2f/0x2f
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01a17a1>]
>> vfs_write+0x8f/0xab
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01a284d>] ?
>> fget_light+0x75/0x7c
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c01a1900>]
>> sys_write+0x3d/0x5e
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0349ec9>]
>> syscall_call+0x7/0xb
>> Apr 29 00:38:17 10.0.15.2 kernel: [  762.904542]  [<c0340000>] ?
>> rp_sidt+0x41/0x83
>>
>>
>> ---
>> Denys Fedoryshchenko
>
> Seems to be related to Glauber Costa work
>
> sk_memory_allocated_add() seems to assume __percpu_counter_add() is 
> irq
> safe (truth is : it is not safe)

Thanks Eric, and sorry for mangled post (long lines). I was very 
sleepy, and resent it properly now.
I will try to check his patches.

^ permalink raw reply

* Re: kernel panic, inconsistent lock, vanilla 3.3.4, 32bit, tcp
From: Denys Fedoryshchenko @ 2012-04-29  8:00 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Glauber Costa
In-Reply-To: <1335684762.2900.69.camel@edumazet-glaptop>

On 2012-04-29 10:32, Eric Dumazet wrote:
> On Sun, 2012-04-29 at 00:45 +0300, Denys Fedoryshchenko wrote:
>> Hi
>>
>> Just got repeatable kernel panic on loaded proxy server
>>
>> Here is the log:

>>
>> ---
>> Denys Fedoryshchenko
>
> Seems to be related to Glauber Costa work
>
> sk_memory_allocated_add() seems to assume __percpu_counter_add() is 
> irq
> safe (truth is : it is not safe)

Not sure if it is related, but at my case, CONFIG_CGROUPS is not set

^ permalink raw reply

* Re: kernel panic, inconsistent lock, vanilla 3.3.4, 32bit, tcp
From: Eric Dumazet @ 2012-04-29  8:17 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: netdev, Glauber Costa
In-Reply-To: <6e1a75b751372f0ca5a58c2d0122a3cc@visp.net.lb>

On Sun, 2012-04-29 at 11:00 +0300, Denys Fedoryshchenko wrote:
> On 2012-04-29 10:32, Eric Dumazet wrote:
> > On Sun, 2012-04-29 at 00:45 +0300, Denys Fedoryshchenko wrote:
> >> Hi
> >>
> >> Just got repeatable kernel panic on loaded proxy server
> >>
> >> Here is the log:
> 
> >>
> >> ---
> >> Denys Fedoryshchenko
> >
> > Seems to be related to Glauber Costa work
> >
> > sk_memory_allocated_add() seems to assume __percpu_counter_add() is 
> > irq
> > safe (truth is : it is not safe)
> 
> Not sure if it is related, but at my case, CONFIG_CGROUPS is not set
> 


Hmm, so its an old bug...

I'll take a look, thanks a lot !

(I am going to reply on your new thread since its better formatted)

^ permalink raw reply

* Re: inconsistent lock/deadlock crash, vanilla 3.3.4, 32bit, tcp
From: Eric Dumazet @ 2012-04-29  8:27 UTC (permalink / raw)
  To: Denys Fedoryshchenko
  Cc: netdev, davem, jmorris, yoshfuji, kaber, linux-kernel,
	Glauber Costa
In-Reply-To: <13fc59c8e3be34ddb3c7bb5c17b1bdf8@visp.net.lb>

On Sun, 2012-04-29 at 10:41 +0300, Denys Fedoryshchenko wrote:
> I apologize for late night post, and a lot of trash left in report.
> I will cleanup it up now, and send with CC to maintainers.
> 
> Server job are proxy, with very high rate of new connections.
> Deadlock at peaktime can be easily reproduced in 10-15 minutes.
> 
> Deadlock occured on almost all 3.3-stable versions (tried 3.3.3 - 
> 3.3.4). It is not easy to try older kernel,
> but if required i can try.
> Usually also, because SYN rate very high, i can see:
> [   51.612987] TCP: Possible SYN flooding on port 8080. Sending 
> cookies.  Check SNMP counters.
> 
>   [  762.903868]
>   [  762.903880] =================================
>   [  762.903890] [ INFO: inconsistent lock state ]
>   [  762.903903] 3.3.4-build-0061 #8 Not tainted
>   [  762.904133] ---------------------------------
>   [  762.904344] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
>   [  762.904542] squid/1603 [HC0[0]:SC0[0]:HE1:SE1] takes:
>   [  762.904542]  (key#3){+.?...}, at: [<c0232cc4>] 
> __percpu_counter_sum+0xd/0x58
>   [  762.904542] {IN-SOFTIRQ-W} state was registered at:
>   [  762.904542]   [<c0158b84>] __lock_acquire+0x284/0xc26
>   [  762.904542]   [<c01598e8>] lock_acquire+0x71/0x85
>   [  762.904542]   [<c0349765>] _raw_spin_lock+0x33/0x40
>   [  762.904542]   [<c0232c93>] __percpu_counter_add+0x58/0x7c
>   [  762.904542]   [<c02cfde1>] sk_clone_lock+0x1e5/0x200
>   [  762.904542]   [<c0303ee4>] inet_csk_clone_lock+0xe/0x78
>   [  762.904542]   [<c0315778>] tcp_create_openreq_child+0x1b/0x404
>   [  762.904542]   [<c031339c>] tcp_v4_syn_recv_sock+0x32/0x1c1
>   [  762.904542]   [<c031615a>] tcp_check_req+0x1fd/0x2d7
>   [  762.904542]   [<c0313f77>] tcp_v4_do_rcv+0xab/0x194
>   [  762.904542]   [<c03153bb>] tcp_v4_rcv+0x3b3/0x5cc
>   [  762.904542]   [<c02fc0c4>] ip_local_deliver_finish+0x13a/0x1e9
>   [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
>   [  762.904542]   [<c02fc652>] ip_local_deliver+0x41/0x45
>   [  762.904542]   [<c02fc4d1>] ip_rcv_finish+0x31a/0x33c
>   [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
>   [  762.904542]   [<c02fc857>] ip_rcv+0x201/0x23e
>   [  762.904542]   [<c02daa3a>] __netif_receive_skb+0x319/0x368
>   [  762.904542]   [<c02dac07>] netif_receive_skb+0x4e/0x7d
>   [  762.904542]   [<c02dacf6>] napi_skb_finish+0x1e/0x34
>   [  762.904542]   [<c02db122>] napi_gro_receive+0x20/0x24
>   [  762.904542]   [<f85d1743>] e1000_receive_skb+0x3f/0x45 [e1000e]
>   [  762.904542]   [<f85d3464>] e1000_clean_rx_irq+0x1f9/0x284 [e1000e]
>   [  762.904542]   [<f85d3926>] e1000_clean+0x62/0x1f4 [e1000e]
>   [  762.904542]   [<c02db228>] net_rx_action+0x90/0x160
>   [  762.904542]   [<c012a445>] __do_softirq+0x7b/0x118
>   [  762.904542] irq event stamp: 156915469
>   [  762.904542] hardirqs last  enabled at (156915469): [<c019b4f4>] 
> __slab_alloc.clone.58.clone.63+0xc4/0x2de
>   [  762.904542] hardirqs last disabled at (156915468): [<c019b452>] 
> __slab_alloc.clone.58.clone.63+0x22/0x2de
>   [  762.904542] softirqs last  enabled at (156915466): [<c02ce677>] 
> lock_sock_nested+0x64/0x6c
>   [  762.904542] softirqs last disabled at (156915464): [<c0349914>] 
> _raw_spin_lock_bh+0xe/0x45
>   [  762.904542]
>   [  762.904542] other info that might help us debug this:
>   [  762.904542]  Possible unsafe locking scenario:
>   [  762.904542]
>   [  762.904542]        CPU0
>   [  762.904542]        ----
>   [  762.904542]   lock(key#3);
>   [  762.904542]   <Interrupt>
>   [  762.904542]     lock(key#3);
>   [  762.904542]
>   [  762.904542]  *** DEADLOCK ***
>   [  762.904542]
>   [  762.904542] 1 lock held by squid/1603:
>   [  762.904542]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>] 
> lock_sock+0xa/0xc
>   [  762.904542]
>   [  762.904542] stack backtrace:
>   [  762.904542] Pid: 1603, comm: squid Not tainted 3.3.4-build-0061 #8
>   [  762.904542] Call Trace:
>   [  762.904542]  [<c0347b73>] ? printk+0x18/0x1d
>   [  762.904542]  [<c015873a>] valid_state+0x1f6/0x201
>   [  762.904542]  [<c0158816>] mark_lock+0xd1/0x1bb
>   [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
>   [  762.904542]  [<c015805d>] ? check_usage_forwards+0x77/0x77
>   [  762.904542]  [<c0158bf8>] __lock_acquire+0x2f8/0xc26
>   [  762.904542]  [<c0159b8e>] ? mark_held_locks+0x5d/0x7b
>   [  762.904542]  [<c0159cf6>] ? trace_hardirqs_on+0xb/0xd
>   [  762.904542]  [<c0158dd4>] ? __lock_acquire+0x4d4/0xc26
>   [  762.904542]  [<c01598e8>] lock_acquire+0x71/0x85
>   [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
>   [  762.904542]  [<c0349765>] _raw_spin_lock+0x33/0x40
>   [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
>   [  762.904542]  [<c0232cc4>] __percpu_counter_sum+0xd/0x58
>   [  762.904542]  [<c02cebc4>] __sk_mem_schedule+0xdd/0x1c7
>   [  762.904542]  [<c02d178d>] ? __alloc_skb+0x76/0x100
>   [  762.904542]  [<c0305e8e>] sk_wmem_schedule+0x21/0x2d
>   [  762.904542]  [<c0306370>] sk_stream_alloc_skb+0x42/0xaa
>   [  762.904542]  [<c0306567>] tcp_sendmsg+0x18f/0x68b
>   [  762.904542]  [<c031f3dc>] ? ip_fast_csum+0x30/0x30
>   [  762.904542]  [<c0320193>] inet_sendmsg+0x53/0x5a
>   [  762.904542]  [<c02cb633>] sock_aio_write+0xd2/0xda
>   [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
>   [  762.904542]  [<c01a1017>] do_sync_write+0x9f/0xd9
>   [  762.904542]  [<c01a2111>] ? file_free_rcu+0x2f/0x2f
>   [  762.904542]  [<c01a17a1>] vfs_write+0x8f/0xab
>   [  762.904542]  [<c01a284d>] ? fget_light+0x75/0x7c
>   [  762.904542]  [<c01a1900>] sys_write+0x3d/0x5e
>   [  762.904542]  [<c0349ec9>] syscall_call+0x7/0xb
>   [  762.904542]  [<c0340000>] ? rp_sidt+0x41/0x83
> 


OK, so when we have memory pressure we can call
percpu_counter_read_positive() with SOFTIRQ enabled, and lockdep
complains...

This bug was probably added in 2008, in commit 1748376b6626a
(net: Use a percpu_counter for sockets_allocated)

I'll have to backport the following patch, can you test it please ?

Thanks !

 net/core/sock.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/sock.c b/net/core/sock.c
index b2e14c0..08fc929 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1818,7 +1818,9 @@ int __sk_mem_schedule(struct sock *sk, int size, int kind)
 
 		if (!sk_under_memory_pressure(sk))
 			return 1;
+		local_bh_disable();
 		alloc = sk_sockets_allocated_read_positive(sk);
+		local_bh_enable();
 		if (sk_prot_mem_limits(sk, 2) > alloc *
 		    sk_mem_pages(sk->sk_wmem_queued +
 				 atomic_read(&sk->sk_rmem_alloc) +

^ permalink raw reply related

* Re: inconsistent lock/deadlock crash, vanilla 3.3.4, 32bit, tcp
From: Eric Dumazet @ 2012-04-29  8:40 UTC (permalink / raw)
  To: Denys Fedoryshchenko
  Cc: netdev, davem, jmorris, yoshfuji, kaber, linux-kernel,
	Glauber Costa
In-Reply-To: <1335688036.2900.80.camel@edumazet-glaptop>

On Sun, 2012-04-29 at 10:27 +0200, Eric Dumazet wrote:
> On Sun, 2012-04-29 at 10:41 +0300, Denys Fedoryshchenko wrote:
> > I apologize for late night post, and a lot of trash left in report.
> > I will cleanup it up now, and send with CC to maintainers.
> > 
> > Server job are proxy, with very high rate of new connections.
> > Deadlock at peaktime can be easily reproduced in 10-15 minutes.
> > 
> > Deadlock occured on almost all 3.3-stable versions (tried 3.3.3 - 
> > 3.3.4). It is not easy to try older kernel,
> > but if required i can try.
> > Usually also, because SYN rate very high, i can see:
> > [   51.612987] TCP: Possible SYN flooding on port 8080. Sending 
> > cookies.  Check SNMP counters.
> > 
> >   [  762.903868]
> >   [  762.903880] =================================
> >   [  762.903890] [ INFO: inconsistent lock state ]
> >   [  762.903903] 3.3.4-build-0061 #8 Not tainted
> >   [  762.904133] ---------------------------------
> >   [  762.904344] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
> >   [  762.904542] squid/1603 [HC0[0]:SC0[0]:HE1:SE1] takes:
> >   [  762.904542]  (key#3){+.?...}, at: [<c0232cc4>] 
> > __percpu_counter_sum+0xd/0x58
> >   [  762.904542] {IN-SOFTIRQ-W} state was registered at:
> >   [  762.904542]   [<c0158b84>] __lock_acquire+0x284/0xc26
> >   [  762.904542]   [<c01598e8>] lock_acquire+0x71/0x85
> >   [  762.904542]   [<c0349765>] _raw_spin_lock+0x33/0x40
> >   [  762.904542]   [<c0232c93>] __percpu_counter_add+0x58/0x7c
> >   [  762.904542]   [<c02cfde1>] sk_clone_lock+0x1e5/0x200
> >   [  762.904542]   [<c0303ee4>] inet_csk_clone_lock+0xe/0x78
> >   [  762.904542]   [<c0315778>] tcp_create_openreq_child+0x1b/0x404
> >   [  762.904542]   [<c031339c>] tcp_v4_syn_recv_sock+0x32/0x1c1
> >   [  762.904542]   [<c031615a>] tcp_check_req+0x1fd/0x2d7
> >   [  762.904542]   [<c0313f77>] tcp_v4_do_rcv+0xab/0x194
> >   [  762.904542]   [<c03153bb>] tcp_v4_rcv+0x3b3/0x5cc
> >   [  762.904542]   [<c02fc0c4>] ip_local_deliver_finish+0x13a/0x1e9
> >   [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
> >   [  762.904542]   [<c02fc652>] ip_local_deliver+0x41/0x45
> >   [  762.904542]   [<c02fc4d1>] ip_rcv_finish+0x31a/0x33c
> >   [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
> >   [  762.904542]   [<c02fc857>] ip_rcv+0x201/0x23e
> >   [  762.904542]   [<c02daa3a>] __netif_receive_skb+0x319/0x368
> >   [  762.904542]   [<c02dac07>] netif_receive_skb+0x4e/0x7d
> >   [  762.904542]   [<c02dacf6>] napi_skb_finish+0x1e/0x34
> >   [  762.904542]   [<c02db122>] napi_gro_receive+0x20/0x24
> >   [  762.904542]   [<f85d1743>] e1000_receive_skb+0x3f/0x45 [e1000e]
> >   [  762.904542]   [<f85d3464>] e1000_clean_rx_irq+0x1f9/0x284 [e1000e]
> >   [  762.904542]   [<f85d3926>] e1000_clean+0x62/0x1f4 [e1000e]
> >   [  762.904542]   [<c02db228>] net_rx_action+0x90/0x160
> >   [  762.904542]   [<c012a445>] __do_softirq+0x7b/0x118
> >   [  762.904542] irq event stamp: 156915469
> >   [  762.904542] hardirqs last  enabled at (156915469): [<c019b4f4>] 
> > __slab_alloc.clone.58.clone.63+0xc4/0x2de
> >   [  762.904542] hardirqs last disabled at (156915468): [<c019b452>] 
> > __slab_alloc.clone.58.clone.63+0x22/0x2de
> >   [  762.904542] softirqs last  enabled at (156915466): [<c02ce677>] 
> > lock_sock_nested+0x64/0x6c
> >   [  762.904542] softirqs last disabled at (156915464): [<c0349914>] 
> > _raw_spin_lock_bh+0xe/0x45
> >   [  762.904542]
> >   [  762.904542] other info that might help us debug this:
> >   [  762.904542]  Possible unsafe locking scenario:
> >   [  762.904542]
> >   [  762.904542]        CPU0
> >   [  762.904542]        ----
> >   [  762.904542]   lock(key#3);
> >   [  762.904542]   <Interrupt>
> >   [  762.904542]     lock(key#3);
> >   [  762.904542]
> >   [  762.904542]  *** DEADLOCK ***
> >   [  762.904542]
> >   [  762.904542] 1 lock held by squid/1603:
> >   [  762.904542]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>] 
> > lock_sock+0xa/0xc
> >   [  762.904542]
> >   [  762.904542] stack backtrace:
> >   [  762.904542] Pid: 1603, comm: squid Not tainted 3.3.4-build-0061 #8
> >   [  762.904542] Call Trace:
> >   [  762.904542]  [<c0347b73>] ? printk+0x18/0x1d
> >   [  762.904542]  [<c015873a>] valid_state+0x1f6/0x201
> >   [  762.904542]  [<c0158816>] mark_lock+0xd1/0x1bb
> >   [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
> >   [  762.904542]  [<c015805d>] ? check_usage_forwards+0x77/0x77
> >   [  762.904542]  [<c0158bf8>] __lock_acquire+0x2f8/0xc26
> >   [  762.904542]  [<c0159b8e>] ? mark_held_locks+0x5d/0x7b
> >   [  762.904542]  [<c0159cf6>] ? trace_hardirqs_on+0xb/0xd
> >   [  762.904542]  [<c0158dd4>] ? __lock_acquire+0x4d4/0xc26
> >   [  762.904542]  [<c01598e8>] lock_acquire+0x71/0x85
> >   [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
> >   [  762.904542]  [<c0349765>] _raw_spin_lock+0x33/0x40
> >   [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
> >   [  762.904542]  [<c0232cc4>] __percpu_counter_sum+0xd/0x58
> >   [  762.904542]  [<c02cebc4>] __sk_mem_schedule+0xdd/0x1c7
> >   [  762.904542]  [<c02d178d>] ? __alloc_skb+0x76/0x100
> >   [  762.904542]  [<c0305e8e>] sk_wmem_schedule+0x21/0x2d
> >   [  762.904542]  [<c0306370>] sk_stream_alloc_skb+0x42/0xaa
> >   [  762.904542]  [<c0306567>] tcp_sendmsg+0x18f/0x68b
> >   [  762.904542]  [<c031f3dc>] ? ip_fast_csum+0x30/0x30
> >   [  762.904542]  [<c0320193>] inet_sendmsg+0x53/0x5a
> >   [  762.904542]  [<c02cb633>] sock_aio_write+0xd2/0xda
> >   [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
> >   [  762.904542]  [<c01a1017>] do_sync_write+0x9f/0xd9
> >   [  762.904542]  [<c01a2111>] ? file_free_rcu+0x2f/0x2f
> >   [  762.904542]  [<c01a17a1>] vfs_write+0x8f/0xab
> >   [  762.904542]  [<c01a284d>] ? fget_light+0x75/0x7c
> >   [  762.904542]  [<c01a1900>] sys_write+0x3d/0x5e
> >   [  762.904542]  [<c0349ec9>] syscall_call+0x7/0xb
> >   [  762.904542]  [<c0340000>] ? rp_sidt+0x41/0x83
> > 
> 
> 
> OK, so when we have memory pressure we can call
> percpu_counter_read_positive() with SOFTIRQ enabled, and lockdep
> complains...
> 
> This bug was probably added in 2008, in commit 1748376b6626a
> (net: Use a percpu_counter for sockets_allocated)

Hmm, no this patch was fine.

Bug was in fact added by Glauber Costa in commit 180d8cd942ce336b2c869
(foundations of per-cgroup memory pressure controlling.)

Because he replaced the safe percpu_counter_read_positive() call to
unsafe percpu_counter_sum_positive() in
sk_sockets_allocated_read_positive()

But anyway the patch I sent should fix the problem.

> 
> I'll have to backport the following patch, can you test it please ?
> 
> Thanks !
> 
>  net/core/sock.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/core/sock.c b/net/core/sock.c
> index b2e14c0..08fc929 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1818,7 +1818,9 @@ int __sk_mem_schedule(struct sock *sk, int size, int kind)
>  
>  		if (!sk_under_memory_pressure(sk))
>  			return 1;
> +		local_bh_disable();
>  		alloc = sk_sockets_allocated_read_positive(sk);
> +		local_bh_enable();
>  		if (sk_prot_mem_limits(sk, 2) > alloc *
>  		    sk_mem_pages(sk->sk_wmem_queued +
>  				 atomic_read(&sk->sk_rmem_alloc) +
> 
> 
> 
> 
> 
> 
> 

^ permalink raw reply

* Re: inconsistent lock/deadlock crash, vanilla 3.3.4, 32bit, tcp
From: Eric Dumazet @ 2012-04-29  8:44 UTC (permalink / raw)
  To: Denys Fedoryshchenko
  Cc: netdev, davem, jmorris, yoshfuji, kaber, linux-kernel,
	Glauber Costa
In-Reply-To: <1335688804.2900.83.camel@edumazet-glaptop>

On Sun, 2012-04-29 at 10:40 +0200, Eric Dumazet wrote:
> On Sun, 2012-04-29 at 10:27 +0200, Eric Dumazet wrote:
> > On Sun, 2012-04-29 at 10:41 +0300, Denys Fedoryshchenko wrote:
> > > I apologize for late night post, and a lot of trash left in report.
> > > I will cleanup it up now, and send with CC to maintainers.
> > > 
> > > Server job are proxy, with very high rate of new connections.
> > > Deadlock at peaktime can be easily reproduced in 10-15 minutes.
> > > 
> > > Deadlock occured on almost all 3.3-stable versions (tried 3.3.3 - 
> > > 3.3.4). It is not easy to try older kernel,
> > > but if required i can try.
> > > Usually also, because SYN rate very high, i can see:
> > > [   51.612987] TCP: Possible SYN flooding on port 8080. Sending 
> > > cookies.  Check SNMP counters.
> > > 
> > >   [  762.903868]
> > >   [  762.903880] =================================
> > >   [  762.903890] [ INFO: inconsistent lock state ]
> > >   [  762.903903] 3.3.4-build-0061 #8 Not tainted
> > >   [  762.904133] ---------------------------------
> > >   [  762.904344] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
> > >   [  762.904542] squid/1603 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > >   [  762.904542]  (key#3){+.?...}, at: [<c0232cc4>] 
> > > __percpu_counter_sum+0xd/0x58
> > >   [  762.904542] {IN-SOFTIRQ-W} state was registered at:
> > >   [  762.904542]   [<c0158b84>] __lock_acquire+0x284/0xc26
> > >   [  762.904542]   [<c01598e8>] lock_acquire+0x71/0x85
> > >   [  762.904542]   [<c0349765>] _raw_spin_lock+0x33/0x40
> > >   [  762.904542]   [<c0232c93>] __percpu_counter_add+0x58/0x7c
> > >   [  762.904542]   [<c02cfde1>] sk_clone_lock+0x1e5/0x200
> > >   [  762.904542]   [<c0303ee4>] inet_csk_clone_lock+0xe/0x78
> > >   [  762.904542]   [<c0315778>] tcp_create_openreq_child+0x1b/0x404
> > >   [  762.904542]   [<c031339c>] tcp_v4_syn_recv_sock+0x32/0x1c1
> > >   [  762.904542]   [<c031615a>] tcp_check_req+0x1fd/0x2d7
> > >   [  762.904542]   [<c0313f77>] tcp_v4_do_rcv+0xab/0x194
> > >   [  762.904542]   [<c03153bb>] tcp_v4_rcv+0x3b3/0x5cc
> > >   [  762.904542]   [<c02fc0c4>] ip_local_deliver_finish+0x13a/0x1e9
> > >   [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
> > >   [  762.904542]   [<c02fc652>] ip_local_deliver+0x41/0x45
> > >   [  762.904542]   [<c02fc4d1>] ip_rcv_finish+0x31a/0x33c
> > >   [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
> > >   [  762.904542]   [<c02fc857>] ip_rcv+0x201/0x23e
> > >   [  762.904542]   [<c02daa3a>] __netif_receive_skb+0x319/0x368
> > >   [  762.904542]   [<c02dac07>] netif_receive_skb+0x4e/0x7d
> > >   [  762.904542]   [<c02dacf6>] napi_skb_finish+0x1e/0x34
> > >   [  762.904542]   [<c02db122>] napi_gro_receive+0x20/0x24
> > >   [  762.904542]   [<f85d1743>] e1000_receive_skb+0x3f/0x45 [e1000e]
> > >   [  762.904542]   [<f85d3464>] e1000_clean_rx_irq+0x1f9/0x284 [e1000e]
> > >   [  762.904542]   [<f85d3926>] e1000_clean+0x62/0x1f4 [e1000e]
> > >   [  762.904542]   [<c02db228>] net_rx_action+0x90/0x160
> > >   [  762.904542]   [<c012a445>] __do_softirq+0x7b/0x118
> > >   [  762.904542] irq event stamp: 156915469
> > >   [  762.904542] hardirqs last  enabled at (156915469): [<c019b4f4>] 
> > > __slab_alloc.clone.58.clone.63+0xc4/0x2de
> > >   [  762.904542] hardirqs last disabled at (156915468): [<c019b452>] 
> > > __slab_alloc.clone.58.clone.63+0x22/0x2de
> > >   [  762.904542] softirqs last  enabled at (156915466): [<c02ce677>] 
> > > lock_sock_nested+0x64/0x6c
> > >   [  762.904542] softirqs last disabled at (156915464): [<c0349914>] 
> > > _raw_spin_lock_bh+0xe/0x45
> > >   [  762.904542]
> > >   [  762.904542] other info that might help us debug this:
> > >   [  762.904542]  Possible unsafe locking scenario:
> > >   [  762.904542]
> > >   [  762.904542]        CPU0
> > >   [  762.904542]        ----
> > >   [  762.904542]   lock(key#3);
> > >   [  762.904542]   <Interrupt>
> > >   [  762.904542]     lock(key#3);
> > >   [  762.904542]
> > >   [  762.904542]  *** DEADLOCK ***
> > >   [  762.904542]
> > >   [  762.904542] 1 lock held by squid/1603:
> > >   [  762.904542]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>] 
> > > lock_sock+0xa/0xc
> > >   [  762.904542]
> > >   [  762.904542] stack backtrace:
> > >   [  762.904542] Pid: 1603, comm: squid Not tainted 3.3.4-build-0061 #8
> > >   [  762.904542] Call Trace:
> > >   [  762.904542]  [<c0347b73>] ? printk+0x18/0x1d
> > >   [  762.904542]  [<c015873a>] valid_state+0x1f6/0x201
> > >   [  762.904542]  [<c0158816>] mark_lock+0xd1/0x1bb
> > >   [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
> > >   [  762.904542]  [<c015805d>] ? check_usage_forwards+0x77/0x77
> > >   [  762.904542]  [<c0158bf8>] __lock_acquire+0x2f8/0xc26
> > >   [  762.904542]  [<c0159b8e>] ? mark_held_locks+0x5d/0x7b
> > >   [  762.904542]  [<c0159cf6>] ? trace_hardirqs_on+0xb/0xd
> > >   [  762.904542]  [<c0158dd4>] ? __lock_acquire+0x4d4/0xc26
> > >   [  762.904542]  [<c01598e8>] lock_acquire+0x71/0x85
> > >   [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
> > >   [  762.904542]  [<c0349765>] _raw_spin_lock+0x33/0x40
> > >   [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
> > >   [  762.904542]  [<c0232cc4>] __percpu_counter_sum+0xd/0x58
> > >   [  762.904542]  [<c02cebc4>] __sk_mem_schedule+0xdd/0x1c7
> > >   [  762.904542]  [<c02d178d>] ? __alloc_skb+0x76/0x100
> > >   [  762.904542]  [<c0305e8e>] sk_wmem_schedule+0x21/0x2d
> > >   [  762.904542]  [<c0306370>] sk_stream_alloc_skb+0x42/0xaa
> > >   [  762.904542]  [<c0306567>] tcp_sendmsg+0x18f/0x68b
> > >   [  762.904542]  [<c031f3dc>] ? ip_fast_csum+0x30/0x30
> > >   [  762.904542]  [<c0320193>] inet_sendmsg+0x53/0x5a
> > >   [  762.904542]  [<c02cb633>] sock_aio_write+0xd2/0xda
> > >   [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
> > >   [  762.904542]  [<c01a1017>] do_sync_write+0x9f/0xd9
> > >   [  762.904542]  [<c01a2111>] ? file_free_rcu+0x2f/0x2f
> > >   [  762.904542]  [<c01a17a1>] vfs_write+0x8f/0xab
> > >   [  762.904542]  [<c01a284d>] ? fget_light+0x75/0x7c
> > >   [  762.904542]  [<c01a1900>] sys_write+0x3d/0x5e
> > >   [  762.904542]  [<c0349ec9>] syscall_call+0x7/0xb
> > >   [  762.904542]  [<c0340000>] ? rp_sidt+0x41/0x83
> > > 
> > 
> > 
> > OK, so when we have memory pressure we can call
> > percpu_counter_read_positive() with SOFTIRQ enabled, and lockdep
> > complains...
> > 
> > This bug was probably added in 2008, in commit 1748376b6626a
> > (net: Use a percpu_counter for sockets_allocated)
> 
> Hmm, no this patch was fine.
> 
> Bug was in fact added by Glauber Costa in commit 180d8cd942ce336b2c869
> (foundations of per-cgroup memory pressure controlling.)
> 
> Because he replaced the safe percpu_counter_read_positive() call to
> unsafe percpu_counter_sum_positive() in
> sk_sockets_allocated_read_positive()
> 
> But anyway the patch I sent should fix the problem.

Thinking again, I am not sure why Glauber did this change. He probably
made a typo or something.

Maybe we should revert and use the following patch instead, since we
dont need a precise sum at this point.

This better matches "sk_sockets_allocated_read_positive" intent anyway

diff --git a/include/net/sock.h b/include/net/sock.h
index 188532e..5a0a58a 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1129,9 +1129,9 @@ sk_sockets_allocated_read_positive(struct sock *sk)
 	struct proto *prot = sk->sk_prot;
 
 	if (mem_cgroup_sockets_enabled && sk->sk_cgrp)
-		return percpu_counter_sum_positive(sk->sk_cgrp->sockets_allocated);
+		return percpu_counter_read_positive(sk->sk_cgrp->sockets_allocated);
 
-	return percpu_counter_sum_positive(prot->sockets_allocated);
+	return percpu_counter_read_positive(prot->sockets_allocated);
 }
 
 static inline int

^ permalink raw reply related

* Re: inconsistent lock/deadlock crash, vanilla 3.3.4, 32bit, tcp
From: Denys Fedoryshchenko @ 2012-04-29  8:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, davem, jmorris, yoshfuji, kaber, linux-kernel,
	Glauber Costa
In-Reply-To: <1335688036.2900.80.camel@edumazet-glaptop>

On 2012-04-29 11:27, Eric Dumazet wrote:
>
>
> OK, so when we have memory pressure we can call
> percpu_counter_read_positive() with SOFTIRQ enabled, and lockdep
> complains...
>
> This bug was probably added in 2008, in commit 1748376b6626a
> (net: Use a percpu_counter for sockets_allocated)
>
> I'll have to backport the following patch, can you test it please ?
>
> Thanks !
>
>  net/core/sock.c |    2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index b2e14c0..08fc929 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1818,7 +1818,9 @@ int __sk_mem_schedule(struct sock *sk, int
> size, int kind)
>
>  		if (!sk_under_memory_pressure(sk))
>  			return 1;
> +		local_bh_disable();
>  		alloc = sk_sockets_allocated_read_positive(sk);
> +		local_bh_enable();
>  		if (sk_prot_mem_limits(sk, 2) > alloc *
>  		    sk_mem_pages(sk->sk_wmem_queued +
>  				 atomic_read(&sk->sk_rmem_alloc) +

I will try to deploy it and test as soon as someone will powercycle 
server on other side.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox