netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* r8169+NAPI soft lockup
@ 2006-05-09 15:44 Richard Gregory
  2006-05-09 16:43 ` Francois Romieu
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Richard Gregory @ 2006-05-09 15:44 UTC (permalink / raw)
  To: netdev

I'm seeing the crash below using 2.6.16.11 custom based on RedHat FC2. 
The main culprit being the r8169+NAPI module, although the it821x module 
(with noraid=1) seems to bring out the bug, maybe because it uses the 
same interrupt.

The machine is an Athlon 2200+ with 1.5G of ram, NForce2 chipset. Two 40 
gig drives create an ext3 software raid1 OS partition and eight 160 gig 
drives form a software raid5 partition, 1.1TB in size using reiserfs. 
The eight drives raid5 use four ITE8182 PCI cards, the r8169 gigabit 
card is in the middle of 5 PCI slots. The two raid1 boot drives use the 
onboard IDE.

BUG: soft lockup detected on CPU#0!

Pid: 11413, comm:                 cpio
EIP: 0060:[<c027d551>] CPU: 0
EIP is at ide_intr+0x41/0xe0
  EFLAGS: 00000286    Not tainted  (2.6.16.11 #1)
EAX: 00000050 EBX: f6f81d80 ECX: e81a9b1c EDX: 0000d007
ESI: 04000000 EDI: 00000286 EBP: c04a8740 DS: 007b ES: 007b
CR0: 8005003b CR2: b7fa5000 CR3: 2a319000 CR4: 000006d0
  [<c012c5c0>] handle_IRQ_event+0x21/0x4a
  [<c012c63c>] __do_IRQ+0x53/0x8f
  [<c0103fab>] do_IRQ+0x19/0x24
  [<c0102b6e>] common_interrupt+0x1a/0x20
  [<c030fb30>] netif_receive_skb+0x108/0x1a9
  [<f883017e>] rtl8169_rx_interrupt+0x287/0x31e [r8169]
  [<f8830798>] pci_unmap_single+0x0/0x10 [r8169]
  [<f88303ad>] rtl8169_poll+0x37/0xb5 [r8169]
  [<c030fd18>] net_rx_action+0x75/0x10a
  [<c0117be1>] __do_softirq+0x35/0x7d
  [<c0117c4b>] do_softirq+0x22/0x26
  [<c0103fb0>] do_IRQ+0x1e/0x24
  [<c0102b6e>] common_interrupt+0x1a/0x20
  [<c0187b38>] is_internal+0x37/0x6f
  [<c018832c>] search_by_key+0x713/0xb06
  [<c0177a4d>] make_cpu_key+0x2a/0x2f
  [<c01799a8>] reiserfs_update_sd_size+0x77/0x17b
  [<c01220d2>] autoremove_wake_function+0x0/0x2d
  [<c017d8ba>] reiserfs_prepare_file_region_for_write+0x47f/0x749
  [<c018ec49>] journal_begin+0x8c/0xcd
  [<c01819c4>] reiserfs_dirty_inode+0x47/0x61
  [<c015e1b7>] __mark_inode_dirty+0x27/0x14b
  [<c017d246>] reiserfs_submit_file_region_for_write+0x150/0x1d2
  [<c017e02e>] reiserfs_file_write+0x4aa/0x58e
  [<c03364e2>] tcp_v4_do_rcv+0x1b/0xb6
  [<c033699f>] tcp_v4_rcv+0x422/0x66e
  [<c01220df>] autoremove_wake_function+0xd/0x2d
  [<c01179a2>] current_fs_time+0x3a/0x50
  [<c0157c9d>] touch_atime+0x65/0xa6
  [<c014d82d>] pipe_readv+0x242/0x24e
  [<c01444b1>] vfs_write+0x87/0x123
  [<c01445eb>] sys_write+0x3c/0x62
  [<c0102947>] sysenter_past_esp+0x54/0x75

(a full trace is available, 
http://www.csc.liv.ac.uk/~greg/r8169bug.tar.gz , bug1.txt)


A similar lockup (with no info, sysreq wasn't enabled) has been seen 
with 2.6.14, in fact, when using the r8169+NAPI driver, the lockup would 
occur any time the it821x driver was used instead of ITE's own driver. 
With the ITE driver, the system has seen 400 days uptime with r8169+NAPI.

Without NAPI, the r8169 driver is stable with it821x. Can transfer at 
~35meg/second for hours with only a single unexplained 10 second pause 
(and link down/link up in dmesg).

The lockup requires r8169 io and it821x based disk io, onboard IDE disk 
io with r8169 io does not crash the system. A raid5 sync or slocate has 
never yet lead to a lockup yet is 89% full. The crash above used cpio to 
backup another machine via rsh, the machine froze an hour or so into 
this operation, having been up and running for at least a day with the 
raid5 partition mounted read/write but mostly unaccessed. Other tests 
showed this wasn't a reiserfs issue, reading the block device also 
crashed the machine.

Rebooting, a raid5 resync was required, which completed without 
problems. raid5 was mounted read only for all these tests, if it was 
mounted at all. linuxbox is another machine, also with an r8169 card, 
without the NAPI option. The discard daemon was running on port 9.

# locked up after 63 mins. Output in bug2.txt
$ find raid5 -xdev | cpio -oHnewc > /dev/tcp/linuxbox/9

# was fine for 180 mins, rebooted and did next test.
$ find raid5 -xdev | cpio -oHnewc > /dev/tcp/localhost/9

# locked up in ~30 mins. Output in bug3.txt
$ find raid5 -xdev | cpio -oHnewc > /dev/tcp/linuxbox/9

# locked up in 15 mins. md0 is the raid5 drive. Output in bug4.txt
# This showed raid5 module and reiserfs were not part of the problem.
$ cat /dev/md0 > /dev/tcp/linuxbox/9

# ran to the end, md0 wasn't enabled. md1 is the onboard IDE based raid1
# `seq 0 26` is enough to tranfer 1.1TB of data.
$ for i in `seq 0 26` ; do cat /dev/md1 > /dev/tcp/linuxbox/9 ; done

# locked in 1 min. Output in bug5.txt
$ for i in `seq 0 26` ; do cat /dev/md1 > /dev/tcp/linuxbox/9 &
$ cat /dev/md0 > /dev/tcp/localhost/9

# locked in 150 mins. no raid5 module. Output in bug6.txt
$ for i in e g i k m o q s ; do cat /dev/hd${i}1 > /dev/tcp/linuxbox/9 ; 
done
# locked in 7 hours. no raid5 module. Output in bug7.txt
$ for i in e g i k m o q s ; do cat /dev/hd${i}1 > /dev/udp/linuxbox/9 ; 
done

# test onboard LAN, 100meg forcedeth module. Fine for 8 hours (approx 
320gig transfered)
$ for i in g i k m o q s e ; do cat /dev/hd${i}1 > /dev/tcp/linuxbox/9 ; 
done &
$ for i in e g i k m o q s ; do cat /dev/hd${i}1 > /dev/tcp/linuxbox/9 ; 
done

# test r8169+NAPI at 100 meg. Locked in 110 mins. Output in bug8.txt
$ for i in g i k m o q s e ; do cat /dev/hd${i}1 > /dev/tcp/linuxbox/9 ; 
done &
$ for i in e g i k m o q s ; do cat /dev/hd${i}1 > /dev/tcp/linuxbox/9 ; 
done

# test r8169 at gigabit, with RX polling option disabled.
# Ran for 9 hours, so we have the winner. But why the NAPI interaction 
problem with it821x and not ITE?
$ for i in g i k m o q s e ; do cat /dev/hd${i}1 > /dev/tcp/linuxbox/9 ; 
done &
$ for i in e g i k m o q s ; do cat /dev/hd${i}1 > /dev/tcp/linuxbox/9 ; 
done &

# Again without NAPI. Ran to the end.
$ cat /dev/md0 > /dev/tcp/linuxbox/9

These tests were done a few days ago, the system has been stable with 
r8169(without NAPI) and it821x.


Am willing to test patches,

Richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: r8169+NAPI soft lockup
  2006-05-09 15:44 r8169+NAPI soft lockup Richard Gregory
@ 2006-05-09 16:43 ` Francois Romieu
  2006-05-09 18:17   ` Richard Gregory
  2006-05-09 20:26 ` Francois Romieu
  2006-05-10 21:50 ` Francois Romieu
  2 siblings, 1 reply; 8+ messages in thread
From: Francois Romieu @ 2006-05-09 16:43 UTC (permalink / raw)
  To: Richard Gregory; +Cc: netdev

Richard Gregory <R.Gregory@liverpool.ac.uk> :
> I'm seeing the crash below using 2.6.16.11 custom based on RedHat FC2. 
> The main culprit being the r8169+NAPI module, although the it821x module 
> (with noraid=1) seems to bring out the bug, maybe because it uses the 
> same interrupt.

(lot of things to digest...)

Is netconsole enabled ?

-- 
Ueimor

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: r8169+NAPI soft lockup
  2006-05-09 16:43 ` Francois Romieu
@ 2006-05-09 18:17   ` Richard Gregory
  2006-05-09 18:53     ` Francois Romieu
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Gregory @ 2006-05-09 18:17 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

Francois Romieu wrote:
> Richard Gregory <R.Gregory@liverpool.ac.uk> :
> 
>>I'm seeing the crash below using 2.6.16.11 custom based on RedHat FC2. 
>>The main culprit being the r8169+NAPI module, although the it821x module 
>>(with noraid=1) seems to bring out the bug, maybe because it uses the 
>>same interrupt.
> 
> 
> (lot of things to digest...)
> 
> Is netconsole enabled ?

It can be. How much output do you need, a single soft lockup?


Richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: r8169+NAPI soft lockup
  2006-05-09 18:17   ` Richard Gregory
@ 2006-05-09 18:53     ` Francois Romieu
  0 siblings, 0 replies; 8+ messages in thread
From: Francois Romieu @ 2006-05-09 18:53 UTC (permalink / raw)
  To: Richard Gregory; +Cc: netdev

Richard Gregory <R.Gregory@liverpool.ac.uk> :
> Francois Romieu wrote:
[...]
> >Is netconsole enabled ?
> 
> It can be. How much output do you need, a single soft lockup?

Nonononono. Keep it disabled.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: r8169+NAPI soft lockup
  2006-05-09 15:44 r8169+NAPI soft lockup Richard Gregory
  2006-05-09 16:43 ` Francois Romieu
@ 2006-05-09 20:26 ` Francois Romieu
  2006-05-10 21:50 ` Francois Romieu
  2 siblings, 0 replies; 8+ messages in thread
From: Francois Romieu @ 2006-05-09 20:26 UTC (permalink / raw)
  To: Richard Gregory; +Cc: netdev

Richard Gregory <R.Gregory@liverpool.ac.uk> :
[...]

Can you send me your drivers/ide/ide-io.o ?

-- 
Ueimor

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: r8169+NAPI soft lockup
  2006-05-09 15:44 r8169+NAPI soft lockup Richard Gregory
  2006-05-09 16:43 ` Francois Romieu
  2006-05-09 20:26 ` Francois Romieu
@ 2006-05-10 21:50 ` Francois Romieu
  2006-05-11  1:47   ` Richard Gregory
  2006-06-03 15:05   ` Richard Gregory
  2 siblings, 2 replies; 8+ messages in thread
From: Francois Romieu @ 2006-05-10 21:50 UTC (permalink / raw)
  To: Richard Gregory; +Cc: netdev

Richard Gregory <R.Gregory@liverpool.ac.uk> :
[...]
> # locked in 1 min. Output in bug5.txt
> $ for i in `seq 0 26` ; do cat /dev/md1 > /dev/tcp/linuxbox/9 &
> $ cat /dev/md0 > /dev/tcp/localhost/9

Can you replace /dev/tcp/foo with a simple /dev/null and send the output
of 'vmstat 1' during 2 minutes of test ?

A few seconds of 'vmstat 1' during a simple dd from /dev/md0
(resp. /dev/md1) would be welcome too.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: r8169+NAPI soft lockup
  2006-05-10 21:50 ` Francois Romieu
@ 2006-05-11  1:47   ` Richard Gregory
  2006-06-03 15:05   ` Richard Gregory
  1 sibling, 0 replies; 8+ messages in thread
From: Richard Gregory @ 2006-05-11  1:47 UTC (permalink / raw)
  To: Francois Romieu, netdev

[-- Attachment #1: Type: text/plain, Size: 483 bytes --]

Francois Romieu wrote:
> Richard Gregory <R.Gregory@liverpool.ac.uk> :
> [...]
> 
>># locked in 1 min. Output in bug5.txt
>>$ for i in `seq 0 26` ; do cat /dev/md1 > /dev/tcp/linuxbox/9 &
>>$ cat /dev/md0 > /dev/tcp/localhost/9
> 
> 
> Can you replace /dev/tcp/foo with a simple /dev/null and send the output
> of 'vmstat 1' during 2 minutes of test ?
> 
> A few seconds of 'vmstat 1' during a simple dd from /dev/md0
> (resp. /dev/md1) would be welcome too.
> 

Attached.


Richard

[-- Attachment #2: r8169vmstat.txt --]
[-- Type: text/plain, Size: 15893 bytes --]

# for i in `seq 0 26` ; do cat /dev/md1 > /dev/null ; done &
# cat /dev/md0 > /dev/null

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0   7280 631436 848704  25264    0    0     0    37   31    39  1 14 83  2
 0  0   7280 631436 848712  25264    0    0     0    56  232    26  0  0 100  0
 0  0   7280 631436 848712  25264    0    0     0     0  212    12  0  0 100  0
 0  0   7280 631436 848712  25264    0    0     0     0  221    24  0  0 100  0
 0  0   7280 631436 848712  25264    0    0     0     0  212    10  0  0 100  0
 0  0   7280 631436 848712  25264    0    0     0     0  208    12  0  0 100  0
 0  0   7280 631436 848712  25264    0    0     0     0  209     6  0  0 100  0
 0  0   7280 631436 848712  25264    0    0     0     0  212    14  0  0 100  0
 0  0   7280 631436 848712  25264    0    0     0     0  209     8  0  0 100  0
 0  1   7280 630572 849708  25328    0    0 17948     0  361   479  0  5 52 43
 0  1   7280 630864 849404  25328    0    0 50384     0  621  1249  0 14  0 86
 0  1   7280 630204 850044  25328    0    0 47360     0  583  1161  0 17  0 83
 1  2   7280 631044 849040  25328    0    0 102292     0 1314  1319  0 65  0 35
 0  2   7280 630984 849296  25328    0    0 122112     0 1627  1303  0 77  0 23
 2  1   7280 630932 849296  25328    0    0 118400    60 1590  1291  1 78  0 21
 1  1   7280 631044 849232  25328    0    0 123584     0 1582  1315  0 78  0 22
 0  2   7280 630984 849252  25328    0    0 123264     0 1619  1315  0 74  0 26
 1  2   7280 631104 848932  25328    0    0 116672    64 1681  1282  2 76  0 22
 2  1   7280 631108 848964  25328    0    0 120704     0 1559  1299  1 74  0 25
 2  1   7280 630572 849684  25328    0    0 124224     0 1670  1354  1 76  0 23
 1  2   7280 631104 848988  25328    0    0 120128    32 1655  1327  0 77  0 23
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  2   7280 630984 849252  25328    0    0 126336     0 1821  1416  1 83  0 16
 0  2   7280 631112 849124  25328    0    0 126592     0 1818  1413  0 88  0 12
 1  2   7280 631164 848888  25328    0    0 127104     0 1824  1401  1 87  0 12
 1  2   7280 631292 848760  25328    0    0 122880     0 1678  1311  1 77  0 22
 2  1   7280 630392 849856  25328    0    0 124864    32 1842  1392  1 85  0 14
 2  1   7280 631232 848840  25328    0    0 123776     0 1744  1385  0 83  0 17
 2  1   7280 630984 849288  25328    0    0 122816     0 1702  1319  0 78  0 22
 2  2   7280 630932 849352  25328    0    0 127040     0 1786  1402  1 83  0 16
 1  1   7280 631104 848976  25328    0    0 123520    12 1637  1351  0 79  0 21
 0  2   7280 631112 849188  25328    0    0 118080    20 1666  1263  3 70  2 25
 2  1   7280 630504 849772  25328    0    0 128960     0 1925  1462  1 89  0 10
 2  1   7280 630144 850156  25328    0    0 122880     0 1641  1352  1 76  0 23
 2  1   7280 630744 849520  25328    0    0 125952     0 1777  1383  1 81  0 18
 1  3   7280 630444 849848  25328    0    0 122820    20 1671  1369  3 76  0 21
 2  1   7280 630924 849340  25328    0    0 120832    12 1689  1322  2 78  0 20
 2  2   7280 631164 848960  25328    0    0 123904     0 1705  1378  2 82  0 16
 1  1   7280 631044 849232  25328    0    0 126336     0 1752  1385  2 78  0 20
 2  1   7280 630324 849956  25328    0    0 124228    40 1883  1442  1 88  0 11
 2  1   7280 630624 849700  25328    0    0 122240     0 1628  1297  0 79  0 21
 1  2   7280 631104 848996  25328    0    0 123072     0 1620  1346  2 77  0 21
 1  1   7280 631052 849252  25328    0    0 123520     0 1697  1358  0 77  0 23
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  2   7280 630924 849380  25328    0    0 127488     0 1901  1420  1 87  0 12
 1  2   7280 630144 850156  25328    0    0 120064    32 1664  1321  0 83  0 17
 1  2   7280 631224 848880  25328    0    0 124544     0 1666  1343  0 81  0 19
 2  2   7280 631104 849228  25328    0    0 122816     0 1686  1362  1 83  0 16
 2  1   7280 630812 849484  25328    0    0 120448     0 1552  1307  0 70  0 30
 1  1   7280 630512 849804  25328    0    0 124864     0 1753  1389  1 80  0 19
 2  1   7280 630444 849876  25328    0    0 120384    32 1711  1317  0 77  0 23
 1  2   7280 630804 849492  25328    0    0 120192     0 1596  1364  3 76  0 21
 1  2   7280 631164 848988  25328    0    0 121472     0 1549  1300  0 83  0 17
 1  2   7280 631224 848924  25328    0    0 125504     0 1777  1383  0 77  0 23
 2  2   7280 630624 849716  25328    0    0 121728    20 1776  1337  0 85  0 15
 0  2   7280 630864 849468  25328    0    0 115712    32 1548  1246  1 66  2 31
 2  1   7280 631232 848892  25328    0    0 123456     0 1720  1389  3 79  0 18
 0  2   7280 631104 849220  25328    0    0 128704     0 1870  1412  0 90  0 10
 2  1   7280 630384 849924  25328    0    0 126016     0 1801  1407  1 85  0 14
 0  2   7280 630984 849348  25328    0    0 123200     0 1625  1340  1 77  0 22
 2  1   7280 630504 849804  25328    0    0 118208    32 1597  1307  3 70  2 25
 2  1   7280 630392 849952  25328    0    0 128128     0 1867  1412  0 86  0 14
 0  2   7280 630444 849896  25328    0    0 120768     0 1570  1313  0 71  0 29
 0  2   7280 631044 849256  25328    0    0 124672     0 1710  1365  1 79  0 20
 0  2   7280 630924 849384  25328    0    0 125952     0 1789  1406  0 89  0 11
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 1  1   7280 630504 849840  25328    0    0 121408    32 1773  1385  2 81  0 17
 1  1   7280 630812 849520  25328    0    0 123200     0 1705  1376  1 77  0 22
 0  2   7280 631224 849144  25328    0    0 126976     0 1700  1379  0 77  0 23
 1  2   7280 630924 849420  25328    0    0 123776     0 1658  1358  1 78  0 21
 1  2   7280 630384 849932  25328    0    0 122368     0 1592  1308  0 75  0 25
 2  2   7280 630744 849620  25328    0    0 121920    32 1741  1338  2 79  0 19
 0  2   7280 631104 849236  25328    0    0 125440     0 1737  1373  1 85  0 14
 2  1   7280 630444 849884  25328    0    0 123648     0 1652  1343  0 80  0 20
 1  1   7280 630804 849564  25328    0    0 125504     0 1716  1387  3 80  0 17
 3  2   7280 631164 849180  25328    0    0 126208     0 1855  1407  2 85  0 13
 1  1   7280 630932 849400  25328    0    0 119104    32 1726  1354  0 82  2 16
 2  1   7280 630384 849976  25328    0    0 121152     0 1506  1304  0 66  0 34
 1  2   7280 631224 848960  25328    0    0 124800     0 1733  1369  0 81  0 19
 2  1   7280 630564 849792  25328    0    0 125888     0 1821  1439  0 85  0 15
 1  1   7280 630452 849920  25328    0    0 127616     0 1840  1406  1 83  0 16
 1  2   7280 630684 849672  25328    0    0 114048    32 1538  1233  1 71  0 28
 0  2   7280 630564 849800  25328    0    0 124160     0 1672  1360  0 83  0 17
 0  2   7280 631104 849252  25328    0    0 122944     0 1652  1332  0 76  0 24
 2  1   7280 630572 849764  25328    0    0 121472     0 1596  1307  1 76  0 23
 2  1   7280 630564 849828  25328    0    0 122176     0 1604  1329  1 77  0 22
 2  1   7280 631232 848940  25328    0    0 118272    32 1643  1285  0 75  2 23
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  2   7280 630444 849900  25328    0    0 123072     0 1656  1314  0 80  0 20
 2  1   7280 630144 850228  25328    0    0 126144     0 1783  1356  1 81  0 18
 2  1   7280 630812 849588  25328    0    0 125568     0 1788  1376  1 85  0 14
 2  1   7280 631112 849288  25328    0    0 123968     0 1705  1335  0 77  0 23
 2  1   7280 630624 849744  25328    0    0 120512    32 1678  1314  3 76  0 21
 1  1   7280 630624 849744  25328    0    0 125312     0 1761  1388  2 79  0 19
 2  1   7280 630744 849624  25328    0    0 124160     0 1718  1329  0 81  0 19
 2  1   7280 630392 850000  25328    0    0 121592     0 1696  1355  2 74  0 24
 1  2   7280 630932 849424  25328    0    0 124224     0 1734  1363  3 81  0 16
 2  1   7280 630384 850008  25328    0    0 118592    32 1657  1290  0 77  2 21
 1  1   7280 631284 849068  25328    0    0 124864     0 1771  1352  0 84  0 16
 1  2   7280 630804 849588  25328    0    0 124544     0 1762  1402  0 82  0 18
 2  1   7280 630572 849844  25328    0    0 124032     0 1695  1335  0 82  0 18
 0  2   7280 630992 849396  25328    0    0 123328     0 1651  1319  0 83  0 17
 2  1   7280 630324 850044  25328    0    0 115968    32 1537  1250  1 71  1 27
 1  1   7280 631044 849148  25328    0    0 126592     0 1795  1399  0 84  0 16
 1  1   7280 630324 850052  25328    0    0 123776     0 1672  1353  0 79  0 21
 2  1   7280 630864 849556  25328    0    0 122496     0 1705  1359  0 80  0 20
 2  2   7280 630444 849944  25328    0    0 121088     0 1603  1316  1 77  0 22
 2  1   7280 631052 849376  25328    0    0 114880    32 1494  1234  0 72  2 26
 2  1   7280 630264 850144  25328    0    0 123392     0 1653  1326  0 76  0 24
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 2  1   7280 630324 850088  25328    0    0 121408     0 1548  1326  1 71  0 28
 2  2   7280 631044 849384  25328    0    0 124352     0 1775  1399  0 83  0 17
 1  2   7280 630864 849512  25328    0    0 123136     0 1682  1352  0 83  0 17
 0  2   7280 631104 849268  25328    0    0 115456    32 1585  1319  2 70  2 26
 2  1   7280 630444 849988  25328    0    0 121024     0 1593  1321  0 77  0 23
 2  1   7280 630564 849868  25328    0    0 123904     0 1729  1351  0 83  0 17
 2  1   7280 631224 848972  25328    0    0 124672     0 1728  1359  1 77  0 22
 2  1   7280 630924 849484  25328    0    0 123008     0 1740  1323  2 80  0 18
 2  1   7280 630444 850004  25328    0    0 121600    32 1821  1359  1 81  2 16
 0  2   7280 630984 849428  25328    0    0 126656     0 1792  1395  2 82  0 16
 2  1   7280 630384 850012  25328    0    0 123200     0 1647  1348  0 78  0 22
 0  2   7280 630324 850096  25328    0    0 122048     0 1634  1306  1 76  0 23
 0  2   7280 630984 849456  25328    0    0 125184     0 1779  1403  0 80  0 20
 2  2   7280 630812 849592  25328    0    0 119552    32 1682  1288  0 76  0 24
 2  1   7280 630264 850168  25328    0    0 121920     0 1586  1317  1 75  0 24
 1  2   7280 631112 849344  25328    0    0 127424     0 1834  1383  1 84  0 15
 2  1   7280 631044 849216  25328    0    0 125952     0 1736  1368  0 77  0 23
 0  2   7280 630684 849728  25328    0    0 121600     0 1591  1306  1 73  0 26
 2  1   7280 630512 849948  25328    0    0 119360    32 1668  1327  1 77  0 22
 0  2   7280 630744 849692  25328    0    0 126336     0 1821  1403  1 80  0 19
 2  1   7280 630504 849956  25328    0    0 124032     0 1733  1371  1 82  0 17
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 2  1   7280 630744 849700  25328    0    0 121856     0 1652  1298  2 77  0 21
 0  2   7280 630924 849508  25328    0    0 129856     0 1947  1488  1 91  0  8
 1  2   7280 630864 849580  25328    0    0 119616    32 1675  1328  0 80  0 20
 2  1   7280 630324 850092  25328    0    0 123520     0 1654  1343  0 81  0 19
 1  2   7280 631224 849032  25328    0    0 121408     0 1561  1292  1 75  0 24
 0  2   7280 631164 849288  25328    0    0 127104     0 1862  1404  0 89  0 11
 0  0   7280 631052 849420  25328    0    0 120320    12 1595  1303  0 80  2 18
 2  1   7280 630444 850000  25328    0    0 121920    20 1709  1357  2 80  0 18
 2  1   7280 631164 849108  25328    0    0 124928     0 1800  1418  1 82  0 17
 2  1   7280 630564 849880  25328    0    0 125568     0 1768  1395  2 80  0 18
 1  2   7280 631104 849176  25328    0    0 123840     0 1691  1354  1 74  0 25
 0  1   7280 630864 849588  25328    0    0 97408    32 1339  1212  1 58  2 39
 0  1   7280 630504 849972  25328    0    0 45824     0  573  1129  1 14  0 85
 0  1   7280 630384 850100  25328    0    0 46976     0  575  1145  0 14  0 86
 0  1   7280 630212 850228  25328    0    0 46976     0  589  1163  0 15  0 85
 0  0   7280 630520 849972  25328    0    0 40064     0  539  1000  0 10 15 75
 0  0   7280 630520 849980  25328    0    0     0    28  224    24  0  0 100  0
 0  0   7280 630520 849980  25328    0    0     0     0  208     8  0  0 100  0
 0  0   7280 630536 849980  25328    0    0     0     0  209     8  0  0 100  0
 0  0   7280 630536 849980  25328    0    0     0     0  212    10  0  0 100  0

# dd if=/dev/md0 > /dev/null
# vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0   7280 630828 849780  25340    0    0    28    37   32    39  1 14 83  2
 0  0   7280 630828 849780  25340    0    0     0     0  216     8  0  0 100  0
 0  0   7280 630828 849780  25340    0    0     0     0  209     8  0  0 100  0
 0  0   7280 630828 849780  25340    0    0     0     0  209     8  0  0 100  0
 0  1   7280 630648 849964  25372    0    0 15308    68  450   197  5 59 16 20
 1  0   7280 630528 850092  25372    0    0 65408     0 1099   485  5 60  0 35
 0  1   7280 631120 849324  25372    0    0 65408     0 1078   474  4 52  0 44
 1  0   7280 630460 850156  25372    0    0 65856     0 1104   501  5 50  0 45
 0  1   7280 630520 850092  25372    0    0 64960     0 1080   483  4 49  0 48
 1  0   7280 630880 849780  25372    0    0 63168    28 1058   474  2 59  0 39
 1  0   7280 630880 849780  25372    0    0 66304     0 1117   526  6 52  0 42
 0  1   7280 630948 849716  25372    0    0 64960     0 1060   466  4 57  0 39
 0  0   7280 630460 850164  25372    0    0 47936     0  861   369  2 41 27 31
 0  0   7280 630468 850164  25372    0    0     0     0  208     8  0  0 100  0
 0  0   7280 630468 850172  25372    0    0     0    28  227    26  0  1 99  0

# dd if=/dev/md1 > /dev/null
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0   7280 630468 850172  25372    0    0    29    37   32    39  1 14 83  2
 0  0   7280 630468 850172  25372    0    0     0     0  217     8  0  0 100  0
 0  0   7280 630468 850172  25372    0    0     0     0  237    56  0  0 100  0
 0  0   7280 630468 850172  25372    0    0     0     0  217    19  0  0 100  0
 0  0   7280 630468 850172  25372    0    0     0    40  236    25  0  0 100  0
 1  1   7280 630468 850200  25372    0    0 36492    52  512   920  1 18 18 63
 0  1   7280 630700 849948  25372    0    0 46464     4  573  1139  2 21  0 77
 1  0   7280 631180 849436  25372    0    0 45056     8  567  1102  4 25  0 71
 0  1   7280 630580 850076  25372    0    0 47360     0  579  1156  0 30  0 70
 0  1   7280 631180 849308  25372    0    0 47360     0  578  1157  2 27  0 71
 0  1   7280 630820 849820  25372    0    0 47360     0  578  1152  3 24  0 73
 0  1   7280 630700 849948  25372    0    0 43008    28  560  1063  4 19  2 75
 0  1   7280 630156 850460  25372    0    0 47360     0  578  1152  3 22  0 75
 0  1   7280 630820 849820  25372    0    0 47360     0  580  1158  1 28  0 71
 0  0   7280 630828 849820  25372    0    0 10496     0  301   276  0  9 78 13
 0  0   7280 630836 849820  25372    0    0     0     0  208    10  0  0 100  0
 0  0   7280 630836 849820  25372    0    0     0    28  225    24  0  1 99  0
 0  0   7280 630836 849820  25372    0    0     0     0  208     8  0  0 100  0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: r8169+NAPI soft lockup
  2006-05-10 21:50 ` Francois Romieu
  2006-05-11  1:47   ` Richard Gregory
@ 2006-06-03 15:05   ` Richard Gregory
  1 sibling, 0 replies; 8+ messages in thread
From: Richard Gregory @ 2006-06-03 15:05 UTC (permalink / raw)
  To: Francois Romieu, netdev

A little more info to the dead thread...

The machine died this morning, network and serial console were 
unresponsive. On rebooting, the only unusual message in logs were the 
last two:
Jun  3 01:58:39 loft -- MARK --
Jun  3 02:03:39 loft -- MARK --
Jun  3 02:08:40 loft -- MARK --
Jun  3 02:10:26 loft smartd[18830]: Device: /dev/hda, starting scheduled 
Short Self-Test.
Jun  3 02:10:27 loft smartd[18830]: Device: /dev/hdc, starting scheduled 
Short Self-Test.
Jun  3 02:10:27 loft smartd[18830]: Device: /dev/hde, starting scheduled 
Short Self-Test.
Jun  3 02:10:28 loft smartd[18830]: Device: /dev/hdg, starting scheduled 
Short Self-Test.
Jun  3 02:10:28 loft smartd[18830]: Device: /dev/hdi, starting scheduled 
Short Self-Test.
Jun  3 02:10:29 loft smartd[18830]: Device: /dev/hdk, starting scheduled 
Short Self-Test.
Jun  3 02:10:30 loft smartd[18830]: Device: /dev/hdm, starting scheduled 
Short Self-Test.
Jun  3 02:10:31 loft smartd[18830]: Device: /dev/hdo, starting scheduled 
Short Self-Test.
Jun  3 02:10:32 loft smartd[18830]: Device: /dev/hdq, starting scheduled 
Short Self-Test.
Jun  3 02:10:32 loft smartd[18830]: Device: /dev/hds, starting scheduled 
Short Self-Test.
Jun  3 02:13:26 loft kernel: hdo: lost interrupt
Jun  3 02:13:46 loft kernel: hdo: dma_timer_expiry: dma status == 0x21

'lost interrupt' has never been seen before in the logs.

I guess this more strongly implicates the it821x driver.


Richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-06-03 15:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-09 15:44 r8169+NAPI soft lockup Richard Gregory
2006-05-09 16:43 ` Francois Romieu
2006-05-09 18:17   ` Richard Gregory
2006-05-09 18:53     ` Francois Romieu
2006-05-09 20:26 ` Francois Romieu
2006-05-10 21:50 ` Francois Romieu
2006-05-11  1:47   ` Richard Gregory
2006-06-03 15:05   ` Richard Gregory

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).