2.6.24 BUG: soft lockup

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* 2.6.24 BUG: soft lockup - CPU#X
@ 2008-03-26 16:46 Matheos Worku
  2008-03-26 17:31 ` Rick Jones
  2008-03-26 20:14 ` Jarek Poplawski
  0 siblings, 2 replies; 36+ messages in thread
From: Matheos Worku @ 2008-03-26 16:46 UTC (permalink / raw)
  To: netdev

All,

While running iperf (transmit) on NIU driver I get several soft lockup 
errors.  I have seen similar soft lockup errors on kernels 2.6.18 and 
later (with other 10G NIC drivers with LSO disabled) but it is more 
frequent/pronounced on 2.6.24 kernel.  oprofile is not giving  much clue 
as where the cpus are spending time (atleast doesn't indicate any thing 
abnormal in the driver(s)). The lockup location as indicated by the 
error message is not consistent and varies from time time and could be 
outside the driver as well.  I have attached several lockup  error 
traces and corresponding profile data. Any clues?

Regards
matheos


Test system is 8xdual core AMD opteron, with 8 cpus booted.


[root@nsn57-110 ~]# more /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 33
model name      : Dual Core AMD Opteron(tm) Processor 885
stepping        : 2
cpu MHz         : 1000.000
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext 
fxsr_opt lm 3dnowext 3dnow rep_good pni lahf_lm cmp_legacy
bogomips        : 2019.84
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp


nsn57-110 login: BUG: soft lockup - CPU#3 stuck for 11s! [iperf:9064]
CPU 3:
Modules linked in: oprofile niu nfs lockd nfs_acl autofs4 hidp rfcomm 
l2cap bluetooth sunrpc ipv6 cpufreq_ondemand rdma_ucm ib_ucm rdma_cm 
iw_cm ib_addr ib_srp scsi_transport_srp  ib_cm ib_ipoib ib_sa ib_uverbs 
ib_umad ib_mad ib_core dm_multipath battery ac parport_pc lp parport 
joydev sr_mod sg e1000 button i2c_nforce2 pcspkr shpchp i2c_core 
dm_snapshotdm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase 
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 9064, comm: iperf Not tainted 2.6.24-mati #3
RIP: 0010:[<ffffffff80406fb8>]  [<ffffffff80406fb8>] 
pfifo_fast_dequeue+0x3b/0x59
RSP: 0018:ffff8101bece5a20  EFLAGS: 00000286
RAX: 00000000000005ea RBX: ffff8101ced38000 RCX: ffff8102f397a990
RDX: ffff8102facead10 RSI: 000000000000004d RDI: ffff8102faceac00
RBP: ffff8103e10ab700 R08: 0000000000000003 R09: 0400000000000000
R10: 00000003e9d5d700 R11: ffffffff80221995 R12: ffffffff882ee016
R13: ffff8103e9d5d100 R14: 0000000000000006 R15: ffff8103e58a8ed0
FS:  000000004700d940(0063) GS:ffff8101fb072e00(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000032e6e95770 CR3: 00000001d051a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff80406d65>] __qdisc_run+0x31/0x174
 [<ffffffff803f8139>] dev_queue_xmit+0x141/0x270
 [<ffffffff8041a17c>] ip_queue_xmit+0x343/0x396
 [<ffffffff8028596d>] add_partial_tail+0x12/0x3f
 [<ffffffff80267f23>] get_pageblock_flags_group+0x3e/0x7f
 [<ffffffff80267f23>] get_pageblock_flags_group+0x3e/0x7f
 [<ffffffff80429176>] tcp_transmit_skb+0x69f/0x6dc
 [<ffffffff8042aee0>] __tcp_push_pending_frames+0x79f/0x878
 [<ffffffff80427d69>] tcp_rcv_established+0x108/0x6f3
 [<ffffffff8042dab0>] tcp_v4_do_rcv+0x30/0x3a1
 [<ffffffff803ece3e>] release_sock+0x60/0xbf
 [<ffffffff8042048d>] tcp_sendmsg+0x968/0xa67
 [<ffffffff803ea812>] sock_aio_write+0xf8/0x110
 [<ffffffff8028c3b0>] do_sync_write+0xc9/0x10c
 [<ffffffff8027302b>] handle_mm_fault+0x668/0x69b
 [<ffffffff802480e0>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8028cb6b>] vfs_write+0xd9/0x14f
 [<ffffffff8028d0ae>] sys_write+0x45/0x6e
 [<ffffffff8020b7fc>] tracesys+0xdc/0xe1

BUG: soft lockup - CPU#3 stuck for 11s! [iperf:9064]
CPU 3:
Modules linked in: oprofile niu nfs lockd nfs_acl autofs4 hidp rfcomm 
l2cap bluetooth sunrpc ipv6 cpufreq_ondemand rdma_ucm ib_ucm rdma_cm 
iw_cm ib_addr ib_srp scsi_transport_srp  ib_cm ib_ipoib ib_sa ib_uverbs 
ib_umad ib_mad ib_core dm_multipath battery ac parport_pc lp parport 
joydev sr_mod sg e1000 button i2c_nforce2 pcspkr shpchp i2c_core 
dm_snapshotdm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase 
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 9064, comm: iperf Not tainted 2.6.24-mati #3
RIP: 0010:[<ffffffff802219c8>]  [<ffffffff802219c8>] 
gart_map_single+0x33/0x70
RSP: 0018:ffff8101bece5968  EFLAGS: 00000246
RAX: 00000002f656e100 RBX: 0000000000000600 RCX: 0000000000000001
RDX: 0000000000000600 RSI: ffff8102f656e100 RDI: ffff8102f656e100
RBP: 0000000000000000 R08: 4000111900000000 R09: 0050000000000000
R10: 0000000000000000 R11: ffffffff80221995 R12: 0000000000000286
R13: ffff8103fba11200 R14: ffff8103e9913100 R15: ffffffff8026fc8b
FS:  000000004700d940(0063) GS:ffff8101fb072e00(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000032e6e95770 CR3: 00000001d051a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff802219c1>] gart_map_single+0x2c/0x70
 [<ffffffff882edf65>] :niu:niu_start_xmit+0x339/0x5af
 [<ffffffff80221995>] gart_map_single+0x0/0x70
 [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246
 [<ffffffff80406fa1>] pfifo_fast_dequeue+0x24/0x59
 [<ffffffff80406dab>] __qdisc_run+0x77/0x174
 [<ffffffff803f8139>] dev_queue_xmit+0x141/0x270
 [<ffffffff8041a17c>] ip_queue_xmit+0x343/0x396
 [<ffffffff8028596d>] add_partial_tail+0x12/0x3f
 [<ffffffff80267f23>] get_pageblock_flags_group+0x3e/0x7f
 [<ffffffff80267f23>] get_pageblock_flags_group+0x3e/0x7f
 [<ffffffff80429176>] tcp_transmit_skb+0x69f/0x6dc
 [<ffffffff8042aee0>] __tcp_push_pending_frames+0x79f/0x878
 [<ffffffff80427d69>] tcp_rcv_established+0x108/0x6f3
 [<ffffffff8042dab0>] tcp_v4_do_rcv+0x30/0x3a1
 [<ffffffff803ece3e>] release_sock+0x60/0xbf
 [<ffffffff8042048d>] tcp_sendmsg+0x968/0xa67
 [<ffffffff803ea812>] sock_aio_write+0xf8/0x110
 [<ffffffff8028c3b0>] do_sync_write+0xc9/0x10c
 [<ffffffff8027302b>] handle_mm_fault+0x668/0x69b
 [<ffffffff802480e0>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8028cb6b>] vfs_write+0xd9/0x14f
 [<ffffffff8028d0ae>] sys_write+0x45/0x6e
 [<ffffffff8020b7fc>] tracesys+0xdc/0xe1


CPU: AMD64 processors, speed 1000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a 
unit mask of 0x00 (No unit mask) count 100000
Samples on CPU 0
Samples on CPU 1
Samples on CPU 2
Samples on CPU 3
Samples on CPU 4
Samples on CPU 5
Samples on CPU 6
Samples on CPU 7
samples  %        samples  %        samples  %        samples  %        
samples  %        samples  %        samples  %        samples  %        
image name               app name                 symbol name
35847    10.0121  15879     5.5917  83932     5.8174  176125   19.8328  
339527   17.1392  290339   14.6522  316262   17.6201  391473   17.5484  
vmlinux
                vmlinux                  _raw_spin_lock
33542     9.3683  13947     4.9113  4596      0.3186  54781     6.1687  
5327      0.2689  4976      0.2511  14967     0.8339  13636     0.6113  
niu.ko
                niu                      niu_start_xmit
20350     5.6838  21907     7.7144  54143     3.7527  69738     7.8530  
239428   12.0862  208502   10.5222  285560   15.9096  184698    8.2794  
vmlinux
                vmlinux                  copy_user_generic_string
13818     3.8594  12931     4.5535  9246      0.6408  24130     2.7172  
66874     3.3758  60730     3.0648  52617     2.9315  83159     3.7277  
vmlinux
                vmlinux                  tcp_ack
12375     3.4563  10647     3.7492  6462      0.4479  16099     1.8129  
45097     2.2765  41207     2.0795  33129     1.8457  56178     2.5183  
vmlinux
                vmlinux                  __tcp_select_window
11995     3.3502  8815      3.1041  5267      0.3651  15376     1.7314  
42396     2.1401  39193     1.9779  29124     1.6226  57981     2.5991  
vmlinux
                vmlinux                  __copy_skb_header
10350     2.8908  7941      2.7964  11467     0.7948  27716     3.1210  
64291     3.2454  56789     2.8659  52322     2.9151  80111     3.5911  
vmlinux
                vmlinux                  __tcp_push_pending_frames
9191      2.5671  2121      0.7469  1062      0.0736  13214     1.4880  
1180      0.0596  1195      0.0603  4256      0.2371  3715      0.1665  
vmlinux
                vmlinux                  pfifo_fast_dequeue
9141      2.5531  5952      2.0959  4051      0.2808  10781     1.2140  
30384     1.5338  28521     1.4393  22391     1.2475  41852     1.8761  
vmlinux
                vmlinux                  free_hot_cold_page
8095      2.2609  2488      0.8761  1697      0.1176  17957     2.0221  
1959      0.0989  1626      0.0821  5020      0.2797  4570      0.2049  
vmlinux
                vmlinux                  __qdisc_run
7922      2.2126  5276      1.8579  6511      0.4513  15156     1.7067  
43016     2.1714  39552     1.9960  32309     1.8001  52778     2.3659  
vmlinux
                vmlinux                  kmem_cache_free
7700      2.1506  6904      2.4312  7348      0.5093  17601     1.9820  
48613     2.4540  40715     2.0547  35488     1.9772  57007     2.5554  
vmlinux
                vmlinux                  ip_queue_xmit
7543      2.1068  6779      2.3872  23039     1.5969  50172     5.6497  
132212    6.6740  112382    5.6715  107581    5.9937  161639    7.2457  
vmlinux
                vmlinux                  dev_queue_xmit
7404      2.0679  8373      2.9485  193876   13.4377  2406      0.2709  
7016      0.3542  45116     2.2768  6269      0.3493  7755      0.3476  
vmlinux
                vmlinux                  skb_release_data
7369      2.0582  8834      3.1108  77903     5.3995  19075     2.1480  
47375     2.3915  61162     3.0866  36048     2.0084  57243     2.5660  
niu.ko
                niu                      niu_interrupt
7207      2.0129  8108      2.8552  56756     3.9338  16089     1.8117  
40057     2.0221  49214     2.4836  33659     1.8753  51010     2.2866  
niu.ko
                niu                      niu_poll
6914      1.9311  4099      1.4434  1058      0.0733  3846      0.4331  
11434     0.5772  11337     0.5721  6269      0.3493  19662     0.8814  
vmlinux
                vmlinux                  get_pageblock_flags_group
5883      1.6431  2139      0.7532  793       0.0550  10490     1.1812  
824       0.0416  779       0.0393  2586      0.1441  2272      0.1018  
vmlinux
                vmlinux                  sock_wfree
5855      1.6353  5284      1.8607  4817      0.3339  16055     1.8079  
35248     1.7793  32311     1.6306  25091     1.3979  49224     2.2065  
vmlinux
                vmlinux                  skb_copy_bits
5665      1.5822  6456      2.2734  12491     0.8658  15495     1.7448  
56038     2.8288  49153     2.4806  69654     3.8807  43563     1.9528  
vmlinux
                vmlinux                  tcp_sendmsg
5094      1.4228  3513      1.2371  1318      0.0914  4933      0.5555  
16424     0.8291  15262     0.7702  10637     0.5926  21869     0.9803  
vmlinux
                vmlinux                  tcp_set_skb_tso_segs
4958      1.3848  4813      1.6949  165349   11.4605  6882      0.7750  
16981     0.8572  44073     2.2242  9780      0.5449  20187     0.9049  
vmlinux
                vmlinux                  skb_release_all
4767      1.3314  4185      1.4737  16709     1.1581  36711     4.1339  
69252     3.4958  57613     2.9075  62623     3.4890  101863    4.5662  
vmlinux
                vmlinux                  pfifo_fast_enqueue
4678      1.3066  3174      1.1177  3419      0.2370  10220     1.1508  
27198     1.3729  24120     1.2172  20724     1.1546  37027     1.6598  
vmlinux
                vmlinux                  tcp_transmit_skb
4532      1.2658  3820      1.3452  12096     0.8384  38766     4.3653  
14964     0.7554  13336     0.6730  17124     0.9540  18142     0.8132  
vmlinux
                vmlinux                  _raw_spin_unlock
4440      1.2401  5000      1.7607  144565   10.0199  0              0  
0              0  40207     2.0291  0              0  0              0  
niu.ko
                niu                      niu_pci_unmap_single
4421      1.2348  4625      1.6287  77354     5.3615  4        4.5e-04  
2        1.0e-04  15377     0.7760  5        2.8e-04  1        4.5e-05  
vmlinux
                vmlinux                  kfree_skb
4264      1.1909  4006      1.4107  3971      0.2752  10260     1.1553  
30687     1.5491  27278     1.3766  24134     1.3446  36029     1.6151  
vmlinux
                vmlinux                  kmem_cache_alloc_node
4094      1.1435  3503      1.2336  3712      0.2573  7679      0.8647  
23417     1.1821  21114     1.0655  21857     1.2177  28370     1.2717  
vmlinux
                vmlinux                  kfree
2964      0.8278  2503      0.8814  2183      0.1513  7702      0.8673  
20190     1.0192  18987     0.9582  13373     0.7451  26578     1.1914  
vmlinux
                vmlinux                  tcp_v4_rcv
2822      0.7882  2502      0.8811  1811      0.1255  6020      0.6779  
20457     1.0327  19217     0.9698  15003     0.8359  27615     1.2379  
vmlinux
                vmlinux                  local_bh_enable
2815      0.7862  2004      0.7057  1814      0.1257  3225      0.3632  
14068     0.7101  13196     0.6659  8887      0.4951  17759     0.7961  
vmlinux
                vmlinux                  __slab_free
2731      0.7628  3904      1.3748  236108   16.3649  0              0  
0              0  51620     2.6051  0              0  0              0  
niu.ko
                niu                      release_tx_packet
2693      0.7522  897       0.3159  402       0.0279  3862      0.4349  
859       0.0434  740       0.0373  1872      0.1043  1605      0.0719  
vmlinux
                vmlinux                  dev_hard_start_xmit
2633      0.7354  2183      0.7687  2805      0.1944  5772      0.6500  
15536     0.7843  13160     0.6641  11056     0.6160  15467     0.6933  
vmlinux
                vmlinux                  ip_finish_output
2567      0.7170  1542      0.5430  3680      0.2551  4625      0.5208  
14971     0.7557  13076     0.6599  16538     0.9214  11771     0.5277  
vmlinux
                vmlinux                  get_page_from_freelist
2504      0.6994  1623      0.5715  1093      0.0758  3293      0.3708  
10124     0.5111  9167      0.4626  7378      0.4111  14708     0.6593  
vmlinux
                vmlinux                  skb_clone
2437      0.6807  816       0.2873  374       0.0259  3701      0.4168  
696       0.0351  618       0.0312  1463      0.0815  1226      0.0550  
vmlinux
                vmlinux                  gart_map_single
2424      0.6770  2060      0.7254  84        0.0058  206       0.0232  
543       0.0274  1183      0.0597  593       0.0330  820       0.0368  
vmlinux
                vmlinux                  eth_header
2007      0.5606  2392      0.8423  7641      0.5296  12076     1.3598  
38413     1.9391  33796     1.7055  40611     2.2626  34932     1.5659  
vmlinux
                vmlinux                  __alloc_skb
1958      0.5469  856       0.3014  2011      0.1394  6385      0.7190  
19324     0.9755  16077     0.8113  11571     0.6447  22939     1.0283  
vmlinux
                vmlinux                  __phys_addr
1868      0.5217  1310      0.4613  4119      0.2855  5267      0.5931  
17008     0.8586  15217     0.7679  19030     1.0602  15463     0.6932  
vmlinux
                vmlinux                  memset_c
1834      0.5122  558       0.1965  344       0.0238  4411      0.4967  
572       0.0289  528       0.0266  1401      0.0781  1332      0.0597  
niu.ko
                niu                      niu_set_txd
1702      0.4754  1381      0.4863  148       0.0103  859       0.0967  
2748      0.1387  2943      0.1485  1009      0.0562  5099      0.2286  
vmlinux
                vmlinux                  free_pages_bulk
1402      0.3916  1614      0.5684  1566      0.1085  5598      0.6304  
10886     0.5495  10396     0.5246  8331      0.4642  13281     0.5953  
vmlinux
                vmlinux                  __pskb_pull_tail
1321      0.3690  1058      0.3726  1253      0.0868  2939      0.3310  
9312      0.4701  7869      0.3971  6736      0.3753  10654     0.4776  
vmlinux
                vmlinux                  tcp_rcv_established
1314      0.3670  11301     3.9795  8380      0.5808  2592      0.2919  
11391     0.5750  1007      0.0508  4551      0.2536  2900      0.1300  
oprofiled                oprofiled                (no symbols)
1311      0.3662  1701      0.5990  1919      0.1330  527       0.0593  
5967      0.3012  5769      0.2911  11960     0.6663  1958      0.0878  
vmlinux
                vmlinux                  __rmqueue
1218      0.3402  635       0.2236  287       0.0199  486       0.0547  
1996      0.1008  2162      0.1091  1289      0.0718  2629      0.1178  
vmlinux
                vmlinux                  __mod_zone_page_state
1177      0.3287  1256      0.4423  1969      0.1365  6394      0.7200  
15622     0.7886  14234     0.7183  11927     0.6645  20082     0.9002  
vmlinux
                vmlinux                  netif_receive_skb
1072      0.2994  1093      0.3849  2235      0.1549  5437      0.6122  
13295     0.6711  12614     0.6366  11498     0.6406  17651     0.7912  
vmlinux
                vmlinux                  ip_route_input
1043      0.2913  680       0.2395  737       0.0511  1932      0.2176  
5400      0.2726  5268      0.2659  4528      0.2523  6865      0.3077  
vmlinux
                vmlinux                  put_page
1034      0.2888  447       0.1574  2398      0.1662  4145      0.4668  
26089     1.3170  24902     1.2567  13333     0.7428  14704     0.6591  
vmlinux
                vmlinux                  __delay
996       0.2782  751       0.2645  21850     1.5144  258       0.0291  
809       0.0408  6957      0.3511  557       0.0310  1027      0.0460  
vmlinux
                vmlinux                  gart_unmap_single
953       0.2662  711       0.2504  1331      0.0923  3741      0.4213  
16598     0.8379  14805     0.7471  5559      0.3097  17815     0.7986  
vmlinux
                vmlinux                  ip_local_deliver
947       0.2645  250       0.0880  1315      0.0911  56        0.0063  
876       0.0442  1013      0.0511  1247      0.0695  255       0.0114  
iperf
                iperf                    reporter_condprintstats
930       0.2597  939       0.3307  936       0.0649  3379      0.3805  
6289      0.3175  6103      0.3080  5442      0.3032  9742      0.4367  
niu.ko
                niu                      niu_find_rxpage
885       0.2472  1082      0.3810  76280     5.2870  533       0.0600  
1716      0.0866  21803     1.1003  1219      0.0679  2108      0.0945  
vmlinux
                vmlinux                  __kfree_skb
876       0.2447  775       0.2729  869       0.0602  2878      0.3241  
7079      0.3573  6356      0.3208  3648      0.2032  8924      0.4000  
vmlinux
                vmlinux                  ip_local_deliver_finish
864       0.2413  552       0.1944  3143      0.2178  6539      0.7363  
18190     0.9182  16338     0.8245  12059     0.6719  20170     0.9042  
vmlinux
                vmlinux                  pfn_to_page
831       0.2321  815       0.2870  71        0.0049  167       0.0188  
440       0.0222  698       0.0352  422       0.0235  650       0.0291  
vmlinux
                vmlinux                  _raw_read_lock
779       0.2176  679       0.2391  2057      0.1426  2551      0.2873  
11112     0.5609  10998     0.5550  13292     0.7405  10674     0.4785  
vmlinux
                vmlinux                  __kmalloc_node_track_caller
700       0.1955  358       0.1261  46        0.0032  81        0.0091  
206       0.0104  294       0.0148  271       0.0151  262       0.0117  
vmlinux
                vmlinux                  local_bh_enable_ip
682       0.1905  853       0.3004  29        0.0020  82        0.0092  
117       0.0059  238       0.0120  170       0.0095  236       0.0106  
vmlinux
                vmlinux                  neigh_resolve_output
640       0.1788  398       0.1402  457       0.0317  838       0.0944  
2527      0.1276  2174      0.1097  3696      0.2059  4044      0.1813  
vmlinux
                vmlinux                  tcp_init_tso_segs
639       0.1785  428       0.1507  541       0.0375  348       0.0392  
1888      0.0953  1729      0.0873  3567      0.1987  838       0.0376  
vmlinux
                vmlinux                  __rmqueue_smallest
626       0.1748  485       0.1708  678       0.0470  1686      0.1899  
3472      0.1753  3666      0.1850  2181      0.1215  4042      0.1812  
vmlinux
                vmlinux                  page_to_pfn
596       0.1665  571       0.2011  202       0.0140  434       0.0489  
1472      0.0743  1303      0.0658  1399      0.0779  1285      0.0576  
vmlinux
                vmlinux                  new_slab
535       0.1494  838       0.2951  854       0.0592  2850      0.3209  
7345      0.3708  5461      0.2756  8328      0.4640  7682      0.3444  
vmlinux
                vmlinux                  ip_rcv_finish
519       0.1450  505       0.1778  3538      0.2452  3250      0.3660  
16741     0.8451  16927     0.8542  23509     1.3098  14544     0.6520  
vmlinux
                vmlinux                  __slab_alloc
509       0.1422  438       0.1542  467       0.0324  1647      0.1855  
4471      0.2257  3949      0.1993  2904      0.1618  5552      0.2489  
vmlinux
                vmlinux                  ___pskb_trim
497       0.1388  402       0.1416  422       0.0292  1373      0.1546  
3719      0.1877  3156      0.1593  2957      0.1647  4635      0.2078  
vmlinux
                vmlinux                  tcp_rtt_estimator
487       0.1360  348       0.1225  502       0.0348  160       0.0180  
1626      0.0821  1474      0.0744  2784      0.1551  684       0.0307  
vmlinux
                vmlinux                  rmqueue_bulk
455       0.1271  360       0.1268  355       0.0246  483       0.0544  
3794      0.1915  3716      0.1875  3955      0.2203  5553      0.2489  
vmlinux
                vmlinux                  add_partial_tail


======

nsn57-110 login: BUG: soft lockup - CPU#3 stuck for 11s! [iperf:10566]
CPU 3:
Modules linked in: oprofile niu nfs lockd nfs_acl autofs4 hidp rfcomm 
l2cap bluetooth sunrpc ipv6 cpufreq_ondemand rdma_ucm ib_ucm rdma_cm 
iw_cm ib_addr ib_srp scsi_transport_srp ib_cm ib_ipoib ib_sa ib_uverbs 
ib_umad ib_mad ib_core dm_multipath battery ac parport_pc lp parport 
joydev sr_mod sg e1000 button i2c_nforce2 pcspkr shpchp i2c_core 
dm_snapshot dm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase 
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 10566, comm: iperf Not tainted 2.6.24-mati #3
RIP: 0010:[<ffffffff80316d29>]  [<ffffffff80316d29>] 
_raw_spin_lock+0x7d/0xf6
RSP: 0018:ffff8101f99bb9f8  EFLAGS: 00000246
RAX: 0000000000000000 RBX: 000000000000000e RCX: 00000000a0c5b122
RDX: 0000000000003b06 RSI: 0000000000000033 RDI: 0000000000000001
RBP: ffffffff803eecfd R08: 0000000000000003 R09: 0400000000000000
R10: 00000003e24a9700 R11: ffffffff80221995 R12: ffffffff802219c1
R13: ffff8103db913450 R14: ffff8101ced38000 R15: ffff8103dd5c0700
FS:  000000004580a940(0063) GS:ffff8101fb072e00(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b762a36c050 CR3: 00000001c6950000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff80316d3a>] _raw_spin_lock+0x8e/0xf6
 [<ffffffff80406e8a>] __qdisc_run+0x156/0x174
 [<ffffffff80406dd2>] __qdisc_run+0x9e/0x174
 [<ffffffff803f8139>] dev_queue_xmit+0x141/0x270
 [<ffffffff8041a17c>] ip_queue_xmit+0x343/0x396
 [<ffffffff80429176>] tcp_transmit_skb+0x69f/0x6dc
 [<ffffffff8042aee0>] __tcp_push_pending_frames+0x79f/0x878
 [<ffffffff80427d69>] tcp_rcv_established+0x108/0x6f3
 [<ffffffff8042dab0>] tcp_v4_do_rcv+0x30/0x3a1
 [<ffffffff803ece3e>] release_sock+0x60/0xbf
 [<ffffffff8042048d>] tcp_sendmsg+0x968/0xa67
 [<ffffffff8026fc8b>] zone_statistics+0x3f/0x60
 [<ffffffff803ea812>] sock_aio_write+0xf8/0x110
 [<ffffffff8042048d>] tcp_sendmsg+0x968/0xa67
 [<ffffffff8028c3b0>] do_sync_write+0xc9/0x10c
 [<ffffffff8027302b>] handle_mm_fault+0x668/0x69b
 [<ffffffff802480e0>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8022b995>] __dequeue_entity+0x1c/0x32
 [<ffffffff8045e4a1>] _spin_unlock_irq+0x9/0xc
 [<ffffffff802314ce>] finish_task_switch+0x37/0x83
 [<ffffffff8028cb6b>] vfs_write+0xd9/0x14f
 [<ffffffff8028d0ae>] sys_write+0x45/0x6e
 [<ffffffff8020b7fc>] tracesys+0xdc/0xe1

BUG: soft lockup - CPU#3 stuck for 11s! [iperf:10557]
CPU 3:
Modules linked in: oprofile niu nfs lockd nfs_acl autofs4 hidp rfcomm 
l2cap bluetooth sunrpc ipv6 cpufreq_ondemand rdma_ucm ib_ucm rdma_cm 
iw_cm ib_addr ib_srp scsi_transport_srp  ib_cm ib_ipoib ib_sa ib_uverbs 
ib_umad ib_mad ib_core dm_multipath battery ac parport_pc lp parport 
joydev sr_mod sg e1000 button i2c_nforce2 pcspkr shpchp i2c_core 
dm_snapshotdm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase 
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 10557, comm: iperf Not tainted 2.6.24-mati #3
RIP: 0010:[<ffffffff882edf6c>]  [<ffffffff882edf6c>] 
:niu:niu_start_xmit+0x340/0x5af
RSP: 0018:ffff8101ac1f5998  EFLAGS: 00000293
RAX: 00000003ec4b7100 RBX: ffff8103e54cb610 RCX: 00000003ec4b7100
RDX: 00000fffffffffff RSI: ffff8103ec4b7100 RDI: ffff8103ec4b7100
RBP: ffffffff80221995 R08: ffffffff882edf84 R09: 0050000000000000
R10: 0000000000000000 R11: ffffffff80221995 R12: ffff8102f04d4ed0
R13: 0050000000000000 R14: 0000000000000000 R15: ffffffff80221995
FS:  0000000041001940(0063) GS:ffff8101fb072e00(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b762a36c050 CR3: 00000001c6950000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff882edf65>] :niu:niu_start_xmit+0x339/0x5af
 [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246
 [<ffffffff80406e8a>] __qdisc_run+0x156/0x174
 [<ffffffff80406dab>] __qdisc_run+0x77/0x174
 [<ffffffff803f8139>] dev_queue_xmit+0x141/0x270
 [<ffffffff8041a17c>] ip_queue_xmit+0x343/0x396
 [<ffffffff8028596d>] add_partial_tail+0x12/0x3f
 [<ffffffff80267f23>] get_pageblock_flags_group+0x3e/0x7f
 [<ffffffff802868a5>] __slab_free+0x7a/0x2d5
 [<ffffffff80429176>] tcp_transmit_skb+0x69f/0x6dc
 [<ffffffff8042aee0>] __tcp_push_pending_frames+0x79f/0x878
 [<ffffffff80427d69>] tcp_rcv_established+0x108/0x6f3
 [<ffffffff8042dab0>] tcp_v4_do_rcv+0x30/0x3a1
 [<ffffffff803ece3e>] release_sock+0x60/0xbf
 [<ffffffff8042048d>] tcp_sendmsg+0x968/0xa67
 [<ffffffff802ec44d>] dummy_socket_sendmsg+0x0/0x3
 [<ffffffff803ea812>] sock_aio_write+0xf8/0x110
 [<ffffffff8042048d>] tcp_sendmsg+0x968/0xa67
 [<ffffffff8028c3b0>] do_sync_write+0xc9/0x10c
 [<ffffffff8027302b>] handle_mm_fault+0x668/0x69b
 [<ffffffff802480e0>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8022b995>] __dequeue_entity+0x1c/0x32
 [<ffffffff8028cb6b>] vfs_write+0xd9/0x14f
 [<ffffffff8028d0ae>] sys_write+0x45/0x6e
 [<ffffffff8020b7fc>] tracesys+0xdc/0xe1



samples  %        samples  %        samples  %        samples  %        
samples  %        samples  %        samples  %        samples  %        
image name               app name                 symbol name
7382      5.6251  88        0.0707  39484    17.0818  23        0.0662  
222       0.0663  272       0.0697  483       0.1890  541       0.1645  
vmlinux
                vmlinux                  skb_release_data
5373      4.0942  0              0  29098    12.5885  0              0  
0              0  0              0  0              0  0              0  
niu.ko
                niu                      niu_pci_unmap_single
5143      3.9190  2411      1.9382  1314      0.5685  625       1.8001  
14759     4.4101  7958      2.0390  25960    10.1604  26348     8.0092  
vmlinux
                vmlinux                  _raw_spin_lock
4794      3.6530  4198      3.3748  4016      1.7374  1899      5.4695  
55928    16.7117  29035     7.4392  37442    14.6543  40278    12.2436  
vmlinux
                iperf                    copy_user_generic_string
4660      3.5509  248       0.1994  24570    10.6296  37        0.1066  
767       0.2292  648       0.1660  1125      0.4403  1301      0.3955  
vmlinux
                vmlinux                  skb_release_all
4633      3.5304  1274      1.0242  0              0  137       0.3946  
2946      0.8803  2362      0.6052  4198      1.6430  4657      1.4156  
vmlinux
                vmlinux                  tcp_ack
4620      3.5205  0              0  13861     5.9966  0              0  
0              0  0              0  0              0  0              0  
vmlinux
                vmlinux                  kfree_skb
4557      3.4724  0              0  41087    17.7753  0              0  
0              0  0              0  0              0  0              0  
niu.ko
                niu                      release_tx_packet
4518      3.4427  1410      1.1335  0              0  112       0.3226  
2227      0.6654  1844      0.4725  3439      1.3460  3756      1.1417  
vmlinux
                vmlinux                  __tcp_select_window
4080      3.1090  799       0.6423  0              0  288       0.8295  
6662      1.9907  4099      1.0502  10426     4.0806  11095     3.3726  
vmlinux
                vmlinux                  dev_queue_xmit
4023      3.0655  779       0.6263  66        0.0286  84        0.2419  
1632      0.4877  1249      0.3200  2576      1.0082  3005      0.9135  
vmlinux
                vmlinux                  free_hot_cold_page
3980      3.0328  1254      1.0081  0              0  105       0.3024  
2095      0.6260  1723      0.4415  3342      1.3080  3640      1.1065  
vmlinux
                vmlinux                  __copy_skb_header
3751      2.8583  1133      0.9108  0              0  165       0.4752  
3165      0.9457  2097      0.5373  5288      2.0697  5409      1.6442  
vmlinux
                vmlinux                  __tcp_push_pending_frames
3504      2.6701  611       0.4912  68        0.0294  61        0.1757  
1193      0.3565  917       0.2349  1990      0.7789  2300      0.6991  
vmlinux
                vmlinux                  get_pageblock_flags_group
3274      2.4948  488       0.3923  12477     5.3979  61        0.1757  
1828      0.5462  1409      0.3610  4512      1.7659  5579      1.6959  
niu.ko
                niu                      niu_interrupt
3197      2.4361  8745      7.0303  1519      0.6572  5717     16.4660  
33696    10.0686  51587    13.2174  15227     5.9596  25442     7.7338  
vmlinux
                iperf                    _raw_spin_lock
3172      2.4171  953       0.7661  66        0.0286  103       0.2967  
2313      0.6911  1719      0.4404  3461      1.3546  3618      1.0998  
vmlinux
                vmlinux                  kmem_cache_free
3168      2.4140  493       0.3963  9766      4.2250  67        0.1930  
1712      0.5116  1153      0.2954  3655      1.4305  4368      1.3278  
niu.ko
                niu                      niu_poll
2693      2.0521  436       0.3505  1        4.3e-04  191       0.5501  
3350      1.0010  2245      0.5752  6262      2.4509  7050      2.1430  
vmlinux
                vmlinux                  pfifo_fast_enqueue
2402      1.8303  676       0.5434  0              0  78        0.2247  
1472      0.4398  1137      0.2913  2471      0.9671  3098      0.9417  
vmlinux
                vmlinux                  skb_copy_bits
2297      1.7503  904       0.7267  1        4.3e-04  111       0.3197  
2129      0.6362  1700      0.4356  3091      1.2098  4294      1.3053  
vmlinux
                vmlinux                  ip_queue_xmit
1944      1.4813  2137      1.7180  0              0  0              0  
0              0  0              0  32        0.0125  353       0.1073  
niu.ko
                niu                      niu_start_xmit
1809      1.3785  624       0.5016  0              0  48        0.1382  
1044      0.3120  783       0.2006  1766      0.6912  1912      0.5812  
vmlinux
                vmlinux                  tcp_set_skb_tso_segs
1779      1.3556  644       0.5177  2        8.7e-04  48        0.1382  
1207      0.3607  736       0.1886  1831      0.7166  2233      0.6788  
vmlinux
                vmlinux                  tcp_transmit_skb
1534      1.1689  356       0.2862  6         0.0026  41        0.1181  
933       0.2788  604       0.1548  1465      0.5734  1527      0.4642  
vmlinux
                vmlinux                  kfree
1368      1.0424  1213      0.9752  1054      0.4560  511       1.4718  
13895     4.1519  6859      1.7574  10097     3.9518  10450     3.1766  
vmlinux
                iperf                    tcp_sendmsg
1341      1.0218  320       0.2573  0              0  43        0.1238  
893       0.2668  763       0.1955  2021      0.7910  2460      0.7478  
vmlinux
                vmlinux                  kmem_cache_alloc_node
1260      0.9601  30        0.0241  13743     5.9456  6         0.0173  
65        0.0194  60        0.0154  103       0.0403  124       0.0377  
vmlinux
                vmlinux                  __kfree_skb
1187      0.9045  365       0.2934  0              0  33        0.0950  
670       0.2002  472       0.1209  967       0.3785  1199      0.3645  
vmlinux
                vmlinux                  skb_clone
1185      0.9030  231       0.1857  0              0  29        0.0835  
843       0.2519  624       0.1599  1343      0.5256  1783      0.5420  
vmlinux
                vmlinux                  local_bh_enable
1107      0.8435  288       0.2315  0              0  40        0.1152  
878       0.2624  792       0.2029  1299      0.5084  1456      0.4426  
vmlinux
                vmlinux                  tcp_v4_rcv
1094      0.8336  296       0.2380  0              0  39        0.1123  
480       0.1434  412       0.1056  947       0.3706  1777      0.5402  
vmlinux
                vmlinux                  ip_finish_output
1051      0.8009  217       0.1744  13        0.0056  28        0.0806  
438       0.1309  312       0.0799  769       0.3010  818       0.2487  
vmlinux
                vmlinux                  free_pages_bulk
1016      0.7742  7593      6.1041  203       0.0878  2511      7.2321  
518       0.1548  2476      0.6344  388       0.1519  369       0.1122  
niu.ko
                iperf                    niu_start_xmit
935       0.7125  202       0.1624  32        0.0138  23        0.0662  
495       0.1479  355       0.0910  899       0.3519  742       0.2256  
vmlinux
                vmlinux                  __slab_free
902       0.6873  537       0.4317  0              0  0              0  
0              0  0              0  10        0.0039  99        0.0301  
vmlinux
                vmlinux                  __qdisc_run
787       0.5997  0              0  0              0  0              0  
462       0.1380  1080      0.2767  0              0  0              0  
libperl.so               perl                     (no symbols)
708       0.5395  8         0.0064  4378      1.8940  0              0  
21        0.0063  25        0.0064  60        0.0235  42        0.0128  
vmlinux
                vmlinux                  gart_unmap_single
660       0.5029  153       0.1230  0              0  27        0.0778  
554       0.1655  477       0.1222  914       0.3577  1055      0.3207  
vmlinux
                vmlinux                  ip_route_input


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-26 16:46 2.6.24 BUG: soft lockup - CPU#X Matheos Worku
@ 2008-03-26 17:31 ` Rick Jones
  2008-03-26 20:14 ` Jarek Poplawski
  1 sibling, 0 replies; 36+ messages in thread
From: Rick Jones @ 2008-03-26 17:31 UTC (permalink / raw)
  To: Matheos Worku; +Cc: netdev

Matheos Worku wrote:
> All,
> 
> While running iperf (transmit) on NIU driver I get several soft lockup 
> errors.  I have seen similar soft lockup errors on kernels 2.6.18 and 
> later (with other 10G NIC drivers with LSO disabled) but it is more 
> frequent/pronounced on 2.6.24 kernel.  oprofile is not giving  much clue 
> as where the cpus are spending time (atleast doesn't indicate any thing 
> abnormal in the driver(s)). 

Is oprofile running on an AMD processor able to profile where/when 
interrupts are disabled?  If not there could be a fair bit missing from 
the profile.

rick jones

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-26 16:46 2.6.24 BUG: soft lockup - CPU#X Matheos Worku
  2008-03-26 17:31 ` Rick Jones
@ 2008-03-26 20:14 ` Jarek Poplawski
  2008-03-26 20:26   ` Matheos Worku
  1 sibling, 1 reply; 36+ messages in thread
From: Jarek Poplawski @ 2008-03-26 20:14 UTC (permalink / raw)
  To: Matheos Worku; +Cc: netdev

Matheos Worku wrote, On 03/26/2008 05:46 PM:
...

> outside the driver as well.  I have attached several lockup  error 
> traces and corresponding profile data. Any clues?

Are network cards' irqs balanced? If so, could you reproduce this
with affinity set?

Regards,
Jarek P.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-26 20:14 ` Jarek Poplawski
@ 2008-03-26 20:26   ` Matheos Worku
  2008-03-26 21:46     ` Jarek Poplawski
  2008-03-27 10:33     ` Jarek Poplawski
  0 siblings, 2 replies; 36+ messages in thread
From: Matheos Worku @ 2008-03-26 20:26 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev

Jarek Poplawski wrote:
> Matheos Worku wrote, On 03/26/2008 05:46 PM:
> ...
>
>   
>> outside the driver as well.  I have attached several lockup  error 
>> traces and corresponding profile data. Any clues?
>>     
>
> Are network cards' irqs balanced? If so, could you reproduce this
> with affinity set?
>
> Regards,
> Jarek P.
>   
Jarek,

Reproduced the lockup with irqbalance disabled and with single src of 
interrupt (TX interrupt, UDP transmit).  Lockup appears in different 
location though.

Regards
matheos

irq of interest: 454 (TX interrupt)


454:      19249      93234     907186       2691          0        
188          0        160   PCI-MSI-edge      eth6
455:      22607      15083          5      13104      25569     
161519      62514      25637   PCI-MSI-edge      eth6
456:      22390      14921          5      24605      37438     
110453     251315         66   PCI-MSI-edge      eth6
457:      11109      26849          2      58895     251720         
84          0      67420   PCI-MSI-edge      eth6
458:      22348      15859          1      21978      27839      
10231          0     267743   PCI-MSI-edge      eth6
459:      19922      15331          2      59275          0     
149788      12394      82549   PCI-MSI-edge      eth6
460:      22928      19058          4       1268      49775     
183189     160901      25150   PCI-MSI-edge      eth6
461:        497      32134          1      31428          0      
69182      68889      45407   PCI-MSI-edge      eth6
462:      11932      23212         10      11355     120509      
47588          1     118637   PCI-MSI-edge      eth6
463:          0          0          0          0          0          
0          0          0   PCI-MSI-edge      eth6
464:          0          0          0          0          0          
0          0          0   PCI-MSI-edge      eth6
465:          0          0          0          0          0          
0          0          0   PCI-MSI-edge      eth6



.......

454:      19249     126519     907186       2691          0        
188          0        160   PCI-MSI-edge      eth6
455:      22609      15083          5      13104      25569     
161519      62514      25637   PCI-MSI-edge      eth6
456:      22390      14923          5      24605      37438     
110453     251315         66   PCI-MSI-edge      eth6
457:      11109      26849          2      58895     251720         
84          0      67420   PCI-MSI-edge      eth6
458:      22348      15867          1      21978      27839      
10231          0     267744   PCI-MSI-edge      eth6
459:      19922      15331          2      59275          0     
149788      12394      82549   PCI-MSI-edge      eth6
460:      22928      19058          4       1268      49775     
183189     160901      25150   PCI-MSI-edge      eth6
461:        498      32134          1      31428          0      
69182      68889      45407   PCI-MSI-edge      eth6
462:      11932      23216         10      11355     120509      
47588          1     118637   PCI-MSI-edge      eth6
463:          0          0          0          0          0          
0          0          0   PCI-MSI-edge      eth6
464:          0          0          0          0          0          
0          0          0   PCI-MSI-edge      eth6
465:          0          0          0          0          0          
0          0          0   PCI-MSI-edge      eth6




nsn57-110 login: BUG: soft lockup - CPU#2 stuck for 11s! 
[uperf.x86_64:16606]
CPU 2:
Modules linked in: ixgbe oprofile niu nfs lockd nfs_acl autofs4 hidp 
rfcomm l2cap bluetooth sunrpc ipv6 cpufreq_ondemand rdma_ucm ib_ucm 
rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp ib_cm ib_ipoib ib_sa 
ib_uverbs ib_umad ib_mad ib_core dm_multipath battery ac parport_pc lp 
parport joydev sr_mod sg e1000 button i2c_nforce2 pcspkr shpchp i2c_core 
dm_snapshot dm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase 
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 16606, comm: uperf.x86_64 Not tainted 2.6.24-mati #3
RIP: 0010:[<ffffffff803ef525>]  [<ffffffff803ef525>] 
__copy_skb_header+0x10d/0x134
RSP: 0018:ffff8101ae14ba38  EFLAGS: 00000246
RAX: 0000000020000000 RBX: ffff8101d059a400 RCX: 000000000000000c
RDX: 0000000000000000 RSI: ffff8101d059a468 RDI: ffff8101f7db4868
RBP: ffff8101ffe50d80 R08: ffff8101f7db4800 R09: ffff8101d059a400
R10: 00000001b1c64660 R11: ffffffff80221995 R12: 0000000000000000
R13: 0000000100000000 R14: ffffffff802858e4 R15: ffff8101fec71900
FS:  0000000040800940(0063) GS:ffff8101fb072700(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000044005f48 CR3: 00000001d0513000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff803ef5f6>] __skb_clone+0x24/0xdc
 [<ffffffff803f152e>] skb_realloc_headroom+0x30/0x63
 [<ffffffff882edd40>] :niu:niu_start_xmit+0x114/0x5af
 [<ffffffff80221995>] gart_map_single+0x0/0x70
 [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246
 [<ffffffff80406fb8>] pfifo_fast_dequeue+0x3b/0x59
 [<ffffffff80406dab>] __qdisc_run+0x77/0x174
 [<ffffffff803f8139>] dev_queue_xmit+0x141/0x270
 [<ffffffff80417faf>] ip_push_pending_frames+0x32c/0x3a0
 [<ffffffff80419676>] ip_generic_getfrag+0x0/0x8b
 [<ffffffff8043359f>] udp_push_pending_frames+0x2ba/0x337
 [<ffffffff80434794>] udp_sendmsg+0x4c8/0x606
 [<ffffffff803eafbb>] sock_sendmsg+0xe2/0xff
 [<ffffffff8029e1a1>] iput+0x42/0x7b
 [<ffffffff802480e0>] autoremove_wake_function+0x0/0x2e
 [<ffffffff80275d0c>] find_extend_vma+0x16/0x59
 [<ffffffff8045e4d3>] _spin_lock_irqsave+0x9/0xe
 [<ffffffff80311d88>] __up_read+0x13/0x8a
 [<ffffffff803eba5c>] sys_sendto+0x128/0x151
 [<ffffffff8045e3ed>] _spin_unlock_bh+0x9/0x15
 [<ffffffff8020b7fc>] tracesys+0xdc/0xe1

BUG: soft lockup - CPU#2 stuck for 11s! [uperf.x86_64:16606]
CPU 2:
Modules linked in: ixgbe oprofile niu nfs lockd nfs_acl autofs4 hidp 
rfcomm l2cap bluetooth sunrpc ipv6 cpufreq_ondemand rdma_ucm ib_ucm 
rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp ib_cm ib_ipoib ib_sa 
ib_uverbs ib_umad ib_mad ib_core dm_multipath battery ac parport_pc lp 
parport joydev sr_mod sg e1000 button i2c_nforce2 pcspkr shpchp i2c_core 
dm_snapshot dm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase 
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 16606, comm: uperf.x86_64 Not tainted 2.6.24-mati #3
RIP: 0010:[<ffffffff803ef462>]  [<ffffffff803ef462>] 
__copy_skb_header+0x4a/0x134
RSP: 0018:ffff8101ae14ba38  EFLAGS: 00000202
RAX: ffff8101fa048300 RBX: ffff8103fb35c100 RCX: ffffffff803f0453
RDX: ffff8101fa1e5d00 RSI: ffff8103fb35c100 RDI: ffff8101fa1e5d00
RBP: 0000000000000020 R08: ffff8101fa1e5d00 R09: ffff8103fb35c100
R10: 00000001c6920e60 R11: ffffffff80221995 R12: ffff810100052cc0
R13: ffffffff805abb88 R14: ffff8101ff231b80 R15: 0000000000000000
FS:  0000000040800940(0063) GS:ffff8101fb072700(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000044005f48 CR3: 00000001d0513000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff803ef5f6>] __skb_clone+0x24/0xdc
 [<ffffffff803f152e>] skb_realloc_headroom+0x30/0x63
 [<ffffffff882edd40>] :niu:niu_start_xmit+0x114/0x5af
 [<ffffffff80221995>] gart_map_single+0x0/0x70
 [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246
 [<ffffffff80406daf>] __qdisc_run+0x7b/0x174
 [<ffffffff80406dab>] __qdisc_run+0x77/0x174
 [<ffffffff803f8139>] dev_queue_xmit+0x141/0x270
 [<ffffffff80417faf>] ip_push_pending_frames+0x32c/0x3a0
 [<ffffffff80419676>] ip_generic_getfrag+0x0/0x8b
 [<ffffffff8043359f>] udp_push_pending_frames+0x2ba/0x337
 [<ffffffff80434794>] udp_sendmsg+0x4c8/0x606
 [<ffffffff803eafbb>] sock_sendmsg+0xe2/0xff
 [<ffffffff8029e1a1>] iput+0x42/0x7b
 [<ffffffff802480e0>] autoremove_wake_function+0x0/0x2e
 [<ffffffff80275d0c>] find_extend_vma+0x16/0x59
 [<ffffffff8045e4d3>] _spin_lock_irqsave+0x9/0xe
 [<ffffffff80311d88>] __up_read+0x13/0x8a
 [<ffffffff803eba5c>] sys_sendto+0x128/0x151
 [<ffffffff8045e3ed>] _spin_unlock_bh+0x9/0x15
 [<ffffffff8020b7fc>] tracesys+0xdc/0xe1

BUG: soft lockup - CPU#2 stuck for 11s! [uperf.x86_64:16606]
CPU 2:
Modules linked in: ixgbe oprofile niu nfs lockd nfs_acl autofs4 hidp 
rfcomm l2cap bluetooth sunrpc ipv6 cpufreq_ondemand rdma_ucm ib_ucm 
rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp ib_cm ib_ipoib ib_sa 
ib_uverbs ib_umad ib_mad ib_core dm_multipath battery ac parport_pc lp 
parport joydev sr_mod sg e1000 button i2c_nforce2 pcspkr shpchp i2c_core 
dm_snapshot dm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase 
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 16606, comm: uperf.x86_64 Not tainted 2.6.24-mati #3
RIP: 0010:[<ffffffff803f065e>]  [<ffffffff803f065e>] 
pskb_expand_head+0x73/0x147
RSP: 0018:ffff8101ae14ba18  EFLAGS: 00000286
RAX: 0000000000000080 RBX: ffff8101c6476080 RCX: 000000000000059f
RDX: 0000000000000138 RSI: ffff8103f64ad841 RDI: ffff8101c64760c1
RBP: 0000000000000000 R08: ffff8101fb0722cb R09: 0000000000000002
R10: 0000000000000001 R11: 0000000000000002 R12: ffffffff8028725b
R13: ffff8101c6478000 R14: ffff8101ff191d80 R15: ffffffff805abb88
FS:  0000000040800940(0063) GS:ffff8101fb072700(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000044005f48 CR3: 00000001d0513000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff803f0630>] pskb_expand_head+0x45/0x147
 [<ffffffff803f154b>] skb_realloc_headroom+0x4d/0x63
 [<ffffffff882edd40>] :niu:niu_start_xmit+0x114/0x5af
 [<ffffffff80221995>] gart_map_single+0x0/0x70
 [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246
 [<ffffffff80406fb8>] pfifo_fast_dequeue+0x3b/0x59
 [<ffffffff80406dab>] __qdisc_run+0x77/0x174
 [<ffffffff803f8139>] dev_queue_xmit+0x141/0x270
 [<ffffffff80417faf>] ip_push_pending_frames+0x32c/0x3a0
 [<ffffffff80419676>] ip_generic_getfrag+0x0/0x8b
 [<ffffffff8043359f>] udp_push_pending_frames+0x2ba/0x337
 [<ffffffff80434794>] udp_sendmsg+0x4c8/0x606
 [<ffffffff803eafbb>] sock_sendmsg+0xe2/0xff
 [<ffffffff8029e1a1>] iput+0x42/0x7b
 [<ffffffff802480e0>] autoremove_wake_function+0x0/0x2e
 [<ffffffff80275d0c>] find_extend_vma+0x16/0x59
 [<ffffffff8045e4d3>] _spin_lock_irqsave+0x9/0xe
 [<ffffffff80311d88>] __up_read+0x13/0x8a
 [<ffffffff803eba5c>] sys_sendto+0x128/0x151
 [<ffffffff8045e3ed>] _spin_unlock_bh+0x9/0x15
 [<ffffffff8020b7fc>] tracesys+0xdc/0xe1



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-26 20:26   ` Matheos Worku
@ 2008-03-26 21:46     ` Jarek Poplawski
  2008-03-26 21:53       ` Jarek Poplawski
  2008-03-27 10:33     ` Jarek Poplawski
  1 sibling, 1 reply; 36+ messages in thread
From: Jarek Poplawski @ 2008-03-26 21:46 UTC (permalink / raw)
  To: Matheos Worku; +Cc: netdev

On Wed, Mar 26, 2008 at 01:26:00PM -0700, Matheos Worku wrote:
...
> Reproduced the lockup with irqbalance disabled and with single src of  
> interrupt (TX interrupt, UDP transmit).  Lockup appears in different  
> location though.
>
> Regards
> matheos
>
> irq of interest: 454 (TX interrupt)
>
>
> 454:      19249      93234     907186       2691          0        188    
>       0        160   PCI-MSI-edge      eth6
...
> .......
>
> 454:      19249     126519     907186       2691          0        188    
>       0        160   PCI-MSI-edge      eth6


Hmm... Is this before disabling or I miss something?

Of course, my hint could be wrong here, but there are some issues
around napi vs. irq balancing, so it would be interesting to check
this with all network cards irqs affinity set (and respected).

This could be interesting:
http://marc.info/?l=linux-netdev&m=120349568512423&w=2
(but I suspect there are possible other reasons yet)

Jarek P.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-26 21:46     ` Jarek Poplawski
@ 2008-03-26 21:53       ` Jarek Poplawski
  0 siblings, 0 replies; 36+ messages in thread
From: Jarek Poplawski @ 2008-03-26 21:53 UTC (permalink / raw)
  To: Matheos Worku; +Cc: netdev

On Wed, Mar 26, 2008 at 10:46:48PM +0100, Jarek Poplawski wrote:
> On Wed, Mar 26, 2008 at 01:26:00PM -0700, Matheos Worku wrote:
...
> > 454:      19249      93234     907186       2691          0        188    
> >       0        160   PCI-MSI-edge      eth6
> ...
> > .......
> >
> > 454:      19249     126519     907186       2691          0        188    
> >       0        160   PCI-MSI-edge      eth6
> 
> 
> Hmm... Is this before disabling or I miss something?

OOPS! I see - only not cleared. So it's OK!

Sorry,
Jarek P.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-26 20:26   ` Matheos Worku
  2008-03-26 21:46     ` Jarek Poplawski
@ 2008-03-27 10:33     ` Jarek Poplawski
  2008-03-27 23:18       ` Brandeburg, Jesse
  1 sibling, 1 reply; 36+ messages in thread
From: Jarek Poplawski @ 2008-03-27 10:33 UTC (permalink / raw)
  To: Matheos Worku; +Cc: netdev

On Wed, Mar 26, 2008 at 01:26:00PM -0700, Matheos Worku wrote:
...
> nsn57-110 login: BUG: soft lockup - CPU#2 stuck for 11s!  
...
> Call Trace:
> [<ffffffff803ef5f6>] __skb_clone+0x24/0xdc
> [<ffffffff803f152e>] skb_realloc_headroom+0x30/0x63
> [<ffffffff882edd40>] :niu:niu_start_xmit+0x114/0x5af
> [<ffffffff80221995>] gart_map_single+0x0/0x70
> [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246
...

Maybe I'm wrong with this again, but I wonder about this gart_map_single
on almost all traces, and probably not supposed to be seen here. Did you
try with some memory re-config/debugging?

Regards,
Jarek P.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-27 10:33     ` Jarek Poplawski
@ 2008-03-27 23:18       ` Brandeburg, Jesse
  2008-03-27 23:45         ` Matheos Worku
  0 siblings, 1 reply; 36+ messages in thread
From: Brandeburg, Jesse @ 2008-03-27 23:18 UTC (permalink / raw)
  To: Jarek Poplawski, Matheos Worku; +Cc: netdev

Jarek Poplawski wrote:
> On Wed, Mar 26, 2008 at 01:26:00PM -0700, Matheos Worku wrote:
> ...
>> nsn57-110 login: BUG: soft lockup - CPU#2 stuck for 11s! ... Call
>> Trace: [<ffffffff803ef5f6>] __skb_clone+0x24/0xdc
>> [<ffffffff803f152e>] skb_realloc_headroom+0x30/0x63
>> [<ffffffff882edd40>] :niu:niu_start_xmit+0x114/0x5af
>> [<ffffffff80221995>] gart_map_single+0x0/0x70
>> [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246 ...
> 
> Maybe I'm wrong with this again, but I wonder about this
> gart_map_single on almost all traces, and probably not supposed to be
> seen here. Did you try with some memory re-config/debugging?

I have some more examples of this but with the ixgbe driver.  We are
running heavy bidirectional stress with multiple rx (non-napi, yeah I
know) interrupts by default (and userspace irqbalance is probably on,
I'll have the lab try it without)

Most of our traces are from redhat 2.6.18 kernel, or 2.6.21.5, I'll try
to get a more recent kernel tested.

here is some related data if that helps at all.

2.6.21.5
=====================================

[  180.487937] BUG: soft lockup detected on CPU#0!
[  180.492461] 
[  180.492461] Call Trace:
[  180.496426]  <IRQ>  [<ffffffff8025d788>] softlockup_tick+0xd4/0xe9
[  180.502650]  [<ffffffff8023bc57>] run_local_timers+0x13/0x15
[  180.508317]  [<ffffffff8023bf37>] update_process_times+0x4c/0x78
[  180.514323]  [<ffffffff802181d0>] smp_local_timer_interrupt+0x34/0x51
[  180.520761]  [<ffffffff80218966>] smp_apic_timer_interrupt+0x43/0x5d
[  180.527104]  [<ffffffff8020a31b>] apic_timer_interrupt+0x6b/0x70
[  180.533113]  [<ffffffff80533e2e>] _spin_unlock_irqrestore+0x3f/0x69
[  180.539372]  [<ffffffff80533e31>] _spin_unlock_irqrestore+0x42/0x69
[  180.545648]  [<ffffffff804ba219>] skb_queue_tail+0x41/0x49
[  180.551133]  [<ffffffff804b990e>] sock_queue_rcv_skb+0xe2/0x11c
[  180.557055]  [<ffffffff804f440e>] udp_queue_rcv_skb+0xb6/0x180
[  180.562877]  [<ffffffff804f492e>] __udp4_lib_rcv+0x456/0x594
[  180.568538]  [<ffffffff804f4a7e>] udp_rcv+0x12/0x14
[  180.573439]  [<ffffffff804d5572>] ip_local_deliver+0xca/0x171
[  180.579191]  [<ffffffff804d5b28>] ip_rcv+0x424/0x464
[  180.584171]  [<ffffffff804bf912>] netif_receive_skb+0x25a/0x30f
[  180.590091]  [<ffffffff804c182c>] process_backlog+0xab/0x10f
[  180.595741]  [<ffffffff804c1acc>] net_rx_action+0xbb/0x254
[  180.601220]  [<ffffffff80237cb5>] __do_softirq+0x5e/0xe2
[  180.606544]  [<ffffffff8020a86c>] call_softirq+0x1c/0x28
[  180.611850]  [<ffffffff8020bcd2>] do_softirq+0x39/0x9f
[  180.616983]  [<ffffffff80237c4b>] irq_exit+0x45/0x51
[  180.621941]  [<ffffffff8020be7b>] do_IRQ+0x143/0x168
[  180.626913]  [<ffffffff80209b86>] ret_from_intr+0x0/0xf
[  180.632131]  <EOI>  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
[  180.638424]  [<ffffffff8028391c>] kfree+0x204/0x216
[  180.643296]  [<ffffffff8028391f>] kfree+0x207/0x216
[  180.648188]  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
[  180.653829]  [<ffffffff804bb02c>] kfree_skbmem+0x11/0x7e
[  180.659145]  [<ffffffff804bb13d>] __kfree_skb+0xa4/0xa9
[  180.664365]  [<ffffffff804bb16c>] kfree_skb+0x2a/0x2c
[  180.669412]  [<ffffffff804bd07d>] skb_free_datagram+0xc/0xe
[  180.674985]  [<ffffffff804f5649>] udp_recvmsg+0x1e1/0x25f
[  180.680388]  [<ffffffff804b725e>] sock_common_recvmsg+0x32/0x47
[  180.686322]  [<ffffffff804b5d7d>] sock_recvmsg+0xe5/0xfe
[  180.691629]  [<ffffffff8024e84b>] __lock_acquire+0xb4a/0xbde
[  180.697299]  [<ffffffff802464bd>] autoremove_wake_function+0x0/0x38
[  180.703562]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
[  180.709825]  [<ffffffff804bb095>] kfree_skbmem+0x7a/0x7e
[  180.715148]  [<ffffffff80209be0>] restore_args+0x0/0x30
[  180.720381]  [<ffffffff80338e30>] copy_user_generic_string+0x0/0x40
[  180.726652]  [<ffffffff804b6b64>] sys_recvfrom+0xce/0x122
[  180.732045]  [<ffffffff8024d864>] trace_hardirqs_on+0x125/0x150
[  180.737964]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
[  180.744222]  [<ffffffff8020963e>] system_call+0x7e/0x83
[  180.749457] 
[  190.731184] BUG: soft lockup detected on CPU#0!
[  190.735704] 
[  190.735705] Call Trace:
[  190.739661]  <IRQ>  [<ffffffff8025d788>] softlockup_tick+0xd4/0xe9
[  190.745886]  [<ffffffff8023bc57>] run_local_timers+0x13/0x15
[  190.751555]  [<ffffffff8023bf37>] update_process_times+0x4c/0x78
[  190.751561]  [<ffffffff802181d0>] smp_local_timer_interrupt+0x34/0x51
[  190.751566]  [<ffffffff80218966>] smp_apic_timer_interrupt+0x43/0x5d
[  190.751571]  [<ffffffff8020a31b>] apic_timer_interrupt+0x6b/0x70
[  190.751577]  [<ffffffff80533e2e>] _spin_unlock_irqrestore+0x3f/0x69
[  190.751582]  [<ffffffff80533e31>] _spin_unlock_irqrestore+0x42/0x69
[  190.751587]  [<ffffffff804ba219>] skb_queue_tail+0x41/0x49
[  190.751590]  [<ffffffff804b990e>] sock_queue_rcv_skb+0xe2/0x11c
[  190.751595]  [<ffffffff804f440e>] udp_queue_rcv_skb+0xb6/0x180
[  190.751599]  [<ffffffff804f492e>] __udp4_lib_rcv+0x456/0x594
[  190.751603]  [<ffffffff804f4a7e>] udp_rcv+0x12/0x14
[  190.751609]  [<ffffffff804d5572>] ip_local_deliver+0xca/0x171
[  190.751613]  [<ffffffff804d5b28>] ip_rcv+0x424/0x464
[  190.751617]  [<ffffffff804bf912>] netif_receive_skb+0x25a/0x30f
[  190.751621]  [<ffffffff804c182c>] process_backlog+0xab/0x10f
[  190.751625]  [<ffffffff804c1acc>] net_rx_action+0xbb/0x254
[  190.751630]  [<ffffffff80237cb5>] __do_softirq+0x5e/0xe2
[  190.751634]  [<ffffffff8020a86c>] call_softirq+0x1c/0x28
[  190.751638]  [<ffffffff8020bcd2>] do_softirq+0x39/0x9f
[  190.751642]  [<ffffffff80237c4b>] irq_exit+0x45/0x51
[  190.751645]  [<ffffffff8020be7b>] do_IRQ+0x143/0x168
[  190.751649]  [<ffffffff80209b86>] ret_from_intr+0x0/0xf
[  190.751652]  <EOI>  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
[  190.751660]  [<ffffffff8028391c>] kfree+0x204/0x216
[  190.751664]  [<ffffffff8028391f>] kfree+0x207/0x216
[  190.751669]  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
[  190.751672]  [<ffffffff804bb02c>] kfree_skbmem+0x11/0x7e
[  190.751676]  [<ffffffff804bb13d>] __kfree_skb+0xa4/0xa9
[  190.751680]  [<ffffffff804bb16c>] kfree_skb+0x2a/0x2c
[  190.751684]  [<ffffffff804bd07d>] skb_free_datagram+0xc/0xe
[  190.751688]  [<ffffffff804f5649>] udp_recvmsg+0x1e1/0x25f
[  190.751695]  [<ffffffff804b725e>] sock_common_recvmsg+0x32/0x47
[  190.751699]  [<ffffffff804b5d7d>] sock_recvmsg+0xe5/0xfe
[  190.751703]  [<ffffffff8024e84b>] __lock_acquire+0xb4a/0xbde
[  190.751709]  [<ffffffff802464bd>] autoremove_wake_function+0x0/0x38
[  190.751713]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
[  190.751717]  [<ffffffff804bb095>] kfree_skbmem+0x7a/0x7e
[  190.751721]  [<ffffffff80209be0>] restore_args+0x0/0x30
[  190.751727]  [<ffffffff80338e30>] copy_user_generic_string+0x0/0x40
[  190.751731]  [<ffffffff804b6b64>] sys_recvfrom+0xce/0x122
[  190.751736]  [<ffffffff8024d864>] trace_hardirqs_on+0x125/0x150
[  190.751740]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
[  190.751745]  [<ffffffff8020963e>] system_call+0x7e/0x83
[  190.751749] 
[  200.738822] BUG: soft lockup detected on CPU#0!
[  200.743364] 
[  200.743364] Call Trace:
[  200.747304]  <IRQ>  [<ffffffff8025d788>] softlockup_tick+0xd4/0xe9
[  200.753531]  [<ffffffff8023bc57>] run_local_timers+0x13/0x15
[  200.759188]  [<ffffffff8023bf37>] update_process_times+0x4c/0x78
[  200.765189]  [<ffffffff802181d0>] smp_local_timer_interrupt+0x34/0x51
[  200.771634]  [<ffffffff80218966>] smp_apic_timer_interrupt+0x43/0x5d
[  200.777987]  [<ffffffff8020a31b>] apic_timer_interrupt+0x6b/0x70
[  200.783997]  [<ffffffff80533e2e>] _spin_unlock_irqrestore+0x3f/0x69
[  200.790261]  [<ffffffff80533e31>] _spin_unlock_irqrestore+0x42/0x69
[  200.796519]  [<ffffffff804ba219>] skb_queue_tail+0x41/0x49
[  200.802004]  [<ffffffff804b990e>] sock_queue_rcv_skb+0xe2/0x11c
[  200.807925]  [<ffffffff804f440e>] udp_queue_rcv_skb+0xb6/0x180
[  200.813757]  [<ffffffff804f492e>] __udp4_lib_rcv+0x456/0x594
[  200.819401]  [<ffffffff804f4a7e>] udp_rcv+0x12/0x14
[  200.824294]  [<ffffffff804d5572>] ip_local_deliver+0xca/0x171
[  200.830030]  [<ffffffff804d5b28>] ip_rcv+0x424/0x464
[  200.835003]  [<ffffffff804bf912>] netif_receive_skb+0x25a/0x30f
[  200.840919]  [<ffffffff804c182c>] process_backlog+0xab/0x10f
[  200.846582]  [<ffffffff804c1acc>] net_rx_action+0xbb/0x254
[  200.852070]  [<ffffffff80237cb5>] __do_softirq+0x5e/0xe2
[  200.857375]  [<ffffffff8020a86c>] call_softirq+0x1c/0x28
[  200.862680]  [<ffffffff8020bcd2>] do_softirq+0x39/0x9f
[  200.867830]  [<ffffffff80237c4b>] irq_exit+0x45/0x51
[  200.872788]  [<ffffffff8020be7b>] do_IRQ+0x143/0x168
[  200.877748]  [<ffffffff80209b86>] ret_from_intr+0x0/0xf
[  200.882977]  <EOI>  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
[  200.889287]  [<ffffffff8028391c>] kfree+0x204/0x216
[  200.894168]  [<ffffffff8028391f>] kfree+0x207/0x216
[  200.899042]  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
[  200.904685]  [<ffffffff804bb02c>] kfree_skbmem+0x11/0x7e
[  200.909999]  [<ffffffff804bb13d>] __kfree_skb+0xa4/0xa9
[  200.915220]  [<ffffffff804bb16c>] kfree_skb+0x2a/0x2c
[  200.920279]  [<ffffffff804bd07d>] skb_free_datagram+0xc/0xe
[  200.925844]  [<ffffffff804f5649>] udp_recvmsg+0x1e1/0x25f
[  200.931249]  [<ffffffff804b725e>] sock_common_recvmsg+0x32/0x47
[  200.937183]  [<ffffffff804b5d7d>] sock_recvmsg+0xe5/0xfe
[  200.942490]  [<ffffffff8024e84b>] __lock_acquire+0xb4a/0xbde
[  200.948150]  [<ffffffff802464bd>] autoremove_wake_function+0x0/0x38
[  200.954417]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
[  200.960670]  [<ffffffff804bb095>] kfree_skbmem+0x7a/0x7e
[  200.965984]  [<ffffffff80209be0>] restore_args+0x0/0x30
[  200.971215]  [<ffffffff80338e30>] copy_user_generic_string+0x0/0x40
[  200.977480]  [<ffffffff804b6b64>] sys_recvfrom+0xce/0x122
[  200.982881]  [<ffffffff8024d864>] trace_hardirqs_on+0x125/0x150
[  200.988791]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
[  200.995059]  [<ffffffff8020963e>] system_call+0x7e/0x83

2.6.18-5-xen
==================================================

BUG: soft lockup detected on CPU#0!

Call Trace:
 <IRQ>  [<ffffffff802a76d8>] softlockup_tick+0xdb/0xed
 [<ffffffff8026ba8c>] timer_interrupt+0x396/0x3f2
 [<ffffffff80210a87>] handle_IRQ_event+0x2d/0x60
 [<ffffffff802a7a17>] __do_IRQ+0xa4/0x105
 [<ffffffff802699cf>] do_IRQ+0xe7/0xf5
 [<ffffffff8038dede>] evtchn_do_upcall+0x86/0xe0
 [<ffffffff8025cbfe>] do_hypervisor_callback+0x1e/0x2c
 [<ffffffff8847b58b>] :ixgbe:ixgbe_clean_rx_irq+0x16b/0xa40
 [<ffffffff8847abb7>] :ixgbe:ixgbe_clean_tx_irq+0xe7/0x330
 [<ffffffff8847bf4d>] :ixgbe:ixgbe_intr+0xed/0x220
 [<ffffffff80210a87>] handle_IRQ_event+0x2d/0x60
 [<ffffffff802a7a17>] __do_IRQ+0xa4/0x105
 [<ffffffff802699cf>] do_IRQ+0xe7/0xf5
 [<ffffffff8038dede>] evtchn_do_upcall+0x86/0xe0
 [<ffffffff8025cbfe>] do_hypervisor_callback+0x1e/0x2c
 [<ffffffff80226aac>] tcp_v4_rcv+0x6bc/0xa59
 [<ffffffff802268e9>] tcp_v4_rcv+0x4f9/0xa59
 [<ffffffff8025cbfe>] do_hypervisor_callback+0x1e/0x2c
 [<ffffffff8023463b>] ip_local_deliver+0x19d/0x263
 [<ffffffff8023591a>] ip_rcv+0x49c/0x4df
 [<ffffffff8021fe67>] netif_receive_skb+0x361/0x3df
 [<ffffffff802307e3>] process_backlog+0x96/0xff
 [<ffffffff8020c5ca>] net_rx_action+0xa8/0x1b4
 [<ffffffff80211ed3>] __do_softirq+0x62/0xdd
 [<ffffffff8025d0cc>] call_softirq+0x1c/0x28
 <EOI>  [<ffffffff80269b59>] do_softirq+0x31/0x98
 [<ffffffff8028714b>] local_bh_enable_ip+0x48/0x59
 [<ffffffff8021cde4>] tcp_recvmsg+0x9db/0xaf0
 [<ffffffff80231b2b>] sock_common_recvmsg+0x2d/0x42
 [<ffffffff8023066b>] sock_recvmsg+0x101/0x120
 [<ffffffff8029448d>] autoremove_wake_function+0x0/0x2e
 [<ffffffff80246001>] sprintf+0x51/0x59
 [<ffffffff8022fcfe>] dev_queue_xmit+0x2f2/0x313
 [<ffffffff8022b25c>] local_bh_enable+0x9/0xa4
 [<ffffffff8023356b>] lock_sock+0xa7/0xb2
 [<ffffffff80260711>] _spin_lock_bh+0x9/0x14
 [<ffffffff80230e57>] release_sock+0x13/0xaa
 [<ffffffff8022b25c>] local_bh_enable+0x9/0xa4
 [<ffffffff8023356b>] lock_sock+0xa7/0xb2
 [<ffffffff8023af7b>] d_instantiate+0x4a/0x82
 [<ffffffff802408aa>] d_rehash+0x21/0x34
 [<ffffffff8022b16f>] sys_recvfrom+0xd4/0x137
 [<ffffffff802a67f9>] audit_syscall_exit+0x2b/0x2ec
 [<ffffffff8025c621>] tracesys+0xa7/0xb2

BUG: soft lockup detected on CPU#0!

Call Trace:
 <IRQ>  [<ffffffff802a76d8>] softlockup_tick+0xdb/0xed
 [<ffffffff8026ba8c>] timer_interrupt+0x396/0x3f2
 [<ffffffff80210a87>] handle_IRQ_event+0x2d/0x60
 [<ffffffff802a7a17>] __do_IRQ+0xa4/0x105
 [<ffffffff802699cf>] do_IRQ+0xe7/0xf5
 [<ffffffff8038dede>] evtchn_do_upcall+0x86/0xe0
 [<ffffffff8025cbfe>] do_hypervisor_callback+0x1e/0x2c
 [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
 [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
 [<ffffffff8038d649>] force_evtchn_callback+0xa/0xb
 [<ffffffff8020ab34>] kmem_cache_alloc+0x62/0x6d
 [<ffffffff802350a4>] alloc_skb_from_cache+0x52/0x13c
 [<ffffffff803f183f>] __netdev_alloc_skb+0x12/0x2d
 [<ffffffff8847b31b>] :ixgbe:ixgbe_alloc_rx_buffers+0xfb/0x200
 [<ffffffff8847bcd2>] :ixgbe:ixgbe_clean_rx_irq+0x8b2/0xa40
 [<ffffffff8847aac4>] :ixgbe:ixgbe_unmap_and_free_tx_resource+0x44/0x50
 [<ffffffff8847bf4d>] :ixgbe:ixgbe_intr+0xed/0x220
 [<ffffffff80210a87>] handle_IRQ_event+0x2d/0x60
 [<ffffffff802a7a17>] __do_IRQ+0xa4/0x105
 [<ffffffff802699cf>] do_IRQ+0xe7/0xf5
 [<ffffffff80270394>] xen_send_IPI_mask+0xa5/0xaa
 [<ffffffff8038dede>] evtchn_do_upcall+0x86/0xe0
 [<ffffffff8026fa43>] smp_send_reschedule+0x4b/0x50
 [<ffffffff8025cbfe>] do_hypervisor_callback+0x1e/0x2c
 [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
 [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
 [<ffffffff8023dcca>] lock_timer_base+0x1b/0x3c
 [<ffffffff8038d649>] force_evtchn_callback+0xa/0xb
 [<ffffffff8021bf22>] __mod_timer+0xb0/0xbe
 [<ffffffff80253b8d>] sk_reset_timer+0xf/0x19
 [<ffffffff80226a6f>] tcp_v4_rcv+0x67f/0xa59
 [<ffffffff8025e499>] __sched_text_start+0x2d9/0x97b
 [<ffffffff8023463b>] ip_local_deliver+0x19d/0x263
 [<ffffffff8023591a>] ip_rcv+0x49c/0x4df
 [<ffffffff8021fe67>] netif_receive_skb+0x361/0x3df
 [<ffffffff802307e3>] process_backlog+0x96/0xff
 [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
 [<ffffffff8020c5ca>] net_rx_action+0xa8/0x1b4
 [<ffffffff80211ed3>] __do_softirq+0x62/0xdd
 [<ffffffff8025d0cc>] call_softirq+0x1c/0x28
 <EOI>  [<ffffffff80269b59>] do_softirq+0x31/0x98
 [<ffffffff8022b2e6>] local_bh_enable+0x93/0xa4
 [<ffffffff8025308f>] __tcp_checksum_complete_user+0x19/0x2c
 [<ffffffff8021aeb2>] tcp_rcv_established+0x45c/0x917
 [<ffffffff8023b449>] tcp_v4_do_rcv+0x2a/0x300
 [<ffffffff80230e98>] release_sock+0x54/0xaa
 [<ffffffff8021cde4>] tcp_recvmsg+0x9db/0xaf0
 [<ffffffff80231b2b>] sock_common_recvmsg+0x2d/0x42
 [<ffffffff8023066b>] sock_recvmsg+0x101/0x120
 [<ffffffff8029448d>] autoremove_wake_function+0x0/0x2e
 [<ffffffff80246001>] sprintf+0x51/0x59
 [<ffffffff8022fcfe>] dev_queue_xmit+0x2f2/0x313
 [<ffffffff8022b25c>] local_bh_enable+0x9/0xa4
 [<ffffffff8023356b>] lock_sock+0xa7/0xb2
 [<ffffffff80260711>] _spin_lock_bh+0x9/0x14
 [<ffffffff80230e57>] release_sock+0x13/0xaa
 [<ffffffff8022b25c>] local_bh_enable+0x9/0xa4
 [<ffffffff8023356b>] lock_sock+0xa7/0xb2
 [<ffffffff8023af7b>] d_instantiate+0x4a/0x82
 [<ffffffff802408aa>] d_rehash+0x21/0x34
 [<ffffffff8022b16f>] sys_recvfrom+0xd4/0x137
 [<ffffffff803ed57c>] move_addr_to_user+0x5d/0x78
 [<ffffffff803ed641>] sys_getpeername+0x9f/0xb5
 [<ffffffff8025c621>] tracesys+0xa7/0xb2



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-27 23:18       ` Brandeburg, Jesse
@ 2008-03-27 23:45         ` Matheos Worku
  2008-03-28  0:02           ` David Miller
  0 siblings, 1 reply; 36+ messages in thread
From: Matheos Worku @ 2008-03-27 23:45 UTC (permalink / raw)
  To: Brandeburg, Jesse; +Cc: Jarek Poplawski, netdev

Brandeburg, Jesse wrote:
> Jarek Poplawski wrote:
>   
>> On Wed, Mar 26, 2008 at 01:26:00PM -0700, Matheos Worku wrote:
>> ...
>>     
>>> nsn57-110 login: BUG: soft lockup - CPU#2 stuck for 11s! ... Call
>>> Trace: [<ffffffff803ef5f6>] __skb_clone+0x24/0xdc
>>> [<ffffffff803f152e>] skb_realloc_headroom+0x30/0x63
>>> [<ffffffff882edd40>] :niu:niu_start_xmit+0x114/0x5af
>>> [<ffffffff80221995>] gart_map_single+0x0/0x70
>>> [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246 ...
>>>       
>> Maybe I'm wrong with this again, but I wonder about this
>> gart_map_single on almost all traces, and probably not supposed to be
>> seen here. Did you try with some memory re-config/debugging?
>>     
>
> I have some more examples of this but with the ixgbe driver.  We are
> running heavy bidirectional stress with multiple rx (non-napi, yeah I
> know) interrupts by default (and userspace irqbalance is probably on,
> I'll have the lab try it without)
>   

I have seen the lockup on kernels 2.6.18 and newer mostly on TX traffic. 
I have seen it on another 10G driver (off the tree niu driver sibling, 
nxge).  The nxge driver doesn't use any TX interrupts and I have seen it 
with UDP TX, irqbalance disabled, with no irq activity at all.  some 
example traces included.
Regards
Matheos


nsn57-110 login: BUG: soft lockup - CPU#4 stuck for 11s! [uperf.x86_64:6697]
CPU 4:
Modules linked in: nxge nfs lockd nfs_acl autofs4 hidp rfcomm l2cap 
bluetooth sunrpc ipv6 cpufreq_ondemand rdma_ucm ib_ucm rdma_cm iw_cm 
ib_addr ib_srp scsi_transport_srp ib_cm ib_ipoib ib_sa ib_uverbs ib_umad 
ib_mad ib_core dm_multipath battery ac parport_pc lp parport joydev 
sr_mod sg e1000 i2c_nforce2 button i2c_core shpchp pcspkr dm_snapshot 
dm_zerodm_mirror dm_mod usb_storage mptsas mptscsih mptbase 
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 6697, comm: uperf.x86_64 Not tainted 2.6.24-mati #3
RIP: 0010:[<ffffffff80316c64>]  [<ffffffff80316c64>] 
_raw_spin_unlock+0x37/0x7f
RSP: 0018:ffff8101e00c3af8  EFLAGS: 00000246
RAX: ffff8101e084e000 RBX: ffff8101f0908000 RCX: ffff8101f0908780
RDX: 0000000000000000 RSI: 0000000000000116 RDI: ffff8101f0908300
RBP: 00000000000003d5 R08: 0000000000000004 R09: 0000000000000115
R10: ffff8101e00c3968 R11: 0000000000000000 R12: ffff8101e00c3968
R13: 0000000000000000 R14: ffff8101f0908000 R15: ffffffff882e77ef
FS:  0000000041001940(0063) GS:ffff8102fba13580(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000004f81cf48 CR3: 00000001e05b4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff80406dca>] __qdisc_run+0x96/0x174
 [<ffffffff803f8139>] dev_queue_xmit+0x141/0x270
 [<ffffffff80417faf>] ip_push_pending_frames+0x32c/0x3a0
 [<ffffffff80419676>] ip_generic_getfrag+0x0/0x8b
 [<ffffffff8043359f>] udp_push_pending_frames+0x2ba/0x337
 [<ffffffff80434794>] udp_sendmsg+0x4c8/0x606
 [<ffffffff803eafbb>] sock_sendmsg+0xe2/0xff
 [<ffffffff8029e1a1>] iput+0x42/0x7b
 [<ffffffff802480e0>] autoremove_wake_function+0x0/0x2e
 [<ffffffff80275d0c>] find_extend_vma+0x16/0x59
 [<ffffffff8045e4d3>] _spin_lock_irqsave+0x9/0xe
 [<ffffffff80311d88>] __up_read+0x13/0x8a
 [<ffffffff803eba5c>] sys_sendto+0x128/0x151
 [<ffffffff8045e3ed>] _spin_unlock_bh+0x9/0x15
 [<ffffffff8020b7fc>] tracesys+0xdc/0xe1

BUG: soft lockup - CPU#4 stuck for 11s! [uperf.x86_64:6697]
CPU 4:
Modules linked in: nxge nfs lockd nfs_acl autofs4 hidp rfcomm l2cap 
bluetooth sunrpc ipv6 cpufreq_ondemand rdma_ucm ib_ucm rdma_cm iw_cm 
ib_addr ib_srp scsi_transport_srp ib_cm ib_ipoib ib_sa ib_uverbs ib_umad 
ib_mad ib_core dm_multipath battery ac parport_pc lp parport joydev 
sr_mod sg e1000 i2c_nforce2 button i2c_core shpchp pcspkr dm_snapshot 
dm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase 
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 6697, comm: uperf.x86_64 Not tainted 2.6.24-mati #3
RIP: 0010:[<ffffffff80316d29>]  [<ffffffff80316d29>] 
_raw_spin_lock+0x7d/0xf6
RSP: 0018:ffff8101e00c3ad8  EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000036 RCX: 00000000dabeab42
RDX: 0000000000000e24 RSI: 0000000000000392 RDI: 0000000000000001
RBP: ffffffff882e77ef R08: 0000000000000004 R09: 0000000000000391
R10: ffff8101e00c3968 R11: 0000000000000000 R12: ffffffffffffff10
R13: 0000000000000011 R14: ffff8102f9db8010 R15: 00000000000005de
FS:  0000000041001940(0063) GS:ffff8102fba13580(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000004f81cf48 CR3: 00000001e05b4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff80316d3a>] _raw_spin_lock+0x8e/0xf6
 [<ffffffff80406fa1>] pfifo_fast_dequeue+0x24/0x59
 [<ffffffff80406dd2>] __qdisc_run+0x9e/0x174
 [<ffffffff803f8139>] dev_queue_xmit+0x141/0x270
 [<ffffffff80417faf>] ip_push_pending_frames+0x32c/0x3a0
 [<ffffffff80419676>] ip_generic_getfrag+0x0/0x8b
 [<ffffffff8043359f>] udp_push_pending_frames+0x2ba/0x337
 [<ffffffff80434794>] udp_sendmsg+0x4c8/0x606
 [<ffffffff803eafbb>] sock_sendmsg+0xe2/0xff
 [<ffffffff8029e1a1>] iput+0x42/0x7b
 [<ffffffff802480e0>] autoremove_wake_function+0x0/0x2e
 [<ffffffff80275d0c>] find_extend_vma+0x16/0x59
 [<ffffffff8045e4d3>] _spin_lock_irqsave+0x9/0xe
 [<ffffffff80311d88>] __up_read+0x13/0x8a
 [<ffffffff803eba5c>] sys_sendto+0x128/0x151
 [<ffffffff8045e3ed>] _spin_unlock_bh+0x9/0x15
 [<ffffffff8020b7fc>] tracesys+0xdc/0xe1

BUG: soft lockup - CPU#4 stuck for 11s! [uperf.x86_64:6697]
CPU 4:
Modules linked in: nxge nfs lockd nfs_acl autofs4 hidp rfcomm l2cap 
bluetooth sunrpc ipv6 cpufreq_ondemand rdma_ucm ib_ucm rdma_cm iw_cm 
ib_addr ib_srp scsi_transport_srp ib_cm ib_ipoib ib_sa ib_uverbs ib_umad 
ib_mad ib_core dm_multipath battery ac parport_pc lp parport joydev 
sr_mod sg e1000 i2c_nforce2 button i2c_core shpchp pcspkr dm_snapshot 
dm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase 
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 6697, comm: uperf.x86_64 Not tainted 2.6.24-mati #3
RIP: 0010:[<ffffffff80316d29>]  [<ffffffff80316d29>] 
_raw_spin_lock+0x7d/0xf6
RSP: 0018:ffff8101e00c3ad8  EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000026 RCX: 000000002a2092e7
RDX: 0000000000000e28 RSI: 00000000000001d8 RDI: 0000000000000001
RBP: ffffffff882e77ef R08: 0000000000000004 R09: 00000000000001d7
R10: ffff8101e00c3968 R11: 0000000000000000 R12: ffffffffffffff10
R13: 0000000000000011 R14: ffff8102f686e010 R15: 00000000000005de
FS:  0000000041001940(0063) GS:ffff8102fba13580(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000004f81cf48 CR3: 00000001e05b4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff80316d3a>] _raw_spin_lock+0x8e/0xf6
 [<ffffffff80406e8a>] __qdisc_run+0x156/0x174
 [<ffffffff80406dd2>] __qdisc_run+0x9e/0x174
 [<ffffffff803f8139>] dev_queue_xmit+0x141/0x270
 [<ffffffff80417faf>] ip_push_pending_frames+0x32c/0x3a0
 [<ffffffff80419676>] ip_generic_getfrag+0x0/0x8b
 [<ffffffff8043359f>] udp_push_pending_frames+0x2ba/0x337
 [<ffffffff80434794>] udp_sendmsg+0x4c8/0x606
 [<ffffffff803eafbb>] sock_sendmsg+0xe2/0xff
 [<ffffffff8029e1a1>] iput+0x42/0x7b
 [<ffffffff802480e0>] autoremove_wake_function+0x0/0x2e
 [<ffffffff80275d0c>] find_extend_vma+0x16/0x59
 [<ffffffff8045e4d3>] _spin_lock_irqsave+0x9/0xe
 [<ffffffff80311d88>] __up_read+0x13/0x8a
 [<ffffffff803eba5c>] sys_sendto+0x128/0x151
 [<ffffffff8045e3ed>] _spin_unlock_bh+0x9/0x15
 [<ffffffff8020b7fc>] tracesys+0xdc/0xe1




> Most of our traces are from redhat 2.6.18 kernel, or 2.6.21.5, I'll try
> to get a more recent kernel tested.
>
> here is some related data if that helps at all.
>
> 2.6.21.5
> =====================================
>
> [  180.487937] BUG: soft lockup detected on CPU#0!
> [  180.492461] 
> [  180.492461] Call Trace:
> [  180.496426]  <IRQ>  [<ffffffff8025d788>] softlockup_tick+0xd4/0xe9
> [  180.502650]  [<ffffffff8023bc57>] run_local_timers+0x13/0x15
> [  180.508317]  [<ffffffff8023bf37>] update_process_times+0x4c/0x78
> [  180.514323]  [<ffffffff802181d0>] smp_local_timer_interrupt+0x34/0x51
> [  180.520761]  [<ffffffff80218966>] smp_apic_timer_interrupt+0x43/0x5d
> [  180.527104]  [<ffffffff8020a31b>] apic_timer_interrupt+0x6b/0x70
> [  180.533113]  [<ffffffff80533e2e>] _spin_unlock_irqrestore+0x3f/0x69
> [  180.539372]  [<ffffffff80533e31>] _spin_unlock_irqrestore+0x42/0x69
> [  180.545648]  [<ffffffff804ba219>] skb_queue_tail+0x41/0x49
> [  180.551133]  [<ffffffff804b990e>] sock_queue_rcv_skb+0xe2/0x11c
> [  180.557055]  [<ffffffff804f440e>] udp_queue_rcv_skb+0xb6/0x180
> [  180.562877]  [<ffffffff804f492e>] __udp4_lib_rcv+0x456/0x594
> [  180.568538]  [<ffffffff804f4a7e>] udp_rcv+0x12/0x14
> [  180.573439]  [<ffffffff804d5572>] ip_local_deliver+0xca/0x171
> [  180.579191]  [<ffffffff804d5b28>] ip_rcv+0x424/0x464
> [  180.584171]  [<ffffffff804bf912>] netif_receive_skb+0x25a/0x30f
> [  180.590091]  [<ffffffff804c182c>] process_backlog+0xab/0x10f
> [  180.595741]  [<ffffffff804c1acc>] net_rx_action+0xbb/0x254
> [  180.601220]  [<ffffffff80237cb5>] __do_softirq+0x5e/0xe2
> [  180.606544]  [<ffffffff8020a86c>] call_softirq+0x1c/0x28
> [  180.611850]  [<ffffffff8020bcd2>] do_softirq+0x39/0x9f
> [  180.616983]  [<ffffffff80237c4b>] irq_exit+0x45/0x51
> [  180.621941]  [<ffffffff8020be7b>] do_IRQ+0x143/0x168
> [  180.626913]  [<ffffffff80209b86>] ret_from_intr+0x0/0xf
> [  180.632131]  <EOI>  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
> [  180.638424]  [<ffffffff8028391c>] kfree+0x204/0x216
> [  180.643296]  [<ffffffff8028391f>] kfree+0x207/0x216
> [  180.648188]  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
> [  180.653829]  [<ffffffff804bb02c>] kfree_skbmem+0x11/0x7e
> [  180.659145]  [<ffffffff804bb13d>] __kfree_skb+0xa4/0xa9
> [  180.664365]  [<ffffffff804bb16c>] kfree_skb+0x2a/0x2c
> [  180.669412]  [<ffffffff804bd07d>] skb_free_datagram+0xc/0xe
> [  180.674985]  [<ffffffff804f5649>] udp_recvmsg+0x1e1/0x25f
> [  180.680388]  [<ffffffff804b725e>] sock_common_recvmsg+0x32/0x47
> [  180.686322]  [<ffffffff804b5d7d>] sock_recvmsg+0xe5/0xfe
> [  180.691629]  [<ffffffff8024e84b>] __lock_acquire+0xb4a/0xbde
> [  180.697299]  [<ffffffff802464bd>] autoremove_wake_function+0x0/0x38
> [  180.703562]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
> [  180.709825]  [<ffffffff804bb095>] kfree_skbmem+0x7a/0x7e
> [  180.715148]  [<ffffffff80209be0>] restore_args+0x0/0x30
> [  180.720381]  [<ffffffff80338e30>] copy_user_generic_string+0x0/0x40
> [  180.726652]  [<ffffffff804b6b64>] sys_recvfrom+0xce/0x122
> [  180.732045]  [<ffffffff8024d864>] trace_hardirqs_on+0x125/0x150
> [  180.737964]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
> [  180.744222]  [<ffffffff8020963e>] system_call+0x7e/0x83
> [  180.749457] 
> [  190.731184] BUG: soft lockup detected on CPU#0!
> [  190.735704] 
> [  190.735705] Call Trace:
> [  190.739661]  <IRQ>  [<ffffffff8025d788>] softlockup_tick+0xd4/0xe9
> [  190.745886]  [<ffffffff8023bc57>] run_local_timers+0x13/0x15
> [  190.751555]  [<ffffffff8023bf37>] update_process_times+0x4c/0x78
> [  190.751561]  [<ffffffff802181d0>] smp_local_timer_interrupt+0x34/0x51
> [  190.751566]  [<ffffffff80218966>] smp_apic_timer_interrupt+0x43/0x5d
> [  190.751571]  [<ffffffff8020a31b>] apic_timer_interrupt+0x6b/0x70
> [  190.751577]  [<ffffffff80533e2e>] _spin_unlock_irqrestore+0x3f/0x69
> [  190.751582]  [<ffffffff80533e31>] _spin_unlock_irqrestore+0x42/0x69
> [  190.751587]  [<ffffffff804ba219>] skb_queue_tail+0x41/0x49
> [  190.751590]  [<ffffffff804b990e>] sock_queue_rcv_skb+0xe2/0x11c
> [  190.751595]  [<ffffffff804f440e>] udp_queue_rcv_skb+0xb6/0x180
> [  190.751599]  [<ffffffff804f492e>] __udp4_lib_rcv+0x456/0x594
> [  190.751603]  [<ffffffff804f4a7e>] udp_rcv+0x12/0x14
> [  190.751609]  [<ffffffff804d5572>] ip_local_deliver+0xca/0x171
> [  190.751613]  [<ffffffff804d5b28>] ip_rcv+0x424/0x464
> [  190.751617]  [<ffffffff804bf912>] netif_receive_skb+0x25a/0x30f
> [  190.751621]  [<ffffffff804c182c>] process_backlog+0xab/0x10f
> [  190.751625]  [<ffffffff804c1acc>] net_rx_action+0xbb/0x254
> [  190.751630]  [<ffffffff80237cb5>] __do_softirq+0x5e/0xe2
> [  190.751634]  [<ffffffff8020a86c>] call_softirq+0x1c/0x28
> [  190.751638]  [<ffffffff8020bcd2>] do_softirq+0x39/0x9f
> [  190.751642]  [<ffffffff80237c4b>] irq_exit+0x45/0x51
> [  190.751645]  [<ffffffff8020be7b>] do_IRQ+0x143/0x168
> [  190.751649]  [<ffffffff80209b86>] ret_from_intr+0x0/0xf
> [  190.751652]  <EOI>  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
> [  190.751660]  [<ffffffff8028391c>] kfree+0x204/0x216
> [  190.751664]  [<ffffffff8028391f>] kfree+0x207/0x216
> [  190.751669]  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
> [  190.751672]  [<ffffffff804bb02c>] kfree_skbmem+0x11/0x7e
> [  190.751676]  [<ffffffff804bb13d>] __kfree_skb+0xa4/0xa9
> [  190.751680]  [<ffffffff804bb16c>] kfree_skb+0x2a/0x2c
> [  190.751684]  [<ffffffff804bd07d>] skb_free_datagram+0xc/0xe
> [  190.751688]  [<ffffffff804f5649>] udp_recvmsg+0x1e1/0x25f
> [  190.751695]  [<ffffffff804b725e>] sock_common_recvmsg+0x32/0x47
> [  190.751699]  [<ffffffff804b5d7d>] sock_recvmsg+0xe5/0xfe
> [  190.751703]  [<ffffffff8024e84b>] __lock_acquire+0xb4a/0xbde
> [  190.751709]  [<ffffffff802464bd>] autoremove_wake_function+0x0/0x38
> [  190.751713]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
> [  190.751717]  [<ffffffff804bb095>] kfree_skbmem+0x7a/0x7e
> [  190.751721]  [<ffffffff80209be0>] restore_args+0x0/0x30
> [  190.751727]  [<ffffffff80338e30>] copy_user_generic_string+0x0/0x40
> [  190.751731]  [<ffffffff804b6b64>] sys_recvfrom+0xce/0x122
> [  190.751736]  [<ffffffff8024d864>] trace_hardirqs_on+0x125/0x150
> [  190.751740]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
> [  190.751745]  [<ffffffff8020963e>] system_call+0x7e/0x83
> [  190.751749] 
> [  200.738822] BUG: soft lockup detected on CPU#0!
> [  200.743364] 
> [  200.743364] Call Trace:
> [  200.747304]  <IRQ>  [<ffffffff8025d788>] softlockup_tick+0xd4/0xe9
> [  200.753531]  [<ffffffff8023bc57>] run_local_timers+0x13/0x15
> [  200.759188]  [<ffffffff8023bf37>] update_process_times+0x4c/0x78
> [  200.765189]  [<ffffffff802181d0>] smp_local_timer_interrupt+0x34/0x51
> [  200.771634]  [<ffffffff80218966>] smp_apic_timer_interrupt+0x43/0x5d
> [  200.777987]  [<ffffffff8020a31b>] apic_timer_interrupt+0x6b/0x70
> [  200.783997]  [<ffffffff80533e2e>] _spin_unlock_irqrestore+0x3f/0x69
> [  200.790261]  [<ffffffff80533e31>] _spin_unlock_irqrestore+0x42/0x69
> [  200.796519]  [<ffffffff804ba219>] skb_queue_tail+0x41/0x49
> [  200.802004]  [<ffffffff804b990e>] sock_queue_rcv_skb+0xe2/0x11c
> [  200.807925]  [<ffffffff804f440e>] udp_queue_rcv_skb+0xb6/0x180
> [  200.813757]  [<ffffffff804f492e>] __udp4_lib_rcv+0x456/0x594
> [  200.819401]  [<ffffffff804f4a7e>] udp_rcv+0x12/0x14
> [  200.824294]  [<ffffffff804d5572>] ip_local_deliver+0xca/0x171
> [  200.830030]  [<ffffffff804d5b28>] ip_rcv+0x424/0x464
> [  200.835003]  [<ffffffff804bf912>] netif_receive_skb+0x25a/0x30f
> [  200.840919]  [<ffffffff804c182c>] process_backlog+0xab/0x10f
> [  200.846582]  [<ffffffff804c1acc>] net_rx_action+0xbb/0x254
> [  200.852070]  [<ffffffff80237cb5>] __do_softirq+0x5e/0xe2
> [  200.857375]  [<ffffffff8020a86c>] call_softirq+0x1c/0x28
> [  200.862680]  [<ffffffff8020bcd2>] do_softirq+0x39/0x9f
> [  200.867830]  [<ffffffff80237c4b>] irq_exit+0x45/0x51
> [  200.872788]  [<ffffffff8020be7b>] do_IRQ+0x143/0x168
> [  200.877748]  [<ffffffff80209b86>] ret_from_intr+0x0/0xf
> [  200.882977]  <EOI>  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
> [  200.889287]  [<ffffffff8028391c>] kfree+0x204/0x216
> [  200.894168]  [<ffffffff8028391f>] kfree+0x207/0x216
> [  200.899042]  [<ffffffff804bb256>] skb_release_data+0x95/0x9a
> [  200.904685]  [<ffffffff804bb02c>] kfree_skbmem+0x11/0x7e
> [  200.909999]  [<ffffffff804bb13d>] __kfree_skb+0xa4/0xa9
> [  200.915220]  [<ffffffff804bb16c>] kfree_skb+0x2a/0x2c
> [  200.920279]  [<ffffffff804bd07d>] skb_free_datagram+0xc/0xe
> [  200.925844]  [<ffffffff804f5649>] udp_recvmsg+0x1e1/0x25f
> [  200.931249]  [<ffffffff804b725e>] sock_common_recvmsg+0x32/0x47
> [  200.937183]  [<ffffffff804b5d7d>] sock_recvmsg+0xe5/0xfe
> [  200.942490]  [<ffffffff8024e84b>] __lock_acquire+0xb4a/0xbde
> [  200.948150]  [<ffffffff802464bd>] autoremove_wake_function+0x0/0x38
> [  200.954417]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
> [  200.960670]  [<ffffffff804bb095>] kfree_skbmem+0x7a/0x7e
> [  200.965984]  [<ffffffff80209be0>] restore_args+0x0/0x30
> [  200.971215]  [<ffffffff80338e30>] copy_user_generic_string+0x0/0x40
> [  200.977480]  [<ffffffff804b6b64>] sys_recvfrom+0xce/0x122
> [  200.982881]  [<ffffffff8024d864>] trace_hardirqs_on+0x125/0x150
> [  200.988791]  [<ffffffff805333ad>] trace_hardirqs_on_thunk+0x35/0x37
> [  200.995059]  [<ffffffff8020963e>] system_call+0x7e/0x83
>
> 2.6.18-5-xen
> ==================================================
>
> BUG: soft lockup detected on CPU#0!
>
> Call Trace:
>  <IRQ>  [<ffffffff802a76d8>] softlockup_tick+0xdb/0xed
>  [<ffffffff8026ba8c>] timer_interrupt+0x396/0x3f2
>  [<ffffffff80210a87>] handle_IRQ_event+0x2d/0x60
>  [<ffffffff802a7a17>] __do_IRQ+0xa4/0x105
>  [<ffffffff802699cf>] do_IRQ+0xe7/0xf5
>  [<ffffffff8038dede>] evtchn_do_upcall+0x86/0xe0
>  [<ffffffff8025cbfe>] do_hypervisor_callback+0x1e/0x2c
>  [<ffffffff8847b58b>] :ixgbe:ixgbe_clean_rx_irq+0x16b/0xa40
>  [<ffffffff8847abb7>] :ixgbe:ixgbe_clean_tx_irq+0xe7/0x330
>  [<ffffffff8847bf4d>] :ixgbe:ixgbe_intr+0xed/0x220
>  [<ffffffff80210a87>] handle_IRQ_event+0x2d/0x60
>  [<ffffffff802a7a17>] __do_IRQ+0xa4/0x105
>  [<ffffffff802699cf>] do_IRQ+0xe7/0xf5
>  [<ffffffff8038dede>] evtchn_do_upcall+0x86/0xe0
>  [<ffffffff8025cbfe>] do_hypervisor_callback+0x1e/0x2c
>  [<ffffffff80226aac>] tcp_v4_rcv+0x6bc/0xa59
>  [<ffffffff802268e9>] tcp_v4_rcv+0x4f9/0xa59
>  [<ffffffff8025cbfe>] do_hypervisor_callback+0x1e/0x2c
>  [<ffffffff8023463b>] ip_local_deliver+0x19d/0x263
>  [<ffffffff8023591a>] ip_rcv+0x49c/0x4df
>  [<ffffffff8021fe67>] netif_receive_skb+0x361/0x3df
>  [<ffffffff802307e3>] process_backlog+0x96/0xff
>  [<ffffffff8020c5ca>] net_rx_action+0xa8/0x1b4
>  [<ffffffff80211ed3>] __do_softirq+0x62/0xdd
>  [<ffffffff8025d0cc>] call_softirq+0x1c/0x28
>  <EOI>  [<ffffffff80269b59>] do_softirq+0x31/0x98
>  [<ffffffff8028714b>] local_bh_enable_ip+0x48/0x59
>  [<ffffffff8021cde4>] tcp_recvmsg+0x9db/0xaf0
>  [<ffffffff80231b2b>] sock_common_recvmsg+0x2d/0x42
>  [<ffffffff8023066b>] sock_recvmsg+0x101/0x120
>  [<ffffffff8029448d>] autoremove_wake_function+0x0/0x2e
>  [<ffffffff80246001>] sprintf+0x51/0x59
>  [<ffffffff8022fcfe>] dev_queue_xmit+0x2f2/0x313
>  [<ffffffff8022b25c>] local_bh_enable+0x9/0xa4
>  [<ffffffff8023356b>] lock_sock+0xa7/0xb2
>  [<ffffffff80260711>] _spin_lock_bh+0x9/0x14
>  [<ffffffff80230e57>] release_sock+0x13/0xaa
>  [<ffffffff8022b25c>] local_bh_enable+0x9/0xa4
>  [<ffffffff8023356b>] lock_sock+0xa7/0xb2
>  [<ffffffff8023af7b>] d_instantiate+0x4a/0x82
>  [<ffffffff802408aa>] d_rehash+0x21/0x34
>  [<ffffffff8022b16f>] sys_recvfrom+0xd4/0x137
>  [<ffffffff802a67f9>] audit_syscall_exit+0x2b/0x2ec
>  [<ffffffff8025c621>] tracesys+0xa7/0xb2
>
> BUG: soft lockup detected on CPU#0!
>
> Call Trace:
>  <IRQ>  [<ffffffff802a76d8>] softlockup_tick+0xdb/0xed
>  [<ffffffff8026ba8c>] timer_interrupt+0x396/0x3f2
>  [<ffffffff80210a87>] handle_IRQ_event+0x2d/0x60
>  [<ffffffff802a7a17>] __do_IRQ+0xa4/0x105
>  [<ffffffff802699cf>] do_IRQ+0xe7/0xf5
>  [<ffffffff8038dede>] evtchn_do_upcall+0x86/0xe0
>  [<ffffffff8025cbfe>] do_hypervisor_callback+0x1e/0x2c
>  [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
>  [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
>  [<ffffffff8038d649>] force_evtchn_callback+0xa/0xb
>  [<ffffffff8020ab34>] kmem_cache_alloc+0x62/0x6d
>  [<ffffffff802350a4>] alloc_skb_from_cache+0x52/0x13c
>  [<ffffffff803f183f>] __netdev_alloc_skb+0x12/0x2d
>  [<ffffffff8847b31b>] :ixgbe:ixgbe_alloc_rx_buffers+0xfb/0x200
>  [<ffffffff8847bcd2>] :ixgbe:ixgbe_clean_rx_irq+0x8b2/0xa40
>  [<ffffffff8847aac4>] :ixgbe:ixgbe_unmap_and_free_tx_resource+0x44/0x50
>  [<ffffffff8847bf4d>] :ixgbe:ixgbe_intr+0xed/0x220
>  [<ffffffff80210a87>] handle_IRQ_event+0x2d/0x60
>  [<ffffffff802a7a17>] __do_IRQ+0xa4/0x105
>  [<ffffffff802699cf>] do_IRQ+0xe7/0xf5
>  [<ffffffff80270394>] xen_send_IPI_mask+0xa5/0xaa
>  [<ffffffff8038dede>] evtchn_do_upcall+0x86/0xe0
>  [<ffffffff8026fa43>] smp_send_reschedule+0x4b/0x50
>  [<ffffffff8025cbfe>] do_hypervisor_callback+0x1e/0x2c
>  [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
>  [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
>  [<ffffffff8023dcca>] lock_timer_base+0x1b/0x3c
>  [<ffffffff8038d649>] force_evtchn_callback+0xa/0xb
>  [<ffffffff8021bf22>] __mod_timer+0xb0/0xbe
>  [<ffffffff80253b8d>] sk_reset_timer+0xf/0x19
>  [<ffffffff80226a6f>] tcp_v4_rcv+0x67f/0xa59
>  [<ffffffff8025e499>] __sched_text_start+0x2d9/0x97b
>  [<ffffffff8023463b>] ip_local_deliver+0x19d/0x263
>  [<ffffffff8023591a>] ip_rcv+0x49c/0x4df
>  [<ffffffff8021fe67>] netif_receive_skb+0x361/0x3df
>  [<ffffffff802307e3>] process_backlog+0x96/0xff
>  [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
>  [<ffffffff8020c5ca>] net_rx_action+0xa8/0x1b4
>  [<ffffffff80211ed3>] __do_softirq+0x62/0xdd
>  [<ffffffff8025d0cc>] call_softirq+0x1c/0x28
>  <EOI>  [<ffffffff80269b59>] do_softirq+0x31/0x98
>  [<ffffffff8022b2e6>] local_bh_enable+0x93/0xa4
>  [<ffffffff8025308f>] __tcp_checksum_complete_user+0x19/0x2c
>  [<ffffffff8021aeb2>] tcp_rcv_established+0x45c/0x917
>  [<ffffffff8023b449>] tcp_v4_do_rcv+0x2a/0x300
>  [<ffffffff80230e98>] release_sock+0x54/0xaa
>  [<ffffffff8021cde4>] tcp_recvmsg+0x9db/0xaf0
>  [<ffffffff80231b2b>] sock_common_recvmsg+0x2d/0x42
>  [<ffffffff8023066b>] sock_recvmsg+0x101/0x120
>  [<ffffffff8029448d>] autoremove_wake_function+0x0/0x2e
>  [<ffffffff80246001>] sprintf+0x51/0x59
>  [<ffffffff8022fcfe>] dev_queue_xmit+0x2f2/0x313
>  [<ffffffff8022b25c>] local_bh_enable+0x9/0xa4
>  [<ffffffff8023356b>] lock_sock+0xa7/0xb2
>  [<ffffffff80260711>] _spin_lock_bh+0x9/0x14
>  [<ffffffff80230e57>] release_sock+0x13/0xaa
>  [<ffffffff8022b25c>] local_bh_enable+0x9/0xa4
>  [<ffffffff8023356b>] lock_sock+0xa7/0xb2
>  [<ffffffff8023af7b>] d_instantiate+0x4a/0x82
>  [<ffffffff802408aa>] d_rehash+0x21/0x34
>  [<ffffffff8022b16f>] sys_recvfrom+0xd4/0x137
>  [<ffffffff803ed57c>] move_addr_to_user+0x5d/0x78
>  [<ffffffff803ed641>] sys_getpeername+0x9f/0xb5
>  [<ffffffff8025c621>] tracesys+0xa7/0xb2
>
>
>   


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-27 23:45         ` Matheos Worku
@ 2008-03-28  0:02           ` David Miller
  2008-03-28  0:19             ` Matheos Worku
  0 siblings, 1 reply; 36+ messages in thread
From: David Miller @ 2008-03-28  0:02 UTC (permalink / raw)
  To: Matheos.Worku; +Cc: jesse.brandeburg, jarkao2, netdev

From: Matheos Worku <Matheos.Worku@Sun.COM>
Date: Thu, 27 Mar 2008 16:45:06 -0700

> Brandeburg, Jesse wrote:
> > Jarek Poplawski wrote:
> >   
> >> On Wed, Mar 26, 2008 at 01:26:00PM -0700, Matheos Worku wrote:
> >> ...
> >>     
> >>> nsn57-110 login: BUG: soft lockup - CPU#2 stuck for 11s! ... Call
> >>> Trace: [<ffffffff803ef5f6>] __skb_clone+0x24/0xdc
> >>> [<ffffffff803f152e>] skb_realloc_headroom+0x30/0x63
> >>> [<ffffffff882edd40>] :niu:niu_start_xmit+0x114/0x5af
> >>> [<ffffffff80221995>] gart_map_single+0x0/0x70
> >>> [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246 ...
> >>>       
> >> Maybe I'm wrong with this again, but I wonder about this
> >> gart_map_single on almost all traces, and probably not supposed to be
> >> seen here. Did you try with some memory re-config/debugging?
> >>     
> >
> > I have some more examples of this but with the ixgbe driver.  We are
> > running heavy bidirectional stress with multiple rx (non-napi, yeah I
> > know) interrupts by default (and userspace irqbalance is probably on,
> > I'll have the lab try it without)
> >   
> 
> I have seen the lockup on kernels 2.6.18 and newer mostly on TX traffic. 
> I have seen it on another 10G driver (off the tree niu driver sibling, 
> nxge).  The nxge driver doesn't use any TX interrupts and I have seen it 
> with UDP TX, irqbalance disabled, with no irq activity at all.  some 
> example traces included.

Interesting.

Are you running uperf in a way such that there are multiple
processors doing TX's in parallel?  That might be a clue.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28  0:02           ` David Miller
@ 2008-03-28  0:19             ` Matheos Worku
  2008-03-28  0:34               ` David Miller
  0 siblings, 1 reply; 36+ messages in thread
From: Matheos Worku @ 2008-03-28  0:19 UTC (permalink / raw)
  To: David Miller; +Cc: jesse.brandeburg, jarkao2, netdev

David Miller wrote:
> From: Matheos Worku <Matheos.Worku@Sun.COM>
> Date: Thu, 27 Mar 2008 16:45:06 -0700
>
>   
>> Brandeburg, Jesse wrote:
>>     
>>> Jarek Poplawski wrote:
>>>   
>>>       
>>>> On Wed, Mar 26, 2008 at 01:26:00PM -0700, Matheos Worku wrote:
>>>> ...
>>>>     
>>>>         
>>>>> nsn57-110 login: BUG: soft lockup - CPU#2 stuck for 11s! ... Call
>>>>> Trace: [<ffffffff803ef5f6>] __skb_clone+0x24/0xdc
>>>>> [<ffffffff803f152e>] skb_realloc_headroom+0x30/0x63
>>>>> [<ffffffff882edd40>] :niu:niu_start_xmit+0x114/0x5af
>>>>> [<ffffffff80221995>] gart_map_single+0x0/0x70
>>>>> [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246 ...
>>>>>       
>>>>>           
>>>> Maybe I'm wrong with this again, but I wonder about this
>>>> gart_map_single on almost all traces, and probably not supposed to be
>>>> seen here. Did you try with some memory re-config/debugging?
>>>>     
>>>>         
>>> I have some more examples of this but with the ixgbe driver.  We are
>>> running heavy bidirectional stress with multiple rx (non-napi, yeah I
>>> know) interrupts by default (and userspace irqbalance is probably on,
>>> I'll have the lab try it without)
>>>   
>>>       
>> I have seen the lockup on kernels 2.6.18 and newer mostly on TX traffic. 
>> I have seen it on another 10G driver (off the tree niu driver sibling, 
>> nxge).  The nxge driver doesn't use any TX interrupts and I have seen it 
>> with UDP TX, irqbalance disabled, with no irq activity at all.  some 
>> example traces included.
>>     
>
> Interesting.
>
> Are you running uperf in a way such that there are multiple
> processors doing TX's in parallel?  That might be a clue.
>   
Dave,
Actually I am running a version of the nxge driver which uses only one 
TX ring, no LLTX enabled so the driver does single threaded TX. On the 
other hand, uperf (or iperf, netperf ) is running multiple TX 
connections in parallel and the connections are bound on multiple 
processors, hence they are running in parallel.

Regards
Matheos


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28  0:19             ` Matheos Worku
@ 2008-03-28  0:34               ` David Miller
  2008-03-28  1:22                 ` Herbert Xu
  0 siblings, 1 reply; 36+ messages in thread
From: David Miller @ 2008-03-28  0:34 UTC (permalink / raw)
  To: Matheos.Worku; +Cc: jesse.brandeburg, jarkao2, netdev, herbert, hadi

From: Matheos Worku <Matheos.Worku@Sun.COM>
Date: Thu, 27 Mar 2008 17:19:42 -0700

> Actually I am running a version of the nxge driver which uses only one 
> TX ring, no LLTX enabled so the driver does single threaded TX.

Ok.

> On the other hand, uperf (or iperf, netperf ) is running multiple TX
> connections in parallel and the connections are bound on multiple
> processors, hence they are running in parallel.

Yes, this is what I was interested in.

I think I know what's wrong.

If one cpu gets into the "qdisc remove, give to device" loop, other
cpus will simply add to the qdisc and that first cpu will do the all
of the TX processing.

This helps performance, but in your case it is clear that if the
device is fast enough and there are enough other cpus generating TX
traffic, it is quite trivial to get a cpu wedged there and never exit.

The code in question is net/sched/sch_generic.c:__qdisc_run(), it just
loops there until the device TX fills up or there are no more packets
in the qdisc queue.

qdisc_run() (in include/linux/pkt_sched.h) sets
__LINK_STATE_QDISC_RUNNING to tell other cpus that there is a cpu
processing the queue inside of __qdisc_run().

net/core/dev.c:dev_queue_xmit() then goes:

	if (q->enqueue) {
		/* Grab device queue */
		spin_lock(&dev->queue_lock);
		q = dev->qdisc;
		if (q->enqueue) {
			/* reset queue_mapping to zero */
			skb_set_queue_mapping(skb, 0);
			rc = q->enqueue(skb, q);
			qdisc_run(dev);
			spin_unlock(&dev->queue_lock);

			rc = rc == NET_XMIT_BYPASS ? NET_XMIT_SUCCESS : rc;
			goto out;
		}
		spin_unlock(&dev->queue_lock);
	}

The first cpu will get into __qdisc_run(), but the other
ones will just q->enqueue() and exit since the first cpu
has indicated it is processing the qdisc.

I'm not sure how we should fix this at the moment, we want
to keep the behavior but on the other hand we need to
break out of this so we don't get stuck here for too long.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28  0:34               ` David Miller
@ 2008-03-28  1:22                 ` Herbert Xu
  2008-03-28  1:38                   ` David Miller
  2008-03-28  1:58                   ` Matheos Worku
  0 siblings, 2 replies; 36+ messages in thread
From: Herbert Xu @ 2008-03-28  1:22 UTC (permalink / raw)
  To: David Miller; +Cc: Matheos.Worku, jesse.brandeburg, jarkao2, netdev, hadi

On Thu, Mar 27, 2008 at 05:34:18PM -0700, David Miller wrote:
>
> The first cpu will get into __qdisc_run(), but the other
> ones will just q->enqueue() and exit since the first cpu
> has indicated it is processing the qdisc.

Indeed.  Since I created the problem it's only fair that I get
to fix it too :)

[NET]: Add preemption point in qdisc_run

The qdisc_run loop is currently unbounded and runs entirely
in a softirq.  This is bad as it may create an unbounded softirq
run.

This patch fixes this by calling need_resched and breaking out
if necessary.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 10b5c08..80d53dd 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -187,6 +187,10 @@ void __qdisc_run(struct net_device *dev)
 	do {
 		if (!qdisc_restart(dev))
 			break;
+		if (need_resched()) {
+			netif_schedule(dev);
+			break;
+		}
 	} while (!netif_queue_stopped(dev));
 
 	clear_bit(__LINK_STATE_QDISC_RUNNING, &dev->state);

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28  1:22                 ` Herbert Xu
@ 2008-03-28  1:38                   ` David Miller
  2008-03-28 10:29                     ` Herbert Xu
  2008-03-28  1:58                   ` Matheos Worku
  1 sibling, 1 reply; 36+ messages in thread
From: David Miller @ 2008-03-28  1:38 UTC (permalink / raw)
  To: herbert; +Cc: Matheos.Worku, jesse.brandeburg, jarkao2, netdev, hadi

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 28 Mar 2008 09:22:34 +0800

> @@ -187,6 +187,10 @@ void __qdisc_run(struct net_device *dev)
>  	do {
>  		if (!qdisc_restart(dev))
>  			break;
> +		if (need_resched()) {
> +			netif_schedule(dev);
> +			break;
> +		}
>  	} while (!netif_queue_stopped(dev));
>  
>  	clear_bit(__LINK_STATE_QDISC_RUNNING, &dev->state);

This runs from softirqs, the local thread's scheduling
state is updated from timers which also run from softirqs,
so this need_resched() test won't work.

Probably you'll need something similar to the ->poll()
NAPI breakout logic, which uses 2 jiffies as the breakout
point.  (it uses 2, because we could be very close to
transitioning to the next jiffie and thus don't want
to breakout prematurely in that case)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28  1:22                 ` Herbert Xu
  2008-03-28  1:38                   ` David Miller
@ 2008-03-28  1:58                   ` Matheos Worku
  2008-03-28 10:33                     ` jamal
  2008-03-28 10:38                     ` Herbert Xu
  1 sibling, 2 replies; 36+ messages in thread
From: Matheos Worku @ 2008-03-28  1:58 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, jesse.brandeburg, jarkao2, netdev, hadi

Herbert Xu wrote:
> On Thu, Mar 27, 2008 at 05:34:18PM -0700, David Miller wrote:
>   
>> The first cpu will get into __qdisc_run(), but the other
>> ones will just q->enqueue() and exit since the first cpu
>> has indicated it is processing the qdisc.
>>     
>
> Indeed.  Since I created the problem it's only fair that I get
> to fix it too :)
>
> [NET]: Add preemption point in qdisc_run
>
> The qdisc_run loop is currently unbounded and runs entirely
> in a softirq.  This is bad as it may create an unbounded softirq
> run.
>
> This patch fixes this by calling need_resched and breaking out
> if necessary.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> Thanks,
>   
Great.

In general, while the TX serialization  improves performance in terms to 
lock contention, wouldn't it reduce throughput since only one guy is 
doing the actual TX at any given time.  Wondering if it would be 
worthwhile to have an  enable/disable option specially for multi queue TX.

Regards
Matheos


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28  1:38                   ` David Miller
@ 2008-03-28 10:29                     ` Herbert Xu
  2008-03-28 10:56                       ` Ingo Molnar
  0 siblings, 1 reply; 36+ messages in thread
From: Herbert Xu @ 2008-03-28 10:29 UTC (permalink / raw)
  To: David Miller
  Cc: Matheos.Worku, jesse.brandeburg, jarkao2, netdev, hadi,
	Ingo Molnar

On Thu, Mar 27, 2008 at 06:38:44PM -0700, David Miller wrote:
>
> This runs from softirqs, the local thread's scheduling
> state is updated from timers which also run from softirqs,
> so this need_resched() test won't work.

I had a trawl through the scheduler/timer code and it appears
that even with softirqs disabled we should able to set the flag
through this call chain (on x86-32):

timer_interrupt => do_timer_interrupt_hook => tick_handle_periodic =>
tick_periodic => update_process_times => scheduler_tick

Ingo, could you confirm that the scheduler is capable of setting
need_resched even with BH disabled?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28  1:58                   ` Matheos Worku
@ 2008-03-28 10:33                     ` jamal
  2008-03-28 17:00                       ` Matheos Worku
  2008-03-28 10:38                     ` Herbert Xu
  1 sibling, 1 reply; 36+ messages in thread
From: jamal @ 2008-03-28 10:33 UTC (permalink / raw)
  To: Matheos Worku; +Cc: Herbert Xu, David Miller, jesse.brandeburg, jarkao2, netdev

On Thu, 2008-27-03 at 18:58 -0700, Matheos Worku wrote:

> In general, while the TX serialization  improves performance in terms to 
> lock contention, wouldn't it reduce throughput since only one guy is 
> doing the actual TX at any given time.  Wondering if it would be 
> worthwhile to have an  enable/disable option specially for multi queue TX.

Empirical evidence so far says at some point the bottleneck is going to
be the wire i.e modern CPUs are "fast enough" that sooner than later
they will fill up the DMA ring of transmitting driver and go back to
doing other things. 
It is hard to create the condition you seem to have come across. I had
access to a dual core opteron but found it very hard with parallel UDP
sessions to keep the TX CPU locked in that region (while the other 3
were busy pumping packets). My folly could have been that i had a Gige
wire and maybe a 10G would have recreated the condition. 
If you can reproduce this at will, can you try to reduce the number of
sending TX u/iperfs and see when it begins to happen?
Are all the iperfs destined out of the same netdevice?

[Typically the TX path on the driver side is inefficient either because
of coding (ex: unnecessary locks) or expensive IO. But this has not
mattered much thus far (given fast enough CPUs).
It all could be improved by reducing the per packet operations the
driver incurs -  as an example, the CPU (to the driver) could batch a
set of packet to the device then kick the device DMA once for the batch
etc.]

cheers,
jamal

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28  1:58                   ` Matheos Worku
  2008-03-28 10:33                     ` jamal
@ 2008-03-28 10:38                     ` Herbert Xu
  2008-03-28 13:38                       ` Jarek Poplawski
  1 sibling, 1 reply; 36+ messages in thread
From: Herbert Xu @ 2008-03-28 10:38 UTC (permalink / raw)
  To: Matheos Worku; +Cc: David Miller, jesse.brandeburg, jarkao2, netdev, hadi

On Thu, Mar 27, 2008 at 06:58:18PM -0700, Matheos Worku wrote:
>
> In general, while the TX serialization  improves performance in terms to 
> lock contention, wouldn't it reduce throughput since only one guy is 
> doing the actual TX at any given time.  Wondering if it would be 
> worthwhile to have an  enable/disable option specially for multi queue TX.

Unless you have a truly lockless driver where two CPUs can transmit
at the same time then regardless of what we do only one CPU can
transmit at a time.

Given that premise, we might as well let one CPU transmit as much
as possible since moving to another CPU after each packet is going
to bounce a lot more than just the spin lock and that is going to
be expensive.

Of course getting stuck in that loop forever when there is other
work to be done (such as running the watchdog :) is bad.  But
hopefully that's what this patch will fix.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 10:29                     ` Herbert Xu
@ 2008-03-28 10:56                       ` Ingo Molnar
  2008-03-28 11:06                         ` Herbert Xu
  0 siblings, 1 reply; 36+ messages in thread
From: Ingo Molnar @ 2008-03-28 10:56 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, Matheos.Worku, jesse.brandeburg, jarkao2, netdev,
	hadi


* Herbert Xu <herbert@gondor.apana.org.au> wrote:

> > This runs from softirqs, the local thread's scheduling state is 
> > updated from timers which also run from softirqs, so this 
> > need_resched() test won't work.
> 
> I had a trawl through the scheduler/timer code and it appears that 
> even with softirqs disabled we should able to set the flag through 
> this call chain (on x86-32):
> 
> timer_interrupt => do_timer_interrupt_hook => tick_handle_periodic => 
> tick_periodic => update_process_times => scheduler_tick
> 
> Ingo, could you confirm that the scheduler is capable of setting 
> need_resched even with BH disabled?

hm, what's the context of this discussion? The call chain looks ok, 
that's how we preempt tasks from the timer tick. But other code besides 
the scheduler shouldnt do this.

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 10:56                       ` Ingo Molnar
@ 2008-03-28 11:06                         ` Herbert Xu
  2008-03-28 11:29                           ` Herbert Xu
  2008-03-28 14:09                           ` Ingo Molnar
  0 siblings, 2 replies; 36+ messages in thread
From: Herbert Xu @ 2008-03-28 11:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David Miller, Matheos.Worku, jesse.brandeburg, jarkao2, netdev,
	hadi

On Fri, Mar 28, 2008 at 11:56:29AM +0100, Ingo Molnar wrote:
>
> hm, what's the context of this discussion? The call chain looks ok, 
> that's how we preempt tasks from the timer tick. But other code besides 
> the scheduler shouldnt do this.

The code under discussion is __qdisc_run from net/sched/sch_generic.c.

It runs with BH off from either process context or softirq context.
As it is it can keep running forever.  We were discussing adding
a need_resched check in there.  So the question is would need_resched
ever get updated while BH is disabled?

Anyway, I've just realised that even if it does get updated, we still
need a better bound to avoid starving other local softirq events so
this is probably moot.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 11:06                         ` Herbert Xu
@ 2008-03-28 11:29                           ` Herbert Xu
  2008-03-28 12:19                             ` jamal
  2008-03-28 23:25                             ` David Miller
  2008-03-28 14:09                           ` Ingo Molnar
  1 sibling, 2 replies; 36+ messages in thread
From: Herbert Xu @ 2008-03-28 11:29 UTC (permalink / raw)
  To: David S. Miller
  Cc: Ingo Molnar, Matheos.Worku, jesse.brandeburg, jarkao2, netdev,
	hadi

On Fri, Mar 28, 2008 at 07:06:21PM +0800, Herbert Xu wrote:
>
> Anyway, I've just realised that even if it does get updated, we still
> need a better bound to avoid starving other local softirq events so
> this is probably moot.

OK, since we don't really have any good ways of balancing softirq
events with each other, I've taken Dave's suggestion of checking
the jiffies as is done with NAPI.  I've kept the need_resched to
minimise the scheduling latency.

[NET]: Add preemption point in qdisc_run

The qdisc_run loop is currently unbounded and runs entirely
in a softirq.  This is bad as it may create an unbounded softirq
run.

This patch fixes this by calling need_resched and breaking out
if necessary.

It also adds a break out if the jiffies value changes since that
would indicate we've been transmitting for too long which starves
other softirqs.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 10b5c08..b741618 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -184,10 +184,22 @@ static inline int qdisc_restart(struct net_device *dev)
 
 void __qdisc_run(struct net_device *dev)
 {
-	do {
-		if (!qdisc_restart(dev))
+	unsigned long start_time = jiffies;
+
+	while (qdisc_restart(dev)) {
+		if (netif_queue_stopped(dev))
+			break;
+
+		/*
+		 * Postpone processing if
+		 * 1. another process needs the CPU;
+		 * 2. we've been doing it for too long.
+		 */
+		if (need_resched() || jiffies != start_time) {
+			netif_schedule(dev);
 			break;
-	} while (!netif_queue_stopped(dev));
+		}
+	}
 
 	clear_bit(__LINK_STATE_QDISC_RUNNING, &dev->state);
 }

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 11:29                           ` Herbert Xu
@ 2008-03-28 12:19                             ` jamal
  2008-03-28 13:26                               ` Herbert Xu
  2008-03-28 23:25                             ` David Miller
  1 sibling, 1 reply; 36+ messages in thread
From: jamal @ 2008-03-28 12:19 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David S. Miller, Ingo Molnar, Matheos.Worku, jesse.brandeburg,
	jarkao2, netdev

On Fri, 2008-28-03 at 19:29 +0800, Herbert Xu wrote:

> OK, since we don't really have any good ways of balancing softirq
> events with each other, I've taken Dave's suggestion of checking
> the jiffies as is done with NAPI.  I've kept the need_resched to
> minimise the scheduling latency.

i think the need_resched would be effective.

**** heres something that has not yet resolved in my mind (even for the
NAPI case which was inherited from code that has been there forever).
Theres probably a very simple answer ;->
say we have two machines:
one faster and we decide HZ = 1000 meaning 1ms for a jiffy; the other
slower and we make HZ 100 i.e 10ms jiffy.
Logically, shouldnt the slower machine allocate less time running the
loop since it has less resources? in the above case it would spend 10
times more. 

cheers,
jamal

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 12:19                             ` jamal
@ 2008-03-28 13:26                               ` Herbert Xu
  2008-03-28 14:07                                 ` jamal
  2008-03-28 14:12                                 ` Ingo Molnar
  0 siblings, 2 replies; 36+ messages in thread
From: Herbert Xu @ 2008-03-28 13:26 UTC (permalink / raw)
  To: jamal
  Cc: David S. Miller, Ingo Molnar, Matheos.Worku, jesse.brandeburg,
	jarkao2, netdev

On Fri, Mar 28, 2008 at 08:19:19AM -0400, jamal wrote:
>
> **** heres something that has not yet resolved in my mind (even for the
> NAPI case which was inherited from code that has been there forever).
> Theres probably a very simple answer ;->
> say we have two machines:
> one faster and we decide HZ = 1000 meaning 1ms for a jiffy; the other
> slower and we make HZ 100 i.e 10ms jiffy.
> Logically, shouldnt the slower machine allocate less time running the
> loop since it has less resources? in the above case it would spend 10
> times more. 

I agree that using jiffies is a pretty coarse approximation of
proper scheduling.  However, in the absence of a better solution
we have to live with it.

Perhaps running these out of process context is the correct
approach.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 10:38                     ` Herbert Xu
@ 2008-03-28 13:38                       ` Jarek Poplawski
  2008-03-28 13:53                         ` Herbert Xu
  0 siblings, 1 reply; 36+ messages in thread
From: Jarek Poplawski @ 2008-03-28 13:38 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Matheos Worku, David Miller, jesse.brandeburg, netdev, hadi

On Fri, Mar 28, 2008 at 06:38:09PM +0800, Herbert Xu wrote:
...
> Given that premise, we might as well let one CPU transmit as much
> as possible since moving to another CPU after each packet is going
> to bounce a lot more than just the spin lock and that is going to
> be expensive.

Considering this, I wonder why using this __LINK_STATE_QDISC_RUNNING
flag to control enqueuing as well would be a wrong idea? Wouldn't this
enforce pseudo affinity?

Regards,
Jarek P.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 13:38                       ` Jarek Poplawski
@ 2008-03-28 13:53                         ` Herbert Xu
  2008-03-28 14:39                           ` Jarek Poplawski
  0 siblings, 1 reply; 36+ messages in thread
From: Herbert Xu @ 2008-03-28 13:53 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Matheos Worku, David Miller, jesse.brandeburg, netdev, hadi

On Fri, Mar 28, 2008 at 02:38:45PM +0100, Jarek Poplawski wrote:
> 
> Considering this, I wonder why using this __LINK_STATE_QDISC_RUNNING
> flag to control enqueuing as well would be a wrong idea? Wouldn't this
> enforce pseudo affinity?

Could you elaborate on how you intend to use it to control the
act of enqueueing?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 13:26                               ` Herbert Xu
@ 2008-03-28 14:07                                 ` jamal
  2008-03-28 14:12                                 ` Ingo Molnar
  1 sibling, 0 replies; 36+ messages in thread
From: jamal @ 2008-03-28 14:07 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David S. Miller, Ingo Molnar, Matheos.Worku, jesse.brandeburg,
	jarkao2, netdev

On Fri, 2008-28-03 at 21:26 +0800, Herbert Xu wrote:
> I agree that using jiffies is a pretty coarse approximation of
> proper scheduling.  However, in the absence of a better solution
> we have to live with it.

I know it works well enough and has for years; sure you could do better
but i wasnt questioning the solution - more than anything on my part
that was curiosity.

> Perhaps running these out of process context is the correct
> approach.

I saw something from Max Krasnyansky along those lines but havent quiet
followed it up. Should be interesting to see the effect over a large
types of workloads.

cheers,
jamal


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 11:06                         ` Herbert Xu
  2008-03-28 11:29                           ` Herbert Xu
@ 2008-03-28 14:09                           ` Ingo Molnar
  1 sibling, 0 replies; 36+ messages in thread
From: Ingo Molnar @ 2008-03-28 14:09 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, Matheos.Worku, jesse.brandeburg, jarkao2, netdev,
	hadi


* Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Fri, Mar 28, 2008 at 11:56:29AM +0100, Ingo Molnar wrote:
> >
> > hm, what's the context of this discussion? The call chain looks ok, 
> > that's how we preempt tasks from the timer tick. But other code besides 
> > the scheduler shouldnt do this.
> 
> The code under discussion is __qdisc_run from net/sched/sch_generic.c.
> 
> It runs with BH off from either process context or softirq context. As 
> it is it can keep running forever.  We were discussing adding a 
> need_resched check in there.  So the question is would need_resched 
> ever get updated while BH is disabled?

yes, certainly - as long as the timer irq is allowed. (i.e. hardirqs are 
not disabled) The scheduler tick runs from hardirq context, not from 
softirq context. (that's how we can preempt softirq threads on the -rt 
kernel for example)

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 13:26                               ` Herbert Xu
  2008-03-28 14:07                                 ` jamal
@ 2008-03-28 14:12                                 ` Ingo Molnar
  1 sibling, 0 replies; 36+ messages in thread
From: Ingo Molnar @ 2008-03-28 14:12 UTC (permalink / raw)
  To: Herbert Xu
  Cc: jamal, David S. Miller, Matheos.Worku, jesse.brandeburg, jarkao2,
	netdev


* Herbert Xu <herbert@gondor.apana.org.au> wrote:

> I agree that using jiffies is a pretty coarse approximation of proper 
> scheduling.  However, in the absence of a better solution we have to 
> live with it.
> 
> Perhaps running these out of process context is the correct approach.

yes. Such anonymous work loops inside softirq context are a disaster to 
TCP determinism and a disaster to scheduling in general (the wrong guy 
gets credited with the overhead). Softirqs were a neat hack 10 years 
ago, now if we know the target task for some workload we should execute 
as much of the workload in that task's context as possible. (and even 
for stuff where we dont have a 'target task' - routing, filtering, etc. 
- it might be better to use kernel threads.)

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 13:53                         ` Herbert Xu
@ 2008-03-28 14:39                           ` Jarek Poplawski
  2008-03-28 14:56                             ` Herbert Xu
  0 siblings, 1 reply; 36+ messages in thread
From: Jarek Poplawski @ 2008-03-28 14:39 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Matheos Worku, David Miller, jesse.brandeburg, netdev, hadi

On Fri, Mar 28, 2008 at 09:53:38PM +0800, Herbert Xu wrote:
> On Fri, Mar 28, 2008 at 02:38:45PM +0100, Jarek Poplawski wrote:
> > 
> > Considering this, I wonder why using this __LINK_STATE_QDISC_RUNNING
> > flag to control enqueuing as well would be a wrong idea? Wouldn't this
> > enforce pseudo affinity?
> 
> Could you elaborate on how you intend to use it to control the
> act of enqueueing?

Probably some backlog would be necessary, similarly to RX. The main
question is how much cache problem is such enqueuing from other CPUs.

Jarek P.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 14:39                           ` Jarek Poplawski
@ 2008-03-28 14:56                             ` Herbert Xu
  2008-03-28 15:29                               ` Jarek Poplawski
  0 siblings, 1 reply; 36+ messages in thread
From: Herbert Xu @ 2008-03-28 14:56 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Matheos Worku, David Miller, jesse.brandeburg, netdev, hadi

On Fri, Mar 28, 2008 at 03:39:53PM +0100, Jarek Poplawski wrote:
> 
> Probably some backlog would be necessary, similarly to RX. The main

Sorry but I still don't understand how this would differ from
what we do now.  As it stands the packet would be queued for
later processing if another CPU is doing a qdisc run.

> question is how much cache problem is such enqueuing from other CPUs.

It's pretty bad actually.  In fact multiple TX queues isn't
going to be useful until we address this bottle-neck.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 14:56                             ` Herbert Xu
@ 2008-03-28 15:29                               ` Jarek Poplawski
  2008-03-28 15:47                                 ` Jarek Poplawski
  2008-03-29  1:06                                 ` Herbert Xu
  0 siblings, 2 replies; 36+ messages in thread
From: Jarek Poplawski @ 2008-03-28 15:29 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Matheos Worku, David Miller, jesse.brandeburg, netdev, hadi

On Fri, Mar 28, 2008 at 10:56:34PM +0800, Herbert Xu wrote:
> On Fri, Mar 28, 2008 at 03:39:53PM +0100, Jarek Poplawski wrote:
> > 
> > Probably some backlog would be necessary, similarly to RX. The main
> 
> Sorry but I still don't understand how this would differ from
> what we do now.  As it stands the packet would be queued for
> later processing if another CPU is doing a qdisc run.

But during this, now limited, time of qdisc_run() there is a contention
for queue_lock and probably some additional cache updating because of
this other enqueuing, which could be delayed especially if queue length
is above some level.

Cheers,
Jarek P.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 15:29                               ` Jarek Poplawski
@ 2008-03-28 15:47                                 ` Jarek Poplawski
  2008-03-29  1:06                                 ` Herbert Xu
  1 sibling, 0 replies; 36+ messages in thread
From: Jarek Poplawski @ 2008-03-28 15:47 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Matheos Worku, David Miller, jesse.brandeburg, netdev, hadi

On Fri, Mar 28, 2008 at 04:29:53PM +0100, Jarek Poplawski wrote:
...
> But during this, now limited, time of qdisc_run() there is a contention
> for queue_lock and probably some additional cache updating because of
> this other enqueuing, which could be delayed especially if queue length
> is above some level.

...Hm... this would need some special treatment for RT traffic yet...

Jarek P.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 10:33                     ` jamal
@ 2008-03-28 17:00                       ` Matheos Worku
  0 siblings, 0 replies; 36+ messages in thread
From: Matheos Worku @ 2008-03-28 17:00 UTC (permalink / raw)
  To: hadi; +Cc: Herbert Xu, David Miller, jesse.brandeburg, jarkao2, netdev

jamal wrote:
> On Thu, 2008-27-03 at 18:58 -0700, Matheos Worku wrote:
>
>   
>> In general, while the TX serialization  improves performance in terms to 
>> lock contention, wouldn't it reduce throughput since only one guy is 
>> doing the actual TX at any given time.  Wondering if it would be 
>> worthwhile to have an  enable/disable option specially for multi queue TX.
>>     
>
> Empirical evidence so far says at some point the bottleneck is going to
> be the wire i.e modern CPUs are "fast enough" that sooner than later
> they will fill up the DMA ring of transmitting driver and go back to
> doing other things. 
>   

> It is hard to create the condition you seem to have come across. I had access to a dual core opteron but found it very hard with parallel UDP
> sessions to keep the TX CPU locked in that region (while the other 3
> were busy pumping packets). My folly could have been that i had a Gige
> wire and maybe a 10G would have recreated the condition. 
> If you can reproduce this at will, can you try to reduce the number of
> sending TX u/iperfs and see when it begins to happen?
> Are all the iperfs destined out of the same netdevice?
>   
I am using 10G nic at this time. With the same driver, I haven't come 
across  the lockup on 1G nic though I haven't really tried to reproduce 
it.   Regarding the number of connection it takes to create the 
situation, I have noticed the lockup at 3 or more udp connections.  
Also, with TSO disabled, I have came across it with lots of TCP connections.


> [Typically the TX path on the driver side is inefficient either because
> of coding (ex: unnecessary locks) or expensive IO. But this has not
> mattered much thus far (given fast enough CPUs).
>   
That could be true  though oprofile is not providing obvious clues, 
alteast not yet.
> It all could be improved by reducing the per packet operations the
> driver incurs -  as an example, the CPU (to the driver) could batch a
> set of packet to the device then kick the device DMA once for the batch
> etc.]
>   
Regards
matheos

> cheers,
> jamal
>
>   


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 11:29                           ` Herbert Xu
  2008-03-28 12:19                             ` jamal
@ 2008-03-28 23:25                             ` David Miller
  1 sibling, 0 replies; 36+ messages in thread
From: David Miller @ 2008-03-28 23:25 UTC (permalink / raw)
  To: herbert; +Cc: mingo, Matheos.Worku, jesse.brandeburg, jarkao2, netdev, hadi

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 28 Mar 2008 19:29:28 +0800

> [NET]: Add preemption point in qdisc_run
> 
> The qdisc_run loop is currently unbounded and runs entirely
> in a softirq.  This is bad as it may create an unbounded softirq
> run.
> 
> This patch fixes this by calling need_resched and breaking out
> if necessary.
> 
> It also adds a break out if the jiffies value changes since that
> would indicate we've been transmitting for too long which starves
> other softirqs.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, and I'll queue this up for -stable, thanks!

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-28 15:29                               ` Jarek Poplawski
  2008-03-28 15:47                                 ` Jarek Poplawski
@ 2008-03-29  1:06                                 ` Herbert Xu
  2008-03-29  9:11                                   ` Jarek Poplawski
  1 sibling, 1 reply; 36+ messages in thread
From: Herbert Xu @ 2008-03-29  1:06 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Matheos Worku, David Miller, jesse.brandeburg, netdev, hadi

On Fri, Mar 28, 2008 at 04:29:53PM +0100, Jarek Poplawski wrote:
>
> But during this, now limited, time of qdisc_run() there is a contention
> for queue_lock and probably some additional cache updating because of
> this other enqueuing, which could be delayed especially if queue length
> is above some level.

You mean delaying into a per-cpu queue? That sounds interesting.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.24 BUG: soft lockup - CPU#X
  2008-03-29  1:06                                 ` Herbert Xu
@ 2008-03-29  9:11                                   ` Jarek Poplawski
  0 siblings, 0 replies; 36+ messages in thread
From: Jarek Poplawski @ 2008-03-29  9:11 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Matheos Worku, David Miller, jesse.brandeburg, netdev, hadi

On Sat, Mar 29, 2008 at 09:06:10AM +0800, Herbert Xu wrote:
> On Fri, Mar 28, 2008 at 04:29:53PM +0100, Jarek Poplawski wrote:
> >
> > But during this, now limited, time of qdisc_run() there is a contention
> > for queue_lock and probably some additional cache updating because of
> > this other enqueuing, which could be delayed especially if queue length
> > is above some level.
> 
> You mean delaying into a per-cpu queue? That sounds interesting.

I mean any delaying could be necessary here. After rethinking it seems
to me this solution with the flag could be wrong even after current fix.

The owner of the flag has to give up queue_lock for some time, and
because of this its chances for regaining the lock are worse: other CPUs
could take it in a loop, winning the cache, and adding packets, which
are imediately dumped (or requeued).

So, it would make a kind of reverse lockup situation. Then, even normal
contention for both locks seems safer against such races: throughput
could be worse, but probably no such "(soft)lockup" risk.

Regards,
Jarek P.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2008-03-29  9:06 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-26 16:46 2.6.24 BUG: soft lockup - CPU#X Matheos Worku
2008-03-26 17:31 ` Rick Jones
2008-03-26 20:14 ` Jarek Poplawski
2008-03-26 20:26   ` Matheos Worku
2008-03-26 21:46     ` Jarek Poplawski
2008-03-26 21:53       ` Jarek Poplawski
2008-03-27 10:33     ` Jarek Poplawski
2008-03-27 23:18       ` Brandeburg, Jesse
2008-03-27 23:45         ` Matheos Worku
2008-03-28  0:02           ` David Miller
2008-03-28  0:19             ` Matheos Worku
2008-03-28  0:34               ` David Miller
2008-03-28  1:22                 ` Herbert Xu
2008-03-28  1:38                   ` David Miller
2008-03-28 10:29                     ` Herbert Xu
2008-03-28 10:56                       ` Ingo Molnar
2008-03-28 11:06                         ` Herbert Xu
2008-03-28 11:29                           ` Herbert Xu
2008-03-28 12:19                             ` jamal
2008-03-28 13:26                               ` Herbert Xu
2008-03-28 14:07                                 ` jamal
2008-03-28 14:12                                 ` Ingo Molnar
2008-03-28 23:25                             ` David Miller
2008-03-28 14:09                           ` Ingo Molnar
2008-03-28  1:58                   ` Matheos Worku
2008-03-28 10:33                     ` jamal
2008-03-28 17:00                       ` Matheos Worku
2008-03-28 10:38                     ` Herbert Xu
2008-03-28 13:38                       ` Jarek Poplawski
2008-03-28 13:53                         ` Herbert Xu
2008-03-28 14:39                           ` Jarek Poplawski
2008-03-28 14:56                             ` Herbert Xu
2008-03-28 15:29                               ` Jarek Poplawski
2008-03-28 15:47                                 ` Jarek Poplawski
2008-03-29  1:06                                 ` Herbert Xu
2008-03-29  9:11                                   ` Jarek Poplawski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).