All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: Bill Fink <billfink@mindspring.com>,
	Linux Network Developers <netdev@vger.kernel.org>,
	brice@myri.com, gallatin@myri.com
Subject: Re: Receive side performance issue with multi-10-GigE and NUMA
Date: Wed, 26 Aug 2009 20:15:02 +0200	[thread overview]
Message-ID: <20090826181502.GC13632@elte.hu> (raw)
In-Reply-To: <20090826180811.GB10816@hmsreliant.think-freely.org>


* Neil Horman <nhorman@tuxdriver.com> wrote:

> On Wed, Aug 26, 2009 at 07:00:13AM -0400, Neil Horman wrote:
> > On Wed, Aug 26, 2009 at 03:10:57AM -0400, Bill Fink wrote:
> > > On Fri, 21 Aug 2009, Neil Horman wrote:
> > > 
> > > > On Fri, Aug 21, 2009 at 12:14:21AM -0400, Bill Fink wrote:
> > > > > On Thu, 20 Aug 2009, Neil Horman wrote:
> > > > > 
> > > > > > On Thu, Aug 20, 2009 at 03:50:44AM -0400, Bill Fink wrote:
> > > > > > 
> > > > > > > When I tried an actual nuttcp performance test, even when rate limiting
> > > > > > > to just 1 Mbps, I immediately got a kernel oops.  I tried to get a
> > > > > > > crashdump via kexec/kdump, but the kexec kernel, instead of just
> > > > > > > generating a crashdump, fully booted the new kernel, which was
> > > > > > > extremely sluggish until I rebooted it through a BIOS re-init,
> > > > > > > and never produced a crashdump.  I tried this several times and
> > > > > > > an immediate kernel oops was always the result (with either a TCP
> > > > > > > or UDP test).  A ping test of 1000 9000-byte packets with an interval
> > > > > > > of 0.001 seconds (which is 72 Mbps for 1 second) on the other hand
> > > > > > > worked just fine.
> > > > > > 
> > > > > > The sluggishness is expected, since the kdump kernel operates out of such
> > > > > > limited memory.  don't know why you booted to a full system rather than did a
> > > > > > crash recovery.  Don't suppose you got a backtrace did you?
> > > > > 
> > > > > There was a backtrace on the screen but I didn't have a chance to
> > > > > record it.  BTW did anyone ever think to print the backtrace in
> > > > > reverse (first to some reserved memory and then output to the display)
> > > > > so the more interesting parts wouldn't have scrolled off the top of
> > > > > the screen?
> > > > > 
> > > > The real solution is to use a console to which the output doesn't scroll off the
> > > > screen.  Normally people use a serial console they can log, or a RAC card that
> > > > they can record. Even on a regular vga monitor in text mode, you can set up the
> > > > vt iirc to allow for scrolling.
> > > 
> > > None of our Asus P6T6 systems have serial consoles.  I don't know of
> > > any RAC cards for them either, nor are there spare PCI slots available
> > > in many cases.  I wouldn't think the Shift-PageUp trick would work
> > > with a crashed kernel, but I admit I didn't try it.  I haven't checked
> > > out netconsole yet either, but I'm not sure it would help either in a
> > > case like this that was a network related kernel crash.
> > > 
> > Any USB ports that you can attach a serial dongle to?  That would work as well,
> > or, as previously mentioned, netconsole also does the trick.
> > 
> > > In any case, a simple kernel command line that would provide a reversed
> > > backtrace would be a simple thing to facilitate Linux users providing
> > > useful info to Linux kernel developers in helping to debug kernel
> > > problems.  The most useful info would still be on the screen, so it
> > > could be transcribed or a photo image of the screen could be taken.
> > > 
> > I understand what your saying, I'm just saying there are currently several
> > options for you that have already solved this problem in differnt ways.
> > 
> > > Fortunately, in this specific case, the SuperMicro X8DAH+-F system
> > > does have a serial console, and after a fair amount of effort I was
> > > able to get it to work as desired, and was able to finally capture
> > > a backtrace of the kernel oops.  BTW I believe the reason the
> > > kexec/kdump didn't work was probably because it couldn't find
> > > a /proc/vmcore file, although I don't know why that would be,
> > > and the Fedora 10 /etc/init.d/kdump script will then just boot
> > > up normally if it fails to find the /proc/vmcore file (or it's
> > > zero size).
> > > 
> > I take care of kdump for fedora and RHEL.  If you file a bug on this, I'd be
> > happy to look into it further.
> > 
> > > The following shows a simple ping test usage of the skb_sources
> > > tracing feature:
> > > 
> > > [root@xeontest1 tracing]# numactl --membind=1 taskset -c 4 ping -c 5 -s 1472 192.168.1.10
> > > PING 192.168.1.10 (192.168.1.10) 1472(1500) bytes of data.
> > > 1480 bytes from 192.168.1.10: icmp_seq=1 ttl=64 time=0.139 ms
> > > 1480 bytes from 192.168.1.10: icmp_seq=2 ttl=64 time=0.182 ms
> > > 1480 bytes from 192.168.1.10: icmp_seq=3 ttl=64 time=0.178 ms
> > > 1480 bytes from 192.168.1.10: icmp_seq=4 ttl=64 time=0.188 ms
> > > 1480 bytes from 192.168.1.10: icmp_seq=5 ttl=64 time=0.178 ms
> > > 
> > > --- 192.168.1.10 ping statistics ---
> > > 5 packets transmitted, 5 received, 0% packet loss, time 3999ms
> > > rtt min/avg/max/mdev = 0.139/0.173/0.188/0.017 ms
> > > 
> > > [root@xeontest1 tracing]# cat trace
> > > # tracer: skb_sources
> > > #
> > > #       PID     ANID    CNID    IFC     RXQ     CCPU    LEN
> > > #        |       |       |       |       |       |       |
> > >         4217    1       1       eth2    0       4       1500
> > >         4217    1       1       eth2    0       4       1500
> > >         4217    1       1       eth2    0       4       1500
> > >         4217    1       1       eth2    0       4       1500
> > >         4217    1       1       eth2    0       4       1500
> > > 
> > > All is as was expected.
> > > 
> > > But if I try an actual nuttcp performance test (even rate limited
> > > to 1 Mbps), I get the following kernel oops:
> > > 
> > thank you, I think I see the problem, I'll have a patch for you in just a bit
> > 
> > Thanks
> > Neil
> > 
> > > [root@xeontest1 tracing]# numactl --membind=1 nuttcp -In2 -Ri1m -xc4/0 192.168.1.10
> > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
> > > IP: [<ffffffff810b01ab>] probe_skb_dequeue+0xf7/0x152
> > > PGD 337d12067 PUD 337d11067 PMD 0
> > > Oops: 0000 [#1] SMP
> > > last sysfs file: /sys/devices/pci0000:80/0000:80:07.0/0000:8b:00.0/0000:8c:04.0e
> > > CPU 4
> > > Modules linked in: w83627ehf hwmon_vid coretemp hwmon ipv6 dm_multipath uinput ]
> > > Pid: 4222, comm: nuttcp Not tainted 2.6.31-rc6-bf #3 X8DAH
> > > RIP: 0010:[<ffffffff810b01ab>]  [<ffffffff810b01ab>] probe_skb_dequeue+0xf7/0x12
> > > RSP: 0018:ffff8801a5811a88  EFLAGS: 00010213
> > > RAX: 0000000000000000 RBX: ffff88033906d154 RCX: 000000000000000d
> > > RDX: 000000000000f88c RSI: 000000000000000b RDI: ffff8803383d3044
> > > RBP: ffff8801a5811ab8 R08: 0000000000000001 R09: ffff8801ab311a00
> > > R10: 0000000000000005 R11: ffffc9000080e2b0 R12: ffff880337c45400
> > > R13: ffff88033906d150 R14: 0000000000000014 R15: ffffffff818bb890
> > > FS:  00007fa976d326f0(0000) GS:ffffc90000800000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > CR2: 0000000000000038 CR3: 000000033801e000 CR4: 00000000000006e0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > Process nuttcp (pid: 4222, threadinfo ffff8801a5810000, task ffff8801ab2e5d00)
> > > Stack:
> > >  ffff8801a5811ab8 ffff8801b35d4ab0 0000000000000014 0000000000000000
> > > <0> 0000000000000014 0000000000000014 ffff8801a5811b18 ffffffff81366ae8
> > > <0> ffff8801a5811ed8 0000001439084000 ffff880337c45400 00000001001416ef
> > > Call Trace:
> > >  [<ffffffff81366ae8>] skb_copy_datagram_iovec+0x50/0x1f5
> > >  [<ffffffff813ac875>] tcp_rcv_established+0x278/0x6db
> > >  [<ffffffff813b3ef5>] tcp_v4_do_rcv+0x1b8/0x366
> > >  [<ffffffff8135f99e>] ? release_sock+0xab/0xb4
> > >  [<ffffffff8136004d>] ? sk_wait_data+0xc8/0xd6
> > >  [<ffffffff813a32d6>] tcp_prequeue_process+0x79/0x8f
> > >  [<ffffffff813a455d>] tcp_recvmsg+0x4e8/0xaa0
> > >  [<ffffffff8135ec90>] sock_common_recvmsg+0x37/0x4c
> > >  [<ffffffff8135cb06>] __sock_recvmsg+0x72/0x7f
> > >  [<ffffffff8135cbdd>] sock_aio_read+0xca/0xda
> > >  [<ffffffff810d9536>] ? vma_merge+0x2a0/0x318
> > >  [<ffffffff810f6d4f>] do_sync_read+0xec/0x132
> > >  [<ffffffff81067ddc>] ? autoremove_wake_function+0x0/0x3d
> > >  [<ffffffff811b646c>] ? security_file_permission+0x16/0x18
> > >  [<ffffffff810f785c>] vfs_read+0xc0/0x107
> > >  [<ffffffff810f7971>] sys_read+0x4c/0x75
> > >  [<ffffffff81011c82>] system_call_fastpath+0x16/0x1b
> > > Code: 44 89 73 30 89 43 14 41 0f b7 84 24 ac 00 00 00 89 43 28 65 8b 04 25 98 e
> > > RIP  [<ffffffff810b01ab>] probe_skb_dequeue+0xf7/0x152
> > >  RSP <ffff8801a5811a88>
> > > CR2: 0000000000000038
> > > 
> > > 						-Thanks
> > > 
> > > 						-Bill
> > > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> 
> Here  you go, I think this will fix your oops.
> 
> 
>     Fix NULL pointer deref in skb sources ftracer
>     
>     Its possible that skb->sk will be null in this path, so we shouldn't just assume
>     we can pass it to sock_net
>     
>     Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> 
>  trace_skb_sources.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)

ok if this is just a temporary fix until TRACE_EVENT() is done, but 
we'll get rid of this and do TRACE_EVENT() before net-next-2.6 it's 
pushed to .32, right?

	Ingo

  reply	other threads:[~2009-08-26 18:15 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-07 21:06 Receive side performance issue with multi-10-GigE and NUMA Bill Fink
2009-08-07 21:18 ` Brice Goglin
2009-08-07 21:51   ` Bill Fink
2009-08-07 21:53     ` Brice Goglin
2009-08-07 22:08       ` Bill Fink
2009-08-07 22:17         ` Brice Goglin
2009-08-07 22:55           ` Bill Fink
2009-08-08  1:03     ` Andrew Gallatin
2009-08-08  1:35       ` Bill Fink
2009-08-08 11:08         ` Andrew Gallatin
2009-08-08 11:26           ` Neil Horman
2009-08-08 18:21             ` Andrew Gallatin
2009-08-08 18:32               ` Neil Horman
2009-08-11  7:32                 ` Bill Fink
2009-08-11 11:02                   ` Neil Horman
2009-08-11 19:15                     ` Christoph Lameter
2009-08-11 22:27                   ` Andi Kleen
2009-08-12  4:30                     ` Bill Fink
2009-08-12  7:21                       ` Andi Kleen
     [not found]                       ` <4A856781.2080301@myri.com>
2009-08-14 16:38                         ` Bill Fink
2009-08-14 16:55                           ` Andrew Gallatin
2009-08-14 21:13                             ` Aviv Greenberg
2009-08-20  7:26                               ` Bill Fink
2009-08-20 13:14                                 ` Ben Hutchings
2009-08-21  4:00                                   ` Bill Fink
2009-08-20 13:17                                 ` Aviv Greenberg
2009-08-12  0:02                   ` Brandeburg, Jesse
2009-08-12  4:38                     ` Bill Fink
2009-08-12 16:00                       ` Jesse Barnes
2009-08-14 20:31                       ` Bill Fink
2009-08-17 16:53                         ` Jesse Barnes
2009-08-18  7:07                           ` Bill Fink
2009-08-18 11:54                             ` Andrew Gallatin
2009-08-19 17:59                               ` Bill Fink
2009-08-07 22:12 ` Neil Horman
2009-08-08  0:54   ` Bill Fink
2009-08-08  1:56     ` Neil Horman
2009-08-14 20:44       ` Bill Fink
2009-08-14 23:25         ` Neil Horman
2009-08-20  7:50           ` Bill Fink
2009-08-20 20:19             ` Neil Horman
2009-08-21  4:14               ` Bill Fink
2009-08-21 15:23                 ` Neil Horman
2009-08-21 15:36                   ` Andrew Gallatin
2009-08-26  7:10                   ` Bill Fink
2009-08-26 11:00                     ` Neil Horman
2009-08-26 18:08                       ` Neil Horman
2009-08-26 18:15                         ` Ingo Molnar [this message]
2009-08-26 19:04                           ` Neil Horman
2009-08-26 19:08                             ` Ingo Molnar
2009-08-26 19:36                               ` David Miller
2009-08-26 19:48                                 ` Ingo Molnar
2009-08-26 20:23                                   ` Neil Horman
2009-08-26 20:40                                     ` Ingo Molnar
2009-08-26 22:39                                       ` Neil Horman
2009-08-26 22:44                                         ` David Miller
2009-08-26 23:05                                           ` Ingo Molnar
2009-08-26 23:08                                             ` David Miller
2009-08-26 23:58                                               ` Ingo Molnar
2009-08-27  0:05                                                 ` Steven Rostedt
2009-08-27  0:35                                                 ` Christoph Hellwig
2009-08-27  9:28                                                   ` Ingo Molnar
2009-08-26 23:05                                           ` Steven Rostedt
2009-08-26 23:09                                             ` David Miller
2009-08-26 23:30                                               ` Ingo Molnar
2009-08-26 23:23                                             ` Neil Horman
2009-08-26 23:29                                               ` David Miller
2009-08-26 23:19                                           ` Neil Horman
2009-08-26 23:14                                         ` Ingo Molnar
2009-08-26 23:33                                         ` Steven Rostedt
2009-08-27  0:14                                           ` Neil Horman
2009-08-27  0:29                                             ` Steven Rostedt
2009-08-27  1:17                                               ` Neil Horman
2009-08-27  9:06                                                 ` Ingo Molnar
2009-08-27  9:34                                               ` Ingo Molnar
2009-08-27  0:34                                         ` Christoph Hellwig
2009-08-27  0:30                                       ` blktrace ftrace plugin, was " Christoph Hellwig
2009-08-27  5:26                                         ` Jens Axboe
2009-08-27  9:12                                           ` Ingo Molnar
2009-08-27  9:14                                             ` Jens Axboe
2009-08-27 13:55                                               ` Arnaldo Carvalho de Melo
2009-08-28  2:03                                             ` Li Zefan
2009-08-26 23:46                                     ` Frederic Weisbecker
2009-08-26 20:28                                   ` Ingo Molnar
2009-08-26 20:01                               ` Neil Horman
2009-08-26 22:57                                 ` Ingo Molnar
2009-08-27 17:32                         ` Bill Fink
2009-09-02  5:28                           ` Bill Fink
2009-08-27 17:44                         ` Bill Fink
2009-08-27 17:51                           ` Neil Horman
2009-09-02  5:11                             ` Bill Fink
2009-09-02 10:49                               ` Neil Horman
2009-09-02 15:38                                 ` Bill Fink
2009-08-12 23:29 ` David Miller
2009-08-13  2:35   ` Bill Fink

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090826181502.GC13632@elte.hu \
    --to=mingo@elte.hu \
    --cc=billfink@mindspring.com \
    --cc=brice@myri.com \
    --cc=gallatin@myri.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.