From mboxrd@z Thu Jan 1 00:00:00 1970 From: Or Gerlitz Subject: Re: can't get IB link with the for-next branch / IBoE patches Date: Tue, 26 Oct 2010 14:19:43 +0200 Message-ID: <4CC6C75F.8030103@Voltaire.com> References: <20101024075835.GA11359@mtldesk30> <20101024160018.GA32499@mtldesk30> <4CC54D4E.7050203@Voltaire.com> <4CC5604D.2080803@Voltaire.com> <20101025161730.GA9335@mtldesk30> <4CC6A051.3010703@Voltaire.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4CC6A051.3010703-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Roland Dreier Cc: Eli Cohen , RDMA list , Andy Grover List-Id: linux-rdma@vger.kernel.org > I have IB port coming to active and basic IPoIB, opensm working okay > on the node with the current for-next/IBoE bits doing a little bit stress testing, I came across the below oops, when running IPoIB and couple of iperf/udp sessions, it doesn't look like a problem in the IB stack. Also with rds, using rds-stress from rds-tools-1.5-1.el5 and "rds-stress -s 192.168.20.18 -p 4000 -t 1 -q 1K -a 1K -D 1M" on the client side, the node running the for-next/IBoE bits and acting as the passive side of the test, got hanged. Also here, this could be a bug in RDS and not in the IBoE patches, I know that the rds guys queued about a hundred! patches for 2.6.37 so with these patches things might be better. I have the oops trace in jpg, will send to Andy, Roland and Eli. I guess we can continue these tests for 2-3 days and have the push over the weekend, or push it before and get fixes if needed through the -rc cycle. Oct 26 12:36:30 nsg2 kernel: BUG: spinlock bad magic on CPU#0, iperf/20845 Oct 26 12:36:30 nsg2 kernel: lock: ffffffff81663ef8, .magic: 00000000, .owner: /-1, .owner_cpu: 0 Oct 26 12:36:30 nsg2 kernel: Pid: 20845, comm: iperf Not tainted 2.6.36-rc5-42052-gce806e1 #1 Oct 26 12:36:30 nsg2 kernel: Call Trace: Oct 26 12:36:30 nsg2 kernel: [] ? do_raw_spin_lock+0x22/0x122 Oct 26 12:36:30 nsg2 kernel: [] ? dev_queue_xmit+0x10d/0x346 Oct 26 12:36:30 nsg2 kernel: [] ? ip_push_pending_frames+0x2bf/0x318 Oct 26 12:36:30 nsg2 kernel: [] ? udp_push_pending_frames+0x2d2/0x351 Oct 26 12:36:30 nsg2 kernel: [] ? udp_sendmsg+0x4b0/0x59c Oct 26 12:36:30 nsg2 kernel: [] ? cap_socket_sendmsg+0x0/0x3 Oct 26 12:36:30 nsg2 kernel: [] ? common_interrupt+0xe/0x13 Oct 26 12:36:30 nsg2 kernel: [] ? cap_socket_sendmsg+0x0/0x3 Oct 26 12:36:30 nsg2 kernel: [] ? sock_aio_write+0xf5/0x10d Oct 26 12:36:30 nsg2 kernel: [] ? reschedule_interrupt+0xe/0x20 Oct 26 12:36:30 nsg2 kernel: [] ? common_interrupt+0xe/0x13 Oct 26 12:36:30 nsg2 kernel: [] ? common_interrupt+0xe/0x13 Oct 26 12:36:30 nsg2 kernel: [] ? do_sync_write+0xab/0xeb Oct 26 12:36:30 nsg2 kernel: [] ? _raw_spin_unlock_irq+0x9/0xd Oct 26 12:36:30 nsg2 kernel: [] ? security_file_permission+0x18/0x6b Oct 26 12:36:30 nsg2 kernel: [] ? vfs_write+0xbe/0x132 Oct 26 12:36:30 nsg2 kernel: [] ? sys_write+0x45/0x6e Oct 26 12:36:30 nsg2 kernel: [] ? system_call_fastpath+0x16/0x1b -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html