From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Vasquez Subject: Re: kernel 2.6.26.3 qla2xxx oopsing on Fire 280R Date: Mon, 8 Sep 2008 14:13:31 -0700 Message-ID: <20080908211331.GC22598@plap4-2.qlogic.org> References: <20080904093929.GA29006@orion.carnet.hr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8BIT Return-path: Received: from avexch1.qlogic.com ([198.70.193.115]:35045 "EHLO avexch1.qlogic.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751254AbYIHVNc convert rfc822-to-8bit (ORCPT ); Mon, 8 Sep 2008 17:13:32 -0400 Content-Disposition: inline In-Reply-To: <20080904093929.GA29006@orion.carnet.hr> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Josip Rodin Cc: sparclinux@vger.kernel.org, linux-scsi@vger.kernel.org On Thu, 04 Sep 2008, Josip Rodin wrote: > Here we go again :/ This is the failing boot log, attached is the config of > the kernel that doesn't work. > > boot: linux > Allocated 8 Megs of memory at 0x40000000 for kernel > qla2xxx 0001:00:04.0: LIP reset occured (f8ef). > scsi1 : qla2xxx > qla2xxx 0001:00:04.0: LIP occured (f8ef). > qla2xxx 0001:00:04.0: LOOP UP detected (1 Gbps). > Unable to handle kernel NULL pointer dereference > tsk->{mm,active_mm}->context = 00000000000000e0 > tsk->{mm,active_mm}->pgd = fffff8007c90e000 > \|/ ____ \|/ > "@'/ .. \`@" > /_| \__/ |_\ > \__U_/ > qla2xxx_1_dpc(771): Oops [#1] > TSTATE: 0000004480009604 TPC: 000000000058ecf4 TNPC: 000000000058ecf8 Y: 000003 > Not tainted > TPC:  > g0: fffff8007cf770a1 g1: 0000000000000000 g2: fffff8007c014000 g3: 000000040020 > g4: fffff8007e0d0580 g5: fffff8007f6a0000 g6: fffff8007cf74000 g7: 20000004cf2b > o0: 0000000000692588 o1: 0000000000000003 o2: 0000000000000001 o3: 000000000000 > o4: 0000000000000000 o5: 00007fffffffe000 sp: fffff8007cf770c1 ret_pc: 00000000044e90c > RPC:  > l0: fffff8007c016940 l1: 000000000000000f l2: fffff8007e4000b0 l3: 000000000000000 > l4: fffff8007e054870 l5: 0000000000000008 l6: fffff8007c89c000 l7: 00000000004fc40 > i0: fffff8007c014000 i1: 0000000000000000 i2: 0000000000000000 i3: fffff8007c984a0 > i4: 0000000000000000 i5: 0000000000000000 i6: fffff8007cf77181 i7: 0000000000512ac > I7:  > Caller[00000000005912ac]: fc_remote_port_add+0x18/0x664 > Caller[000000001003c0e4]: qla2x00_update_fcport+0x2b0/0x368 [qla2xxx] > Caller[000000001003ca98]: qla2x00_configure_loop+0x85c/0x1a18 [qla2xxx] > Caller[000000001003dcd0]: qla2x00_loop_resync+0x7c/0x10c [qla2xxx] > Caller[0000000010039f50]: qla2x00_do_dpc+0x60c/0x6e8 [qla2xxx] > Caller[000000000046aec4]: kthread+0x4c/0x78 > Caller[0000000000426df8]: kernel_thread+0x38/0x48 > Caller[000000000046acec]: kthreadd+0xc4/0x1a0 That's odd, as fc_flush_work() is quite minimal: static void fc_flush_work(struct Scsi_Host *shost) { if (!fc_host_work_q(shost)) { printk(KERN_ERR "ERROR: FC host '%s' attempted to flush work, " "when no workqueue created.\n", shost->hostt->name); dump_stack(); return; } flush_workqueue(fc_host_work_q(shost)); } there's not much chance here for a NULL-dereference. Since we have know good and bad points, could you possibly git-bisect this to help troubleshoot? Also, are you seeing similar problems with Linus' latest tree? -- AV