public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Martin Lorenz <martin@lorenz.eu.org>
To: linux-thinkpad@linux-thinkpad.org
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
	linux-kernel@vger.kernel.org, "Brown, Len" <len.brown@intel.com>,
	acpi-devel@kernel.org, Pavel Machek <pavel@suse.cz>,
	Andrew Morton <akpm@osdl.org>
Subject: Re: [ltp] Re: X60s w/t kern 2.6.19-rc1-git: two BUG warnings
Date: Wed, 11 Oct 2006 09:06:50 +0200	[thread overview]
Message-ID: <20061011070650.GA7003@gimli> (raw)
In-Reply-To: <m17iz7oj3w.fsf@ebiederm.dsl.xmission.com>

On Tue, Oct 10, 2006 at 05:18:27PM -0600, Eric W. Biederman wrote:
> Jeremy Fitzhardinge <jeremy@goop.org> writes:
> 
> > Martin Lorenz wrote:
> >> Dear kernel gurus,
> >>
> >
> > (CC:s added to various interested parties.)
> >
> >> whatever I do and whic problem I seem to get fixed new ones arise:
> >>
> >> now I loose ACPI events after suspend/resume. not every time, but roughly 3
> >> out of 4 times.
> >>
> >
> > Yes, I'm seeing this as well.  It used to be a 100% failure, so there has been
> > some improvement.  On the other hand, before that it used to work perfectly...
> > Unfortunately I don't have a good feeling for when these changes happened.
> 
> So the first symptom is not the failure in e1000 open but this.
> [ 6727.082000] Trying to free already-free IRQ 20
> Which I suspect comes from e1000_close as it happens after
> the e1000 watchdog timer goes off because it hasn't gotten
> interrupts for a while.
> 
> We have other bits of suspicious code as well.
> The e1000 rolls it's own version of pci_save_state but doesn't
> call any of the msi save/restore state routines.
> 
> The bug is from the attempt to allocate an already allocated irq.
> So it appears somehow in the save/restore mess the msi code
> thought the irq code was allocates but the irq code did not?
> 

this morning I tried and booted the machine with pci=nomsi
the BUG does not come up as expected but the symptom of loosing ACPI after
suspend/resume remains...

I put up some information on my current configuration on
http://www.lorenz.eu.org/~mlo/kernel/
tell me, what I should add to help 

BTW: can someone tell me, why I can't pull from
     git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
     anymore?
     I keep getting 'fatal: unable to connect a socket (Connection refused)'
     since yesterday afternoon (first noticed ~2pm UTC)

> I have scratched the msi code enough lately that something may have
> shaken loose.   In particular in commit: 
> d7bd007e480672c99d8656c7b7b12ef0549432c37 I explicitly disallowed some
> weird double allocation/deallocation code that could have masked this
> problem.  
> 
> The e1000 watchdog doesn't seem to be going off in the functioning
> case so that may not be it.
> 
> My problem is I have been given two incomplete dmesg traces one
> that works and one that doesn't and without any background in this
> area beyond noting the e1000 watchdog timer is firing I can't say what
> happened.
> 
> In the working cases do you have trouble if you up and down the
> network interface?
> 
> We seem to be at that nasty interface layer where every piece is
> making assumptions about how someone else's piece works, and someone
> has their assumptions wrong.
> 
> If someone who understands the how suspend/resume is supposed to work
> for these kinds of things can speak up that would help.
> 
> >
> >> the only errornous things I see in the logs are those:
> >>
> >> [ 6727.089000] BUG: warning at drivers/pci/msi.c:680/pci_enable_msi()
> >> [ 6727.089000]  [<c0103bd9>] dump_trace+0x69/0x1af
> >> [ 6727.089000]  [<c0103d37>] show_trace_log_lvl+0x18/0x2c
> >> [ 6727.089000]  [<c01043d6>] show_trace+0xf/0x11
> >> [ 6727.089000]  [<c01044d9>] dump_stack+0x15/0x17
> >> [ 6727.089000]  [<c0208cd6>] pci_enable_msi+0x78/0x22e
> >> [ 6727.090000]  [<c0257fe5>] e1000_open+0x64/0x176
> >> [ 6727.091000]  [<c029b121>] dev_open+0x2b/0x62
> >> [ 6727.092000]  [<c0299c2f>] dev_change_flags+0x47/0xe4
> >> [ 6727.093000]  [<c02cdceb>] devinet_ioctl+0x252/0x556
> >> [ 6727.094000]  [<c0290e9a>] sock_ioctl+0x19e/0x1c2
> >> [ 6727.095000]  [<c016979b>] do_ioctl+0x1f/0x62
> >> [ 6727.095000]  [<c0169a23>] vfs_ioctl+0x245/0x257
> >> [ 6727.096000]  [<c0169a81>] sys_ioctl+0x4c/0x67
> >> [ 6727.096000]  [<c0102dc3>] syscall_call+0x7/0xb
> >> [ 6727.096000] DWARF2 unwinder stuck at syscall_call+0x7/0xb
> >> [ 6727.096000]
> >> [ 6727.096000] Leftover inexact backtrace:
> >> [ 6727.096000]
> >> [ 6727.096000]  [<c02e007b>] unix_stream_connect+0x15b/0x367
> >> [ 6727.096000]  =======================
> >>
> >> which occurs in variations but very often
> >>
> >
> > This just seems to be something thinking the IRQ isn't MSI-capable.  It doesn't
> > seem to have any negative effect - the e1000 works fine anyway.
> 
> It is a BUG.  The first symptoms of which where the failure to free
> the irq in by the e1000 watchdog.  The normal irq code said you didn't
> have an irq handler registered for that irq.  
> 
> Does anyone know how the irq handler could go aways in the weird
> suspend/resume world?
> 
> >> [ 6708.389000] e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full
> > Duplex
> >> [ 6708.389000] e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO
> >> [ 6727.082000] Trying to free already-free IRQ 20
> >> [ 6727.089000] BUG: warning at drivers/pci/msi.c:680/pci_enable_msi()
> >> [ 6727.089000]  [<c0103bd9>] dump_trace+0x69/0x1af
> >> [ 6727.089000]  [<c0103d37>] show_trace_log_lvl+0x18/0x2c
> >> [ 6727.089000]  [<c01043d6>] show_trace+0xf/0x11
> >> [ 6727.089000]  [<c01044d9>] dump_stack+0x15/0x17
> >> [ 6727.089000]  [<c0208cd6>] pci_enable_msi+0x78/0x22e
> >> [ 6727.090000]  [<c0257fe5>] e1000_open+0x64/0x176
> >> [ 6727.091000]  [<c029b121>] dev_open+0x2b/0x62
> >> [ 6727.092000]  [<c0299c2f>] dev_change_flags+0x47/0xe4
> >> [ 6727.093000]  [<c02cdceb>] devinet_ioctl+0x252/0x556
> >> [ 6727.094000]  [<c0290e9a>] sock_ioctl+0x19e/0x1c2
> >> [ 6727.095000]  [<c016979b>] do_ioctl+0x1f/0x62
> >> [ 6727.095000]  [<c0169a23>] vfs_ioctl+0x245/0x257
> >> [ 6727.096000]  [<c0169a81>] sys_ioctl+0x4c/0x67
> >> [ 6727.096000]  [<c0102dc3>] syscall_call+0x7/0xb
> >> [ 6727.096000] DWARF2 unwinder stuck at syscall_call+0x7/0xb
> >> [ 6727.096000] [ 6727.096000] Leftover inexact backtrace:
> >> [ 6727.096000] [ 6727.096000]  [<c02e007b>] unix_stream_connect+0x15b/0x367
> >> [ 6727.096000]  =======================
> >> [ 6728.719000] e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full
> > Duplex
> >> [ 6728.719000] e1000: eth0: e1000_watchdog: 10/100 speed: disabling
> >TSO
> 
> Eric
> -- 
> The linux-thinkpad mailing list home page is at:
> http://mailman.linux-thinkpad.org/mailman/listinfo/linux-thinkpad

gruss
  mlo
--
Dipl.-Ing. Martin Lorenz

            They that can give up essential liberty 
	    to obtain a little temporary safety 
	    deserve neither liberty nor safety.
                                   Benjamin Franklin

please encrypt your mail to me
GnuPG key-ID: F1AAD37D
get it here:
http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&search=0xF1AAD37D

ICQ UIN: 33588107

  reply	other threads:[~2006-10-11  7:10 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-10  6:28 X60s w/t kern 2.6.19-rc1-git: two BUG warnings Martin Lorenz
2006-10-10 18:55 ` Jeremy Fitzhardinge
2006-10-10 23:18   ` Eric W. Biederman
2006-10-11  7:06     ` Martin Lorenz [this message]
2006-10-12  7:01       ` [ltp] " Martin Lorenz
2006-10-12  7:25         ` Pavel Machek
2006-10-12 15:24         ` Martin Lorenz
2006-10-13 15:47 ` Adrian Bunk
2006-10-16  6:32   ` [ltp] " Martin Lorenz
2006-10-16  6:53     ` Martin Lorenz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061011070650.GA7003@gimli \
    --to=martin@lorenz.eu.org \
    --cc=acpi-devel@kernel.org \
    --cc=akpm@osdl.org \
    --cc=jeremy@goop.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-thinkpad@linux-thinkpad.org \
    --cc=pavel@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox