public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jean Delvare <khali@linux-fr.org>
To: Robert Norris <robn@opera.com>
Cc: linux-kernel@vger.kernel.org, Linux I2C <linux-i2c@vger.kernel.org>
Subject: Re: PROBLEM: modprobe hang at startup (3.8.x, 3.9.x, IBM x3550)
Date: Wed, 15 May 2013 21:49:23 +0200	[thread overview]
Message-ID: <20130515214923.036dabdb@endymion.delvare> (raw)
In-Reply-To: <20130515112741.GA23766@pyro.melbourne.osa>

Robert,

On Wed, 15 May 2013 21:27:41 +1000, Robert Norris wrote:
> On Wed, May 15, 2013 at 11:20:44AM +0200, Jean Delvare wrote:
> > Can you share the full output of lspci -s 00:1f.3 -vv?
> 
> 00:1f.3 SMBus: Intel Corporation 631xESB/632xESB/3100 Chipset SMBus Controller (rev 09)
>     Subsystem: IBM Device 02dd
>     Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
>     Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>     Interrupt: pin B routed to IRQ 0

Hmm, this "IRQ 0" is quite odd. I'm wondering if this could be the
reason for this hang. Was it with the i2c-i801 driver loaded, or
blacklisted? Please check if it makes a difference.

Do you see the same (and more generally, this issue) on one, some or
all of your x3550 servers?

Are you using IPMI on these machines?

>     Region 4: I/O ports at 0440 [size=32]
> 
> > I'm also curious if the SMBus controller shares its interrupt line
> > with another chip. /proc/interrupts should tell but you'll have to
> > make one of your systems hang again.
> 
> I'm not sure how to read it, so here it is (3.9.2, immediately after
> boot, no options to i2c_i801):
> 
>            CPU0       CPU1       CPU2       CPU3       
> (...)
>  20:          0          0          0          0   IO-APIC-fasteoi   i801_smbus

Here the IRQ looks correct, and it isn't shared. But I am surprised
that the counters are all 0. If an SMBus transaction had been
attempted, there should be a 1 somewhere, even if the transaction
ultimately failed.

> (...)
> I went with blacklisting for now because this driver doesn't appear to
> be doing anything useful for us (sensors etc are working without it).
> I'll confess to not really knowing much about its purpose though.

It all depends on what I2C/SMBus slaves are connected to the SMBus.
Often there are the SPD EEPROMs from your memory modules, sometimes
with integrated thermal sensors (on DDR3 only - driver is jc42.) And in
your case a clock chip as well, for which IBM contributed a driver.

> > (...)
> > As far as debugging goes, please tell me if you have any I2C/SMBus
> > slave device driver loaded (check in /sys/bus/i2c/drivers.) Loading the
> > i2c-i801 driver doesn't do much on its own if there are no slave device
> > drivers using it.
> 
> $ modprobe i2c-i801 disable_features=0x10
> $ dmesg | tail
> ...
> [28876.193408] i801_smbus 0000:00:1f.3: Interrupt disabled by user
> [28876.201168] ics932s401 4-0069: ics932s401 chip found
> $ ls /sys/bus/i2c/drivers
> dummy  ics932s401

The dummy driver is a helper stub for i2c-core, it doesn't actually
access the SMBus. ics932s401 is for the clock chip, and I know clock
chips can be tricky and error prone. OTOH I can only guess that IBM had
a good reason to contribute the driver and make it auto-load on the
x3550.

I would appreciate if you could test the following:
* Blacklist i2c-i801 and ics932s401 so that none of them get
  auto-loaded.
* Manually load i2c-i801 with interrupts enabled, and see what happens.
* If no hang happens, load i2c-dev, find the i801 bus number with
  i2cdetect -l (from the i2c-tools package - it should be 4 according
  to what you reported so far but there is no guarantee that it won't
  change across reboots.) Then do a simple read from a random address
  with:
  # i2cget 4 0x50 0x00
  (Adjust the bus number as needed.)
  I am curious if this will hang as well or only when accessing the
  clock chip at address 0x69.

Thanks,
-- 
Jean Delvare

  reply	other threads:[~2013-05-15 19:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-13  1:22 PROBLEM: modprobe hang at startup (3.8.x, 3.9.x, IBM x3550) Robert Norris
2013-05-14 23:16 ` Robert Norris
2013-05-15  9:20   ` Jean Delvare
2013-05-15 11:27     ` Robert Norris
2013-05-15 19:49       ` Jean Delvare [this message]
2013-05-16  3:44         ` Robert Norris
2013-05-17  8:36           ` Jean Delvare
2013-05-17  9:22             ` Martin Mokrejs
2013-05-17  9:47               ` Jean Delvare
2013-05-17  9:54             ` Daniel Kurtz
2013-05-17 10:26               ` Robert Norris
2013-05-17 10:24             ` Robert Norris
2013-05-17 12:18             ` Robert Norris
2013-05-17  8:49           ` Jean Delvare
2013-05-17 10:27             ` Robert Norris
2013-05-17 10:56               ` Jean Delvare

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130515214923.036dabdb@endymion.delvare \
    --to=khali@linux-fr.org \
    --cc=linux-i2c@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robn@opera.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox