linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Roland Singer <roland.singer@desertbit.com>
Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, dri-devel@lists.freedesktop.org
Subject: Re: Kernel Freeze with American Megatrends BIOS
Date: Tue, 30 Aug 2016 08:06:34 -0500	[thread overview]
Message-ID: <20160830130634.GA16426@localhost> (raw)
In-Reply-To: <1d1bfdc2-f23d-9816-e4e3-ae676105dc39@desertbit.com>

On Tue, Aug 30, 2016 at 12:08:57PM +0200, Roland Singer wrote:
> Thanks for pointing it out.
> 
> Yeah that's right. The system will hang randomly a few minutes later,
> because some certain actions in the graphical user session will trigger
> the freeze.
> 
> I had a look at the function body of pci_read_config_dword:
> 
>   #define PCI_OP_READ(size, type, len) \
>   int pci_bus_read_config_##size \
> 	(struct pci_bus *bus, unsigned int devfn, int pos, type *value)	\
>   {									\
> 	int res;							\
> 	unsigned long flags;						\
> 	u32 data = 0;							\
> 	if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER;	\
> 	raw_spin_lock_irqsave(&pci_lock, flags);			\
> 	res = bus->ops->read(bus, devfn, pos, len, &data);		\
> 	*value = (type)data;						\
> 	raw_spin_unlock_irqrestore(&pci_lock, flags);		\
> 	return res;							\
>   }
> 
> I guess, that bus->ops->read(...) might be the trigger.
> Any hints how to continue debugging?

It's not likely that the problem is in the bus->ops->read() path.  That
is used by every device driver, so a problem there would cause more
serious problems than what you're seeing.

My guess would be some problem in the video driver or the bbswitch
thing.

> Am 30.08.2016 um 01:54 schrieb Bjorn Helgaas:
> > On Mon, Aug 29, 2016 at 09:55:56PM +0200, Roland Singer wrote:
> >> Just tried it and the system didn't freeze. However it will freeze
> >> after some time (few minutes while working).
> >>
> >> Seams to be pci_read_config_dword. Where is this exactly defined?
> > 
> > pci_read_config_dword() is defined in include/linux/pci.h.  It calls
> > pci_bus_read_config_dword() which is defined by the PCI_OP_READ() macro
> > in drivers/pci/access.c.
> > 
> > If I understand correctly, this:
> > 
> >   dis_dev_get();
> >   pci_read_config_dword(dis_dev, 0, &cfg_word);
> >   dis_dev_put();
> > 
> > causes an immediate system hang, but if you only do this:
> > 
> >   dis_dev_get();
> >   dis_dev_put();
> > 
> > the system hangs a few minutes later.  Right?
> > 
> >> Am 29.08.2016 um 21:07 schrieb Bjorn Helgaas:
> >>> On Mon, Aug 29, 2016 at 08:46:17PM +0200, Roland Singer wrote:
> >>>> Hi Bjorn,
> >>>>
> >>>> I am using the bbswitch kernel module to switch off/on the GPU and
> >>>> to obtain the GPU power state.
> >>>> Obtaining the GPU state immediately after starting the graphical user
> >>>> session freezes the system.
> >>>>
> >>>> This code triggers something, which is responsible for the freeze.
> >>>>
> >>>> ---
> >>>> // Returns 1 if the card is disabled, 0 if enabled
> >>>> static int is_card_disabled(void) {
> >>>>     u32 cfg_word;
> >>>>     // read first config word which contains Vendor and Device ID. If all bits
> >>>>     // are enabled, the device is assumed to be off
> >>>>     pci_read_config_dword(dis_dev, 0, &cfg_word);
> >>>>     // if one of the bits is not enabled (the card is enabled), the inverted
> >>>>     // result will be non-zero and hence logical not will make it 0 ("false")
> >>>>     return !~cfg_word;
> >>>> }
> >>>>
> >>>> static int bbswitch_proc_show(struct seq_file *seqfp, void *p) {
> >>>>     // show the card state. Example output: 0000:01:00:00 ON
> >>>>     dis_dev_get();
> >>>>     seq_printf(seqfp, "%s %s\n", dev_name(&dis_dev->dev),
> >>>>              is_card_disabled() ? "OFF" : "ON");
> >>>>     dis_dev_put();
> >>>>     return 0;
> >>>> }
> >>>> ---
> >>>>
> >>>> Either dis_dev_get or pci_read_config_dword is the trigger.
> >>>
> >>> What happens if you remove the call to is_card_disabled()?  Does the
> >>> system still freeze if you only do the dis_dev_get()/dis_dev_put()?
> >>>
> >>>> Link to the bbswitch module source code:
> >>>> https://github.com/Bumblebee-Project/bbswitch/blob/master/bbswitch.c#L333
> >>>>
> >>>>
> >>>> Am 29.08.2016 um 18:02 schrieb Bjorn Helgaas:
> >>>>> [+cc linux-acpi, linux-kernel, dri-devel]
> >>>>>
> >>>>> Hi Roland,
> >>>>>
> >>>>> I have no idea how to debug this problem.  Are you seeing something
> >>>>> that suggests it may be a PCI problem?
> >>>>>
> >>>>> On Tue, Aug 23, 2016 at 11:23:45AM +0200, Roland Singer wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> hope somebody can help me fix this kernel problem which affects the following machines:
> >>>>>>
> >>>>>> - Clevo P651RA (i7-6700HQ/GTX 965M, part of the P6xxRx family which are also affected)
> >>>>>> - MSI GE62 Apache Pro (i7-6700HQ/GTX 960M)
> >>>>>> - Gigabyte P35V5 (i7-6700HQ/GTX 970M)
> >>>>>> - Razer Blade 14" (2016) (i7-6700HQ/GTX 970M) (BIOS 5.11, 04/07/2016)
> >>>>>>
> >>>>>>
> >>>>>> The kernel freezes if the graphical user session (Xorg & Wayland) is
> >>>>>> started with a switched off discrete GPU card (NVIDIA).
> >>>>>> If the discrete GPU is switched off after the graphical session start,
> >>>>>> then everything works as expected, until the graphical session is restarted.
> >>>>>>
> >>>>>> This problem seams to be linked to specific BIOS settings. If the computer
> >>>>>> is started with the following command line:
> >>>>>>
> >>>>>> acpi_osi=! acpi_osi="Windows 2009"
> >>>>>>
> >>>>>> then the kernel freeze does not occur anymore. However this required a special
> >>>>>> ACPI DSDT firmware patch for the Razer Blade 2016 laptop:
> >>>>>>
> >>>>>> https://github.com/m4ng0squ4sh/razer_blade_14_2016_acpi_dsdt
> >>>>>>
> >>>>>> I strongly recommend to fix this in the kernel and I am ready to help and solve
> >>>>>> this problem with some help.
> >>>>>>
> >>>>>> Here is a link to the GitHub issue with further information:
> >>>>>>
> >>>>>> https://github.com/Bumblebee-Project/Bumblebee/issues/764#issuecomment-241212595
> >>>>>>
> >>>>>> Here are some more detailed information:
> >>>>>>
> >>>>>> https://github.com/Lekensteyn/acpi-stuff/blob/master/Clevo-P651RA/notes.txt
> >>>>>>
> >>>>>> Hope somebody can help.
> >>>>
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> >>>> the body of a message to majordomo@vger.kernel.org
> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>
> 

  reply	other threads:[~2016-08-30 13:06 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-23  9:23 Kernel Freeze with American Megatrends BIOS Roland Singer
2016-08-29  7:56 ` Roland Singer
2016-08-29 16:02 ` Bjorn Helgaas
2016-08-29 18:46   ` Roland Singer
2016-08-29 19:07     ` Bjorn Helgaas
2016-08-29 19:55       ` Roland Singer
2016-08-29 23:54         ` Bjorn Helgaas
2016-08-30 10:08           ` Roland Singer
2016-08-30 13:06             ` Bjorn Helgaas [this message]
2016-08-30 14:08               ` Emil Velikov
2016-08-30 15:25                 ` Roland Singer
2016-08-30 15:44                   ` Ilia Mirkin
2016-08-30 15:48                     ` Ilia Mirkin
2016-08-30 15:48                   ` Emil Velikov
2016-08-30 17:37                     ` Roland Singer
2016-08-30 17:43                       ` Ilia Mirkin
2016-08-30 18:02                         ` Roland Singer
2016-08-30 18:13                           ` Ilia Mirkin
2016-08-30 19:21                             ` Peter Wu
2016-08-31 11:12                               ` Roland Singer
2016-08-31 11:11                             ` Roland Singer
2016-08-30 18:09                       ` Emil Velikov
2016-08-30 18:10                         ` Emil Velikov
2016-08-31 10:51                           ` Roland Singer
2016-08-30 19:53   ` Peter Wu
2016-08-31 11:27     ` Roland Singer
2016-08-31 11:46       ` Peter Wu
2016-08-31 12:21         ` Roland Singer
2016-08-31 12:34           ` Peter Wu
2016-08-31 13:13             ` Roland Singer
2016-08-31 20:06               ` Roland Singer
2016-08-31 20:16                 ` Roland Singer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160830130634.GA16426@localhost \
    --to=helgaas@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=roland.singer@desertbit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).