From: Konrad Rzeszutek Wilk <konrad@kernel.org>
To: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Jan Beulich <JBeulich@suse.com>,
xen-devel@lists.xen.org
Subject: Re: dom0 / hypervisor hang on dom0 boot
Date: Fri, 17 May 2013 18:28:16 -0400 [thread overview]
Message-ID: <20130517222814.GA3255@localhost.localdomain> (raw)
In-Reply-To: <1630888.LbRauWP15S@amur.mch.fsc.net>
[-- Attachment #1: Type: text/plain, Size: 2581 bytes --]
On Thu, May 16, 2013 at 01:07:05PM +0200, Dietmar Hahn wrote:
> Am Mittwoch 15 Mai 2013, 10:42:17 schrieb Jan Beulich:
> > >>> On 15.05.13 at 11:12, Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> wrote:
> > > Am Mittwoch 15 Mai 2013, 09:35:46 schrieb Jan Beulich:
> > >> >>> On 15.05.13 at 08:53, Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> wrote:
> > >> > I tried iommu=debug and I can't see any faulting messages but Iam not
> > >> > familiar with this code.
> > >> > I attached the logging, maybe anyone can have a look on this.
> >
> > Perhaps only (if at all) by instrumenting the hypervisor. The
> > question of course is how easily/quickly you can narrow down the
> > code region that it might be dying in. And whether it's a hypervisor
> > action at all that causes the hang (as opposed to something the
> > DRM code in Dom0 does).
>
> I added some debug code to the linux kernel and could track down the
> point of the hang. I used openSuSE kernel 3.7.10-1.4 but I looked at newer
> kernels and found that the code is similar.
>
> i915_gem_init_global_gtt(...)
> ...
> intel_gtt_clear_range(start / PAGE_SIZE, (end-start) / PAGE_SIZE);
> ...
>
> void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
> {
> unsigned int i;
>
> ---> A printk(...) here is seen on serial line!
>
> for (i = first_entry; i < (first_entry + num_entries); i++) {
> intel_private.driver->write_entry(intel_private.base.scratch_page_dma,
> i, 0);
> }
>
> ---> A printk(...) here is never seen!
>
> readl(intel_private.gtt+i-1);
> }
>
> The function behind the pointer intel_private.driver->write_entry is
> i965_write_entry(). And the interesting instruction seems to be:
> writel(addr | pte_flags, intel_private.gtt + entry);
>
> I added another printk() on start of the function i965_write_entry().
> And surprisingly after printing a lot of messages the kernel came up!!!
> But now I had other problems like losing the audio device (maybe timeouts).
> So maybe the hang is a timing problem?
>
> What I wanted to check is, what the hypervisor is doing while the system hangs.
> Has anybody an idea maybe a timer and after 30s printing a dump of the stack of
> all cpus?
Yes. Can you try the two attached patches please.
> Thanks.
>
> Dietmar.
>
>
> --
> Company details: http://ts.fujitsu.com/imprint.html
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>
[-- Attachment #2: 0001-drm-i915-Don-t-leak-a-page-in-case-of-DMA-error-mapp.patch --]
[-- Type: text/plain, Size: 1947 bytes --]
>From 4201962b743a44325ff848ba6387d3710343c123 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Fri, 17 May 2013 18:13:35 -0400
Subject: [PATCH 1/2] drm/i915: Don't leak a page in case of DMA error mapping.
We don't free the allocated page if we fail to setup the DMA
mapping. This fixes it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
drivers/char/agp/intel-gtt.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index dbd901e..701b328 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -294,9 +294,10 @@ static int intel_gtt_setup_scratch_page(void)
if (intel_private.base.needs_dmar) {
dma_addr = pci_map_page(intel_private.pcidev, page, 0,
PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
- if (pci_dma_mapping_error(intel_private.pcidev, dma_addr))
+ if (pci_dma_mapping_error(intel_private.pcidev, dma_addr)) {
+ __intel_gtt_teardown_scratch_page();
return -EINVAL;
-
+ }
intel_private.base.scratch_page_dma = dma_addr;
} else
intel_private.base.scratch_page_dma = page_to_phys(page);
@@ -542,15 +543,18 @@ static unsigned int intel_gtt_mappable_entries(void)
return aperture_size >> PAGE_SHIFT;
}
-
-static void intel_gtt_teardown_scratch_page(void)
+static void __intel_gtt_teardown_scratch_page(void)
{
set_pages_wb(intel_private.scratch_page, 1);
- pci_unmap_page(intel_private.pcidev, intel_private.base.scratch_page_dma,
- PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
put_page(intel_private.scratch_page);
__free_page(intel_private.scratch_page);
}
+static void intel_gtt_teardown_scratch_page(void)
+{
+ pci_unmap_page(intel_private.pcidev, intel_private.base.scratch_page_dma,
+ PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+ __intel_gtt_teardown_scratch_page();
+}
static void intel_gtt_cleanup(void)
{
--
1.8.1.2
[-- Attachment #3: 0002-drm-i915-Sync-the-scratch-page-after-writting-values.patch --]
[-- Type: text/plain, Size: 1221 bytes --]
>From 51908f611fb00195d98f1a552106c6d1709720c0 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Fri, 17 May 2013 18:20:46 -0400
Subject: [PATCH 2/2] drm/i915: Sync the scratch page after writting values to
it.
We don't sync the page after we have written to it - this is what
you are suppose to when doing:
pci_map_page
.. write some values
[ was missing a call to pci_dma_sync_single_for_device]
.. read some values
pci_unmap_page
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
drivers/char/agp/intel-gtt.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index 701b328..89dd698 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -902,6 +902,9 @@ void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
intel_private.driver->write_entry(intel_private.base.scratch_page_dma,
i, 0);
}
+ pci_dma_sync_single_for_device(intel_private.pcidev,
+ intel_private.base.scratch_page_dma,
+ PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
readl(intel_private.gtt+i-1);
}
EXPORT_SYMBOL(intel_gtt_clear_range);
--
1.8.1.2
[-- Attachment #4: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2013-05-17 22:28 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-14 12:35 dom0 / hypervisor hang on dom0 boot Dietmar Hahn
2013-05-14 12:42 ` Andrew Cooper
2013-05-14 12:50 ` Dietmar Hahn
2013-05-14 12:51 ` Andrew Cooper
2013-05-14 13:25 ` Dietmar Hahn
2013-05-14 13:27 ` Jan Beulich
2013-05-15 6:53 ` Dietmar Hahn
2013-05-15 8:35 ` Jan Beulich
2013-05-15 9:12 ` Dietmar Hahn
2013-05-15 9:42 ` Jan Beulich
2013-05-16 11:07 ` Dietmar Hahn
2013-05-16 12:10 ` Jan Beulich
2013-05-16 13:16 ` Dietmar Hahn
2013-05-16 13:45 ` Jan Beulich
2013-05-17 7:10 ` Dietmar Hahn
2013-05-16 14:50 ` Dugger, Donald D
2013-05-20 14:30 ` Dugger, Donald D
2013-05-21 8:03 ` Jan Beulich
2013-05-21 8:28 ` Tian, Kevin
2013-05-21 8:47 ` Jan Beulich
2013-05-17 22:28 ` Konrad Rzeszutek Wilk [this message]
2013-05-21 7:39 ` Dietmar Hahn
2013-05-21 14:10 ` Konrad Rzeszutek Wilk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130517222814.GA3255@localhost.localdomain \
--to=konrad@kernel.org \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=dietmar.hahn@ts.fujitsu.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).