public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: fenghua.yu@intel.com
Cc: dwmw2@infradead.org, amluto@gmail.com, kyle@redhat.com,
	mgross@linux.intel.com, linux-kernel@vger.kernel.org,
	iommu@lists.linux-foundation.org
Subject: Re: [patch 1/2] Intel IOMMU Suspend/Resume Support for DMAR
Date: Wed, 25 Mar 2009 21:12:28 +0100	[thread overview]
Message-ID: <20090325201228.GA26204@elte.hu> (raw)
In-Reply-To: <20090325184635.374969000@intel.com>


* fenghua.yu@intel.com <fenghua.yu@intel.com> wrote:

> The attached patch implements the suspend and resume feature for 
> DMAR. It hooks to kernel suspend and resume interface.  When 
> suspend happens, it saves necessary hardware registers. When 
> resume happens it restores the registers and restarts DMAR.

The structure looks pretty clean. I suspect David will go over the 
lowlevel details - i only have a few nitpicks:

>  #include <linux/intel-iommu.h>
> +#include <linux/sysdev.h>
> +#include <asm/i8259.h>

why is that needed?

>  #include <asm/cacheflush.h>
>  #include <asm/iommu.h>
>  #include "pci.h"
> @@ -2563,6 +2565,161 @@ static void __init init_no_remapping_devices(void)
>  	}
>  }
>  
> +#ifdef CONFIG_PM
> +static int init_iommu_hw(void)
> +{
> +	struct dmar_drhd_unit *drhd;
> +	struct intel_iommu *iommu;
> +
> +	for_each_drhd_unit(drhd) {
> +		if (drhd->ignored)
> +			continue;
> +
> +		iommu = drhd->iommu;
> +
> +		if (iommu->qi)
> +			dmar_reenable_qi(iommu);
> +	}
> +
> +	for_each_drhd_unit(drhd) {
> +		if (drhd->ignored)
> +			continue;

i suggested the following cleanliness detail on the previous 
submission a month or two ago, and it's still valid: why isnt the 
drhd->ignored skipping integrated into for_each_drhd_unit() ?

Perhaps named: for_each_active_drhd_unit().

Also, in practice, we iterate over active iommus, so this whole 
sequence:

> +	for_each_drhd_unit(drhd) {
> +		if (drhd->ignored)
> +			continue;
> +
> +		iommu = drhd->iommu;

Could perhaps be replaced with for_each_active_iommu(iommu).

> +
> +		iommu_flush_write_buffer(iommu);
> +
> +		iommu_set_root_entry(iommu);
> +
> +		iommu->flush.flush_context(iommu, 0, 0, 0, DMA_CCMD_GLOBAL_INVL,
> +					   0);
> +		iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH,
> +					 0);

just put those '0);' lines to the end of the call and ignore 
checkpatch.

> +		iommu_disable_protect_mem_regions(iommu);
> +		iommu_enable_translation(iommu);
> +	}
> +
> +	return 0;
> +}
> +
> +static void iommu_flush_all(void)
> +{
> +	struct dmar_drhd_unit *drhd;
> +	struct intel_iommu *iommu;
> +
> +	for_each_drhd_unit(drhd) {
> +		if (drhd->ignored)
> +			continue;
> +
> +		iommu = drhd->iommu;
> +		iommu->flush.flush_context(iommu, 0, 0, 0, DMA_CCMD_GLOBAL_INVL,
> +					   0);
> +		iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH,
> +					 0);

ditto.

> +	}
> +}
> +
> +static u32 iommu_state[MAX_IOMMUS][MAX_IOMMU_REGS];

hm, this goes up to MAX_IOMMUS which you define to 32.

But dmar_drhd_unit registration in dmar_register_drhd_unit() is 
open-ended and has no such limit.

While i'm sure there's a limit in the spec, there's the chance for 
there to be more than 32 units enumerated (in future hardware) - and 
this will break silently.

It would be far cleaner to embedd the IOMMU state in the IOMMU 
driver data structure directly. That gets rid of the limit and it 
also makes this array allocated dynamically.

> +
> +static int iommu_suspend(struct sys_device *dev, pm_message_t state)
> +{
> +	struct dmar_drhd_unit *drhd;
> +	struct intel_iommu *iommu;
> +	unsigned long flag;
> +	int    i = 0;

(the tab before the 'i' looks a bit weird.)

> +
> +	iommu_flush_all();
> +
> +	for_each_drhd_unit(drhd) {
> +		if (drhd->ignored)
> +			continue;
> +
> +		iommu = drhd->iommu;
> +
> +		iommu_disable_translation(iommu);
> +
> +		spin_lock_irqsave(&iommu->register_lock, flag);
> +
> +		iommu_state[i][DMAR_FECTL_REG] =
> +			(u32) readl(iommu->reg + DMAR_FECTL_REG);
> +		iommu_state[i][DMAR_FEDATA_REG] =
> +			(u32) readl(iommu->reg + DMAR_FEDATA_REG);
> +		iommu_state[i][DMAR_FEADDR_REG] =
> +			(u32) readl(iommu->reg + DMAR_FEADDR_REG);
> +		iommu_state[i][DMAR_FEUADDR_REG] =
> +			(u32) readl(iommu->reg + DMAR_FEUADDR_REG);

no need for the cast - readl returns u32.

> +
> +		spin_unlock_irqrestore(&iommu->register_lock, flag);
> +		i++;
> +	}
> +	return 0;
> +}
> +
> +static int iommu_resume(struct sys_device *dev)
> +{
> +	struct dmar_drhd_unit *drhd;
> +	struct intel_iommu *iommu;
> +	unsigned long flag;
> +	int i = 0;
> +
> +	if (init_iommu_hw())
> +		panic("IOMMU setup failed, DMAR can not start!\n");

Please dont panic boxes ... insert a WARN() and return.

> +
> +	for_each_drhd_unit(drhd) {
> +		if (drhd->ignored)
> +			continue;
> +
> +		iommu = drhd->iommu;
> +
> +		spin_lock_irqsave(&iommu->register_lock, flag);
> +
> +		writel((u32) iommu_state[i][DMAR_FECTL_REG],
> +			iommu->reg + DMAR_FECTL_REG);
> +		writel((u32) iommu_state[i][DMAR_FEDATA_REG],
> +			iommu->reg + DMAR_FEDATA_REG);
> +		writel((u32) iommu_state[i][DMAR_FEADDR_REG],
> +			iommu->reg + DMAR_FEADDR_REG);
> +		writel((u32) iommu_state[i][DMAR_FEUADDR_REG],
> +			iommu->reg + DMAR_FEUADDR_REG);

there's no need for the (u32) case. iommu_state[] has u32 types, and 
writel takes u32.

> +
> +		spin_unlock_irqrestore(&iommu->register_lock, flag);
> +		i++;
> +	}
> +	return 0;
> +}
> +
> +static struct sysdev_class iommu_sysclass = {
> +	.name		= "iommu",
> +	.resume		= iommu_resume,
> +	.suspend	= iommu_suspend,
> +};
> +
> +static struct sys_device device_iommu = {
> +	.cls	= &iommu_sysclass,
> +};
> +
> +static int __init init_iommu_sysfs(void)
> +{
> +	int error;
> +
> +	error = sysdev_class_register(&iommu_sysclass);
> +	if (error)
> +		return error;
> +
> +	error = sysdev_register(&device_iommu);
> +	if (error)
> +		sysdev_class_unregister(&iommu_sysclass);
> +
> +	return error;
> +}
> +
> +#else
> +static init __init init_iommu_sysfs(void)
> +{
> +	return 0;
> +}
> +#endif	/* CONFIG_PM */
> +
>  int __init intel_iommu_init(void)
>  {
>  	int ret = 0;
> @@ -2598,6 +2755,7 @@ int __init intel_iommu_init(void)
>  	init_timer(&unmap_timer);
>  	force_iommu = 1;
>  	dma_ops = &intel_dma_ops;
> +	init_iommu_sysfs();
>  
>  	register_iommu(&intel_iommu_ops);
>  
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 1d6c71d..5ec836b 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -205,6 +205,9 @@ static inline void dmar_writeq(void __iomem *addr, u64 val)
>  /* low 64 bit */
>  #define dma_frcd_page_addr(d) (d & (((u64)-1) << PAGE_SHIFT))
>  
> +#define MAX_IOMMUS      32
> +#define MAX_IOMMU_REGS  0xc0

MAX_IOMMU_REGS ... would be nice to add some reference here where 
that limit comes from. Specification / documentation coordinates.

That will give those people who ever extend the hw side (and it will 
happen) a chance to notice this limit in the Linux kernel.

Also, is there a chance that MAX_IOMMU_REGS can be probed from the 
hardware directly? If that's possible and reliable then it would be 
nicer to make this limit dynamic. The registers are saved/restored 
straight and in a linear fashion - no reason to not make that 
extensible if we have a chance for it.

Thanks,

	Ingo

  reply	other threads:[~2009-03-25 20:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-25 18:45 [patch 0/2] Intel IOMMU Suspend/Resume Support fenghua.yu
2009-03-25 18:45 ` [patch 1/2] Intel IOMMU Suspend/Resume Support for DMAR fenghua.yu
2009-03-25 20:12   ` Ingo Molnar [this message]
2009-03-25 23:47     ` David Woodhouse
2009-03-26  9:58       ` Ingo Molnar
2009-03-25 18:45 ` [patch 2/2] Intel IOMMU Suspend/Resume Support for Queued Invalidation fenghua.yu
2009-03-25 20:28   ` Ingo Molnar
2009-03-25 20:53 ` [patch 0/2] Intel IOMMU Suspend/Resume Support Andrew Lutomirski
2009-03-25 20:57   ` Yu, Fenghua
2009-03-25 21:21     ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2009-03-24 19:58 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled Andrew Lutomirski
2009-03-24 20:32 ` Ingo Molnar
2009-03-24 22:36   ` [PATCH 1/2] Intel IOMMU Suspend/Resume Support for DMAR Fenghua Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090325201228.GA26204@elte.hu \
    --to=mingo@elte.hu \
    --cc=amluto@gmail.com \
    --cc=dwmw2@infradead.org \
    --cc=fenghua.yu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=kyle@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgross@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox