* [PATCH v2 0/2] iommu/vt-d: Fault logging improvements
@ 2016-03-17 20:12 Alex Williamson
2016-03-17 20:12 ` [PATCH v2 1/2] iommu/vt-d: Ratelimit fault handler Alex Williamson
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Alex Williamson @ 2016-03-17 20:12 UTC (permalink / raw)
To: iommu, dwmw2; +Cc: joe, joro, linux-kernel
Ratelimit and improve formatting.
v2:
- Use a single ratelimit state as suggested by Joe Perches, except
I chose to move it up to dmar_fault() so that it includes the
"handling fault status reg" pr_err and we can avoid collecting
entries for logging if we don't plan to print them.
- Added reformatting changes suggested by Joe Perches.
- While there is clearly more that could be done with disabling
fault handling for specific context entries on storm and sending
errors to drivers, this makes a marked improvement on its own.
Thanks,
Alex
---
Alex Williamson (2):
iommu/vt-d: Ratelimit fault handler
iommu/vt-d: Improve fault handler error messages
drivers/iommu/dmar.c | 47 +++++++++++++++++++++++++++--------------------
1 file changed, 27 insertions(+), 20 deletions(-)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/2] iommu/vt-d: Ratelimit fault handler
2016-03-17 20:12 [PATCH v2 0/2] iommu/vt-d: Fault logging improvements Alex Williamson
@ 2016-03-17 20:12 ` Alex Williamson
2016-03-17 20:33 ` Joe Perches
2016-03-17 20:12 ` [PATCH v2 2/2] iommu/vt-d: Improve fault handler error messages Alex Williamson
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Alex Williamson @ 2016-03-17 20:12 UTC (permalink / raw)
To: iommu, dwmw2; +Cc: joe, joro, linux-kernel
Fault rates can easily overwhelm the console and make the system
unresponsive. Ratelimit to allow an opportunity for maintenance.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
drivers/iommu/dmar.c | 33 ++++++++++++++++++++++-----------
1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 8ffd756..8f8bfff 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1602,10 +1602,17 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
int reg, fault_index;
u32 fault_status;
unsigned long flag;
+ bool ratelimited;
+ static DEFINE_RATELIMIT_STATE(rs,
+ DEFAULT_RATELIMIT_INTERVAL,
+ DEFAULT_RATELIMIT_BURST);
+
+ /* Disable printing, simply clear the fault when ratelimited */
+ ratelimited = !__ratelimit(&rs);
raw_spin_lock_irqsave(&iommu->register_lock, flag);
fault_status = readl(iommu->reg + DMAR_FSTS_REG);
- if (fault_status)
+ if (fault_status && !ratelimited)
pr_err("DRHD: handling fault status reg %x\n", fault_status);
/* TBD: ignore advanced fault log currently */
@@ -1627,24 +1634,28 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
if (!(data & DMA_FRCD_F))
break;
- fault_reason = dma_frcd_fault_reason(data);
- type = dma_frcd_type(data);
+ if (!ratelimited) {
+ fault_reason = dma_frcd_fault_reason(data);
+ type = dma_frcd_type(data);
- data = readl(iommu->reg + reg +
- fault_index * PRIMARY_FAULT_REG_LEN + 8);
- source_id = dma_frcd_source_id(data);
+ data = readl(iommu->reg + reg +
+ fault_index * PRIMARY_FAULT_REG_LEN + 8);
+ source_id = dma_frcd_source_id(data);
+
+ guest_addr = dmar_readq(iommu->reg + reg +
+ fault_index * PRIMARY_FAULT_REG_LEN);
+ guest_addr = dma_frcd_page_addr(guest_addr);
+ }
- guest_addr = dmar_readq(iommu->reg + reg +
- fault_index * PRIMARY_FAULT_REG_LEN);
- guest_addr = dma_frcd_page_addr(guest_addr);
/* clear the fault */
writel(DMA_FRCD_F, iommu->reg + reg +
fault_index * PRIMARY_FAULT_REG_LEN + 12);
raw_spin_unlock_irqrestore(&iommu->register_lock, flag);
- dmar_fault_do_one(iommu, type, fault_reason,
- source_id, guest_addr);
+ if (!ratelimited)
+ dmar_fault_do_one(iommu, type, fault_reason,
+ source_id, guest_addr);
fault_index++;
if (fault_index >= cap_num_fault_regs(iommu->cap))
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] iommu/vt-d: Improve fault handler error messages
2016-03-17 20:12 [PATCH v2 0/2] iommu/vt-d: Fault logging improvements Alex Williamson
2016-03-17 20:12 ` [PATCH v2 1/2] iommu/vt-d: Ratelimit fault handler Alex Williamson
@ 2016-03-17 20:12 ` Alex Williamson
2016-03-17 20:25 ` [PATCH v2 0/2] iommu/vt-d: Fault logging improvements Joe Perches
2016-04-05 14:20 ` Joerg Roedel
3 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2016-03-17 20:12 UTC (permalink / raw)
To: iommu, dwmw2; +Cc: joe, joro, linux-kernel
Remove new line in error logs, avoid duplicate and explicit pr_fmt.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
drivers/iommu/dmar.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 8f8bfff..6a86b5d 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1579,18 +1579,14 @@ static int dmar_fault_do_one(struct intel_iommu *iommu, int type,
reason = dmar_get_fault_reason(fault_reason, &fault_type);
if (fault_type == INTR_REMAP)
- pr_err("INTR-REMAP: Request device [[%02x:%02x.%d] "
- "fault index %llx\n"
- "INTR-REMAP:[fault reason %02d] %s\n",
- (source_id >> 8), PCI_SLOT(source_id & 0xFF),
+ pr_err("[INTR-REMAP] Request device [%02x:%02x.%d] fault index %llx [fault reason %02d] %s\n",
+ source_id >> 8, PCI_SLOT(source_id & 0xFF),
PCI_FUNC(source_id & 0xFF), addr >> 48,
fault_reason, reason);
else
- pr_err("DMAR:[%s] Request device [%02x:%02x.%d] "
- "fault addr %llx \n"
- "DMAR:[fault reason %02d] %s\n",
- (type ? "DMA Read" : "DMA Write"),
- (source_id >> 8), PCI_SLOT(source_id & 0xFF),
+ pr_err("[%s] Request device [%02x:%02x.%d] fault addr %llx [fault reason %02d] %s\n",
+ type ? "DMA Read" : "DMA Write",
+ source_id >> 8, PCI_SLOT(source_id & 0xFF),
PCI_FUNC(source_id & 0xFF), addr, fault_reason, reason);
return 0;
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/2] iommu/vt-d: Fault logging improvements
2016-03-17 20:12 [PATCH v2 0/2] iommu/vt-d: Fault logging improvements Alex Williamson
2016-03-17 20:12 ` [PATCH v2 1/2] iommu/vt-d: Ratelimit fault handler Alex Williamson
2016-03-17 20:12 ` [PATCH v2 2/2] iommu/vt-d: Improve fault handler error messages Alex Williamson
@ 2016-03-17 20:25 ` Joe Perches
2016-04-05 14:20 ` Joerg Roedel
3 siblings, 0 replies; 7+ messages in thread
From: Joe Perches @ 2016-03-17 20:25 UTC (permalink / raw)
To: Alex Williamson, iommu, dwmw2; +Cc: joro, linux-kernel
On Thu, 2016-03-17 at 14:12 -0600, Alex Williamson wrote:
> Ratelimit and improve formatting.
Makes sense, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] iommu/vt-d: Ratelimit fault handler
2016-03-17 20:12 ` [PATCH v2 1/2] iommu/vt-d: Ratelimit fault handler Alex Williamson
@ 2016-03-17 20:33 ` Joe Perches
2016-03-17 20:46 ` Alex Williamson
0 siblings, 1 reply; 7+ messages in thread
From: Joe Perches @ 2016-03-17 20:33 UTC (permalink / raw)
To: Alex Williamson, iommu, dwmw2; +Cc: joro, linux-kernel
On Thu, 2016-03-17 at 14:12 -0600, Alex Williamson wrote:
> Fault rates can easily overwhelm the console and make the system
> unresponsive. Ratelimit to allow an opportunity for maintenance.
[]
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
[]
> @@ -1602,10 +1602,17 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
> int reg, fault_index;
> u32 fault_status;
> unsigned long flag;
> + bool ratelimited;
> + static DEFINE_RATELIMIT_STATE(rs,
> + DEFAULT_RATELIMIT_INTERVAL,
> + DEFAULT_RATELIMIT_BURST);
Are these the appropriate limits for dmar?
include/linux/ratelimit.h:#define DEFAULT_RATELIMIT_INTERVAL (5 * HZ)
include/linux/ratelimit.h:#define DEFAULT_RATELIMIT_BURST 10
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] iommu/vt-d: Ratelimit fault handler
2016-03-17 20:33 ` Joe Perches
@ 2016-03-17 20:46 ` Alex Williamson
0 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2016-03-17 20:46 UTC (permalink / raw)
To: Joe Perches; +Cc: iommu, dwmw2, joro, linux-kernel
On Thu, 17 Mar 2016 13:33:30 -0700
Joe Perches <joe@perches.com> wrote:
> On Thu, 2016-03-17 at 14:12 -0600, Alex Williamson wrote:
> > Fault rates can easily overwhelm the console and make the system
> > unresponsive. Ratelimit to allow an opportunity for maintenance.
> []
> > diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> []
> > @@ -1602,10 +1602,17 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
> > int reg, fault_index;
> > u32 fault_status;
> > unsigned long flag;
> > + bool ratelimited;
> > + static DEFINE_RATELIMIT_STATE(rs,
> > + DEFAULT_RATELIMIT_INTERVAL,
> > + DEFAULT_RATELIMIT_BURST);
>
> Are these the appropriate limits for dmar?
>
> include/linux/ratelimit.h:#define DEFAULT_RATELIMIT_INTERVAL (5 * HZ)
> include/linux/ratelimit.h:#define DEFAULT_RATELIMIT_BURST 10
They seem OK to me, I've got a test running that continuously generates
DMA read faults and I get 20 lines of log every 5 seconds. That seems
like enough to know there's an issue, it's ongoing, and maybe see some
patterns in the fault addresses. I expect we could turn up the burst
value but generally when I'm looking at the logs I'm only looking for
things like is it a single target address, is it a sequential address,
or what's the general address space to know if it should or should not
be a valid fault address. Thanks,
Alex
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/2] iommu/vt-d: Fault logging improvements
2016-03-17 20:12 [PATCH v2 0/2] iommu/vt-d: Fault logging improvements Alex Williamson
` (2 preceding siblings ...)
2016-03-17 20:25 ` [PATCH v2 0/2] iommu/vt-d: Fault logging improvements Joe Perches
@ 2016-04-05 14:20 ` Joerg Roedel
3 siblings, 0 replies; 7+ messages in thread
From: Joerg Roedel @ 2016-04-05 14:20 UTC (permalink / raw)
To: Alex Williamson; +Cc: iommu, dwmw2, joe, linux-kernel
On Thu, Mar 17, 2016 at 02:12:19PM -0600, Alex Williamson wrote:
> Alex Williamson (2):
> iommu/vt-d: Ratelimit fault handler
> iommu/vt-d: Improve fault handler error messages
Applied, thanks Alex.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-04-05 14:20 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-17 20:12 [PATCH v2 0/2] iommu/vt-d: Fault logging improvements Alex Williamson
2016-03-17 20:12 ` [PATCH v2 1/2] iommu/vt-d: Ratelimit fault handler Alex Williamson
2016-03-17 20:33 ` Joe Perches
2016-03-17 20:46 ` Alex Williamson
2016-03-17 20:12 ` [PATCH v2 2/2] iommu/vt-d: Improve fault handler error messages Alex Williamson
2016-03-17 20:25 ` [PATCH v2 0/2] iommu/vt-d: Fault logging improvements Joe Perches
2016-04-05 14:20 ` Joerg Roedel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox