Linux IOMMU Development
 help / color / mirror / Atom feed
From: Pranjal Shrivastava <praan@google.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Mostafa Saleh <smostafa@google.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	iommu@lists.linux.dev, Jason Gunthorpe <jgg@nvidia.com>
Subject: Re: [PATCH v4 2/3] iommu/arm-smmu-v3: Log better event records
Date: Fri, 1 Nov 2024 16:06:31 +0000	[thread overview]
Message-ID: <ZyT8hzuV_xnXgpf-@google.com> (raw)
In-Reply-To: <08498356-d892-4756-89e1-45b2654faa42@arm.com>

On Fri, Nov 01, 2024 at 03:05:24PM +0000, Robin Murphy wrote:
> On 2024-10-18 7:00 pm, Pranjal Shrivastava wrote:
> > Currently, the driver dumps the raw hex for a received event record.
> > Improve this by leveraging `struct arm_smmu_event` for event fields
> > and log human-readable event records with meaningful information.
> > 
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
> > ---
> >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 109 ++++++++++++++++++--
> >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  13 +++
> >   2 files changed, 113 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 2f1108e5de51..4477cf86cb8e 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -83,6 +83,34 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
> >   	{ 0, NULL},
> >   };
> > +static const char * const event_str[] = {
> > +	/* Bad config events */
> > +	[EVT_ID_BAD_SID_CONFIG] = "C_BAD_STREAMID",
> > +	[EVT_ID_BAD_STE_CONFIG] = "C_BAD_STE",
> > +	[EVT_ID_BAD_CD_CONFIG] = "C_BAD_CD",
> > +	[EVT_ID_BAD_SSID_CONFIG] = "C_BAD_SUBSTREAMID",
> > +	[EVT_ID_STREAM_DISABLED] = "F_STREAM_DISABLED",
> > +
> > +	/* Bad translation events */
> > +	[EVT_ID_TRANSLATION_FAULT] = "F_TRANSLATION",
> > +	[EVT_ID_ADDR_SIZE_FAULT] = "F_ADDR_SIZE",
> > +	[EVT_ID_ACCESS_FAULT] = "F_ACCESS",
> > +	[EVT_ID_PERMISSION_FAULT] = "F_PERMISSION",
> > +
> > +	/* Bad fetch events */
> > +	[EVT_ID_STE_FETCH_FAULT] = "F_STE_FETCH",
> > +	[EVT_ID_CD_FETCH_FAULT] = "F_CD_FETCH",
> > +	[EVT_ID_VMS_FETCH_FAULT] = "F_VMS_FAULT",
> > +	[EVT_ID_MAX] = NULL,
> 
> "(null)" is not a particularly obvious or helpful description of an
> unknown/imp-def event.

We can either add all the event strings or have a null check before we
print, i.e. if the sting fetched is NULL we can print "UNKNOWN" or
something. I like the latter approach better, LMK what you think?

> 
> > +};
> > +
> > +static const char * const event_class_str[] = {
> > +	[0] = "CD fetch",
> > +	[1] = "Stage 1 translation table fetch",
> > +	[2] = "Input address caused fault ",
> 
> I'm wondering how to parse "caused fault" here, but either way it doesn't
> seem to line up with the other two descriptions.
> 

Hmm, this was added as per the review comments in v2 [1], it was decided
to keep the spec-defined strings for event_class as is, if we have a
strong preference against this, then maybe we can go ahead with the
following spec-defined abbreviations:

static const char * const class_str[] = {
	[0] = "CD",  // or "CD Fetch"
	[1] = "TTD", // or "S1 TT fetch"
	[2] = "IN",
	[3] = "RES",
};

> > +	[3] = "Reserved",
> > +};
> > +
> >   static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> >   				    struct arm_smmu_device *smmu, u32 flags);
> >   static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
> > @@ -1756,6 +1784,60 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
> >   	return rb_entry(node, struct arm_smmu_stream, node)->master;
> >   }
> > +static void arm_smmu_dump_raw_event(struct arm_smmu_event *event)
> > +{
> > +	int i;
> > +	struct arm_smmu_device *smmu = event->smmu;
> > +
> > +	dev_err(smmu->dev, "event 0x%02x received: master %s:\n",
> 
> I'm mildly wary of automated parsing tools scanning for things like IOMMU
> faults, so I think it would be safest to leave any existing messages exactly
> as they are unless there's a particularly compelling reason to change them.
> And I don't believe that redundantly showing the same thing twice in one
> report is that.

Ack. Will leave this as is.

> 
> Arguably more personal preference, but as before with the v2 driver, I also
> think it seems neater and more logical to show the raw data first in the
> context of the notification, and then have the decode follow as a
> clarification.
> 

Ack. I'll log the raw event first and then the decoded message.
However, in the legacy log, we use dev_err for each dword which *may* be
interrupted by other logs, so should we have some message explaining
that the following is just a decoded version of the above log? 
For example:

[   85.411681] arm-smmu-v3 arm-smmu-v3.0.auto: event 0x13 received: master 0000:00:01.0:
[   85.412045] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0000000800000013
[   85.412347] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0000020000000000
[   85.412630] arm-smmu-v3 arm-smmu-v3.0.auto:  0x00000000ffffc0d0
[   85.412921] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0000000000000000
[   85.410290] arm-smmu-v3 arm-smmu-v3.0.auto: event 0x13: F_PERMISSION
[   85.410290]  master: 0000:00:01.0 sid: 0x00000008.0x00000
[   85.410290]  iova = 0xffffc0d0 ipa = 0x0
[   85.410290]  Unpriv | Data | Write | S1 | Input address caused fault
[   85.410290]  STAG = 0x0 Stall = 0x0

Personally, I think the above log looks better..

> > +		event->id, event->master_name);
> > +
> > +	for (i = 0; i < EVTQ_ENT_DWORDS; ++i)
> > +		dev_err(smmu->dev, "\t0x%016llx\n", event->raw[i]);
> > +}
> > +
> > +static void arm_smmu_dump_event(struct arm_smmu_event *evt, struct ratelimit_state *rs)
> > +{
> > +	struct arm_smmu_device *smmu = evt->smmu;
> > +	char title[100] = {0};
> > +	char mastr[100] = {0};
> > +	char addrs[100] = {0};
> > +	char flags[100] = {0};
> > +	char other[50] = {0};
> > +
> > +	if (!__ratelimit(rs))
> > +		return;
> > +
> > +	snprintf(title, 100, "Unexpected event received: %s\n",	event_str[evt->id]);
> 
> ...that also makes it easier to avoid redundant waffle like this and look
> less like two separate things have happened.
> 

Ack. Please refer to the log above, I think it improves this as well..

> > +	snprintf(mastr, 100, "\tmaster: %s sid: 0x%08x.0x%05x\n",
> 
> TBH I find that StreamID format even harder to read than the raw event
> record itself :/
> 

Ouch! Nicolin didn't like this too :( 
I was just trying to follow the format used in the arm_smmu_handle_ppr
logs. Let's log the streamID and SSID separately then?
Maybe something like:

[   85.410290]  master: 0000:00:01.0 sid: 0x00000008 ssid: 0x00000

> > +		 evt->master_name, evt->sid, evt->ssid);
> > +
> > +	switch (evt->id) {
> > +	case EVT_ID_TRANSLATION_FAULT:
> > +	case EVT_ID_ADDR_SIZE_FAULT:
> > +	case EVT_ID_ACCESS_FAULT:
> > +	case EVT_ID_PERMISSION_FAULT:
> > +		snprintf(addrs, 100, "\tiova = %#llx ipa = %#llx\n", evt->iova, evt->ipa);
> > +		snprintf(other, 50, "\tSTAG = %#x Stall = %#x\n", evt->stag, evt->stall);
> > +		snprintf(flags, 100, "\t%s%s%s%s%s%s\n",
> > +			 evt->privileged ? "Priv | " : "Unpriv | ",
> > +			 evt->instruction ? "Inst | " : "Data | ",
> > +			 evt->read ? "Read | " : "Write | ",
> > +			 evt->s2 ? "S2 | " : "S1 | ", event_class_str[evt->class],
> > +			 evt->ttrnw_valid ? (evt->ttrnw ? "| TTD Read" : "| TTD Write") : "");
> 
> Why needlessly duplicate all the constant parts which could be in the format
> string itself?
> 

Umm.. I'm not sure I understand? Are you referring to the spaces and
the ' | ' characters?

> > +		break;
> > +
> > +	case EVT_ID_STE_FETCH_FAULT:
> > +	case EVT_ID_CD_FETCH_FAULT:
> > +	case EVT_ID_VMS_FETCH_FAULT:
> > +		snprintf(addrs, 100, "\tFetch address: %#llx\n", evt->ipa);
> > +		break;
> > +	}
> > +
> > +	dev_err(smmu->dev, "%s%s%s%s%s", title, mastr, addrs, flags, other);
> > +	arm_smmu_dump_raw_event(evt);
> > +}
> > +
> >   /* IRQ and event handlers */
> >   static int arm_smmu_handle_evt(struct arm_smmu_event *event)
> >   {
> > @@ -1819,6 +1901,8 @@ static int arm_smmu_handle_evt(struct arm_smmu_event *event)
> >   static void arm_smmu_get_event_from_raw(struct arm_smmu_device *smmu,
> >   					struct arm_smmu_event *event)
> >   {
> > +	struct arm_smmu_master *master;
> > +
> >   	/* Pick out the good stuff */
> >   	event->id = FIELD_GET(EVTQ_0_ID, event->raw[0]);
> >   	event->sid = FIELD_GET(EVTQ_0_SID, event->raw[0]);
> > @@ -1833,12 +1917,24 @@ static void arm_smmu_get_event_from_raw(struct arm_smmu_device *smmu,
> >   	event->class = FIELD_GET(EVTQ_1_CLASS, event->raw[1]);
> >   	event->iova = FIELD_GET(EVTQ_2_ADDR, event->raw[2]);
> >   	event->ipa = FIELD_GET(EVTQ_3_IPA, event->raw[3]);
> > +	event->ttrnw = FIELD_GET(EVTQ_1_TT_READ, event->raw[1]);
> > +	event->ttrnw_valid = false;
> >   	event->smmu = smmu;
> > +	event->dev = NULL;
> > +
> > +	if (event->id == EVT_ID_PERMISSION_FAULT)
> > +		event->ttrnw_valid = (event->class == EVTQ_1_CLASS_TT);
> > +
> > +	mutex_lock(&smmu->streams_mutex);
> > +	master = arm_smmu_find_master(smmu, event->sid);
> > +	if (master)
> > +		event->dev = get_device(master->dev);
> > +	mutex_unlock(&smmu->streams_mutex);
> > +	event->master_name = event->dev ? dev_name(event->dev) : "(unassigned sid)";
> 
> As I said before, this shouldn't be here; we've saved the device, so we can
> derive its name at the point where we actually need its name.

Ack, will move this within `arm_smmu_dump_event`

> 
> Thanks,
> Robin.
> 

Thanks,
Praan

[1] https://lore.kernel.org/linux-iommu/ZtZEFKnKasX6433r@Asurada-Nvidia/

> >   }
> >   static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
> >   {
> > -	int i, ret;
> >   	struct arm_smmu_event evt;
> >   	struct arm_smmu_device *smmu = dev;
> >   	struct arm_smmu_queue *q = &smmu->evtq.q;
> > @@ -1850,15 +1946,10 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
> >   		while (!queue_remove_raw(q, evt.raw)) {
> >   			arm_smmu_get_event_from_raw(smmu, &evt);
> > -			ret = arm_smmu_handle_evt(&evt);
> > -			if (!ret || !__ratelimit(&rs))
> > -				continue;
> > -
> > -			dev_info(smmu->dev, "event 0x%02x received:\n", evt.id);
> > -			for (i = 0; i < EVTQ_ENT_DWORDS; ++i)
> > -				dev_info(smmu->dev, "\t0x%016llx\n",
> > -					(unsigned long long)evt.raw[i]);
> > +			if (arm_smmu_handle_evt(&evt))
> > +				arm_smmu_dump_event(&evt, &rs);
> > +			put_device(evt.dev);
> >   			cond_resched();
> >   		}
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > index 8a42d7b701fb..820b08f872af 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > @@ -438,10 +438,19 @@ static inline unsigned int arm_smmu_cdtab_l2_idx(unsigned int ssid)
> >   #define EVTQ_0_ID			GENMASK_ULL(7, 0)
> >   /* Events */
> > +#define EVT_ID_BAD_SID_CONFIG		0x02
> > +#define EVT_ID_STE_FETCH_FAULT		0x03
> > +#define EVT_ID_BAD_STE_CONFIG		0x04
> > +#define EVT_ID_STREAM_DISABLED		0x06
> > +#define EVT_ID_BAD_SSID_CONFIG		0x08
> > +#define EVT_ID_CD_FETCH_FAULT		0x09
> > +#define EVT_ID_BAD_CD_CONFIG		0x0a
> >   #define EVT_ID_TRANSLATION_FAULT	0x10
> >   #define EVT_ID_ADDR_SIZE_FAULT		0x11
> >   #define EVT_ID_ACCESS_FAULT		0x12
> >   #define EVT_ID_PERMISSION_FAULT		0x13
> > +#define EVT_ID_VMS_FETCH_FAULT		0x25
> > +#define EVT_ID_MAX			0xff
> >   #define EVTQ_0_SSV			(1UL << 11)
> >   #define EVTQ_0_SSID			GENMASK_ULL(31, 12)
> > @@ -774,7 +783,9 @@ struct arm_smmu_stream {
> >   };
> >   struct arm_smmu_event {
> > +	const char			*master_name;
> >   	struct arm_smmu_device		*smmu;
> > +	struct device			*dev;
> >   	u8				id;
> >   	u8				class;
> >   	u16				stag;
> > @@ -789,6 +800,8 @@ struct arm_smmu_event {
> >   	bool				instruction;
> >   	bool				s2;
> >   	bool				read;
> > +	bool				ttrnw;
> > +	bool				ttrnw_valid;
> >   };
> >   /* SMMU private data for each master */
> 

  reply	other threads:[~2024-11-01 16:06 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-18 18:00 [PATCH v4 0/3] iommu/arm-smmu-v3: Parse out event records Pranjal Shrivastava
2024-10-18 18:00 ` [PATCH v4 1/3] iommu/arm-smmu-v3: Introduce struct arm_smmu_event Pranjal Shrivastava
2024-10-19  1:56   ` Nicolin Chen
2024-10-21  6:20     ` Pranjal Shrivastava
2024-10-24 13:11   ` Will Deacon
2024-10-24 14:20     ` Pranjal Shrivastava
2024-10-24 17:02     ` Pranjal Shrivastava
2024-10-24 17:03       ` Jason Gunthorpe
2024-10-24 17:37         ` Pranjal Shrivastava
2024-10-28 12:23           ` Jason Gunthorpe
2024-10-28 14:46             ` Pranjal Shrivastava
2024-11-04 17:23       ` Daniel Mentz
2024-11-04 18:16         ` Pranjal Shrivastava
2024-11-04 18:19           ` Pranjal Shrivastava
2024-11-01 14:41   ` Robin Murphy
2024-11-01 15:08     ` Pranjal Shrivastava
2024-11-04  5:25       ` Daniel Mentz
2024-11-04  8:31         ` Pranjal Shrivastava
2024-11-07  0:10           ` Daniel Mentz
2024-11-07 14:33             ` Pranjal Shrivastava
2024-11-07  0:16   ` Daniel Mentz
2024-11-07 14:57     ` Pranjal Shrivastava
2024-11-11 22:20       ` Daniel Mentz
2024-11-12  0:52         ` Pranjal Shrivastava
2024-11-12  4:01           ` Daniel Mentz
2024-11-12  8:12             ` Pranjal Shrivastava
2024-10-18 18:00 ` [PATCH v4 2/3] iommu/arm-smmu-v3: Log better event records Pranjal Shrivastava
2024-10-19  2:06   ` Nicolin Chen
2024-10-19  4:51     ` Nicolin Chen
2024-10-21  6:29       ` Pranjal Shrivastava
2024-10-21  6:26     ` Pranjal Shrivastava
2024-10-21 22:53       ` Nicolin Chen
2024-10-24 13:15   ` Will Deacon
2024-10-24 14:14     ` Pranjal Shrivastava
2024-10-29 18:53       ` Will Deacon
2024-10-29 19:59         ` Pranjal Shrivastava
2024-10-24 19:00     ` Nicolin Chen
2024-10-29 18:49       ` Will Deacon
2024-11-01 15:05   ` Robin Murphy
2024-11-01 16:06     ` Pranjal Shrivastava [this message]
2024-11-04  6:36   ` Daniel Mentz
2024-11-04 10:51     ` Pranjal Shrivastava
2024-10-18 18:00 ` [PATCH v4 3/3] iommu/arm-smmu-v3: Avoid redundant master lookup in events Pranjal Shrivastava
2024-10-19  2:08   ` Nicolin Chen
2024-10-19  1:45 ` [PATCH v4 0/3] iommu/arm-smmu-v3: Parse out event records Nicolin Chen
2024-10-21  6:33   ` Pranjal Shrivastava
2024-10-21 22:51     ` Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZyT8hzuV_xnXgpf-@google.com \
    --to=praan@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox