All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pranjal Shrivastava <praan@google.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Mostafa Saleh <smostafa@google.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	iommu@lists.linux.dev, Jason Gunthorpe <jgg@nvidia.com>
Subject: Re: [PATCH v4 2/3] iommu/arm-smmu-v3: Log better event records
Date: Fri, 1 Nov 2024 16:06:31 +0000	[thread overview]
Message-ID: <ZyT8hzuV_xnXgpf-@google.com> (raw)
In-Reply-To: <08498356-d892-4756-89e1-45b2654faa42@arm.com>

On Fri, Nov 01, 2024 at 03:05:24PM +0000, Robin Murphy wrote:
> On 2024-10-18 7:00 pm, Pranjal Shrivastava wrote:
> > Currently, the driver dumps the raw hex for a received event record.
> > Improve this by leveraging `struct arm_smmu_event` for event fields
> > and log human-readable event records with meaningful information.
> > 
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
> > ---
> >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 109 ++++++++++++++++++--
> >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  13 +++
> >   2 files changed, 113 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 2f1108e5de51..4477cf86cb8e 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -83,6 +83,34 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
> >   	{ 0, NULL},
> >   };
> > +static const char * const event_str[] = {
> > +	/* Bad config events */
> > +	[EVT_ID_BAD_SID_CONFIG] = "C_BAD_STREAMID",
> > +	[EVT_ID_BAD_STE_CONFIG] = "C_BAD_STE",
> > +	[EVT_ID_BAD_CD_CONFIG] = "C_BAD_CD",
> > +	[EVT_ID_BAD_SSID_CONFIG] = "C_BAD_SUBSTREAMID",
> > +	[EVT_ID_STREAM_DISABLED] = "F_STREAM_DISABLED",
> > +
> > +	/* Bad translation events */
> > +	[EVT_ID_TRANSLATION_FAULT] = "F_TRANSLATION",
> > +	[EVT_ID_ADDR_SIZE_FAULT] = "F_ADDR_SIZE",
> > +	[EVT_ID_ACCESS_FAULT] = "F_ACCESS",
> > +	[EVT_ID_PERMISSION_FAULT] = "F_PERMISSION",
> > +
> > +	/* Bad fetch events */
> > +	[EVT_ID_STE_FETCH_FAULT] = "F_STE_FETCH",
> > +	[EVT_ID_CD_FETCH_FAULT] = "F_CD_FETCH",
> > +	[EVT_ID_VMS_FETCH_FAULT] = "F_VMS_FAULT",
> > +	[EVT_ID_MAX] = NULL,
> 
> "(null)" is not a particularly obvious or helpful description of an
> unknown/imp-def event.

We can either add all the event strings or have a null check before we
print, i.e. if the sting fetched is NULL we can print "UNKNOWN" or
something. I like the latter approach better, LMK what you think?

> 
> > +};
> > +
> > +static const char * const event_class_str[] = {
> > +	[0] = "CD fetch",
> > +	[1] = "Stage 1 translation table fetch",
> > +	[2] = "Input address caused fault ",
> 
> I'm wondering how to parse "caused fault" here, but either way it doesn't
> seem to line up with the other two descriptions.
> 

Hmm, this was added as per the review comments in v2 [1], it was decided
to keep the spec-defined strings for event_class as is, if we have a
strong preference against this, then maybe we can go ahead with the
following spec-defined abbreviations:

static const char * const class_str[] = {
	[0] = "CD",  // or "CD Fetch"
	[1] = "TTD", // or "S1 TT fetch"
	[2] = "IN",
	[3] = "RES",
};

> > +	[3] = "Reserved",
> > +};
> > +
> >   static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> >   				    struct arm_smmu_device *smmu, u32 flags);
> >   static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
> > @@ -1756,6 +1784,60 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
> >   	return rb_entry(node, struct arm_smmu_stream, node)->master;
> >   }
> > +static void arm_smmu_dump_raw_event(struct arm_smmu_event *event)
> > +{
> > +	int i;
> > +	struct arm_smmu_device *smmu = event->smmu;
> > +
> > +	dev_err(smmu->dev, "event 0x%02x received: master %s:\n",
> 
> I'm mildly wary of automated parsing tools scanning for things like IOMMU
> faults, so I think it would be safest to leave any existing messages exactly
> as they are unless there's a particularly compelling reason to change them.
> And I don't believe that redundantly showing the same thing twice in one
> report is that.

Ack. Will leave this as is.

> 
> Arguably more personal preference, but as before with the v2 driver, I also
> think it seems neater and more logical to show the raw data first in the
> context of the notification, and then have the decode follow as a
> clarification.
> 

Ack. I'll log the raw event first and then the decoded message.
However, in the legacy log, we use dev_err for each dword which *may* be
interrupted by other logs, so should we have some message explaining
that the following is just a decoded version of the above log? 
For example:

[   85.411681] arm-smmu-v3 arm-smmu-v3.0.auto: event 0x13 received: master 0000:00:01.0:
[   85.412045] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0000000800000013
[   85.412347] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0000020000000000
[   85.412630] arm-smmu-v3 arm-smmu-v3.0.auto:  0x00000000ffffc0d0
[   85.412921] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0000000000000000
[   85.410290] arm-smmu-v3 arm-smmu-v3.0.auto: event 0x13: F_PERMISSION
[   85.410290]  master: 0000:00:01.0 sid: 0x00000008.0x00000
[   85.410290]  iova = 0xffffc0d0 ipa = 0x0
[   85.410290]  Unpriv | Data | Write | S1 | Input address caused fault
[   85.410290]  STAG = 0x0 Stall = 0x0

Personally, I think the above log looks better..

> > +		event->id, event->master_name);
> > +
> > +	for (i = 0; i < EVTQ_ENT_DWORDS; ++i)
> > +		dev_err(smmu->dev, "\t0x%016llx\n", event->raw[i]);
> > +}
> > +
> > +static void arm_smmu_dump_event(struct arm_smmu_event *evt, struct ratelimit_state *rs)
> > +{
> > +	struct arm_smmu_device *smmu = evt->smmu;
> > +	char title[100] = {0};
> > +	char mastr[100] = {0};
> > +	char addrs[100] = {0};
> > +	char flags[100] = {0};
> > +	char other[50] = {0};
> > +
> > +	if (!__ratelimit(rs))
> > +		return;
> > +
> > +	snprintf(title, 100, "Unexpected event received: %s\n",	event_str[evt->id]);
> 
> ...that also makes it easier to avoid redundant waffle like this and look
> less like two separate things have happened.
> 

Ack. Please refer to the log above, I think it improves this as well..

> > +	snprintf(mastr, 100, "\tmaster: %s sid: 0x%08x.0x%05x\n",
> 
> TBH I find that StreamID format even harder to read than the raw event
> record itself :/
> 

Ouch! Nicolin didn't like this too :( 
I was just trying to follow the format used in the arm_smmu_handle_ppr
logs. Let's log the streamID and SSID separately then?
Maybe something like:

[   85.410290]  master: 0000:00:01.0 sid: 0x00000008 ssid: 0x00000

> > +		 evt->master_name, evt->sid, evt->ssid);
> > +
> > +	switch (evt->id) {
> > +	case EVT_ID_TRANSLATION_FAULT:
> > +	case EVT_ID_ADDR_SIZE_FAULT:
> > +	case EVT_ID_ACCESS_FAULT:
> > +	case EVT_ID_PERMISSION_FAULT:
> > +		snprintf(addrs, 100, "\tiova = %#llx ipa = %#llx\n", evt->iova, evt->ipa);
> > +		snprintf(other, 50, "\tSTAG = %#x Stall = %#x\n", evt->stag, evt->stall);
> > +		snprintf(flags, 100, "\t%s%s%s%s%s%s\n",
> > +			 evt->privileged ? "Priv | " : "Unpriv | ",
> > +			 evt->instruction ? "Inst | " : "Data | ",
> > +			 evt->read ? "Read | " : "Write | ",
> > +			 evt->s2 ? "S2 | " : "S1 | ", event_class_str[evt->class],
> > +			 evt->ttrnw_valid ? (evt->ttrnw ? "| TTD Read" : "| TTD Write") : "");
> 
> Why needlessly duplicate all the constant parts which could be in the format
> string itself?
> 

Umm.. I'm not sure I understand? Are you referring to the spaces and
the ' | ' characters?

> > +		break;
> > +
> > +	case EVT_ID_STE_FETCH_FAULT:
> > +	case EVT_ID_CD_FETCH_FAULT:
> > +	case EVT_ID_VMS_FETCH_FAULT:
> > +		snprintf(addrs, 100, "\tFetch address: %#llx\n", evt->ipa);
> > +		break;
> > +	}
> > +
> > +	dev_err(smmu->dev, "%s%s%s%s%s", title, mastr, addrs, flags, other);
> > +	arm_smmu_dump_raw_event(evt);
> > +}
> > +
> >   /* IRQ and event handlers */
> >   static int arm_smmu_handle_evt(struct arm_smmu_event *event)
> >   {
> > @@ -1819,6 +1901,8 @@ static int arm_smmu_handle_evt(struct arm_smmu_event *event)
> >   static void arm_smmu_get_event_from_raw(struct arm_smmu_device *smmu,
> >   					struct arm_smmu_event *event)
> >   {
> > +	struct arm_smmu_master *master;
> > +
> >   	/* Pick out the good stuff */
> >   	event->id = FIELD_GET(EVTQ_0_ID, event->raw[0]);
> >   	event->sid = FIELD_GET(EVTQ_0_SID, event->raw[0]);
> > @@ -1833,12 +1917,24 @@ static void arm_smmu_get_event_from_raw(struct arm_smmu_device *smmu,
> >   	event->class = FIELD_GET(EVTQ_1_CLASS, event->raw[1]);
> >   	event->iova = FIELD_GET(EVTQ_2_ADDR, event->raw[2]);
> >   	event->ipa = FIELD_GET(EVTQ_3_IPA, event->raw[3]);
> > +	event->ttrnw = FIELD_GET(EVTQ_1_TT_READ, event->raw[1]);
> > +	event->ttrnw_valid = false;
> >   	event->smmu = smmu;
> > +	event->dev = NULL;
> > +
> > +	if (event->id == EVT_ID_PERMISSION_FAULT)
> > +		event->ttrnw_valid = (event->class == EVTQ_1_CLASS_TT);
> > +
> > +	mutex_lock(&smmu->streams_mutex);
> > +	master = arm_smmu_find_master(smmu, event->sid);
> > +	if (master)
> > +		event->dev = get_device(master->dev);
> > +	mutex_unlock(&smmu->streams_mutex);
> > +	event->master_name = event->dev ? dev_name(event->dev) : "(unassigned sid)";
> 
> As I said before, this shouldn't be here; we've saved the device, so we can
> derive its name at the point where we actually need its name.

Ack, will move this within `arm_smmu_dump_event`

> 
> Thanks,
> Robin.
> 

Thanks,
Praan

[1] https://lore.kernel.org/linux-iommu/ZtZEFKnKasX6433r@Asurada-Nvidia/

> >   }
> >   static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
> >   {
> > -	int i, ret;
> >   	struct arm_smmu_event evt;
> >   	struct arm_smmu_device *smmu = dev;
> >   	struct arm_smmu_queue *q = &smmu->evtq.q;
> > @@ -1850,15 +1946,10 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
> >   		while (!queue_remove_raw(q, evt.raw)) {
> >   			arm_smmu_get_event_from_raw(smmu, &evt);
> > -			ret = arm_smmu_handle_evt(&evt);
> > -			if (!ret || !__ratelimit(&rs))
> > -				continue;
> > -
> > -			dev_info(smmu->dev, "event 0x%02x received:\n", evt.id);
> > -			for (i = 0; i < EVTQ_ENT_DWORDS; ++i)
> > -				dev_info(smmu->dev, "\t0x%016llx\n",
> > -					(unsigned long long)evt.raw[i]);
> > +			if (arm_smmu_handle_evt(&evt))
> > +				arm_smmu_dump_event(&evt, &rs);
> > +			put_device(evt.dev);
> >   			cond_resched();
> >   		}
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > index 8a42d7b701fb..820b08f872af 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > @@ -438,10 +438,19 @@ static inline unsigned int arm_smmu_cdtab_l2_idx(unsigned int ssid)
> >   #define EVTQ_0_ID			GENMASK_ULL(7, 0)
> >   /* Events */
> > +#define EVT_ID_BAD_SID_CONFIG		0x02
> > +#define EVT_ID_STE_FETCH_FAULT		0x03
> > +#define EVT_ID_BAD_STE_CONFIG		0x04
> > +#define EVT_ID_STREAM_DISABLED		0x06
> > +#define EVT_ID_BAD_SSID_CONFIG		0x08
> > +#define EVT_ID_CD_FETCH_FAULT		0x09
> > +#define EVT_ID_BAD_CD_CONFIG		0x0a
> >   #define EVT_ID_TRANSLATION_FAULT	0x10
> >   #define EVT_ID_ADDR_SIZE_FAULT		0x11
> >   #define EVT_ID_ACCESS_FAULT		0x12
> >   #define EVT_ID_PERMISSION_FAULT		0x13
> > +#define EVT_ID_VMS_FETCH_FAULT		0x25
> > +#define EVT_ID_MAX			0xff
> >   #define EVTQ_0_SSV			(1UL << 11)
> >   #define EVTQ_0_SSID			GENMASK_ULL(31, 12)
> > @@ -774,7 +783,9 @@ struct arm_smmu_stream {
> >   };
> >   struct arm_smmu_event {
> > +	const char			*master_name;
> >   	struct arm_smmu_device		*smmu;
> > +	struct device			*dev;
> >   	u8				id;
> >   	u8				class;
> >   	u16				stag;
> > @@ -789,6 +800,8 @@ struct arm_smmu_event {
> >   	bool				instruction;
> >   	bool				s2;
> >   	bool				read;
> > +	bool				ttrnw;
> > +	bool				ttrnw_valid;
> >   };
> >   /* SMMU private data for each master */
> 

  reply	other threads:[~2024-11-01 16:06 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-18 18:00 [PATCH v4 0/3] iommu/arm-smmu-v3: Parse out event records Pranjal Shrivastava
2024-10-18 18:00 ` [PATCH v4 1/3] iommu/arm-smmu-v3: Introduce struct arm_smmu_event Pranjal Shrivastava
2024-10-19  1:56   ` Nicolin Chen
2024-10-21  6:20     ` Pranjal Shrivastava
2024-10-24 13:11   ` Will Deacon
2024-10-24 14:20     ` Pranjal Shrivastava
2024-10-24 17:02     ` Pranjal Shrivastava
2024-10-24 17:03       ` Jason Gunthorpe
2024-10-24 17:37         ` Pranjal Shrivastava
2024-10-28 12:23           ` Jason Gunthorpe
2024-10-28 14:46             ` Pranjal Shrivastava
2024-11-04 17:23       ` Daniel Mentz
2024-11-04 18:16         ` Pranjal Shrivastava
2024-11-04 18:19           ` Pranjal Shrivastava
2024-11-01 14:41   ` Robin Murphy
2024-11-01 15:08     ` Pranjal Shrivastava
2024-11-04  5:25       ` Daniel Mentz
2024-11-04  8:31         ` Pranjal Shrivastava
2024-11-07  0:10           ` Daniel Mentz
2024-11-07 14:33             ` Pranjal Shrivastava
2024-11-07  0:16   ` Daniel Mentz
2024-11-07 14:57     ` Pranjal Shrivastava
2024-11-11 22:20       ` Daniel Mentz
2024-11-12  0:52         ` Pranjal Shrivastava
2024-11-12  4:01           ` Daniel Mentz
2024-11-12  8:12             ` Pranjal Shrivastava
2024-10-18 18:00 ` [PATCH v4 2/3] iommu/arm-smmu-v3: Log better event records Pranjal Shrivastava
2024-10-19  2:06   ` Nicolin Chen
2024-10-19  4:51     ` Nicolin Chen
2024-10-21  6:29       ` Pranjal Shrivastava
2024-10-21  6:26     ` Pranjal Shrivastava
2024-10-21 22:53       ` Nicolin Chen
2024-10-24 13:15   ` Will Deacon
2024-10-24 14:14     ` Pranjal Shrivastava
2024-10-29 18:53       ` Will Deacon
2024-10-29 19:59         ` Pranjal Shrivastava
2024-10-24 19:00     ` Nicolin Chen
2024-10-29 18:49       ` Will Deacon
2024-11-01 15:05   ` Robin Murphy
2024-11-01 16:06     ` Pranjal Shrivastava [this message]
2024-11-04  6:36   ` Daniel Mentz
2024-11-04 10:51     ` Pranjal Shrivastava
2024-10-18 18:00 ` [PATCH v4 3/3] iommu/arm-smmu-v3: Avoid redundant master lookup in events Pranjal Shrivastava
2024-10-19  2:08   ` Nicolin Chen
2024-10-19  1:45 ` [PATCH v4 0/3] iommu/arm-smmu-v3: Parse out event records Nicolin Chen
2024-10-21  6:33   ` Pranjal Shrivastava
2024-10-21 22:51     ` Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZyT8hzuV_xnXgpf-@google.com \
    --to=praan@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.