From: Pranjal Shrivastava <praan@google.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
Mostafa Saleh <smostafa@google.com>,
Nicolin Chen <nicolinc@nvidia.com>,
iommu@lists.linux.dev, Jason Gunthorpe <jgg@nvidia.com>
Subject: Re: [PATCH v4 2/3] iommu/arm-smmu-v3: Log better event records
Date: Fri, 1 Nov 2024 16:06:31 +0000 [thread overview]
Message-ID: <ZyT8hzuV_xnXgpf-@google.com> (raw)
In-Reply-To: <08498356-d892-4756-89e1-45b2654faa42@arm.com>
On Fri, Nov 01, 2024 at 03:05:24PM +0000, Robin Murphy wrote:
> On 2024-10-18 7:00 pm, Pranjal Shrivastava wrote:
> > Currently, the driver dumps the raw hex for a received event record.
> > Improve this by leveraging `struct arm_smmu_event` for event fields
> > and log human-readable event records with meaningful information.
> >
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 109 ++++++++++++++++++--
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 13 +++
> > 2 files changed, 113 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 2f1108e5de51..4477cf86cb8e 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -83,6 +83,34 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
> > { 0, NULL},
> > };
> > +static const char * const event_str[] = {
> > + /* Bad config events */
> > + [EVT_ID_BAD_SID_CONFIG] = "C_BAD_STREAMID",
> > + [EVT_ID_BAD_STE_CONFIG] = "C_BAD_STE",
> > + [EVT_ID_BAD_CD_CONFIG] = "C_BAD_CD",
> > + [EVT_ID_BAD_SSID_CONFIG] = "C_BAD_SUBSTREAMID",
> > + [EVT_ID_STREAM_DISABLED] = "F_STREAM_DISABLED",
> > +
> > + /* Bad translation events */
> > + [EVT_ID_TRANSLATION_FAULT] = "F_TRANSLATION",
> > + [EVT_ID_ADDR_SIZE_FAULT] = "F_ADDR_SIZE",
> > + [EVT_ID_ACCESS_FAULT] = "F_ACCESS",
> > + [EVT_ID_PERMISSION_FAULT] = "F_PERMISSION",
> > +
> > + /* Bad fetch events */
> > + [EVT_ID_STE_FETCH_FAULT] = "F_STE_FETCH",
> > + [EVT_ID_CD_FETCH_FAULT] = "F_CD_FETCH",
> > + [EVT_ID_VMS_FETCH_FAULT] = "F_VMS_FAULT",
> > + [EVT_ID_MAX] = NULL,
>
> "(null)" is not a particularly obvious or helpful description of an
> unknown/imp-def event.
We can either add all the event strings or have a null check before we
print, i.e. if the sting fetched is NULL we can print "UNKNOWN" or
something. I like the latter approach better, LMK what you think?
>
> > +};
> > +
> > +static const char * const event_class_str[] = {
> > + [0] = "CD fetch",
> > + [1] = "Stage 1 translation table fetch",
> > + [2] = "Input address caused fault ",
>
> I'm wondering how to parse "caused fault" here, but either way it doesn't
> seem to line up with the other two descriptions.
>
Hmm, this was added as per the review comments in v2 [1], it was decided
to keep the spec-defined strings for event_class as is, if we have a
strong preference against this, then maybe we can go ahead with the
following spec-defined abbreviations:
static const char * const class_str[] = {
[0] = "CD", // or "CD Fetch"
[1] = "TTD", // or "S1 TT fetch"
[2] = "IN",
[3] = "RES",
};
> > + [3] = "Reserved",
> > +};
> > +
> > static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> > struct arm_smmu_device *smmu, u32 flags);
> > static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
> > @@ -1756,6 +1784,60 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
> > return rb_entry(node, struct arm_smmu_stream, node)->master;
> > }
> > +static void arm_smmu_dump_raw_event(struct arm_smmu_event *event)
> > +{
> > + int i;
> > + struct arm_smmu_device *smmu = event->smmu;
> > +
> > + dev_err(smmu->dev, "event 0x%02x received: master %s:\n",
>
> I'm mildly wary of automated parsing tools scanning for things like IOMMU
> faults, so I think it would be safest to leave any existing messages exactly
> as they are unless there's a particularly compelling reason to change them.
> And I don't believe that redundantly showing the same thing twice in one
> report is that.
Ack. Will leave this as is.
>
> Arguably more personal preference, but as before with the v2 driver, I also
> think it seems neater and more logical to show the raw data first in the
> context of the notification, and then have the decode follow as a
> clarification.
>
Ack. I'll log the raw event first and then the decoded message.
However, in the legacy log, we use dev_err for each dword which *may* be
interrupted by other logs, so should we have some message explaining
that the following is just a decoded version of the above log?
For example:
[ 85.411681] arm-smmu-v3 arm-smmu-v3.0.auto: event 0x13 received: master 0000:00:01.0:
[ 85.412045] arm-smmu-v3 arm-smmu-v3.0.auto: 0x0000000800000013
[ 85.412347] arm-smmu-v3 arm-smmu-v3.0.auto: 0x0000020000000000
[ 85.412630] arm-smmu-v3 arm-smmu-v3.0.auto: 0x00000000ffffc0d0
[ 85.412921] arm-smmu-v3 arm-smmu-v3.0.auto: 0x0000000000000000
[ 85.410290] arm-smmu-v3 arm-smmu-v3.0.auto: event 0x13: F_PERMISSION
[ 85.410290] master: 0000:00:01.0 sid: 0x00000008.0x00000
[ 85.410290] iova = 0xffffc0d0 ipa = 0x0
[ 85.410290] Unpriv | Data | Write | S1 | Input address caused fault
[ 85.410290] STAG = 0x0 Stall = 0x0
Personally, I think the above log looks better..
> > + event->id, event->master_name);
> > +
> > + for (i = 0; i < EVTQ_ENT_DWORDS; ++i)
> > + dev_err(smmu->dev, "\t0x%016llx\n", event->raw[i]);
> > +}
> > +
> > +static void arm_smmu_dump_event(struct arm_smmu_event *evt, struct ratelimit_state *rs)
> > +{
> > + struct arm_smmu_device *smmu = evt->smmu;
> > + char title[100] = {0};
> > + char mastr[100] = {0};
> > + char addrs[100] = {0};
> > + char flags[100] = {0};
> > + char other[50] = {0};
> > +
> > + if (!__ratelimit(rs))
> > + return;
> > +
> > + snprintf(title, 100, "Unexpected event received: %s\n", event_str[evt->id]);
>
> ...that also makes it easier to avoid redundant waffle like this and look
> less like two separate things have happened.
>
Ack. Please refer to the log above, I think it improves this as well..
> > + snprintf(mastr, 100, "\tmaster: %s sid: 0x%08x.0x%05x\n",
>
> TBH I find that StreamID format even harder to read than the raw event
> record itself :/
>
Ouch! Nicolin didn't like this too :(
I was just trying to follow the format used in the arm_smmu_handle_ppr
logs. Let's log the streamID and SSID separately then?
Maybe something like:
[ 85.410290] master: 0000:00:01.0 sid: 0x00000008 ssid: 0x00000
> > + evt->master_name, evt->sid, evt->ssid);
> > +
> > + switch (evt->id) {
> > + case EVT_ID_TRANSLATION_FAULT:
> > + case EVT_ID_ADDR_SIZE_FAULT:
> > + case EVT_ID_ACCESS_FAULT:
> > + case EVT_ID_PERMISSION_FAULT:
> > + snprintf(addrs, 100, "\tiova = %#llx ipa = %#llx\n", evt->iova, evt->ipa);
> > + snprintf(other, 50, "\tSTAG = %#x Stall = %#x\n", evt->stag, evt->stall);
> > + snprintf(flags, 100, "\t%s%s%s%s%s%s\n",
> > + evt->privileged ? "Priv | " : "Unpriv | ",
> > + evt->instruction ? "Inst | " : "Data | ",
> > + evt->read ? "Read | " : "Write | ",
> > + evt->s2 ? "S2 | " : "S1 | ", event_class_str[evt->class],
> > + evt->ttrnw_valid ? (evt->ttrnw ? "| TTD Read" : "| TTD Write") : "");
>
> Why needlessly duplicate all the constant parts which could be in the format
> string itself?
>
Umm.. I'm not sure I understand? Are you referring to the spaces and
the ' | ' characters?
> > + break;
> > +
> > + case EVT_ID_STE_FETCH_FAULT:
> > + case EVT_ID_CD_FETCH_FAULT:
> > + case EVT_ID_VMS_FETCH_FAULT:
> > + snprintf(addrs, 100, "\tFetch address: %#llx\n", evt->ipa);
> > + break;
> > + }
> > +
> > + dev_err(smmu->dev, "%s%s%s%s%s", title, mastr, addrs, flags, other);
> > + arm_smmu_dump_raw_event(evt);
> > +}
> > +
> > /* IRQ and event handlers */
> > static int arm_smmu_handle_evt(struct arm_smmu_event *event)
> > {
> > @@ -1819,6 +1901,8 @@ static int arm_smmu_handle_evt(struct arm_smmu_event *event)
> > static void arm_smmu_get_event_from_raw(struct arm_smmu_device *smmu,
> > struct arm_smmu_event *event)
> > {
> > + struct arm_smmu_master *master;
> > +
> > /* Pick out the good stuff */
> > event->id = FIELD_GET(EVTQ_0_ID, event->raw[0]);
> > event->sid = FIELD_GET(EVTQ_0_SID, event->raw[0]);
> > @@ -1833,12 +1917,24 @@ static void arm_smmu_get_event_from_raw(struct arm_smmu_device *smmu,
> > event->class = FIELD_GET(EVTQ_1_CLASS, event->raw[1]);
> > event->iova = FIELD_GET(EVTQ_2_ADDR, event->raw[2]);
> > event->ipa = FIELD_GET(EVTQ_3_IPA, event->raw[3]);
> > + event->ttrnw = FIELD_GET(EVTQ_1_TT_READ, event->raw[1]);
> > + event->ttrnw_valid = false;
> > event->smmu = smmu;
> > + event->dev = NULL;
> > +
> > + if (event->id == EVT_ID_PERMISSION_FAULT)
> > + event->ttrnw_valid = (event->class == EVTQ_1_CLASS_TT);
> > +
> > + mutex_lock(&smmu->streams_mutex);
> > + master = arm_smmu_find_master(smmu, event->sid);
> > + if (master)
> > + event->dev = get_device(master->dev);
> > + mutex_unlock(&smmu->streams_mutex);
> > + event->master_name = event->dev ? dev_name(event->dev) : "(unassigned sid)";
>
> As I said before, this shouldn't be here; we've saved the device, so we can
> derive its name at the point where we actually need its name.
Ack, will move this within `arm_smmu_dump_event`
>
> Thanks,
> Robin.
>
Thanks,
Praan
[1] https://lore.kernel.org/linux-iommu/ZtZEFKnKasX6433r@Asurada-Nvidia/
> > }
> > static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
> > {
> > - int i, ret;
> > struct arm_smmu_event evt;
> > struct arm_smmu_device *smmu = dev;
> > struct arm_smmu_queue *q = &smmu->evtq.q;
> > @@ -1850,15 +1946,10 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
> > while (!queue_remove_raw(q, evt.raw)) {
> > arm_smmu_get_event_from_raw(smmu, &evt);
> > - ret = arm_smmu_handle_evt(&evt);
> > - if (!ret || !__ratelimit(&rs))
> > - continue;
> > -
> > - dev_info(smmu->dev, "event 0x%02x received:\n", evt.id);
> > - for (i = 0; i < EVTQ_ENT_DWORDS; ++i)
> > - dev_info(smmu->dev, "\t0x%016llx\n",
> > - (unsigned long long)evt.raw[i]);
> > + if (arm_smmu_handle_evt(&evt))
> > + arm_smmu_dump_event(&evt, &rs);
> > + put_device(evt.dev);
> > cond_resched();
> > }
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > index 8a42d7b701fb..820b08f872af 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > @@ -438,10 +438,19 @@ static inline unsigned int arm_smmu_cdtab_l2_idx(unsigned int ssid)
> > #define EVTQ_0_ID GENMASK_ULL(7, 0)
> > /* Events */
> > +#define EVT_ID_BAD_SID_CONFIG 0x02
> > +#define EVT_ID_STE_FETCH_FAULT 0x03
> > +#define EVT_ID_BAD_STE_CONFIG 0x04
> > +#define EVT_ID_STREAM_DISABLED 0x06
> > +#define EVT_ID_BAD_SSID_CONFIG 0x08
> > +#define EVT_ID_CD_FETCH_FAULT 0x09
> > +#define EVT_ID_BAD_CD_CONFIG 0x0a
> > #define EVT_ID_TRANSLATION_FAULT 0x10
> > #define EVT_ID_ADDR_SIZE_FAULT 0x11
> > #define EVT_ID_ACCESS_FAULT 0x12
> > #define EVT_ID_PERMISSION_FAULT 0x13
> > +#define EVT_ID_VMS_FETCH_FAULT 0x25
> > +#define EVT_ID_MAX 0xff
> > #define EVTQ_0_SSV (1UL << 11)
> > #define EVTQ_0_SSID GENMASK_ULL(31, 12)
> > @@ -774,7 +783,9 @@ struct arm_smmu_stream {
> > };
> > struct arm_smmu_event {
> > + const char *master_name;
> > struct arm_smmu_device *smmu;
> > + struct device *dev;
> > u8 id;
> > u8 class;
> > u16 stag;
> > @@ -789,6 +800,8 @@ struct arm_smmu_event {
> > bool instruction;
> > bool s2;
> > bool read;
> > + bool ttrnw;
> > + bool ttrnw_valid;
> > };
> > /* SMMU private data for each master */
>
next prev parent reply other threads:[~2024-11-01 16:06 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-18 18:00 [PATCH v4 0/3] iommu/arm-smmu-v3: Parse out event records Pranjal Shrivastava
2024-10-18 18:00 ` [PATCH v4 1/3] iommu/arm-smmu-v3: Introduce struct arm_smmu_event Pranjal Shrivastava
2024-10-19 1:56 ` Nicolin Chen
2024-10-21 6:20 ` Pranjal Shrivastava
2024-10-24 13:11 ` Will Deacon
2024-10-24 14:20 ` Pranjal Shrivastava
2024-10-24 17:02 ` Pranjal Shrivastava
2024-10-24 17:03 ` Jason Gunthorpe
2024-10-24 17:37 ` Pranjal Shrivastava
2024-10-28 12:23 ` Jason Gunthorpe
2024-10-28 14:46 ` Pranjal Shrivastava
2024-11-04 17:23 ` Daniel Mentz
2024-11-04 18:16 ` Pranjal Shrivastava
2024-11-04 18:19 ` Pranjal Shrivastava
2024-11-01 14:41 ` Robin Murphy
2024-11-01 15:08 ` Pranjal Shrivastava
2024-11-04 5:25 ` Daniel Mentz
2024-11-04 8:31 ` Pranjal Shrivastava
2024-11-07 0:10 ` Daniel Mentz
2024-11-07 14:33 ` Pranjal Shrivastava
2024-11-07 0:16 ` Daniel Mentz
2024-11-07 14:57 ` Pranjal Shrivastava
2024-11-11 22:20 ` Daniel Mentz
2024-11-12 0:52 ` Pranjal Shrivastava
2024-11-12 4:01 ` Daniel Mentz
2024-11-12 8:12 ` Pranjal Shrivastava
2024-10-18 18:00 ` [PATCH v4 2/3] iommu/arm-smmu-v3: Log better event records Pranjal Shrivastava
2024-10-19 2:06 ` Nicolin Chen
2024-10-19 4:51 ` Nicolin Chen
2024-10-21 6:29 ` Pranjal Shrivastava
2024-10-21 6:26 ` Pranjal Shrivastava
2024-10-21 22:53 ` Nicolin Chen
2024-10-24 13:15 ` Will Deacon
2024-10-24 14:14 ` Pranjal Shrivastava
2024-10-29 18:53 ` Will Deacon
2024-10-29 19:59 ` Pranjal Shrivastava
2024-10-24 19:00 ` Nicolin Chen
2024-10-29 18:49 ` Will Deacon
2024-11-01 15:05 ` Robin Murphy
2024-11-01 16:06 ` Pranjal Shrivastava [this message]
2024-11-04 6:36 ` Daniel Mentz
2024-11-04 10:51 ` Pranjal Shrivastava
2024-10-18 18:00 ` [PATCH v4 3/3] iommu/arm-smmu-v3: Avoid redundant master lookup in events Pranjal Shrivastava
2024-10-19 2:08 ` Nicolin Chen
2024-10-19 1:45 ` [PATCH v4 0/3] iommu/arm-smmu-v3: Parse out event records Nicolin Chen
2024-10-21 6:33 ` Pranjal Shrivastava
2024-10-21 22:51 ` Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZyT8hzuV_xnXgpf-@google.com \
--to=praan@google.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=nicolinc@nvidia.com \
--cc=robin.murphy@arm.com \
--cc=smostafa@google.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox