* [PATCH v2 0/6] drm/i915: expose RCS topology to userspace
@ 2018-01-11 19:53 Lionel Landwerlin
2018-01-11 19:53 ` [PATCH v2 1/6] drm/i915: store all subslice masks Lionel Landwerlin
` (7 more replies)
0 siblings, 8 replies; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-11 19:53 UTC (permalink / raw)
To: intel-gfx
Hi all,
Tvrtko found a few bugs in the previous iteration of this series and
also made quite a few recommendations on the stored internal state as
well uapi, so here is an update. Details in the patches.
Thanks a lot,
Lionel Landwerlin (6):
drm/i915: store all subslice masks
drm/i915/debugfs: reuse max slice/subslices already stored in sseu
drm/i915/debugfs: add rcs topology entry
drm/i915: add rcs topology to error state
drm/i915: add query uAPI
drm/i915: expose rcs topology through query uAPI
drivers/gpu/drm/i915/Makefile | 1 +
drivers/gpu/drm/i915/i915_debugfs.c | 94 +++++++++++----
drivers/gpu/drm/i915/i915_drv.c | 3 +-
drivers/gpu/drm/i915/i915_drv.h | 3 +
drivers/gpu/drm/i915/i915_gpu_error.c | 40 +++++++
drivers/gpu/drm/i915/i915_query.c | 184 +++++++++++++++++++++++++++++
drivers/gpu/drm/i915/intel_device_info.c | 196 +++++++++++++++++++++++--------
drivers/gpu/drm/i915/intel_device_info.h | 36 +++++-
drivers/gpu/drm/i915/intel_lrc.c | 2 +-
drivers/gpu/drm/i915/intel_query_info.c | 88 ++++++++++++++
drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
include/uapi/drm/i915_drm.h | 82 +++++++++++++
12 files changed, 652 insertions(+), 79 deletions(-)
create mode 100644 drivers/gpu/drm/i915/i915_query.c
create mode 100644 drivers/gpu/drm/i915/intel_query_info.c
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/6] drm/i915: store all subslice masks
2018-01-11 19:53 [PATCH v2 0/6] drm/i915: expose RCS topology to userspace Lionel Landwerlin
@ 2018-01-11 19:53 ` Lionel Landwerlin
2018-01-12 10:15 ` Tvrtko Ursulin
2018-01-11 19:53 ` [PATCH v2 2/6] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
` (6 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-11 19:53 UTC (permalink / raw)
To: intel-gfx
Up to now, subslice mask was assumed to be uniform across slices. But
starting with Cannonlake, slices can be asymmetric (for example slice0
has different number of subslices as slice1+). This change stores all
subslices masks for all slices rather than having a single mask that
applies to all slices.
v2: Rework how we store total numbers in sseu_dev_info (Tvrtko)
Fix CHV eu masks, was reading disabled as enabled (Tvrtko)
Readability changes (Tvrtko)
Add EU index helper (Tvrtko)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 25 ++--
drivers/gpu/drm/i915/i915_drv.c | 2 +-
drivers/gpu/drm/i915/intel_device_info.c | 196 +++++++++++++++++++++++--------
drivers/gpu/drm/i915/intel_device_info.h | 36 +++++-
drivers/gpu/drm/i915/intel_lrc.c | 2 +-
drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
6 files changed, 200 insertions(+), 63 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2bb63073d73f..463029f72a0b 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4285,7 +4285,7 @@ static void cherryview_sseu_device_status(struct drm_i915_private *dev_priv,
continue;
sseu->slice_mask = BIT(0);
- sseu->subslice_mask |= BIT(ss);
+ sseu->subslice_mask[0] |= BIT(ss);
eu_cnt = ((sig1[ss] & CHV_EU08_PG_ENABLE) ? 0 : 2) +
((sig1[ss] & CHV_EU19_PG_ENABLE) ? 0 : 2) +
((sig1[ss] & CHV_EU210_PG_ENABLE) ? 0 : 2) +
@@ -4332,7 +4332,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
continue;
sseu->slice_mask |= BIT(s);
- sseu->subslice_mask = info->sseu.subslice_mask;
+ sseu->subslice_mask[s] = info->sseu.subslice_mask[s];
for (ss = 0; ss < ss_max; ss++) {
unsigned int eu_cnt;
@@ -4387,8 +4387,8 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
sseu->slice_mask |= BIT(s);
if (IS_GEN9_BC(dev_priv))
- sseu->subslice_mask =
- INTEL_INFO(dev_priv)->sseu.subslice_mask;
+ sseu->subslice_mask[s] =
+ INTEL_INFO(dev_priv)->sseu.subslice_mask[s];
for (ss = 0; ss < ss_max; ss++) {
unsigned int eu_cnt;
@@ -4398,7 +4398,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
/* skip disabled subslice */
continue;
- sseu->subslice_mask |= BIT(ss);
+ sseu->subslice_mask[s] |= BIT(ss);
}
eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
@@ -4420,9 +4420,12 @@ static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
if (sseu->slice_mask) {
- sseu->subslice_mask = INTEL_INFO(dev_priv)->sseu.subslice_mask;
sseu->eu_per_subslice =
INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
+ for (s = 0; s < fls(sseu->slice_mask); s++) {
+ sseu->subslice_mask[s] =
+ INTEL_INFO(dev_priv)->sseu.subslice_mask[s];
+ }
sseu->eu_total = sseu->eu_per_subslice *
sseu_subslice_total(sseu);
@@ -4441,6 +4444,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
{
struct drm_i915_private *dev_priv = node_to_i915(m->private);
const char *type = is_available_info ? "Available" : "Enabled";
+ int s;
seq_printf(m, " %s Slice Mask: %04x\n", type,
sseu->slice_mask);
@@ -4448,10 +4452,11 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
hweight8(sseu->slice_mask));
seq_printf(m, " %s Subslice Total: %u\n", type,
sseu_subslice_total(sseu));
- seq_printf(m, " %s Subslice Mask: %04x\n", type,
- sseu->subslice_mask);
- seq_printf(m, " %s Subslice Per Slice: %u\n", type,
- hweight8(sseu->subslice_mask));
+ for (s = 0; s < fls(sseu->slice_mask); s++) {
+ seq_printf(m, " %s Slice%i %u subslices, mask=%04x\n", type,
+ s, hweight8(sseu->subslice_mask[s]),
+ sseu->subslice_mask[s]);
+ }
seq_printf(m, " %s EU Total: %u\n", type,
sseu->eu_total);
seq_printf(m, " %s EU Per Subslice: %u\n", type,
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6c8da9d20c33..969835d3cbcd 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -414,7 +414,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
return -ENODEV;
break;
case I915_PARAM_SUBSLICE_MASK:
- value = INTEL_INFO(dev_priv)->sseu.subslice_mask;
+ value = INTEL_INFO(dev_priv)->sseu.subslice_mask[0];
if (!value)
return -ENODEV;
break;
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index d28592e43512..0bf6ef51cdb7 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -80,12 +80,17 @@ void intel_device_info_dump_flags(const struct intel_device_info *info,
static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
{
+ int s;
+
drm_printf(p, "slice mask: %04x\n", sseu->slice_mask);
drm_printf(p, "slice total: %u\n", hweight8(sseu->slice_mask));
drm_printf(p, "subslice total: %u\n", sseu_subslice_total(sseu));
- drm_printf(p, "subslice mask %04x\n", sseu->subslice_mask);
- drm_printf(p, "subslice per slice: %u\n",
- hweight8(sseu->subslice_mask));
+ for (s = 0; s < ARRAY_SIZE(sseu->subslice_mask); s++) {
+ drm_printf(p, "slice%d subslice mask %04x\n",
+ s, sseu->subslice_mask[s]);
+ drm_printf(p, "slice%d subslice per slice: %u\n",
+ s, hweight8(sseu->subslice_mask[s]));
+ }
drm_printf(p, "EU total: %u\n", sseu->eu_total);
drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
drm_printf(p, "has slice power gating: %s\n",
@@ -119,22 +124,88 @@ void intel_device_info_dump(const struct intel_device_info *info,
intel_device_info_dump_flags(info, p);
}
+static u16 compute_eu_total(const struct sseu_dev_info *sseu)
+{
+ u16 i, total = 0;
+
+ for (i = 0; i < ARRAY_SIZE(sseu->eu_mask); i++)
+ total += hweight8(sseu->eu_mask[i]);
+
+ return total;
+}
+
+static u16 compute_subslice_total(const struct sseu_dev_info *sseu)
+{
+ u16 i, total = 0;
+
+ for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
+ total += hweight8(sseu->subslice_mask[i]);
+
+ return total;
+}
+
static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
{
struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
const u32 fuse2 = I915_READ(GEN8_FUSE2);
+ int s, ss;
+ const int eu_mask = 0xff;
+ u32 subslice_mask, eu_en;
sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
GEN10_F2_S_ENA_SHIFT;
- sseu->subslice_mask = (1 << 4) - 1;
- sseu->subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
- GEN10_F2_SS_DIS_SHIFT);
+ sseu->max_slices = 6;
+ sseu->max_subslices = 4;
+ sseu->max_eus_per_subslice = 8;
- sseu->eu_total = hweight32(~I915_READ(GEN8_EU_DISABLE0));
- sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE1));
- sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE2));
- sseu->eu_total += hweight8(~(I915_READ(GEN10_EU_DISABLE3) &
- GEN10_EU_DIS_SS_MASK));
+ subslice_mask = (1 << 4) - 1;
+ subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
+ GEN10_F2_SS_DIS_SHIFT);
+
+ /* Slice0 can have up to 3 subslices, but there are only 2 in
+ * slice1/2.
+ */
+ sseu->subslice_mask[0] = subslice_mask;
+ for (s = 1; s < sseu->max_slices; s++)
+ sseu->subslice_mask[s] = subslice_mask & 0x3;
+
+ /* Slice0 */
+ eu_en = ~I915_READ(GEN8_EU_DISABLE0);
+ for (ss = 0; ss < sseu->max_subslices; ss++) {
+ sseu->eu_mask[sseu_eu_idx(sseu, 0, ss, 0)] =
+ (eu_en >> (8 * ss)) & eu_mask;
+ }
+ /* Slice1 */
+ sseu->eu_mask[sseu_eu_idx(sseu, 1, 0, 0)] = (eu_en >> 24) & eu_mask;
+ eu_en = ~I915_READ(GEN8_EU_DISABLE1);
+ sseu->eu_mask[sseu_eu_idx(sseu, 1, 1, 0)] = eu_en & eu_mask;
+ /* Slice2 */
+ sseu->eu_mask[sseu_eu_idx(sseu, 2, 0, 0)] = (eu_en >> 8) & eu_mask;
+ sseu->eu_mask[sseu_eu_idx(sseu, 2, 1, 0)] = (eu_en >> 16) & eu_mask;
+ /* Slice3 */
+ sseu->eu_mask[sseu_eu_idx(sseu, 3, 0, 0)] = (eu_en >> 24) & eu_mask;
+ eu_en = ~I915_READ(GEN8_EU_DISABLE2);
+ sseu->eu_mask[sseu_eu_idx(sseu, 3, 1, 0)] = eu_en & eu_mask;
+ /* Slice4 */
+ sseu->eu_mask[sseu_eu_idx(sseu, 4, 0, 0)] = (eu_en >> 8) & eu_mask;
+ sseu->eu_mask[sseu_eu_idx(sseu, 4, 1, 0)] = (eu_en >> 16) & eu_mask;
+ /* Slice5 */
+ sseu->eu_mask[sseu_eu_idx(sseu, 5, 0, 0)] = (eu_en >> 24) & eu_mask;
+ eu_en = ~I915_READ(GEN10_EU_DISABLE3);
+ sseu->eu_mask[sseu_eu_idx(sseu, 5, 1, 0)] = eu_en & eu_mask;
+
+ /* Do a second pass where we marked the subslices disabled if all
+ * their eus are off.
+ */
+ for (s = 0; s < sseu->max_slices; s++) {
+ for (ss = 0; ss < sseu->max_subslices; ss++) {
+ if (sseu_eu_mask(sseu, s, ss, 0) == 0)
+ sseu->subslice_mask[s] &= ~BIT(ss);
+ }
+ }
+
+ sseu->subslice_total = compute_subslice_total(sseu);
+ sseu->eu_total = compute_eu_total(sseu);
/*
* CNL is expected to always have a uniform distribution
@@ -155,26 +226,31 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
{
struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
- u32 fuse, eu_dis;
+ u32 fuse;
fuse = I915_READ(CHV_FUSE_GT);
sseu->slice_mask = BIT(0);
+ sseu->max_slices = 1;
+ sseu->max_subslices = 2;
+ sseu->max_eus_per_subslice = 8;
if (!(fuse & CHV_FGT_DISABLE_SS0)) {
- sseu->subslice_mask |= BIT(0);
- eu_dis = fuse & (CHV_FGT_EU_DIS_SS0_R0_MASK |
- CHV_FGT_EU_DIS_SS0_R1_MASK);
- sseu->eu_total += 8 - hweight32(eu_dis);
+ sseu->subslice_mask[0] |= BIT(0);
+ sseu->eu_mask[0] = ~((fuse & CHV_FGT_EU_DIS_SS0_R0_MASK) >> CHV_FGT_EU_DIS_SS0_R0_SHIFT);
+ sseu->eu_mask[0] |= ~(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
+ sseu->subslice_mask[0] = 1;
}
if (!(fuse & CHV_FGT_DISABLE_SS1)) {
- sseu->subslice_mask |= BIT(1);
- eu_dis = fuse & (CHV_FGT_EU_DIS_SS1_R0_MASK |
- CHV_FGT_EU_DIS_SS1_R1_MASK);
- sseu->eu_total += 8 - hweight32(eu_dis);
+ sseu->subslice_mask[0] |= BIT(1);
+ sseu->eu_mask[1] = ~((fuse & CHV_FGT_EU_DIS_SS1_R0_MASK) >> CHV_FGT_EU_DIS_SS0_R0_SHIFT);
+ sseu->eu_mask[1] |= ~(((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
}
+ sseu->subslice_total = compute_subslice_total(sseu);
+ sseu->eu_total = compute_eu_total(sseu);
+
/*
* CHV expected to always have a uniform distribution of EU
* across subslices.
@@ -196,41 +272,53 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
{
struct intel_device_info *info = mkwrite_device_info(dev_priv);
struct sseu_dev_info *sseu = &info->sseu;
- int s_max = 3, ss_max = 4, eu_max = 8;
int s, ss;
- u32 fuse2, eu_disable;
- u8 eu_mask = 0xff;
+ u32 fuse2, eu_disable, subslice_mask;
+ const u8 eu_mask = 0xff;
fuse2 = I915_READ(GEN8_FUSE2);
sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
+ /* BXT has a single slice and at most 3 subslices. */
+ sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
+ sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
+ sseu->max_eus_per_subslice = 8;
+
/*
* The subslice disable field is global, i.e. it applies
* to each of the enabled slices.
*/
- sseu->subslice_mask = (1 << ss_max) - 1;
- sseu->subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
- GEN9_F2_SS_DIS_SHIFT);
+ subslice_mask = (1 << sseu->max_subslices) - 1;
+ subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
+ GEN9_F2_SS_DIS_SHIFT);
/*
* Iterate through enabled slices and subslices to
* count the total enabled EU.
*/
- for (s = 0; s < s_max; s++) {
+ for (s = 0; s < sseu->max_slices; s++) {
if (!(sseu->slice_mask & BIT(s)))
/* skip disabled slice */
continue;
+ sseu->subslice_mask[s] = subslice_mask;
+
eu_disable = I915_READ(GEN9_EU_DISABLE(s));
- for (ss = 0; ss < ss_max; ss++) {
+ for (ss = 0; ss < sseu->max_subslices; ss++) {
int eu_per_ss;
+ u8 eu_disabled_mask;
- if (!(sseu->subslice_mask & BIT(ss)))
+ if (!(sseu->subslice_mask[s] & BIT(ss)))
/* skip disabled subslice */
continue;
- eu_per_ss = eu_max - hweight8((eu_disable >> (ss*8)) &
- eu_mask);
+ eu_disabled_mask = (eu_disable >> (ss*8)) & eu_mask;
+
+ sseu->eu_mask[sseu_eu_idx(sseu, s, ss, 0)] =
+ ~eu_disabled_mask;
+
+ eu_per_ss = sseu->max_eus_per_subslice -
+ hweight8(eu_disabled_mask);
/*
* Record which subslice(s) has(have) 7 EUs. we
@@ -239,11 +327,12 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
*/
if (eu_per_ss == 7)
sseu->subslice_7eu[s] |= BIT(ss);
-
- sseu->eu_total += eu_per_ss;
}
}
+ sseu->subslice_total = compute_subslice_total(sseu);
+ sseu->eu_total = compute_eu_total(sseu);
+
/*
* SKL is expected to always have a uniform distribution
* of EU across subslices with the exception that any one
@@ -269,8 +358,8 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
sseu->has_eu_pg = sseu->eu_per_subslice > 2;
if (IS_GEN9_LP(dev_priv)) {
-#define IS_SS_DISABLED(ss) (!(sseu->subslice_mask & BIT(ss)))
- info->has_pooled_eu = hweight8(sseu->subslice_mask) == 3;
+#define IS_SS_DISABLED(ss) (!(sseu->subslice_mask[0] & BIT(ss)))
+ info->has_pooled_eu = hweight8(sseu->subslice_mask[0]) == 3;
sseu->min_eu_in_pool = 0;
if (info->has_pooled_eu) {
@@ -288,19 +377,22 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
{
struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
- const int s_max = 3, ss_max = 3, eu_max = 8;
int s, ss;
- u32 fuse2, eu_disable[3]; /* s_max */
+ u32 fuse2, subslice_mask, eu_disable[3]; /* s_max */
fuse2 = I915_READ(GEN8_FUSE2);
sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
+ sseu->max_slices = 3;
+ sseu->max_subslices = 3;
+ sseu->max_eus_per_subslice = 8;
+
/*
* The subslice disable field is global, i.e. it applies
* to each of the enabled slices.
*/
- sseu->subslice_mask = GENMASK(ss_max - 1, 0);
- sseu->subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
- GEN8_F2_SS_DIS_SHIFT);
+ subslice_mask = GENMASK(sseu->max_subslices - 1, 0);
+ subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
+ GEN8_F2_SS_DIS_SHIFT);
eu_disable[0] = I915_READ(GEN8_EU_DISABLE0) & GEN8_EU_DIS0_S0_MASK;
eu_disable[1] = (I915_READ(GEN8_EU_DISABLE0) >> GEN8_EU_DIS0_S1_SHIFT) |
@@ -314,30 +406,40 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
* Iterate through enabled slices and subslices to
* count the total enabled EU.
*/
- for (s = 0; s < s_max; s++) {
+ for (s = 0; s < sseu->max_slices; s++) {
if (!(sseu->slice_mask & BIT(s)))
/* skip disabled slice */
continue;
- for (ss = 0; ss < ss_max; ss++) {
+ sseu->subslice_mask[s] = subslice_mask;
+
+ for (ss = 0; ss < sseu->max_subslices; ss++) {
+ u8 eu_disabled_mask;
u32 n_disabled;
- if (!(sseu->subslice_mask & BIT(ss)))
+ if (!(sseu->subslice_mask[ss] & BIT(ss)))
/* skip disabled subslice */
continue;
- n_disabled = hweight8(eu_disable[s] >> (ss * eu_max));
+ eu_disabled_mask =
+ eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
+
+ sseu->eu_mask[sseu_eu_idx(sseu, s, ss, 0)] =
+ ~eu_disabled_mask;
+
+ n_disabled = hweight8(eu_disabled_mask);
/*
* Record which subslices have 7 EUs.
*/
- if (eu_max - n_disabled == 7)
+ if (sseu->max_eus_per_subslice - n_disabled == 7)
sseu->subslice_7eu[s] |= 1 << ss;
-
- sseu->eu_total += eu_max - n_disabled;
}
}
+ sseu->subslice_total = compute_subslice_total(sseu);
+ sseu->eu_total = compute_eu_total(sseu);
+
/*
* BDW is expected to always have a uniform distribution of EU across
* subslices with the exception that any one EU in any one subslice may
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 49cb27bd04c1..6eb8bcf5a213 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -110,10 +110,14 @@ enum intel_platform {
func(supports_tv); \
func(has_ipc);
+#define GEN_MAX_SLICES (6) /* CNL upper bound */
+#define GEN_MAX_SUBSLICES (7)
+
struct sseu_dev_info {
u8 slice_mask;
- u8 subslice_mask;
- u8 eu_total;
+ u8 subslice_mask[GEN_MAX_SUBSLICES];
+ u16 subslice_total;
+ u16 eu_total;
u8 eu_per_subslice;
u8 min_eu_in_pool;
/* For each slice, which subslice(s) has(have) 7 EUs (bitfield)? */
@@ -121,6 +125,17 @@ struct sseu_dev_info {
u8 has_slice_pg:1;
u8 has_subslice_pg:1;
u8 has_eu_pg:1;
+
+ /* Topology fields */
+ u8 max_slices;
+ u8 max_subslices;
+ u8 max_eus_per_subslice;
+
+ /* We don't have more than 8 eus per subslice at the moment and as we
+ * store eus enabled using bits, no need to multiply by eus per
+ * subslice.
+ */
+ u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
};
struct intel_device_info {
@@ -167,7 +182,22 @@ struct intel_device_info {
static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
{
- return hweight8(sseu->slice_mask) * hweight8(sseu->subslice_mask);
+ return sseu->subslice_total;
+}
+
+static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
+ int slice, int subslice, int eu_group)
+{
+ int subslice_stride = ALIGN(sseu->max_eus_per_subslice, 8) / 8;
+ int slice_stride = sseu->max_subslices * subslice_stride;
+
+ return slice * slice_stride + subslice * subslice_stride + eu_group;
+}
+
+static inline int sseu_eu_mask(const struct sseu_dev_info *sseu,
+ int slice, int subslice, int eu_group)
+{
+ return sseu->eu_mask[sseu_eu_idx(sseu, slice, subslice, eu_group)];
}
const char *intel_platform_name(enum intel_platform platform);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ff25f209d0a5..ac7896031b8d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2098,7 +2098,7 @@ make_rpcs(struct drm_i915_private *dev_priv)
if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
- rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask) <<
+ rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]) <<
GEN8_RPCS_SS_CNT_SHIFT;
rpcs |= GEN8_RPCS_ENABLE;
}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c5ff203e42d6..23ae9a957fab 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -90,7 +90,7 @@ hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
#define instdone_subslice_mask(dev_priv__) \
(INTEL_GEN(dev_priv__) == 7 ? \
- 1 : INTEL_INFO(dev_priv__)->sseu.subslice_mask)
+ 1 : INTEL_INFO(dev_priv__)->sseu.subslice_mask[0])
#define for_each_instdone_slice_subslice(dev_priv__, slice__, subslice__) \
for ((slice__) = 0, (subslice__) = 0; \
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 2/6] drm/i915/debugfs: reuse max slice/subslices already stored in sseu
2018-01-11 19:53 [PATCH v2 0/6] drm/i915: expose RCS topology to userspace Lionel Landwerlin
2018-01-11 19:53 ` [PATCH v2 1/6] drm/i915: store all subslice masks Lionel Landwerlin
@ 2018-01-11 19:53 ` Lionel Landwerlin
2018-01-11 19:53 ` [PATCH v2 3/6] drm/i915/debugfs: add rcs topology entry Lionel Landwerlin
` (5 subsequent siblings)
7 siblings, 0 replies; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-11 19:53 UTC (permalink / raw)
To: intel-gfx
Now that we have that information in topology fields, let's just reused it.
v2: Style tweaks (Tvrtko)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 27 +++++++++++----------------
1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 463029f72a0b..2d1c9cce5fe4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4300,11 +4300,11 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
struct sseu_dev_info *sseu)
{
const struct intel_device_info *info = INTEL_INFO(dev_priv);
- int s_max = 6, ss_max = 4;
int s, ss;
- u32 s_reg[s_max], eu_reg[2 * s_max], eu_mask[2];
+ u32 s_reg[info->sseu.max_slices];
+ u32 eu_reg[2 * info->sseu.max_subslices], eu_mask[2];
- for (s = 0; s < s_max; s++) {
+ for (s = 0; s < info->sseu.max_slices; s++) {
/*
* FIXME: Valid SS Mask respects the spec and read
* only valid bits for those registers, excluding reserverd
@@ -4326,7 +4326,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
GEN9_PGCTL_SSB_EU210_ACK |
GEN9_PGCTL_SSB_EU311_ACK;
- for (s = 0; s < s_max; s++) {
+ for (s = 0; s < info->sseu.max_slices; s++) {
if ((s_reg[s] & GEN9_PGCTL_SLICE_ACK) == 0)
/* skip disabled slice */
continue;
@@ -4334,7 +4334,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
sseu->slice_mask |= BIT(s);
sseu->subslice_mask[s] = info->sseu.subslice_mask[s];
- for (ss = 0; ss < ss_max; ss++) {
+ for (ss = 0; ss < info->sseu.max_subslices; ss++) {
unsigned int eu_cnt;
if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss))))
@@ -4354,17 +4354,12 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
struct sseu_dev_info *sseu)
{
- int s_max = 3, ss_max = 4;
+ const struct intel_device_info *info = INTEL_INFO(dev_priv);
int s, ss;
- u32 s_reg[s_max], eu_reg[2*s_max], eu_mask[2];
-
- /* BXT has a single slice and at most 3 subslices. */
- if (IS_GEN9_LP(dev_priv)) {
- s_max = 1;
- ss_max = 3;
- }
+ u32 s_reg[info->sseu.max_slices];
+ u32 eu_reg[2 * info->sseu.max_subslices], eu_mask[2];
- for (s = 0; s < s_max; s++) {
+ for (s = 0; s < info->sseu.max_slices; s++) {
s_reg[s] = I915_READ(GEN9_SLICE_PGCTL_ACK(s));
eu_reg[2*s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s));
eu_reg[2*s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s));
@@ -4379,7 +4374,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
GEN9_PGCTL_SSB_EU210_ACK |
GEN9_PGCTL_SSB_EU311_ACK;
- for (s = 0; s < s_max; s++) {
+ for (s = 0; s < info->sseu.max_slices; s++) {
if ((s_reg[s] & GEN9_PGCTL_SLICE_ACK) == 0)
/* skip disabled slice */
continue;
@@ -4390,7 +4385,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
sseu->subslice_mask[s] =
INTEL_INFO(dev_priv)->sseu.subslice_mask[s];
- for (ss = 0; ss < ss_max; ss++) {
+ for (ss = 0; ss < info->sseu.max_subslices; ss++) {
unsigned int eu_cnt;
if (IS_GEN9_LP(dev_priv)) {
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 3/6] drm/i915/debugfs: add rcs topology entry
2018-01-11 19:53 [PATCH v2 0/6] drm/i915: expose RCS topology to userspace Lionel Landwerlin
2018-01-11 19:53 ` [PATCH v2 1/6] drm/i915: store all subslice masks Lionel Landwerlin
2018-01-11 19:53 ` [PATCH v2 2/6] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
@ 2018-01-11 19:53 ` Lionel Landwerlin
2018-01-12 10:21 ` Tvrtko Ursulin
2018-01-11 19:53 ` [PATCH v2 4/6] drm/i915: add rcs topology to error state Lionel Landwerlin
` (4 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-11 19:53 UTC (permalink / raw)
To: intel-gfx
While the end goal is to make this information available to userspace
through a new ioctl, there is no reason we can't display it in a human
readable fashion through debugfs.
slice0: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)
slice1: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)
slice2: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)
v2: Reformat debugfs printing (Tvrtko)
Use the new EU mask helper (Tvrtko)
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 42 +++++++++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2d1c9cce5fe4..83af1029b907 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3162,6 +3162,47 @@ static int i915_engine_info(struct seq_file *m, void *unused)
return 0;
}
+static int i915_rcs_topology(struct seq_file *m, void *unused)
+{
+ struct drm_i915_private *dev_priv = node_to_i915(m->private);
+ const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+ int s, ss;
+ int subslice_stride =
+ DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE);
+
+ if (sseu->max_slices == 0) {
+ seq_printf(m, "Unavailable\n");
+ return 0;
+ }
+
+ for (s = 0; s < sseu->max_slices; s++) {
+ seq_printf(m, "slice%i: %u subslice(s) (0x%hhx):\n",
+ s, hweight8(sseu->subslice_mask[s]),
+ sseu->subslice_mask[s]);
+
+ for (ss = 0; ss < sseu->max_subslices; ss++) {
+ int eu_group, n_subslice_eus = 0;
+
+ for (eu_group = 0; eu_group < subslice_stride; eu_group++) {
+ n_subslice_eus +=
+ hweight8(sseu_eu_mask(sseu, s, ss, eu_group));
+ }
+
+ seq_printf(m, "\tsubslice%i: %u EUs (", ss, n_subslice_eus);
+ for (eu_group = 0;
+ eu_group < max(0, subslice_stride - 1);
+ eu_group++) {
+ u8 val = sseu_eu_mask(sseu, s, ss, eu_group);
+ seq_printf(m, "0x%hhx, ", val);
+ }
+ seq_printf(m, "0x%hhx)\n",
+ sseu_eu_mask(sseu, s, ss, subslice_stride - 1));
+ }
+ }
+
+ return 0;
+}
+
static int i915_shrinker_info(struct seq_file *m, void *unused)
{
struct drm_i915_private *i915 = node_to_i915(m->private);
@@ -4692,6 +4733,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_dmc_info", i915_dmc_info, 0},
{"i915_display_info", i915_display_info, 0},
{"i915_engine_info", i915_engine_info, 0},
+ {"i915_rcs_topology", i915_rcs_topology, 0},
{"i915_shrinker_info", i915_shrinker_info, 0},
{"i915_shared_dplls_info", i915_shared_dplls_info, 0},
{"i915_dp_mst_info", i915_dp_mst_info, 0},
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 4/6] drm/i915: add rcs topology to error state
2018-01-11 19:53 [PATCH v2 0/6] drm/i915: expose RCS topology to userspace Lionel Landwerlin
` (2 preceding siblings ...)
2018-01-11 19:53 ` [PATCH v2 3/6] drm/i915/debugfs: add rcs topology entry Lionel Landwerlin
@ 2018-01-11 19:53 ` Lionel Landwerlin
2018-01-11 19:53 ` [PATCH v2 5/6] drm/i915: add query uAPI Lionel Landwerlin
` (3 subsequent siblings)
7 siblings, 0 replies; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-11 19:53 UTC (permalink / raw)
To: intel-gfx
This might be useful information for developers looking at an error
state.
v2: Place topology towards the end of the error state (Chris)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
drivers/gpu/drm/i915/i915_gpu_error.c | 40 +++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 944059322daa..cc7f53cc9a77 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -605,6 +605,45 @@ static void err_print_uc(struct drm_i915_error_state_buf *m,
print_error_obj(m, NULL, "GuC log buffer", error_uc->guc_log);
}
+static void err_print_rcs_topology(struct drm_i915_error_state_buf *m,
+ const struct sseu_dev_info *sseu)
+{
+ int s, ss;
+ int subslice_stride =
+ DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE);
+
+ /* Unavailable prior to Gen 8. */
+ if (sseu->max_slices == 0)
+ return;
+
+ err_printf(m, "RCS topology:\n");
+
+ for (s = 0; s < sseu->max_slices; s++) {
+ err_printf(m, " slice%i %u subslice(s) (0x%hhx):\n",
+ s, hweight8(sseu->subslice_mask[s]),
+ sseu->subslice_mask[s]);
+
+ for (ss = 0; ss < sseu->max_subslices; ss++) {
+ int eu_group, n_subslice_eus = 0;
+
+ for (eu_group = 0; eu_group < subslice_stride; eu_group++) {
+ n_subslice_eus +=
+ hweight8(sseu_eu_mask(sseu, s, ss, eu_group));
+ }
+
+ err_printf(m, " subslice%i: %u EUs (", ss, n_subslice_eus);
+ for (eu_group = 0;
+ eu_group < max(0, subslice_stride - 1);
+ eu_group++) {
+ u8 val = sseu_eu_mask(sseu, s, ss, eu_group);
+ err_printf(m, " 0x%hhx", val);
+ }
+ err_printf(m, "0x%hhx)\n",
+ sseu_eu_mask(sseu, s, ss, subslice_stride - 1));
+ }
+ }
+}
+
int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
const struct i915_gpu_state *error)
{
@@ -787,6 +826,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
intel_display_print_error_state(m, error->display);
err_print_capabilities(m, &error->device_info);
+ err_print_rcs_topology(m, &INTEL_INFO(dev_priv)->sseu);
err_print_params(m, &error->params);
err_print_uc(m, &error->uc);
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 5/6] drm/i915: add query uAPI
2018-01-11 19:53 [PATCH v2 0/6] drm/i915: expose RCS topology to userspace Lionel Landwerlin
` (3 preceding siblings ...)
2018-01-11 19:53 ` [PATCH v2 4/6] drm/i915: add rcs topology to error state Lionel Landwerlin
@ 2018-01-11 19:53 ` Lionel Landwerlin
2018-01-12 12:06 ` Tvrtko Ursulin
2018-01-11 19:53 ` [PATCH v2 6/6] drm/i915: expose rcs topology through " Lionel Landwerlin
` (2 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-11 19:53 UTC (permalink / raw)
To: intel-gfx
There are a number of information that are readable from hardware
registers and that we would like to make accessible to userspace. One
particular example is the topology of the execution units (how are
execution units grouped in subslices and slices and also which ones
have been fused off for die recovery).
At the moment the GET_PARAM ioctl covers some basic needs, but
generally is only able to return a single value for each defined
parameter. This is a bit problematic with topology descriptions which
are array/maps of available units.
This change introduces a new ioctl that can deal with requests to fill
structures of potentially variable lengths. The user is expected fill
a query with length fields set at 0 on the first call, the kernel then
sets the length fields to the their expected values. A second call to
the kernel with length fields at their expected values will trigger a
copy of the data to the pointed memory locations.
The scope of this uAPI is only to provide information to userspace,
not to allow configuration of the device.
v2: Simplify dispatcher code iteration (Tvrtko)
Tweak uapi drm_i915_query_item structure (Tvrtko)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
drivers/gpu/drm/i915/Makefile | 1 +
drivers/gpu/drm/i915/i915_drv.c | 1 +
drivers/gpu/drm/i915/i915_drv.h | 3 +++
drivers/gpu/drm/i915/i915_query.c | 51 +++++++++++++++++++++++++++++++++++++++
include/uapi/drm/i915_drm.h | 31 ++++++++++++++++++++++++
5 files changed, 87 insertions(+)
create mode 100644 drivers/gpu/drm/i915/i915_query.c
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 3bddd8a06806..b0415a3e2d59 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -69,6 +69,7 @@ i915-y += i915_cmd_parser.o \
i915_gem_timeline.o \
i915_gem_userptr.o \
i915_gemfs.o \
+ i915_query.o \
i915_trace_points.o \
i915_vma.o \
intel_breadcrumbs.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 969835d3cbcd..d92e1b7236fc 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2824,6 +2824,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+ DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
};
static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a689396d0ff6..de0eb6ce2fcd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3623,6 +3623,9 @@ extern void i915_perf_fini(struct drm_i915_private *dev_priv);
extern void i915_perf_register(struct drm_i915_private *dev_priv);
extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
+/* i915_query.c */
+int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file);
+
/* i915_suspend.c */
extern int i915_save_state(struct drm_i915_private *dev_priv);
extern int i915_restore_state(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
new file mode 100644
index 000000000000..5694cfea4553
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -0,0 +1,51 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "i915_drv.h"
+#include <uapi/drm/i915_drm.h>
+
+int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
+{
+ struct drm_i915_query *args = data;
+ struct drm_i915_query_item __user *user_item_ptr =
+ u64_to_user_ptr(args->items_ptr);
+ u32 i;
+
+ for (i = 0; i < args->num_items; i++, user_item_ptr++) {
+ struct drm_i915_query_item item;
+
+ if (copy_from_user(&item, user_item_ptr, sizeof(item)))
+ return -EFAULT;
+
+ switch (item.query_id) {
+ default:
+ return -EINVAL;
+ }
+
+ if (copy_to_user(user_item_ptr, &item, sizeof(item)))
+ return -EFAULT;
+ }
+
+ return 0;
+}
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 536ee4febd74..39e93f10f2cd 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -318,6 +318,7 @@ typedef struct _drm_i915_sarea {
#define DRM_I915_PERF_OPEN 0x36
#define DRM_I915_PERF_ADD_CONFIG 0x37
#define DRM_I915_PERF_REMOVE_CONFIG 0x38
+#define DRM_I915_QUERY 0x39
#define DRM_IOCTL_I915_INIT DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
#define DRM_IOCTL_I915_FLUSH DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -375,6 +376,7 @@ typedef struct _drm_i915_sarea {
#define DRM_IOCTL_I915_PERF_OPEN DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
#define DRM_IOCTL_I915_PERF_ADD_CONFIG DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
#define DRM_IOCTL_I915_PERF_REMOVE_CONFIG DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
+#define DRM_IOCTL_I915_QUERY DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
/* Allow drivers to submit batchbuffers directly to hardware, relying
* on the security mechanisms provided by hardware.
@@ -1613,6 +1615,35 @@ struct drm_i915_perf_oa_config {
__u64 flex_regs_ptr;
};
+
+struct drm_i915_query_item {
+ __u64 query_id;
+
+ /*
+ * When set to zero by userspace, this is filled with the size of the
+ * data to be written at the data_ptr pointer.
+ */
+ __u32 length;
+ __u32 pad;
+
+ /*
+ * Data will be written at the location pointed by data_ptr when the
+ * value of length matches the length of the data to be written by the
+ * kernel.
+ */
+ __u64 data_ptr;
+};
+
+struct drm_i915_query {
+ __u32 num_items;
+ __u32 _pad;
+
+ /*
+ * This point to an array of num_items drm_i915_query_item structures.
+ */
+ __u64 items_ptr;
+};
+
#if defined(__cplusplus)
}
#endif
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 6/6] drm/i915: expose rcs topology through query uAPI
2018-01-11 19:53 [PATCH v2 0/6] drm/i915: expose RCS topology to userspace Lionel Landwerlin
` (4 preceding siblings ...)
2018-01-11 19:53 ` [PATCH v2 5/6] drm/i915: add query uAPI Lionel Landwerlin
@ 2018-01-11 19:53 ` Lionel Landwerlin
2018-01-12 12:27 ` Tvrtko Ursulin
2018-01-12 10:06 ` ✓ Fi.CI.BAT: success for drm/i915: expose RCS topology to userspace Patchwork
2018-01-12 11:49 ` ✗ Fi.CI.IGT: warning " Patchwork
7 siblings, 1 reply; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-11 19:53 UTC (permalink / raw)
To: intel-gfx
With the introduction of asymmetric slices in CNL, we cannot rely on
the previous SUBSLICE_MASK getparam to tell userspace what subslices
are available. Here we introduce a more detailed way of querying the
Gen's GPU topology that doesn't aggregate numbers.
This is essential for monitoring parts of the GPU with the OA unit,
because counters need to be normalized to the number of
EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
not gives us sufficient information.
As a bonus we can draw representations of the GPU :
https://imgur.com/a/vuqpa
v2: Rename uapi struct s/_mask/_info/ (Tvrtko)
Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko)
Add uapi macros to read data from *_info structs (Tvrtko)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
drivers/gpu/drm/i915/i915_query.c | 133 ++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/intel_query_info.c | 88 +++++++++++++++++++++
include/uapi/drm/i915_drm.h | 51 ++++++++++++
3 files changed, 272 insertions(+)
create mode 100644 drivers/gpu/drm/i915/intel_query_info.c
diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
index 5694cfea4553..1d9f5a15323c 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -25,8 +25,128 @@
#include "i915_drv.h"
#include <uapi/drm/i915_drm.h>
+static int query_slices_info(struct drm_i915_private *dev_priv,
+ struct drm_i915_query_item *query_item)
+{
+ const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+ struct drm_i915_query_slices_info slices_info;
+ u32 data_length, length;
+
+ if (sseu->max_slices == 0)
+ return -ENODEV;
+
+ data_length = sizeof(u8);
+ length = sizeof(slices_info) + data_length;
+
+ /*
+ * If we ever change the internal slice mask data type, we'll need to
+ * update this function.
+ */
+ BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
+
+ if (query_item->length == 0) {
+ query_item->length = length;
+ return 0;
+ }
+
+ if (query_item->length != length)
+ return -EINVAL;
+
+ memset(&slices_info, 0, sizeof(slices_info));
+ slices_info.max_slices = sseu->max_slices;
+
+ if (copy_to_user(u64_to_user_ptr(query_item->data_ptr), &slices_info,
+ sizeof(slices_info)))
+ return -EFAULT;
+
+ if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
+ offsetof(struct drm_i915_query_slices_info, data)),
+ &sseu->slice_mask, data_length))
+ return -EFAULT;
+
+ return 0;
+}
+
+static int query_subslices_info(struct drm_i915_private *dev_priv,
+ struct drm_i915_query_item *query_item)
+{
+ const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+ struct drm_i915_query_subslices_info subslices_info;
+ u32 data_length, length;
+
+ if (sseu->max_slices == 0)
+ return -ENODEV;
+
+ memset(&subslices_info, 0, sizeof(subslices_info));
+ subslices_info.max_slices = sseu->max_slices;
+ subslices_info.max_subslices = sseu->max_subslices;
+
+ data_length = subslices_info.max_slices *
+ DIV_ROUND_UP(subslices_info.max_subslices, BITS_PER_BYTE);
+ length = sizeof(subslices_info) + data_length;
+
+ if (query_item->length == 0) {
+ query_item->length = length;
+ return 0;
+ }
+
+ if (query_item->length != length)
+ return -EINVAL;
+
+ if (copy_to_user(u64_to_user_ptr(query_item->data_ptr), &subslices_info,
+ sizeof(subslices_info)))
+ return -EFAULT;
+
+ if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
+ offsetof(struct drm_i915_query_subslices_info, data)),
+ sseu->subslice_mask, data_length))
+ return -EFAULT;
+
+ return 0;
+}
+
+static int query_eus_info(struct drm_i915_private *dev_priv,
+ struct drm_i915_query_item *query_item)
+{
+ const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+ struct drm_i915_query_eus_info eus_info;
+ u32 data_length, length;
+
+ if (sseu->max_slices == 0)
+ return -ENODEV;
+
+ memset(&eus_info, 0, sizeof(eus_info));
+ eus_info.max_slices = sseu->max_slices;
+ eus_info.max_subslices = sseu->max_subslices;
+ eus_info.max_eus_per_subslice = sseu->max_eus_per_subslice;
+
+ data_length = eus_info.max_slices * eus_info.max_subslices *
+ DIV_ROUND_UP(eus_info.max_eus_per_subslice, BITS_PER_BYTE);
+ length = sizeof(eus_info) + data_length;
+
+ if (query_item->length == 0) {
+ query_item->length = length;
+ return 0;
+ }
+
+ if (query_item->length != length)
+ return -EINVAL;
+
+ if (copy_to_user(u64_to_user_ptr(query_item->data_ptr), &eus_info,
+ sizeof(eus_info)))
+ return -EFAULT;
+
+ if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
+ offsetof(struct drm_i915_query_eus_info, data)),
+ sseu->eu_mask, data_length))
+ return -EFAULT;
+
+ return 0;
+}
+
int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
{
+ struct drm_i915_private *dev_priv = to_i915(dev);
struct drm_i915_query *args = data;
struct drm_i915_query_item __user *user_item_ptr =
u64_to_user_ptr(args->items_ptr);
@@ -34,15 +154,28 @@ int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
for (i = 0; i < args->num_items; i++, user_item_ptr++) {
struct drm_i915_query_item item;
+ int ret;
if (copy_from_user(&item, user_item_ptr, sizeof(item)))
return -EFAULT;
switch (item.query_id) {
+ case DRM_I915_QUERY_ID_SLICES_INFO:
+ ret = query_slices_info(dev_priv, &item);
+ break;
+ case DRM_I915_QUERY_ID_SUBSLICES_INFO:
+ ret = query_subslices_info(dev_priv, &item);
+ break;
+ case DRM_I915_QUERY_ID_EUS_INFO:
+ ret = query_eus_info(dev_priv, &item);
+ break;
default:
return -EINVAL;
}
+ if (ret)
+ return ret;
+
if (copy_to_user(user_item_ptr, &item, sizeof(item)))
return -EFAULT;
}
diff --git a/drivers/gpu/drm/i915/intel_query_info.c b/drivers/gpu/drm/i915/intel_query_info.c
new file mode 100644
index 000000000000..79b03be9f51a
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_query_info.c
@@ -0,0 +1,88 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "i915_drv.h"
+#include <uapi/drm/i915_drm.h>
+
+static int query_info_rcs_topology(struct drm_i915_private *dev_priv,
+ struct drm_i915_query_info *args)
+{
+ const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+ struct drm_i915_rcs_topology_info __user *user_topology =
+ u64_to_user_ptr(args->info_ptr);
+ struct drm_i915_rcs_topology_info topology;
+ u32 data_size, total_size;
+ const u8 *data = NULL;
+ int ret;
+
+ /* Not supported on gen < 8. */
+ if (sseu->max_slices == 0)
+ return -ENODEV;
+
+ switch (args->query_params[0]) {
+ case I915_RCS_TOPOLOGY_SLICE:
+ topology.params[0] = sseu->max_slices;
+ data_size = sizeof(sseu->slice_mask);
+ data = &sseu->slice_mask;
+ break;
+
+ case I915_RCS_TOPOLOGY_SUBSLICE:
+ topology.params[0] = sseu->max_slices;
+ topology.params[1] = ALIGN(sseu->max_subslices, 8) / 8;
+ data_size = sseu->max_slices * topology.params[1];
+ data = sseu->subslices_mask;
+ break;
+
+ case I915_RCS_TOPOLOGY_EU:
+ topology.params[2] = ALIGN(sseu->max_eus_per_subslice, 8) / 8;
+ topology.params[1] = sseu->max_subslices * topology.params[2];
+ topology.params[0] = sseu->max_slices;
+ data_size = sseu->max_slices * topology.params[1];
+ data = sseu->eu_mask;
+ break;
+
+ default:
+ return -EINVAL;
+ }
+
+ total_size = sizeof(topology) + data_size;
+
+ if (args->info_ptr_len == 0) {
+ args->info_ptr_len = total_size;
+ return 0;
+ }
+
+ if (args->info_ptr_len < total_size)
+ return -EINVAL;
+
+ ret = copy_to_user(user_topology, &topology, sizeof(topology));
+ if (ret)
+ return -EFAULT;
+
+ ret = copy_to_user(user_topology + 1, data, data_size);
+ if (ret)
+ return -EFAULT;
+
+ return 0;
+}
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 39e93f10f2cd..5a8e2f7d5dc3 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1618,6 +1618,9 @@ struct drm_i915_perf_oa_config {
struct drm_i915_query_item {
__u64 query_id;
+#define DRM_I915_QUERY_ID_SLICES_INFO 0x01
+#define DRM_I915_QUERY_ID_SUBSLICES_INFO 0x02
+#define DRM_I915_QUERY_ID_EUS_INFO 0x03
/*
* When set to zero by userspace, this is filled with the size of the
@@ -1644,6 +1647,54 @@ struct drm_i915_query {
__u64 items_ptr;
};
+/* Data written by the kernel with query DRM_I915_QUERY_ID_SLICES_INFO :
+ *
+ * data: each bit indicates whether a slice is available (1) or fused off (0).
+ * Use DRM_I915_QUERY_SLICE_AVAILABLE() to query a given slice's
+ * availability.
+ */
+struct drm_i915_query_slices_info {
+ __u32 max_slices;
+
+#define DRM_I915_QUERY_SLICE_AVAILABLE(info, slice) \
+ (((info)->data[(slice) / 8] >> ((slice) % 8)) & 1)
+ __u8 data[];
+};
+
+/* Data written by the kernel with query DRM_I915_QUERY_ID_SUBSLICES_INFO :
+ *
+ * data: each bit indicates whether a subslice is available (1) or fused off
+ * (0). Use DRM_I915_QUERY_SUBSLICE_AVAILABLE() to query a given
+ * subslice's availability.
+ */
+struct drm_i915_query_subslices_info {
+ __u32 max_slices;
+ __u32 max_subslices;
+
+#define DRM_I915_QUERY_SUBSLICE_AVAILABLE(info, slice, subslice) \
+ (((info)->data[(slice) * ALIGN((info)->max_subslices, 8) / 8 + \
+ (subslice) / 8] >> ((subslice) % 8)) & 1)
+ __u8 data[];
+};
+
+/* Data written by the kernel with query DRM_I915_QUERY_ID_EUS_INFO :
+ *
+ * data: Each bit indicates whether a subslice is available (1) or fused off
+ * (0). Use DRM_I915_QUERY_EU_AVAILABLE() to query a given EU's
+ * availability.
+ */
+struct drm_i915_query_eus_info {
+ __u32 max_slices;
+ __u32 max_subslices;
+ __u32 max_eus_per_subslice;
+
+#define DRM_I915_QUERY_EU_AVAILABLE(info, slice, subslice, eu) \
+ (((info)->data[(slice) * ALIGN((info)->max_eus_per_subslice, 8) / 8 * (info)->max_subslices + \
+ (subslice) * ALIGN((info)->max_eus_per_subslice, 8) / 8 + \
+ (eu) / 8] >> ((eu) % 8)) & 1)
+ __u8 data[];
+};
+
#if defined(__cplusplus)
}
#endif
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 21+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: expose RCS topology to userspace
2018-01-11 19:53 [PATCH v2 0/6] drm/i915: expose RCS topology to userspace Lionel Landwerlin
` (5 preceding siblings ...)
2018-01-11 19:53 ` [PATCH v2 6/6] drm/i915: expose rcs topology through " Lionel Landwerlin
@ 2018-01-12 10:06 ` Patchwork
2018-01-12 11:49 ` ✗ Fi.CI.IGT: warning " Patchwork
7 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2018-01-12 10:06 UTC (permalink / raw)
To: Lionel Landwerlin; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: expose RCS topology to userspace
URL : https://patchwork.freedesktop.org/series/36353/
State : success
== Summary ==
Series 36353v1 drm/i915: expose RCS topology to userspace
https://patchwork.freedesktop.org/api/1.0/series/36353/revisions/1/mbox/
Test debugfs_test:
Subgroup read_all_entries:
dmesg-warn -> DMESG-FAIL (fi-elk-e7500) fdo#103989
incomplete -> PASS (fi-snb-2520m) fdo#103713
fdo#103989 https://bugs.freedesktop.org/show_bug.cgi?id=103989
fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:417s
fi-bdw-gvtdvm total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:427s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:374s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:492s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:282s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:485s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:488s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:467s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:454s
fi-elk-e7500 total:224 pass:168 dwarn:9 dfail:1 fail:0 skip:45
fi-gdg-551 total:288 pass:179 dwarn:0 dfail:0 fail:1 skip:108 time:274s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:512s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:394s
fi-hsw-4770r total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:400s
fi-ilk-650 total:288 pass:228 dwarn:0 dfail:0 fail:0 skip:60 time:413s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:459s
fi-ivb-3770 total:288 pass:255 dwarn:0 dfail:0 fail:0 skip:33 time:418s
fi-kbl-7500u total:288 pass:263 dwarn:1 dfail:0 fail:0 skip:24 time:468s
fi-kbl-7560u total:288 pass:269 dwarn:0 dfail:0 fail:0 skip:19 time:499s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:450s
fi-kbl-r total:288 pass:260 dwarn:1 dfail:0 fail:0 skip:27 time:504s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:585s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:434s
fi-skl-6600u total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:513s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:528s
fi-skl-6700k2 total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:490s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:472s
fi-skl-gvtdvm total:288 pass:265 dwarn:0 dfail:0 fail:0 skip:23 time:433s
fi-snb-2520m total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:533s
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:398s
Blacklisted hosts:
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:565s
fi-glk-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:470s
353fa2d3affffef324005ed2553da6eb174a2f0b drm-tip: 2018y-01m-12d-09h-21m-50s UTC integration manifest
23b30dea8c0e drm/i915: expose rcs topology through query uAPI
beeda81a3f3c drm/i915: add query uAPI
09de4f4c4d5f drm/i915: add rcs topology to error state
cef5aec09af7 drm/i915/debugfs: add rcs topology entry
e4220f75ba30 drm/i915/debugfs: reuse max slice/subslices already stored in sseu
aac15151dddb drm/i915: store all subslice masks
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7653/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/6] drm/i915: store all subslice masks
2018-01-11 19:53 ` [PATCH v2 1/6] drm/i915: store all subslice masks Lionel Landwerlin
@ 2018-01-12 10:15 ` Tvrtko Ursulin
2018-01-12 10:58 ` Lionel Landwerlin
0 siblings, 1 reply; 21+ messages in thread
From: Tvrtko Ursulin @ 2018-01-12 10:15 UTC (permalink / raw)
To: Lionel Landwerlin, intel-gfx
On 11/01/2018 19:53, Lionel Landwerlin wrote:
> Up to now, subslice mask was assumed to be uniform across slices. But
> starting with Cannonlake, slices can be asymmetric (for example slice0
> has different number of subslices as slice1+). This change stores all
> subslices masks for all slices rather than having a single mask that
> applies to all slices.
>
> v2: Rework how we store total numbers in sseu_dev_info (Tvrtko)
> Fix CHV eu masks, was reading disabled as enabled (Tvrtko)
> Readability changes (Tvrtko)
> Add EU index helper (Tvrtko)
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_debugfs.c | 25 ++--
> drivers/gpu/drm/i915/i915_drv.c | 2 +-
> drivers/gpu/drm/i915/intel_device_info.c | 196 +++++++++++++++++++++++--------
> drivers/gpu/drm/i915/intel_device_info.h | 36 +++++-
> drivers/gpu/drm/i915/intel_lrc.c | 2 +-
> drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
> 6 files changed, 200 insertions(+), 63 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 2bb63073d73f..463029f72a0b 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -4285,7 +4285,7 @@ static void cherryview_sseu_device_status(struct drm_i915_private *dev_priv,
> continue;
>
> sseu->slice_mask = BIT(0);
> - sseu->subslice_mask |= BIT(ss);
> + sseu->subslice_mask[0] |= BIT(ss);
> eu_cnt = ((sig1[ss] & CHV_EU08_PG_ENABLE) ? 0 : 2) +
> ((sig1[ss] & CHV_EU19_PG_ENABLE) ? 0 : 2) +
> ((sig1[ss] & CHV_EU210_PG_ENABLE) ? 0 : 2) +
> @@ -4332,7 +4332,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
> continue;
>
> sseu->slice_mask |= BIT(s);
> - sseu->subslice_mask = info->sseu.subslice_mask;
> + sseu->subslice_mask[s] = info->sseu.subslice_mask[s];
>
> for (ss = 0; ss < ss_max; ss++) {
> unsigned int eu_cnt;
> @@ -4387,8 +4387,8 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
> sseu->slice_mask |= BIT(s);
>
> if (IS_GEN9_BC(dev_priv))
> - sseu->subslice_mask =
> - INTEL_INFO(dev_priv)->sseu.subslice_mask;
> + sseu->subslice_mask[s] =
> + INTEL_INFO(dev_priv)->sseu.subslice_mask[s];
>
> for (ss = 0; ss < ss_max; ss++) {
> unsigned int eu_cnt;
> @@ -4398,7 +4398,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
> /* skip disabled subslice */
> continue;
>
> - sseu->subslice_mask |= BIT(ss);
> + sseu->subslice_mask[s] |= BIT(ss);
> }
>
> eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
> @@ -4420,9 +4420,12 @@ static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
> sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
>
> if (sseu->slice_mask) {
> - sseu->subslice_mask = INTEL_INFO(dev_priv)->sseu.subslice_mask;
> sseu->eu_per_subslice =
> INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
> + for (s = 0; s < fls(sseu->slice_mask); s++) {
> + sseu->subslice_mask[s] =
> + INTEL_INFO(dev_priv)->sseu.subslice_mask[s];
> + }
> sseu->eu_total = sseu->eu_per_subslice *
> sseu_subslice_total(sseu);
>
> @@ -4441,6 +4444,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
> {
> struct drm_i915_private *dev_priv = node_to_i915(m->private);
> const char *type = is_available_info ? "Available" : "Enabled";
> + int s;
>
> seq_printf(m, " %s Slice Mask: %04x\n", type,
> sseu->slice_mask);
> @@ -4448,10 +4452,11 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
> hweight8(sseu->slice_mask));
> seq_printf(m, " %s Subslice Total: %u\n", type,
> sseu_subslice_total(sseu));
> - seq_printf(m, " %s Subslice Mask: %04x\n", type,
> - sseu->subslice_mask);
> - seq_printf(m, " %s Subslice Per Slice: %u\n", type,
> - hweight8(sseu->subslice_mask));
> + for (s = 0; s < fls(sseu->slice_mask); s++) {
> + seq_printf(m, " %s Slice%i %u subslices, mask=%04x\n", type,
> + s, hweight8(sseu->subslice_mask[s]),
> + sseu->subslice_mask[s]);
> + }
> seq_printf(m, " %s EU Total: %u\n", type,
> sseu->eu_total);
> seq_printf(m, " %s EU Per Subslice: %u\n", type,
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 6c8da9d20c33..969835d3cbcd 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -414,7 +414,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
> return -ENODEV;
> break;
> case I915_PARAM_SUBSLICE_MASK:
> - value = INTEL_INFO(dev_priv)->sseu.subslice_mask;
> + value = INTEL_INFO(dev_priv)->sseu.subslice_mask[0];
> if (!value)
> return -ENODEV;
> break;
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index d28592e43512..0bf6ef51cdb7 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -80,12 +80,17 @@ void intel_device_info_dump_flags(const struct intel_device_info *info,
>
> static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
> {
> + int s;
> +
> drm_printf(p, "slice mask: %04x\n", sseu->slice_mask);
> drm_printf(p, "slice total: %u\n", hweight8(sseu->slice_mask));
> drm_printf(p, "subslice total: %u\n", sseu_subslice_total(sseu));
> - drm_printf(p, "subslice mask %04x\n", sseu->subslice_mask);
> - drm_printf(p, "subslice per slice: %u\n",
> - hweight8(sseu->subslice_mask));
> + for (s = 0; s < ARRAY_SIZE(sseu->subslice_mask); s++) {
> + drm_printf(p, "slice%d subslice mask %04x\n",
> + s, sseu->subslice_mask[s]);
> + drm_printf(p, "slice%d subslice per slice: %u\n",
> + s, hweight8(sseu->subslice_mask[s]));
Cosmetic only but consider condensing this into one line if you don't
have a preference to either.
> + }
> drm_printf(p, "EU total: %u\n", sseu->eu_total);
> drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
> drm_printf(p, "has slice power gating: %s\n",
> @@ -119,22 +124,88 @@ void intel_device_info_dump(const struct intel_device_info *info,
> intel_device_info_dump_flags(info, p);
> }
>
> +static u16 compute_eu_total(const struct sseu_dev_info *sseu)
> +{
> + u16 i, total = 0;
> +
> + for (i = 0; i < ARRAY_SIZE(sseu->eu_mask); i++)
> + total += hweight8(sseu->eu_mask[i]);
> +
> + return total;
> +}
> +
> +static u16 compute_subslice_total(const struct sseu_dev_info *sseu)
> +{
> + u16 i, total = 0;
> +
> + for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> + total += hweight8(sseu->subslice_mask[i]);
> +
> + return total;
> +}
> +
> static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
> {
> struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
> const u32 fuse2 = I915_READ(GEN8_FUSE2);
> + int s, ss;
> + const int eu_mask = 0xff;
> + u32 subslice_mask, eu_en;
>
> sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
> GEN10_F2_S_ENA_SHIFT;
> - sseu->subslice_mask = (1 << 4) - 1;
> - sseu->subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
> - GEN10_F2_SS_DIS_SHIFT);
> + sseu->max_slices = 6;
> + sseu->max_subslices = 4;
> + sseu->max_eus_per_subslice = 8;
>
> - sseu->eu_total = hweight32(~I915_READ(GEN8_EU_DISABLE0));
> - sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE1));
> - sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE2));
> - sseu->eu_total += hweight8(~(I915_READ(GEN10_EU_DISABLE3) &
> - GEN10_EU_DIS_SS_MASK));
> + subslice_mask = (1 << 4) - 1;
> + subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
> + GEN10_F2_SS_DIS_SHIFT);
> +
> + /* Slice0 can have up to 3 subslices, but there are only 2 in
> + * slice1/2.
> + */
Sorry for not spotting this earlier - this comment style is frowned
upon. Instead for multi-line please use:
/*
* Blah blah...
*/
> + sseu->subslice_mask[0] = subslice_mask;
> + for (s = 1; s < sseu->max_slices; s++)
> + sseu->subslice_mask[s] = subslice_mask & 0x3;
> +
> + /* Slice0 */
> + eu_en = ~I915_READ(GEN8_EU_DISABLE0);
> + for (ss = 0; ss < sseu->max_subslices; ss++) {
> + sseu->eu_mask[sseu_eu_idx(sseu, 0, ss, 0)] =
> + (eu_en >> (8 * ss)) & eu_mask;
> + }
> + /* Slice1 */
> + sseu->eu_mask[sseu_eu_idx(sseu, 1, 0, 0)] = (eu_en >> 24) & eu_mask;
> + eu_en = ~I915_READ(GEN8_EU_DISABLE1);
> + sseu->eu_mask[sseu_eu_idx(sseu, 1, 1, 0)] = eu_en & eu_mask;
> + /* Slice2 */
> + sseu->eu_mask[sseu_eu_idx(sseu, 2, 0, 0)] = (eu_en >> 8) & eu_mask;
> + sseu->eu_mask[sseu_eu_idx(sseu, 2, 1, 0)] = (eu_en >> 16) & eu_mask;
> + /* Slice3 */
> + sseu->eu_mask[sseu_eu_idx(sseu, 3, 0, 0)] = (eu_en >> 24) & eu_mask;
> + eu_en = ~I915_READ(GEN8_EU_DISABLE2);
> + sseu->eu_mask[sseu_eu_idx(sseu, 3, 1, 0)] = eu_en & eu_mask;
> + /* Slice4 */
> + sseu->eu_mask[sseu_eu_idx(sseu, 4, 0, 0)] = (eu_en >> 8) & eu_mask;
> + sseu->eu_mask[sseu_eu_idx(sseu, 4, 1, 0)] = (eu_en >> 16) & eu_mask;
> + /* Slice5 */
> + sseu->eu_mask[sseu_eu_idx(sseu, 5, 0, 0)] = (eu_en >> 24) & eu_mask;
> + eu_en = ~I915_READ(GEN10_EU_DISABLE3);
> + sseu->eu_mask[sseu_eu_idx(sseu, 5, 1, 0)] = eu_en & eu_mask;
Indices are guaranteed to be unique in these operations? No need to RMW?
> +
> + /* Do a second pass where we marked the subslices disabled if all
> + * their eus are off.
> + */
> + for (s = 0; s < sseu->max_slices; s++) {
> + for (ss = 0; ss < sseu->max_subslices; ss++) {
> + if (sseu_eu_mask(sseu, s, ss, 0) == 0)
> + sseu->subslice_mask[s] &= ~BIT(ss);
> + }
> + }
> +
> + sseu->subslice_total = compute_subslice_total(sseu);
> + sseu->eu_total = compute_eu_total(sseu);
>
> /*
> * CNL is expected to always have a uniform distribution
> @@ -155,26 +226,31 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
> static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
> {
> struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
> - u32 fuse, eu_dis;
> + u32 fuse;
>
> fuse = I915_READ(CHV_FUSE_GT);
>
> sseu->slice_mask = BIT(0);
> + sseu->max_slices = 1;
> + sseu->max_subslices = 2;
> + sseu->max_eus_per_subslice = 8;
>
> if (!(fuse & CHV_FGT_DISABLE_SS0)) {
> - sseu->subslice_mask |= BIT(0);
> - eu_dis = fuse & (CHV_FGT_EU_DIS_SS0_R0_MASK |
> - CHV_FGT_EU_DIS_SS0_R1_MASK);
> - sseu->eu_total += 8 - hweight32(eu_dis);
> + sseu->subslice_mask[0] |= BIT(0);
> + sseu->eu_mask[0] = ~((fuse & CHV_FGT_EU_DIS_SS0_R0_MASK) >> CHV_FGT_EU_DIS_SS0_R0_SHIFT);
> + sseu->eu_mask[0] |= ~(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
Use sseu_eu_idx?
> + sseu->subslice_mask[0] = 1;
> }
>
> if (!(fuse & CHV_FGT_DISABLE_SS1)) {
> - sseu->subslice_mask |= BIT(1);
> - eu_dis = fuse & (CHV_FGT_EU_DIS_SS1_R0_MASK |
> - CHV_FGT_EU_DIS_SS1_R1_MASK);
> - sseu->eu_total += 8 - hweight32(eu_dis);
> + sseu->subslice_mask[0] |= BIT(1);
> + sseu->eu_mask[1] = ~((fuse & CHV_FGT_EU_DIS_SS1_R0_MASK) >> CHV_FGT_EU_DIS_SS0_R0_SHIFT);
> + sseu->eu_mask[1] |= ~(((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
> }
>
> + sseu->subslice_total = compute_subslice_total(sseu);
> + sseu->eu_total = compute_eu_total(sseu);
> +
> /*
> * CHV expected to always have a uniform distribution of EU
> * across subslices.
> @@ -196,41 +272,53 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
> {
> struct intel_device_info *info = mkwrite_device_info(dev_priv);
> struct sseu_dev_info *sseu = &info->sseu;
> - int s_max = 3, ss_max = 4, eu_max = 8;
> int s, ss;
> - u32 fuse2, eu_disable;
> - u8 eu_mask = 0xff;
> + u32 fuse2, eu_disable, subslice_mask;
> + const u8 eu_mask = 0xff;
>
> fuse2 = I915_READ(GEN8_FUSE2);
> sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
>
> + /* BXT has a single slice and at most 3 subslices. */
> + sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
> + sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
> + sseu->max_eus_per_subslice = 8;
> +
> /*
> * The subslice disable field is global, i.e. it applies
> * to each of the enabled slices.
> */
> - sseu->subslice_mask = (1 << ss_max) - 1;
> - sseu->subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
> - GEN9_F2_SS_DIS_SHIFT);
> + subslice_mask = (1 << sseu->max_subslices) - 1;
> + subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
> + GEN9_F2_SS_DIS_SHIFT);
>
> /*
> * Iterate through enabled slices and subslices to
> * count the total enabled EU.
> */
> - for (s = 0; s < s_max; s++) {
> + for (s = 0; s < sseu->max_slices; s++) {
> if (!(sseu->slice_mask & BIT(s)))
> /* skip disabled slice */
> continue;
>
> + sseu->subslice_mask[s] = subslice_mask;
> +
> eu_disable = I915_READ(GEN9_EU_DISABLE(s));
> - for (ss = 0; ss < ss_max; ss++) {
> + for (ss = 0; ss < sseu->max_subslices; ss++) {
> int eu_per_ss;
> + u8 eu_disabled_mask;
>
> - if (!(sseu->subslice_mask & BIT(ss)))
> + if (!(sseu->subslice_mask[s] & BIT(ss)))
> /* skip disabled subslice */
> continue;
>
> - eu_per_ss = eu_max - hweight8((eu_disable >> (ss*8)) &
> - eu_mask);
> + eu_disabled_mask = (eu_disable >> (ss*8)) & eu_mask; > +
> + sseu->eu_mask[sseu_eu_idx(sseu, s, ss, 0)] =
> + ~eu_disabled_mask;
> +
> + eu_per_ss = sseu->max_eus_per_subslice -
> + hweight8(eu_disabled_mask);
>
> /*
> * Record which subslice(s) has(have) 7 EUs. we
> @@ -239,11 +327,12 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
> */
> if (eu_per_ss == 7)
> sseu->subslice_7eu[s] |= BIT(ss);
> -
> - sseu->eu_total += eu_per_ss;
> }
> }
>
> + sseu->subslice_total = compute_subslice_total(sseu);
> + sseu->eu_total = compute_eu_total(sseu);
> +
> /*
> * SKL is expected to always have a uniform distribution
> * of EU across subslices with the exception that any one
> @@ -269,8 +358,8 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
> sseu->has_eu_pg = sseu->eu_per_subslice > 2;
>
> if (IS_GEN9_LP(dev_priv)) {
> -#define IS_SS_DISABLED(ss) (!(sseu->subslice_mask & BIT(ss)))
> - info->has_pooled_eu = hweight8(sseu->subslice_mask) == 3;
> +#define IS_SS_DISABLED(ss) (!(sseu->subslice_mask[0] & BIT(ss)))
> + info->has_pooled_eu = hweight8(sseu->subslice_mask[0]) == 3;
>
> sseu->min_eu_in_pool = 0;
> if (info->has_pooled_eu) {
> @@ -288,19 +377,22 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
> static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
> {
> struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
> - const int s_max = 3, ss_max = 3, eu_max = 8;
> int s, ss;
> - u32 fuse2, eu_disable[3]; /* s_max */
> + u32 fuse2, subslice_mask, eu_disable[3]; /* s_max */
>
> fuse2 = I915_READ(GEN8_FUSE2);
> sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
> + sseu->max_slices = 3;
> + sseu->max_subslices = 3;
> + sseu->max_eus_per_subslice = 8;
> +
> /*
> * The subslice disable field is global, i.e. it applies
> * to each of the enabled slices.
> */
> - sseu->subslice_mask = GENMASK(ss_max - 1, 0);
> - sseu->subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
> - GEN8_F2_SS_DIS_SHIFT);
> + subslice_mask = GENMASK(sseu->max_subslices - 1, 0);
> + subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
> + GEN8_F2_SS_DIS_SHIFT);
>
> eu_disable[0] = I915_READ(GEN8_EU_DISABLE0) & GEN8_EU_DIS0_S0_MASK;
> eu_disable[1] = (I915_READ(GEN8_EU_DISABLE0) >> GEN8_EU_DIS0_S1_SHIFT) |
> @@ -314,30 +406,40 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
> * Iterate through enabled slices and subslices to
> * count the total enabled EU.
> */
> - for (s = 0; s < s_max; s++) {
> + for (s = 0; s < sseu->max_slices; s++) {
> if (!(sseu->slice_mask & BIT(s)))
> /* skip disabled slice */
> continue;
>
> - for (ss = 0; ss < ss_max; ss++) {
> + sseu->subslice_mask[s] = subslice_mask;
> +
> + for (ss = 0; ss < sseu->max_subslices; ss++) {
> + u8 eu_disabled_mask;
> u32 n_disabled;
>
> - if (!(sseu->subslice_mask & BIT(ss)))
> + if (!(sseu->subslice_mask[ss] & BIT(ss)))
> /* skip disabled subslice */
> continue;
>
> - n_disabled = hweight8(eu_disable[s] >> (ss * eu_max));
> + eu_disabled_mask =
> + eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
> +
> + sseu->eu_mask[sseu_eu_idx(sseu, s, ss, 0)] =
> + ~eu_disabled_mask;
> +
> + n_disabled = hweight8(eu_disabled_mask);
>
> /*
> * Record which subslices have 7 EUs.
> */
> - if (eu_max - n_disabled == 7)
> + if (sseu->max_eus_per_subslice - n_disabled == 7)
> sseu->subslice_7eu[s] |= 1 << ss;
> -
> - sseu->eu_total += eu_max - n_disabled;
> }
> }
>
> + sseu->subslice_total = compute_subslice_total(sseu);
> + sseu->eu_total = compute_eu_total(sseu);
> +
> /*
> * BDW is expected to always have a uniform distribution of EU across
> * subslices with the exception that any one EU in any one subslice may
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> index 49cb27bd04c1..6eb8bcf5a213 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -110,10 +110,14 @@ enum intel_platform {
> func(supports_tv); \
> func(has_ipc);
>
> +#define GEN_MAX_SLICES (6) /* CNL upper bound */
> +#define GEN_MAX_SUBSLICES (7)
> +
> struct sseu_dev_info {
> u8 slice_mask;
> - u8 subslice_mask;
> - u8 eu_total;
> + u8 subslice_mask[GEN_MAX_SUBSLICES];
> + u16 subslice_total;
> + u16 eu_total;
> u8 eu_per_subslice;
> u8 min_eu_in_pool;
> /* For each slice, which subslice(s) has(have) 7 EUs (bitfield)? */
> @@ -121,6 +125,17 @@ struct sseu_dev_info {
> u8 has_slice_pg:1;
> u8 has_subslice_pg:1;
> u8 has_eu_pg:1;
> +
> + /* Topology fields */
> + u8 max_slices;
> + u8 max_subslices;
> + u8 max_eus_per_subslice;
> +
> + /* We don't have more than 8 eus per subslice at the moment and as we
> + * store eus enabled using bits, no need to multiply by eus per
> + * subslice.
> + */
> + u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
> };
>
> struct intel_device_info {
> @@ -167,7 +182,22 @@ struct intel_device_info {
>
> static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
> {
> - return hweight8(sseu->slice_mask) * hweight8(sseu->subslice_mask);
> + return sseu->subslice_total;
> +}
> +
> +static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
> + int slice, int subslice, int eu_group)
What is eu_group for? Will it be used at some point?
> +{
> + int subslice_stride = ALIGN(sseu->max_eus_per_subslice, 8) / 8;
What are these eights? sizeof(eu_mask[0]) ? Is this whole statement
DIV_ROUND_UP also?
> + int slice_stride = sseu->max_subslices * subslice_stride;
> +
> + return slice * slice_stride + subslice * subslice_stride + eu_group;
> +}
> +
> +static inline int sseu_eu_mask(const struct sseu_dev_info *sseu,
> + int slice, int subslice, int eu_group)
> +{
> + return sseu->eu_mask[sseu_eu_idx(sseu, slice, subslice, eu_group)];
> }
>
> const char *intel_platform_name(enum intel_platform platform);
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index ff25f209d0a5..ac7896031b8d 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -2098,7 +2098,7 @@ make_rpcs(struct drm_i915_private *dev_priv)
>
> if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
> rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
> - rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask) <<
> + rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]) <<
> GEN8_RPCS_SS_CNT_SHIFT;
> rpcs |= GEN8_RPCS_ENABLE;
> }
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index c5ff203e42d6..23ae9a957fab 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -90,7 +90,7 @@ hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
>
> #define instdone_subslice_mask(dev_priv__) \
> (INTEL_GEN(dev_priv__) == 7 ? \
> - 1 : INTEL_INFO(dev_priv__)->sseu.subslice_mask)
> + 1 : INTEL_INFO(dev_priv__)->sseu.subslice_mask[0])
>
> #define for_each_instdone_slice_subslice(dev_priv__, slice__, subslice__) \
> for ((slice__) = 0, (subslice__) = 0; \
>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/6] drm/i915/debugfs: add rcs topology entry
2018-01-11 19:53 ` [PATCH v2 3/6] drm/i915/debugfs: add rcs topology entry Lionel Landwerlin
@ 2018-01-12 10:21 ` Tvrtko Ursulin
2018-01-12 11:02 ` Lionel Landwerlin
0 siblings, 1 reply; 21+ messages in thread
From: Tvrtko Ursulin @ 2018-01-12 10:21 UTC (permalink / raw)
To: Lionel Landwerlin, intel-gfx
On 11/01/2018 19:53, Lionel Landwerlin wrote:
> While the end goal is to make this information available to userspace
> through a new ioctl, there is no reason we can't display it in a human
> readable fashion through debugfs.
>
> slice0: 3 subslice(s) (0x7):
> subslice0: 8 EUs (0xff)
> subslice1: 8 EUs (0xff)
> subslice2: 8 EUs (0xff)
> subslice3: 0 EUs (0x0)
> slice1: 3 subslice(s) (0x7):
> subslice0: 8 EUs (0xff)
> subslice1: 8 EUs (0xff)
> subslice2: 8 EUs (0xff)
> subslice3: 0 EUs (0x0)
> slice2: 3 subslice(s) (0x7):
> subslice0: 8 EUs (0xff)
> subslice1: 8 EUs (0xff)
> subslice2: 8 EUs (0xff)
> subslice3: 0 EUs (0x0)
>
> v2: Reformat debugfs printing (Tvrtko)
> Use the new EU mask helper (Tvrtko)
>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_debugfs.c | 42 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 42 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 2d1c9cce5fe4..83af1029b907 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3162,6 +3162,47 @@ static int i915_engine_info(struct seq_file *m, void *unused)
> return 0;
> }
>
> +static int i915_rcs_topology(struct seq_file *m, void *unused)
> +{
> + struct drm_i915_private *dev_priv = node_to_i915(m->private);
> + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
> + int s, ss;
> + int subslice_stride =
> + DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE);
BITS_PER_BYTE is sizeof(eu_mask[0]) * BITS_PER_BYTE ?
> +
> + if (sseu->max_slices == 0) {
> + seq_printf(m, "Unavailable\n");
> + return 0;
> + }
> +
> + for (s = 0; s < sseu->max_slices; s++) {
> + seq_printf(m, "slice%i: %u subslice(s) (0x%hhx):\n",
> + s, hweight8(sseu->subslice_mask[s]),
> + sseu->subslice_mask[s]);
> +
> + for (ss = 0; ss < sseu->max_subslices; ss++) {
> + int eu_group, n_subslice_eus = 0;
> +
> + for (eu_group = 0; eu_group < subslice_stride; eu_group++) {
> + n_subslice_eus +=
> + hweight8(sseu_eu_mask(sseu, s, ss, eu_group));
> + }
Still trying to understand eu_group concept - is this just to handle
more than 8 EUs couple with the fact you chose eu_mask to be u8? Or is a
hw concept?
Regards,
Tvrtko
> +
> + seq_printf(m, "\tsubslice%i: %u EUs (", ss, n_subslice_eus);
> + for (eu_group = 0;
> + eu_group < max(0, subslice_stride - 1);
> + eu_group++) {
> + u8 val = sseu_eu_mask(sseu, s, ss, eu_group);
> + seq_printf(m, "0x%hhx, ", val);
> + }
> + seq_printf(m, "0x%hhx)\n",
> + sseu_eu_mask(sseu, s, ss, subslice_stride - 1));
> + }
> + }
> +
> + return 0;
> +}
> +
> static int i915_shrinker_info(struct seq_file *m, void *unused)
> {
> struct drm_i915_private *i915 = node_to_i915(m->private);
> @@ -4692,6 +4733,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
> {"i915_dmc_info", i915_dmc_info, 0},
> {"i915_display_info", i915_display_info, 0},
> {"i915_engine_info", i915_engine_info, 0},
> + {"i915_rcs_topology", i915_rcs_topology, 0},
> {"i915_shrinker_info", i915_shrinker_info, 0},
> {"i915_shared_dplls_info", i915_shared_dplls_info, 0},
> {"i915_dp_mst_info", i915_dp_mst_info, 0},
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/6] drm/i915: store all subslice masks
2018-01-12 10:15 ` Tvrtko Ursulin
@ 2018-01-12 10:58 ` Lionel Landwerlin
2018-01-12 11:05 ` Tvrtko Ursulin
2018-01-12 12:01 ` Tvrtko Ursulin
0 siblings, 2 replies; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-12 10:58 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
On 12/01/18 10:15, Tvrtko Ursulin wrote:
>
> On 11/01/2018 19:53, Lionel Landwerlin wrote:
>> Up to now, subslice mask was assumed to be uniform across slices. But
>> starting with Cannonlake, slices can be asymmetric (for example slice0
>> has different number of subslices as slice1+). This change stores all
>> subslices masks for all slices rather than having a single mask that
>> applies to all slices.
>>
>> v2: Rework how we store total numbers in sseu_dev_info (Tvrtko)
>> Fix CHV eu masks, was reading disabled as enabled (Tvrtko)
>> Readability changes (Tvrtko)
>> Add EU index helper (Tvrtko)
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_debugfs.c | 25 ++--
>> drivers/gpu/drm/i915/i915_drv.c | 2 +-
>> drivers/gpu/drm/i915/intel_device_info.c | 196
>> +++++++++++++++++++++++--------
>> drivers/gpu/drm/i915/intel_device_info.h | 36 +++++-
>> drivers/gpu/drm/i915/intel_lrc.c | 2 +-
>> drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
>> 6 files changed, 200 insertions(+), 63 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 2bb63073d73f..463029f72a0b 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -4285,7 +4285,7 @@ static void
>> cherryview_sseu_device_status(struct drm_i915_private *dev_priv,
>> continue;
>> sseu->slice_mask = BIT(0);
>> - sseu->subslice_mask |= BIT(ss);
>> + sseu->subslice_mask[0] |= BIT(ss);
>> eu_cnt = ((sig1[ss] & CHV_EU08_PG_ENABLE) ? 0 : 2) +
>> ((sig1[ss] & CHV_EU19_PG_ENABLE) ? 0 : 2) +
>> ((sig1[ss] & CHV_EU210_PG_ENABLE) ? 0 : 2) +
>> @@ -4332,7 +4332,7 @@ static void gen10_sseu_device_status(struct
>> drm_i915_private *dev_priv,
>> continue;
>> sseu->slice_mask |= BIT(s);
>> - sseu->subslice_mask = info->sseu.subslice_mask;
>> + sseu->subslice_mask[s] = info->sseu.subslice_mask[s];
>> for (ss = 0; ss < ss_max; ss++) {
>> unsigned int eu_cnt;
>> @@ -4387,8 +4387,8 @@ static void gen9_sseu_device_status(struct
>> drm_i915_private *dev_priv,
>> sseu->slice_mask |= BIT(s);
>> if (IS_GEN9_BC(dev_priv))
>> - sseu->subslice_mask =
>> - INTEL_INFO(dev_priv)->sseu.subslice_mask;
>> + sseu->subslice_mask[s] =
>> + INTEL_INFO(dev_priv)->sseu.subslice_mask[s];
>> for (ss = 0; ss < ss_max; ss++) {
>> unsigned int eu_cnt;
>> @@ -4398,7 +4398,7 @@ static void gen9_sseu_device_status(struct
>> drm_i915_private *dev_priv,
>> /* skip disabled subslice */
>> continue;
>> - sseu->subslice_mask |= BIT(ss);
>> + sseu->subslice_mask[s] |= BIT(ss);
>> }
>> eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
>> @@ -4420,9 +4420,12 @@ static void
>> broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
>> sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
>> if (sseu->slice_mask) {
>> - sseu->subslice_mask = INTEL_INFO(dev_priv)->sseu.subslice_mask;
>> sseu->eu_per_subslice =
>> INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
>> + for (s = 0; s < fls(sseu->slice_mask); s++) {
>> + sseu->subslice_mask[s] =
>> + INTEL_INFO(dev_priv)->sseu.subslice_mask[s];
>> + }
>> sseu->eu_total = sseu->eu_per_subslice *
>> sseu_subslice_total(sseu);
>> @@ -4441,6 +4444,7 @@ static void i915_print_sseu_info(struct
>> seq_file *m, bool is_available_info,
>> {
>> struct drm_i915_private *dev_priv = node_to_i915(m->private);
>> const char *type = is_available_info ? "Available" : "Enabled";
>> + int s;
>> seq_printf(m, " %s Slice Mask: %04x\n", type,
>> sseu->slice_mask);
>> @@ -4448,10 +4452,11 @@ static void i915_print_sseu_info(struct
>> seq_file *m, bool is_available_info,
>> hweight8(sseu->slice_mask));
>> seq_printf(m, " %s Subslice Total: %u\n", type,
>> sseu_subslice_total(sseu));
>> - seq_printf(m, " %s Subslice Mask: %04x\n", type,
>> - sseu->subslice_mask);
>> - seq_printf(m, " %s Subslice Per Slice: %u\n", type,
>> - hweight8(sseu->subslice_mask));
>> + for (s = 0; s < fls(sseu->slice_mask); s++) {
>> + seq_printf(m, " %s Slice%i %u subslices, mask=%04x\n", type,
>> + s, hweight8(sseu->subslice_mask[s]),
>> + sseu->subslice_mask[s]);
>> + }
>> seq_printf(m, " %s EU Total: %u\n", type,
>> sseu->eu_total);
>> seq_printf(m, " %s EU Per Subslice: %u\n", type,
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>> b/drivers/gpu/drm/i915/i915_drv.c
>> index 6c8da9d20c33..969835d3cbcd 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -414,7 +414,7 @@ static int i915_getparam(struct drm_device *dev,
>> void *data,
>> return -ENODEV;
>> break;
>> case I915_PARAM_SUBSLICE_MASK:
>> - value = INTEL_INFO(dev_priv)->sseu.subslice_mask;
>> + value = INTEL_INFO(dev_priv)->sseu.subslice_mask[0];
>> if (!value)
>> return -ENODEV;
>> break;
>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c
>> b/drivers/gpu/drm/i915/intel_device_info.c
>> index d28592e43512..0bf6ef51cdb7 100644
>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>> @@ -80,12 +80,17 @@ void intel_device_info_dump_flags(const struct
>> intel_device_info *info,
>> static void sseu_dump(const struct sseu_dev_info *sseu, struct
>> drm_printer *p)
>> {
>> + int s;
>> +
>> drm_printf(p, "slice mask: %04x\n", sseu->slice_mask);
>> drm_printf(p, "slice total: %u\n", hweight8(sseu->slice_mask));
>> drm_printf(p, "subslice total: %u\n", sseu_subslice_total(sseu));
>> - drm_printf(p, "subslice mask %04x\n", sseu->subslice_mask);
>> - drm_printf(p, "subslice per slice: %u\n",
>> - hweight8(sseu->subslice_mask));
>> + for (s = 0; s < ARRAY_SIZE(sseu->subslice_mask); s++) {
>> + drm_printf(p, "slice%d subslice mask %04x\n",
>> + s, sseu->subslice_mask[s]);
>> + drm_printf(p, "slice%d subslice per slice: %u\n",
>> + s, hweight8(sseu->subslice_mask[s]));
>
> Cosmetic only but consider condensing this into one line if you don't
> have a preference to either.
Sure, just conscious about the 80characters.
>
>> + }
>> drm_printf(p, "EU total: %u\n", sseu->eu_total);
>> drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
>> drm_printf(p, "has slice power gating: %s\n",
>> @@ -119,22 +124,88 @@ void intel_device_info_dump(const struct
>> intel_device_info *info,
>> intel_device_info_dump_flags(info, p);
>> }
>> +static u16 compute_eu_total(const struct sseu_dev_info *sseu)
>> +{
>> + u16 i, total = 0;
>> +
>> + for (i = 0; i < ARRAY_SIZE(sseu->eu_mask); i++)
>> + total += hweight8(sseu->eu_mask[i]);
>> +
>> + return total;
>> +}
>> +
>> +static u16 compute_subslice_total(const struct sseu_dev_info *sseu)
>> +{
>> + u16 i, total = 0;
>> +
>> + for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
>> + total += hweight8(sseu->subslice_mask[i]);
>> +
>> + return total;
>> +}
>> +
>> static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
>> {
>> struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
>> const u32 fuse2 = I915_READ(GEN8_FUSE2);
>> + int s, ss;
>> + const int eu_mask = 0xff;
>> + u32 subslice_mask, eu_en;
>> sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
>> GEN10_F2_S_ENA_SHIFT;
>> - sseu->subslice_mask = (1 << 4) - 1;
>> - sseu->subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
>> - GEN10_F2_SS_DIS_SHIFT);
>> + sseu->max_slices = 6;
>> + sseu->max_subslices = 4;
>> + sseu->max_eus_per_subslice = 8;
>> - sseu->eu_total = hweight32(~I915_READ(GEN8_EU_DISABLE0));
>> - sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE1));
>> - sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE2));
>> - sseu->eu_total += hweight8(~(I915_READ(GEN10_EU_DISABLE3) &
>> - GEN10_EU_DIS_SS_MASK));
>> + subslice_mask = (1 << 4) - 1;
>> + subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
>> + GEN10_F2_SS_DIS_SHIFT);
>> +
>> + /* Slice0 can have up to 3 subslices, but there are only 2 in
>> + * slice1/2.
>> + */
>
> Sorry for not spotting this earlier - this comment style is frowned
> upon. Instead for multi-line please use:
>
> /*
> * Blah blah...
> */
Done.
>
>> + sseu->subslice_mask[0] = subslice_mask;
>> + for (s = 1; s < sseu->max_slices; s++)
>> + sseu->subslice_mask[s] = subslice_mask & 0x3;
>> +
>> + /* Slice0 */
>> + eu_en = ~I915_READ(GEN8_EU_DISABLE0);
>> + for (ss = 0; ss < sseu->max_subslices; ss++) {
>> + sseu->eu_mask[sseu_eu_idx(sseu, 0, ss, 0)] =
>> + (eu_en >> (8 * ss)) & eu_mask;
>> + }
>> + /* Slice1 */
>> + sseu->eu_mask[sseu_eu_idx(sseu, 1, 0, 0)] = (eu_en >> 24) &
>> eu_mask;
>> + eu_en = ~I915_READ(GEN8_EU_DISABLE1);
>> + sseu->eu_mask[sseu_eu_idx(sseu, 1, 1, 0)] = eu_en & eu_mask;
>> + /* Slice2 */
>> + sseu->eu_mask[sseu_eu_idx(sseu, 2, 0, 0)] = (eu_en >> 8) & eu_mask;
>> + sseu->eu_mask[sseu_eu_idx(sseu, 2, 1, 0)] = (eu_en >> 16) &
>> eu_mask;
>> + /* Slice3 */
>> + sseu->eu_mask[sseu_eu_idx(sseu, 3, 0, 0)] = (eu_en >> 24) &
>> eu_mask;
>> + eu_en = ~I915_READ(GEN8_EU_DISABLE2);
>> + sseu->eu_mask[sseu_eu_idx(sseu, 3, 1, 0)] = eu_en & eu_mask;
>> + /* Slice4 */
>> + sseu->eu_mask[sseu_eu_idx(sseu, 4, 0, 0)] = (eu_en >> 8) & eu_mask;
>> + sseu->eu_mask[sseu_eu_idx(sseu, 4, 1, 0)] = (eu_en >> 16) &
>> eu_mask;
>> + /* Slice5 */
>> + sseu->eu_mask[sseu_eu_idx(sseu, 5, 0, 0)] = (eu_en >> 24) &
>> eu_mask;
>> + eu_en = ~I915_READ(GEN10_EU_DISABLE3);
>> + sseu->eu_mask[sseu_eu_idx(sseu, 5, 1, 0)] = eu_en & eu_mask;
>
> Indices are guaranteed to be unique in these operations? No need to RMW?
For a given max_slices/max_subslices/DIV_ROUND_UP(max_eus_per_subslice,
BITS_PER_BYTE) indices are unique.
>
>> +
>> + /* Do a second pass where we marked the subslices disabled if all
>> + * their eus are off.
>> + */
>> + for (s = 0; s < sseu->max_slices; s++) {
>> + for (ss = 0; ss < sseu->max_subslices; ss++) {
>> + if (sseu_eu_mask(sseu, s, ss, 0) == 0)
>> + sseu->subslice_mask[s] &= ~BIT(ss);
>> + }
>> + }
>> +
>> + sseu->subslice_total = compute_subslice_total(sseu);
>> + sseu->eu_total = compute_eu_total(sseu);
>> /*
>> * CNL is expected to always have a uniform distribution
>> @@ -155,26 +226,31 @@ static void gen10_sseu_info_init(struct
>> drm_i915_private *dev_priv)
>> static void cherryview_sseu_info_init(struct drm_i915_private
>> *dev_priv)
>> {
>> struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
>> - u32 fuse, eu_dis;
>> + u32 fuse;
>> fuse = I915_READ(CHV_FUSE_GT);
>> sseu->slice_mask = BIT(0);
>> + sseu->max_slices = 1;
>> + sseu->max_subslices = 2;
>> + sseu->max_eus_per_subslice = 8;
>> if (!(fuse & CHV_FGT_DISABLE_SS0)) {
>> - sseu->subslice_mask |= BIT(0);
>> - eu_dis = fuse & (CHV_FGT_EU_DIS_SS0_R0_MASK |
>> - CHV_FGT_EU_DIS_SS0_R1_MASK);
>> - sseu->eu_total += 8 - hweight32(eu_dis);
>> + sseu->subslice_mask[0] |= BIT(0);
>> + sseu->eu_mask[0] = ~((fuse & CHV_FGT_EU_DIS_SS0_R0_MASK) >>
>> CHV_FGT_EU_DIS_SS0_R0_SHIFT);
>> + sseu->eu_mask[0] |= ~(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK)
>> >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
>
> Use sseu_eu_idx?
Done.
>
>> + sseu->subslice_mask[0] = 1;
>> }
>> if (!(fuse & CHV_FGT_DISABLE_SS1)) {
>> - sseu->subslice_mask |= BIT(1);
>> - eu_dis = fuse & (CHV_FGT_EU_DIS_SS1_R0_MASK |
>> - CHV_FGT_EU_DIS_SS1_R1_MASK);
>> - sseu->eu_total += 8 - hweight32(eu_dis);
>> + sseu->subslice_mask[0] |= BIT(1);
>> + sseu->eu_mask[1] = ~((fuse & CHV_FGT_EU_DIS_SS1_R0_MASK) >>
>> CHV_FGT_EU_DIS_SS0_R0_SHIFT);
>> + sseu->eu_mask[1] |= ~(((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK)
>> >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
>> }
>> + sseu->subslice_total = compute_subslice_total(sseu);
>> + sseu->eu_total = compute_eu_total(sseu);
>> +
>> /*
>> * CHV expected to always have a uniform distribution of EU
>> * across subslices.
>> @@ -196,41 +272,53 @@ static void gen9_sseu_info_init(struct
>> drm_i915_private *dev_priv)
>> {
>> struct intel_device_info *info = mkwrite_device_info(dev_priv);
>> struct sseu_dev_info *sseu = &info->sseu;
>> - int s_max = 3, ss_max = 4, eu_max = 8;
>> int s, ss;
>> - u32 fuse2, eu_disable;
>> - u8 eu_mask = 0xff;
>> + u32 fuse2, eu_disable, subslice_mask;
>> + const u8 eu_mask = 0xff;
>> fuse2 = I915_READ(GEN8_FUSE2);
>> sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >>
>> GEN8_F2_S_ENA_SHIFT;
>> + /* BXT has a single slice and at most 3 subslices. */
>> + sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
>> + sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
>> + sseu->max_eus_per_subslice = 8;
>> +
>> /*
>> * The subslice disable field is global, i.e. it applies
>> * to each of the enabled slices.
>> */
>> - sseu->subslice_mask = (1 << ss_max) - 1;
>> - sseu->subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
>> - GEN9_F2_SS_DIS_SHIFT);
>> + subslice_mask = (1 << sseu->max_subslices) - 1;
>> + subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
>> + GEN9_F2_SS_DIS_SHIFT);
>> /*
>> * Iterate through enabled slices and subslices to
>> * count the total enabled EU.
>> */
>> - for (s = 0; s < s_max; s++) {
>> + for (s = 0; s < sseu->max_slices; s++) {
>> if (!(sseu->slice_mask & BIT(s)))
>> /* skip disabled slice */
>> continue;
>> + sseu->subslice_mask[s] = subslice_mask;
>> +
>> eu_disable = I915_READ(GEN9_EU_DISABLE(s));
>> - for (ss = 0; ss < ss_max; ss++) {
>> + for (ss = 0; ss < sseu->max_subslices; ss++) {
>> int eu_per_ss;
>> + u8 eu_disabled_mask;
>> - if (!(sseu->subslice_mask & BIT(ss)))
>> + if (!(sseu->subslice_mask[s] & BIT(ss)))
>> /* skip disabled subslice */
>> continue;
>> - eu_per_ss = eu_max - hweight8((eu_disable >> (ss*8)) &
>> - eu_mask);
>> + eu_disabled_mask = (eu_disable >> (ss*8)) & eu_mask; > +
>> + sseu->eu_mask[sseu_eu_idx(sseu, s, ss, 0)] =
>> + ~eu_disabled_mask;
>> +
>> + eu_per_ss = sseu->max_eus_per_subslice -
>> + hweight8(eu_disabled_mask);
>> /*
>> * Record which subslice(s) has(have) 7 EUs. we
>> @@ -239,11 +327,12 @@ static void gen9_sseu_info_init(struct
>> drm_i915_private *dev_priv)
>> */
>> if (eu_per_ss == 7)
>> sseu->subslice_7eu[s] |= BIT(ss);
>> -
>> - sseu->eu_total += eu_per_ss;
>> }
>> }
>> + sseu->subslice_total = compute_subslice_total(sseu);
>> + sseu->eu_total = compute_eu_total(sseu);
>> +
>> /*
>> * SKL is expected to always have a uniform distribution
>> * of EU across subslices with the exception that any one
>> @@ -269,8 +358,8 @@ static void gen9_sseu_info_init(struct
>> drm_i915_private *dev_priv)
>> sseu->has_eu_pg = sseu->eu_per_subslice > 2;
>> if (IS_GEN9_LP(dev_priv)) {
>> -#define IS_SS_DISABLED(ss) (!(sseu->subslice_mask & BIT(ss)))
>> - info->has_pooled_eu = hweight8(sseu->subslice_mask) == 3;
>> +#define IS_SS_DISABLED(ss) (!(sseu->subslice_mask[0] & BIT(ss)))
>> + info->has_pooled_eu = hweight8(sseu->subslice_mask[0]) == 3;
>> sseu->min_eu_in_pool = 0;
>> if (info->has_pooled_eu) {
>> @@ -288,19 +377,22 @@ static void gen9_sseu_info_init(struct
>> drm_i915_private *dev_priv)
>> static void broadwell_sseu_info_init(struct drm_i915_private
>> *dev_priv)
>> {
>> struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
>> - const int s_max = 3, ss_max = 3, eu_max = 8;
>> int s, ss;
>> - u32 fuse2, eu_disable[3]; /* s_max */
>> + u32 fuse2, subslice_mask, eu_disable[3]; /* s_max */
>> fuse2 = I915_READ(GEN8_FUSE2);
>> sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >>
>> GEN8_F2_S_ENA_SHIFT;
>> + sseu->max_slices = 3;
>> + sseu->max_subslices = 3;
>> + sseu->max_eus_per_subslice = 8;
>> +
>> /*
>> * The subslice disable field is global, i.e. it applies
>> * to each of the enabled slices.
>> */
>> - sseu->subslice_mask = GENMASK(ss_max - 1, 0);
>> - sseu->subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
>> - GEN8_F2_SS_DIS_SHIFT);
>> + subslice_mask = GENMASK(sseu->max_subslices - 1, 0);
>> + subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
>> + GEN8_F2_SS_DIS_SHIFT);
>> eu_disable[0] = I915_READ(GEN8_EU_DISABLE0) &
>> GEN8_EU_DIS0_S0_MASK;
>> eu_disable[1] = (I915_READ(GEN8_EU_DISABLE0) >>
>> GEN8_EU_DIS0_S1_SHIFT) |
>> @@ -314,30 +406,40 @@ static void broadwell_sseu_info_init(struct
>> drm_i915_private *dev_priv)
>> * Iterate through enabled slices and subslices to
>> * count the total enabled EU.
>> */
>> - for (s = 0; s < s_max; s++) {
>> + for (s = 0; s < sseu->max_slices; s++) {
>> if (!(sseu->slice_mask & BIT(s)))
>> /* skip disabled slice */
>> continue;
>> - for (ss = 0; ss < ss_max; ss++) {
>> + sseu->subslice_mask[s] = subslice_mask;
>> +
>> + for (ss = 0; ss < sseu->max_subslices; ss++) {
>> + u8 eu_disabled_mask;
>> u32 n_disabled;
>> - if (!(sseu->subslice_mask & BIT(ss)))
>> + if (!(sseu->subslice_mask[ss] & BIT(ss)))
>> /* skip disabled subslice */
>> continue;
>> - n_disabled = hweight8(eu_disable[s] >> (ss * eu_max));
>> + eu_disabled_mask =
>> + eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
>> +
>> + sseu->eu_mask[sseu_eu_idx(sseu, s, ss, 0)] =
>> + ~eu_disabled_mask;
>> +
>> + n_disabled = hweight8(eu_disabled_mask);
>> /*
>> * Record which subslices have 7 EUs.
>> */
>> - if (eu_max - n_disabled == 7)
>> + if (sseu->max_eus_per_subslice - n_disabled == 7)
>> sseu->subslice_7eu[s] |= 1 << ss;
>> -
>> - sseu->eu_total += eu_max - n_disabled;
>> }
>> }
>> + sseu->subslice_total = compute_subslice_total(sseu);
>> + sseu->eu_total = compute_eu_total(sseu);
>> +
>> /*
>> * BDW is expected to always have a uniform distribution of EU
>> across
>> * subslices with the exception that any one EU in any one
>> subslice may
>> diff --git a/drivers/gpu/drm/i915/intel_device_info.h
>> b/drivers/gpu/drm/i915/intel_device_info.h
>> index 49cb27bd04c1..6eb8bcf5a213 100644
>> --- a/drivers/gpu/drm/i915/intel_device_info.h
>> +++ b/drivers/gpu/drm/i915/intel_device_info.h
>> @@ -110,10 +110,14 @@ enum intel_platform {
>> func(supports_tv); \
>> func(has_ipc);
>> +#define GEN_MAX_SLICES (6) /* CNL upper bound */
>> +#define GEN_MAX_SUBSLICES (7)
>> +
>> struct sseu_dev_info {
>> u8 slice_mask;
>> - u8 subslice_mask;
>> - u8 eu_total;
>> + u8 subslice_mask[GEN_MAX_SUBSLICES];
>> + u16 subslice_total;
>> + u16 eu_total;
>> u8 eu_per_subslice;
>> u8 min_eu_in_pool;
>> /* For each slice, which subslice(s) has(have) 7 EUs
>> (bitfield)? */
>> @@ -121,6 +125,17 @@ struct sseu_dev_info {
>> u8 has_slice_pg:1;
>> u8 has_subslice_pg:1;
>> u8 has_eu_pg:1;
>> +
>> + /* Topology fields */
>> + u8 max_slices;
>> + u8 max_subslices;
>> + u8 max_eus_per_subslice;
>> +
>> + /* We don't have more than 8 eus per subslice at the moment and
>> as we
>> + * store eus enabled using bits, no need to multiply by eus per
>> + * subslice.
>> + */
>> + u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
>> };
>> struct intel_device_info {
>> @@ -167,7 +182,22 @@ struct intel_device_info {
>> static inline unsigned int sseu_subslice_total(const struct
>> sseu_dev_info *sseu)
>> {
>> - return hweight8(sseu->slice_mask) * hweight8(sseu->subslice_mask);
>> + return sseu->subslice_total;
>> +}
>> +
>> +static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
>> + int slice, int subslice, int eu_group)
>
> What is eu_group for? Will it be used at some point?
In case we ever have more than 8 EUs per subslice.
>
>> +{
>> + int subslice_stride = ALIGN(sseu->max_eus_per_subslice, 8) / 8;
>
> What are these eights? sizeof(eu_mask[0]) ? Is this whole statement
> DIV_ROUND_UP also?
Sorry, forgot to replace some of them. Done.
>
>> + int slice_stride = sseu->max_subslices * subslice_stride;
>> +
>> + return slice * slice_stride + subslice * subslice_stride +
>> eu_group;
>> +}
>> +
>> +static inline int sseu_eu_mask(const struct sseu_dev_info *sseu,
>> + int slice, int subslice, int eu_group)
>> +{
>> + return sseu->eu_mask[sseu_eu_idx(sseu, slice, subslice, eu_group)];
>> }
>> const char *intel_platform_name(enum intel_platform platform);
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index ff25f209d0a5..ac7896031b8d 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -2098,7 +2098,7 @@ make_rpcs(struct drm_i915_private *dev_priv)
>> if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
>> rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
>> - rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask) <<
>> + rpcs |=
>> hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]) <<
>> GEN8_RPCS_SS_CNT_SHIFT;
>> rpcs |= GEN8_RPCS_ENABLE;
>> }
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index c5ff203e42d6..23ae9a957fab 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -90,7 +90,7 @@ hangcheck_action_to_str(const enum
>> intel_engine_hangcheck_action a)
>> #define instdone_subslice_mask(dev_priv__) \
>> (INTEL_GEN(dev_priv__) == 7 ? \
>> - 1 : INTEL_INFO(dev_priv__)->sseu.subslice_mask)
>> + 1 : INTEL_INFO(dev_priv__)->sseu.subslice_mask[0])
>> #define for_each_instdone_slice_subslice(dev_priv__, slice__,
>> subslice__) \
>> for ((slice__) = 0, (subslice__) = 0; \
>>
>
> Regards,
>
> Tvrtko
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/6] drm/i915/debugfs: add rcs topology entry
2018-01-12 10:21 ` Tvrtko Ursulin
@ 2018-01-12 11:02 ` Lionel Landwerlin
0 siblings, 0 replies; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-12 11:02 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
On 12/01/18 10:21, Tvrtko Ursulin wrote:
>
> On 11/01/2018 19:53, Lionel Landwerlin wrote:
>> While the end goal is to make this information available to userspace
>> through a new ioctl, there is no reason we can't display it in a human
>> readable fashion through debugfs.
>>
>> slice0: 3 subslice(s) (0x7):
>> subslice0: 8 EUs (0xff)
>> subslice1: 8 EUs (0xff)
>> subslice2: 8 EUs (0xff)
>> subslice3: 0 EUs (0x0)
>> slice1: 3 subslice(s) (0x7):
>> subslice0: 8 EUs (0xff)
>> subslice1: 8 EUs (0xff)
>> subslice2: 8 EUs (0xff)
>> subslice3: 0 EUs (0x0)
>> slice2: 3 subslice(s) (0x7):
>> subslice0: 8 EUs (0xff)
>> subslice1: 8 EUs (0xff)
>> subslice2: 8 EUs (0xff)
>> subslice3: 0 EUs (0x0)
>>
>> v2: Reformat debugfs printing (Tvrtko)
>> Use the new EU mask helper (Tvrtko)
>>
>> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_debugfs.c | 42
>> +++++++++++++++++++++++++++++++++++++
>> 1 file changed, 42 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 2d1c9cce5fe4..83af1029b907 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -3162,6 +3162,47 @@ static int i915_engine_info(struct seq_file
>> *m, void *unused)
>> return 0;
>> }
>> +static int i915_rcs_topology(struct seq_file *m, void *unused)
>> +{
>> + struct drm_i915_private *dev_priv = node_to_i915(m->private);
>> + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
>> + int s, ss;
>> + int subslice_stride =
>> + DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE);
>
> BITS_PER_BYTE is sizeof(eu_mask[0]) * BITS_PER_BYTE ?
If you have 8 or less EUs per subslice, yes.
>
>> +
>> + if (sseu->max_slices == 0) {
>> + seq_printf(m, "Unavailable\n");
>> + return 0;
>> + }
>> +
>> + for (s = 0; s < sseu->max_slices; s++) {
>> + seq_printf(m, "slice%i: %u subslice(s) (0x%hhx):\n",
>> + s, hweight8(sseu->subslice_mask[s]),
>> + sseu->subslice_mask[s]);
>> +
>> + for (ss = 0; ss < sseu->max_subslices; ss++) {
>> + int eu_group, n_subslice_eus = 0;
>> +
>> + for (eu_group = 0; eu_group < subslice_stride;
>> eu_group++) {
>> + n_subslice_eus +=
>> + hweight8(sseu_eu_mask(sseu, s, ss, eu_group));
>> + }
>
> Still trying to understand eu_group concept - is this just to handle
> more than 8 EUs couple with the fact you chose eu_mask to be u8? Or is
> a hw concept?
It's not a hw concept. I just wanted to make sure we had a plan if one
day we end up with more than 8 EUs per subslice.
>
> Regards,
>
> Tvrtko
>
>> +
>> + seq_printf(m, "\tsubslice%i: %u EUs (", ss,
>> n_subslice_eus);
>> + for (eu_group = 0;
>> + eu_group < max(0, subslice_stride - 1);
>> + eu_group++) {
>> + u8 val = sseu_eu_mask(sseu, s, ss, eu_group);
>> + seq_printf(m, "0x%hhx, ", val);
>> + }
>> + seq_printf(m, "0x%hhx)\n",
>> + sseu_eu_mask(sseu, s, ss, subslice_stride - 1));
>> + }
>> + }
>> +
>> + return 0;
>> +}
>> +
>> static int i915_shrinker_info(struct seq_file *m, void *unused)
>> {
>> struct drm_i915_private *i915 = node_to_i915(m->private);
>> @@ -4692,6 +4733,7 @@ static const struct drm_info_list
>> i915_debugfs_list[] = {
>> {"i915_dmc_info", i915_dmc_info, 0},
>> {"i915_display_info", i915_display_info, 0},
>> {"i915_engine_info", i915_engine_info, 0},
>> + {"i915_rcs_topology", i915_rcs_topology, 0},
>> {"i915_shrinker_info", i915_shrinker_info, 0},
>> {"i915_shared_dplls_info", i915_shared_dplls_info, 0},
>> {"i915_dp_mst_info", i915_dp_mst_info, 0},
>>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/6] drm/i915: store all subslice masks
2018-01-12 10:58 ` Lionel Landwerlin
@ 2018-01-12 11:05 ` Tvrtko Ursulin
2018-01-12 11:31 ` Lionel Landwerlin
2018-01-12 12:01 ` Tvrtko Ursulin
1 sibling, 1 reply; 21+ messages in thread
From: Tvrtko Ursulin @ 2018-01-12 11:05 UTC (permalink / raw)
To: Lionel Landwerlin, intel-gfx
On 12/01/2018 10:58, Lionel Landwerlin wrote:
> On 12/01/18 10:15, Tvrtko Ursulin wrote:
[snip]
>>
>>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>>> @@ -80,12 +80,17 @@ void intel_device_info_dump_flags(const struct
>>> intel_device_info *info,
>>> static void sseu_dump(const struct sseu_dev_info *sseu, struct
>>> drm_printer *p)
>>> {
>>> + int s;
>>> +
>>> drm_printf(p, "slice mask: %04x\n", sseu->slice_mask);
>>> drm_printf(p, "slice total: %u\n", hweight8(sseu->slice_mask));
>>> drm_printf(p, "subslice total: %u\n", sseu_subslice_total(sseu));
>>> - drm_printf(p, "subslice mask %04x\n", sseu->subslice_mask);
>>> - drm_printf(p, "subslice per slice: %u\n",
>>> - hweight8(sseu->subslice_mask));
>>> + for (s = 0; s < ARRAY_SIZE(sseu->subslice_mask); s++) {
>>> + drm_printf(p, "slice%d subslice mask %04x\n",
>>> + s, sseu->subslice_mask[s]);
>>> + drm_printf(p, "slice%d subslice per slice: %u\n",
>>> + s, hweight8(sseu->subslice_mask[s]));
>>
>> Cosmetic only but consider condensing this into one line if you don't
>> have a preference to either.
>
> Sure, just conscious about the 80characters.
I wasn't 100% clear here - I meant the kernel log messages, not the
source code. I think there is no point in logging two lines per slice
where one only contains a mask, and second a count.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/6] drm/i915: store all subslice masks
2018-01-12 11:05 ` Tvrtko Ursulin
@ 2018-01-12 11:31 ` Lionel Landwerlin
0 siblings, 0 replies; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-12 11:31 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
On 12/01/18 11:05, Tvrtko Ursulin wrote:
>
> On 12/01/2018 10:58, Lionel Landwerlin wrote:
>> On 12/01/18 10:15, Tvrtko Ursulin wrote:
>
> [snip]
>
>>>
>>>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>>>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>>>> @@ -80,12 +80,17 @@ void intel_device_info_dump_flags(const struct
>>>> intel_device_info *info,
>>>> static void sseu_dump(const struct sseu_dev_info *sseu, struct
>>>> drm_printer *p)
>>>> {
>>>> + int s;
>>>> +
>>>> drm_printf(p, "slice mask: %04x\n", sseu->slice_mask);
>>>> drm_printf(p, "slice total: %u\n", hweight8(sseu->slice_mask));
>>>> drm_printf(p, "subslice total: %u\n",
>>>> sseu_subslice_total(sseu));
>>>> - drm_printf(p, "subslice mask %04x\n", sseu->subslice_mask);
>>>> - drm_printf(p, "subslice per slice: %u\n",
>>>> - hweight8(sseu->subslice_mask));
>>>> + for (s = 0; s < ARRAY_SIZE(sseu->subslice_mask); s++) {
>>>> + drm_printf(p, "slice%d subslice mask %04x\n",
>>>> + s, sseu->subslice_mask[s]);
>>>> + drm_printf(p, "slice%d subslice per slice: %u\n",
>>>> + s, hweight8(sseu->subslice_mask[s]));
>>>
>>> Cosmetic only but consider condensing this into one line if you
>>> don't have a preference to either.
>>
>> Sure, just conscious about the 80characters.
>
> I wasn't 100% clear here - I meant the kernel log messages, not the
> source code. I think there is no point in logging two lines per slice
> where one only contains a mask, and second a count.
>
> Regards,
>
> Tvrtko
>
Ahaha :)
Okay, done.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* ✗ Fi.CI.IGT: warning for drm/i915: expose RCS topology to userspace
2018-01-11 19:53 [PATCH v2 0/6] drm/i915: expose RCS topology to userspace Lionel Landwerlin
` (6 preceding siblings ...)
2018-01-12 10:06 ` ✓ Fi.CI.BAT: success for drm/i915: expose RCS topology to userspace Patchwork
@ 2018-01-12 11:49 ` Patchwork
7 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2018-01-12 11:49 UTC (permalink / raw)
To: Lionel Landwerlin; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: expose RCS topology to userspace
URL : https://patchwork.freedesktop.org/series/36353/
State : warning
== Summary ==
Test gem_tiled_swapping:
Subgroup non-threaded:
incomplete -> PASS (shard-hsw) fdo#104218 +1
Test kms_cursor_crc:
Subgroup cursor-256x85-random:
notrun -> INCOMPLETE (shard-hsw)
Test gem_eio:
Subgroup in-flight-contexts:
dmesg-warn -> PASS (shard-snb) fdo#104058
Test kms_frontbuffer_tracking:
Subgroup fbc-1p-offscren-pri-shrfb-draw-blt:
pass -> FAIL (shard-snb) fdo#101623
Subgroup fbc-1p-primscrn-pri-indfb-draw-pwrite:
fail -> PASS (shard-snb) fdo#103167
Test kms_flip:
Subgroup wf_vblank-vs-dpms-interruptible:
pass -> DMESG-WARN (shard-hsw) fdo#102614
Test drv_suspend:
Subgroup debugfs-reader:
pass -> SKIP (shard-snb)
Test drv_hangman:
Subgroup error-state-capture-blt:
dmesg-warn -> PASS (shard-snb)
Test gem_wait:
Subgroup write-busy-bsd:
skip -> PASS (shard-snb)
Test gem_pwrite_snooped:
fail -> PASS (shard-snb)
Test perf_pmu:
Subgroup busy-check-all-rcs0:
skip -> PASS (shard-snb)
Test gem_exec_parallel:
Subgroup default-fds:
skip -> PASS (shard-snb)
Test gem_partial_pwrite_pread:
Subgroup write:
skip -> PASS (shard-snb)
Test prime_vgem:
Subgroup fence-wait-blt:
skip -> PASS (shard-snb)
fdo#104218 https://bugs.freedesktop.org/show_bug.cgi?id=104218
fdo#104058 https://bugs.freedesktop.org/show_bug.cgi?id=104058
fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#102614 https://bugs.freedesktop.org/show_bug.cgi?id=102614
shard-hsw total:2695 pass:1528 dwarn:2 dfail:0 fail:10 skip:1154 time:8868s
shard-snb total:2713 pass:1309 dwarn:1 dfail:0 fail:11 skip:1392 time:7849s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7653/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/6] drm/i915: store all subslice masks
2018-01-12 10:58 ` Lionel Landwerlin
2018-01-12 11:05 ` Tvrtko Ursulin
@ 2018-01-12 12:01 ` Tvrtko Ursulin
2018-01-12 13:53 ` Lionel Landwerlin
1 sibling, 1 reply; 21+ messages in thread
From: Tvrtko Ursulin @ 2018-01-12 12:01 UTC (permalink / raw)
To: Lionel Landwerlin, intel-gfx
On 12/01/2018 10:58, Lionel Landwerlin wrote:
> On 12/01/18 10:15, Tvrtko Ursulin wrote:
>>
[snip]
>>> +static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
>>> + int slice, int subslice, int eu_group)
>>
>> What is eu_group for? Will it be used at some point?
>
> In case we ever have more than 8 EUs per subslice.
I am thinking if we could hide that from the call sites, to avoid it
being passed as zeros, and to avoid having to write loops in other
patches which reference eu_groups, when it is not immediately obvious
what that means.
Could we for instance have a helper which would clear/set numbered EUs
in sseu_dev_info, and so hide all the implementation details?
sseu_enable_eus(sseu, slice, subslice, start, end);
Then when you have code like:
sseu->eu_mask[sseu_eu_idx(sseu, s, ss, 0)] = ~eu_disabled_mask;
You would write it as:
/* On this slice/subslice mark EUs 0 to N as enabled. */
sseu_enable_eus(sseu, s, ss, 0, fls(~eu_disabled_mask));
Helper would internally know the size of the underlying storage and
dtrt. There would be no need to manually manage eu_groups. In the
initial implementation you could simply GEM_BUG_ON if the EU range does
not fit into the current storage. Later u8 could be turned into u16 or
similar. You also wouldn't have any iteration over eu_groups in this
version.
I think that would be cleaner and easier to extend in the future. Unless
I overlooked some important detail?
Or even simplify it by passing bitmask instead of start/end, and just
have no support for more than 8 EUs in this version? No eu_group etc.
When the need arises to have more, bump the eu_mask type to u16. That
would require you to put back the stride parameter in the uAPI I think.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 5/6] drm/i915: add query uAPI
2018-01-11 19:53 ` [PATCH v2 5/6] drm/i915: add query uAPI Lionel Landwerlin
@ 2018-01-12 12:06 ` Tvrtko Ursulin
0 siblings, 0 replies; 21+ messages in thread
From: Tvrtko Ursulin @ 2018-01-12 12:06 UTC (permalink / raw)
To: Lionel Landwerlin, intel-gfx
On 11/01/2018 19:53, Lionel Landwerlin wrote:
> There are a number of information that are readable from hardware
> registers and that we would like to make accessible to userspace. One
> particular example is the topology of the execution units (how are
> execution units grouped in subslices and slices and also which ones
> have been fused off for die recovery).
>
> At the moment the GET_PARAM ioctl covers some basic needs, but
> generally is only able to return a single value for each defined
> parameter. This is a bit problematic with topology descriptions which
> are array/maps of available units.
>
> This change introduces a new ioctl that can deal with requests to fill
> structures of potentially variable lengths. The user is expected fill
> a query with length fields set at 0 on the first call, the kernel then
> sets the length fields to the their expected values. A second call to
> the kernel with length fields at their expected values will trigger a
> copy of the data to the pointed memory locations.
>
> The scope of this uAPI is only to provide information to userspace,
> not to allow configuration of the device.
>
> v2: Simplify dispatcher code iteration (Tvrtko)
> Tweak uapi drm_i915_query_item structure (Tvrtko)
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
> drivers/gpu/drm/i915/Makefile | 1 +
> drivers/gpu/drm/i915/i915_drv.c | 1 +
> drivers/gpu/drm/i915/i915_drv.h | 3 +++
> drivers/gpu/drm/i915/i915_query.c | 51 +++++++++++++++++++++++++++++++++++++++
> include/uapi/drm/i915_drm.h | 31 ++++++++++++++++++++++++
> 5 files changed, 87 insertions(+)
> create mode 100644 drivers/gpu/drm/i915/i915_query.c
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 3bddd8a06806..b0415a3e2d59 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -69,6 +69,7 @@ i915-y += i915_cmd_parser.o \
> i915_gem_timeline.o \
> i915_gem_userptr.o \
> i915_gemfs.o \
> + i915_query.o \
> i915_trace_points.o \
> i915_vma.o \
> intel_breadcrumbs.o \
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 969835d3cbcd..d92e1b7236fc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -2824,6 +2824,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
> DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
> DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
> DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
> + DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
> };
>
> static struct drm_driver driver = {
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a689396d0ff6..de0eb6ce2fcd 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3623,6 +3623,9 @@ extern void i915_perf_fini(struct drm_i915_private *dev_priv);
> extern void i915_perf_register(struct drm_i915_private *dev_priv);
> extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
>
> +/* i915_query.c */
> +int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file);
> +
> /* i915_suspend.c */
> extern int i915_save_state(struct drm_i915_private *dev_priv);
> extern int i915_restore_state(struct drm_i915_private *dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
> new file mode 100644
> index 000000000000..5694cfea4553
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -0,0 +1,51 @@
> +/*
> + * Copyright © 2017 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include "i915_drv.h"
> +#include <uapi/drm/i915_drm.h>
> +
> +int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> +{
> + struct drm_i915_query *args = data;
> + struct drm_i915_query_item __user *user_item_ptr =
> + u64_to_user_ptr(args->items_ptr);
> + u32 i;
> +
> + for (i = 0; i < args->num_items; i++, user_item_ptr++) {
> + struct drm_i915_query_item item;
> +
> + if (copy_from_user(&item, user_item_ptr, sizeof(item)))
> + return -EFAULT;
> +
> + switch (item.query_id) {
> + default:
> + return -EINVAL;
> + }
> +
> + if (copy_to_user(user_item_ptr, &item, sizeof(item)))
> + return -EFAULT;
> + }
> +
> + return 0;
> +}
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 536ee4febd74..39e93f10f2cd 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -318,6 +318,7 @@ typedef struct _drm_i915_sarea {
> #define DRM_I915_PERF_OPEN 0x36
> #define DRM_I915_PERF_ADD_CONFIG 0x37
> #define DRM_I915_PERF_REMOVE_CONFIG 0x38
> +#define DRM_I915_QUERY 0x39
>
> #define DRM_IOCTL_I915_INIT DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
> #define DRM_IOCTL_I915_FLUSH DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
> @@ -375,6 +376,7 @@ typedef struct _drm_i915_sarea {
> #define DRM_IOCTL_I915_PERF_OPEN DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
> #define DRM_IOCTL_I915_PERF_ADD_CONFIG DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
> #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
> +#define DRM_IOCTL_I915_QUERY DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>
> /* Allow drivers to submit batchbuffers directly to hardware, relying
> * on the security mechanisms provided by hardware.
> @@ -1613,6 +1615,35 @@ struct drm_i915_perf_oa_config {
> __u64 flex_regs_ptr;
> };
>
> +
> +struct drm_i915_query_item {
> + __u64 query_id;
> +
> + /*
> + * When set to zero by userspace, this is filled with the size of the
> + * data to be written at the data_ptr pointer.
> + */
> + __u32 length;
> + __u32 pad;
> +
> + /*
> + * Data will be written at the location pointed by data_ptr when the
> + * value of length matches the length of the data to be written by the
> + * kernel.
> + */
> + __u64 data_ptr;
> +};
> +
> +struct drm_i915_query {
> + __u32 num_items;
> + __u32 _pad;
> +
> + /*
> + * This point to an array of num_items drm_i915_query_item structures.
> + */
> + __u64 items_ptr;
> +};
> +
> #if defined(__cplusplus)
> }
> #endif
>
Looks good to me and I will be able to use it for the engine info stuff
I need.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
You'll need to ping some more people to get more approvals though.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 6/6] drm/i915: expose rcs topology through query uAPI
2018-01-11 19:53 ` [PATCH v2 6/6] drm/i915: expose rcs topology through " Lionel Landwerlin
@ 2018-01-12 12:27 ` Tvrtko Ursulin
2018-01-12 14:04 ` Lionel Landwerlin
0 siblings, 1 reply; 21+ messages in thread
From: Tvrtko Ursulin @ 2018-01-12 12:27 UTC (permalink / raw)
To: Lionel Landwerlin, intel-gfx
On 11/01/2018 19:53, Lionel Landwerlin wrote:
> With the introduction of asymmetric slices in CNL, we cannot rely on
> the previous SUBSLICE_MASK getparam to tell userspace what subslices
> are available. Here we introduce a more detailed way of querying the
> Gen's GPU topology that doesn't aggregate numbers.
>
> This is essential for monitoring parts of the GPU with the OA unit,
> because counters need to be normalized to the number of
> EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
> not gives us sufficient information.
>
> As a bonus we can draw representations of the GPU :
>
> https://imgur.com/a/vuqpa
>
> v2: Rename uapi struct s/_mask/_info/ (Tvrtko)
> Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko)
> Add uapi macros to read data from *_info structs (Tvrtko)
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_query.c | 133 ++++++++++++++++++++++++++++++++
> drivers/gpu/drm/i915/intel_query_info.c | 88 +++++++++++++++++++++
> include/uapi/drm/i915_drm.h | 51 ++++++++++++
> 3 files changed, 272 insertions(+)
> create mode 100644 drivers/gpu/drm/i915/intel_query_info.c
>
> diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
> index 5694cfea4553..1d9f5a15323c 100644
> --- a/drivers/gpu/drm/i915/i915_query.c
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -25,8 +25,128 @@
> #include "i915_drv.h"
> #include <uapi/drm/i915_drm.h>
>
> +static int query_slices_info(struct drm_i915_private *dev_priv,
> + struct drm_i915_query_item *query_item)
> +{
> + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
> + struct drm_i915_query_slices_info slices_info;
> + u32 data_length, length;
> +
> + if (sseu->max_slices == 0)
> + return -ENODEV;
> +
> + data_length = sizeof(u8);
sizeof(sseu->slice_mask) ?
> + length = sizeof(slices_info) + data_length;
> +
> + /*
> + * If we ever change the internal slice mask data type, we'll need to
> + * update this function.
> + */
> + BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
> +
> + if (query_item->length == 0) {
> + query_item->length = length;
> + return 0;
> + }
> +
> + if (query_item->length != length)
> + return -EINVAL;
> +
> + memset(&slices_info, 0, sizeof(slices_info));
> + slices_info.max_slices = sseu->max_slices;
> +
> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr), &slices_info,
> + sizeof(slices_info)))
> + return -EFAULT;
> +
> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
> + offsetof(struct drm_i915_query_slices_info, data)),
> + &sseu->slice_mask, data_length))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +static int query_subslices_info(struct drm_i915_private *dev_priv,
> + struct drm_i915_query_item *query_item)
> +{
> + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
> + struct drm_i915_query_subslices_info subslices_info;
> + u32 data_length, length;
> +
> + if (sseu->max_slices == 0)
> + return -ENODEV;
> +
> + memset(&subslices_info, 0, sizeof(subslices_info));
> + subslices_info.max_slices = sseu->max_slices;
> + subslices_info.max_subslices = sseu->max_subslices;
> +
> + data_length = subslices_info.max_slices *
> + DIV_ROUND_UP(subslices_info.max_subslices, BITS_PER_BYTE);
s/BITS_PER_BYTE/sizeof(sseu->subslice_mask[0]) * BITS_PER_BYTE/ ?
> + length = sizeof(subslices_info) + data_length;
> +
> + if (query_item->length == 0) {
> + query_item->length = length;
> + return 0;
> + }
> +
> + if (query_item->length != length)
> + return -EINVAL;
> +
> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr), &subslices_info,
> + sizeof(subslices_info)))
> + return -EFAULT;
> +
> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
> + offsetof(struct drm_i915_query_subslices_info, data)),
> + sseu->subslice_mask, data_length))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +static int query_eus_info(struct drm_i915_private *dev_priv,
> + struct drm_i915_query_item *query_item)
> +{
> + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
> + struct drm_i915_query_eus_info eus_info;
> + u32 data_length, length;
> +
> + if (sseu->max_slices == 0)
> + return -ENODEV;
> +
> + memset(&eus_info, 0, sizeof(eus_info));
> + eus_info.max_slices = sseu->max_slices;
> + eus_info.max_subslices = sseu->max_subslices;
> + eus_info.max_eus_per_subslice = sseu->max_eus_per_subslice;
> +
> + data_length = eus_info.max_slices * eus_info.max_subslices *
> + DIV_ROUND_UP(eus_info.max_eus_per_subslice, BITS_PER_BYTE);
> + length = sizeof(eus_info) + data_length;
> +
> + if (query_item->length == 0) {
> + query_item->length = length;
> + return 0;
> + }
> +
> + if (query_item->length != length)
> + return -EINVAL;
> +
> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr), &eus_info,
> + sizeof(eus_info)))
> + return -EFAULT;
> +
> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
> + offsetof(struct drm_i915_query_eus_info, data)),
> + sseu->eu_mask, data_length))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> {
> + struct drm_i915_private *dev_priv = to_i915(dev);
> struct drm_i915_query *args = data;
> struct drm_i915_query_item __user *user_item_ptr =
> u64_to_user_ptr(args->items_ptr);
> @@ -34,15 +154,28 @@ int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>
> for (i = 0; i < args->num_items; i++, user_item_ptr++) {
> struct drm_i915_query_item item;
> + int ret;
>
> if (copy_from_user(&item, user_item_ptr, sizeof(item)))
> return -EFAULT;
>
> switch (item.query_id) {
> + case DRM_I915_QUERY_ID_SLICES_INFO:
> + ret = query_slices_info(dev_priv, &item);
> + break;
> + case DRM_I915_QUERY_ID_SUBSLICES_INFO:
> + ret = query_subslices_info(dev_priv, &item);
> + break;
> + case DRM_I915_QUERY_ID_EUS_INFO:
> + ret = query_eus_info(dev_priv, &item);
> + break;
> default:
> return -EINVAL;
> }
>
> + if (ret)
> + return ret;
> +
> if (copy_to_user(user_item_ptr, &item, sizeof(item)))
> return -EFAULT;
> }
> diff --git a/drivers/gpu/drm/i915/intel_query_info.c b/drivers/gpu/drm/i915/intel_query_info.c
> new file mode 100644
> index 000000000000..79b03be9f51a
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_query_info.c
Ooops!
> @@ -0,0 +1,88 @@
> +/*
> + * Copyright © 2017 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include "i915_drv.h"
> +#include <uapi/drm/i915_drm.h>
> +
> +static int query_info_rcs_topology(struct drm_i915_private *dev_priv,
> + struct drm_i915_query_info *args)
> +{
> + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
> + struct drm_i915_rcs_topology_info __user *user_topology =
> + u64_to_user_ptr(args->info_ptr);
> + struct drm_i915_rcs_topology_info topology;
> + u32 data_size, total_size;
> + const u8 *data = NULL;
> + int ret;
> +
> + /* Not supported on gen < 8. */
> + if (sseu->max_slices == 0)
> + return -ENODEV;
> +
> + switch (args->query_params[0]) {
> + case I915_RCS_TOPOLOGY_SLICE:
> + topology.params[0] = sseu->max_slices;
> + data_size = sizeof(sseu->slice_mask);
> + data = &sseu->slice_mask;
> + break;
> +
> + case I915_RCS_TOPOLOGY_SUBSLICE:
> + topology.params[0] = sseu->max_slices;
> + topology.params[1] = ALIGN(sseu->max_subslices, 8) / 8;
> + data_size = sseu->max_slices * topology.params[1];
> + data = sseu->subslices_mask;
> + break;
> +
> + case I915_RCS_TOPOLOGY_EU:
> + topology.params[2] = ALIGN(sseu->max_eus_per_subslice, 8) / 8;
> + topology.params[1] = sseu->max_subslices * topology.params[2];
> + topology.params[0] = sseu->max_slices;
> + data_size = sseu->max_slices * topology.params[1];
> + data = sseu->eu_mask;
> + break;
> +
> + default:
> + return -EINVAL;
> + }
> +
> + total_size = sizeof(topology) + data_size;
> +
> + if (args->info_ptr_len == 0) {
> + args->info_ptr_len = total_size;
> + return 0;
> + }
> +
> + if (args->info_ptr_len < total_size)
> + return -EINVAL;
> +
> + ret = copy_to_user(user_topology, &topology, sizeof(topology));
> + if (ret)
> + return -EFAULT;
> +
> + ret = copy_to_user(user_topology + 1, data, data_size);
> + if (ret)
> + return -EFAULT;
> +
> + return 0;
> +}
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 39e93f10f2cd..5a8e2f7d5dc3 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1618,6 +1618,9 @@ struct drm_i915_perf_oa_config {
>
> struct drm_i915_query_item {
> __u64 query_id;
> +#define DRM_I915_QUERY_ID_SLICES_INFO 0x01
> +#define DRM_I915_QUERY_ID_SUBSLICES_INFO 0x02
> +#define DRM_I915_QUERY_ID_EUS_INFO 0x03
>
> /*
> * When set to zero by userspace, this is filled with the size of the
> @@ -1644,6 +1647,54 @@ struct drm_i915_query {
> __u64 items_ptr;
> };
>
> +/* Data written by the kernel with query DRM_I915_QUERY_ID_SLICES_INFO :
> + *
> + * data: each bit indicates whether a slice is available (1) or fused off (0).
> + * Use DRM_I915_QUERY_SLICE_AVAILABLE() to query a given slice's
> + * availability.
> + */
> +struct drm_i915_query_slices_info {
> + __u32 max_slices;
> +
> +#define DRM_I915_QUERY_SLICE_AVAILABLE(info, slice) \
> + (((info)->data[(slice) / 8] >> ((slice) % 8)) & 1)
Alternatively, (data[slice / 8] & BIT(slice % 8)), but it is probably a
personal preference as what is more readable.
> + __u8 data[];
> +};
> +
> +/* Data written by the kernel with query DRM_I915_QUERY_ID_SUBSLICES_INFO :
> + *
> + * data: each bit indicates whether a subslice is available (1) or fused off
> + * (0). Use DRM_I915_QUERY_SUBSLICE_AVAILABLE() to query a given
> + * subslice's availability.
> + */
> +struct drm_i915_query_subslices_info {
> + __u32 max_slices;
> + __u32 max_subslices;
> +
> +#define DRM_I915_QUERY_SUBSLICE_AVAILABLE(info, slice, subslice) \
> + (((info)->data[(slice) * ALIGN((info)->max_subslices, 8) / 8 + \
> + (subslice) / 8] >> ((subslice) % 8)) & 1)
> + __u8 data[];
> +};
> +
> +/* Data written by the kernel with query DRM_I915_QUERY_ID_EUS_INFO :
> + *
> + * data: Each bit indicates whether a subslice is available (1) or fused off
> + * (0). Use DRM_I915_QUERY_EU_AVAILABLE() to query a given EU's
> + * availability.
> + */
> +struct drm_i915_query_eus_info {
> + __u32 max_slices;
> + __u32 max_subslices;
> + __u32 max_eus_per_subslice;
> +
> +#define DRM_I915_QUERY_EU_AVAILABLE(info, slice, subslice, eu) \
> + (((info)->data[(slice) * ALIGN((info)->max_eus_per_subslice, 8) / 8 * (info)->max_subslices + \
> + (subslice) * ALIGN((info)->max_eus_per_subslice, 8) / 8 + \
> + (eu) / 8] >> ((eu) % 8)) & 1)
> + __u8 data[];
> +};
To many hardcoded eights in all of the above. But not sure what to
suggests. If someone says to lose the zero-length array or not it will
be slightly different.
Do you want to prefix the macros with !! so they return 0/1?
Is the ALIGN(x, 8) / 8 pattern equal to DIV_ROUND_UP(x, 8)? I think I
commented about that already somewhere.
> +
> #if defined(__cplusplus)
> }
> #endif
>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/6] drm/i915: store all subslice masks
2018-01-12 12:01 ` Tvrtko Ursulin
@ 2018-01-12 13:53 ` Lionel Landwerlin
2018-01-12 17:37 ` Tvrtko Ursulin
0 siblings, 1 reply; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-12 13:53 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
On 12/01/18 12:01, Tvrtko Ursulin wrote:
>
> On 12/01/2018 10:58, Lionel Landwerlin wrote:
>> On 12/01/18 10:15, Tvrtko Ursulin wrote:
>>>
>
> [snip]
>
>>>> +static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
>>>> + int slice, int subslice, int eu_group)
>>>
>>> What is eu_group for? Will it be used at some point?
>>
>> In case we ever have more than 8 EUs per subslice.
>
> I am thinking if we could hide that from the call sites, to avoid it
> being passed as zeros, and to avoid having to write loops in other
> patches which reference eu_groups, when it is not immediately obvious
> what that means.
>
> Could we for instance have a helper which would clear/set numbered EUs
> in sseu_dev_info, and so hide all the implementation details?
>
> sseu_enable_eus(sseu, slice, subslice, start, end);
>
> Then when you have code like:
>
> sseu->eu_mask[sseu_eu_idx(sseu, s, ss, 0)] = ~eu_disabled_mask;
>
> You would write it as:
>
> /* On this slice/subslice mark EUs 0 to N as enabled. */
> sseu_enable_eus(sseu, s, ss, 0, fls(~eu_disabled_mask));
Hmm... I don't think that works if you have gaps, right?
Like a BXT 2x6 where a row of EUs has been fused off. It would be
something like 0b01110111 or 0b10111011.
>
> Helper would internally know the size of the underlying storage and
> dtrt. There would be no need to manually manage eu_groups. In the
> initial implementation you could simply GEM_BUG_ON if the EU range
> does not fit into the current storage. Later u8 could be turned into
> u16 or similar. You also wouldn't have any iteration over eu_groups in
> this version.
>
> I think that would be cleaner and easier to extend in the future.
> Unless I overlooked some important detail?
>
> Or even simplify it by passing bitmask instead of start/end, and just
> have no support for more than 8 EUs in this version? No eu_group etc.
> When the need arises to have more, bump the eu_mask type to u16. That
> would require you to put back the stride parameter in the uAPI I think.
I'm not really a fan of having the data field in userspace be
reinterpreted (as u16 or u32) based on one of the other field.
It might be easier on the kernel side, but complicates userspace.
I would prefer to stick to u8 and have everybody think of
slice/subslice/eus availability as array of u8 bit fields which you
might need to iterate more than one if there are more than 8 elements.
>
> Regards,
>
> Tvrtko
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 6/6] drm/i915: expose rcs topology through query uAPI
2018-01-12 12:27 ` Tvrtko Ursulin
@ 2018-01-12 14:04 ` Lionel Landwerlin
0 siblings, 0 replies; 21+ messages in thread
From: Lionel Landwerlin @ 2018-01-12 14:04 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
On 12/01/18 12:27, Tvrtko Ursulin wrote:
>
> On 11/01/2018 19:53, Lionel Landwerlin wrote:
>> With the introduction of asymmetric slices in CNL, we cannot rely on
>> the previous SUBSLICE_MASK getparam to tell userspace what subslices
>> are available. Here we introduce a more detailed way of querying the
>> Gen's GPU topology that doesn't aggregate numbers.
>>
>> This is essential for monitoring parts of the GPU with the OA unit,
>> because counters need to be normalized to the number of
>> EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
>> not gives us sufficient information.
>>
>> As a bonus we can draw representations of the GPU :
>>
>> https://imgur.com/a/vuqpa
>>
>> v2: Rename uapi struct s/_mask/_info/ (Tvrtko)
>> Report max_slice/subslice/eus_per_subslice rather than strides
>> (Tvrtko)
>> Add uapi macros to read data from *_info structs (Tvrtko)
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_query.c | 133
>> ++++++++++++++++++++++++++++++++
>> drivers/gpu/drm/i915/intel_query_info.c | 88 +++++++++++++++++++++
>> include/uapi/drm/i915_drm.h | 51 ++++++++++++
>> 3 files changed, 272 insertions(+)
>> create mode 100644 drivers/gpu/drm/i915/intel_query_info.c
>>
>> diff --git a/drivers/gpu/drm/i915/i915_query.c
>> b/drivers/gpu/drm/i915/i915_query.c
>> index 5694cfea4553..1d9f5a15323c 100644
>> --- a/drivers/gpu/drm/i915/i915_query.c
>> +++ b/drivers/gpu/drm/i915/i915_query.c
>> @@ -25,8 +25,128 @@
>> #include "i915_drv.h"
>> #include <uapi/drm/i915_drm.h>
>> +static int query_slices_info(struct drm_i915_private *dev_priv,
>> + struct drm_i915_query_item *query_item)
>> +{
>> + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
>> + struct drm_i915_query_slices_info slices_info;
>> + u32 data_length, length;
>> +
>> + if (sseu->max_slices == 0)
>> + return -ENODEV;
>> +
>> + data_length = sizeof(u8);
>
> sizeof(sseu->slice_mask) ?
Sure.
>
>> + length = sizeof(slices_info) + data_length;
>> +
>> + /*
>> + * If we ever change the internal slice mask data type, we'll
>> need to
>> + * update this function.
>> + */
>> + BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
>> +
>> + if (query_item->length == 0) {
>> + query_item->length = length;
>> + return 0;
>> + }
>> +
>> + if (query_item->length != length)
>> + return -EINVAL;
>> +
>> + memset(&slices_info, 0, sizeof(slices_info));
>> + slices_info.max_slices = sseu->max_slices;
>> +
>> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr),
>> &slices_info,
>> + sizeof(slices_info)))
>> + return -EFAULT;
>> +
>> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
>> + offsetof(struct drm_i915_query_slices_info,
>> data)),
>> + &sseu->slice_mask, data_length))
>> + return -EFAULT;
>> +
>> + return 0;
>> +}
>> +
>> +static int query_subslices_info(struct drm_i915_private *dev_priv,
>> + struct drm_i915_query_item *query_item)
>> +{
>> + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
>> + struct drm_i915_query_subslices_info subslices_info;
>> + u32 data_length, length;
>> +
>> + if (sseu->max_slices == 0)
>> + return -ENODEV;
>> +
>> + memset(&subslices_info, 0, sizeof(subslices_info));
>> + subslices_info.max_slices = sseu->max_slices;
>> + subslices_info.max_subslices = sseu->max_subslices;
>> +
>> + data_length = subslices_info.max_slices *
>> + DIV_ROUND_UP(subslices_info.max_subslices, BITS_PER_BYTE);
>
> s/BITS_PER_BYTE/sizeof(sseu->subslice_mask[0]) * BITS_PER_BYTE/ ?
Okay.
>
>> + length = sizeof(subslices_info) + data_length;
>> +
>> + if (query_item->length == 0) {
>> + query_item->length = length;
>> + return 0;
>> + }
>> +
>> + if (query_item->length != length)
>> + return -EINVAL;
>> +
>> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr),
>> &subslices_info,
>> + sizeof(subslices_info)))
>> + return -EFAULT;
>> +
>> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
>> + offsetof(struct drm_i915_query_subslices_info,
>> data)),
>> + sseu->subslice_mask, data_length))
>> + return -EFAULT;
>> +
>> + return 0;
>> +}
>> +
>> +static int query_eus_info(struct drm_i915_private *dev_priv,
>> + struct drm_i915_query_item *query_item)
>> +{
>> + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
>> + struct drm_i915_query_eus_info eus_info;
>> + u32 data_length, length;
>> +
>> + if (sseu->max_slices == 0)
>> + return -ENODEV;
>> +
>> + memset(&eus_info, 0, sizeof(eus_info));
>> + eus_info.max_slices = sseu->max_slices;
>> + eus_info.max_subslices = sseu->max_subslices;
>> + eus_info.max_eus_per_subslice = sseu->max_eus_per_subslice;
>> +
>> + data_length = eus_info.max_slices * eus_info.max_subslices *
>> + DIV_ROUND_UP(eus_info.max_eus_per_subslice, BITS_PER_BYTE);
>> + length = sizeof(eus_info) + data_length;
>> +
>> + if (query_item->length == 0) {
>> + query_item->length = length;
>> + return 0;
>> + }
>> +
>> + if (query_item->length != length)
>> + return -EINVAL;
>> +
>> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr), &eus_info,
>> + sizeof(eus_info)))
>> + return -EFAULT;
>> +
>> + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
>> + offsetof(struct drm_i915_query_eus_info, data)),
>> + sseu->eu_mask, data_length))
>> + return -EFAULT;
>> +
>> + return 0;
>> +}
>> +
>> int i915_query_ioctl(struct drm_device *dev, void *data, struct
>> drm_file *file)
>> {
>> + struct drm_i915_private *dev_priv = to_i915(dev);
>> struct drm_i915_query *args = data;
>> struct drm_i915_query_item __user *user_item_ptr =
>> u64_to_user_ptr(args->items_ptr);
>> @@ -34,15 +154,28 @@ int i915_query_ioctl(struct drm_device *dev,
>> void *data, struct drm_file *file)
>> for (i = 0; i < args->num_items; i++, user_item_ptr++) {
>> struct drm_i915_query_item item;
>> + int ret;
>> if (copy_from_user(&item, user_item_ptr, sizeof(item)))
>> return -EFAULT;
>> switch (item.query_id) {
>> + case DRM_I915_QUERY_ID_SLICES_INFO:
>> + ret = query_slices_info(dev_priv, &item);
>> + break;
>> + case DRM_I915_QUERY_ID_SUBSLICES_INFO:
>> + ret = query_subslices_info(dev_priv, &item);
>> + break;
>> + case DRM_I915_QUERY_ID_EUS_INFO:
>> + ret = query_eus_info(dev_priv, &item);
>> + break;
>> default:
>> return -EINVAL;
>> }
>> + if (ret)
>> + return ret;
>> +
>> if (copy_to_user(user_item_ptr, &item, sizeof(item)))
>> return -EFAULT;
>> }
>> diff --git a/drivers/gpu/drm/i915/intel_query_info.c
>> b/drivers/gpu/drm/i915/intel_query_info.c
>> new file mode 100644
>> index 000000000000..79b03be9f51a
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/intel_query_info.c
>
> Ooops!
Again?? WTH??
>
>> @@ -0,0 +1,88 @@
>> +/*
>> + * Copyright © 2017 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person
>> obtaining a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom
>> the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including
>> the next
>> + * paragraph) shall be included in all copies or substantial
>> portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
>> EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
>> OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>> OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + */
>> +
>> +#include "i915_drv.h"
>> +#include <uapi/drm/i915_drm.h>
>> +
>> +static int query_info_rcs_topology(struct drm_i915_private *dev_priv,
>> + struct drm_i915_query_info *args)
>> +{
>> + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
>> + struct drm_i915_rcs_topology_info __user *user_topology =
>> + u64_to_user_ptr(args->info_ptr);
>> + struct drm_i915_rcs_topology_info topology;
>> + u32 data_size, total_size;
>> + const u8 *data = NULL;
>> + int ret;
>> +
>> + /* Not supported on gen < 8. */
>> + if (sseu->max_slices == 0)
>> + return -ENODEV;
>> +
>> + switch (args->query_params[0]) {
>> + case I915_RCS_TOPOLOGY_SLICE:
>> + topology.params[0] = sseu->max_slices;
>> + data_size = sizeof(sseu->slice_mask);
>> + data = &sseu->slice_mask;
>> + break;
>> +
>> + case I915_RCS_TOPOLOGY_SUBSLICE:
>> + topology.params[0] = sseu->max_slices;
>> + topology.params[1] = ALIGN(sseu->max_subslices, 8) / 8;
>> + data_size = sseu->max_slices * topology.params[1];
>> + data = sseu->subslices_mask;
>> + break;
>> +
>> + case I915_RCS_TOPOLOGY_EU:
>> + topology.params[2] = ALIGN(sseu->max_eus_per_subslice, 8) / 8;
>> + topology.params[1] = sseu->max_subslices * topology.params[2];
>> + topology.params[0] = sseu->max_slices;
>> + data_size = sseu->max_slices * topology.params[1];
>> + data = sseu->eu_mask;
>> + break;
>> +
>> + default:
>> + return -EINVAL;
>> + }
>> +
>> + total_size = sizeof(topology) + data_size;
>> +
>> + if (args->info_ptr_len == 0) {
>> + args->info_ptr_len = total_size;
>> + return 0;
>> + }
>> +
>> + if (args->info_ptr_len < total_size)
>> + return -EINVAL;
>> +
>> + ret = copy_to_user(user_topology, &topology, sizeof(topology));
>> + if (ret)
>> + return -EFAULT;
>> +
>> + ret = copy_to_user(user_topology + 1, data, data_size);
>> + if (ret)
>> + return -EFAULT;
>> +
>> + return 0;
>> +}
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 39e93f10f2cd..5a8e2f7d5dc3 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -1618,6 +1618,9 @@ struct drm_i915_perf_oa_config {
>> struct drm_i915_query_item {
>> __u64 query_id;
>> +#define DRM_I915_QUERY_ID_SLICES_INFO 0x01
>> +#define DRM_I915_QUERY_ID_SUBSLICES_INFO 0x02
>> +#define DRM_I915_QUERY_ID_EUS_INFO 0x03
>> /*
>> * When set to zero by userspace, this is filled with the size
>> of the
>> @@ -1644,6 +1647,54 @@ struct drm_i915_query {
>> __u64 items_ptr;
>> };
>> +/* Data written by the kernel with query
>> DRM_I915_QUERY_ID_SLICES_INFO :
>> + *
>> + * data: each bit indicates whether a slice is available (1) or
>> fused off (0).
>> + * Use DRM_I915_QUERY_SLICE_AVAILABLE() to query a given slice's
>> + * availability.
>> + */
>> +struct drm_i915_query_slices_info {
>> + __u32 max_slices;
>> +
>> +#define DRM_I915_QUERY_SLICE_AVAILABLE(info, slice) \
>> + (((info)->data[(slice) / 8] >> ((slice) % 8)) & 1)
>
> Alternatively, (data[slice / 8] & BIT(slice % 8)), but it is probably
> a personal preference as what is more readable.
Nice, thanks!
>
>> + __u8 data[];
>> +};
>> +
>> +/* Data written by the kernel with query
>> DRM_I915_QUERY_ID_SUBSLICES_INFO :
>> + *
>> + * data: each bit indicates whether a subslice is available (1) or
>> fused off
>> + * (0). Use DRM_I915_QUERY_SUBSLICE_AVAILABLE() to query a given
>> + * subslice's availability.
>> + */
>> +struct drm_i915_query_subslices_info {
>> + __u32 max_slices;
>> + __u32 max_subslices;
>> +
>> +#define DRM_I915_QUERY_SUBSLICE_AVAILABLE(info, slice, subslice) \
>> + (((info)->data[(slice) * ALIGN((info)->max_subslices, 8) / 8 + \
>> + (subslice) / 8] >> ((subslice) % 8)) & 1)
>> + __u8 data[];
>> +};
>> +
>> +/* Data written by the kernel with query DRM_I915_QUERY_ID_EUS_INFO :
>> + *
>> + * data: Each bit indicates whether a subslice is available (1) or
>> fused off
>> + * (0). Use DRM_I915_QUERY_EU_AVAILABLE() to query a given EU's
>> + * availability.
>> + */
>> +struct drm_i915_query_eus_info {
>> + __u32 max_slices;
>> + __u32 max_subslices;
>> + __u32 max_eus_per_subslice;
>> +
>> +#define DRM_I915_QUERY_EU_AVAILABLE(info, slice, subslice, eu) \
>> + (((info)->data[(slice) * ALIGN((info)->max_eus_per_subslice, 8)
>> / 8 * (info)->max_subslices + \
>> + (subslice) * ALIGN((info)->max_eus_per_subslice, 8) /
>> 8 + \
>> + (eu) / 8] >> ((eu) % 8)) & 1)
>> + __u8 data[];
>> +};
>
> To many hardcoded eights in all of the above. But not sure what to
> suggests. If someone says to lose the zero-length array or not it will
> be slightly different.
>
> Do you want to prefix the macros with !! so they return 0/1?
Sure :)
>
> Is the ALIGN(x, 8) / 8 pattern equal to DIV_ROUND_UP(x, 8)? I think I
> commented about that already somewhere.
I avoided DIV_ROUND_UP() because it doesn't look like it's used by
anybody else in uapi.
I think they're equivalent.
>
>> +
>> #if defined(__cplusplus)
>> }
>> #endif
>>
>
> Regards,
>
> Tvrtko
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/6] drm/i915: store all subslice masks
2018-01-12 13:53 ` Lionel Landwerlin
@ 2018-01-12 17:37 ` Tvrtko Ursulin
0 siblings, 0 replies; 21+ messages in thread
From: Tvrtko Ursulin @ 2018-01-12 17:37 UTC (permalink / raw)
To: Lionel Landwerlin, intel-gfx
On 12/01/2018 13:53, Lionel Landwerlin wrote:
> On 12/01/18 12:01, Tvrtko Ursulin wrote:
>>
>> On 12/01/2018 10:58, Lionel Landwerlin wrote:
>>> On 12/01/18 10:15, Tvrtko Ursulin wrote:
>>>>
>>
>> [snip]
>>
>>>>> +static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
>>>>> + int slice, int subslice, int eu_group)
>>>>
>>>> What is eu_group for? Will it be used at some point?
>>>
>>> In case we ever have more than 8 EUs per subslice.
>>
>> I am thinking if we could hide that from the call sites, to avoid it
>> being passed as zeros, and to avoid having to write loops in other
>> patches which reference eu_groups, when it is not immediately obvious
>> what that means.
>>
>> Could we for instance have a helper which would clear/set numbered EUs
>> in sseu_dev_info, and so hide all the implementation details?
>>
>> sseu_enable_eus(sseu, slice, subslice, start, end);
>>
>> Then when you have code like:
>>
>> sseu->eu_mask[sseu_eu_idx(sseu, s, ss, 0)] = ~eu_disabled_mask;
>>
>> You would write it as:
>>
>> /* On this slice/subslice mark EUs 0 to N as enabled. */
>> sseu_enable_eus(sseu, s, ss, 0, fls(~eu_disabled_mask));
>
> Hmm... I don't think that works if you have gaps, right?
> Like a BXT 2x6 where a row of EUs has been fused off. It would be
> something like 0b01110111 or 0b10111011.
Oops, you are right. I just don't like having this eu_group field which
is internal data storage detail, leaked out, is always zero, and even
unused today. So I am trying to come up with a nicer api.
What about sseu_set_eus(sseu, slice, subslice, mask)?
Then you can replace call sites like:
sseu->eu_mask[sseu_eu_idx(sseu, 0, 0, 0)] =
~((fuse & CHV_FGT_EU_DIS_SS0_R0_MASK) >>
CHV_FGT_EU_DIS_SS0_R0_SHIFT);
sseu->eu_mask[sseu_eu_idx(sseu, 0, 0, 0)] |=
~(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
With:
mask = (fuse & CHV_FGT_EU_DIS_SS0_R0_MASK) >>
CHV_FGT_EU_DIS_SS0_R0_SHIFT;
mask |= ((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
sseu_set_eus(sseu, 0, 0, ~mask);
sseu_set_eus can then be the only place which understand the storage
format. You can even later extend the mask size past u8 and it can still
do the magic inside it.
>>
>> Helper would internally know the size of the underlying storage and
>> dtrt. There would be no need to manually manage eu_groups. In the
>> initial implementation you could simply GEM_BUG_ON if the EU range
>> does not fit into the current storage. Later u8 could be turned into
>> u16 or similar. You also wouldn't have any iteration over eu_groups in
>> this version.
>>
>> I think that would be cleaner and easier to extend in the future.
>> Unless I overlooked some important detail?
>>
>> Or even simplify it by passing bitmask instead of start/end, and just
>> have no support for more than 8 EUs in this version? No eu_group etc.
>> When the need arises to have more, bump the eu_mask type to u16. That
>> would require you to put back the stride parameter in the uAPI I think.
>
> I'm not really a fan of having the data field in userspace be
> reinterpreted (as u16 or u32) based on one of the other field.
> It might be easier on the kernel side, but complicates userspace.
>
> I would prefer to stick to u8 and have everybody think of
> slice/subslice/eus availability as array of u8 bit fields which you
> might need to iterate more than one if there are more than 8 elements.
You are thinking about bit-ordering? Yeah, if it is easier keep the
formats the same. Even though strictly speaking internal and ABI data
representation do not need to be the same. As said above, I am just
looking for a solution which hides the eu_group parameter from the
callers and also is a bit shorter to read.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2018-01-12 17:37 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-11 19:53 [PATCH v2 0/6] drm/i915: expose RCS topology to userspace Lionel Landwerlin
2018-01-11 19:53 ` [PATCH v2 1/6] drm/i915: store all subslice masks Lionel Landwerlin
2018-01-12 10:15 ` Tvrtko Ursulin
2018-01-12 10:58 ` Lionel Landwerlin
2018-01-12 11:05 ` Tvrtko Ursulin
2018-01-12 11:31 ` Lionel Landwerlin
2018-01-12 12:01 ` Tvrtko Ursulin
2018-01-12 13:53 ` Lionel Landwerlin
2018-01-12 17:37 ` Tvrtko Ursulin
2018-01-11 19:53 ` [PATCH v2 2/6] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
2018-01-11 19:53 ` [PATCH v2 3/6] drm/i915/debugfs: add rcs topology entry Lionel Landwerlin
2018-01-12 10:21 ` Tvrtko Ursulin
2018-01-12 11:02 ` Lionel Landwerlin
2018-01-11 19:53 ` [PATCH v2 4/6] drm/i915: add rcs topology to error state Lionel Landwerlin
2018-01-11 19:53 ` [PATCH v2 5/6] drm/i915: add query uAPI Lionel Landwerlin
2018-01-12 12:06 ` Tvrtko Ursulin
2018-01-11 19:53 ` [PATCH v2 6/6] drm/i915: expose rcs topology through " Lionel Landwerlin
2018-01-12 12:27 ` Tvrtko Ursulin
2018-01-12 14:04 ` Lionel Landwerlin
2018-01-12 10:06 ` ✓ Fi.CI.BAT: success for drm/i915: expose RCS topology to userspace Patchwork
2018-01-12 11:49 ` ✗ Fi.CI.IGT: warning " Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox