* [PATCH 0/3] Fix up cmd parser OACONTROL handling + refactorings
@ 2014-03-27 18:43 bradley.d.volkin
2014-03-27 18:43 ` [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted bradley.d.volkin
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: bradley.d.volkin @ 2014-03-27 18:43 UTC (permalink / raw)
To: intel-gfx
From: Brad Volkin <bradley.d.volkin@intel.com>
Patches 1 and 2 do some cleanups suggested as part of the review process.
Patch 3 continues the OACONTROL handling fixes from the other day.
I think patches 1 and 2 are valuable on their own. I think the need/benefit
for the tracking provided by patch 3 is somewhat unclear. Per Ken:
"I don't really buy the snooping problem, though...just because I leave
OACONTROL set doesn't mean I'll get useful data. Another context might
clobber it, and empirically the numbers seem to reset across RC6 anyway.
So in actuality, they're likely to get bogus data.
Even if they did somehow miraculously get decent values, it basically
gives information akin to 'top', which is unprivileged on every system
I've ever used."
That argument makes sense to me, but I've gone ahead and written the patch in
the event that we do want it.
Brad Volkin (3):
drm/i915: BUG_ON() when cmd/reg tables are not sorted
drm/i915: Refactor cmd parser checks into a function
drm/i915: Track OACONTROL register enable/disable during parsing
drivers/gpu/drm/i915/i915_cmd_parser.c | 198 +++++++++++++++++++--------------
1 file changed, 117 insertions(+), 81 deletions(-)
--
1.8.3.2
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted
2014-03-27 18:43 [PATCH 0/3] Fix up cmd parser OACONTROL handling + refactorings bradley.d.volkin
@ 2014-03-27 18:43 ` bradley.d.volkin
2014-03-27 21:47 ` Kenneth Graunke
2014-04-02 9:11 ` Jani Nikula
2014-03-27 18:43 ` [PATCH 2/3] drm/i915: Refactor cmd parser checks into a function bradley.d.volkin
2014-03-27 18:43 ` [PATCH 3/3] drm/i915: Track OACONTROL register enable/disable during parsing bradley.d.volkin
2 siblings, 2 replies; 19+ messages in thread
From: bradley.d.volkin @ 2014-03-27 18:43 UTC (permalink / raw)
To: intel-gfx
From: Brad Volkin <bradley.d.volkin@intel.com>
As suggested during review, this makes it much more obvious
when the tables are not sorted.
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
---
drivers/gpu/drm/i915/i915_cmd_parser.c | 31 +++++++++++++++++++++----------
1 file changed, 21 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 788bd96..8a93db3 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -493,12 +493,13 @@ static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header)
return 0;
}
-static void validate_cmds_sorted(struct intel_ring_buffer *ring)
+static bool validate_cmds_sorted(struct intel_ring_buffer *ring)
{
int i;
+ bool ret = true;
if (!ring->cmd_tables || ring->cmd_table_count == 0)
- return;
+ return true;
for (i = 0; i < ring->cmd_table_count; i++) {
const struct drm_i915_cmd_table *table = &ring->cmd_tables[i];
@@ -510,35 +511,45 @@ static void validate_cmds_sorted(struct intel_ring_buffer *ring)
&table->table[i];
u32 curr = desc->cmd.value & desc->cmd.mask;
- if (curr < previous)
+ if (curr < previous) {
DRM_ERROR("CMD: table not sorted ring=%d table=%d entry=%d cmd=0x%08X prev=0x%08X\n",
ring->id, i, j, curr, previous);
+ ret = false;
+ }
previous = curr;
}
}
+
+ return ret;
}
-static void check_sorted(int ring_id, const u32 *reg_table, int reg_count)
+static bool check_sorted(int ring_id, const u32 *reg_table, int reg_count)
{
int i;
u32 previous = 0;
+ bool ret = true;
for (i = 0; i < reg_count; i++) {
u32 curr = reg_table[i];
- if (curr < previous)
+ if (curr < previous) {
DRM_ERROR("CMD: table not sorted ring=%d entry=%d reg=0x%08X prev=0x%08X\n",
ring_id, i, curr, previous);
+ ret = false;
+ }
previous = curr;
}
+
+ return ret;
}
-static void validate_regs_sorted(struct intel_ring_buffer *ring)
+static bool validate_regs_sorted(struct intel_ring_buffer *ring)
{
- check_sorted(ring->id, ring->reg_table, ring->reg_count);
- check_sorted(ring->id, ring->master_reg_table, ring->master_reg_count);
+ return check_sorted(ring->id, ring->reg_table, ring->reg_count) &&
+ check_sorted(ring->id, ring->master_reg_table,
+ ring->master_reg_count);
}
/**
@@ -617,8 +628,8 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring)
BUG();
}
- validate_cmds_sorted(ring);
- validate_regs_sorted(ring);
+ BUG_ON(!validate_cmds_sorted(ring));
+ BUG_ON(!validate_regs_sorted(ring));
}
static const struct drm_i915_cmd_descriptor*
--
1.8.3.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/3] drm/i915: Refactor cmd parser checks into a function
2014-03-27 18:43 [PATCH 0/3] Fix up cmd parser OACONTROL handling + refactorings bradley.d.volkin
2014-03-27 18:43 ` [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted bradley.d.volkin
@ 2014-03-27 18:43 ` bradley.d.volkin
2014-03-27 21:49 ` Kenneth Graunke
2014-03-27 18:43 ` [PATCH 3/3] drm/i915: Track OACONTROL register enable/disable during parsing bradley.d.volkin
2 siblings, 1 reply; 19+ messages in thread
From: bradley.d.volkin @ 2014-03-27 18:43 UTC (permalink / raw)
To: intel-gfx
From: Brad Volkin <bradley.d.volkin@intel.com>
This brings the code a little more in line with kernel coding style.
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
---
drivers/gpu/drm/i915/i915_cmd_parser.c | 136 +++++++++++++++++----------------
1 file changed, 71 insertions(+), 65 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 8a93db3..2eb2aca 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -758,6 +758,76 @@ bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
return (i915.enable_cmd_parser == 1);
}
+static bool check_cmd(const struct intel_ring_buffer *ring,
+ const struct drm_i915_cmd_descriptor *desc,
+ const u32 *cmd,
+ const bool is_master)
+{
+ if (desc->flags & CMD_DESC_REJECT) {
+ DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd);
+ return false;
+ }
+
+ if ((desc->flags & CMD_DESC_MASTER) && !is_master) {
+ DRM_DEBUG_DRIVER("CMD: Rejected master-only command: 0x%08X\n",
+ *cmd);
+ return false;
+ }
+
+ if (desc->flags & CMD_DESC_REGISTER) {
+ u32 reg_addr = cmd[desc->reg.offset] & desc->reg.mask;
+
+ if (!valid_reg(ring->reg_table,
+ ring->reg_count, reg_addr)) {
+ if (!is_master ||
+ !valid_reg(ring->master_reg_table,
+ ring->master_reg_count,
+ reg_addr)) {
+ DRM_DEBUG_DRIVER("CMD: Rejected register 0x%08X in command: 0x%08X (ring=%d)\n",
+ reg_addr,
+ *cmd,
+ ring->id);
+ return false;
+ }
+ }
+ }
+
+ if (desc->flags & CMD_DESC_BITMASK) {
+ int i;
+
+ for (i = 0; i < MAX_CMD_DESC_BITMASKS; i++) {
+ u32 dword;
+
+ if (desc->bits[i].mask == 0)
+ break;
+
+ if (desc->bits[i].condition_mask != 0) {
+ u32 offset =
+ desc->bits[i].condition_offset;
+ u32 condition = cmd[offset] &
+ desc->bits[i].condition_mask;
+
+ if (condition == 0)
+ continue;
+ }
+
+ dword = cmd[desc->bits[i].offset] &
+ desc->bits[i].mask;
+
+ if (dword != desc->bits[i].expected) {
+ DRM_DEBUG_DRIVER("CMD: Rejected command 0x%08X for bitmask 0x%08X (exp=0x%08X act=0x%08X) (ring=%d)\n",
+ *cmd,
+ desc->bits[i].mask,
+ desc->bits[i].expected,
+ dword, ring->id);
+ return false;
+ }
+ }
+ }
+
+ return true;
+}
+
#define LENGTH_BIAS 2
/**
@@ -830,75 +900,11 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
break;
}
- if (desc->flags & CMD_DESC_REJECT) {
- DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd);
+ if (!check_cmd(ring, desc, cmd, is_master)) {
ret = -EINVAL;
break;
}
- if ((desc->flags & CMD_DESC_MASTER) && !is_master) {
- DRM_DEBUG_DRIVER("CMD: Rejected master-only command: 0x%08X\n",
- *cmd);
- ret = -EINVAL;
- break;
- }
-
- if (desc->flags & CMD_DESC_REGISTER) {
- u32 reg_addr = cmd[desc->reg.offset] & desc->reg.mask;
-
- if (!valid_reg(ring->reg_table,
- ring->reg_count, reg_addr)) {
- if (!is_master ||
- !valid_reg(ring->master_reg_table,
- ring->master_reg_count,
- reg_addr)) {
- DRM_DEBUG_DRIVER("CMD: Rejected register 0x%08X in command: 0x%08X (ring=%d)\n",
- reg_addr,
- *cmd,
- ring->id);
- ret = -EINVAL;
- break;
- }
- }
- }
-
- if (desc->flags & CMD_DESC_BITMASK) {
- int i;
-
- for (i = 0; i < MAX_CMD_DESC_BITMASKS; i++) {
- u32 dword;
-
- if (desc->bits[i].mask == 0)
- break;
-
- if (desc->bits[i].condition_mask != 0) {
- u32 offset =
- desc->bits[i].condition_offset;
- u32 condition = cmd[offset] &
- desc->bits[i].condition_mask;
-
- if (condition == 0)
- continue;
- }
-
- dword = cmd[desc->bits[i].offset] &
- desc->bits[i].mask;
-
- if (dword != desc->bits[i].expected) {
- DRM_DEBUG_DRIVER("CMD: Rejected command 0x%08X for bitmask 0x%08X (exp=0x%08X act=0x%08X) (ring=%d)\n",
- *cmd,
- desc->bits[i].mask,
- desc->bits[i].expected,
- dword, ring->id);
- ret = -EINVAL;
- break;
- }
- }
-
- if (ret)
- break;
- }
-
cmd += length;
}
--
1.8.3.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/3] drm/i915: Track OACONTROL register enable/disable during parsing
2014-03-27 18:43 [PATCH 0/3] Fix up cmd parser OACONTROL handling + refactorings bradley.d.volkin
2014-03-27 18:43 ` [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted bradley.d.volkin
2014-03-27 18:43 ` [PATCH 2/3] drm/i915: Refactor cmd parser checks into a function bradley.d.volkin
@ 2014-03-27 18:43 ` bradley.d.volkin
2014-03-27 21:58 ` Kenneth Graunke
2014-03-28 17:21 ` [PATCH v2 " bradley.d.volkin
2 siblings, 2 replies; 19+ messages in thread
From: bradley.d.volkin @ 2014-03-27 18:43 UTC (permalink / raw)
To: intel-gfx
From: Brad Volkin <bradley.d.volkin@intel.com>
There is some thought that the data from the performance counters enabled
via OACONTROL should only be available to the process that enabled counting.
To limit snooping, require that any batch buffer which sets OACONTROL to a
non-zero value also sets it back to 0 before the end of the batch.
This requires limiting OACONTROL writes to happen via MI_LOAD_REGISTER_IMM
so that we can access the value being written. This should be in line with
the expected use case for writing OACONTROL.
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
---
drivers/gpu/drm/i915/i915_cmd_parser.c | 35 ++++++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 2eb2aca..779e14c 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -407,12 +407,7 @@ static const u32 gen7_render_regs[] = {
REG64(CL_PRIMITIVES_COUNT),
REG64(PS_INVOCATION_COUNT),
REG64(PS_DEPTH_COUNT),
- /*
- * FIXME: This is just to keep mesa working for now, we need to check
- * that mesa resets this again and that it doesn't use any of the
- * special modes which write into the gtt.
- */
- OACONTROL,
+ OACONTROL, /* Only allowed for LRI and SRM. See below. */
REG64(GEN7_SO_NUM_PRIMS_WRITTEN(0)),
REG64(GEN7_SO_NUM_PRIMS_WRITTEN(1)),
REG64(GEN7_SO_NUM_PRIMS_WRITTEN(2)),
@@ -761,7 +756,8 @@ bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
static bool check_cmd(const struct intel_ring_buffer *ring,
const struct drm_i915_cmd_descriptor *desc,
const u32 *cmd,
- const bool is_master)
+ const bool is_master,
+ bool *oacontrol_set)
{
if (desc->flags & CMD_DESC_REJECT) {
DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd);
@@ -777,6 +773,23 @@ static bool check_cmd(const struct intel_ring_buffer *ring,
if (desc->flags & CMD_DESC_REGISTER) {
u32 reg_addr = cmd[desc->reg.offset] & desc->reg.mask;
+ /*
+ * OACONTROL requires some special handling for writes. We
+ * want to make sure that any batch which enables OA also
+ * disables it before the end of the batch. The goal is to
+ * prevent one process from snooping on the perf data from
+ * another process. To do that, we need to check the value
+ * that will be written to the register. Hence, limit
+ * OACONTROL writes to only MI_LOAD_REGISTER_IMM commands.
+ */
+ if (reg_addr == OACONTROL) {
+ if (desc->cmd.value == MI_LOAD_REGISTER_MEM)
+ return false;
+
+ if (desc->cmd.value == MI_LOAD_REGISTER_IMM(1))
+ *oacontrol_set = (cmd[2] != 0) ? true : false;
+ }
+
if (!valid_reg(ring->reg_table,
ring->reg_count, reg_addr)) {
if (!is_master ||
@@ -851,6 +864,7 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
u32 *cmd, *batch_base, *batch_end;
struct drm_i915_cmd_descriptor default_desc = { 0 };
int needs_clflush = 0;
+ bool oacontrol_set = false; /* OACONTROL tracking. See check_cmd() */
ret = i915_gem_obj_prepare_shmem_read(batch_obj, &needs_clflush);
if (ret) {
@@ -900,7 +914,7 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
break;
}
- if (!check_cmd(ring, desc, cmd, is_master)) {
+ if (!check_cmd(ring, desc, cmd, is_master, &oacontrol_set)) {
ret = -EINVAL;
break;
}
@@ -908,6 +922,11 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
cmd += length;
}
+ if (oacontrol_set) {
+ DRM_DEBUG_DRIVER("CMD: batch set OACONTROL but did not clear it\n");
+ ret = -EINVAL;
+ }
+
if (cmd >= batch_end) {
DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
ret = -EINVAL;
--
1.8.3.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted
2014-03-27 18:43 ` [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted bradley.d.volkin
@ 2014-03-27 21:47 ` Kenneth Graunke
2014-03-27 23:56 ` Volkin, Bradley D
2014-04-02 9:11 ` Jani Nikula
1 sibling, 1 reply; 19+ messages in thread
From: Kenneth Graunke @ 2014-03-27 21:47 UTC (permalink / raw)
To: bradley.d.volkin, intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 4097 bytes --]
On 03/27/2014 11:43 AM, bradley.d.volkin@intel.com wrote:
> From: Brad Volkin <bradley.d.volkin@intel.com>
>
> As suggested during review, this makes it much more obvious
> when the tables are not sorted.
>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_cmd_parser.c | 31 +++++++++++++++++++++----------
> 1 file changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 788bd96..8a93db3 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -493,12 +493,13 @@ static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header)
> return 0;
> }
>
> -static void validate_cmds_sorted(struct intel_ring_buffer *ring)
> +static bool validate_cmds_sorted(struct intel_ring_buffer *ring)
> {
> int i;
> + bool ret = true;
>
> if (!ring->cmd_tables || ring->cmd_table_count == 0)
> - return;
> + return true;
>
> for (i = 0; i < ring->cmd_table_count; i++) {
> const struct drm_i915_cmd_table *table = &ring->cmd_tables[i];
> @@ -510,35 +511,45 @@ static void validate_cmds_sorted(struct intel_ring_buffer *ring)
> &table->table[i];
> u32 curr = desc->cmd.value & desc->cmd.mask;
>
> - if (curr < previous)
> + if (curr < previous) {
> DRM_ERROR("CMD: table not sorted ring=%d table=%d entry=%d cmd=0x%08X prev=0x%08X\n",
> ring->id, i, j, curr, previous);
> + ret = false;
> + }
>
> previous = curr;
> }
> }
> +
> + return ret;
> }
>
> -static void check_sorted(int ring_id, const u32 *reg_table, int reg_count)
> +static bool check_sorted(int ring_id, const u32 *reg_table, int reg_count)
> {
> int i;
> u32 previous = 0;
> + bool ret = true;
>
> for (i = 0; i < reg_count; i++) {
> u32 curr = reg_table[i];
>
> - if (curr < previous)
> + if (curr < previous) {
> DRM_ERROR("CMD: table not sorted ring=%d entry=%d reg=0x%08X prev=0x%08X\n",
> ring_id, i, curr, previous);
> + ret = false;
> + }
>
> previous = curr;
> }
> +
> + return ret;
> }
>
> -static void validate_regs_sorted(struct intel_ring_buffer *ring)
> +static bool validate_regs_sorted(struct intel_ring_buffer *ring)
> {
> - check_sorted(ring->id, ring->reg_table, ring->reg_count);
> - check_sorted(ring->id, ring->master_reg_table, ring->master_reg_count);
> + return check_sorted(ring->id, ring->reg_table, ring->reg_count) &&
> + check_sorted(ring->id, ring->master_reg_table,
> + ring->master_reg_count);
> }
>
> /**
> @@ -617,8 +628,8 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring)
> BUG();
> }
>
> - validate_cmds_sorted(ring);
> - validate_regs_sorted(ring);
> + BUG_ON(!validate_cmds_sorted(ring));
> + BUG_ON(!validate_regs_sorted(ring));
> }
>
> static const struct drm_i915_cmd_descriptor*
Does any code actually rely on the tables being sorted?
I didn't see any early breaks or returns once the parser has gone past a
command...it seems to keep looking through the entire list.
As an aside:
It seems like find_cmd could be implemented a lot more efficiently.
Currently, most 3DSTATE commands are not in the tables - they only
appear to contain commands that need special handling. This means that
find_cmd will walk through every command in every table before falling
back to the default info struct.
As an example, my window manager (KWin) submitted a render ring batch
with 700 commands in it. Most of those commands need to walk through
every command in every table, which is about 48 entries. That's 33,600
iterations through tables, which seems...kind of excessive.
For example, 3D commands all have the form 0x78xx or 0x79xx. We could
use the xx as an array index into a lookup table, and get O(1) access
instead of O(n). Or, at least something similar.
Thanks again for your work on this, Brad.
--Ken
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/3] drm/i915: Refactor cmd parser checks into a function
2014-03-27 18:43 ` [PATCH 2/3] drm/i915: Refactor cmd parser checks into a function bradley.d.volkin
@ 2014-03-27 21:49 ` Kenneth Graunke
2014-04-02 9:14 ` Jani Nikula
0 siblings, 1 reply; 19+ messages in thread
From: Kenneth Graunke @ 2014-03-27 21:49 UTC (permalink / raw)
To: bradley.d.volkin, intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 474 bytes --]
On 03/27/2014 11:43 AM, bradley.d.volkin@intel.com wrote:
> From: Brad Volkin <bradley.d.volkin@intel.com>
>
> This brings the code a little more in line with kernel coding style.
>
> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_cmd_parser.c | 136 +++++++++++++++++----------------
> 1 file changed, 71 insertions(+), 65 deletions(-)
I like the refactor.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] drm/i915: Track OACONTROL register enable/disable during parsing
2014-03-27 18:43 ` [PATCH 3/3] drm/i915: Track OACONTROL register enable/disable during parsing bradley.d.volkin
@ 2014-03-27 21:58 ` Kenneth Graunke
2014-03-28 0:06 ` Volkin, Bradley D
2014-03-28 17:21 ` [PATCH v2 " bradley.d.volkin
1 sibling, 1 reply; 19+ messages in thread
From: Kenneth Graunke @ 2014-03-27 21:58 UTC (permalink / raw)
To: bradley.d.volkin, intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 4664 bytes --]
On 03/27/2014 11:43 AM, bradley.d.volkin@intel.com wrote:
> From: Brad Volkin <bradley.d.volkin@intel.com>
>
> There is some thought that the data from the performance counters enabled
> via OACONTROL should only be available to the process that enabled counting.
> To limit snooping, require that any batch buffer which sets OACONTROL to a
> non-zero value also sets it back to 0 before the end of the batch.
>
> This requires limiting OACONTROL writes to happen via MI_LOAD_REGISTER_IMM
> so that we can access the value being written. This should be in line with
> the expected use case for writing OACONTROL.
>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_cmd_parser.c | 35 ++++++++++++++++++++++++++--------
> 1 file changed, 27 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 2eb2aca..779e14c 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -407,12 +407,7 @@ static const u32 gen7_render_regs[] = {
> REG64(CL_PRIMITIVES_COUNT),
> REG64(PS_INVOCATION_COUNT),
> REG64(PS_DEPTH_COUNT),
> - /*
> - * FIXME: This is just to keep mesa working for now, we need to check
> - * that mesa resets this again and that it doesn't use any of the
> - * special modes which write into the gtt.
> - */
> - OACONTROL,
> + OACONTROL, /* Only allowed for LRI and SRM. See below. */
> REG64(GEN7_SO_NUM_PRIMS_WRITTEN(0)),
> REG64(GEN7_SO_NUM_PRIMS_WRITTEN(1)),
> REG64(GEN7_SO_NUM_PRIMS_WRITTEN(2)),
> @@ -761,7 +756,8 @@ bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
> static bool check_cmd(const struct intel_ring_buffer *ring,
> const struct drm_i915_cmd_descriptor *desc,
> const u32 *cmd,
> - const bool is_master)
> + const bool is_master,
> + bool *oacontrol_set)
> {
> if (desc->flags & CMD_DESC_REJECT) {
> DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd);
> @@ -777,6 +773,23 @@ static bool check_cmd(const struct intel_ring_buffer *ring,
> if (desc->flags & CMD_DESC_REGISTER) {
> u32 reg_addr = cmd[desc->reg.offset] & desc->reg.mask;
>
> + /*
> + * OACONTROL requires some special handling for writes. We
> + * want to make sure that any batch which enables OA also
> + * disables it before the end of the batch. The goal is to
> + * prevent one process from snooping on the perf data from
> + * another process. To do that, we need to check the value
> + * that will be written to the register. Hence, limit
> + * OACONTROL writes to only MI_LOAD_REGISTER_IMM commands.
> + */
> + if (reg_addr == OACONTROL) {
> + if (desc->cmd.value == MI_LOAD_REGISTER_MEM)
> + return false;
While I don't need the ability to use LRM on OACONTROL, I don't see any
specific reason to prohibit it, as long as there's a later LRI of 0.
I think you could just do:
if (desc->cmd.value == MI_LOAD_REGISTER_MEM)
oacontrol_set = true;
which will later get cleared if it sees a LRI of 0, and if not, you
reject the batch.
> +
> + if (desc->cmd.value == MI_LOAD_REGISTER_IMM(1))
> + *oacontrol_set = (cmd[2] != 0) ? true : false;
You shouldn't need to do both != 0 and ? true : false...
*oacontrol_set = cmd[2] != 0;
Thanks for implementing this! That was surprisingly less code than I
thought.
> + }
> +
> if (!valid_reg(ring->reg_table,
> ring->reg_count, reg_addr)) {
> if (!is_master ||
> @@ -851,6 +864,7 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
> u32 *cmd, *batch_base, *batch_end;
> struct drm_i915_cmd_descriptor default_desc = { 0 };
> int needs_clflush = 0;
> + bool oacontrol_set = false; /* OACONTROL tracking. See check_cmd() */
>
> ret = i915_gem_obj_prepare_shmem_read(batch_obj, &needs_clflush);
> if (ret) {
> @@ -900,7 +914,7 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
> break;
> }
>
> - if (!check_cmd(ring, desc, cmd, is_master)) {
> + if (!check_cmd(ring, desc, cmd, is_master, &oacontrol_set)) {
> ret = -EINVAL;
> break;
> }
> @@ -908,6 +922,11 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
> cmd += length;
> }
>
> + if (oacontrol_set) {
> + DRM_DEBUG_DRIVER("CMD: batch set OACONTROL but did not clear it\n");
> + ret = -EINVAL;
> + }
> +
> if (cmd >= batch_end) {
> DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
> ret = -EINVAL;
>
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted
2014-03-27 21:47 ` Kenneth Graunke
@ 2014-03-27 23:56 ` Volkin, Bradley D
0 siblings, 0 replies; 19+ messages in thread
From: Volkin, Bradley D @ 2014-03-27 23:56 UTC (permalink / raw)
To: Kenneth Graunke; +Cc: intel-gfx@lists.freedesktop.org
On Thu, Mar 27, 2014 at 02:47:03PM -0700, Kenneth Graunke wrote:
> Does any code actually rely on the tables being sorted?
Not today. The idea was to make it easier to move to an algorithm that does
in the future. For example, I thought binary search might be an easy win.
>
> I didn't see any early breaks or returns once the parser has gone past a
> command...it seems to keep looking through the entire list.
>
> As an aside:
>
> It seems like find_cmd could be implemented a lot more efficiently.
Yes, it's definitely a naive implementation. I've been hesitant to do too
much in the way of optimizing the searches until we have some real perf
regressions to look at. I've received several suggested improvements that,
at least with quick implementations, didn't show gains in the benchmarks
I had available (e.g. binary search; using kmap_atomic instead of vmap). My
hope is that with the feature now generally available, we'll see some data
from benchmarks people care about and go from there.
I like your suggestion though and I'll add it to the list.
Thanks,
Brad
> Currently, most 3DSTATE commands are not in the tables - they only
> appear to contain commands that need special handling. This means that
> find_cmd will walk through every command in every table before falling
> back to the default info struct.
>
> As an example, my window manager (KWin) submitted a render ring batch
> with 700 commands in it. Most of those commands need to walk through
> every command in every table, which is about 48 entries. That's 33,600
> iterations through tables, which seems...kind of excessive.
>
> For example, 3D commands all have the form 0x78xx or 0x79xx. We could
> use the xx as an array index into a lookup table, and get O(1) access
> instead of O(n). Or, at least something similar.
>
> Thanks again for your work on this, Brad.
>
> --Ken
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] drm/i915: Track OACONTROL register enable/disable during parsing
2014-03-27 21:58 ` Kenneth Graunke
@ 2014-03-28 0:06 ` Volkin, Bradley D
2014-03-28 7:51 ` Daniel Vetter
0 siblings, 1 reply; 19+ messages in thread
From: Volkin, Bradley D @ 2014-03-28 0:06 UTC (permalink / raw)
To: Kenneth Graunke; +Cc: intel-gfx
On Thu, Mar 27, 2014 at 02:58:01PM -0700, Kenneth Graunke wrote:
> On 03/27/2014 11:43 AM, bradley.d.volkin@intel.com wrote:
> > From: Brad Volkin <bradley.d.volkin@intel.com>
> >
> > There is some thought that the data from the performance counters enabled
> > via OACONTROL should only be available to the process that enabled counting.
> > To limit snooping, require that any batch buffer which sets OACONTROL to a
> > non-zero value also sets it back to 0 before the end of the batch.
> >
> > This requires limiting OACONTROL writes to happen via MI_LOAD_REGISTER_IMM
> > so that we can access the value being written. This should be in line with
> > the expected use case for writing OACONTROL.
> >
> > Cc: Kenneth Graunke <kenneth@whitecape.org>
> > Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_cmd_parser.c | 35 ++++++++++++++++++++++++++--------
> > 1 file changed, 27 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > index 2eb2aca..779e14c 100644
> > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > @@ -407,12 +407,7 @@ static const u32 gen7_render_regs[] = {
> > REG64(CL_PRIMITIVES_COUNT),
> > REG64(PS_INVOCATION_COUNT),
> > REG64(PS_DEPTH_COUNT),
> > - /*
> > - * FIXME: This is just to keep mesa working for now, we need to check
> > - * that mesa resets this again and that it doesn't use any of the
> > - * special modes which write into the gtt.
> > - */
> > - OACONTROL,
> > + OACONTROL, /* Only allowed for LRI and SRM. See below. */
> > REG64(GEN7_SO_NUM_PRIMS_WRITTEN(0)),
> > REG64(GEN7_SO_NUM_PRIMS_WRITTEN(1)),
> > REG64(GEN7_SO_NUM_PRIMS_WRITTEN(2)),
> > @@ -761,7 +756,8 @@ bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
> > static bool check_cmd(const struct intel_ring_buffer *ring,
> > const struct drm_i915_cmd_descriptor *desc,
> > const u32 *cmd,
> > - const bool is_master)
> > + const bool is_master,
> > + bool *oacontrol_set)
> > {
> > if (desc->flags & CMD_DESC_REJECT) {
> > DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd);
> > @@ -777,6 +773,23 @@ static bool check_cmd(const struct intel_ring_buffer *ring,
> > if (desc->flags & CMD_DESC_REGISTER) {
> > u32 reg_addr = cmd[desc->reg.offset] & desc->reg.mask;
> >
> > + /*
> > + * OACONTROL requires some special handling for writes. We
> > + * want to make sure that any batch which enables OA also
> > + * disables it before the end of the batch. The goal is to
> > + * prevent one process from snooping on the perf data from
> > + * another process. To do that, we need to check the value
> > + * that will be written to the register. Hence, limit
> > + * OACONTROL writes to only MI_LOAD_REGISTER_IMM commands.
> > + */
> > + if (reg_addr == OACONTROL) {
> > + if (desc->cmd.value == MI_LOAD_REGISTER_MEM)
> > + return false;
>
> While I don't need the ability to use LRM on OACONTROL, I don't see any
> specific reason to prohibit it, as long as there's a later LRI of 0.
>
> I think you could just do:
>
> if (desc->cmd.value == MI_LOAD_REGISTER_MEM)
> oacontrol_set = true;
>
> which will later get cleared if it sees a LRI of 0, and if not, you
> reject the batch.
Fair enough. Rejecting LRM was as much a carryover from thinking we needed to
check bits within the value. I'll make this change if there are no objections.
>
> > +
> > + if (desc->cmd.value == MI_LOAD_REGISTER_IMM(1))
> > + *oacontrol_set = (cmd[2] != 0) ? true : false;
>
> You shouldn't need to do both != 0 and ? true : false...
>
> *oacontrol_set = cmd[2] != 0;
Will fix.
>
> Thanks for implementing this! That was surprisingly less code than I
> thought.
This is the "you told me there's only one register that needs this" implementation :)
A more general solution would probably be uglier.
Thanks,
Brad
>
> > + }
> > +
> > if (!valid_reg(ring->reg_table,
> > ring->reg_count, reg_addr)) {
> > if (!is_master ||
> > @@ -851,6 +864,7 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
> > u32 *cmd, *batch_base, *batch_end;
> > struct drm_i915_cmd_descriptor default_desc = { 0 };
> > int needs_clflush = 0;
> > + bool oacontrol_set = false; /* OACONTROL tracking. See check_cmd() */
> >
> > ret = i915_gem_obj_prepare_shmem_read(batch_obj, &needs_clflush);
> > if (ret) {
> > @@ -900,7 +914,7 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
> > break;
> > }
> >
> > - if (!check_cmd(ring, desc, cmd, is_master)) {
> > + if (!check_cmd(ring, desc, cmd, is_master, &oacontrol_set)) {
> > ret = -EINVAL;
> > break;
> > }
> > @@ -908,6 +922,11 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
> > cmd += length;
> > }
> >
> > + if (oacontrol_set) {
> > + DRM_DEBUG_DRIVER("CMD: batch set OACONTROL but did not clear it\n");
> > + ret = -EINVAL;
> > + }
> > +
> > if (cmd >= batch_end) {
> > DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
> > ret = -EINVAL;
> >
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] drm/i915: Track OACONTROL register enable/disable during parsing
2014-03-28 0:06 ` Volkin, Bradley D
@ 2014-03-28 7:51 ` Daniel Vetter
0 siblings, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2014-03-28 7:51 UTC (permalink / raw)
To: Volkin, Bradley D; +Cc: intel-gfx
On Thu, Mar 27, 2014 at 05:06:33PM -0700, Volkin, Bradley D wrote:
> On Thu, Mar 27, 2014 at 02:58:01PM -0700, Kenneth Graunke wrote:
> > On 03/27/2014 11:43 AM, bradley.d.volkin@intel.com wrote:
> > > From: Brad Volkin <bradley.d.volkin@intel.com>
> > >
> > > There is some thought that the data from the performance counters enabled
> > > via OACONTROL should only be available to the process that enabled counting.
> > > To limit snooping, require that any batch buffer which sets OACONTROL to a
> > > non-zero value also sets it back to 0 before the end of the batch.
> > >
> > > This requires limiting OACONTROL writes to happen via MI_LOAD_REGISTER_IMM
> > > so that we can access the value being written. This should be in line with
> > > the expected use case for writing OACONTROL.
> > >
> > > Cc: Kenneth Graunke <kenneth@whitecape.org>
> > > Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> > > ---
> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 35 ++++++++++++++++++++++++++--------
> > > 1 file changed, 27 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > > index 2eb2aca..779e14c 100644
> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > > @@ -407,12 +407,7 @@ static const u32 gen7_render_regs[] = {
> > > REG64(CL_PRIMITIVES_COUNT),
> > > REG64(PS_INVOCATION_COUNT),
> > > REG64(PS_DEPTH_COUNT),
> > > - /*
> > > - * FIXME: This is just to keep mesa working for now, we need to check
> > > - * that mesa resets this again and that it doesn't use any of the
> > > - * special modes which write into the gtt.
> > > - */
> > > - OACONTROL,
> > > + OACONTROL, /* Only allowed for LRI and SRM. See below. */
> > > REG64(GEN7_SO_NUM_PRIMS_WRITTEN(0)),
> > > REG64(GEN7_SO_NUM_PRIMS_WRITTEN(1)),
> > > REG64(GEN7_SO_NUM_PRIMS_WRITTEN(2)),
> > > @@ -761,7 +756,8 @@ bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
> > > static bool check_cmd(const struct intel_ring_buffer *ring,
> > > const struct drm_i915_cmd_descriptor *desc,
> > > const u32 *cmd,
> > > - const bool is_master)
> > > + const bool is_master,
> > > + bool *oacontrol_set)
> > > {
> > > if (desc->flags & CMD_DESC_REJECT) {
> > > DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd);
> > > @@ -777,6 +773,23 @@ static bool check_cmd(const struct intel_ring_buffer *ring,
> > > if (desc->flags & CMD_DESC_REGISTER) {
> > > u32 reg_addr = cmd[desc->reg.offset] & desc->reg.mask;
> > >
> > > + /*
> > > + * OACONTROL requires some special handling for writes. We
> > > + * want to make sure that any batch which enables OA also
> > > + * disables it before the end of the batch. The goal is to
> > > + * prevent one process from snooping on the perf data from
> > > + * another process. To do that, we need to check the value
> > > + * that will be written to the register. Hence, limit
> > > + * OACONTROL writes to only MI_LOAD_REGISTER_IMM commands.
> > > + */
> > > + if (reg_addr == OACONTROL) {
> > > + if (desc->cmd.value == MI_LOAD_REGISTER_MEM)
> > > + return false;
> >
> > While I don't need the ability to use LRM on OACONTROL, I don't see any
> > specific reason to prohibit it, as long as there's a later LRI of 0.
> >
> > I think you could just do:
> >
> > if (desc->cmd.value == MI_LOAD_REGISTER_MEM)
> > oacontrol_set = true;
> >
> > which will later get cleared if it sees a LRI of 0, and if not, you
> > reject the batch.
>
> Fair enough. Rejecting LRM was as much a carryover from thinking we needed to
> check bits within the value. I'll make this change if there are no objections.
Imo if there's no user there's no need for that.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 3/3] drm/i915: Track OACONTROL register enable/disable during parsing
2014-03-27 18:43 ` [PATCH 3/3] drm/i915: Track OACONTROL register enable/disable during parsing bradley.d.volkin
2014-03-27 21:58 ` Kenneth Graunke
@ 2014-03-28 17:21 ` bradley.d.volkin
2014-03-28 17:37 ` Kenneth Graunke
` (2 more replies)
1 sibling, 3 replies; 19+ messages in thread
From: bradley.d.volkin @ 2014-03-28 17:21 UTC (permalink / raw)
To: intel-gfx
From: Brad Volkin <bradley.d.volkin@intel.com>
There is some thought that the data from the performance counters enabled
via OACONTROL should only be available to the process that enabled counting.
To limit snooping, require that any batch buffer which sets OACONTROL to a
non-zero value also sets it back to 0 before the end of the batch.
This requires limiting OACONTROL writes to happen via MI_LOAD_REGISTER_IMM
so that we can access the value being written. This should be in line with
the expected use case for writing OACONTROL.
v2: Drop an unnecessary '? true : false'
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
---
drivers/gpu/drm/i915/i915_cmd_parser.c | 35 ++++++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 2eb2aca..34e2d45 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -407,12 +407,7 @@ static const u32 gen7_render_regs[] = {
REG64(CL_PRIMITIVES_COUNT),
REG64(PS_INVOCATION_COUNT),
REG64(PS_DEPTH_COUNT),
- /*
- * FIXME: This is just to keep mesa working for now, we need to check
- * that mesa resets this again and that it doesn't use any of the
- * special modes which write into the gtt.
- */
- OACONTROL,
+ OACONTROL, /* Only allowed for LRI and SRM. See below. */
REG64(GEN7_SO_NUM_PRIMS_WRITTEN(0)),
REG64(GEN7_SO_NUM_PRIMS_WRITTEN(1)),
REG64(GEN7_SO_NUM_PRIMS_WRITTEN(2)),
@@ -761,7 +756,8 @@ bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
static bool check_cmd(const struct intel_ring_buffer *ring,
const struct drm_i915_cmd_descriptor *desc,
const u32 *cmd,
- const bool is_master)
+ const bool is_master,
+ bool *oacontrol_set)
{
if (desc->flags & CMD_DESC_REJECT) {
DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd);
@@ -777,6 +773,23 @@ static bool check_cmd(const struct intel_ring_buffer *ring,
if (desc->flags & CMD_DESC_REGISTER) {
u32 reg_addr = cmd[desc->reg.offset] & desc->reg.mask;
+ /*
+ * OACONTROL requires some special handling for writes. We
+ * want to make sure that any batch which enables OA also
+ * disables it before the end of the batch. The goal is to
+ * prevent one process from snooping on the perf data from
+ * another process. To do that, we need to check the value
+ * that will be written to the register. Hence, limit
+ * OACONTROL writes to only MI_LOAD_REGISTER_IMM commands.
+ */
+ if (reg_addr == OACONTROL) {
+ if (desc->cmd.value == MI_LOAD_REGISTER_MEM)
+ return false;
+
+ if (desc->cmd.value == MI_LOAD_REGISTER_IMM(1))
+ *oacontrol_set = (cmd[2] != 0);
+ }
+
if (!valid_reg(ring->reg_table,
ring->reg_count, reg_addr)) {
if (!is_master ||
@@ -851,6 +864,7 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
u32 *cmd, *batch_base, *batch_end;
struct drm_i915_cmd_descriptor default_desc = { 0 };
int needs_clflush = 0;
+ bool oacontrol_set = false; /* OACONTROL tracking. See check_cmd() */
ret = i915_gem_obj_prepare_shmem_read(batch_obj, &needs_clflush);
if (ret) {
@@ -900,7 +914,7 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
break;
}
- if (!check_cmd(ring, desc, cmd, is_master)) {
+ if (!check_cmd(ring, desc, cmd, is_master, &oacontrol_set)) {
ret = -EINVAL;
break;
}
@@ -908,6 +922,11 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
cmd += length;
}
+ if (oacontrol_set) {
+ DRM_DEBUG_DRIVER("CMD: batch set OACONTROL but did not clear it\n");
+ ret = -EINVAL;
+ }
+
if (cmd >= batch_end) {
DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
ret = -EINVAL;
--
1.8.3.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/3] drm/i915: Track OACONTROL register enable/disable during parsing
2014-03-28 17:21 ` [PATCH v2 " bradley.d.volkin
@ 2014-03-28 17:37 ` Kenneth Graunke
2014-04-02 9:35 ` Jani Nikula
2014-04-10 6:07 ` Zhenyu Wang
2 siblings, 0 replies; 19+ messages in thread
From: Kenneth Graunke @ 2014-03-28 17:37 UTC (permalink / raw)
To: bradley.d.volkin, intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 1046 bytes --]
On 03/28/2014 10:21 AM, bradley.d.volkin@intel.com wrote:
> From: Brad Volkin <bradley.d.volkin@intel.com>
>
> There is some thought that the data from the performance counters enabled
> via OACONTROL should only be available to the process that enabled counting.
> To limit snooping, require that any batch buffer which sets OACONTROL to a
> non-zero value also sets it back to 0 before the end of the batch.
>
> This requires limiting OACONTROL writes to happen via MI_LOAD_REGISTER_IMM
> so that we can access the value being written. This should be in line with
> the expected use case for writing OACONTROL.
>
> v2: Drop an unnecessary '? true : false'
>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
I still don't see any rationale for prohibiting LRM - there's zero
security benefit, and you can write to every other register with either
LRI or LRM, so it's just extra inconsistency. But, Daniel.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted
2014-03-27 18:43 ` [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted bradley.d.volkin
2014-03-27 21:47 ` Kenneth Graunke
@ 2014-04-02 9:11 ` Jani Nikula
2014-04-02 9:35 ` Daniel Vetter
1 sibling, 1 reply; 19+ messages in thread
From: Jani Nikula @ 2014-04-02 9:11 UTC (permalink / raw)
To: bradley.d.volkin, intel-gfx
On Thu, 27 Mar 2014, bradley.d.volkin@intel.com wrote:
> From: Brad Volkin <bradley.d.volkin@intel.com>
>
> As suggested during review, this makes it much more obvious
> when the tables are not sorted.
Let's look at the optimizations later.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_cmd_parser.c | 31 +++++++++++++++++++++----------
> 1 file changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 788bd96..8a93db3 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -493,12 +493,13 @@ static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header)
> return 0;
> }
>
> -static void validate_cmds_sorted(struct intel_ring_buffer *ring)
> +static bool validate_cmds_sorted(struct intel_ring_buffer *ring)
> {
> int i;
> + bool ret = true;
>
> if (!ring->cmd_tables || ring->cmd_table_count == 0)
> - return;
> + return true;
>
> for (i = 0; i < ring->cmd_table_count; i++) {
> const struct drm_i915_cmd_table *table = &ring->cmd_tables[i];
> @@ -510,35 +511,45 @@ static void validate_cmds_sorted(struct intel_ring_buffer *ring)
> &table->table[i];
> u32 curr = desc->cmd.value & desc->cmd.mask;
>
> - if (curr < previous)
> + if (curr < previous) {
> DRM_ERROR("CMD: table not sorted ring=%d table=%d entry=%d cmd=0x%08X prev=0x%08X\n",
> ring->id, i, j, curr, previous);
> + ret = false;
> + }
>
> previous = curr;
> }
> }
> +
> + return ret;
> }
>
> -static void check_sorted(int ring_id, const u32 *reg_table, int reg_count)
> +static bool check_sorted(int ring_id, const u32 *reg_table, int reg_count)
> {
> int i;
> u32 previous = 0;
> + bool ret = true;
>
> for (i = 0; i < reg_count; i++) {
> u32 curr = reg_table[i];
>
> - if (curr < previous)
> + if (curr < previous) {
> DRM_ERROR("CMD: table not sorted ring=%d entry=%d reg=0x%08X prev=0x%08X\n",
> ring_id, i, curr, previous);
> + ret = false;
> + }
>
> previous = curr;
> }
> +
> + return ret;
> }
>
> -static void validate_regs_sorted(struct intel_ring_buffer *ring)
> +static bool validate_regs_sorted(struct intel_ring_buffer *ring)
> {
> - check_sorted(ring->id, ring->reg_table, ring->reg_count);
> - check_sorted(ring->id, ring->master_reg_table, ring->master_reg_count);
> + return check_sorted(ring->id, ring->reg_table, ring->reg_count) &&
> + check_sorted(ring->id, ring->master_reg_table,
> + ring->master_reg_count);
> }
>
> /**
> @@ -617,8 +628,8 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring)
> BUG();
> }
>
> - validate_cmds_sorted(ring);
> - validate_regs_sorted(ring);
> + BUG_ON(!validate_cmds_sorted(ring));
> + BUG_ON(!validate_regs_sorted(ring));
> }
>
> static const struct drm_i915_cmd_descriptor*
> --
> 1.8.3.2
>
--
Jani Nikula, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/3] drm/i915: Refactor cmd parser checks into a function
2014-03-27 21:49 ` Kenneth Graunke
@ 2014-04-02 9:14 ` Jani Nikula
0 siblings, 0 replies; 19+ messages in thread
From: Jani Nikula @ 2014-04-02 9:14 UTC (permalink / raw)
To: Kenneth Graunke, bradley.d.volkin, intel-gfx
On Thu, 27 Mar 2014, Kenneth Graunke <kenneth@whitecape.org> wrote:
> On 03/27/2014 11:43 AM, bradley.d.volkin@intel.com wrote:
>> From: Brad Volkin <bradley.d.volkin@intel.com>
>>
>> This brings the code a little more in line with kernel coding style.
>>
>> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_cmd_parser.c | 136 +++++++++++++++++----------------
>> 1 file changed, 71 insertions(+), 65 deletions(-)
>
> I like the refactor.
>
> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ditto.
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Jani Nikula, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/3] drm/i915: Track OACONTROL register enable/disable during parsing
2014-03-28 17:21 ` [PATCH v2 " bradley.d.volkin
2014-03-28 17:37 ` Kenneth Graunke
@ 2014-04-02 9:35 ` Jani Nikula
2014-04-10 6:07 ` Zhenyu Wang
2 siblings, 0 replies; 19+ messages in thread
From: Jani Nikula @ 2014-04-02 9:35 UTC (permalink / raw)
To: bradley.d.volkin, intel-gfx
On Fri, 28 Mar 2014, bradley.d.volkin@intel.com wrote:
> From: Brad Volkin <bradley.d.volkin@intel.com>
>
> There is some thought that the data from the performance counters enabled
> via OACONTROL should only be available to the process that enabled counting.
> To limit snooping, require that any batch buffer which sets OACONTROL to a
> non-zero value also sets it back to 0 before the end of the batch.
>
> This requires limiting OACONTROL writes to happen via MI_LOAD_REGISTER_IMM
> so that we can access the value being written. This should be in line with
> the expected use case for writing OACONTROL.
>
> v2: Drop an unnecessary '? true : false'
>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_cmd_parser.c | 35 ++++++++++++++++++++++++++--------
> 1 file changed, 27 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 2eb2aca..34e2d45 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -407,12 +407,7 @@ static const u32 gen7_render_regs[] = {
> REG64(CL_PRIMITIVES_COUNT),
> REG64(PS_INVOCATION_COUNT),
> REG64(PS_DEPTH_COUNT),
> - /*
> - * FIXME: This is just to keep mesa working for now, we need to check
> - * that mesa resets this again and that it doesn't use any of the
> - * special modes which write into the gtt.
> - */
> - OACONTROL,
> + OACONTROL, /* Only allowed for LRI and SRM. See below. */
> REG64(GEN7_SO_NUM_PRIMS_WRITTEN(0)),
> REG64(GEN7_SO_NUM_PRIMS_WRITTEN(1)),
> REG64(GEN7_SO_NUM_PRIMS_WRITTEN(2)),
> @@ -761,7 +756,8 @@ bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
> static bool check_cmd(const struct intel_ring_buffer *ring,
> const struct drm_i915_cmd_descriptor *desc,
> const u32 *cmd,
> - const bool is_master)
> + const bool is_master,
> + bool *oacontrol_set)
> {
> if (desc->flags & CMD_DESC_REJECT) {
> DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd);
> @@ -777,6 +773,23 @@ static bool check_cmd(const struct intel_ring_buffer *ring,
> if (desc->flags & CMD_DESC_REGISTER) {
> u32 reg_addr = cmd[desc->reg.offset] & desc->reg.mask;
>
> + /*
> + * OACONTROL requires some special handling for writes. We
> + * want to make sure that any batch which enables OA also
> + * disables it before the end of the batch. The goal is to
> + * prevent one process from snooping on the perf data from
> + * another process. To do that, we need to check the value
> + * that will be written to the register. Hence, limit
> + * OACONTROL writes to only MI_LOAD_REGISTER_IMM commands.
> + */
> + if (reg_addr == OACONTROL) {
> + if (desc->cmd.value == MI_LOAD_REGISTER_MEM)
> + return false;
Given the comment "Only allowed for LRI and SRM" above, would it be
better to just whitelist the allowed commands here? *shrug*
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
> +
> + if (desc->cmd.value == MI_LOAD_REGISTER_IMM(1))
> + *oacontrol_set = (cmd[2] != 0);
> + }
> +
> if (!valid_reg(ring->reg_table,
> ring->reg_count, reg_addr)) {
> if (!is_master ||
> @@ -851,6 +864,7 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
> u32 *cmd, *batch_base, *batch_end;
> struct drm_i915_cmd_descriptor default_desc = { 0 };
> int needs_clflush = 0;
> + bool oacontrol_set = false; /* OACONTROL tracking. See check_cmd() */
>
> ret = i915_gem_obj_prepare_shmem_read(batch_obj, &needs_clflush);
> if (ret) {
> @@ -900,7 +914,7 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
> break;
> }
>
> - if (!check_cmd(ring, desc, cmd, is_master)) {
> + if (!check_cmd(ring, desc, cmd, is_master, &oacontrol_set)) {
> ret = -EINVAL;
> break;
> }
> @@ -908,6 +922,11 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
> cmd += length;
> }
>
> + if (oacontrol_set) {
> + DRM_DEBUG_DRIVER("CMD: batch set OACONTROL but did not clear it\n");
> + ret = -EINVAL;
> + }
> +
> if (cmd >= batch_end) {
> DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
> ret = -EINVAL;
> --
> 1.8.3.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Jani Nikula, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted
2014-04-02 9:11 ` Jani Nikula
@ 2014-04-02 9:35 ` Daniel Vetter
0 siblings, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2014-04-02 9:35 UTC (permalink / raw)
To: Jani Nikula; +Cc: intel-gfx
On Wed, Apr 02, 2014 at 12:11:37PM +0300, Jani Nikula wrote:
> On Thu, 27 Mar 2014, bradley.d.volkin@intel.com wrote:
> > From: Brad Volkin <bradley.d.volkin@intel.com>
> >
> > As suggested during review, this makes it much more obvious
> > when the tables are not sorted.
>
> Let's look at the optimizations later.
>
> Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Pulled in all three, thanks for patches&review.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/3] drm/i915: Track OACONTROL register enable/disable during parsing
2014-03-28 17:21 ` [PATCH v2 " bradley.d.volkin
2014-03-28 17:37 ` Kenneth Graunke
2014-04-02 9:35 ` Jani Nikula
@ 2014-04-10 6:07 ` Zhenyu Wang
2014-04-10 6:43 ` Daniel Vetter
2 siblings, 1 reply; 19+ messages in thread
From: Zhenyu Wang @ 2014-04-10 6:07 UTC (permalink / raw)
To: bradley.d.volkin; +Cc: intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 809 bytes --]
On 2014.03.28 10:21:50 -0700, bradley.d.volkin@intel.com wrote:
> From: Brad Volkin <bradley.d.volkin@intel.com>
>
> There is some thought that the data from the performance counters enabled
> via OACONTROL should only be available to the process that enabled counting.
> To limit snooping, require that any batch buffer which sets OACONTROL to a
> non-zero value also sets it back to 0 before the end of the batch.
>
I think this might be too strict although there's good point in this.
But it makes almost impossible to insert mi report count command to probe
GPU statistics from kind of third-party tool or lib.
Could we have a i915.enable_cmd_parser config that can disable this?
--
Open Source Technology Center, Intel ltd.
$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/3] drm/i915: Track OACONTROL register enable/disable during parsing
2014-04-10 6:07 ` Zhenyu Wang
@ 2014-04-10 6:43 ` Daniel Vetter
2014-04-10 7:51 ` Zhenyu Wang
0 siblings, 1 reply; 19+ messages in thread
From: Daniel Vetter @ 2014-04-10 6:43 UTC (permalink / raw)
To: Zhenyu Wang; +Cc: intel-gfx
On Thu, Apr 10, 2014 at 02:07:21PM +0800, Zhenyu Wang wrote:
> On 2014.03.28 10:21:50 -0700, bradley.d.volkin@intel.com wrote:
> > From: Brad Volkin <bradley.d.volkin@intel.com>
> >
> > There is some thought that the data from the performance counters enabled
> > via OACONTROL should only be available to the process that enabled counting.
> > To limit snooping, require that any batch buffer which sets OACONTROL to a
> > non-zero value also sets it back to 0 before the end of the batch.
> >
>
> I think this might be too strict although there's good point in this.
> But it makes almost impossible to insert mi report count command to probe
> GPU statistics from kind of third-party tool or lib.
At least in the case of mesa you _must_ use mesa (or whatever your gl
library is) to insert the MI_PERF cmd at just the rigth spots in the
command stream and ofc flush all outstanding vertices and similar things.
If you want a global perf sampling support then I think we need to wire up
the time-based sampling the hw provides to the perf subsystem. And figure
out how to coordinate between mesa wanting to use OA and perf (probably
just reject mesa batches or something like that).
> Could we have a i915.enable_cmd_parser config that can disable this?
As is I don't see a compelling reason. Also if you require the users of
your perf tuning tooling to use a module option, you're doing it wrong.
This stuff should Just Work.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/3] drm/i915: Track OACONTROL register enable/disable during parsing
2014-04-10 6:43 ` Daniel Vetter
@ 2014-04-10 7:51 ` Zhenyu Wang
0 siblings, 0 replies; 19+ messages in thread
From: Zhenyu Wang @ 2014-04-10 7:51 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 984 bytes --]
On 2014.04.10 08:43:40 +0200, Daniel Vetter wrote:
>
> At least in the case of mesa you _must_ use mesa (or whatever your gl
> library is) to insert the MI_PERF cmd at just the rigth spots in the
> command stream and ofc flush all outstanding vertices and similar things.
>
> If you want a global perf sampling support then I think we need to wire up
> the time-based sampling the hw provides to the perf subsystem. And figure
> out how to coordinate between mesa wanting to use OA and perf (probably
> just reject mesa batches or something like that).
>
yeah, that'll be an interesting direction.
> > Could we have a i915.enable_cmd_parser config that can disable this?
>
> As is I don't see a compelling reason. Also if you require the users of
> your perf tuning tooling to use a module option, you're doing it wrong.
> This stuff should Just Work.
--
Open Source Technology Center, Intel ltd.
$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2014-04-10 7:51 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-27 18:43 [PATCH 0/3] Fix up cmd parser OACONTROL handling + refactorings bradley.d.volkin
2014-03-27 18:43 ` [PATCH 1/3] drm/i915: BUG_ON() when cmd/reg tables are not sorted bradley.d.volkin
2014-03-27 21:47 ` Kenneth Graunke
2014-03-27 23:56 ` Volkin, Bradley D
2014-04-02 9:11 ` Jani Nikula
2014-04-02 9:35 ` Daniel Vetter
2014-03-27 18:43 ` [PATCH 2/3] drm/i915: Refactor cmd parser checks into a function bradley.d.volkin
2014-03-27 21:49 ` Kenneth Graunke
2014-04-02 9:14 ` Jani Nikula
2014-03-27 18:43 ` [PATCH 3/3] drm/i915: Track OACONTROL register enable/disable during parsing bradley.d.volkin
2014-03-27 21:58 ` Kenneth Graunke
2014-03-28 0:06 ` Volkin, Bradley D
2014-03-28 7:51 ` Daniel Vetter
2014-03-28 17:21 ` [PATCH v2 " bradley.d.volkin
2014-03-28 17:37 ` Kenneth Graunke
2014-04-02 9:35 ` Jani Nikula
2014-04-10 6:07 ` Zhenyu Wang
2014-04-10 6:43 ` Daniel Vetter
2014-04-10 7:51 ` Zhenyu Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox