* [PATCH i-g-t 0/2] Confirm full SSEU enable on Gen9+ @ 2015-03-10 21:17 jeff.mcgee 2015-03-10 21:17 ` [PATCH i-g-t 1/2] lib: Add media spin jeff.mcgee 2015-03-10 21:17 ` [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu jeff.mcgee 0 siblings, 2 replies; 10+ messages in thread From: jeff.mcgee @ 2015-03-10 21:17 UTC (permalink / raw) To: intel-gfx From: Jeff McGee <jeff.mcgee@intel.com> New IGT testing to cover the RC6/SSEU issue recently resolved on SKL. http://lists.freedesktop.org/archives/intel-gfx/2015-February/060058.html Jeff McGee (2): lib: Add media spin tests/pm_sseu: Create new test pm_sseu lib/Makefile.sources | 2 + lib/intel_batchbuffer.c | 24 +++ lib/intel_batchbuffer.h | 22 ++ lib/media_spin.c | 540 ++++++++++++++++++++++++++++++++++++++++++++++++ lib/media_spin.h | 39 ++++ tests/.gitignore | 1 + tests/Makefile.sources | 1 + tests/pm_sseu.c | 373 +++++++++++++++++++++++++++++++++ 8 files changed, 1002 insertions(+) create mode 100644 lib/media_spin.c create mode 100644 lib/media_spin.h create mode 100644 tests/pm_sseu.c -- 2.3.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH i-g-t 1/2] lib: Add media spin 2015-03-10 21:17 [PATCH i-g-t 0/2] Confirm full SSEU enable on Gen9+ jeff.mcgee @ 2015-03-10 21:17 ` jeff.mcgee 2015-03-12 17:52 ` [PATCH i-g-t 1/2 v2] " jeff.mcgee 2015-03-10 21:17 ` [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu jeff.mcgee 1 sibling, 1 reply; 10+ messages in thread From: jeff.mcgee @ 2015-03-10 21:17 UTC (permalink / raw) To: intel-gfx From: Jeff McGee <jeff.mcgee@intel.com> The media spin utility is derived from media fill. The purpose is to create a simple means to keep the render engine (media pipeline) busy for a controlled amount of time. It does so by emitting a batch with a single execution thread that spins in a tight loop the requested number of times. Each spin increments a counter whose final 32-bit value is written to the destination buffer on completion for checking. The implementation supports Gen8, Gen8lp, and Gen9. Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> --- lib/Makefile.sources | 2 + lib/intel_batchbuffer.c | 24 +++ lib/intel_batchbuffer.h | 22 ++ lib/media_spin.c | 540 ++++++++++++++++++++++++++++++++++++++++++++++++ lib/media_spin.h | 39 ++++ 5 files changed, 627 insertions(+) create mode 100644 lib/media_spin.c create mode 100644 lib/media_spin.h diff --git a/lib/Makefile.sources b/lib/Makefile.sources index 76f353a..3d93629 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -29,6 +29,8 @@ libintel_tools_la_SOURCES = \ media_fill_gen8.c \ media_fill_gen8lp.c \ media_fill_gen9.c \ + media_spin.h \ + media_spin.c \ gen7_media.h \ gen8_media.h \ rendercopy_i915.c \ diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c index c70f6d8..14970e4 100644 --- a/lib/intel_batchbuffer.c +++ b/lib/intel_batchbuffer.c @@ -39,6 +39,7 @@ #include "intel_reg.h" #include "rendercopy.h" #include "media_fill.h" +#include "media_spin.h" #include <i915_drm.h> /** @@ -530,3 +531,26 @@ igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid) return fill; } + +/** + * igt_get_media_spinfunc: + * @devid: pci device id + * + * Returns: + * + * The platform-specific media spin function pointer for the device specified + * with @devid. Will return NULL when no media spin function is implemented. + */ +igt_media_spinfunc_t igt_get_media_spinfunc(int devid) +{ + igt_media_spinfunc_t spin = NULL; + + if (IS_GEN9(devid)) + spin = gen9_media_spinfunc; + else if (IS_BROADWELL(devid)) + spin = gen8_media_spinfunc; + else if (IS_CHERRYVIEW(devid)) + spin = gen8lp_media_spinfunc; + + return spin; +} diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h index 12f7be1..13b356a 100644 --- a/lib/intel_batchbuffer.h +++ b/lib/intel_batchbuffer.h @@ -265,4 +265,26 @@ typedef void (*igt_fillfunc_t)(struct intel_batchbuffer *batch, igt_fillfunc_t igt_get_media_fillfunc(int devid); igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid); +/** + * igt_media_spinfunc_t: + * @batch: batchbuffer object + * @dst: destination i-g-t buffer object + * @spins: number of loops to execute + * + * This is the type of the per-platform media spin functions. The + * platform-specific implementation can be obtained by calling + * igt_get_media_spinfunc(). + * + * The media spin function emits a batchbuffer for the render engine with + * the media pipeline selected. The workload consists of a single thread + * which spins in a tight loop the requested number of times. Each spin + * increments a counter whose final 32-bit value is written to the + * destination buffer on completion. This utility provides a simple way + * to keep the render engine busy for a set time for various tests. + */ +typedef void (*igt_media_spinfunc_t)(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins); + +igt_media_spinfunc_t igt_get_media_spinfunc(int devid); + #endif diff --git a/lib/media_spin.c b/lib/media_spin.c new file mode 100644 index 0000000..b44c55a --- /dev/null +++ b/lib/media_spin.c @@ -0,0 +1,540 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + * Jeff McGee <jeff.mcgee@intel.com> + */ + +#include <intel_bufmgr.h> +#include <i915_drm.h> +#include "intel_reg.h" +#include "drmtest.h" +#include "intel_batchbuffer.h" +#include "gen8_media.h" +#include "media_spin.h" + +static const uint32_t spin_kernel[][4] = { + { 0x00600001, 0x20800208, 0x008d0000, 0x00000000 }, /* mov (8)r4.0<1>:ud r0.0<8;8;1>:ud */ + { 0x00200001, 0x20800208, 0x00450040, 0x00000000 }, /* mov (2)r4.0<1>.ud r2.0<2;2;1>:ud */ + { 0x00000001, 0x20880608, 0x00000000, 0x00000003 }, /* mov (1)r4.8<1>:ud 0x3 */ + { 0x00000001, 0x20a00608, 0x00000000, 0x00000000 }, /* mov (1)r5.0<1>:ud 0 */ + { 0x00000040, 0x20a00208, 0x060000a0, 0x00000001 }, /* add (1)r5.0<1>:ud r5.0<0;1;0>:ud 1 */ + { 0x01000010, 0x20000200, 0x02000020, 0x000000a0 }, /* cmp.e.f0.0 (1)null<1> r1<0;1;0> r5<0;1;0> */ + { 0x00110027, 0x00000000, 0x00000000, 0xffffffe0 }, /* ~f0.0 while (1) -32 */ + { 0x0c800031, 0x20000a00, 0x0e000080, 0x040a8000 }, /* send.dcdp1 (16)null<1> r4.0<0;1;0> 0x040a8000 */ + { 0x00600001, 0x2e000208, 0x008d0000, 0x00000000 }, /* mov (8)r112<1>:ud r0.0<8;8;1>:ud */ + { 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 }, /* send.ts (16)null<1> r112<0;1;0>:d 0x82000010 */ +}; + +static uint32_t +batch_used(struct intel_batchbuffer *batch) +{ + return batch->ptr - batch->buffer; +} + +static uint32_t +batch_align(struct intel_batchbuffer *batch, uint32_t align) +{ + uint32_t offset = batch_used(batch); + offset = ALIGN(offset, align); + batch->ptr = batch->buffer + offset; + return offset; +} + +static void * +batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align) +{ + uint32_t offset = batch_align(batch, align); + batch->ptr += size; + return memset(batch->buffer + offset, 0, size); +} + +static uint32_t +batch_offset(struct intel_batchbuffer *batch, void *ptr) +{ + return (uint8_t *)ptr - batch->buffer; +} + +static uint32_t +batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, + uint32_t align) +{ + return batch_offset(batch, memcpy(batch_alloc(batch, size, align), ptr, size)); +} + +static void +gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end) +{ + int ret; + + ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer); + if (ret == 0) + ret = drm_intel_gem_bo_context_exec(batch->bo, NULL, + batch_end, 0); + igt_assert(ret == 0); +} + +static uint32_t +gen8_spin_curbe_buffer_data(struct intel_batchbuffer *batch, + uint32_t iters) +{ + uint32_t *curbe_buffer; + uint32_t offset; + + curbe_buffer = batch_alloc(batch, 64, 64); + offset = batch_offset(batch, curbe_buffer); + *curbe_buffer = iters; + + return offset; +} + +static uint32_t +gen8_spin_surface_state(struct intel_batchbuffer *batch, + struct igt_buf *buf, + uint32_t format, + int is_dst) +{ + struct gen8_surface_state *ss; + uint32_t write_domain, read_domain, offset; + int ret; + + if (is_dst) { + write_domain = read_domain = I915_GEM_DOMAIN_RENDER; + } else { + write_domain = 0; + read_domain = I915_GEM_DOMAIN_SAMPLER; + } + + ss = batch_alloc(batch, sizeof(*ss), 64); + offset = batch_offset(batch, ss); + + ss->ss0.surface_type = GEN8_SURFACE_2D; + ss->ss0.surface_format = format; + ss->ss0.render_cache_read_write = 1; + ss->ss0.vertical_alignment = 1; /* align 4 */ + ss->ss0.horizontal_alignment = 1; /* align 4 */ + + if (buf->tiling == I915_TILING_X) + ss->ss0.tiled_mode = 2; + else if (buf->tiling == I915_TILING_Y) + ss->ss0.tiled_mode = 3; + + ss->ss8.base_addr = buf->bo->offset; + + ret = drm_intel_bo_emit_reloc(batch->bo, + batch_offset(batch, ss) + 8 * 4, + buf->bo, 0, + read_domain, write_domain); + igt_assert(ret == 0); + + ss->ss2.height = igt_buf_height(buf) - 1; + ss->ss2.width = igt_buf_width(buf) - 1; + ss->ss3.pitch = buf->stride - 1; + + ss->ss7.shader_chanel_select_r = 4; + ss->ss7.shader_chanel_select_g = 5; + ss->ss7.shader_chanel_select_b = 6; + ss->ss7.shader_chanel_select_a = 7; + + return offset; +} + +static uint32_t +gen8_spin_binding_table(struct intel_batchbuffer *batch, + struct igt_buf *dst) +{ + uint32_t *binding_table, offset; + + binding_table = batch_alloc(batch, 32, 64); + offset = batch_offset(batch, binding_table); + + binding_table[0] = gen8_spin_surface_state(batch, dst, + GEN8_SURFACEFORMAT_R8_UNORM, 1); + + return offset; +} + +static uint32_t +gen8_spin_media_kernel(struct intel_batchbuffer *batch, + const uint32_t kernel[][4], + size_t size) +{ + uint32_t offset; + + offset = batch_copy(batch, kernel, size, 64); + + return offset; +} + +static uint32_t +gen8_spin_interface_descriptor(struct intel_batchbuffer *batch, + struct igt_buf *dst) +{ + struct gen8_interface_descriptor_data *idd; + uint32_t offset; + uint32_t binding_table_offset, kernel_offset; + + binding_table_offset = gen8_spin_binding_table(batch, dst); + kernel_offset = gen8_spin_media_kernel(batch, spin_kernel, + sizeof(spin_kernel)); + + idd = batch_alloc(batch, sizeof(*idd), 64); + offset = batch_offset(batch, idd); + + idd->desc0.kernel_start_pointer = (kernel_offset >> 6); + + idd->desc2.single_program_flow = 1; + idd->desc2.floating_point_mode = GEN8_FLOATING_POINT_IEEE_754; + + idd->desc3.sampler_count = 0; /* 0 samplers used */ + idd->desc3.sampler_state_pointer = 0; + + idd->desc4.binding_table_entry_count = 0; + idd->desc4.binding_table_pointer = (binding_table_offset >> 5); + + idd->desc5.constant_urb_entry_read_offset = 0; + idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */ + + return offset; +} + +static void +gen8_emit_state_base_address(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2)); + + /* general */ + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + + /* stateless data port */ + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); + + /* surface */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY); + + /* dynamic */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, + 0, BASE_ADDRESS_MODIFY); + + /* indirect */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* instruction */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY); + + /* general state buffer size */ + OUT_BATCH(0xfffff000 | 1); + /* dynamic state buffer size */ + OUT_BATCH(1 << 12 | 1); + /* indirect object buffer size */ + OUT_BATCH(0xfffff000 | 1); + /* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */ + OUT_BATCH(1 << 12 | 1); +} + +static void +gen9_emit_state_base_address(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2)); + + /* general */ + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + + /* stateless data port */ + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); + + /* surface */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY); + + /* dynamic */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, + 0, BASE_ADDRESS_MODIFY); + + /* indirect */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* instruction */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY); + + /* general state buffer size */ + OUT_BATCH(0xfffff000 | 1); + /* dynamic state buffer size */ + OUT_BATCH(1 << 12 | 1); + /* indirect object buffer size */ + OUT_BATCH(0xfffff000 | 1); + /* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */ + OUT_BATCH(1 << 12 | 1); + + /* Bindless surface state base address */ + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + OUT_BATCH(0xfffff000); +} + +static void +gen8_emit_vfe_state(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2)); + + /* scratch buffer */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* number of threads & urb entries */ + OUT_BATCH(2 << 8); + + OUT_BATCH(0); + + /* urb entry size & curbe size */ + OUT_BATCH(2 << 16 | + 2); + + /* scoreboard */ + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); +} + +static void +gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer) +{ + OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2)); + OUT_BATCH(0); + /* curbe total data length */ + OUT_BATCH(64); + /* curbe data start address, is relative to the dynamics base address */ + OUT_BATCH(curbe_buffer); +} + +static void +gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, + uint32_t interface_descriptor) +{ + OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2)); + OUT_BATCH(0); + /* interface descriptor data length */ + OUT_BATCH(sizeof(struct gen8_interface_descriptor_data)); + /* interface descriptor address, is relative to the dynamics base address */ + OUT_BATCH(interface_descriptor); +} + +static void +gen8_emit_media_state_flush(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2)); + OUT_BATCH(0); +} + +static void +gen8_emit_media_objects(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2)); + + /* interface descriptor offset */ + OUT_BATCH(0); + + /* without indirect data */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* scoreboard */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* inline data (xoffset, yoffset) */ + OUT_BATCH(0); + OUT_BATCH(0); + gen8_emit_media_state_flush(batch); +} + +static void +gen8lp_emit_media_objects(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2)); + + /* interface descriptor offset */ + OUT_BATCH(0); + + /* without indirect data */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* scoreboard */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* inline data (xoffset, yoffset) */ + OUT_BATCH(0); + OUT_BATCH(0); +} + +/* + * This sets up the media pipeline, + * + * +---------------+ <---- 4096 + * | ^ | + * | | | + * | various | + * | state | + * | | | + * |_______|_______| <---- 2048 + ? + * | ^ | + * | | | + * | batch | + * | commands | + * | | | + * | | | + * +---------------+ <---- 0 + ? + * + */ + +#define BATCH_STATE_SPLIT 2048 + +void +gen8_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins) +{ + uint32_t curbe_buffer, interface_descriptor; + uint32_t batch_end; + + intel_batchbuffer_flush_with_context(batch, NULL); + + /* setup states */ + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; + + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); + igt_assert(batch->ptr < &batch->buffer[4095]); + + /* media pipeline */ + batch->ptr = batch->buffer; + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA); + gen8_emit_state_base_address(batch); + + gen8_emit_vfe_state(batch); + + gen8_emit_curbe_load(batch, curbe_buffer); + + gen8_emit_interface_descriptor_load(batch, interface_descriptor); + + gen8_emit_media_objects(batch); + + OUT_BATCH(MI_BATCH_BUFFER_END); + + batch_end = batch_align(batch, 8); + igt_assert(batch_end < BATCH_STATE_SPLIT); + + gen8_render_flush(batch, batch_end); + intel_batchbuffer_reset(batch); +} + +void +gen8lp_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins) +{ + uint32_t curbe_buffer, interface_descriptor; + uint32_t batch_end; + + intel_batchbuffer_flush_with_context(batch, NULL); + + /* setup states */ + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; + + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); + igt_assert(batch->ptr < &batch->buffer[4095]); + + /* media pipeline */ + batch->ptr = batch->buffer; + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA); + gen8_emit_state_base_address(batch); + + gen8_emit_vfe_state(batch); + + gen8_emit_curbe_load(batch, curbe_buffer); + + gen8_emit_interface_descriptor_load(batch, interface_descriptor); + + gen8lp_emit_media_objects(batch); + + OUT_BATCH(MI_BATCH_BUFFER_END); + + batch_end = batch_align(batch, 8); + igt_assert(batch_end < BATCH_STATE_SPLIT); + + gen8_render_flush(batch, batch_end); + intel_batchbuffer_reset(batch); +} + +void +gen9_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins) +{ + uint32_t curbe_buffer, interface_descriptor; + uint32_t batch_end; + + intel_batchbuffer_flush_with_context(batch, NULL); + + /* setup states */ + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; + + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); + igt_assert(batch->ptr < &batch->buffer[4095]); + + /* media pipeline */ + batch->ptr = batch->buffer; + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA | + GEN9_FORCE_MEDIA_AWAKE_ENABLE | + GEN9_SAMPLER_DOP_GATE_DISABLE | + GEN9_PIPELINE_SELECTION_MASK | + GEN9_SAMPLER_DOP_GATE_MASK | + GEN9_FORCE_MEDIA_AWAKE_MASK); + gen9_emit_state_base_address(batch); + + gen8_emit_vfe_state(batch); + + gen8_emit_curbe_load(batch, curbe_buffer); + + gen8_emit_interface_descriptor_load(batch, interface_descriptor); + + gen8_emit_media_objects(batch); + + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA | + GEN9_FORCE_MEDIA_AWAKE_DISABLE | + GEN9_SAMPLER_DOP_GATE_ENABLE | + GEN9_PIPELINE_SELECTION_MASK | + GEN9_SAMPLER_DOP_GATE_MASK | + GEN9_FORCE_MEDIA_AWAKE_MASK); + + OUT_BATCH(MI_BATCH_BUFFER_END); + + batch_end = batch_align(batch, 8); + igt_assert(batch_end < BATCH_STATE_SPLIT); + + gen8_render_flush(batch, batch_end); + intel_batchbuffer_reset(batch); +} diff --git a/lib/media_spin.h b/lib/media_spin.h new file mode 100644 index 0000000..8bc4829 --- /dev/null +++ b/lib/media_spin.h @@ -0,0 +1,39 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + * Jeff McGee <jeff.mcgee@intel.com> + */ + +#ifndef MEDIA_SPIN_H +#define MEDIA_SPIN_H + +void gen8_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins); + +void gen8lp_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins); + +void gen9_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins); + +#endif /* MEDIA_SPIN_H */ -- 2.3.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH i-g-t 1/2 v2] lib: Add media spin 2015-03-10 21:17 ` [PATCH i-g-t 1/2] lib: Add media spin jeff.mcgee @ 2015-03-12 17:52 ` jeff.mcgee 2015-03-25 2:50 ` He, Shuang 0 siblings, 1 reply; 10+ messages in thread From: jeff.mcgee @ 2015-03-12 17:52 UTC (permalink / raw) To: intel-gfx From: Jeff McGee <jeff.mcgee@intel.com> The media spin utility is derived from media fill. The purpose is to create a simple means to keep the render engine (media pipeline) busy for a controlled amount of time. It does so by emitting a batch with a single execution thread that spins in a tight loop the requested number of times. Each spin increments a counter whose final 32-bit value is written to the destination buffer on completion for checking. The implementation supports Gen8, Gen8lp, and Gen9. v2: Apply the recommendations of igt.cocci. Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> --- lib/Makefile.sources | 2 + lib/intel_batchbuffer.c | 24 +++ lib/intel_batchbuffer.h | 22 ++ lib/media_spin.c | 540 ++++++++++++++++++++++++++++++++++++++++++++++++ lib/media_spin.h | 39 ++++ 5 files changed, 627 insertions(+) create mode 100644 lib/media_spin.c create mode 100644 lib/media_spin.h diff --git a/lib/Makefile.sources b/lib/Makefile.sources index 76f353a..3d93629 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -29,6 +29,8 @@ libintel_tools_la_SOURCES = \ media_fill_gen8.c \ media_fill_gen8lp.c \ media_fill_gen9.c \ + media_spin.h \ + media_spin.c \ gen7_media.h \ gen8_media.h \ rendercopy_i915.c \ diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c index 666c323..195ccc4 100644 --- a/lib/intel_batchbuffer.c +++ b/lib/intel_batchbuffer.c @@ -40,6 +40,7 @@ #include "rendercopy.h" #include "media_fill.h" #include "ioctl_wrappers.h" +#include "media_spin.h" #include <i915_drm.h> @@ -785,3 +786,26 @@ igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid) return fill; } + +/** + * igt_get_media_spinfunc: + * @devid: pci device id + * + * Returns: + * + * The platform-specific media spin function pointer for the device specified + * with @devid. Will return NULL when no media spin function is implemented. + */ +igt_media_spinfunc_t igt_get_media_spinfunc(int devid) +{ + igt_media_spinfunc_t spin = NULL; + + if (IS_GEN9(devid)) + spin = gen9_media_spinfunc; + else if (IS_BROADWELL(devid)) + spin = gen8_media_spinfunc; + else if (IS_CHERRYVIEW(devid)) + spin = gen8lp_media_spinfunc; + + return spin; +} diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h index fa8875b..62c8396 100644 --- a/lib/intel_batchbuffer.h +++ b/lib/intel_batchbuffer.h @@ -300,4 +300,26 @@ typedef void (*igt_fillfunc_t)(struct intel_batchbuffer *batch, igt_fillfunc_t igt_get_media_fillfunc(int devid); igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid); +/** + * igt_media_spinfunc_t: + * @batch: batchbuffer object + * @dst: destination i-g-t buffer object + * @spins: number of loops to execute + * + * This is the type of the per-platform media spin functions. The + * platform-specific implementation can be obtained by calling + * igt_get_media_spinfunc(). + * + * The media spin function emits a batchbuffer for the render engine with + * the media pipeline selected. The workload consists of a single thread + * which spins in a tight loop the requested number of times. Each spin + * increments a counter whose final 32-bit value is written to the + * destination buffer on completion. This utility provides a simple way + * to keep the render engine busy for a set time for various tests. + */ +typedef void (*igt_media_spinfunc_t)(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins); + +igt_media_spinfunc_t igt_get_media_spinfunc(int devid); + #endif diff --git a/lib/media_spin.c b/lib/media_spin.c new file mode 100644 index 0000000..580c109 --- /dev/null +++ b/lib/media_spin.c @@ -0,0 +1,540 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + * Jeff McGee <jeff.mcgee@intel.com> + */ + +#include <intel_bufmgr.h> +#include <i915_drm.h> +#include "intel_reg.h" +#include "drmtest.h" +#include "intel_batchbuffer.h" +#include "gen8_media.h" +#include "media_spin.h" + +static const uint32_t spin_kernel[][4] = { + { 0x00600001, 0x20800208, 0x008d0000, 0x00000000 }, /* mov (8)r4.0<1>:ud r0.0<8;8;1>:ud */ + { 0x00200001, 0x20800208, 0x00450040, 0x00000000 }, /* mov (2)r4.0<1>.ud r2.0<2;2;1>:ud */ + { 0x00000001, 0x20880608, 0x00000000, 0x00000003 }, /* mov (1)r4.8<1>:ud 0x3 */ + { 0x00000001, 0x20a00608, 0x00000000, 0x00000000 }, /* mov (1)r5.0<1>:ud 0 */ + { 0x00000040, 0x20a00208, 0x060000a0, 0x00000001 }, /* add (1)r5.0<1>:ud r5.0<0;1;0>:ud 1 */ + { 0x01000010, 0x20000200, 0x02000020, 0x000000a0 }, /* cmp.e.f0.0 (1)null<1> r1<0;1;0> r5<0;1;0> */ + { 0x00110027, 0x00000000, 0x00000000, 0xffffffe0 }, /* ~f0.0 while (1) -32 */ + { 0x0c800031, 0x20000a00, 0x0e000080, 0x040a8000 }, /* send.dcdp1 (16)null<1> r4.0<0;1;0> 0x040a8000 */ + { 0x00600001, 0x2e000208, 0x008d0000, 0x00000000 }, /* mov (8)r112<1>:ud r0.0<8;8;1>:ud */ + { 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 }, /* send.ts (16)null<1> r112<0;1;0>:d 0x82000010 */ +}; + +static uint32_t +batch_used(struct intel_batchbuffer *batch) +{ + return batch->ptr - batch->buffer; +} + +static uint32_t +batch_align(struct intel_batchbuffer *batch, uint32_t align) +{ + uint32_t offset = batch_used(batch); + offset = ALIGN(offset, align); + batch->ptr = batch->buffer + offset; + return offset; +} + +static void * +batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align) +{ + uint32_t offset = batch_align(batch, align); + batch->ptr += size; + return memset(batch->buffer + offset, 0, size); +} + +static uint32_t +batch_offset(struct intel_batchbuffer *batch, void *ptr) +{ + return (uint8_t *)ptr - batch->buffer; +} + +static uint32_t +batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, + uint32_t align) +{ + return batch_offset(batch, memcpy(batch_alloc(batch, size, align), ptr, size)); +} + +static void +gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end) +{ + int ret; + + ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer); + if (ret == 0) + ret = drm_intel_gem_bo_context_exec(batch->bo, NULL, + batch_end, 0); + igt_assert_eq(ret, 0); +} + +static uint32_t +gen8_spin_curbe_buffer_data(struct intel_batchbuffer *batch, + uint32_t iters) +{ + uint32_t *curbe_buffer; + uint32_t offset; + + curbe_buffer = batch_alloc(batch, 64, 64); + offset = batch_offset(batch, curbe_buffer); + *curbe_buffer = iters; + + return offset; +} + +static uint32_t +gen8_spin_surface_state(struct intel_batchbuffer *batch, + struct igt_buf *buf, + uint32_t format, + int is_dst) +{ + struct gen8_surface_state *ss; + uint32_t write_domain, read_domain, offset; + int ret; + + if (is_dst) { + write_domain = read_domain = I915_GEM_DOMAIN_RENDER; + } else { + write_domain = 0; + read_domain = I915_GEM_DOMAIN_SAMPLER; + } + + ss = batch_alloc(batch, sizeof(*ss), 64); + offset = batch_offset(batch, ss); + + ss->ss0.surface_type = GEN8_SURFACE_2D; + ss->ss0.surface_format = format; + ss->ss0.render_cache_read_write = 1; + ss->ss0.vertical_alignment = 1; /* align 4 */ + ss->ss0.horizontal_alignment = 1; /* align 4 */ + + if (buf->tiling == I915_TILING_X) + ss->ss0.tiled_mode = 2; + else if (buf->tiling == I915_TILING_Y) + ss->ss0.tiled_mode = 3; + + ss->ss8.base_addr = buf->bo->offset; + + ret = drm_intel_bo_emit_reloc(batch->bo, + batch_offset(batch, ss) + 8 * 4, + buf->bo, 0, + read_domain, write_domain); + igt_assert_eq(ret, 0); + + ss->ss2.height = igt_buf_height(buf) - 1; + ss->ss2.width = igt_buf_width(buf) - 1; + ss->ss3.pitch = buf->stride - 1; + + ss->ss7.shader_chanel_select_r = 4; + ss->ss7.shader_chanel_select_g = 5; + ss->ss7.shader_chanel_select_b = 6; + ss->ss7.shader_chanel_select_a = 7; + + return offset; +} + +static uint32_t +gen8_spin_binding_table(struct intel_batchbuffer *batch, + struct igt_buf *dst) +{ + uint32_t *binding_table, offset; + + binding_table = batch_alloc(batch, 32, 64); + offset = batch_offset(batch, binding_table); + + binding_table[0] = gen8_spin_surface_state(batch, dst, + GEN8_SURFACEFORMAT_R8_UNORM, 1); + + return offset; +} + +static uint32_t +gen8_spin_media_kernel(struct intel_batchbuffer *batch, + const uint32_t kernel[][4], + size_t size) +{ + uint32_t offset; + + offset = batch_copy(batch, kernel, size, 64); + + return offset; +} + +static uint32_t +gen8_spin_interface_descriptor(struct intel_batchbuffer *batch, + struct igt_buf *dst) +{ + struct gen8_interface_descriptor_data *idd; + uint32_t offset; + uint32_t binding_table_offset, kernel_offset; + + binding_table_offset = gen8_spin_binding_table(batch, dst); + kernel_offset = gen8_spin_media_kernel(batch, spin_kernel, + sizeof(spin_kernel)); + + idd = batch_alloc(batch, sizeof(*idd), 64); + offset = batch_offset(batch, idd); + + idd->desc0.kernel_start_pointer = (kernel_offset >> 6); + + idd->desc2.single_program_flow = 1; + idd->desc2.floating_point_mode = GEN8_FLOATING_POINT_IEEE_754; + + idd->desc3.sampler_count = 0; /* 0 samplers used */ + idd->desc3.sampler_state_pointer = 0; + + idd->desc4.binding_table_entry_count = 0; + idd->desc4.binding_table_pointer = (binding_table_offset >> 5); + + idd->desc5.constant_urb_entry_read_offset = 0; + idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */ + + return offset; +} + +static void +gen8_emit_state_base_address(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2)); + + /* general */ + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + + /* stateless data port */ + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); + + /* surface */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY); + + /* dynamic */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, + 0, BASE_ADDRESS_MODIFY); + + /* indirect */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* instruction */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY); + + /* general state buffer size */ + OUT_BATCH(0xfffff000 | 1); + /* dynamic state buffer size */ + OUT_BATCH(1 << 12 | 1); + /* indirect object buffer size */ + OUT_BATCH(0xfffff000 | 1); + /* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */ + OUT_BATCH(1 << 12 | 1); +} + +static void +gen9_emit_state_base_address(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2)); + + /* general */ + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + + /* stateless data port */ + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); + + /* surface */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY); + + /* dynamic */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, + 0, BASE_ADDRESS_MODIFY); + + /* indirect */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* instruction */ + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY); + + /* general state buffer size */ + OUT_BATCH(0xfffff000 | 1); + /* dynamic state buffer size */ + OUT_BATCH(1 << 12 | 1); + /* indirect object buffer size */ + OUT_BATCH(0xfffff000 | 1); + /* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */ + OUT_BATCH(1 << 12 | 1); + + /* Bindless surface state base address */ + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + OUT_BATCH(0xfffff000); +} + +static void +gen8_emit_vfe_state(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2)); + + /* scratch buffer */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* number of threads & urb entries */ + OUT_BATCH(2 << 8); + + OUT_BATCH(0); + + /* urb entry size & curbe size */ + OUT_BATCH(2 << 16 | + 2); + + /* scoreboard */ + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); +} + +static void +gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer) +{ + OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2)); + OUT_BATCH(0); + /* curbe total data length */ + OUT_BATCH(64); + /* curbe data start address, is relative to the dynamics base address */ + OUT_BATCH(curbe_buffer); +} + +static void +gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, + uint32_t interface_descriptor) +{ + OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2)); + OUT_BATCH(0); + /* interface descriptor data length */ + OUT_BATCH(sizeof(struct gen8_interface_descriptor_data)); + /* interface descriptor address, is relative to the dynamics base address */ + OUT_BATCH(interface_descriptor); +} + +static void +gen8_emit_media_state_flush(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2)); + OUT_BATCH(0); +} + +static void +gen8_emit_media_objects(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2)); + + /* interface descriptor offset */ + OUT_BATCH(0); + + /* without indirect data */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* scoreboard */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* inline data (xoffset, yoffset) */ + OUT_BATCH(0); + OUT_BATCH(0); + gen8_emit_media_state_flush(batch); +} + +static void +gen8lp_emit_media_objects(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2)); + + /* interface descriptor offset */ + OUT_BATCH(0); + + /* without indirect data */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* scoreboard */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* inline data (xoffset, yoffset) */ + OUT_BATCH(0); + OUT_BATCH(0); +} + +/* + * This sets up the media pipeline, + * + * +---------------+ <---- 4096 + * | ^ | + * | | | + * | various | + * | state | + * | | | + * |_______|_______| <---- 2048 + ? + * | ^ | + * | | | + * | batch | + * | commands | + * | | | + * | | | + * +---------------+ <---- 0 + ? + * + */ + +#define BATCH_STATE_SPLIT 2048 + +void +gen8_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins) +{ + uint32_t curbe_buffer, interface_descriptor; + uint32_t batch_end; + + intel_batchbuffer_flush_with_context(batch, NULL); + + /* setup states */ + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; + + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); + igt_assert(batch->ptr < &batch->buffer[4095]); + + /* media pipeline */ + batch->ptr = batch->buffer; + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA); + gen8_emit_state_base_address(batch); + + gen8_emit_vfe_state(batch); + + gen8_emit_curbe_load(batch, curbe_buffer); + + gen8_emit_interface_descriptor_load(batch, interface_descriptor); + + gen8_emit_media_objects(batch); + + OUT_BATCH(MI_BATCH_BUFFER_END); + + batch_end = batch_align(batch, 8); + igt_assert(batch_end < BATCH_STATE_SPLIT); + + gen8_render_flush(batch, batch_end); + intel_batchbuffer_reset(batch); +} + +void +gen8lp_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins) +{ + uint32_t curbe_buffer, interface_descriptor; + uint32_t batch_end; + + intel_batchbuffer_flush_with_context(batch, NULL); + + /* setup states */ + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; + + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); + igt_assert(batch->ptr < &batch->buffer[4095]); + + /* media pipeline */ + batch->ptr = batch->buffer; + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA); + gen8_emit_state_base_address(batch); + + gen8_emit_vfe_state(batch); + + gen8_emit_curbe_load(batch, curbe_buffer); + + gen8_emit_interface_descriptor_load(batch, interface_descriptor); + + gen8lp_emit_media_objects(batch); + + OUT_BATCH(MI_BATCH_BUFFER_END); + + batch_end = batch_align(batch, 8); + igt_assert(batch_end < BATCH_STATE_SPLIT); + + gen8_render_flush(batch, batch_end); + intel_batchbuffer_reset(batch); +} + +void +gen9_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins) +{ + uint32_t curbe_buffer, interface_descriptor; + uint32_t batch_end; + + intel_batchbuffer_flush_with_context(batch, NULL); + + /* setup states */ + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; + + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); + igt_assert(batch->ptr < &batch->buffer[4095]); + + /* media pipeline */ + batch->ptr = batch->buffer; + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA | + GEN9_FORCE_MEDIA_AWAKE_ENABLE | + GEN9_SAMPLER_DOP_GATE_DISABLE | + GEN9_PIPELINE_SELECTION_MASK | + GEN9_SAMPLER_DOP_GATE_MASK | + GEN9_FORCE_MEDIA_AWAKE_MASK); + gen9_emit_state_base_address(batch); + + gen8_emit_vfe_state(batch); + + gen8_emit_curbe_load(batch, curbe_buffer); + + gen8_emit_interface_descriptor_load(batch, interface_descriptor); + + gen8_emit_media_objects(batch); + + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA | + GEN9_FORCE_MEDIA_AWAKE_DISABLE | + GEN9_SAMPLER_DOP_GATE_ENABLE | + GEN9_PIPELINE_SELECTION_MASK | + GEN9_SAMPLER_DOP_GATE_MASK | + GEN9_FORCE_MEDIA_AWAKE_MASK); + + OUT_BATCH(MI_BATCH_BUFFER_END); + + batch_end = batch_align(batch, 8); + igt_assert(batch_end < BATCH_STATE_SPLIT); + + gen8_render_flush(batch, batch_end); + intel_batchbuffer_reset(batch); +} diff --git a/lib/media_spin.h b/lib/media_spin.h new file mode 100644 index 0000000..8bc4829 --- /dev/null +++ b/lib/media_spin.h @@ -0,0 +1,39 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + * Jeff McGee <jeff.mcgee@intel.com> + */ + +#ifndef MEDIA_SPIN_H +#define MEDIA_SPIN_H + +void gen8_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins); + +void gen8lp_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins); + +void gen9_media_spinfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, uint32_t spins); + +#endif /* MEDIA_SPIN_H */ -- 2.3.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH i-g-t 1/2 v2] lib: Add media spin 2015-03-12 17:52 ` [PATCH i-g-t 1/2 v2] " jeff.mcgee @ 2015-03-25 2:50 ` He, Shuang 2015-03-25 18:07 ` Thomas Wood 0 siblings, 1 reply; 10+ messages in thread From: He, Shuang @ 2015-03-25 2:50 UTC (permalink / raw) To: Mcgee, Jeff, intel-gfx@lists.freedesktop.org; +Cc: Liu, Lei A (He Shuang on behalf of Liu Lei) Tested-by: Lei,Liu lei.a.liu@intel.com I-G-T test result: ./pm_sseu IGT-Version: 1.9-g07be8fe (x86_64) (Linux: 4.0.0-rc3_drm-intel-nightly_c09a3b_20150310+ x86_64) Subtest full-enable: SUCCESS (0.010s) Manually test result: SSEU Device Info Available Slice Total: 1 Available Subslice Total: 3 Available Subslice Per Slice: 3 Available EU Total: 23 Available EU Per Subslice: 8 Has Slice Power Gating: no Has Subslice Power Gating: no Has EU Power Gating: yes SSEU Device Status Enabled Slice Total: 1 Enabled Subslice Total: 3 Enabled Subslice Per Slice: 3 Enabled EU Total: 24 Enabled EU Per Subslice: 8 EU are enabled in pairs. Because one EU in a pair can be fused-off, it is possible to see such case where reported EU enabled is greater than reported EU available. The IGT test allows for this discrepancy and only fails if enabled is less than available, which can only happen if unwanted power gating is applied Best wishes Liu,Lei > -----Original Message----- > From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf > Of jeff.mcgee@intel.com > Sent: Friday, March 13, 2015 1:52 AM > To: intel-gfx@lists.freedesktop.org > Subject: [Intel-gfx] [PATCH i-g-t 1/2 v2] lib: Add media spin > > From: Jeff McGee <jeff.mcgee@intel.com> > > The media spin utility is derived from media fill. The purpose > is to create a simple means to keep the render engine (media > pipeline) busy for a controlled amount of time. It does so by > emitting a batch with a single execution thread that spins in > a tight loop the requested number of times. Each spin increments > a counter whose final 32-bit value is written to the destination > buffer on completion for checking. The implementation supports > Gen8, Gen8lp, and Gen9. > > v2: Apply the recommendations of igt.cocci. > > Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> > --- > lib/Makefile.sources | 2 + > lib/intel_batchbuffer.c | 24 +++ > lib/intel_batchbuffer.h | 22 ++ > lib/media_spin.c | 540 > ++++++++++++++++++++++++++++++++++++++++++++++++ > lib/media_spin.h | 39 ++++ > 5 files changed, 627 insertions(+) > create mode 100644 lib/media_spin.c > create mode 100644 lib/media_spin.h > > diff --git a/lib/Makefile.sources b/lib/Makefile.sources > index 76f353a..3d93629 100644 > --- a/lib/Makefile.sources > +++ b/lib/Makefile.sources > @@ -29,6 +29,8 @@ libintel_tools_la_SOURCES = \ > media_fill_gen8.c \ > media_fill_gen8lp.c \ > media_fill_gen9.c \ > + media_spin.h \ > + media_spin.c \ > gen7_media.h \ > gen8_media.h \ > rendercopy_i915.c \ > diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c > index 666c323..195ccc4 100644 > --- a/lib/intel_batchbuffer.c > +++ b/lib/intel_batchbuffer.c > @@ -40,6 +40,7 @@ > #include "rendercopy.h" > #include "media_fill.h" > #include "ioctl_wrappers.h" > +#include "media_spin.h" > > #include <i915_drm.h> > > @@ -785,3 +786,26 @@ igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid) > > return fill; > } > + > +/** > + * igt_get_media_spinfunc: > + * @devid: pci device id > + * > + * Returns: > + * > + * The platform-specific media spin function pointer for the device specified > + * with @devid. Will return NULL when no media spin function is > implemented. > + */ > +igt_media_spinfunc_t igt_get_media_spinfunc(int devid) > +{ > + igt_media_spinfunc_t spin = NULL; > + > + if (IS_GEN9(devid)) > + spin = gen9_media_spinfunc; > + else if (IS_BROADWELL(devid)) > + spin = gen8_media_spinfunc; > + else if (IS_CHERRYVIEW(devid)) > + spin = gen8lp_media_spinfunc; > + > + return spin; > +} > diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h > index fa8875b..62c8396 100644 > --- a/lib/intel_batchbuffer.h > +++ b/lib/intel_batchbuffer.h > @@ -300,4 +300,26 @@ typedef void (*igt_fillfunc_t)(struct > intel_batchbuffer *batch, > igt_fillfunc_t igt_get_media_fillfunc(int devid); > igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid); > > +/** > + * igt_media_spinfunc_t: > + * @batch: batchbuffer object > + * @dst: destination i-g-t buffer object > + * @spins: number of loops to execute > + * > + * This is the type of the per-platform media spin functions. The > + * platform-specific implementation can be obtained by calling > + * igt_get_media_spinfunc(). > + * > + * The media spin function emits a batchbuffer for the render engine with > + * the media pipeline selected. The workload consists of a single thread > + * which spins in a tight loop the requested number of times. Each spin > + * increments a counter whose final 32-bit value is written to the > + * destination buffer on completion. This utility provides a simple way > + * to keep the render engine busy for a set time for various tests. > + */ > +typedef void (*igt_media_spinfunc_t)(struct intel_batchbuffer *batch, > + struct igt_buf *dst, uint32_t spins); > + > +igt_media_spinfunc_t igt_get_media_spinfunc(int devid); > + > #endif > diff --git a/lib/media_spin.c b/lib/media_spin.c > new file mode 100644 > index 0000000..580c109 > --- /dev/null > +++ b/lib/media_spin.c > @@ -0,0 +1,540 @@ > +/* > + * Copyright © 2015 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the > "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO > EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, > DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > OTHER DEALINGS > + * IN THE SOFTWARE. > + * > + * Authors: > + * Jeff McGee <jeff.mcgee@intel.com> > + */ > + > +#include <intel_bufmgr.h> > +#include <i915_drm.h> > +#include "intel_reg.h" > +#include "drmtest.h" > +#include "intel_batchbuffer.h" > +#include "gen8_media.h" > +#include "media_spin.h" > + > +static const uint32_t spin_kernel[][4] = { > + { 0x00600001, 0x20800208, 0x008d0000, 0x00000000 }, /* mov > (8)r4.0<1>:ud r0.0<8;8;1>:ud */ > + { 0x00200001, 0x20800208, 0x00450040, 0x00000000 }, /* mov > (2)r4.0<1>.ud r2.0<2;2;1>:ud */ > + { 0x00000001, 0x20880608, 0x00000000, 0x00000003 }, /* mov > (1)r4.8<1>:ud 0x3 */ > + { 0x00000001, 0x20a00608, 0x00000000, 0x00000000 }, /* mov > (1)r5.0<1>:ud 0 */ > + { 0x00000040, 0x20a00208, 0x060000a0, 0x00000001 }, /* add > (1)r5.0<1>:ud r5.0<0;1;0>:ud 1 */ > + { 0x01000010, 0x20000200, 0x02000020, 0x000000a0 }, /* cmp.e.f0.0 > (1)null<1> r1<0;1;0> r5<0;1;0> */ > + { 0x00110027, 0x00000000, 0x00000000, 0xffffffe0 }, /* ~f0.0 while (1) > -32 */ > + { 0x0c800031, 0x20000a00, 0x0e000080, 0x040a8000 }, /* send.dcdp1 > (16)null<1> r4.0<0;1;0> 0x040a8000 */ > + { 0x00600001, 0x2e000208, 0x008d0000, 0x00000000 }, /* mov > (8)r112<1>:ud r0.0<8;8;1>:ud */ > + { 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 }, /* send.ts > (16)null<1> r112<0;1;0>:d 0x82000010 */ > +}; > + > +static uint32_t > +batch_used(struct intel_batchbuffer *batch) > +{ > + return batch->ptr - batch->buffer; > +} > + > +static uint32_t > +batch_align(struct intel_batchbuffer *batch, uint32_t align) > +{ > + uint32_t offset = batch_used(batch); > + offset = ALIGN(offset, align); > + batch->ptr = batch->buffer + offset; > + return offset; > +} > + > +static void * > +batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align) > +{ > + uint32_t offset = batch_align(batch, align); > + batch->ptr += size; > + return memset(batch->buffer + offset, 0, size); > +} > + > +static uint32_t > +batch_offset(struct intel_batchbuffer *batch, void *ptr) > +{ > + return (uint8_t *)ptr - batch->buffer; > +} > + > +static uint32_t > +batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, > + uint32_t align) > +{ > + return batch_offset(batch, memcpy(batch_alloc(batch, size, align), > ptr, size)); > +} > + > +static void > +gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end) > +{ > + int ret; > + > + ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer); > + if (ret == 0) > + ret = drm_intel_gem_bo_context_exec(batch->bo, NULL, > + batch_end, 0); > + igt_assert_eq(ret, 0); > +} > + > +static uint32_t > +gen8_spin_curbe_buffer_data(struct intel_batchbuffer *batch, > + uint32_t iters) > +{ > + uint32_t *curbe_buffer; > + uint32_t offset; > + > + curbe_buffer = batch_alloc(batch, 64, 64); > + offset = batch_offset(batch, curbe_buffer); > + *curbe_buffer = iters; > + > + return offset; > +} > + > +static uint32_t > +gen8_spin_surface_state(struct intel_batchbuffer *batch, > + struct igt_buf *buf, > + uint32_t format, > + int is_dst) > +{ > + struct gen8_surface_state *ss; > + uint32_t write_domain, read_domain, offset; > + int ret; > + > + if (is_dst) { > + write_domain = read_domain = > I915_GEM_DOMAIN_RENDER; > + } else { > + write_domain = 0; > + read_domain = I915_GEM_DOMAIN_SAMPLER; > + } > + > + ss = batch_alloc(batch, sizeof(*ss), 64); > + offset = batch_offset(batch, ss); > + > + ss->ss0.surface_type = GEN8_SURFACE_2D; > + ss->ss0.surface_format = format; > + ss->ss0.render_cache_read_write = 1; > + ss->ss0.vertical_alignment = 1; /* align 4 */ > + ss->ss0.horizontal_alignment = 1; /* align 4 */ > + > + if (buf->tiling == I915_TILING_X) > + ss->ss0.tiled_mode = 2; > + else if (buf->tiling == I915_TILING_Y) > + ss->ss0.tiled_mode = 3; > + > + ss->ss8.base_addr = buf->bo->offset; > + > + ret = drm_intel_bo_emit_reloc(batch->bo, > + batch_offset(batch, ss) + 8 * 4, > + buf->bo, 0, > + read_domain, write_domain); > + igt_assert_eq(ret, 0); > + > + ss->ss2.height = igt_buf_height(buf) - 1; > + ss->ss2.width = igt_buf_width(buf) - 1; > + ss->ss3.pitch = buf->stride - 1; > + > + ss->ss7.shader_chanel_select_r = 4; > + ss->ss7.shader_chanel_select_g = 5; > + ss->ss7.shader_chanel_select_b = 6; > + ss->ss7.shader_chanel_select_a = 7; > + > + return offset; > +} > + > +static uint32_t > +gen8_spin_binding_table(struct intel_batchbuffer *batch, > + struct igt_buf *dst) > +{ > + uint32_t *binding_table, offset; > + > + binding_table = batch_alloc(batch, 32, 64); > + offset = batch_offset(batch, binding_table); > + > + binding_table[0] = gen8_spin_surface_state(batch, dst, > + > GEN8_SURFACEFORMAT_R8_UNORM, 1); > + > + return offset; > +} > + > +static uint32_t > +gen8_spin_media_kernel(struct intel_batchbuffer *batch, > + const uint32_t kernel[][4], > + size_t size) > +{ > + uint32_t offset; > + > + offset = batch_copy(batch, kernel, size, 64); > + > + return offset; > +} > + > +static uint32_t > +gen8_spin_interface_descriptor(struct intel_batchbuffer *batch, > + struct igt_buf *dst) > +{ > + struct gen8_interface_descriptor_data *idd; > + uint32_t offset; > + uint32_t binding_table_offset, kernel_offset; > + > + binding_table_offset = gen8_spin_binding_table(batch, dst); > + kernel_offset = gen8_spin_media_kernel(batch, spin_kernel, > + sizeof(spin_kernel)); > + > + idd = batch_alloc(batch, sizeof(*idd), 64); > + offset = batch_offset(batch, idd); > + > + idd->desc0.kernel_start_pointer = (kernel_offset >> 6); > + > + idd->desc2.single_program_flow = 1; > + idd->desc2.floating_point_mode = > GEN8_FLOATING_POINT_IEEE_754; > + > + idd->desc3.sampler_count = 0; /* 0 samplers used */ > + idd->desc3.sampler_state_pointer = 0; > + > + idd->desc4.binding_table_entry_count = 0; > + idd->desc4.binding_table_pointer = (binding_table_offset >> 5); > + > + idd->desc5.constant_urb_entry_read_offset = 0; > + idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */ > + > + return offset; > +} > + > +static void > +gen8_emit_state_base_address(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2)); > + > + /* general */ > + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); > + OUT_BATCH(0); > + > + /* stateless data port */ > + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); > + > + /* surface */ > + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, > BASE_ADDRESS_MODIFY); > + > + /* dynamic */ > + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | > I915_GEM_DOMAIN_INSTRUCTION, > + 0, BASE_ADDRESS_MODIFY); > + > + /* indirect */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + > + /* instruction */ > + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, > BASE_ADDRESS_MODIFY); > + > + /* general state buffer size */ > + OUT_BATCH(0xfffff000 | 1); > + /* dynamic state buffer size */ > + OUT_BATCH(1 << 12 | 1); > + /* indirect object buffer size */ > + OUT_BATCH(0xfffff000 | 1); > + /* intruction buffer size, must set modify enable bit, otherwise it > may result in GPU hang */ > + OUT_BATCH(1 << 12 | 1); > +} > + > +static void > +gen9_emit_state_base_address(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2)); > + > + /* general */ > + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); > + OUT_BATCH(0); > + > + /* stateless data port */ > + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); > + > + /* surface */ > + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, > BASE_ADDRESS_MODIFY); > + > + /* dynamic */ > + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | > I915_GEM_DOMAIN_INSTRUCTION, > + 0, BASE_ADDRESS_MODIFY); > + > + /* indirect */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + > + /* instruction */ > + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, > BASE_ADDRESS_MODIFY); > + > + /* general state buffer size */ > + OUT_BATCH(0xfffff000 | 1); > + /* dynamic state buffer size */ > + OUT_BATCH(1 << 12 | 1); > + /* indirect object buffer size */ > + OUT_BATCH(0xfffff000 | 1); > + /* intruction buffer size, must set modify enable bit, otherwise it > may result in GPU hang */ > + OUT_BATCH(1 << 12 | 1); > + > + /* Bindless surface state base address */ > + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); > + OUT_BATCH(0); > + OUT_BATCH(0xfffff000); > +} > + > +static void > +gen8_emit_vfe_state(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2)); > + > + /* scratch buffer */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + > + /* number of threads & urb entries */ > + OUT_BATCH(2 << 8); > + > + OUT_BATCH(0); > + > + /* urb entry size & curbe size */ > + OUT_BATCH(2 << 16 | > + 2); > + > + /* scoreboard */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > +} > + > +static void > +gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t > curbe_buffer) > +{ > + OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2)); > + OUT_BATCH(0); > + /* curbe total data length */ > + OUT_BATCH(64); > + /* curbe data start address, is relative to the dynamics base address > */ > + OUT_BATCH(curbe_buffer); > +} > + > +static void > +gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, > + uint32_t interface_descriptor) > +{ > + OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2)); > + OUT_BATCH(0); > + /* interface descriptor data length */ > + OUT_BATCH(sizeof(struct gen8_interface_descriptor_data)); > + /* interface descriptor address, is relative to the dynamics base > address */ > + OUT_BATCH(interface_descriptor); > +} > + > +static void > +gen8_emit_media_state_flush(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2)); > + OUT_BATCH(0); > +} > + > +static void > +gen8_emit_media_objects(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2)); > + > + /* interface descriptor offset */ > + OUT_BATCH(0); > + > + /* without indirect data */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + > + /* scoreboard */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + > + /* inline data (xoffset, yoffset) */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + gen8_emit_media_state_flush(batch); > +} > + > +static void > +gen8lp_emit_media_objects(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2)); > + > + /* interface descriptor offset */ > + OUT_BATCH(0); > + > + /* without indirect data */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + > + /* scoreboard */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + > + /* inline data (xoffset, yoffset) */ > + OUT_BATCH(0); > + OUT_BATCH(0); > +} > + > +/* > + * This sets up the media pipeline, > + * > + * +---------------+ <---- 4096 > + * | ^ | > + * | | | > + * | various | > + * | state | > + * | | | > + * |_______|_______| <---- 2048 + ? > + * | ^ | > + * | | | > + * | batch | > + * | commands | > + * | | | > + * | | | > + * +---------------+ <---- 0 + ? > + * > + */ > + > +#define BATCH_STATE_SPLIT 2048 > + > +void > +gen8_media_spinfunc(struct intel_batchbuffer *batch, > + struct igt_buf *dst, uint32_t spins) > +{ > + uint32_t curbe_buffer, interface_descriptor; > + uint32_t batch_end; > + > + intel_batchbuffer_flush_with_context(batch, NULL); > + > + /* setup states */ > + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; > + > + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); > + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); > + igt_assert(batch->ptr < &batch->buffer[4095]); > + > + /* media pipeline */ > + batch->ptr = batch->buffer; > + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA); > + gen8_emit_state_base_address(batch); > + > + gen8_emit_vfe_state(batch); > + > + gen8_emit_curbe_load(batch, curbe_buffer); > + > + gen8_emit_interface_descriptor_load(batch, interface_descriptor); > + > + gen8_emit_media_objects(batch); > + > + OUT_BATCH(MI_BATCH_BUFFER_END); > + > + batch_end = batch_align(batch, 8); > + igt_assert(batch_end < BATCH_STATE_SPLIT); > + > + gen8_render_flush(batch, batch_end); > + intel_batchbuffer_reset(batch); > +} > + > +void > +gen8lp_media_spinfunc(struct intel_batchbuffer *batch, > + struct igt_buf *dst, uint32_t spins) > +{ > + uint32_t curbe_buffer, interface_descriptor; > + uint32_t batch_end; > + > + intel_batchbuffer_flush_with_context(batch, NULL); > + > + /* setup states */ > + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; > + > + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); > + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); > + igt_assert(batch->ptr < &batch->buffer[4095]); > + > + /* media pipeline */ > + batch->ptr = batch->buffer; > + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA); > + gen8_emit_state_base_address(batch); > + > + gen8_emit_vfe_state(batch); > + > + gen8_emit_curbe_load(batch, curbe_buffer); > + > + gen8_emit_interface_descriptor_load(batch, interface_descriptor); > + > + gen8lp_emit_media_objects(batch); > + > + OUT_BATCH(MI_BATCH_BUFFER_END); > + > + batch_end = batch_align(batch, 8); > + igt_assert(batch_end < BATCH_STATE_SPLIT); > + > + gen8_render_flush(batch, batch_end); > + intel_batchbuffer_reset(batch); > +} > + > +void > +gen9_media_spinfunc(struct intel_batchbuffer *batch, > + struct igt_buf *dst, uint32_t spins) > +{ > + uint32_t curbe_buffer, interface_descriptor; > + uint32_t batch_end; > + > + intel_batchbuffer_flush_with_context(batch, NULL); > + > + /* setup states */ > + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; > + > + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); > + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); > + igt_assert(batch->ptr < &batch->buffer[4095]); > + > + /* media pipeline */ > + batch->ptr = batch->buffer; > + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA | > + GEN9_FORCE_MEDIA_AWAKE_ENABLE | > + GEN9_SAMPLER_DOP_GATE_DISABLE | > + GEN9_PIPELINE_SELECTION_MASK | > + GEN9_SAMPLER_DOP_GATE_MASK | > + GEN9_FORCE_MEDIA_AWAKE_MASK); > + gen9_emit_state_base_address(batch); > + > + gen8_emit_vfe_state(batch); > + > + gen8_emit_curbe_load(batch, curbe_buffer); > + > + gen8_emit_interface_descriptor_load(batch, interface_descriptor); > + > + gen8_emit_media_objects(batch); > + > + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA | > + GEN9_FORCE_MEDIA_AWAKE_DISABLE | > + GEN9_SAMPLER_DOP_GATE_ENABLE | > + GEN9_PIPELINE_SELECTION_MASK | > + GEN9_SAMPLER_DOP_GATE_MASK | > + GEN9_FORCE_MEDIA_AWAKE_MASK); > + > + OUT_BATCH(MI_BATCH_BUFFER_END); > + > + batch_end = batch_align(batch, 8); > + igt_assert(batch_end < BATCH_STATE_SPLIT); > + > + gen8_render_flush(batch, batch_end); > + intel_batchbuffer_reset(batch); > +} > diff --git a/lib/media_spin.h b/lib/media_spin.h > new file mode 100644 > index 0000000..8bc4829 > --- /dev/null > +++ b/lib/media_spin.h > @@ -0,0 +1,39 @@ > +/* > + * Copyright © 2015 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the > "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO > EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, > DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > OTHER DEALINGS > + * IN THE SOFTWARE. > + * > + * Authors: > + * Jeff McGee <jeff.mcgee@intel.com> > + */ > + > +#ifndef MEDIA_SPIN_H > +#define MEDIA_SPIN_H > + > +void gen8_media_spinfunc(struct intel_batchbuffer *batch, > + struct igt_buf *dst, uint32_t spins); > + > +void gen8lp_media_spinfunc(struct intel_batchbuffer *batch, > + struct igt_buf *dst, uint32_t spins); > + > +void gen9_media_spinfunc(struct intel_batchbuffer *batch, > + struct igt_buf *dst, uint32_t spins); > + > +#endif /* MEDIA_SPIN_H */ > -- > 2.3.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH i-g-t 1/2 v2] lib: Add media spin 2015-03-25 2:50 ` He, Shuang @ 2015-03-25 18:07 ` Thomas Wood 0 siblings, 0 replies; 10+ messages in thread From: Thomas Wood @ 2015-03-25 18:07 UTC (permalink / raw) To: He, Shuang; +Cc: intel-gfx@lists.freedesktop.org, Liu, Lei A On 25 March 2015 at 02:50, He, Shuang <shuang.he@intel.com> wrote: > (He Shuang on behalf of Liu Lei) > Tested-by: Lei,Liu lei.a.liu@intel.com Thanks, both patches in this series are now merged. > > I-G-T test result: > ./pm_sseu > IGT-Version: 1.9-g07be8fe (x86_64) (Linux: 4.0.0-rc3_drm-intel-nightly_c09a3b_20150310+ x86_64) > Subtest full-enable: SUCCESS (0.010s) > > Manually test result: > SSEU Device Info > Available Slice Total: 1 > Available Subslice Total: 3 > Available Subslice Per Slice: 3 > Available EU Total: 23 > Available EU Per Subslice: 8 > Has Slice Power Gating: no > Has Subslice Power Gating: no > Has EU Power Gating: yes > SSEU Device Status > Enabled Slice Total: 1 > Enabled Subslice Total: 3 > Enabled Subslice Per Slice: 3 > Enabled EU Total: 24 > Enabled EU Per Subslice: 8 > > EU are enabled in pairs. Because one EU in a pair can be fused-off, it is possible to see such case where reported EU enabled is greater than reported EU available. The IGT test allows for this discrepancy and only fails if enabled is less than available, which can only happen if unwanted power gating is applied > > Best wishes > Liu,Lei > >> -----Original Message----- >> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf >> Of jeff.mcgee@intel.com >> Sent: Friday, March 13, 2015 1:52 AM >> To: intel-gfx@lists.freedesktop.org >> Subject: [Intel-gfx] [PATCH i-g-t 1/2 v2] lib: Add media spin >> >> From: Jeff McGee <jeff.mcgee@intel.com> >> >> The media spin utility is derived from media fill. The purpose >> is to create a simple means to keep the render engine (media >> pipeline) busy for a controlled amount of time. It does so by >> emitting a batch with a single execution thread that spins in >> a tight loop the requested number of times. Each spin increments >> a counter whose final 32-bit value is written to the destination >> buffer on completion for checking. The implementation supports >> Gen8, Gen8lp, and Gen9. >> >> v2: Apply the recommendations of igt.cocci. >> >> Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> >> --- >> lib/Makefile.sources | 2 + >> lib/intel_batchbuffer.c | 24 +++ >> lib/intel_batchbuffer.h | 22 ++ >> lib/media_spin.c | 540 >> ++++++++++++++++++++++++++++++++++++++++++++++++ >> lib/media_spin.h | 39 ++++ >> 5 files changed, 627 insertions(+) >> create mode 100644 lib/media_spin.c >> create mode 100644 lib/media_spin.h >> >> diff --git a/lib/Makefile.sources b/lib/Makefile.sources >> index 76f353a..3d93629 100644 >> --- a/lib/Makefile.sources >> +++ b/lib/Makefile.sources >> @@ -29,6 +29,8 @@ libintel_tools_la_SOURCES = \ >> media_fill_gen8.c \ >> media_fill_gen8lp.c \ >> media_fill_gen9.c \ >> + media_spin.h \ >> + media_spin.c \ >> gen7_media.h \ >> gen8_media.h \ >> rendercopy_i915.c \ >> diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c >> index 666c323..195ccc4 100644 >> --- a/lib/intel_batchbuffer.c >> +++ b/lib/intel_batchbuffer.c >> @@ -40,6 +40,7 @@ >> #include "rendercopy.h" >> #include "media_fill.h" >> #include "ioctl_wrappers.h" >> +#include "media_spin.h" >> >> #include <i915_drm.h> >> >> @@ -785,3 +786,26 @@ igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid) >> >> return fill; >> } >> + >> +/** >> + * igt_get_media_spinfunc: >> + * @devid: pci device id >> + * >> + * Returns: >> + * >> + * The platform-specific media spin function pointer for the device specified >> + * with @devid. Will return NULL when no media spin function is >> implemented. >> + */ >> +igt_media_spinfunc_t igt_get_media_spinfunc(int devid) >> +{ >> + igt_media_spinfunc_t spin = NULL; >> + >> + if (IS_GEN9(devid)) >> + spin = gen9_media_spinfunc; >> + else if (IS_BROADWELL(devid)) >> + spin = gen8_media_spinfunc; >> + else if (IS_CHERRYVIEW(devid)) >> + spin = gen8lp_media_spinfunc; >> + >> + return spin; >> +} >> diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h >> index fa8875b..62c8396 100644 >> --- a/lib/intel_batchbuffer.h >> +++ b/lib/intel_batchbuffer.h >> @@ -300,4 +300,26 @@ typedef void (*igt_fillfunc_t)(struct >> intel_batchbuffer *batch, >> igt_fillfunc_t igt_get_media_fillfunc(int devid); >> igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid); >> >> +/** >> + * igt_media_spinfunc_t: >> + * @batch: batchbuffer object >> + * @dst: destination i-g-t buffer object >> + * @spins: number of loops to execute >> + * >> + * This is the type of the per-platform media spin functions. The >> + * platform-specific implementation can be obtained by calling >> + * igt_get_media_spinfunc(). >> + * >> + * The media spin function emits a batchbuffer for the render engine with >> + * the media pipeline selected. The workload consists of a single thread >> + * which spins in a tight loop the requested number of times. Each spin >> + * increments a counter whose final 32-bit value is written to the >> + * destination buffer on completion. This utility provides a simple way >> + * to keep the render engine busy for a set time for various tests. >> + */ >> +typedef void (*igt_media_spinfunc_t)(struct intel_batchbuffer *batch, >> + struct igt_buf *dst, uint32_t spins); >> + >> +igt_media_spinfunc_t igt_get_media_spinfunc(int devid); >> + >> #endif >> diff --git a/lib/media_spin.c b/lib/media_spin.c >> new file mode 100644 >> index 0000000..580c109 >> --- /dev/null >> +++ b/lib/media_spin.c >> @@ -0,0 +1,540 @@ >> +/* >> + * Copyright © 2015 Intel Corporation >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice (including the next >> + * paragraph) shall be included in all copies or substantial portions of the >> + * Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >> EVENT SHALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, >> DAMAGES OR OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >> ARISING >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR >> OTHER DEALINGS >> + * IN THE SOFTWARE. >> + * >> + * Authors: >> + * Jeff McGee <jeff.mcgee@intel.com> >> + */ >> + >> +#include <intel_bufmgr.h> >> +#include <i915_drm.h> >> +#include "intel_reg.h" >> +#include "drmtest.h" >> +#include "intel_batchbuffer.h" >> +#include "gen8_media.h" >> +#include "media_spin.h" >> + >> +static const uint32_t spin_kernel[][4] = { >> + { 0x00600001, 0x20800208, 0x008d0000, 0x00000000 }, /* mov >> (8)r4.0<1>:ud r0.0<8;8;1>:ud */ >> + { 0x00200001, 0x20800208, 0x00450040, 0x00000000 }, /* mov >> (2)r4.0<1>.ud r2.0<2;2;1>:ud */ >> + { 0x00000001, 0x20880608, 0x00000000, 0x00000003 }, /* mov >> (1)r4.8<1>:ud 0x3 */ >> + { 0x00000001, 0x20a00608, 0x00000000, 0x00000000 }, /* mov >> (1)r5.0<1>:ud 0 */ >> + { 0x00000040, 0x20a00208, 0x060000a0, 0x00000001 }, /* add >> (1)r5.0<1>:ud r5.0<0;1;0>:ud 1 */ >> + { 0x01000010, 0x20000200, 0x02000020, 0x000000a0 }, /* cmp.e.f0.0 >> (1)null<1> r1<0;1;0> r5<0;1;0> */ >> + { 0x00110027, 0x00000000, 0x00000000, 0xffffffe0 }, /* ~f0.0 while (1) >> -32 */ >> + { 0x0c800031, 0x20000a00, 0x0e000080, 0x040a8000 }, /* send.dcdp1 >> (16)null<1> r4.0<0;1;0> 0x040a8000 */ >> + { 0x00600001, 0x2e000208, 0x008d0000, 0x00000000 }, /* mov >> (8)r112<1>:ud r0.0<8;8;1>:ud */ >> + { 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 }, /* send.ts >> (16)null<1> r112<0;1;0>:d 0x82000010 */ >> +}; >> + >> +static uint32_t >> +batch_used(struct intel_batchbuffer *batch) >> +{ >> + return batch->ptr - batch->buffer; >> +} >> + >> +static uint32_t >> +batch_align(struct intel_batchbuffer *batch, uint32_t align) >> +{ >> + uint32_t offset = batch_used(batch); >> + offset = ALIGN(offset, align); >> + batch->ptr = batch->buffer + offset; >> + return offset; >> +} >> + >> +static void * >> +batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align) >> +{ >> + uint32_t offset = batch_align(batch, align); >> + batch->ptr += size; >> + return memset(batch->buffer + offset, 0, size); >> +} >> + >> +static uint32_t >> +batch_offset(struct intel_batchbuffer *batch, void *ptr) >> +{ >> + return (uint8_t *)ptr - batch->buffer; >> +} >> + >> +static uint32_t >> +batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, >> + uint32_t align) >> +{ >> + return batch_offset(batch, memcpy(batch_alloc(batch, size, align), >> ptr, size)); >> +} >> + >> +static void >> +gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end) >> +{ >> + int ret; >> + >> + ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer); >> + if (ret == 0) >> + ret = drm_intel_gem_bo_context_exec(batch->bo, NULL, >> + batch_end, 0); >> + igt_assert_eq(ret, 0); >> +} >> + >> +static uint32_t >> +gen8_spin_curbe_buffer_data(struct intel_batchbuffer *batch, >> + uint32_t iters) >> +{ >> + uint32_t *curbe_buffer; >> + uint32_t offset; >> + >> + curbe_buffer = batch_alloc(batch, 64, 64); >> + offset = batch_offset(batch, curbe_buffer); >> + *curbe_buffer = iters; >> + >> + return offset; >> +} >> + >> +static uint32_t >> +gen8_spin_surface_state(struct intel_batchbuffer *batch, >> + struct igt_buf *buf, >> + uint32_t format, >> + int is_dst) >> +{ >> + struct gen8_surface_state *ss; >> + uint32_t write_domain, read_domain, offset; >> + int ret; >> + >> + if (is_dst) { >> + write_domain = read_domain = >> I915_GEM_DOMAIN_RENDER; >> + } else { >> + write_domain = 0; >> + read_domain = I915_GEM_DOMAIN_SAMPLER; >> + } >> + >> + ss = batch_alloc(batch, sizeof(*ss), 64); >> + offset = batch_offset(batch, ss); >> + >> + ss->ss0.surface_type = GEN8_SURFACE_2D; >> + ss->ss0.surface_format = format; >> + ss->ss0.render_cache_read_write = 1; >> + ss->ss0.vertical_alignment = 1; /* align 4 */ >> + ss->ss0.horizontal_alignment = 1; /* align 4 */ >> + >> + if (buf->tiling == I915_TILING_X) >> + ss->ss0.tiled_mode = 2; >> + else if (buf->tiling == I915_TILING_Y) >> + ss->ss0.tiled_mode = 3; >> + >> + ss->ss8.base_addr = buf->bo->offset; >> + >> + ret = drm_intel_bo_emit_reloc(batch->bo, >> + batch_offset(batch, ss) + 8 * 4, >> + buf->bo, 0, >> + read_domain, write_domain); >> + igt_assert_eq(ret, 0); >> + >> + ss->ss2.height = igt_buf_height(buf) - 1; >> + ss->ss2.width = igt_buf_width(buf) - 1; >> + ss->ss3.pitch = buf->stride - 1; >> + >> + ss->ss7.shader_chanel_select_r = 4; >> + ss->ss7.shader_chanel_select_g = 5; >> + ss->ss7.shader_chanel_select_b = 6; >> + ss->ss7.shader_chanel_select_a = 7; >> + >> + return offset; >> +} >> + >> +static uint32_t >> +gen8_spin_binding_table(struct intel_batchbuffer *batch, >> + struct igt_buf *dst) >> +{ >> + uint32_t *binding_table, offset; >> + >> + binding_table = batch_alloc(batch, 32, 64); >> + offset = batch_offset(batch, binding_table); >> + >> + binding_table[0] = gen8_spin_surface_state(batch, dst, >> + >> GEN8_SURFACEFORMAT_R8_UNORM, 1); >> + >> + return offset; >> +} >> + >> +static uint32_t >> +gen8_spin_media_kernel(struct intel_batchbuffer *batch, >> + const uint32_t kernel[][4], >> + size_t size) >> +{ >> + uint32_t offset; >> + >> + offset = batch_copy(batch, kernel, size, 64); >> + >> + return offset; >> +} >> + >> +static uint32_t >> +gen8_spin_interface_descriptor(struct intel_batchbuffer *batch, >> + struct igt_buf *dst) >> +{ >> + struct gen8_interface_descriptor_data *idd; >> + uint32_t offset; >> + uint32_t binding_table_offset, kernel_offset; >> + >> + binding_table_offset = gen8_spin_binding_table(batch, dst); >> + kernel_offset = gen8_spin_media_kernel(batch, spin_kernel, >> + sizeof(spin_kernel)); >> + >> + idd = batch_alloc(batch, sizeof(*idd), 64); >> + offset = batch_offset(batch, idd); >> + >> + idd->desc0.kernel_start_pointer = (kernel_offset >> 6); >> + >> + idd->desc2.single_program_flow = 1; >> + idd->desc2.floating_point_mode = >> GEN8_FLOATING_POINT_IEEE_754; >> + >> + idd->desc3.sampler_count = 0; /* 0 samplers used */ >> + idd->desc3.sampler_state_pointer = 0; >> + >> + idd->desc4.binding_table_entry_count = 0; >> + idd->desc4.binding_table_pointer = (binding_table_offset >> 5); >> + >> + idd->desc5.constant_urb_entry_read_offset = 0; >> + idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */ >> + >> + return offset; >> +} >> + >> +static void >> +gen8_emit_state_base_address(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2)); >> + >> + /* general */ >> + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); >> + OUT_BATCH(0); >> + >> + /* stateless data port */ >> + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); >> + >> + /* surface */ >> + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, >> BASE_ADDRESS_MODIFY); >> + >> + /* dynamic */ >> + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | >> I915_GEM_DOMAIN_INSTRUCTION, >> + 0, BASE_ADDRESS_MODIFY); >> + >> + /* indirect */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + >> + /* instruction */ >> + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, >> BASE_ADDRESS_MODIFY); >> + >> + /* general state buffer size */ >> + OUT_BATCH(0xfffff000 | 1); >> + /* dynamic state buffer size */ >> + OUT_BATCH(1 << 12 | 1); >> + /* indirect object buffer size */ >> + OUT_BATCH(0xfffff000 | 1); >> + /* intruction buffer size, must set modify enable bit, otherwise it >> may result in GPU hang */ >> + OUT_BATCH(1 << 12 | 1); >> +} >> + >> +static void >> +gen9_emit_state_base_address(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2)); >> + >> + /* general */ >> + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); >> + OUT_BATCH(0); >> + >> + /* stateless data port */ >> + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); >> + >> + /* surface */ >> + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, >> BASE_ADDRESS_MODIFY); >> + >> + /* dynamic */ >> + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | >> I915_GEM_DOMAIN_INSTRUCTION, >> + 0, BASE_ADDRESS_MODIFY); >> + >> + /* indirect */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + >> + /* instruction */ >> + OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, >> BASE_ADDRESS_MODIFY); >> + >> + /* general state buffer size */ >> + OUT_BATCH(0xfffff000 | 1); >> + /* dynamic state buffer size */ >> + OUT_BATCH(1 << 12 | 1); >> + /* indirect object buffer size */ >> + OUT_BATCH(0xfffff000 | 1); >> + /* intruction buffer size, must set modify enable bit, otherwise it >> may result in GPU hang */ >> + OUT_BATCH(1 << 12 | 1); >> + >> + /* Bindless surface state base address */ >> + OUT_BATCH(0 | BASE_ADDRESS_MODIFY); >> + OUT_BATCH(0); >> + OUT_BATCH(0xfffff000); >> +} >> + >> +static void >> +gen8_emit_vfe_state(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2)); >> + >> + /* scratch buffer */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + >> + /* number of threads & urb entries */ >> + OUT_BATCH(2 << 8); >> + >> + OUT_BATCH(0); >> + >> + /* urb entry size & curbe size */ >> + OUT_BATCH(2 << 16 | >> + 2); >> + >> + /* scoreboard */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> +} >> + >> +static void >> +gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t >> curbe_buffer) >> +{ >> + OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2)); >> + OUT_BATCH(0); >> + /* curbe total data length */ >> + OUT_BATCH(64); >> + /* curbe data start address, is relative to the dynamics base address >> */ >> + OUT_BATCH(curbe_buffer); >> +} >> + >> +static void >> +gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, >> + uint32_t interface_descriptor) >> +{ >> + OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2)); >> + OUT_BATCH(0); >> + /* interface descriptor data length */ >> + OUT_BATCH(sizeof(struct gen8_interface_descriptor_data)); >> + /* interface descriptor address, is relative to the dynamics base >> address */ >> + OUT_BATCH(interface_descriptor); >> +} >> + >> +static void >> +gen8_emit_media_state_flush(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2)); >> + OUT_BATCH(0); >> +} >> + >> +static void >> +gen8_emit_media_objects(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2)); >> + >> + /* interface descriptor offset */ >> + OUT_BATCH(0); >> + >> + /* without indirect data */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + >> + /* scoreboard */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + >> + /* inline data (xoffset, yoffset) */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + gen8_emit_media_state_flush(batch); >> +} >> + >> +static void >> +gen8lp_emit_media_objects(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2)); >> + >> + /* interface descriptor offset */ >> + OUT_BATCH(0); >> + >> + /* without indirect data */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + >> + /* scoreboard */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + >> + /* inline data (xoffset, yoffset) */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> +} >> + >> +/* >> + * This sets up the media pipeline, >> + * >> + * +---------------+ <---- 4096 >> + * | ^ | >> + * | | | >> + * | various | >> + * | state | >> + * | | | >> + * |_______|_______| <---- 2048 + ? >> + * | ^ | >> + * | | | >> + * | batch | >> + * | commands | >> + * | | | >> + * | | | >> + * +---------------+ <---- 0 + ? >> + * >> + */ >> + >> +#define BATCH_STATE_SPLIT 2048 >> + >> +void >> +gen8_media_spinfunc(struct intel_batchbuffer *batch, >> + struct igt_buf *dst, uint32_t spins) >> +{ >> + uint32_t curbe_buffer, interface_descriptor; >> + uint32_t batch_end; >> + >> + intel_batchbuffer_flush_with_context(batch, NULL); >> + >> + /* setup states */ >> + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; >> + >> + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); >> + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); >> + igt_assert(batch->ptr < &batch->buffer[4095]); >> + >> + /* media pipeline */ >> + batch->ptr = batch->buffer; >> + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA); >> + gen8_emit_state_base_address(batch); >> + >> + gen8_emit_vfe_state(batch); >> + >> + gen8_emit_curbe_load(batch, curbe_buffer); >> + >> + gen8_emit_interface_descriptor_load(batch, interface_descriptor); >> + >> + gen8_emit_media_objects(batch); >> + >> + OUT_BATCH(MI_BATCH_BUFFER_END); >> + >> + batch_end = batch_align(batch, 8); >> + igt_assert(batch_end < BATCH_STATE_SPLIT); >> + >> + gen8_render_flush(batch, batch_end); >> + intel_batchbuffer_reset(batch); >> +} >> + >> +void >> +gen8lp_media_spinfunc(struct intel_batchbuffer *batch, >> + struct igt_buf *dst, uint32_t spins) >> +{ >> + uint32_t curbe_buffer, interface_descriptor; >> + uint32_t batch_end; >> + >> + intel_batchbuffer_flush_with_context(batch, NULL); >> + >> + /* setup states */ >> + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; >> + >> + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); >> + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); >> + igt_assert(batch->ptr < &batch->buffer[4095]); >> + >> + /* media pipeline */ >> + batch->ptr = batch->buffer; >> + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA); >> + gen8_emit_state_base_address(batch); >> + >> + gen8_emit_vfe_state(batch); >> + >> + gen8_emit_curbe_load(batch, curbe_buffer); >> + >> + gen8_emit_interface_descriptor_load(batch, interface_descriptor); >> + >> + gen8lp_emit_media_objects(batch); >> + >> + OUT_BATCH(MI_BATCH_BUFFER_END); >> + >> + batch_end = batch_align(batch, 8); >> + igt_assert(batch_end < BATCH_STATE_SPLIT); >> + >> + gen8_render_flush(batch, batch_end); >> + intel_batchbuffer_reset(batch); >> +} >> + >> +void >> +gen9_media_spinfunc(struct intel_batchbuffer *batch, >> + struct igt_buf *dst, uint32_t spins) >> +{ >> + uint32_t curbe_buffer, interface_descriptor; >> + uint32_t batch_end; >> + >> + intel_batchbuffer_flush_with_context(batch, NULL); >> + >> + /* setup states */ >> + batch->ptr = &batch->buffer[BATCH_STATE_SPLIT]; >> + >> + curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins); >> + interface_descriptor = gen8_spin_interface_descriptor(batch, dst); >> + igt_assert(batch->ptr < &batch->buffer[4095]); >> + >> + /* media pipeline */ >> + batch->ptr = batch->buffer; >> + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA | >> + GEN9_FORCE_MEDIA_AWAKE_ENABLE | >> + GEN9_SAMPLER_DOP_GATE_DISABLE | >> + GEN9_PIPELINE_SELECTION_MASK | >> + GEN9_SAMPLER_DOP_GATE_MASK | >> + GEN9_FORCE_MEDIA_AWAKE_MASK); >> + gen9_emit_state_base_address(batch); >> + >> + gen8_emit_vfe_state(batch); >> + >> + gen8_emit_curbe_load(batch, curbe_buffer); >> + >> + gen8_emit_interface_descriptor_load(batch, interface_descriptor); >> + >> + gen8_emit_media_objects(batch); >> + >> + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA | >> + GEN9_FORCE_MEDIA_AWAKE_DISABLE | >> + GEN9_SAMPLER_DOP_GATE_ENABLE | >> + GEN9_PIPELINE_SELECTION_MASK | >> + GEN9_SAMPLER_DOP_GATE_MASK | >> + GEN9_FORCE_MEDIA_AWAKE_MASK); >> + >> + OUT_BATCH(MI_BATCH_BUFFER_END); >> + >> + batch_end = batch_align(batch, 8); >> + igt_assert(batch_end < BATCH_STATE_SPLIT); >> + >> + gen8_render_flush(batch, batch_end); >> + intel_batchbuffer_reset(batch); >> +} >> diff --git a/lib/media_spin.h b/lib/media_spin.h >> new file mode 100644 >> index 0000000..8bc4829 >> --- /dev/null >> +++ b/lib/media_spin.h >> @@ -0,0 +1,39 @@ >> +/* >> + * Copyright © 2015 Intel Corporation >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice (including the next >> + * paragraph) shall be included in all copies or substantial portions of the >> + * Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >> EVENT SHALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, >> DAMAGES OR OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >> ARISING >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR >> OTHER DEALINGS >> + * IN THE SOFTWARE. >> + * >> + * Authors: >> + * Jeff McGee <jeff.mcgee@intel.com> >> + */ >> + >> +#ifndef MEDIA_SPIN_H >> +#define MEDIA_SPIN_H >> + >> +void gen8_media_spinfunc(struct intel_batchbuffer *batch, >> + struct igt_buf *dst, uint32_t spins); >> + >> +void gen8lp_media_spinfunc(struct intel_batchbuffer *batch, >> + struct igt_buf *dst, uint32_t spins); >> + >> +void gen9_media_spinfunc(struct intel_batchbuffer *batch, >> + struct igt_buf *dst, uint32_t spins); >> + >> +#endif /* MEDIA_SPIN_H */ >> -- >> 2.3.0 >> >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu 2015-03-10 21:17 [PATCH i-g-t 0/2] Confirm full SSEU enable on Gen9+ jeff.mcgee 2015-03-10 21:17 ` [PATCH i-g-t 1/2] lib: Add media spin jeff.mcgee @ 2015-03-10 21:17 ` jeff.mcgee 2015-03-12 12:09 ` Thomas Wood 2015-03-12 17:54 ` [PATCH i-g-t 2/2 v2] " jeff.mcgee 1 sibling, 2 replies; 10+ messages in thread From: jeff.mcgee @ 2015-03-10 21:17 UTC (permalink / raw) To: intel-gfx From: Jeff McGee <jeff.mcgee@intel.com> New test pm_sseu is intended for any subtest related to the slice/subslice/EU power gating feature. The sole initial subtest, 'full-enable', confirms that the slice/subslice/EU state is at full enablement when the render engine is active. Starting with Gen9 SKL, the render power gating feature can leave SSEU in a partially enabled state upon resumption of render work unless explicit action is taken. Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> --- tests/.gitignore | 1 + tests/Makefile.sources | 1 + tests/pm_sseu.c | 373 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 375 insertions(+) create mode 100644 tests/pm_sseu.c diff --git a/tests/.gitignore b/tests/.gitignore index 7b4dd94..23094ce 100644 --- a/tests/.gitignore +++ b/tests/.gitignore @@ -144,6 +144,7 @@ pm_psr pm_rc6_residency pm_rpm pm_rps +pm_sseu prime_nv_api prime_nv_pcopy prime_nv_test diff --git a/tests/Makefile.sources b/tests/Makefile.sources index 51e8376..74106c0 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -82,6 +82,7 @@ TESTS_progs_M = \ pm_rpm \ pm_rps \ pm_rc6_residency \ + pm_sseu \ prime_self_import \ template \ $(NULL) diff --git a/tests/pm_sseu.c b/tests/pm_sseu.c new file mode 100644 index 0000000..45aeef3 --- /dev/null +++ b/tests/pm_sseu.c @@ -0,0 +1,373 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + * Jeff McGee <jeff.mcgee@intel.com> + */ + +#include <fcntl.h> +#include <unistd.h> +#include <string.h> +#include <errno.h> +#include <time.h> +#include "drmtest.h" +#include "i915_drm.h" +#include "intel_io.h" +#include "intel_bufmgr.h" +#include "intel_batchbuffer.h" +#include "intel_chipset.h" +#include "ioctl_wrappers.h" +#include "igt_debugfs.h" +#include "media_spin.h" + +static double +to_dt(const struct timespec *start, const struct timespec *end) +{ + double dt; + + dt = (end->tv_sec - start->tv_sec) * 1e3; + dt += (end->tv_nsec - start->tv_nsec) * 1e-6; + + return dt; +} + +struct status { + struct { + int slice_total; + int subslice_total; + int subslice_per; + int eu_total; + int eu_per; + bool has_slice_pg; + bool has_subslice_pg; + bool has_eu_pg; + } info; + struct { + int slice_total; + int subslice_total; + int subslice_per; + int eu_total; + int eu_per; + } hw; +}; + +#define DBG_STATUS_BUF_SIZE 4096 + +struct { + int init; + int status_fd; + char status_buf[DBG_STATUS_BUF_SIZE]; +} dbg; + +static void +dbg_get_status_section(const char *title, char **first, char **last) +{ + char *pos; + + *first = strstr(dbg.status_buf, title); + igt_assert(*first != NULL); + + pos = *first; + do { + pos = strchr(pos, '\n'); + igt_assert(pos != NULL); + pos++; + } while (*pos == ' '); /* lines in the section begin with a space */ + *last = pos - 1; +} + +static int +dbg_get_int(const char *first, const char *last, const char *name) +{ + char *pos; + + pos = strstr(first, name); + igt_assert(pos != NULL); + pos = strstr(pos, ":"); + igt_assert(pos != NULL); + pos += 2; + igt_assert(pos < last); + + return strtol(pos, &pos, 10); +} + +static bool +dbg_get_bool(const char *first, const char *last, const char *name) +{ + char *pos; + + pos = strstr(first, name); + igt_assert(pos != NULL); + pos = strstr(pos, ":"); + igt_assert(pos != NULL); + pos += 2; + igt_assert(pos < last); + + if (*pos == 'y') + return true; + if (*pos == 'n') + return false; + + igt_assert(false); + return false; +} + +static void +dbg_get_status(struct status *stat) +{ + char *first, *last; + int nread; + + lseek(dbg.status_fd, 0, SEEK_SET); + nread = read(dbg.status_fd, dbg.status_buf, DBG_STATUS_BUF_SIZE); + igt_assert(nread < DBG_STATUS_BUF_SIZE); + dbg.status_buf[nread] = '\0'; + + memset(stat, 0, sizeof(*stat)); + + dbg_get_status_section("SSEU Device Info", &first, &last); + stat->info.slice_total = + dbg_get_int(first, last, "Available Slice Total:"); + stat->info.subslice_total = + dbg_get_int(first, last, "Available Subslice Total:"); + stat->info.subslice_per = + dbg_get_int(first, last, "Available Subslice Per Slice:"); + stat->info.eu_total = + dbg_get_int(first, last, "Available EU Total:"); + stat->info.eu_per = + dbg_get_int(first, last, "Available EU Per Subslice:"); + stat->info.has_slice_pg = + dbg_get_bool(first, last, "Has Slice Power Gating:"); + stat->info.has_subslice_pg = + dbg_get_bool(first, last, "Has Subslice Power Gating:"); + stat->info.has_eu_pg = + dbg_get_bool(first, last, "Has EU Power Gating:"); + + dbg_get_status_section("SSEU Device Status", &first, &last); + stat->hw.slice_total = + dbg_get_int(first, last, "Enabled Slice Total:"); + stat->hw.subslice_total = + dbg_get_int(first, last, "Enabled Subslice Total:"); + stat->hw.subslice_per = + dbg_get_int(first, last, "Enabled Subslice Per Slice:"); + stat->hw.eu_total = + dbg_get_int(first, last, "Enabled EU Total:"); + stat->hw.eu_per = + dbg_get_int(first, last, "Enabled EU Per Subslice:"); +} + +static void +dbg_init(void) +{ + dbg.status_fd = igt_debugfs_open("i915_sseu_status", O_RDONLY); + igt_assert(dbg.status_fd != -1); + dbg.init = 1; +} + +static void +dbg_deinit(void) +{ + switch (dbg.init) + { + case 1: + close(dbg.status_fd); + } +} + +struct { + int init; + int drm_fd; + int devid; + int gen; + int has_ppgtt; + drm_intel_bufmgr *bufmgr; + struct intel_batchbuffer *batch; + igt_media_spinfunc_t spinfunc; + struct igt_buf buf; + uint32_t spins_per_msec; +} gem; + +static void +gem_check_spin(uint32_t spins) +{ + uint32_t *data; + + data = (uint32_t*)gem.buf.bo->virtual; + igt_assert(*data == spins); +} + +static uint32_t +gem_get_target_spins(double dt) +{ + struct timespec tstart, tdone; + double prev_dt, cur_dt; + uint32_t spins; + int i, ret; + + /* Double increments until we bound the target time */ + prev_dt = 0.0; + for (i = 0; i < 32; i++) { + spins = 1 << i; + clock_gettime(CLOCK_MONOTONIC, &tstart); + + gem.spinfunc(gem.batch, &gem.buf, spins); + ret = drm_intel_bo_map(gem.buf.bo, 0); + igt_assert (ret == 0); + clock_gettime(CLOCK_MONOTONIC, &tdone); + + gem_check_spin(spins); + drm_intel_bo_unmap(gem.buf.bo); + + cur_dt = to_dt(&tstart, &tdone); + if (cur_dt > dt) + break; + prev_dt = cur_dt; + } + igt_assert(i != 32); + + /* Linearly interpolate between i and i-1 to get target increments */ + spins = 1 << (i-1); /* lower bound spins */ + spins += spins * (dt - prev_dt)/(cur_dt - prev_dt); /* target spins */ + + return spins; +} + +static void +gem_init(void) +{ + gem.drm_fd = drm_open_any(); + gem.init = 1; + + gem.devid = intel_get_drm_devid(gem.drm_fd); + gem.gen = intel_gen(gem.devid); + gem.has_ppgtt = gem_uses_aliasing_ppgtt(gem.drm_fd); + + gem.bufmgr = drm_intel_bufmgr_gem_init(gem.drm_fd, 4096); + igt_assert(gem.bufmgr); + gem.init = 2; + + drm_intel_bufmgr_gem_enable_reuse(gem.bufmgr); + + gem.batch = intel_batchbuffer_alloc(gem.bufmgr, gem.devid); + igt_assert(gem.batch); + gem.init = 3; + + gem.spinfunc = igt_get_media_spinfunc(gem.devid); + igt_assert(gem.spinfunc); + + gem.buf.stride = sizeof(uint32_t); + gem.buf.tiling = I915_TILING_NONE; + gem.buf.size = gem.buf.stride; + gem.buf.bo = drm_intel_bo_alloc(gem.bufmgr, "", gem.buf.size, 4096); + igt_assert(gem.buf.bo); + gem.init = 4; + + gem.spins_per_msec = gem_get_target_spins(100) / 100; +} + +static void +gem_deinit(void) +{ + switch (gem.init) + { + case 4: + drm_intel_bo_unmap(gem.buf.bo); + drm_intel_bo_unreference(gem.buf.bo); + case 3: + intel_batchbuffer_free(gem.batch); + case 2: + drm_intel_bufmgr_destroy(gem.bufmgr); + case 1: + close(gem.drm_fd); + } +} + +static void +check_full_enable(struct status *stat) +{ + igt_assert(stat->hw.slice_total == stat->info.slice_total); + igt_assert(stat->hw.subslice_total == stat->info.subslice_total); + igt_assert(stat->hw.subslice_per == stat->info.subslice_per); + + /* + * EU are powered in pairs, but it is possible for one EU in the pair + * to be non-functional due to fusing. The determination of enabled + * EU does not account for this and can therefore actually exceed the + * available count. Allow for this small discrepancy in our + * comparison. + */ + igt_assert(stat->hw.eu_total >= stat->info.eu_total); + igt_assert(stat->hw.eu_per >= stat->info.eu_per); +} + +static void +full_enable(void) +{ + struct status stat; + const int spin_msec = 10; + int ret, spins; + + /* Simulation doesn't currently model slice/subslice/EU power gating. */ + igt_skip_on_simulation(); + + /* + * Gen9 SKL is the first case in which render power gating can leave + * slice/subslice/EU in a partially enabled state upon resumption of + * render work. So start checking that this is prevented as of Gen9. + */ + igt_require(gem.gen >= 9); + + spins = spin_msec * gem.spins_per_msec; + + gem.spinfunc(gem.batch, &gem.buf, spins); + + usleep(2000); /* 2ms wait to make sure batch is running */ + dbg_get_status(&stat); + + ret = drm_intel_bo_map(gem.buf.bo, 0); + igt_assert (ret == 0); + + gem_check_spin(spins); + drm_intel_bo_unmap(gem.buf.bo); + + check_full_enable(&stat); +} + +static void +exit_handler(int sig) +{ + gem_deinit(); + dbg_deinit(); +} + +igt_main +{ + igt_fixture { + igt_install_exit_handler(exit_handler); + + dbg_init(); + gem_init(); + } + + igt_subtest("full-enable") + full_enable(); +} -- 2.3.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu 2015-03-10 21:17 ` [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu jeff.mcgee @ 2015-03-12 12:09 ` Thomas Wood 2015-03-18 16:51 ` Jeff McGee 2015-03-12 17:54 ` [PATCH i-g-t 2/2 v2] " jeff.mcgee 1 sibling, 1 reply; 10+ messages in thread From: Thomas Wood @ 2015-03-12 12:09 UTC (permalink / raw) To: jeff.mcgee; +Cc: Intel Graphics Development On 10 March 2015 at 21:17, <jeff.mcgee@intel.com> wrote: > From: Jeff McGee <jeff.mcgee@intel.com> > > New test pm_sseu is intended for any subtest related to the > slice/subslice/EU power gating feature. The sole initial subtest, > 'full-enable', confirms that the slice/subslice/EU state is at > full enablement when the render engine is active. Starting with > Gen9 SKL, the render power gating feature can leave SSEU in a > partially enabled state upon resumption of render work unless > explicit action is taken. Please add a short description to the test using the IGT_TEST_DESCRIPTION macro, so that it is included in the documentation and help output. > > Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> > --- > tests/.gitignore | 1 + > tests/Makefile.sources | 1 + > tests/pm_sseu.c | 373 +++++++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 375 insertions(+) > create mode 100644 tests/pm_sseu.c > > diff --git a/tests/.gitignore b/tests/.gitignore > index 7b4dd94..23094ce 100644 > --- a/tests/.gitignore > +++ b/tests/.gitignore > @@ -144,6 +144,7 @@ pm_psr > pm_rc6_residency > pm_rpm > pm_rps > +pm_sseu > prime_nv_api > prime_nv_pcopy > prime_nv_test > diff --git a/tests/Makefile.sources b/tests/Makefile.sources > index 51e8376..74106c0 100644 > --- a/tests/Makefile.sources > +++ b/tests/Makefile.sources > @@ -82,6 +82,7 @@ TESTS_progs_M = \ > pm_rpm \ > pm_rps \ > pm_rc6_residency \ > + pm_sseu \ > prime_self_import \ > template \ > $(NULL) > diff --git a/tests/pm_sseu.c b/tests/pm_sseu.c > new file mode 100644 > index 0000000..45aeef3 > --- /dev/null > +++ b/tests/pm_sseu.c > @@ -0,0 +1,373 @@ > +/* > + * Copyright © 2015 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS > + * IN THE SOFTWARE. > + * > + * Authors: > + * Jeff McGee <jeff.mcgee@intel.com> > + */ > + > +#include <fcntl.h> > +#include <unistd.h> > +#include <string.h> > +#include <errno.h> > +#include <time.h> > +#include "drmtest.h" > +#include "i915_drm.h" > +#include "intel_io.h" > +#include "intel_bufmgr.h" > +#include "intel_batchbuffer.h" > +#include "intel_chipset.h" > +#include "ioctl_wrappers.h" > +#include "igt_debugfs.h" > +#include "media_spin.h" > + > +static double > +to_dt(const struct timespec *start, const struct timespec *end) > +{ > + double dt; > + > + dt = (end->tv_sec - start->tv_sec) * 1e3; > + dt += (end->tv_nsec - start->tv_nsec) * 1e-6; > + > + return dt; > +} > + > +struct status { > + struct { > + int slice_total; > + int subslice_total; > + int subslice_per; > + int eu_total; > + int eu_per; > + bool has_slice_pg; > + bool has_subslice_pg; > + bool has_eu_pg; > + } info; > + struct { > + int slice_total; > + int subslice_total; > + int subslice_per; > + int eu_total; > + int eu_per; > + } hw; > +}; > + > +#define DBG_STATUS_BUF_SIZE 4096 > + > +struct { > + int init; > + int status_fd; > + char status_buf[DBG_STATUS_BUF_SIZE]; > +} dbg; > + > +static void > +dbg_get_status_section(const char *title, char **first, char **last) > +{ > + char *pos; > + > + *first = strstr(dbg.status_buf, title); > + igt_assert(*first != NULL); > + > + pos = *first; > + do { > + pos = strchr(pos, '\n'); > + igt_assert(pos != NULL); > + pos++; > + } while (*pos == ' '); /* lines in the section begin with a space */ > + *last = pos - 1; > +} > + > +static int > +dbg_get_int(const char *first, const char *last, const char *name) > +{ > + char *pos; > + > + pos = strstr(first, name); > + igt_assert(pos != NULL); > + pos = strstr(pos, ":"); > + igt_assert(pos != NULL); > + pos += 2; > + igt_assert(pos < last); > + > + return strtol(pos, &pos, 10); > +} > + > +static bool > +dbg_get_bool(const char *first, const char *last, const char *name) > +{ > + char *pos; > + > + pos = strstr(first, name); > + igt_assert(pos != NULL); > + pos = strstr(pos, ":"); > + igt_assert(pos != NULL); > + pos += 2; > + igt_assert(pos < last); > + > + if (*pos == 'y') > + return true; > + if (*pos == 'n') > + return false; > + > + igt_assert(false); Perhaps use igt_assert_f() to add a more detailed error message? > + return false; > +} > + > +static void > +dbg_get_status(struct status *stat) > +{ > + char *first, *last; > + int nread; > + > + lseek(dbg.status_fd, 0, SEEK_SET); > + nread = read(dbg.status_fd, dbg.status_buf, DBG_STATUS_BUF_SIZE); > + igt_assert(nread < DBG_STATUS_BUF_SIZE); igt_assert_lt() would produce a better error message here. Using igt.cocci will suggest other similar changes elsewhere too. > + dbg.status_buf[nread] = '\0'; > + > + memset(stat, 0, sizeof(*stat)); > + > + dbg_get_status_section("SSEU Device Info", &first, &last); > + stat->info.slice_total = > + dbg_get_int(first, last, "Available Slice Total:"); > + stat->info.subslice_total = > + dbg_get_int(first, last, "Available Subslice Total:"); > + stat->info.subslice_per = > + dbg_get_int(first, last, "Available Subslice Per Slice:"); > + stat->info.eu_total = > + dbg_get_int(first, last, "Available EU Total:"); > + stat->info.eu_per = > + dbg_get_int(first, last, "Available EU Per Subslice:"); > + stat->info.has_slice_pg = > + dbg_get_bool(first, last, "Has Slice Power Gating:"); > + stat->info.has_subslice_pg = > + dbg_get_bool(first, last, "Has Subslice Power Gating:"); > + stat->info.has_eu_pg = > + dbg_get_bool(first, last, "Has EU Power Gating:"); > + > + dbg_get_status_section("SSEU Device Status", &first, &last); > + stat->hw.slice_total = > + dbg_get_int(first, last, "Enabled Slice Total:"); > + stat->hw.subslice_total = > + dbg_get_int(first, last, "Enabled Subslice Total:"); > + stat->hw.subslice_per = > + dbg_get_int(first, last, "Enabled Subslice Per Slice:"); > + stat->hw.eu_total = > + dbg_get_int(first, last, "Enabled EU Total:"); > + stat->hw.eu_per = > + dbg_get_int(first, last, "Enabled EU Per Subslice:"); > +} > + > +static void > +dbg_init(void) > +{ > + dbg.status_fd = igt_debugfs_open("i915_sseu_status", O_RDONLY); > + igt_assert(dbg.status_fd != -1); > + dbg.init = 1; > +} > + > +static void > +dbg_deinit(void) > +{ > + switch (dbg.init) > + { > + case 1: > + close(dbg.status_fd); > + } > +} > + > +struct { > + int init; > + int drm_fd; > + int devid; > + int gen; > + int has_ppgtt; > + drm_intel_bufmgr *bufmgr; > + struct intel_batchbuffer *batch; > + igt_media_spinfunc_t spinfunc; > + struct igt_buf buf; > + uint32_t spins_per_msec; > +} gem; > + > +static void > +gem_check_spin(uint32_t spins) > +{ > + uint32_t *data; > + > + data = (uint32_t*)gem.buf.bo->virtual; > + igt_assert(*data == spins); > +} > + > +static uint32_t > +gem_get_target_spins(double dt) > +{ > + struct timespec tstart, tdone; > + double prev_dt, cur_dt; > + uint32_t spins; > + int i, ret; > + > + /* Double increments until we bound the target time */ > + prev_dt = 0.0; > + for (i = 0; i < 32; i++) { > + spins = 1 << i; > + clock_gettime(CLOCK_MONOTONIC, &tstart); > + > + gem.spinfunc(gem.batch, &gem.buf, spins); > + ret = drm_intel_bo_map(gem.buf.bo, 0); > + igt_assert (ret == 0); > + clock_gettime(CLOCK_MONOTONIC, &tdone); > + > + gem_check_spin(spins); > + drm_intel_bo_unmap(gem.buf.bo); > + > + cur_dt = to_dt(&tstart, &tdone); > + if (cur_dt > dt) > + break; > + prev_dt = cur_dt; > + } > + igt_assert(i != 32); > + > + /* Linearly interpolate between i and i-1 to get target increments */ > + spins = 1 << (i-1); /* lower bound spins */ > + spins += spins * (dt - prev_dt)/(cur_dt - prev_dt); /* target spins */ > + > + return spins; > +} > + > +static void > +gem_init(void) > +{ > + gem.drm_fd = drm_open_any(); > + gem.init = 1; > + > + gem.devid = intel_get_drm_devid(gem.drm_fd); > + gem.gen = intel_gen(gem.devid); > + gem.has_ppgtt = gem_uses_aliasing_ppgtt(gem.drm_fd); > + > + gem.bufmgr = drm_intel_bufmgr_gem_init(gem.drm_fd, 4096); > + igt_assert(gem.bufmgr); > + gem.init = 2; > + > + drm_intel_bufmgr_gem_enable_reuse(gem.bufmgr); > + > + gem.batch = intel_batchbuffer_alloc(gem.bufmgr, gem.devid); > + igt_assert(gem.batch); > + gem.init = 3; > + > + gem.spinfunc = igt_get_media_spinfunc(gem.devid); > + igt_assert(gem.spinfunc); > + > + gem.buf.stride = sizeof(uint32_t); > + gem.buf.tiling = I915_TILING_NONE; > + gem.buf.size = gem.buf.stride; > + gem.buf.bo = drm_intel_bo_alloc(gem.bufmgr, "", gem.buf.size, 4096); > + igt_assert(gem.buf.bo); > + gem.init = 4; > + > + gem.spins_per_msec = gem_get_target_spins(100) / 100; > +} > + > +static void > +gem_deinit(void) > +{ > + switch (gem.init) > + { > + case 4: > + drm_intel_bo_unmap(gem.buf.bo); > + drm_intel_bo_unreference(gem.buf.bo); > + case 3: > + intel_batchbuffer_free(gem.batch); > + case 2: > + drm_intel_bufmgr_destroy(gem.bufmgr); > + case 1: > + close(gem.drm_fd); > + } > +} > + > +static void > +check_full_enable(struct status *stat) > +{ > + igt_assert(stat->hw.slice_total == stat->info.slice_total); > + igt_assert(stat->hw.subslice_total == stat->info.subslice_total); > + igt_assert(stat->hw.subslice_per == stat->info.subslice_per); > + > + /* > + * EU are powered in pairs, but it is possible for one EU in the pair > + * to be non-functional due to fusing. The determination of enabled > + * EU does not account for this and can therefore actually exceed the > + * available count. Allow for this small discrepancy in our > + * comparison. > + */ > + igt_assert(stat->hw.eu_total >= stat->info.eu_total); > + igt_assert(stat->hw.eu_per >= stat->info.eu_per); > +} > + > +static void > +full_enable(void) > +{ > + struct status stat; > + const int spin_msec = 10; > + int ret, spins; > + > + /* Simulation doesn't currently model slice/subslice/EU power gating. */ > + igt_skip_on_simulation(); > + > + /* > + * Gen9 SKL is the first case in which render power gating can leave > + * slice/subslice/EU in a partially enabled state upon resumption of > + * render work. So start checking that this is prevented as of Gen9. > + */ > + igt_require(gem.gen >= 9); > + > + spins = spin_msec * gem.spins_per_msec; > + > + gem.spinfunc(gem.batch, &gem.buf, spins); > + > + usleep(2000); /* 2ms wait to make sure batch is running */ > + dbg_get_status(&stat); > + > + ret = drm_intel_bo_map(gem.buf.bo, 0); > + igt_assert (ret == 0); > + > + gem_check_spin(spins); > + drm_intel_bo_unmap(gem.buf.bo); > + > + check_full_enable(&stat); > +} > + > +static void > +exit_handler(int sig) > +{ > + gem_deinit(); > + dbg_deinit(); > +} > + > +igt_main > +{ > + igt_fixture { > + igt_install_exit_handler(exit_handler); > + > + dbg_init(); > + gem_init(); > + } > + > + igt_subtest("full-enable") > + full_enable(); > +} > -- > 2.3.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu 2015-03-12 12:09 ` Thomas Wood @ 2015-03-18 16:51 ` Jeff McGee 0 siblings, 0 replies; 10+ messages in thread From: Jeff McGee @ 2015-03-18 16:51 UTC (permalink / raw) To: Thomas Wood; +Cc: Intel Graphics Development On Thu, Mar 12, 2015 at 12:09:50PM +0000, Thomas Wood wrote: > On 10 March 2015 at 21:17, <jeff.mcgee@intel.com> wrote: > > From: Jeff McGee <jeff.mcgee@intel.com> > > > > New test pm_sseu is intended for any subtest related to the > > slice/subslice/EU power gating feature. The sole initial subtest, > > 'full-enable', confirms that the slice/subslice/EU state is at > > full enablement when the render engine is active. Starting with > > Gen9 SKL, the render power gating feature can leave SSEU in a > > partially enabled state upon resumption of render work unless > > explicit action is taken. > > Please add a short description to the test using the > IGT_TEST_DESCRIPTION macro, so that it is included in the > documentation and help output. > Hi Thomas. I have posted v2 patches to address this and your other comments. Can you please have a second look? Thanks -Jeff _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH i-g-t 2/2 v2] tests/pm_sseu: Create new test pm_sseu 2015-03-10 21:17 ` [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu jeff.mcgee 2015-03-12 12:09 ` Thomas Wood @ 2015-03-12 17:54 ` jeff.mcgee 2015-03-24 23:20 ` [PATCH i-g-t 2/2 v3] " jeff.mcgee 1 sibling, 1 reply; 10+ messages in thread From: jeff.mcgee @ 2015-03-12 17:54 UTC (permalink / raw) To: intel-gfx From: Jeff McGee <jeff.mcgee@intel.com> New test pm_sseu is intended for any subtest related to the slice/subslice/EU power gating feature. The sole initial subtest, 'full-enable', confirms that the slice/subslice/EU state is at full enablement when the render engine is active. Starting with Gen9 SKL, the render power gating feature can leave SSEU in a partially enabled state upon resumption of render work unless explicit action is taken. v2: Add test description and apply recommendations of igt.cocci (Thomas Wood). Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> --- tests/.gitignore | 1 + tests/Makefile.sources | 1 + tests/pm_sseu.c | 375 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 377 insertions(+) create mode 100644 tests/pm_sseu.c diff --git a/tests/.gitignore b/tests/.gitignore index 426cc67..a1ec1b5 100644 --- a/tests/.gitignore +++ b/tests/.gitignore @@ -143,6 +143,7 @@ pm_lpsp pm_rc6_residency pm_rpm pm_rps +pm_sseu prime_nv_api prime_nv_pcopy prime_nv_test diff --git a/tests/Makefile.sources b/tests/Makefile.sources index 51e8376..74106c0 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -82,6 +82,7 @@ TESTS_progs_M = \ pm_rpm \ pm_rps \ pm_rc6_residency \ + pm_sseu \ prime_self_import \ template \ $(NULL) diff --git a/tests/pm_sseu.c b/tests/pm_sseu.c new file mode 100644 index 0000000..7196dcb --- /dev/null +++ b/tests/pm_sseu.c @@ -0,0 +1,375 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + * Jeff McGee <jeff.mcgee@intel.com> + */ + +#include <fcntl.h> +#include <unistd.h> +#include <string.h> +#include <errno.h> +#include <time.h> +#include "drmtest.h" +#include "i915_drm.h" +#include "intel_io.h" +#include "intel_bufmgr.h" +#include "intel_batchbuffer.h" +#include "intel_chipset.h" +#include "ioctl_wrappers.h" +#include "igt_debugfs.h" +#include "media_spin.h" + +IGT_TEST_DESCRIPTION("Tests slice/subslice/EU power gating functionality.\n"); + +static double +to_dt(const struct timespec *start, const struct timespec *end) +{ + double dt; + + dt = (end->tv_sec - start->tv_sec) * 1e3; + dt += (end->tv_nsec - start->tv_nsec) * 1e-6; + + return dt; +} + +struct status { + struct { + int slice_total; + int subslice_total; + int subslice_per; + int eu_total; + int eu_per; + bool has_slice_pg; + bool has_subslice_pg; + bool has_eu_pg; + } info; + struct { + int slice_total; + int subslice_total; + int subslice_per; + int eu_total; + int eu_per; + } hw; +}; + +#define DBG_STATUS_BUF_SIZE 4096 + +struct { + int init; + int status_fd; + char status_buf[DBG_STATUS_BUF_SIZE]; +} dbg; + +static void +dbg_get_status_section(const char *title, char **first, char **last) +{ + char *pos; + + *first = strstr(dbg.status_buf, title); + igt_assert(*first != NULL); + + pos = *first; + do { + pos = strchr(pos, '\n'); + igt_assert(pos != NULL); + pos++; + } while (*pos == ' '); /* lines in the section begin with a space */ + *last = pos - 1; +} + +static int +dbg_get_int(const char *first, const char *last, const char *name) +{ + char *pos; + + pos = strstr(first, name); + igt_assert(pos != NULL); + pos = strstr(pos, ":"); + igt_assert(pos != NULL); + pos += 2; + igt_assert(pos != last); + + return strtol(pos, &pos, 10); +} + +static bool +dbg_get_bool(const char *first, const char *last, const char *name) +{ + char *pos; + + pos = strstr(first, name); + igt_assert(pos != NULL); + pos = strstr(pos, ":"); + igt_assert(pos != NULL); + pos += 2; + igt_assert(pos < last); + + if (*pos == 'y') + return true; + if (*pos == 'n') + return false; + + igt_assert_f(false, "Could not read boolean value for %s.\n", name); + return false; +} + +static void +dbg_get_status(struct status *stat) +{ + char *first, *last; + int nread; + + lseek(dbg.status_fd, 0, SEEK_SET); + nread = read(dbg.status_fd, dbg.status_buf, DBG_STATUS_BUF_SIZE); + igt_assert_lt(nread, DBG_STATUS_BUF_SIZE); + dbg.status_buf[nread] = '\0'; + + memset(stat, 0, sizeof(*stat)); + + dbg_get_status_section("SSEU Device Info", &first, &last); + stat->info.slice_total = + dbg_get_int(first, last, "Available Slice Total:"); + stat->info.subslice_total = + dbg_get_int(first, last, "Available Subslice Total:"); + stat->info.subslice_per = + dbg_get_int(first, last, "Available Subslice Per Slice:"); + stat->info.eu_total = + dbg_get_int(first, last, "Available EU Total:"); + stat->info.eu_per = + dbg_get_int(first, last, "Available EU Per Subslice:"); + stat->info.has_slice_pg = + dbg_get_bool(first, last, "Has Slice Power Gating:"); + stat->info.has_subslice_pg = + dbg_get_bool(first, last, "Has Subslice Power Gating:"); + stat->info.has_eu_pg = + dbg_get_bool(first, last, "Has EU Power Gating:"); + + dbg_get_status_section("SSEU Device Status", &first, &last); + stat->hw.slice_total = + dbg_get_int(first, last, "Enabled Slice Total:"); + stat->hw.subslice_total = + dbg_get_int(first, last, "Enabled Subslice Total:"); + stat->hw.subslice_per = + dbg_get_int(first, last, "Enabled Subslice Per Slice:"); + stat->hw.eu_total = + dbg_get_int(first, last, "Enabled EU Total:"); + stat->hw.eu_per = + dbg_get_int(first, last, "Enabled EU Per Subslice:"); +} + +static void +dbg_init(void) +{ + dbg.status_fd = igt_debugfs_open("i915_sseu_status", O_RDONLY); + igt_assert_neq(dbg.status_fd, -1); + dbg.init = 1; +} + +static void +dbg_deinit(void) +{ + switch (dbg.init) + { + case 1: + close(dbg.status_fd); + } +} + +struct { + int init; + int drm_fd; + int devid; + int gen; + int has_ppgtt; + drm_intel_bufmgr *bufmgr; + struct intel_batchbuffer *batch; + igt_media_spinfunc_t spinfunc; + struct igt_buf buf; + uint32_t spins_per_msec; +} gem; + +static void +gem_check_spin(uint32_t spins) +{ + uint32_t *data; + + data = (uint32_t*)gem.buf.bo->virtual; + igt_assert_eq_u32(*data, spins); +} + +static uint32_t +gem_get_target_spins(double dt) +{ + struct timespec tstart, tdone; + double prev_dt, cur_dt; + uint32_t spins; + int i, ret; + + /* Double increments until we bound the target time */ + prev_dt = 0.0; + for (i = 0; i < 32; i++) { + spins = 1 << i; + clock_gettime(CLOCK_MONOTONIC, &tstart); + + gem.spinfunc(gem.batch, &gem.buf, spins); + ret = drm_intel_bo_map(gem.buf.bo, 0); + igt_assert_eq(ret, 0); + clock_gettime(CLOCK_MONOTONIC, &tdone); + + gem_check_spin(spins); + drm_intel_bo_unmap(gem.buf.bo); + + cur_dt = to_dt(&tstart, &tdone); + if (cur_dt > dt) + break; + prev_dt = cur_dt; + } + igt_assert_neq(i, 32); + + /* Linearly interpolate between i and i-1 to get target increments */ + spins = 1 << (i-1); /* lower bound spins */ + spins += spins * (dt - prev_dt)/(cur_dt - prev_dt); /* target spins */ + + return spins; +} + +static void +gem_init(void) +{ + gem.drm_fd = drm_open_any(); + gem.init = 1; + + gem.devid = intel_get_drm_devid(gem.drm_fd); + gem.gen = intel_gen(gem.devid); + gem.has_ppgtt = gem_uses_aliasing_ppgtt(gem.drm_fd); + + gem.bufmgr = drm_intel_bufmgr_gem_init(gem.drm_fd, 4096); + igt_assert(gem.bufmgr); + gem.init = 2; + + drm_intel_bufmgr_gem_enable_reuse(gem.bufmgr); + + gem.batch = intel_batchbuffer_alloc(gem.bufmgr, gem.devid); + igt_assert(gem.batch); + gem.init = 3; + + gem.spinfunc = igt_get_media_spinfunc(gem.devid); + igt_assert(gem.spinfunc); + + gem.buf.stride = sizeof(uint32_t); + gem.buf.tiling = I915_TILING_NONE; + gem.buf.size = gem.buf.stride; + gem.buf.bo = drm_intel_bo_alloc(gem.bufmgr, "", gem.buf.size, 4096); + igt_assert(gem.buf.bo); + gem.init = 4; + + gem.spins_per_msec = gem_get_target_spins(100) / 100; +} + +static void +gem_deinit(void) +{ + switch (gem.init) + { + case 4: + drm_intel_bo_unmap(gem.buf.bo); + drm_intel_bo_unreference(gem.buf.bo); + case 3: + intel_batchbuffer_free(gem.batch); + case 2: + drm_intel_bufmgr_destroy(gem.bufmgr); + case 1: + close(gem.drm_fd); + } +} + +static void +check_full_enable(struct status *stat) +{ + igt_assert_eq(stat->hw.slice_total, stat->info.slice_total); + igt_assert_eq(stat->hw.subslice_total, stat->info.subslice_total); + igt_assert_eq(stat->hw.subslice_per, stat->info.subslice_per); + + /* + * EU are powered in pairs, but it is possible for one EU in the pair + * to be non-functional due to fusing. The determination of enabled + * EU does not account for this and can therefore actually exceed the + * available count. Allow for this small discrepancy in our + * comparison. + */ + igt_assert_lte(stat->info.eu_total, stat->hw.eu_total); + igt_assert_lte(stat->info.eu_per, stat->hw.eu_per); +} + +static void +full_enable(void) +{ + struct status stat; + const int spin_msec = 10; + int ret, spins; + + /* Simulation doesn't currently model slice/subslice/EU power gating. */ + igt_skip_on_simulation(); + + /* + * Gen9 SKL is the first case in which render power gating can leave + * slice/subslice/EU in a partially enabled state upon resumption of + * render work. So start checking that this is prevented as of Gen9. + */ + igt_require(gem.gen >= 9); + + spins = spin_msec * gem.spins_per_msec; + + gem.spinfunc(gem.batch, &gem.buf, spins); + + usleep(2000); /* 2ms wait to make sure batch is running */ + dbg_get_status(&stat); + + ret = drm_intel_bo_map(gem.buf.bo, 0); + igt_assert_eq(ret, 0); + + gem_check_spin(spins); + drm_intel_bo_unmap(gem.buf.bo); + + check_full_enable(&stat); +} + +static void +exit_handler(int sig) +{ + gem_deinit(); + dbg_deinit(); +} + +igt_main +{ + igt_fixture { + igt_install_exit_handler(exit_handler); + + dbg_init(); + gem_init(); + } + + igt_subtest("full-enable") + full_enable(); +} -- 2.3.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH i-g-t 2/2 v3] tests/pm_sseu: Create new test pm_sseu 2015-03-12 17:54 ` [PATCH i-g-t 2/2 v2] " jeff.mcgee @ 2015-03-24 23:20 ` jeff.mcgee 0 siblings, 0 replies; 10+ messages in thread From: jeff.mcgee @ 2015-03-24 23:20 UTC (permalink / raw) To: intel-gfx From: Jeff McGee <jeff.mcgee@intel.com> New test pm_sseu is intended for any subtest related to the slice/subslice/EU power gating feature. The sole initial subtest, 'full-enable', confirms that the slice/subslice/EU state is at full enablement when the render engine is active. Starting with Gen9 SKL, the render power gating feature can leave SSEU in a partially enabled state upon resumption of render work unless explicit action is taken. v2: Add test description and apply recommendations of igt.cocci (Thomas Wood). v3: Skip instead of fail if debugfs entry i915_sseu_status is not available. Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> --- tests/.gitignore | 1 + tests/Makefile.sources | 1 + tests/pm_sseu.c | 376 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 378 insertions(+) create mode 100644 tests/pm_sseu.c diff --git a/tests/.gitignore b/tests/.gitignore index 402e062..35b3289 100644 --- a/tests/.gitignore +++ b/tests/.gitignore @@ -146,6 +146,7 @@ pm_lpsp pm_rc6_residency pm_rpm pm_rps +pm_sseu prime_nv_api prime_nv_pcopy prime_nv_test diff --git a/tests/Makefile.sources b/tests/Makefile.sources index a165978..798cb75 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -86,6 +86,7 @@ TESTS_progs_M = \ pm_rpm \ pm_rps \ pm_rc6_residency \ + pm_sseu \ prime_self_import \ template \ $(NULL) diff --git a/tests/pm_sseu.c b/tests/pm_sseu.c new file mode 100644 index 0000000..34465db --- /dev/null +++ b/tests/pm_sseu.c @@ -0,0 +1,376 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + * Jeff McGee <jeff.mcgee@intel.com> + */ + +#include <fcntl.h> +#include <unistd.h> +#include <string.h> +#include <errno.h> +#include <time.h> +#include "drmtest.h" +#include "i915_drm.h" +#include "intel_io.h" +#include "intel_bufmgr.h" +#include "intel_batchbuffer.h" +#include "intel_chipset.h" +#include "ioctl_wrappers.h" +#include "igt_debugfs.h" +#include "media_spin.h" + +IGT_TEST_DESCRIPTION("Tests slice/subslice/EU power gating functionality.\n"); + +static double +to_dt(const struct timespec *start, const struct timespec *end) +{ + double dt; + + dt = (end->tv_sec - start->tv_sec) * 1e3; + dt += (end->tv_nsec - start->tv_nsec) * 1e-6; + + return dt; +} + +struct status { + struct { + int slice_total; + int subslice_total; + int subslice_per; + int eu_total; + int eu_per; + bool has_slice_pg; + bool has_subslice_pg; + bool has_eu_pg; + } info; + struct { + int slice_total; + int subslice_total; + int subslice_per; + int eu_total; + int eu_per; + } hw; +}; + +#define DBG_STATUS_BUF_SIZE 4096 + +struct { + int init; + int status_fd; + char status_buf[DBG_STATUS_BUF_SIZE]; +} dbg; + +static void +dbg_get_status_section(const char *title, char **first, char **last) +{ + char *pos; + + *first = strstr(dbg.status_buf, title); + igt_assert(*first != NULL); + + pos = *first; + do { + pos = strchr(pos, '\n'); + igt_assert(pos != NULL); + pos++; + } while (*pos == ' '); /* lines in the section begin with a space */ + *last = pos - 1; +} + +static int +dbg_get_int(const char *first, const char *last, const char *name) +{ + char *pos; + + pos = strstr(first, name); + igt_assert(pos != NULL); + pos = strstr(pos, ":"); + igt_assert(pos != NULL); + pos += 2; + igt_assert(pos != last); + + return strtol(pos, &pos, 10); +} + +static bool +dbg_get_bool(const char *first, const char *last, const char *name) +{ + char *pos; + + pos = strstr(first, name); + igt_assert(pos != NULL); + pos = strstr(pos, ":"); + igt_assert(pos != NULL); + pos += 2; + igt_assert(pos < last); + + if (*pos == 'y') + return true; + if (*pos == 'n') + return false; + + igt_assert_f(false, "Could not read boolean value for %s.\n", name); + return false; +} + +static void +dbg_get_status(struct status *stat) +{ + char *first, *last; + int nread; + + lseek(dbg.status_fd, 0, SEEK_SET); + nread = read(dbg.status_fd, dbg.status_buf, DBG_STATUS_BUF_SIZE); + igt_assert_lt(nread, DBG_STATUS_BUF_SIZE); + dbg.status_buf[nread] = '\0'; + + memset(stat, 0, sizeof(*stat)); + + dbg_get_status_section("SSEU Device Info", &first, &last); + stat->info.slice_total = + dbg_get_int(first, last, "Available Slice Total:"); + stat->info.subslice_total = + dbg_get_int(first, last, "Available Subslice Total:"); + stat->info.subslice_per = + dbg_get_int(first, last, "Available Subslice Per Slice:"); + stat->info.eu_total = + dbg_get_int(first, last, "Available EU Total:"); + stat->info.eu_per = + dbg_get_int(first, last, "Available EU Per Subslice:"); + stat->info.has_slice_pg = + dbg_get_bool(first, last, "Has Slice Power Gating:"); + stat->info.has_subslice_pg = + dbg_get_bool(first, last, "Has Subslice Power Gating:"); + stat->info.has_eu_pg = + dbg_get_bool(first, last, "Has EU Power Gating:"); + + dbg_get_status_section("SSEU Device Status", &first, &last); + stat->hw.slice_total = + dbg_get_int(first, last, "Enabled Slice Total:"); + stat->hw.subslice_total = + dbg_get_int(first, last, "Enabled Subslice Total:"); + stat->hw.subslice_per = + dbg_get_int(first, last, "Enabled Subslice Per Slice:"); + stat->hw.eu_total = + dbg_get_int(first, last, "Enabled EU Total:"); + stat->hw.eu_per = + dbg_get_int(first, last, "Enabled EU Per Subslice:"); +} + +static void +dbg_init(void) +{ + dbg.status_fd = igt_debugfs_open("i915_sseu_status", O_RDONLY); + igt_skip_on_f(dbg.status_fd == -1, + "debugfs entry 'i915_sseu_status' not found\n"); + dbg.init = 1; +} + +static void +dbg_deinit(void) +{ + switch (dbg.init) + { + case 1: + close(dbg.status_fd); + } +} + +struct { + int init; + int drm_fd; + int devid; + int gen; + int has_ppgtt; + drm_intel_bufmgr *bufmgr; + struct intel_batchbuffer *batch; + igt_media_spinfunc_t spinfunc; + struct igt_buf buf; + uint32_t spins_per_msec; +} gem; + +static void +gem_check_spin(uint32_t spins) +{ + uint32_t *data; + + data = (uint32_t*)gem.buf.bo->virtual; + igt_assert_eq_u32(*data, spins); +} + +static uint32_t +gem_get_target_spins(double dt) +{ + struct timespec tstart, tdone; + double prev_dt, cur_dt; + uint32_t spins; + int i, ret; + + /* Double increments until we bound the target time */ + prev_dt = 0.0; + for (i = 0; i < 32; i++) { + spins = 1 << i; + clock_gettime(CLOCK_MONOTONIC, &tstart); + + gem.spinfunc(gem.batch, &gem.buf, spins); + ret = drm_intel_bo_map(gem.buf.bo, 0); + igt_assert_eq(ret, 0); + clock_gettime(CLOCK_MONOTONIC, &tdone); + + gem_check_spin(spins); + drm_intel_bo_unmap(gem.buf.bo); + + cur_dt = to_dt(&tstart, &tdone); + if (cur_dt > dt) + break; + prev_dt = cur_dt; + } + igt_assert_neq(i, 32); + + /* Linearly interpolate between i and i-1 to get target increments */ + spins = 1 << (i-1); /* lower bound spins */ + spins += spins * (dt - prev_dt)/(cur_dt - prev_dt); /* target spins */ + + return spins; +} + +static void +gem_init(void) +{ + gem.drm_fd = drm_open_any(); + gem.init = 1; + + gem.devid = intel_get_drm_devid(gem.drm_fd); + gem.gen = intel_gen(gem.devid); + gem.has_ppgtt = gem_uses_aliasing_ppgtt(gem.drm_fd); + + gem.bufmgr = drm_intel_bufmgr_gem_init(gem.drm_fd, 4096); + igt_assert(gem.bufmgr); + gem.init = 2; + + drm_intel_bufmgr_gem_enable_reuse(gem.bufmgr); + + gem.batch = intel_batchbuffer_alloc(gem.bufmgr, gem.devid); + igt_assert(gem.batch); + gem.init = 3; + + gem.spinfunc = igt_get_media_spinfunc(gem.devid); + igt_assert(gem.spinfunc); + + gem.buf.stride = sizeof(uint32_t); + gem.buf.tiling = I915_TILING_NONE; + gem.buf.size = gem.buf.stride; + gem.buf.bo = drm_intel_bo_alloc(gem.bufmgr, "", gem.buf.size, 4096); + igt_assert(gem.buf.bo); + gem.init = 4; + + gem.spins_per_msec = gem_get_target_spins(100) / 100; +} + +static void +gem_deinit(void) +{ + switch (gem.init) + { + case 4: + drm_intel_bo_unmap(gem.buf.bo); + drm_intel_bo_unreference(gem.buf.bo); + case 3: + intel_batchbuffer_free(gem.batch); + case 2: + drm_intel_bufmgr_destroy(gem.bufmgr); + case 1: + close(gem.drm_fd); + } +} + +static void +check_full_enable(struct status *stat) +{ + igt_assert_eq(stat->hw.slice_total, stat->info.slice_total); + igt_assert_eq(stat->hw.subslice_total, stat->info.subslice_total); + igt_assert_eq(stat->hw.subslice_per, stat->info.subslice_per); + + /* + * EU are powered in pairs, but it is possible for one EU in the pair + * to be non-functional due to fusing. The determination of enabled + * EU does not account for this and can therefore actually exceed the + * available count. Allow for this small discrepancy in our + * comparison. + */ + igt_assert_lte(stat->info.eu_total, stat->hw.eu_total); + igt_assert_lte(stat->info.eu_per, stat->hw.eu_per); +} + +static void +full_enable(void) +{ + struct status stat; + const int spin_msec = 10; + int ret, spins; + + /* Simulation doesn't currently model slice/subslice/EU power gating. */ + igt_skip_on_simulation(); + + /* + * Gen9 SKL is the first case in which render power gating can leave + * slice/subslice/EU in a partially enabled state upon resumption of + * render work. So start checking that this is prevented as of Gen9. + */ + igt_require(gem.gen >= 9); + + spins = spin_msec * gem.spins_per_msec; + + gem.spinfunc(gem.batch, &gem.buf, spins); + + usleep(2000); /* 2ms wait to make sure batch is running */ + dbg_get_status(&stat); + + ret = drm_intel_bo_map(gem.buf.bo, 0); + igt_assert_eq(ret, 0); + + gem_check_spin(spins); + drm_intel_bo_unmap(gem.buf.bo); + + check_full_enable(&stat); +} + +static void +exit_handler(int sig) +{ + gem_deinit(); + dbg_deinit(); +} + +igt_main +{ + igt_fixture { + igt_install_exit_handler(exit_handler); + + dbg_init(); + gem_init(); + } + + igt_subtest("full-enable") + full_enable(); +} -- 2.3.3 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-03-25 18:07 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-10 21:17 [PATCH i-g-t 0/2] Confirm full SSEU enable on Gen9+ jeff.mcgee 2015-03-10 21:17 ` [PATCH i-g-t 1/2] lib: Add media spin jeff.mcgee 2015-03-12 17:52 ` [PATCH i-g-t 1/2 v2] " jeff.mcgee 2015-03-25 2:50 ` He, Shuang 2015-03-25 18:07 ` Thomas Wood 2015-03-10 21:17 ` [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu jeff.mcgee 2015-03-12 12:09 ` Thomas Wood 2015-03-18 16:51 ` Jeff McGee 2015-03-12 17:54 ` [PATCH i-g-t 2/2 v2] " jeff.mcgee 2015-03-24 23:20 ` [PATCH i-g-t 2/2 v3] " jeff.mcgee
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox