* [igt-dev] [PATCH i-g-t] lib/rendercopy: Add gen4/5 rendercopy
@ 2018-06-11 16:14 Ville Syrjala
2018-06-11 18:01 ` [igt-dev] ✓ Fi.CI.BAT: success for " Patchwork
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Ville Syrjala @ 2018-06-11 16:14 UTC (permalink / raw)
To: igt-dev
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Add rendercopy implementation for gen4/5. Basic structure
copied from the gen6 implementation, and the gen4/5 specific
bits were mostly lifted from sna.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
lib/Makefile.sources | 2 +
lib/gen4_render.h | 628 ++++++++++++++++++++++++++++++++++++++++++
lib/intel_batchbuffer.c | 2 +
lib/meson.build | 1 +
lib/rendercopy.h | 5 +
lib/rendercopy_gen4.c | 704 ++++++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 1342 insertions(+)
create mode 100644 lib/gen4_render.h
create mode 100644 lib/rendercopy_gen4.c
diff --git a/lib/Makefile.sources b/lib/Makefile.sources
index 042c1d3bb44a..e0ebd02c1661 100644
--- a/lib/Makefile.sources
+++ b/lib/Makefile.sources
@@ -71,10 +71,12 @@ lib_source_list = \
gen8_media.h \
rendercopy_i915.c \
rendercopy_i830.c \
+ gen4_render.h \
gen6_render.h \
gen7_render.h \
gen8_render.h \
gen9_render.h \
+ rendercopy_gen4.c \
rendercopy_gen6.c \
rendercopy_gen7.c \
rendercopy_gen8.c \
diff --git a/lib/gen4_render.h b/lib/gen4_render.h
new file mode 100644
index 000000000000..ab1158e3c6d2
--- /dev/null
+++ b/lib/gen4_render.h
@@ -0,0 +1,628 @@
+#ifndef GEN4_RENDER_H
+#define GEN4_RENDER_H
+
+#include <stdint.h>
+
+#define GEN4_3D(Pipeline,Opcode,Subopcode) ((3 << 29) | \
+ ((Pipeline) << 27) | \
+ ((Opcode) << 24) | \
+ ((Subopcode) << 16))
+
+#define GEN4_URB_FENCE GEN4_3D(0, 0, 0)
+# define UF0_CS_REALLOC (1 << 13)
+# define UF0_VFE_REALLOC (1 << 12)
+# define UF0_SF_REALLOC (1 << 11)
+# define UF0_CLIP_REALLOC (1 << 10)
+# define UF0_GS_REALLOC (1 << 9)
+# define UF0_VS_REALLOC (1 << 8)
+# define UF1_CLIP_FENCE_SHIFT 20
+# define UF1_GS_FENCE_SHIFT 10
+# define UF1_VS_FENCE_SHIFT 0
+# define UF2_CS_FENCE_SHIFT 20
+# define UF2_VFE_FENCE_SHIFT 10
+# define UF2_SF_FENCE_SHIFT 0
+
+#define GEN4_CS_URB_STATE GEN4_3D(0, 0, 1)
+
+#define GEN4_STATE_BASE_ADDRESS GEN4_3D(0, 1, 1)
+# define BASE_ADDRESS_MODIFY (1 << 0)
+
+#define GEN4_STATE_SIP GEN4_3D(0, 1, 2)
+
+#define GEN4_PIPELINE_SELECT GEN4_3D(0, 1, 4)
+#define G4X_PIPELINE_SELECT GEN4_3D(1, 1, 4)
+# define PIPELINE_SELECT_3D 0
+# define PIPELINE_SELECT_MEDIA 1
+
+#define GEN4_3DSTATE_PIPELINED_POINTERS GEN4_3D(3, 0, 0)
+# define GEN4_GS_DISABLE 0
+# define GEN4_GS_ENABLE 1
+# define GEN4_CLIP_DISABLE 0
+# define GEN4_CLIP_ENABLE 1
+
+#define GEN4_3DSTATE_BINDING_TABLE_POINTERS GEN4_3D(3, 0, 1)
+
+#define GEN4_3DSTATE_VERTEX_BUFFERS GEN4_3D(3, 0, 8)
+# define VB0_BUFFER_INDEX_SHIFT 27
+# define VB0_VERTEXDATA (0 << 26)
+# define VB0_INSTANCEDATA (1 << 26)
+# define VB0_BUFFER_PITCH_SHIFT 0
+
+#define GEN4_3DSTATE_VERTEX_ELEMENTS GEN4_3D(3, 0, 9)
+# define VE0_VERTEX_BUFFER_INDEX_SHIFT 27
+# define VE0_VALID (1 << 26)
+# define VE0_FORMAT_SHIFT 16
+# define VE0_OFFSET_SHIFT 0
+# define VE1_VFCOMPONENT_0_SHIFT 28
+# define VE1_VFCOMPONENT_1_SHIFT 24
+# define VE1_VFCOMPONENT_2_SHIFT 20
+# define VE1_VFCOMPONENT_3_SHIFT 16
+# define VE1_DESTINATION_ELEMENT_OFFSET_SHIFT 0
+
+#define GEN4_VFCOMPONENT_NOSTORE 0
+#define GEN4_VFCOMPONENT_STORE_SRC 1
+#define GEN4_VFCOMPONENT_STORE_0 2
+#define GEN4_VFCOMPONENT_STORE_1_FLT 3
+#define GEN4_VFCOMPONENT_STORE_1_INT 4
+#define GEN4_VFCOMPONENT_STORE_VID 5
+#define GEN4_VFCOMPONENT_STORE_IID 6
+#define GEN4_VFCOMPONENT_STORE_PID 7
+
+#define GEN4_3DSTATE_DRAWING_RECTANGLE GEN4_3D(3, 1, 0)
+
+#define GEN4_3DSTATE_DEPTH_BUFFER GEN4_3D(3, 1, 5)
+# define GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT 29
+# define GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT 18
+
+#define GEN4_DEPTHFORMAT_D32_FLOAT_S8X24_UINT 0
+#define GEN4_DEPTHFORMAT_D32_FLOAT 1
+#define GEN4_DEPTHFORMAT_D24_UNORM_S8_UINT 2
+#define GEN4_DEPTHFORMAT_D24_UNORM_X8_UINT 3
+#define GEN4_DEPTHFORMAT_D16_UNORM 5
+
+#define GEN4_3DSTATE_CLEAR_PARAMS GEN4_3D(3, 1, 0x10)
+# define GEN4_3DSTATE_DEPTH_CLEAR_VALID (1 << 15)
+
+#define GEN4_3DPRIMITIVE GEN4_3D(3, 3, 0)
+# define GEN4_3DPRIMITIVE_VERTEX_SEQUENTIAL (0 << 15)
+# define GEN4_3DPRIMITIVE_VERTEX_RANDOM (1 << 15)
+# define GEN4_3DPRIMITIVE_TOPOLOGY_SHIFT 10
+
+#define _3DPRIM_POINTLIST 0x01
+#define _3DPRIM_LINELIST 0x02
+#define _3DPRIM_LINESTRIP 0x03
+#define _3DPRIM_TRILIST 0x04
+#define _3DPRIM_TRISTRIP 0x05
+#define _3DPRIM_TRIFAN 0x06
+#define _3DPRIM_QUADLIST 0x07
+#define _3DPRIM_QUADSTRIP 0x08
+#define _3DPRIM_LINELIST_ADJ 0x09
+#define _3DPRIM_LINESTRIP_ADJ 0x0A
+#define _3DPRIM_TRILIST_ADJ 0x0B
+#define _3DPRIM_TRISTRIP_ADJ 0x0C
+#define _3DPRIM_TRISTRIP_REVERSE 0x0D
+#define _3DPRIM_POLYGON 0x0E
+#define _3DPRIM_RECTLIST 0x0F
+#define _3DPRIM_LINELOOP 0x10
+#define _3DPRIM_POINTLIST_BF 0x11
+#define _3DPRIM_LINESTRIP_CONT 0x12
+#define _3DPRIM_LINESTRIP_BF 0x13
+#define _3DPRIM_LINESTRIP_CONT_BF 0x14
+#define _3DPRIM_TRIFAN_NOSTIPPLE 0x15
+
+#define GEN4_CULLMODE_BOTH 0
+#define GEN4_CULLMODE_NONE 1
+#define GEN4_CULLMODE_FRONT 2
+#define GEN4_CULLMODE_BACK 3
+
+#define GEN4_BORDER_COLOR_MODE_DEFAULT 0
+#define GEN4_BORDER_COLOR_MODE_LEGACY 1
+
+#define GEN4_MAPFILTER_NEAREST 0
+#define GEN4_MAPFILTER_LINEAR 1
+#define GEN4_MAPFILTER_ANISOTROPIC 2
+#define GEN4_MAPFILTER_MONO 6
+
+#define GEN4_MIPFILTER_NONE 0
+#define GEN4_MIPFILTER_NEAREST 1
+#define GEN4_MIPFILTER_LINEAR 3
+
+#define GEN4_PREFILTER_ALWAYS 0
+#define GEN4_PREFILTER_NEVER 1
+#define GEN4_PREFILTER_LESS 2
+#define GEN4_PREFILTER_EQUAL 3
+#define GEN4_PREFILTER_LEQUAL 4
+#define GEN4_PREFILTER_GREATER 5
+#define GEN4_PREFILTER_NOTEQUAL 6
+#define GEN4_PREFILTER_GEQUAL 7
+
+#define GEN4_TEXCOORDMODE_WRAP 0
+#define GEN4_TEXCOORDMODE_MIRROR 1
+#define GEN4_TEXCOORDMODE_CLAMP 2
+#define GEN4_TEXCOORDMODE_CUBE 3
+#define GEN4_TEXCOORDMODE_CLAMP_BORDER 4
+#define GEN4_TEXCOORDMODE_MIRROR_ONCE 5
+
+#define GEN4_LOD_PRECLAMP_D3D 0
+#define GEN4_LOD_PRECLAMP_OGL 1
+
+/* The hardware supports two different modes for border color. The
+ * default (OpenGL) mode uses floating-point color channels, while the
+ * legacy mode uses 4 bytes.
+ *
+ * More significantly, the legacy mode respects the components of the
+ * border color for channels not present in the source, (whereas the
+ * default mode will ignore the border color's alpha channel and use
+ * alpha==1 for an RGB source, for example).
+ *
+ * The legacy mode matches the semantics specified by the Render
+ * extension.
+ */
+struct gen4_sampler_default_border_color {
+ float color[4];
+};
+
+struct gen4_sampler_legacy_border_color {
+ uint8_t color[4];
+};
+
+struct gen4_sampler_state {
+ struct {
+ uint32_t shadow_function:3;
+ uint32_t lod_bias:11;
+ uint32_t min_filter:3;
+ uint32_t mag_filter:3;
+ uint32_t mip_filter:2;
+ uint32_t base_level:5;
+ uint32_t pad0:1;
+ uint32_t lod_preclamp:1;
+ uint32_t border_color_mode:1;
+ uint32_t pad1:1;
+ uint32_t disable:1;
+ } ss0;
+
+ struct {
+ uint32_t r_wrap_mode:3;
+ uint32_t t_wrap_mode:3;
+ uint32_t s_wrap_mode:3;
+ uint32_t cube_ctlr_mode:1;
+ uint32_t pad:2;
+ uint32_t max_lod:10;
+ uint32_t min_lod:10;
+ } ss1;
+
+ struct {
+ uint32_t pad:5;
+ uint32_t border_color_pointer:27;
+ } ss2;
+
+ struct {
+ uint32_t pad:13;
+ uint32_t address_rounding_enable:6;
+ uint32_t max_aniso:3;
+ uint32_t chroma_key_mode:1;
+ uint32_t chroma_key_index:2;
+ uint32_t chroma_key_enable:1;
+ uint32_t monochrome_filter_width:3;
+ uint32_t monochrome_filter_height:3;
+ } ss3;
+};
+
+typedef enum {
+ SAMPLER_FILTER_NEAREST = 0,
+ SAMPLER_FILTER_BILINEAR,
+ FILTER_COUNT
+} sampler_filter_t;
+
+typedef enum {
+ SAMPLER_EXTEND_NONE = 0,
+ SAMPLER_EXTEND_REPEAT,
+ SAMPLER_EXTEND_PAD,
+ SAMPLER_EXTEND_REFLECT,
+ EXTEND_COUNT
+} sampler_extend_t;
+
+struct gen4_surface_state {
+ struct {
+ unsigned int cube_pos_z:1;
+ unsigned int cube_neg_z:1;
+ unsigned int cube_pos_y:1;
+ unsigned int cube_neg_y:1;
+ unsigned int cube_pos_x:1;
+ unsigned int cube_neg_x:1;
+ unsigned int media_boundary_pixel_mode:2;
+ unsigned int render_cache_read_mode:1;
+ unsigned int cube_corner_mode:1;
+ unsigned int mipmap_layout_mode:1;
+ unsigned int vert_line_stride_ofs:1;
+ unsigned int vert_line_stride:1;
+ unsigned int color_blend:1;
+ unsigned int writedisable_blue:1;
+ unsigned int writedisable_green:1;
+ unsigned int writedisable_red:1;
+ unsigned int writedisable_alpha:1;
+ unsigned int surface_format:9;
+ unsigned int data_return_format:1;
+ unsigned int pad0:1;
+ unsigned int surface_type:3;
+ } ss0;
+
+ struct {
+ unsigned int base_addr;
+ } ss1;
+
+ struct {
+ unsigned int render_target_rotation:2;
+ unsigned int mip_count:4;
+ unsigned int width:13;
+ unsigned int height:13;
+ } ss2;
+
+ struct {
+ unsigned int tile_walk:1;
+ unsigned int tiled_surface:1;
+ unsigned int pad0:1;
+ unsigned int pitch:17;
+ unsigned int pad1:1;
+ unsigned int depth:11;
+ } ss3;
+
+ struct {
+ unsigned int pad:8;
+ unsigned int render_target_view_extent:9;
+ unsigned int min_array_elt:11;
+ unsigned int min_lod:4;
+ } ss4;
+
+ struct {
+ unsigned int pad:20;
+ unsigned int y_offset:4;
+ unsigned int pad1:1;
+ unsigned int x_offset:7;
+ } ss5;
+};
+
+struct gen4_cc_viewport {
+ float min_depth;
+ float max_depth;
+};
+
+struct gen4_vs_state {
+ struct {
+ unsigned int pad0:1;
+ unsigned int grf_reg_count:3;
+ unsigned int pad1:2;
+ unsigned int kernel_start_pointer:26;
+ } vs0;
+
+ struct {
+ unsigned int pad0:7;
+ unsigned int sw_exception_enable:1;
+ unsigned int pad1:3;
+ unsigned int mask_stack_exception_enable:1;
+ unsigned int pad2:1;
+ unsigned int illegal_op_exception_enable:1;
+ unsigned int pad3:2;
+ unsigned int floating_point_mode:1;
+ unsigned int thread_priority:1;
+ unsigned int binding_table_entry_count:8;
+ unsigned int pad4:5;
+ unsigned int single_program_flow:1;
+ } vs1;
+
+ struct {
+ unsigned int per_thread_scratch_space:4;
+ unsigned int pad0:6;
+ unsigned int scratch_space_pointer:22;
+ } vs2;
+
+ struct {
+ unsigned int dispatch_grf_start_reg:4;
+ unsigned int urb_entry_read_offset:6;
+ unsigned int pad0:1;
+ unsigned int urb_entry_read_length:6;
+ unsigned int pad1:1;
+ unsigned int const_urb_entry_read_offset:6;
+ unsigned int pad2:1;
+ unsigned int const_urb_entry_read_length:6;
+ unsigned int pad3:1;
+ } vs3;
+
+ struct {
+ unsigned int pad0:10;
+ unsigned int stats_enable:1;
+ unsigned int nr_urb_entries:7;
+ unsigned int pad1:1;
+ unsigned int urb_entry_allocation_size:5;
+ unsigned int pad2:1;
+ unsigned int max_threads:6;
+ unsigned int pad3:1;
+ } vs4;
+
+ struct {
+ unsigned int sampler_count:3;
+ unsigned int pad:2;
+ unsigned int sampler_state_pointer:27;
+ } vs5;
+
+ struct {
+ unsigned int vs_enable:1;
+ unsigned int vert_cache_disable:1;
+ unsigned int pad:30;
+ } vs6;
+};
+
+struct gen4_sf_state {
+ struct {
+ unsigned int pad0:1;
+ unsigned int grf_reg_count:3;
+ unsigned int pad1:2;
+ unsigned int kernel_start_pointer:26;
+ } sf0;
+
+ struct {
+ unsigned int barycentric_interp:1; /* ilk */
+ unsigned int pad0:6;
+ unsigned int sw_exception_enable:1;
+ unsigned int pad1:3;
+ unsigned int mask_stack_exception_enable:1;
+ unsigned int pad2:1;
+ unsigned int illegal_op_exception_enable:1;
+ unsigned int pad3:2;
+ unsigned int floating_point_mode:1;
+ unsigned int thread_priority:1;
+ unsigned int binding_table_entry_count:8;
+ unsigned int pad4:6;
+ } sf1;
+
+ struct {
+ unsigned int per_thread_scratch_space:4;
+ unsigned int pad0:6;
+ unsigned int scratch_space_pointer:22;
+ } sf2;
+
+ struct {
+ unsigned int dispatch_grf_start_reg:4;
+ unsigned int urb_entry_read_offset:6;
+ unsigned int pad0:1;
+ unsigned int urb_entry_read_length:7;
+ unsigned int const_urb_entry_read_offset:6;
+ unsigned int pad1:1;
+ unsigned int const_urb_entry_read_length:6;
+ unsigned int pad2:1;
+ } sf3;
+
+ struct {
+ unsigned int pad0:10;
+ unsigned int stats_enable:1;
+ unsigned int nr_urb_entries:8;
+ unsigned int urb_entry_allocation_size:6;
+ unsigned int max_threads:6;
+ unsigned int pad2:1;
+ } sf4;
+
+ struct {
+ unsigned int front_winding:1;
+ unsigned int viewport_transform:1;
+ unsigned int pad:3;
+ unsigned int sf_viewport_state_offset:27;
+ } sf5;
+
+ struct {
+ unsigned int pad:9;
+ unsigned int dest_org_vbias:4;
+ unsigned int dest_org_hbias:4;
+ unsigned int scissor:1;
+ unsigned int disable_2x2_trifilter:1;
+ unsigned int disable_zero_trifilter:1;
+ unsigned int point_rast_rule:2;
+ unsigned int line_endcap_aa_region_width:2;
+ unsigned int line_width:4;
+ unsigned int fast_scissor_disable:1;
+ unsigned int cull_mode:2;
+ unsigned int aa_enable:1;
+ } sf6;
+
+ struct {
+ unsigned int point_size:11;
+ unsigned int use_point_size_state:1;
+ unsigned int subpixel_precision:1;
+ unsigned int sprite_point:1;
+ unsigned int aa_line_dist_mode:1;
+ unsigned int pad:10;
+ unsigned int trifan_pv:2;
+ unsigned int linestrip_pv:2;
+ unsigned int tristrip_pv:2;
+ unsigned int line_last_pixel_enable:1;
+ } sf7;
+};
+
+struct gen4_wm_state {
+ struct {
+ unsigned int pad0:1;
+ unsigned int grf_reg_count:3;
+ unsigned int pad1:2;
+ unsigned int kernel_start_pointer:26;
+ } wm0;
+
+ struct {
+ unsigned int pad0:1;
+ unsigned int sw_exception_enable:1;
+ unsigned int mask_stack_exception_enable:1;
+ unsigned int pad2:1;
+ unsigned int illegal_op_exception_enable:1;
+ unsigned int pad3:3;
+ unsigned int depth_coeff_urb_read_offset:6;
+ unsigned int pad4:2;
+ unsigned int floating_point_mode:1;
+ unsigned int thread_priority:1;
+ unsigned int binding_table_entry_count:8;
+ unsigned int pad5:5;
+ unsigned int single_program_flow:1;
+ } wm1;
+
+ struct {
+ unsigned int per_thread_scratch_space:4;
+ unsigned int pad0:6;
+ unsigned int scratch_space_pointer:22;
+ } wm2;
+
+ struct {
+ unsigned int dispatch_grf_start_reg:4;
+ unsigned int urb_entry_read_offset:6;
+ unsigned int pad0:1;
+ unsigned int urb_entry_read_length:7;
+ unsigned int const_urb_entry_read_offset:6;
+ unsigned int pad1:1;
+ unsigned int const_urb_entry_read_length:6;
+ unsigned int pad2:1;
+ } wm3;
+
+ struct {
+ unsigned int stats_enable:1;
+ unsigned int pad0:1;
+ unsigned int sampler_count:3;
+ unsigned int sampler_state_pointer:27;
+ } wm4;
+
+ struct {
+ unsigned int enable_8_pix:1;
+ unsigned int enable_16_pix:1;
+ unsigned int enable_32_pix:1;
+ unsigned int enable_cont_32_pix:1; /* ctg+ */
+ unsigned int enable_cont_64_pix:1; /* ctg+ */
+ unsigned int pad0:1;
+ unsigned int fast_span_coverage:1; /* ilk */
+ unsigned int depth_clear:1; /* ilk */
+ unsigned int depth_resolve:1; /* ilk */
+ unsigned int hier_depth_resolve:1; /* ilk */
+ unsigned int legacy_global_depth_bias:1;
+ unsigned int line_stipple:1;
+ unsigned int depth_offset:1;
+ unsigned int polygon_stipple:1;
+ unsigned int line_aa_region_width:2;
+ unsigned int line_endcap_aa_region_width:2;
+ unsigned int early_depth_test:1;
+ unsigned int thread_dispatch_enable:1;
+ unsigned int program_uses_depth:1;
+ unsigned int program_computes_dpeth:1;
+ unsigned int program_uses_killpixel:1;
+ unsigned int legacy_line_rast:1;
+ unsigned int transposed_urb_read:1;
+ unsigned int max_threads:7;
+ } wm5;
+
+ struct {
+ float global_depth_offset_constant;
+ } wm6;
+
+ struct {
+ float global_depth_offset_scale;
+ } wm7;
+
+ /* ilk only from now on */
+ struct {
+ unsigned int pad0:1;
+ unsigned int grf_reg_count_1:3;
+ unsigned int pad1:2;
+ unsigned int kernel_start_pointer_1:26;
+ } wm8;
+
+ struct {
+ unsigned int pad0:1;
+ unsigned int grf_reg_count_2:3;
+ unsigned int pad1:2;
+ unsigned int kernel_start_pointer_2:26;
+ } wm9;
+
+ struct {
+ unsigned int pad0:1;
+ unsigned int grf_reg_count_3:3;
+ unsigned int pad1:2;
+ unsigned int kernel_start_pointer_3:26;
+ } wm10;
+};
+
+struct gen4_color_calc_state {
+ struct {
+ unsigned int pad0:3;
+ unsigned int bf_stencil_pass_depth_pass_op:3;
+ unsigned int bf_stencil_pass_depth_fail_op:3;
+ unsigned int bf_stencil_fail_op:3;
+ unsigned int bf_stencil_func:3;
+ unsigned int bf_stencil_enable:1;
+ unsigned int pad1:2;
+ unsigned int stencil_write_enable:1;
+ unsigned int stencil_pass_depth_pass_op:3;
+ unsigned int stencil_pass_depth_fail_op:3;
+ unsigned int stencil_fail_op:3;
+ unsigned int stencil_func:3;
+ unsigned int stencil_enable:1;
+ } cc0;
+
+ struct {
+ unsigned int bf_stencil_ref:8;
+ unsigned int stencil_write_mask:8;
+ unsigned int stencil_test_mask:8;
+ unsigned int stencil_ref:8;
+ } cc1;
+
+ struct {
+ unsigned int logicop_enable:1;
+ unsigned int pad0:10;
+ unsigned int depth_write_enable:1;
+ unsigned int depth_test_function:3;
+ unsigned int depth_test:1;
+ unsigned int bf_stencil_write_mask:8;
+ unsigned int bf_stencil_test_mask:8;
+ } cc2;
+
+ struct {
+ unsigned int pad0:8;
+ unsigned int alpha_test_func:3;
+ unsigned int alpha_test:1;
+ unsigned int blend_enable:1;
+ unsigned int ia_blend_enable:1;
+ unsigned int pad1:1;
+ unsigned int alpha_test_format:1;
+ unsigned int pad2:16;
+ } cc3;
+
+ struct {
+ unsigned int pad0:5;
+ unsigned int cc_viewport_state_offset:27;
+ } cc4;
+
+ struct {
+ unsigned int pad0:2;
+ unsigned int ia_dest_blend_factor:5;
+ unsigned int ia_src_blend_factor:5;
+ unsigned int ia_blend_function:3;
+ unsigned int stats_enable:1;
+ unsigned int logicop_func:4;
+ unsigned int pad1:10;
+ unsigned int round_disable:1;
+ unsigned int dither_enable:1;
+ } cc5;
+
+ struct {
+ unsigned int clamp_post_alpha_blend:1;
+ unsigned int clamp_pre_alpha_blend:1;
+ unsigned int clamp_range:2;
+ unsigned int pad0:11;
+ unsigned int y_dither_offset:2;
+ unsigned int x_dither_offset:2;
+ unsigned int dest_blend_factor:5;
+ unsigned int src_blend_factor:5;
+ unsigned int blend_function:3;
+ } cc6;
+
+ struct {
+ union {
+ float f;
+ unsigned char ub[4];
+ } alpha_ref;
+ } cc7;
+};
+
+#endif
diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
index a85c760c6242..8eb101cdcf80 100644
--- a/lib/intel_batchbuffer.c
+++ b/lib/intel_batchbuffer.c
@@ -833,6 +833,8 @@ igt_render_copyfunc_t igt_get_render_copyfunc(int devid)
copy = gen2_render_copyfunc;
else if (IS_GEN3(devid))
copy = gen3_render_copyfunc;
+ else if (IS_GEN4(devid) || IS_GEN5(devid))
+ copy = gen4_render_copyfunc;
else if (IS_GEN6(devid))
copy = gen6_render_copyfunc;
else if (IS_GEN7(devid))
diff --git a/lib/meson.build b/lib/meson.build
index 1a355414ec5b..78590c0b5630 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -32,6 +32,7 @@ lib_sources = [
'gpu_cmds.c',
'rendercopy_i915.c',
'rendercopy_i830.c',
+ 'rendercopy_gen4.c',
'rendercopy_gen6.c',
'rendercopy_gen7.c',
'rendercopy_gen8.c',
diff --git a/lib/rendercopy.h b/lib/rendercopy.h
index fdc3cabb20d7..19b29dbb118d 100644
--- a/lib/rendercopy.h
+++ b/lib/rendercopy.h
@@ -43,6 +43,11 @@ void gen6_render_copyfunc(struct intel_batchbuffer *batch,
struct igt_buf *src, unsigned src_x, unsigned src_y,
unsigned width, unsigned height,
struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
+void gen4_render_copyfunc(struct intel_batchbuffer *batch,
+ drm_intel_context *context,
+ struct igt_buf *src, unsigned src_x, unsigned src_y,
+ unsigned width, unsigned height,
+ struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
void gen3_render_copyfunc(struct intel_batchbuffer *batch,
drm_intel_context *context,
struct igt_buf *src, unsigned src_x, unsigned src_y,
diff --git a/lib/rendercopy_gen4.c b/lib/rendercopy_gen4.c
new file mode 100644
index 000000000000..acd4be8de9da
--- /dev/null
+++ b/lib/rendercopy_gen4.c
@@ -0,0 +1,704 @@
+#include "rendercopy.h"
+#include "intel_chipset.h"
+#include "gen4_render.h"
+#include "surfaceformat.h"
+
+#include <assert.h>
+
+#define VERTEX_SIZE (3*4)
+
+#define URB_VS_ENTRY_SIZE 1
+#define URB_GS_ENTRY_SIZE 0
+#define URB_CL_ENTRY_SIZE 0
+#define URB_SF_ENTRY_SIZE 2
+#define URB_CS_ENTRY_SIZE 1
+
+#define GEN4_GRF_BLOCKS(nreg) (((nreg) + 15) / 16 - 1)
+#define SF_KERNEL_NUM_GRF 16
+#define PS_KERNEL_NUM_GRF 32
+
+static const uint32_t gen4_sf_kernel_nomask[][4] = {
+ { 0x00400031, 0x20c01fbd, 0x0069002c, 0x01110001 },
+ { 0x00600001, 0x206003be, 0x00690060, 0x00000000 },
+ { 0x00600040, 0x20e077bd, 0x00690080, 0x006940a0 },
+ { 0x00600041, 0x202077be, 0x008d00e0, 0x000000c0 },
+ { 0x00600040, 0x20e077bd, 0x006900a0, 0x00694060 },
+ { 0x00600041, 0x204077be, 0x008d00e0, 0x000000c8 },
+ { 0x00600031, 0x20001fbc, 0x008d0000, 0x8640c800 },
+};
+
+static const uint32_t gen5_sf_kernel_nomask[][4] = {
+ { 0x00400031, 0x20c01fbd, 0x1069002c, 0x02100001 },
+ { 0x00600001, 0x206003be, 0x00690060, 0x00000000 },
+ { 0x00600040, 0x20e077bd, 0x00690080, 0x006940a0 },
+ { 0x00600041, 0x202077be, 0x008d00e0, 0x000000c0 },
+ { 0x00600040, 0x20e077bd, 0x006900a0, 0x00694060 },
+ { 0x00600041, 0x204077be, 0x008d00e0, 0x000000c8 },
+ { 0x00600031, 0x20001fbc, 0x648d0000, 0x8808c800 },
+};
+
+static const uint32_t gen4_ps_kernel_nomask_affine[][4] = {
+ { 0x00800040, 0x23c06d29, 0x00480028, 0x10101010 },
+ { 0x00800040, 0x23806d29, 0x0048002a, 0x11001100 },
+ { 0x00802040, 0x2100753d, 0x008d03c0, 0x00004020 },
+ { 0x00802040, 0x2140753d, 0x008d0380, 0x00004024 },
+ { 0x00802059, 0x200077bc, 0x00000060, 0x008d0100 },
+ { 0x00802048, 0x204077be, 0x00000064, 0x008d0140 },
+ { 0x00802059, 0x200077bc, 0x00000070, 0x008d0100 },
+ { 0x00802048, 0x208077be, 0x00000074, 0x008d0140 },
+ { 0x00600201, 0x20200022, 0x008d0000, 0x00000000 },
+ { 0x00000201, 0x20280062, 0x00000000, 0x00000000 },
+ { 0x01800031, 0x21801d09, 0x008d0000, 0x02580001 },
+ { 0x00600001, 0x204003be, 0x008d0180, 0x00000000 },
+ { 0x00601001, 0x20c003be, 0x008d01a0, 0x00000000 },
+ { 0x00600001, 0x206003be, 0x008d01c0, 0x00000000 },
+ { 0x00601001, 0x20e003be, 0x008d01e0, 0x00000000 },
+ { 0x00600001, 0x208003be, 0x008d0200, 0x00000000 },
+ { 0x00601001, 0x210003be, 0x008d0220, 0x00000000 },
+ { 0x00600001, 0x20a003be, 0x008d0240, 0x00000000 },
+ { 0x00601001, 0x212003be, 0x008d0260, 0x00000000 },
+ { 0x00600201, 0x202003be, 0x008d0020, 0x00000000 },
+ { 0x00800031, 0x20001d28, 0x008d0000, 0x85a04800 },
+};
+
+static const uint32_t gen5_ps_kernel_nomask_affine[][4] = {
+ { 0x00800040, 0x23c06d29, 0x00480028, 0x10101010 },
+ { 0x00800040, 0x23806d29, 0x0048002a, 0x11001100 },
+ { 0x00802040, 0x2100753d, 0x008d03c0, 0x00004020 },
+ { 0x00802040, 0x2140753d, 0x008d0380, 0x00004024 },
+ { 0x00802059, 0x200077bc, 0x00000060, 0x008d0100 },
+ { 0x00802048, 0x204077be, 0x00000064, 0x008d0140 },
+ { 0x00802059, 0x200077bc, 0x00000070, 0x008d0100 },
+ { 0x00802048, 0x208077be, 0x00000074, 0x008d0140 },
+ { 0x01800031, 0x21801fa9, 0x208d0000, 0x0a8a0001 },
+ { 0x00802001, 0x304003be, 0x008d0180, 0x00000000 },
+ { 0x00802001, 0x306003be, 0x008d01c0, 0x00000000 },
+ { 0x00802001, 0x308003be, 0x008d0200, 0x00000000 },
+ { 0x00802001, 0x30a003be, 0x008d0240, 0x00000000 },
+ { 0x00600201, 0x202003be, 0x008d0020, 0x00000000 },
+ { 0x00800031, 0x20001d28, 0x548d0000, 0x94084800 },
+};
+
+static uint32_t
+batch_used(struct intel_batchbuffer *batch)
+{
+ return batch->ptr - batch->buffer;
+}
+
+static uint32_t
+batch_round_upto(struct intel_batchbuffer *batch, uint32_t divisor)
+{
+ uint32_t offset = batch_used(batch);
+ offset = (offset + divisor - 1) / divisor * divisor;
+ batch->ptr = batch->buffer + offset;
+ return offset;
+}
+
+static int gen4_max_vs_nr_urb_entries(uint32_t devid)
+{
+ return IS_GEN5(devid) ? 256 : 32;
+}
+
+static int gen4_max_sf_nr_urb_entries(uint32_t devid)
+{
+ return IS_GEN5(devid) ? 128 : 64;
+}
+
+static int gen4_urb_size(uint32_t devid)
+{
+ return IS_GEN5(devid) ? 1024 : IS_G4X(devid) ? 384 : 256;
+}
+
+static int gen4_max_sf_threads(uint32_t devid)
+{
+ return IS_GEN5(devid) ? 48 : 24;
+}
+
+static int gen4_max_wm_threads(uint32_t devid)
+{
+ return IS_GEN5(devid) ? 72 : IS_G4X(devid) ? 50 : 32;
+}
+
+static void
+gen4_render_flush(struct intel_batchbuffer *batch,
+ drm_intel_context *context, uint32_t batch_end)
+{
+ int ret;
+
+ ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
+ if (ret == 0)
+ ret = drm_intel_gem_bo_context_exec(batch->bo, context,
+ batch_end, 0);
+ assert(ret == 0);
+}
+
+static uint32_t
+gen4_bind_buf(struct intel_batchbuffer *batch,
+ struct igt_buf *buf,
+ uint32_t format, int is_dst)
+{
+ struct gen4_surface_state *ss;
+ uint32_t write_domain, read_domain;
+ int ret;
+
+ if (is_dst) {
+ write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
+ } else {
+ write_domain = 0;
+ read_domain = I915_GEM_DOMAIN_SAMPLER;
+ }
+
+ ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 32);
+
+ ss->ss0.surface_type = SURFACE_2D;
+ ss->ss0.surface_format = format;
+
+ ss->ss0.data_return_format = SURFACERETURNFORMAT_FLOAT32;
+ ss->ss0.color_blend = 1;
+ ss->ss1.base_addr = buf->bo->offset;
+
+ ret = drm_intel_bo_emit_reloc(batch->bo,
+ intel_batchbuffer_subdata_offset(batch, ss) + 4,
+ buf->bo, 0,
+ read_domain, write_domain);
+ assert(ret == 0);
+
+ ss->ss2.height = igt_buf_height(buf) - 1;
+ ss->ss2.width = igt_buf_width(buf) - 1;
+ ss->ss3.pitch = buf->stride - 1;
+ ss->ss3.tiled_surface = buf->tiling != I915_TILING_NONE;
+ ss->ss3.tile_walk = buf->tiling == I915_TILING_Y;
+
+ return intel_batchbuffer_subdata_offset(batch, ss);
+}
+
+static uint32_t
+gen4_bind_surfaces(struct intel_batchbuffer *batch,
+ struct igt_buf *src,
+ struct igt_buf *dst)
+{
+ uint32_t *binding_table;
+
+ binding_table = intel_batchbuffer_subdata_alloc(batch, 32, 32);
+
+ binding_table[0] =
+ gen4_bind_buf(batch, dst, SURFACEFORMAT_B8G8R8A8_UNORM, 1);
+ binding_table[1] =
+ gen4_bind_buf(batch, src, SURFACEFORMAT_B8G8R8A8_UNORM, 0);
+
+ return intel_batchbuffer_subdata_offset(batch, binding_table);
+}
+
+static void
+gen4_emit_sip(struct intel_batchbuffer *batch)
+{
+ OUT_BATCH(GEN4_STATE_SIP | (2 - 2));
+ OUT_BATCH(0);
+}
+
+static void
+gen4_emit_state_base_address(struct intel_batchbuffer *batch)
+{
+ if (IS_GEN5(batch->devid)) {
+ OUT_BATCH(GEN4_STATE_BASE_ADDRESS | (8 - 2));
+ OUT_RELOC(batch->bo, /* general */
+ I915_GEM_DOMAIN_INSTRUCTION, 0,
+ BASE_ADDRESS_MODIFY);
+ OUT_RELOC(batch->bo, /* surface */
+ I915_GEM_DOMAIN_INSTRUCTION, 0,
+ BASE_ADDRESS_MODIFY);
+ OUT_BATCH(0); /* media */
+ OUT_RELOC(batch->bo, /* instruction */
+ I915_GEM_DOMAIN_INSTRUCTION, 0,
+ BASE_ADDRESS_MODIFY);
+
+ /* upper bounds, disable */
+ OUT_BATCH(BASE_ADDRESS_MODIFY); /* general */
+ OUT_BATCH(0); /* media */
+ OUT_BATCH(BASE_ADDRESS_MODIFY); /* instruction */
+ } else {
+ OUT_BATCH(GEN4_STATE_BASE_ADDRESS | (6 - 2));
+ OUT_RELOC(batch->bo, /* general */
+ I915_GEM_DOMAIN_INSTRUCTION, 0,
+ BASE_ADDRESS_MODIFY);
+ OUT_RELOC(batch->bo, /* surface */
+ I915_GEM_DOMAIN_INSTRUCTION, 0,
+ BASE_ADDRESS_MODIFY);
+ OUT_BATCH(0); /* media */
+
+ /* upper bounds, disable */
+ OUT_BATCH(BASE_ADDRESS_MODIFY); /* general */
+ OUT_BATCH(0); /* media */
+ }
+}
+
+static void
+gen4_emit_pipelined_pointers(struct intel_batchbuffer *batch,
+ uint32_t vs, uint32_t sf,
+ uint32_t wm, uint32_t cc)
+{
+ OUT_BATCH(GEN4_3DSTATE_PIPELINED_POINTERS | (7 - 2));
+ OUT_BATCH(vs);
+ OUT_BATCH(GEN4_GS_DISABLE);
+ OUT_BATCH(GEN4_CLIP_DISABLE);
+ OUT_BATCH(sf);
+ OUT_BATCH(wm);
+ OUT_BATCH(cc);
+}
+
+static void
+gen4_emit_urb(struct intel_batchbuffer *batch)
+{
+ int vs_entries = gen4_max_vs_nr_urb_entries(batch->devid);
+ int gs_entries = 0;
+ int cl_entries = 0;
+ int sf_entries = gen4_max_sf_nr_urb_entries(batch->devid);
+ int cs_entries = 0;
+
+ int urb_vs_end = vs_entries * URB_VS_ENTRY_SIZE;
+ int urb_gs_end = urb_vs_end + gs_entries * URB_GS_ENTRY_SIZE;
+ int urb_cl_end = urb_gs_end + cl_entries * URB_CL_ENTRY_SIZE;
+ int urb_sf_end = urb_cl_end + sf_entries * URB_SF_ENTRY_SIZE;
+ int urb_cs_end = urb_sf_end + cs_entries * URB_CS_ENTRY_SIZE;
+
+ assert(urb_cs_end <= gen4_urb_size(batch->devid));
+
+ intel_batchbuffer_align(batch, 16);
+
+ OUT_BATCH(GEN4_URB_FENCE |
+ UF0_CS_REALLOC |
+ UF0_SF_REALLOC |
+ UF0_CLIP_REALLOC |
+ UF0_GS_REALLOC |
+ UF0_VS_REALLOC |
+ (3 - 2));
+ OUT_BATCH(urb_cl_end << UF1_CLIP_FENCE_SHIFT |
+ urb_gs_end << UF1_GS_FENCE_SHIFT |
+ urb_vs_end << UF1_VS_FENCE_SHIFT);
+ OUT_BATCH(urb_cs_end << UF2_CS_FENCE_SHIFT |
+ urb_sf_end << UF2_SF_FENCE_SHIFT);
+
+ OUT_BATCH(GEN4_CS_URB_STATE | (2 - 2));
+ OUT_BATCH((URB_CS_ENTRY_SIZE - 1) << 4 | cs_entries << 0);
+}
+
+static void
+gen4_emit_null_depth_buffer(struct intel_batchbuffer *batch)
+{
+ if (IS_G4X(batch->devid) || IS_GEN5(batch->devid)) {
+ OUT_BATCH(GEN4_3DSTATE_DEPTH_BUFFER | (6 - 2));
+ OUT_BATCH(SURFACE_NULL << GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT |
+ GEN4_DEPTHFORMAT_D32_FLOAT << GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ } else {
+ OUT_BATCH(GEN4_3DSTATE_DEPTH_BUFFER | (5 - 2));
+ OUT_BATCH(SURFACE_NULL << GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT |
+ GEN4_DEPTHFORMAT_D32_FLOAT << GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ }
+
+ if (IS_GEN5(batch->devid)) {
+ OUT_BATCH(GEN4_3DSTATE_CLEAR_PARAMS | (2 - 2));
+ OUT_BATCH(0);
+ }
+}
+
+static void
+gen4_emit_invariant(struct intel_batchbuffer *batch)
+{
+ OUT_BATCH(MI_FLUSH | MI_INHIBIT_RENDER_CACHE_FLUSH);
+
+ if (IS_GEN5(batch->devid) || IS_G4X(batch->devid))
+ OUT_BATCH(G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D);
+ else
+ OUT_BATCH(GEN4_PIPELINE_SELECT | PIPELINE_SELECT_3D);
+}
+
+static uint32_t
+gen4_create_vs_state(struct intel_batchbuffer *batch)
+{
+ struct gen4_vs_state *vs;
+ int nr_urb_entries;
+
+ vs = intel_batchbuffer_subdata_alloc(batch, sizeof(*vs), 32);
+
+ /* Set up the vertex shader to be disabled (passthrough) */
+ nr_urb_entries = gen4_max_vs_nr_urb_entries(batch->devid);
+ if (IS_GEN5(batch->devid))
+ nr_urb_entries >>= 2;
+ vs->vs4.nr_urb_entries = nr_urb_entries;
+ vs->vs4.urb_entry_allocation_size = URB_VS_ENTRY_SIZE - 1;
+ vs->vs6.vs_enable = 0;
+ vs->vs6.vert_cache_disable = 1;
+
+ return intel_batchbuffer_subdata_offset(batch, vs);
+}
+
+static uint32_t
+gen4_create_sf_state(struct intel_batchbuffer *batch,
+ uint32_t kernel)
+{
+ struct gen4_sf_state *sf;
+
+ sf = intel_batchbuffer_subdata_alloc(batch, sizeof(*sf), 32);
+
+ sf->sf0.grf_reg_count = GEN4_GRF_BLOCKS(SF_KERNEL_NUM_GRF);
+ sf->sf0.kernel_start_pointer = kernel >> 6;
+
+ sf->sf3.urb_entry_read_length = 1; /* 1 URB per vertex */
+ /* don't smash vertex header, read start from dw8 */
+ sf->sf3.urb_entry_read_offset = 1;
+ sf->sf3.dispatch_grf_start_reg = 3;
+
+ sf->sf4.max_threads = gen4_max_sf_threads(batch->devid) - 1;;
+ sf->sf4.urb_entry_allocation_size = URB_SF_ENTRY_SIZE - 1;
+ sf->sf4.nr_urb_entries = gen4_max_sf_nr_urb_entries(batch->devid);
+
+ sf->sf6.cull_mode = GEN4_CULLMODE_NONE;
+ sf->sf6.dest_org_vbias = 0x8;
+ sf->sf6.dest_org_hbias = 0x8;
+
+ return intel_batchbuffer_subdata_offset(batch, sf);
+}
+
+static uint32_t
+gen4_create_wm_state(struct intel_batchbuffer *batch,
+ uint32_t kernel,
+ uint32_t sampler)
+{
+ struct gen4_wm_state *wm;
+
+ wm = intel_batchbuffer_subdata_alloc(batch, sizeof(*wm), 32);
+
+ assert((kernel & 63) == 0);
+ wm->wm0.kernel_start_pointer = kernel >> 6;
+ wm->wm0.grf_reg_count = GEN4_GRF_BLOCKS(PS_KERNEL_NUM_GRF);
+
+ wm->wm3.urb_entry_read_offset = 0;
+ wm->wm3.dispatch_grf_start_reg = 3;
+
+ assert((sampler & 31) == 0);
+ wm->wm4.sampler_state_pointer = sampler >> 5;
+ wm->wm4.sampler_count = 1;
+
+ wm->wm5.max_threads = gen4_max_wm_threads(batch->devid);
+ wm->wm5.thread_dispatch_enable = 1;
+ wm->wm5.enable_16_pix = 1;
+ wm->wm5.early_depth_test = 1;
+
+ if (IS_GEN5(batch->devid))
+ wm->wm1.binding_table_entry_count = 0;
+ else
+ wm->wm1.binding_table_entry_count = 2;
+ wm->wm3.urb_entry_read_length = 2;
+
+ return intel_batchbuffer_subdata_offset(batch, wm);
+}
+
+static void
+gen4_emit_binding_table(struct intel_batchbuffer *batch,
+ uint32_t wm_table)
+{
+ OUT_BATCH(GEN4_3DSTATE_BINDING_TABLE_POINTERS | (6 - 2));
+ OUT_BATCH(0); /* vs */
+ OUT_BATCH(0); /* gs */
+ OUT_BATCH(0); /* clip */
+ OUT_BATCH(0); /* sf */
+ OUT_BATCH(wm_table); /* ps */
+}
+
+static void
+gen4_emit_drawing_rectangle(struct intel_batchbuffer *batch,
+ struct igt_buf *dst)
+{
+ OUT_BATCH(GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
+ OUT_BATCH(0);
+ OUT_BATCH((igt_buf_height(dst) - 1) << 16 |
+ (igt_buf_width(dst) - 1));
+ OUT_BATCH(0);
+}
+
+static void
+gen4_emit_vertex_elements(struct intel_batchbuffer *batch)
+{
+
+ if (IS_GEN5(batch->devid)) {
+ /* The VUE layout
+ * dword 0-3: pad (0.0, 0.0, 0.0, 0.0),
+ * dword 4-7: position (x, y, 1.0, 1.0),
+ * dword 8-11: texture coordinate 0 (u0, v0, 0, 0)
+ *
+ * dword 4-11 are fetched from vertex buffer
+ */
+ OUT_BATCH(GEN4_3DSTATE_VERTEX_ELEMENTS | (3 * 2 + 1 - 2));
+
+ /* pad */
+ OUT_BATCH(0 << VE0_VERTEX_BUFFER_INDEX_SHIFT | VE0_VALID |
+ SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
+ 0 << VE0_OFFSET_SHIFT);
+ OUT_BATCH(GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
+ GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
+ GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+ GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+
+ /* x,y */
+ OUT_BATCH(0 << VE0_VERTEX_BUFFER_INDEX_SHIFT | VE0_VALID |
+ SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
+ 0 << VE0_OFFSET_SHIFT);
+ OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+ GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+ GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_2_SHIFT |
+ GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
+
+ /* u0, v0 */
+ OUT_BATCH(0 << VE0_VERTEX_BUFFER_INDEX_SHIFT | VE0_VALID |
+ SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
+ 4 << VE0_OFFSET_SHIFT);
+ OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+ GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+ GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+ GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+ } else {
+ /* The VUE layout
+ * dword 0-3: position (x, y, 1.0, 1.0),
+ * dword 4-7: texture coordinate 0 (u0, v0, 0, 0)
+ *
+ * dword 0-7 are fetched from vertex buffer
+ */
+ OUT_BATCH(GEN4_3DSTATE_VERTEX_ELEMENTS | (2 * 2 + 1 - 2));
+
+ /* x,y */
+ OUT_BATCH(0 << VE0_VERTEX_BUFFER_INDEX_SHIFT | VE0_VALID |
+ SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
+ 0 << VE0_OFFSET_SHIFT);
+ OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+ GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+ GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_2_SHIFT |
+ GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT |
+ 4 << VE1_DESTINATION_ELEMENT_OFFSET_SHIFT);
+
+ /* u0, v0 */
+ OUT_BATCH(0 << VE0_VERTEX_BUFFER_INDEX_SHIFT | VE0_VALID |
+ SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
+ 4 << VE0_OFFSET_SHIFT);
+ OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+ GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+ GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+ GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT |
+ 8 << VE1_DESTINATION_ELEMENT_OFFSET_SHIFT);
+ }
+}
+
+static uint32_t
+gen4_create_cc_viewport(struct intel_batchbuffer *batch)
+{
+ struct gen4_cc_viewport *vp;
+
+ vp = intel_batchbuffer_subdata_alloc(batch, sizeof(*vp), 32);
+
+ vp->min_depth = -1.e35;
+ vp->max_depth = 1.e35;
+
+ return intel_batchbuffer_subdata_offset(batch, vp);
+}
+
+static uint32_t
+gen4_create_cc_state(struct intel_batchbuffer *batch,
+ uint32_t cc_vp)
+{
+ struct gen4_color_calc_state *cc;
+
+ cc = intel_batchbuffer_subdata_alloc(batch, sizeof(*cc), 64);
+
+ cc->cc4.cc_viewport_state_offset = cc_vp;
+
+ return intel_batchbuffer_subdata_offset(batch, cc);
+}
+
+static uint32_t
+gen4_create_sf_kernel(struct intel_batchbuffer *batch)
+{
+ if (IS_GEN5(batch->devid))
+ return intel_batchbuffer_copy_data(batch, gen5_sf_kernel_nomask,
+ sizeof(gen5_sf_kernel_nomask),
+ 64);
+ else
+ return intel_batchbuffer_copy_data(batch, gen4_sf_kernel_nomask,
+ sizeof(gen4_sf_kernel_nomask),
+ 64);
+}
+
+static uint32_t
+gen4_create_ps_kernel(struct intel_batchbuffer *batch)
+{
+ if (IS_GEN5(batch->devid))
+ return intel_batchbuffer_copy_data(batch, gen5_ps_kernel_nomask_affine,
+ sizeof(gen5_ps_kernel_nomask_affine),
+ 64);
+ else
+ return intel_batchbuffer_copy_data(batch, gen4_ps_kernel_nomask_affine,
+ sizeof(gen4_ps_kernel_nomask_affine),
+ 64);
+}
+
+static uint32_t
+gen4_create_sampler(struct intel_batchbuffer *batch,
+ sampler_filter_t filter,
+ sampler_extend_t extend)
+{
+ struct gen4_sampler_state *ss;
+
+ ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 32);
+
+ ss->ss0.lod_preclamp = GEN4_LOD_PRECLAMP_OGL;
+
+ /* We use the legacy mode to get the semantics specified by
+ * the Render extension. */
+ ss->ss0.border_color_mode = GEN4_BORDER_COLOR_MODE_LEGACY;
+
+ switch (filter) {
+ default:
+ case SAMPLER_FILTER_NEAREST:
+ ss->ss0.min_filter = GEN4_MAPFILTER_NEAREST;
+ ss->ss0.mag_filter = GEN4_MAPFILTER_NEAREST;
+ break;
+ case SAMPLER_FILTER_BILINEAR:
+ ss->ss0.min_filter = GEN4_MAPFILTER_LINEAR;
+ ss->ss0.mag_filter = GEN4_MAPFILTER_LINEAR;
+ break;
+ }
+
+ switch (extend) {
+ default:
+ case SAMPLER_EXTEND_NONE:
+ ss->ss1.r_wrap_mode = GEN4_TEXCOORDMODE_CLAMP_BORDER;
+ ss->ss1.s_wrap_mode = GEN4_TEXCOORDMODE_CLAMP_BORDER;
+ ss->ss1.t_wrap_mode = GEN4_TEXCOORDMODE_CLAMP_BORDER;
+ break;
+ case SAMPLER_EXTEND_REPEAT:
+ ss->ss1.r_wrap_mode = GEN4_TEXCOORDMODE_WRAP;
+ ss->ss1.s_wrap_mode = GEN4_TEXCOORDMODE_WRAP;
+ ss->ss1.t_wrap_mode = GEN4_TEXCOORDMODE_WRAP;
+ break;
+ case SAMPLER_EXTEND_PAD:
+ ss->ss1.r_wrap_mode = GEN4_TEXCOORDMODE_CLAMP;
+ ss->ss1.s_wrap_mode = GEN4_TEXCOORDMODE_CLAMP;
+ ss->ss1.t_wrap_mode = GEN4_TEXCOORDMODE_CLAMP;
+ break;
+ case SAMPLER_EXTEND_REFLECT:
+ ss->ss1.r_wrap_mode = GEN4_TEXCOORDMODE_MIRROR;
+ ss->ss1.s_wrap_mode = GEN4_TEXCOORDMODE_MIRROR;
+ ss->ss1.t_wrap_mode = GEN4_TEXCOORDMODE_MIRROR;
+ break;
+ }
+
+ return intel_batchbuffer_subdata_offset(batch, ss);
+}
+
+static void gen4_emit_vertex_buffer(struct intel_batchbuffer *batch)
+{
+ OUT_BATCH(GEN4_3DSTATE_VERTEX_BUFFERS | (5 - 2));
+ OUT_BATCH(VB0_VERTEXDATA |
+ 0 << VB0_BUFFER_INDEX_SHIFT |
+ VERTEX_SIZE << VB0_BUFFER_PITCH_SHIFT);
+ OUT_RELOC(batch->bo, I915_GEM_DOMAIN_VERTEX, 0, 0);
+ if (IS_GEN5(batch->devid))
+ OUT_RELOC(batch->bo, I915_GEM_DOMAIN_VERTEX, 0, batch->bo->size - 1);
+ else
+ OUT_BATCH(batch->bo->size / VERTEX_SIZE - 1);
+ OUT_BATCH(0);
+}
+
+static uint32_t gen4_emit_primitive(struct intel_batchbuffer *batch)
+{
+ uint32_t offset;
+
+ OUT_BATCH(GEN4_3DPRIMITIVE |
+ GEN4_3DPRIMITIVE_VERTEX_SEQUENTIAL |
+ _3DPRIM_RECTLIST << GEN4_3DPRIMITIVE_TOPOLOGY_SHIFT |
+ 0 << 9 |
+ (6 - 2));
+ OUT_BATCH(3); /* vertex count */
+ offset = batch_used(batch);
+ OUT_BATCH(0); /* vertex_index */
+ OUT_BATCH(1); /* single instance */
+ OUT_BATCH(0); /* start instance location */
+ OUT_BATCH(0); /* index buffer offset, ignored */
+
+ return offset;
+}
+
+void gen4_render_copyfunc(struct intel_batchbuffer *batch,
+ drm_intel_context *context,
+ struct igt_buf *src, unsigned src_x, unsigned src_y,
+ unsigned width, unsigned height,
+ struct igt_buf *dst, unsigned dst_x, unsigned dst_y)
+{
+ uint32_t cc, cc_vp;
+ uint32_t wm, wm_sampler, wm_kernel, wm_table;
+ uint32_t sf, sf_kernel;
+ uint32_t vs;
+ uint32_t offset, batch_end;
+
+ intel_batchbuffer_flush_with_context(batch, context);
+
+ batch->ptr = batch->buffer + 1024;
+ intel_batchbuffer_subdata_alloc(batch, 64, 64);
+
+ vs = gen4_create_vs_state(batch);
+
+ sf_kernel = gen4_create_sf_kernel(batch);
+ sf = gen4_create_sf_state(batch, sf_kernel);
+
+ wm_table = gen4_bind_surfaces(batch, src, dst);
+ wm_kernel = gen4_create_ps_kernel(batch);
+ wm_sampler = gen4_create_sampler(batch,
+ SAMPLER_FILTER_NEAREST,
+ SAMPLER_EXTEND_NONE);
+ wm = gen4_create_wm_state(batch, wm_kernel, wm_sampler);
+
+ cc_vp = gen4_create_cc_viewport(batch);
+ cc = gen4_create_cc_state(batch, cc_vp);
+
+ batch->ptr = batch->buffer;
+
+ gen4_emit_invariant(batch);
+ gen4_emit_state_base_address(batch);
+ gen4_emit_sip(batch);
+ gen4_emit_null_depth_buffer(batch);
+
+ gen4_emit_drawing_rectangle(batch, dst);
+ gen4_emit_binding_table(batch, wm_table);
+ gen4_emit_vertex_elements(batch);
+ gen4_emit_pipelined_pointers(batch, vs, sf, wm, cc);
+ gen4_emit_urb(batch);
+
+ gen4_emit_vertex_buffer(batch);
+ offset = gen4_emit_primitive(batch);
+
+ OUT_BATCH(MI_BATCH_BUFFER_END);
+ batch_end = intel_batchbuffer_align(batch, 8);
+
+ *(uint32_t*)(batch->buffer + offset) =
+ batch_round_upto(batch, VERTEX_SIZE)/VERTEX_SIZE;
+
+ emit_vertex_2s(batch, dst_x + width, dst_y + height);
+ emit_vertex_normalized(batch, src_x + width, igt_buf_width(src));
+ emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+
+ emit_vertex_2s(batch, dst_x, dst_y + height);
+ emit_vertex_normalized(batch, src_x, igt_buf_width(src));
+ emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+
+ emit_vertex_2s(batch, dst_x, dst_y);
+ emit_vertex_normalized(batch, src_x, igt_buf_width(src));
+ emit_vertex_normalized(batch, src_y, igt_buf_height(src));
+
+ gen4_render_flush(batch, context, batch_end);
+ intel_batchbuffer_reset(batch);
+}
--
2.16.4
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 7+ messages in thread* [igt-dev] ✓ Fi.CI.BAT: success for lib/rendercopy: Add gen4/5 rendercopy
2018-06-11 16:14 [igt-dev] [PATCH i-g-t] lib/rendercopy: Add gen4/5 rendercopy Ville Syrjala
@ 2018-06-11 18:01 ` Patchwork
2018-06-12 0:14 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2018-06-11 18:01 UTC (permalink / raw)
To: Ville Syrjälä; +Cc: igt-dev
== Series Details ==
Series: lib/rendercopy: Add gen4/5 rendercopy
URL : https://patchwork.freedesktop.org/series/44577/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4302 -> IGTPW_1439 =
== Summary - WARNING ==
Minor unknown changes coming with IGTPW_1439 need to be verified
manually.
If you think the reported changes have nothing to do with the changes
introduced in IGTPW_1439, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://patchwork.freedesktop.org/api/1.0/series/44577/revisions/1/mbox/
== Possible new issues ==
Here are the unknown changes that may have been introduced in IGTPW_1439:
=== IGT changes ===
==== Warnings ====
igt@gem_exec_gttfill@basic:
fi-pnv-d510: SKIP -> PASS
igt@gem_mmap_gtt@basic-small-bo-tiledy:
fi-gdg-551: PASS -> SKIP
igt@gem_render_linear_blits@basic:
fi-bwr-2160: SKIP -> PASS +1
igt@gem_render_tiled_blits@basic:
fi-elk-e7500: SKIP -> PASS +1
fi-ilk-650: SKIP -> PASS +2
== Known issues ==
Here are the changes found in IGTPW_1439 that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@gem_exec_suspend@basic-s3:
fi-skl-gvtdvm: NOTRUN -> INCOMPLETE (fdo#105600, fdo#104108)
igt@kms_flip@basic-flip-vs-dpms:
fi-glk-j4005: PASS -> DMESG-WARN (fdo#106097)
igt@kms_frontbuffer_tracking@basic:
fi-hsw-4200u: PASS -> DMESG-FAIL (fdo#106103, fdo#102614)
igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c:
fi-bxt-dsi: PASS -> INCOMPLETE (fdo#103927)
==== Possible fixes ====
igt@gem_mmap_gtt@basic-small-bo-tiledx:
fi-gdg-551: FAIL (fdo#102575) -> SKIP
igt@kms_pipe_crc_basic@read-crc-pipe-b-frame-sequence:
fi-glk-j4005: FAIL (fdo#103481) -> PASS
igt@kms_pipe_crc_basic@read-crc-pipe-c:
fi-glk-j4005: DMESG-WARN (fdo#106000, fdo#106097) -> PASS
igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
fi-cnl-psr: DMESG-WARN (fdo#104951) -> PASS
fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
fdo#102614 https://bugs.freedesktop.org/show_bug.cgi?id=102614
fdo#103481 https://bugs.freedesktop.org/show_bug.cgi?id=103481
fdo#103927 https://bugs.freedesktop.org/show_bug.cgi?id=103927
fdo#104108 https://bugs.freedesktop.org/show_bug.cgi?id=104108
fdo#104951 https://bugs.freedesktop.org/show_bug.cgi?id=104951
fdo#105600 https://bugs.freedesktop.org/show_bug.cgi?id=105600
fdo#106000 https://bugs.freedesktop.org/show_bug.cgi?id=106000
fdo#106097 https://bugs.freedesktop.org/show_bug.cgi?id=106097
fdo#106103 https://bugs.freedesktop.org/show_bug.cgi?id=106103
== Participating hosts (41 -> 38) ==
Additional (2): fi-bdw-gvtdvm fi-skl-gvtdvm
Missing (5): fi-ctg-p8600 fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-skl-6700hq
== Build changes ==
* IGT: IGT_4513 -> IGTPW_1439
CI_DRM_4302: ef129f260b2bd362959651fe8e20e369bf3c977e @ git://anongit.freedesktop.org/gfx-ci/linux
IGTPW_1439: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1439/
IGT_4513: 7b6838781441cfbc7f6c18f421f127dfb02b44cf @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1439/issues.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 7+ messages in thread* [igt-dev] ✓ Fi.CI.IGT: success for lib/rendercopy: Add gen4/5 rendercopy
2018-06-11 16:14 [igt-dev] [PATCH i-g-t] lib/rendercopy: Add gen4/5 rendercopy Ville Syrjala
2018-06-11 18:01 ` [igt-dev] ✓ Fi.CI.BAT: success for " Patchwork
@ 2018-06-12 0:14 ` Patchwork
2018-06-13 10:35 ` [igt-dev] [PATCH i-g-t] " Kalamarz, Lukasz
2018-06-14 13:13 ` Katarzyna Dec
3 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2018-06-12 0:14 UTC (permalink / raw)
To: Ville Syrjala; +Cc: igt-dev
== Series Details ==
Series: lib/rendercopy: Add gen4/5 rendercopy
URL : https://patchwork.freedesktop.org/series/44577/
State : success
== Summary ==
= CI Bug Log - changes from IGT_4513_full -> IGTPW_1439_full =
== Summary - WARNING ==
Minor unknown changes coming with IGTPW_1439_full need to be verified
manually.
If you think the reported changes have nothing to do with the changes
introduced in IGTPW_1439_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://patchwork.freedesktop.org/api/1.0/series/44577/revisions/1/mbox/
== Possible new issues ==
Here are the unknown changes that may have been introduced in IGTPW_1439_full:
=== IGT changes ===
==== Warnings ====
igt@gem_exec_schedule@deep-bsd2:
shard-kbl: PASS -> SKIP +3
igt@gem_mocs_settings@mocs-rc6-bsd2:
shard-kbl: SKIP -> PASS
igt@pm_rc6_residency@rc6-accuracy:
shard-snb: PASS -> SKIP
== Known issues ==
Here are the changes found in IGTPW_1439_full that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@drv_selftest@live_gtt:
shard-kbl: PASS -> FAIL (fdo#105347)
igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy:
shard-hsw: PASS -> FAIL (fdo#105767)
igt@kms_flip@flip-vs-absolute-wf_vblank-interruptible:
shard-glk: PASS -> FAIL (fdo#100368)
igt@kms_flip_tiling@flip-to-x-tiled:
shard-glk: PASS -> FAIL (fdo#104724) +1
igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-shrfb-draw-mmap-wc:
shard-glk: PASS -> FAIL (fdo#104724, fdo#103167) +1
igt@kms_hdmi_inject@inject-audio:
shard-snb: PASS -> FAIL (fdo#102370)
igt@kms_rotation_crc@primary-rotation-180:
shard-kbl: PASS -> FAIL (fdo#104724, fdo#103925)
igt@kms_vblank@pipe-c-ts-continuation-suspend:
shard-kbl: PASS -> FAIL (fdo#104894)
igt@perf_pmu@other-read-3:
shard-snb: PASS -> INCOMPLETE (fdo#105411)
igt@perf_pmu@rc6-runtime-pm-long:
shard-apl: PASS -> FAIL (fdo#105010)
shard-glk: PASS -> FAIL (fdo#105010)
==== Possible fixes ====
igt@drv_suspend@shrink:
shard-glk: INCOMPLETE (k.org#198133, fdo#103359) -> PASS
igt@gem_eio@hibernate:
shard-snb: INCOMPLETE (fdo#105411) -> PASS
igt@gem_exec_big:
shard-hsw: INCOMPLETE (fdo#103540) -> PASS
igt@kms_flip@2x-flip-vs-blocking-wf-vblank:
shard-glk: FAIL (fdo#100368) -> PASS
igt@kms_flip@plain-flip-ts-check:
shard-hsw: FAIL (fdo#100368) -> PASS
igt@kms_flip_tiling@flip-x-tiled:
shard-glk: FAIL (fdo#104724) -> PASS
igt@kms_rotation_crc@sprite-rotation-180:
shard-snb: FAIL (fdo#104724, fdo#103925) -> PASS
igt@perf@polling:
shard-hsw: FAIL (fdo#102252) -> PASS
==== Warnings ====
igt@drv_selftest@live_gtt:
shard-glk: FAIL (fdo#105347) -> INCOMPLETE (k.org#198133, fdo#103359)
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
fdo#102370 https://bugs.freedesktop.org/show_bug.cgi?id=102370
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#103359 https://bugs.freedesktop.org/show_bug.cgi?id=103359
fdo#103540 https://bugs.freedesktop.org/show_bug.cgi?id=103540
fdo#103925 https://bugs.freedesktop.org/show_bug.cgi?id=103925
fdo#104724 https://bugs.freedesktop.org/show_bug.cgi?id=104724
fdo#104894 https://bugs.freedesktop.org/show_bug.cgi?id=104894
fdo#105010 https://bugs.freedesktop.org/show_bug.cgi?id=105010
fdo#105347 https://bugs.freedesktop.org/show_bug.cgi?id=105347
fdo#105411 https://bugs.freedesktop.org/show_bug.cgi?id=105411
fdo#105767 https://bugs.freedesktop.org/show_bug.cgi?id=105767
k.org#198133 https://bugzilla.kernel.org/show_bug.cgi?id=198133
== Participating hosts (5 -> 5) ==
No changes in participating hosts
== Build changes ==
* IGT: IGT_4513 -> IGTPW_1439
* Linux: CI_DRM_4294 -> CI_DRM_4302
CI_DRM_4294: af0889384edc6de2f91494325d571c66dffea83f @ git://anongit.freedesktop.org/gfx-ci/linux
CI_DRM_4302: ef129f260b2bd362959651fe8e20e369bf3c977e @ git://anongit.freedesktop.org/gfx-ci/linux
IGTPW_1439: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1439/
IGT_4513: 7b6838781441cfbc7f6c18f421f127dfb02b44cf @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1439/shards.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [igt-dev] [PATCH i-g-t] lib/rendercopy: Add gen4/5 rendercopy
2018-06-11 16:14 [igt-dev] [PATCH i-g-t] lib/rendercopy: Add gen4/5 rendercopy Ville Syrjala
2018-06-11 18:01 ` [igt-dev] ✓ Fi.CI.BAT: success for " Patchwork
2018-06-12 0:14 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
@ 2018-06-13 10:35 ` Kalamarz, Lukasz
2018-06-14 13:23 ` Ville Syrjälä
2018-06-14 13:13 ` Katarzyna Dec
3 siblings, 1 reply; 7+ messages in thread
From: Kalamarz, Lukasz @ 2018-06-13 10:35 UTC (permalink / raw)
To: ville.syrjala@linux.intel.com, igt-dev@lists.freedesktop.org
On Mon, 2018-06-11 at 19:14 +0300, Ville Syrjala wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Add rendercopy implementation for gen4/5. Basic structure
> copied from the gen6 implementation,
After refactoring some part of rendercopy libs I don't like that
sentence :(
> and the gen4/5 specific
> bits were mostly lifted from sna.
>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> lib/Makefile.sources | 2 +
> lib/gen4_render.h | 628
> ++++++++++++++++++++++++++++++++++++++++++
> lib/intel_batchbuffer.c | 2 +
> lib/meson.build | 1 +
> lib/rendercopy.h | 5 +
> lib/rendercopy_gen4.c | 704
> ++++++++++++++++++++++++++++++++++++++++++++++++
> 6 files changed, 1342 insertions(+)
> create mode 100644 lib/gen4_render.h
> create mode 100644 lib/rendercopy_gen4.c
>
> diff --git a/lib/Makefile.sources b/lib/Makefile.sources
> index 042c1d3bb44a..e0ebd02c1661 100644
> --- a/lib/Makefile.sources
> +++ b/lib/Makefile.sources
> @@ -71,10 +71,12 @@ lib_source_list = \
> gen8_media.h \
> rendercopy_i915.c \
> rendercopy_i830.c \
> + gen4_render.h \
> gen6_render.h \
> gen7_render.h \
> gen8_render.h \
> gen9_render.h \
> + rendercopy_gen4.c \
> rendercopy_gen6.c \
> rendercopy_gen7.c \
> rendercopy_gen8.c \
> diff --git a/lib/gen4_render.h b/lib/gen4_render.h
> new file mode 100644
> index 000000000000..ab1158e3c6d2
> --- /dev/null
> +++ b/lib/gen4_render.h
With having in mind refactoring of genX_render libs introduced in patch
series: https://patchwork.freedesktop.org/series/44624/ Could You check
if registers defined here with GEN4/5 prefix are reimplmented in
gen6_render? If so, then maybe it will be good idea to modify those
definitions and not add more duplicated definitions?
> @@ -0,0 +1,628 @@
> +#ifndef GEN4_RENDER_H
> +#define GEN4_RENDER_H
<snip>
> diff --git a/lib/rendercopy_gen4.c b/lib/rendercopy_gen4.c
> new file mode 100644
> index 000000000000..acd4be8de9da
> --- /dev/null
> +++ b/lib/rendercopy_gen4.c
> @@ -0,0 +1,704 @@
> +#include "rendercopy.h"
> +#include "intel_chipset.h"
> +#include "gen4_render.h"
> +#include "surfaceformat.h"
> +
> +#include <assert.h>
> +
> +#define VERTEX_SIZE (3*4)
> +
> +#define URB_VS_ENTRY_SIZE 1
> +#define URB_GS_ENTRY_SIZE 0
> +#define URB_CL_ENTRY_SIZE 0
> +#define URB_SF_ENTRY_SIZE 2
> +#define URB_CS_ENTRY_SIZE 1
> +
> +#define GEN4_GRF_BLOCKS(nreg) (((nreg) + 15) / 16 - 1)
> +#define SF_KERNEL_NUM_GRF 16
> +#define PS_KERNEL_NUM_GRF 32
> +
> +static const uint32_t gen4_sf_kernel_nomask[][4] = {
> + { 0x00400031, 0x20c01fbd, 0x0069002c, 0x01110001 },
> + { 0x00600001, 0x206003be, 0x00690060, 0x00000000 },
> + { 0x00600040, 0x20e077bd, 0x00690080, 0x006940a0 },
> + { 0x00600041, 0x202077be, 0x008d00e0, 0x000000c0 },
> + { 0x00600040, 0x20e077bd, 0x006900a0, 0x00694060 },
> + { 0x00600041, 0x204077be, 0x008d00e0, 0x000000c8 },
> + { 0x00600031, 0x20001fbc, 0x008d0000, 0x8640c800 },
> +};
> +
> +static const uint32_t gen5_sf_kernel_nomask[][4] = {
> + { 0x00400031, 0x20c01fbd, 0x1069002c, 0x02100001 },
> + { 0x00600001, 0x206003be, 0x00690060, 0x00000000 },
> + { 0x00600040, 0x20e077bd, 0x00690080, 0x006940a0 },
> + { 0x00600041, 0x202077be, 0x008d00e0, 0x000000c0 },
> + { 0x00600040, 0x20e077bd, 0x006900a0, 0x00694060 },
> + { 0x00600041, 0x204077be, 0x008d00e0, 0x000000c8 },
> + { 0x00600031, 0x20001fbc, 0x648d0000, 0x8808c800 },
> +};
> +
> +static const uint32_t gen4_ps_kernel_nomask_affine[][4] = {
> + { 0x00800040, 0x23c06d29, 0x00480028, 0x10101010 },
> + { 0x00800040, 0x23806d29, 0x0048002a, 0x11001100 },
> + { 0x00802040, 0x2100753d, 0x008d03c0, 0x00004020 },
> + { 0x00802040, 0x2140753d, 0x008d0380, 0x00004024 },
> + { 0x00802059, 0x200077bc, 0x00000060, 0x008d0100 },
> + { 0x00802048, 0x204077be, 0x00000064, 0x008d0140 },
> + { 0x00802059, 0x200077bc, 0x00000070, 0x008d0100 },
> + { 0x00802048, 0x208077be, 0x00000074, 0x008d0140 },
> + { 0x00600201, 0x20200022, 0x008d0000, 0x00000000 },
> + { 0x00000201, 0x20280062, 0x00000000, 0x00000000 },
> + { 0x01800031, 0x21801d09, 0x008d0000, 0x02580001 },
> + { 0x00600001, 0x204003be, 0x008d0180, 0x00000000 },
> + { 0x00601001, 0x20c003be, 0x008d01a0, 0x00000000 },
> + { 0x00600001, 0x206003be, 0x008d01c0, 0x00000000 },
> + { 0x00601001, 0x20e003be, 0x008d01e0, 0x00000000 },
> + { 0x00600001, 0x208003be, 0x008d0200, 0x00000000 },
> + { 0x00601001, 0x210003be, 0x008d0220, 0x00000000 },
> + { 0x00600001, 0x20a003be, 0x008d0240, 0x00000000 },
> + { 0x00601001, 0x212003be, 0x008d0260, 0x00000000 },
> + { 0x00600201, 0x202003be, 0x008d0020, 0x00000000 },
> + { 0x00800031, 0x20001d28, 0x008d0000, 0x85a04800 },
> +};
> +
> +static const uint32_t gen5_ps_kernel_nomask_affine[][4] = {
> + { 0x00800040, 0x23c06d29, 0x00480028, 0x10101010 },
> + { 0x00800040, 0x23806d29, 0x0048002a, 0x11001100 },
> + { 0x00802040, 0x2100753d, 0x008d03c0, 0x00004020 },
> + { 0x00802040, 0x2140753d, 0x008d0380, 0x00004024 },
> + { 0x00802059, 0x200077bc, 0x00000060, 0x008d0100 },
> + { 0x00802048, 0x204077be, 0x00000064, 0x008d0140 },
> + { 0x00802059, 0x200077bc, 0x00000070, 0x008d0100 },
> + { 0x00802048, 0x208077be, 0x00000074, 0x008d0140 },
> + { 0x01800031, 0x21801fa9, 0x208d0000, 0x0a8a0001 },
> + { 0x00802001, 0x304003be, 0x008d0180, 0x00000000 },
> + { 0x00802001, 0x306003be, 0x008d01c0, 0x00000000 },
> + { 0x00802001, 0x308003be, 0x008d0200, 0x00000000 },
> + { 0x00802001, 0x30a003be, 0x008d0240, 0x00000000 },
> + { 0x00600201, 0x202003be, 0x008d0020, 0x00000000 },
> + { 0x00800031, 0x20001d28, 0x548d0000, 0x94084800 },
> +};
> +
> +static uint32_t
> +batch_used(struct intel_batchbuffer *batch)
> +{
> + return batch->ptr - batch->buffer;
> +}
> +
> +static uint32_t
> +batch_round_upto(struct intel_batchbuffer *batch, uint32_t divisor)
> +{
> + uint32_t offset = batch_used(batch);
> + offset = (offset + divisor - 1) / divisor * divisor;
> + batch->ptr = batch->buffer + offset;
> + return offset;
> +}
With the usage of the same methods in two libs (gen4 and gen6), maybe
it will be worth to move those functions to intel_batchbuffer (like it
was done with previous copy/pasted functions)?
<snip>
----
Lukasz
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [igt-dev] [PATCH i-g-t] lib/rendercopy: Add gen4/5 rendercopy
2018-06-13 10:35 ` [igt-dev] [PATCH i-g-t] " Kalamarz, Lukasz
@ 2018-06-14 13:23 ` Ville Syrjälä
0 siblings, 0 replies; 7+ messages in thread
From: Ville Syrjälä @ 2018-06-14 13:23 UTC (permalink / raw)
To: Kalamarz, Lukasz; +Cc: igt-dev@lists.freedesktop.org
On Wed, Jun 13, 2018 at 10:35:56AM +0000, Kalamarz, Lukasz wrote:
> On Mon, 2018-06-11 at 19:14 +0300, Ville Syrjala wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > Add rendercopy implementation for gen4/5. Basic structure
> > copied from the gen6 implementation,
>
> After refactoring some part of rendercopy libs I don't like that
> sentence :(
>
> > and the gen4/5 specific
> > bits were mostly lifted from sna.
> >
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> > lib/Makefile.sources | 2 +
> > lib/gen4_render.h | 628
> > ++++++++++++++++++++++++++++++++++++++++++
> > lib/intel_batchbuffer.c | 2 +
> > lib/meson.build | 1 +
> > lib/rendercopy.h | 5 +
> > lib/rendercopy_gen4.c | 704
> > ++++++++++++++++++++++++++++++++++++++++++++++++
> > 6 files changed, 1342 insertions(+)
> > create mode 100644 lib/gen4_render.h
> > create mode 100644 lib/rendercopy_gen4.c
> >
> > diff --git a/lib/Makefile.sources b/lib/Makefile.sources
> > index 042c1d3bb44a..e0ebd02c1661 100644
> > --- a/lib/Makefile.sources
> > +++ b/lib/Makefile.sources
> > @@ -71,10 +71,12 @@ lib_source_list = \
> > gen8_media.h \
> > rendercopy_i915.c \
> > rendercopy_i830.c \
> > + gen4_render.h \
> > gen6_render.h \
> > gen7_render.h \
> > gen8_render.h \
> > gen9_render.h \
> > + rendercopy_gen4.c \
> > rendercopy_gen6.c \
> > rendercopy_gen7.c \
> > rendercopy_gen8.c \
> > diff --git a/lib/gen4_render.h b/lib/gen4_render.h
> > new file mode 100644
> > index 000000000000..ab1158e3c6d2
> > --- /dev/null
> > +++ b/lib/gen4_render.h
>
> With having in mind refactoring of genX_render libs introduced in patch
> series: https://patchwork.freedesktop.org/series/44624/ Could You check
> if registers defined here with GEN4/5 prefix are reimplmented in
> gen6_render? If so, then maybe it will be good idea to modify those
> definitions and not add more duplicated definitions?
There are many things that didn't change from gen4 onwards. I think the
correct option would be to remove the duplicate definitions from gen6+
and inherit the gen4 stuff.
>
> > @@ -0,0 +1,628 @@
> > +#ifndef GEN4_RENDER_H
> > +#define GEN4_RENDER_H
>
> <snip>
>
> > diff --git a/lib/rendercopy_gen4.c b/lib/rendercopy_gen4.c
> > new file mode 100644
> > index 000000000000..acd4be8de9da
> > --- /dev/null
> > +++ b/lib/rendercopy_gen4.c
> > @@ -0,0 +1,704 @@
> > +#include "rendercopy.h"
> > +#include "intel_chipset.h"
> > +#include "gen4_render.h"
> > +#include "surfaceformat.h"
> > +
> > +#include <assert.h>
> > +
> > +#define VERTEX_SIZE (3*4)
> > +
> > +#define URB_VS_ENTRY_SIZE 1
> > +#define URB_GS_ENTRY_SIZE 0
> > +#define URB_CL_ENTRY_SIZE 0
> > +#define URB_SF_ENTRY_SIZE 2
> > +#define URB_CS_ENTRY_SIZE 1
> > +
> > +#define GEN4_GRF_BLOCKS(nreg) (((nreg) + 15) / 16 - 1)
> > +#define SF_KERNEL_NUM_GRF 16
> > +#define PS_KERNEL_NUM_GRF 32
> > +
> > +static const uint32_t gen4_sf_kernel_nomask[][4] = {
> > + { 0x00400031, 0x20c01fbd, 0x0069002c, 0x01110001 },
> > + { 0x00600001, 0x206003be, 0x00690060, 0x00000000 },
> > + { 0x00600040, 0x20e077bd, 0x00690080, 0x006940a0 },
> > + { 0x00600041, 0x202077be, 0x008d00e0, 0x000000c0 },
> > + { 0x00600040, 0x20e077bd, 0x006900a0, 0x00694060 },
> > + { 0x00600041, 0x204077be, 0x008d00e0, 0x000000c8 },
> > + { 0x00600031, 0x20001fbc, 0x008d0000, 0x8640c800 },
> > +};
> > +
> > +static const uint32_t gen5_sf_kernel_nomask[][4] = {
> > + { 0x00400031, 0x20c01fbd, 0x1069002c, 0x02100001 },
> > + { 0x00600001, 0x206003be, 0x00690060, 0x00000000 },
> > + { 0x00600040, 0x20e077bd, 0x00690080, 0x006940a0 },
> > + { 0x00600041, 0x202077be, 0x008d00e0, 0x000000c0 },
> > + { 0x00600040, 0x20e077bd, 0x006900a0, 0x00694060 },
> > + { 0x00600041, 0x204077be, 0x008d00e0, 0x000000c8 },
> > + { 0x00600031, 0x20001fbc, 0x648d0000, 0x8808c800 },
> > +};
> > +
> > +static const uint32_t gen4_ps_kernel_nomask_affine[][4] = {
> > + { 0x00800040, 0x23c06d29, 0x00480028, 0x10101010 },
> > + { 0x00800040, 0x23806d29, 0x0048002a, 0x11001100 },
> > + { 0x00802040, 0x2100753d, 0x008d03c0, 0x00004020 },
> > + { 0x00802040, 0x2140753d, 0x008d0380, 0x00004024 },
> > + { 0x00802059, 0x200077bc, 0x00000060, 0x008d0100 },
> > + { 0x00802048, 0x204077be, 0x00000064, 0x008d0140 },
> > + { 0x00802059, 0x200077bc, 0x00000070, 0x008d0100 },
> > + { 0x00802048, 0x208077be, 0x00000074, 0x008d0140 },
> > + { 0x00600201, 0x20200022, 0x008d0000, 0x00000000 },
> > + { 0x00000201, 0x20280062, 0x00000000, 0x00000000 },
> > + { 0x01800031, 0x21801d09, 0x008d0000, 0x02580001 },
> > + { 0x00600001, 0x204003be, 0x008d0180, 0x00000000 },
> > + { 0x00601001, 0x20c003be, 0x008d01a0, 0x00000000 },
> > + { 0x00600001, 0x206003be, 0x008d01c0, 0x00000000 },
> > + { 0x00601001, 0x20e003be, 0x008d01e0, 0x00000000 },
> > + { 0x00600001, 0x208003be, 0x008d0200, 0x00000000 },
> > + { 0x00601001, 0x210003be, 0x008d0220, 0x00000000 },
> > + { 0x00600001, 0x20a003be, 0x008d0240, 0x00000000 },
> > + { 0x00601001, 0x212003be, 0x008d0260, 0x00000000 },
> > + { 0x00600201, 0x202003be, 0x008d0020, 0x00000000 },
> > + { 0x00800031, 0x20001d28, 0x008d0000, 0x85a04800 },
> > +};
> > +
> > +static const uint32_t gen5_ps_kernel_nomask_affine[][4] = {
> > + { 0x00800040, 0x23c06d29, 0x00480028, 0x10101010 },
> > + { 0x00800040, 0x23806d29, 0x0048002a, 0x11001100 },
> > + { 0x00802040, 0x2100753d, 0x008d03c0, 0x00004020 },
> > + { 0x00802040, 0x2140753d, 0x008d0380, 0x00004024 },
> > + { 0x00802059, 0x200077bc, 0x00000060, 0x008d0100 },
> > + { 0x00802048, 0x204077be, 0x00000064, 0x008d0140 },
> > + { 0x00802059, 0x200077bc, 0x00000070, 0x008d0100 },
> > + { 0x00802048, 0x208077be, 0x00000074, 0x008d0140 },
> > + { 0x01800031, 0x21801fa9, 0x208d0000, 0x0a8a0001 },
> > + { 0x00802001, 0x304003be, 0x008d0180, 0x00000000 },
> > + { 0x00802001, 0x306003be, 0x008d01c0, 0x00000000 },
> > + { 0x00802001, 0x308003be, 0x008d0200, 0x00000000 },
> > + { 0x00802001, 0x30a003be, 0x008d0240, 0x00000000 },
> > + { 0x00600201, 0x202003be, 0x008d0020, 0x00000000 },
> > + { 0x00800031, 0x20001d28, 0x548d0000, 0x94084800 },
> > +};
> > +
> > +static uint32_t
> > +batch_used(struct intel_batchbuffer *batch)
> > +{
> > + return batch->ptr - batch->buffer;
> > +}
> > +
> > +static uint32_t
> > +batch_round_upto(struct intel_batchbuffer *batch, uint32_t divisor)
> > +{
> > + uint32_t offset = batch_used(batch);
> > + offset = (offset + divisor - 1) / divisor * divisor;
> > + batch->ptr = batch->buffer + offset;
> > + return offset;
> > +}
>
> With the usage of the same methods in two libs (gen4 and gen6), maybe
> it will be worth to move those functions to intel_batchbuffer (like it
> was done with previous copy/pasted functions)?
Probably a better approach is to change gen4/6 to create the
vertex buffer the same way as gen7+.
--
Ville Syrjälä
Intel
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [igt-dev] [PATCH i-g-t] lib/rendercopy: Add gen4/5 rendercopy
2018-06-11 16:14 [igt-dev] [PATCH i-g-t] lib/rendercopy: Add gen4/5 rendercopy Ville Syrjala
` (2 preceding siblings ...)
2018-06-13 10:35 ` [igt-dev] [PATCH i-g-t] " Kalamarz, Lukasz
@ 2018-06-14 13:13 ` Katarzyna Dec
2018-06-14 13:24 ` Ville Syrjälä
3 siblings, 1 reply; 7+ messages in thread
From: Katarzyna Dec @ 2018-06-14 13:13 UTC (permalink / raw)
To: Ville Syrjala; +Cc: igt-dev
On Mon, Jun 11, 2018 at 07:14:18PM +0300, Ville Syrjala wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Add rendercopy implementation for gen4/5. Basic structure
> copied from the gen6 implementation, and the gen4/5 specific
> bits were mostly lifted from sna.
>
I have pushed remaining rendercopy code to public ML.
Could you please remove duplications from registers?
I've talked to Lukasz and he will remove duplications in
other places.
Maybe you could make more patches instead of 1?
We are waiting for your patches :)
Kasia :)
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [igt-dev] [PATCH i-g-t] lib/rendercopy: Add gen4/5 rendercopy
2018-06-14 13:13 ` Katarzyna Dec
@ 2018-06-14 13:24 ` Ville Syrjälä
0 siblings, 0 replies; 7+ messages in thread
From: Ville Syrjälä @ 2018-06-14 13:24 UTC (permalink / raw)
To: Katarzyna Dec; +Cc: igt-dev
On Thu, Jun 14, 2018 at 03:13:13PM +0200, Katarzyna Dec wrote:
> On Mon, Jun 11, 2018 at 07:14:18PM +0300, Ville Syrjala wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > Add rendercopy implementation for gen4/5. Basic structure
> > copied from the gen6 implementation, and the gen4/5 specific
> > bits were mostly lifted from sna.
> >
> I have pushed remaining rendercopy code to public ML.
> Could you please remove duplications from registers?
> I've talked to Lukasz and he will remove duplications in
> other places.
> Maybe you could make more patches instead of 1?
>
> We are waiting for your patches :)
Since you're already working on the deduplication I don't know if I
should stick my fingers in that pie. I'm afraid ocd will kick in
and I'll just move almost everything from the gen6+ files to the gen4
file, which will probably conflict badly with what you're already
doing.
--
Ville Syrjälä
Intel
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-06-14 13:25 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-11 16:14 [igt-dev] [PATCH i-g-t] lib/rendercopy: Add gen4/5 rendercopy Ville Syrjala
2018-06-11 18:01 ` [igt-dev] ✓ Fi.CI.BAT: success for " Patchwork
2018-06-12 0:14 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
2018-06-13 10:35 ` [igt-dev] [PATCH i-g-t] " Kalamarz, Lukasz
2018-06-14 13:23 ` Ville Syrjälä
2018-06-14 13:13 ` Katarzyna Dec
2018-06-14 13:24 ` Ville Syrjälä
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox