From mboxrd@z Thu Jan 1 00:00:00 1970 From: Francisco Jerez Subject: Re: [PATCH v4] drm/i915 : Added Programming of the MOCS Date: Tue, 30 Jun 2015 20:00:25 +0300 Message-ID: <878ub13zdi.fsf@riseup.net> References: <1434554362-22384-1-git-send-email-peter.antoine@intel.com> <87ioact03d.fsf@riseup.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1539491315==" Return-path: Received: from mx1.riseup.net (mx1.riseup.net [198.252.153.129]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3D3216E9F3 for ; Tue, 30 Jun 2015 10:00:37 -0700 (PDT) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Peter Antoine Cc: intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org --===============1539491315== Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" --==-=-= Content-Type: multipart/mixed; boundary="=-=-=" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Peter Antoine writes: > On Mon, 29 Jun 2015, Peter Antoine wrote: > >> On Thu, 25 Jun 2015, Francisco Jerez wrote: >> >>> Peter Antoine writes: >>> >>>> This change adds the programming of the MOCS registers to the gen 9+ >>>> platforms. This change set programs the MOCS register values to a set >>>> of values that are defined to be optimal. >>>> >>>> It creates a fixed register set that is programmed across the different >>>> engines so that all engines have the same table. This is done as the >>>> main RCS context only holds the registers for itself and the shared >>>> L3 values. By trying to keep the registers consistent across the >>>> different engines it should make the programming for the registers >>>> consistent. >>>> >>>> v2: >>>> -'static const' for private data structures and style changes.(Matt=20 >> Turner) >>>> v3: >>>> - Make the tables "slightly" more readable. (Damien Lespiau) >>>> - Updated tables fix performance regression. >>>> v4: >>>> - Code formatting. (Chris Wilson) >>>> - re-privatised mocs code. (Daniel Vetter) >>>> >>>> Signed-off-by: Peter Antoine >>>> --- >>>> drivers/gpu/drm/i915/Makefile | 1 + >>>> drivers/gpu/drm/i915/i915_reg.h | 9 + >>>> drivers/gpu/drm/i915/intel_lrc.c | 10 +- >>>> drivers/gpu/drm/i915/intel_lrc.h | 4 + >>>> drivers/gpu/drm/i915/intel_mocs.c | 373=20 >> ++++++++++++++++++++++++++++++++++++++ >>>> drivers/gpu/drm/i915/intel_mocs.h | 64 +++++++ >>>> 6 files changed, 460 insertions(+), 1 deletion(-) >>>> create mode 100644 drivers/gpu/drm/i915/intel_mocs.c >>>> create mode 100644 drivers/gpu/drm/i915/intel_mocs.h >>>> >>>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Make= file >>>> index b7ddf48..c781e19 100644 >>>> --- a/drivers/gpu/drm/i915/Makefile >>>> +++ b/drivers/gpu/drm/i915/Makefile >>>> @@ -35,6 +35,7 @@ i915-y +=3D i915_cmd_parser.o \ >>>> i915_irq.o \ >>>> i915_trace_points.o \ >>>> intel_lrc.o \ >>>> + intel_mocs.o \ >>>> intel_ringbuffer.o \ >>>> intel_uncore.o >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_reg.h=20 >> b/drivers/gpu/drm/i915/i915_reg.h >>>> index 7213224..3a435b5 100644 >>>> --- a/drivers/gpu/drm/i915/i915_reg.h >>>> +++ b/drivers/gpu/drm/i915/i915_reg.h >>>> @@ -7829,4 +7829,13 @@ enum skl_disp_power_wells { >>>> #define _PALETTE_A (dev_priv->info.display_mmio_offset + 0xa000) >>>> #define _PALETTE_B (dev_priv->info.display_mmio_offset + 0xa800) >>>> >>>> +/* MOCS (Memory Object Control State) registers */ >>>> +#define GEN9_LNCFCMOCS0 (0xB020) /* L3 Cache Control=20 >> base */ >>>> + >>>> +#define GEN9_GFX_MOCS_0 (0xc800) /* Graphics MOCS base=20 >> register*/ >>>> +#define GEN9_MFX0_MOCS_0 (0xc900) /* Media 0 MOCS base=20 >> register*/ >>>> +#define GEN9_MFX1_MOCS_0 (0xcA00) /* Media 1 MOCS base=20 >> register*/ >>>> +#define GEN9_VEBOX_MOCS_0 (0xcB00) /* Video MOCS base register*/ >>>> +#define GEN9_BLT_MOCS_0 (0xcc00) /* Blitter MOCS base=20 >> register*/ >>>> + >>>> #endif /* _I915_REG_H_ */ >>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c=20 >> b/drivers/gpu/drm/i915/intel_lrc.c >>>> index 9f5485d..73b919d 100644 >>>> --- a/drivers/gpu/drm/i915/intel_lrc.c >>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c >>>> @@ -135,6 +135,7 @@ >>>> #include >>>> #include >>>> #include "i915_drv.h" >>>> +#include "intel_mocs.h" >>>> >>>> #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE) >>>> #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE) >>>> @@ -796,7 +797,7 @@ static int logical_ring_prepare(struct=20 >> intel_ringbuffer *ringbuf, >>>> * >>>> * Return: non-zero if the ringbuffer is not ready to be written to. >>>> */ >>>> -static int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, >>>> +int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, >>>> struct intel_context *ctx, int=20 >> num_dwords) >>>> { >>>> struct intel_engine_cs *ring =3D ringbuf->ring; >>>> @@ -1379,6 +1380,13 @@ static int gen8_init_rcs_context(struct=20 >> intel_engine_cs *ring, >>>> if (ret) >>>> return ret; >>>> >>>> + /* >>>> + * Failing to program the MOCS is non-fatal.The system will not >>>> + * run at peak performance. So generate a warning and carry on. >>>> + */ >>>> + if (gen9_program_mocs(ring, ctx) !=3D 0) >>>> + DRM_ERROR("MOCS failed to program: expect performance=20 >> issues."); >>>> + >>>> return intel_lr_context_render_state_init(ring, ctx); >>>> } >>>> >>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h=20 >> b/drivers/gpu/drm/i915/intel_lrc.h >>>> index 04d3a6d..dbbd6af 100644 >>>> --- a/drivers/gpu/drm/i915/intel_lrc.h >>>> +++ b/drivers/gpu/drm/i915/intel_lrc.h >>>> @@ -44,6 +44,10 @@ int intel_logical_rings_init(struct drm_device *dev= ); >>>> >>>> int logical_ring_flush_all_caches(struct intel_ringbuffer *ringbuf, >>>> struct intel_context *ctx); >>>> + >>>> +int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, >>>> + struct intel_context *ctx, int=20 >> num_dwords); >>>> + >>>> /** >>>> * intel_logical_ring_advance() - advance the ringbuffer tail >>>> * @ringbuf: Ringbuffer to advance. >>>> diff --git a/drivers/gpu/drm/i915/intel_mocs.c=20 >> b/drivers/gpu/drm/i915/intel_mocs.c >>>> new file mode 100644 >>>> index 0000000..7c09e67 >>>> --- /dev/null >>>> +++ b/drivers/gpu/drm/i915/intel_mocs.c >>>> @@ -0,0 +1,373 @@ >>>> +/* >>>> + * Copyright (c) 2015 Intel Corporation >>>> + * >>>> + * Permission is hereby granted, free of charge, to any person obtain= ing=20 >> a >>>> + * copy of this software and associated documentation files (the=20 >> "Software"), >>>> + * to deal in the Software without restriction, including without=20 >> limitation >>>> + * the rights to use, copy, modify, merge, publish, distribute,=20 >> sublicense, >>>> + * and/or sell copies of the Software, and to permit persons to whom = the >>>> + * Software is furnished to do so, subject to the following condition= s: * >>>> + * The above copyright notice and this permission notice (including t= he=20 >> next >>>> + * paragraph) shall be included in all copies or substantial portions= of=20 >> the >>>> + * Software. >>>> + * >>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,=20 >> EXPRESS OR >>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF=20 >> MERCHANTABILITY, >>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT= =20 >> SHALL >>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES = OR=20 >> OTHER >>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,=20 >> ARISING FROM, >>>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEAL= INGS=20 >> IN THE >>>> + * SOFTWARE. >>>> + * >>>> + * Authors: >>>> + * Peter Antoine >>>> + */ >>>> + >>>> +#include "intel_mocs.h" >>>> +#include "intel_lrc.h" >>>> +#include "intel_ringbuffer.h" >>>> + >>>> +/* structures required */ >>>> +struct drm_i915_mocs_entry { >>>> + u32 control_value; >>>> + u16 l3cc_value; >>>> +}; >>>> + >>>> +struct drm_i915_mocs_table { >>>> + u32 size; >>>> + const struct drm_i915_mocs_entry *table; >>>> +}; >>>> + >>>> +/* Defines for the tables (XXX_MOCS_0 - XXX_MOCS_63) */ >>>> +#define MOCS_CACHEABILITY(value) (value << 0) >>>> +#define MOCS_TGT_CACHE(value) (value << 2) >>>> +#define MOCS_LRUM(value) (value << 4) >>>> +#define MOCS_AOM(value) (value << 6) >>>> +#define MOCS_LECC_ESC(value) (value << 7) >>>> +#define MOCS_LECC_SCC(value) (value << 8) >>>> +#define MOC_PFM(value) (value << 11) >>>> +#define MOCS_SCF(value) (value << 14) >>>> + >>>> +/* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per = word=20 >> */ >>>> +#define MOCS_ESC(value) (value << 0) >>>> +#define MOCS_SCC(value) (value << 1) >>>> +#define MOCS_L3_CACHEABILITY(value) (value << 4) >>>> + >>>> +/* Helper defines */ >>>> +#define GEN9_NUM_MOCS_RINGS (5) /* Number of mocs engines to program= =20 >> */ >>>> +#define GEN9_NUM_MOCS_ENTRIES (63) /* 63 out of 64 - 64 is rsvrd=20 >> */ >>>> + >>>> +/* EDRAM Caching options */ >>>> +#define EDRAM_PAGETABLE (0) >>>> +#define EDRAM_UC (1) >>>> +#define EDRAM_RESERVED (2) >>> >>> According to the BSpec this is WT rather than reserved?a >> Just checked the Bspec and you are correct, changing the text. >> As well as for the items below. > Just to add - I was looking at the wrong gen. >>> >>>> +#define EDRAM_WB (3) >>>> + >>>> +/* L3 Caching options */ >>>> +#define L3_DIRECT (0) >>>> +#define L3_UC (1) >>>> +#define L3_RESERVED (2) >>>> +#define L3_WB (3) >>>> + >>>> +/* target cache */ >>>> +#define ELLC (0) >>> >>> BSpec says that this is "Use TC/LRU controls from page table", but upon >>> a closer look it seems like the BSpec is wrong and your patch is >>> correct. Can you confirm that this is what you intended? >> These values look good, they are bits 3:2 for the XXX_MOCS_N registers=20 >> (c800) and friends.=20 >>> >>>> +#define LLC (1) >>>> +#define LLC_ELLC (2) >>>> + >>>> +/* >>>> + * MOCS tables >>>> + * >>>> + * These are the MOCS tables that are programmed across all the rings. >>>> + * The control value is programmed to all the rings that support the >>>> + * MOCS registers. While the l3cc_values are only programmed to the >>>> + * LNCFCMOCS0 - LNCFCMOCS32 registers. >>>> + * >>>> + * NOTE: These tables MUST start with being uncached and the length M= UST=20 >> be >>>> + * less than 63 as the last two registers are reserved by the=20 >> hardware. >>>> + */ >>>> +static struct drm_i915_mocs_entry skylake_mocs_table[] =3D { >>>> + /* {0x00000009, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_UC) | MOCS_TGT_CACHE(LLC_ELLC) | >>>> + MOCS_LRUM(0) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> + /* {0x0000003b, 0x0030} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC_ELLC) | >>>> + MOCS_LRUM(3) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_WB))}, >>>> + /* {0x00000039, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_UC) | MOCS_TGT_CACHE(LLC_ELLC) | >>>> + MOCS_LRUM(3) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> + /* {0x00000017, 0x0030} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC) | >>>> + MOCS_LRUM(1) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_WB))}, >>>> + /* {0x00000017, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC) | >>>> + MOCS_LRUM(1) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> + /* {0x00000019, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_UC) | MOCS_TGT_CACHE(LLC_ELLC) | >>>> + MOCS_LRUM(1) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> + /* {0x00000037, 0x0030} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC) | >>>> + MOCS_LRUM(3) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_WB))}, >>>> + /* {0x00000037, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC) | >>>> + MOCS_LRUM(3) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> + /* {0x0000003b, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC_ELLC) | >>>> + MOCS_LRUM(3) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> +}; >>> >>> Mesa will want an additional entry with TC=3DLLC/eLLC, LeCC=3DPTE, L3CC= =3DWB, >>> everything else unset, I'll reply with a userspace patch making use of >>> your change if you add such an entry. > Ok. I think what you want is, same as entry two, but use the underlying > pagetable settings and not specify the EDRAM settings. Please confirm in= =20 > the new patchset. Yeah, that sounds good. >>> >>> Another thing worth mentioning is that entries 0, 2 and 5 seem to do the >>> same thing suspiciously, the only difference is the LRUM field which >>> AFAIK doesn't have any effect for LeCC=3DUC. Is my understanding corre= ct? >>> >> These tables are generated via requests and then boiled down to the abov= e.=20 >> So some of the entries are by request. Swings and roundabouts, can remov= e=20 >> the ones that look redundant but then the tuning that has been done wont= =20 >> match. I'll add the new entry at the end of the table. >>>> + >>>> +static struct drm_i915_mocs_entry broxton_mocs_table[] =3D { >>>> + /* {0x00000001, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_UC) | MOCS_TGT_CACHE(ELLC) | >>>> + MOCS_LRUM(0) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> + /* {0x00000005, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_UC) | MOCS_TGT_CACHE(LLC) | >>>> + MOCS_LRUM(0) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> + /* {0x00000005, 0x0030} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_UC) | MOCS_TGT_CACHE(LLC) | >>>> + MOCS_LRUM(0) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_WB))}, >>>> + /* {0x00000017, 0x0030} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC) | >>>> + MOCS_LRUM(1) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_WB))}, >>>> + /* {0x00000017, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC) | >>>> + MOCS_LRUM(1) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> + /* {0x00000019, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_UC) | MOCS_TGT_CACHE(LLC_ELLC) | >>>> + MOCS_LRUM(1) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> + /* {0x00000037, 0x0030} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC) | >>>> + MOCS_LRUM(3) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_WB))}, >>>> + /* {0x00000037, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC) | >>>> + MOCS_LRUM(3) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> + /* {0x0000003b, 0x0010} */ >>>> + {(MOCS_CACHEABILITY(EDRAM_WB) | MOCS_TGT_CACHE(LLC_ELLC) | >>>> + MOCS_LRUM(3) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | >>>> + MOC_PFM(0) | MOCS_SCF(0)), >>>> + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, >>>> +}; >>>> + >>> >>> Wouldn't it be a good idea to have BXT's entries match SKL's for a given >>> index? The TC, LeCC and LRUM settings you do here arguably don't have >>> any effect on BXT, L3CC does but it doesn't match SKL's setting for >>> entries 1 and 2. Is there any reason for this? >> As mentioned above this table is auto-generated and matches another tune= d=20 >> table, simply keeping them the same allows for the tuning to be consiste= nt=20 >> across platforms. >> >> Peter. >>> >>> Other than that looks good. >>> >>>> +/** >>>> + * get_mocs_settings >>>> + * >>>> + * This function will return the values of the MOCS table that needs = to >>>> + * be programmed for the platform. It will return the values that need >>>> + * to be programmed and if they need to be programmed. >>>> + * >>>> + * If the return values is false then the registers do not need=20 >> programming. >>>> + */ >>>> +static bool get_mocs_settings(struct drm_device *dev, >>>> + struct drm_i915_mocs_table *table) { >>>> + bool result =3D false; >>>> + >>>> + if (IS_SKYLAKE(dev)) { >>>> + table->size =3D ARRAY_SIZE(skylake_mocs_table); >>>> + table->table =3D skylake_mocs_table; >>>> + result =3D true; >>>> + } else if (IS_BROXTON(dev)) { >>>> + table->size =3D ARRAY_SIZE(broxton_mocs_table); >>>> + table->table =3D broxton_mocs_table; >>>> + result =3D true; >>>> + } else { >>>> + /* Platform that should have a MOCS table does not */ >>>> + WARN_ON(INTEL_INFO(dev)->gen >=3D 9); >>>> + } >>>> + >>>> + return result; >>>> +} >>>> + >>>> +/** >>>> + * emit_mocs_control_table() - emit the mocs control table >>>> + * @ringbuf: DRM device. >>>> + * @table: The values to program into the control regs. >>>> + * @reg_base: The base for the Engine that needs to be programmed. >>>> + * >>>> + * This function simply emits a MI_LOAD_REGISTER_IMM command for the >>>> + * given table starting at the given address. >>>> + * >>>> + * Return: Nothing. >>>> + */ >>>> +static void emit_mocs_control_table(struct intel_ringbuffer *ringbuf, >>>> + struct drm_i915_mocs_table *table, >>>> + u32 reg_base) >>>> +{ >>>> + unsigned int index; >>>> + >>>> + intel_logical_ring_emit(ringbuf, >>>> + MI_LOAD_REGISTER_IMM(GEN9_NUM_MOCS_ENTRIES)); >>>> + >>>> + for (index =3D 0; index < table->size; index++) { >>>> + intel_logical_ring_emit(ringbuf, reg_base + (index * 4)); >>>> + intel_logical_ring_emit(ringbuf, >>>> + table->table[index].control_value); >>>> + } >>>> + >>>> + /* >>>> + * Ok, now set the unused entries to uncached. These entries are >>>> + * officially undefined and no contact is given for the contents and >>>> + * settings is given for these entries. >>>> + * >>>> + * Entry 0 in the table is uncached - so we are just written that >>>> + * value to all the used entries. >>>> + */ >>>> + for (; index < GEN9_NUM_MOCS_ENTRIES; index++) { >>>> + intel_logical_ring_emit(ringbuf, reg_base + (index * 4)); >>>> + intel_logical_ring_emit(ringbuf,=20 >> table->table[0].control_value); >>>> + } >>>> + >>>> + intel_logical_ring_emit(ringbuf, MI_NOOP); >>>> +} >>>> + >>>> +/** >>>> + * emit_mocs_l3cc_table() - emit the mocs control table >>>> + * @ringbuf: DRM device. >>>> + * @table: The values to program into the control regs. >>>> + * >>>> + * This function simply emits a MI_LOAD_REGISTER_IMM command for the >>>> + * given table starting at the given address. This register set is=20 >> programmed >>>> + * in pairs. >>>> + * >>>> + * Return: Nothing. >>>> + */ >>>> +static void emit_mocs_l3cc_table(struct intel_ringbuffer *ringbuf, >>>> + struct drm_i915_mocs_table *table) { >>>> + unsigned int count; >>>> + unsigned int i; >>>> + u32 value; >>>> + u32 filler =3D (table->table[0].l3cc_value & 0xffff) | >>>> + ((table->table[0].l3cc_value & 0xffff) << 16); >>>> + >>>> + intel_logical_ring_emit(ringbuf, >>>> + MI_LOAD_REGISTER_IMM(GEN9_NUM_MOCS_ENTRIES / 2)); >>>> + >>>> + for (i =3D 0, count =3D 0; i < table->size / 2; i++, count +=3D 2) { >>>> + value =3D (table->table[count].l3cc_value & 0xffff) | >>>> + ((table->table[count + 1].l3cc_value & 0xffff) <<=20 >> 16); >>>> + >>>> + intel_logical_ring_emit(ringbuf, GEN9_LNCFCMOCS0 + (i * 4)); >>>> + intel_logical_ring_emit(ringbuf, value); >>>> + } >>>> + >>>> + if (table->size & 0x01) { >>>> + /* Odd table size - 1 left over */ >>>> + value =3D (table->table[count].l3cc_value & 0xffff) | >>>> + ((table->table[0].l3cc_value & 0xffff) << 16); >>>> + } else >>>> + value =3D filler; >>>> + >>>> + /* >>>> + * Now set the rest of the table to uncached - use entry 0 as this >>>> + * will be uncached. Leave the last pair as initialised as they are >>>> + * reserved by the hardware. >>>> + */ >>>> + for (; i < (GEN9_NUM_MOCS_ENTRIES / 2) - 1; i++) { >>>> + intel_logical_ring_emit(ringbuf, GEN9_LNCFCMOCS0 + (i * 4)); >>>> + intel_logical_ring_emit(ringbuf, value); >>>> + >>>> + value =3D filler; >>>> + } >>>> + >>>> + intel_logical_ring_emit(ringbuf, MI_NOOP); >>>> +} >>>> + >>>> +/* >>>> + * gen9_program_mocs() - program the MOCS register. >>>> + * >>>> + * ring: The ring that the programming batch will be run in. >>>> + * ctx: The intel_context to be used. >>>> + * >>>> + * This function will emit a batch buffer with the values required for >>>> + * programming the MOCS register values for all the currently support= ed >>>> + * rings. >>>> + * >>>> + * These registers are partially stored in the RCS context, so they a= re >>>> + * emitted at the same time so that when a context is created these=20 >> registers >>>> + * are set up. These registers have to be emitted into the start of t= he >>>> + * context as setting the ELSP will re-init some of these registers b= ack >>>> + * to the hw values. >>>> + * >>>> + * Return: 0 on success, otherwise the error status. >>>> + */ >>>> +int gen9_program_mocs(struct intel_engine_cs *ring, >>>> + struct intel_context *ctx) >>>> +{ >>>> + int ret =3D 0; >>>> + >>>> + struct drm_i915_mocs_table t; >>>> + struct drm_device *dev =3D ring->dev; >>>> + struct intel_ringbuffer *ringbuf =3D ctx->engine[ring->id].ringbuf; >>>> + >>>> + if (get_mocs_settings(dev, &t)) { >>>> + u32 table_size; >>>> + >>>> + /* >>>> + * OK. For each supported ring: >>>> + * number of mocs entries * 2 dwords for each control_value >>>> + * plus number of mocs entries /2 dwords for l3cc values. >>>> + * >>>> + * Plus 1 for the load command and 1 for the NOOP per ring >>>> + * and the l3cc programming. >>>> + */ >>>> + table_size =3D GEN9_NUM_MOCS_RINGS * >>>> + ((2 * GEN9_NUM_MOCS_ENTRIES) + 2) + >>>> + GEN9_NUM_MOCS_ENTRIES + 2; >>>> + ret =3D intel_logical_ring_begin(ringbuf, ctx, table_size); >>>> + if (ret) { >>>> + DRM_DEBUG("intel_logical_ring_begin failed %d\n",=20 >> ret); >>>> + return ret; >>>> + } >>>> + >>>> + /* program the control registers */ >>>> + emit_mocs_control_table(ringbuf, &t, GEN9_GFX_MOCS_0); >>>> + emit_mocs_control_table(ringbuf, &t, GEN9_MFX0_MOCS_0); >>>> + emit_mocs_control_table(ringbuf, &t, GEN9_MFX1_MOCS_0); >>>> + emit_mocs_control_table(ringbuf, &t, GEN9_VEBOX_MOCS_0); >>>> + emit_mocs_control_table(ringbuf, &t, GEN9_BLT_MOCS_0); >>>> + >>>> + /* now program the l3cc registers */ >>>> + emit_mocs_l3cc_table(ringbuf, &t); >>>> + >>>> + intel_logical_ring_advance(ringbuf); >>>> + >>>> + DRM_DEBUG("MOCS: Table set in Context\n"); >>>> + } else { >>>> + DRM_DEBUG("MOCS: Table Not supported on platform\n"); >>>> + } >>>> + >>>> + return ret; >>>> +} >>>> + >>>> diff --git a/drivers/gpu/drm/i915/intel_mocs.h=20 >> b/drivers/gpu/drm/i915/intel_mocs.h >>>> new file mode 100644 >>>> index 0000000..e2780ce >>>> --- /dev/null >>>> +++ b/drivers/gpu/drm/i915/intel_mocs.h >>>> @@ -0,0 +1,64 @@ >>>> +/* >>>> + * Copyright (c) 2015 Intel Corporation >>>> + * >>>> + * Permission is hereby granted, free of charge, to any person obtain= ing=20 >> a >>>> + * copy of this software and associated documentation files (the=20 >> "Software"), >>>> + * to deal in the Software without restriction, including without=20 >> limitation >>>> + * the rights to use, copy, modify, merge, publish, distribute,=20 >> sublicense, >>>> + * and/or sell copies of the Software, and to permit persons to whom = the >>>> + * Software is furnished to do so, subject to the following condition= s: >>>> + * >>>> + * The above copyright notice and this permission notice (including t= he=20 >> next >>>> + * paragraph) shall be included in all copies or substantial portions= of=20 >> the >>>> + * Software. >>>> + * >>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,=20 >> EXPRESS OR >>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF=20 >> MERCHANTABILITY, >>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT= =20 >> SHALL >>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES = OR=20 >> OTHER >>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,=20 >> ARISING FROM, >>>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEAL= INGS=20 >> IN THE >>>> + * SOFTWARE. >>>> + * >>>> + * Authors: >>>> + * Peter Antoine >>>> + */ >>>> + >>>> +#ifndef INTEL_MOCS_H >>>> +#define INTEL_MOCS_H >>>> + >>>> +/** >>>> + * DOC: Memory Objects Control State (MOCS) >>>> + * >>>> + * Motivation: >>>> + * In previous Gens the MOCS settings was a value that was set by use= r=20 >> land as >>>> + * part of the batch. In Gen9 this has changed to be a single table (= per=20 >> ring) >>>> + * that all batches now reference by index instead of programming the= =20 >> MOCS >>>> + * directly. >>>> + * >>>> + * The one wrinkle in this is that only PART of the MOCS tables are=20 >> included >>>> + * in context (The GFX_MOCS_0 - GFX_MOCS_64 and the LNCFCMOCS0 -=20 >> LNCFCMOCS32 >>>> + * registers). The rest are not (the settings for the other rings). >>>> + * >>>> + * This table needs to be set at system start-up because the way the= =20 >> table >>>> + * interacts with the contexts and the GmmLib interface. >>>> + * >>>> + * >>>> + * Implementation: >>>> + * >>>> + * The table is programmed on a platform basis from a table that is=20 >> generated >>>> + * from the one that has been agreed by the different responsible=20 >> parties. This >>>> + * tables (one per supported platform) is defined in intel_mocs.c and= is >>>> + * programmed in the first batch after the context is loaded (with th= e=20 >> hardware >>>> + * workarounds). This will then let the usual context handling keep t= he=20 >> MOCS in >>>> + * step. >>>> + */ >>>> + >>>> +#include >>>> +#include "i915_drv.h" >>>> + >>>> +int gen9_program_mocs(struct intel_engine_cs *ring, >>>> + struct intel_context *ctx); >>>> + >>>> +#endif >>>> + >>>> -- >>>> 1.9.1 >>>> >>>> _______________________________________________ >>>> Intel-gfx mailing list >>>> Intel-gfx@lists.freedesktop.org >>>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx >>> >> >> -- >> Peter Antoine (Android Graphics Driver Software Engineer) >> --------------------------------------------------------------------- >> Intel Corporation (UK) Limited >> Registered No. 1134945 (England) >> Registered Office: Pipers Way, Swindon SN3 1RJ >> VAT No: 860 2173 47 >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx >> > > -- > Peter Antoine (Android Graphics Driver Software Engineer) > --------------------------------------------------------------------- > Intel Corporation (UK) Limited > Registered No. 1134945 (England) > Registered Office: Pipers Way, Swindon SN3 1RJ > VAT No: 860 2173 47 --=-=-=-- --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAlWSyykACgkQg5k4nX1Sv1vQvwD/emwxFPLsf6W6pYAcJoaBsQoA StuvmBBnXaUDXpCFt6cBAIlEJgTdTh62lJ+aue8xfRNJyl+oWy3n5vZP2RQuLMfK =Uzqo -----END PGP SIGNATURE----- --==-=-=-- --===============1539491315== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4 IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHA6Ly9saXN0 cy5mcmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9pbnRlbC1nZngK --===============1539491315==--