From: Francisco Jerez <currojerez@riseup.net>
To: Zhigang Gong <zhigang.gong@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org, Zhigang Gong <zhigang.gong@intel.com>
Subject: Re: [PATCH] drm/i915/hsw: enable atomic in L3 for some steppings.
Date: Tue, 06 Jan 2015 16:19:08 +0200 [thread overview]
Message-ID: <874ms4gffn.fsf@riseup.net> (raw)
In-Reply-To: <20150106080101.GF28045@ivb-gt2-rev4>
[-- Attachment #1.1.1: Type: text/plain, Size: 6426 bytes --]
Zhigang Gong <zhigang.gong@linux.intel.com> writes:
> On Mon, Jan 05, 2015 at 07:27:26PM +0200, Francisco Jerez wrote:
>> Zhigang Gong <zhigang.gong@linux.intel.com> writes:
>>
>> > On Mon, Jan 05, 2015 at 05:03:16AM +0200, Francisco Jerez wrote:
>> >> Zhigang Gong <zhigang.gong@intel.com> writes:
>> >>
>> >> > According to bspec, ROW_CHICKEN3's bit 6 which is to disable
>> >> > move of cacheable global atomics to L3 is needed for GT3 D
>> >> > stepping.
>> >> >
>> >> > I enabled it and tested it with HSW GT2 D stepping and GT3 E stepping.
>> >> > The atomics works fine in beignet. And it could get more than 10x performance
>> >> > improvement with some workload, for an example, the "splat" kernel in darktable,
>> >> > without this patch, it consumes 50 seconds in one large raw picture processing.
>> >> > But with this patch, the same process only takes less than 1 second.
>> >> >
>> >>
>> >> I tried this already (on HSW GT2 D as well) and I don't think it's
>> >> enough to get L3 atomics working reliably. Even though they did seem to
>> >> work OK at first glance I observed some corruption issues (e.g. atomic
>> >> writes not landing in system memory) when doing atomic writes to
>> >> contiguous (as in within the same cache-line) locations in memory. The
>> >> "unused" ARB_shader_image_load_store test [1] I sent to the Piglit
>> >> mailing list some time ago exposes this IIRC, and probably a couple of
>> >> other tests too.
>> > Ok, I will find that case and have a try on my systems. I just tested all
>> > the atomic related OpenCL conformance test cases without any issues.
>
> I just found the patchset hasn't been accepted by piglit. I took a look at
> the source code. And realize that that shader will not work correctly if
> mesa doesn't allocate some DC in L3 space. Don't know whether you already
> allocated some DC when you did that test.
>
Of course, and it still fails no matter how you allocate the L3.
>> >
>> >>
>> >> Also this change is going to cause an instant lock-up anytime Mesa uses
>> >> atomics because Mesa doesn't change the default L3 way allocation for
>> >> the DC, which turns out to be 0 on HSW.
>> >
>> > This is another issue, IMHO, if the application wants to use atomics,
>> > it's better to allocate some L3 space for DC. Otherwise, it could
>> > never leverage the "atomics in L3 feature". Based on my test,
>> > the performance impact as huge as more than 10x for some workloads.
>> >
>> Sure, but this change alone will cause a regression (an irrecoverable
>> system hang) with current releases of Mesa.
>>
>> I agree that Mesa should be fixed eventually to assign L3 space to the
>> DC for some workloads, but it doesn't seem like we have a satisfactory
>> API to do that right now -- The current mechanism used by Beignet
>> (poking the L3 control registers from the batch buffer) is slightly
>> concerning from a security point of view because it allows an arbitrary
>> userspace application to cause misrendering or impact the performance of
> I agree with you that the L3 configuration is very dangerous. It's better
> to do more checking in the kernel space and make sure the user space
> application will not crash the system via L3 configuration.
> And, if the kernel provide new API we will use it in beignet ASAP in beignet.
>
>> other clients (since the L3 config registers are not being saved and
>> restored as part of the context) and even crash the whole system. It
> I found the L3 config registers are part of the IVB context, but not part
> of HSW's. So if two user applications want to use different L3 config,
> the user application need to do the config registers' store/restore manually.
> Is there a way to do this type of thing gracefully in KMD? The new API
> may need to consider the safely context switching regards to L3 config change.
>
Yeah, relying on userspace saving and restoring the L3 config values
sounds rather fragile to me, and may not work if e.g. a context hangs
and gets kicked out. IMHO it would be preferrable to keep track of the
L3 config and save/restore it from the kernel on context switch.
>> also doesn't look like it could be made to work with the hardware
>> command checker.
>
> As to the hardware command checker:
> I just found with current nightly kernel, if we boot the kernel with
> "i915.enable_ppgtt=2", then we can configure the L3 related registers
> in user space. But the command checker still don't allow user space to
> chagne L3 config by default.
>
Ugh, apparently that's because the hardware command checker is not
enabled at all in that case...
> Thanks,
> Zhigang Gong.
>
>>
>> > Thanks,
>> > Zhigang Gong.
>> >
>> >>
>> >> [1] http://lists.freedesktop.org/archives/piglit/2014-December/013571.html
>> >>
>> >> > Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
>> >> > ---
>> >> > drivers/gpu/drm/i915/intel_pm.c | 10 ++++++----
>> >> > 1 file changed, 6 insertions(+), 4 deletions(-)
>> >> >
>> >> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> >> > index 7d99a9c..8a27802 100644
>> >> > --- a/drivers/gpu/drm/i915/intel_pm.c
>> >> > +++ b/drivers/gpu/drm/i915/intel_pm.c
>> >> > @@ -5938,10 +5938,12 @@ static void haswell_init_clock_gating(struct drm_device *dev)
>> >> >
>> >> > ilk_init_lp_watermarks(dev);
>> >> >
>> >> > - /* L3 caching of data atomics doesn't work -- disable it. */
>> >> > - I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
>> >> > - I915_WRITE(HSW_ROW_CHICKEN3,
>> >> > - _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
>> >> > + if (IS_HSW_GT3(dev) && dev->pdev->revision <= 6) {
>> >> > + /* L3 caching of data atomics doesn't work -- disable it. */
>> >> > + I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
>> >> > + I915_WRITE(HSW_ROW_CHICKEN3,
>> >> > + _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
>> >> > + }
>> >> >
>> >> > /* This is required by WaCatErrorRejectionIssue:hsw */
>> >> > I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
>> >> > --
>> >> > 1.8.3.2
>> >
>> >
>> >
>> >
>> >> _______________________________________________
>> >> Intel-gfx mailing list
>> >> Intel-gfx@lists.freedesktop.org
>> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[-- Attachment #1.2: Type: application/pgp-signature, Size: 212 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-01-06 14:22 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-05 1:05 [PATCH] drm/i915/hsw: enable atomic in L3 for some steppings Zhigang Gong
2015-01-05 3:03 ` Francisco Jerez
2015-01-05 2:36 ` Zhigang Gong
2015-01-05 17:27 ` Francisco Jerez
2015-01-06 8:01 ` Zhigang Gong
2015-01-06 14:19 ` Francisco Jerez [this message]
2015-01-05 7:05 ` shuang.he
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874ms4gffn.fsf@riseup.net \
--to=currojerez@riseup.net \
--cc=intel-gfx@lists.freedesktop.org \
--cc=zhigang.gong@intel.com \
--cc=zhigang.gong@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox