Re: [PATCH v4] drm/amdgpu: Ensure the DMA engine is deactivated during set ups

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Lang Yu <Lang.Yu@amd.com>
To: Haohui Mai <ricetons@gmail.com>
Cc: "Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"Zhang, Hawking" <Hawking.Zhang@amd.com>,
	"Chen, Guchun" <Guchun.Chen@amd.com>,
	amd-gfx@lists.freedesktop.org,
	"Yifan Zhang" <yifan1.zhang@amd.com>
Subject: Re: [PATCH v4] drm/amdgpu: Ensure the DMA engine is deactivated during set ups
Date: Fri, 29 Apr 2022 16:32:31 +0800	[thread overview]
Message-ID: <Ymuin6/LGeiz4SE9@lang-desktop> (raw)
In-Reply-To: <CAHpOOhEA-GEgLU9B8RrYo9VnATj62sqnRm+6e3NsHo2H02pD7g@mail.gmail.com>

On 04/29/ , Haohui Mai wrote:
> Thanks for pointing it out. The v5 patch added the code back.
> 
> ~Haohui

But you didn't remove the following codes added in v4.

 	if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
+		sdma_v5_2_ctx_switch_disable_all(adev);
+		sdma_v5_2_halt(adev);
+

sdma_v5_2_halt() has been called in sdma_v5_2_load_microcode().

No need to call these two functions here.

Regards,
Lang

> On Thu, Apr 28, 2022 at 10:05 PM Lang Yu <Lang.Yu@amd.com> wrote:
> >
> > On 04/28/ , Haohui Mai wrote:
> > > If I understand correctly, the original code will disable the HALT bit
> > > of the register mmSDMA0_F32_CNTL twice on other code paths -- one in
> > > the sdma_v5_2_ctx_switch_enable() and the other one in
> > > sdma_v5_2_enable().
> > >
> > > The change ensures that the bit is only disabled once. Just wondering,
> > > which one is the expected behavior?
> > >
> > > ~Haohui
> >
> > The HALT bit of the register mmSDMA0_F32_CNTL is not set/reset in original sdma_v5_2_ctx_switch_enable()
> > (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fagd5f%2Flinux%2F-%2Fblob%2Famd-staging-drm-next%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Fsdma_v5_2.c%23L523&amp;data=05%7C01%7CLang.Yu%40amd.com%7C2df035ef6ad5411215a208da299c0253%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637868047861293833%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=%2FLpd6TUyxSe96rEzbyvV0zXNPZ%2BIokiT8KtGAKea8Co%3D&amp;reserved=0)
> >
> > Regards,
> > Lang
> >
> > > On Thu, Apr 28, 2022 at 6:27 PM Lang Yu <Lang.Yu@amd.com> wrote:
> > > >
> > > > On 04/28/ , ricetons@gmail.com wrote:
> > > > > From: Haohui Mai <ricetons@gmail.com>
> > > > >
> > > > > The patch fully deactivates the DMA engine before setting up the ring
> > > > > buffer to avoid potential data races and crashes.
> > > > >
> > > > > The v4 patch addresses the comments from Lang Yu <Lang.Yu@amd.com>.
> > > > >
> > > > > Signed-off-by: Haohui Mai <ricetons@gmail.com>
> > > > > ---
> > > > >  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 113 ++++++++++++++-----------
> > > > >  1 file changed, 62 insertions(+), 51 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> > > > > index 013d2dec81d0..b000117b77d0 100644
> > > > > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> > > > > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> > > > > @@ -459,7 +459,6 @@ static void sdma_v5_2_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 se
> > > > >       }
> > > > >  }
> > > > >
> > > > > -
> > > > >  /**
> > > > >   * sdma_v5_2_gfx_stop - stop the gfx async dma engines
> > > > >   *
> > > > > @@ -505,17 +504,21 @@ static void sdma_v5_2_rlc_stop(struct amdgpu_device *adev)
> > > > >  }
> > > > >
> > > > >  /**
> > > > > - * sdma_v5_2_ctx_switch_enable - stop the async dma engines context switch
> > > > > + * sdma_v5_2_ctx_switch_enable_for_instance - start the async dma engines
> > > > > + * context switch for an instance
> > > > >   *
> > > > >   * @adev: amdgpu_device pointer
> > > > > - * @enable: enable/disable the DMA MEs context switch.
> > > > > + * @instance_idx: the index of the SDMA instance
> > > > >   *
> > > > > - * Halt or unhalt the async dma engines context switch.
> > > > > + * Unhalt the async dma engines context switch.
> > > > >   */
> > > > > -static void sdma_v5_2_ctx_switch_enable(struct amdgpu_device *adev, bool enable)
> > > > > +static void sdma_v5_2_ctx_switch_enable_for_instance(struct amdgpu_device *adev, int instance_idx)
> > > > >  {
> > > > >       u32 f32_cntl, phase_quantum = 0;
> > > > > -     int i;
> > > > > +
> > > > > +     if (WARN_ON(instance_idx >= adev->sdma.num_instances)) {
> > > > > +             return;
> > > > > +     }
> > > > >
> > > > >       if (amdgpu_sdma_phase_quantum) {
> > > > >               unsigned value = amdgpu_sdma_phase_quantum;
> > > > > @@ -539,61 +542,71 @@ static void sdma_v5_2_ctx_switch_enable(struct amdgpu_device *adev, bool enable)
> > > > >               phase_quantum =
> > > > >                       value << SDMA0_PHASE0_QUANTUM__VALUE__SHIFT |
> > > > >                       unit  << SDMA0_PHASE0_QUANTUM__UNIT__SHIFT;
> > > > > -     }
> > > > > -
> > > > > -     for (i = 0; i < adev->sdma.num_instances; i++) {
> > > > > -             if (enable && amdgpu_sdma_phase_quantum) {
> > > > > -                     WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_PHASE0_QUANTUM),
> > > > > -                            phase_quantum);
> > > > > -                     WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_PHASE1_QUANTUM),
> > > > > -                            phase_quantum);
> > > > > -                     WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_PHASE2_QUANTUM),
> > > > > -                            phase_quantum);
> > > > > -             }
> > > > >
> > > > > -             if (!amdgpu_sriov_vf(adev)) {
> > > > > -                     f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL));
> > > > > -                     f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
> > > > > -                                     AUTO_CTXSW_ENABLE, enable ? 1 : 0);
> > > > > -                     WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL), f32_cntl);
> > > > > -             }
> > > > > +             WREG32_SOC15_IP(GC,
> > > > > +                     sdma_v5_2_get_reg_offset(adev, instance_idx, mmSDMA0_PHASE0_QUANTUM),
> > > > > +                     phase_quantum);
> > > > > +             WREG32_SOC15_IP(GC,
> > > > > +                     sdma_v5_2_get_reg_offset(adev, instance_idx, mmSDMA0_PHASE1_QUANTUM),
> > > > > +                 phase_quantum);
> > > > > +             WREG32_SOC15_IP(GC,
> > > > > +                     sdma_v5_2_get_reg_offset(adev, instance_idx, mmSDMA0_PHASE2_QUANTUM),
> > > > > +                 phase_quantum);
> > > > >       }
> > > > >
> > > > > +     if (!amdgpu_sriov_vf(adev)) {
> > > > > +             f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, instance_idx, mmSDMA0_CNTL));
> > > > > +             f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
> > > > > +                             AUTO_CTXSW_ENABLE, 1);
> > > > > +             WREG32(sdma_v5_2_get_reg_offset(adev, instance_idx, mmSDMA0_CNTL), f32_cntl);
> > > > > +     }
> > > > >  }
> > > > >
> > > > >  /**
> > > > > - * sdma_v5_2_enable - stop the async dma engines
> > > > > + * sdma_v5_2_ctx_switch_disable_all - stop the async dma engines context switch
> > > > >   *
> > > > >   * @adev: amdgpu_device pointer
> > > > > - * @enable: enable/disable the DMA MEs.
> > > > >   *
> > > > > - * Halt or unhalt the async dma engines.
> > > > > + * Halt the async dma engines context switch.
> > > > >   */
> > > > > -static void sdma_v5_2_enable(struct amdgpu_device *adev, bool enable)
> > > > > +static void sdma_v5_2_ctx_switch_disable_all(struct amdgpu_device *adev)
> > > > >  {
> > > > >       u32 f32_cntl;
> > > > >       int i;
> > > > >
> > > > > -     if (!enable) {
> > > > > -             sdma_v5_2_gfx_stop(adev);
> > > > > -             sdma_v5_2_rlc_stop(adev);
> > > > > -     }
> > > > > +     if (amdgpu_sriov_vf(adev))
> > > > > +             return;
> > > > >
> > > > > -     if (!amdgpu_sriov_vf(adev)) {
> > > > > -             for (i = 0; i < adev->sdma.num_instances; i++) {
> > > > > -                     f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL));
> > > > > -                     f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, enable ? 0 : 1);
> > > > > -                     WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl);
> > > > > -             }
> > > > > +     for (i = 0; i < adev->sdma.num_instances; i++) {
> > > > > +             f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL));
> > > > > +             f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
> > > > > +                             AUTO_CTXSW_ENABLE, 0);
> > > > > +             WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL), f32_cntl);
> > > > >       }
> > > > >  }
> > > > >
> > > > > +/**
> > > > > + * sdma_v5_2_halt - stop the async dma engines
> > > > > + *
> > > > > + * @adev: amdgpu_device pointer
> > > > > + *
> > > > > + * Halt the async dma engines.
> > > > > + */
> > > > > +static void sdma_v5_2_halt(struct amdgpu_device *adev)
> > > > > +{
> > > > > +     sdma_v5_2_gfx_stop(adev);
> > > > > +     sdma_v5_2_rlc_stop(adev);
> > > > > +}
> > > > > +
> > > > >  /**
> > > > >   * sdma_v5_2_gfx_resume - setup and start the async dma engines
> > > > >   *
> > > > >   * @adev: amdgpu_device pointer
> > > > >   *
> > > > >   * Set up the gfx DMA ring buffers and enable them.
> > > > > + * It assumes that the dma engine is stopped for each instance.
> > > > > + * The function enables the engine and preemptions sequentially for each instance.
> > > > > + *
> > > > >   * Returns 0 for success, error for failure.
> > > > >   */
> > > > >  static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
> > > > > @@ -737,10 +750,7 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
> > > > >
> > > > >               ring->sched.ready = true;
> > > > >
> > > > > -             if (amdgpu_sriov_vf(adev)) { /* bare-metal sequence doesn't need below to lines */
> > > > > -                     sdma_v5_2_ctx_switch_enable(adev, true);
> > > > > -                     sdma_v5_2_enable(adev, true);
> > > > > -             }
> > > > > +             sdma_v5_2_ctx_switch_enable_for_instance(adev, i);
> > > > >
> > > > >               r = amdgpu_ring_test_ring(ring);
> > > > >               if (r) {
> > > > > @@ -784,7 +794,7 @@ static int sdma_v5_2_load_microcode(struct amdgpu_device *adev)
> > > > >       int i, j;
> > > > >
> > > > >       /* halt the MEs */
> > > > > -     sdma_v5_2_enable(adev, false);
> > > > > +     sdma_v5_2_halt(adev);
> > > > >
> > > > >       for (i = 0; i < adev->sdma.num_instances; i++) {
> > > > >               if (!adev->sdma.instance[i].fw)
> > > > > @@ -856,8 +866,8 @@ static int sdma_v5_2_start(struct amdgpu_device *adev)
> > > > >       int r = 0;
> > > > >
> > > > >       if (amdgpu_sriov_vf(adev)) {
> > > > > -             sdma_v5_2_ctx_switch_enable(adev, false);
> > > > > -             sdma_v5_2_enable(adev, false);
> > > > > +             sdma_v5_2_ctx_switch_disable_all(adev);
> > > > > +             sdma_v5_2_halt(adev);
> > > > >
> > > > >               /* set RB registers */
> > > > >               r = sdma_v5_2_gfx_resume(adev);
> > > > > @@ -865,6 +875,9 @@ static int sdma_v5_2_start(struct amdgpu_device *adev)
> > > > >       }
> > > > >
> > > > >       if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
> > > > > +             sdma_v5_2_ctx_switch_disable_all(adev);
> > > > > +             sdma_v5_2_halt(adev);
> > > >
> > > > You don't need call these functions here.
> > > >
> > > > In original code, you can see sdma_v5_2_load_microcode() calls
> > > > sdma_v5_2_enable(adev, false) to halt the engine before load ucode.
> > > >
> > > > What I mean is sdma_v5_2_halt(adev) should have no functional change with
> > > > original sdma_v5_2_enable(adev, false), like this,
> > > >
> > > > static void sdma_v5_2_halt(struct amdgpu_device *adev) {
> > > >         u32 f32_cntl;
> > > >         int i;
> > > >
> > > >         sdma_v5_2_gfx_stop(adev);
> > > >         sdma_v5_2_rlc_stop(adev);
> > > >
> > > >         if (!amdgpu_sriov_vf(adev)) {
> > > >                 for (i = 0; i < adev->sdma.num_instances; i++) {
> > > >                         f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL));
> > > >                         f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, 1);
> > > >                         WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl);
> > > >                 }
> > > >         }
> > > > }
> > > >
> > > > But you remove the codes to set the HLAT bit of register mmSDMA0_F32_CNTL.
> > > >
> > > > Regards,
> > > > Lang
> > > >
> > > > >               r = sdma_v5_2_load_microcode(adev);
> > > > >               if (r)
> > > > >                       return r;
> > > > > @@ -881,12 +894,10 @@ static int sdma_v5_2_start(struct amdgpu_device *adev)
> > > > >               amdgpu_gfx_off_ctrl(adev, false);
> > > > >
> > > > >       sdma_v5_2_soft_reset(adev);
> > > > > -     /* unhalt the MEs */
> > > > > -     sdma_v5_2_enable(adev, true);
> > > > > -     /* enable sdma ring preemption */
> > > > > -     sdma_v5_2_ctx_switch_enable(adev, true);
> > > > >
> > > > > -     /* start the gfx rings and rlc compute queues */
> > > > > +     /* Soft reset supposes to disable the dma engine and preemption.
> > > > > +      * Now start the gfx rings and rlc compute queues.
> > > > > +      */
> > > > >       r = sdma_v5_2_gfx_resume(adev);
> > > > >       if (adev->in_s0ix)
> > > > >               amdgpu_gfx_off_ctrl(adev, true);
> > > > > @@ -1340,8 +1351,8 @@ static int sdma_v5_2_hw_fini(void *handle)
> > > > >       if (amdgpu_sriov_vf(adev))
> > > > >               return 0;
> > > > >
> > > > > -     sdma_v5_2_ctx_switch_enable(adev, false);
> > > > > -     sdma_v5_2_enable(adev, false);
> > > > > +     sdma_v5_2_ctx_switch_disable_all(adev);
> > > > > +     sdma_v5_2_halt(adev);
> > > > >
> > > > >       return 0;
> > > > >  }
> > > > > --
> > > > > 2.25.1
> > > > >

next prev parent reply	other threads:[~2022-04-29  8:32 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-28  9:53 [PATCH v4] drm/amdgpu: Ensure the DMA engine is deactivated during set ups ricetons
2022-04-28 10:27 ` Lang Yu
2022-04-28 13:26   ` Haohui Mai
2022-04-28 14:05     ` Lang Yu
2022-04-29  4:51       ` Haohui Mai
2022-04-29  8:32         ` Lang Yu [this message]
2022-04-29  9:02           ` Haohui Mai
2022-04-29 13:18             ` Lang Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ymuin6/LGeiz4SE9@lang-desktop \
    --to=lang.yu@amd.com \
    --cc=Guchun.Chen@amd.com \
    --cc=Hawking.Zhang@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=ricetons@gmail.com \
    --cc=yifan1.zhang@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.