AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
To: "Kim, Jonathan" <Jonathan.Kim@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Cc: "Kuehling, Felix" <Felix.Kuehling@amd.com>,
	"Yao, Yiqing(James)" <YiQing.Yao@amd.com>
Subject: Re: [PATCH] drm/amdkfd: sever xgmi io link if host driver has disable sharing
Date: Mon, 21 Oct 2024 13:35:11 -0400	[thread overview]
Message-ID: <4a4be2d3-da50-4b60-abe6-2be01867aa5d@amd.com> (raw)
In-Reply-To: <CY8PR12MB7435752CDBFDEF81C128F44D85462@CY8PR12MB7435.namprd12.prod.outlook.com>

Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>

On 2024-10-16 15:08, Kim, Jonathan wrote:
> [Public]
> 
> Messed up James' email in Tested-by tag.  CC'ing James.
> 
>> -----Original Message-----
>> From: Kim, Jonathan <Jonathan.Kim@amd.com>
>> Sent: Wednesday, October 16, 2024 11:59 AM
>> To: amd-gfx@lists.freedesktop.org
>> Cc: Kasiviswanathan, Harish <Harish.Kasiviswanathan@amd.com>; Kuehling, Felix
>> <Felix.Kuehling@amd.com>; Kim, Jonathan <Jonathan.Kim@amd.com>; Kim,
>> Jonathan <Jonathan.Kim@amd.com>; James Yao <yiqing@yao.amd.com>
>> Subject: [PATCH] drm/amdkfd: sever xgmi io link if host driver has disable sharing
>>
>> From: Jonathan Kim <Jonathan.Kim@amd.com>
>>
>> Host drivers can create partial hives per guest by disabling xgmi sharing
>> between certain peers in the main hive.
>> Typically, these partial hives are fully connected per guest session.
>> In the event that the host makes a mistake by adding a non-shared node
>> to a guest session, have the KFD reflect sharing disabled by severing
>> the IO link.
>>
>> Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
>> Tested-by: James Yao <yiqing@yao.amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 17 +++++++++++++++++
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h |  2 ++
>>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c    |  3 +++
>>  3 files changed, 22 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> index fcdbcff57632..1d50f327eb08 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> @@ -801,6 +801,23 @@ int amdgpu_xgmi_get_num_links(struct amdgpu_device
>> *adev,
>>       return  -EINVAL;
>>  }
>>
>> +bool amdgpu_xgmi_get_is_sharing_enabled(struct amdgpu_device *adev,
>> +                                     struct amdgpu_device *peer_adev)
>> +{
>> +     struct psp_xgmi_topology_info *top = &adev->psp.xgmi_context.top_info;
>> +     int i;
>> +
>> +     /* Sharing should always be enabled for non-SRIOV. */
>> +     if (!amdgpu_sriov_vf(adev))
>> +             return true;
>> +
>> +     for (i = 0 ; i < top->num_nodes; ++i)
>> +             if (top->nodes[i].node_id == peer_adev->gmc.xgmi.node_id)
>> +                     return !!top->nodes[i].is_sharing_enabled;
>> +
>> +     return false;
>> +}
>> +
>>  /*
>>   * Devices that support extended data require the entire hive to initialize with
>>   * the shared memory buffer flag set.
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> index 41d5f97fc77a..8cc7ab38db7c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> @@ -66,6 +66,8 @@ int amdgpu_xgmi_get_hops_count(struct amdgpu_device
>> *adev,
>>               struct amdgpu_device *peer_adev);
>>  int amdgpu_xgmi_get_num_links(struct amdgpu_device *adev,
>>               struct amdgpu_device *peer_adev);
>> +bool amdgpu_xgmi_get_is_sharing_enabled(struct amdgpu_device *adev,
>> +                                     struct amdgpu_device *peer_adev);
>>  uint64_t amdgpu_xgmi_get_relative_phy_addr(struct amdgpu_device *adev,
>>                                          uint64_t addr);
>>  static inline bool amdgpu_xgmi_same_hive(struct amdgpu_device *adev,
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> index 48caecf7e72e..723f1220e1cc 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> @@ -28,6 +28,7 @@
>>  #include "kfd_topology.h"
>>  #include "amdgpu.h"
>>  #include "amdgpu_amdkfd.h"
>> +#include "amdgpu_xgmi.h"
>>
>>  /* GPU Processor ID base for dGPUs for which VCRAT needs to be created.
>>   * GPU processor ID are expressed with Bit[31]=1.
>> @@ -2329,6 +2330,8 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>                               continue;
>>                       if (peer_dev->gpu->kfd->hive_id != kdev->kfd->hive_id)
>>                               continue;
>> +                     if (!amdgpu_xgmi_get_is_sharing_enabled(kdev->adev,
>> peer_dev->gpu->adev))
>> +                             continue;
>>                       sub_type_hdr = (typeof(sub_type_hdr))(
>>                               (char *)sub_type_hdr +
>>                               sizeof(struct crat_subtype_iolink));
>> --
>> 2.34.1
> 

      reply	other threads:[~2024-10-21 17:35 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-16 15:58 [PATCH] drm/amdkfd: sever xgmi io link if host driver has disable sharing jokim
2024-10-16 19:08 ` Kim, Jonathan
2024-10-21 17:35   ` Harish Kasiviswanathan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4a4be2d3-da50-4b60-abe6-2be01867aa5d@amd.com \
    --to=harish.kasiviswanathan@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=Jonathan.Kim@amd.com \
    --cc=YiQing.Yao@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox