AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Marek Olšák" <maraeo@gmail.com>, "Alex Deucher" <alexdeucher@gmail.com>
Cc: "Christian König" <christian.koenig@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO
Date: Mon, 16 Jan 2023 12:31:00 +0100	[thread overview]
Message-ID: <35c28517-cafc-168f-eee9-2c4cbc572814@amd.com> (raw)
In-Reply-To: <CAAxE2A5xivt_4PK2uEkVS_v08viJVJP9E39DfTb4VsVCvDMkTw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 10127 bytes --]

Mhm, that sucks. Could we have the automated builds check for paddings 
in the UAPi data structure?

Christian.

Am 14.01.23 um 00:33 schrieb Marek Olšák:
> There is no hole on 32-bit unfortunately. It looks like the hole on 
> 64-bit is now ABI.
>
> I moved the field to replace _pad1. The patch is attached (with your Rb).
>
> Marek
>
> On Fri, Jan 13, 2023 at 4:20 PM Alex Deucher <alexdeucher@gmail.com> 
> wrote:
>
>     On Fri, Jan 13, 2023 at 4:02 PM Marek Olšák <maraeo@gmail.com> wrote:
>     >
>     > i've added the comments and indeed pahole shows the hole as
>     expected.
>
>     What about on 32-bit?
>
>     Alex
>
>     >
>     > Marek
>     >
>     > On Thu, Jan 12, 2023 at 11:44 AM Alex Deucher
>     <alexdeucher@gmail.com> wrote:
>     >>
>     >> On Thu, Jan 12, 2023 at 6:50 AM Christian König
>     >> <christian.koenig@amd.com> wrote:
>     >> >
>     >> > Am 11.01.23 um 21:48 schrieb Alex Deucher:
>     >> > > On Wed, Jan 4, 2023 at 3:17 PM Marek Olšák
>     <maraeo@gmail.com> wrote:
>     >> > >> Yes, it's meant to be like a spec sheet. We are not
>     interested in the current bandwidth utilization.
>     >> > > After chatting with Marek on IRC and thinking about this
>     more, I think
>     >> > > this patch is fine.  It's not really meant for bandwidth
>     per se, but
>     >> > > rather as a limit to determine what the driver should do in
>     certain
>     >> > > cases (i.e., when does it make sense to copy to vram vs
>     not).  It's
>     >> > > not straightforward for userspace to parse the full topology to
>     >> > > determine what links may be slow.  I guess one potential
>     pitfall would
>     >> > > be that if you pass the device into a VM, the driver may
>     report the
>     >> > > wrong values.  Generally in a VM the VM doesn't get the
>     full view up
>     >> > > to the root port.  I don't know if the hypervisors report
>     properly for
>     >> > > pcie_bandwidth_available() in a VM or if it just shows the
>     info about
>     >> > > the endpoint in the VM.
>     >> >
>     >> > So this basically doesn't return the gen and lanes of the
>     device, but
>     >> > rather what was negotiated between the device and the
>     upstream root port?
>     >>
>     >> Correct. It exposes the max gen and lanes of the slowest link
>     between
>     >> the device and the root port.
>     >>
>     >> >
>     >> > If I got that correctly then we should probably document that
>     cause
>     >> > otherwise somebody will try to "fix" it at some time.
>     >>
>     >> Good point.
>     >>
>     >> Alex
>     >>
>     >> >
>     >> > Christian.
>     >> >
>     >> > >
>     >> > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>     >> > >
>     >> > > Alex
>     >> > >
>     >> > >> Marek
>     >> > >>
>     >> > >> On Wed, Jan 4, 2023 at 10:33 AM Lazar, Lijo
>     <Lijo.Lazar@amd.com> wrote:
>     >> > >>> [AMD Official Use Only - General]
>     >> > >>>
>     >> > >>>
>     >> > >>> To clarify, with DPM in place, the current bandwidth will
>     be changing based on the load.
>     >> > >>>
>     >> > >>> If apps/umd already has a way to know the current
>     bandwidth utilisation, then possible maximum also could be part of
>     the same API. Otherwise, this only looks like duplicate
>     information. We have the same information in sysfs DPM nodes.
>     >> > >>>
>     >> > >>> BTW, I don't know to what extent app/umd really makes use
>     of this. Take that memory frequency as an example (I'm reading it
>     as 16GHz). It only looks like a spec sheet.
>     >> > >>>
>     >> > >>> Thanks,
>     >> > >>> Lijo
>     >> > >>> ________________________________
>     >> > >>> From: Marek Olšák <maraeo@gmail.com>
>     >> > >>> Sent: Wednesday, January 4, 2023 8:40:00 PM
>     >> > >>> To: Lazar, Lijo <Lijo.Lazar@amd.com>
>     >> > >>> Cc: amd-gfx@lists.freedesktop.org
>     <amd-gfx@lists.freedesktop.org>
>     >> > >>> Subject: Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen
>     and lanes from the INFO
>     >> > >>>
>     >> > >>> On Wed, Jan 4, 2023 at 9:19 AM Lazar, Lijo
>     <lijo.lazar@amd.com> wrote:
>     >> > >>>
>     >> > >>>
>     >> > >>>
>     >> > >>> On 1/4/2023 7:43 PM, Marek Olšák wrote:
>     >> > >>>> On Wed, Jan 4, 2023 at 6:50 AM Lazar, Lijo
>     <lijo.lazar@amd.com
>     >> > >>>> <mailto:lijo.lazar@amd.com>> wrote:
>     >> > >>>>
>     >> > >>>>
>     >> > >>>>
>     >> > >>>>      On 1/4/2023 4:11 AM, Marek Olšák wrote:
>     >> > >>>>       > I see. Well, those sysfs files are not usable,
>     and I don't think it
>     >> > >>>>       > would be important even if they were usable, but
>     for completeness:
>     >> > >>>>       >
>     >> > >>>>       > The ioctl returns:
>     >> > >>>>       >      pcie_gen = 1
>     >> > >>>>       > pcie_num_lanes = 16
>     >> > >>>>       >
>     >> > >>>>       > Theoretical bandwidth from those values: 4.0 GB/s
>     >> > >>>>       > My DMA test shows this write bandwidth: 3.5 GB/s
>     >> > >>>>       > It matches the expectation.
>     >> > >>>>       >
>     >> > >>>>       > Let's see the devices (there is only 1 GPU
>     Navi21 in the system):
>     >> > >>>>       > $ lspci |egrep '(PCI|VGA).*Navi'
>     >> > >>>>       > 0a:00.0 PCI bridge: Advanced Micro Devices, Inc.
>     [AMD/ATI] Navi
>     >> > >>>>      10 XL
>     >> > >>>>       > Upstream Port of PCI Express Switch (rev c3)
>     >> > >>>>       > 0b:00.0 PCI bridge: Advanced Micro Devices, Inc.
>     [AMD/ATI] Navi
>     >> > >>>>      10 XL
>     >> > >>>>       > Downstream Port of PCI Express Switch
>     >> > >>>>       > 0c:00.0 VGA compatible controller: Advanced
>     Micro Devices, Inc.
>     >> > >>>>       > [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900
>     XT] (rev c3)
>     >> > >>>>       >
>     >> > >>>>       > Let's read sysfs:
>     >> > >>>>       >
>     >> > >>>>       > $ cat
>     /sys/bus/pci/devices/0000:0a:00.0/current_link_width
>     >> > >>>>       > 16
>     >> > >>>>       > $ cat
>     /sys/bus/pci/devices/0000:0b:00.0/current_link_width
>     >> > >>>>       > 16
>     >> > >>>>       > $ cat
>     /sys/bus/pci/devices/0000:0c:00.0/current_link_width
>     >> > >>>>       > 16
>     >> > >>>>       > $ cat
>     /sys/bus/pci/devices/0000:0a:00.0/current_link_speed
>     >> > >>>>       > 2.5 GT/s PCIe
>     >> > >>>>       > $ cat
>     /sys/bus/pci/devices/0000:0b:00.0/current_link_speed
>     >> > >>>>       > 16.0 GT/s PCIe
>     >> > >>>>       > $ cat
>     /sys/bus/pci/devices/0000:0c:00.0/current_link_speed
>     >> > >>>>       > 16.0 GT/s PCIe
>     >> > >>>>       >
>     >> > >>>>       > Problem 1: None of the speed numbers match 4 GB/s.
>     >> > >>>>
>     >> > >>>>      US bridge = 2.5GT/s means operating at PCIe Gen 1
>     speed. Total
>     >> > >>>>      theoretical bandwidth is then derived based on
>     encoding and total
>     >> > >>>>      number
>     >> > >>>>      of lanes.
>     >> > >>>>
>     >> > >>>>       > Problem 2: Userspace doesn't know the bus index
>     of the bridges,
>     >> > >>>>      and it's
>     >> > >>>>       > not clear which bridge should be used.
>     >> > >>>>
>     >> > >>>>      In general, modern ones have this arch= US->DS->EP.
>     US is the one
>     >> > >>>>      connected to physical link.
>     >> > >>>>
>     >> > >>>>       > Problem 3: The PCIe gen number is missing.
>     >> > >>>>
>     >> > >>>>      Current link speed is based on whether it's
>     Gen1/2/3/4/5.
>     >> > >>>>
>     >> > >>>>      BTW, your patch makes use of capabilities flags
>     which gives the maximum
>     >> > >>>>      supported speed/width by the device. It may not
>     necessarily reflect the
>     >> > >>>>      current speed/width negotiated. I guess in NV, this
>     info is already
>     >> > >>>>      obtained from PMFW and made available through
>     metrics table.
>     >> > >>>>
>     >> > >>>>
>     >> > >>>> It computes the minimum of the device PCIe gen and the
>     motherboard/slot
>     >> > >>>> PCIe gen to get the final value. These 2 lines do that.
>     The low 16 bits
>     >> > >>>> of the mask contain the device PCIe gen mask. The high
>     16 bits of the
>     >> > >>>> mask contain the slot PCIe gen mask.
>     >> > >>>> + pcie_gen_mask = adev->pm.pcie_gen_mask &
>     (adev->pm.pcie_gen_mask >> 16);
>     >> > >>>> + dev_info->pcie_gen = fls(pcie_gen_mask);
>     >> > >>>>
>     >> > >>> With DPM in place on some ASICs, how much does this
>     static info help for
>     >> > >>> upper level apps?
>     >> > >>>
>     >> > >>>
>     >> > >>> It helps UMDs make better decisions if they know the
>     maximum achievable bandwidth. UMDs also compute the maximum memory
>     bandwidth and compute performance (FLOPS). Right now it's printed
>     by Mesa to give users detailed information about their GPU. For
>     example:
>     >> > >>>
>     >> > >>> $ AMD_DEBUG=info glxgears
>     >> > >>> Device info:
>     >> > >>>      name = NAVI21
>     >> > >>>      marketing_name = AMD Radeon RX 6800
>     >> > >>>      num_se = 3
>     >> > >>>      num_rb = 12
>     >> > >>>      num_cu = 60
>     >> > >>>      max_gpu_freq = 2475 MHz
>     >> > >>>      max_gflops = 19008 GFLOPS
>     >> > >>>      l0_cache_size = 16 KB
>     >> > >>>      l1_cache_size = 128 KB
>     >> > >>>      l2_cache_size = 4096 KB
>     >> > >>>      l3_cache_size = 128 MB
>     >> > >>>      memory_channels = 16 (TCC blocks)
>     >> > >>>      memory_size = 16 GB (16384 MB)
>     >> > >>>      memory_freq = 16 GHz
>     >> > >>>      memory_bus_width = 256 bits
>     >> > >>>      memory_bandwidth = 512 GB/s
>     >> > >>>      pcie_gen = 1
>     >> > >>>      pcie_num_lanes = 16
>     >> > >>>      pcie_bandwidth = 4.0 GB/s
>     >> > >>>
>     >> > >>> Marek
>     >> >
>

[-- Attachment #2: Type: text/html, Size: 19334 bytes --]

  parent reply	other threads:[~2023-01-16 11:31 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30 22:07 [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO Marek Olšák
2023-01-02 15:54 ` Christian König
2023-01-02 17:55   ` Marek Olšák
2023-01-03  8:31     ` Christian König
2023-01-03 22:41       ` Marek Olšák
2023-01-04 11:50         ` Lazar, Lijo
2023-01-04 14:13           ` Marek Olšák
2023-01-04 14:18             ` Lazar, Lijo
2023-01-04 15:10               ` Marek Olšák
2023-01-04 15:33                 ` Lazar, Lijo
2023-01-04 20:17                   ` Marek Olšák
2023-01-11 20:48                     ` Alex Deucher
2023-01-11 20:50                       ` Alex Deucher
2023-01-12  2:39                         ` Marek Olšák
2023-01-12 11:46                         ` Christian König
2023-01-12 11:50                       ` Christian König
2023-01-12 16:43                         ` Alex Deucher
2023-01-13 21:01                           ` Marek Olšák
2023-01-13 21:20                             ` Alex Deucher
2023-01-13 23:33                               ` Marek Olšák
2023-01-13 23:38                                 ` Marek Olšák
2023-01-16 11:31                                 ` Christian König [this message]
2023-01-17 19:21                                 ` Alex Deucher
2023-01-04 14:15           ` Alex Deucher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35c28517-cafc-168f-eee9-2c4cbc572814@amd.com \
    --to=christian.koenig@amd.com \
    --cc=alexdeucher@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=maraeo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox