public inbox for linux-doc@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages
@ 2026-03-23  6:52 Lukas Wunner
  2026-03-23 11:03 ` Mika Westerberg
  2026-03-23 16:50 ` Bjorn Helgaas
  0 siblings, 2 replies; 8+ messages in thread
From: Lukas Wunner @ 2026-03-23  6:52 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jonathan Corbet, linux-pci, linux-doc, Mika Westerberg,
	Ilpo Jarvinen, Maciej Grochowski, Kai-Heng Feng

The prefix/header of the TLP that caused an error is recorded by the Root
Complex and emitted to the kernel log in raw hex format.  Document the
existence and usage of tlp-tool, which allows decoding the TLP Header
into human-readable form.

The TLP Header hints at the root cause of an error, yet is often ignored
because of its seeming opaqueness.  Instead, PCIe errors are frequently
worked around by a change in the kernel without fully understanding the
actual source of the problem.  With more documentation on available tools
we'll hopefully come up with better solutions.

There are also wireshark dissectors for TLPs, but it seems they expect a
complete TLP, not just the header, and they cannot grok the hex format
emitted by the kernel directly.  tlp-tool appears to be the most cut and
dried solution out there.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Maciej Grochowski <mx2pg@pm.me>
---
We could also go one step further and point users to this tool
in a printk_once() message when the first error occurs.
For now, just amending the documentation is probably sufficient.

 Documentation/PCI/pcieaer-howto.rst | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
index 3210c47..90fdfdd 100644
--- a/Documentation/PCI/pcieaer-howto.rst
+++ b/Documentation/PCI/pcieaer-howto.rst
@@ -85,6 +85,16 @@ In the example, 'Requester ID' means the ID of the device that sent
 the error message to the Root Port. Please refer to PCIe specs for other
 fields.
 
+The 'TLP Header' is the prefix/header of the TLP that caused the error
+in raw hex format. To decode the TLP Header into human-readable form
+one may use tlp-tool:
+
+https://github.com/mmpg-x86/tlp-tool
+
+Example usage::
+
+  curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer
+
 AER Ratelimits
 --------------
 
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages
  2026-03-23  6:52 [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages Lukas Wunner
@ 2026-03-23 11:03 ` Mika Westerberg
  2026-03-23 16:50 ` Bjorn Helgaas
  1 sibling, 0 replies; 8+ messages in thread
From: Mika Westerberg @ 2026-03-23 11:03 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Jonathan Corbet, linux-pci, linux-doc,
	Ilpo Jarvinen, Maciej Grochowski, Kai-Heng Feng

On Mon, Mar 23, 2026 at 07:52:39AM +0100, Lukas Wunner wrote:
> The prefix/header of the TLP that caused an error is recorded by the Root
> Complex and emitted to the kernel log in raw hex format.  Document the
> existence and usage of tlp-tool, which allows decoding the TLP Header
> into human-readable form.
> 
> The TLP Header hints at the root cause of an error, yet is often ignored
> because of its seeming opaqueness.  Instead, PCIe errors are frequently
> worked around by a change in the kernel without fully understanding the
> actual source of the problem.  With more documentation on available tools
> we'll hopefully come up with better solutions.
> 
> There are also wireshark dissectors for TLPs, but it seems they expect a
> complete TLP, not just the header, and they cannot grok the hex format
> emitted by the kernel directly.  tlp-tool appears to be the most cut and
> dried solution out there.
> 
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: Maciej Grochowski <mx2pg@pm.me>

Good idea, this is useful.

Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages
  2026-03-23  6:52 [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages Lukas Wunner
  2026-03-23 11:03 ` Mika Westerberg
@ 2026-03-23 16:50 ` Bjorn Helgaas
  2026-03-24  5:53   ` mx2pg
  1 sibling, 1 reply; 8+ messages in thread
From: Bjorn Helgaas @ 2026-03-23 16:50 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Jonathan Corbet, linux-pci, linux-doc, Mika Westerberg,
	Ilpo Jarvinen, Maciej Grochowski, Kai-Heng Feng

On Mon, Mar 23, 2026 at 07:52:39AM +0100, Lukas Wunner wrote:
> The prefix/header of the TLP that caused an error is recorded by the Root
> Complex and emitted to the kernel log in raw hex format.  Document the
> existence and usage of tlp-tool, which allows decoding the TLP Header
> into human-readable form.
> 
> The TLP Header hints at the root cause of an error, yet is often ignored
> because of its seeming opaqueness.  Instead, PCIe errors are frequently
> worked around by a change in the kernel without fully understanding the
> actual source of the problem.  With more documentation on available tools
> we'll hopefully come up with better solutions.
> 
> There are also wireshark dissectors for TLPs, but it seems they expect a
> complete TLP, not just the header, and they cannot grok the hex format
> emitted by the kernel directly.  tlp-tool appears to be the most cut and
> dried solution out there.
> 
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: Maciej Grochowski <mx2pg@pm.me>

Applied to pci/for-linus for v7.0, thanks!

I tweaked the commit log to note that the Header Log is in the AER
Capability, which may be in any PCIe function.

> ---
> We could also go one step further and point users to this tool
> in a printk_once() message when the first error occurs.
> For now, just amending the documentation is probably sufficient.
> 
>  Documentation/PCI/pcieaer-howto.rst | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
> index 3210c47..90fdfdd 100644
> --- a/Documentation/PCI/pcieaer-howto.rst
> +++ b/Documentation/PCI/pcieaer-howto.rst
> @@ -85,6 +85,16 @@ In the example, 'Requester ID' means the ID of the device that sent
>  the error message to the Root Port. Please refer to PCIe specs for other
>  fields.
>  
> +The 'TLP Header' is the prefix/header of the TLP that caused the error
> +in raw hex format. To decode the TLP Header into human-readable form
> +one may use tlp-tool:
> +
> +https://github.com/mmpg-x86/tlp-tool
> +
> +Example usage::
> +
> +  curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer
> +
>  AER Ratelimits
>  --------------
>  
> -- 
> 2.51.0
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages
  2026-03-23 16:50 ` Bjorn Helgaas
@ 2026-03-24  5:53   ` mx2pg
  2026-03-24 10:09     ` Lukas Wunner
  2026-03-24 11:18     ` Ilpo Järvinen
  0 siblings, 2 replies; 8+ messages in thread
From: mx2pg @ 2026-03-24  5:53 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Lukas Wunner, Jonathan Corbet, linux-pci, linux-doc,
	Mika Westerberg, Ilpo Jarvinen, Kai-Heng Feng

  One thing worth calling out: starting with PCIe 6.0, Flit Mode is                                                                                                                           
  mandatory at 64.0 GT/s and supported at all PCIe link speeds, so a                                                                                                                          
  Flit-capable PCIe 6.x link may operate below 64.0 GT/s and still be                                                                                                                         
  in Flit Mode.  The raw TLP Header bytes do not encode the framing —                                                                                                                         
  the same four bytes decode to entirely different packet types in   
  non-Flit vs Flit framing.  The negotiated mode can be read from the                                                                                                                         
  Flit Mode Status bit in Link Status 2, or via lspci -vv on a recent                                                                                                                         
  pciutils build.                                                    
                                                                                                                                                                                              
  tlp-tool defaults to non-Flit, which is correct for the vast majority                                                                                                                       
  of hardware deployed today.  That will change: as PCIe 6.x adoption  
  grows, a significant share of TLP debugging will involve Flit Mode                                                                                                                          
  links, and this is already a concern among switch and device vendors                                                                                                                        
  working through the transition.  Users on Flit Mode links must pass                                                                                                                         
  --flit:                                                                                                                                                                                     
                                                                                                                                                                                              
    # non-Flit link (default, most common today)                                                                                                                                              
    curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer                                                                                                                       
                                                                       
    # Flit Mode link                                                                                                                                                                          
    curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer --flit
                                                                              
  It may be worth a one-liner in the Documentation patch:                                                                                                                                     
                                                         
    For PCIe 6.x links with Flit Mode negotiated (check Flit Mode Status                                                                                                                      
    in Link Status 2, or lspci -vv), pass --flit to rtlp-tool.                                                                                                                                
   
  Maciej     



On Monday, March 23rd, 2026 at 9:50 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Mon, Mar 23, 2026 at 07:52:39AM +0100, Lukas Wunner wrote:
> > The prefix/header of the TLP that caused an error is recorded by the Root
> > Complex and emitted to the kernel log in raw hex format.  Document the
> > existence and usage of tlp-tool, which allows decoding the TLP Header
> > into human-readable form.
> >
> > The TLP Header hints at the root cause of an error, yet is often ignored
> > because of its seeming opaqueness.  Instead, PCIe errors are frequently
> > worked around by a change in the kernel without fully understanding the
> > actual source of the problem.  With more documentation on available tools
> > we'll hopefully come up with better solutions.
> >
> > There are also wireshark dissectors for TLPs, but it seems they expect a
> > complete TLP, not just the header, and they cannot grok the hex format
> > emitted by the kernel directly.  tlp-tool appears to be the most cut and
> > dried solution out there.
> >
> > Signed-off-by: Lukas Wunner <lukas@wunner.de>
> > Cc: Maciej Grochowski <mx2pg@pm.me>
> 
> Applied to pci/for-linus for v7.0, thanks!
> 
> I tweaked the commit log to note that the Header Log is in the AER
> Capability, which may be in any PCIe function.
> 
> > ---
> > We could also go one step further and point users to this tool
> > in a printk_once() message when the first error occurs.
> > For now, just amending the documentation is probably sufficient.
> >
> >  Documentation/PCI/pcieaer-howto.rst | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
> > index 3210c47..90fdfdd 100644
> > --- a/Documentation/PCI/pcieaer-howto.rst
> > +++ b/Documentation/PCI/pcieaer-howto.rst
> > @@ -85,6 +85,16 @@ In the example, 'Requester ID' means the ID of the device that sent
> >  the error message to the Root Port. Please refer to PCIe specs for other
> >  fields.
> >
> > +The 'TLP Header' is the prefix/header of the TLP that caused the error
> > +in raw hex format. To decode the TLP Header into human-readable form
> > +one may use tlp-tool:
> > +
> > +https://github.com/mmpg-x86/tlp-tool
> > +
> > +Example usage::
> > +
> > +  curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer
> > +
> >  AER Ratelimits
> >  --------------
> >
> > --
> > 2.51.0
> >
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages
  2026-03-24  5:53   ` mx2pg
@ 2026-03-24 10:09     ` Lukas Wunner
  2026-03-24 10:22       ` Lukas Wunner
  2026-03-24 11:18     ` Ilpo Järvinen
  1 sibling, 1 reply; 8+ messages in thread
From: Lukas Wunner @ 2026-03-24 10:09 UTC (permalink / raw)
  To: Maciej Grochowski
  Cc: Bjorn Helgaas, Jonathan Corbet, linux-pci, linux-doc,
	Mika Westerberg, Ilpo Jarvinen, Kai-Heng Feng

On Tue, Mar 24, 2026 at 05:53:30AM +0000, mx2pg@pm.me wrote:
> One thing worth calling out: starting with PCIe 6.0, Flit Mode is
> mandatory at 64.0 GT/s and supported at all PCIe link speeds, so a
> Flit-capable PCIe 6.x link may operate below 64.0 GT/s and still be
> in Flit Mode.  The raw TLP Header bytes do not encode the framing —
> the same four bytes decode to entirely different packet types in   
> non-Flit vs Flit framing.  The negotiated mode can be read from the
> Flit Mode Status bit in Link Status 2, or via lspci -vv on a recent
> pciutils build.

Thanks Maciej for chiming in.

Ilpo (who is cc'ed) amended the kernel last year with commit
7e077e6707b3 ("PCI/ERR: Handle TLP Log in Flit mode") to suffix
the hexdump with " (Flit)" in Flit Mode:

https://git.kernel.org/linus/7e077e6707b3

Any chance you could amend tlp-tool to auto-detect Flit Mode
if the suffix is present?

As an aside:  Prior to that, commit f68ea779d98a ("PCI: Add
pcie_print_tlp_log() to print TLP Header and Prefix Log")
changed the log message to prefix each dword with "0x".
But I tested it and it seems that tlp-tool groks both the
old and the new format without any code change:

https://git.kernel.org/linus/f68ea779d98a

But we need to be careful going forward not to break user space
tooling that might rely on the syntax.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages
  2026-03-24 10:09     ` Lukas Wunner
@ 2026-03-24 10:22       ` Lukas Wunner
  0 siblings, 0 replies; 8+ messages in thread
From: Lukas Wunner @ 2026-03-24 10:22 UTC (permalink / raw)
  To: Maciej Grochowski
  Cc: Bjorn Helgaas, Jonathan Corbet, linux-pci, linux-doc,
	Mika Westerberg, Ilpo Jarvinen, Kai-Heng Feng

On Tue, Mar 24, 2026 at 11:09:35AM +0100, Lukas Wunner wrote:
> Ilpo (who is cc'ed) amended the kernel last year with commit
> 7e077e6707b3 ("PCI/ERR: Handle TLP Log in Flit mode") to suffix
> the hexdump with " (Flit)" in Flit Mode:
> 
> https://git.kernel.org/linus/7e077e6707b3

Sorry, I should have mentioned that this commit first appeared in
v6.15, so auto-detection of Flit mode logs in user space tools
should be possible from that kernel version onward.

> As an aside:  Prior to that, commit f68ea779d98a ("PCI: Add
> pcie_print_tlp_log() to print TLP Header and Prefix Log")
> changed the log message to prefix each dword with "0x".
> But I tested it and it seems that tlp-tool groks both the
> old and the new format without any code change:
> 
> https://git.kernel.org/linus/f68ea779d98a

This landed in v6.14.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages
  2026-03-24  5:53   ` mx2pg
  2026-03-24 10:09     ` Lukas Wunner
@ 2026-03-24 11:18     ` Ilpo Järvinen
  2026-03-25  5:18       ` mx2pg
  1 sibling, 1 reply; 8+ messages in thread
From: Ilpo Järvinen @ 2026-03-24 11:18 UTC (permalink / raw)
  To: mx2pg
  Cc: Bjorn Helgaas, Lukas Wunner, Jonathan Corbet, linux-pci,
	linux-doc, Mika Westerberg, Kai-Heng Feng

[-- Attachment #1: Type: text/plain, Size: 7508 bytes --]

On Tue, 24 Mar 2026, mx2pg@pm.me wrote:

>   One thing worth calling out: starting with PCIe 6.0, Flit Mode is                                                                                                                           
>   mandatory at 64.0 GT/s and supported at all PCIe link speeds, so a                                                                                                                          
>   Flit-capable PCIe 6.x link may operate below 64.0 GT/s and still be                                                                                                                         
>   in Flit Mode.  The raw TLP Header bytes do not encode the framing —                                                                                                                         
>   the same four bytes decode to entirely different packet types in   
>   non-Flit vs Flit framing.  The negotiated mode can be read from the                                                                                                                         
>   Flit Mode Status bit in Link Status 2, or via lspci -vv on a recent                                                                                                                         
>   pciutils build.                                                    

There's one caveat in using Link Status 2 Flit Mode Status bit, it can 
only be used as the indicator when the Link is Up, which may come into 
picture in troubleshooting scenarios.

The kernel code tries to hide that by indicating the Flit mode explicitly 
in the log message it prints out.

Sadly, TLP Logging on DPC side was botched in the PCIe spec so it doesn't 
indicate the Flit/non-Flit mode information explicitly (in contrast to AER 
that has a flag that tells in which mode the TLP Log was captured). To 
workaround that limitation, kernel has to save of the Link Status 2 
contents and hope the information is not stale when DPC has brought the 
Link Down (it seems relatively likely to remain valid but it's still 
fundamentally racy way to get the Flit/non-Flit information).

-- 
 i.

>   tlp-tool defaults to non-Flit, which is correct for the vast majority                                                                                                                       
>   of hardware deployed today.  That will change: as PCIe 6.x adoption  
>   grows, a significant share of TLP debugging will involve Flit Mode                                                                                                                          
>   links, and this is already a concern among switch and device vendors                                                                                                                        
>   working through the transition.  Users on Flit Mode links must pass                                                                                                                         
>   --flit:                                                                                                                                                                                     
>                                                                                                                                                                                               
>     # non-Flit link (default, most common today)                                                                                                                                              
>     curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer                                                                                                                       
>                                                                        
>     # Flit Mode link                                                                                                                                                                          
>     curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer --flit
>                                                                               
>   It may be worth a one-liner in the Documentation patch:                                                                                                                                     
>                                                          
>     For PCIe 6.x links with Flit Mode negotiated (check Flit Mode Status                                                                                                                      
>     in Link Status 2, or lspci -vv), pass --flit to rtlp-tool.                                                                                                                                
>    
>   Maciej     
> 
> 
> 
> On Monday, March 23rd, 2026 at 9:50 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> > On Mon, Mar 23, 2026 at 07:52:39AM +0100, Lukas Wunner wrote:
> > > The prefix/header of the TLP that caused an error is recorded by the Root
> > > Complex and emitted to the kernel log in raw hex format.  Document the
> > > existence and usage of tlp-tool, which allows decoding the TLP Header
> > > into human-readable form.
> > >
> > > The TLP Header hints at the root cause of an error, yet is often ignored
> > > because of its seeming opaqueness.  Instead, PCIe errors are frequently
> > > worked around by a change in the kernel without fully understanding the
> > > actual source of the problem.  With more documentation on available tools
> > > we'll hopefully come up with better solutions.
> > >
> > > There are also wireshark dissectors for TLPs, but it seems they expect a
> > > complete TLP, not just the header, and they cannot grok the hex format
> > > emitted by the kernel directly.  tlp-tool appears to be the most cut and
> > > dried solution out there.
> > >
> > > Signed-off-by: Lukas Wunner <lukas@wunner.de>
> > > Cc: Maciej Grochowski <mx2pg@pm.me>
> > 
> > Applied to pci/for-linus for v7.0, thanks!
> > 
> > I tweaked the commit log to note that the Header Log is in the AER
> > Capability, which may be in any PCIe function.
> > 
> > > ---
> > > We could also go one step further and point users to this tool
> > > in a printk_once() message when the first error occurs.
> > > For now, just amending the documentation is probably sufficient.
> > >
> > >  Documentation/PCI/pcieaer-howto.rst | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > >
> > > diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
> > > index 3210c47..90fdfdd 100644
> > > --- a/Documentation/PCI/pcieaer-howto.rst
> > > +++ b/Documentation/PCI/pcieaer-howto.rst
> > > @@ -85,6 +85,16 @@ In the example, 'Requester ID' means the ID of the device that sent
> > >  the error message to the Root Port. Please refer to PCIe specs for other
> > >  fields.
> > >
> > > +The 'TLP Header' is the prefix/header of the TLP that caused the error
> > > +in raw hex format. To decode the TLP Header into human-readable form
> > > +one may use tlp-tool:
> > > +
> > > +https://github.com/mmpg-x86/tlp-tool
> > > +
> > > +Example usage::
> > > +
> > > +  curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer
> > > +
> > >  AER Ratelimits
> > >  --------------
> > >
> > > --
> > > 2.51.0
> > >
> >
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages
  2026-03-24 11:18     ` Ilpo Järvinen
@ 2026-03-25  5:18       ` mx2pg
  0 siblings, 0 replies; 8+ messages in thread
From: mx2pg @ 2026-03-25  5:18 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Bjorn Helgaas, Lukas Wunner, Jonathan Corbet, linux-pci,
	linux-doc, Mika Westerberg, Kai-Heng Feng

Thanks Lukas for the suggestion and Ilpo for the caveat on
Link Status 2.

I'll add auto-detection of the (Flit) suffix (7e077e6707b3,
v6.15+) to --aer mode so mixed flit / non-flit TLPs in the same
log are each parsed with the correct framing -- no --flit needed.
For --lspci I'll also pick up Flit+ from LnkSta2: per device.

The --flit flag stays as a global override for inputs without
auto-detection markers.  This will be a patch release (v0.5.1) --
fully backward compatible.

Maciej



On Tuesday, March 24th, 2026 at 4:18 AM, Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> wrote:

> On Tue, 24 Mar 2026, mx2pg@pm.me wrote:
> 
> >   One thing worth calling out: starting with PCIe 6.0, Flit Mode is
> >   mandatory at 64.0 GT/s and supported at all PCIe link speeds, so a
> >   Flit-capable PCIe 6.x link may operate below 64.0 GT/s and still be
> >   in Flit Mode.  The raw TLP Header bytes do not encode the framing —
> >   the same four bytes decode to entirely different packet types in
> >   non-Flit vs Flit framing.  The negotiated mode can be read from the
> >   Flit Mode Status bit in Link Status 2, or via lspci -vv on a recent
> >   pciutils build.
> 
> There's one caveat in using Link Status 2 Flit Mode Status bit, it can
> only be used as the indicator when the Link is Up, which may come into
> picture in troubleshooting scenarios.
> 
> The kernel code tries to hide that by indicating the Flit mode explicitly
> in the log message it prints out.
> 
> Sadly, TLP Logging on DPC side was botched in the PCIe spec so it doesn't
> indicate the Flit/non-Flit mode information explicitly (in contrast to AER
> that has a flag that tells in which mode the TLP Log was captured). To
> workaround that limitation, kernel has to save of the Link Status 2
> contents and hope the information is not stale when DPC has brought the
> Link Down (it seems relatively likely to remain valid but it's still
> fundamentally racy way to get the Flit/non-Flit information).
> 
> --
>  i.
> 
> >   tlp-tool defaults to non-Flit, which is correct for the vast majority
> >   of hardware deployed today.  That will change: as PCIe 6.x adoption
> >   grows, a significant share of TLP debugging will involve Flit Mode
> >   links, and this is already a concern among switch and device vendors
> >   working through the transition.  Users on Flit Mode links must pass
> >   --flit:
> >
> >     # non-Flit link (default, most common today)
> >     curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer
> >
> >     # Flit Mode link
> >     curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer --flit
> >
> >   It may be worth a one-liner in the Documentation patch:
> >
> >     For PCIe 6.x links with Flit Mode negotiated (check Flit Mode Status
> >     in Link Status 2, or lspci -vv), pass --flit to rtlp-tool.
> >
> >   Maciej
> >
> >
> >
> > On Monday, March 23rd, 2026 at 9:50 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > > On Mon, Mar 23, 2026 at 07:52:39AM +0100, Lukas Wunner wrote:
> > > > The prefix/header of the TLP that caused an error is recorded by the Root
> > > > Complex and emitted to the kernel log in raw hex format.  Document the
> > > > existence and usage of tlp-tool, which allows decoding the TLP Header
> > > > into human-readable form.
> > > >
> > > > The TLP Header hints at the root cause of an error, yet is often ignored
> > > > because of its seeming opaqueness.  Instead, PCIe errors are frequently
> > > > worked around by a change in the kernel without fully understanding the
> > > > actual source of the problem.  With more documentation on available tools
> > > > we'll hopefully come up with better solutions.
> > > >
> > > > There are also wireshark dissectors for TLPs, but it seems they expect a
> > > > complete TLP, not just the header, and they cannot grok the hex format
> > > > emitted by the kernel directly.  tlp-tool appears to be the most cut and
> > > > dried solution out there.
> > > >
> > > > Signed-off-by: Lukas Wunner <lukas@wunner.de>
> > > > Cc: Maciej Grochowski <mx2pg@pm.me>
> > >
> > > Applied to pci/for-linus for v7.0, thanks!
> > >
> > > I tweaked the commit log to note that the Header Log is in the AER
> > > Capability, which may be in any PCIe function.
> > >
> > > > ---
> > > > We could also go one step further and point users to this tool
> > > > in a printk_once() message when the first error occurs.
> > > > For now, just amending the documentation is probably sufficient.
> > > >
> > > >  Documentation/PCI/pcieaer-howto.rst | 10 ++++++++++
> > > >  1 file changed, 10 insertions(+)
> > > >
> > > > diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
> > > > index 3210c47..90fdfdd 100644
> > > > --- a/Documentation/PCI/pcieaer-howto.rst
> > > > +++ b/Documentation/PCI/pcieaer-howto.rst
> > > > @@ -85,6 +85,16 @@ In the example, 'Requester ID' means the ID of the device that sent
> > > >  the error message to the Root Port. Please refer to PCIe specs for other
> > > >  fields.
> > > >
> > > > +The 'TLP Header' is the prefix/header of the TLP that caused the error
> > > > +in raw hex format. To decode the TLP Header into human-readable form
> > > > +one may use tlp-tool:
> > > > +
> > > > +https://github.com/mmpg-x86/tlp-tool
> > > > +
> > > > +Example usage::
> > > > +
> > > > +  curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer
> > > > +
> > > >  AER Ratelimits
> > > >  --------------
> > > >
> > > > --
> > > > 2.51.0
> > > >
> > >
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-03-25  5:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23  6:52 [PATCH] Documentation: PCI: Document decoding of TLP Header in AER messages Lukas Wunner
2026-03-23 11:03 ` Mika Westerberg
2026-03-23 16:50 ` Bjorn Helgaas
2026-03-24  5:53   ` mx2pg
2026-03-24 10:09     ` Lukas Wunner
2026-03-24 10:22       ` Lukas Wunner
2026-03-24 11:18     ` Ilpo Järvinen
2026-03-25  5:18       ` mx2pg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox