All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Cc: Ankit Soni <Ankit.Soni@amd.com>, Jonathan Corbet <corbet@lwn.net>,
	iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	Justin Stitt <justinstitt@google.com>,
	Kevin Tian <kevin.tian@intel.com>,
	linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	llvm@lists.linux.dev, Bill Wendling <morbo@google.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <nick.desaulniers+lkml@gmail.com>,
	Miguel Ojeda <ojeda@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Shuah Khan <shuah@kernel.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Will Deacon <will@kernel.org>, Alexey Kardashevskiy <aik@amd.com>,
	James Gowans <jgowans@amazon.com>,
	Michael Roth <michael.roth@amd.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	patches@lists.linux.dev
Subject: Re: [PATCH v2 03/15] iommupt: Add the basic structure of the iommu implementation
Date: Thu, 15 May 2025 16:32:59 -0300	[thread overview]
Message-ID: <20250515193259.GB613512@nvidia.com> (raw)
In-Reply-To: <2459f14b-4f4e-47b4-8f79-6af784ef6686@oracle.com>

On Wed, May 14, 2025 at 04:08:09PM -0400, Alejandro Jimenez wrote:
> 
> 
> On 5/14/25 11:54 AM, Jason Gunthorpe wrote:
> > On Wed, May 14, 2025 at 09:23:49AM +0000, Ankit Soni wrote:
> > > I am experiencing a system hang with a 5-level v2 page table mode, on boot.
> > > The NVMe boot drive is not initializing.
> > > Below are the relevant dmesg logs with some prints i had added:
> > > 
> > > [    6.386439] AMD-Vi v2 domain init
> > > [    6.390132] AMD-Vi v2 pt init
> > > [    6.390133] AMD-Vi aperture end last va ffffffffffffff
> > > ...
> > > [   10.315372] AMD-Vi gen pt MAP PAGES iova ffffffffffffe000 paddr 19351b000
> > > ...
> > > [   72.171930] nvme nvme0: I/O tag 0 (0000) QID 0 timeout, disable controller
> > > [   72.179618] nvme nvme1: I/O tag 24 (0018) QID 0 timeout, disable controller
> > > [   72.197176] nvme nvme0: Identify Controller failed (-4)
> > > [   72.203063] nvme nvme1: Identify Controller failed (-4)
> > > [   72.209237] nvme 0000:05:00.0: probe with driver nvme failed with error -5
> > > [   72.209336] nvme 0000:44:00.0: probe with driver nvme failed with error -5
> > > ...
> > > Timed out waiting for the udev queue to be empty.
> > > 
> > > According to the dmesg logs above, the IOVA for the v2 page table appears
> > > incorrect and is not aligned with domain->geometry.aperture_end. Which
> > > requires domain->geometry.force_aperture = true; to be added at the
> > > appropriate location. Proabably here!
> 
> Thank you for pointing out this issue and its cause. I originally tested on
> a host with SCSI storage, and after your report I tried but couldn't
> reproduce the hang on a Zen4 host with an nvme boot drive. I wanted to see
> if it was a pattern common to NVME, but I suppose it depends on the DMA mask
> chosen by the specific driver.

Yeah, that's a good point

I've also been thinking that the dma-iommu.c needs some updating here
as allocating top-down like the above shows completely defeats the
dynamic top optimization feature AMD has. iova ffff_ffff_ffff_e000
will immediately expand to a 6 level table.

I think dynamic top can be made to work with vt-d and riscv with some
effort.

Jason

  reply	other threads:[~2025-05-15 19:33 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-05 14:18 [PATCH v2 00/15] Consolidate iommu page table implementations (AMD) Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 01/15] genpt: Generic Page Table base API Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 02/15] genpt: Add Documentation/ files Jason Gunthorpe
2025-05-07  2:37   ` Bagas Sanjaya
2025-05-13 18:53     ` Jason Gunthorpe
2025-05-15  6:21       ` Bagas Sanjaya
2025-05-05 14:18 ` [PATCH v2 03/15] iommupt: Add the basic structure of the iommu implementation Jason Gunthorpe
2025-05-14  9:23   ` Ankit Soni
2025-05-14 15:54     ` Jason Gunthorpe
2025-05-14 20:08       ` Alejandro Jimenez
2025-05-15 19:32         ` Jason Gunthorpe [this message]
2025-05-16  5:02         ` Ankit Soni
2025-05-16 20:39           ` Alejandro Jimenez
2025-05-05 14:18 ` [PATCH v2 04/15] iommupt: Add the AMD IOMMU v1 page table format Jason Gunthorpe
2025-05-05 18:48   ` ALOK TIWARI
2025-05-16  8:30   ` Yi Liu
2025-05-16 11:57     ` Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 05/15] iommupt: Add iova_to_phys op Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 06/15] iommupt: Add unmap_pages op Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 07/15] iommupt: Add map_pages op Jason Gunthorpe
2025-05-13  5:15   ` Ankit Soni
2025-05-13 12:00     ` Jason Gunthorpe
2025-06-05 16:49   ` Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 08/15] iommupt: Add read_and_clear_dirty op Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 09/15] iommupt: Add a kunit test for Generic Page Table Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 10/15] iommupt: Add a mock pagetable format for iommufd selftest to use Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 11/15] iommufd: Change the selftest to use iommupt instead of xarray Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 12/15] iommupt: Add the x86 64 bit page table format Jason Gunthorpe
2025-06-05 21:03   ` Jacob Pan
2025-06-06 11:43     ` Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 13/15] iommu/amd: Use the generic iommu page table Jason Gunthorpe
2025-05-05 19:09   ` ALOK TIWARI
2025-05-05 14:18 ` [PATCH v2 14/15] iommu/amd: Remove AMD io_pgtable support Jason Gunthorpe
2025-05-05 14:18 ` [PATCH v2 15/15] iommupt: Add a kunit test for the IOMMU implementation Jason Gunthorpe
2025-05-05 17:44   ` Nathan Chancellor
2025-05-05 17:47     ` Jason Gunthorpe
2025-05-05 18:00       ` Nathan Chancellor
2025-05-13  1:08 ` [PATCH v2 00/15] Consolidate iommu page table implementations (AMD) Alejandro Jimenez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250515193259.GB613512@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=Ankit.Soni@amd.com \
    --cc=aik@amd.com \
    --cc=alejandro.j.jimenez@oracle.com \
    --cc=corbet@lwn.net \
    --cc=iommu@lists.linux.dev \
    --cc=jgowans@amazon.com \
    --cc=joro@8bytes.org \
    --cc=justinstitt@google.com \
    --cc=kevin.tian@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=michael.roth@amd.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=nick.desaulniers+lkml@gmail.com \
    --cc=ojeda@kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=patches@lists.linux.dev \
    --cc=robin.murphy@arm.com \
    --cc=shuah@kernel.org \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.