public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Mitchell Augustin <mitchell.augustin@canonical.com>
Cc: linux-pci@vger.kernel.org, kvm@vger.kernel.org,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: drivers/pci: (and/or KVM): Slow PCI initialization during VM boot with passthrough of large BAR Nvidia GPUs on DGX H100
Date: Tue, 3 Dec 2024 15:06:20 -0700	[thread overview]
Message-ID: <20241203150620.15431c5c.alex.williamson@redhat.com> (raw)
In-Reply-To: <CAHTA-uZWGmoLr0R4L608xzvBAxnr7zQPMDbX0U4MTfN3BAsfTQ@mail.gmail.com>

On Tue, 3 Dec 2024 14:33:10 -0600
Mitchell Augustin <mitchell.augustin@canonical.com> wrote:

> Thanks.
> 
> I'm thinking about the cleanest way to accomplish this:
> 
> 1. I'm wondering if replacing the pci_info() calls with equivalent
> printk_deferred() calls might be sufficient here. This works in my
> initial test, but I'm not sure if this is definitive proof that we
> wouldn't have any issues in all deployments, or if my configuration is
> just not impacted by this kind of deadlock.

Just switching to printk_deferred() alone seems like wishful thinking,
but if you were also to wrap the code in console_{un}lock(), that might
be a possible low-impact solution.

> 2. I did also draft a patch that would just eliminate the redundancy
> and disable the impacted logs by default, and allow them to be
> re-enabled with a new kernel command line option
> "pci=bar_logging_enabled" (at the cost of the performance gains due to
> reduced redundancy). This works well in all of my tests.

I suspect Bjorn would prefer not to add yet another pci command line
option and as we've seen here, the logs are useful by default.
 
> Do you think either of those approaches would work / be appropriate?
> Ultimately I am trying to avoid messy changes that would require
> actually propagating all of the info needed for these logs back up to
> pci_read_bases(), if at all possible, since there seems like no
> obvious way to do that without changing the signature of
> __pci_read_base() or tracking additional state.

The calling convention of __pci_read_base() is already changing if
we're having the caller disable decoding and it doesn't have a lot of
callers, so I don't think I'd worry about changing the signature.

I think maybe another alternative that doesn't hold off the console
would be to split the BAR sizing and resource processing into separate
steps.  For example pci_read_bases() might pass arrays like:

        u32 bars[PCI_STD_NUM_BARS] = { 0 };
        u32 romsz = 0;

To a function like:

void __pci_read_bars(struct pci_dev *dev, u32 *bars, u32 *romsz,
                     int num_bars, int rom)
{
        u16 orig_cmd;
        u32 tmp;
        int i;

        if (!dev->mmio_always_on) {
                pci_read_config_word(dev, PCI_COMMAND, &orig_cmd);
                if (orig_cmd & PCI_COMMAND_DECODE_ENABLE) {
                        pci_write_config_word(dev, PCI_COMMAND,
                                orig_cmd & ~PCI_COMMAND_DECODE_ENABLE);
                }
        }

        for (i = 0; i < num_bars; i++) {
                unsigned int pos = PCI_BASE_ADDRESS_0 + (i << 2);

                pci_read_config_dword(dev, pos, &tmp);
                pci_write_config_dword(dev, pos, ~0);
                pci_read_config_dword(dev, pos, &bars[i]);
                pci_write_config_dword(dev, pos, tmp);
        }
                
        if (rom) {
                pci_read_config_dword(dev, rom, &tmp);
                pci_write_config_dword(dev, rom, PCI_ROM_ADDRESS_MASK);
                pci_read_config_dword(dev, rom, romsz);
                pci_write_config_dword(dev, rom, tmp);
        }

        if (!dev->mmio_always_on && (orig_cmd & PCI_COMMAND_DECODE_ENABLE))
                pci_write_config_word(dev, PCI_COMMAND, orig_cmd);
}

pci_read_bases() would then iterate in a similar way that it does now,
passing pointers to the stashed data to __pci_read_base(), which would
then only do the resource processing and could freely print.

To me that seems better than blocking the console... Maybe there are
other ideas on the list.  Thanks,

Alex


  reply	other threads:[~2024-12-03 22:06 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-25 22:46 drivers/pci: (and/or KVM): Slow PCI initialization during VM boot with passthrough of large BAR Nvidia GPUs on DGX H100 Mitchell Augustin
2024-11-26 17:34 ` Alex Williamson
2024-11-26 22:18   ` Mitchell Augustin
2024-11-26 22:41     ` Alex Williamson
2024-11-26 23:08       ` Mitchell Augustin
2024-11-27  0:02         ` Alex Williamson
2024-11-27  1:12           ` Mitchell Augustin
2024-11-27 17:22             ` Alex Williamson
2024-12-02 19:36               ` Mitchell Augustin
2024-12-03 18:34                 ` Mitchell Augustin
2024-12-03 19:20                 ` Alex Williamson
2024-12-03 20:33                   ` Mitchell Augustin
2024-12-03 22:06                     ` Alex Williamson [this message]
2024-12-03 23:09                       ` Mitchell Augustin
2024-12-03 23:30                         ` Alex Williamson
2024-12-06  0:09                           ` Mitchell Augustin
2025-01-08 23:06                             ` Mitchell Augustin
2025-01-13 18:22                               ` Alex Williamson
2025-01-13 19:43                                 ` Mitchell Augustin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241203150620.15431c5c.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mitchell.augustin@canonical.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox