All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Dongli Zhang <dongli.zhang@oracle.com>
Cc: qemu-devel@nongnu.org, marcel.apfelbaum@gmail.com,
	Jason Wang <jasowang@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 1/1] msix: correct pba size calculation used for msix_exclusive_bar initialization
Date: Mon, 14 Jan 2019 18:14:13 -0500	[thread overview]
Message-ID: <20190114175247-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <a8026b40-9556-c9c1-1bbc-e2991d6473cb@oracle.com>

On Mon, Dec 17, 2018 at 11:15:46AM +0800, Dongli Zhang wrote:
> Hi Michael,
> 
> CCed Jason as this is related to commit a0ccd2123ee2 "pci: remove hard-coded bar
> size in msix_init_exclusive_bar()"
> 
> Please see my reply inline.
> 
> On 12/17/2018 10:24 AM, Michael S. Tsirkin wrote:
> > On Mon, Dec 17, 2018 at 07:34:39AM +0800, Dongli Zhang wrote:
> >> The bar_pba_size is more than what the pba is expected to have, although
> >> this usually would not affect the bar_size used for dev->msix_exclusive_bar
> >> initialization.
> >>
> >> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> > 
> > 
> > If this does ever have an effect, we need a compat config
> > for old machine types.
> > Could you explain a bit more? Are there configs affected?
> > What are these?
> 
> Here I copy parts of msix_init_exclusive_bar() and msix_init().
> 
> 'bar_pba_size' is computed (line 348) in msix_init_exclusive_bar(), while
> 'pba_size' is computed (line 291) in msix_init().
> 
> msix_init_exclusive_bar() is one of callers of msix_init().
> 
> I think both 'bar_pba_size' and 'pba_size' are indicating the size of pba. The
> difference is, while 'bar_pba_size' is used to initialize the size of
> MemoryRegion (dev->msix_exclusive_bar), 'pba_size' is used to initialize the
> config space in msix_init().
> 
> However, the equations to compute the two variables are different. Why would we
> use different size for MemoryRegion and config space?

The math in msix_init_exclusive_bar allocates
too much memory in some cases.

For example consider nentries = 8.
msix_exclusive_bar will give us bar_pba_size = 16.
So 16 bytes.  However 8 bytes would be enough - this
is all that the spec requires.

So in practice bar_pba_size sometimes allocates an
extra 8 bytes but never more.

Since each MSIX entry size is 16 bytes, and since
we make sure that table+pba is a power of two,
this always leaves a multiple of 16 bytes for the PBA,
so extra 8 bytes have no effect.

I have merged your patch with the above explanation.

Thanks for the contribution!



> 
> Thanks to line 365, both equations would reach at the same bar_size.
> 
> For instance, assuming nvme with 1333 queues,
> 
> with "uint32_t bar_pba_size = (nentries / 8 + 1) * 8;", bar_pba_size=1336 (line
> 348) and bar_size=32768 (line 365).
> 
> with "uint32_t bar_pba_size = QEMU_ALIGN_UP(nentries, 64) / 8;",
> bar_pba_size=168 (line 348) and bar_size=32768 (line 365).
> 
> That's why I mentioned "this usually would not affect the bar_size used for
> dev->msix_exclusive_bar initialization."
> 
> 
> 
> 341 int msix_init_exclusive_bar(PCIDevice *dev, unsigned short nentries,
> 342                             uint8_t bar_nr, Error **errp)
> 343 {
> 344     int ret;
> 345     char *name;
> 346     uint32_t bar_size = 4096;
> 347     uint32_t bar_pba_offset = bar_size / 2;
> 348     uint32_t bar_pba_size = (nentries / 8 + 1) * 8;
> 349
> 350     /*
> 351      * Migration compatibility dictates that this remains a 4k
> 352      * BAR with the vector table in the lower half and PBA in
> 353      * the upper half for nentries which is lower or equal to 128.
> 354      * No need to care about using more than 65 entries for legacy
> 355      * machine types who has at most 64 queues.
> 356      */
> 357     if (nentries * PCI_MSIX_ENTRY_SIZE > bar_pba_offset) {
> 358         bar_pba_offset = nentries * PCI_MSIX_ENTRY_SIZE;
> 359     }
> 360
> 361     if (bar_pba_offset + bar_pba_size > 4096) {
> 362         bar_size = bar_pba_offset + bar_pba_size;
> 363     }
> 364
> 365     bar_size = pow2ceil(bar_size);
> 366
> 367     name = g_strdup_printf("%s-msix", dev->name);
> 368     memory_region_init(&dev->msix_exclusive_bar, OBJECT(dev), name, bar_size);
> 369     g_free(name);
> 370
> 371     ret = msix_init(dev, nentries, &dev->msix_exclusive_bar, bar_nr,
> 372                     0, &dev->msix_exclusive_bar,
> 373                     bar_nr, bar_pba_offset,
> 374                     0, errp);
> 375     if (ret) {
> 376         return ret;
> 377     }
> 378
> 379     pci_register_bar(dev, bar_nr, PCI_BASE_ADDRESS_SPACE_MEMORY,
> 380                      &dev->msix_exclusive_bar);
> 381
> 382     return 0;
> 383 }
> 
> 
> 269 int msix_init(struct PCIDevice *dev, unsigned short nentries,
> 270               MemoryRegion *table_bar, uint8_t table_bar_nr,
> 271               unsigned table_offset, MemoryRegion *pba_bar,
> 272               uint8_t pba_bar_nr, unsigned pba_offset, uint8_t cap_pos,
> 273               Error **errp)
> 274 {
> 275     int cap;
> 276     unsigned table_size, pba_size;
> 277     uint8_t *config;
> 278
> 279     /* Nothing to do if MSI is not supported by interrupt controller */
> 280     if (!msi_nonbroken) {
> 281         error_setg(errp, "MSI-X is not supported by interrupt controller");
> 282         return -ENOTSUP;
> 283     }
> 284
> 285     if (nentries < 1 || nentries > PCI_MSIX_FLAGS_QSIZE + 1) {
> 286         error_setg(errp, "The number of MSI-X vectors is invalid");
> 287         return -EINVAL;
> 288     }
> 289
> 290     table_size = nentries * PCI_MSIX_ENTRY_SIZE;
> 291     pba_size = QEMU_ALIGN_UP(nentries, 64) / 8;
> 
> 
> Dongli Zhang
> 
> 
> > 
> > 
> >> ---
> >>  hw/pci/msix.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/hw/pci/msix.c b/hw/pci/msix.c
> >> index 702dac4..234c0a3 100644
> >> --- a/hw/pci/msix.c
> >> +++ b/hw/pci/msix.c
> >> @@ -345,7 +345,7 @@ int msix_init_exclusive_bar(PCIDevice *dev, unsigned short nentries,
> >>      char *name;
> >>      uint32_t bar_size = 4096;
> >>      uint32_t bar_pba_offset = bar_size / 2;
> >> -    uint32_t bar_pba_size = (nentries / 8 + 1) * 8;
> >> +    uint32_t bar_pba_size = QEMU_ALIGN_UP(nentries, 64) / 8;
> >>  
> >>      /*
> >>       * Migration compatibility dictates that this remains a 4k
> >> -- 
> >> 2.7.4

      reply	other threads:[~2019-01-14 23:14 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-16 23:34 [Qemu-devel] [PATCH 1/1] msix: correct pba size calculation used for msix_exclusive_bar initialization Dongli Zhang
2018-12-17  2:24 ` Michael S. Tsirkin
2018-12-17  3:15   ` Dongli Zhang
2019-01-14 23:14     ` Michael S. Tsirkin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190114175247-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=dongli.zhang@oracle.com \
    --cc=jasowang@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.