From: Evgenii Shatokhin <e.shatokhin@yadro.com>
To: Serge Semin <fancer.lancer@gmail.com>
Cc: "Serge Semin" <Sergey.Semin@baikalelectronics.ru>,
"Robin Murphy" <robin.murphy@arm.com>,
kernel-team@android.com, "Vidya Sagar" <vidyas@nvidia.com>,
"Christoph Hellwig" <hch@infradead.org>,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
"Gustavo Pimentel" <gustavo.pimentel@synopsys.com>,
"Jingoo Han" <jingoohan1@gmail.com>,
"Rob Herring" <robh@kernel.org>,
"Krzysztof Wilczyński" <kw@linux.com>,
"Bjorn Helgaas" <bhelgaas@google.com>,
linux@yadro.com, "Lorenzo Pieralisi" <lpieralisi@kernel.org>,
"Will McVicker" <willmcvicker@google.com>
Subject: Re: [PATCH v6 0/2] PCI: dwc: Add support for 64-bit MSI target addresses
Date: Thu, 9 Feb 2023 11:03:11 +0300 [thread overview]
Message-ID: <7f59568a-eebe-95c8-4aea-c01dd726efa7@yadro.com> (raw)
In-Reply-To: <20230209004837.n62af6wxgjj4kxt6@mobilestation>
On 09.02.2023 03:48, Serge Semin wrote:
> On Mon, Feb 06, 2023 at 02:27:41PM +0300, Evgenii Shatokhin wrote:
>> Hi Sergey,
>>
>> First of all, thank you for the detailed explanation. It is clearer now what
>> is going on and why it is that way.
>>
>> On 04.02.2023 01:12, Serge Semin wrote:
>>> Hi Evgenii
>>>
>>> On Wed, Feb 01, 2023 at 04:54:55PM +0300, Evgenii Shatokhin wrote:
>>>> On 31.01.2023 15:42, Robin Murphy wrote:
>>>>>
>>>>> On 2023-01-31 12:29, Evgenii Shatokhin wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 26.08.2022 02:54, Will McVicker wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I've update patch 2/2 to address Robin's suggestions. This includes:
>>>>>>>
>>>>>>> * Dropping the while-loop for retrying with a 64-bit mask in favor of
>>>>>>> retrying within the error if-statement.
>>>>>>> * Using an int for the DMA mask instead of a bool and ternary
>>>>>>> operation.
>>>>>>>
>>>>>>> Thanks again for the reviews and sorry for the extra revision today!
>>>>>>> Hopefully this is the last one :) If not, I'd be fine to submit
>>>>>>> patch 1/2
>>>>>>> without 2/2 to avoid resending patch 1/2 for future revisions of patch
>>>>>>> 2/2
>>>>>>> (unless I don't need to do that anyway).
>>>>>>
>>>>>> The first patch of the series made it into the mainline kernel, but, it
>>>>>> seems, the second one ("PCI: dwc: Add support for 64-bit MSI target
>>>>>> address") did not. As of 6.2-rc6, it is still missing.
>>>>>>
>>>>>> Was it intentionally dropped because of some issues or, perhaps, just by
>>>>>> accident? If it was by accident, could you please queue it for inclusion
>>>>>> into mainline again?
>>>>>
>>>>> Yes, it was dropped due to the PCI_MSI_FLAGS_64BIT usage apparently
>>>>> being incorrect, and some other open debate (which all happened on the
>>>>> v5 thread):
>>>>>
>>>>> https://lore.kernel.org/linux-pci/YzVTmy9MWh+AjshC@lpieralisi/
>>>>
>>>
>>>> I see. If I understand it correctly, the problem was that
>>>> PCI_MSI_FLAGS_64BIT flag did not guarantee that 64-bit mask could be used
>>>> for that particular allocation. Right?
>>>>
>>>
>>> William was trying to utilize for only software cause. Setting
>>> PCI_MSI_FLAGS_64BIT didn't actually change the hardware behavior.
>>> He could have as well provided just a driver private capability
>>> flag. (see below for a more detailed problem description)
>>>
>>>>>
>>>>> The DMA mask issues have now been sorted out,
>>>>
>>>> I suppose, you mean https://lore.kernel.org/all/20230113171409.30470-26-Sergey.Semin@baikalelectronics.ru/?
>>>
>>> Well, the way the DMA-mask issue has been solved was a bit of the
>>> hacky. I wouldn't call it a fully proper solution. The problem with
>>> pointlessly allocating physical memory for the iMSI-RX engine (it
>>> doesn't perform any DMA) and artificially restricting the coherent-DMA
>>> mask is still there. The patch in the subject was a compromise in
>>> order to at least permit unrestricted streaming DMAs but limiting the
>>> coherent DMAs for the MSI setup to work properly for all peripheral
>>> devices.
>>>
>>>>
>>>> It still breaks our particular case when the SoC has no 32-bit-addressable
>>>> RAM. We'd set DMA masks to DMA_BIT_MASK(36) in the platform-specific driver
>>>> before calling dw_pcie_host_init(). However, dw_pcie_msi_host_init() resets
>>>> it to 32-bit, tries dmam_alloc_coherent() and fails.
>>>
>>> Yeah. That's another problem with the implemented approach. But are
>>> your sure the driver had worked even before this patch? AFAICS the
>>> driver allocated the MSI-targeted page from DMA32 zone before this
>>> modification. So the allocation must have failed on your platform too.
>>
>> You are right. I did not notice earlier that the kernel based on 6.0-stable
>> we used before did actually contain our SoC-specific workaround for this.
>> Without that custom patch, initialization of PCIe host does not work. So,
>> yes, the problem was present earlier too.
>>
>>>
>>>>
>>>> With 36-bit masks, the kernel seems to play well with the devices in our
>>>> case.
>>>>
>>>> I saw your comment in https://lore.kernel.org/linux-pci/4dc31a63-00b1-f379-c5ac-7dc9425937f4@arm.com/
>>>> that drivers should always explicitly set their masks.
>>>>
>>>
>>>> Is it a really bad idea to check the current coherent mask's bits in
>>>> dw_pcie_msi_host_init() and if it is more than 32 - just issue a warning
>>>> rather than reset it to 32-bit unconditionally? That would help in our case.
>>>> Or, perhaps, there is a better workaround.
>>>
>>> The problem isn't in the value the mask is set to. The problem is
>>> two-leveled, but is mainly connected with the PCIe device detected on
>>> the PCIe bus. There are some of them which can't send MSI TLPs to the
>>> 64-bit addresses. Since we can't predict whether such devices exist on
>>> the bus beforehand the LLDD probe is performed together with the
>>> MSI-engine initialization, the solution was to just restrict the MSIs
>>> base address to be allocated within the lowest 4GB. Moreover as I said
>>> above the iMSI-RX engine doesn't actually cause any DMA thus there is
>>> no need in any memory allocation. Instead reserving some PCIe-bus
>>> space/DWORD for MSIs would be enough. Alas the PCIe-subsystem doesn't
>>> provide a way to do so. That's why we have what you see in the driver:
>>> DMA mask restriction and coherent DMA memory allocation.
>>
>
>> So, if I understand you correctly, what is needed here is a small area of
>> PCIe address space accessible to any of the connected PCIe devices. As the
>> kernel does not know in advance, which restrictions the devices have, it
>> tries to allocate 32-bit-addressable memory, suitable for DMA. This way, it
>> would be OK for any attached PCIe device. Right?
>
> Right. The restriction is the 64-bit MSI capability. If any PCIe
> peripheral device has no PCI_MSI_64_BIT_ADDR_CAP flag set and the MSI
> target address is selected from the space above 4GB then such device
> MSIs won't be handled.
>
> Note as I said above no DMA actually performed if at least one MSI
> vector is enabled. The driver just needs a DWORD within the PCIe bus
> space for the MSI MWr TLPs target address and EP+vector data. The MSI
> TLP data is decoded by the iMSI-RX engine in order to set the
> corresponding flag in the MSI IRQ status register. Such TLPs won't be
> passed to the master AXI-bus.
>
>>
>>>
>>> If only we had a way to auto-detected the PCIe-bus space with no
>>> physical memory behind it and take out a DWORD from it to initialize
>>> the iMSI-RX engine we could have immediately got rid from the mask
>>> setting operation and the memory allocation. It would have solved your
>>> problem too.
>>
>
>> Yes, it would solve our issue too. I do not know, however, if a generic
>> solution is possible here, but I am no expert in PCIe.
>
> Currently the kernel PCIe subsystem doesn't provide a way to reserve a
> range within the PCIe bus memory with no physical RAM behind and left
> unused during the BARs resource initialization. Implementing such
> functionality (perhaps in the framework of the P2P module or based on
> it) would give the generic solution.
>
>>
>> For now, we are probably better off with SoC-specific patches, when we know
>> which PCIe devices can possibly be used and what their restrictions are.
>
> Since you know that there is no any RAM below 4GB and you have
> matching CPU and PCIe address spaces, then you can just take any
> address below 4GB and use it to initialize the MSI-target address
> (dw_pcie_rp.msi_data). But make sure that the peripheral PCIe-devices
> don't use it for something application-specific (like accessing CPU
> MMIO devices mapped to that base address). That seems like the most
> universal solution for your case.
Interesting idea!
Thank you, Sergey.
>
> -Serge(y)
>
>>
>>>
>>> -Serge(y)
>>>
>>>>
>>>> Looking forward to your comments.
>>>
>>>
>>>
>>>>
>>>>
>>>>> so you, or Will, or anyone
>>>>> else interested should be free to rework this on top of linux-next
>>>>> (although at this point, more realistically on top of 6.3-rc1 in a few
>>>>> weeks).
>>>>>
>>>>> Thanks,
>>>>> Robin.
>>>>>
>>>>>> Support for 64-bit MSI target addresses is needed for some of our SoCs.
>>>>>> I ran into a situation when there was no available RAM in ZONE_DMA32
>>>>>> during initialization of PCIe host. Hence, dmam_alloc_coherent() failed
>>>>>> in dw_pcie_msi_host_init() and initialization failed with -ENOMEM:
>>>>>>
>>>>>> [ 0.374834] dw-pcie 4000000.pcie0: host bridge /soc/pcie0@4000000
>>>>>> ranges:
>>>>>> [ 0.375813] dw-pcie 4000000.pcie0: MEM
>>>>>> 0x0041000000..0x004fffffff -> 0x0041000000
>>>>>> [ 0.376171] dw-pcie 4000000.pcie0: IB MEM
>>>>>> 0x0400000000..0x07ffffffff -> 0x0400000000
>>>>>> [ 0.377914] dw-pcie 4000000.pcie0: Failed to alloc and map MSI data
>>>>>> [ 0.378191] dw-pcie 4000000.pcie0: Failed to initialize host
>>>>>> [ 0.378255] dw-pcie: probe of 4000000.pcie0 failed with error -12
>>>>>>
>>>>>> Mainline kernel 6.2-rc6 was used in that test.
>>>>>>
>>>>>> The hardware supports 64-bit target addresses, so the patch "PCI: dwc:
>>>>>> Add support for 64-bit MSI target address" should help with this
>>>>>> particular failure.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Will
>>>>>>>
>>>>>>> Will McVicker (2):
>>>>>>> PCI: dwc: Drop dependency on ZONE_DMA32
>>>>>>>
>>>>>>> v6:
>>>>>>> * Retrying DMA allocation with 64-bit mask within the error
>>>>>>> if-statement.
>>>>>>> * Use an int for the DMA mask instead of a bool and ternary operation.
>>>>>>>
>>>>>>> v5:
>>>>>>> * Updated patch 2/2 to first try with a 32-bit DMA mask. On failure,
>>>>>>> retry with a 64-bit mask if supported.
>>>>>>>
>>>>>>> v4:
>>>>>>> * Updated commit descriptions.
>>>>>>> * Renamed msi_64b -> msi_64bit.
>>>>>>> * Dropped msi_64bit ternary use.
>>>>>>> * Dropped export of dw_pcie_msi_capabilities.
>>>>>>>
>>>>>>> v3:
>>>>>>> * Switched to a managed DMA allocation.
>>>>>>> * Simplified the DMA allocation cleanup.
>>>>>>> * Dropped msi_page from struct dw_pcie_rp.
>>>>>>> * Allocating a u64 instead of a full page.
>>>>>>>
>>>>>>> v2:
>>>>>>> * Fixed build error caught by kernel test robot
>>>>>>> * Fixed error handling reported by Isaac Manjarres
>>>>>>> PCI: dwc: Add support for 64-bit MSI target address
>>>>>>>
>>>>>>> .../pci/controller/dwc/pcie-designware-host.c | 43 +++++++++----------
>>>>>>> drivers/pci/controller/dwc/pcie-designware.c | 8 ++++
>>>>>>> drivers/pci/controller/dwc/pcie-designware.h | 2 +-
>>>>>>> 3 files changed, 30 insertions(+), 23 deletions(-)
>>>>>>>
>>>>>>>
>>>>>>> base-commit: 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>>>>>>
>>>>>> Thank you in advance.
>>
>> Regards,
>> Evgenii
>>
>>
>
next prev parent reply other threads:[~2023-02-09 8:03 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-25 23:54 [PATCH v6 0/2] PCI: dwc: Add support for 64-bit MSI target addresses Will McVicker
2022-08-25 23:54 ` [PATCH v6 1/2] PCI: dwc: Drop dependency on ZONE_DMA32 Will McVicker
2022-08-25 23:54 ` [PATCH v6 2/2] PCI: dwc: Add support for 64-bit MSI target address Will McVicker
2022-08-29 8:03 ` [PATCH v6 0/2] PCI: dwc: Add support for 64-bit MSI target addresses Lorenzo Pieralisi
2023-01-31 12:29 ` Evgenii Shatokhin
2023-01-31 12:42 ` Robin Murphy
2023-02-01 13:54 ` Evgenii Shatokhin
2023-02-03 22:12 ` Serge Semin
2023-02-06 11:27 ` Evgenii Shatokhin
2023-02-09 0:48 ` Serge Semin
2023-02-09 8:03 ` Evgenii Shatokhin [this message]
2023-02-09 15:54 ` Robin Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7f59568a-eebe-95c8-4aea-c01dd726efa7@yadro.com \
--to=e.shatokhin@yadro.com \
--cc=Sergey.Semin@baikalelectronics.ru \
--cc=bhelgaas@google.com \
--cc=fancer.lancer@gmail.com \
--cc=gustavo.pimentel@synopsys.com \
--cc=hch@infradead.org \
--cc=jingoohan1@gmail.com \
--cc=kernel-team@android.com \
--cc=kw@linux.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux@yadro.com \
--cc=lpieralisi@kernel.org \
--cc=robh@kernel.org \
--cc=robin.murphy@arm.com \
--cc=vidyas@nvidia.com \
--cc=willmcvicker@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox