* [PATCH v6 0/6] Improve PCI memory mapping API
@ 2024-10-12 11:32 Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 1/6] PCI: endpoint: Introduce pci_epc_function_is_valid() Damien Le Moal
` (7 more replies)
0 siblings, 8 replies; 29+ messages in thread
From: Damien Le Moal @ 2024-10-12 11:32 UTC (permalink / raw)
To: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci
Cc: Rick Wertenbroek, Niklas Cassel
This series introduces the new functions pci_epc_mem_map() and
pci_epc_mem_unmap() to improve handling of the PCI address mapping
alignment constraints of endpoint controllers in a controller
independent manner.
The issue fixed is that the fixed alignment defined by the "align" field
of struct pci_epc_features is defined for inbound ATU entries (e.g.
BARs) and is a fixed value, whereas some controllers need a PCI address
alignment that depends on the PCI address itself. For instance, the
rk3399 SoC controller in endpoint mode uses the lower bits of the local
endpoint memory address as the lower bits for the PCI addresses for data
transfers. That is, when mapping local memory, one must take into
account the number of bits of the RC PCI address that change from the
start address of the mapping.
To fix this, the new endpoint controller method .align_addr is
introduced and called from the new pci_epc_mem_map() function. This
method is optional and for controllers that do not define it, it is
assumed that the controller has no PCI address constraint.
The functions pci_epc_mem_map() is a helper function which obtains
the mapping information, allocates endpoint controller memory according
to the mapping size obtained and maps the memory. pci_epc_mem_unmap()
unmaps and frees the endpoint memory.
This series is organized as follows:
- Patch 1 introduces a small helper to clean up the epc core code
- Patch 2 improves pci_epc_mem_alloc_addr()
- Patch 3 introduce the new align_addr() endpoint controller method
and the epc functions pci_epc_mem_map() and pci_epc_mem_unmap().
- Patch 4 documents these new functions.
- Patch 5 modifies the test endpoint function driver to use
pci_epc_mem_map() and pci_epc_mem_unmap() to illustrate the use of
these functions.
- Finally, patch 6 implements the RK3588 endpoint controller driver
.align_addr() operation to satisfy that controller PCI address
alignment constraint.
This patch series was tested using the pci endpoint test driver (as-is
and a modified version removing memory allocation alignment on the host
side) as well as with a prototype NVMe endpoint function driver (where
data transfers use addresses that are never aligned to any specific
boundary).
Changes from v5:
- Changed patch 3 to rename the new controller operation to align_addr
and change its interface. Patch 6 is changed accordingly.
Changes from v4:
- Removed the patch adding the pci_epc_map_align() function. The former
.map_align controller operation is now introduced in patch 3 as
"get_mem_map()" and used directly in the new pci_epf_mem_map()
function.
- Modified the documentation patch 4 to reflect the previous change.
- Changed patch 6 title and modified it to rename map_align to
get_mem_map in accordance with the new patch 3.
- Added review tags
Changes from v3:
- Addressed Niklas comments (improved patch 2 commit message, added
comments in the pci_epc_map_align() function in patch 3, typos and
improvements in patch 5, patch 7 commit message).
- Added review tags
Changes from v2:
- Dropped all patches for the rockchip-ep. These patches will be sent
later as a separate series.
- Added patch 2 and 5
- Added review tags to patch 1
Changes from v1:
- Changed pci_epc_check_func() to pci_epc_function_is_valid() in patch
1.
- Removed patch "PCI: endpoint: Improve pci_epc_mem_alloc_addr()"
(former patch 2 of v1)
- Various typos cleanups all over. Also fixed some blank space
indentation.
- Added review tags
Damien Le Moal (6):
PCI: endpoint: Introduce pci_epc_function_is_valid()
PCI: endpoint: Improve pci_epc_mem_alloc_addr()
PCI: endpoint: Introduce pci_epc_mem_map()/unmap()
PCI: endpoint: Update documentation
PCI: endpoint: test: Use pci_epc_mem_map/unmap()
PCI: dwc: endpoint: Implement the pci_epc_ops::align_addr() operation
Documentation/PCI/endpoint/pci-endpoint.rst | 29 ++
.../pci/controller/dwc/pcie-designware-ep.c | 15 +
drivers/pci/endpoint/functions/pci-epf-test.c | 372 +++++++++---------
drivers/pci/endpoint/pci-epc-core.c | 182 ++++++---
drivers/pci/endpoint/pci-epc-mem.c | 9 +-
include/linux/pci-epc.h | 38 ++
6 files changed, 415 insertions(+), 230 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v6 1/6] PCI: endpoint: Introduce pci_epc_function_is_valid()
2024-10-12 11:32 [PATCH v6 0/6] Improve PCI memory mapping API Damien Le Moal
@ 2024-10-12 11:32 ` Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 2/6] PCI: endpoint: Improve pci_epc_mem_alloc_addr() Damien Le Moal
` (6 subsequent siblings)
7 siblings, 0 replies; 29+ messages in thread
From: Damien Le Moal @ 2024-10-12 11:32 UTC (permalink / raw)
To: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci
Cc: Rick Wertenbroek, Niklas Cassel
Introduce the epc core helper function pci_epc_function_is_valid() to
verify that an epc pointer, a physical function number and a virtual
function number are all valid. This avoids repeating the code pattern:
if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions)
return err;
if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
return err;
in many functions of the endpoint controller core code.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
---
drivers/pci/endpoint/pci-epc-core.c | 79 +++++++++++------------------
1 file changed, 31 insertions(+), 48 deletions(-)
diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
index 17f007109255..b854f1bab26f 100644
--- a/drivers/pci/endpoint/pci-epc-core.c
+++ b/drivers/pci/endpoint/pci-epc-core.c
@@ -128,6 +128,18 @@ enum pci_barno pci_epc_get_next_free_bar(const struct pci_epc_features
}
EXPORT_SYMBOL_GPL(pci_epc_get_next_free_bar);
+static bool pci_epc_function_is_valid(struct pci_epc *epc,
+ u8 func_no, u8 vfunc_no)
+{
+ if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions)
+ return false;
+
+ if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ return false;
+
+ return true;
+}
+
/**
* pci_epc_get_features() - get the features supported by EPC
* @epc: the features supported by *this* EPC device will be returned
@@ -145,10 +157,7 @@ const struct pci_epc_features *pci_epc_get_features(struct pci_epc *epc,
{
const struct pci_epc_features *epc_features;
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions)
- return NULL;
-
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return NULL;
if (!epc->ops->get_features)
@@ -218,10 +227,7 @@ int pci_epc_raise_irq(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
{
int ret;
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions)
- return -EINVAL;
-
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return -EINVAL;
if (!epc->ops->raise_irq)
@@ -262,10 +268,7 @@ int pci_epc_map_msi_irq(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
{
int ret;
- if (IS_ERR_OR_NULL(epc))
- return -EINVAL;
-
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return -EINVAL;
if (!epc->ops->map_msi_irq)
@@ -293,10 +296,7 @@ int pci_epc_get_msi(struct pci_epc *epc, u8 func_no, u8 vfunc_no)
{
int interrupt;
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions)
- return 0;
-
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return 0;
if (!epc->ops->get_msi)
@@ -329,11 +329,10 @@ int pci_epc_set_msi(struct pci_epc *epc, u8 func_no, u8 vfunc_no, u8 interrupts)
int ret;
u8 encode_int;
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions ||
- interrupts < 1 || interrupts > 32)
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return -EINVAL;
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (interrupts < 1 || interrupts > 32)
return -EINVAL;
if (!epc->ops->set_msi)
@@ -361,10 +360,7 @@ int pci_epc_get_msix(struct pci_epc *epc, u8 func_no, u8 vfunc_no)
{
int interrupt;
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions)
- return 0;
-
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return 0;
if (!epc->ops->get_msix)
@@ -397,11 +393,10 @@ int pci_epc_set_msix(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
{
int ret;
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions ||
- interrupts < 1 || interrupts > 2048)
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return -EINVAL;
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (interrupts < 1 || interrupts > 2048)
return -EINVAL;
if (!epc->ops->set_msix)
@@ -428,10 +423,7 @@ EXPORT_SYMBOL_GPL(pci_epc_set_msix);
void pci_epc_unmap_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
phys_addr_t phys_addr)
{
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions)
- return;
-
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return;
if (!epc->ops->unmap_addr)
@@ -459,10 +451,7 @@ int pci_epc_map_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
{
int ret;
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions)
- return -EINVAL;
-
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return -EINVAL;
if (!epc->ops->map_addr)
@@ -489,12 +478,11 @@ EXPORT_SYMBOL_GPL(pci_epc_map_addr);
void pci_epc_clear_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
struct pci_epf_bar *epf_bar)
{
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions ||
- (epf_bar->barno == BAR_5 &&
- epf_bar->flags & PCI_BASE_ADDRESS_MEM_TYPE_64))
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return;
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (epf_bar->barno == BAR_5 &&
+ epf_bar->flags & PCI_BASE_ADDRESS_MEM_TYPE_64)
return;
if (!epc->ops->clear_bar)
@@ -521,18 +509,16 @@ int pci_epc_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
int ret;
int flags = epf_bar->flags;
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions ||
- (epf_bar->barno == BAR_5 &&
- flags & PCI_BASE_ADDRESS_MEM_TYPE_64) ||
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
+ return -EINVAL;
+
+ if ((epf_bar->barno == BAR_5 && flags & PCI_BASE_ADDRESS_MEM_TYPE_64) ||
(flags & PCI_BASE_ADDRESS_SPACE_IO &&
flags & PCI_BASE_ADDRESS_IO_MASK) ||
(upper_32_bits(epf_bar->size) &&
!(flags & PCI_BASE_ADDRESS_MEM_TYPE_64)))
return -EINVAL;
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
- return -EINVAL;
-
if (!epc->ops->set_bar)
return 0;
@@ -561,10 +547,7 @@ int pci_epc_write_header(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
{
int ret;
- if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions)
- return -EINVAL;
-
- if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no]))
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
return -EINVAL;
/* Only Virtual Function #1 has deviceID */
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v6 2/6] PCI: endpoint: Improve pci_epc_mem_alloc_addr()
2024-10-12 11:32 [PATCH v6 0/6] Improve PCI memory mapping API Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 1/6] PCI: endpoint: Introduce pci_epc_function_is_valid() Damien Le Moal
@ 2024-10-12 11:32 ` Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 3/6] PCI: endpoint: Introduce pci_epc_mem_map()/unmap() Damien Le Moal
` (5 subsequent siblings)
7 siblings, 0 replies; 29+ messages in thread
From: Damien Le Moal @ 2024-10-12 11:32 UTC (permalink / raw)
To: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci
Cc: Rick Wertenbroek, Niklas Cassel
There is no point in attempting to allocate memory from an endpoint
controller memory window if the requested size is larger than the memory
window size. Add a check to skip bitmap_find_free_region() calls for
such case. This check can be done without the mem->lock mutex held as
memory window sizes are constant and never modified at runtime.
Also change the final return to return NULL to simplify the code.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
drivers/pci/endpoint/pci-epc-mem.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/endpoint/pci-epc-mem.c b/drivers/pci/endpoint/pci-epc-mem.c
index a9c028f58da1..218a60e945db 100644
--- a/drivers/pci/endpoint/pci-epc-mem.c
+++ b/drivers/pci/endpoint/pci-epc-mem.c
@@ -178,7 +178,7 @@ EXPORT_SYMBOL_GPL(pci_epc_mem_exit);
void __iomem *pci_epc_mem_alloc_addr(struct pci_epc *epc,
phys_addr_t *phys_addr, size_t size)
{
- void __iomem *virt_addr = NULL;
+ void __iomem *virt_addr;
struct pci_epc_mem *mem;
unsigned int page_shift;
size_t align_size;
@@ -188,10 +188,13 @@ void __iomem *pci_epc_mem_alloc_addr(struct pci_epc *epc,
for (i = 0; i < epc->num_windows; i++) {
mem = epc->windows[i];
- mutex_lock(&mem->lock);
+ if (size > mem->window.size)
+ continue;
+
align_size = ALIGN(size, mem->window.page_size);
order = pci_epc_mem_get_order(mem, align_size);
+ mutex_lock(&mem->lock);
pageno = bitmap_find_free_region(mem->bitmap, mem->pages,
order);
if (pageno >= 0) {
@@ -211,7 +214,7 @@ void __iomem *pci_epc_mem_alloc_addr(struct pci_epc *epc,
mutex_unlock(&mem->lock);
}
- return virt_addr;
+ return NULL;
}
EXPORT_SYMBOL_GPL(pci_epc_mem_alloc_addr);
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v6 3/6] PCI: endpoint: Introduce pci_epc_mem_map()/unmap()
2024-10-12 11:32 [PATCH v6 0/6] Improve PCI memory mapping API Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 1/6] PCI: endpoint: Introduce pci_epc_function_is_valid() Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 2/6] PCI: endpoint: Improve pci_epc_mem_alloc_addr() Damien Le Moal
@ 2024-10-12 11:32 ` Damien Le Moal
2024-10-12 11:47 ` Manivannan Sadhasivam
2024-10-13 9:06 ` Niklas Cassel
2024-10-12 11:32 ` [PATCH v6 4/6] PCI: endpoint: Update documentation Damien Le Moal
` (4 subsequent siblings)
7 siblings, 2 replies; 29+ messages in thread
From: Damien Le Moal @ 2024-10-12 11:32 UTC (permalink / raw)
To: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci
Cc: Rick Wertenbroek, Niklas Cassel
Some endpoint controllers have requirements on the alignment of the
controller physical memory address that must be used to map a RC PCI
address region. For instance, the endpoint controller of the RK3399 SoC
uses at most the lower 20 bits of a physical memory address region as
the lower bits of a RC PCI address region. For mapping a PCI address
region of size bytes starting from pci_addr, the exact number of
address bits used is the number of address bits changing in the address
range [pci_addr..pci_addr + size - 1]. For this example, this creates
the following constraints:
1) The offset into the controller physical memory allocated for a
mapping depends on the mapping size *and* the starting PCI address
for the mapping.
2) A mapping size cannot exceed the controller windows size (1MB) minus
the offset needed into the allocated physical memory, which can end
up being a smaller size than the desired mapping size.
Handling these constraints independently of the controller being used
in an endpoint function driver is not possible with the current EPC
API as only the ->align field in struct pci_epc_features is provided
but used for BAR (inbound ATU mappings) mapping only. A new API is
needed for function drivers to discover mapping constraints and handle
non-static requirements based on the RC PCI address range to access.
Introduce the endpoint controller operation ->align_addr() to allow
the EPC core functions to obtain the size and the offset into a
controller address region that must be allocated and mapped to access
a RC PCI address region. The size of the mapping provided by the
align_addr() operation can then be used as the size argument for the
function pci_epc_mem_alloc_addr() and the offset into the allocated
controller memory provided can be used to correctly handle data
transfers. For endpoint controllers that have PCI address alignment
constraints, the align_addr() operation may indicate upon return an
effective PCI address mapping size that is smaller (but not 0) than the
requested PCI address region size.
The controller ->align_addr() operation is optional: controllers that
do not have any alignment constraints for mapping RC PCI address regions
do not need to implement this operation. For such controllers, it is
always assumed that the mapping size is equal to the requested size of
the PCI region and that the mapping offset is 0.
The function pci_epc_mem_map() is introduced to use this new controller
operation (if it is defined) to handle controller memory allocation and
mapping to a RC PCI address region in endpoint function drivers.
This function first uses the ->align_addr() controller operation to
determine the controller memory address size (and offset into) needed
for mapping an RC PCI address region. The result of this operation is
used to allocate a controller physical memory region using
pci_epc_mem_alloc_addr() and then to map that memory to the RC PCI
address space with pci_epc_map_addr().
Since ->align_addr() () may indicate that not all of a RC PCI address
region can be mapped, pci_epc_mem_map() may only partially map the RC
PCI address region specified. It is the responsibility of the caller
(an endpoint function driver) to handle such smaller mapping by
repeatedly using pci_epc_mem_map() over the desried PCI address range.
The counterpart of pci_epc_mem_map() to unmap and free a mapped
controller memory address region is pci_epc_mem_unmap().
Both functions operate using the new struct pci_epc_map data structure.
This new structure represents a mapping PCI address, mapping effective
size, the size of the controller memory needed for the mapping as well
as the physical and virtual CPU addresses of the mapping (phys_base and
virt_base fields). For convenience, the physical and virtual CPU
addresses within that mapping to use to access the target RC PCI address
region are also provided (phys_addr and virt_addr fields).
Endpoint function drivers can use struct pci_epc_map to access the
mapped RC PCI address region using the ->virt_addr and ->pci_size
fields.
Co-developed-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
drivers/pci/endpoint/pci-epc-core.c | 103 ++++++++++++++++++++++++++++
include/linux/pci-epc.h | 38 ++++++++++
2 files changed, 141 insertions(+)
diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
index b854f1bab26f..04a85d2f7e2a 100644
--- a/drivers/pci/endpoint/pci-epc-core.c
+++ b/drivers/pci/endpoint/pci-epc-core.c
@@ -466,6 +466,109 @@ int pci_epc_map_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
}
EXPORT_SYMBOL_GPL(pci_epc_map_addr);
+/**
+ * pci_epc_mem_map() - allocate and map a PCI address to a CPU address
+ * @epc: the EPC device on which the CPU address is to be allocated and mapped
+ * @func_no: the physical endpoint function number in the EPC device
+ * @vfunc_no: the virtual endpoint function number in the physical function
+ * @pci_addr: PCI address to which the CPU address should be mapped
+ * @pci_size: the number of bytes to map starting from @pci_addr
+ * @map: where to return the mapping information
+ *
+ * Allocate a controller memory address region and map it to a RC PCI address
+ * region, taking into account the controller physical address mapping
+ * constraints using the controller operation align_addr(). If this operation is
+ * not defined, we assume that there are no alignment constraints for the
+ * mapping.
+ *
+ * The effective size of the PCI address range mapped from @pci_addr is
+ * indicated by @map->pci_size. This size may be less than the requested
+ * @pci_size. The local virtual CPU address for the mapping is indicated by
+ * @map->virt_addr (@map->phys_addr indicates the physical address).
+ * The size and CPU address of the controller memory allocated and mapped are
+ * respectively indicated by @map->map_size and @map->virt_base (and
+ * @map->phys_base for the physical address of @map->virt_base).
+ *
+ * Returns 0 on success and a negative error code in case of error.
+ */
+int pci_epc_mem_map(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+ u64 pci_addr, size_t pci_size, struct pci_epc_map *map)
+{
+ size_t map_size = pci_size;
+ size_t map_offset = 0;
+ int ret;
+
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
+ return -EINVAL;
+
+ if (!pci_size || !map)
+ return -EINVAL;
+
+ /*
+ * Align the PCI address to map. If the controller defines the
+ * .align_addr() operation, use it to determine the PCI address to map
+ * and the size of the mapping. Otherwise, assume that the controller
+ * has no alignment constraint.
+ */
+ memset(map, 0, sizeof(*map));
+ map->pci_addr = pci_addr;
+ if (epc->ops->align_addr)
+ map->map_pci_addr =
+ epc->ops->align_addr(epc, pci_addr,
+ &map_size, &map_offset);
+ else
+ map->map_pci_addr = pci_addr;
+ map->map_size = map_size;
+ if (map->map_pci_addr + map->map_size < pci_addr + pci_size)
+ map->pci_size = map->map_pci_addr + map->map_size - pci_addr;
+ else
+ map->pci_size = pci_size;
+
+ map->virt_base = pci_epc_mem_alloc_addr(epc, &map->phys_base,
+ map->map_size);
+ if (!map->virt_base)
+ return -ENOMEM;
+
+ map->phys_addr = map->phys_base + map_offset;
+ map->virt_addr = map->virt_base + map_offset;
+
+ ret = pci_epc_map_addr(epc, func_no, vfunc_no, map->phys_base,
+ map->map_pci_addr, map->map_size);
+ if (ret) {
+ pci_epc_mem_free_addr(epc, map->phys_base, map->virt_base,
+ map->map_size);
+ return ret;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(pci_epc_mem_map);
+
+/**
+ * pci_epc_mem_unmap() - unmap and free a CPU address region
+ * @epc: the EPC device on which the CPU address is allocated and mapped
+ * @func_no: the physical endpoint function number in the EPC device
+ * @vfunc_no: the virtual endpoint function number in the physical function
+ * @map: the mapping information
+ *
+ * Unmap and free a CPU address region that was allocated and mapped with
+ * pci_epc_mem_map().
+ */
+void pci_epc_mem_unmap(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+ struct pci_epc_map *map)
+{
+ if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
+ return;
+
+ if (!map || !map->virt_base)
+ return;
+
+ pci_epc_unmap_addr(epc, func_no, vfunc_no, map->phys_base);
+ pci_epc_mem_free_addr(epc, map->phys_base, map->virt_base,
+ map->map_size);
+}
+EXPORT_SYMBOL_GPL(pci_epc_mem_unmap);
+
/**
* pci_epc_clear_bar() - reset the BAR
* @epc: the EPC device for which the BAR has to be cleared
diff --git a/include/linux/pci-epc.h b/include/linux/pci-epc.h
index 42ef06136bd1..f4b8dc37e447 100644
--- a/include/linux/pci-epc.h
+++ b/include/linux/pci-epc.h
@@ -32,11 +32,43 @@ pci_epc_interface_string(enum pci_epc_interface_type type)
}
}
+/**
+ * struct pci_epc_map - information about EPC memory for mapping a RC PCI
+ * address range
+ * @pci_addr: start address of the RC PCI address range to map
+ * @pci_size: size of the RC PCI address range mapped from @pci_addr
+ * @map_pci_addr: RC PCI address used as the first address mapped (may be lower
+ * than @pci_addr)
+ * @map_size: size of the controller memory needed for mapping the RC PCI address
+ * range @pci_addr..@pci_addr+@pci_size
+ * @phys_base: base physical address of the allocated EPC memory for mapping the
+ * RC PCI address range
+ * @phys_addr: physical address at which @pci_addr is mapped
+ * @virt_base: base virtual address of the allocated EPC memory for mapping the
+ * RC PCI address range
+ * @virt_addr: virtual address at which @pci_addr is mapped
+ */
+struct pci_epc_map {
+ phys_addr_t pci_addr;
+ size_t pci_size;
+
+ phys_addr_t map_pci_addr;
+ size_t map_size;
+
+ phys_addr_t phys_base;
+ phys_addr_t phys_addr;
+ void __iomem *virt_base;
+ void __iomem *virt_addr;
+};
+
/**
* struct pci_epc_ops - set of function pointers for performing EPC operations
* @write_header: ops to populate configuration space header
* @set_bar: ops to configure the BAR
* @clear_bar: ops to reset the BAR
+ * @align_addr: operation to get the mapping address, mapping size and offset
+ * into a controller memory window needed to map an RC PCI address
+ * region
* @map_addr: ops to map CPU address to PCI address
* @unmap_addr: ops to unmap CPU address and PCI address
* @set_msi: ops to set the requested number of MSI interrupts in the MSI
@@ -61,6 +93,8 @@ struct pci_epc_ops {
struct pci_epf_bar *epf_bar);
void (*clear_bar)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
struct pci_epf_bar *epf_bar);
+ phys_addr_t (*align_addr)(struct pci_epc *epc, phys_addr_t pci_addr,
+ size_t *size, size_t *offset);
int (*map_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
phys_addr_t addr, u64 pci_addr, size_t size);
void (*unmap_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
@@ -278,6 +312,10 @@ void __iomem *pci_epc_mem_alloc_addr(struct pci_epc *epc,
phys_addr_t *phys_addr, size_t size);
void pci_epc_mem_free_addr(struct pci_epc *epc, phys_addr_t phys_addr,
void __iomem *virt_addr, size_t size);
+int pci_epc_mem_map(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+ u64 pci_addr, size_t pci_size, struct pci_epc_map *map);
+void pci_epc_mem_unmap(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+ struct pci_epc_map *map);
#else
static inline void pci_epc_init_notify(struct pci_epc *epc)
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v6 4/6] PCI: endpoint: Update documentation
2024-10-12 11:32 [PATCH v6 0/6] Improve PCI memory mapping API Damien Le Moal
` (2 preceding siblings ...)
2024-10-12 11:32 ` [PATCH v6 3/6] PCI: endpoint: Introduce pci_epc_mem_map()/unmap() Damien Le Moal
@ 2024-10-12 11:32 ` Damien Le Moal
2024-10-12 11:48 ` Manivannan Sadhasivam
2024-10-12 11:32 ` [PATCH v6 5/6] PCI: endpoint: test: Use pci_epc_mem_map/unmap() Damien Le Moal
` (3 subsequent siblings)
7 siblings, 1 reply; 29+ messages in thread
From: Damien Le Moal @ 2024-10-12 11:32 UTC (permalink / raw)
To: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci
Cc: Rick Wertenbroek, Niklas Cassel
Document the new functions pci_epc_mem_map() and pci_epc_mem_unmap().
Also add the documentation for the functions pci_epc_map_addr() and
pci_epc_unmap_addr() that were missing.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
---
Documentation/PCI/endpoint/pci-endpoint.rst | 29 +++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/Documentation/PCI/endpoint/pci-endpoint.rst b/Documentation/PCI/endpoint/pci-endpoint.rst
index 21507e3cc238..35f82f2d45f5 100644
--- a/Documentation/PCI/endpoint/pci-endpoint.rst
+++ b/Documentation/PCI/endpoint/pci-endpoint.rst
@@ -117,6 +117,35 @@ by the PCI endpoint function driver.
The PCI endpoint function driver should use pci_epc_mem_free_addr() to
free the memory space allocated using pci_epc_mem_alloc_addr().
+* pci_epc_map_addr()
+
+ A PCI endpoint function driver should use pci_epc_map_addr() to map to a RC
+ PCI address the CPU address of local memory obtained with
+ pci_epc_mem_alloc_addr().
+
+* pci_epc_unmap_addr()
+
+ A PCI endpoint function driver should use pci_epc_unmap_addr() to unmap the
+ CPU address of local memory mapped to a RC address with pci_epc_map_addr().
+
+* pci_epc_mem_map()
+
+ A PCI endpoint controller may impose constraints on the RC PCI addresses that
+ can be mapped. The function pci_epc_mem_map() allows endpoint function
+ drivers to allocate and map controller memory while handling such
+ constraints. This function will determine the size of the memory that must be
+ allocated with pci_epc_mem_alloc_addr() for successfully mapping a RC PCI
+ address range. This function will also indicate the size of the PCI address
+ range that was actually mapped, which can be less than the requested size, as
+ well as the offset into the allocated memory to use for accessing the mapped
+ RC PCI address range.
+
+* pci_epc_mem_unmap()
+
+ A PCI endpoint function driver can use pci_epc_mem_unmap() to unmap and free
+ controller memory that was allocated and mapped using pci_epc_mem_map().
+
+
Other EPC APIs
~~~~~~~~~~~~~~
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v6 5/6] PCI: endpoint: test: Use pci_epc_mem_map/unmap()
2024-10-12 11:32 [PATCH v6 0/6] Improve PCI memory mapping API Damien Le Moal
` (3 preceding siblings ...)
2024-10-12 11:32 ` [PATCH v6 4/6] PCI: endpoint: Update documentation Damien Le Moal
@ 2024-10-12 11:32 ` Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 6/6] PCI: dwc: endpoint: Implement the pci_epc_ops::align_addr() operation Damien Le Moal
` (2 subsequent siblings)
7 siblings, 0 replies; 29+ messages in thread
From: Damien Le Moal @ 2024-10-12 11:32 UTC (permalink / raw)
To: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci
Cc: Rick Wertenbroek, Niklas Cassel
Modify the endpoint test driver to use the functions pci_epc_mem_map()
and pci_epc_mem_unmap() for the read, write and copy tests. For each
test case, the transfer (dma or mmio) are executed in a loop to ensure
that potentially partial mappings returned by pci_epc_mem_map() are
correctly handled.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
drivers/pci/endpoint/functions/pci-epf-test.c | 372 +++++++++---------
1 file changed, 193 insertions(+), 179 deletions(-)
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 7c2ed6eae53a..a73bc0771d35 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -317,91 +317,92 @@ static void pci_epf_test_print_rate(struct pci_epf_test *epf_test,
static void pci_epf_test_copy(struct pci_epf_test *epf_test,
struct pci_epf_test_reg *reg)
{
- int ret;
- void __iomem *src_addr;
- void __iomem *dst_addr;
- phys_addr_t src_phys_addr;
- phys_addr_t dst_phys_addr;
+ int ret = 0;
struct timespec64 start, end;
struct pci_epf *epf = epf_test->epf;
- struct device *dev = &epf->dev;
struct pci_epc *epc = epf->epc;
+ struct device *dev = &epf->dev;
+ struct pci_epc_map src_map, dst_map;
+ u64 src_addr = reg->src_addr;
+ u64 dst_addr = reg->dst_addr;
+ size_t copy_size = reg->size;
+ ssize_t map_size = 0;
+ void *copy_buf = NULL, *buf;
- src_addr = pci_epc_mem_alloc_addr(epc, &src_phys_addr, reg->size);
- if (!src_addr) {
- dev_err(dev, "Failed to allocate source address\n");
- reg->status = STATUS_SRC_ADDR_INVALID;
- ret = -ENOMEM;
- goto err;
- }
-
- ret = pci_epc_map_addr(epc, epf->func_no, epf->vfunc_no, src_phys_addr,
- reg->src_addr, reg->size);
- if (ret) {
- dev_err(dev, "Failed to map source address\n");
- reg->status = STATUS_SRC_ADDR_INVALID;
- goto err_src_addr;
- }
-
- dst_addr = pci_epc_mem_alloc_addr(epc, &dst_phys_addr, reg->size);
- if (!dst_addr) {
- dev_err(dev, "Failed to allocate destination address\n");
- reg->status = STATUS_DST_ADDR_INVALID;
- ret = -ENOMEM;
- goto err_src_map_addr;
- }
-
- ret = pci_epc_map_addr(epc, epf->func_no, epf->vfunc_no, dst_phys_addr,
- reg->dst_addr, reg->size);
- if (ret) {
- dev_err(dev, "Failed to map destination address\n");
- reg->status = STATUS_DST_ADDR_INVALID;
- goto err_dst_addr;
- }
-
- ktime_get_ts64(&start);
if (reg->flags & FLAG_USE_DMA) {
if (epf_test->dma_private) {
dev_err(dev, "Cannot transfer data using DMA\n");
ret = -EINVAL;
- goto err_map_addr;
+ goto set_status;
}
-
- ret = pci_epf_test_data_transfer(epf_test, dst_phys_addr,
- src_phys_addr, reg->size, 0,
- DMA_MEM_TO_MEM);
- if (ret)
- dev_err(dev, "Data transfer failed\n");
} else {
- void *buf;
-
- buf = kzalloc(reg->size, GFP_KERNEL);
- if (!buf) {
+ copy_buf = kzalloc(copy_size, GFP_KERNEL);
+ if (!copy_buf) {
ret = -ENOMEM;
- goto err_map_addr;
+ goto set_status;
}
-
- memcpy_fromio(buf, src_addr, reg->size);
- memcpy_toio(dst_addr, buf, reg->size);
- kfree(buf);
+ buf = copy_buf;
}
- ktime_get_ts64(&end);
- pci_epf_test_print_rate(epf_test, "COPY", reg->size, &start, &end,
- reg->flags & FLAG_USE_DMA);
-err_map_addr:
- pci_epc_unmap_addr(epc, epf->func_no, epf->vfunc_no, dst_phys_addr);
+ while (copy_size) {
+ ret = pci_epc_mem_map(epc, epf->func_no, epf->vfunc_no,
+ src_addr, copy_size, &src_map);
+ if (ret) {
+ dev_err(dev, "Failed to map source address\n");
+ reg->status = STATUS_SRC_ADDR_INVALID;
+ goto free_buf;
+ }
+
+ ret = pci_epc_mem_map(epf->epc, epf->func_no, epf->vfunc_no,
+ dst_addr, copy_size, &dst_map);
+ if (ret) {
+ dev_err(dev, "Failed to map destination address\n");
+ reg->status = STATUS_DST_ADDR_INVALID;
+ pci_epc_mem_unmap(epc, epf->func_no, epf->vfunc_no,
+ &src_map);
+ goto free_buf;
+ }
+
+ map_size = min_t(size_t, dst_map.pci_size, src_map.pci_size);
+
+ ktime_get_ts64(&start);
+ if (reg->flags & FLAG_USE_DMA) {
+ ret = pci_epf_test_data_transfer(epf_test,
+ dst_map.phys_addr, src_map.phys_addr,
+ map_size, 0, DMA_MEM_TO_MEM);
+ if (ret) {
+ dev_err(dev, "Data transfer failed\n");
+ goto unmap;
+ }
+ } else {
+ memcpy_fromio(buf, src_map.virt_addr, map_size);
+ memcpy_toio(dst_map.virt_addr, buf, map_size);
+ buf += map_size;
+ }
+ ktime_get_ts64(&end);
-err_dst_addr:
- pci_epc_mem_free_addr(epc, dst_phys_addr, dst_addr, reg->size);
+ copy_size -= map_size;
+ src_addr += map_size;
+ dst_addr += map_size;
-err_src_map_addr:
- pci_epc_unmap_addr(epc, epf->func_no, epf->vfunc_no, src_phys_addr);
+ pci_epc_mem_unmap(epc, epf->func_no, epf->vfunc_no, &dst_map);
+ pci_epc_mem_unmap(epc, epf->func_no, epf->vfunc_no, &src_map);
+ map_size = 0;
+ }
-err_src_addr:
- pci_epc_mem_free_addr(epc, src_phys_addr, src_addr, reg->size);
+ pci_epf_test_print_rate(epf_test, "COPY", reg->size, &start,
+ &end, reg->flags & FLAG_USE_DMA);
-err:
+unmap:
+ if (map_size) {
+ pci_epc_mem_unmap(epc, epf->func_no, epf->vfunc_no, &dst_map);
+ pci_epc_mem_unmap(epc, epf->func_no, epf->vfunc_no, &src_map);
+ }
+
+free_buf:
+ kfree(copy_buf);
+
+set_status:
if (!ret)
reg->status |= STATUS_COPY_SUCCESS;
else
@@ -411,82 +412,89 @@ static void pci_epf_test_copy(struct pci_epf_test *epf_test,
static void pci_epf_test_read(struct pci_epf_test *epf_test,
struct pci_epf_test_reg *reg)
{
- int ret;
- void __iomem *src_addr;
- void *buf;
+ int ret = 0;
+ void *src_buf, *buf;
u32 crc32;
- phys_addr_t phys_addr;
+ struct pci_epc_map map;
phys_addr_t dst_phys_addr;
struct timespec64 start, end;
struct pci_epf *epf = epf_test->epf;
- struct device *dev = &epf->dev;
struct pci_epc *epc = epf->epc;
+ struct device *dev = &epf->dev;
struct device *dma_dev = epf->epc->dev.parent;
+ u64 src_addr = reg->src_addr;
+ size_t src_size = reg->size;
+ ssize_t map_size = 0;
- src_addr = pci_epc_mem_alloc_addr(epc, &phys_addr, reg->size);
- if (!src_addr) {
- dev_err(dev, "Failed to allocate address\n");
- reg->status = STATUS_SRC_ADDR_INVALID;
+ src_buf = kzalloc(src_size, GFP_KERNEL);
+ if (!src_buf) {
ret = -ENOMEM;
- goto err;
+ goto set_status;
}
+ buf = src_buf;
- ret = pci_epc_map_addr(epc, epf->func_no, epf->vfunc_no, phys_addr,
- reg->src_addr, reg->size);
- if (ret) {
- dev_err(dev, "Failed to map address\n");
- reg->status = STATUS_SRC_ADDR_INVALID;
- goto err_addr;
- }
-
- buf = kzalloc(reg->size, GFP_KERNEL);
- if (!buf) {
- ret = -ENOMEM;
- goto err_map_addr;
- }
+ while (src_size) {
+ ret = pci_epc_mem_map(epc, epf->func_no, epf->vfunc_no,
+ src_addr, src_size, &map);
+ if (ret) {
+ dev_err(dev, "Failed to map address\n");
+ reg->status = STATUS_SRC_ADDR_INVALID;
+ goto free_buf;
+ }
- if (reg->flags & FLAG_USE_DMA) {
- dst_phys_addr = dma_map_single(dma_dev, buf, reg->size,
- DMA_FROM_DEVICE);
- if (dma_mapping_error(dma_dev, dst_phys_addr)) {
- dev_err(dev, "Failed to map destination buffer addr\n");
- ret = -ENOMEM;
- goto err_dma_map;
+ map_size = map.pci_size;
+ if (reg->flags & FLAG_USE_DMA) {
+ dst_phys_addr = dma_map_single(dma_dev, buf, map_size,
+ DMA_FROM_DEVICE);
+ if (dma_mapping_error(dma_dev, dst_phys_addr)) {
+ dev_err(dev,
+ "Failed to map destination buffer addr\n");
+ ret = -ENOMEM;
+ goto unmap;
+ }
+
+ ktime_get_ts64(&start);
+ ret = pci_epf_test_data_transfer(epf_test,
+ dst_phys_addr, map.phys_addr,
+ map_size, src_addr, DMA_DEV_TO_MEM);
+ if (ret)
+ dev_err(dev, "Data transfer failed\n");
+ ktime_get_ts64(&end);
+
+ dma_unmap_single(dma_dev, dst_phys_addr, map_size,
+ DMA_FROM_DEVICE);
+
+ if (ret)
+ goto unmap;
+ } else {
+ ktime_get_ts64(&start);
+ memcpy_fromio(buf, map.virt_addr, map_size);
+ ktime_get_ts64(&end);
}
- ktime_get_ts64(&start);
- ret = pci_epf_test_data_transfer(epf_test, dst_phys_addr,
- phys_addr, reg->size,
- reg->src_addr, DMA_DEV_TO_MEM);
- if (ret)
- dev_err(dev, "Data transfer failed\n");
- ktime_get_ts64(&end);
+ src_size -= map_size;
+ src_addr += map_size;
+ buf += map_size;
- dma_unmap_single(dma_dev, dst_phys_addr, reg->size,
- DMA_FROM_DEVICE);
- } else {
- ktime_get_ts64(&start);
- memcpy_fromio(buf, src_addr, reg->size);
- ktime_get_ts64(&end);
+ pci_epc_mem_unmap(epc, epf->func_no, epf->vfunc_no, &map);
+ map_size = 0;
}
- pci_epf_test_print_rate(epf_test, "READ", reg->size, &start, &end,
- reg->flags & FLAG_USE_DMA);
+ pci_epf_test_print_rate(epf_test, "READ", reg->size, &start,
+ &end, reg->flags & FLAG_USE_DMA);
- crc32 = crc32_le(~0, buf, reg->size);
+ crc32 = crc32_le(~0, src_buf, reg->size);
if (crc32 != reg->checksum)
ret = -EIO;
-err_dma_map:
- kfree(buf);
-
-err_map_addr:
- pci_epc_unmap_addr(epc, epf->func_no, epf->vfunc_no, phys_addr);
+unmap:
+ if (map_size)
+ pci_epc_mem_unmap(epc, epf->func_no, epf->vfunc_no, &map);
-err_addr:
- pci_epc_mem_free_addr(epc, phys_addr, src_addr, reg->size);
+free_buf:
+ kfree(src_buf);
-err:
+set_status:
if (!ret)
reg->status |= STATUS_READ_SUCCESS;
else
@@ -496,71 +504,79 @@ static void pci_epf_test_read(struct pci_epf_test *epf_test,
static void pci_epf_test_write(struct pci_epf_test *epf_test,
struct pci_epf_test_reg *reg)
{
- int ret;
- void __iomem *dst_addr;
- void *buf;
- phys_addr_t phys_addr;
+ int ret = 0;
+ void *dst_buf, *buf;
+ struct pci_epc_map map;
phys_addr_t src_phys_addr;
struct timespec64 start, end;
struct pci_epf *epf = epf_test->epf;
- struct device *dev = &epf->dev;
struct pci_epc *epc = epf->epc;
+ struct device *dev = &epf->dev;
struct device *dma_dev = epf->epc->dev.parent;
+ u64 dst_addr = reg->dst_addr;
+ size_t dst_size = reg->size;
+ ssize_t map_size = 0;
- dst_addr = pci_epc_mem_alloc_addr(epc, &phys_addr, reg->size);
- if (!dst_addr) {
- dev_err(dev, "Failed to allocate address\n");
- reg->status = STATUS_DST_ADDR_INVALID;
+ dst_buf = kzalloc(dst_size, GFP_KERNEL);
+ if (!dst_buf) {
ret = -ENOMEM;
- goto err;
+ goto set_status;
}
+ get_random_bytes(dst_buf, dst_size);
+ reg->checksum = crc32_le(~0, dst_buf, dst_size);
+ buf = dst_buf;
- ret = pci_epc_map_addr(epc, epf->func_no, epf->vfunc_no, phys_addr,
- reg->dst_addr, reg->size);
- if (ret) {
- dev_err(dev, "Failed to map address\n");
- reg->status = STATUS_DST_ADDR_INVALID;
- goto err_addr;
- }
-
- buf = kzalloc(reg->size, GFP_KERNEL);
- if (!buf) {
- ret = -ENOMEM;
- goto err_map_addr;
- }
-
- get_random_bytes(buf, reg->size);
- reg->checksum = crc32_le(~0, buf, reg->size);
-
- if (reg->flags & FLAG_USE_DMA) {
- src_phys_addr = dma_map_single(dma_dev, buf, reg->size,
- DMA_TO_DEVICE);
- if (dma_mapping_error(dma_dev, src_phys_addr)) {
- dev_err(dev, "Failed to map source buffer addr\n");
- ret = -ENOMEM;
- goto err_dma_map;
+ while (dst_size) {
+ ret = pci_epc_mem_map(epc, epf->func_no, epf->vfunc_no,
+ dst_addr, dst_size, &map);
+ if (ret) {
+ dev_err(dev, "Failed to map address\n");
+ reg->status = STATUS_DST_ADDR_INVALID;
+ goto free_buf;
}
- ktime_get_ts64(&start);
+ map_size = map.pci_size;
+ if (reg->flags & FLAG_USE_DMA) {
+ src_phys_addr = dma_map_single(dma_dev, buf, map_size,
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(dma_dev, src_phys_addr)) {
+ dev_err(dev,
+ "Failed to map source buffer addr\n");
+ ret = -ENOMEM;
+ goto unmap;
+ }
+
+ ktime_get_ts64(&start);
+
+ ret = pci_epf_test_data_transfer(epf_test,
+ map.phys_addr, src_phys_addr,
+ map_size, dst_addr,
+ DMA_MEM_TO_DEV);
+ if (ret)
+ dev_err(dev, "Data transfer failed\n");
+ ktime_get_ts64(&end);
+
+ dma_unmap_single(dma_dev, src_phys_addr, map_size,
+ DMA_TO_DEVICE);
+
+ if (ret)
+ goto unmap;
+ } else {
+ ktime_get_ts64(&start);
+ memcpy_toio(map.virt_addr, buf, map_size);
+ ktime_get_ts64(&end);
+ }
- ret = pci_epf_test_data_transfer(epf_test, phys_addr,
- src_phys_addr, reg->size,
- reg->dst_addr,
- DMA_MEM_TO_DEV);
- if (ret)
- dev_err(dev, "Data transfer failed\n");
- ktime_get_ts64(&end);
+ dst_size -= map_size;
+ dst_addr += map_size;
+ buf += map_size;
- dma_unmap_single(dma_dev, src_phys_addr, reg->size,
- DMA_TO_DEVICE);
- } else {
- ktime_get_ts64(&start);
- memcpy_toio(dst_addr, buf, reg->size);
- ktime_get_ts64(&end);
+ pci_epc_mem_unmap(epc, epf->func_no, epf->vfunc_no, &map);
+ map_size = 0;
}
- pci_epf_test_print_rate(epf_test, "WRITE", reg->size, &start, &end,
- reg->flags & FLAG_USE_DMA);
+ pci_epf_test_print_rate(epf_test, "WRITE", reg->size, &start,
+ &end, reg->flags & FLAG_USE_DMA);
/*
* wait 1ms inorder for the write to complete. Without this delay L3
@@ -568,16 +584,14 @@ static void pci_epf_test_write(struct pci_epf_test *epf_test,
*/
usleep_range(1000, 2000);
-err_dma_map:
- kfree(buf);
-
-err_map_addr:
- pci_epc_unmap_addr(epc, epf->func_no, epf->vfunc_no, phys_addr);
+unmap:
+ if (map_size)
+ pci_epc_mem_unmap(epc, epf->func_no, epf->vfunc_no, &map);
-err_addr:
- pci_epc_mem_free_addr(epc, phys_addr, dst_addr, reg->size);
+free_buf:
+ kfree(dst_buf);
-err:
+set_status:
if (!ret)
reg->status |= STATUS_WRITE_SUCCESS;
else
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v6 6/6] PCI: dwc: endpoint: Implement the pci_epc_ops::align_addr() operation
2024-10-12 11:32 [PATCH v6 0/6] Improve PCI memory mapping API Damien Le Moal
` (4 preceding siblings ...)
2024-10-12 11:32 ` [PATCH v6 5/6] PCI: endpoint: test: Use pci_epc_mem_map/unmap() Damien Le Moal
@ 2024-10-12 11:32 ` Damien Le Moal
2024-10-12 11:53 ` Manivannan Sadhasivam
2024-10-12 11:57 ` [PATCH v6 0/6] Improve PCI memory mapping API Manivannan Sadhasivam
2024-10-21 22:19 ` Bjorn Helgaas
7 siblings, 1 reply; 29+ messages in thread
From: Damien Le Moal @ 2024-10-12 11:32 UTC (permalink / raw)
To: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci
Cc: Rick Wertenbroek, Niklas Cassel
The function dw_pcie_prog_outbound_atu() used to program outbound ATU
entries for mapping RC PCI addresses to local CPU addresses does not
allow PCI addresses that are not aligned to the value of region_align
of struct dw_pcie. This value is determined from the iATU hardware
registers during probing of the iATU (done by dw_pcie_iatu_detect()).
This value is thus valid for all DWC PCIe controllers, and valid
regardless of the hardware configuration used when synthesizing the
DWC PCIe controller.
Implement the ->align_addr() endpoint controller operation to allow
this mapping alignment to be transparently handled by endpoint function
drivers through the function pci_epc_mem_map().
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
drivers/pci/controller/dwc/pcie-designware-ep.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index 43ba5c6738df..ad602b213ec4 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -268,6 +268,20 @@ static int dw_pcie_find_index(struct dw_pcie_ep *ep, phys_addr_t addr,
return -EINVAL;
}
+static phys_addr_t dw_pcie_ep_align_addr(struct pci_epc *epc,
+ phys_addr_t pci_addr, size_t *pci_size, size_t *offset)
+{
+ struct dw_pcie_ep *ep = epc_get_drvdata(epc);
+ struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+ phys_addr_t mask = pci->region_align - 1;
+ size_t ofst = pci_addr & mask;
+
+ *pci_size = ALIGN(ofst + *pci_size, ep->page_size);
+ *offset = ofst;
+
+ return pci_addr & ~mask;
+}
+
static void dw_pcie_ep_unmap_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
phys_addr_t addr)
{
@@ -444,6 +458,7 @@ static const struct pci_epc_ops epc_ops = {
.write_header = dw_pcie_ep_write_header,
.set_bar = dw_pcie_ep_set_bar,
.clear_bar = dw_pcie_ep_clear_bar,
+ .align_addr = dw_pcie_ep_align_addr,
.map_addr = dw_pcie_ep_map_addr,
.unmap_addr = dw_pcie_ep_unmap_addr,
.set_msi = dw_pcie_ep_set_msi,
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v6 3/6] PCI: endpoint: Introduce pci_epc_mem_map()/unmap()
2024-10-12 11:32 ` [PATCH v6 3/6] PCI: endpoint: Introduce pci_epc_mem_map()/unmap() Damien Le Moal
@ 2024-10-12 11:47 ` Manivannan Sadhasivam
2024-10-13 9:06 ` Niklas Cassel
1 sibling, 0 replies; 29+ messages in thread
From: Manivannan Sadhasivam @ 2024-10-12 11:47 UTC (permalink / raw)
To: Damien Le Moal
Cc: Krzysztof Wilczyński, Kishon Vijay Abraham I, Bjorn Helgaas,
Lorenzo Pieralisi, Rob Herring, Jonathan Corbet, Jingoo Han,
linux-pci, Rick Wertenbroek, Niklas Cassel
On Sat, Oct 12, 2024 at 08:32:43PM +0900, Damien Le Moal wrote:
> Some endpoint controllers have requirements on the alignment of the
> controller physical memory address that must be used to map a RC PCI
> address region. For instance, the endpoint controller of the RK3399 SoC
> uses at most the lower 20 bits of a physical memory address region as
> the lower bits of a RC PCI address region. For mapping a PCI address
> region of size bytes starting from pci_addr, the exact number of
> address bits used is the number of address bits changing in the address
> range [pci_addr..pci_addr + size - 1]. For this example, this creates
> the following constraints:
> 1) The offset into the controller physical memory allocated for a
> mapping depends on the mapping size *and* the starting PCI address
> for the mapping.
> 2) A mapping size cannot exceed the controller windows size (1MB) minus
> the offset needed into the allocated physical memory, which can end
> up being a smaller size than the desired mapping size.
>
> Handling these constraints independently of the controller being used
> in an endpoint function driver is not possible with the current EPC
> API as only the ->align field in struct pci_epc_features is provided
> but used for BAR (inbound ATU mappings) mapping only. A new API is
> needed for function drivers to discover mapping constraints and handle
> non-static requirements based on the RC PCI address range to access.
>
> Introduce the endpoint controller operation ->align_addr() to allow
> the EPC core functions to obtain the size and the offset into a
> controller address region that must be allocated and mapped to access
> a RC PCI address region. The size of the mapping provided by the
> align_addr() operation can then be used as the size argument for the
> function pci_epc_mem_alloc_addr() and the offset into the allocated
> controller memory provided can be used to correctly handle data
> transfers. For endpoint controllers that have PCI address alignment
> constraints, the align_addr() operation may indicate upon return an
> effective PCI address mapping size that is smaller (but not 0) than the
> requested PCI address region size.
>
> The controller ->align_addr() operation is optional: controllers that
> do not have any alignment constraints for mapping RC PCI address regions
> do not need to implement this operation. For such controllers, it is
> always assumed that the mapping size is equal to the requested size of
> the PCI region and that the mapping offset is 0.
>
> The function pci_epc_mem_map() is introduced to use this new controller
> operation (if it is defined) to handle controller memory allocation and
> mapping to a RC PCI address region in endpoint function drivers.
>
> This function first uses the ->align_addr() controller operation to
> determine the controller memory address size (and offset into) needed
> for mapping an RC PCI address region. The result of this operation is
> used to allocate a controller physical memory region using
> pci_epc_mem_alloc_addr() and then to map that memory to the RC PCI
> address space with pci_epc_map_addr().
>
> Since ->align_addr() () may indicate that not all of a RC PCI address
> region can be mapped, pci_epc_mem_map() may only partially map the RC
> PCI address region specified. It is the responsibility of the caller
> (an endpoint function driver) to handle such smaller mapping by
> repeatedly using pci_epc_mem_map() over the desried PCI address range.
>
> The counterpart of pci_epc_mem_map() to unmap and free a mapped
> controller memory address region is pci_epc_mem_unmap().
>
> Both functions operate using the new struct pci_epc_map data structure.
> This new structure represents a mapping PCI address, mapping effective
> size, the size of the controller memory needed for the mapping as well
> as the physical and virtual CPU addresses of the mapping (phys_base and
> virt_base fields). For convenience, the physical and virtual CPU
> addresses within that mapping to use to access the target RC PCI address
> region are also provided (phys_addr and virt_addr fields).
>
> Endpoint function drivers can use struct pci_epc_map to access the
> mapped RC PCI address region using the ->virt_addr and ->pci_size
> fields.
>
> Co-developed-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
- Mani
> ---
> drivers/pci/endpoint/pci-epc-core.c | 103 ++++++++++++++++++++++++++++
> include/linux/pci-epc.h | 38 ++++++++++
> 2 files changed, 141 insertions(+)
>
> diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
> index b854f1bab26f..04a85d2f7e2a 100644
> --- a/drivers/pci/endpoint/pci-epc-core.c
> +++ b/drivers/pci/endpoint/pci-epc-core.c
> @@ -466,6 +466,109 @@ int pci_epc_map_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> }
> EXPORT_SYMBOL_GPL(pci_epc_map_addr);
>
> +/**
> + * pci_epc_mem_map() - allocate and map a PCI address to a CPU address
> + * @epc: the EPC device on which the CPU address is to be allocated and mapped
> + * @func_no: the physical endpoint function number in the EPC device
> + * @vfunc_no: the virtual endpoint function number in the physical function
> + * @pci_addr: PCI address to which the CPU address should be mapped
> + * @pci_size: the number of bytes to map starting from @pci_addr
> + * @map: where to return the mapping information
> + *
> + * Allocate a controller memory address region and map it to a RC PCI address
> + * region, taking into account the controller physical address mapping
> + * constraints using the controller operation align_addr(). If this operation is
> + * not defined, we assume that there are no alignment constraints for the
> + * mapping.
> + *
> + * The effective size of the PCI address range mapped from @pci_addr is
> + * indicated by @map->pci_size. This size may be less than the requested
> + * @pci_size. The local virtual CPU address for the mapping is indicated by
> + * @map->virt_addr (@map->phys_addr indicates the physical address).
> + * The size and CPU address of the controller memory allocated and mapped are
> + * respectively indicated by @map->map_size and @map->virt_base (and
> + * @map->phys_base for the physical address of @map->virt_base).
> + *
> + * Returns 0 on success and a negative error code in case of error.
> + */
> +int pci_epc_mem_map(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + u64 pci_addr, size_t pci_size, struct pci_epc_map *map)
> +{
> + size_t map_size = pci_size;
> + size_t map_offset = 0;
> + int ret;
> +
> + if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
> + return -EINVAL;
> +
> + if (!pci_size || !map)
> + return -EINVAL;
> +
> + /*
> + * Align the PCI address to map. If the controller defines the
> + * .align_addr() operation, use it to determine the PCI address to map
> + * and the size of the mapping. Otherwise, assume that the controller
> + * has no alignment constraint.
> + */
> + memset(map, 0, sizeof(*map));
> + map->pci_addr = pci_addr;
> + if (epc->ops->align_addr)
> + map->map_pci_addr =
> + epc->ops->align_addr(epc, pci_addr,
> + &map_size, &map_offset);
> + else
> + map->map_pci_addr = pci_addr;
> + map->map_size = map_size;
> + if (map->map_pci_addr + map->map_size < pci_addr + pci_size)
> + map->pci_size = map->map_pci_addr + map->map_size - pci_addr;
> + else
> + map->pci_size = pci_size;
> +
> + map->virt_base = pci_epc_mem_alloc_addr(epc, &map->phys_base,
> + map->map_size);
> + if (!map->virt_base)
> + return -ENOMEM;
> +
> + map->phys_addr = map->phys_base + map_offset;
> + map->virt_addr = map->virt_base + map_offset;
> +
> + ret = pci_epc_map_addr(epc, func_no, vfunc_no, map->phys_base,
> + map->map_pci_addr, map->map_size);
> + if (ret) {
> + pci_epc_mem_free_addr(epc, map->phys_base, map->virt_base,
> + map->map_size);
> + return ret;
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(pci_epc_mem_map);
> +
> +/**
> + * pci_epc_mem_unmap() - unmap and free a CPU address region
> + * @epc: the EPC device on which the CPU address is allocated and mapped
> + * @func_no: the physical endpoint function number in the EPC device
> + * @vfunc_no: the virtual endpoint function number in the physical function
> + * @map: the mapping information
> + *
> + * Unmap and free a CPU address region that was allocated and mapped with
> + * pci_epc_mem_map().
> + */
> +void pci_epc_mem_unmap(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + struct pci_epc_map *map)
> +{
> + if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
> + return;
> +
> + if (!map || !map->virt_base)
> + return;
> +
> + pci_epc_unmap_addr(epc, func_no, vfunc_no, map->phys_base);
> + pci_epc_mem_free_addr(epc, map->phys_base, map->virt_base,
> + map->map_size);
> +}
> +EXPORT_SYMBOL_GPL(pci_epc_mem_unmap);
> +
> /**
> * pci_epc_clear_bar() - reset the BAR
> * @epc: the EPC device for which the BAR has to be cleared
> diff --git a/include/linux/pci-epc.h b/include/linux/pci-epc.h
> index 42ef06136bd1..f4b8dc37e447 100644
> --- a/include/linux/pci-epc.h
> +++ b/include/linux/pci-epc.h
> @@ -32,11 +32,43 @@ pci_epc_interface_string(enum pci_epc_interface_type type)
> }
> }
>
> +/**
> + * struct pci_epc_map - information about EPC memory for mapping a RC PCI
> + * address range
> + * @pci_addr: start address of the RC PCI address range to map
> + * @pci_size: size of the RC PCI address range mapped from @pci_addr
> + * @map_pci_addr: RC PCI address used as the first address mapped (may be lower
> + * than @pci_addr)
> + * @map_size: size of the controller memory needed for mapping the RC PCI address
> + * range @pci_addr..@pci_addr+@pci_size
> + * @phys_base: base physical address of the allocated EPC memory for mapping the
> + * RC PCI address range
> + * @phys_addr: physical address at which @pci_addr is mapped
> + * @virt_base: base virtual address of the allocated EPC memory for mapping the
> + * RC PCI address range
> + * @virt_addr: virtual address at which @pci_addr is mapped
> + */
> +struct pci_epc_map {
> + phys_addr_t pci_addr;
> + size_t pci_size;
> +
> + phys_addr_t map_pci_addr;
> + size_t map_size;
> +
> + phys_addr_t phys_base;
> + phys_addr_t phys_addr;
> + void __iomem *virt_base;
> + void __iomem *virt_addr;
> +};
> +
> /**
> * struct pci_epc_ops - set of function pointers for performing EPC operations
> * @write_header: ops to populate configuration space header
> * @set_bar: ops to configure the BAR
> * @clear_bar: ops to reset the BAR
> + * @align_addr: operation to get the mapping address, mapping size and offset
> + * into a controller memory window needed to map an RC PCI address
> + * region
> * @map_addr: ops to map CPU address to PCI address
> * @unmap_addr: ops to unmap CPU address and PCI address
> * @set_msi: ops to set the requested number of MSI interrupts in the MSI
> @@ -61,6 +93,8 @@ struct pci_epc_ops {
> struct pci_epf_bar *epf_bar);
> void (*clear_bar)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> struct pci_epf_bar *epf_bar);
> + phys_addr_t (*align_addr)(struct pci_epc *epc, phys_addr_t pci_addr,
> + size_t *size, size_t *offset);
> int (*map_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> phys_addr_t addr, u64 pci_addr, size_t size);
> void (*unmap_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> @@ -278,6 +312,10 @@ void __iomem *pci_epc_mem_alloc_addr(struct pci_epc *epc,
> phys_addr_t *phys_addr, size_t size);
> void pci_epc_mem_free_addr(struct pci_epc *epc, phys_addr_t phys_addr,
> void __iomem *virt_addr, size_t size);
> +int pci_epc_mem_map(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + u64 pci_addr, size_t pci_size, struct pci_epc_map *map);
> +void pci_epc_mem_unmap(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + struct pci_epc_map *map);
>
> #else
> static inline void pci_epc_init_notify(struct pci_epc *epc)
> --
> 2.47.0
>
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 4/6] PCI: endpoint: Update documentation
2024-10-12 11:32 ` [PATCH v6 4/6] PCI: endpoint: Update documentation Damien Le Moal
@ 2024-10-12 11:48 ` Manivannan Sadhasivam
0 siblings, 0 replies; 29+ messages in thread
From: Manivannan Sadhasivam @ 2024-10-12 11:48 UTC (permalink / raw)
To: Damien Le Moal
Cc: Krzysztof Wilczyński, Kishon Vijay Abraham I, Bjorn Helgaas,
Lorenzo Pieralisi, Rob Herring, Jonathan Corbet, Jingoo Han,
linux-pci, Rick Wertenbroek, Niklas Cassel
On Sat, Oct 12, 2024 at 08:32:44PM +0900, Damien Le Moal wrote:
> Document the new functions pci_epc_mem_map() and pci_epc_mem_unmap().
> Also add the documentation for the functions pci_epc_map_addr() and
> pci_epc_unmap_addr() that were missing.
>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
- Mani
> Reviewed-by: Niklas Cassel <cassel@kernel.org>
> ---
> Documentation/PCI/endpoint/pci-endpoint.rst | 29 +++++++++++++++++++++
> 1 file changed, 29 insertions(+)
>
> diff --git a/Documentation/PCI/endpoint/pci-endpoint.rst b/Documentation/PCI/endpoint/pci-endpoint.rst
> index 21507e3cc238..35f82f2d45f5 100644
> --- a/Documentation/PCI/endpoint/pci-endpoint.rst
> +++ b/Documentation/PCI/endpoint/pci-endpoint.rst
> @@ -117,6 +117,35 @@ by the PCI endpoint function driver.
> The PCI endpoint function driver should use pci_epc_mem_free_addr() to
> free the memory space allocated using pci_epc_mem_alloc_addr().
>
> +* pci_epc_map_addr()
> +
> + A PCI endpoint function driver should use pci_epc_map_addr() to map to a RC
> + PCI address the CPU address of local memory obtained with
> + pci_epc_mem_alloc_addr().
> +
> +* pci_epc_unmap_addr()
> +
> + A PCI endpoint function driver should use pci_epc_unmap_addr() to unmap the
> + CPU address of local memory mapped to a RC address with pci_epc_map_addr().
> +
> +* pci_epc_mem_map()
> +
> + A PCI endpoint controller may impose constraints on the RC PCI addresses that
> + can be mapped. The function pci_epc_mem_map() allows endpoint function
> + drivers to allocate and map controller memory while handling such
> + constraints. This function will determine the size of the memory that must be
> + allocated with pci_epc_mem_alloc_addr() for successfully mapping a RC PCI
> + address range. This function will also indicate the size of the PCI address
> + range that was actually mapped, which can be less than the requested size, as
> + well as the offset into the allocated memory to use for accessing the mapped
> + RC PCI address range.
> +
> +* pci_epc_mem_unmap()
> +
> + A PCI endpoint function driver can use pci_epc_mem_unmap() to unmap and free
> + controller memory that was allocated and mapped using pci_epc_mem_map().
> +
> +
> Other EPC APIs
> ~~~~~~~~~~~~~~
>
> --
> 2.47.0
>
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 6/6] PCI: dwc: endpoint: Implement the pci_epc_ops::align_addr() operation
2024-10-12 11:32 ` [PATCH v6 6/6] PCI: dwc: endpoint: Implement the pci_epc_ops::align_addr() operation Damien Le Moal
@ 2024-10-12 11:53 ` Manivannan Sadhasivam
0 siblings, 0 replies; 29+ messages in thread
From: Manivannan Sadhasivam @ 2024-10-12 11:53 UTC (permalink / raw)
To: Damien Le Moal
Cc: Krzysztof Wilczyński, Kishon Vijay Abraham I, Bjorn Helgaas,
Lorenzo Pieralisi, Rob Herring, Jonathan Corbet, Jingoo Han,
linux-pci, Rick Wertenbroek, Niklas Cassel
On Sat, Oct 12, 2024 at 08:32:46PM +0900, Damien Le Moal wrote:
> The function dw_pcie_prog_outbound_atu() used to program outbound ATU
> entries for mapping RC PCI addresses to local CPU addresses does not
> allow PCI addresses that are not aligned to the value of region_align
> of struct dw_pcie. This value is determined from the iATU hardware
> registers during probing of the iATU (done by dw_pcie_iatu_detect()).
> This value is thus valid for all DWC PCIe controllers, and valid
> regardless of the hardware configuration used when synthesizing the
> DWC PCIe controller.
>
> Implement the ->align_addr() endpoint controller operation to allow
> this mapping alignment to be transparently handled by endpoint function
> drivers through the function pci_epc_mem_map().
>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
- Mani
> ---
> drivers/pci/controller/dwc/pcie-designware-ep.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index 43ba5c6738df..ad602b213ec4 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -268,6 +268,20 @@ static int dw_pcie_find_index(struct dw_pcie_ep *ep, phys_addr_t addr,
> return -EINVAL;
> }
>
> +static phys_addr_t dw_pcie_ep_align_addr(struct pci_epc *epc,
> + phys_addr_t pci_addr, size_t *pci_size, size_t *offset)
> +{
> + struct dw_pcie_ep *ep = epc_get_drvdata(epc);
> + struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> + phys_addr_t mask = pci->region_align - 1;
> + size_t ofst = pci_addr & mask;
> +
> + *pci_size = ALIGN(ofst + *pci_size, ep->page_size);
> + *offset = ofst;
> +
> + return pci_addr & ~mask;
> +}
> +
> static void dw_pcie_ep_unmap_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> phys_addr_t addr)
> {
> @@ -444,6 +458,7 @@ static const struct pci_epc_ops epc_ops = {
> .write_header = dw_pcie_ep_write_header,
> .set_bar = dw_pcie_ep_set_bar,
> .clear_bar = dw_pcie_ep_clear_bar,
> + .align_addr = dw_pcie_ep_align_addr,
> .map_addr = dw_pcie_ep_map_addr,
> .unmap_addr = dw_pcie_ep_unmap_addr,
> .set_msi = dw_pcie_ep_set_msi,
> --
> 2.47.0
>
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-12 11:32 [PATCH v6 0/6] Improve PCI memory mapping API Damien Le Moal
` (5 preceding siblings ...)
2024-10-12 11:32 ` [PATCH v6 6/6] PCI: dwc: endpoint: Implement the pci_epc_ops::align_addr() operation Damien Le Moal
@ 2024-10-12 11:57 ` Manivannan Sadhasivam
2024-10-12 12:03 ` Damien Le Moal
2024-10-21 22:19 ` Bjorn Helgaas
7 siblings, 1 reply; 29+ messages in thread
From: Manivannan Sadhasivam @ 2024-10-12 11:57 UTC (permalink / raw)
To: Damien Le Moal
Cc: Krzysztof Wilczyński, Kishon Vijay Abraham I, Bjorn Helgaas,
Lorenzo Pieralisi, Rob Herring, Jonathan Corbet, Jingoo Han,
linux-pci, Rick Wertenbroek, Niklas Cassel
On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
> This series introduces the new functions pci_epc_mem_map() and
> pci_epc_mem_unmap() to improve handling of the PCI address mapping
> alignment constraints of endpoint controllers in a controller
> independent manner.
>
> The issue fixed is that the fixed alignment defined by the "align" field
> of struct pci_epc_features is defined for inbound ATU entries (e.g.
> BARs) and is a fixed value, whereas some controllers need a PCI address
> alignment that depends on the PCI address itself. For instance, the
> rk3399 SoC controller in endpoint mode uses the lower bits of the local
> endpoint memory address as the lower bits for the PCI addresses for data
> transfers. That is, when mapping local memory, one must take into
> account the number of bits of the RC PCI address that change from the
> start address of the mapping.
>
> To fix this, the new endpoint controller method .align_addr is
> introduced and called from the new pci_epc_mem_map() function. This
> method is optional and for controllers that do not define it, it is
> assumed that the controller has no PCI address constraint.
>
> The functions pci_epc_mem_map() is a helper function which obtains
> the mapping information, allocates endpoint controller memory according
> to the mapping size obtained and maps the memory. pci_epc_mem_unmap()
> unmaps and frees the endpoint memory.
>
> This series is organized as follows:
> - Patch 1 introduces a small helper to clean up the epc core code
> - Patch 2 improves pci_epc_mem_alloc_addr()
> - Patch 3 introduce the new align_addr() endpoint controller method
> and the epc functions pci_epc_mem_map() and pci_epc_mem_unmap().
> - Patch 4 documents these new functions.
> - Patch 5 modifies the test endpoint function driver to use
> pci_epc_mem_map() and pci_epc_mem_unmap() to illustrate the use of
> these functions.
> - Finally, patch 6 implements the RK3588 endpoint controller driver
> .align_addr() operation to satisfy that controller PCI address
> alignment constraint.
>
> This patch series was tested using the pci endpoint test driver (as-is
> and a modified version removing memory allocation alignment on the host
> side) as well as with a prototype NVMe endpoint function driver (where
> data transfers use addresses that are never aligned to any specific
> boundary).
>
Applied to pci/endpoint!
- Mani
> Changes from v5:
> - Changed patch 3 to rename the new controller operation to align_addr
> and change its interface. Patch 6 is changed accordingly.
>
> Changes from v4:
> - Removed the patch adding the pci_epc_map_align() function. The former
> .map_align controller operation is now introduced in patch 3 as
> "get_mem_map()" and used directly in the new pci_epf_mem_map()
> function.
> - Modified the documentation patch 4 to reflect the previous change.
> - Changed patch 6 title and modified it to rename map_align to
> get_mem_map in accordance with the new patch 3.
> - Added review tags
>
> Changes from v3:
> - Addressed Niklas comments (improved patch 2 commit message, added
> comments in the pci_epc_map_align() function in patch 3, typos and
> improvements in patch 5, patch 7 commit message).
> - Added review tags
>
> Changes from v2:
> - Dropped all patches for the rockchip-ep. These patches will be sent
> later as a separate series.
> - Added patch 2 and 5
> - Added review tags to patch 1
>
> Changes from v1:
> - Changed pci_epc_check_func() to pci_epc_function_is_valid() in patch
> 1.
> - Removed patch "PCI: endpoint: Improve pci_epc_mem_alloc_addr()"
> (former patch 2 of v1)
> - Various typos cleanups all over. Also fixed some blank space
> indentation.
> - Added review tags
>
> Damien Le Moal (6):
> PCI: endpoint: Introduce pci_epc_function_is_valid()
> PCI: endpoint: Improve pci_epc_mem_alloc_addr()
> PCI: endpoint: Introduce pci_epc_mem_map()/unmap()
> PCI: endpoint: Update documentation
> PCI: endpoint: test: Use pci_epc_mem_map/unmap()
> PCI: dwc: endpoint: Implement the pci_epc_ops::align_addr() operation
>
> Documentation/PCI/endpoint/pci-endpoint.rst | 29 ++
> .../pci/controller/dwc/pcie-designware-ep.c | 15 +
> drivers/pci/endpoint/functions/pci-epf-test.c | 372 +++++++++---------
> drivers/pci/endpoint/pci-epc-core.c | 182 ++++++---
> drivers/pci/endpoint/pci-epc-mem.c | 9 +-
> include/linux/pci-epc.h | 38 ++
> 6 files changed, 415 insertions(+), 230 deletions(-)
>
> --
> 2.47.0
>
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-12 11:57 ` [PATCH v6 0/6] Improve PCI memory mapping API Manivannan Sadhasivam
@ 2024-10-12 12:03 ` Damien Le Moal
0 siblings, 0 replies; 29+ messages in thread
From: Damien Le Moal @ 2024-10-12 12:03 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Krzysztof Wilczyński, Kishon Vijay Abraham I, Bjorn Helgaas,
Lorenzo Pieralisi, Rob Herring, Jonathan Corbet, Jingoo Han,
linux-pci, Rick Wertenbroek, Niklas Cassel
On 10/12/24 20:57, Manivannan Sadhasivam wrote:
> On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
>> This series introduces the new functions pci_epc_mem_map() and
>> pci_epc_mem_unmap() to improve handling of the PCI address mapping
>> alignment constraints of endpoint controllers in a controller
>> independent manner.
>>
>> The issue fixed is that the fixed alignment defined by the "align" field
>> of struct pci_epc_features is defined for inbound ATU entries (e.g.
>> BARs) and is a fixed value, whereas some controllers need a PCI address
>> alignment that depends on the PCI address itself. For instance, the
>> rk3399 SoC controller in endpoint mode uses the lower bits of the local
>> endpoint memory address as the lower bits for the PCI addresses for data
>> transfers. That is, when mapping local memory, one must take into
>> account the number of bits of the RC PCI address that change from the
>> start address of the mapping.
>>
>> To fix this, the new endpoint controller method .align_addr is
>> introduced and called from the new pci_epc_mem_map() function. This
>> method is optional and for controllers that do not define it, it is
>> assumed that the controller has no PCI address constraint.
>>
>> The functions pci_epc_mem_map() is a helper function which obtains
>> the mapping information, allocates endpoint controller memory according
>> to the mapping size obtained and maps the memory. pci_epc_mem_unmap()
>> unmaps and frees the endpoint memory.
>>
>> This series is organized as follows:
>> - Patch 1 introduces a small helper to clean up the epc core code
>> - Patch 2 improves pci_epc_mem_alloc_addr()
>> - Patch 3 introduce the new align_addr() endpoint controller method
>> and the epc functions pci_epc_mem_map() and pci_epc_mem_unmap().
>> - Patch 4 documents these new functions.
>> - Patch 5 modifies the test endpoint function driver to use
>> pci_epc_mem_map() and pci_epc_mem_unmap() to illustrate the use of
>> these functions.
>> - Finally, patch 6 implements the RK3588 endpoint controller driver
>> .align_addr() operation to satisfy that controller PCI address
>> alignment constraint.
>>
>> This patch series was tested using the pci endpoint test driver (as-is
>> and a modified version removing memory allocation alignment on the host
>> side) as well as with a prototype NVMe endpoint function driver (where
>> data transfers use addresses that are never aligned to any specific
>> boundary).
>>
>
> Applied to pci/endpoint!
Awesome ! Thanks a lot !
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 3/6] PCI: endpoint: Introduce pci_epc_mem_map()/unmap()
2024-10-12 11:32 ` [PATCH v6 3/6] PCI: endpoint: Introduce pci_epc_mem_map()/unmap() Damien Le Moal
2024-10-12 11:47 ` Manivannan Sadhasivam
@ 2024-10-13 9:06 ` Niklas Cassel
2024-10-14 13:09 ` Damien Le Moal
1 sibling, 1 reply; 29+ messages in thread
From: Niklas Cassel @ 2024-10-13 9:06 UTC (permalink / raw)
To: Damien Le Moal
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek
On Sat, Oct 12, 2024 at 08:32:43PM +0900, Damien Le Moal wrote:
> Some endpoint controllers have requirements on the alignment of the
> controller physical memory address that must be used to map a RC PCI
> address region. For instance, the endpoint controller of the RK3399 SoC
> uses at most the lower 20 bits of a physical memory address region as
> the lower bits of a RC PCI address region. For mapping a PCI address
> region of size bytes starting from pci_addr, the exact number of
> address bits used is the number of address bits changing in the address
> range [pci_addr..pci_addr + size - 1]. For this example, this creates
> the following constraints:
> 1) The offset into the controller physical memory allocated for a
> mapping depends on the mapping size *and* the starting PCI address
> for the mapping.
> 2) A mapping size cannot exceed the controller windows size (1MB) minus
> the offset needed into the allocated physical memory, which can end
> up being a smaller size than the desired mapping size.
>
> Handling these constraints independently of the controller being used
> in an endpoint function driver is not possible with the current EPC
> API as only the ->align field in struct pci_epc_features is provided
> but used for BAR (inbound ATU mappings) mapping only. A new API is
> needed for function drivers to discover mapping constraints and handle
> non-static requirements based on the RC PCI address range to access.
>
> Introduce the endpoint controller operation ->align_addr() to allow
> the EPC core functions to obtain the size and the offset into a
> controller address region that must be allocated and mapped to access
> a RC PCI address region. The size of the mapping provided by the
> align_addr() operation can then be used as the size argument for the
> function pci_epc_mem_alloc_addr() and the offset into the allocated
> controller memory provided can be used to correctly handle data
> transfers. For endpoint controllers that have PCI address alignment
> constraints, the align_addr() operation may indicate upon return an
> effective PCI address mapping size that is smaller (but not 0) than the
> requested PCI address region size.
>
> The controller ->align_addr() operation is optional: controllers that
> do not have any alignment constraints for mapping RC PCI address regions
> do not need to implement this operation. For such controllers, it is
> always assumed that the mapping size is equal to the requested size of
> the PCI region and that the mapping offset is 0.
>
> The function pci_epc_mem_map() is introduced to use this new controller
> operation (if it is defined) to handle controller memory allocation and
> mapping to a RC PCI address region in endpoint function drivers.
>
> This function first uses the ->align_addr() controller operation to
> determine the controller memory address size (and offset into) needed
> for mapping an RC PCI address region. The result of this operation is
> used to allocate a controller physical memory region using
> pci_epc_mem_alloc_addr() and then to map that memory to the RC PCI
> address space with pci_epc_map_addr().
>
> Since ->align_addr() () may indicate that not all of a RC PCI address
> region can be mapped, pci_epc_mem_map() may only partially map the RC
> PCI address region specified. It is the responsibility of the caller
> (an endpoint function driver) to handle such smaller mapping by
> repeatedly using pci_epc_mem_map() over the desried PCI address range.
>
> The counterpart of pci_epc_mem_map() to unmap and free a mapped
> controller memory address region is pci_epc_mem_unmap().
>
> Both functions operate using the new struct pci_epc_map data structure.
> This new structure represents a mapping PCI address, mapping effective
> size, the size of the controller memory needed for the mapping as well
> as the physical and virtual CPU addresses of the mapping (phys_base and
> virt_base fields). For convenience, the physical and virtual CPU
> addresses within that mapping to use to access the target RC PCI address
> region are also provided (phys_addr and virt_addr fields).
>
> Endpoint function drivers can use struct pci_epc_map to access the
> mapped RC PCI address region using the ->virt_addr and ->pci_size
> fields.
>
> Co-developed-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
> drivers/pci/endpoint/pci-epc-core.c | 103 ++++++++++++++++++++++++++++
> include/linux/pci-epc.h | 38 ++++++++++
> 2 files changed, 141 insertions(+)
>
> diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
> index b854f1bab26f..04a85d2f7e2a 100644
> --- a/drivers/pci/endpoint/pci-epc-core.c
> +++ b/drivers/pci/endpoint/pci-epc-core.c
> @@ -466,6 +466,109 @@ int pci_epc_map_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> }
> EXPORT_SYMBOL_GPL(pci_epc_map_addr);
>
> +/**
> + * pci_epc_mem_map() - allocate and map a PCI address to a CPU address
> + * @epc: the EPC device on which the CPU address is to be allocated and mapped
> + * @func_no: the physical endpoint function number in the EPC device
> + * @vfunc_no: the virtual endpoint function number in the physical function
> + * @pci_addr: PCI address to which the CPU address should be mapped
> + * @pci_size: the number of bytes to map starting from @pci_addr
> + * @map: where to return the mapping information
> + *
> + * Allocate a controller memory address region and map it to a RC PCI address
> + * region, taking into account the controller physical address mapping
> + * constraints using the controller operation align_addr(). If this operation is
> + * not defined, we assume that there are no alignment constraints for the
> + * mapping.
> + *
> + * The effective size of the PCI address range mapped from @pci_addr is
> + * indicated by @map->pci_size. This size may be less than the requested
> + * @pci_size. The local virtual CPU address for the mapping is indicated by
> + * @map->virt_addr (@map->phys_addr indicates the physical address).
> + * The size and CPU address of the controller memory allocated and mapped are
> + * respectively indicated by @map->map_size and @map->virt_base (and
> + * @map->phys_base for the physical address of @map->virt_base).
> + *
> + * Returns 0 on success and a negative error code in case of error.
> + */
> +int pci_epc_mem_map(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + u64 pci_addr, size_t pci_size, struct pci_epc_map *map)
> +{
> + size_t map_size = pci_size;
> + size_t map_offset = 0;
> + int ret;
> +
> + if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
> + return -EINVAL;
> +
> + if (!pci_size || !map)
> + return -EINVAL;
> +
> + /*
> + * Align the PCI address to map. If the controller defines the
> + * .align_addr() operation, use it to determine the PCI address to map
> + * and the size of the mapping. Otherwise, assume that the controller
> + * has no alignment constraint.
> + */
> + memset(map, 0, sizeof(*map));
> + map->pci_addr = pci_addr;
> + if (epc->ops->align_addr)
> + map->map_pci_addr =
> + epc->ops->align_addr(epc, pci_addr,
> + &map_size, &map_offset);
> + else
> + map->map_pci_addr = pci_addr;
> + map->map_size = map_size;
> + if (map->map_pci_addr + map->map_size < pci_addr + pci_size)
> + map->pci_size = map->map_pci_addr + map->map_size - pci_addr;
> + else
> + map->pci_size = pci_size;
> +
> + map->virt_base = pci_epc_mem_alloc_addr(epc, &map->phys_base,
> + map->map_size);
> + if (!map->virt_base)
> + return -ENOMEM;
> +
> + map->phys_addr = map->phys_base + map_offset;
> + map->virt_addr = map->virt_base + map_offset;
> +
> + ret = pci_epc_map_addr(epc, func_no, vfunc_no, map->phys_base,
> + map->map_pci_addr, map->map_size);
> + if (ret) {
> + pci_epc_mem_free_addr(epc, map->phys_base, map->virt_base,
> + map->map_size);
> + return ret;
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(pci_epc_mem_map);
> +
> +/**
> + * pci_epc_mem_unmap() - unmap and free a CPU address region
> + * @epc: the EPC device on which the CPU address is allocated and mapped
> + * @func_no: the physical endpoint function number in the EPC device
> + * @vfunc_no: the virtual endpoint function number in the physical function
> + * @map: the mapping information
> + *
> + * Unmap and free a CPU address region that was allocated and mapped with
> + * pci_epc_mem_map().
> + */
> +void pci_epc_mem_unmap(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + struct pci_epc_map *map)
> +{
> + if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
> + return;
> +
> + if (!map || !map->virt_base)
> + return;
> +
> + pci_epc_unmap_addr(epc, func_no, vfunc_no, map->phys_base);
> + pci_epc_mem_free_addr(epc, map->phys_base, map->virt_base,
> + map->map_size);
> +}
> +EXPORT_SYMBOL_GPL(pci_epc_mem_unmap);
> +
> /**
> * pci_epc_clear_bar() - reset the BAR
> * @epc: the EPC device for which the BAR has to be cleared
> diff --git a/include/linux/pci-epc.h b/include/linux/pci-epc.h
> index 42ef06136bd1..f4b8dc37e447 100644
> --- a/include/linux/pci-epc.h
> +++ b/include/linux/pci-epc.h
> @@ -32,11 +32,43 @@ pci_epc_interface_string(enum pci_epc_interface_type type)
> }
> }
>
> +/**
> + * struct pci_epc_map - information about EPC memory for mapping a RC PCI
> + * address range
> + * @pci_addr: start address of the RC PCI address range to map
> + * @pci_size: size of the RC PCI address range mapped from @pci_addr
> + * @map_pci_addr: RC PCI address used as the first address mapped (may be lower
> + * than @pci_addr)
> + * @map_size: size of the controller memory needed for mapping the RC PCI address
> + * range @pci_addr..@pci_addr+@pci_size
> + * @phys_base: base physical address of the allocated EPC memory for mapping the
> + * RC PCI address range
> + * @phys_addr: physical address at which @pci_addr is mapped
> + * @virt_base: base virtual address of the allocated EPC memory for mapping the
> + * RC PCI address range
> + * @virt_addr: virtual address at which @pci_addr is mapped
> + */
> +struct pci_epc_map {
> + phys_addr_t pci_addr;
I think that it is really confusing that this is type phys_addr_t.
I would much prefer:
-dma_addr_t
or, less commonly used (but still much better than phys_addr_t):
-pci_bus_addr_t
-u64
To make it more clear that this is NOT a physical address.
In drivers/pci, we usually only use phy_addr_t for the "CPU address".
> + size_t pci_size;
> +
> + phys_addr_t map_pci_addr;
> + size_t map_size;
> +
> + phys_addr_t phys_base;
> + phys_addr_t phys_addr;
> + void __iomem *virt_base;
> + void __iomem *virt_addr;
> +};
> +
> /**
> * struct pci_epc_ops - set of function pointers for performing EPC operations
> * @write_header: ops to populate configuration space header
> * @set_bar: ops to configure the BAR
> * @clear_bar: ops to reset the BAR
> + * @align_addr: operation to get the mapping address, mapping size and offset
> + * into a controller memory window needed to map an RC PCI address
> + * region
I think this text should be more clear that it is about the PCI address.
Perhaps:
Operation to get the PCI address to map and the size of the mapping,
in order to satisfy address translation requirements of the controller.
> * @map_addr: ops to map CPU address to PCI address
> * @unmap_addr: ops to unmap CPU address and PCI address
> * @set_msi: ops to set the requested number of MSI interrupts in the MSI
> @@ -61,6 +93,8 @@ struct pci_epc_ops {
> struct pci_epf_bar *epf_bar);
> void (*clear_bar)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> struct pci_epf_bar *epf_bar);
> + phys_addr_t (*align_addr)(struct pci_epc *epc, phys_addr_t pci_addr,
> + size_t *size, size_t *offset);
This functions returns an aligned PCI address.
Making it return a phys_addr_t for someone used to reading code in
drivers/pci is very confusing, as you automatically assume that this is
then the "CPU address" (which is not the case here).
Please change the return type (basically the same as my first comment in
this reply) in order to make the API more clear.
> int (*map_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> phys_addr_t addr, u64 pci_addr, size_t size);
> void (*unmap_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> @@ -278,6 +312,10 @@ void __iomem *pci_epc_mem_alloc_addr(struct pci_epc *epc,
> phys_addr_t *phys_addr, size_t size);
> void pci_epc_mem_free_addr(struct pci_epc *epc, phys_addr_t phys_addr,
> void __iomem *virt_addr, size_t size);
> +int pci_epc_mem_map(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + u64 pci_addr, size_t pci_size, struct pci_epc_map *map);
> +void pci_epc_mem_unmap(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + struct pci_epc_map *map);
>
> #else
> static inline void pci_epc_init_notify(struct pci_epc *epc)
> --
> 2.47.0
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 3/6] PCI: endpoint: Introduce pci_epc_mem_map()/unmap()
2024-10-13 9:06 ` Niklas Cassel
@ 2024-10-14 13:09 ` Damien Le Moal
2024-10-15 6:01 ` Manivannan Sadhasivam
0 siblings, 1 reply; 29+ messages in thread
From: Damien Le Moal @ 2024-10-14 13:09 UTC (permalink / raw)
To: Niklas Cassel
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek
On 10/13/24 18:06, Niklas Cassel wrote:
>> * @map_addr: ops to map CPU address to PCI address
>> * @unmap_addr: ops to unmap CPU address and PCI address
>> * @set_msi: ops to set the requested number of MSI interrupts in the MSI
>> @@ -61,6 +93,8 @@ struct pci_epc_ops {
>> struct pci_epf_bar *epf_bar);
>> void (*clear_bar)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
>> struct pci_epf_bar *epf_bar);
>> + phys_addr_t (*align_addr)(struct pci_epc *epc, phys_addr_t pci_addr,
>> + size_t *size, size_t *offset);
>
> This functions returns an aligned PCI address.
> Making it return a phys_addr_t for someone used to reading code in
> drivers/pci is very confusing, as you automatically assume that this is
> then the "CPU address" (which is not the case here).
>
> Please change the return type (basically the same as my first comment in
> this reply) in order to make the API more clear.
Sure I can send an incremental patch to change this to use u64 like other
operation s (e.g. map_addr) for the pci address.
Mani,
Are you OK with that ?
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 3/6] PCI: endpoint: Introduce pci_epc_mem_map()/unmap()
2024-10-14 13:09 ` Damien Le Moal
@ 2024-10-15 6:01 ` Manivannan Sadhasivam
0 siblings, 0 replies; 29+ messages in thread
From: Manivannan Sadhasivam @ 2024-10-15 6:01 UTC (permalink / raw)
To: Damien Le Moal
Cc: Niklas Cassel, Krzysztof Wilczyński, Kishon Vijay Abraham I,
Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring, Jonathan Corbet,
Jingoo Han, linux-pci, Rick Wertenbroek
On Mon, Oct 14, 2024 at 10:09:23PM +0900, Damien Le Moal wrote:
> On 10/13/24 18:06, Niklas Cassel wrote:
> >> * @map_addr: ops to map CPU address to PCI address
> >> * @unmap_addr: ops to unmap CPU address and PCI address
> >> * @set_msi: ops to set the requested number of MSI interrupts in the MSI
> >> @@ -61,6 +93,8 @@ struct pci_epc_ops {
> >> struct pci_epf_bar *epf_bar);
> >> void (*clear_bar)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> >> struct pci_epf_bar *epf_bar);
> >> + phys_addr_t (*align_addr)(struct pci_epc *epc, phys_addr_t pci_addr,
> >> + size_t *size, size_t *offset);
> >
> > This functions returns an aligned PCI address.
> > Making it return a phys_addr_t for someone used to reading code in
> > drivers/pci is very confusing, as you automatically assume that this is
> > then the "CPU address" (which is not the case here).
> >
> > Please change the return type (basically the same as my first comment in
> > this reply) in order to make the API more clear.
>
> Sure I can send an incremental patch to change this to use u64 like other
> operation s (e.g. map_addr) for the pci address.
>
> Mani,
>
> Are you OK with that ?
>
Sounds good to me.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-12 11:32 [PATCH v6 0/6] Improve PCI memory mapping API Damien Le Moal
` (6 preceding siblings ...)
2024-10-12 11:57 ` [PATCH v6 0/6] Improve PCI memory mapping API Manivannan Sadhasivam
@ 2024-10-21 22:19 ` Bjorn Helgaas
2024-10-22 1:51 ` Damien Le Moal
7 siblings, 1 reply; 29+ messages in thread
From: Bjorn Helgaas @ 2024-10-21 22:19 UTC (permalink / raw)
To: Damien Le Moal
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek, Niklas Cassel
On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
> This series introduces the new functions pci_epc_mem_map() and
> pci_epc_mem_unmap() to improve handling of the PCI address mapping
> alignment constraints of endpoint controllers in a controller
> independent manner.
Hi Damien, I know this is obvious to everybody except me, but who uses
this mapping? A driver running on the endpoint that does MMIO? DMA
(Memory Reads or Writes that target an endpoint BAR)? I'm still
trying to wrap my head around the whole endpoint driver model.
Bjorn
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-21 22:19 ` Bjorn Helgaas
@ 2024-10-22 1:51 ` Damien Le Moal
2024-10-22 8:38 ` Niklas Cassel
2024-10-22 20:47 ` Bjorn Helgaas
0 siblings, 2 replies; 29+ messages in thread
From: Damien Le Moal @ 2024-10-22 1:51 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek, Niklas Cassel
On 10/22/24 07:19, Bjorn Helgaas wrote:
> On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
>> This series introduces the new functions pci_epc_mem_map() and
>> pci_epc_mem_unmap() to improve handling of the PCI address mapping
>> alignment constraints of endpoint controllers in a controller
>> independent manner.
>
> Hi Damien, I know this is obvious to everybody except me, but who uses
> this mapping? A driver running on the endpoint that does MMIO? DMA
> (Memory Reads or Writes that target an endpoint BAR)? I'm still
> trying to wrap my head around the whole endpoint driver model.
The mapping API is for mmio or DMA using enpoint controller memory mapped to a
host PCI address range. It is not for BARs. BARs setup does not use the same API
and has not changed with these patches.
BARs can still be accessed on the enpoint (within the EPF driver) with regular
READ_ONCE()/WRITE_ONCE() once they are set. But any data transfer between the
PCI RC host and the EPF driver on the EP host that use mmio or DMA generic
channel (memory copy offload channel) needs the new mapping API. DMA transfers
that can be done using dedicated DMA rx/tx channels associated with the endpoint
controller do not need to use this API as the mapping to the host PCI address
space is automatically handled by the DMA driver.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 1:51 ` Damien Le Moal
@ 2024-10-22 8:38 ` Niklas Cassel
2024-10-22 11:57 ` Damien Le Moal
2024-10-22 13:56 ` Manivannan Sadhasivam
2024-10-22 20:47 ` Bjorn Helgaas
1 sibling, 2 replies; 29+ messages in thread
From: Niklas Cassel @ 2024-10-22 8:38 UTC (permalink / raw)
To: Damien Le Moal
Cc: Bjorn Helgaas, Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek
On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
> On 10/22/24 07:19, Bjorn Helgaas wrote:
> > On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
>
> DMA transfers that can be done using dedicated DMA rx/tx channels associated
> with the endpoint controller do not need to use this API as the mapping to
> the host PCI address space is automatically handled by the DMA driver.
I disagree with this part. It think that it depends on the DMA controller.
Looking at the manual for e.g. the embedded DMA controller on the DWC
controller (eDMA):
""
When you do not want the iATU to translate outbound requests that are generated by the
internal DMA module, then you must implement one of the following approaches:
- Ensure that the combination of DMA channel programming and iATU control register
programming, causes no translation of DMA traffic to be done in the iATU.
or
- Activate the DMA bypass mode to allow request TLPs which are initiated by the DMA
controller to pass through the iATU untranslated. You can activate DMA bypass mode by
setting the DMA Bypass field of the iATU Control 2 Register (IATU_REGION_C-
TRL_OFF_2_OUTBOUND_0).
""
However, we also know that, if there is no match in the iATU table:
""
The default behavior of the ATU when there is no address match in the outbound direction or no
TLP attribute match in the inbound direction, is to pass the transaction through.
""
I.e. if there is a iATU mapping (which is what pci_epc_map_addr() sets up),
the eDMA will take that into account. If there is none, it will go through
untranslated.
So for the eDMA embedded on the DWC controller, we do not strictly need to
call pci_epc_map_addr(). (However, we probably want to do it anyway, as long
as we haven't enabled DMA bypass mode, just to make sure that there is no
other iATU mapping in the mapping table that will affect our transfer.)
However, if you think about a generic DMA controller, e.g. an ARM primecell
pl330, I don't see how it that DMA controller will be able to perform
transfers correctly if there is not an iATU mapping for the region that it
is reading/writing to.
So the safest thing, that will work with any DMA controller is probably to
always call pci_epc_map_addr() before doing the transfer.
However, as I pointed out in:
https://lore.kernel.org/lkml/Zg5oeDzq5u3jmKIu@ryzen/
This behavior is still inconsistent between PCI EPF drivers:
E.g. for pci-epf-test.c:
When using dedicated DMA rx/tx channels (epf_test->dma_private == true),
and when FLAG_USE_DMA is set, it currently always calls pci_epc_map_addr()
before doing the transfer.
However for pci-epf-mhi.c:
When using dedicated DMA rx/tx channels and when MHI_EPF_USE_DMA is set,
it currently does not call pci_epc_map_addr() before doing the transfer.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 8:38 ` Niklas Cassel
@ 2024-10-22 11:57 ` Damien Le Moal
2024-10-22 13:56 ` Manivannan Sadhasivam
1 sibling, 0 replies; 29+ messages in thread
From: Damien Le Moal @ 2024-10-22 11:57 UTC (permalink / raw)
To: Niklas Cassel
Cc: Bjorn Helgaas, Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek
On 10/22/24 17:38, Niklas Cassel wrote:
> On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
>> On 10/22/24 07:19, Bjorn Helgaas wrote:
>>> On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
>>
>> DMA transfers that can be done using dedicated DMA rx/tx channels associated
>> with the endpoint controller do not need to use this API as the mapping to
>> the host PCI address space is automatically handled by the DMA driver.
>
> I disagree with this part. It think that it depends on the DMA controller.
>
> Looking at the manual for e.g. the embedded DMA controller on the DWC
> controller (eDMA):
> ""
> When you do not want the iATU to translate outbound requests that are generated by the
> internal DMA module, then you must implement one of the following approaches:
> - Ensure that the combination of DMA channel programming and iATU control register
> programming, causes no translation of DMA traffic to be done in the iATU.
> or
> - Activate the DMA bypass mode to allow request TLPs which are initiated by the DMA
> controller to pass through the iATU untranslated. You can activate DMA bypass mode by
> setting the DMA Bypass field of the iATU Control 2 Register (IATU_REGION_C-
> TRL_OFF_2_OUTBOUND_0).
> ""
>
> However, we also know that, if there is no match in the iATU table:
> ""
> The default behavior of the ATU when there is no address match in the outbound direction or no
> TLP attribute match in the inbound direction, is to pass the transaction through.
> ""
>
> I.e. if there is a iATU mapping (which is what pci_epc_map_addr() sets up),
> the eDMA will take that into account. If there is none, it will go through
> untranslated.
>
>
> So for the eDMA embedded on the DWC controller, we do not strictly need to
> call pci_epc_map_addr(). (However, we probably want to do it anyway, as long
> as we haven't enabled DMA bypass mode, just to make sure that there is no
> other iATU mapping in the mapping table that will affect our transfer.)
>
> However, if you think about a generic DMA controller, e.g. an ARM primecell
> pl330, I don't see how it that DMA controller will be able to perform
> transfers correctly if there is not an iATU mapping for the region that it
> is reading/writing to.
>
> So the safest thing, that will work with any DMA controller is probably to
> always call pci_epc_map_addr() before doing the transfer.
Indeed, I think you are right. Not doing the mapping before DMA with the RK3588
works fine, but that may not be the case for all controllers.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 8:38 ` Niklas Cassel
2024-10-22 11:57 ` Damien Le Moal
@ 2024-10-22 13:56 ` Manivannan Sadhasivam
2024-10-22 14:16 ` Niklas Cassel
1 sibling, 1 reply; 29+ messages in thread
From: Manivannan Sadhasivam @ 2024-10-22 13:56 UTC (permalink / raw)
To: Niklas Cassel
Cc: Damien Le Moal, Bjorn Helgaas, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek
On Tue, Oct 22, 2024 at 10:38:58AM +0200, Niklas Cassel wrote:
> On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
> > On 10/22/24 07:19, Bjorn Helgaas wrote:
> > > On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
> >
> > DMA transfers that can be done using dedicated DMA rx/tx channels associated
> > with the endpoint controller do not need to use this API as the mapping to
> > the host PCI address space is automatically handled by the DMA driver.
>
> I disagree with this part. It think that it depends on the DMA controller.
>
> Looking at the manual for e.g. the embedded DMA controller on the DWC
> controller (eDMA):
> ""
> When you do not want the iATU to translate outbound requests that are generated by the
> internal DMA module, then you must implement one of the following approaches:
> - Ensure that the combination of DMA channel programming and iATU control register
> programming, causes no translation of DMA traffic to be done in the iATU.
> or
> - Activate the DMA bypass mode to allow request TLPs which are initiated by the DMA
> controller to pass through the iATU untranslated. You can activate DMA bypass mode by
> setting the DMA Bypass field of the iATU Control 2 Register (IATU_REGION_C-
> TRL_OFF_2_OUTBOUND_0).
> ""
>
> However, we also know that, if there is no match in the iATU table:
> ""
> The default behavior of the ATU when there is no address match in the outbound direction or no
> TLP attribute match in the inbound direction, is to pass the transaction through.
> ""
>
> I.e. if there is a iATU mapping (which is what pci_epc_map_addr() sets up),
> the eDMA will take that into account. If there is none, it will go through
> untranslated.
>
>
> So for the eDMA embedded on the DWC controller, we do not strictly need to
> call pci_epc_map_addr(). (However, we probably want to do it anyway, as long
> as we haven't enabled DMA bypass mode, just to make sure that there is no
> other iATU mapping in the mapping table that will affect our transfer.)
>
I do not recommend that. EPF driver *should* know how to isolate the address
spaces between DMA and iATU. Creating the iATU mapping for the DMA address is
just defeating the purpose of using DMA.
If any overlap mapping is present, then the EPF driver has a bug!
> However, if you think about a generic DMA controller, e.g. an ARM primecell
> pl330, I don't see how it that DMA controller will be able to perform
> transfers correctly if there is not an iATU mapping for the region that it
> is reading/writing to.
>
I don't think the generic DMA controller can be used to read/write to remote
memory. It needs to be integrated with the PCIe IP so that it can issue PCIe
transactions.
> So the safest thing, that will work with any DMA controller is probably to
> always call pci_epc_map_addr() before doing the transfer.
>
>
> However, as I pointed out in:
> https://lore.kernel.org/lkml/Zg5oeDzq5u3jmKIu@ryzen/
>
> This behavior is still inconsistent between PCI EPF drivers:
> E.g. for pci-epf-test.c:
> When using dedicated DMA rx/tx channels (epf_test->dma_private == true),
> and when FLAG_USE_DMA is set, it currently always calls pci_epc_map_addr()
> before doing the transfer.
>
> However for pci-epf-mhi.c:
> When using dedicated DMA rx/tx channels and when MHI_EPF_USE_DMA is set,
> it currently does not call pci_epc_map_addr() before doing the transfer.
>
Because, there are not going to be any overlapping mappings. But I agree with
the inconsistency between EPF drivers though...
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 13:56 ` Manivannan Sadhasivam
@ 2024-10-22 14:16 ` Niklas Cassel
2024-10-22 15:18 ` Frank Li
2024-10-22 15:30 ` Manivannan Sadhasivam
0 siblings, 2 replies; 29+ messages in thread
From: Niklas Cassel @ 2024-10-22 14:16 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Damien Le Moal, Bjorn Helgaas, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek
On Tue, Oct 22, 2024 at 07:26:31PM +0530, Manivannan Sadhasivam wrote:
> On Tue, Oct 22, 2024 at 10:38:58AM +0200, Niklas Cassel wrote:
> > On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
> > > On 10/22/24 07:19, Bjorn Helgaas wrote:
> > > > On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
>
> > However, if you think about a generic DMA controller, e.g. an ARM primecell
> > pl330, I don't see how it that DMA controller will be able to perform
> > transfers correctly if there is not an iATU mapping for the region that it
> > is reading/writing to.
> >
>
> I don't think the generic DMA controller can be used to read/write to remote
> memory. It needs to be integrated with the PCIe IP so that it can issue PCIe
> transactions.
I'm not an expert, so I might of course be misunderstanding how things work.
When the CPU performs an AXI read/write to a MMIO address within the PCIe
controller (specifically the PCIe controller's outbound memory window),
the PCIe controller translates that incoming read/write to a read/write on the
PCIe bus.
(The PCI address of the generated PCIe transaction will depend on how the iATU
has been configured, which determines how reads/writes to the PCIe controller's
outbound memory window should be translated to PCIe addresses.)
If that is how it works when the CPU does the AXI read/write, why wouldn't
things work the same if it is an generic DMA controller performing the AXI
read/write to the MMIO address targeting the PCIe controller's outbound memory
window?
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 14:16 ` Niklas Cassel
@ 2024-10-22 15:18 ` Frank Li
2024-10-22 15:30 ` Manivannan Sadhasivam
1 sibling, 0 replies; 29+ messages in thread
From: Frank Li @ 2024-10-22 15:18 UTC (permalink / raw)
To: Niklas Cassel
Cc: Manivannan Sadhasivam, Damien Le Moal, Bjorn Helgaas,
Krzysztof Wilczyński, Kishon Vijay Abraham I, Bjorn Helgaas,
Lorenzo Pieralisi, Rob Herring, Jonathan Corbet, Jingoo Han,
linux-pci, Rick Wertenbroek
On Tue, Oct 22, 2024 at 04:16:24PM +0200, Niklas Cassel wrote:
> On Tue, Oct 22, 2024 at 07:26:31PM +0530, Manivannan Sadhasivam wrote:
> > On Tue, Oct 22, 2024 at 10:38:58AM +0200, Niklas Cassel wrote:
> > > On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
> > > > On 10/22/24 07:19, Bjorn Helgaas wrote:
> > > > > On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
> >
> > > However, if you think about a generic DMA controller, e.g. an ARM primecell
> > > pl330, I don't see how it that DMA controller will be able to perform
> > > transfers correctly if there is not an iATU mapping for the region that it
> > > is reading/writing to.
> > >
> >
> > I don't think the generic DMA controller can be used to read/write to remote
> > memory. It needs to be integrated with the PCIe IP so that it can issue PCIe
> > transactions.
>
> I'm not an expert, so I might of course be misunderstanding how things work.
>
> When the CPU performs an AXI read/write to a MMIO address within the PCIe
> controller (specifically the PCIe controller's outbound memory window),
> the PCIe controller translates that incoming read/write to a read/write on the
> PCIe bus.
>
> (The PCI address of the generated PCIe transaction will depend on how the iATU
> has been configured, which determines how reads/writes to the PCIe controller's
> outbound memory window should be translated to PCIe addresses.)
>
> If that is how it works when the CPU does the AXI read/write, why wouldn't
> things work the same if it is an generic DMA controller performing the AXI
> read/write to the MMIO address targeting the PCIe controller's outbound memory
> window?
generic DMA controller can preforming memory to memory (PCI map windows) to
do data transfer. But MMIO need map by iATU of pci controller.
for example: copy 0x4000_0000 to PCI host's 0xA_0000_0000
EP memory (0x4000_0000), -> DMA -> PCI map windows (iATU) (0x8000_0000) ->
PCI bus (0xA_0000_0000-> Host memory (0xA_0000_0000).
But embedded DMA can bypass iATU. Directy send out TLP.
EP memory (0x4000_0000) -> PCI controller DMA -> PCI BUS (0xA_0000_0000)
-> Host memmory (0xA_0000_0000)
anthor words, eDMA can copy data to any place of PCI host memory.
generally DMA only copy data to EP PCI map window.
Frank
>
>
> Kind regards,
> Niklas
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 14:16 ` Niklas Cassel
2024-10-22 15:18 ` Frank Li
@ 2024-10-22 15:30 ` Manivannan Sadhasivam
2024-10-22 22:12 ` Damien Le Moal
1 sibling, 1 reply; 29+ messages in thread
From: Manivannan Sadhasivam @ 2024-10-22 15:30 UTC (permalink / raw)
To: Niklas Cassel
Cc: Damien Le Moal, Bjorn Helgaas, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek
On Tue, Oct 22, 2024 at 04:16:24PM +0200, Niklas Cassel wrote:
> On Tue, Oct 22, 2024 at 07:26:31PM +0530, Manivannan Sadhasivam wrote:
> > On Tue, Oct 22, 2024 at 10:38:58AM +0200, Niklas Cassel wrote:
> > > On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
> > > > On 10/22/24 07:19, Bjorn Helgaas wrote:
> > > > > On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
> >
> > > However, if you think about a generic DMA controller, e.g. an ARM primecell
> > > pl330, I don't see how it that DMA controller will be able to perform
> > > transfers correctly if there is not an iATU mapping for the region that it
> > > is reading/writing to.
> > >
> >
> > I don't think the generic DMA controller can be used to read/write to remote
> > memory. It needs to be integrated with the PCIe IP so that it can issue PCIe
> > transactions.
>
> I'm not an expert, so I might of course be misunderstanding how things work.
>
Neither am I :) I'm just sharing my understanding based on reading the DWC spec
and open to get corrected if I'm wrong.
> When the CPU performs an AXI read/write to a MMIO address within the PCIe
> controller (specifically the PCIe controller's outbound memory window),
> the PCIe controller translates that incoming read/write to a read/write on the
> PCIe bus.
>
I don't think the *PCIe controller* translates the read/writes, but the iATU. If
we use iATU, then the remote address needs to be mapped to the endpoint DDR and
if CPU performs AXI read/write to that address, then iATU will translate the DDR
address to remote address and then issue PCIe transactions (together with the
PCIe controller).
And if DMA is used, then DMA controller can issue PCIe transactions to the
remote memory itself (again, together with the PCIe controller). So no mapping
is required here.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 1:51 ` Damien Le Moal
2024-10-22 8:38 ` Niklas Cassel
@ 2024-10-22 20:47 ` Bjorn Helgaas
2024-10-22 22:05 ` Damien Le Moal
1 sibling, 1 reply; 29+ messages in thread
From: Bjorn Helgaas @ 2024-10-22 20:47 UTC (permalink / raw)
To: Damien Le Moal
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek, Niklas Cassel
On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
> On 10/22/24 07:19, Bjorn Helgaas wrote:
> > On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
> >> This series introduces the new functions pci_epc_mem_map() and
> >> pci_epc_mem_unmap() to improve handling of the PCI address mapping
> >> alignment constraints of endpoint controllers in a controller
> >> independent manner.
> >
> > Hi Damien, I know this is obvious to everybody except me, but who
> > uses this mapping? A driver running on the endpoint that does
> > MMIO? DMA (Memory Reads or Writes that target an endpoint BAR)?
> > I'm still trying to wrap my head around the whole endpoint driver
> > model.
>
> The mapping API is for mmio or DMA using endpoint controller memory
> mapped to a host PCI address range. It is not for BARs. BARs setup
> does not use the same API and has not changed with these patches.
>
> BARs can still be accessed on the endpoint (within the EPF driver)
> with regular READ_ONCE()/WRITE_ONCE() once they are set. But any
> data transfer between the PCI RC host and the EPF driver on the EP
> host that use mmio or DMA generic channel (memory copy offload
> channel) needs the new mapping API. DMA transfers that can be done
> using dedicated DMA rx/tx channels associated with the endpoint
> controller do not need to use this API as the mapping to the host
> PCI address space is automatically handled by the DMA driver.
Sorry I'm dense. I'm really not used to thinking in the endpoint
point of view. Correct me where I go off the rails:
- This code (pci_epc_mem_map()) runs on an endpoint, basically as
part of some endpoint firmware, right?
- On the endpoint's PCIe side, it receives memory read/write TLPs?
- These TLPs would be generated elsewhere in the PCIe fabric, e.g.,
by a driver on the host doing MMIO to the endpoint, or possibly
another endpoint doing peer-to-peer DMA.
- Mem read/write TLPs are routed by address, and the endpoint
accepts them if the address matches one of its BARs.
- This is a little different from a Root Complex, which would
basically treat reads/writes to anything outside the Root Port
windows as incoming DMA headed to physical memory.
- A Root Complex would use the TLP address (the PCI bus address)
directly as a CPU physical memory address unless the TLP address
is translated by an IOMMU.
- For the endpoint, you want the BAR to be an aperture to physical
memory in the address space of the SoC driving the endpoint.
- The SoC physical memory address may be different from the PCI but
address in the TLP, and pci_epc_mem_map() is the way to account
for this?
- IOMMU translations from PCI to CPU physical address space are
pretty arbitrary and needn't be contiguous on the CPU side.
- pci_epc_mem_map() sets up a conceptually similar PCI to CPU
address space translation, but it's much simpler because it
basically applies just a constant offset?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 20:47 ` Bjorn Helgaas
@ 2024-10-22 22:05 ` Damien Le Moal
2024-10-22 23:49 ` Bjorn Helgaas
0 siblings, 1 reply; 29+ messages in thread
From: Damien Le Moal @ 2024-10-22 22:05 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek, Niklas Cassel
On 10/23/24 05:47, Bjorn Helgaas wrote:
> On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
>> On 10/22/24 07:19, Bjorn Helgaas wrote:
>>> On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
>>>> This series introduces the new functions pci_epc_mem_map() and
>>>> pci_epc_mem_unmap() to improve handling of the PCI address mapping
>>>> alignment constraints of endpoint controllers in a controller
>>>> independent manner.
>>>
>>> Hi Damien, I know this is obvious to everybody except me, but who
>>> uses this mapping? A driver running on the endpoint that does
>>> MMIO? DMA (Memory Reads or Writes that target an endpoint BAR)?
>>> I'm still trying to wrap my head around the whole endpoint driver
>>> model.
>>
>> The mapping API is for mmio or DMA using endpoint controller memory
>> mapped to a host PCI address range. It is not for BARs. BARs setup
>> does not use the same API and has not changed with these patches.
>>
>> BARs can still be accessed on the endpoint (within the EPF driver)
>> with regular READ_ONCE()/WRITE_ONCE() once they are set. But any
>> data transfer between the PCI RC host and the EPF driver on the EP
>> host that use mmio or DMA generic channel (memory copy offload
>> channel) needs the new mapping API. DMA transfers that can be done
>> using dedicated DMA rx/tx channels associated with the endpoint
>> controller do not need to use this API as the mapping to the host
>> PCI address space is automatically handled by the DMA driver.
>
> Sorry I'm dense. I'm really not used to thinking in the endpoint
> point of view. Correct me where I go off the rails:
>
> - This code (pci_epc_mem_map()) runs on an endpoint, basically as
> part of some endpoint firmware, right?
Not sure what you mean by "firmware" here. pci_epc_mem_map() is intended for use
by a PCI endpoint function driver on the EP side, nothing else.
> - On the endpoint's PCIe side, it receives memory read/write TLPs?
pci_epc_mem_map() does not receive anything itself. It only allocates local EP
controller memory from outbound ATU windows and map that to a PCI address region
of the host (RC side). Here "map" means programming the ATU using a correctly
aligned PCI address that satisfies the EP PCI controller alignment constraints.
The memory read/write TLPs will happen once the EP function driver access the
mapped memory with memcpy_fromio/toio() (or use generic memory copy offload DMA
channel).
> - These TLPs would be generated elsewhere in the PCIe fabric, e.g.,
> by a driver on the host doing MMIO to the endpoint, or possibly
> another endpoint doing peer-to-peer DMA.
MMIO or DMA, yes.
>
> - Mem read/write TLPs are routed by address, and the endpoint
> accepts them if the address matches one of its BARs.
Yes, but pci_epc_mem_map() is not for use for BARs on the EP side. BARs setup on
EP PCI controllers use inbound ATU windows. pci_epc_mem_map() programs outbound
ATU windows. BARs setup use pci_epf_alloc_space() + pci_epc_set_bar() to setup a
BAR.
>
> - This is a little different from a Root Complex, which would
> basically treat reads/writes to anything outside the Root Port
> windows as incoming DMA headed to physical memory.
>
> - A Root Complex would use the TLP address (the PCI bus address)
> directly as a CPU physical memory address unless the TLP address
> is translated by an IOMMU.
>
> - For the endpoint, you want the BAR to be an aperture to physical
> memory in the address space of the SoC driving the endpoint.
Yes, and as said above this is done with pci_epf_alloc_space() +
pci_epc_set_bar(), not pci_epc_mem_map().
> - The SoC physical memory address may be different from the PCI but
> address in the TLP, and pci_epc_mem_map() is the way to account
> for this?
pci_epc_mem_map() is essentially a call to pci_epc_mem_alloc_addr() +
pci_epc_map_addr(). pci_epc_mem_alloc_addr() allocates memory from the EP PCI
controller outbound windows and pci_epc_map_addr() programs an EP PCI controller
outbound ATU entry to map that memory to some PCI address region of the host (RC).
pci_epc_mem_map() was created because the current epc API does not hide any of
the EP PCI controller PCI address alignment constraints for programming outbound
ATU entries. This means that using directly pci_epc_mem_alloc_addr() +
pci_epc_map_addr(), things may or may not work (often the latter) depending on
the PCI address region (start address and its size) that is to be mapped and
accessed by the endpoint function driver.
Most EP PCI controllers at least require a PCI address to be aligned to some
fixed boundary (e.g. 64K for the RK3588 SoC on the Rock5B board). Even such
alignment boundary/mask is not exposed through the API (epc_features->align is
for BARs and should not be used for outbound ATU programming). Worse yet, some
EP PCI controllers use the lower bits of a local memory window address range as
the lower bits of the RC PCI address range when generating TLPs (e.g. RK3399 and
Cadence EP controller). For these EP PCI controllers, the programming of
outbound ATU entries for mapping a RC PCI address region thus endup depending on
the start PCI address of the region *and* the region size (as the size drives
the number of lower bits of address that will change over the entire region).
The above API issues where essentially unsolvable from a regular endpoint driver
correctly coded to be EP controller independent. I could not get my prototype
nvme endpoint function driver to work without introducing pci_epc_mem_map()
(nvme does not require any special alignment of command buffers, so on an nvme
EPF driver we end up having to deal with essentially random PCI addresses that
have no special alignment).
pci_epc_mem_map() relies on the new ->align_addr() EP controller operation to
get information about the start address and the size to use for mapping to local
memory a PCI address region. The size given is used for pci_epc_mem_alloc_addr()
and the start address and size given are used with pci_epc_map_addr(). This
generates a correct mapping regardless of the PCI address and size given.
And for the actual data access by the EP function driver, pci_epc_mem_map()
gives the address within the mapping that correspond to the PCI address that we
wanted mapped.
Et voila, problem solved :)
>
> - IOMMU translations from PCI to CPU physical address space are
> pretty arbitrary and needn't be contiguous on the CPU side.
>
> - pci_epc_mem_map() sets up a conceptually similar PCI to CPU
> address space translation, but it's much simpler because it
> basically applies just a constant offset?
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 15:30 ` Manivannan Sadhasivam
@ 2024-10-22 22:12 ` Damien Le Moal
0 siblings, 0 replies; 29+ messages in thread
From: Damien Le Moal @ 2024-10-22 22:12 UTC (permalink / raw)
To: Manivannan Sadhasivam, Niklas Cassel
Cc: Bjorn Helgaas, Krzysztof Wilczyński, Kishon Vijay Abraham I,
Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring, Jonathan Corbet,
Jingoo Han, linux-pci, Rick Wertenbroek
On 10/23/24 00:30, Manivannan Sadhasivam wrote:
> On Tue, Oct 22, 2024 at 04:16:24PM +0200, Niklas Cassel wrote:
>> On Tue, Oct 22, 2024 at 07:26:31PM +0530, Manivannan Sadhasivam wrote:
>>> On Tue, Oct 22, 2024 at 10:38:58AM +0200, Niklas Cassel wrote:
>>>> On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
>>>>> On 10/22/24 07:19, Bjorn Helgaas wrote:
>>>>>> On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
>>>
>>>> However, if you think about a generic DMA controller, e.g. an ARM primecell
>>>> pl330, I don't see how it that DMA controller will be able to perform
>>>> transfers correctly if there is not an iATU mapping for the region that it
>>>> is reading/writing to.
>>>>
>>>
>>> I don't think the generic DMA controller can be used to read/write to remote
>>> memory. It needs to be integrated with the PCIe IP so that it can issue PCIe
>>> transactions.
>>
>> I'm not an expert, so I might of course be misunderstanding how things work.
>>
>
> Neither am I :) I'm just sharing my understanding based on reading the DWC spec
> and open to get corrected if I'm wrong.
>
>> When the CPU performs an AXI read/write to a MMIO address within the PCIe
>> controller (specifically the PCIe controller's outbound memory window),
>> the PCIe controller translates that incoming read/write to a read/write on the
>> PCIe bus.
>>
>
> I don't think the *PCIe controller* translates the read/writes, but the iATU. If
> we use iATU, then the remote address needs to be mapped to the endpoint DDR and
> if CPU performs AXI read/write to that address, then iATU will translate the DDR
> address to remote address and then issue PCIe transactions (together with the
> PCIe controller).
>
> And if DMA is used, then DMA controller can issue PCIe transactions to the
> remote memory itself (again, together with the PCIe controller). So no mapping
> is required here.
Caveat: DMA here cannot be the case "using the generic memory copy offload DMA
channel". It must be DMA driven by hardware rx/tx DMA channels. For memory copy
offload DMA channel, an EPF must map the address range to "DMA" to (which is
really memcpy_toio/fromio() but the DMA API will hide that...).
Would love to have so EPF wrapper API around all that mess to simplify EPF
drivers. They all end up having to do the exact same things.
>
> - Mani
>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 22:05 ` Damien Le Moal
@ 2024-10-22 23:49 ` Bjorn Helgaas
2024-10-23 2:51 ` Damien Le Moal
0 siblings, 1 reply; 29+ messages in thread
From: Bjorn Helgaas @ 2024-10-22 23:49 UTC (permalink / raw)
To: Damien Le Moal
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek, Niklas Cassel
On Wed, Oct 23, 2024 at 07:05:54AM +0900, Damien Le Moal wrote:
> On 10/23/24 05:47, Bjorn Helgaas wrote:
> > On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
> >> On 10/22/24 07:19, Bjorn Helgaas wrote:
> >>> On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
> >>>> This series introduces the new functions pci_epc_mem_map() and
> >>>> pci_epc_mem_unmap() to improve handling of the PCI address mapping
> >>>> alignment constraints of endpoint controllers in a controller
> >>>> independent manner.
> >>>
> >>> Hi Damien, I know this is obvious to everybody except me, but who
> >>> uses this mapping? A driver running on the endpoint that does
> >>> MMIO? DMA (Memory Reads or Writes that target an endpoint BAR)?
> >>> I'm still trying to wrap my head around the whole endpoint driver
> >>> model.
> >>
> >> The mapping API is for mmio or DMA using endpoint controller memory
> >> mapped to a host PCI address range. It is not for BARs. BARs setup
> >> does not use the same API and has not changed with these patches.
> >>
> >> BARs can still be accessed on the endpoint (within the EPF driver)
> >> with regular READ_ONCE()/WRITE_ONCE() once they are set. But any
> >> data transfer between the PCI RC host and the EPF driver on the EP
> >> host that use mmio or DMA generic channel (memory copy offload
> >> channel) needs the new mapping API. DMA transfers that can be done
> >> using dedicated DMA rx/tx channels associated with the endpoint
> >> controller do not need to use this API as the mapping to the host
> >> PCI address space is automatically handled by the DMA driver.
> >
> > Sorry I'm dense. I'm really not used to thinking in the endpoint
> > point of view. Correct me where I go off the rails:
> >
> > - This code (pci_epc_mem_map()) runs on an endpoint, basically as
> > part of some endpoint firmware, right?
>
> Not sure what you mean by "firmware" here. pci_epc_mem_map() is
> intended for use by a PCI endpoint function driver on the EP side,
> nothing else.
I mean the software operating the endpoint from the "inside", not from
the PCIe side.
In a non-endpoint framework scenario, the kernel running on the host
enumerates PCI devices using config reads, and a driver like nvme runs
on the host and operates an NVMe device with config accesses and MMIO
to the NVMe BARs.
I referred to "firmware" because if the endpoint were on a PCIe
plug-in card, an SoC on the card could run firmware that uses the
endpoint framework to respond to PCIe activity generated by the host.
(Of course, if the host driver programs the endpoint appropriately,
the endpoint might also initiate its own PCIe activity for DMA.)
> > - On the endpoint's PCIe side, it receives memory read/write TLPs?
>
> pci_epc_mem_map() does not receive anything itself. It only
> allocates local EP controller memory from outbound ATU windows and
> map that to a PCI address region of the host (RC side). Here "map"
> means programming the ATU using a correctly aligned PCI address that
> satisfies the EP PCI controller alignment constraints.
I assumed we were talking about an endpoint responding to PCIe traffic
generated elsewhere, but I think my assumption was wrong.
I don't have any specs for these endpoint controllers, and there's
nothing in Documentation/PCI/endpoint/ about ATUs. Some doc and a few
pictures would go a long ways.
> The memory read/write TLPs will happen once the EP function driver
> access the mapped memory with memcpy_fromio/toio() (or use generic
> memory copy offload DMA channel).
This would be the endpoint itself *initiating* DMA TLPs on PCIe, not
responding to incoming DMA, I guess.
> > - These TLPs would be generated elsewhere in the PCIe fabric, e.g.,
> > by a driver on the host doing MMIO to the endpoint, or possibly
> > another endpoint doing peer-to-peer DMA.
>
> MMIO or DMA, yes.
>
> > - Mem read/write TLPs are routed by address, and the endpoint
> > accepts them if the address matches one of its BARs.
>
> Yes, but pci_epc_mem_map() is not for use for BARs on the EP side.
> BARs setup on EP PCI controllers use inbound ATU windows.
> pci_epc_mem_map() programs outbound ATU windows. BARs setup use
> pci_epf_alloc_space() + pci_epc_set_bar() to setup a BAR.
>
> > - This is a little different from a Root Complex, which would
> > basically treat reads/writes to anything outside the Root Port
> > windows as incoming DMA headed to physical memory.
> >
> > - A Root Complex would use the TLP address (the PCI bus address)
> > directly as a CPU physical memory address unless the TLP address
> > is translated by an IOMMU.
> >
> > - For the endpoint, you want the BAR to be an aperture to physical
> > memory in the address space of the SoC driving the endpoint.
>
> Yes, and as said above this is done with pci_epf_alloc_space() +
> pci_epc_set_bar(), not pci_epc_mem_map().
And if the endpoint is *initiating* DMA, BARs of that endpoint are not
involved at all. The target of the DMA would be either host memory or
a BAR of another endpoint.
> > - The SoC physical memory address may be different from the PCI but
> > address in the TLP, and pci_epc_mem_map() is the way to account
> > for this?
>
> pci_epc_mem_map() is essentially a call to pci_epc_mem_alloc_addr()
> + pci_epc_map_addr(). pci_epc_mem_alloc_addr() allocates memory from
> the EP PCI controller outbound windows and pci_epc_map_addr()
> programs an EP PCI controller outbound ATU entry to map that memory
> to some PCI address region of the host (RC).
I see "RC" and "RC address" used often in this area, but I don't quite
understand the RC connection. It sounds like this might be the PCI
memory address space, i.e., the set of addresses that might appear in
PCIe TLPs?
Does "EPC addr space" refer to the physical address space of the
processor operating the endpoint, or does it mean something more
specific?
> pci_epc_mem_map() was created because the current epc API does not
> hide any of the EP PCI controller PCI address alignment constraints
> for programming outbound ATU entries. This means that using directly
> pci_epc_mem_alloc_addr() + pci_epc_map_addr(), things may or may not
> work (often the latter) depending on the PCI address region (start
> address and its size) that is to be mapped and accessed by the
> endpoint function driver.
>
> Most EP PCI controllers at least require a PCI address to be aligned
> to some fixed boundary (e.g. 64K for the RK3588 SoC on the Rock5B
> board). Even such alignment boundary/mask is not exposed through the
> API (epc_features->align is for BARs and should not be used for
> outbound ATU programming). Worse yet, some EP PCI controllers use
> the lower bits of a local memory window address range as the lower
> bits of the RC PCI address range when generating TLPs (e.g. RK3399
> and Cadence EP controller). For these EP PCI controllers, the
> programming of outbound ATU entries for mapping a RC PCI address
> region thus endup depending on the start PCI address of the region
> *and* the region size (as the size drives the number of lower bits
> of address that will change over the entire region).
>
> The above API issues where essentially unsolvable from a regular
> endpoint driver correctly coded to be EP controller independent. I
> could not get my prototype nvme endpoint function driver to work
> without introducing pci_epc_mem_map() (nvme does not require any
> special alignment of command buffers, so on an nvme EPF driver we
> end up having to deal with essentially random PCI addresses that
> have no special alignment).
>
> pci_epc_mem_map() relies on the new ->align_addr() EP controller
> operation to get information about the start address and the size to
> use for mapping to local memory a PCI address region. The size given
> is used for pci_epc_mem_alloc_addr() and the start address and size
> given are used with pci_epc_map_addr(). This generates a correct
> mapping regardless of the PCI address and size given. And for the
> actual data access by the EP function driver, pci_epc_mem_map()
> gives the address within the mapping that correspond to the PCI
> address that we wanted mapped.
> > - IOMMU translations from PCI to CPU physical address space are
> > pretty arbitrary and needn't be contiguous on the CPU side.
> >
> > - pci_epc_mem_map() sets up a conceptually similar PCI to CPU
> > address space translation, but it's much simpler because it
> > basically applies just a constant offset?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-22 23:49 ` Bjorn Helgaas
@ 2024-10-23 2:51 ` Damien Le Moal
2024-10-23 9:29 ` Niklas Cassel
0 siblings, 1 reply; 29+ messages in thread
From: Damien Le Moal @ 2024-10-23 2:51 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek, Niklas Cassel
On 10/23/24 08:49, Bjorn Helgaas wrote:
> On Wed, Oct 23, 2024 at 07:05:54AM +0900, Damien Le Moal wrote:
>> On 10/23/24 05:47, Bjorn Helgaas wrote:
>>> On Tue, Oct 22, 2024 at 10:51:53AM +0900, Damien Le Moal wrote:
>>>> On 10/22/24 07:19, Bjorn Helgaas wrote:
>>>>> On Sat, Oct 12, 2024 at 08:32:40PM +0900, Damien Le Moal wrote:
>>>>>> This series introduces the new functions pci_epc_mem_map() and
>>>>>> pci_epc_mem_unmap() to improve handling of the PCI address mapping
>>>>>> alignment constraints of endpoint controllers in a controller
>>>>>> independent manner.
>>>>>
>>>>> Hi Damien, I know this is obvious to everybody except me, but who
>>>>> uses this mapping? A driver running on the endpoint that does
>>>>> MMIO? DMA (Memory Reads or Writes that target an endpoint BAR)?
>>>>> I'm still trying to wrap my head around the whole endpoint driver
>>>>> model.
>>>>
>>>> The mapping API is for mmio or DMA using endpoint controller memory
>>>> mapped to a host PCI address range. It is not for BARs. BARs setup
>>>> does not use the same API and has not changed with these patches.
>>>>
>>>> BARs can still be accessed on the endpoint (within the EPF driver)
>>>> with regular READ_ONCE()/WRITE_ONCE() once they are set. But any
>>>> data transfer between the PCI RC host and the EPF driver on the EP
>>>> host that use mmio or DMA generic channel (memory copy offload
>>>> channel) needs the new mapping API. DMA transfers that can be done
>>>> using dedicated DMA rx/tx channels associated with the endpoint
>>>> controller do not need to use this API as the mapping to the host
>>>> PCI address space is automatically handled by the DMA driver.
>>>
>>> Sorry I'm dense. I'm really not used to thinking in the endpoint
>>> point of view. Correct me where I go off the rails:
>>>
>>> - This code (pci_epc_mem_map()) runs on an endpoint, basically as
>>> part of some endpoint firmware, right?
>>
>> Not sure what you mean by "firmware" here. pci_epc_mem_map() is
>> intended for use by a PCI endpoint function driver on the EP side,
>> nothing else.
>
> I mean the software operating the endpoint from the "inside", not from
> the PCIe side.
Got it. So yes, a real device firmware, or a PCI endpoint function driver for a
machine running the PCI endpoint framework.
> In a non-endpoint framework scenario, the kernel running on the host
> enumerates PCI devices using config reads, and a driver like nvme runs
> on the host and operates an NVMe device with config accesses and MMIO
> to the NVMe BARs.
Yep, I know that.
>> pci_epc_mem_map() does not receive anything itself. It only
>> allocates local EP controller memory from outbound ATU windows and
>> map that to a PCI address region of the host (RC side). Here "map"
>> means programming the ATU using a correctly aligned PCI address that
>> satisfies the EP PCI controller alignment constraints.
>
> I assumed we were talking about an endpoint responding to PCIe traffic
> generated elsewhere, but I think my assumption was wrong.
Not at all. See below.
> I don't have any specs for these endpoint controllers, and there's
> nothing in Documentation/PCI/endpoint/ about ATUs. Some doc and a few
> pictures would go a long ways.
Agree. The PCI endpoint API documentation is far from great and should be
improved. Will try to do something about it.
>> The memory read/write TLPs will happen once the EP function driver
>> access the mapped memory with memcpy_fromio/toio() (or use generic
>> memory copy offload DMA channel).
>
> This would be the endpoint itself *initiating* DMA TLPs on PCIe, not
> responding to incoming DMA, I guess.
It can be both. What differs is how the endpint gets the PCI address to read/write.
For an endpoint initiated transfer, typically, the PCI address is obtained from
some "command" received through a BAR or through DMA. The command has the PCI
addresses to use for transfering data to/from the host (e.g. an nvme rw command
uses PRPs or SGLs to specify the PCI address segments for the data buffer of a
command). For this case, the EPF driver calls calls pci_epc_mem_map() for the
command buffer, does the transfer (memcpy_toio/fromio()) and unmaps with
pci_epc_mem_unmap(). Note though that here, if an eDMA channel is used for the
transfer, the DMA engine will do the mapping automatically and the epf does not
need to call pci_epc_mem_map()/pci_epc_mem_unmap(). There is still an issue in
this area which is that it is *not* clear if the DMA channel used can actually
do the mapping automatically or not. E.g. the generic DMA channel (mem copy
offload engine) will not. So there is still some API improvement needed to
abstract more HW dependent things here.
For a RC initiated transfer, the transfers are not done to random addresses.
E.g. using NVMe CMB (controller memory buffer), the host will tell the endpoint
"I want to use the CMB with this PCI address". Then the endpoint can use
pci_epc_mem_map() to map the CMB to said PCI address and exchange data with the
host through it.
>>> - These TLPs would be generated elsewhere in the PCIe fabric, e.g.,
>>> by a driver on the host doing MMIO to the endpoint, or possibly
>>> another endpoint doing peer-to-peer DMA.
>>
>> MMIO or DMA, yes.
>>
>>> - Mem read/write TLPs are routed by address, and the endpoint
>>> accepts them if the address matches one of its BARs.
>>
>> Yes, but pci_epc_mem_map() is not for use for BARs on the EP side.
>> BARs setup on EP PCI controllers use inbound ATU windows.
>> pci_epc_mem_map() programs outbound ATU windows. BARs setup use
>> pci_epf_alloc_space() + pci_epc_set_bar() to setup a BAR.
>>
>>> - This is a little different from a Root Complex, which would
>>> basically treat reads/writes to anything outside the Root Port
>>> windows as incoming DMA headed to physical memory.
>>>
>>> - A Root Complex would use the TLP address (the PCI bus address)
>>> directly as a CPU physical memory address unless the TLP address
>>> is translated by an IOMMU.
>>>
>>> - For the endpoint, you want the BAR to be an aperture to physical
>>> memory in the address space of the SoC driving the endpoint.
>>
>> Yes, and as said above this is done with pci_epf_alloc_space() +
>> pci_epc_set_bar(), not pci_epc_mem_map().
>
> And if the endpoint is *initiating* DMA, BARs of that endpoint are not
> involved at all. The target of the DMA would be either host memory or
> a BAR of another endpoint.
Yes.
>>> - The SoC physical memory address may be different from the PCI but
>>> address in the TLP, and pci_epc_mem_map() is the way to account
>>> for this?
>>
>> pci_epc_mem_map() is essentially a call to pci_epc_mem_alloc_addr()
>> + pci_epc_map_addr(). pci_epc_mem_alloc_addr() allocates memory from
>> the EP PCI controller outbound windows and pci_epc_map_addr()
>> programs an EP PCI controller outbound ATU entry to map that memory
>> to some PCI address region of the host (RC).
>
> I see "RC" and "RC address" used often in this area, but I don't quite
> understand the RC connection. It sounds like this might be the PCI
> memory address space, i.e., the set of addresses that might appear in
> PCIe TLPs?
Yes. Sorry about that. PCI address == RC address in my head since the endpoint
does not "own" the PCI address space, the host RC does.
> Does "EPC addr space" refer to the physical address space of the
> processor operating the endpoint, or does it mean something more
> specific?
Correct. EPC address is the local memory address on the endpoint, so CPU
addresses. For something that maps to a RC PCI address, that generally mean
addresses withing the EP PCI controller outbound memory regions.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v6 0/6] Improve PCI memory mapping API
2024-10-23 2:51 ` Damien Le Moal
@ 2024-10-23 9:29 ` Niklas Cassel
0 siblings, 0 replies; 29+ messages in thread
From: Niklas Cassel @ 2024-10-23 9:29 UTC (permalink / raw)
To: Damien Le Moal
Cc: Bjorn Helgaas, Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Bjorn Helgaas, Lorenzo Pieralisi,
Rob Herring, Jonathan Corbet, Jingoo Han, linux-pci,
Rick Wertenbroek
On Wed, Oct 23, 2024 at 11:51:41AM +0900, Damien Le Moal wrote:
(snip)
> For an endpoint initiated transfer, typically, the PCI address is obtained from
> some "command" received through a BAR or through DMA. The command has the PCI
> addresses to use for transfering data to/from the host (e.g. an nvme rw command
> uses PRPs or SGLs to specify the PCI address segments for the data buffer of a
> command). For this case, the EPF driver calls calls pci_epc_mem_map() for the
> command buffer, does the transfer (memcpy_toio/fromio()) and unmaps with
> pci_epc_mem_unmap(). Note though that here, if an eDMA channel is used for the
> transfer, the DMA engine will do the mapping automatically and the epf does not
> need to call pci_epc_mem_map()/pci_epc_mem_unmap(). There is still an issue in
> this area which is that it is *not* clear if the DMA channel used can actually
> do the mapping automatically or not. E.g. the generic DMA channel (mem copy
> offload engine) will not. So there is still some API improvement needed to
> abstract more HW dependent things here.
FWIW, in my final reply here:
https://lore.kernel.org/lkml/ZiYuIaX7ZV0exKMt@ryzen/
"
I did suggest that DWC-based drivers could set a DMA_SLAVE_SKIP_MEM_MAP flag
or similar when registering the eDMA, which pci-epf-test then could check,
but I got no response if anyone else thought that this was a good idea.
"
For DMA_SLAVE (private tx/rx DMA channels):
For DWC-based controllers, we can definitely set DMA_SLAVE_SKIP_MEM_MAP when
registering the eDMA (e.g. in dw_pcie_edma_detect()).
However, I don't know how the DMA hardware (if any) in:
drivers/pci/controller/cadence/pcie-cadence-ep.c
drivers/pci/controller/pcie-rcar-ep.c
drivers/pci/controller/pcie-rockchip-ep.c
works, so I'm not sure if those drivers can set DMA_SLAVE_SKIP_MEM_MAP
(the safest thing is to not set that flag, until we know how they work).
For DMA_MEMCPY:
We know that we need to perform pci_epc_mem_map() when using
DMA API + "dummy" memcpy dma-channel (DMA_MEMCPY).
I could imagine that some embedded DMA controllers also provide
DMA_MEMCPY capabilities (in addition to DMA_SLAVE).
So I guess the safest thing is to call the flag something like:
PCI_EPC_DMA_SKIP_MEM_MAP
(rather than PCI_EPC_DMA_SLAVE_SKIP_MEM_MAP).
(since the embedded DMA controller might provide both DMA_SLAVE and DMA_MEMCPY).
And let the EPC driver (e.g. dw_pcie_edma_detect()), or possibly the DMA
driver itself to provide/set this flag.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2024-10-23 9:29 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-12 11:32 [PATCH v6 0/6] Improve PCI memory mapping API Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 1/6] PCI: endpoint: Introduce pci_epc_function_is_valid() Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 2/6] PCI: endpoint: Improve pci_epc_mem_alloc_addr() Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 3/6] PCI: endpoint: Introduce pci_epc_mem_map()/unmap() Damien Le Moal
2024-10-12 11:47 ` Manivannan Sadhasivam
2024-10-13 9:06 ` Niklas Cassel
2024-10-14 13:09 ` Damien Le Moal
2024-10-15 6:01 ` Manivannan Sadhasivam
2024-10-12 11:32 ` [PATCH v6 4/6] PCI: endpoint: Update documentation Damien Le Moal
2024-10-12 11:48 ` Manivannan Sadhasivam
2024-10-12 11:32 ` [PATCH v6 5/6] PCI: endpoint: test: Use pci_epc_mem_map/unmap() Damien Le Moal
2024-10-12 11:32 ` [PATCH v6 6/6] PCI: dwc: endpoint: Implement the pci_epc_ops::align_addr() operation Damien Le Moal
2024-10-12 11:53 ` Manivannan Sadhasivam
2024-10-12 11:57 ` [PATCH v6 0/6] Improve PCI memory mapping API Manivannan Sadhasivam
2024-10-12 12:03 ` Damien Le Moal
2024-10-21 22:19 ` Bjorn Helgaas
2024-10-22 1:51 ` Damien Le Moal
2024-10-22 8:38 ` Niklas Cassel
2024-10-22 11:57 ` Damien Le Moal
2024-10-22 13:56 ` Manivannan Sadhasivam
2024-10-22 14:16 ` Niklas Cassel
2024-10-22 15:18 ` Frank Li
2024-10-22 15:30 ` Manivannan Sadhasivam
2024-10-22 22:12 ` Damien Le Moal
2024-10-22 20:47 ` Bjorn Helgaas
2024-10-22 22:05 ` Damien Le Moal
2024-10-22 23:49 ` Bjorn Helgaas
2024-10-23 2:51 ` Damien Le Moal
2024-10-23 9:29 ` Niklas Cassel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).