* [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
@ 2024-03-18 19:30 Niklas Cassel
2024-03-18 20:02 ` Arnd Bergmann
0 siblings, 1 reply; 9+ messages in thread
From: Niklas Cassel @ 2024-03-18 19:30 UTC (permalink / raw)
To: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Arnd Bergmann, Greg Kroah-Hartman
Cc: Damien Le Moal, Niklas Cassel, linux-pci
The current code uses writel()/readl(), which has an implicit memory
barrier for every single readl()/writel().
Additionally, reading 4 bytes at a time over the PCI bus is not really
optimal, considering that this code is running in an ioctl handler.
Use memcpy_toio()/memcpy_fromio() for BAR tests.
Before patch with a 4MB BAR:
$ time /usr/bin/pcitest -b 1
BAR1: OKAY
real 0m 1.56s
After patch with a 4MB BAR:
$ time /usr/bin/pcitest -b 1
BAR1: OKAY
real 0m 0.54s
Signed-off-by: Niklas Cassel <cassel@kernel.org>
---
drivers/misc/pci_endpoint_test.c | 52 ++++++++++++++++++++++++--------
1 file changed, 40 insertions(+), 12 deletions(-)
diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
index 705029ad8eb5..cb6c9ccf3a5f 100644
--- a/drivers/misc/pci_endpoint_test.c
+++ b/drivers/misc/pci_endpoint_test.c
@@ -272,31 +272,59 @@ static const u32 bar_test_pattern[] = {
0xA5A5A5A5,
};
+static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
+ enum pci_barno barno, int offset,
+ void *write_buf, void *read_buf,
+ int size)
+{
+ memset(write_buf, bar_test_pattern[barno], size);
+ memcpy_toio(test->bar[barno] + offset, write_buf, size);
+
+ /* Make sure that reads are performed after writes. */
+ mb();
+ memcpy_fromio(read_buf, test->bar[barno] + offset, size);
+
+ return memcmp(write_buf, read_buf, size);
+}
+
static bool pci_endpoint_test_bar(struct pci_endpoint_test *test,
enum pci_barno barno)
{
- int j;
- u32 val;
- int size;
+ int j, bar_size, buf_size, iters, remain;
+ void *write_buf;
+ void *read_buf;
struct pci_dev *pdev = test->pdev;
+ struct device *dev = &pdev->dev;
if (!test->bar[barno])
return false;
- size = pci_resource_len(pdev, barno);
+ bar_size = pci_resource_len(pdev, barno);
if (barno == test->test_reg_bar)
- size = 0x4;
+ bar_size = 0x4;
- for (j = 0; j < size; j += 4)
- pci_endpoint_test_bar_writel(test, barno, j,
- bar_test_pattern[barno]);
+ buf_size = min(SZ_1M, bar_size);
- for (j = 0; j < size; j += 4) {
- val = pci_endpoint_test_bar_readl(test, barno, j);
- if (val != bar_test_pattern[barno])
+ write_buf = kmalloc(buf_size, GFP_KERNEL);
+ if (!write_buf)
+ return false;
+
+ read_buf = kmalloc(buf_size, GFP_KERNEL);
+ if (!read_buf)
+ return false;
+
+ iters = bar_size / buf_size;
+ for (j = 0; j < iters; j++)
+ if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * j,
+ write_buf, read_buf, buf_size))
+ return false;
+
+ remain = bar_size % buf_size;
+ if (remain)
+ if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * iters,
+ write_buf, read_buf, remain))
return false;
- }
return true;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-18 19:30 [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests Niklas Cassel
@ 2024-03-18 20:02 ` Arnd Bergmann
2024-03-19 4:31 ` Manivannan Sadhasivam
2024-03-19 16:40 ` Niklas Cassel
0 siblings, 2 replies; 9+ messages in thread
From: Arnd Bergmann @ 2024-03-18 20:02 UTC (permalink / raw)
To: Niklas Cassel, Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Greg Kroah-Hartman
Cc: Damien Le Moal, linux-pci
On Mon, Mar 18, 2024, at 20:30, Niklas Cassel wrote:
> diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
> index 705029ad8eb5..cb6c9ccf3a5f 100644
> --- a/drivers/misc/pci_endpoint_test.c
> +++ b/drivers/misc/pci_endpoint_test.c
> @@ -272,31 +272,59 @@ static const u32 bar_test_pattern[] = {
> 0xA5A5A5A5,
> };
>
> +static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
> + enum pci_barno barno, int offset,
> + void *write_buf, void *read_buf,
> + int size)
> +{
> + memset(write_buf, bar_test_pattern[barno], size);
> + memcpy_toio(test->bar[barno] + offset, write_buf, size);
> +
> + /* Make sure that reads are performed after writes. */
> + mb();
> + memcpy_fromio(read_buf, test->bar[barno] + offset, size);
Did you see actual bugs without the barrier? On normal PCI
semantics, a read will always force a write to be flushed first.
Arnd
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-18 20:02 ` Arnd Bergmann
@ 2024-03-19 4:31 ` Manivannan Sadhasivam
2024-03-19 6:41 ` Arnd Bergmann
2024-03-19 16:40 ` Niklas Cassel
1 sibling, 1 reply; 9+ messages in thread
From: Manivannan Sadhasivam @ 2024-03-19 4:31 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Niklas Cassel, Krzysztof Wilczyński, Kishon Vijay Abraham I,
Greg Kroah-Hartman, Damien Le Moal, linux-pci
On Mon, Mar 18, 2024 at 09:02:21PM +0100, Arnd Bergmann wrote:
> On Mon, Mar 18, 2024, at 20:30, Niklas Cassel wrote:
> > diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
> > index 705029ad8eb5..cb6c9ccf3a5f 100644
> > --- a/drivers/misc/pci_endpoint_test.c
> > +++ b/drivers/misc/pci_endpoint_test.c
> > @@ -272,31 +272,59 @@ static const u32 bar_test_pattern[] = {
> > 0xA5A5A5A5,
> > };
> >
> > +static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
> > + enum pci_barno barno, int offset,
> > + void *write_buf, void *read_buf,
> > + int size)
> > +{
> > + memset(write_buf, bar_test_pattern[barno], size);
> > + memcpy_toio(test->bar[barno] + offset, write_buf, size);
> > +
> > + /* Make sure that reads are performed after writes. */
> > + mb();
> > + memcpy_fromio(read_buf, test->bar[barno] + offset, size);
>
> Did you see actual bugs without the barrier? On normal PCI
> semantics, a read will always force a write to be flushed first.
>
Even for non-PCI cases, read/writes to the same region are not reordered, right?
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-19 4:31 ` Manivannan Sadhasivam
@ 2024-03-19 6:41 ` Arnd Bergmann
0 siblings, 0 replies; 9+ messages in thread
From: Arnd Bergmann @ 2024-03-19 6:41 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Niklas Cassel, Krzysztof Wilczyński, Kishon Vijay Abraham I,
Greg Kroah-Hartman, Damien Le Moal, linux-pci
On Tue, Mar 19, 2024, at 05:31, Manivannan Sadhasivam wrote:
> On Mon, Mar 18, 2024 at 09:02:21PM +0100, Arnd Bergmann wrote:
>> On Mon, Mar 18, 2024, at 20:30, Niklas Cassel wrote:
>> > diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
>> > index 705029ad8eb5..cb6c9ccf3a5f 100644
>> > --- a/drivers/misc/pci_endpoint_test.c
>> > +++ b/drivers/misc/pci_endpoint_test.c
>> > @@ -272,31 +272,59 @@ static const u32 bar_test_pattern[] = {
>> > 0xA5A5A5A5,
>> > };
>> >
>> > +static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
>> > + enum pci_barno barno, int offset,
>> > + void *write_buf, void *read_buf,
>> > + int size)
>> > +{
>> > + memset(write_buf, bar_test_pattern[barno], size);
>> > + memcpy_toio(test->bar[barno] + offset, write_buf, size);
>> > +
>> > + /* Make sure that reads are performed after writes. */
>> > + mb();
>> > + memcpy_fromio(read_buf, test->bar[barno] + offset, size);
>>
>> Did you see actual bugs without the barrier? On normal PCI
>> semantics, a read will always force a write to be flushed first.
>>
>
> Even for non-PCI cases, read/writes to the same region are not reordered, right?
Correct. The only thing that is special about PCI here is
that writes are explicitly allowed to get posted in the
first place, on other buses the memcpy_toio() by itself
might already require non-posted behavior.
Arnd
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-18 20:02 ` Arnd Bergmann
2024-03-19 4:31 ` Manivannan Sadhasivam
@ 2024-03-19 16:40 ` Niklas Cassel
2024-03-19 16:48 ` Manivannan Sadhasivam
1 sibling, 1 reply; 9+ messages in thread
From: Niklas Cassel @ 2024-03-19 16:40 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Greg Kroah-Hartman, Damien Le Moal,
linux-pci
Hello Arnd,
On Mon, Mar 18, 2024 at 09:02:21PM +0100, Arnd Bergmann wrote:
> On Mon, Mar 18, 2024, at 20:30, Niklas Cassel wrote:
> > diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
> > index 705029ad8eb5..cb6c9ccf3a5f 100644
> > --- a/drivers/misc/pci_endpoint_test.c
> > +++ b/drivers/misc/pci_endpoint_test.c
> > @@ -272,31 +272,59 @@ static const u32 bar_test_pattern[] = {
> > 0xA5A5A5A5,
> > };
> >
> > +static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
> > + enum pci_barno barno, int offset,
> > + void *write_buf, void *read_buf,
> > + int size)
> > +{
> > + memset(write_buf, bar_test_pattern[barno], size);
> > + memcpy_toio(test->bar[barno] + offset, write_buf, size);
> > +
> > + /* Make sure that reads are performed after writes. */
> > + mb();
> > + memcpy_fromio(read_buf, test->bar[barno] + offset, size);
>
> Did you see actual bugs without the barrier? On normal PCI
> semantics, a read will always force a write to be flushed first.
I'm aware that a Read Request must not pass a Posted Request under
normal PCI transaction ordering rules.
(As defined in PCIe 6.0, Table 2-42 Ordering Rules Summary)
I was more worried about the compiler or CPU reordering things:
https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt#L1876-L1878
I did try the patch without the barrier, and did not see any issues.
I did also see this comment:
https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt#L2785-L2790
Do you think that we need to perform any flushing after the memset(),
to ensure that the data written using memcpy_toio() is actually what
we expect it to me?
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-19 16:40 ` Niklas Cassel
@ 2024-03-19 16:48 ` Manivannan Sadhasivam
2024-03-19 16:53 ` Niklas Cassel
0 siblings, 1 reply; 9+ messages in thread
From: Manivannan Sadhasivam @ 2024-03-19 16:48 UTC (permalink / raw)
To: Niklas Cassel
Cc: Arnd Bergmann, Krzysztof Wilczyński, Kishon Vijay Abraham I,
Greg Kroah-Hartman, Damien Le Moal, linux-pci
On Tue, Mar 19, 2024 at 05:40:33PM +0100, Niklas Cassel wrote:
> Hello Arnd,
>
> On Mon, Mar 18, 2024 at 09:02:21PM +0100, Arnd Bergmann wrote:
> > On Mon, Mar 18, 2024, at 20:30, Niklas Cassel wrote:
> > > diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
> > > index 705029ad8eb5..cb6c9ccf3a5f 100644
> > > --- a/drivers/misc/pci_endpoint_test.c
> > > +++ b/drivers/misc/pci_endpoint_test.c
> > > @@ -272,31 +272,59 @@ static const u32 bar_test_pattern[] = {
> > > 0xA5A5A5A5,
> > > };
> > >
> > > +static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
> > > + enum pci_barno barno, int offset,
> > > + void *write_buf, void *read_buf,
> > > + int size)
> > > +{
> > > + memset(write_buf, bar_test_pattern[barno], size);
> > > + memcpy_toio(test->bar[barno] + offset, write_buf, size);
> > > +
> > > + /* Make sure that reads are performed after writes. */
> > > + mb();
> > > + memcpy_fromio(read_buf, test->bar[barno] + offset, size);
> >
> > Did you see actual bugs without the barrier? On normal PCI
> > semantics, a read will always force a write to be flushed first.
>
> I'm aware that a Read Request must not pass a Posted Request under
> normal PCI transaction ordering rules.
> (As defined in PCIe 6.0, Table 2-42 Ordering Rules Summary)
>
> I was more worried about the compiler or CPU reordering things:
> https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt#L1876-L1878
>
> I did try the patch without the barrier, and did not see any issues.
>
There shouldn't be an issue without the barrier.
>
> I did also see this comment:
> https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt#L2785-L2790
>
> Do you think that we need to perform any flushing after the memset(),
> to ensure that the data written using memcpy_toio() is actually what
> we expect it to me?
>
The documentation recommends cache flushing only if the normal memory write and
MMIO access are dependent. But here you are just accessing the MMIO. So no
explicit ordering or cache flushing is required.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-19 16:48 ` Manivannan Sadhasivam
@ 2024-03-19 16:53 ` Niklas Cassel
2024-03-19 19:55 ` Arnd Bergmann
0 siblings, 1 reply; 9+ messages in thread
From: Niklas Cassel @ 2024-03-19 16:53 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Arnd Bergmann, Krzysztof Wilczyński, Kishon Vijay Abraham I,
Greg Kroah-Hartman, Damien Le Moal, linux-pci
On Tue, Mar 19, 2024 at 10:18:26PM +0530, Manivannan Sadhasivam wrote:
> >
> > I did also see this comment:
> > https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt#L2785-L2790
> >
> > Do you think that we need to perform any flushing after the memset(),
> > to ensure that the data written using memcpy_toio() is actually what
> > we expect it to me?
> >
>
> The documentation recommends cache flushing only if the normal memory write and
> MMIO access are dependent. But here you are just accessing the MMIO. So no
> explicit ordering or cache flushing is required.
What does dependent mean in this case then?
Since the data that we are writing to the device is the data that was
just written to memory using memset().
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-19 16:53 ` Niklas Cassel
@ 2024-03-19 19:55 ` Arnd Bergmann
2024-03-19 21:36 ` Niklas Cassel
0 siblings, 1 reply; 9+ messages in thread
From: Arnd Bergmann @ 2024-03-19 19:55 UTC (permalink / raw)
To: Niklas Cassel, Manivannan Sadhasivam
Cc: Krzysztof Wilczyński, Kishon Vijay Abraham I,
Greg Kroah-Hartman, Damien Le Moal, linux-pci
On Tue, Mar 19, 2024, at 17:53, Niklas Cassel wrote:
> On Tue, Mar 19, 2024 at 10:18:26PM +0530, Manivannan Sadhasivam wrote:
>> >
>> > I did also see this comment:
>> > https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt#L2785-L2790
>> >
>> > Do you think that we need to perform any flushing after the memset(),
>> > to ensure that the data written using memcpy_toio() is actually what
>> > we expect it to me?
>> >
>>
>> The documentation recommends cache flushing only if the normal memory write and
>> MMIO access are dependent. But here you are just accessing the MMIO. So no
>> explicit ordering or cache flushing is required.
>
> What does dependent mean in this case then?
>
> Since the data that we are writing to the device is the data that was
> just written to memory using memset().
You need a barrier for the case where the memset() writes to
a buffer in RAM and then you write the address of that buffer
into a device register, which triggers a DMA read from the
buffer. Without a barrier, the side effect of the MMIO write
may come before the data in the RAM buffer is visible.
A memcpy_fromio() only involves a single master accessing
the memory (i.e. the CPU executing both the memset() and the
memcpy()), so there is no way this can go wrong.
Arnd
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-19 19:55 ` Arnd Bergmann
@ 2024-03-19 21:36 ` Niklas Cassel
0 siblings, 0 replies; 9+ messages in thread
From: Niklas Cassel @ 2024-03-19 21:36 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Greg Kroah-Hartman, Damien Le Moal,
linux-pci
Hello Arnd,
On Tue, Mar 19, 2024 at 08:55:19PM +0100, Arnd Bergmann wrote:
> On Tue, Mar 19, 2024, at 17:53, Niklas Cassel wrote:
> > On Tue, Mar 19, 2024 at 10:18:26PM +0530, Manivannan Sadhasivam wrote:
> >> >
> >> > I did also see this comment:
> >> > https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt#L2785-L2790
> >> >
> >> > Do you think that we need to perform any flushing after the memset(),
> >> > to ensure that the data written using memcpy_toio() is actually what
> >> > we expect it to me?
> >> >
> >>
> >> The documentation recommends cache flushing only if the normal memory write and
> >> MMIO access are dependent. But here you are just accessing the MMIO. So no
> >> explicit ordering or cache flushing is required.
> >
> > What does dependent mean in this case then?
> >
> > Since the data that we are writing to the device is the data that was
> > just written to memory using memset().
>
> You need a barrier for the case where the memset() writes to
> a buffer in RAM and then you write the address of that buffer
> into a device register, which triggers a DMA read from the
> buffer. Without a barrier, the side effect of the MMIO write
> may come before the data in the RAM buffer is visible.
>
> A memcpy_fromio() only involves a single master accessing
> the memory (i.e. the CPU executing both the memset() and the
> memcpy()), so there is no way this can go wrong.
Thank you for the clarification.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-03-19 21:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-18 19:30 [PATCH] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests Niklas Cassel
2024-03-18 20:02 ` Arnd Bergmann
2024-03-19 4:31 ` Manivannan Sadhasivam
2024-03-19 6:41 ` Arnd Bergmann
2024-03-19 16:40 ` Niklas Cassel
2024-03-19 16:48 ` Manivannan Sadhasivam
2024-03-19 16:53 ` Niklas Cassel
2024-03-19 19:55 ` Arnd Bergmann
2024-03-19 21:36 ` Niklas Cassel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox