* [PATCH 0/3] DAPL support on s390x platform
@ 2014-10-10 9:34 Alexey Ishchuk
[not found] ` <1412933657-52641-1-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Alexey Ishchuk @ 2014-10-10 9:34 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
schwidefsky-tA70FqPdS9bQT0dZR+AlfA,
gmuelas-tA70FqPdS9bQT0dZR+AlfA, utz.bacher-tA70FqPdS9bQT0dZR+AlfA,
roland-DgEjT+Ai2ygdnm+yROfE0A, yishaih-VPRAkNaXOzVWk0Htik3J/w,
Alexey Ishchuk
The s390 kernel maintainer is going to include the new system calls for
PCI I/O memory access into the s390 Linux kernel. To make the final
decision we would like to receive feedback from the libmlx4 and libibverbs
user space library maintainers with direct answer to the question: will
those system calls be used in the libmlx4 user space library to provide
support for the s390x platform?
This patch series contains the changes to kernel and users pace libraries
required to provide support for the DAPL API on s390x platform.
The current implementation of Infiniband verbs uses mapped memory areas to
directly access the device UAR and Blueflame pages, which are located in
the PCI I/O memory, from user space. On the s390x platform the PCI I/O
memory can be accessed only using special privileged CPU instructions that
cannot be used directly in user space programs. This restricts the usage of
mapped memory areas to access the PCI I/O memory on s390x platform.
This version of changes introduces two new kernel system calls which allow
to execute the privileged CPU instructions in kernel space on request from
user space programs. One system call allows the user space programs to
write data to a PCI I/O memory page and the second one can be used to read
data from PCI I/O memory to user space program buffer using mapped memory
area addresses as arguments.
This approach of the DAPL API support on s390x platform has the following
advantages:
* the current Infiniband and mlx4 support kernel modules remain
unchanged;
* the changes are separated into the special kernel platform
specific directory;
* no conditional compilation directives are used in the kernel
source code;
* no changes required to the kernel virtual memory management;
* only minor changes are required in the user space DAPL API
components.
There are 1 patch for the Linux kernel and 2 patches for the DAPL API user
space components. The kernel patch is not changed since the previous post
and is included into this patch series just for reference. The patches for
the perftest and dapl packages are not included into this series because
those changes have been already applied to corresponding packages by the
maintainers.
[PATCH 1/3] s390/kernel: add system calls for access PCI memory
This patch contains the new system call implementation required for the PCI
I/O memory access from user space programs on s390x platform.
[PATCH 2/3] libibverbs: add support for the s390x plaform
This patch contains the changes to the libibverbs user space library to
provide support of the s390x platform.
[PATCH 3/3] libmlx4: add support for the s390x platform
This patch contains the changes to the libmlx4 user space library intended
to provide the PCI I/O memory access on the s390x platform. The direct
access to mapped memory areas is replaced by appropriate system call
invocation.
Alexey Ishchuk (3):
s390/kernel: add system calls for access PCI memory
libibverbs: add s390x platform support
libmlx4: add s390x platform support
--
1.8.5.5
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread[parent not found: <1412933657-52641-1-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>]
* [PATCH 1/3] s390/kernel: add system calls for access PCI memory [not found] ` <1412933657-52641-1-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> @ 2014-10-10 9:34 ` Alexey Ishchuk [not found] ` <1412933657-52641-2-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> 2014-10-10 9:34 ` [PATCH 2/3] libibverbs: add support for the s390x platform Alexey Ishchuk ` (2 subsequent siblings) 3 siblings, 1 reply; 12+ messages in thread From: Alexey Ishchuk @ 2014-10-10 9:34 UTC (permalink / raw) To: linux-rdma-u79uwXL29TY76Z2rM5mHXA Cc: blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, schwidefsky-tA70FqPdS9bQT0dZR+AlfA, gmuelas-tA70FqPdS9bQT0dZR+AlfA, utz.bacher-tA70FqPdS9bQT0dZR+AlfA, roland-DgEjT+Ai2ygdnm+yROfE0A, yishaih-VPRAkNaXOzVWk0Htik3J/w, Alexey Ishchuk, Alexey Ishchuk Add the new __NR_s390_pci_mmio_write and __NR_s390_pci_mmio_read system calls to allow user space applications to access device PCI I/O memory pages on s390x platform. Signed-off-by: Alexey Ishchuk <alexey_ishchuk-JgLE2wv1ufrQT0dZR+AlfA@public.gmane.org> --- arch/s390/include/uapi/asm/unistd.h | 4 +- arch/s390/kernel/Makefile | 1 + arch/s390/kernel/entry.h | 4 + arch/s390/kernel/pci_mmio.c | 207 ++++++++++++++++++++++++++++++++++++ arch/s390/kernel/syscalls.S | 2 + 5 files changed, 217 insertions(+), 1 deletion(-) create mode 100644 arch/s390/kernel/pci_mmio.c diff --git a/arch/s390/include/uapi/asm/unistd.h b/arch/s390/include/uapi/asm/unistd.h index 3802d2d..ab49d1d 100644 --- a/arch/s390/include/uapi/asm/unistd.h +++ b/arch/s390/include/uapi/asm/unistd.h @@ -283,7 +283,9 @@ #define __NR_sched_setattr 345 #define __NR_sched_getattr 346 #define __NR_renameat2 347 -#define NR_syscalls 348 +#define __NR_s390_pci_mmio_write 348 +#define __NR_s390_pci_mmio_read 349 +#define NR_syscalls 350 /* * There are some system calls that are not present on 64 bit, some diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile index 33225e8..3e71b7e 100644 --- a/arch/s390/kernel/Makefile +++ b/arch/s390/kernel/Makefile @@ -60,6 +60,7 @@ ifdef CONFIG_64BIT obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_cpum_cf.o perf_cpum_sf.o \ perf_cpum_cf_events.o obj-y += runtime_instr.o cache.o +obj-y += pci_mmio.o endif # vdso diff --git a/arch/s390/kernel/entry.h b/arch/s390/kernel/entry.h index 6ac7819..a36b6f9 100644 --- a/arch/s390/kernel/entry.h +++ b/arch/s390/kernel/entry.h @@ -70,4 +70,8 @@ struct old_sigaction; long sys_s390_personality(unsigned int personality); long sys_s390_runtime_instr(int command, int signum); +long sys_s390_pci_mmio_write(const unsigned long mmio_addr, + const void *user_buffer, const size_t length); +long sys_s390_pci_mmio_read(const unsigned long mmio_addr, + void *user_buffer, const size_t length); #endif /* _ENTRY_H */ diff --git a/arch/s390/kernel/pci_mmio.c b/arch/s390/kernel/pci_mmio.c new file mode 100644 index 0000000..f318207 --- /dev/null +++ b/arch/s390/kernel/pci_mmio.c @@ -0,0 +1,207 @@ +/* + * Access to PCI I/O memory from user space programs. + * + * Copyright IBM Corp. 2014 + * Author(s): Alexey Ishchuk <aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> + */ +#include <linux/kernel.h> +#include <linux/syscalls.h> +#include <linux/init.h> +#include <linux/mm.h> +#include <linux/errno.h> +#include <linux/pci.h> + +union value_buffer { + u8 buf8; + u16 buf16; + u32 buf32; + u64 buf64; + u8 buf_large[64]; +}; + +static long get_pfn(const unsigned long user_addr, + const unsigned long access, + unsigned long *pfn) +{ + struct vm_area_struct *vma = NULL; + long ret = 0L; + + if (!pfn) + return -EINVAL; + + down_read(¤t->mm->mmap_sem); + vma = find_vma(current->mm, user_addr); + if (vma) { + if (!(vma->vm_flags & access)) + ret = -EACCES; + else + ret = follow_pfn(vma, user_addr, pfn); + } else { + ret = -EINVAL; + } + up_read(¤t->mm->mmap_sem); + + return ret; +} + +static inline int verify_page_addr(const unsigned long page_addr) +{ + return !(page_addr < ZPCI_IOMAP_ADDR_BASE || + page_addr > (ZPCI_IOMAP_ADDR_BASE | ZPCI_IOMAP_ADDR_IDX_MASK)); +} + +static long choose_buffer(const size_t length, + union value_buffer *value, + void **buf) +{ + long ret = 0L; + + if (length > sizeof(value->buf_large)) { + *buf = kmalloc(length, GFP_KERNEL); + if (!*buf) + return -ENOMEM; + ret = 1; + } else { + *buf = value->buf_large; + } + return ret; +} + +SYSCALL_DEFINE3(s390_pci_mmio_write, + const unsigned long, mmio_addr, + const void __user *, user_buffer, + const size_t, length) +{ + long ret = 0L; + void *buf = NULL; + long buf_allocated = 0; + void __iomem *io_addr = NULL; + unsigned long pfn = 0UL; + unsigned long offset = 0UL; + unsigned long page_addr = 0UL; + union value_buffer value; + + if (!length) + return -EINVAL; + if (!zpci_is_enabled()) + return -ENODEV; + + ret = get_pfn(mmio_addr, VM_WRITE, &pfn); + if (ret) + return ret; + + page_addr = pfn << PAGE_SHIFT; + if (!verify_page_addr(page_addr)) + return -EFAULT; + + offset = mmio_addr & ~PAGE_MASK; + if (offset + length > PAGE_SIZE) + return -EINVAL; + io_addr = (void *)(page_addr | offset); + + buf_allocated = choose_buffer(length, &value, &buf); + if (buf_allocated < 0L) + return -ENOMEM; + + switch (length) { + case 1: + ret = get_user(value.buf8, ((u8 *)user_buffer)); + break; + case 2: + ret = get_user(value.buf16, ((u16 *)user_buffer)); + break; + case 4: + ret = get_user(value.buf32, ((u32 *)user_buffer)); + break; + case 8: + ret = get_user(value.buf64, ((u64 *)user_buffer)); + break; + default: + ret = copy_from_user(buf, user_buffer, length); + } + if (ret) + goto out; + + switch (length) { + case 1: + __raw_writeb(value.buf8, io_addr); + break; + case 2: + __raw_writew(value.buf16, io_addr); + break; + case 4: + __raw_writel(value.buf32, io_addr); + break; + case 8: + __raw_writeq(value.buf64, io_addr); + break; + default: + memcpy_toio(io_addr, buf, length); + } +out: + if (buf_allocated > 0L) + kfree(buf); + return ret; +} + +SYSCALL_DEFINE3(s390_pci_mmio_read, + const unsigned long, mmio_addr, + void __user *, user_buffer, + const size_t, length) +{ + long ret = 0L; + void *buf = NULL; + long buf_allocated = 0L; + void __iomem *io_addr = NULL; + unsigned long pfn = 0UL; + unsigned long offset = 0UL; + unsigned long page_addr = 0UL; + union value_buffer value; + + if (!length) + return -EINVAL; + if (!zpci_is_enabled()) + return -ENODEV; + + ret = get_pfn(mmio_addr, VM_READ, &pfn); + if (ret) + return ret; + + page_addr = pfn << PAGE_SHIFT; + if (!verify_page_addr(page_addr)) + return -EFAULT; + + offset = mmio_addr & ~PAGE_MASK; + if (offset + length > PAGE_SIZE) + return -EINVAL; + io_addr = (void *)(page_addr | offset); + + buf_allocated = choose_buffer(length, &value, &buf); + if (buf_allocated < 0L) + return -ENOMEM; + + switch (length) { + case 1: + value.buf8 = __raw_readb(io_addr); + ret = put_user(value.buf8, ((u8 *)user_buffer)); + break; + case 2: + value.buf16 = __raw_readw(io_addr); + ret = put_user(value.buf16, ((u16 *)user_buffer)); + break; + case 4: + value.buf32 = __raw_readl(io_addr); + ret = put_user(value.buf32, ((u32 *)user_buffer)); + break; + case 8: + value.buf64 = __raw_readq(io_addr); + ret = put_user(value.buf64, ((u64 *)user_buffer)); + break; + default: + memcpy_fromio(buf, io_addr, length); + ret = copy_to_user(user_buffer, buf, length); + } + if (buf_allocated > 0L) + kfree(buf); + return ret; +} diff --git a/arch/s390/kernel/syscalls.S b/arch/s390/kernel/syscalls.S index fe5cdf2..1faa942 100644 --- a/arch/s390/kernel/syscalls.S +++ b/arch/s390/kernel/syscalls.S @@ -356,3 +356,5 @@ SYSCALL(sys_finit_module,sys_finit_module,compat_sys_finit_module) SYSCALL(sys_sched_setattr,sys_sched_setattr,compat_sys_sched_setattr) /* 345 */ SYSCALL(sys_sched_getattr,sys_sched_getattr,compat_sys_sched_getattr) SYSCALL(sys_renameat2,sys_renameat2,compat_sys_renameat2) +SYSCALL(sys_ni_syscall,sys_s390_pci_mmio_write,sys_ni_syscall) +SYSCALL(sys_ni_syscall,sys_s390_pci_mmio_read,sys_ni_syscall) -- 1.8.5.5 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 12+ messages in thread
[parent not found: <1412933657-52641-2-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>]
* RE: [PATCH 1/3] s390/kernel: add system calls for access PCI memory [not found] ` <1412933657-52641-2-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> @ 2014-10-12 11:52 ` Shachar Raindel [not found] ` <f061a36c713c42c9b71530183a6e8644-Vl31pUvGNwELId+1UC+8EGu6+pknBqLbXA4E9RH9d+qIuWR1G4zioA@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Shachar Raindel @ 2014-10-12 11:52 UTC (permalink / raw) To: Alexey Ishchuk, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org, schwidefsky-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org, gmuelas-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org, utz.bacher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org, roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, Yishai Hadas, Alexey Ishchuk > -----Original Message----- > From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma- > owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Alexey Ishchuk > Sent: Friday, October 10, 2014 12:34 PM > To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > Cc: blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org; schwidefsky-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org; > gmuelas-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org; utz.bacher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org; roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org; Yishai > Hadas; Alexey Ishchuk; Alexey Ishchuk > Subject: [PATCH 1/3] s390/kernel: add system calls for access PCI memory > > Add the new __NR_s390_pci_mmio_write and __NR_s390_pci_mmio_read > system calls to allow user space applications to access device PCI I/O > memory pages on s390x platform. > > Signed-off-by: Alexey Ishchuk <alexey_ishchuk-JgLE2wv1ufrQT0dZR+AlfA@public.gmane.org> > --- > arch/s390/include/uapi/asm/unistd.h | 4 +- > arch/s390/kernel/Makefile | 1 + > arch/s390/kernel/entry.h | 4 + > arch/s390/kernel/pci_mmio.c | 207 > ++++++++++++++++++++++++++++++++++++ > arch/s390/kernel/syscalls.S | 2 + > 5 files changed, 217 insertions(+), 1 deletion(-) > create mode 100644 arch/s390/kernel/pci_mmio.c > > diff --git a/arch/s390/include/uapi/asm/unistd.h > b/arch/s390/include/uapi/asm/unistd.h > index 3802d2d..ab49d1d 100644 > --- a/arch/s390/include/uapi/asm/unistd.h > +++ b/arch/s390/include/uapi/asm/unistd.h > @@ -283,7 +283,9 @@ > #define __NR_sched_setattr 345 > #define __NR_sched_getattr 346 > #define __NR_renameat2 347 > -#define NR_syscalls 348 > +#define __NR_s390_pci_mmio_write 348 > +#define __NR_s390_pci_mmio_read 349 > +#define NR_syscalls 350 > > /* > * There are some system calls that are not present on 64 bit, some > diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile > index 33225e8..3e71b7e 100644 > --- a/arch/s390/kernel/Makefile > +++ b/arch/s390/kernel/Makefile > @@ -60,6 +60,7 @@ ifdef CONFIG_64BIT > obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_cpum_cf.o perf_cpum_sf.o > \ > perf_cpum_cf_events.o > obj-y += runtime_instr.o cache.o > +obj-y += pci_mmio.o > endif > > # vdso > diff --git a/arch/s390/kernel/entry.h b/arch/s390/kernel/entry.h > index 6ac7819..a36b6f9 100644 > --- a/arch/s390/kernel/entry.h > +++ b/arch/s390/kernel/entry.h > @@ -70,4 +70,8 @@ struct old_sigaction; > long sys_s390_personality(unsigned int personality); > long sys_s390_runtime_instr(int command, int signum); > > +long sys_s390_pci_mmio_write(const unsigned long mmio_addr, > + const void *user_buffer, const size_t length); > +long sys_s390_pci_mmio_read(const unsigned long mmio_addr, > + void *user_buffer, const size_t length); > #endif /* _ENTRY_H */ > diff --git a/arch/s390/kernel/pci_mmio.c b/arch/s390/kernel/pci_mmio.c > new file mode 100644 > index 0000000..f318207 > --- /dev/null > +++ b/arch/s390/kernel/pci_mmio.c > @@ -0,0 +1,207 @@ > +/* > + * Access to PCI I/O memory from user space programs. > + * > + * Copyright IBM Corp. 2014 > + * Author(s): Alexey Ishchuk <aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> > + */ > +#include <linux/kernel.h> > +#include <linux/syscalls.h> > +#include <linux/init.h> > +#include <linux/mm.h> > +#include <linux/errno.h> > +#include <linux/pci.h> > + > +union value_buffer { > + u8 buf8; > + u16 buf16; > + u32 buf32; > + u64 buf64; > + u8 buf_large[64]; > +}; > + > +static long get_pfn(const unsigned long user_addr, > + const unsigned long access, > + unsigned long *pfn) > +{ > + struct vm_area_struct *vma = NULL; > + long ret = 0L; > + > + if (!pfn) > + return -EINVAL; > + > + down_read(¤t->mm->mmap_sem); > + vma = find_vma(current->mm, user_addr); > + if (vma) { > + if (!(vma->vm_flags & access)) > + ret = -EACCES; > + else > + ret = follow_pfn(vma, user_addr, pfn); > + } else { > + ret = -EINVAL; > + } > + up_read(¤t->mm->mmap_sem); > + > + return ret; > +} > + > +static inline int verify_page_addr(const unsigned long page_addr) > +{ > + return !(page_addr < ZPCI_IOMAP_ADDR_BASE || > + page_addr > (ZPCI_IOMAP_ADDR_BASE | ZPCI_IOMAP_ADDR_IDX_MASK)); > +} > + > +static long choose_buffer(const size_t length, > + union value_buffer *value, > + void **buf) > +{ > + long ret = 0L; > + > + if (length > sizeof(value->buf_large)) { > + *buf = kmalloc(length, GFP_KERNEL); > + if (!*buf) > + return -ENOMEM; > + ret = 1; > + } else { > + *buf = value->buf_large; > + } > + return ret; > +} > + > +SYSCALL_DEFINE3(s390_pci_mmio_write, > + const unsigned long, mmio_addr, > + const void __user *, user_buffer, > + const size_t, length) > +{ > + long ret = 0L; > + void *buf = NULL; > + long buf_allocated = 0; > + void __iomem *io_addr = NULL; > + unsigned long pfn = 0UL; > + unsigned long offset = 0UL; > + unsigned long page_addr = 0UL; > + union value_buffer value; > + > + if (!length) > + return -EINVAL; > + if (!zpci_is_enabled()) > + return -ENODEV; > + > + ret = get_pfn(mmio_addr, VM_WRITE, &pfn); > + if (ret) > + return ret; > + > + page_addr = pfn << PAGE_SHIFT; > + if (!verify_page_addr(page_addr)) > + return -EFAULT; > + > + offset = mmio_addr & ~PAGE_MASK; > + if (offset + length > PAGE_SIZE) > + return -EINVAL; > + io_addr = (void *)(page_addr | offset); > + > + buf_allocated = choose_buffer(length, &value, &buf); > + if (buf_allocated < 0L) > + return -ENOMEM; > + > + switch (length) { > + case 1: > + ret = get_user(value.buf8, ((u8 *)user_buffer)); This cast (and similar casts across the code) kills the __user annotation of the user buffer pointer. First - fix this to help various static verification tools such as sparse work on your code. Second - are you sure this switch-case block achieves any performance gain compared to always using copy_from_user? If so, why not just push it into the S390 copy from user code? > + break; > + case 2: > + ret = get_user(value.buf16, ((u16 *)user_buffer)); > + break; > + case 4: > + ret = get_user(value.buf32, ((u32 *)user_buffer)); > + break; > + case 8: > + ret = get_user(value.buf64, ((u64 *)user_buffer)); > + break; > + default: > + ret = copy_from_user(buf, user_buffer, length); > + } > + if (ret) > + goto out; > + > + switch (length) { > + case 1: > + __raw_writeb(value.buf8, io_addr); > + break; > + case 2: > + __raw_writew(value.buf16, io_addr); > + break; > + case 4: > + __raw_writel(value.buf32, io_addr); > + break; > + case 8: > + __raw_writeq(value.buf64, io_addr); > + break; > + default: > + memcpy_toio(io_addr, buf, length); > + } > +out: > + if (buf_allocated > 0L) > + kfree(buf); > + return ret; > +} > + > +SYSCALL_DEFINE3(s390_pci_mmio_read, > + const unsigned long, mmio_addr, > + void __user *, user_buffer, > + const size_t, length) > +{ > + long ret = 0L; > + void *buf = NULL; > + long buf_allocated = 0L; > + void __iomem *io_addr = NULL; > + unsigned long pfn = 0UL; > + unsigned long offset = 0UL; > + unsigned long page_addr = 0UL; > + union value_buffer value; > + > + if (!length) > + return -EINVAL; > + if (!zpci_is_enabled()) > + return -ENODEV; > + > + ret = get_pfn(mmio_addr, VM_READ, &pfn); > + if (ret) > + return ret; > + > + page_addr = pfn << PAGE_SHIFT; > + if (!verify_page_addr(page_addr)) > + return -EFAULT; > + > + offset = mmio_addr & ~PAGE_MASK; > + if (offset + length > PAGE_SIZE) > + return -EINVAL; > + io_addr = (void *)(page_addr | offset); > + > + buf_allocated = choose_buffer(length, &value, &buf); > + if (buf_allocated < 0L) > + return -ENOMEM; > + > + switch (length) { > + case 1: > + value.buf8 = __raw_readb(io_addr); > + ret = put_user(value.buf8, ((u8 *)user_buffer)); Add __user annotations in this code block as well. > + break; > + case 2: > + value.buf16 = __raw_readw(io_addr); > + ret = put_user(value.buf16, ((u16 *)user_buffer)); > + break; > + case 4: > + value.buf32 = __raw_readl(io_addr); > + ret = put_user(value.buf32, ((u32 *)user_buffer)); > + break; > + case 8: > + value.buf64 = __raw_readq(io_addr); > + ret = put_user(value.buf64, ((u64 *)user_buffer)); > + break; > + default: > + memcpy_fromio(buf, io_addr, length); > + ret = copy_to_user(user_buffer, buf, length); > + } > + if (buf_allocated > 0L) > + kfree(buf); > + return ret; > +} > diff --git a/arch/s390/kernel/syscalls.S b/arch/s390/kernel/syscalls.S > index fe5cdf2..1faa942 100644 > --- a/arch/s390/kernel/syscalls.S > +++ b/arch/s390/kernel/syscalls.S > @@ -356,3 +356,5 @@ > SYSCALL(sys_finit_module,sys_finit_module,compat_sys_finit_module) > SYSCALL(sys_sched_setattr,sys_sched_setattr,compat_sys_sched_setattr) > /* 345 */ > SYSCALL(sys_sched_getattr,sys_sched_getattr,compat_sys_sched_getattr) > SYSCALL(sys_renameat2,sys_renameat2,compat_sys_renameat2) > +SYSCALL(sys_ni_syscall,sys_s390_pci_mmio_write,sys_ni_syscall) > +SYSCALL(sys_ni_syscall,sys_s390_pci_mmio_read,sys_ni_syscall) Generally speaking, looks OK once the __user annotation is added. I suspect you might need ack/review from the S390 maintainer as well for this to be pushed, as the syscall is generic to the entire S390 subsystem. --Shachar -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <f061a36c713c42c9b71530183a6e8644-Vl31pUvGNwELId+1UC+8EGu6+pknBqLbXA4E9RH9d+qIuWR1G4zioA@public.gmane.org>]
* Re: [PATCH 1/3] s390/kernel: add system calls for access PCI memory [not found] ` <f061a36c713c42c9b71530183a6e8644-Vl31pUvGNwELId+1UC+8EGu6+pknBqLbXA4E9RH9d+qIuWR1G4zioA@public.gmane.org> @ 2014-10-13 8:39 ` Martin Schwidefsky 0 siblings, 0 replies; 12+ messages in thread From: Martin Schwidefsky @ 2014-10-13 8:39 UTC (permalink / raw) To: Shachar Raindel Cc: Alexey Ishchuk, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org, gmuelas-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org, utz.bacher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org, roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, Yishai Hadas, Alexey Ishchuk On Sun, 12 Oct 2014 11:52:55 +0000 Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > > + switch (length) { > > + case 1: > > + ret = get_user(value.buf8, ((u8 *)user_buffer)); > > This cast (and similar casts across the code) kills the __user > annotation of the user buffer pointer. > First - fix this to help various static verification tools such > as sparse work on your code. > Second - are you sure this switch-case block achieves any > performance gain compared to always using copy_from_user? > If so, why not just push it into the S390 copy from user code? The __user annotation is indeed missing. If the switch is improving performance needs to be seen, with the compile options set for z10 the get_user is inlined while the copy_from_user calls a function. For compiles < z10 all 5 switch cases will call the same __copy_from_user function. So it depends, as long as the switch is correct I am ok the code block for now. > > + switch (length) { > > + case 1: > > + value.buf8 = __raw_readb(io_addr); > > + ret = put_user(value.buf8, ((u8 *)user_buffer)); > > Add __user annotations in this code block as well. Yes, please add. > Generally speaking, looks OK once the __user annotation is added. > > I suspect you might need ack/review from the S390 maintainer as > well for this to be pushed, as the syscall is generic to the > entire S390 subsystem. With the missing __user annotations added: Acked-by: Martin Schwidefsky <schwidefsky-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/3] libibverbs: add support for the s390x platform [not found] ` <1412933657-52641-1-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> 2014-10-10 9:34 ` [PATCH 1/3] s390/kernel: add system calls for access PCI memory Alexey Ishchuk @ 2014-10-10 9:34 ` Alexey Ishchuk 2014-10-10 9:34 ` [PATCH 3/3] libmlx4: " Alexey Ishchuk 2014-11-05 15:04 ` [PATCH 0/3] DAPL support on " Utz Bacher 3 siblings, 0 replies; 12+ messages in thread From: Alexey Ishchuk @ 2014-10-10 9:34 UTC (permalink / raw) To: linux-rdma-u79uwXL29TY76Z2rM5mHXA Cc: blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, schwidefsky-tA70FqPdS9bQT0dZR+AlfA, gmuelas-tA70FqPdS9bQT0dZR+AlfA, utz.bacher-tA70FqPdS9bQT0dZR+AlfA, roland-DgEjT+Ai2ygdnm+yROfE0A, yishaih-VPRAkNaXOzVWk0Htik3J/w, Alexey Ishchuk This patch adds the required platform specific code to allow execution of the libibverbs functions on the s390x platform. Signed-off-by: Alexey Ishchuk <aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> --- include/infiniband/arch.h | 7 +++++++ 1 file changed, 7 insertions(+) --- a/include/infiniband/arch.h +++ b/include/infiniband/arch.h @@ -115,6 +115,13 @@ static inline uint64_t ntohll(uint64_t x #define wmb() mb() #define wc_wmb() wmb() +#elif defined(__s390x__) + +#define mb() { asm volatile("" : : : "memory"); } /* for s390x */ +#define rmb() mb() /* for s390x */ +#define wmb() mb() /* for s390x */ +#define wc_wmb() wmb() /* for s390x */ + #else #warning No architecture specific defines found. Using generic implementation. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 3/3] libmlx4: add support for the s390x platform [not found] ` <1412933657-52641-1-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> 2014-10-10 9:34 ` [PATCH 1/3] s390/kernel: add system calls for access PCI memory Alexey Ishchuk 2014-10-10 9:34 ` [PATCH 2/3] libibverbs: add support for the s390x platform Alexey Ishchuk @ 2014-10-10 9:34 ` Alexey Ishchuk 2014-11-05 15:04 ` [PATCH 0/3] DAPL support on " Utz Bacher 3 siblings, 0 replies; 12+ messages in thread From: Alexey Ishchuk @ 2014-10-10 9:34 UTC (permalink / raw) To: linux-rdma-u79uwXL29TY76Z2rM5mHXA Cc: blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, schwidefsky-tA70FqPdS9bQT0dZR+AlfA, gmuelas-tA70FqPdS9bQT0dZR+AlfA, utz.bacher-tA70FqPdS9bQT0dZR+AlfA, roland-DgEjT+Ai2ygdnm+yROfE0A, yishaih-VPRAkNaXOzVWk0Htik3J/w, Alexey Ishchuk Since, s390x platform requires execution of privileged CPU instructions to work with PCI I/O memory, the PCI I/O memory cannot be directly accessed from the userspace programs via the mapped memory areas. The current implementation of the Inifiniband verbs uses mapped memory areas to write data to device UAR and Blueflame page to initiate the I/O operations, these verbs currently cannot be used on the s390x platfrom without modification. This patch contains the changes to the libmlx4 userspace Mellanox device driver library required to provide support for the DAPL API on the s390x platform. The original code that directly used mapped memory areas to access the PCI I/O memory of the Mellanox networking device is replaced with the new system call invocation for writing the data to mapped memory areas. Signed-off-by: Alexey Ishchuk <aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> --- Makefile.am | 2 Makefile.in | 2 src/doorbell.h | 8 ++- src/mlx4.h | 2 src/mmio.h | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ src/qp.c | 17 -------- 6 files changed, 126 insertions(+), 20 deletions(-) --- a/Makefile.am +++ b/Makefile.am @@ -12,7 +12,7 @@ src_libmlx4_la_LDFLAGS = -avoid-version mlx4confdir = $(sysconfdir)/libibverbs.d mlx4conf_DATA = mlx4.driver -EXTRA_DIST = src/doorbell.h src/mlx4.h src/mlx4-abi.h src/wqe.h \ +EXTRA_DIST = src/doorbell.h src/mlx4.h src/mlx4-abi.h src/wqe.h src/mmio.h \ src/mlx4.map libmlx4.spec.in mlx4.driver dist-hook: libmlx4.spec --- a/Makefile.in +++ b/Makefile.in @@ -353,7 +353,7 @@ src_libmlx4_la_LDFLAGS = -avoid-version mlx4confdir = $(sysconfdir)/libibverbs.d mlx4conf_DATA = mlx4.driver -EXTRA_DIST = src/doorbell.h src/mlx4.h src/mlx4-abi.h src/wqe.h \ +EXTRA_DIST = src/doorbell.h src/mlx4.h src/mlx4-abi.h src/wqe.h src/mmio.h \ src/mlx4.map libmlx4.spec.in mlx4.driver all: config.h --- a/src/doorbell.h +++ b/src/doorbell.h @@ -33,6 +33,8 @@ #ifndef DOORBELL_H #define DOORBELL_H +#include "mmio.h" + #if SIZEOF_LONG == 8 #if __BYTE_ORDER == __LITTLE_ENDIAN @@ -45,7 +47,7 @@ static inline void mlx4_write64(uint32_t val[2], struct mlx4_context *ctx, int offset) { - *(volatile uint64_t *) (ctx->uar + offset) = MLX4_PAIR_TO_64(val); + mmio_writeq((unsigned long)(ctx->uar + offset), MLX4_PAIR_TO_64(val)); } #else @@ -53,8 +55,8 @@ static inline void mlx4_write64(uint32_t static inline void mlx4_write64(uint32_t val[2], struct mlx4_context *ctx, int offset) { pthread_spin_lock(&ctx->uar_lock); - *(volatile uint32_t *) (ctx->uar + offset) = val[0]; - *(volatile uint32_t *) (ctx->uar + offset + 4) = val[1]; + mmio_writel((unsigned long)(ctx->uar + offset), val[0]); + mmio_writel((unsigned long)(ctx->uar + offset + 4), val[1]); pthread_spin_unlock(&ctx->uar_lock); } --- a/src/mlx4.h +++ b/src/mlx4.h @@ -74,6 +74,8 @@ #define wc_wmb() asm volatile("sfence" ::: "memory") #elif defined(__ia64__) #define wc_wmb() asm volatile("fwb" ::: "memory") +#elif defined(__s390x__) +#define wc_wmb { asm volatile("" : : : "memory") } #else #define wc_wmb() wmb() #endif --- /dev/null +++ b/src/mmio.h @@ -0,0 +1,115 @@ +#ifndef MMIO_H +#define MMIO_H + +#include <unistd.h> +#include <asm/unistd.h> +#include <sys/syscall.h> +#ifdef __s390x__ + +static inline long mmio_writeb(const unsigned long mmio_addr, + const uint8_t val) +{ + return syscall(__NR_s390_pci_mmio_write, mmio_addr, &val, sizeof(val)); +} + +static inline long mmio_writew(const unsigned long mmio_addr, + const uint16_t val) +{ + return syscall(__NR_s390_pci_mmio_write, mmio_addr, &val, sizeof(val)); +} + +static inline long mmio_writel(const unsigned long mmio_addr, + const uint32_t val) +{ + return syscall(__NR_s390_pci_mmio_write, mmio_addr, &val, sizeof(val)); +} + +static inline long mmio_writeq(const unsigned long mmio_addr, + const uint64_t val) +{ + return syscall(__NR_s390_pci_mmio_write, mmio_addr, &val, sizeof(val)); +} + +static inline long mmio_write(const unsigned long mmio_addr, + const void *val, + const size_t length) +{ + return syscall(__NR_s390_pci_mmio_write, mmio_addr, val, length); +} + +static inline long mmio_readb(const unsigned long mmio_addr, uint8_t *val) +{ + return syscall(__NR_s390_pci_mmio_read, mmio_addr, val, sizeof(*val)); +} + +static inline long mmio_readw(const unsigned long mmio_addr, uint16_t *val) +{ + return syscall(__NR_s390_pci_mmio_read, mmio_addr, val, sizeof(*val)); +} + +static inline long mmio_readl(const unsigned long mmio_addr, uint32_t *val) +{ + return syscall(__NR_s390_pci_mmio_read, mmio_addr, val, sizeof(*val)); +} + +static inline long mmio_readq(const unsigned long mmio_addr, uint64_t *val) +{ + return syscall(__NR_s390_pci_mmio_read, mmio_addr, val, sizeof(*val)); +} + +static inline long mmio_read(const unsigned long mmio_addr, + void *val, + const size_t length) +{ + return syscall(__NR_s390_pci_mmio_read, mmio_addr, val, length); +} + +static inline void mlx4_bf_copy(unsigned long *dst, + unsigned long *src, + unsigned bytecnt) +{ + mmio_write((unsigned long)dst, src, bytecnt); +} + +#else + +#define mmio_writeb(addr, value) \ + (*((volatile uint8_t *)addr) = value) +#define mmio_writew(addr, value) \ + (*((volatile uint16_t *)addr) = value) +#define mmio_writel(addr, value) \ + (*((volatile uint32_t *)addr) = value) +#define mmio_writeq(addr, value) \ + (*((volatile uint64_t *)addr) = value) +#define mmio_write(addr, value, length) \ + memcpy(addr, value, length) + +#define mmio_readb(addr, value) \ + (value = *((volatile uint8_t *)addr)) +#define mmio_readw(addr, value) \ + (value = *((volatile uint16_t *)addr)) +#define mmio_readl(addr, value) \ + (value = *((volatile uint32_t *)addr)) +#define mmio_readq(addr, value) \ + (value = *((volatile uint64_t *)addr)) +#define mmio_read(addr, value, length) \ + memcpy(value, addr, length) + +/* + * Avoid using memcpy() to copy to BlueFlame page, since memcpy() + * implementations may use move-string-buffer assembler instructions, + * which do not guarantee order of copying. + */ +static inline void mlx4_bf_copy(unsigned long *dst, + unsigned long *src, + unsigned bytecnt) +{ + while (bytecnt > 0) { + *dst++ = *src++; + *dst++ = *src++; + bytecnt -= 2 * sizeof(long); + } +} +#endif + +#endif --- a/src/qp.c +++ b/src/qp.c @@ -173,20 +173,6 @@ static void set_data_seg(struct mlx4_wqe dseg->byte_count = htonl(sg->length); } -/* - * Avoid using memcpy() to copy to BlueFlame page, since memcpy() - * implementations may use move-string-buffer assembler instructions, - * which do not guarantee order of copying. - */ -static void mlx4_bf_copy(unsigned long *dst, unsigned long *src, unsigned bytecnt) -{ - while (bytecnt > 0) { - *dst++ = *src++; - *dst++ = *src++; - bytecnt -= 2 * sizeof (long); - } -} - int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { @@ -434,7 +420,8 @@ out: */ wmb(); - *(uint32_t *) (ctx->uar + MLX4_SEND_DOORBELL) = qp->doorbell_qpn; + mmio_writel((unsigned long)(ctx->uar + MLX4_SEND_DOORBELL), + qp->doorbell_qpn); } if (nreq) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] DAPL support on s390x platform [not found] ` <1412933657-52641-1-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> ` (2 preceding siblings ...) 2014-10-10 9:34 ` [PATCH 3/3] libmlx4: " Alexey Ishchuk @ 2014-11-05 15:04 ` Utz Bacher [not found] ` <OF2939B020.E253C5B0-ONC1257D87.0050AE0F-C1257D87.0052C3AA-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> 3 siblings, 1 reply; 12+ messages in thread From: Utz Bacher @ 2014-11-05 15:04 UTC (permalink / raw) To: roland-DgEjT+Ai2ygdnm+yROfE0A Cc: Alexey Ishchuk, blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Gonzalo Muelas Serrano, linux-rdma-u79uwXL29TY76Z2rM5mHXA, mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, yishaih-VPRAkNaXOzVWk0Htik3J/w Hi Roland, Alexey Ishchuk <aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote on 10.10.2014 11:34:14: > This patch series contains the changes to kernel and users pace libraries > required to provide support for the DAPL API on s390x platform. [...] > [PATCH 1/3] s390/kernel: add system calls for access PCI memory > This patch contains the new system call implementation required for the PCI > I/O memory access from user space programs on s390x platform. > [PATCH 2/3] libibverbs: add support for the s390x plaform > This patch contains the changes to the libibverbs user space library to > provide support of the s390x platform. > [PATCH 3/3] libmlx4: add support for the s390x platform > This patch contains the changes to the libmlx4 user space library intended > to provide the PCI I/O memory access on the s390x platform. The direct > access to mapped memory areas is replaced by appropriate system call > invocation. we have continued to work on this and every player seems to be willing to proceed if you nod as the last piece in the chain for the overall patch set, too. Let me wrap up motivation of this approach (A) and status of the individual patchs (B): (A) Motivation PCI-Express support has been added to System z (s390x) a while ago. This is an implementation according to all the standards, but the memory subsystem does not implement real MMIO with arbitrary memory writes triggering PCI activity. Instead, a privileged instruction needs to be used to write to mapped memory. Emulating MMIO via a page fault handler is something we discussed for a while, but eventually, a correct implementation is very expensive: System z has got a rich instruction set (CISC) with ~1000 instructions and over a third of those touch memory, and some of them very much live up to the C in CISC. To have a correct implementation, all instructions touching memory would have to be emulated in case of a page fault, even if in *most* cases, just a few will be used. And we keep changing the code as the platform evolves. The intention is to keep any changes as minimal intrusive and clean as possible. Therefore small helper functions ("static inline long mmio_writel ()") in libmlx4 provide an abstraction of MMIO access (similar to the kernel approach). These functions are no change (even in the binary) on most platforms, but work against a system call on s390x. BTW, for system integrity reasons, z firmware makes only whitelisted PCI devices available to the Operating System level. Therefore, the Mellanox RoCE adapter is the only NIC currently supported by System z, and this is the reason we only touch libmlx4 with the mmio abstraction. (B) Status of patches 1. kernel code -- the new system call: reviewed, acked and accepted by the s390x maintainer Martin Schwidefsky (2014/10/13), so we will have that system call in the s390x kernel. 2. libibverbs -- define barriers on s390x: Looking for your feedback. We understand there have been no general objections so far. 3. libmlx4 -- provide MMIO abstraction: reviewed by the Mellanox maintainers and we understand they would apply this once you give the go for the overall set. Previously, a patch to DAPL to build on s390x has been accepted already (Arlin Davis, 2014/09/02). We gave your concern on MMIO handling on s390x serious consideration from various angles, but the page fault handler does not appear workable. OTOH, Mellanox is fine with the MMIO abstraction in libmlx4, and we didn't hear of significant other concerns. With that, could you please consider the patch set again to add s390x to the list of supported platforms? Happy to repost the patches for convenience. Thanks, Utz -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <OF2939B020.E253C5B0-ONC1257D87.0050AE0F-C1257D87.0052C3AA-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/3] DAPL support on s390x platform [not found] ` <OF2939B020.E253C5B0-ONC1257D87.0050AE0F-C1257D87.0052C3AA-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> @ 2014-11-11 23:22 ` Roland Dreier [not found] ` <CAL1RGDVLeNCAXrrq8mEhKxcm_z_xwgLUfO_wzhQE9xJz8Lnq7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Roland Dreier @ 2014-11-11 23:22 UTC (permalink / raw) To: Utz Bacher Cc: Alexey Ishchuk, blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Gonzalo Muelas Serrano, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Yishai Hadas On Wed, Nov 5, 2014 at 7:04 AM, Utz Bacher <utz.bacher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> wrote: > (B) Status of patches > 1. kernel code -- the new system call: reviewed, acked and accepted by the > s390x maintainer Martin Schwidefsky (2014/10/13), so we will have that > system call in the s390x kernel. > 2. libibverbs -- define barriers on s390x: Looking for your feedback. We > understand there have been no general objections so far. > 3. libmlx4 -- provide MMIO abstraction: reviewed by the Mellanox > maintainers and we understand they would apply this once you give the go > for the overall set. > Previously, a patch to DAPL to build on s390x has been accepted already > (Arlin Davis, 2014/09/02). > > We gave your concern on MMIO handling on s390x serious consideration from > various angles, but the page fault handler does not appear workable. OTOH, > Mellanox is fine with the MMIO abstraction in libmlx4, and we didn't hear > of significant other concerns. With that, could you please consider the > patch set again to add s390x to the list of supported platforms? Happy to > repost the patches for convenience. If Mellanox is willing to take on the maintenance burden of changing all MMIO access to an inline function, and if you're willing to take on the burden of knowing that every new adapter you support means tracking down and convincing the maintainer of the driver library, then I'm OK with adding the simple barrier patch to libibverbs. Could you please send the latest version of that patch to me? It does seem a little strange to be adding a new system call to simulate kernel bypass, but I guess you do what you gotta do... - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <CAL1RGDVLeNCAXrrq8mEhKxcm_z_xwgLUfO_wzhQE9xJz8Lnq7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 0/3] DAPL support on s390x platform [not found] ` <CAL1RGDVLeNCAXrrq8mEhKxcm_z_xwgLUfO_wzhQE9xJz8Lnq7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2014-11-12 9:34 ` Martin Schwidefsky 2014-11-12 12:38 ` Yishai Hadas 1 sibling, 0 replies; 12+ messages in thread From: Martin Schwidefsky @ 2014-11-12 9:34 UTC (permalink / raw) To: Roland Dreier Cc: Utz Bacher, Alexey Ishchuk, blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Gonzalo Muelas Serrano, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Yishai Hadas On Tue, 11 Nov 2014 15:22:20 -0800 Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > On Wed, Nov 5, 2014 at 7:04 AM, Utz Bacher <utz.bacher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> wrote: > > (B) Status of patches > > 1. kernel code -- the new system call: reviewed, acked and accepted by the > > s390x maintainer Martin Schwidefsky (2014/10/13), so we will have that > > system call in the s390x kernel. > > 2. libibverbs -- define barriers on s390x: Looking for your feedback. We > > understand there have been no general objections so far. > > 3. libmlx4 -- provide MMIO abstraction: reviewed by the Mellanox > > maintainers and we understand they would apply this once you give the go > > for the overall set. > > Previously, a patch to DAPL to build on s390x has been accepted already > > (Arlin Davis, 2014/09/02). > > > > We gave your concern on MMIO handling on s390x serious consideration from > > various angles, but the page fault handler does not appear workable. OTOH, > > Mellanox is fine with the MMIO abstraction in libmlx4, and we didn't hear > > of significant other concerns. With that, could you please consider the > > patch set again to add s390x to the list of supported platforms? Happy to > > repost the patches for convenience. > > If Mellanox is willing to take on the maintenance burden of changing > all MMIO access to an inline function, and if you're willing to take > on the burden of knowing that every new adapter you support means > tracking down and convincing the maintainer of the driver library, > then I'm OK with adding the simple barrier patch to libibverbs. Could > you please send the latest version of that patch to me? > > It does seem a little strange to be adding a new system call to > simulate kernel bypass, but I guess you do what you gotta do... Well the virtualization of PCI on s390 requires to use these special instructions to access the PCI memory. And they are privileged for a good reason, otherwise user space could to nasty things. Given these preconditions the system calls are the least painful method of doing what you have to do. Thanks for the confirmation, we will go forward with the system call approach. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] DAPL support on s390x platform [not found] ` <CAL1RGDVLeNCAXrrq8mEhKxcm_z_xwgLUfO_wzhQE9xJz8Lnq7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2014-11-12 9:34 ` Martin Schwidefsky @ 2014-11-12 12:38 ` Yishai Hadas [not found] ` <546354D9.9050601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 1 sibling, 1 reply; 12+ messages in thread From: Yishai Hadas @ 2014-11-12 12:38 UTC (permalink / raw) To: Roland Dreier, Utz Bacher Cc: Alexey Ishchuk, blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Gonzalo Muelas Serrano, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Yishai Hadas, amira-VPRAkNaXOzVWk0Htik3J/w On 11/12/2014 1:22 AM, Roland Dreier wrote: > On Wed, Nov 5, 2014 at 7:04 AM, Utz Bacher <utz.bacher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> wrote: >> (B) Status of patches >> 1. kernel code -- the new system call: reviewed, acked and accepted by the >> s390x maintainer Martin Schwidefsky (2014/10/13), so we will have that >> system call in the s390x kernel. >> 2. libibverbs -- define barriers on s390x: Looking for your feedback. We >> understand there have been no general objections so far. >> 3. libmlx4 -- provide MMIO abstraction: reviewed by the Mellanox >> maintainers and we understand they would apply this once you give the go >> for the overall set. >> Previously, a patch to DAPL to build on s390x has been accepted already >> (Arlin Davis, 2014/09/02). >> >> We gave your concern on MMIO handling on s390x serious consideration from >> various angles, but the page fault handler does not appear workable. OTOH, >> Mellanox is fine with the MMIO abstraction in libmlx4, and we didn't hear >> of significant other concerns. With that, could you please consider the >> patch set again to add s390x to the list of supported platforms? Happy to >> repost the patches for convenience. > > If Mellanox is willing to take on the maintenance burden of changing > all MMIO access to an inline function, and if you're willing to take > on the burden of knowing that every new adapter you support means > tracking down and convincing the maintainer of the driver library, > then I'm OK with adding the simple barrier patch to libibverbs. Could > you please send the latest version of that patch to me? > The libmlx4 patch should pass some performance regression before it can be approved, this task is in progress in our side. > It does seem a little strange to be adding a new system call to > simulate kernel bypass, but I guess you do what you gotta do... > > - R. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <546354D9.9050601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: [PATCH 0/3] DAPL support on s390x platform [not found] ` <546354D9.9050601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2014-12-10 16:27 ` Utz Bacher [not found] ` <OFA12CE4AE.0DA527AB-ONC1257DAA.0057DC75-C1257DAA.005A6BB7-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Utz Bacher @ 2014-12-10 16:27 UTC (permalink / raw) To: Yishai Hadas, Roland Dreier Cc: Alexey Ishchuk, amira-VPRAkNaXOzVWk0Htik3J/w, blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Gonzalo Muelas Serrano, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Yishai Hadas Hi Roland and Yishai, Yishai Hadas <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote on 12.11.2014 13:38:49: [...] > On 11/12/2014 1:22 AM, Roland Dreier wrote: > > On Wed, Nov 5, 2014 at 7:04 AM, Utz Bacher <utz.bacher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> wrote: > >> (B) Status of patches > >> 1. kernel code -- the new system call: reviewed, acked and accepted by the > >> s390x maintainer Martin Schwidefsky (2014/10/13), so we will have that > >> system call in the s390x kernel. > >> 2. libibverbs -- define barriers on s390x: Looking for your feedback. We > >> understand there have been no general objections so far. > >> 3. libmlx4 -- provide MMIO abstraction: reviewed by the Mellanox > >> maintainers and we understand they would apply this once you give the go > >> for the overall set. > >> Previously, a patch to DAPL to build on s390x has been accepted already > >> (Arlin Davis, 2014/09/02). [...] > > If Mellanox is willing to take on the maintenance burden of changing > > all MMIO access to an inline function, and if you're willing to take > > on the burden of knowing that every new adapter you support means > > tracking down and convincing the maintainer of the driver library, > > then I'm OK with adding the simple barrier patch to libibverbs. Could > > you please send the latest version of that patch to me? > > > > The libmlx4 patch should pass some performance regression before it can > be approved, this task is in progress in our side. so when the libmlx4 patch passed the performance regression runs successfully, we would assume Roland and Yishai acknowledge the changes. Can we correctly expect these patches (without further changes) appearing in libmlx4 on top of 1.0.6 and libibverbs on top of 1.1.8? Just checking before we talk to distribution partners, so that everyone stays in sync with upstream. Martin Schwidefsky is applying the kernel changes, too. Thanks in advance, Utz :wq -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <OFA12CE4AE.0DA527AB-ONC1257DAA.0057DC75-C1257DAA.005A6BB7-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/3] DAPL support on s390x platform [not found] ` <OFA12CE4AE.0DA527AB-ONC1257DAA.0057DC75-C1257DAA.005A6BB7-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> @ 2014-12-12 8:19 ` Martin Schwidefsky 0 siblings, 0 replies; 12+ messages in thread From: Martin Schwidefsky @ 2014-12-12 8:19 UTC (permalink / raw) To: Utz Bacher Cc: Yishai Hadas, Roland Dreier, Alexey Ishchuk, amira-VPRAkNaXOzVWk0Htik3J/w, blaschka-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Gonzalo Muelas Serrano, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Yishai Hadas On Wed, 10 Dec 2014 17:27:38 +0100 Utz Bacher <utz.bacher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> wrote: > Yishai Hadas <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote on 12.11.2014 13:38:49: > [...] > > On 11/12/2014 1:22 AM, Roland Dreier wrote: > > > On Wed, Nov 5, 2014 at 7:04 AM, Utz Bacher <utz.bacher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> > wrote: > > >> (B) Status of patches > > >> 1. kernel code -- the new system call: reviewed, acked and accepted by > the > > >> s390x maintainer Martin Schwidefsky (2014/10/13), so we will have that > > >> system call in the s390x kernel. > > >> 2. libibverbs -- define barriers on s390x: Looking for your feedback. > We > > >> understand there have been no general objections so far. > > >> 3. libmlx4 -- provide MMIO abstraction: reviewed by the Mellanox > > >> maintainers and we understand they would apply this once you give the > go > > >> for the overall set. > > >> Previously, a patch to DAPL to build on s390x has been accepted > already > > >> (Arlin Davis, 2014/09/02). > [...] > > > If Mellanox is willing to take on the maintenance burden of changing > > > all MMIO access to an inline function, and if you're willing to take > > > on the burden of knowing that every new adapter you support means > > > tracking down and convincing the maintainer of the driver library, > > > then I'm OK with adding the simple barrier patch to libibverbs. Could > > > you please send the latest version of that patch to me? > > > > > > > The libmlx4 patch should pass some performance regression before it can > > be approved, this task is in progress in our side. > > so when the libmlx4 patch passed the performance regression runs > successfully, we would assume Roland and Yishai acknowledge the changes. > Can we correctly expect these patches (without further changes) appearing > in libmlx4 on top of 1.0.6 and libibverbs on top of 1.1.8? > Just checking before we talk to distribution partners, so that everyone > stays in sync with upstream. Martin Schwidefsky is applying the kernel > changes, too. The kernel part just went upstream, with the following git commit: commit 4eafad7febd482092b331ea72c37274d745956be Author: Alexey Ishchuk <aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> Date: Fri Nov 14 14:27:58 2014 +0100 s390/kernel: add system calls for PCI memory access Add the new __NR_s390_pci_mmio_write and __NR_s390_pci_mmio_read system calls to allow user space applications to access device PCI I/O memory pages on s390x platform. [ Martin Schwidefsky: some code beautification ] Signed-off-by: Alexey Ishchuk <aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> Signed-off-by: Martin Schwidefsky <schwidefsky-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-12-12 8:19 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-10 9:34 [PATCH 0/3] DAPL support on s390x platform Alexey Ishchuk
[not found] ` <1412933657-52641-1-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2014-10-10 9:34 ` [PATCH 1/3] s390/kernel: add system calls for access PCI memory Alexey Ishchuk
[not found] ` <1412933657-52641-2-git-send-email-aishchuk-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2014-10-12 11:52 ` Shachar Raindel
[not found] ` <f061a36c713c42c9b71530183a6e8644-Vl31pUvGNwELId+1UC+8EGu6+pknBqLbXA4E9RH9d+qIuWR1G4zioA@public.gmane.org>
2014-10-13 8:39 ` Martin Schwidefsky
2014-10-10 9:34 ` [PATCH 2/3] libibverbs: add support for the s390x platform Alexey Ishchuk
2014-10-10 9:34 ` [PATCH 3/3] libmlx4: " Alexey Ishchuk
2014-11-05 15:04 ` [PATCH 0/3] DAPL support on " Utz Bacher
[not found] ` <OF2939B020.E253C5B0-ONC1257D87.0050AE0F-C1257D87.0052C3AA-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2014-11-11 23:22 ` Roland Dreier
[not found] ` <CAL1RGDVLeNCAXrrq8mEhKxcm_z_xwgLUfO_wzhQE9xJz8Lnq7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-11-12 9:34 ` Martin Schwidefsky
2014-11-12 12:38 ` Yishai Hadas
[not found] ` <546354D9.9050601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-12-10 16:27 ` Utz Bacher
[not found] ` <OFA12CE4AE.0DA527AB-ONC1257DAA.0057DC75-C1257DAA.005A6BB7-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2014-12-12 8:19 ` Martin Schwidefsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox