public inbox for linux-parisc@vger.kernel.org
 help / color / mirror / Atom feed
* Possible io_uring bug in PA-RISC kernel 6.1.46
@ 2023-08-27 11:25 Vidra.Jonas
  2023-08-27 15:09 ` Helge Deller
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Vidra.Jonas @ 2023-08-27 11:25 UTC (permalink / raw)
  To: linux-parisc

Hello,

on my C8000 workstation, I'm getting libuv aborts connected to io_uring
usage. I'm not sure whether this is a libuv bug or a kernel one, but
since libuv is pretty well tested on other archs, it's probably a
PA-RISC issue. I've encountered the problem in kernel versions 6.1.41
and 6.1.46. 6.1.4 seemed fine, but I updated other system packages in
the meantime, so I can't be sure.

The issue is that a call to
`io_uring_enter(fd, 2, 2, IORING_ENTER_GETEVENTS)`
returns 0, and libuv reacts to that by aborting, probably on this line:
https://github.com/libuv/libuv/blob/65dc822d6c20a9130fa100c7b46d751f8cf4d233/src/unix/linux.c#L1252
(I'm saying probably, because gdb seems to be buggy on my machine and I
don't really trust its output, so I rely on strace instead, but that
doesn't support backtraces on the PA-RISC.)

Since min_complete is 2 here and IORING_SETUP_IOPOLL was not specified
during setup (the only flags that can be passed to io_uring_setup are 0
and IORING_SETUP_SQPOLL), the call should (if I understand the docs
correctly) block instead of returning.

The bug manifests as aborts in CMake when compiling KDE packages, as any
call of `cmake -E cmake_autogen ...` fails.

Is this a known issue? I've seen a number of io_uring patches on the ML,
but none seem relevant.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
  2023-08-27 11:25 Possible io_uring bug in PA-RISC kernel 6.1.46 Vidra.Jonas
@ 2023-08-27 15:09 ` Helge Deller
  2023-08-27 20:35 ` John David Anglin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Helge Deller @ 2023-08-27 15:09 UTC (permalink / raw)
  To: Vidra.Jonas, linux-parisc

Hello Vidra,

On 8/27/23 13:25, Vidra.Jonas@seznam.cz wrote:
> on my C8000 workstation, I'm getting libuv aborts connected to io_uring
> usage. I'm not sure whether this is a libuv bug or a kernel one, but
> since libuv is pretty well tested on other archs, it's probably a
> PA-RISC issue.

I'd assume it's a problem in libuv.
I'm not saying libuv is doing stuff wrong, but probably not the way
it needs to be to fulfill cache requirements on parisc.

> I've encountered the problem in kernel versions 6.1.41
> and 6.1.46. 6.1.4 seemed fine, but I updated other system packages in
> the meantime, so I can't be sure.

Maybe you could try kernel git head (or 6.4.stable) too?
Those have some more io_uring and parisc patches...

> The issue is that a call to
> `io_uring_enter(fd, 2, 2, IORING_ENTER_GETEVENTS)`
> returns 0, and libuv reacts to that by aborting, probably on this line:
> https://github.com/libuv/libuv/blob/65dc822d6c20a9130fa100c7b46d751f8cf4d233/src/unix/linux.c#L1252
> (I'm saying probably, because gdb seems to be buggy on my machine and I
> don't really trust its output, so I rely on strace instead, but that
> doesn't support backtraces on the PA-RISC.)

I have not yet looked into this problem, but if the io_uring
syscalls return 0, it's often an issue that the kernel doesn't
see the same memory (due to cache aliasing) as the userspace.
The memory regions allocated with mmap() need to be done by the
kernel. The  https://github.com/axboe/liburing library should
do it right, but it seems libuv does everything itself.

> Since min_complete is 2 here and IORING_SETUP_IOPOLL was not specified
> during setup (the only flags that can be passed to io_uring_setup are 0
> and IORING_SETUP_SQPOLL), the call should (if I understand the docs
> correctly) block instead of returning.
>
> The bug manifests as aborts in CMake when compiling KDE packages, as any
> call of `cmake -E cmake_autogen ...` fails.
>
> Is this a known issue? I've seen a number of io_uring patches on the ML,
> but none seem relevant.

The best thing would be to have a small reproducer.
Do you think you can come up with something?

Helge

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
  2023-08-27 11:25 Possible io_uring bug in PA-RISC kernel 6.1.46 Vidra.Jonas
  2023-08-27 15:09 ` Helge Deller
@ 2023-08-27 20:35 ` John David Anglin
  2023-08-27 20:41   ` Helge Deller
  2023-08-27 21:23 ` Sam James
  2023-08-31 13:38 ` Helge Deller
  3 siblings, 1 reply; 12+ messages in thread
From: John David Anglin @ 2023-08-27 20:35 UTC (permalink / raw)
  To: Vidra.Jonas, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 678 bytes --]

On 2023-08-27 7:25 a.m., Vidra.Jonas@seznam.cz wrote:
> The issue is that a call to
> `io_uring_enter(fd, 2, 2, IORING_ENTER_GETEVENTS)`
> returns 0, and libuv reacts to that by aborting, probably on this line:
> https://github.com/libuv/libuv/blob/65dc822d6c20a9130fa100c7b46d751f8cf4d233/src/unix/linux.c#L1252
> (I'm saying probably, because gdb seems to be buggy on my machine and I
> don't really trust its output, so I rely on strace instead, but that
> doesn't support backtraces on the PA-RISC.)
I have the attached io_uring patch on 6.1.46.  It's back ported from 6.4.

libuv make check was successful on my system.

Dave

-- 
John David Anglin  dave.anglin@bell.net


[-- Attachment #2: io_uring-v4.patch --]
[-- Type: text/plain, Size: 11624 bytes --]

diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 9eff86acdfec..576e8eadb838 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -379,6 +379,9 @@ enum {
 #define IORING_OFF_SQ_RING		0ULL
 #define IORING_OFF_CQ_RING		0x8000000ULL
 #define IORING_OFF_SQES			0x10000000ULL
+#define IORING_OFF_PBUF_RING		0x80000000ULL
+#define IORING_OFF_PBUF_SHIFT		16
+#define IORING_OFF_MMAP_MASK		0xf8000000ULL
 
 /*
  * Filled with the offset for mmap(2)
@@ -621,12 +624,26 @@ struct io_uring_buf_ring {
 	};
 };
 
+/*
+ * Flags for IORING_REGISTER_PBUF_RING.
+ *
+ * IOU_PBUF_RING_MMAP:	If set, kernel will allocate the memory for the ring.
+ *			The application must not set a ring_addr in struct
+ *			io_uring_buf_reg, instead it must subsequently call
+ *			mmap(2) with the offset set as:
+ *			IORING_OFF_PBUF_RING | (bgid << IORING_OFF_PBUF_SHIFT)
+ *			to get a virtual mapping for the ring.
+ */
+enum {
+	IOU_PBUF_RING_MMAP	= 1,
+};
+
 /* argument for IORING_(UN)REGISTER_PBUF_RING */
 struct io_uring_buf_reg {
 	__u64	ring_addr;
 	__u32	ring_entries;
 	__u16	bgid;
-	__u16	pad;
+	__u16	flags;
 	__u64	resv[3];
 };
 
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index ed8e9deae284..6cdb78e18041 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -72,6 +72,7 @@
 #include <linux/io_uring.h>
 #include <linux/audit.h>
 #include <linux/security.h>
+#include <asm/shmparam.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/io_uring.h>
@@ -3075,7 +3076,7 @@ static void *io_uring_validate_mmap_request(struct file *file,
 	struct page *page;
 	void *ptr;
 
-	switch (offset) {
+	switch (offset & IORING_OFF_MMAP_MASK) {
 	case IORING_OFF_SQ_RING:
 	case IORING_OFF_CQ_RING:
 		ptr = ctx->rings;
@@ -3083,6 +3084,17 @@ static void *io_uring_validate_mmap_request(struct file *file,
 	case IORING_OFF_SQES:
 		ptr = ctx->sq_sqes;
 		break;
+	case IORING_OFF_PBUF_RING: {
+		unsigned int bgid;
+
+		bgid = (offset & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT;
+		mutex_lock(&ctx->uring_lock);
+		ptr = io_pbuf_get_address(ctx, bgid);
+		mutex_unlock(&ctx->uring_lock);
+		if (!ptr)
+			return ERR_PTR(-EINVAL);
+		break;
+		}
 	default:
 		return ERR_PTR(-EINVAL);
 	}
@@ -3110,6 +3122,49 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
 	return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
 }
 
+static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
+			unsigned long addr, unsigned long len,
+			unsigned long pgoff, unsigned long flags)
+{
+	void *ptr;
+
+	/*
+	 * Do not allow to map to user-provided address to avoid breaking the
+	 * aliasing rules. Userspace is not able to guess the offset address of
+	 * kernel kmalloc()ed memory area.
+	 */
+	if (addr)
+		return -EINVAL;
+
+	ptr = io_uring_validate_mmap_request(filp, pgoff, len);
+	if (IS_ERR(ptr))
+		return -ENOMEM;
+
+	/*
+	 * Some architectures have strong cache aliasing requirements.
+	 * For such architectures we need a coherent mapping which aliases
+	 * kernel memory *and* userspace memory. To achieve that:
+	 * - use a NULL file pointer to reference physical memory, and
+	 * - use the kernel virtual address of the shared io_uring context
+	 *   (instead of the userspace-provided address, which has to be 0UL
+	 *   anyway).
+	 * - use the same pgoff which the get_unmapped_area() uses to
+	 *   calculate the page colouring.
+	 * For architectures without such aliasing requirements, the
+	 * architecture will return any suitable mapping because addr is 0.
+	 */
+	filp = NULL;
+	flags |= MAP_SHARED;
+	pgoff = 0;	/* has been translated to ptr above */
+#ifdef SHM_COLOUR
+	addr = (uintptr_t) ptr;
+	pgoff = addr >> PAGE_SHIFT;
+#else
+	addr = 0UL;
+#endif
+	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+}
+
 #else /* !CONFIG_MMU */
 
 static int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
@@ -3324,6 +3379,8 @@ static const struct file_operations io_uring_fops = {
 #ifndef CONFIG_MMU
 	.get_unmapped_area = io_uring_nommu_get_unmapped_area,
 	.mmap_capabilities = io_uring_nommu_mmap_capabilities,
+#else
+	.get_unmapped_area = io_uring_mmu_get_unmapped_area,
 #endif
 	.poll		= io_uring_poll,
 #ifdef CONFIG_PROC_FS
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index acc37e5a6d4e..5d44964fc41e 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -137,7 +137,8 @@ static void __user *io_ring_buffer_select(struct io_kiocb *req, size_t *len,
 		return NULL;
 
 	head &= bl->mask;
-	if (head < IO_BUFFER_LIST_BUF_PER_PAGE) {
+	/* mmaped buffers are always contig */
+	if (bl->is_mmap || head < IO_BUFFER_LIST_BUF_PER_PAGE) {
 		buf = &br->bufs[head];
 	} else {
 		int off = head & (IO_BUFFER_LIST_BUF_PER_PAGE - 1);
@@ -179,7 +180,7 @@ void __user *io_buffer_select(struct io_kiocb *req, size_t *len,
 
 	bl = io_buffer_get_list(ctx, req->buf_index);
 	if (likely(bl)) {
-		if (bl->buf_nr_pages)
+		if (bl->is_mapped)
 			ret = io_ring_buffer_select(req, len, bl, issue_flags);
 		else
 			ret = io_provided_buffer_select(req, len, bl);
@@ -214,17 +215,30 @@ static int __io_remove_buffers(struct io_ring_ctx *ctx,
 	if (!nbufs)
 		return 0;
 
-	if (bl->buf_nr_pages) {
-		int j;
-
+	if (bl->is_mapped) {
 		i = bl->buf_ring->tail - bl->head;
-		for (j = 0; j < bl->buf_nr_pages; j++)
-			unpin_user_page(bl->buf_pages[j]);
-		kvfree(bl->buf_pages);
-		bl->buf_pages = NULL;
-		bl->buf_nr_pages = 0;
+		if (bl->is_mmap) {
+			if (bl->buf_ring) {
+				struct page *page;
+
+				page = virt_to_head_page(bl->buf_ring);
+				if (put_page_testzero(page))
+					free_compound_page(page);
+				bl->buf_ring = NULL;
+			}
+			bl->is_mmap = 0;
+		} else if (bl->buf_nr_pages) {
+			int j;
+
+			for (j = 0; j < bl->buf_nr_pages; j++)
+				unpin_user_page(bl->buf_pages[j]);
+			kvfree(bl->buf_pages);
+			bl->buf_pages = NULL;
+			bl->buf_nr_pages = 0;
+		}
 		/* make sure it's seen as empty */
 		INIT_LIST_HEAD(&bl->buf_list);
+		bl->is_mapped = 0;
 		return i;
 	}
 
@@ -304,7 +318,7 @@ int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
 	if (bl) {
 		ret = -EINVAL;
 		/* can't use provide/remove buffers command on mapped buffers */
-		if (!bl->buf_nr_pages)
+		if (!bl->is_mapped)
 			ret = __io_remove_buffers(ctx, bl, p->nbufs);
 	}
 	if (ret < 0)
@@ -452,7 +466,7 @@ int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
 		}
 	}
 	/* can't add buffers via this command for a mapped buffer ring */
-	if (bl->buf_nr_pages) {
+	if (bl->is_mapped) {
 		ret = -EINVAL;
 		goto err;
 	}
@@ -468,23 +482,78 @@ int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
 	return IOU_ISSUE_SKIP_COMPLETE;
 }
 
-int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
+static int io_pin_pbuf_ring(struct io_uring_buf_reg *reg,
+			    struct io_buffer_list *bl)
 {
 	struct io_uring_buf_ring *br;
-	struct io_uring_buf_reg reg;
-	struct io_buffer_list *bl, *free_bl = NULL;
 	struct page **pages;
 	int nr_pages;
 
+	pages = io_pin_pages(reg->ring_addr,
+			     flex_array_size(br, bufs, reg->ring_entries),
+			     &nr_pages);
+	if (IS_ERR(pages))
+		return PTR_ERR(pages);
+
+	br = page_address(pages[0]);
+#ifdef SHM_COLOUR
+	if ((reg->ring_addr | (unsigned long) br) & (SHM_COLOUR - 1)) {
+		int i;
+
+		for (i = 0; i < nr_pages; i++)
+			unpin_user_page(pages[i]);
+		return -EINVAL;
+	}
+#endif
+	bl->buf_pages = pages;
+	bl->buf_nr_pages = nr_pages;
+	bl->buf_ring = br;
+	bl->is_mapped = 1;
+	bl->is_mmap = 0;
+	return 0;
+}
+
+static int io_alloc_pbuf_ring(struct io_uring_buf_reg *reg,
+			      struct io_buffer_list *bl)
+{
+	gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP;
+	size_t ring_size;
+	void *ptr;
+
+	ring_size = reg->ring_entries * sizeof(struct io_uring_buf_ring);
+	ptr = (void *) __get_free_pages(gfp, get_order(ring_size));
+	if (!ptr)
+		return -ENOMEM;
+
+	bl->buf_ring = ptr;
+	bl->is_mapped = 1;
+	bl->is_mmap = 1;
+	return 0;
+}
+
+int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
+{
+	struct io_uring_buf_reg reg;
+	struct io_buffer_list *bl, *free_bl = NULL;
+	int ret;
+
 	if (copy_from_user(&reg, arg, sizeof(reg)))
 		return -EFAULT;
 
-	if (reg.pad || reg.resv[0] || reg.resv[1] || reg.resv[2])
+	if (reg.resv[0] || reg.resv[1] || reg.resv[2])
 		return -EINVAL;
-	if (!reg.ring_addr)
-		return -EFAULT;
-	if (reg.ring_addr & ~PAGE_MASK)
+	if (reg.flags & ~IOU_PBUF_RING_MMAP)
 		return -EINVAL;
+	if (!(reg.flags & IOU_PBUF_RING_MMAP)) {
+		if (!reg.ring_addr)
+			return -EFAULT;
+		if (reg.ring_addr & ~PAGE_MASK)
+			return -EINVAL;
+	} else {
+		if (reg.ring_addr)
+			return -EINVAL;
+	}
+
 	if (!is_power_of_2(reg.ring_entries))
 		return -EINVAL;
 
@@ -501,7 +570,7 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
 	bl = io_buffer_get_list(ctx, reg.bgid);
 	if (bl) {
 		/* if mapped buffer ring OR classic exists, don't allow */
-		if (bl->buf_nr_pages || !list_empty(&bl->buf_list))
+		if (bl->is_mapped || !list_empty(&bl->buf_list))
 			return -EEXIST;
 	} else {
 		free_bl = bl = kzalloc(sizeof(*bl), GFP_KERNEL);
@@ -509,22 +578,21 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
 			return -ENOMEM;
 	}
 
-	pages = io_pin_pages(reg.ring_addr,
-			     flex_array_size(br, bufs, reg.ring_entries),
-			     &nr_pages);
-	if (IS_ERR(pages)) {
-		kfree(free_bl);
-		return PTR_ERR(pages);
+	if (!(reg.flags & IOU_PBUF_RING_MMAP))
+		ret = io_pin_pbuf_ring(&reg, bl);
+	else
+		ret = io_alloc_pbuf_ring(&reg, bl);
+
+	if (!ret) {
+		bl->nr_entries = reg.ring_entries;
+		bl->mask = reg.ring_entries - 1;
+
+		io_buffer_add_list(ctx, bl, reg.bgid);
+		return 0;
 	}
 
-	br = page_address(pages[0]);
-	bl->buf_pages = pages;
-	bl->buf_nr_pages = nr_pages;
-	bl->nr_entries = reg.ring_entries;
-	bl->buf_ring = br;
-	bl->mask = reg.ring_entries - 1;
-	io_buffer_add_list(ctx, bl, reg.bgid);
-	return 0;
+	kfree(free_bl);
+	return ret;
 }
 
 int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
@@ -534,13 +602,15 @@ int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
 
 	if (copy_from_user(&reg, arg, sizeof(reg)))
 		return -EFAULT;
-	if (reg.pad || reg.resv[0] || reg.resv[1] || reg.resv[2])
+	if (reg.resv[0] || reg.resv[1] || reg.resv[2])
+		return -EINVAL;
+	if (reg.flags)
 		return -EINVAL;
 
 	bl = io_buffer_get_list(ctx, reg.bgid);
 	if (!bl)
 		return -ENOENT;
-	if (!bl->buf_nr_pages)
+	if (!bl->is_mapped)
 		return -EINVAL;
 
 	__io_remove_buffers(ctx, bl, -1U);
@@ -550,3 +620,14 @@ int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
 	}
 	return 0;
 }
+
+void *io_pbuf_get_address(struct io_ring_ctx *ctx, unsigned long bgid)
+{
+	struct io_buffer_list *bl;
+
+	bl = io_buffer_get_list(ctx, bgid);
+	if (!bl || !bl->is_mmap)
+		return NULL;
+
+	return bl->buf_ring;
+}
diff --git a/io_uring/kbuf.h b/io_uring/kbuf.h
index c23e15d7d3ca..d14345ef61fc 100644
--- a/io_uring/kbuf.h
+++ b/io_uring/kbuf.h
@@ -23,6 +23,11 @@ struct io_buffer_list {
 	__u16 nr_entries;
 	__u16 head;
 	__u16 mask;
+
+	/* ring mapped provided buffers */
+	__u8 is_mapped;
+	/* ring mapped provided buffers, but mmap'ed by application */
+	__u8 is_mmap;
 };
 
 struct io_buffer {
@@ -50,6 +55,8 @@ unsigned int __io_put_kbuf(struct io_kiocb *req, unsigned issue_flags);
 
 void io_kbuf_recycle_legacy(struct io_kiocb *req, unsigned issue_flags);
 
+void *io_pbuf_get_address(struct io_ring_ctx *ctx, unsigned long bgid);
+
 static inline void io_kbuf_recycle_ring(struct io_kiocb *req)
 {
 	/*

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
  2023-08-27 20:35 ` John David Anglin
@ 2023-08-27 20:41   ` Helge Deller
  2023-08-27 21:01     ` John David Anglin
  2023-08-28  1:45     ` Sam James
  0 siblings, 2 replies; 12+ messages in thread
From: Helge Deller @ 2023-08-27 20:41 UTC (permalink / raw)
  To: John David Anglin, Vidra.Jonas, linux-parisc

On 8/27/23 22:35, John David Anglin wrote:
> On 2023-08-27 7:25 a.m., Vidra.Jonas@seznam.cz wrote:
>> The issue is that a call to
>> `io_uring_enter(fd, 2, 2, IORING_ENTER_GETEVENTS)`
>> returns 0, and libuv reacts to that by aborting, probably on this line:
>> https://github.com/libuv/libuv/blob/65dc822d6c20a9130fa100c7b46d751f8cf4d233/src/unix/linux.c#L1252
>> (I'm saying probably, because gdb seems to be buggy on my machine and I
>> don't really trust its output, so I rely on strace instead, but that
>> doesn't support backtraces on the PA-RISC.)
> I have the attached io_uring patch on 6.1.46.  It's back ported from 6.4.
>
> libuv make check was successful on my system.

Did the libuv check failed on 6.1.46 for you?
I just tried plain 6.1.49 and my machine crashed....

Helge

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
  2023-08-27 20:41   ` Helge Deller
@ 2023-08-27 21:01     ` John David Anglin
  2023-08-27 21:31       ` John David Anglin
  2023-08-28  1:45     ` Sam James
  1 sibling, 1 reply; 12+ messages in thread
From: John David Anglin @ 2023-08-27 21:01 UTC (permalink / raw)
  To: Helge Deller, Vidra.Jonas, linux-parisc

On 2023-08-27 4:41 p.m., Helge Deller wrote:
> On 8/27/23 22:35, John David Anglin wrote:
>> On 2023-08-27 7:25 a.m., Vidra.Jonas@seznam.cz wrote:
>>> The issue is that a call to
>>> `io_uring_enter(fd, 2, 2, IORING_ENTER_GETEVENTS)`
>>> returns 0, and libuv reacts to that by aborting, probably on this line:
>>> https://github.com/libuv/libuv/blob/65dc822d6c20a9130fa100c7b46d751f8cf4d233/src/unix/linux.c#L1252
>>> (I'm saying probably, because gdb seems to be buggy on my machine and I
>>> don't really trust its output, so I rely on strace instead, but that
>>> doesn't support backtraces on the PA-RISC.)
>> I have the attached io_uring patch on 6.1.46.  It's back ported from 6.4.
>>
>> libuv make check was successful on my system.
>
> Did the libuv check failed on 6.1.46 for you?
No. It was successful.
> I just tried plain 6.1.49 and my machine crashed....
Dave

-- 
John David Anglin  dave.anglin@bell.net


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
  2023-08-27 11:25 Possible io_uring bug in PA-RISC kernel 6.1.46 Vidra.Jonas
  2023-08-27 15:09 ` Helge Deller
  2023-08-27 20:35 ` John David Anglin
@ 2023-08-27 21:23 ` Sam James
  2023-08-31 13:38 ` Helge Deller
  3 siblings, 0 replies; 12+ messages in thread
From: Sam James @ 2023-08-27 21:23 UTC (permalink / raw)
  To: Vidra.Jonas; +Cc: linux-parisc


<Vidra.Jonas@seznam.cz> writes:

> Hello,
>
> on my C8000 workstation, I'm getting libuv aborts connected to io_uring
> usage. I'm not sure whether this is a libuv bug or a kernel one, but
> since libuv is pretty well tested on other archs, it's probably a
> PA-RISC issue. I've encountered the problem in kernel versions 6.1.41
> and 6.1.46. 6.1.4 seemed fine, but I updated other system packages in
> the meantime, so I can't be sure.
>
> The issue is that a call to
> `io_uring_enter(fd, 2, 2, IORING_ENTER_GETEVENTS)`
> returns 0, and libuv reacts to that by aborting, probably on this line:
> https://github.com/libuv/libuv/blob/65dc822d6c20a9130fa100c7b46d751f8cf4d233/src/unix/linux.c#L1252
> (I'm saying probably, because gdb seems to be buggy on my machine and I
> don't really trust its output, so I rely on strace instead, but that
> doesn't support backtraces on the PA-RISC.)

* elfutils needs me to finish off the port - I have some old patches
rebased.

* I wasn't aware of libunwind being broken (you mentioned it on IRC)
though.

* Can you file a bug for your gdb issue at
  https://sourceware.org/bugzilla? so far I've tried and I can't
  reproduce any problems with it




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
  2023-08-27 21:01     ` John David Anglin
@ 2023-08-27 21:31       ` John David Anglin
  2023-08-27 21:41         ` Helge Deller
  0 siblings, 1 reply; 12+ messages in thread
From: John David Anglin @ 2023-08-27 21:31 UTC (permalink / raw)
  To: Helge Deller, Vidra.Jonas, linux-parisc

On 2023-08-27 5:01 p.m., John David Anglin wrote:
>> Did the libuv check failed on 6.1.46 for you?
> No. It was successful.
I ran the testsuite a few more times.  Sometimes metrics_pool_events times out:

not ok 179 - metrics_pool_events
# timeout
# Output from process `metrics_pool_events`: (no output)

All other tests are ok.

Dave

-- 
John David Anglin  dave.anglin@bell.net


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
  2023-08-27 21:31       ` John David Anglin
@ 2023-08-27 21:41         ` Helge Deller
       [not found]           ` <1wd.NvQz.Gj}B593TXc.1azrF9@seznam.cz>
  0 siblings, 1 reply; 12+ messages in thread
From: Helge Deller @ 2023-08-27 21:41 UTC (permalink / raw)
  To: John David Anglin, Vidra.Jonas, linux-parisc

On 8/27/23 23:31, John David Anglin wrote:
> On 2023-08-27 5:01 p.m., John David Anglin wrote:
>>> Did the libuv check failed on 6.1.46 for you?
>> No. It was successful.
> I ran the testsuite a few more times.  Sometimes metrics_pool_events times out:
>
> not ok 179 - metrics_pool_events
> # timeout
> # Output from process `metrics_pool_events`: (no output)
>
> All other tests are ok.

That's strange. For me all "plain" > 6.1.45 kernels fail, often even HPMC/crash
the machine.
I did not tested your patch (yet).

Anyway, my git branch "6.1-stable-buildd-6.1.49" is 6.1.49 plus some io_uring
patches (and others) backported:
  https://github.com/hdeller/linux/tree/6.1-stable-buildd-6.1.49

With that one, all libuv checks run OK, I tested a few times.
I think we should ask for backporting some of those patches, but
we should be clear which ones...

Helge


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
  2023-08-27 20:41   ` Helge Deller
  2023-08-27 21:01     ` John David Anglin
@ 2023-08-28  1:45     ` Sam James
  1 sibling, 0 replies; 12+ messages in thread
From: Sam James @ 2023-08-28  1:45 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, Vidra.Jonas, linux-parisc


Helge Deller <deller@gmx.de> writes:

> On 8/27/23 22:35, John David Anglin wrote:
>> On 2023-08-27 7:25 a.m., Vidra.Jonas@seznam.cz wrote:
>>> The issue is that a call to
>>> `io_uring_enter(fd, 2, 2, IORING_ENTER_GETEVENTS)`
>>> returns 0, and libuv reacts to that by aborting, probably on this line:
>>> https://github.com/libuv/libuv/blob/65dc822d6c20a9130fa100c7b46d751f8cf4d233/src/unix/linux.c#L1252
>>> (I'm saying probably, because gdb seems to be buggy on my machine and I
>>> don't really trust its output, so I rely on strace instead, but that
>>> doesn't support backtraces on the PA-RISC.)
>> I have the attached io_uring patch on 6.1.46.  It's back ported from 6.4.
>>
>> libuv make check was successful on my system.
>
> Did the libuv check failed on 6.1.46 for you?
> I just tried plain 6.1.49 and my machine crashed....
>

I can't test 6.1.x right now, but fwiw, 6.4.12 passes libuv's test suite.

best,
sam

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
  2023-08-27 11:25 Possible io_uring bug in PA-RISC kernel 6.1.46 Vidra.Jonas
                   ` (2 preceding siblings ...)
  2023-08-27 21:23 ` Sam James
@ 2023-08-31 13:38 ` Helge Deller
  3 siblings, 0 replies; 12+ messages in thread
From: Helge Deller @ 2023-08-31 13:38 UTC (permalink / raw)
  To: Vidra.Jonas, linux-parisc; +Cc: Sam James

On 8/27/23 13:25, Vidra.Jonas@seznam.cz wrote:
> Hello,
>
> on my C8000 workstation, I'm getting libuv aborts connected to io_uring
> usage. I'm not sure whether this is a libuv bug or a kernel one, but
> since libuv is pretty well tested on other archs, it's probably a
> PA-RISC issue. I've encountered the problem in kernel versions 6.1.41
> and 6.1.46. 6.1.4 seemed fine, but I updated other system packages in
> the meantime, so I can't be sure.
>
> The issue is that a call to
> `io_uring_enter(fd, 2, 2, IORING_ENTER_GETEVENTS)`
> returns 0, and libuv reacts to that by aborting, probably on this line:
> https://github.com/libuv/libuv/blob/65dc822d6c20a9130fa100c7b46d751f8cf4d233/src/unix/linux.c#L1252
> (I'm saying probably, because gdb seems to be buggy on my machine and I
> don't really trust its output, so I rely on strace instead, but that
> doesn't support backtraces on the PA-RISC.)
>
> Since min_complete is 2 here and IORING_SETUP_IOPOLL was not specified
> during setup (the only flags that can be passed to io_uring_setup are 0
> and IORING_SETUP_SQPOLL), the call should (if I understand the docs
> correctly) block instead of returning.
>
> The bug manifests as aborts in CMake when compiling KDE packages, as any
> call of `cmake -E cmake_autogen ...` fails.
>
> Is this a known issue? I've seen a number of io_uring patches on the ML,
> but none seem relevant.

Just to sum up this mail thread:
The upcoming 6.1.51 stable kernel should have the relevant patches.
Kernel 6.4 and above should work out of the box.

Helge

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
       [not found]             ` <1xN.NvP9.1{VV8Csbz0.1azrZI@seznam.cz>
@ 2023-09-05 17:12               ` Vidra.Jonas
       [not found]                 ` <1M4b1y-1qfpu42R6s-001mPw@mail.gmx.net>
  0 siblings, 1 reply; 12+ messages in thread
From: Vidra.Jonas @ 2023-09-05 17:12 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, linux-parisc

On 2023-09-05 18:35:30, Jonas Vidra wrote:
> On 2023-09-05 18:14:01, Jonas Vidra wrote:
>> I will try building a vanilla 6.1.46 with the patch by John David Anglin
>> applied and I'll report back. It might take me a while, though, as the
>> PA-RISC system in question is not available 24/7.
>
> I did a dirty rebuild (no `make clean`) with the patch and it fixes the
> issue as well.

I was too hasty. The failure is still present on patched 6.1.46, although
it is now intermittent. With a vanilla 6.1.46, I hit the issue perhaps 10
times, and it seemed that the problematic command fails every single time.
On a patched 6.1.46, the problematic command ran 3 times but only failed
twice – it finished fine on the first run.

It might be just pure chance, though, if the actual failure frequency is
something like 4/5 – it is possible that both versions are identically
buggy.

I have no idea if the Helge's tree I tested previously is similarly
affected – the failure rate there is 0/4, which too might be due to
just exceptionally good luck. But I still hope it is fixed there.


> Is a minimal reproducer of the issue still desired? I can try to create
> something from the strace output, but I'm not sure whether it's still
> necessary.

I'm now assuming it's really needed. I will prepare it, hopefully in the
next few days.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible io_uring bug in PA-RISC kernel 6.1.46
       [not found]                 ` <1M4b1y-1qfpu42R6s-001mPw@mail.gmx.net>
@ 2023-09-09  6:11                   ` Vidra.Jonas
  0 siblings, 0 replies; 12+ messages in thread
From: Vidra.Jonas @ 2023-09-09  6:11 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, linux-parisc

On 2023-09-07 11:39:14, Helge Deller wrote:
> Please try kernel 6.1.51 or newer.

I just tested 6.1.52 and it seems to work fine.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-09-09  6:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-27 11:25 Possible io_uring bug in PA-RISC kernel 6.1.46 Vidra.Jonas
2023-08-27 15:09 ` Helge Deller
2023-08-27 20:35 ` John David Anglin
2023-08-27 20:41   ` Helge Deller
2023-08-27 21:01     ` John David Anglin
2023-08-27 21:31       ` John David Anglin
2023-08-27 21:41         ` Helge Deller
     [not found]           ` <1wd.NvQz.Gj}B593TXc.1azrF9@seznam.cz>
     [not found]             ` <1xN.NvP9.1{VV8Csbz0.1azrZI@seznam.cz>
2023-09-05 17:12               ` Vidra.Jonas
     [not found]                 ` <1M4b1y-1qfpu42R6s-001mPw@mail.gmx.net>
2023-09-09  6:11                   ` Vidra.Jonas
2023-08-28  1:45     ` Sam James
2023-08-27 21:23 ` Sam James
2023-08-31 13:38 ` Helge Deller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox