public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
* [PATCH] eal/linux: handle epoll error conditions
@ 2026-01-28 12:20 Kevin Traynor
  2026-01-29 12:51 ` Kevin Traynor
                   ` (5 more replies)
  0 siblings, 6 replies; 33+ messages in thread
From: Kevin Traynor @ 2026-01-28 12:20 UTC (permalink / raw)
  To: dev; +Cc: thomas, david.marchand, dsosnowski, viacheslavo, Kevin Traynor,
	stable

Add handling for epoll error conditions EPOLLERR, EPOLLHUP and
EPOLLRDHUP. These events indicate that the interrupt file descriptor
is in an error state or there has been a hangup.

This may happen when the interrupt file descriptor is deleted or for
mlx5 devices when the device is unbound from mlx5 kernel driver or if
the device is removed by the mlx5 kernel driver as part of LAG setup.

Previously, the interrupts were being read, but the condition was not
cleared and that may lead to an interrupt continuing to fire and a
busy-loop processing it.

Now when this condition is detected, an error message is logged and the
interrupt is removed to prevent busy-looping.

Also cover the case where no bytes are read even though the epoll has
indicated there is something to read.

Bugzilla ID: 1873
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
---
 lib/eal/linux/eal_interrupts.c | 63 ++++++++++++++++++++++++----------
 1 file changed, 44 insertions(+), 19 deletions(-)

diff --git a/lib/eal/linux/eal_interrupts.c b/lib/eal/linux/eal_interrupts.c
index 9db978923a..eedc75d776 100644
--- a/lib/eal/linux/eal_interrupts.c
+++ b/lib/eal/linux/eal_interrupts.c
@@ -887,4 +887,21 @@ rte_intr_disable(const struct rte_intr_handle *intr_handle)
 }
 
+static void
+eal_intr_source_free(struct rte_intr_source *src)
+{
+	struct rte_intr_callback *cb, *next;
+
+	/* Free all callbacks */
+	for (cb = TAILQ_FIRST(&src->callbacks); cb; cb = next) {
+		next = TAILQ_NEXT(cb, next);
+		TAILQ_REMOVE(&src->callbacks, cb, next);
+		free(cb);
+	}
+
+	/* Free the interrupt source */
+	rte_intr_instance_free(src->intr_handle);
+	free(src);
+}
+
 static int
 eal_intr_process_interrupts(struct epoll_event *events, int nfds)
@@ -918,4 +935,21 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds)
 		}
 
+		/* Check for error conditions on the fd before processing. */
+		if (events[n].events & (EPOLLRDHUP | EPOLLERR | EPOLLHUP)) {
+			EAL_LOG(WARNING, "Disconnect condition on fd %d "
+				"(events=0x%x), removing from epoll",
+				events[n].data.fd, events[n].events);
+			/*
+			 * There is an error or a hangup. Remove the
+			 * interrupt source and return to force the wait list
+			 * to be rebuilt.
+			 */
+			TAILQ_REMOVE(&intr_sources, src, next);
+			rte_spinlock_unlock(&intr_lock);
+
+			eal_intr_source_free(src);
+			return -1;
+		}
+
 		/* mark this interrupt source as active and release the lock. */
 		src->active = 1;
@@ -957,5 +991,7 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds)
 			 */
 			bytes_read = read(events[n].data.fd, &buf, bytes_read);
-			if (bytes_read < 0) {
+			if (bytes_read > 0) {
+				call = true;
+			} else if (bytes_read < 0) {
 				if (errno == EINTR || errno == EWOULDBLOCK)
 					continue;
@@ -965,27 +1001,16 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds)
 					events[n].data.fd,
 					strerror(errno));
-				/*
-				 * The device is unplugged or buggy, remove
-				 * it as an interrupt source and return to
-				 * force the wait list to be rebuilt.
-				 */
+			} else { /* bytes == 0 */
+				EAL_LOG(WARNING, "Read nothing from file "
+					"descriptor %d", events[n].data.fd);
+			}
+			if (bytes_read <= 0) {
 				rte_spinlock_lock(&intr_lock);
 				TAILQ_REMOVE(&intr_sources, src, next);
 				rte_spinlock_unlock(&intr_lock);
 
-				for (cb = TAILQ_FIRST(&src->callbacks); cb;
-							cb = next) {
-					next = TAILQ_NEXT(cb, next);
-					TAILQ_REMOVE(&src->callbacks, cb, next);
-					free(cb);
-				}
-				rte_intr_instance_free(src->intr_handle);
-				free(src);
+				eal_intr_source_free(src);
 				return -1;
-			} else if (bytes_read == 0)
-				EAL_LOG(ERR, "Read nothing from file "
-					"descriptor %d", events[n].data.fd);
-			else
-				call = true;
+			}
 		}
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2026-03-04 11:09 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-28 12:20 [PATCH] eal/linux: handle epoll error conditions Kevin Traynor
2026-01-29 12:51 ` Kevin Traynor
2026-02-06 17:20 ` [PATCH v2 0/2] interrupt epoll event handling Kevin Traynor
2026-02-06 17:20   ` [PATCH v2 1/2] net/mlx5: check for no data read in devx interrupt Kevin Traynor
2026-02-07  6:09     ` Stephen Hemminger
2026-02-10 15:05       ` Kevin Traynor
2026-02-10 17:05         ` Slava Ovsiienko
2026-02-10 19:07           ` Kevin Traynor
2026-02-10 20:58             ` Slava Ovsiienko
2026-02-19 14:44               ` Kevin Traynor
2026-02-06 17:20   ` [PATCH v2 2/2] eal/linux: handle interrupt epoll events Kevin Traynor
2026-02-07  6:11     ` Stephen Hemminger
2026-02-10 13:35       ` Kevin Traynor
2026-02-10  9:17     ` David Marchand
2026-02-10 14:47       ` Kevin Traynor
2026-02-10 18:06 ` [PATCH v3 0/2] interrupt epoll event handling Kevin Traynor
2026-02-10 18:06   ` [PATCH v3 1/2] net/mlx5: check for no data read in devx interrupt Kevin Traynor
2026-02-10 18:06   ` [PATCH v3 2/2] eal/linux: handle interrupt epoll events Kevin Traynor
2026-02-19 14:37 ` [PATCH v4 0/3] interrupt disconnect/error event handling Kevin Traynor
2026-02-19 14:38 ` Kevin Traynor
2026-02-19 14:38   ` [PATCH v4 1/3] eal/linux: handle interrupt epoll events Kevin Traynor
2026-02-19 14:38   ` [PATCH v4 2/3] eal/interrupt: add interrupt event info Kevin Traynor
2026-02-26 15:41     ` David Marchand
2026-03-02 11:47       ` Kevin Traynor
2026-02-19 14:38   ` [PATCH v4 3/3] net/mlx5: check devx disconnect/error interrupt events Kevin Traynor
2026-03-03 16:16     ` Slava Ovsiienko
2026-02-19 18:52   ` [PATCH v4 0/3] interrupt disconnect/error event handling Stephen Hemminger
2026-03-02 11:41     ` Kevin Traynor
2026-03-03 18:58 ` [PATCH v5 " Kevin Traynor
2026-03-03 18:58   ` [PATCH v5 1/3] eal/linux: handle interrupt epoll events Kevin Traynor
2026-03-03 18:58   ` [PATCH v5 2/3] eal/interrupt: add interrupt event info Kevin Traynor
2026-03-03 18:58   ` [PATCH v5 3/3] net/mlx5: check devx disconnect/error interrupt events Kevin Traynor
2026-03-04 11:09   ` [PATCH v5 0/3] interrupt disconnect/error event handling David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox