public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
From: Kevin Traynor <ktraynor@redhat.com>
To: dev@dpdk.org
Cc: thomas@monjalon.net, david.marchand@redhat.com,
	dsosnowski@nvidia.com, viacheslavo@nvidia.com,
	hkalra@marvell.com, stephen@networkplumber.org,
	Kevin Traynor <ktraynor@redhat.com>,
	stable@dpdk.org
Subject: [PATCH v5 1/3] eal/linux: handle interrupt epoll events
Date: Tue,  3 Mar 2026 18:58:16 +0000	[thread overview]
Message-ID: <20260303185818.28689-2-ktraynor@redhat.com> (raw)
In-Reply-To: <20260303185818.28689-1-ktraynor@redhat.com>

Add handling for epoll error and disconnect conditions EPOLLERR,
EPOLLHUP and EPOLLRDHUP.

These events indicate that the interrupt file descriptor is in
an error state or there has been a hangup.

Only do this for interrupts that are read in eal. Interrupts that
are read outside eal should deal with disconnect/error events
appropriate to their functionality. e.g. virtio interrupt handling
has reconnect mechanisms for some cases.

Also, treat no bytes read as an error condition.

Bugzilla ID: 1873
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/eal/linux/eal_interrupts.c | 72 ++++++++++++++++++++++------------
 1 file changed, 46 insertions(+), 26 deletions(-)

diff --git a/lib/eal/linux/eal_interrupts.c b/lib/eal/linux/eal_interrupts.c
index 9db978923a..f3f6bdd01d 100644
--- a/lib/eal/linux/eal_interrupts.c
+++ b/lib/eal/linux/eal_interrupts.c
@@ -887,4 +887,26 @@ rte_intr_disable(const struct rte_intr_handle *intr_handle)
 }
 
+static void
+eal_intr_source_remove_and_free(struct rte_intr_source *src)
+{
+	struct rte_intr_callback *cb, *next;
+
+	/* Remove the interrupt source */
+	rte_spinlock_lock(&intr_lock);
+	TAILQ_REMOVE(&intr_sources, src, next);
+	rte_spinlock_unlock(&intr_lock);
+
+	/* Free callbacks */
+	for (cb = TAILQ_FIRST(&src->callbacks); cb; cb = next) {
+		next = TAILQ_NEXT(cb, next);
+		TAILQ_REMOVE(&src->callbacks, cb, next);
+		free(cb);
+	}
+
+	/* Free the interrupt source */
+	rte_intr_instance_free(src->intr_handle);
+	free(src);
+}
+
 static int
 eal_intr_process_interrupts(struct epoll_event *events, int nfds)
@@ -952,40 +974,38 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds)
 
 		if (bytes_read > 0) {
-			/**
+			/*
+			 * Check for epoll error or disconnect events for
+			 * interrupts that are read directly in eal.
+			 */
+			if (events[n].events & (EPOLLERR | EPOLLHUP | EPOLLRDHUP)) {
+				EAL_LOG(ERR, "Disconnect condition on fd %d "
+					"(events=0x%x), removing from epoll",
+					events[n].data.fd, events[n].events);
+				eal_intr_source_remove_and_free(src);
+				return -1;
+			}
+
+			/*
 			 * read out to clear the ready-to-be-read flag
 			 * for epoll_wait.
 			 */
 			bytes_read = read(events[n].data.fd, &buf, bytes_read);
-			if (bytes_read < 0) {
+			if (bytes_read > 0) {
+				call = true;
+			} else if (bytes_read < 0) {
 				if (errno == EINTR || errno == EWOULDBLOCK)
 					continue;
 
-				EAL_LOG(ERR, "Error reading from file "
-					"descriptor %d: %s",
+				EAL_LOG(ERR, "Error reading from file descriptor %d: %s",
 					events[n].data.fd,
 					strerror(errno));
-				/*
-				 * The device is unplugged or buggy, remove
-				 * it as an interrupt source and return to
-				 * force the wait list to be rebuilt.
-				 */
-				rte_spinlock_lock(&intr_lock);
-				TAILQ_REMOVE(&intr_sources, src, next);
-				rte_spinlock_unlock(&intr_lock);
-
-				for (cb = TAILQ_FIRST(&src->callbacks); cb;
-							cb = next) {
-					next = TAILQ_NEXT(cb, next);
-					TAILQ_REMOVE(&src->callbacks, cb, next);
-					free(cb);
-				}
-				rte_intr_instance_free(src->intr_handle);
-				free(src);
+			} else {
+				EAL_LOG(ERR, "Read nothing from file descriptor %d",
+					events[n].data.fd);
+			}
+			if (bytes_read <= 0) {
+				eal_intr_source_remove_and_free(src);
 				return -1;
-			} else if (bytes_read == 0)
-				EAL_LOG(ERR, "Read nothing from file "
-					"descriptor %d", events[n].data.fd);
-			else
-				call = true;
+			}
 		}
 
-- 
2.53.0


  reply	other threads:[~2026-03-03 18:58 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-28 12:20 [PATCH] eal/linux: handle epoll error conditions Kevin Traynor
2026-01-29 12:51 ` Kevin Traynor
2026-02-06 17:20 ` [PATCH v2 0/2] interrupt epoll event handling Kevin Traynor
2026-02-06 17:20   ` [PATCH v2 1/2] net/mlx5: check for no data read in devx interrupt Kevin Traynor
2026-02-07  6:09     ` Stephen Hemminger
2026-02-10 15:05       ` Kevin Traynor
2026-02-10 17:05         ` Slava Ovsiienko
2026-02-10 19:07           ` Kevin Traynor
2026-02-10 20:58             ` Slava Ovsiienko
2026-02-19 14:44               ` Kevin Traynor
2026-02-06 17:20   ` [PATCH v2 2/2] eal/linux: handle interrupt epoll events Kevin Traynor
2026-02-07  6:11     ` Stephen Hemminger
2026-02-10 13:35       ` Kevin Traynor
2026-02-10  9:17     ` David Marchand
2026-02-10 14:47       ` Kevin Traynor
2026-02-10 18:06 ` [PATCH v3 0/2] interrupt epoll event handling Kevin Traynor
2026-02-10 18:06   ` [PATCH v3 1/2] net/mlx5: check for no data read in devx interrupt Kevin Traynor
2026-02-10 18:06   ` [PATCH v3 2/2] eal/linux: handle interrupt epoll events Kevin Traynor
2026-02-19 14:37 ` [PATCH v4 0/3] interrupt disconnect/error event handling Kevin Traynor
2026-02-19 14:38 ` Kevin Traynor
2026-02-19 14:38   ` [PATCH v4 1/3] eal/linux: handle interrupt epoll events Kevin Traynor
2026-02-19 14:38   ` [PATCH v4 2/3] eal/interrupt: add interrupt event info Kevin Traynor
2026-02-26 15:41     ` David Marchand
2026-03-02 11:47       ` Kevin Traynor
2026-02-19 14:38   ` [PATCH v4 3/3] net/mlx5: check devx disconnect/error interrupt events Kevin Traynor
2026-03-03 16:16     ` Slava Ovsiienko
2026-02-19 18:52   ` [PATCH v4 0/3] interrupt disconnect/error event handling Stephen Hemminger
2026-03-02 11:41     ` Kevin Traynor
2026-03-03 18:58 ` [PATCH v5 " Kevin Traynor
2026-03-03 18:58   ` Kevin Traynor [this message]
2026-03-03 18:58   ` [PATCH v5 2/3] eal/interrupt: add interrupt event info Kevin Traynor
2026-03-03 18:58   ` [PATCH v5 3/3] net/mlx5: check devx disconnect/error interrupt events Kevin Traynor
2026-03-04 11:09   ` [PATCH v5 0/3] interrupt disconnect/error event handling David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260303185818.28689-2-ktraynor@redhat.com \
    --to=ktraynor@redhat.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=dsosnowski@nvidia.com \
    --cc=hkalra@marvell.com \
    --cc=stable@dpdk.org \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox