* [PATCH net] net: ps3_gelic_net: handle skb allocation failures
@ 2025-11-10 11:45 Florian Fuchs
2025-11-12 2:04 ` Jakub Kicinski
0 siblings, 1 reply; 4+ messages in thread
From: Florian Fuchs @ 2025-11-10 11:45 UTC (permalink / raw)
To: Geoff Levand, netdev, Jakub Kicinski
Cc: Andrew Lunn, David S . Miller, Eric Dumazet, Paolo Abeni,
Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, linuxppc-dev, linux-kernel, fuchsfl
Handle skb allocation failures in RX path, to avoid NULL pointer
dereference and RX stalls under memory pressure. If the refill fails
with -ENOMEM, complete napi polling and wake up later to retry via timer.
Also explicitly re-enable RX DMA after oom, so the dmac doesn't remain
stopped in this situation.
Previously, memory pressure could lead to skb allocation failures and
subsequent Oops like:
Oops: Kernel access of bad area, sig: 11 [#2]
Hardware name: SonyPS3 Cell Broadband Engine 0x701000 PS3
NIP [c0003d0000065900] gelic_net_poll+0x6c/0x2d0 [ps3_gelic] (unreliable)
LR [c0003d00000659c4] gelic_net_poll+0x130/0x2d0 [ps3_gelic]
Call Trace:
gelic_net_poll+0x130/0x2d0 [ps3_gelic] (unreliable)
__napi_poll+0x44/0x168
net_rx_action+0x178/0x290
Steps to reproduce the issue:
1. Start a continuous network traffic, like scp of a 20GB file
2. Inject failslab errors using the kernel fault injection:
echo -1 > /sys/kernel/debug/failslab/times
echo 30 > /sys/kernel/debug/failslab/interval
echo 100 > /sys/kernel/debug/failslab/probability
3. After some time, traces start to appear, kernel Oopses
and the system stops
Step 2 is not always necessary, as it is usually already triggered by
the transfer of a big enough file.
Fixes: 02c1889166b4 ("ps3: gigabit ethernet driver for PS3, take3")
Signed-off-by: Florian Fuchs <fuchsfl@gmail.com>
---
drivers/net/ethernet/toshiba/ps3_gelic_net.c | 54 +++++++++++++++-----
drivers/net/ethernet/toshiba/ps3_gelic_net.h | 1 +
2 files changed, 42 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index 5ee8e8980393..a8121f7583f9 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -259,6 +259,7 @@ void gelic_card_down(struct gelic_card *card)
mutex_lock(&card->updown_lock);
if (atomic_dec_if_positive(&card->users) == 0) {
pr_debug("%s: real do\n", __func__);
+ timer_delete_sync(&card->rx_oom_timer);
napi_disable(&card->napi);
/*
* Disable irq. Wireless interrupts will
@@ -970,7 +971,8 @@ static void gelic_net_pass_skb_up(struct gelic_descr *descr,
* gelic_card_decode_one_descr - processes an rx descriptor
* @card: card structure
*
- * returns 1 if a packet has been sent to the stack, otherwise 0
+ * returns 1 if a packet has been sent to the stack, -ENOMEM on skb alloc
+ * failure, otherwise 0
*
* processes an rx descriptor by iommu-unmapping the data buffer and passing
* the packet up to the stack
@@ -981,16 +983,17 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
struct gelic_descr_chain *chain = &card->rx_chain;
struct gelic_descr *descr = chain->head;
struct net_device *netdev = NULL;
- int dmac_chain_ended;
+ int dmac_chain_ended = 0;
status = gelic_descr_get_status(descr);
if (status == GELIC_DESCR_DMA_CARDOWNED)
return 0;
- if (status == GELIC_DESCR_DMA_NOT_IN_USE) {
+ if (status == GELIC_DESCR_DMA_NOT_IN_USE || !descr->skb) {
dev_dbg(ctodev(card), "dormant descr? %p\n", descr);
- return 0;
+ dmac_chain_ended = 1;
+ goto refill;
}
/* netdevice select */
@@ -1048,9 +1051,10 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
refill:
/* is the current descriptor terminated with next_descr == NULL? */
- dmac_chain_ended =
- be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
- GELIC_DESCR_RX_DMA_CHAIN_END;
+ if (!dmac_chain_ended)
+ dmac_chain_ended =
+ be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
+ GELIC_DESCR_RX_DMA_CHAIN_END;
/*
* So that always DMAC can see the end
* of the descriptor chain to avoid
@@ -1062,10 +1066,12 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
/*
- * this call can fail, but for now, just leave this
- * descriptor without skb
+ * this call can fail, propagate the error
*/
- gelic_descr_prepare_rx(card, descr);
+ int ret = gelic_descr_prepare_rx(card, descr);
+
+ if (ret)
+ return ret;
chain->tail = descr;
chain->head = descr->next;
@@ -1087,6 +1093,17 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
return 1;
}
+/**
+ * gelic_rx_oom_timer - Restart napi poll if oom occurred
+ * @t: timer list
+ */
+static void gelic_rx_oom_timer(struct timer_list *t)
+{
+ struct gelic_card *card = timer_container_of(card, t, rx_oom_timer);
+
+ napi_schedule(&card->napi);
+}
+
/**
* gelic_net_poll - NAPI poll function called by the stack to return packets
* @napi: napi structure
@@ -1099,12 +1116,21 @@ static int gelic_net_poll(struct napi_struct *napi, int budget)
{
struct gelic_card *card = container_of(napi, struct gelic_card, napi);
int packets_done = 0;
+ int work_result = 0;
while (packets_done < budget) {
- if (!gelic_card_decode_one_descr(card))
- break;
+ work_result = gelic_card_decode_one_descr(card);
+ if (work_result == 1) {
+ packets_done++;
+ continue;
+ }
+ break;
+ }
- packets_done++;
+ if (work_result == -ENOMEM) {
+ napi_complete_done(napi, packets_done);
+ mod_timer(&card->rx_oom_timer, jiffies + 1);
+ return packets_done;
}
if (packets_done < budget) {
@@ -1576,6 +1602,8 @@ static struct gelic_card *gelic_alloc_card_net(struct net_device **netdev)
mutex_init(&card->updown_lock);
atomic_set(&card->users, 0);
+ timer_setup(&card->rx_oom_timer, gelic_rx_oom_timer, 0);
+
return card;
}
diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.h b/drivers/net/ethernet/toshiba/ps3_gelic_net.h
index f7d7931e51b7..c10f1984a5a1 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.h
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.h
@@ -268,6 +268,7 @@ struct gelic_vlan_id {
struct gelic_card {
struct napi_struct napi;
struct net_device *netdev[GELIC_PORT_MAX];
+ struct timer_list rx_oom_timer;
/*
* hypervisor requires irq_status should be
* 8 bytes aligned, but u64 member is
base-commit: 96a9178a29a6b84bb632ebeb4e84cf61191c73d5
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net] net: ps3_gelic_net: handle skb allocation failures
2025-11-10 11:45 [PATCH net] net: ps3_gelic_net: handle skb allocation failures Florian Fuchs
@ 2025-11-12 2:04 ` Jakub Kicinski
2025-11-12 9:34 ` Florian Fuchs
0 siblings, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2025-11-12 2:04 UTC (permalink / raw)
To: Florian Fuchs
Cc: Geoff Levand, netdev, Andrew Lunn, David S . Miller, Eric Dumazet,
Paolo Abeni, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, linuxppc-dev, linux-kernel
On Mon, 10 Nov 2025 12:45:23 +0100 Florian Fuchs wrote:
> Handle skb allocation failures in RX path, to avoid NULL pointer
> dereference and RX stalls under memory pressure. If the refill fails
> with -ENOMEM, complete napi polling and wake up later to retry via timer.
> Also explicitly re-enable RX DMA after oom, so the dmac doesn't remain
> stopped in this situation.
>
> Previously, memory pressure could lead to skb allocation failures and
> subsequent Oops like:
>
> Oops: Kernel access of bad area, sig: 11 [#2]
> Hardware name: SonyPS3 Cell Broadband Engine 0x701000 PS3
> NIP [c0003d0000065900] gelic_net_poll+0x6c/0x2d0 [ps3_gelic] (unreliable)
> LR [c0003d00000659c4] gelic_net_poll+0x130/0x2d0 [ps3_gelic]
> Call Trace:
> gelic_net_poll+0x130/0x2d0 [ps3_gelic] (unreliable)
> __napi_poll+0x44/0x168
> net_rx_action+0x178/0x290
>
> Steps to reproduce the issue:
> 1. Start a continuous network traffic, like scp of a 20GB file
> 2. Inject failslab errors using the kernel fault injection:
> echo -1 > /sys/kernel/debug/failslab/times
> echo 30 > /sys/kernel/debug/failslab/interval
> echo 100 > /sys/kernel/debug/failslab/probability
> 3. After some time, traces start to appear, kernel Oopses
> and the system stops
>
> Step 2 is not always necessary, as it is usually already triggered by
> the transfer of a big enough file.
Have you actually tested this on a real device?
Please describe the testing you have done rather that "how to test".
> Fixes: 02c1889166b4 ("ps3: gigabit ethernet driver for PS3, take3")
> Signed-off-by: Florian Fuchs <fuchsfl@gmail.com>
> ---
> drivers/net/ethernet/toshiba/ps3_gelic_net.c | 54 +++++++++++++++-----
> drivers/net/ethernet/toshiba/ps3_gelic_net.h | 1 +
> 2 files changed, 42 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> index 5ee8e8980393..a8121f7583f9 100644
> --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> @@ -259,6 +259,7 @@ void gelic_card_down(struct gelic_card *card)
> mutex_lock(&card->updown_lock);
> if (atomic_dec_if_positive(&card->users) == 0) {
> pr_debug("%s: real do\n", __func__);
> + timer_delete_sync(&card->rx_oom_timer);
> napi_disable(&card->napi);
I think the ordering here should be inverted
> /*
> * Disable irq. Wireless interrupts will
> @@ -970,7 +971,8 @@ static void gelic_net_pass_skb_up(struct gelic_descr *descr,
> * gelic_card_decode_one_descr - processes an rx descriptor
> * @card: card structure
> *
> - * returns 1 if a packet has been sent to the stack, otherwise 0
> + * returns 1 if a packet has been sent to the stack, -ENOMEM on skb alloc
> + * failure, otherwise 0
> *
> * processes an rx descriptor by iommu-unmapping the data buffer and passing
> * the packet up to the stack
> @@ -981,16 +983,17 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
> struct gelic_descr_chain *chain = &card->rx_chain;
> struct gelic_descr *descr = chain->head;
> struct net_device *netdev = NULL;
> - int dmac_chain_ended;
> + int dmac_chain_ended = 0;
>
> status = gelic_descr_get_status(descr);
>
> if (status == GELIC_DESCR_DMA_CARDOWNED)
> return 0;
>
> - if (status == GELIC_DESCR_DMA_NOT_IN_USE) {
> + if (status == GELIC_DESCR_DMA_NOT_IN_USE || !descr->skb) {
> dev_dbg(ctodev(card), "dormant descr? %p\n", descr);
> - return 0;
> + dmac_chain_ended = 1;
> + goto refill;
> }
>
> /* netdevice select */
> @@ -1048,9 +1051,10 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
> refill:
>
> /* is the current descriptor terminated with next_descr == NULL? */
> - dmac_chain_ended =
> - be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
> - GELIC_DESCR_RX_DMA_CHAIN_END;
> + if (!dmac_chain_ended)
> + dmac_chain_ended =
> + be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
> + GELIC_DESCR_RX_DMA_CHAIN_END;
TBH handling the OOM inside the Rx function seems a little fragile.
What if there is a packet to Rx as we enter. I don't see any loop here
it just replaces the used buffer..
> /*
> * So that always DMAC can see the end
> * of the descriptor chain to avoid
> @@ -1062,10 +1066,12 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
> gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
>
> /*
> - * this call can fail, but for now, just leave this
> - * descriptor without skb
> + * this call can fail, propagate the error
> */
> - gelic_descr_prepare_rx(card, descr);
> + int ret = gelic_descr_prepare_rx(card, descr);
> +
please dont declare variables half way thru a function and dont
separate function call from its error check with empty lines
> + if (ret)
> + return ret;
>
> chain->tail = descr;
> chain->head = descr->next;
> @@ -1087,6 +1093,17 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
> return 1;
> }
>
> +/**
> + * gelic_rx_oom_timer - Restart napi poll if oom occurred
> + * @t: timer list
> + */
This kdoc is worthless
> +static void gelic_rx_oom_timer(struct timer_list *t)
> +{
> + struct gelic_card *card = timer_container_of(card, t, rx_oom_timer);
> +
> + napi_schedule(&card->napi);
> +}
> +
> /**
> * gelic_net_poll - NAPI poll function called by the stack to return packets
> * @napi: napi structure
> @@ -1099,12 +1116,21 @@ static int gelic_net_poll(struct napi_struct *napi, int budget)
> {
> struct gelic_card *card = container_of(napi, struct gelic_card, napi);
> int packets_done = 0;
> + int work_result = 0;
>
> while (packets_done < budget) {
> - if (!gelic_card_decode_one_descr(card))
> - break;
> + work_result = gelic_card_decode_one_descr(card);
> + if (work_result == 1) {
> + packets_done++;
> + continue;
> + }
> + break;
common / success path should be the less indented one.
> + }
>
> - packets_done++;
> + if (work_result == -ENOMEM) {
> + napi_complete_done(napi, packets_done);
> + mod_timer(&card->rx_oom_timer, jiffies + 1);
> + return packets_done;
> }
>
> if (packets_done < budget) {
> @@ -1576,6 +1602,8 @@ static struct gelic_card *gelic_alloc_card_net(struct net_device **netdev)
> mutex_init(&card->updown_lock);
> atomic_set(&card->users, 0);
>
> + timer_setup(&card->rx_oom_timer, gelic_rx_oom_timer, 0);
> +
> return card;
> }
>
> diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.h b/drivers/net/ethernet/toshiba/ps3_gelic_net.h
> index f7d7931e51b7..c10f1984a5a1 100644
> --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.h
> +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.h
> @@ -268,6 +268,7 @@ struct gelic_vlan_id {
> struct gelic_card {
> struct napi_struct napi;
> struct net_device *netdev[GELIC_PORT_MAX];
> + struct timer_list rx_oom_timer;
> /*
> * hypervisor requires irq_status should be
> * 8 bytes aligned, but u64 member is
>
> base-commit: 96a9178a29a6b84bb632ebeb4e84cf61191c73d5
--
pw-bot: cr
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net] net: ps3_gelic_net: handle skb allocation failures
2025-11-12 2:04 ` Jakub Kicinski
@ 2025-11-12 9:34 ` Florian Fuchs
2025-11-12 14:31 ` Jakub Kicinski
0 siblings, 1 reply; 4+ messages in thread
From: Florian Fuchs @ 2025-11-12 9:34 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Geoff Levand, netdev, Andrew Lunn, David S . Miller, Eric Dumazet,
Paolo Abeni, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, linuxppc-dev, linux-kernel
Hi Jakub,
On 11 Nov 18:04, Jakub Kicinski wrote:
> On Mon, 10 Nov 2025 12:45:23 +0100 Florian Fuchs wrote:
> > Steps to reproduce the issue:
> > 1. Start a continuous network traffic, like scp of a 20GB file
> > 2. Inject failslab errors using the kernel fault injection:
> > echo -1 > /sys/kernel/debug/failslab/times
> > echo 30 > /sys/kernel/debug/failslab/interval
> > echo 100 > /sys/kernel/debug/failslab/probability
> > 3. After some time, traces start to appear, kernel Oopses
> > and the system stops
> >
> > Step 2 is not always necessary, as it is usually already triggered by
> > the transfer of a big enough file.
>
> Have you actually tested this on a real device?
> Please describe the testing you have done rather that "how to test".
Yes, of course, I intensively tested the patch on a Sony PS3 (CECHL04
PAL). I ran the final fix for many hours, with continuous system load
and high network transfer load. I am happy to get feedback on better or
acceptable testing.
My testing consisted of:
1. Produce Oops: Test the kernel without any gelic patches, scp a big
file to usb stick and create high cpu/memory load (like compiling
some software) or extract verbose, tar xv, a big file via ssh
2. Safely re-produce the Oops using failslab injection, so I dont need
to wait for it
3. Develop against that failslab injection, high load and network
transfer
4. First solution was to just always refill the chain, which resulted in
RX stall after some time, as the dmac seemed to be stopped, when buffer
was full and NOT_IN_USE head found and needed rmmod/modprobe to work
again
5. Run the final fix for many hours while injecting failslabs, high load,
and high network load with continuous scp and netcat
6. Further massive improvement is to convert the driver to use
napi_gro_receive and napi_skb_alloc, but this would be a separate
patch
> > --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> > +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> > @@ -259,6 +259,7 @@ void gelic_card_down(struct gelic_card *card)
> > mutex_lock(&card->updown_lock);
> > if (atomic_dec_if_positive(&card->users) == 0) {
> > pr_debug("%s: real do\n", __func__);
> > + timer_delete_sync(&card->rx_oom_timer);
> > napi_disable(&card->napi);
>
> I think the ordering here should be inverted
I thought, that there might be a race condition in the inverted order
like that napi gets re-enabled by the timer in between of the down:
1. napi_disable
2. rx_oom_timer runs and calls napi_schedule again
3. timer_delete_sync
So the timer is deleted first, to prevent any possibility to run.
> TBH handling the OOM inside the Rx function seems a little fragile.
> What if there is a packet to Rx as we enter. I don't see any loop here
> it just replaces the used buffer..
I am not sure, the handling needs to happen, when the skb allocation
fails, and that happens in the rx function, right? I am open to better
fitting fix position.
Thank you for your feedback!
Florian
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net] net: ps3_gelic_net: handle skb allocation failures
2025-11-12 9:34 ` Florian Fuchs
@ 2025-11-12 14:31 ` Jakub Kicinski
0 siblings, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2025-11-12 14:31 UTC (permalink / raw)
To: Florian Fuchs
Cc: Geoff Levand, netdev, Andrew Lunn, David S . Miller, Eric Dumazet,
Paolo Abeni, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, linuxppc-dev, linux-kernel
On Wed, 12 Nov 2025 10:34:01 +0100 Florian Fuchs wrote:
> > > --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> > > +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> > > @@ -259,6 +259,7 @@ void gelic_card_down(struct gelic_card *card)
> > > mutex_lock(&card->updown_lock);
> > > if (atomic_dec_if_positive(&card->users) == 0) {
> > > pr_debug("%s: real do\n", __func__);
> > > + timer_delete_sync(&card->rx_oom_timer);
> > > napi_disable(&card->napi);
> >
> > I think the ordering here should be inverted
>
> I thought, that there might be a race condition in the inverted order
> like that napi gets re-enabled by the timer in between of the down:
>
> 1. napi_disable
> 2. rx_oom_timer runs and calls napi_schedule again
> 3. timer_delete_sync
>
> So the timer is deleted first, to prevent any possibility to run.
napi_disable() makes napi_schedule() a nop (it makes it look like it's
already scheduled).
> > TBH handling the OOM inside the Rx function seems a little fragile.
> > What if there is a packet to Rx as we enter. I don't see any loop here
> > it just replaces the used buffer..
>
> I am not sure, the handling needs to happen, when the skb allocation
> fails, and that happens in the rx function, right? I am open to better
> fitting fix position.
Purely from the structure of the code PoV it'd be cleaner if the
alloc/refill was separate from the processing so we can call just
that part.
But looking closer I think the handling is fine as is. So I think
just addressing the nits is fine for v2
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-11-12 14:31 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-10 11:45 [PATCH net] net: ps3_gelic_net: handle skb allocation failures Florian Fuchs
2025-11-12 2:04 ` Jakub Kicinski
2025-11-12 9:34 ` Florian Fuchs
2025-11-12 14:31 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).