* [patch netdrvr sis900] net: come alive after temporary memory shortage
@ 2005-09-26 12:19 Konstantin Khorenko
2005-09-26 13:06 ` Daniele Venzano
0 siblings, 1 reply; 3+ messages in thread
From: Konstantin Khorenko @ 2005-09-26 12:19 UTC (permalink / raw)
To: Daniele Venzano
Cc: Vasily Averin, Stanislav Protassov, Ollie Lho, linux-net,
linux-kernel
[-- Attachment #1: Type: text/plain, Size: 4123 bytes --]
Patch solves following problems:
1) Forgotten counter incrementation in sis900_rx() in case
it doesn't get memory for skb, that leads to whole interface failure.
Problem is accompanied with messages:
eth0: Memory squeeze,deferring packet.
eth0: NULL pointer encountered in Rx ring, skipping
2) If counter cur_rx overflows and there'll be temporary memory problems
buffer can't be recreated later, when memory IS avaliable.
3) Limit the work in handler to prevent the endless packets processing if
new packets are generated faster then handled.
Signed-off-by: Konstantin Khorenko <khorenko@sw.ru>
Signed-off-by: Vasily Averin <vvs@sw.ru>
-----------------------------------
We had a customer that complains about the problem with network card
that is supported by sis900 driver.
Problem description: at random time card suddenly stops working and only
reboot makes it back to work.
Non-working is accomplished with massages in /var/log/messages:
eth0: Memory squeeze,deferring packet.
eth0: NULL pointer encountered in Rx ring, skipping
eth0: NULL pointer encountered in Rx ring, skipping
eth0: NULL pointer encountered in Rx ring, skipping
(till reboot)
We discover that his problem is already known:
http://www.ussg.iu.edu/hypermail/linux/kernel/0407.3/0566.html
http://www.kernelnewbies.org/documents/kdoc/sis900/problems.html
Nevertheless it isn't fixed till now, so we tried to fix.
(1) Function sis900_rx().
During normal execution dirty_rx < cur_rx is ALWAYS true.
Let's assume, we are short of memory.
unsigned int entry = sis_priv->cur_rx % NUM_RX_DESC;
...
while (rx_status & OWN) {
...
if (some error check) {
...
} else {
5. Next func call, after previous one we have rx_skbuff[cur_rx%] == NULL,
which means rx_skbuff[entry] == NULL
if (sis_priv->rx_skbuff[entry] == NULL) {
printk(KERN_INFO "%s: NULL pointer "
"encountered in Rx ring, skipping\n",
net_dev->name);
6. Print and exit while() loop.
break;
}
...
1. fail here. --> if ((skb = dev_alloc_skb(RX_BUF_SIZE)) == NULL) {
...
printk(KERN_INFO "%s: Memory squeeze,"
"deferring packet.\n",
net_dev->name);
--> sis_priv->rx_skbuff[entry] = NULL;
2. now sis_priv->rx_skbuff[cur_rx%] == NULL
...
break;
3. and we are exiting while () not incrementing cur_rx!
}
...
} // of else
sis_priv->cur_rx++;
entry = sis_priv->cur_rx % NUM_RX_DESC;
...
} //of while
4. we refill all buffers rx_skbuff[entry], where entry < cur_rx.
rx_skbuff[cur_rx%] == NULL before and AFTER loop
for (; sis_priv->cur_rx > sis_priv->dirty_rx; sis_priv->dirty_rx++) {
entry = sis_priv->dirty_rx % NUM_RX_DESC;
if (sis_priv->rx_skbuff[entry] == NULL) {
...
sis_priv->rx_skbuff[entry] = skb;
...
}
}
No matter how many times func is called cur_rx won't be incremented, and
thus
rx_skbuff[cur_rx%] will be NULL forever, which results neverending
printings and packets drops.
------
(2) The same function sis900_rx().
for (; sis_priv->cur_rx > sis_priv->dirty_rx; sis_priv->dirty_rx++) {
entry = sis_priv->dirty_rx % NUM_RX_DESC;
if (sis_priv->rx_skbuff[entry] == NULL) {
...skb = dev_alloc_skb(RX_BUF_SIZE)...
...
sis_priv->rx_skbuff[entry] = skb;
...
}
}
Assume cur_rx is overflowed in previous while() loop execution, but
dirty_rx is NOT and we really need buffer refilling.
Comparison sis_priv->cur_rx > sis_priv->dirty_rx will fail and buffers
won't be refilled.
----------
(3) The same function sis900_rx().
Assume whole buffer is filled, there is no memory shortage problem and
network card receives packets faster then kernel process them in this
sis900_rx() function in while (rx_status & OWN) loop - execution control
won't leave the loop.
sis900_rx() is called in interrupt handler, it's not good idea to `do
"too much" work here` (sentence from sources :) )
----------
Hope, you'll check this changes and find them usefull. :)
Kernels with patches compile but untested.
This patch is against mainstream 2.6.13.1 kernel.
--
Best regards,
Konstantin Khorenko,
SWsoft, Inc.
[-- Attachment #2: diff-sis900-2.6.13.1 --]
[-- Type: text/plain, Size: 2298 bytes --]
--- ./drivers/net/sis900.c.sis900 2005-08-29 03:41:01.000000000 +0400
+++ ./drivers/net/sis900.c 2005-09-19 14:34:42.000000000 +0400
@@ -1696,6 +1696,14 @@ static int sis900_rx(struct net_device *
long ioaddr = net_dev->base_addr;
unsigned int entry = sis_priv->cur_rx % NUM_RX_DESC;
u32 rx_status = sis_priv->rx_ring[entry].cmdsts;
+ /*
+ * If cur > dirty, then limit = NUM_RX_DESC - cur + dirty =
+ * NUM_RX_DESC + (dirty - cur)
+ * If cur < dirty (cur overflowed, dirty - not), then
+ * limit = dirty - cur
+ */
+ int rx_work_limit =
+ (sis_priv->dirty_rx - sis_priv->cur_rx) % NUM_RX_DESC;
if (netif_msg_rx_status(sis_priv))
printk(KERN_DEBUG "sis900_rx, cur_rx:%4.4d, dirty_rx:%4.4d "
@@ -1705,6 +1713,8 @@ static int sis900_rx(struct net_device *
while (rx_status & OWN) {
unsigned int rx_size;
+ if (--rx_work_limit < 0)
+ break;
rx_size = (rx_status & DSIZE) - CRC_SIZE;
if (rx_status & (ABORT|OVERRUN|TOOLONG|RUNT|RXISERR|CRCERR|FAERR)) {
@@ -1770,6 +1780,7 @@ static int sis900_rx(struct net_device *
sis_priv->rx_ring[entry].cmdsts = 0;
sis_priv->rx_ring[entry].bufptr = 0;
sis_priv->stats.rx_dropped++;
+ sis_priv->cur_rx++;
break;
}
skb->dev = net_dev;
@@ -1787,7 +1798,7 @@ static int sis900_rx(struct net_device *
/* refill the Rx buffer, what if the rate of refilling is slower
* than consuming ?? */
- for (;sis_priv->cur_rx - sis_priv->dirty_rx > 0; sis_priv->dirty_rx++) {
+ for (; sis_priv->cur_rx != sis_priv->dirty_rx; sis_priv->dirty_rx++) {
struct sk_buff *skb;
entry = sis_priv->dirty_rx % NUM_RX_DESC;
#
# Patch solves following problems:
# 1) Forgotten counter incrementation in sis900_rx() in case
# it doesn't get memory for skb, that leads to whole interface failure.
# Problem is accompanied with messages:
# eth0: Memory squeeze,deferring packet.
# eth0: NULL pointer encountered in Rx ring, skipping
# 2) If counter cur_rx overflows and there'll be temporary memory problems
# buffer can't be recreated later, when memory IS avaliable.
# 3) Limit the work in handler to prevent the endless packets processing if
# new packets are generated faster then handled.
#
# Signed-off-by: Konstantin Khorenko <khorenko@sw.ru>
# Signed-off-by: Vasily Averin <vvs@sw.ru>
^ permalink raw reply [flat|nested] 3+ messages in thread
* [patch netdrvr sis900] net: come alive after temporary memory shortage
@ 2005-09-26 12:26 Konstantin Khorenko
0 siblings, 0 replies; 3+ messages in thread
From: Konstantin Khorenko @ 2005-09-26 12:26 UTC (permalink / raw)
To: Daniele Venzano
Cc: Vasily Averin, Stanislav Protassov, linux-net, linux-kernel,
marcelo
[-- Attachment #1: Type: text/plain, Size: 4121 bytes --]
Patch solves following problems:
1) Forgotten counter incrementation in sis900_rx() in case
it doesn't get memory for skb, that leads to whole interface failure.
Problem is accompanied with messages:
eth0: Memory squeeze,deferring packet.
eth0: NULL pointer encountered in Rx ring, skipping
2) If counter cur_rx overflows and there'll be temporary memory problems
buffer can't be recreated later, when memory IS avaliable.
3) Limit the work in handler to prevent the endless packets processing if
new packets are generated faster then handled.
Signed-off-by: Konstantin Khorenko <khorenko@sw.ru>
Signed-off-by: Vasily Averin <vvs@sw.ru>
-----------------------------------
We had a customer that complains about the problem with network card
that is supported by sis900 driver.
Problem description: at random time card suddenly stops working and only
reboot makes it back to work.
Non-working is accomplished with massages in /var/log/messages:
eth0: Memory squeeze,deferring packet.
eth0: NULL pointer encountered in Rx ring, skipping
eth0: NULL pointer encountered in Rx ring, skipping
eth0: NULL pointer encountered in Rx ring, skipping
(till reboot)
We discover that his problem is already known:
http://www.ussg.iu.edu/hypermail/linux/kernel/0407.3/0566.html
http://www.kernelnewbies.org/documents/kdoc/sis900/problems.html
Nevertheless it isn't fixed till now, so we tried to fix.
(1) Function sis900_rx().
During normal execution dirty_rx < cur_rx is ALWAYS true.
Let's assume, we are short of memory.
unsigned int entry = sis_priv->cur_rx % NUM_RX_DESC;
...
while (rx_status & OWN) {
...
if (some error check) {
...
} else {
5. Next func call, after previous one we have rx_skbuff[cur_rx%] == NULL,
which means rx_skbuff[entry] == NULL
if (sis_priv->rx_skbuff[entry] == NULL) {
printk(KERN_INFO "%s: NULL pointer "
"encountered in Rx ring, skipping\n",
net_dev->name);
6. Print and exit while() loop.
break;
}
...
1. fail here. --> if ((skb = dev_alloc_skb(RX_BUF_SIZE)) == NULL) {
...
printk(KERN_INFO "%s: Memory squeeze,"
"deferring packet.\n",
net_dev->name);
--> sis_priv->rx_skbuff[entry] = NULL;
2. now sis_priv->rx_skbuff[cur_rx%] == NULL
...
break;
3. and we are exiting while () not incrementing cur_rx!
}
...
} // of else
sis_priv->cur_rx++;
entry = sis_priv->cur_rx % NUM_RX_DESC;
...
} //of while
4. we refill all buffers rx_skbuff[entry], where entry < cur_rx.
rx_skbuff[cur_rx%] == NULL before and AFTER loop
for (; sis_priv->cur_rx > sis_priv->dirty_rx; sis_priv->dirty_rx++) {
entry = sis_priv->dirty_rx % NUM_RX_DESC;
if (sis_priv->rx_skbuff[entry] == NULL) {
...
sis_priv->rx_skbuff[entry] = skb;
...
}
}
No matter how many times func is called cur_rx won't be incremented, and
thus
rx_skbuff[cur_rx%] will be NULL forever, which results neverending
printings and packets drops.
------
(2) The same function sis900_rx().
for (; sis_priv->cur_rx > sis_priv->dirty_rx; sis_priv->dirty_rx++) {
entry = sis_priv->dirty_rx % NUM_RX_DESC;
if (sis_priv->rx_skbuff[entry] == NULL) {
...skb = dev_alloc_skb(RX_BUF_SIZE)...
...
sis_priv->rx_skbuff[entry] = skb;
...
}
}
Assume cur_rx is overflowed in previous while() loop execution, but
dirty_rx is NOT and we really need buffer refilling.
Comparison sis_priv->cur_rx > sis_priv->dirty_rx will fail and buffers
won't be refilled.
----------
(3) The same function sis900_rx().
Assume whole buffer is filled, there is no memory shortage problem and
network card receives packets faster then kernel process them in this
sis900_rx() function in while (rx_status & OWN) loop - execution control
won't leave the loop.
sis900_rx() is called in interrupt handler, it's not good idea to `do
"too much" work here` (sentence from sources :) )
----------
Hope, you'll check this changes and find them usefull. :)
Kernels with patches compile but untested.
This patch is against mainstream 2.4.31 kernel.
--
Best regards,
Konstantin Khorenko,
SWsoft, Inc.
[-- Attachment #2: diff-sis900-2.4.31 --]
[-- Type: text/plain, Size: 2285 bytes --]
--- ./drivers/net/sis900.c.sis900 2004-08-08 03:26:05.000000000 +0400
+++ ./drivers/net/sis900.c 2005-09-16 17:17:27.000000000 +0400
@@ -1613,6 +1613,14 @@ static int sis900_rx(struct net_device *
long ioaddr = net_dev->base_addr;
unsigned int entry = sis_priv->cur_rx % NUM_RX_DESC;
u32 rx_status = sis_priv->rx_ring[entry].cmdsts;
+ /*
+ * If cur > dirty, then limit = NUM_RX_DESC - cur + dirty =
+ * NUM_RX_DESC + (dirty - cur)
+ * If cur < dirty (cur overflowed, dirty - not), then
+ * limit = dirty - cur
+ */
+ int rx_work_limit =
+ (sis_priv->dirty_rx - sis_priv->cur_rx) % NUM_RX_DESC;
if (sis900_debug > 3)
printk(KERN_INFO "sis900_rx, cur_rx:%4.4d, dirty_rx:%4.4d "
@@ -1622,6 +1630,8 @@ static int sis900_rx(struct net_device *
while (rx_status & OWN) {
unsigned int rx_size;
+ if (--rx_work_limit < 0)
+ break;
rx_size = (rx_status & DSIZE) - CRC_SIZE;
if (rx_status & (ABORT|OVERRUN|TOOLONG|RUNT|RXISERR|CRCERR|FAERR)) {
@@ -1688,6 +1698,7 @@ static int sis900_rx(struct net_device *
sis_priv->rx_ring[entry].cmdsts = 0;
sis_priv->rx_ring[entry].bufptr = 0;
sis_priv->stats.rx_dropped++;
+ sis_priv->cur_rx++;
break;
}
skb->dev = net_dev;
@@ -1705,7 +1716,7 @@ static int sis900_rx(struct net_device *
/* refill the Rx buffer, what if the rate of refilling is slower than
consuming ?? */
- for (;sis_priv->cur_rx - sis_priv->dirty_rx > 0; sis_priv->dirty_rx++) {
+ for (; sis_priv->cur_rx != sis_priv->dirty_rx; sis_priv->dirty_rx++) {
struct sk_buff *skb;
entry = sis_priv->dirty_rx % NUM_RX_DESC;
#
# Patch solves following problems:
# 1) Forgotten counter incrementation in sis900_rx() in case
# it doesn't get memory for skb, that leads to whole interface failure.
# Problem is accompanied with messages:
# eth0: Memory squeeze,deferring packet.
# eth0: NULL pointer encountered in Rx ring, skipping
# 2) If counter cur_rx overflows and there'll be temporary memory problems
# buffer can't be recreated later, when memory IS avaliable.
# 3) Limit the work in handler to prevent the endless packets processing if
# new packets are generated faster then handled.
#
# Signed-off-by: Konstantin Khorenko <khorenko@sw.ru>
# Signed-off-by: Vasily Averin <vvs@sw.ru>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [patch netdrvr sis900] net: come alive after temporary memory shortage
2005-09-26 12:19 Konstantin Khorenko
@ 2005-09-26 13:06 ` Daniele Venzano
0 siblings, 0 replies; 3+ messages in thread
From: Daniele Venzano @ 2005-09-26 13:06 UTC (permalink / raw)
To: Konstantin Khorenko
Cc: Vasily Averin, Stanislav Protassov, linux-net,
Linux Kernel Mailing List
Il giorno 26/set/05, alle ore 14:19, Konstantin Khorenko ha scritto:
> Hope, you'll check this changes and find them usefull. :)
> Kernels with patches compile but untested.
> This patch is against mainstream 2.6.13.1 kernel.
> --- ./drivers/net/sis900.c.sis900 2005-08-29 03:41:01.000000000
> +0400
> +++ ./drivers/net/sis900.c 2005-09-19 14:34:42.000000000 +0400
Please create the diff one directory above the root sources directory
so that it is possible to apply with 'patch -p1'.
> @@ -1696,6 +1696,14 @@ static int sis900_rx(struct net_device *
> long ioaddr = net_dev->base_addr;
> unsigned int entry = sis_priv->cur_rx % NUM_RX_DESC;
> u32 rx_status = sis_priv->rx_ring[entry].cmdsts;
> + /*
> + * If cur > dirty, then limit = NUM_RX_DESC - cur + dirty =
> + * NUM_RX_DESC + (dirty - cur)
> + * If cur < dirty (cur overflowed, dirty - not), then
> + * limit = dirty - cur
> + */
> + int rx_work_limit =
> + (sis_priv->dirty_rx - sis_priv->cur_rx) % NUM_RX_DESC;
Remove this comment, or move it to the description of the function
above the sis900_rx() declaration.
>
> if (netif_msg_rx_status(sis_priv))
> printk(KERN_DEBUG "sis900_rx, cur_rx:%4.4d, dirty_rx:%4.4d "
> @@ -1705,6 +1713,8 @@ static int sis900_rx(struct net_device *
> while (rx_status & OWN) {
> unsigned int rx_size;
>
> + if (--rx_work_limit < 0)
> + break;
> rx_size = (rx_status & DSIZE) - CRC_SIZE;
>
> if (rx_status & (ABORT|OVERRUN|TOOLONG|RUNT|RXISERR|CRCERR|
> FAERR)) {
> @@ -1770,6 +1780,7 @@ static int sis900_rx(struct net_device *
> sis_priv->rx_ring[entry].cmdsts = 0;
> sis_priv->rx_ring[entry].bufptr = 0;
> sis_priv->stats.rx_dropped++;
> + sis_priv->cur_rx++;
> break;
> }
> skb->dev = net_dev;
> @@ -1787,7 +1798,7 @@ static int sis900_rx(struct net_device *
>
> /* refill the Rx buffer, what if the rate of refilling is slower
> * than consuming ?? */
> - for (;sis_priv->cur_rx - sis_priv->dirty_rx > 0; sis_priv-
> >dirty_rx++) {
> + for (; sis_priv->cur_rx != sis_priv->dirty_rx; sis_priv-
> >dirty_rx++) {
> struct sk_buff *skb;
>
> entry = sis_priv->dirty_rx % NUM_RX_DESC;
With those corrections, the patch should be resent to me, to Jeff
Garzik and to the netdev mailing list for review and possibly inclusion.
Thanks for the contribution.
--
Daniele Venzano
http://www.brownhat.org
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-09-26 13:06 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-26 12:26 [patch netdrvr sis900] net: come alive after temporary memory shortage Konstantin Khorenko
-- strict thread matches above, loose matches on Subject: below --
2005-09-26 12:19 Konstantin Khorenko
2005-09-26 13:06 ` Daniele Venzano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox