From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Garzik Subject: Re: [PATCH] sis900: Allocate rx replacement buffer before rx operation Date: Fri, 27 Apr 2007 20:17:00 -0400 Message-ID: <4632927C.7010801@garzik.org> References: <20070420135458.GB16191@hmsreliant.homelinux.net> <462E33A8.5060107@garzik.org> <20070426174736.GC6000@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, davem@davemloft.net, venza@brownhat.org To: Neil Horman Return-path: Received: from srv5.dvmed.net ([207.36.208.214]:48239 "EHLO mail.dvmed.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754621AbXD1ARD (ORCPT ); Fri, 27 Apr 2007 20:17:03 -0400 In-Reply-To: <20070426174736.GC6000@localhost.localdomain> Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org Neil Horman wrote: > On Tue, Apr 24, 2007 at 12:43:20PM -0400, Jeff Garzik wrote: >> Neil Horman wrote: >>> Hey there- >>> The sis900 driver appears to have a bug in which the receive routine >>> passes the skbuff holding the received frame to the network stack before >>> refilling the buffer in the rx ring. If a new skbuff cannot be allocated, >>> the >>> driver simply leaves a hole in the rx ring, which causes the driver to stop >>> receiving frames and become non-recoverable without an rmmod/insmod >>> according to >>> reporters. This patch reverses that order, attempting to allocate a >>> replacement >>> buffer first, and receiving the new frame only if one can be allocated. >>> If no >>> skbuff can be allocated, the current skbuf in the rx ring is recycled, >>> dropping >>> the current frame, but keeping the NIC operational. >>> >>> Thanks & Regards >>> Neil > > > > Just found a hole in my last patch. It was reported to me that shortly after we > integrated this patch. The report was of an oops that took place inside of > netif_rx when using the sis900 driver. Looking at my origional patch I noted > that there was a spot between the new skb_alloc and the refill_rx_ring label > where skb got reassigned to the pointer currently held in the rx_ring for the > purposes of receiveing the frame. The result of this is however that the buffer > that gets passed to netif_rx (if it is called), then gets placed right back into > the rx_ring. So if you receive frames fast enough the skb being processed by > the network stack can get corrupted. The reporter is testing out the fix I've > written for this below (I'm not near my hardware at the moment to test myself), > but I wanted to post it for review ASAP. I'll post test results when I hear > them, but I think this is a pretty straightforward fix. It just uses a separate > pointer to do the rx operation, so that we don't improperly reassign the pointer > that we use to refill the rx ring. > > Thanks & Regards > Neil > > Signed-off-by: Neil Horman > > > sis900.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > > diff --git a/drivers/net/sis900.c b/drivers/net/sis900.c > index a6a0f09..7e44939 100644 > --- a/drivers/net/sis900.c > +++ b/drivers/net/sis900.c > @@ -1754,6 +1754,7 @@ static int sis900_rx(struct net_device *net_dev) > sis_priv->rx_ring[entry].cmdsts = RX_BUF_SIZE; > } else { > struct sk_buff * skb; > + struct sk_buff * rx_skb; > > pci_unmap_single(sis_priv->pci_dev, > sis_priv->rx_ring[entry].bufptr, RX_BUF_SIZE, > @@ -1787,10 +1788,10 @@ static int sis900_rx(struct net_device *net_dev) > } > > /* give the socket buffer to upper layers */ > - skb = sis_priv->rx_skbuff[entry]; > - skb_put(skb, rx_size); > - skb->protocol = eth_type_trans(skb, net_dev); > - netif_rx(skb); > + rx_skb = sis_priv->rx_skbuff[entry]; > + skb_put(rx_skb, rx_size); > + skb->protocol = eth_type_trans(rx_skb, net_dev); applied this, and the one-line fix to this