From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: net_tx_action race condition? Date: Wed, 28 Mar 2018 09:32:18 -0700 Message-ID: <7f70cdb4-4205-169a-0204-fd5cd72b44f1@gmail.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Cc: "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Sarvendra Vikram Singh , Kunal Sharma To: Saurabh Kr , Angelo Rizzi Return-path: In-Reply-To: Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On 03/28/2018 12:30 AM, Saurabh Kr wrote: > Hi Eric/Angelo, >   > We are seeing the assertion error  in linux kernel 2.4.29  “*kernel: KERNEL: assertion (atomic_read(&skb->users) == 0) failed at dev.c(1397)**”.* Based on patch provided (_https://patchwork.kernel.org/patch/5368051/_ ) we merged the changes in linux kernel 2.4.29 but we are still facing the assertion error at dev.c (1397). Please let me know your thoughts. >   > *Before Merge**(linux 2.4.29)* > --------------------------------- >   > static void net_tx_action(struct softirq_action *h) > { >         int cpu = smp_processor_id(); >   >         if (softnet_data[cpu].completion_queue) { >                 struct sk_buff *clist; >   >                 local_irq_disable(); >                 clist = softnet_data[cpu].completion_queue; // Existing code >                 softnet_data[cpu].completion_queue = NULL; >                 local_irq_enable(); >   >                 while (clist != NULL) { >                         struct sk_buff *skb = clist; >                         clist = clist->next; >   >                         BUG_TRAP(atomic_read(&skb->users) == 0); >                         __kfree_skb(skb); >                 } >         } >   >          --------- >   > *After Merge the changes based on available patch**(linux 2.4.29)**:* > ------------------------------------------------------------------------------ >   > static void net_tx_action(struct softirq_action *h) > { >         int cpu = smp_processor_id(); >   >         if (softnet_data[cpu].completion_queue) { >                 struct sk_buff *clist; >   >                 local_irq_disable(); >                 clist = *(volatile typeof(softnet_data[cpu].completion_queue) *)&( softnet_data[cpu].completion_queue);  // Modified line based on available patch >                 softnet_data[cpu].completion_queue = NULL; >                 local_irq_enable(); >   >                 while (clist != NULL) { >                         struct sk_buff *skb = clist; >                         clist = clist->next; >   >                         BUG_TRAP(atomic_read(&skb->users) == 0); >                         __kfree_skb(skb); >                 } >         } >   …………. >   > Thanks & regards, > Saurabh >   Thats simply prove (again) that this 'fix' was not the proper one. I have no idea what is wrong, and there is no way I am going to look at 2.4.29 kernel...