From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matan Barak Subject: Re: [PATCH V1 net-next 03/10] net/mlx4_core: Use tasklet for user-space CQ completion events Date: Wed, 10 Dec 2014 17:47:31 +0200 Message-ID: <54886B13.5040409@mellanox.com> References: <1418216999-17012-1-git-send-email-ogerlitz@mellanox.com> <1418216999-17012-4-git-send-email-ogerlitz@mellanox.com> <1418225599.27198.18.camel@edumazet-glaptop2.roam.corp.google.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Cc: "David S. Miller" , , "Amir Vadai" , Tal Alon , Jack Morgenstein To: Eric Dumazet , Or Gerlitz Return-path: Received: from mail-db3on0089.outbound.protection.outlook.com ([157.55.234.89]:42346 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754093AbaLJQXc (ORCPT ); Wed, 10 Dec 2014 11:23:32 -0500 In-Reply-To: <1418225599.27198.18.camel@edumazet-glaptop2.roam.corp.google.com> Sender: netdev-owner@vger.kernel.org List-ID: On 12/10/2014 5:33 PM, Eric Dumazet wrote: > On Wed, 2014-12-10 at 15:09 +0200, Or Gerlitz wrote: >> From: Matan Barak >> >> Previously, we've fired all our completion callbacks straight from our ISR. >> >> Some of those callbacks were lightweight (for example, mlx4_en's and >> IPoIB napi callbacks), but some of them did more work (for example, >> the user-space RDMA stack uverbs' completion handler). Besides that, >> doing more than the minimal work in ISR is generally considered wrong, >> it could even lead to a hard lockup of the system. Since when a lot >> of completion events are generated by the hardware, the loop over those >> events could be so long, that we'll get into a hard lockup by the system >> watchdog. > > ... > >> +#define TASKLET_THRESHOLD 1000 >> + >> +void mlx4_cq_tasklet_cb(unsigned long data) >> +{ >> + unsigned long flags; >> + unsigned int i = 0; >> + struct mlx4_eq_tasklet *ctx = (struct mlx4_eq_tasklet *)data; >> + struct mlx4_cq *mcq, *temp; >> + >> + spin_lock_irqsave(&ctx->lock, flags); >> + list_splice_tail_init(&ctx->list, &ctx->process_list); >> + spin_unlock_irqrestore(&ctx->lock, flags); >> + >> + list_for_each_entry_safe(mcq, temp, &ctx->process_list, tasklet_ctx.list) { >> + list_del_init(&mcq->tasklet_ctx.list); >> + mcq->tasklet_ctx.comp(mcq); >> + if (atomic_dec_and_test(&mcq->refcount)) >> + complete(&mcq->free); >> + if (++i == TASKLET_THRESHOLD) >> + break; >> + } >> + >> + if (i == TASKLET_THRESHOLD) >> + tasklet_schedule(&ctx->task); >> +} >> + > > What is the max duration of doing this loop up to 1000 times ? > > I suspect it might be too long, but not necessarily detected by > conventional watchdog. > > __do_softirq() uses both a counter and a test against jiffies, with a 2 > ms limit. You're right - we'll measure it accurately, but I think it took over 2ms (on a system with 400 CQs opened), including the spin_lock on the list_splice. We'll add the jiffies test to V2. Thanks. > > Thanks. > >