From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760093AbXGTVPt (ORCPT ); Fri, 20 Jul 2007 17:15:49 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752505AbXGTVPl (ORCPT ); Fri, 20 Jul 2007 17:15:41 -0400 Received: from zrtps0kn.nortel.com ([47.140.192.55]:40086 "EHLO zrtps0kn.nortel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752437AbXGTVPk (ORCPT ); Fri, 20 Jul 2007 17:15:40 -0400 Message-ID: <46A125F4.3030504@nortel.com> Date: Fri, 20 Jul 2007 15:15:32 -0600 From: "Chris Friesen" User-Agent: Mozilla Thunderbird 1.0.2-6 (X11/20050513) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-kernel@vger.kernel.org Subject: posible latency issues in seq_read Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 20 Jul 2007 21:15:35.0775 (UTC) FILETIME=[1D2E76F0:01C7CB13] Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org We've run into an issue (on 2.6.10) where calling "lsof" triggers lost packets on our server. Preempt is disabled, and NAPI is enabled. It appears that for some reason the networking softirq is not being handled in a timely fashion, which means that the rx ring buffer fills up and packets overflow. It appears that the problem path is: seq_read tcp_seq_next established_get_next read_lock/read_unlock The issue appears to be related to the amount of time that this syscall takes. While we're in the syscall we cannot run the softirqd thread, and so the rx buffer is not being cleaned. The fact that there are kmalloc(GFP_KERNEL) calls in seq_read() seems to indicate that sleeping is safe, so would it be reasonable to call schedule() periodically (maybe based on elapsed time) to ensure that system latency is kept under control? Thanks, Chris