From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932239AbXG0MxS (ORCPT ); Fri, 27 Jul 2007 08:53:18 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755502AbXG0MxK (ORCPT ); Fri, 27 Jul 2007 08:53:10 -0400 Received: from srv5.dvmed.net ([207.36.208.214]:57240 "EHLO mail.dvmed.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755507AbXG0MxJ (ORCPT ); Fri, 27 Jul 2007 08:53:09 -0400 Message-ID: <46A9EAB0.3090306@garzik.org> Date: Fri, 27 Jul 2007 08:53:04 -0400 From: Jeff Garzik User-Agent: Thunderbird 1.5.0.12 (X11/20070719) MIME-Version: 1.0 To: Trond Myklebust , Linux Kernel Mailing List CC: Andrew Morton , Michal Piotrowski Subject: NFSv4 poops itself Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -4.3 (----) X-Spam-Report: SpamAssassin version 3.1.9 on srv5.dvmed.net summary: Content analysis details: (-4.3 points, 5.0 required) Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Background: Server: x86-64 dual core Intel, kernel 2.6.23-rc1 (my home fileserver) Exporting NFS/NFSv4 mounts. Client count: 1 Uptime: 4 days Client: x86-64 dual core Intel, kernel 2.6.23-rc1 (my main workstation) NFS mount setup: pretzel:/ on /g type nfs4 (rw,noatime,proto=tcp,addr=10.10.10.1) Uptime: 4 days Home directory mounted via NFSv4. Problem: My workstation has been happily talking to my file server for several days without incident. An hour ago, my numeric keypad stopping working (unrelated problem... USB or X bug?). The solution to the keypad problem is usually to log out of X and log back in, or worse case, reboot. So, I log out, and log back in. At first, a few shell windows open and successfully initialize themselves (read bash profile over NFS, etc.) Then, as more shell windows open, things start hanging. I can easily switch to console and ssh to the fileserver, so it is clear this is an NFS hang. No adverse messages at all on the client. On the server, I see NFSv4 spamming dmesg with hundreds of thousands of messages: Jul 27 08:20:53 pretzel kernel: NFSD: preprocess_seqid_op: old stateid! Jul 27 08:21:24 pretzel last message repeated 167966 times Jul 27 08:21:55 pretzel last message repeated 173628 times Jul 27 08:21:55 pretzel kernel: NFSD: preprocess_seqid_op: old stateid! Jul 27 08:22:26 pretzel last message repeated 171286 times Jul 27 08:23:27 pretzel last message repeated 344461 times Jul 27 08:23:30 pretzel last message repeated 18656 times I rebooted the client, the problem disappeared, and everything is happy again... but clearly NFSv4 shat itself. And now I am worried this will happen again. In all my quite-heavy use of NFSv4 I've never seen this behavior before, so I would call this a regression. I always run vanilla linux-2.6.git self-built kernels on both client and server. Jeff