From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jeff Layton <jlayton@redhat.com>
Subject: [PATCH] have cifs_reconnect handle signals appropriately
Date: Wed, 30 May 2007 17:46:51 -0400
Message-ID: <20070530174651.2af67a97.jlayton@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
To: linux-fsdevel@vger.kernel.org
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from mx1.redhat.com ([66.187.233.31]:35007 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1760258AbXE3Vqx (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
	Wed, 30 May 2007 17:46:53 -0400
Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254])
	by mx1.redhat.com (8.13.1/8.13.1) with ESMTP id l4ULkqjj019924
	for <linux-fsdevel@vger.kernel.org>; Wed, 30 May 2007 17:46:52 -0400
Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [10.11.255.20])
	by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l4ULkqgW013917
	for <linux-fsdevel@vger.kernel.org>; Wed, 30 May 2007 17:46:52 -0400
Received: from tleilax.poochiereds.net (vpn-14-61.rdu.redhat.com [10.11.14.61])
	by pobox.corp.redhat.com (8.13.1/8.13.1) with SMTP id l4ULkpXb022835
	for <linux-fsdevel@vger.kernel.org>; Wed, 30 May 2007 17:46:51 -0400
Sender: linux-fsdevel-owner@vger.kernel.org
List-Id: linux-fsdevel.vger.kernel.org

This case is the result of a fairly long, drawn-out case. The problem
goes something like this:

1) mount a samba share using CIFS
2) start some continuous I/O on the mount (a loop that creates a
tarball on the mount and removes it seems to work)
3) shut down the samba server
4) suspend the process doing I/O (via ^z)
5) kill -9 pid_of_cifsd_kthread (I have no idea why they're doing this,
but bear with me)
6) umount -l the mount
7) start up samba again

after this, you cannot remount the samba share. mount attempts all
return either -ENOTDIR or -EAGAIN. The only fix seems to be to reboot
the box.

While the steps for this reproducer are pathological, I think they
expose a problem with how cifsd handles signals. If we're in
cifs_reconnect and cifsd is signalled, then the connect calls will all
start returning -ERESTARTSYS and we'll never exit from the while loop.

I *think* the following patch (or something like it) might be
appropriate. I've tested a similar patch on Steve's backported 1.48a
CIFS code and it seems to fix the problem there, but that code doesn't
have the kthread changes. Does this look reasonable, or am I missing
something important? :-)
-- 
Jeff Layton <jlayton@redhat.com>

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index f4e9266..d369dd0 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -197,6 +197,11 @@ cifs_reconnect(struct TCP_Server_Info *server)
 					server->server_RFC1001_name);
 		}
 		if(rc) {
+			if (rc == -ERESTARTSYS) {
+				cFYI(1,("reconnect interrupted by signal"));
+				kthread_stop(server->tsk);
+				continue;
+			}
 			cFYI(1,("reconnect error %d",rc));
 			msleep(3000);
 		} else {