public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Eric Van Hensbergen <ericvh@gmail.com>,
	linux-nfs@vger.kernel.org
Subject: [PATCH] forgetting to cancel request in interrupted zero-copy 9P RPC (was Re: [git pull] vfs part 2)
Date: Fri, 3 Jul 2015 16:00:01 +0100	[thread overview]
Message-ID: <20150703150000.GS17109@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20150703094210.GR17109@ZenIV.linux.org.uk>

On Fri, Jul 03, 2015 at 10:42:10AM +0100, Al Viro wrote:

> AFAICS, we get occasional stray responses from somewhere.  And no, it doesn't
> seem to be related to flushes or to dropping chan->lock in req_done() (this
> run had been with chan->lock taken on the outside of the loop).
> 
> What I really don't understand is WTF is it playing with p9_tag_lookup() -
> it's stashing req->tc via virtqueue_add_sgs() opaque data argument, fetches
> it back in req_done(), then picks ->tag from it and uses p9_tag_lookup() to
> find req.  Why not simply pass req instead?  I had been wrong about that
> p9_tag_lookup() being able to return NULL, but why bother with it at all?


Got it.  What happens is that on zero-copy path a signal hitting in the
end of p9_virtio_zc_request() is treated as "it hadn't been sent, got
an error, fuck off and mark the tag ready for reuse".  No TFLUSH issued,
etc.  As the result, when reply finally *does* arrive (we had actually
sent the request), it plays hell on the entire thing - tag might very
well have been reused by then and an unrelated request sent with the
same tag.  Depending on the timing, results can get rather ugly.

There are still other bogosities found in this thread, and at the very
least we need to cope with genuine corrupted response from server, but
the patch below fixes the problem with stray responses here and stops the
"what do you mean, you'd written 4K?  I've only sent 30 bytes!" problems
here.  10 minutes of trinity running without triggering it, while without
that patch it triggers in 2-3 minutes.

Could you verify that the patch below deals with your setup as well?
If it does, I'm going to put it into tonight's pull request, after I get
some sleep...  Right now I'm about to crawl in direction of bed - 25 hours
of uptime is a bit too much... ;-/

diff --git a/net/9p/client.c b/net/9p/client.c
index 6f4c4c8..8c4941d 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -843,7 +843,8 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
 	if (err < 0) {
 		if (err == -EIO)
 			c->status = Disconnected;
-		goto reterr;
+		if (err != -ERESTARTSYS)
+			goto reterr;
 	}
 	if (req->status == REQ_STATUS_ERROR) {
 		p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
@@ -1647,7 +1648,10 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
 		if (*err) {
 			trace_9p_protocol_dump(clnt, req->rc);
 			p9_free_req(clnt, req);
+			break;
 		}
+		if (rsize < count)
+			pr_err("mismatched reply [tag = %d]\n", req->tc->tag);
 
 		p9_debug(P9_DEBUG_9P, "<<< RWRITE count %d\n", count);
 

  reply	other threads:[~2015-07-03 15:00 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-15 18:14 [git pull] vfs part 2 Al Viro
2015-04-23 10:16 ` Andrey Ryabinin
2015-05-25  8:30   ` Andrey Ryabinin
2015-06-21 21:12   ` Al Viro
2015-06-21 21:16     ` Linus Torvalds
2015-06-21 21:35       ` Al Viro
2015-06-22 12:02     ` Andrey Ryabinin
2015-07-01  6:27       ` Al Viro
2015-07-01  7:50         ` Andrey Ryabinin
2015-07-01  8:27           ` Al Viro
2015-07-01  8:41             ` Andrey Ryabinin
2015-07-01  8:55               ` Al Viro
2015-07-01 11:25                 ` Andrey Ryabinin
2015-07-01 18:44                   ` Al Viro
2015-07-02  3:20                     ` Al Viro
2015-07-02  4:10                       ` running out of tags in 9P (was Re: [git pull] vfs part 2) Al Viro
2015-07-02  7:50                         ` Andrey Ryabinin
2015-07-02  7:59                           ` Al Viro
2015-07-02  8:19                             ` Andrey Ryabinin
2015-07-02  8:25                               ` Al Viro
2015-07-02  8:42                                 ` Al Viro
2015-07-02 12:19                                   ` Andrey Ryabinin
2015-07-02 16:43                                     ` Al Viro
2015-07-02 16:49                                       ` Al Viro
2015-07-03  8:19                                         ` Andrey Ryabinin
2015-07-03  9:42                                           ` Al Viro
2015-07-03 15:00                                             ` Al Viro [this message]
2015-07-03 19:56                                               ` [PATCH] forgetting to cancel request in interrupted zero-copy 9P RPC " Andrey Ryabinin
2015-07-02 20:26                                       ` running out of tags in 9P " Andrey Ryabinin
     [not found]                         ` <5594E5EB.4030808@samsung.com>
2015-07-02  7:50                           ` Al Viro
2015-07-02 12:00                       ` [git pull] vfs part 2 Jeff Layton
2015-07-02 12:07                         ` Jeff Layton
2015-07-02 16:45                           ` Al Viro
2015-07-02 17:01                             ` Jeff Layton
2015-07-02 17:56                               ` Dominique Martinet
2015-07-02 18:43                                 ` Al Viro
2015-07-02 21:00                                   ` Dominique Martinet
2015-07-02 18:59                                 ` Jeff Layton
2015-07-02 20:36                                 ` Andrey Ryabinin
2015-07-02 18:40                               ` Al Viro
2015-07-02 19:16                                 ` Linus Torvalds
2015-07-02 20:44                                   ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150703150000.GS17109@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=a.ryabinin@samsung.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=ericvh@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox