From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758736AbYCZC1b (ORCPT ); Tue, 25 Mar 2008 22:27:31 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754018AbYCZC1X (ORCPT ); Tue, 25 Mar 2008 22:27:23 -0400 Received: from mail.fieldses.org ([66.93.2.214]:36657 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752965AbYCZC1X (ORCPT ); Tue, 25 Mar 2008 22:27:23 -0400 Date: Tue, 25 Mar 2008 22:27:19 -0400 To: Linus Torvalds Cc: linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org, Neil Brown , Tom Tucker , Roland Dreier Subject: [PATCH] SVCRDMA: Check num_sge when setting LAST_CTXT bit Message-ID: <20080326022718.GA25657@fieldses.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.17+20080114 (2008-01-14) From: "J. Bruce Fields" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Tom Tucker The RDMACTXT_F_LAST_CTXT bit was getting set incorrectly when the last chunk in the read-list spanned multiple pages. This resulted in a kernel panic when the wrong context was used to build the RPC iovec page list. RDMA_READ is used to fetch RPC data from the client for NFS_WRITE requests. A scatter-gather is used to map the advertised client side buffer to the server-side iovec and associated page list. WR contexts are used to convey which scatter-gather entries are handled by each WR. When the write data is large, a single RPC may require multiple RDMA_READ requests so the contexts for a single RPC are chained together in a linked list. The last context in this list is marked with a bit RDMACTXT_F_LAST_CTXT so that when this WR completes, the CQ handler code can enqueue the RPC for processing. The code in rdma_read_xdr was setting this bit on the last two contexts on this list when the last read-list chunk spanned multiple pages. This caused the svc_rdma_recvfrom logic to incorrectly build the RPC and caused the kernel to crash because the second-to-last context doesn't contain the iovec page list. Modified the condition that sets this bit so that it correctly detects the last context for the RPC. Signed-off-by: Tom Tucker Tested-by: Roland Dreier Signed-off-by: J. Bruce Fields --- net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 21 +++++++++++---------- 1 files changed, 11 insertions(+), 10 deletions(-) One more server-side rdma patch for 2.6.25. --b. diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c index 9712716..c22d6b6 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -322,15 +322,6 @@ next_sge: ctxt->direction = DMA_FROM_DEVICE; clear_bit(RDMACTXT_F_READ_DONE, &ctxt->flags); clear_bit(RDMACTXT_F_LAST_CTXT, &ctxt->flags); - if ((ch+1)->rc_discrim == 0) { - /* - * Checked in sq_cq_reap to see if we need to - * be enqueued - */ - set_bit(RDMACTXT_F_LAST_CTXT, &ctxt->flags); - ctxt->next = hdr_ctxt; - hdr_ctxt->next = head; - } /* Prepare READ WR */ memset(&read_wr, 0, sizeof read_wr); @@ -348,7 +339,17 @@ next_sge: rdma_set_ctxt_sge(ctxt, &sge[ch_sge_ary[ch_no].start], &sgl_offset, read_wr.num_sge); - + if (((ch+1)->rc_discrim == 0) && + (read_wr.num_sge == ch_sge_ary[ch_no].count)) { + /* + * Mark the last RDMA_READ with a bit to + * indicate all RPC data has been fetched from + * the client and the RPC needs to be enqueued. + */ + set_bit(RDMACTXT_F_LAST_CTXT, &ctxt->flags); + ctxt->next = hdr_ctxt; + hdr_ctxt->next = head; + } /* Post the read */ err = svc_rdma_send(xprt, &read_wr); if (err) { -- 1.5.5.rc1