linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boaz Harrosh <bharrosh@panasas.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>,
	NFS list <linux-nfs@vger.kernel.org>,
	open-osd <osd-dev@open-osd.org>
Cc: Benny Halevy <bhalevy@tonian.com>, Peng Tao <bergwolf@gmail.com>
Subject: [RFC 0/5] CRASHFIX: pnfs-obj: NONE-RPC LD must not call rpc_restart_call_prepare()
Date: Fri, 8 Jun 2012 13:06:32 +0300	[thread overview]
Message-ID: <4FD1CEA8.3090105@panasas.com> (raw)


NONE-RPC layout-Drivers call nfs_writeback_done() as part
of their completion of IO. (through pnfs_ld_write_done())

Inside nfs_writeback_done() there is code that does:

	else if (resp->count < argp->count) {
		...

		/* This a short write! */
		nfs_inc_stats(inode, NFSIOS_SHORTWRITE);

		... /* Prepare the remainder */

		rpc_restart_call_prepare(task);
	}

But for none-rpc LDs (objects, blocks) there is no task->tk_ops
and this code will crash.

The same code exists in read.c::nfs_readpage_result_common() with
same crash.

Therefor: NONE-RPC LDs *must not* call pnfs_ld_read/write_done with
partial/short IO. Either complete everything or fail everything and
IO will be resubmitted through MDS.

Below are a set of patches that fix this problem for objects
layout, blocks maintainer should check that code does not allow a short
IO as well.

The Fix is much easier at the LD level then at NFS-Core. Though it
could be fixed there, I do not think it is worth it. I will submit a
comment that warns for this. Certainly NFS-Core changes could not
be so simple as below to be candidates for @Stable

These where lightly tested and are submitted for review. I will conduct
heavy testing and will put them in linux-next only next week.

Here are the list of patches:

[PATCH 1/6] ore: Fix NFS crash by supporting any unaligned RAID IO
[PATCH 2/6] ore: Remove support of partial IO request (NFS crash)
[PATCH 3/6] pnfs-obj: Fix __r4w_get_page in the case of offset beyond i_size

	These 3 completely plug the crash above, and are destined
	for Stable@

	Trond please ACK the 3rd patch since I want to push this
	through my tree, as these are for ore mostly.
	(Please also review all of them, watch my back)

[PATCH 4/6] pnfs-obj: don't leak objio_state if ore_write/read fails.

	A memory leak found on the way, in the error case.
	Should it be for Stable@ ?

[PATCH 4/5] NFS41: add pg_layout_private to nfs_pageio_descriptor

	This is the patch by Peng Tao <bergwolf@gmail.com> which is
	needed for below fix. Minus the has_moreio flag.

[PATCH 5/5] pnfs-obj: Better IO pattern in case of unaligned offset

	Though this patch is an optimization it fixes a very bad
	IO pattern. Please consider for the 3.5-rcX Kernel and
	don't postpone to 3.6 Merge window. If possible

Thanks for any review
Boaz

             reply	other threads:[~2012-06-08 10:06 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-08 10:06 Boaz Harrosh [this message]
2012-06-08 12:06 ` [PATCH 1/6] ore: Fix NFS crash by supporting any unaligned RAID IO Boaz Harrosh
2012-06-08 12:09 ` [PATCH 2/6] ore: Remove support of partial IO request (NFS crash) Boaz Harrosh
2012-06-08 12:13 ` [PATCH 3/6] pnfs-obj: Fix __r4w_get_page when offset is beyond i_size Boaz Harrosh
2012-06-13 13:53   ` Peng Tao
2012-06-08 12:14 ` [PATCH 4/6] pnfs-obj: don't leak objio_state if ore_write/read fails Boaz Harrosh
2012-06-08 12:15 ` [PATCH 5/6] NFS41: add pg_layout_private to nfs_pageio_descriptor Boaz Harrosh
2012-06-08 12:21 ` [PATCH 6/6] pnfs-obj: Better IO pattern in case of unaligned offset Boaz Harrosh
2012-06-13 13:57 ` [RFC 0/5] CRASHFIX: pnfs-obj: NONE-RPC LD must not call rpc_restart_call_prepare() Peng Tao
2012-07-12 14:11 ` Boaz Harrosh
2012-07-18 23:07   ` Myklebust, Trond
2012-07-19  9:48     ` Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FD1CEA8.3090105@panasas.com \
    --to=bharrosh@panasas.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bergwolf@gmail.com \
    --cc=bhalevy@tonian.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=osd-dev@open-osd.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).