All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Yan, Zheng" <zheng.z.yan@intel.com>
To: Sage Weil <sage@inktank.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: [PATCH] mds: sort dentries when committing dir fragment
Date: Mon, 26 Nov 2012 16:54:32 +0800	[thread overview]
Message-ID: <50B32E48.50403@intel.com> (raw)
In-Reply-To: <50B2D196.8010509@intel.com>

On 11/26/2012 10:19 AM, Yan, Zheng wrote:
> On 11/26/2012 06:32 AM, Sage Weil wrote:
>> I pushed an alternative approach to wip-tmap.
>>
>> This sorting is an artifact of tmap's crummy implementation, and the mds 
>> workaround will need to get reverted when we switch to omap.  Instead, fix 
>> tmap so that it will tolerate unsorted keys.  (Also, drop the ENOENT on rm 
>> on missing key.)
>>
>> Eventually we can deprecate and remove tmap entirely...
>>
>> What do you think?
> 
> This approach is cleaner than mine. But I think your fix isn't enough because
> MDS may provide tmap contains misordered items to the TMAPPUT method. Misordered
> items will confuse future TMAPUP. This fix is either sorting items when handling
> TMAPPUT or searching forward for any potential misordered items when TMAP_SET
> wants to add a new item or TMAP_RM fails to find an item.
>

how about the patch attached below

-----
From e3c4fb68dc6c7592b7f53ab7a98b561167b567df Mon Sep 17 00:00:00 2001
From: "Yan, Zheng" <zheng.z.yan@intel.com>
Date: Mon, 26 Nov 2012 12:28:30 +0800
Subject: [PATCH] osd: check misordered items in TMAP

There is a bug in the MDS that causes misordered items in TMAPs
that store dir fragments. Misordered items confuse TMAP updates.

Fix this by adding code to do_tmapup() to check if there are
misordered items that may affect the correctness of TMAP update.
Fall back to use do_tmapup_slow if misordered items are found.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
 src/osd/ReplicatedPG.cc | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/src/osd/ReplicatedPG.cc b/src/osd/ReplicatedPG.cc
index 48cba10..71f5363 100644
--- a/src/osd/ReplicatedPG.cc
+++ b/src/osd/ReplicatedPG.cc
@@ -1803,8 +1803,17 @@ int ReplicatedPG::do_tmapup(OpContext *ctx, bufferlist::iterator& bp, OSDOp& osd
 	  nkeys--;
 	}
 	if (!ip.end()) {
+	  string last_key = nextkey;
+
 	  ::decode(nextkey, ip);
 	  ::decode(nextval, ip);
+
+	  if (nextkey <= last_key) {
+	    dout(0) << "tmapup warning: key '" << key << "' < previous key '" << last_key
+		    << "', falling back to an inefficient (unsorted) update" << dendl;
+	    bp = orig_bp;
+	    return do_tmapup_slow(ctx, bp, osd_op, newop.outdata);
+	  }
 	} else {
 	  have_next = false;
 	}
@@ -1848,6 +1857,35 @@ int ReplicatedPG::do_tmapup(OpContext *ctx, bufferlist::iterator& bp, OSDOp& osd
       ::encode(nextkey, newkeydata);
       ::encode(nextval, newkeydata);
       dout(20) << "  keep " << nextkey << " " << nextval.length() << dendl;
+
+      /*
+       * TMAPs for storing dir fragments may contain misordered items.
+       * Only items corresponding to dentries that have the same name
+       * prefix can be out of order.
+       */
+      size_t lastlen = nextkey.find_last_of('_');
+      if (lastlen > 0 && lastlen != string::npos) {
+	string last_key = nextkey;
+	bufferlist::iterator tmp_ip = ip;
+	while (!tmp_ip.end()) {
+	  ::decode(nextkey, tmp_ip);
+	  ::decode(nextval, tmp_ip);
+
+	  if (nextkey <= last_key) {
+	    dout(0) << "tmapup warning: key '" << nextkey << "' < previous key '" << last_key
+		    << "', falling back to an inefficient (unsorted) update" << dendl;
+	    bp = orig_bp;
+	    return do_tmapup_slow(ctx, bp, osd_op, newop.outdata);
+	  }
+
+	  size_t len = nextkey.find_last_of('_');
+	  if (len == 0 || len == string::npos)
+	    break;
+	  len = min(len, lastlen);
+	  if (last_key.compare(0, len, nextkey, 0, len) < 0)
+	    break;
+	}
+      }
     }
     if (!ip.end()) {
       bufferlist rest;
-- 
1.7.11.7


  reply	other threads:[~2012-11-26  8:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-25 12:17 [PATCH] mds: sort dentries when committing dir fragment Yan, Zheng
2012-11-25 22:32 ` Sage Weil
2012-11-26  2:19   ` Yan, Zheng
2012-11-26  8:54     ` Yan, Zheng [this message]
2012-11-26 19:06       ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50B32E48.50403@intel.com \
    --to=zheng.z.yan@intel.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.