* [PATCH 02/12] mds: Consider stopping MDS when finding peer inode
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 8:55 ` [PATCH 03/12] mds: Add finish callback to waiting_for_base_ino wait queue Yan, Zheng
` (9 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
To migrate strays, the receiving MDS need find stopping MDS' mdsdir
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/MDCache.cc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
index efa6671..1a93d9b 100644
--- a/src/mds/MDCache.cc
+++ b/src/mds/MDCache.cc
@@ -7225,8 +7225,9 @@ void MDCache::find_ino_peers(inodeno_t ino, Context *c, int hint)
void MDCache::_do_find_ino_peer(find_ino_peer_info_t& fip)
{
set<int> all, active;
- mds->mdsmap->get_active_mds_set(active);
mds->mdsmap->get_mds_set(all);
+ mds->mdsmap->get_active_mds_set(active);
+ mds->mdsmap->get_mds_set(active, MDSMap::STATE_STOPPING);
dout(10) << "_do_find_ino_peer " << fip.tid << " " << fip.ino
<< " active " << active << " all " << all
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 03/12] mds: Add finish callback to waiting_for_base_ino wait queue
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
2012-10-02 8:55 ` [PATCH 02/12] mds: Consider stopping MDS when finding peer inode Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 8:55 ` [PATCH 04/12] mds: Allow rename request for stray migration/reintegration Yan, Zheng
` (8 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/MDCache.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
index 1a93d9b..1348b3e 100644
--- a/src/mds/MDCache.cc
+++ b/src/mds/MDCache.cc
@@ -8341,8 +8341,8 @@ void MDCache::discover_base_ino(inodeno_t want_ino,
discover_info_t& d = _create_discover(from);
d.ino = want_ino;
_send_discover(d);
- waiting_for_base_ino[from][want_ino].push_back(onfinish);
}
+ waiting_for_base_ino[from][want_ino].push_back(onfinish);
}
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 04/12] mds: Allow rename request for stray migration/reintegration
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
2012-10-02 8:55 ` [PATCH 02/12] mds: Consider stopping MDS when finding peer inode Yan, Zheng
2012-10-02 8:55 ` [PATCH 03/12] mds: Add finish callback to waiting_for_base_ino wait queue Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 8:55 ` [PATCH 05/12] mds: Fix xlock imports Yan, Zheng
` (7 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
Allow rename request to modify system directory if it is for stray
migration/reintegration.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/Server.cc | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/src/mds/Server.cc b/src/mds/Server.cc
index 4c52ca2..8584e60 100644
--- a/src/mds/Server.cc
+++ b/src/mds/Server.cc
@@ -1983,7 +1983,7 @@ CDentry* Server::rdlock_path_xlock_dentry(MDRequest *mdr, int n,
}
CInode *diri = dir->get_inode();
- if (diri->is_system() && !diri->is_root()) {
+ if (!mdr->reqid.name.is_mds() && diri->is_system() && !diri->is_root()) {
reply_request(mdr, -EROFS);
return 0;
}
@@ -5126,10 +5126,12 @@ void Server::handle_client_rename(MDRequest *mdr)
pdn = pdn->get_dir()->inode->parent;
}
- // is this a stray reintegration or merge? (sanity checks!)
+ // is this a stray migration, reintegration or merge? (sanity checks!)
if (mdr->reqid.name.is_mds() &&
- (!destdnl->is_remote() ||
- destdnl->get_remote_ino() != srci->ino())) {
+ !(MDS_INO_IS_STRAY(srcpath.get_ino()) &&
+ MDS_INO_IS_STRAY(destpath.get_ino())) &&
+ !(destdnl->is_remote() &&
+ destdnl->get_remote_ino() == srci->ino())) {
reply_request(mdr, -EINVAL); // actually, this won't reply, but whatev.
return;
}
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 05/12] mds: Fix xlock imports
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
` (2 preceding siblings ...)
2012-10-02 8:55 ` [PATCH 04/12] mds: Allow rename request for stray migration/reintegration Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 8:55 ` [PATCH 06/12] mds: Set metablob.renamed_dirino in do_rename_rollback() Yan, Zheng
` (6 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
Xlock imports and capability imports are uncorrelated, we should call
xlock_import() even there is no capability import.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/Server.cc | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/src/mds/Server.cc b/src/mds/Server.cc
index 8584e60..7659b23 100644
--- a/src/mds/Server.cc
+++ b/src/mds/Server.cc
@@ -5838,16 +5838,16 @@ void Server::_rename_apply(MDRequest *mdr, CDentry *srcdn, CDentry *destdn, CDen
if (mdr->more()->cap_imports.count(destdnl->get_inode())) {
mds->mdcache->migrator->finish_import_inode_caps(destdnl->get_inode(), srcdn->authority().first,
mdr->more()->cap_imports[destdnl->get_inode()]);
- /* hack: add an auth pin for each xlock we hold. These were
- * remote xlocks previously but now they're local and
- * we're going to try and unpin when we xlock_finish. */
- for (set<SimpleLock *>::iterator i = mdr->xlocks.begin();
- i != mdr->xlocks.end();
- ++i)
- if ((*i)->get_parent() == destdnl->get_inode() &&
- !(*i)->is_locallock())
- mds->locker->xlock_import(*i, mdr);
}
+ /* hack: add an auth pin for each xlock we hold. These were
+ * remote xlocks previously but now they're local and
+ * we're going to try and unpin when we xlock_finish. */
+ for (set<SimpleLock *>::iterator i = mdr->xlocks.begin();
+ i != mdr->xlocks.end();
+ ++i)
+ if ((*i)->get_parent() == destdnl->get_inode() &&
+ !(*i)->is_locallock())
+ mds->locker->xlock_import(*i, mdr);
// hack: fix auth bit
in->state_set(CInode::STATE_AUTH);
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 06/12] mds: Set metablob.renamed_dirino in do_rename_rollback()
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
` (3 preceding siblings ...)
2012-10-02 8:55 ` [PATCH 05/12] mds: Fix xlock imports Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 8:55 ` [PATCH 07/12] mds: Avoid save unnecessary parent snaprealm Yan, Zheng
` (5 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/Server.cc | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/mds/Server.cc b/src/mds/Server.cc
index 7659b23..e16800e 100644
--- a/src/mds/Server.cc
+++ b/src/mds/Server.cc
@@ -6395,8 +6395,11 @@ void Server::do_rename_rollback(bufferlist &rbl, int master, MDRequest *mdr)
le->commit.add_null_dentry(straydn, true);
}
- if (in->is_dir())
+ if (in->is_dir()) {
+ dout(10) << " noting renamed dir ino " << in->ino() << " in metablob" << dendl;
+ le->commit.renamed_dirino = in->ino();
mdcache->project_subtree_rename(in, destdir, srcdir);
+ }
mdlog->submit_entry(le, new C_MDS_LoggedRenameRollback(this, mut, mdr,
srcdnl->get_inode(), destdir));
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 07/12] mds: Avoid save unnecessary parent snaprealm
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
` (4 preceding siblings ...)
2012-10-02 8:55 ` [PATCH 06/12] mds: Set metablob.renamed_dirino in do_rename_rollback() Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 8:55 ` [PATCH 08/12] mds: Allow export subtrees in other MDS' stray directory Yan, Zheng
` (4 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
We can avoid save parent snaprealm if current_parent_since is greater
than parent snaprealm's newest sequence.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/CInode.cc | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/mds/CInode.cc b/src/mds/CInode.cc
index 8fed424..8566e55 100644
--- a/src/mds/CInode.cc
+++ b/src/mds/CInode.cc
@@ -386,8 +386,10 @@ void CInode::project_past_snaprealm_parent(SnapRealm *newparent)
if (newparent != oldparent) {
snapid_t oldparentseq = oldparent->get_newest_seq();
- new_snap->past_parents[oldparentseq].ino = oldparent->inode->ino();
- new_snap->past_parents[oldparentseq].first = new_snap->current_parent_since;
+ if (oldparentseq + 1 > new_snap->current_parent_since) {
+ new_snap->past_parents[oldparentseq].ino = oldparent->inode->ino();
+ new_snap->past_parents[oldparentseq].first = new_snap->current_parent_since;
+ }
new_snap->current_parent_since = MAX(oldparentseq, newparent->get_last_created()) + 1;
}
}
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 08/12] mds: Allow export subtrees in other MDS' stray directory
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
` (5 preceding siblings ...)
2012-10-02 8:55 ` [PATCH 07/12] mds: Avoid save unnecessary parent snaprealm Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 8:55 ` [PATCH 09/12] mds: Properly update dirty dir fragstat during log replay Yan, Zheng
` (3 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
Stray migration is implemented by rename, it may create auth subtrees
in other MDS' stray directory.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/Migrator.cc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mds/Migrator.cc b/src/mds/Migrator.cc
index 5f727a9..98ab4e1 100644
--- a/src/mds/Migrator.cc
+++ b/src/mds/Migrator.cc
@@ -640,7 +640,8 @@ void Migrator::export_dir(CDir *dir, int dest)
return;
}
- if (!dir->inode->is_base() && dir->inode->get_parent_dir()->get_inode()->is_stray()) {
+ if (!dir->inode->is_base() &&
+ dir->inode->get_parent_dir()->ino() == MDS_INO_MDSDIR(mds->get_nodeid())) {
dout(7) << "i won't export anything in stray" << dendl;
return;
}
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 09/12] mds: Properly update dirty dir fragstat during log replay
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
` (6 preceding siblings ...)
2012-10-02 8:55 ` [PATCH 08/12] mds: Allow export subtrees in other MDS' stray directory Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 8:55 ` [PATCH 10/12] mds: Trim non auth subtree directory Yan, Zheng
` (2 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
Dirty dir fragstat is managed by filelock instead of nestlock.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/journal.cc | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/mds/journal.cc b/src/mds/journal.cc
index 38f8c0e..0de3f39 100644
--- a/src/mds/journal.cc
+++ b/src/mds/journal.cc
@@ -488,14 +488,20 @@ void EMetaBlob::replay(MDS *mds, LogSegment *logseg)
dir->get_inode()->filelock.mark_dirty();
dir->get_inode()->nestlock.mark_dirty();
- if (!(dir->fnode.rstat == dir->fnode.accounted_rstat) ||
- !(dir->fnode.fragstat == dir->fnode.accounted_fragstat)) {
+ if (!(dir->fnode.rstat == dir->fnode.accounted_rstat)) {
dout(10) << "EMetaBlob.replay dirty nestinfo on " << *dir << dendl;
mds->locker->mark_updated_scatterlock(&dir->inode->nestlock);
logseg->dirty_dirfrag_nest.push_back(&dir->inode->item_dirty_dirfrag_nest);
} else {
dout(10) << "EMetaBlob.replay clean nestinfo on " << *dir << dendl;
}
+ if (!(dir->fnode.fragstat == dir->fnode.accounted_fragstat)) {
+ dout(10) << "EMetaBlob.replay dirty fragstat on " << *dir << dendl;
+ mds->locker->mark_updated_scatterlock(&dir->inode->filelock);
+ logseg->dirty_dirfrag_dir.push_back(&dir->inode->item_dirty_dirfrag_dir);
+ } else {
+ dout(10) << "EMetaBlob.replay clean fragstat on " << *dir << dendl;
+ }
}
if (lump.is_new())
dir->mark_new(logseg);
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 10/12] mds: Trim non auth subtree directory
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
` (7 preceding siblings ...)
2012-10-02 8:55 ` [PATCH 09/12] mds: Properly update dirty dir fragstat during log replay Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 8:55 ` [PATCH 11/12] mds: Properly re-calculate mdsdir inode's auth bit Yan, Zheng
2012-10-02 8:55 ` [PATCH 12/12] mds: Avoid creating unnecessary snaprealm Yan, Zheng
10 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
Trim non auth subtree directory if all its dentries were trimmed
and it's not bound of auth subtree.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/MDCache.cc | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
index 1348b3e..397b991 100644
--- a/src/mds/MDCache.cc
+++ b/src/mds/MDCache.cc
@@ -5412,6 +5412,15 @@ bool MDCache::trim(int max)
++i)
lru.lru_insert_mid(*i);
+ for (map<CDir*, set<CDir*> >::iterator p = subtrees.begin();
+ p != subtrees.end();) {
+ CDir *dir = p->first;
+ p++;
+ if (!dir->is_auth() && !dir->get_inode()->is_auth()) {
+ if (dir->get_num_ref() == 1) // subtree pin
+ trim_dirfrag(dir, 0, expiremap);
+ }
+ }
// trim root?
if (max == 0 && root) {
list<CDir*> ls;
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 11/12] mds: Properly re-calculate mdsdir inode's auth bit
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
` (8 preceding siblings ...)
2012-10-02 8:55 ` [PATCH 10/12] mds: Trim non auth subtree directory Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 8:55 ` [PATCH 12/12] mds: Avoid creating unnecessary snaprealm Yan, Zheng
10 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/MDCache.cc | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
index 397b991..8b02b8b 100644
--- a/src/mds/MDCache.cc
+++ b/src/mds/MDCache.cc
@@ -3132,6 +3132,11 @@ void MDCache::recalc_auth_bits()
for (map<CDir*,set<CDir*> >::iterator p = subtrees.begin();
p != subtrees.end();
++p) {
+
+ CInode *inode = p->first->get_inode();
+ if (inode->is_mdsdir() && inode->ino() != MDS_INO_MDSDIR(mds->get_nodeid()))
+ inode->state_clear(CInode::STATE_AUTH);
+
list<CDir*> dfq; // dirfrag queue
dfq.push_back(p->first);
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 12/12] mds: Avoid creating unnecessary snaprealm
2012-10-02 8:55 [PATCH 01/12] mds: Don't drop client request from MDS Yan, Zheng
` (9 preceding siblings ...)
2012-10-02 8:55 ` [PATCH 11/12] mds: Properly re-calculate mdsdir inode's auth bit Yan, Zheng
@ 2012-10-02 8:55 ` Yan, Zheng
2012-10-02 18:31 ` Sage Weil
10 siblings, 1 reply; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 8:55 UTC (permalink / raw)
To: sage, ceph-devel; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
When moving directory between snaprealms, we can avoid creating snaprealm
if the directory doesn't has its own snaprealm and directory was created
after both realms' newest snapshot.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/Server.cc | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/src/mds/Server.cc b/src/mds/Server.cc
index e16800e..b706b5a 100644
--- a/src/mds/Server.cc
+++ b/src/mds/Server.cc
@@ -4577,7 +4577,8 @@ void Server::_unlink_local(MDRequest *mdr, CDentry *dn, CDentry *straydn)
mdcache->predirty_journal_parents(mdr, &le->metablob, in, straydn->get_dir(), PREDIRTY_PRIMARY|PREDIRTY_DIR, 1);
// project snaprealm, too
- in->project_past_snaprealm_parent(straydn->get_dir()->inode->find_snaprealm());
+ if (in->snaprealm || follows + 1 > dn->first)
+ in->project_past_snaprealm_parent(straydn->get_dir()->inode->find_snaprealm());
le->metablob.add_primary_dentry(straydn, true, in);
} else {
@@ -5247,11 +5248,16 @@ void Server::handle_client_rename(MDRequest *mdr)
}
// moving between snaprealms?
- if (srcdnl->is_primary() && !srci->snaprealm &&
- srci->find_snaprealm() != destdn->get_dir()->inode->find_snaprealm()) {
- dout(10) << " renaming between snaprealms, creating snaprealm for " << *srci << dendl;
- mds->mdcache->snaprealm_create(mdr, srci);
- return;
+ if (srcdnl->is_primary() && srci->is_multiversion() && !srci->snaprealm) {
+ SnapRealm *srcrealm = srci->find_snaprealm();
+ SnapRealm *destrealm = destdn->get_dir()->inode->find_snaprealm();
+ if (srcrealm != destrealm &&
+ (srcrealm->get_newest_seq() + 1 > srcdn->first ||
+ destrealm->get_newest_seq() + 1 > srcdn->first)) {
+ dout(10) << " renaming between snaprealms, creating snaprealm for " << *srci << dendl;
+ mds->mdcache->snaprealm_create(mdr, srci);
+ return;
+ }
}
assert(g_conf->mds_kill_rename_at != 1);
@@ -5650,6 +5656,7 @@ void Server::_rename_prepare(MDRequest *mdr,
if (destdn->is_auth())
mdcache->predirty_journal_parents(mdr, metablob, srci, destdn->get_dir(), flags, 1);
+ SnapRealm *src_realm = srci->find_snaprealm();
SnapRealm *dest_realm = destdn->get_dir()->inode->find_snaprealm();
snapid_t next_dest_snap = dest_realm->get_newest_seq() + 1;
@@ -5659,7 +5666,8 @@ void Server::_rename_prepare(MDRequest *mdr,
if (destdnl->is_primary()) {
if (destdn->is_auth()) {
// project snaprealm, too
- oldin->project_past_snaprealm_parent(straydn->get_dir()->inode->find_snaprealm());
+ if (oldin->snaprealm || src_realm->get_newest_seq() + 1 > srcdn->first)
+ oldin->project_past_snaprealm_parent(straydn->get_dir()->inode->find_snaprealm());
straydn->first = MAX(oldin->first, next_dest_snap);
metablob->add_primary_dentry(straydn, true, oldin);
}
@@ -5703,7 +5711,8 @@ void Server::_rename_prepare(MDRequest *mdr,
}
} else if (srcdnl->is_primary()) {
// project snap parent update?
- if (destdn->is_auth() && srci->snaprealm)
+ if (destdn->is_auth() &&
+ (srci->snaprealm || src_realm->get_newest_seq() + 1 > srcdn->first))
srci->project_past_snaprealm_parent(dest_realm);
if (destdn->is_auth() && !destdnl->is_null())
--
1.7.11.4
^ permalink raw reply related [flat|nested] 16+ messages in thread* Re: [PATCH 12/12] mds: Avoid creating unnecessary snaprealm
2012-10-02 8:55 ` [PATCH 12/12] mds: Avoid creating unnecessary snaprealm Yan, Zheng
@ 2012-10-02 18:31 ` Sage Weil
2012-10-02 23:45 ` Yan, Zheng
0 siblings, 1 reply; 16+ messages in thread
From: Sage Weil @ 2012-10-02 18:31 UTC (permalink / raw)
To: Yan, Zheng; +Cc: ceph-devel
Hi Yan,
This whole series looks great! Sticking it in wip-mds and running it
through the fs qa suite before merging it.
How are you testing these? If you haven't seen it yet, there is an 'mds
thrash exports' option that will make MDSs random migrate subtrees to each
other that is great for shaking out bugs. That and periodic daemon
restarts (one of the first things we need to do on the clustered mds front
is to get daemon restarting integrated into teuthology).
Thanks!
sage
On Tue, 2 Oct 2012, Yan, Zheng wrote:
> From: "Yan, Zheng" <zheng.z.yan@intel.com>
>
> When moving directory between snaprealms, we can avoid creating snaprealm
> if the directory doesn't has its own snaprealm and directory was created
> after both realms' newest snapshot.
>
> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
> ---
> src/mds/Server.cc | 25 +++++++++++++++++--------
> 1 file changed, 17 insertions(+), 8 deletions(-)
>
> diff --git a/src/mds/Server.cc b/src/mds/Server.cc
> index e16800e..b706b5a 100644
> --- a/src/mds/Server.cc
> +++ b/src/mds/Server.cc
> @@ -4577,7 +4577,8 @@ void Server::_unlink_local(MDRequest *mdr, CDentry *dn, CDentry *straydn)
> mdcache->predirty_journal_parents(mdr, &le->metablob, in, straydn->get_dir(), PREDIRTY_PRIMARY|PREDIRTY_DIR, 1);
>
> // project snaprealm, too
> - in->project_past_snaprealm_parent(straydn->get_dir()->inode->find_snaprealm());
> + if (in->snaprealm || follows + 1 > dn->first)
> + in->project_past_snaprealm_parent(straydn->get_dir()->inode->find_snaprealm());
>
> le->metablob.add_primary_dentry(straydn, true, in);
> } else {
> @@ -5247,11 +5248,16 @@ void Server::handle_client_rename(MDRequest *mdr)
> }
>
> // moving between snaprealms?
> - if (srcdnl->is_primary() && !srci->snaprealm &&
> - srci->find_snaprealm() != destdn->get_dir()->inode->find_snaprealm()) {
> - dout(10) << " renaming between snaprealms, creating snaprealm for " << *srci << dendl;
> - mds->mdcache->snaprealm_create(mdr, srci);
> - return;
> + if (srcdnl->is_primary() && srci->is_multiversion() && !srci->snaprealm) {
> + SnapRealm *srcrealm = srci->find_snaprealm();
> + SnapRealm *destrealm = destdn->get_dir()->inode->find_snaprealm();
> + if (srcrealm != destrealm &&
> + (srcrealm->get_newest_seq() + 1 > srcdn->first ||
> + destrealm->get_newest_seq() + 1 > srcdn->first)) {
> + dout(10) << " renaming between snaprealms, creating snaprealm for " << *srci << dendl;
> + mds->mdcache->snaprealm_create(mdr, srci);
> + return;
> + }
> }
>
> assert(g_conf->mds_kill_rename_at != 1);
> @@ -5650,6 +5656,7 @@ void Server::_rename_prepare(MDRequest *mdr,
> if (destdn->is_auth())
> mdcache->predirty_journal_parents(mdr, metablob, srci, destdn->get_dir(), flags, 1);
>
> + SnapRealm *src_realm = srci->find_snaprealm();
> SnapRealm *dest_realm = destdn->get_dir()->inode->find_snaprealm();
> snapid_t next_dest_snap = dest_realm->get_newest_seq() + 1;
>
> @@ -5659,7 +5666,8 @@ void Server::_rename_prepare(MDRequest *mdr,
> if (destdnl->is_primary()) {
> if (destdn->is_auth()) {
> // project snaprealm, too
> - oldin->project_past_snaprealm_parent(straydn->get_dir()->inode->find_snaprealm());
> + if (oldin->snaprealm || src_realm->get_newest_seq() + 1 > srcdn->first)
> + oldin->project_past_snaprealm_parent(straydn->get_dir()->inode->find_snaprealm());
> straydn->first = MAX(oldin->first, next_dest_snap);
> metablob->add_primary_dentry(straydn, true, oldin);
> }
> @@ -5703,7 +5711,8 @@ void Server::_rename_prepare(MDRequest *mdr,
> }
> } else if (srcdnl->is_primary()) {
> // project snap parent update?
> - if (destdn->is_auth() && srci->snaprealm)
> + if (destdn->is_auth() &&
> + (srci->snaprealm || src_realm->get_newest_seq() + 1 > srcdn->first))
> srci->project_past_snaprealm_parent(dest_realm);
>
> if (destdn->is_auth() && !destdnl->is_null())
> --
> 1.7.11.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH 12/12] mds: Avoid creating unnecessary snaprealm
2012-10-02 18:31 ` Sage Weil
@ 2012-10-02 23:45 ` Yan, Zheng
2012-10-03 0:12 ` Sage Weil
0 siblings, 1 reply; 16+ messages in thread
From: Yan, Zheng @ 2012-10-02 23:45 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 10/03/2012 02:31 AM, Sage Weil wrote:
> Hi Yan,
>
> This whole series looks great! Sticking it in wip-mds and running it
> through the fs qa suite before merging it.
>
> How are you testing these? If you haven't seen it yet, there is an 'mds
> thrash exports' option that will make MDSs random migrate subtrees to each
> other that is great for shaking out bugs. That and periodic daemon
> restarts (one of the first things we need to do on the clustered mds front
> is to get daemon restarting integrated into teuthology).
>
The patches are fixes for problems I encountered during playing MDS shutdown.
I setup a 2 MDS cephfs and copied some data into it, deleted some directories
whose authority is MDS.1, then shutdown MDS.1.
Most patches in this series are obvious. The two snaprealm related patches are
workaround for a bug: replica inode's snaprealm->open is not true. The bug triggers
assertion in CInode::pop_projected_snaprealm() if snaprealm is involved in cross
authority rename.
Regards
Yan, Zheng
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 12/12] mds: Avoid creating unnecessary snaprealm
2012-10-02 23:45 ` Yan, Zheng
@ 2012-10-03 0:12 ` Sage Weil
2012-10-03 11:44 ` Yan, Zheng
0 siblings, 1 reply; 16+ messages in thread
From: Sage Weil @ 2012-10-03 0:12 UTC (permalink / raw)
To: Yan, Zheng; +Cc: ceph-devel
On Wed, 3 Oct 2012, Yan, Zheng wrote:
> On 10/03/2012 02:31 AM, Sage Weil wrote:
> > Hi Yan,
> >
> > This whole series looks great! Sticking it in wip-mds and running it
> > through the fs qa suite before merging it.
> >
> > How are you testing these? If you haven't seen it yet, there is an 'mds
> > thrash exports' option that will make MDSs random migrate subtrees to each
> > other that is great for shaking out bugs. That and periodic daemon
> > restarts (one of the first things we need to do on the clustered mds front
> > is to get daemon restarting integrated into teuthology).
> >
>
> The patches are fixes for problems I encountered during playing MDS shutdown.
> I setup a 2 MDS cephfs and copied some data into it, deleted some directories
> whose authority is MDS.1, then shutdown MDS.1.
>
> Most patches in this series are obvious. The two snaprealm related patches are
> workaround for a bug: replica inode's snaprealm->open is not true. The bug triggers
> assertion in CInode::pop_projected_snaprealm() if snaprealm is involved in cross
> authority rename.
Do you mind opening a ticket at tracker.newdream.net so we don't lose
track of it?
Fsstress on a single mds turned up this:
2012-10-02T17:09:09.359 INFO:teuthology.task.ceph.mds.a.err:*** Caught signal (Segmentation fault) **
2012-10-02T17:09:09.359 INFO:teuthology.task.ceph.mds.a.err: in thread 7f8873a41700
2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: ceph version 0.52-949-ge8df6a7 (commit:e8df6a74cae66accb6682129c9c5ad33797f458c)
2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x812b21]
2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 2: (()+0xfcb0) [0x7f88787b3cb0]
2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 3: (Server::handle_client_rename(MDRequest*)+0xa28) [0x53dc88]
2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 4: (Server::dispatch_client_request(MDRequest*)+0x4fb) [0x54123b]
2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 5: (Server::handle_client_request(MClientRequest*)+0x51d) [0x544a6d]
2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 6: (Server::dispatch(Message*)+0x2d3) [0x5452e3]
2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 7: (MDS::handle_deferrable_message(Message*)+0x91f) [0x4bc32f]
2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 8: (MDS::_dispatch(Message*)+0x9b6) [0x4cf8b6]
2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 9: (MDS::ms_dispatch(Message*)+0x21b) [0x4d0c3b]
2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 10: (DispatchQueue::entry()+0x711) [0x7eb301]
2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x7713dd]
2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 12: (()+0x7e9a) [0x7f88787abe9a]
2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 13: (clone()+0x6d) [0x7f8876d534bd]
2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err:2012-10-02 17:09:09.349272 7f8873a41700 -1 *** Caught signal (Segmentation fault) **
2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: in thread 7f8873a41700
I don't have time right now to hunt this down, but you should be able to
reproduce with qa/workunits/suites/fsstress.sh on top of ceph-fuse with 1
mds.
Thanks!
sage
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 12/12] mds: Avoid creating unnecessary snaprealm
2012-10-03 0:12 ` Sage Weil
@ 2012-10-03 11:44 ` Yan, Zheng
0 siblings, 0 replies; 16+ messages in thread
From: Yan, Zheng @ 2012-10-03 11:44 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 10/03/2012 08:12 AM, Sage Weil wrote:
> On Wed, 3 Oct 2012, Yan, Zheng wrote:
>> On 10/03/2012 02:31 AM, Sage Weil wrote:
>>> Hi Yan,
>>>
>>> This whole series looks great! Sticking it in wip-mds and running it
>>> through the fs qa suite before merging it.
>>>
>>> How are you testing these? If you haven't seen it yet, there is an 'mds
>>> thrash exports' option that will make MDSs random migrate subtrees to each
>>> other that is great for shaking out bugs. That and periodic daemon
>>> restarts (one of the first things we need to do on the clustered mds front
>>> is to get daemon restarting integrated into teuthology).
>>>
>>
>> The patches are fixes for problems I encountered during playing MDS shutdown.
>> I setup a 2 MDS cephfs and copied some data into it, deleted some directories
>> whose authority is MDS.1, then shutdown MDS.1.
>>
>> Most patches in this series are obvious. The two snaprealm related patches are
>> workaround for a bug: replica inode's snaprealm->open is not true. The bug triggers
>> assertion in CInode::pop_projected_snaprealm() if snaprealm is involved in cross
>> authority rename.
>
> Do you mind opening a ticket at tracker.newdream.net so we don't lose
> track of it?
will do
>
> Fsstress on a single mds turned up this:
>
> 2012-10-02T17:09:09.359 INFO:teuthology.task.ceph.mds.a.err:*** Caught signal (Segmentation fault) **
> 2012-10-02T17:09:09.359 INFO:teuthology.task.ceph.mds.a.err: in thread 7f8873a41700
> 2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: ceph version 0.52-949-ge8df6a7 (commit:e8df6a74cae66accb6682129c9c5ad33797f458c)
> 2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x812b21]
> 2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 2: (()+0xfcb0) [0x7f88787b3cb0]
> 2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 3: (Server::handle_client_rename(MDRequest*)+0xa28) [0x53dc88]
> 2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 4: (Server::dispatch_client_request(MDRequest*)+0x4fb) [0x54123b]
> 2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 5: (Server::handle_client_request(MClientRequest*)+0x51d) [0x544a6d]
> 2012-10-02T17:09:09.361 INFO:teuthology.task.ceph.mds.a.err: 6: (Server::dispatch(Message*)+0x2d3) [0x5452e3]
> 2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 7: (MDS::handle_deferrable_message(Message*)+0x91f) [0x4bc32f]
> 2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 8: (MDS::_dispatch(Message*)+0x9b6) [0x4cf8b6]
> 2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 9: (MDS::ms_dispatch(Message*)+0x21b) [0x4d0c3b]
> 2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 10: (DispatchQueue::entry()+0x711) [0x7eb301]
> 2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x7713dd]
> 2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 12: (()+0x7e9a) [0x7f88787abe9a]
> 2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: 13: (clone()+0x6d) [0x7f8876d534bd]
> 2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err:2012-10-02 17:09:09.349272 7f8873a41700 -1 *** Caught signal (Segmentation fault) **
> 2012-10-02T17:09:09.362 INFO:teuthology.task.ceph.mds.a.err: in thread 7f8873a41700
>
> I don't have time right now to hunt this down, but you should be able to
> reproduce with qa/workunits/suites/fsstress.sh on top of ceph-fuse with 1
> mds.
>
this is a old stray reintegration bug, I just sent a patch to fix it.
Regards
Yan, Zheng
> Thanks!
> sage
>
^ permalink raw reply [flat|nested] 16+ messages in thread