* [PATCH V2] mds: fix CDir::_commit_partial() bug
@ 2012-11-22 16:52 Yan, Zheng
2012-11-24 11:05 ` Yan, Zheng
2012-12-01 13:58 ` Yan, Zheng
0 siblings, 2 replies; 8+ messages in thread
From: Yan, Zheng @ 2012-11-22 16:52 UTC (permalink / raw)
To: ceph-devel, sage; +Cc: Yan, Zheng
From: "Yan, Zheng" <zheng.z.yan@intel.com>
When a null dentry is encountered, CDir::_commit_partial() adds
a OSD_TMAP_RM command to delete the dentry. But if the dentry is
new, the osd will not find the dentry when handling the command
and the tmap update operation will fail totally.
This patch also makes sure dentries are properly marked as new
when preparing new dentries and exporting dentries.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
src/mds/CDentry.h | 2 ++
src/mds/CDir.cc | 11 ++++++++---
src/mds/CDir.h | 2 +-
src/mds/MDCache.cc | 9 ++++++---
src/mds/Server.cc | 3 +++
5 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/src/mds/CDentry.h b/src/mds/CDentry.h
index 480e562..5755c55 100644
--- a/src/mds/CDentry.h
+++ b/src/mds/CDentry.h
@@ -347,6 +347,8 @@ public:
// twiddle
state = 0;
state_set(CDentry::STATE_AUTH);
+ if (nstate & STATE_NEW)
+ mark_new();
if (nstate & STATE_DIRTY)
_mark_dirty(ls);
if (!replica_map.empty())
diff --git a/src/mds/CDir.cc b/src/mds/CDir.cc
index c5220ed..411d864 100644
--- a/src/mds/CDir.cc
+++ b/src/mds/CDir.cc
@@ -1696,7 +1696,7 @@ class C_Dir_Committed : public Context {
public:
C_Dir_Committed(CDir *d, version_t v, version_t lrv) : dir(d), version(v), last_renamed_version(lrv) { }
void finish(int r) {
- dir->_committed(version, last_renamed_version);
+ dir->_committed(version, last_renamed_version, r);
}
};
@@ -1802,6 +1802,10 @@ CDir::map_t::iterator CDir::_commit_partial(ObjectOperation& m,
continue; // skip clean dentries
if (dn->get_linkage()->is_null()) {
+ if (dn->is_new()) {
+ dn->mark_clean();
+ continue;
+ }
dout(10) << " rm " << dn->name << " " << *dn << dendl;
finalbl.append(CEPH_OSD_TMAP_RM);
dn->key().encode(finalbl);
@@ -1997,10 +2001,11 @@ void CDir::_commit(version_t want)
*
* @param v version i just committed
*/
-void CDir::_committed(version_t v, version_t lrv)
+void CDir::_committed(version_t v, version_t lrv, int ret)
{
- dout(10) << "_committed v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
+ dout(10) << "_committed ret " << ret << " v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
assert(is_auth());
+ assert(ret == 0);
bool stray = inode->is_stray();
diff --git a/src/mds/CDir.h b/src/mds/CDir.h
index 2222418..274e38b 100644
--- a/src/mds/CDir.h
+++ b/src/mds/CDir.h
@@ -487,7 +487,7 @@ private:
unsigned max_write_size=-1,
map_t::iterator last_committed_dn=map_t::iterator());
void _encode_dentry(CDentry *dn, bufferlist& bl, const set<snapid_t> *snaps);
- void _committed(version_t v, version_t last_renamed_version);
+ void _committed(version_t v, version_t last_renamed_version, int ret);
void wait_for_commit(Context *c, version_t v=0);
// -- dirtyness --
diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
index f8b1c8f..e69a49f 100644
--- a/src/mds/MDCache.cc
+++ b/src/mds/MDCache.cc
@@ -657,12 +657,15 @@ CDentry *MDCache::get_or_create_stray_dentry(CInode *in)
CDir *straydir = strayi->get_dirfrag(fg);
assert(straydir);
CDentry *straydn = straydir->lookup(straydname);
- if (!straydn) {
+
+ if (!straydn)
straydn = straydir->add_null_dentry(straydname);
- straydn->mark_new();
- } else
+ else
assert(straydn->get_projected_linkage()->is_null());
+ if (!straydn->is_dirty())
+ straydn->mark_new();
+
return straydn;
}
diff --git a/src/mds/Server.cc b/src/mds/Server.cc
index ec0d5d5..228fede 100644
--- a/src/mds/Server.cc
+++ b/src/mds/Server.cc
@@ -1685,6 +1685,9 @@ CDentry* Server::prepare_null_dentry(MDRequest *mdr, CDir *dir, const string& dn
}
}
+ if (!dn->is_dirty())
+ dn->mark_new();
+
return dn;
}
--
1.7.11.7
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH V2] mds: fix CDir::_commit_partial() bug
2012-11-22 16:52 [PATCH V2] mds: fix CDir::_commit_partial() bug Yan, Zheng
@ 2012-11-24 11:05 ` Yan, Zheng
2012-11-24 16:21 ` Sage Weil
2012-12-01 13:58 ` Yan, Zheng
1 sibling, 1 reply; 8+ messages in thread
From: Yan, Zheng @ 2012-11-24 11:05 UTC (permalink / raw)
To: sage@inktank.com >> Sage Weil; +Cc: ceph-devel
Hi, Sage
I found this fix is still not enough because we don't know if a dentry has been
successfully deleted from the directory fragment after MDS restart. There are several
options to fix this, which one do you like?
1. Make OSD ignore the TMAP_RM commands if it can't the find items.
2. Add a new item deletion command for tmap, the new command doesn't make whole tmap
update operation fail if it can't the item. When committing directory fragments,
use the new command for null dentries that were added by MDS log replay.
3. When committing a directory fragment, Re-fetch it if it contains dirty+null dentries
that were added by MDS log replay. Find already deleted items and mark corresponding
dentries as new.
Regards
Yan, Zheng
On 11/23/2012 12:52 AM, Yan, Zheng wrote:
> From: "Yan, Zheng" <zheng.z.yan@intel.com>
>
> When a null dentry is encountered, CDir::_commit_partial() adds
> a OSD_TMAP_RM command to delete the dentry. But if the dentry is
> new, the osd will not find the dentry when handling the command
> and the tmap update operation will fail totally.
>
> This patch also makes sure dentries are properly marked as new
> when preparing new dentries and exporting dentries.
>
> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
> ---
> src/mds/CDentry.h | 2 ++
> src/mds/CDir.cc | 11 ++++++++---
> src/mds/CDir.h | 2 +-
> src/mds/MDCache.cc | 9 ++++++---
> src/mds/Server.cc | 3 +++
> 5 files changed, 20 insertions(+), 7 deletions(-)
>
> diff --git a/src/mds/CDentry.h b/src/mds/CDentry.h
> index 480e562..5755c55 100644
> --- a/src/mds/CDentry.h
> +++ b/src/mds/CDentry.h
> @@ -347,6 +347,8 @@ public:
> // twiddle
> state = 0;
> state_set(CDentry::STATE_AUTH);
> + if (nstate & STATE_NEW)
> + mark_new();
> if (nstate & STATE_DIRTY)
> _mark_dirty(ls);
> if (!replica_map.empty())
> diff --git a/src/mds/CDir.cc b/src/mds/CDir.cc
> index c5220ed..411d864 100644
> --- a/src/mds/CDir.cc
> +++ b/src/mds/CDir.cc
> @@ -1696,7 +1696,7 @@ class C_Dir_Committed : public Context {
> public:
> C_Dir_Committed(CDir *d, version_t v, version_t lrv) : dir(d), version(v), last_renamed_version(lrv) { }
> void finish(int r) {
> - dir->_committed(version, last_renamed_version);
> + dir->_committed(version, last_renamed_version, r);
> }
> };
>
> @@ -1802,6 +1802,10 @@ CDir::map_t::iterator CDir::_commit_partial(ObjectOperation& m,
> continue; // skip clean dentries
>
> if (dn->get_linkage()->is_null()) {
> + if (dn->is_new()) {
> + dn->mark_clean();
> + continue;
> + }
> dout(10) << " rm " << dn->name << " " << *dn << dendl;
> finalbl.append(CEPH_OSD_TMAP_RM);
> dn->key().encode(finalbl);
> @@ -1997,10 +2001,11 @@ void CDir::_commit(version_t want)
> *
> * @param v version i just committed
> */
> -void CDir::_committed(version_t v, version_t lrv)
> +void CDir::_committed(version_t v, version_t lrv, int ret)
> {
> - dout(10) << "_committed v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
> + dout(10) << "_committed ret " << ret << " v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
> assert(is_auth());
> + assert(ret == 0);
>
> bool stray = inode->is_stray();
>
> diff --git a/src/mds/CDir.h b/src/mds/CDir.h
> index 2222418..274e38b 100644
> --- a/src/mds/CDir.h
> +++ b/src/mds/CDir.h
> @@ -487,7 +487,7 @@ private:
> unsigned max_write_size=-1,
> map_t::iterator last_committed_dn=map_t::iterator());
> void _encode_dentry(CDentry *dn, bufferlist& bl, const set<snapid_t> *snaps);
> - void _committed(version_t v, version_t last_renamed_version);
> + void _committed(version_t v, version_t last_renamed_version, int ret);
> void wait_for_commit(Context *c, version_t v=0);
>
> // -- dirtyness --
> diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
> index f8b1c8f..e69a49f 100644
> --- a/src/mds/MDCache.cc
> +++ b/src/mds/MDCache.cc
> @@ -657,12 +657,15 @@ CDentry *MDCache::get_or_create_stray_dentry(CInode *in)
> CDir *straydir = strayi->get_dirfrag(fg);
> assert(straydir);
> CDentry *straydn = straydir->lookup(straydname);
> - if (!straydn) {
> +
> + if (!straydn)
> straydn = straydir->add_null_dentry(straydname);
> - straydn->mark_new();
> - } else
> + else
> assert(straydn->get_projected_linkage()->is_null());
>
> + if (!straydn->is_dirty())
> + straydn->mark_new();
> +
> return straydn;
> }
>
> diff --git a/src/mds/Server.cc b/src/mds/Server.cc
> index ec0d5d5..228fede 100644
> --- a/src/mds/Server.cc
> +++ b/src/mds/Server.cc
> @@ -1685,6 +1685,9 @@ CDentry* Server::prepare_null_dentry(MDRequest *mdr, CDir *dir, const string& dn
> }
> }
>
> + if (!dn->is_dirty())
> + dn->mark_new();
> +
> return dn;
> }
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2] mds: fix CDir::_commit_partial() bug
2012-11-24 11:05 ` Yan, Zheng
@ 2012-11-24 16:21 ` Sage Weil
2012-11-24 17:45 ` Yan, Zheng
0 siblings, 1 reply; 8+ messages in thread
From: Sage Weil @ 2012-11-24 16:21 UTC (permalink / raw)
To: Yan, Zheng; +Cc: ceph-devel
On Sat, 24 Nov 2012, Yan, Zheng wrote:
> Hi, Sage
>
> I found this fix is still not enough because we don't know if a dentry has been
> successfully deleted from the directory fragment after MDS restart. There are several
> options to fix this, which one do you like?
>
> 1. Make OSD ignore the TMAP_RM commands if it can't the find items.
This is certainly the simplest. Since TMAP is effectively deprecated and
everyone should be using the omap stuff anyway, it gets my vote.
Speaking of, we should transition the MDS to use the omap operations soon
as well, since TMAP is horribly inefficient for large directories. Also,
the rmkey method in that case does not have this problem.
Let's ignore enoent on TMAP_RM for bobtail, and transition to omap in the
next release... do you want to test the patch that drops the !key_exists
check?
Thanks!
sage
> 2. Add a new item deletion command for tmap, the new command doesn't make whole tmap
> update operation fail if it can't the item. When committing directory fragments,
> use the new command for null dentries that were added by MDS log replay.
>
> 3. When committing a directory fragment, Re-fetch it if it contains dirty+null dentries
> that were added by MDS log replay. Find already deleted items and mark corresponding
> dentries as new.
>
> Regards
> Yan, Zheng
>
> On 11/23/2012 12:52 AM, Yan, Zheng wrote:
> > From: "Yan, Zheng" <zheng.z.yan@intel.com>
> >
> > When a null dentry is encountered, CDir::_commit_partial() adds
> > a OSD_TMAP_RM command to delete the dentry. But if the dentry is
> > new, the osd will not find the dentry when handling the command
> > and the tmap update operation will fail totally.
> >
> > This patch also makes sure dentries are properly marked as new
> > when preparing new dentries and exporting dentries.
> >
> > Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
> > ---
> > src/mds/CDentry.h | 2 ++
> > src/mds/CDir.cc | 11 ++++++++---
> > src/mds/CDir.h | 2 +-
> > src/mds/MDCache.cc | 9 ++++++---
> > src/mds/Server.cc | 3 +++
> > 5 files changed, 20 insertions(+), 7 deletions(-)
> >
> > diff --git a/src/mds/CDentry.h b/src/mds/CDentry.h
> > index 480e562..5755c55 100644
> > --- a/src/mds/CDentry.h
> > +++ b/src/mds/CDentry.h
> > @@ -347,6 +347,8 @@ public:
> > // twiddle
> > state = 0;
> > state_set(CDentry::STATE_AUTH);
> > + if (nstate & STATE_NEW)
> > + mark_new();
> > if (nstate & STATE_DIRTY)
> > _mark_dirty(ls);
> > if (!replica_map.empty())
> > diff --git a/src/mds/CDir.cc b/src/mds/CDir.cc
> > index c5220ed..411d864 100644
> > --- a/src/mds/CDir.cc
> > +++ b/src/mds/CDir.cc
> > @@ -1696,7 +1696,7 @@ class C_Dir_Committed : public Context {
> > public:
> > C_Dir_Committed(CDir *d, version_t v, version_t lrv) : dir(d), version(v), last_renamed_version(lrv) { }
> > void finish(int r) {
> > - dir->_committed(version, last_renamed_version);
> > + dir->_committed(version, last_renamed_version, r);
> > }
> > };
> >
> > @@ -1802,6 +1802,10 @@ CDir::map_t::iterator CDir::_commit_partial(ObjectOperation& m,
> > continue; // skip clean dentries
> >
> > if (dn->get_linkage()->is_null()) {
> > + if (dn->is_new()) {
> > + dn->mark_clean();
> > + continue;
> > + }
> > dout(10) << " rm " << dn->name << " " << *dn << dendl;
> > finalbl.append(CEPH_OSD_TMAP_RM);
> > dn->key().encode(finalbl);
> > @@ -1997,10 +2001,11 @@ void CDir::_commit(version_t want)
> > *
> > * @param v version i just committed
> > */
> > -void CDir::_committed(version_t v, version_t lrv)
> > +void CDir::_committed(version_t v, version_t lrv, int ret)
> > {
> > - dout(10) << "_committed v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
> > + dout(10) << "_committed ret " << ret << " v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
> > assert(is_auth());
> > + assert(ret == 0);
> >
> > bool stray = inode->is_stray();
> >
> > diff --git a/src/mds/CDir.h b/src/mds/CDir.h
> > index 2222418..274e38b 100644
> > --- a/src/mds/CDir.h
> > +++ b/src/mds/CDir.h
> > @@ -487,7 +487,7 @@ private:
> > unsigned max_write_size=-1,
> > map_t::iterator last_committed_dn=map_t::iterator());
> > void _encode_dentry(CDentry *dn, bufferlist& bl, const set<snapid_t> *snaps);
> > - void _committed(version_t v, version_t last_renamed_version);
> > + void _committed(version_t v, version_t last_renamed_version, int ret);
> > void wait_for_commit(Context *c, version_t v=0);
> >
> > // -- dirtyness --
> > diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
> > index f8b1c8f..e69a49f 100644
> > --- a/src/mds/MDCache.cc
> > +++ b/src/mds/MDCache.cc
> > @@ -657,12 +657,15 @@ CDentry *MDCache::get_or_create_stray_dentry(CInode *in)
> > CDir *straydir = strayi->get_dirfrag(fg);
> > assert(straydir);
> > CDentry *straydn = straydir->lookup(straydname);
> > - if (!straydn) {
> > +
> > + if (!straydn)
> > straydn = straydir->add_null_dentry(straydname);
> > - straydn->mark_new();
> > - } else
> > + else
> > assert(straydn->get_projected_linkage()->is_null());
> >
> > + if (!straydn->is_dirty())
> > + straydn->mark_new();
> > +
> > return straydn;
> > }
> >
> > diff --git a/src/mds/Server.cc b/src/mds/Server.cc
> > index ec0d5d5..228fede 100644
> > --- a/src/mds/Server.cc
> > +++ b/src/mds/Server.cc
> > @@ -1685,6 +1685,9 @@ CDentry* Server::prepare_null_dentry(MDRequest *mdr, CDir *dir, const string& dn
> > }
> > }
> >
> > + if (!dn->is_dirty())
> > + dn->mark_new();
> > +
> > return dn;
> > }
> >
> >
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2] mds: fix CDir::_commit_partial() bug
2012-11-24 16:21 ` Sage Weil
@ 2012-11-24 17:45 ` Yan, Zheng
2012-11-24 17:51 ` Sage Weil
0 siblings, 1 reply; 8+ messages in thread
From: Yan, Zheng @ 2012-11-24 17:45 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 11/25/2012 12:21 AM, Sage Weil wrote:
> On Sat, 24 Nov 2012, Yan, Zheng wrote:
>> Hi, Sage
>>
>> I found this fix is still not enough because we don't know if a dentry has been
>> successfully deleted from the directory fragment after MDS restart. There are several
>> options to fix this, which one do you like?
>>
>> 1. Make OSD ignore the TMAP_RM commands if it can't the find items.
>
> This is certainly the simplest. Since TMAP is effectively deprecated and
> everyone should be using the omap stuff anyway, it gets my vote.
>
> Speaking of, we should transition the MDS to use the omap operations soon
> as well, since TMAP is horribly inefficient for large directories. Also,
> the rmkey method in that case does not have this problem.
I can do this, it should be easy.
>
> Let's ignore enoent on TMAP_RM for bobtail, and transition to omap in the
> next release... do you want to test the patch that drops the !key_exists
> check?
I will modify the code and test it. but I think it is hiding bugs ;)
Another issue of tmap is that we may get different results when comparing two dentry_key_t
and their string representation. So the dentries in tmap can be out of order, setkey method
may add duplicated dentry and rmkey method may fail to find existing dentry. Since tmap is
so buggy, let's switch the MDS to use the omap ASAP.
Regards
Yan, Zheng
>
> Thanks!
> sage
>
>
>> 2. Add a new item deletion command for tmap, the new command doesn't make whole tmap
>> update operation fail if it can't the item. When committing directory fragments,
>> use the new command for null dentries that were added by MDS log replay.
>>
>> 3. When committing a directory fragment, Re-fetch it if it contains dirty+null dentries
>> that were added by MDS log replay. Find already deleted items and mark corresponding
>> dentries as new.
>>
>> Regards
>> Yan, Zheng
>>
>> On 11/23/2012 12:52 AM, Yan, Zheng wrote:
>>> From: "Yan, Zheng" <zheng.z.yan@intel.com>
>>>
>>> When a null dentry is encountered, CDir::_commit_partial() adds
>>> a OSD_TMAP_RM command to delete the dentry. But if the dentry is
>>> new, the osd will not find the dentry when handling the command
>>> and the tmap update operation will fail totally.
>>>
>>> This patch also makes sure dentries are properly marked as new
>>> when preparing new dentries and exporting dentries.
>>>
>>> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
>>> ---
>>> src/mds/CDentry.h | 2 ++
>>> src/mds/CDir.cc | 11 ++++++++---
>>> src/mds/CDir.h | 2 +-
>>> src/mds/MDCache.cc | 9 ++++++---
>>> src/mds/Server.cc | 3 +++
>>> 5 files changed, 20 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/src/mds/CDentry.h b/src/mds/CDentry.h
>>> index 480e562..5755c55 100644
>>> --- a/src/mds/CDentry.h
>>> +++ b/src/mds/CDentry.h
>>> @@ -347,6 +347,8 @@ public:
>>> // twiddle
>>> state = 0;
>>> state_set(CDentry::STATE_AUTH);
>>> + if (nstate & STATE_NEW)
>>> + mark_new();
>>> if (nstate & STATE_DIRTY)
>>> _mark_dirty(ls);
>>> if (!replica_map.empty())
>>> diff --git a/src/mds/CDir.cc b/src/mds/CDir.cc
>>> index c5220ed..411d864 100644
>>> --- a/src/mds/CDir.cc
>>> +++ b/src/mds/CDir.cc
>>> @@ -1696,7 +1696,7 @@ class C_Dir_Committed : public Context {
>>> public:
>>> C_Dir_Committed(CDir *d, version_t v, version_t lrv) : dir(d), version(v), last_renamed_version(lrv) { }
>>> void finish(int r) {
>>> - dir->_committed(version, last_renamed_version);
>>> + dir->_committed(version, last_renamed_version, r);
>>> }
>>> };
>>>
>>> @@ -1802,6 +1802,10 @@ CDir::map_t::iterator CDir::_commit_partial(ObjectOperation& m,
>>> continue; // skip clean dentries
>>>
>>> if (dn->get_linkage()->is_null()) {
>>> + if (dn->is_new()) {
>>> + dn->mark_clean();
>>> + continue;
>>> + }
>>> dout(10) << " rm " << dn->name << " " << *dn << dendl;
>>> finalbl.append(CEPH_OSD_TMAP_RM);
>>> dn->key().encode(finalbl);
>>> @@ -1997,10 +2001,11 @@ void CDir::_commit(version_t want)
>>> *
>>> * @param v version i just committed
>>> */
>>> -void CDir::_committed(version_t v, version_t lrv)
>>> +void CDir::_committed(version_t v, version_t lrv, int ret)
>>> {
>>> - dout(10) << "_committed v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
>>> + dout(10) << "_committed ret " << ret << " v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
>>> assert(is_auth());
>>> + assert(ret == 0);
>>>
>>> bool stray = inode->is_stray();
>>>
>>> diff --git a/src/mds/CDir.h b/src/mds/CDir.h
>>> index 2222418..274e38b 100644
>>> --- a/src/mds/CDir.h
>>> +++ b/src/mds/CDir.h
>>> @@ -487,7 +487,7 @@ private:
>>> unsigned max_write_size=-1,
>>> map_t::iterator last_committed_dn=map_t::iterator());
>>> void _encode_dentry(CDentry *dn, bufferlist& bl, const set<snapid_t> *snaps);
>>> - void _committed(version_t v, version_t last_renamed_version);
>>> + void _committed(version_t v, version_t last_renamed_version, int ret);
>>> void wait_for_commit(Context *c, version_t v=0);
>>>
>>> // -- dirtyness --
>>> diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
>>> index f8b1c8f..e69a49f 100644
>>> --- a/src/mds/MDCache.cc
>>> +++ b/src/mds/MDCache.cc
>>> @@ -657,12 +657,15 @@ CDentry *MDCache::get_or_create_stray_dentry(CInode *in)
>>> CDir *straydir = strayi->get_dirfrag(fg);
>>> assert(straydir);
>>> CDentry *straydn = straydir->lookup(straydname);
>>> - if (!straydn) {
>>> +
>>> + if (!straydn)
>>> straydn = straydir->add_null_dentry(straydname);
>>> - straydn->mark_new();
>>> - } else
>>> + else
>>> assert(straydn->get_projected_linkage()->is_null());
>>>
>>> + if (!straydn->is_dirty())
>>> + straydn->mark_new();
>>> +
>>> return straydn;
>>> }
>>>
>>> diff --git a/src/mds/Server.cc b/src/mds/Server.cc
>>> index ec0d5d5..228fede 100644
>>> --- a/src/mds/Server.cc
>>> +++ b/src/mds/Server.cc
>>> @@ -1685,6 +1685,9 @@ CDentry* Server::prepare_null_dentry(MDRequest *mdr, CDir *dir, const string& dn
>>> }
>>> }
>>>
>>> + if (!dn->is_dirty())
>>> + dn->mark_new();
>>> +
>>> return dn;
>>> }
>>>
>>>
>>
>>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2] mds: fix CDir::_commit_partial() bug
2012-11-24 17:45 ` Yan, Zheng
@ 2012-11-24 17:51 ` Sage Weil
2012-11-24 18:14 ` Yan, Zheng
0 siblings, 1 reply; 8+ messages in thread
From: Sage Weil @ 2012-11-24 17:51 UTC (permalink / raw)
To: Yan, Zheng; +Cc: ceph-devel
On Sun, 25 Nov 2012, Yan, Zheng wrote:
> On 11/25/2012 12:21 AM, Sage Weil wrote:
> > On Sat, 24 Nov 2012, Yan, Zheng wrote:
> >> Hi, Sage
> >>
> >> I found this fix is still not enough because we don't know if a dentry has been
> >> successfully deleted from the directory fragment after MDS restart. There are several
> >> options to fix this, which one do you like?
> >>
> >> 1. Make OSD ignore the TMAP_RM commands if it can't the find items.
> >
> > This is certainly the simplest. Since TMAP is effectively deprecated and
> > everyone should be using the omap stuff anyway, it gets my vote.
> >
> > Speaking of, we should transition the MDS to use the omap operations soon
> > as well, since TMAP is horribly inefficient for large directories. Also,
> > the rmkey method in that case does not have this problem.
>
> I can do this, it should be easy.
>
> >
> > Let's ignore enoent on TMAP_RM for bobtail, and transition to omap in the
> > next release... do you want to test the patch that drops the !key_exists
> > check?
>
> I will modify the code and test it. but I think it is hiding bugs ;)
>
> Another issue of tmap is that we may get different results when comparing two dentry_key_t
> and their string representation. So the dentries in tmap can be out of order, setkey method
> may add duplicated dentry and rmkey method may fail to find existing dentry. Since tmap is
> so buggy, let's switch the MDS to use the omap ASAP.
Yeah. I think the question is what is the minimal set of fixes we should
do to the TMAP implementation for bobtail before changing to omap. The
dentry_key_t ordering fix you posted the other day and ignoring the rm
failures should catch these, right?
sage
>
> Regards
> Yan, Zheng
>
>
> >
> > Thanks!
> > sage
> >
> >
> >> 2. Add a new item deletion command for tmap, the new command doesn't make whole tmap
> >> update operation fail if it can't the item. When committing directory fragments,
> >> use the new command for null dentries that were added by MDS log replay.
> >>
> >> 3. When committing a directory fragment, Re-fetch it if it contains dirty+null dentries
> >> that were added by MDS log replay. Find already deleted items and mark corresponding
> >> dentries as new.
> >>
> >> Regards
> >> Yan, Zheng
> >>
> >> On 11/23/2012 12:52 AM, Yan, Zheng wrote:
> >>> From: "Yan, Zheng" <zheng.z.yan@intel.com>
> >>>
> >>> When a null dentry is encountered, CDir::_commit_partial() adds
> >>> a OSD_TMAP_RM command to delete the dentry. But if the dentry is
> >>> new, the osd will not find the dentry when handling the command
> >>> and the tmap update operation will fail totally.
> >>>
> >>> This patch also makes sure dentries are properly marked as new
> >>> when preparing new dentries and exporting dentries.
> >>>
> >>> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
> >>> ---
> >>> src/mds/CDentry.h | 2 ++
> >>> src/mds/CDir.cc | 11 ++++++++---
> >>> src/mds/CDir.h | 2 +-
> >>> src/mds/MDCache.cc | 9 ++++++---
> >>> src/mds/Server.cc | 3 +++
> >>> 5 files changed, 20 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/src/mds/CDentry.h b/src/mds/CDentry.h
> >>> index 480e562..5755c55 100644
> >>> --- a/src/mds/CDentry.h
> >>> +++ b/src/mds/CDentry.h
> >>> @@ -347,6 +347,8 @@ public:
> >>> // twiddle
> >>> state = 0;
> >>> state_set(CDentry::STATE_AUTH);
> >>> + if (nstate & STATE_NEW)
> >>> + mark_new();
> >>> if (nstate & STATE_DIRTY)
> >>> _mark_dirty(ls);
> >>> if (!replica_map.empty())
> >>> diff --git a/src/mds/CDir.cc b/src/mds/CDir.cc
> >>> index c5220ed..411d864 100644
> >>> --- a/src/mds/CDir.cc
> >>> +++ b/src/mds/CDir.cc
> >>> @@ -1696,7 +1696,7 @@ class C_Dir_Committed : public Context {
> >>> public:
> >>> C_Dir_Committed(CDir *d, version_t v, version_t lrv) : dir(d), version(v), last_renamed_version(lrv) { }
> >>> void finish(int r) {
> >>> - dir->_committed(version, last_renamed_version);
> >>> + dir->_committed(version, last_renamed_version, r);
> >>> }
> >>> };
> >>>
> >>> @@ -1802,6 +1802,10 @@ CDir::map_t::iterator CDir::_commit_partial(ObjectOperation& m,
> >>> continue; // skip clean dentries
> >>>
> >>> if (dn->get_linkage()->is_null()) {
> >>> + if (dn->is_new()) {
> >>> + dn->mark_clean();
> >>> + continue;
> >>> + }
> >>> dout(10) << " rm " << dn->name << " " << *dn << dendl;
> >>> finalbl.append(CEPH_OSD_TMAP_RM);
> >>> dn->key().encode(finalbl);
> >>> @@ -1997,10 +2001,11 @@ void CDir::_commit(version_t want)
> >>> *
> >>> * @param v version i just committed
> >>> */
> >>> -void CDir::_committed(version_t v, version_t lrv)
> >>> +void CDir::_committed(version_t v, version_t lrv, int ret)
> >>> {
> >>> - dout(10) << "_committed v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
> >>> + dout(10) << "_committed ret " << ret << " v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
> >>> assert(is_auth());
> >>> + assert(ret == 0);
> >>>
> >>> bool stray = inode->is_stray();
> >>>
> >>> diff --git a/src/mds/CDir.h b/src/mds/CDir.h
> >>> index 2222418..274e38b 100644
> >>> --- a/src/mds/CDir.h
> >>> +++ b/src/mds/CDir.h
> >>> @@ -487,7 +487,7 @@ private:
> >>> unsigned max_write_size=-1,
> >>> map_t::iterator last_committed_dn=map_t::iterator());
> >>> void _encode_dentry(CDentry *dn, bufferlist& bl, const set<snapid_t> *snaps);
> >>> - void _committed(version_t v, version_t last_renamed_version);
> >>> + void _committed(version_t v, version_t last_renamed_version, int ret);
> >>> void wait_for_commit(Context *c, version_t v=0);
> >>>
> >>> // -- dirtyness --
> >>> diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
> >>> index f8b1c8f..e69a49f 100644
> >>> --- a/src/mds/MDCache.cc
> >>> +++ b/src/mds/MDCache.cc
> >>> @@ -657,12 +657,15 @@ CDentry *MDCache::get_or_create_stray_dentry(CInode *in)
> >>> CDir *straydir = strayi->get_dirfrag(fg);
> >>> assert(straydir);
> >>> CDentry *straydn = straydir->lookup(straydname);
> >>> - if (!straydn) {
> >>> +
> >>> + if (!straydn)
> >>> straydn = straydir->add_null_dentry(straydname);
> >>> - straydn->mark_new();
> >>> - } else
> >>> + else
> >>> assert(straydn->get_projected_linkage()->is_null());
> >>>
> >>> + if (!straydn->is_dirty())
> >>> + straydn->mark_new();
> >>> +
> >>> return straydn;
> >>> }
> >>>
> >>> diff --git a/src/mds/Server.cc b/src/mds/Server.cc
> >>> index ec0d5d5..228fede 100644
> >>> --- a/src/mds/Server.cc
> >>> +++ b/src/mds/Server.cc
> >>> @@ -1685,6 +1685,9 @@ CDentry* Server::prepare_null_dentry(MDRequest *mdr, CDir *dir, const string& dn
> >>> }
> >>> }
> >>>
> >>> + if (!dn->is_dirty())
> >>> + dn->mark_new();
> >>> +
> >>> return dn;
> >>> }
> >>>
> >>>
> >>
> >>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2] mds: fix CDir::_commit_partial() bug
2012-11-24 17:51 ` Sage Weil
@ 2012-11-24 18:14 ` Yan, Zheng
0 siblings, 0 replies; 8+ messages in thread
From: Yan, Zheng @ 2012-11-24 18:14 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 11/25/2012 01:51 AM, Sage Weil wrote:
> On Sun, 25 Nov 2012, Yan, Zheng wrote:
>> On 11/25/2012 12:21 AM, Sage Weil wrote:
>>> On Sat, 24 Nov 2012, Yan, Zheng wrote:
>>>> Hi, Sage
>>>>
>>>> I found this fix is still not enough because we don't know if a dentry has been
>>>> successfully deleted from the directory fragment after MDS restart. There are several
>>>> options to fix this, which one do you like?
>>>>
>>>> 1. Make OSD ignore the TMAP_RM commands if it can't the find items.
>>>
>>> This is certainly the simplest. Since TMAP is effectively deprecated and
>>> everyone should be using the omap stuff anyway, it gets my vote.
>>>
>>> Speaking of, we should transition the MDS to use the omap operations soon
>>> as well, since TMAP is horribly inefficient for large directories. Also,
>>> the rmkey method in that case does not have this problem.
>>
>> I can do this, it should be easy.
>>
>>>
>>> Let's ignore enoent on TMAP_RM for bobtail, and transition to omap in the
>>> next release... do you want to test the patch that drops the !key_exists
>>> check?
>>
>> I will modify the code and test it. but I think it is hiding bugs ;)
>>
>> Another issue of tmap is that we may get different results when comparing two dentry_key_t
>> and their string representation. So the dentries in tmap can be out of order, setkey method
>> may add duplicated dentry and rmkey method may fail to find existing dentry. Since tmap is
>> so buggy, let's switch the MDS to use the omap ASAP.
>
> Yeah. I think the question is what is the minimal set of fixes we should
> do to the TMAP implementation for bobtail before changing to omap. The
> dentry_key_t ordering fix you posted the other day and ignoring the rm
> failures should catch these, right?
Yes, that bug was caught by the !key_exists check. Actually the patch I previously sent
is buggy too, it breaks CDir::lookup. I have a patch that sorts dentries in CDir::_commit_full
and CDir::_commit_partial. It survives over night fsstress test, but change is ugly and
relatively large. I will send it out tomorrow.
Regards
Yan, Zheng
>
> sage
>
>
>>
>> Regards
>> Yan, Zheng
>>
>>
>>>
>>> Thanks!
>>> sage
>>>
>>>
>>>> 2. Add a new item deletion command for tmap, the new command doesn't make whole tmap
>>>> update operation fail if it can't the item. When committing directory fragments,
>>>> use the new command for null dentries that were added by MDS log replay.
>>>>
>>>> 3. When committing a directory fragment, Re-fetch it if it contains dirty+null dentries
>>>> that were added by MDS log replay. Find already deleted items and mark corresponding
>>>> dentries as new.
>>>>
>>>> Regards
>>>> Yan, Zheng
>>>>
>>>> On 11/23/2012 12:52 AM, Yan, Zheng wrote:
>>>>> From: "Yan, Zheng" <zheng.z.yan@intel.com>
>>>>>
>>>>> When a null dentry is encountered, CDir::_commit_partial() adds
>>>>> a OSD_TMAP_RM command to delete the dentry. But if the dentry is
>>>>> new, the osd will not find the dentry when handling the command
>>>>> and the tmap update operation will fail totally.
>>>>>
>>>>> This patch also makes sure dentries are properly marked as new
>>>>> when preparing new dentries and exporting dentries.
>>>>>
>>>>> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
>>>>> ---
>>>>> src/mds/CDentry.h | 2 ++
>>>>> src/mds/CDir.cc | 11 ++++++++---
>>>>> src/mds/CDir.h | 2 +-
>>>>> src/mds/MDCache.cc | 9 ++++++---
>>>>> src/mds/Server.cc | 3 +++
>>>>> 5 files changed, 20 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/src/mds/CDentry.h b/src/mds/CDentry.h
>>>>> index 480e562..5755c55 100644
>>>>> --- a/src/mds/CDentry.h
>>>>> +++ b/src/mds/CDentry.h
>>>>> @@ -347,6 +347,8 @@ public:
>>>>> // twiddle
>>>>> state = 0;
>>>>> state_set(CDentry::STATE_AUTH);
>>>>> + if (nstate & STATE_NEW)
>>>>> + mark_new();
>>>>> if (nstate & STATE_DIRTY)
>>>>> _mark_dirty(ls);
>>>>> if (!replica_map.empty())
>>>>> diff --git a/src/mds/CDir.cc b/src/mds/CDir.cc
>>>>> index c5220ed..411d864 100644
>>>>> --- a/src/mds/CDir.cc
>>>>> +++ b/src/mds/CDir.cc
>>>>> @@ -1696,7 +1696,7 @@ class C_Dir_Committed : public Context {
>>>>> public:
>>>>> C_Dir_Committed(CDir *d, version_t v, version_t lrv) : dir(d), version(v), last_renamed_version(lrv) { }
>>>>> void finish(int r) {
>>>>> - dir->_committed(version, last_renamed_version);
>>>>> + dir->_committed(version, last_renamed_version, r);
>>>>> }
>>>>> };
>>>>>
>>>>> @@ -1802,6 +1802,10 @@ CDir::map_t::iterator CDir::_commit_partial(ObjectOperation& m,
>>>>> continue; // skip clean dentries
>>>>>
>>>>> if (dn->get_linkage()->is_null()) {
>>>>> + if (dn->is_new()) {
>>>>> + dn->mark_clean();
>>>>> + continue;
>>>>> + }
>>>>> dout(10) << " rm " << dn->name << " " << *dn << dendl;
>>>>> finalbl.append(CEPH_OSD_TMAP_RM);
>>>>> dn->key().encode(finalbl);
>>>>> @@ -1997,10 +2001,11 @@ void CDir::_commit(version_t want)
>>>>> *
>>>>> * @param v version i just committed
>>>>> */
>>>>> -void CDir::_committed(version_t v, version_t lrv)
>>>>> +void CDir::_committed(version_t v, version_t lrv, int ret)
>>>>> {
>>>>> - dout(10) << "_committed v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
>>>>> + dout(10) << "_committed ret " << ret << " v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
>>>>> assert(is_auth());
>>>>> + assert(ret == 0);
>>>>>
>>>>> bool stray = inode->is_stray();
>>>>>
>>>>> diff --git a/src/mds/CDir.h b/src/mds/CDir.h
>>>>> index 2222418..274e38b 100644
>>>>> --- a/src/mds/CDir.h
>>>>> +++ b/src/mds/CDir.h
>>>>> @@ -487,7 +487,7 @@ private:
>>>>> unsigned max_write_size=-1,
>>>>> map_t::iterator last_committed_dn=map_t::iterator());
>>>>> void _encode_dentry(CDentry *dn, bufferlist& bl, const set<snapid_t> *snaps);
>>>>> - void _committed(version_t v, version_t last_renamed_version);
>>>>> + void _committed(version_t v, version_t last_renamed_version, int ret);
>>>>> void wait_for_commit(Context *c, version_t v=0);
>>>>>
>>>>> // -- dirtyness --
>>>>> diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
>>>>> index f8b1c8f..e69a49f 100644
>>>>> --- a/src/mds/MDCache.cc
>>>>> +++ b/src/mds/MDCache.cc
>>>>> @@ -657,12 +657,15 @@ CDentry *MDCache::get_or_create_stray_dentry(CInode *in)
>>>>> CDir *straydir = strayi->get_dirfrag(fg);
>>>>> assert(straydir);
>>>>> CDentry *straydn = straydir->lookup(straydname);
>>>>> - if (!straydn) {
>>>>> +
>>>>> + if (!straydn)
>>>>> straydn = straydir->add_null_dentry(straydname);
>>>>> - straydn->mark_new();
>>>>> - } else
>>>>> + else
>>>>> assert(straydn->get_projected_linkage()->is_null());
>>>>>
>>>>> + if (!straydn->is_dirty())
>>>>> + straydn->mark_new();
>>>>> +
>>>>> return straydn;
>>>>> }
>>>>>
>>>>> diff --git a/src/mds/Server.cc b/src/mds/Server.cc
>>>>> index ec0d5d5..228fede 100644
>>>>> --- a/src/mds/Server.cc
>>>>> +++ b/src/mds/Server.cc
>>>>> @@ -1685,6 +1685,9 @@ CDentry* Server::prepare_null_dentry(MDRequest *mdr, CDir *dir, const string& dn
>>>>> }
>>>>> }
>>>>>
>>>>> + if (!dn->is_dirty())
>>>>> + dn->mark_new();
>>>>> +
>>>>> return dn;
>>>>> }
>>>>>
>>>>>
>>>>
>>>>
>>
>>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2] mds: fix CDir::_commit_partial() bug
2012-11-22 16:52 [PATCH V2] mds: fix CDir::_commit_partial() bug Yan, Zheng
2012-11-24 11:05 ` Yan, Zheng
@ 2012-12-01 13:58 ` Yan, Zheng
2012-12-01 20:56 ` Sage Weil
1 sibling, 1 reply; 8+ messages in thread
From: Yan, Zheng @ 2012-12-01 13:58 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
Hi Sage,
This patch is buggy, please drop it from wip-mds.
On 11/23/2012 12:52 AM, Yan, Zheng wrote:
> From: "Yan, Zheng" <zheng.z.yan@intel.com>
>
> When a null dentry is encountered, CDir::_commit_partial() adds
> a OSD_TMAP_RM command to delete the dentry. But if the dentry is
> new, the osd will not find the dentry when handling the command
> and the tmap update operation will fail totally.
>
> This patch also makes sure dentries are properly marked as new
> when preparing new dentries and exporting dentries.
>
> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
> ---
SNIP
> diff --git a/src/mds/Server.cc b/src/mds/Server.cc
> index ec0d5d5..228fede 100644
> --- a/src/mds/Server.cc
> +++ b/src/mds/Server.cc
> @@ -1685,6 +1685,9 @@ CDentry* Server::prepare_null_dentry(MDRequest *mdr, CDir *dir, const string& dn
> }
> }
>
> + if (!dn->is_dirty())
> + dn->mark_new();
> +
This should be something like 'else if (!dn->is_dirty())'. This issue
has been fixed on the osd side, this patch is no longer needed.
Regards
Yan, Zheng
> return dn;
> }
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2] mds: fix CDir::_commit_partial() bug
2012-12-01 13:58 ` Yan, Zheng
@ 2012-12-01 20:56 ` Sage Weil
0 siblings, 0 replies; 8+ messages in thread
From: Sage Weil @ 2012-12-01 20:56 UTC (permalink / raw)
To: Yan, Zheng; +Cc: ceph-devel
On Sat, 1 Dec 2012, Yan, Zheng wrote:
> Hi Sage,
>
> This patch is buggy, please drop it from wip-mds.
Done.
> On 11/23/2012 12:52 AM, Yan, Zheng wrote:
> > From: "Yan, Zheng" <zheng.z.yan@intel.com>
> >
> > When a null dentry is encountered, CDir::_commit_partial() adds
> > a OSD_TMAP_RM command to delete the dentry. But if the dentry is
> > new, the osd will not find the dentry when handling the command
> > and the tmap update operation will fail totally.
> >
> > This patch also makes sure dentries are properly marked as new
> > when preparing new dentries and exporting dentries.
> >
> > Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
> > ---
> SNIP
> > diff --git a/src/mds/Server.cc b/src/mds/Server.cc
> > index ec0d5d5..228fede 100644
> > --- a/src/mds/Server.cc
> > +++ b/src/mds/Server.cc
> > @@ -1685,6 +1685,9 @@ CDentry* Server::prepare_null_dentry(MDRequest *mdr, CDir *dir, const string& dn
> > }
> > }
> >
> > + if (!dn->is_dirty())
> > + dn->mark_new();
> > +
> This should be something like 'else if (!dn->is_dirty())'. This issue
> has been fixed on the osd side, this patch is no longer needed.
Yeah. It would still be nice to avoid including those ops in the osd
request, as an optimization, but that can happen later.
Repushed wip-mds, and scheduled the fs suite to run against it. That
only covers single mds configurations still, but as long as there aren't
regressions there I'm happy relying on your testing for now.
Real Soon Now we need to flesh out that suite to include clustered mds
tests, export thrashing, and build a mds_thrasher teuthology task that
restarts, adds, and stops mds's to exercise the failure paths. Eventually
we'll want to hook that into the failure injection sites in Migration.cc
as well so we deliberatel exercise each failure point...
sage
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-12-01 20:56 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-22 16:52 [PATCH V2] mds: fix CDir::_commit_partial() bug Yan, Zheng
2012-11-24 11:05 ` Yan, Zheng
2012-11-24 16:21 ` Sage Weil
2012-11-24 17:45 ` Yan, Zheng
2012-11-24 17:51 ` Sage Weil
2012-11-24 18:14 ` Yan, Zheng
2012-12-01 13:58 ` Yan, Zheng
2012-12-01 20:56 ` Sage Weil
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.