* [RFC] Should lockd get into grace_period when statd start but not stop?
@ 2010-03-10 10:22 Mi Jinlong
2010-03-10 18:24 ` J. Bruce Fields
0 siblings, 1 reply; 8+ messages in thread
From: Mi Jinlong @ 2010-03-10 10:22 UTC (permalink / raw)
To: Trond.Myklebust, J. Bruce Fields, Chuck Lever, NFSv3 list
Hi,
When using command "service nfslock stop" and "service nfsklock start"
to restart the nfslock service at RHEL with kernel 2.6.31, if start the service
after stop it more than the grace_period time, lock which lockd get before
cann't be reclaimed for the grace_period is timeout.
So, IMO, the lockd should get into grace_period when statd start not stop?
Some code at kernel: fs/lockd/svclock.c
....
411 if (locks_in_grace() && !reclaim) {
412 ret = nlm_lck_denied_grace_period;
413 goto out;
414 }
415 if (reclaim && !locks_in_grace()) {
416 ret = nlm_lck_denied_grace_period;
417 goto out;
418 }
....
I think it can implement it like that:
1) When statd stop, it send a KILL signal to lockd,
and lockd only release the lock, but don't get into grace_period.
2) When statd start, it send one some other signal to lockd,
and lockd only get into grace_period. The lock will be reclaimed.
thanks,
Mi Jinlong
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] Should lockd get into grace_period when statd start but not stop?
2010-03-10 10:22 [RFC] Should lockd get into grace_period when statd start but not stop? Mi Jinlong
@ 2010-03-10 18:24 ` J. Bruce Fields
2010-03-11 1:02 ` Mi Jinlong
0 siblings, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2010-03-10 18:24 UTC (permalink / raw)
To: Mi Jinlong; +Cc: Trond.Myklebust, Chuck Lever, NFSv3 list
On Wed, Mar 10, 2010 at 06:22:35PM +0800, Mi Jinlong wrote:
> Hi,
>
> When using command "service nfslock stop" and "service nfsklock start"
> to restart the nfslock service at RHEL with kernel 2.6.31, if start the service
> after stop it more than the grace_period time, lock which lockd get before
> cann't be reclaimed for the grace_period is timeout.
>
> So, IMO, the lockd should get into grace_period when statd start not stop?
Sorry, I'm not sure I understand.
Are you saying that lockd's grace period starts when the last lock is
shut down, instead of when the new lockd is started? That would be a
bug, I agree.
But are you really shutting down lockd completely?
--b.
>
> Some code at kernel: fs/lockd/svclock.c
> ....
> 411 if (locks_in_grace() && !reclaim) {
> 412 ret = nlm_lck_denied_grace_period;
> 413 goto out;
> 414 }
> 415 if (reclaim && !locks_in_grace()) {
> 416 ret = nlm_lck_denied_grace_period;
> 417 goto out;
> 418 }
> ....
>
> I think it can implement it like that:
> 1) When statd stop, it send a KILL signal to lockd,
> and lockd only release the lock, but don't get into grace_period.
> 2) When statd start, it send one some other signal to lockd,
> and lockd only get into grace_period. The lock will be reclaimed.
>
> thanks,
> Mi Jinlong
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] Should lockd get into grace_period when statd start but not stop?
2010-03-10 18:24 ` J. Bruce Fields
@ 2010-03-11 1:02 ` Mi Jinlong
2010-03-11 15:58 ` J. Bruce Fields
0 siblings, 1 reply; 8+ messages in thread
From: Mi Jinlong @ 2010-03-11 1:02 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Trond.Myklebust, Chuck Lever, NFSv3 list
J. Bruce Fields :
> On Wed, Mar 10, 2010 at 06:22:35PM +0800, Mi Jinlong wrote:
>> Hi,
>>
>> When using command "service nfslock stop" and "service nfsklock start"
>> to restart the nfslock service at RHEL with kernel 2.6.31, if start the service
>> after stop it more than the grace_period time, lock which lockd get before
>> cann't be reclaimed for the grace_period is timeout.
>>
>> So, IMO, the lockd should get into grace_period when statd start not stop?
>
> Sorry, I'm not sure I understand.
>
> Are you saying that lockd's grace period starts when the last lock is
> shut down, instead of when the new lockd is started? That would be a
> bug, I agree.
I means that using command "service nfslock stop" to stop the nfslock service
at RHEL, it only case statd stop, but lockd is still running.
When statd stop, it send a KILL signal to lockd, lockd will release all locks
and get into grace_period state to wait reclaimed lock request. It means lockd
get into grace_period state when statd stop.
But the reclaimed lock request only be caused when client receive a SM_NOTIFY
which is send by server's statd at start. If statd start when lockd isn't at
grace_period state, the reclaimed lock cann't be reclaimed.
So, I think, lockd should get into grace_period state when statd start, not stop.
>
> But are you really shutting down lockd completely?
Lockd don't shut down when nfslock service stop, the service stop only case lockd stop
at RHEL.
>
> --b.
>
>> Some code at kernel: fs/lockd/svclock.c
>> ....
>> 411 if (locks_in_grace() && !reclaim) {
>> 412 ret = nlm_lck_denied_grace_period;
>> 413 goto out;
>> 414 }
>> 415 if (reclaim && !locks_in_grace()) {
>> 416 ret = nlm_lck_denied_grace_period;
>> 417 goto out;
>> 418 }
>> ....
>>
>> I think it can implement it like that:
>> 1) When statd stop, it send a KILL signal to lockd,
>> and lockd only release the lock, but don't get into grace_period.
>> 2) When statd start, it send one some other signal to lockd,
>> and lockd only get into grace_period. The lock will be reclaimed.
>>
>> thanks,
>> Mi Jinlong
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
--
Regards
Mi Jinlong
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] Should lockd get into grace_period when statd start but not stop?
2010-03-11 1:02 ` Mi Jinlong
@ 2010-03-11 15:58 ` J. Bruce Fields
2010-03-12 9:42 ` Mi Jinlong
0 siblings, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2010-03-11 15:58 UTC (permalink / raw)
To: Mi Jinlong; +Cc: Trond.Myklebust, Chuck Lever, NFSv3 list
On Thu, Mar 11, 2010 at 09:02:55AM +0800, Mi Jinlong wrote:
>
> J. Bruce Fields :
> > On Wed, Mar 10, 2010 at 06:22:35PM +0800, Mi Jinlong wrote:
> >> Hi,
> >>
> >> When using command "service nfslock stop" and "service nfsklock start"
> >> to restart the nfslock service at RHEL with kernel 2.6.31, if start the service
> >> after stop it more than the grace_period time, lock which lockd get before
> >> cann't be reclaimed for the grace_period is timeout.
> >>
> >> So, IMO, the lockd should get into grace_period when statd start not stop?
> >
> > Sorry, I'm not sure I understand.
> >
> > Are you saying that lockd's grace period starts when the last lock is
> > shut down, instead of when the new lockd is started? That would be a
> > bug, I agree.
>
> I means that using command "service nfslock stop" to stop the nfslock service
> at RHEL, it only case statd stop, but lockd is still running.
>
> When statd stop, it send a KILL signal to lockd, lockd will release all locks
> and get into grace_period state to wait reclaimed lock request. It means lockd
> get into grace_period state when statd stop.
>
> But the reclaimed lock request only be caused when client receive a SM_NOTIFY
> which is send by server's statd at start. If statd start when lockd isn't at
> grace_period state, the reclaimed lock cann't be reclaimed.
>
> So, I think, lockd should get into grace_period state when statd start, not stop.
>
> >
> > But are you really shutting down lockd completely?
>
> Lockd don't shut down when nfslock service stop, the service stop only case lockd stop
> at RHEL.
OK, so that's the problem. Lockd doesn't shut down completely, if I
remember correctly, until nfsd server and all clients do.
Our current NFS implementation just isn't designed to be able to shut
down some components while leaving others running.
Is there some reason you *need* to do what you're doing?
--b.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] Should lockd get into grace_period when statd start but not stop?
2010-03-11 15:58 ` J. Bruce Fields
@ 2010-03-12 9:42 ` Mi Jinlong
2010-03-12 23:08 ` J. Bruce Fields
0 siblings, 1 reply; 8+ messages in thread
From: Mi Jinlong @ 2010-03-12 9:42 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Trond.Myklebust, Chuck Lever, NFSv3 list
J. Bruce Fields:
> On Thu, Mar 11, 2010 at 09:02:55AM +0800, Mi Jinlong wrote:
>> J. Bruce Fields :
>>> On Wed, Mar 10, 2010 at 06:22:35PM +0800, Mi Jinlong wrote:
>>>> Hi,
>>>>
>>>> When using command "service nfslock stop" and "service nfsklock start"
>>>> to restart the nfslock service at RHEL with kernel 2.6.31, if start the service
>>>> after stop it more than the grace_period time, lock which lockd get before
>>>> cann't be reclaimed for the grace_period is timeout.
>>>>
>>>> So, IMO, the lockd should get into grace_period when statd start not stop?
>>> Sorry, I'm not sure I understand.
>>>
>>> Are you saying that lockd's grace period starts when the last lock is
>>> shut down, instead of when the new lockd is started? That would be a
>>> bug, I agree.
>> I means that using command "service nfslock stop" to stop the nfslock service
>> at RHEL, it only case statd stop, but lockd is still running.
>>
>> When statd stop, it send a KILL signal to lockd, lockd will release all locks
>> and get into grace_period state to wait reclaimed lock request. It means lockd
>> get into grace_period state when statd stop.
>>
>> But the reclaimed lock request only be caused when client receive a SM_NOTIFY
>> which is send by server's statd at start. If statd start when lockd isn't at
>> grace_period state, the reclaimed lock cann't be reclaimed.
>>
>> So, I think, lockd should get into grace_period state when statd start, not stop.
>>
>>> But are you really shutting down lockd completely?
>> Lockd don't shut down when nfslock service stop, the service stop only case lockd stop
>> at RHEL.
>
> OK, so that's the problem. Lockd doesn't shut down completely, if I
> remember correctly, until nfsd server and all clients do.
>
> Our current NFS implementation just isn't designed to be able to shut
> down some components while leaving others running.
Really? But the lockd started with nfs service start, but not nfslock service.
And, lockd can't stop with statd at the same time.
Sometimes, the lockd will not synchronous with statd. Maybe this problem is a good example.
>
> Is there some reason you *need* to do what you're doing?
When using the NFSv3 at RHEL, I restart the nfslock service with intermediate wait 90s,
those lock that lockd get before can't be reclaimed.
I think it's a bug of the kernel, so I want geting some opinion.
If you think it's a bug too, I will try to make a patch as below to fix it.
1) When statd stop, it send a KILL signal to lockd,
and lockd only release the lock, but don't get into grace_period.
2) When statd start, it send one some other signal to lockd,
and lockd only get into grace_period. The lock will be reclaimed.
thanks,
Mi Jinlong
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] Should lockd get into grace_period when statd start but not stop?
2010-03-12 9:42 ` Mi Jinlong
@ 2010-03-12 23:08 ` J. Bruce Fields
2010-03-16 10:05 ` Mi Jinlong
0 siblings, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2010-03-12 23:08 UTC (permalink / raw)
To: Mi Jinlong; +Cc: Trond.Myklebust, Chuck Lever, NFSv3 list
On Fri, Mar 12, 2010 at 05:42:18PM +0800, Mi Jinlong wrote:
>
>
> J. Bruce Fields:
> > Our current NFS implementation just isn't designed to be able to shut
> > down some components while leaving others running.
>
> Really? But the lockd started with nfs service start, but not nfslock service.
> And, lockd can't stop with statd at the same time.
> Sometimes, the lockd will not synchronous with statd. Maybe this problem is a good example.
I'm sorry, I still don't understand.
Please take a look at section 3.1, 3.2, and 3.3 of the nfs-utils README
file. That describes the order in which servers should be started and
stopped.
--b.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] Should lockd get into grace_period when statd start but not stop?
2010-03-12 23:08 ` J. Bruce Fields
@ 2010-03-16 10:05 ` Mi Jinlong
2010-03-16 16:56 ` J. Bruce Fields
0 siblings, 1 reply; 8+ messages in thread
From: Mi Jinlong @ 2010-03-16 10:05 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Trond.Myklebust, Chuck Lever, NFSv3 list
J. Bruce Fields :
> On Fri, Mar 12, 2010 at 05:42:18PM +0800, Mi Jinlong wrote:
>>
>> J. Bruce Fields:
>>> Our current NFS implementation just isn't designed to be able to shut
>>> down some components while leaving others running.
>> Really? But the lockd started with nfs service start, but not nfslock service.
>> And, lockd can't stop with statd at the same time.
>> Sometimes, the lockd will not synchronous with statd. Maybe this problem is a good example.
>
> I'm sorry, I still don't understand.
>
> Please take a look at section 3.1, 3.2, and 3.3 of the nfs-utils README
> file. That describes the order in which servers should be started and
> stopped.
Maybe that's my problem.
The status of lockd and statd, when testing.
lockd statd
| | <== service nfslock stop
get KILL signal stopd ^
and get into grace_period | |
| | | more than grace_period time
| | v
| | <== service nfslock start
normal state |
| start Client receive SM_NOTIFY and reclaime lock,
| | but out of grace_period time.
v v
As above, after nfslock service start, client cannot reclaime lock success.
thanks,
Mi Jinlong
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] Should lockd get into grace_period when statd start but not stop?
2010-03-16 10:05 ` Mi Jinlong
@ 2010-03-16 16:56 ` J. Bruce Fields
0 siblings, 0 replies; 8+ messages in thread
From: J. Bruce Fields @ 2010-03-16 16:56 UTC (permalink / raw)
To: Mi Jinlong; +Cc: Trond.Myklebust, Chuck Lever, NFSv3 list
On Tue, Mar 16, 2010 at 06:05:33PM +0800, Mi Jinlong wrote:
>
>
> J. Bruce Fields :
> > On Fri, Mar 12, 2010 at 05:42:18PM +0800, Mi Jinlong wrote:
> >>
> >> J. Bruce Fields:
> >>> Our current NFS implementation just isn't designed to be able to shut
> >>> down some components while leaving others running.
> >> Really? But the lockd started with nfs service start, but not nfslock service.
> >> And, lockd can't stop with statd at the same time.
> >> Sometimes, the lockd will not synchronous with statd. Maybe this problem is a good example.
> >
> > I'm sorry, I still don't understand.
> >
> > Please take a look at section 3.1, 3.2, and 3.3 of the nfs-utils README
> > file. That describes the order in which servers should be started and
> > stopped.
>
> Maybe that's my problem.
> The status of lockd and statd, when testing.
>
> lockd statd
> | | <== service nfslock stop
> get KILL signal stopd ^
> and get into grace_period | |
I believe you also need to shut down nfsd here.
--b.
> | | | more than grace_period time
> | | v
> | | <== service nfslock start
> normal state |
> | start Client receive SM_NOTIFY and reclaime lock,
> | | but out of grace_period time.
> v v
>
> As above, after nfslock service start, client cannot reclaime lock success.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-03-16 16:54 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-10 10:22 [RFC] Should lockd get into grace_period when statd start but not stop? Mi Jinlong
2010-03-10 18:24 ` J. Bruce Fields
2010-03-11 1:02 ` Mi Jinlong
2010-03-11 15:58 ` J. Bruce Fields
2010-03-12 9:42 ` Mi Jinlong
2010-03-12 23:08 ` J. Bruce Fields
2010-03-16 10:05 ` Mi Jinlong
2010-03-16 16:56 ` J. Bruce Fields
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox