* [PATCH] rbd: prevent busy loop when requesting exclusive lock
@ 2023-08-01 22:22 Ilya Dryomov
2023-08-02 6:35 ` Dongsheng Yang
0 siblings, 1 reply; 4+ messages in thread
From: Ilya Dryomov @ 2023-08-01 22:22 UTC (permalink / raw)
To: ceph-devel; +Cc: Dongsheng Yang
Due to rbd_try_acquire_lock() effectively swallowing all but
EBLOCKLISTED error from rbd_try_lock() ("request lock anyway") and
rbd_request_lock() returning ETIMEDOUT error not only for an actual
notify timeout but also when the lock owner doesn't respond, a busy
loop inside of rbd_acquire_lock() between rbd_try_acquire_lock() and
rbd_request_lock() is possible.
Requesting the lock on EBUSY error (returned by get_lock_owner_info()
if an incompatible lock or invalid lock owner is detected) makes very
little sense. The same goes for ETIMEDOUT error (might pop up pretty
much anywhere if osd_request_timeout option is set) and many others.
Just fail I/O requests on rbd_dev->acquiring_list immediately on any
error from rbd_try_lock().
Cc: stable@vger.kernel.org # 588159009d5b: rbd: retrieve and check lock owner twice before blocklisting
Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
---
drivers/block/rbd.c | 28 +++++++++++++++-------------
1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 24afcc93ac01..2328cc05be36 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3675,7 +3675,7 @@ static int rbd_lock(struct rbd_device *rbd_dev)
ret = ceph_cls_lock(osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc,
RBD_LOCK_NAME, CEPH_CLS_LOCK_EXCLUSIVE, cookie,
RBD_LOCK_TAG, "", 0);
- if (ret)
+ if (ret && ret != -EEXIST)
return ret;
__rbd_lock(rbd_dev, cookie);
@@ -3878,7 +3878,7 @@ static struct ceph_locker *get_lock_owner_info(struct rbd_device *rbd_dev)
&rbd_dev->header_oloc, RBD_LOCK_NAME,
&lock_type, &lock_tag, &lockers, &num_lockers);
if (ret) {
- rbd_warn(rbd_dev, "failed to retrieve lockers: %d", ret);
+ rbd_warn(rbd_dev, "failed to get header lockers: %d", ret);
return ERR_PTR(ret);
}
@@ -3940,8 +3940,10 @@ static int find_watcher(struct rbd_device *rbd_dev,
ret = ceph_osdc_list_watchers(osdc, &rbd_dev->header_oid,
&rbd_dev->header_oloc, &watchers,
&num_watchers);
- if (ret)
+ if (ret) {
+ rbd_warn(rbd_dev, "failed to get watchers: %d", ret);
return ret;
+ }
sscanf(locker->id.cookie, RBD_LOCK_COOKIE_PREFIX " %llu", &cookie);
for (i = 0; i < num_watchers; i++) {
@@ -3985,8 +3987,12 @@ static int rbd_try_lock(struct rbd_device *rbd_dev)
locker = refreshed_locker = NULL;
ret = rbd_lock(rbd_dev);
- if (ret != -EBUSY)
+ if (!ret)
+ goto out;
+ if (ret != -EBUSY) {
+ rbd_warn(rbd_dev, "failed to lock header: %d", ret);
goto out;
+ }
/* determine if the current lock holder is still alive */
locker = get_lock_owner_info(rbd_dev);
@@ -4089,11 +4095,8 @@ static int rbd_try_acquire_lock(struct rbd_device *rbd_dev)
ret = rbd_try_lock(rbd_dev);
if (ret < 0) {
- rbd_warn(rbd_dev, "failed to lock header: %d", ret);
- if (ret == -EBLOCKLISTED)
- goto out;
-
- ret = 1; /* request lock anyway */
+ rbd_warn(rbd_dev, "failed to acquire lock: %d", ret);
+ goto out;
}
if (ret > 0) {
up_write(&rbd_dev->lock_rwsem);
@@ -6627,12 +6630,11 @@ static int rbd_add_acquire_lock(struct rbd_device *rbd_dev)
cancel_delayed_work_sync(&rbd_dev->lock_dwork);
if (!ret)
ret = -ETIMEDOUT;
- }
- if (ret) {
- rbd_warn(rbd_dev, "failed to acquire exclusive lock: %ld", ret);
- return ret;
+ rbd_warn(rbd_dev, "failed to acquire lock: %ld", ret);
}
+ if (ret)
+ return ret;
/*
* The lock may have been released by now, unless automatic lock
--
2.41.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] rbd: prevent busy loop when requesting exclusive lock
2023-08-01 22:22 [PATCH] rbd: prevent busy loop when requesting exclusive lock Ilya Dryomov
@ 2023-08-02 6:35 ` Dongsheng Yang
2023-08-02 6:41 ` Ilya Dryomov
0 siblings, 1 reply; 4+ messages in thread
From: Dongsheng Yang @ 2023-08-02 6:35 UTC (permalink / raw)
To: Ilya Dryomov, ceph-devel
Hi Ilya
在 2023/8/2 星期三 上午 6:22, Ilya Dryomov 写道:
> Due to rbd_try_acquire_lock() effectively swallowing all but
> EBLOCKLISTED error from rbd_try_lock() ("request lock anyway") and
> rbd_request_lock() returning ETIMEDOUT error not only for an actual
> notify timeout but also when the lock owner doesn't respond, a busy
> loop inside of rbd_acquire_lock() between rbd_try_acquire_lock() and
> rbd_request_lock() is possible.
>
> Requesting the lock on EBUSY error (returned by get_lock_owner_info()
> if an incompatible lock or invalid lock owner is detected) makes very
> little sense. The same goes for ETIMEDOUT error (might pop up pretty
> much anywhere if osd_request_timeout option is set) and many others.
>
> Just fail I/O requests on rbd_dev->acquiring_list immediately on any
> error from rbd_try_lock().
>
> Cc: stable@vger.kernel.org # 588159009d5b: rbd: retrieve and check lock owner twice before blocklisting
> Cc: stable@vger.kernel.org
> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
> ---
> drivers/block/rbd.c | 28 +++++++++++++++-------------
> 1 file changed, 15 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 24afcc93ac01..2328cc05be36 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -3675,7 +3675,7 @@ static int rbd_lock(struct rbd_device *rbd_dev)
> ret = ceph_cls_lock(osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc,
> RBD_LOCK_NAME, CEPH_CLS_LOCK_EXCLUSIVE, cookie,
> RBD_LOCK_TAG, "", 0);
> - if (ret)
> + if (ret && ret != -EEXIST)
> return ret;
>
> __rbd_lock(rbd_dev, cookie);
If we got -EEXIST here, we will call __rbd_lock() and return 0. -EEXIST
means lock is held by myself, is that necessary to call __rbd_lock()?
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] rbd: prevent busy loop when requesting exclusive lock
2023-08-02 6:35 ` Dongsheng Yang
@ 2023-08-02 6:41 ` Ilya Dryomov
2023-08-02 6:52 ` Dongsheng Yang
0 siblings, 1 reply; 4+ messages in thread
From: Ilya Dryomov @ 2023-08-02 6:41 UTC (permalink / raw)
To: Dongsheng Yang; +Cc: ceph-devel
On Wed, Aug 2, 2023 at 8:36 AM Dongsheng Yang
<dongsheng.yang@easystack.cn> wrote:
>
> Hi Ilya
>
> 在 2023/8/2 星期三 上午 6:22, Ilya Dryomov 写道:
> > Due to rbd_try_acquire_lock() effectively swallowing all but
> > EBLOCKLISTED error from rbd_try_lock() ("request lock anyway") and
> > rbd_request_lock() returning ETIMEDOUT error not only for an actual
> > notify timeout but also when the lock owner doesn't respond, a busy
> > loop inside of rbd_acquire_lock() between rbd_try_acquire_lock() and
> > rbd_request_lock() is possible.
> >
> > Requesting the lock on EBUSY error (returned by get_lock_owner_info()
> > if an incompatible lock or invalid lock owner is detected) makes very
> > little sense. The same goes for ETIMEDOUT error (might pop up pretty
> > much anywhere if osd_request_timeout option is set) and many others.
> >
> > Just fail I/O requests on rbd_dev->acquiring_list immediately on any
> > error from rbd_try_lock().
> >
> > Cc: stable@vger.kernel.org # 588159009d5b: rbd: retrieve and check lock owner twice before blocklisting
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
> > ---
> > drivers/block/rbd.c | 28 +++++++++++++++-------------
> > 1 file changed, 15 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> > index 24afcc93ac01..2328cc05be36 100644
> > --- a/drivers/block/rbd.c
> > +++ b/drivers/block/rbd.c
> > @@ -3675,7 +3675,7 @@ static int rbd_lock(struct rbd_device *rbd_dev)
> > ret = ceph_cls_lock(osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc,
> > RBD_LOCK_NAME, CEPH_CLS_LOCK_EXCLUSIVE, cookie,
> > RBD_LOCK_TAG, "", 0);
> > - if (ret)
> > + if (ret && ret != -EEXIST)
> > return ret;
> >
> > __rbd_lock(rbd_dev, cookie);
>
> If we got -EEXIST here, we will call __rbd_lock() and return 0. -EEXIST
> means lock is held by myself, is that necessary to call __rbd_lock()?
Hi Dongsheng,
Yes, because the reason rbd_lock() gets called in the first place is
that the kernel client doesn't "know" that it's still holding the lock
in RADOS. This can happen if the unlock operation times out, for
example.
Notice
WARN_ON(__rbd_is_lock_owner(rbd_dev) ||
rbd_dev->lock_cookie[0] != '\0');
at the top of rbd_lock().
Thanks,
Ilya
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] rbd: prevent busy loop when requesting exclusive lock
2023-08-02 6:41 ` Ilya Dryomov
@ 2023-08-02 6:52 ` Dongsheng Yang
0 siblings, 0 replies; 4+ messages in thread
From: Dongsheng Yang @ 2023-08-02 6:52 UTC (permalink / raw)
To: Ilya Dryomov; +Cc: ceph-devel
在 2023/8/2 星期三 下午 2:41, Ilya Dryomov 写道:
> On Wed, Aug 2, 2023 at 8:36 AM Dongsheng Yang
> <dongsheng.yang@easystack.cn> wrote:
>>
>> Hi Ilya
>>
>> 在 2023/8/2 星期三 上午 6:22, Ilya Dryomov 写道:
>>> Due to rbd_try_acquire_lock() effectively swallowing all but
>>> EBLOCKLISTED error from rbd_try_lock() ("request lock anyway") and
>>> rbd_request_lock() returning ETIMEDOUT error not only for an actual
>>> notify timeout but also when the lock owner doesn't respond, a busy
>>> loop inside of rbd_acquire_lock() between rbd_try_acquire_lock() and
>>> rbd_request_lock() is possible.
>>>
>>> Requesting the lock on EBUSY error (returned by get_lock_owner_info()
>>> if an incompatible lock or invalid lock owner is detected) makes very
>>> little sense. The same goes for ETIMEDOUT error (might pop up pretty
>>> much anywhere if osd_request_timeout option is set) and many others.
>>>
>>> Just fail I/O requests on rbd_dev->acquiring_list immediately on any
>>> error from rbd_try_lock().
>>>
>>> Cc: stable@vger.kernel.org # 588159009d5b: rbd: retrieve and check lock owner twice before blocklisting
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
>>> ---
>>> drivers/block/rbd.c | 28 +++++++++++++++-------------
>>> 1 file changed, 15 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>>> index 24afcc93ac01..2328cc05be36 100644
>>> --- a/drivers/block/rbd.c
>>> +++ b/drivers/block/rbd.c
>>> @@ -3675,7 +3675,7 @@ static int rbd_lock(struct rbd_device *rbd_dev)
>>> ret = ceph_cls_lock(osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc,
>>> RBD_LOCK_NAME, CEPH_CLS_LOCK_EXCLUSIVE, cookie,
>>> RBD_LOCK_TAG, "", 0);
>>> - if (ret)
>>> + if (ret && ret != -EEXIST)
>>> return ret;
>>>
>>> __rbd_lock(rbd_dev, cookie);
>>
>> If we got -EEXIST here, we will call __rbd_lock() and return 0. -EEXIST
>> means lock is held by myself, is that necessary to call __rbd_lock()?
>
> Hi Dongsheng,
>
> Yes, because the reason rbd_lock() gets called in the first place is
> that the kernel client doesn't "know" that it's still holding the lock
> in RADOS. This can happen if the unlock operation times out, for
> example.
>
> Notice
>
> WARN_ON(__rbd_is_lock_owner(rbd_dev) ||
> rbd_dev->lock_cookie[0] != '\0');
>
> at the top of rbd_lock().
Okey, then
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>
> Thanks,
>
> Ilya
> .
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-08-02 6:53 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-01 22:22 [PATCH] rbd: prevent busy loop when requesting exclusive lock Ilya Dryomov
2023-08-02 6:35 ` Dongsheng Yang
2023-08-02 6:41 ` Ilya Dryomov
2023-08-02 6:52 ` Dongsheng Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox