* [lustre-devel] [PATCH 1/7] lustre: ptlrpc: unregister reply buffer on rq_err
2022-04-19 0:30 [lustre-devel] [PATCH 0/7] lustre: OpenSFS updates April 18, 2022 James Simmons
@ 2022-04-19 0:30 ` James Simmons
2022-04-19 0:30 ` [lustre-devel] [PATCH 2/7] lustre: llite: Fix use of uninitialized fields James Simmons
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: James Simmons @ 2022-04-19 0:30 UTC (permalink / raw)
To: Andreas Dilger, Oleg Drokin, NeilBrown
Cc: Alexey Lyashkov, Alexander Zarochentsev, Lustre Development List
From: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Unregister reply buffer on rq_err and prevent a late reply from
modifying request flags in INTERPRET state.
HPE-bug-id: LUS-10717
Fixes: b06c1d17e488 ("lustre: mgc: do not ignore target registration failure")
WC-bug-id: https://jira.whamcloud.com/browse/LU-15435
Lustre-commit: d8012811cc6ff9c7f ("LU-15435 ptlrpc: unregister reply buffer on rq_err")
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-on: https://review.whamcloud.com/46132
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
fs/lustre/ptlrpc/client.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/lustre/ptlrpc/client.c b/fs/lustre/ptlrpc/client.c
index ec0cd5f..685d6e2 100644
--- a/fs/lustre/ptlrpc/client.c
+++ b/fs/lustre/ptlrpc/client.c
@@ -1857,6 +1857,11 @@ int ptlrpc_check_set(const struct lu_env *env, struct ptlrpc_request_set *set)
}
if (req->rq_err) {
+ if (!ptlrpc_unregister_reply(req, 1)) {
+ ptlrpc_unregister_bulk(req, 1);
+ continue;
+ }
+
spin_lock(&req->rq_lock);
req->rq_replied = 0;
spin_unlock(&req->rq_lock);
--
1.8.3.1
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
^ permalink raw reply related [flat|nested] 8+ messages in thread* [lustre-devel] [PATCH 2/7] lustre: llite: Fix use of uninitialized fields
2022-04-19 0:30 [lustre-devel] [PATCH 0/7] lustre: OpenSFS updates April 18, 2022 James Simmons
2022-04-19 0:30 ` [lustre-devel] [PATCH 1/7] lustre: ptlrpc: unregister reply buffer on rq_err James Simmons
@ 2022-04-19 0:30 ` James Simmons
2022-04-19 0:31 ` [lustre-devel] [PATCH 3/7] lustre: lov: remove lo_trunc_stripeno James Simmons
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: James Simmons @ 2022-04-19 0:30 UTC (permalink / raw)
To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List
From: Patrick Farrell <pfarrell@whamcloud.com>
We use data from ci_rw to set io_start_index and
io_end_index, which is a problem for mmap because mmap does
not use ci_rw.
When ci_rand_read is set or readahead is disabled, we use
these values to decide how much data to read.
ci_rw is uninitialized, and if the values are non-zero,
we may try to read data beyond the locks we took for our
I/O.
If there is no lock (either because there was never one or
it was cancelled), this results in an LBUG in
osc_req_attr_set when it verifies the pages are covered by
a lock.
WC-bug-id: https://jira.whamcloud.com/browse/LU-15637
Lustre-commit: 9884f37985c1108fb ("LU-15637 llite: Fix use of uninitialized fields")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46776
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
fs/lustre/llite/rw.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c
index b8cffde..0ddd920 100644
--- a/fs/lustre/llite/rw.c
+++ b/fs/lustre/llite/rw.c
@@ -1627,6 +1627,8 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io,
struct ll_readahead_state *ras = NULL;
struct cl_2queue *queue = &io->ci_queue;
struct ll_sb_info *sbi = ll_i2sbi(inode);
+ struct vvp_io *vio = vvp_env_io(env);
+ bool mmap = !vio->vui_ra_valid;
struct cl_sync_io *anchor = NULL;
pgoff_t ra_start_index = 0;
pgoff_t io_start_index;
@@ -1644,12 +1646,11 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io,
uptodate = vpg->vpg_defer_uptodate;
if (ll_readahead_enabled(sbi) && !vpg->vpg_ra_updated && ras) {
- struct vvp_io *vio = vvp_env_io(env);
enum ras_update_flags flags = 0;
if (uptodate)
flags |= LL_RAS_HIT;
- if (!vio->vui_ra_valid)
+ if (mmap)
flags |= LL_RAS_MMAP;
ras_update(sbi, inode, ras, vvp_index(vpg), flags, io);
}
@@ -1667,9 +1668,16 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io,
cl_page_list_add(&queue->c2_qin, page, true);
}
- io_start_index = cl_index(io->ci_obj, io->u.ci_rw.crw_pos);
- io_end_index = cl_index(io->ci_obj, io->u.ci_rw.crw_pos +
- io->u.ci_rw.crw_count - 1);
+ /* mmap does not set the ci_rw fields */
+ if (!mmap) {
+ io_start_index = cl_index(io->ci_obj, io->u.ci_rw.crw_pos);
+ io_end_index = cl_index(io->ci_obj, io->u.ci_rw.crw_pos +
+ io->u.ci_rw.crw_count - 1);
+ } else {
+ io_start_index = vvp_index(vpg);
+ io_end_index = vvp_index(vpg);
+ }
+
if (ll_readahead_enabled(sbi) && ras && !io->ci_rand_read) {
pgoff_t skip_index = 0;
--
1.8.3.1
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
^ permalink raw reply related [flat|nested] 8+ messages in thread* [lustre-devel] [PATCH 3/7] lustre: lov: remove lo_trunc_stripeno
2022-04-19 0:30 [lustre-devel] [PATCH 0/7] lustre: OpenSFS updates April 18, 2022 James Simmons
2022-04-19 0:30 ` [lustre-devel] [PATCH 1/7] lustre: ptlrpc: unregister reply buffer on rq_err James Simmons
2022-04-19 0:30 ` [lustre-devel] [PATCH 2/7] lustre: llite: Fix use of uninitialized fields James Simmons
@ 2022-04-19 0:31 ` James Simmons
2022-04-19 0:31 ` [lustre-devel] [PATCH 4/7] lustre: lmv: change default hash back to fnv_1a_64 James Simmons
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: James Simmons @ 2022-04-19 0:31 UTC (permalink / raw)
To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List
From: "John L. Hammond" <jhammond@whamcloud.com>
Remove the lo_trunc_stripeno member of struct lov_layout_raid0 and add
an lis_trunc_stripe_index array to struct lov_io. This makes the
truncate stripe index information belong to the IO and not to the
concurrently accessed object. This is needed because we do not have
locking that protects it from its initialization in lov_io_iter_init()
to its use in lov_lock_sub_init(). Also remove the unused
lo_write_lock member of struct lov_object.
Fixes: d83ed47d35 ("lustre: lov: correctly set OST obj size")
WC-bug-id: https://jira.whamcloud.com/browse/LU-15702
Lustre-commit: 42a6d1fdb6818f1b3 ("LU-15702 lov: remove lo_trunc_stripeno")
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46940
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
fs/lustre/lov/lov_cl_internal.h | 17 +++++++----------
fs/lustre/lov/lov_io.c | 40 ++++++++++++++++++++++++++++------------
fs/lustre/lov/lov_lock.c | 35 +++++++++++++++++++----------------
fs/lustre/lov/lov_object.c | 2 --
4 files changed, 54 insertions(+), 40 deletions(-)
diff --git a/fs/lustre/lov/lov_cl_internal.h b/fs/lustre/lov/lov_cl_internal.h
index 42fd10a..6b96543 100644
--- a/fs/lustre/lov/lov_cl_internal.h
+++ b/fs/lustre/lov/lov_cl_internal.h
@@ -175,11 +175,6 @@ struct lov_comp_layout_entry_ops {
struct lov_layout_raid0 {
unsigned int lo_nr;
/**
- * record the stripe no before the truncate size, used for setting OST
- * object size for truncate. LU-14128.
- */
- int lo_trunc_stripeno;
- /**
* When this is true, lov_object::lo_attr contains
* valid up to date attributes for a top-level
* object. This field is reset to 0 when attributes of
@@ -325,11 +320,6 @@ struct lov_object {
*/
int lo_preferred_mirror;
/**
- * For FLR: the lock to protect access to
- * lo_preferred_mirror.
- */
- spinlock_t lo_write_lock;
- /**
* For FLR: Number of (valid) mirrors.
*/
unsigned int lo_mirror_count;
@@ -562,6 +552,13 @@ struct lov_io {
loff_t lis_io_endpos;
/**
+ * Record the stripe index before the truncate size, used for setting
+ * OST object size for truncate. LU-14128. lis_trunc_stripe_index[i]
+ * refers to lov_object.u.composite.lo_entries[i].
+ */
+ int *lis_trunc_stripe_index;
+
+ /**
* starting position within a file, for the current io loop iteration
* (stripe), used by ci_io_loop().
*/
diff --git a/fs/lustre/lov/lov_io.c b/fs/lustre/lov/lov_io.c
index 904bafd..38dacd35 100644
--- a/fs/lustre/lov/lov_io.c
+++ b/fs/lustre/lov/lov_io.c
@@ -782,6 +782,7 @@ static int lov_io_iter_init(const struct lu_env *env,
{
struct lov_io *lio = cl2lov_io(env, ios);
struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
+ bool is_trunc = cl_io_is_trunc(ios->cis_io);
struct lov_io_sub *sub;
struct lu_extent ext;
int rc = 0;
@@ -790,6 +791,16 @@ static int lov_io_iter_init(const struct lu_env *env,
ext.e_start = lio->lis_pos;
ext.e_end = lio->lis_endpos;
+ if (is_trunc) {
+ int count = lio->lis_object->u.composite.lo_entry_count;
+
+ lio->lis_trunc_stripe_index = kcalloc(count,
+ sizeof(lio->lis_trunc_stripe_index[0]),
+ GFP_NOFS);
+ if (!lio->lis_trunc_stripe_index)
+ return -ENOMEM;
+ }
+
lov_foreach_io_layout(index, lio, &ext) {
struct lov_layout_entry *le = lov_entry(lio->lis_object, index);
struct lov_layout_raid0 *r0 = &le->lle_raid0;
@@ -798,7 +809,8 @@ static int lov_io_iter_init(const struct lu_env *env,
u64 start;
u64 end;
- r0->lo_trunc_stripeno = -1;
+ if (is_trunc)
+ lio->lis_trunc_stripe_index[index] = -1;
CDEBUG(D_VFSTRACE, "component[%d] flags %#x\n",
index, lsm->lsm_entries[index]->lsme_flags);
@@ -832,8 +844,7 @@ static int lov_io_iter_init(const struct lu_env *env,
continue;
}
- if (cl_io_is_trunc(ios->cis_io) &&
- !tested_trunc_stripe) {
+ if (is_trunc && !tested_trunc_stripe) {
int prev;
u64 tr_start;
@@ -848,20 +859,22 @@ static int lov_io_iter_init(const struct lu_env *env,
if (ext.e_start <
lsm->lsm_entries[index]->lsme_extent.e_start) {
/* need previous stripe involvement */
- r0->lo_trunc_stripeno = prev;
+ lio->lis_trunc_stripe_index[index] = prev;
} else {
tr_start = ext.e_start;
tr_start = lov_do_div64(tr_start,
stripe_width(lsm, index));
/* tr_start %= stripe_swidth */
- if (tr_start == stripe * lsm->lsm_entries[index]->lsme_stripe_size)
- r0->lo_trunc_stripeno = prev;
+ if (tr_start ==
+ stripe * lsm->lsm_entries[index]->lsme_stripe_size)
+ lio->lis_trunc_stripe_index[index] = prev;
}
}
/* if the last stripe is the trunc stripeno */
- if (r0->lo_trunc_stripeno == stripe)
- r0->lo_trunc_stripeno = -1;
+ if (is_trunc &&
+ lio->lis_trunc_stripe_index[index] == stripe)
+ lio->lis_trunc_stripe_index[index] = -1;
sub = lov_sub_get(env, lio,
lov_comp_index(index, stripe));
@@ -875,10 +888,10 @@ static int lov_io_iter_init(const struct lu_env *env,
if (rc != 0)
break;
- if (r0->lo_trunc_stripeno != -1) {
- stripe = r0->lo_trunc_stripeno;
+ if (is_trunc && lio->lis_trunc_stripe_index[index] != -1) {
+ stripe = lio->lis_trunc_stripe_index[index];
if (unlikely(!r0->lo_sub[stripe])) {
- r0->lo_trunc_stripeno = -1;
+ lio->lis_trunc_stripe_index[index] = -1;
continue;
}
sub = lov_sub_get(env, lio,
@@ -892,7 +905,7 @@ static int lov_io_iter_init(const struct lu_env *env,
* read get wrong kms.
*/
if (!list_empty(&sub->sub_linkage)) {
- r0->lo_trunc_stripeno = -1;
+ lio->lis_trunc_stripe_index[index] = -1;
continue;
}
@@ -1091,6 +1104,9 @@ static void lov_io_iter_fini(const struct lu_env *env,
struct lov_io *lio = cl2lov_io(env, ios);
int rc;
+ kfree(lio->lis_trunc_stripe_index);
+ lio->lis_trunc_stripe_index = NULL;
+
rc = lov_io_call(env, lio, lov_io_iter_fini_wrapper);
LASSERT(rc == 0);
while (!list_empty(&lio->lis_active))
diff --git a/fs/lustre/lov/lov_lock.c b/fs/lustre/lov/lov_lock.c
index d137614..313c09a 100644
--- a/fs/lustre/lov/lov_lock.c
+++ b/fs/lustre/lov/lov_lock.c
@@ -115,6 +115,8 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
struct cl_lock *lock)
{
struct lov_object *lov = cl2lov(obj);
+ struct lov_io *lio = lov_env_io(env);
+ bool is_trunc = cl_io_is_trunc(io);
struct lov_lock *lovlck;
struct lu_extent ext;
int result = 0;
@@ -124,6 +126,8 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
u64 start;
u64 end;
+ LASSERT(ergo(is_trunc, lio->lis_trunc_stripe_index != NULL));
+
ext.e_start = cl_offset(obj, lock->cll_descr.cld_start);
if (lock->cll_descr.cld_end == CL_PAGE_EOF)
ext.e_end = OBD_OBJECT_EOF;
@@ -131,16 +135,16 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
ext.e_end = cl_offset(obj, lock->cll_descr.cld_end + 1);
nr = 0;
- lov_foreach_io_layout(index, lov_env_io(env), &ext) {
+ lov_foreach_io_layout(index, lio, &ext) {
struct lov_layout_raid0 *r0 = lov_r0(lov, index);
for (i = 0; i < r0->lo_nr; i++) {
if (likely(r0->lo_sub[i])) { /* spare layout */
- if (lov_stripe_intersects(lov->lo_lsm, index, i,
- &ext, &start, &end))
- nr++;
- else if (cl_io_is_trunc(io) &&
- r0->lo_trunc_stripeno == i)
+ if (lov_stripe_intersects(lov->lo_lsm, index,
+ i, &ext, &start,
+ &end) ||
+ (is_trunc &&
+ i == lio->lis_trunc_stripe_index[index]))
nr++;
}
}
@@ -162,24 +166,23 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
struct lov_layout_raid0 *r0 = lov_r0(lov, index);
for (i = 0; i < r0->lo_nr; ++i) {
- struct lov_lock_sub *lls = &lovlck->lls_sub[nr];
- struct cl_lock_descr *descr = &lls->sub_lock.cll_descr;
- bool intersect = false;
+ struct lov_lock_sub *lls;
+ struct cl_lock_descr *descr;
if (unlikely(!r0->lo_sub[i]))
continue;
- intersect = lov_stripe_intersects(lov->lo_lsm, index, i,
- &ext, &start, &end);
- if (intersect)
- goto init_sublock;
-
- if (cl_io_is_trunc(io) && i == r0->lo_trunc_stripeno)
+ if (lov_stripe_intersects(lov->lo_lsm, index, i, &ext,
+ &start, &end) ||
+ (is_trunc &&
+ i == lio->lis_trunc_stripe_index[index]))
goto init_sublock;
continue;
-
init_sublock:
+ LASSERT(nr < lovlck->lls_nr);
+ lls = &lovlck->lls_sub[nr];
+ descr = &lls->sub_lock.cll_descr;
LASSERT(!descr->cld_obj);
descr->cld_obj = lovsub2cl(r0->lo_sub[i]);
descr->cld_start = cl_index(descr->cld_obj, start);
diff --git a/fs/lustre/lov/lov_object.c b/fs/lustre/lov/lov_object.c
index ff0f7fa..d9eaf15 100644
--- a/fs/lustre/lov/lov_object.c
+++ b/fs/lustre/lov/lov_object.c
@@ -214,7 +214,6 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
spin_lock_init(&r0->lo_sub_lock);
r0->lo_nr = lse->lsme_stripe_count;
- r0->lo_trunc_stripeno = -1;
flags = memalloc_nofs_save();
r0->lo_sub = kvmalloc_array(r0->lo_nr, sizeof(r0->lo_sub[0]),
@@ -641,7 +640,6 @@ static int lov_init_composite(const struct lu_env *env, struct lov_device *dev,
entry_count = lsm->lsm_entry_count;
- spin_lock_init(&comp->lo_write_lock);
comp->lo_flags = lsm->lsm_flags;
comp->lo_mirror_count = lsm->lsm_mirror_count + 1;
comp->lo_entry_count = lsm->lsm_entry_count;
--
1.8.3.1
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
^ permalink raw reply related [flat|nested] 8+ messages in thread* [lustre-devel] [PATCH 4/7] lustre: lmv: change default hash back to fnv_1a_64
2022-04-19 0:30 [lustre-devel] [PATCH 0/7] lustre: OpenSFS updates April 18, 2022 James Simmons
` (2 preceding siblings ...)
2022-04-19 0:31 ` [lustre-devel] [PATCH 3/7] lustre: lov: remove lo_trunc_stripeno James Simmons
@ 2022-04-19 0:31 ` James Simmons
2022-04-19 0:31 ` [lustre-devel] [PATCH 5/7] lnet: only update gateway NI status on discovery James Simmons
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: James Simmons @ 2022-04-19 0:31 UTC (permalink / raw)
To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List
From: Andreas Dilger <adilger@whamcloud.com>
Until performance issue is resolved, change the default directory
hash type from 'crush' back to 'fnv_1a_64'.
Fixes: 92fb134e43a0 ("lustre: lmv: change default hash type to crush")
WC-bug-id: https://jira.whamcloud.com/browse/LU-15692
Lustre-commit: 0090b6f6f6cfd65fc ("LU-15692 lmv: change default hash back to fnv_1a_64")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46950
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
include/uapi/linux/lustre/lustre_user.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h
index 3017148..fa01c28 100644
--- a/include/uapi/linux/lustre/lustre_user.h
+++ b/include/uapi/linux/lustre/lustre_user.h
@@ -707,7 +707,7 @@ static __attribute__((unused)) const char *mdt_hash_name[] = {
"crush",
};
-#define LMV_HASH_TYPE_DEFAULT LMV_HASH_TYPE_CRUSH
+#define LMV_HASH_TYPE_DEFAULT LMV_HASH_TYPE_FNV_1A_64
/* Right now only the lower part(0-16bits) of lmv_hash_type is being used,
* and the higher part will be the flag to indicate the status of object,
--
1.8.3.1
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
^ permalink raw reply related [flat|nested] 8+ messages in thread* [lustre-devel] [PATCH 5/7] lnet: only update gateway NI status on discovery
2022-04-19 0:30 [lustre-devel] [PATCH 0/7] lustre: OpenSFS updates April 18, 2022 James Simmons
` (3 preceding siblings ...)
2022-04-19 0:31 ` [lustre-devel] [PATCH 4/7] lustre: lmv: change default hash back to fnv_1a_64 James Simmons
@ 2022-04-19 0:31 ` James Simmons
2022-04-19 0:31 ` [lustre-devel] [PATCH 6/7] lnet: ln_api_mutex deadlocks James Simmons
2022-04-19 0:31 ` [lustre-devel] [PATCH 7/7] lustre: clio: Disable lockless for DIO with O_APPEND James Simmons
6 siblings, 0 replies; 8+ messages in thread
From: James Simmons @ 2022-04-19 0:31 UTC (permalink / raw)
To: Andreas Dilger, Oleg Drokin, NeilBrown
Cc: Chris Horn, Amir Shehata, Lustre Development List
From: Chris Horn <chris.horn@hpe.com>
Move the NI status from DOWN to UP only when receiving
a discovery PING. The discovery PING should be the only
message which should update the NI status since it's used
as the gateway NI keep alive mechanism.
This is done to avoid the following scenario:
The gateway itself can push its updates to the peers which
have removed it from its routing table. The peers would
respond to the PUSH with an ACK, the ACK will bring the
gateway's NI status to up. Therefore other peers which have
avoid_asym_router_failure=1 will have their route status
remain up even though the symmetrical route is gone.
Note: there is no way for the gateway to differentiate between
a keep alive discovery and a manually triggered discovery or ping.
However, this a narrow case which will not be handled.
net_last_alive converted to use ktime_get_seconds() instead of
ktime_get_real_seconds() since the NTP adjustment is not needed.
WC-bug-id: https://jira.whamcloud.com/browse/LU-13714
Lustre-commit: 3e3f70eb1ec95f32d ("LU-13714 lnet: only update gateway NI status on discovery")
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/39176
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
net/lnet/lnet/config.c | 2 +-
net/lnet/lnet/lib-move.c | 16 ++++++++++++----
net/lnet/lnet/router.c | 2 +-
net/lnet/lnet/router_proc.c | 2 +-
4 files changed, 15 insertions(+), 7 deletions(-)
diff --git a/net/lnet/lnet/config.c b/net/lnet/lnet/config.c
index f499c91..da3d20e 100644
--- a/net/lnet/lnet/config.c
+++ b/net/lnet/lnet/config.c
@@ -350,7 +350,7 @@ struct lnet_net *
spin_lock_init(&net->net_lock);
net->net_id = net_id;
- net->net_last_alive = ktime_get_real_seconds();
+ net->net_last_alive = ktime_get_seconds();
net->net_sel_priority = LNET_MAX_SELECTION_PRIORITY;
diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c
index 3ad13d0..0b3986e 100644
--- a/net/lnet/lnet/lib-move.c
+++ b/net/lnet/lnet/lib-move.c
@@ -4250,6 +4250,7 @@ void lnet_monitor_thr_stop(void)
u32 type;
int rc = 0;
int cpt;
+ time64_t now = ktime_get_seconds();
LASSERT(!in_interrupt());
@@ -4301,11 +4302,18 @@ void lnet_monitor_thr_stop(void)
return -EPROTO;
}
- if (the_lnet.ln_routing &&
- ni->ni_net->net_last_alive != ktime_get_real_seconds()) {
+ /* Only update net_last_alive for incoming GETs on the reserved portal
+ * (i.e. incoming lnet/discovery pings).
+ * This avoids situations where the router's own traffic results in NI
+ * status changes
+ */
+ if (the_lnet.ln_routing && type == LNET_MSG_GET &&
+ hdr->msg.get.ptl_index == LNET_RESERVED_PORTAL &&
+ !lnet_islocalnid(&src_nid) &&
+ ni->ni_net->net_last_alive != now) {
lnet_ni_lock(ni);
spin_lock(&ni->ni_net->net_lock);
- ni->ni_net->net_last_alive = ktime_get_real_seconds();
+ ni->ni_net->net_last_alive = now;
spin_unlock(&ni->ni_net->net_lock);
push = lnet_ni_set_status_locked(ni, LNET_NI_STATUS_UP);
lnet_ni_unlock(ni);
@@ -4480,7 +4488,7 @@ void lnet_monitor_thr_stop(void)
}
}
- lpni->lpni_last_alive = ktime_get_seconds();
+ lpni->lpni_last_alive = now;
msg->msg_rxpeer = lpni;
msg->msg_rxni = ni;
diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c
index beded3e..60ae15d 100644
--- a/net/lnet/lnet/router.c
+++ b/net/lnet/lnet/router.c
@@ -1044,7 +1044,7 @@ int lnet_get_rtr_pool_cfg(int cpt, struct lnet_ioctl_pool_cfg *pool_cfg)
timeout = router_ping_timeout + alive_router_check_interval;
- now = ktime_get_real_seconds();
+ now = ktime_get_seconds();
list_for_each_entry(net, &the_lnet.ln_nets, net_list) {
if (net->net_lnd->lnd_type == LOLND)
continue;
diff --git a/net/lnet/lnet/router_proc.c b/net/lnet/lnet/router_proc.c
index a53d6fa..f231da1 100644
--- a/net/lnet/lnet/router_proc.c
+++ b/net/lnet/lnet/router_proc.c
@@ -663,7 +663,7 @@ static int proc_lnet_nis(struct ctl_table *table, int write,
if (ni) {
struct lnet_tx_queue *tq;
char *stat;
- time64_t now = ktime_get_real_seconds();
+ time64_t now = ktime_get_seconds();
time64_t last_alive = -1;
int i;
int j;
--
1.8.3.1
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
^ permalink raw reply related [flat|nested] 8+ messages in thread* [lustre-devel] [PATCH 6/7] lnet: ln_api_mutex deadlocks
2022-04-19 0:30 [lustre-devel] [PATCH 0/7] lustre: OpenSFS updates April 18, 2022 James Simmons
` (4 preceding siblings ...)
2022-04-19 0:31 ` [lustre-devel] [PATCH 5/7] lnet: only update gateway NI status on discovery James Simmons
@ 2022-04-19 0:31 ` James Simmons
2022-04-19 0:31 ` [lustre-devel] [PATCH 7/7] lustre: clio: Disable lockless for DIO with O_APPEND James Simmons
6 siblings, 0 replies; 8+ messages in thread
From: James Simmons @ 2022-04-19 0:31 UTC (permalink / raw)
To: Andreas Dilger, Oleg Drokin, NeilBrown
Cc: Chris Horn, Lustre Development List
From: Chris Horn <chris.horn@hpe.com>
LNetNIFini() acquires the ln_api_mutex and holds onto it throughout
various shutdown routines. Meanwhile, LND threads (via
lnet_nid2peerni_locked()) or the discovery thread (via
lnet_peer_data_present()) may need to acquire this mutex in order to
progress.
Address these potential deadlocks by setting the_lnet.ln_state to
LNET_STATE_STOPPING earlier in LNetNIFini(), and release the mutex
prior to any call into LND module or before any wait.
LNetNIInit() is modified to return -ESHUTDOWN if it finds that there
is a concurrent shutdown in progress.
HPE-bug-id: LUS-10681
WC-bug-id: https://jira.whamcloud.com/browse/LU-15616
Lustre-commit: 22de0bd145b649768 ("LU-15616 lnet: ln_api_mutex deadlocks")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/46727
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
net/lnet/lnet/api-ni.c | 40 +++++++++++++++++++++++++++++++++++-----
net/lnet/lnet/lib-move.c | 2 ++
net/lnet/lnet/peer.c | 11 ++++++++---
3 files changed, 45 insertions(+), 8 deletions(-)
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 1978905..44d5014 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -2244,14 +2244,16 @@ static void lnet_push_target_fini(void)
islo = ni->ni_net->net_lnd->lnd_type == LOLND;
LASSERT(!in_interrupt());
- /* Holding the mutex makes it safe for lnd_shutdown
+ /* Holding the LND mutex makes it safe for lnd_shutdown
* to call module_put(). Module unload cannot finish
* until lnet_unregister_lnd() completes, and that
- * requires the mutex.
+ * requires the LND mutex.
*/
+ mutex_unlock(&the_lnet.ln_api_mutex);
mutex_lock(&the_lnet.ln_lnd_mutex);
net->net_lnd->lnd_shutdown(ni);
mutex_unlock(&the_lnet.ln_lnd_mutex);
+ mutex_lock(&the_lnet.ln_api_mutex);
if (!islo)
CDEBUG(D_LNI, "Removed LNI %s\n",
@@ -2323,7 +2325,8 @@ static void lnet_push_target_fini(void)
/* NB called holding the global mutex */
/* All quiet on the API front */
- LASSERT(the_lnet.ln_state == LNET_STATE_RUNNING);
+ LASSERT(the_lnet.ln_state == LNET_STATE_RUNNING ||
+ the_lnet.ln_state == LNET_STATE_STOPPING);
LASSERT(!the_lnet.ln_refcount);
lnet_net_lock(LNET_LOCK_EX);
@@ -2823,6 +2826,11 @@ void lnet_lib_exit(void)
CDEBUG(D_OTHER, "refs %d\n", the_lnet.ln_refcount);
+ if (the_lnet.ln_state == LNET_STATE_STOPPING) {
+ mutex_unlock(&the_lnet.ln_api_mutex);
+ return -ESHUTDOWN;
+ }
+
if (the_lnet.ln_refcount > 0) {
rc = the_lnet.ln_refcount++;
mutex_unlock(&the_lnet.ln_api_mutex);
@@ -2968,6 +2976,10 @@ void lnet_lib_exit(void)
} else {
LASSERT(!the_lnet.ln_niinit_self);
+ lnet_net_lock(LNET_LOCK_EX);
+ the_lnet.ln_state = LNET_STATE_STOPPING;
+ lnet_net_unlock(LNET_LOCK_EX);
+
lnet_fault_fini();
lnet_router_debugfs_fini();
lnet_monitor_thr_stop();
@@ -3433,6 +3445,10 @@ static int lnet_handle_legacy_ip2nets(char *ip2nets,
lnet_set_tune_defaults(tun);
mutex_lock(&the_lnet.ln_api_mutex);
+ if (the_lnet.ln_state != LNET_STATE_RUNNING) {
+ rc = -ESHUTDOWN;
+ goto out;
+ }
while ((net = list_first_entry_or_null(&net_head,
struct lnet_net,
net_list)) != NULL) {
@@ -3498,8 +3514,10 @@ int lnet_dyn_add_ni(struct lnet_ioctl_config_ni *conf)
lnet_set_tune_defaults(tun);
mutex_lock(&the_lnet.ln_api_mutex);
-
- rc = lnet_add_net_common(net, tun);
+ if (the_lnet.ln_state != LNET_STATE_RUNNING)
+ rc = -ESHUTDOWN;
+ else
+ rc = lnet_add_net_common(net, tun);
mutex_unlock(&the_lnet.ln_api_mutex);
@@ -3522,6 +3540,10 @@ int lnet_dyn_del_ni(struct lnet_ioctl_config_ni *conf)
return -EINVAL;
mutex_lock(&the_lnet.ln_api_mutex);
+ if (the_lnet.ln_state != LNET_STATE_RUNNING) {
+ rc = -ESHUTDOWN;
+ goto unlock_api_mutex;
+ }
lnet_net_lock(0);
@@ -3615,6 +3637,10 @@ int lnet_dyn_del_ni(struct lnet_ioctl_config_ni *conf)
return rc == 0 ? -EINVAL : rc;
mutex_lock(&the_lnet.ln_api_mutex);
+ if (the_lnet.ln_state != LNET_STATE_RUNNING) {
+ rc = -ESHUTDOWN;
+ goto out_unlock_clean;
+ }
if (rc > 1) {
rc = -EINVAL; /* only add one network per call */
@@ -3668,6 +3694,10 @@ int lnet_dyn_del_ni(struct lnet_ioctl_config_ni *conf)
return -EINVAL;
mutex_lock(&the_lnet.ln_api_mutex);
+ if (the_lnet.ln_state != LNET_STATE_RUNNING) {
+ rc = -ESHUTDOWN;
+ goto out;
+ }
lnet_net_lock(0);
diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c
index 0b3986e..0496bf5 100644
--- a/net/lnet/lnet/lib-move.c
+++ b/net/lnet/lnet/lib-move.c
@@ -3872,7 +3872,9 @@ void lnet_monitor_thr_stop(void)
complete(&the_lnet.ln_mt_wait_complete);
/* block until monitor thread signals that it's done */
+ mutex_unlock(&the_lnet.ln_api_mutex);
wait_for_completion(&the_lnet.ln_mt_signal);
+ mutex_lock(&the_lnet.ln_api_mutex);
LASSERT(the_lnet.ln_mt_state == LNET_MT_STATE_SHUTDOWN);
/* perform cleanup tasks */
diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index 98f71dd..714326a 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -3244,12 +3244,15 @@ static int lnet_peer_deletion(struct lnet_peer *lp)
if (lp->lp_state & LNET_PEER_MARK_DELETED)
return 0;
- if (the_lnet.ln_dc_state != LNET_DC_STATE_RUNNING)
- return -ESHUTDOWN;
-
spin_unlock(&lp->lp_lock);
mutex_lock(&the_lnet.ln_api_mutex);
+ if (the_lnet.ln_state != LNET_STATE_RUNNING ||
+ the_lnet.ln_dc_state != LNET_DC_STATE_RUNNING) {
+ mutex_unlock(&the_lnet.ln_api_mutex);
+ spin_lock(&lp->lp_lock);
+ return -ESHUTDOWN;
+ }
lnet_net_lock(LNET_LOCK_EX);
/* remove the peer from the discovery work
@@ -3929,8 +3932,10 @@ void lnet_peer_discovery_stop(void)
else
wake_up(&the_lnet.ln_dc_waitq);
+ mutex_unlock(&the_lnet.ln_api_mutex);
wait_event(the_lnet.ln_dc_waitq,
the_lnet.ln_dc_state == LNET_DC_STATE_SHUTDOWN);
+ mutex_lock(&the_lnet.ln_api_mutex);
LASSERT(list_empty(&the_lnet.ln_dc_request));
LASSERT(list_empty(&the_lnet.ln_dc_working));
--
1.8.3.1
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
^ permalink raw reply related [flat|nested] 8+ messages in thread* [lustre-devel] [PATCH 7/7] lustre: clio: Disable lockless for DIO with O_APPEND
2022-04-19 0:30 [lustre-devel] [PATCH 0/7] lustre: OpenSFS updates April 18, 2022 James Simmons
` (5 preceding siblings ...)
2022-04-19 0:31 ` [lustre-devel] [PATCH 6/7] lnet: ln_api_mutex deadlocks James Simmons
@ 2022-04-19 0:31 ` James Simmons
6 siblings, 0 replies; 8+ messages in thread
From: James Simmons @ 2022-04-19 0:31 UTC (permalink / raw)
To: Andreas Dilger, Oleg Drokin, NeilBrown
Cc: Shaun Tancheff, Lustre Development List
From: Shaun Tancheff <shaun.tancheff@hpe.com>
Lockless O_DIRECT with O_APPEND can allow interleaved / racy
appends from concurrent I/O.
Disable lockless I/O when O_APPEND is set
HPE-bug-id: LUS-9776
WC-bug-id: https://jira.whamcloud.com/browse/LU-15670
Lustre-commit: 649d638467c037579 ("LU-15670 clio: Disable lockless for DIO with O_APPEND")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-on: https://review.whamcloud.com/46890
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
fs/lustre/llite/file.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index 4855156..1ac3e4f 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -1673,6 +1673,8 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot,
io = vvp_env_thread_io(env);
if (file->f_flags & O_DIRECT) {
+ if (file->f_flags & O_APPEND)
+ dio_lock = 1;
if (!is_sync_kiocb(args->u.normal.via_iocb))
is_aio = true;
--
1.8.3.1
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
^ permalink raw reply related [flat|nested] 8+ messages in thread