* [PATCH 0/3] afs: fix the usage of read_seqbegin_or_lock()
@ 2023-11-30 11:55 Oleg Nesterov
2023-11-30 11:56 ` [PATCH 1/3] afs: fix the usage of read_seqbegin_or_lock() in afs_lookup_volume_rcu() Oleg Nesterov
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Oleg Nesterov @ 2023-11-30 11:55 UTC (permalink / raw)
To: David Howells; +Cc: Al Viro, Marc Dionne, linux-afs, linux-kernel
Hello,
Every usage of read_seqbegin_or_lock() in fs/afs is wrong, the counter is
always even so read_seqbegin_or_lock() can never take the lock.
The users should either make it odd on the 2nd pass, or use read_seqbegin()
instead.
Oleg.
---
fs/afs/callback.c | 3 ++-
fs/afs/inode.c | 15 ++++++---------
fs/afs/server.c | 7 ++++---
3 files changed, 12 insertions(+), 13 deletions(-)
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/3] afs: fix the usage of read_seqbegin_or_lock() in afs_lookup_volume_rcu()
2023-11-30 11:55 [PATCH 0/3] afs: fix the usage of read_seqbegin_or_lock() Oleg Nesterov
@ 2023-11-30 11:56 ` Oleg Nesterov
2023-11-30 11:56 ` [PATCH 2/3] afs: fix the usage of read_seqbegin_or_lock() in afs_find_server*() Oleg Nesterov
2023-11-30 11:56 ` [PATCH 3/3] afs: use read_seqbegin() in afs_check_validity() and afs_getattr() Oleg Nesterov
2 siblings, 0 replies; 4+ messages in thread
From: Oleg Nesterov @ 2023-11-30 11:56 UTC (permalink / raw)
To: David Howells; +Cc: Al Viro, Marc Dionne, linux-afs, linux-kernel
David Howells says:
(2) afs_lookup_volume_rcu().
There can be a lot of volumes known by a system. A thousand would
require a 10-step walk and this is drivable by remote operation, so I
think this should probably take a lock on the second pass too.
Make the "seq" counter odd on the 2nd pass, otherwise read_seqbegin_or_lock()
never takes the lock.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/afs/callback.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/afs/callback.c b/fs/afs/callback.c
index a484fa642808..90f9b2a46ff4 100644
--- a/fs/afs/callback.c
+++ b/fs/afs/callback.c
@@ -110,13 +110,14 @@ static struct afs_volume *afs_lookup_volume_rcu(struct afs_cell *cell,
{
struct afs_volume *volume = NULL;
struct rb_node *p;
- int seq = 0;
+ int seq = 1;
do {
/* Unfortunately, rbtree walking doesn't give reliable results
* under just the RCU read lock, so we have to check for
* changes.
*/
+ seq++; /* 2 on the 1st/lockless path, otherwise odd */
read_seqbegin_or_lock(&cell->volume_lock, &seq);
p = rcu_dereference_raw(cell->volumes.rb_node);
--
2.25.1.362.g51ebf55
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/3] afs: fix the usage of read_seqbegin_or_lock() in afs_find_server*()
2023-11-30 11:55 [PATCH 0/3] afs: fix the usage of read_seqbegin_or_lock() Oleg Nesterov
2023-11-30 11:56 ` [PATCH 1/3] afs: fix the usage of read_seqbegin_or_lock() in afs_lookup_volume_rcu() Oleg Nesterov
@ 2023-11-30 11:56 ` Oleg Nesterov
2023-11-30 11:56 ` [PATCH 3/3] afs: use read_seqbegin() in afs_check_validity() and afs_getattr() Oleg Nesterov
2 siblings, 0 replies; 4+ messages in thread
From: Oleg Nesterov @ 2023-11-30 11:56 UTC (permalink / raw)
To: David Howells; +Cc: Al Viro, Marc Dionne, linux-afs, linux-kernel
David Howells says:
(5) afs_find_server().
There could be a lot of servers in the list and each server can have
multiple addresses, so I think this would be better with an exclusive
second pass.
The server list isn't likely to change all that often, but when it does
change, there's a good chance several servers are going to be
added/removed one after the other. Further, this is only going to be
used for incoming cache management/callback requests from the server,
which hopefully aren't going to happen too often - but it is remotely
drivable.
(6) afs_find_server_by_uuid().
Similarly to (5), there could be a lot of servers to search through, but
they are in a tree not a flat list, so it should be faster to process.
Again, it's not likely to change that often and, again, when it does
change it's likely to involve multiple changes. This can be driven
remotely by an incoming cache management request but is mostly going to
be driven by setting up or reconfiguring a volume's server list -
something that also isn't likely to happen often.
Make the "seq" counter odd on the 2nd pass, otherwise read_seqbegin_or_lock()
never takes the lock.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/afs/server.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/afs/server.c b/fs/afs/server.c
index b5237206eac3..0bd2f5ba6900 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -27,7 +27,7 @@ struct afs_server *afs_find_server(struct afs_net *net,
const struct afs_addr_list *alist;
struct afs_server *server = NULL;
unsigned int i;
- int seq = 0, diff;
+ int seq = 1, diff;
rcu_read_lock();
@@ -35,6 +35,7 @@ struct afs_server *afs_find_server(struct afs_net *net,
if (server)
afs_unuse_server_notime(net, server, afs_server_trace_put_find_rsq);
server = NULL;
+ seq++; /* 2 on the 1st/lockless path, otherwise odd */
read_seqbegin_or_lock(&net->fs_addr_lock, &seq);
if (srx->transport.family == AF_INET6) {
@@ -90,7 +91,7 @@ struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uu
{
struct afs_server *server = NULL;
struct rb_node *p;
- int diff, seq = 0;
+ int diff, seq = 1;
_enter("%pU", uuid);
@@ -102,7 +103,7 @@ struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uu
if (server)
afs_unuse_server(net, server, afs_server_trace_put_uuid_rsq);
server = NULL;
-
+ seq++; /* 2 on the 1st/lockless path, otherwise odd */
read_seqbegin_or_lock(&net->fs_lock, &seq);
p = net->fs_servers.rb_node;
--
2.25.1.362.g51ebf55
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/3] afs: use read_seqbegin() in afs_check_validity() and afs_getattr()
2023-11-30 11:55 [PATCH 0/3] afs: fix the usage of read_seqbegin_or_lock() Oleg Nesterov
2023-11-30 11:56 ` [PATCH 1/3] afs: fix the usage of read_seqbegin_or_lock() in afs_lookup_volume_rcu() Oleg Nesterov
2023-11-30 11:56 ` [PATCH 2/3] afs: fix the usage of read_seqbegin_or_lock() in afs_find_server*() Oleg Nesterov
@ 2023-11-30 11:56 ` Oleg Nesterov
2 siblings, 0 replies; 4+ messages in thread
From: Oleg Nesterov @ 2023-11-30 11:56 UTC (permalink / raw)
To: David Howells; +Cc: Al Viro, Marc Dionne, linux-afs, linux-kernel
David Howells says:
(3) afs_check_validity().
(4) afs_getattr().
These are both pretty short, so your solution is probably good for them.
That said, afs_vnode_commit_status() can spend a long time under the
write lock - and pretty much every file RPC op returns a status update.
Change these functions to use read_seqbegin(). This simplifies the code
and doesn't change the current behaviour, the "seq" counter is always even
so read_seqbegin_or_lock() can never take the lock.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/afs/inode.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 78efc9719349..a6ae74d5b698 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -629,10 +629,10 @@ bool afs_check_validity(struct afs_vnode *vnode)
enum afs_cb_break_reason need_clear = afs_cb_break_no_break;
time64_t now = ktime_get_real_seconds();
unsigned int cb_break;
- int seq = 0;
+ int seq;
do {
- read_seqbegin_or_lock(&vnode->cb_lock, &seq);
+ seq = read_seqbegin(&vnode->cb_lock);
cb_break = vnode->cb_break;
if (test_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
@@ -650,9 +650,7 @@ bool afs_check_validity(struct afs_vnode *vnode)
need_clear = afs_cb_break_no_promise;
}
- } while (need_seqretry(&vnode->cb_lock, seq));
-
- done_seqretry(&vnode->cb_lock, seq);
+ } while (read_seqretry(&vnode->cb_lock, seq));
if (need_clear == afs_cb_break_no_break)
return true;
@@ -755,7 +753,7 @@ int afs_getattr(struct mnt_idmap *idmap, const struct path *path,
struct inode *inode = d_inode(path->dentry);
struct afs_vnode *vnode = AFS_FS_I(inode);
struct key *key;
- int ret, seq = 0;
+ int ret, seq;
_enter("{ ino=%lu v=%u }", inode->i_ino, inode->i_generation);
@@ -772,7 +770,7 @@ int afs_getattr(struct mnt_idmap *idmap, const struct path *path,
}
do {
- read_seqbegin_or_lock(&vnode->cb_lock, &seq);
+ seq = read_seqbegin(&vnode->cb_lock);
generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat);
if (test_bit(AFS_VNODE_SILLY_DELETED, &vnode->flags) &&
stat->nlink > 0)
@@ -784,9 +782,8 @@ int afs_getattr(struct mnt_idmap *idmap, const struct path *path,
*/
if (S_ISDIR(inode->i_mode))
stat->size = vnode->netfs.remote_i_size;
- } while (need_seqretry(&vnode->cb_lock, seq));
+ } while (read_seqretry(&vnode->cb_lock, seq));
- done_seqretry(&vnode->cb_lock, seq);
return 0;
}
--
2.25.1.362.g51ebf55
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-11-30 11:57 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-30 11:55 [PATCH 0/3] afs: fix the usage of read_seqbegin_or_lock() Oleg Nesterov
2023-11-30 11:56 ` [PATCH 1/3] afs: fix the usage of read_seqbegin_or_lock() in afs_lookup_volume_rcu() Oleg Nesterov
2023-11-30 11:56 ` [PATCH 2/3] afs: fix the usage of read_seqbegin_or_lock() in afs_find_server*() Oleg Nesterov
2023-11-30 11:56 ` [PATCH 3/3] afs: use read_seqbegin() in afs_check_validity() and afs_getattr() Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox