* [PATCH 0/1] fsopen: fsconfig syscall restart fix
@ 2020-09-23 16:46 Alexander Mikhalitsyn
  2020-09-23 16:46 ` [PATCH 1/1] " Alexander Mikhalitsyn
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Mikhalitsyn @ 2020-09-23 16:46 UTC (permalink / raw)
  To: viro; +Cc: avagin, Alexander Mikhalitsyn, linux-fsdevel, linux-kernel
Hi guys!
Sometimes ago our CRIU CI started reporting hardly-reproducible (on developer
environment) error "EBUSY" from fsconfig syscall in the CRIU cgroups code.
https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/cgroup.c#L585
The machine on which we caught this problem is PowerPC (POWER8). After tracing
and debugging that we realized that problem is that we get ERESTARTNOINR
from vfs_get_tree function in vfs_fsconfig_locked. After a more deep
investigation, we found that the source is cgroup1_get_tree() and
restart_syscall() at the end. I personally have no idea why we caught this only
on POWER8 VM and have no such problem on amd64. But anyway this is incorrect
behaviour and our patch should fix this problem and make this impossible on all
architectures.
Big thanks to Andrei Vagin. He helped me a lot to fully understand the
problem and prepare this patch.
Regards, Alex
Alexander Mikhalitsyn (1):
  fsopen: fsconfig syscall restart fix
 fs/fsopen.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)
-- 
2.25.1
^ permalink raw reply	[flat|nested] 5+ messages in thread
* [PATCH 1/1] fsopen: fsconfig syscall restart fix
  2020-09-23 16:46 [PATCH 0/1] fsopen: fsconfig syscall restart fix Alexander Mikhalitsyn
@ 2020-09-23 16:46 ` Alexander Mikhalitsyn
  2020-09-23 17:03   ` Al Viro
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Mikhalitsyn @ 2020-09-23 16:46 UTC (permalink / raw)
  To: viro; +Cc: avagin, Alexander Mikhalitsyn, linux-fsdevel, linux-kernel
During execution of vfs_fsconfig_locked function we can get ERESTARTNOINTR
error (or other interrupt error). But we changing fs context fc->phase
field to transient states and our entry fc->phase checks in switch cases
(see FS_CONTEXT_CREATE_PARAMS, FS_CONTEXT_RECONF_PARAMS) will always fail
after syscall restart which will lead to returning -EBUSY to the userspace.
The idea of the fix is to save entry-time fs_context phase field value and
recover fc->phase value to the original one before exiting with
"interrupt error" (ERESTARTNOINTR or similar).
Many thanks to Andrei Vagin <avagin@gmail.com> for help with that.
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
---
 fs/fsopen.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/fs/fsopen.c b/fs/fsopen.c
index 27a890aa493a..70e6d163c169 100644
--- a/fs/fsopen.c
+++ b/fs/fsopen.c
@@ -209,6 +209,18 @@ SYSCALL_DEFINE3(fspick, int, dfd, const char __user *, path, unsigned int, flags
 	return ret;
 }
 
+static inline bool is_interrupt_error(int error)
+{
+	switch (error) {
+	case -EINTR:
+	case -ERESTARTSYS:
+	case -ERESTARTNOHAND:
+	case -ERESTARTNOINTR:
+		return true;
+	}
+	return false;
+}
+
 /*
  * Check the state and apply the configuration.  Note that this function is
  * allowed to 'steal' the value by setting param->xxx to NULL before returning.
@@ -217,11 +229,20 @@ static int vfs_fsconfig_locked(struct fs_context *fc, int cmd,
 			       struct fs_parameter *param)
 {
 	struct super_block *sb;
+	enum fs_context_phase entry_phase;
 	int ret;
 
 	ret = finish_clean_context(fc);
 	if (ret)
 		return ret;
+
+	/* We changing fc->phase in the code below but we need to
+	 * return fc->phase to the original value if we get
+	 * "interrupt error" during the process to make fsconfig
+	 * syscall restart procedure work correctly.
+	 */
+	entry_phase = fc->phase;
+
 	switch (cmd) {
 	case FSCONFIG_CMD_CREATE:
 		if (fc->phase != FS_CONTEXT_CREATE_PARAMS)
@@ -264,7 +285,16 @@ static int vfs_fsconfig_locked(struct fs_context *fc, int cmd,
 
 		return vfs_parse_fs_param(fc, param);
 	}
-	fc->phase = FS_CONTEXT_FAILED;
+
+	/* We should fail context only if we get real error.
+	 * If we get ERESTARTNOINTR we can safely restart
+	 * fsconfig syscall.
+	 */
+	if (is_interrupt_error(ret))
+		fc->phase = entry_phase;
+	else
+		fc->phase = FS_CONTEXT_FAILED;
+
 	return ret;
 }
 
-- 
2.25.1
^ permalink raw reply related	[flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] fsopen: fsconfig syscall restart fix
  2020-09-23 16:46 ` [PATCH 1/1] " Alexander Mikhalitsyn
@ 2020-09-23 17:03   ` Al Viro
  2020-09-23 17:19     ` Alexander Mikhalitsyn
  0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2020-09-23 17:03 UTC (permalink / raw)
  To: Alexander Mikhalitsyn; +Cc: avagin, linux-fsdevel, linux-kernel
On Wed, Sep 23, 2020 at 07:46:36PM +0300, Alexander Mikhalitsyn wrote:
> During execution of vfs_fsconfig_locked function we can get ERESTARTNOINTR
> error (or other interrupt error). But we changing fs context fc->phase
> field to transient states and our entry fc->phase checks in switch cases
> (see FS_CONTEXT_CREATE_PARAMS, FS_CONTEXT_RECONF_PARAMS) will always fail
> after syscall restart which will lead to returning -EBUSY to the userspace.
> 
> The idea of the fix is to save entry-time fs_context phase field value and
> recover fc->phase value to the original one before exiting with
> "interrupt error" (ERESTARTNOINTR or similar).
If you have e.g. vfs_create_tree() fail in the middle of ->get_tree(),
the only thing you can do to that thing is to discard it.  The state is
*NOT* required to be recoverable after a failure exit - quite a bit of
config might've been consumed and freed by that point.
CREATE and RECONFIGURE are simply not restartable.
^ permalink raw reply	[flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] fsopen: fsconfig syscall restart fix
  2020-09-23 17:03   ` Al Viro
@ 2020-09-23 17:19     ` Alexander Mikhalitsyn
  2020-09-24 15:31       ` Alexander Mikhalitsyn
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Mikhalitsyn @ 2020-09-23 17:19 UTC (permalink / raw)
  To: Al Viro; +Cc: avagin, linux-fsdevel, linux-kernel
On Wed, 23 Sep 2020 18:03:22 +0100
Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Wed, Sep 23, 2020 at 07:46:36PM +0300, Alexander Mikhalitsyn wrote:
> > During execution of vfs_fsconfig_locked function we can get ERESTARTNOINTR
> > error (or other interrupt error). But we changing fs context fc->phase
> > field to transient states and our entry fc->phase checks in switch cases
> > (see FS_CONTEXT_CREATE_PARAMS, FS_CONTEXT_RECONF_PARAMS) will always fail
> > after syscall restart which will lead to returning -EBUSY to the userspace.
> > 
> > The idea of the fix is to save entry-time fs_context phase field value and
> > recover fc->phase value to the original one before exiting with
> > "interrupt error" (ERESTARTNOINTR or similar).
> 
> If you have e.g. vfs_create_tree() fail in the middle of ->get_tree(),
> the only thing you can do to that thing is to discard it.  The state is
> *NOT* required to be recoverable after a failure exit - quite a bit of
> config might've been consumed and freed by that point.
> 
> CREATE and RECONFIGURE are simply not restartable.
Thank you for quick response!
I got you idea. But as far as I understand fsopen/fsconfig API is in
early-development stage and we can think about convenience here.
Consider the typical code here:
int fsfd;
fsfd = fsopen("somefs", 0);
// a lot of:
fsconfig(fsfd, FSCONFIG_SET_FLAG, ...);
fsconfig(fsfd, FSCONFIG_SET_STRING, ...);
fsconfig(fsfd, FSCONFIG_SET_BINARY, ...);
//...
// now call:
fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0)
-> get signal here or something else
-> syscall restarted but this doesn't work because
of broken fc->phase state
-> get EBUSY
-> now we need to repeat *all* steps with
fsconfig(fsfd, FSCONFIG_SET_FLAG/FSCONFIG_SET_STRING, ...).
Speaking honestly, this looks weird.
Regards,
Alex.
^ permalink raw reply	[flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] fsopen: fsconfig syscall restart fix
  2020-09-23 17:19     ` Alexander Mikhalitsyn
@ 2020-09-24 15:31       ` Alexander Mikhalitsyn
  0 siblings, 0 replies; 5+ messages in thread
From: Alexander Mikhalitsyn @ 2020-09-24 15:31 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: christian, dhowells
I've sent the copy to Christian and David
Cc: Christian Brauner <christian@brauner.io>
Cc: David Howells <dhowells@redhat.com>
Guys, please take a look once time permit.
Thank you.
Regards, Alex
On Wed, 23 Sep 2020 20:19:58 +0300
Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> wrote:
> On Wed, 23 Sep 2020 18:03:22 +0100
> Al Viro <viro@zeniv.linux.org.uk> wrote:
> 
> > On Wed, Sep 23, 2020 at 07:46:36PM +0300, Alexander Mikhalitsyn wrote:
> > > During execution of vfs_fsconfig_locked function we can get ERESTARTNOINTR
> > > error (or other interrupt error). But we changing fs context fc->phase
> > > field to transient states and our entry fc->phase checks in switch cases
> > > (see FS_CONTEXT_CREATE_PARAMS, FS_CONTEXT_RECONF_PARAMS) will always fail
> > > after syscall restart which will lead to returning -EBUSY to the userspace.
> > > 
> > > The idea of the fix is to save entry-time fs_context phase field value and
> > > recover fc->phase value to the original one before exiting with
> > > "interrupt error" (ERESTARTNOINTR or similar).
> > 
> > If you have e.g. vfs_create_tree() fail in the middle of ->get_tree(),
> > the only thing you can do to that thing is to discard it.  The state is
> > *NOT* required to be recoverable after a failure exit - quite a bit of
> > config might've been consumed and freed by that point.
> > 
> > CREATE and RECONFIGURE are simply not restartable.
> 
> Thank you for quick response!
> 
> I got you idea. But as far as I understand fsopen/fsconfig API is in
> early-development stage and we can think about convenience here.
> 
> Consider the typical code here:
> int fsfd;
> fsfd = fsopen("somefs", 0);
> // a lot of:
> fsconfig(fsfd, FSCONFIG_SET_FLAG, ...);
> fsconfig(fsfd, FSCONFIG_SET_STRING, ...);
> fsconfig(fsfd, FSCONFIG_SET_BINARY, ...);
> //...
> 
> // now call:
> fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0)
> -> get signal here or something else
> -> syscall restarted but this doesn't work because
> of broken fc->phase state
> -> get EBUSY
> -> now we need to repeat *all* steps with
> fsconfig(fsfd, FSCONFIG_SET_FLAG/FSCONFIG_SET_STRING, ...).
> Speaking honestly, this looks weird.
> 
> Regards,
> Alex.
^ permalink raw reply	[flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-09-24 15:32 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-23 16:46 [PATCH 0/1] fsopen: fsconfig syscall restart fix Alexander Mikhalitsyn
2020-09-23 16:46 ` [PATCH 1/1] " Alexander Mikhalitsyn
2020-09-23 17:03   ` Al Viro
2020-09-23 17:19     ` Alexander Mikhalitsyn
2020-09-24 15:31       ` Alexander Mikhalitsyn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).