From: Minchan Kim <minchan@kernel.org>
To: Pavel Emelyanov <xemul@parallels.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
linux-mm@kvack.org
Subject: Re: Change soft-dirty interface?
Date: Fri, 14 Jun 2013 20:22:22 +0900 [thread overview]
Message-ID: <20130614112222.GB306@gmail.com> (raw)
In-Reply-To: <51BAE9F3.5030301@parallels.com>
Hello Pavel,
On Fri, Jun 14, 2013 at 02:01:23PM +0400, Pavel Emelyanov wrote:
> >>>>> If it's not allowed, another approach should be new system call.
> >>>>>
> >>>>> int sys_softdirty(pid_t pid, void *addr, size_t len);
> >>>>
> >>>> This looks like existing sys_madvise() one.
> >>>
> >>> Except pid part. It is added by your purpose, which external task
> >>> can control any process.
>
> In CRIU we can work with pid-less syscalls just fine :) So extending regular
> madvise would work.
I didn't know that.
Just out of curiosity. How can CRIU control other tasks without pid?
>
> >>>>
> >>>>> If we approach new system call, we don't need to maintain current
> >>>>> proc interface and it would be very handy to get a information
> >>>>> without pagemap (open/read/close) so we can add a parameter to
> >>>>> get a dirty information easily.
> >>>>>
> >>>>> int sys_softdirty(pid_t pid, void *addr, size_t len, unsigned char *vec)
> >>>>>
> >>>>> What do you think about it?
> >>>>>
> >>>>
> >>>> This is OK for me, though there's another issue with this API I'd like
> >>>> to mention -- consider your app is doing these tricks with soft-dirty
> >>>> and at the same time CRIU tools live-migrate it using the soft-dirty bits
> >>>> to optimize the freeze time.
> >>>>
> >>>> In that case soft-dirty bits would be in wrong state for both -- you app
> >>>> and CRIU, but with the proc API we could compare the ctime-s of the
> >>>> clear_refs file and find out, that someone spoiled the soft-dirty state
> >>>> from last time we messed with it and handle it somehow (copy all the memory
> >>>> in the worst case). Can we somehow handle this with your proposal?
> >>>
> >>> Good point I didn't think over that.
> >>> A simple idea popped from my mind is we can use read/write lock
> >>> so if pid is equal to calling process's one or pid is NULL,
> >>> we use read side lock, which can allow marking soft-dirty
> >>> several vmas with parallel. And pid is not equal to calling
> >>> process's one, the API should try to hold write-side lock
> >>> then, if it's fail, the API should return EAGAIN so that CRIU
> >>> can progress other processes and retry it after a while.
> >>> Of course, it would make live-lock so that sys_softdirty might
> >>> need another argument like "int block".
> >>
> >> And we need a flag to show SELF_SOFT_DIRTY or EXTERNAL_SOFT_DIRTY
> >> and the flag will be protected by above lock. It could prevent mixed
> >> case by self and external.
> >
> > I realized it's not enough. Another idea is here.
> > The intenion is followin as,
> >
> > self softdirty VS self softdirty -> NOT exclusive
> > self softdirty VS external softdirty -> exclusive
> > external softdirty VS external softdirty-> excluisve
>
> I think it might work for us. However, I have two comments to the
> implementation, please see below.
>
> > struct softdirty token {
> > u64 external;
> > u64 internal;
> > };
> >
> > int sys_set_softdirty(pid_t pid, unsigned long start, size_t len,
> > struct softdirty *token);
I should have mentioned that start and len are ignored if pid is not eqaul
to caller's pid.
> > int sys_get_softdirty(pid_t pid, unsigned long start, size_t len,
> > struct softdirty token, char *vec);
>
> Can you please show an example how to use these two, I don't quite get how
> can I do external soft-dirty tracking in atomic manner.
Hmm, I don't know how CRIU works but ...
while(1) {
struct softdirty token;
sys_set_softdirty(tracked_pid, 0, 0, &token);
...
...
...
if (!sys_get_softdirty(tacked_pid, 0, 0, token, NULL))
break;
}
Maybe do you have a concern about live-lock?
>
> >
> > SYSCALL(set_softdirty, ..., token)
> > {
> > struct task_struct *tsk = task_from_pid(pid);
> > mutex_lock(&mm->st_lock);
> > if (tsk == current)
> > tsk->mm->token.internal++;
> > else
> > tsk->mm->token.external++;
> > token->external = mm->token.external;
> > token->internal = mm->token.internal;
> > mutex_unlock(&mm->st_lock);
> > ..
> > ..
> >
> > }
> >
> > SYSCALL(get_softdirty, ..., token, ...)
> > {
> > struct task_struct *tsk = task_from_pid(pid);
> > mutex_lock(&mm->st_lock);
> > if (tsk == current) {
> > if (tsk->mm->token.external != token.external) {
> > mutex_unlock
> > return -EAGAIN;
> > }
> > } else {
> > if (tsk->mm->token.external != token.external ||
> > tsk->mm->token.internal != token.internal) {
> > mutex_unlock;
> > return -EAGAIN;
> > }
> > }
> > mutex_unlock(&mm->st_lock);
>
> Presumably the critical section should be longer, as if tokens match and we
> release the lock and proceed with working on pagemap, the concurrent call
> to set_softdirty can proceed and spoil the picture.
True.
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-06-14 11:22 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-13 1:53 Change soft-dirty interface? Minchan Kim
2013-06-13 9:10 ` Pavel Emelyanov
2013-06-14 0:32 ` Minchan Kim
2013-06-14 0:41 ` Minchan Kim
2013-06-14 5:07 ` Minchan Kim
2013-06-14 10:01 ` Pavel Emelyanov
2013-06-14 11:22 ` Minchan Kim [this message]
2013-06-14 11:37 ` Pavel Emelyanov
2013-06-15 6:41 ` Minchan Kim
2013-06-19 9:31 ` Pavel Emelyanov
2013-06-21 1:41 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130614112222.GB306@gmail.com \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=kosaki.motohiro@gmail.com \
--cc=linux-mm@kvack.org \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.