All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
To: Oren Laadan <orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Alexey Dobriyan
	<adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH] c/r: tolerate X86_EFLAGS_RF on restart
Date: Mon, 26 Oct 2009 09:57:47 -0500	[thread overview]
Message-ID: <20091026145746.GA23564@us.ibm.com> (raw)
In-Reply-To: <1256323024-20268-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>

Quoting Oren Laadan (orenl@librato.com):
> On restart, the c/r code that sanitizes a task's EFLAGS failed with an
> error if the X86_EFLAGS_RF was set. However, there are cases in which
> a task legitimately has this flag recorded at checkpoint, such as:
> 
> 1) We are running inside a KVM guest, and the guest is being debugged
> (see arch/x86/kvm/{svm,vmx}.c, via callbacks from qemu.
> 
> 2) The kernel itself is being debugged using kgbd.
> 
> 3) Exceptions that occur under certain condition will set this flag
> on the currently running task. For details see page 679 in Intel's
> manual (http://www.intel.com/Assets/PDF/manual/253668.pdf), like:
> 
>   When calling an event handler, Intel 64 and IA-32 processors
>   establish the value of the RF flag in the EFLAGS image pushed on the
>   stack:
>     • For any fault-class exception except a debug exception
>       generated in response to an instruction breakpoint, the value
>       pushed for RF is 1.
>     • For any interrupt arriving after any iteration of a repeated
>       string instruction but the last iteration, the value pushed for
>       RF is 1.
>     • For any trap-class exception generated by any iteration of a
>       repeated string instruction but the last iteration, the value
>       pushed for RF is 1.

I wasn't quite sure what a 'trap-class exception' is,
http://www.yashks.com/exception-basics.html says:

Example: INT 3(or CC or CD 21) is a trap instruction. If a program has this
instruction and if the debugger is running, it will return the next instruction
address to this debugger. Where debugger can continue it debugging.

Does that not mean that if we checkpoint t1 and t2 where t1 is ptracing t2, t2
might have RF set to 1?  So this isn't generally safe, right?

IIUC it also could be set for things like a "General Protection Exception(Type:
Fault)" at checkpoint in which case we could really mess up the sw at
restart.

Or maybe I'm totally misunderstanding...?

>     • For other cases, the value pushed for RF is the value that was
>       in EFLAG.RF at the time the event handler was called.
>   [...]
> 
> Case #3 is consistent with the fact that in all cases where we
> observed the issue there was a sendmail process doing IO.
> 
> Clearly, the X86_EFLAGS_RF is valid to be set.
> 
> This patches fixes the restart to:
> 
> 1) Silently ignore the X86_EFLAGS_RF if was set at checkpoint: this
> allows restart to tolerate the flag and safely complete the restart.
> 
> 2) Preserve the existing X86_EFLAGS_RF of a restarting task (instead
> of overwriting it): this ensures that restarting does not interfere
> with kvm, kgdb, etc.
> 
> Signed-off-by: Oren Laadan <orenl@cs.columbia.edu>
> Cc: Alexey Dobriyan <adobriyan@gmail.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Avi Kivity <avi@redhat.com>
> ---
>  arch/x86/mm/checkpoint.c |   40 ++++++++++++++++++++++++++++++++++++++--
>  1 files changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/mm/checkpoint.c b/arch/x86/mm/checkpoint.c
> index 9dd8e12..a93bb6f 100644
> --- a/arch/x86/mm/checkpoint.c
> +++ b/arch/x86/mm/checkpoint.c
> @@ -27,13 +27,49 @@ static int check_eflags(__u32 eflags)
>  #define X86_EFLAGS_CKPT_MASK  \
>  	(X86_EFLAGS_CF | X86_EFLAGS_PF | X86_EFLAGS_AF | X86_EFLAGS_ZF | \
>  	 X86_EFLAGS_SF | X86_EFLAGS_TF | X86_EFLAGS_DF | X86_EFLAGS_OF | \
> -	 X86_EFLAGS_NT | X86_EFLAGS_AC | X86_EFLAGS_ID)
> +	 X86_EFLAGS_NT | X86_EFLAGS_AC | X86_EFLAGS_ID | X86_EFLAGS_RF)
>  
>  	if ((eflags & ~X86_EFLAGS_CKPT_MASK) != (X86_EFLAGS_IF | 0x2))
>  		return 0;
>  	return 1;
>  }
>  
> +static void restore_eflags(struct pt_regs *regs, __u32 eflags)
> +{
> +	/*
> +	 * A task may have had X86_EFLAGS_RF set at checkpoint, .e.g:
> +	 * 1) It ran in a KVM guest, and the guest was being debugged,
> +	 * 2) The kernel was debugged using kgbd,
> +	 * 3) From Intel's manual: "When calling an event handler, 
> +	 *    Intel 64 and IA-32 processors establish the value of the
> +	 *    RF flag in the EFLAGS image pushed on the stack:
> +	 *  - For any fault-class exception except a debug exception
> +	 *    generated in response to an instruction breakpoint, the
> +	 *    value pushed for RF is 1.
> +	 *  - For any interrupt arriving after any iteration of a
> +	 *    repeated string instruction but the last iteration, the
> +	 *    value pushed for RF is 1.
> +	 *  - For any trap-class exception generated by any iteration
> +	 *    of a repeated string instruction but the last iteration,
> +	 *    the value pushed for RF is 1.
> +	 *  - For other cases, the value pushed for RF is the value
> +	 *    that was in EFLAG.RF at the time the event handler was
> +	 *    called.
> +	 *  [from: http://www.intel.com/Assets/PDF/manual/253668.pdf]
> +	 *
> +	 * So: (1) silently ignore X86_EFLAGS_RF from checkpoint time,
> +	 * as it's irrelevant for the current process, (2) preserve
> +	 * the existing X86_EFLAGS_RF of this process as it it may be
> +	 * used by kvm/kgdb. Disable preemption to protect test and
> +	 * change of our eflags.
> +	 */
> +	preempt_disable();
> +	eflags &= ~X86_EFLAGS_RF;
> +	eflags |= (regs->flags & X86_EFLAGS_RF);
> +	regs->flags = eflags;
> +	preempt_enable();
> +}
> +
>  static int check_tls(struct desc_struct *desc)
>  {
>  	if (!desc->a && !desc->b)
> @@ -433,7 +469,7 @@ static int load_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
>  	regs->orig_ax = h->orig_ax;
>  	regs->ip = h->ip;
>  
> -	regs->flags = h->flags;
> +	restore_eflags(regs, h->flags);
>  	regs->sp = h->sp;
>  
>  	regs->ds = decode_segment(h->ds);
> -- 
> 1.6.0.4
> 
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

  parent reply	other threads:[~2009-10-26 14:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-23 18:37 [PATCH] c/r: tolerate X86_EFLAGS_RF on restart Oren Laadan
     [not found] ` <1256323024-20268-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-10-26 14:57   ` Serge E. Hallyn [this message]
     [not found]     ` <20091026145746.GA23564-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-29  1:20       ` Oren Laadan
     [not found]         ` <4AE8EDC4.5010202-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-10-29 14:31           ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091026145746.GA23564@us.ibm.com \
    --to=serue-r/jw6+rmf7hqt0dzr+alfa@public.gmane.org \
    --cc=adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.