* [PATCH v2 0/2] Improvements for domain_crash_synchronous
@ 2013-09-09 12:46 Andrew Cooper
2013-09-09 12:46 ` [PATCH v2 1/2] x86/traps: Record last extable faulting address Andrew Cooper
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Andrew Cooper @ 2013-09-09 12:46 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Keir Fraser, Jan Beulich
This patch series provides improvements for identifying the cause of a
domain_crash_synchronous from assembly code.
All changes for v2 are in the 2nd patch, which has been rebased on top of
staging, now the prerequisite series have been committed.
~Andrew
--
1.7.10.4
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/2] x86/traps: Record last extable faulting address
2013-09-09 12:46 [PATCH v2 0/2] Improvements for domain_crash_synchronous Andrew Cooper
@ 2013-09-09 12:46 ` Andrew Cooper
2013-09-09 12:46 ` [PATCH v2 2/2] Xen/x86: Improve information from domain_crash_synchronous Andrew Cooper
2013-09-09 13:44 ` [PATCH v2 0/2] Improvements for domain_crash_synchronous Keir Fraser
2 siblings, 0 replies; 7+ messages in thread
From: Andrew Cooper @ 2013-09-09 12:46 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Keir Fraser, Jan Beulich
... so the following patch can identify the location of faults leading to a
decision to crash a domain.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <JBeulich@suse.com>
---
xen/arch/x86/traps.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 9db42c82..50fb6ba 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -89,6 +89,7 @@ static char __read_mostly opt_nmi[10] = "fatal";
string_param("nmi", opt_nmi);
DEFINE_PER_CPU(u64, efer);
+static DEFINE_PER_CPU(unsigned long, last_extable_addr);
DEFINE_PER_CPU_READ_MOSTLY(u32, ler_msr);
@@ -550,6 +551,7 @@ static inline void do_trap(
{
dprintk(XENLOG_ERR, "Trap %d: %p -> %p\n",
trapnr, _p(regs->eip), _p(fixup));
+ this_cpu(last_extable_addr) = regs->eip;
regs->eip = fixup;
return;
}
@@ -1038,6 +1040,7 @@ void do_invalid_op(struct cpu_user_regs *regs)
die:
if ( (fixup = search_exception_table(regs->eip)) != 0 )
{
+ this_cpu(last_extable_addr) = regs->eip;
regs->eip = fixup;
return;
}
@@ -1370,6 +1373,7 @@ void do_page_fault(struct cpu_user_regs *regs)
perfc_incr(copy_user_faults);
if ( unlikely(regs->error_code & PFEC_reserved_bit) )
reserved_bit_page_fault(addr, regs);
+ this_cpu(last_extable_addr) = regs->eip;
regs->eip = fixup;
return;
}
@@ -3062,6 +3066,7 @@ void do_general_protection(struct cpu_user_regs *regs)
{
dprintk(XENLOG_INFO, "GPF (%04x): %p -> %p\n",
regs->error_code, _p(regs->eip), _p(fixup));
+ this_cpu(last_extable_addr) = regs->eip;
regs->eip = fixup;
return;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] Xen/x86: Improve information from domain_crash_synchronous
2013-09-09 12:46 [PATCH v2 0/2] Improvements for domain_crash_synchronous Andrew Cooper
2013-09-09 12:46 ` [PATCH v2 1/2] x86/traps: Record last extable faulting address Andrew Cooper
@ 2013-09-09 12:46 ` Andrew Cooper
2013-09-09 13:45 ` Jan Beulich
2013-09-09 13:44 ` [PATCH v2 0/2] Improvements for domain_crash_synchronous Keir Fraser
2 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2013-09-09 12:46 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Keir Fraser, Jan Beulich
As it currently stands, the string "domain_crash_sync called from entry.S" is
not helpful at identifying why the domain was crashed, and a debug build of
Xen doesn't help the matter
This patch improves the information printed, by pointing to where the crash
decision was made.
Specific improvements include:
* Moving the ascii string "domain_crash_sync called from entry.S\n" away from
some semi-hot code cache lines.
* Moving the printk into C code (especially as this_cpu() is miserable to use
in assembly code)
* Undo the previous confusing situation of having the
domain_crash_synchronous() as a macro in C code, yet a global symbol in
assembly code.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <JBeulich@suse.com>
---
Changes in v2:
* Remove incorrect .globl's for {compat_,}create_bounce_frame
* Tweak printed string format
* Introduce and use UNLIKELY_ENTRY_LABEL() to save open-coding it
* Use __UNLIKELY_END() now it is available
* Remove redundant ud2's from the unlikely codepath - being preceded by a
jmp rather than a call, there was no way they could actually be executed
---
xen/arch/x86/traps.c | 12 +++++++++
xen/arch/x86/x86_64/compat/entry.S | 11 ++++++---
xen/arch/x86/x86_64/entry.S | 48 ++++++++++++++++++------------------
xen/include/asm-x86/asm_defns.h | 4 +++
xen/include/xen/sched.h | 7 ++++++
5 files changed, 54 insertions(+), 28 deletions(-)
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 50fb6ba..bc0d513 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -3748,6 +3748,18 @@ unsigned long do_get_debugreg(int reg)
return -EINVAL;
}
+void asm_domain_crash_synchronous(unsigned long addr)
+{
+ if ( addr == 0 )
+ addr = this_cpu(last_extable_addr);
+
+ printk("domain_crash_sync called from entry.S\n"
+ " fault at %p: ", _p(addr));
+ print_symbol("%s\n", addr);
+
+ __domain_crash_synchronous();
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
index c0afe2c..4ec1df7 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -329,7 +329,10 @@ UNLIKELY_END(compat_bounce_failsafe)
movzwl TRAPBOUNCE_cs(%rdx),%eax
/* Null selectors (0-3) are not allowed. */
testl $~3,%eax
- jz domain_crash_synchronous
+UNLIKELY_START(z, compat_bounce_null_selector)
+ lea UNLIKELY_ENTRY_LABEL(compat_bounce_null_selector)(%rip), %rdi
+ jmp asm_domain_crash_synchronous /* Does not return */
+__UNLIKELY_END(compat_bounce_null_selector)
movl %eax,UREGS_cs+8(%rsp)
movl TRAPBOUNCE_eip(%rdx),%eax
movl %eax,UREGS_rip+8(%rsp)
@@ -339,10 +342,10 @@ UNLIKELY_END(compat_bounce_failsafe)
xorl %edi,%edi
jmp .Lft13
.previous
- _ASM_EXTABLE(.Lft1, domain_crash_synchronous)
+ _ASM_EXTABLE(.Lft1, dom_crash_sync_extable)
_ASM_EXTABLE(.Lft2, compat_crash_page_fault)
_ASM_EXTABLE(.Lft3, compat_crash_page_fault_4)
- _ASM_EXTABLE(.Lft4, domain_crash_synchronous)
+ _ASM_EXTABLE(.Lft4, dom_crash_sync_extable)
_ASM_EXTABLE(.Lft5, compat_crash_page_fault_4)
_ASM_EXTABLE(.Lft6, compat_crash_page_fault_8)
_ASM_EXTABLE(.Lft7, compat_crash_page_fault)
@@ -363,7 +366,7 @@ compat_crash_page_fault:
.Lft14: mov %edi,%fs
movl %esi,%edi
call show_page_walk
- jmp domain_crash_synchronous
+ jmp dom_crash_sync_extable
.section .fixup,"ax"
.Lfx14:
xorl %edi,%edi
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index f64e871..d44a6c1 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -370,7 +370,10 @@ create_bounce_frame:
sbb %ecx,%ecx # In +ve address space? Then okay.
cmpq %rax,%rsi
adc %ecx,%ecx # Above Xen private area? Then okay.
- jg domain_crash_synchronous
+UNLIKELY_START(g, create_bounce_frame_bad_sp)
+ lea UNLIKELY_ENTRY_LABEL(create_bounce_frame_bad_sp)(%rip), %rdi
+ jmp asm_domain_crash_synchronous /* Does not return */
+__UNLIKELY_END(create_bounce_frame_bad_sp)
movb TRAPBOUNCE_flags(%rdx),%cl
subq $40,%rsi
movq UREGS_ss+8(%rsp),%rax
@@ -429,26 +432,26 @@ UNLIKELY_END(bounce_failsafe)
movq $FLAT_KERNEL_CS,UREGS_cs+8(%rsp)
movq TRAPBOUNCE_eip(%rdx),%rax
testq %rax,%rax
- jz domain_crash_synchronous
+UNLIKELY_START(z, create_bounce_frame_bad_bounce_ip)
+ lea UNLIKELY_ENTRY_LABEL(create_bounce_frame_bad_bounce_ip)(%rip), %rdi
+ jmp asm_domain_crash_synchronous /* Does not return */
+__UNLIKELY_END(create_bounce_frame_bad_bounce_ip)
movq %rax,UREGS_rip+8(%rsp)
ret
- _ASM_EXTABLE(.Lft2, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft3, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft4, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft5, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft6, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft7, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft8, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft9, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft10, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft11, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft12, domain_crash_synchronous)
- _ASM_EXTABLE(.Lft13, domain_crash_synchronous)
-
-domain_crash_synchronous_string:
- .asciz "domain_crash_sync called from entry.S\n"
-
-ENTRY(domain_crash_synchronous)
+ _ASM_EXTABLE(.Lft2, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft3, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft4, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft5, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft6, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft7, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft8, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft9, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft10, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft11, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft12, dom_crash_sync_extable)
+ _ASM_EXTABLE(.Lft13, dom_crash_sync_extable)
+
+ENTRY(dom_crash_sync_extable)
# Get out of the guest-save area of the stack.
GET_STACK_BASE(%rax)
leaq STACK_CPUINFO_FIELD(guest_cpu_user_regs)(%rax),%rsp
@@ -459,11 +462,8 @@ ENTRY(domain_crash_synchronous)
setz %al
leal (%rax,%rax,2),%eax
orb %al,UREGS_cs(%rsp)
- # printk(domain_crash_synchronous_string)
- leaq domain_crash_synchronous_string(%rip),%rdi
- xorl %eax,%eax
- call printk
- jmp __domain_crash_synchronous
+ xorl %edi,%edi
+ jmp asm_domain_crash_synchronous /* Does not return */
/* No special register assumptions. */
ENTRY(ret_from_intr)
diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
index 25032d5..34034f3 100644
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -35,10 +35,14 @@ void ret_from_intr(void);
#ifdef __ASSEMBLY__
#define UNLIKELY_START(cond, tag) \
+ .Lunlikely.entry.tag: \
j##cond .Lunlikely.tag; \
.subsection 1; \
.Lunlikely.tag:
+#define UNLIKELY_ENTRY_LABEL(tag) \
+ .Lunlikely.entry.tag
+
#define UNLIKELY_DONE(cond, tag) \
j##cond .Llikely.tag
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index ae6a3b8..8e66d6b 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -541,6 +541,13 @@ void __domain_crash_synchronous(void) __attribute__((noreturn));
__domain_crash_synchronous(); \
} while (0)
+/*
+ * Called from assembly code, with an optional address to help indicate why
+ * the crash occured. If addr is 0, look up address from last extable
+ * redirection.
+ */
+void asm_domain_crash_synchronous(unsigned long addr) __attribute__((noreturn));
+
#define set_current_state(_s) do { current->state = (_s); } while (0)
void scheduler_init(void);
int sched_init_vcpu(struct vcpu *v, unsigned int processor);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/2] Improvements for domain_crash_synchronous
2013-09-09 12:46 [PATCH v2 0/2] Improvements for domain_crash_synchronous Andrew Cooper
2013-09-09 12:46 ` [PATCH v2 1/2] x86/traps: Record last extable faulting address Andrew Cooper
2013-09-09 12:46 ` [PATCH v2 2/2] Xen/x86: Improve information from domain_crash_synchronous Andrew Cooper
@ 2013-09-09 13:44 ` Keir Fraser
2 siblings, 0 replies; 7+ messages in thread
From: Keir Fraser @ 2013-09-09 13:44 UTC (permalink / raw)
To: Andrew Cooper, Xen-devel; +Cc: Jan Beulich
On 09/09/2013 05:46, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> This patch series provides improvements for identifying the cause of a
> domain_crash_synchronous from assembly code.
>
> All changes for v2 are in the 2nd patch, which has been rebased on top of
> staging, now the prerequisite series have been committed.
>
> ~Andrew
Acked-by: Keir Fraser <keir@xen.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] Xen/x86: Improve information from domain_crash_synchronous
2013-09-09 12:46 ` [PATCH v2 2/2] Xen/x86: Improve information from domain_crash_synchronous Andrew Cooper
@ 2013-09-09 13:45 ` Jan Beulich
2013-09-09 13:49 ` Andrew Cooper
0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2013-09-09 13:45 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Keir Fraser
>>> On 09.09.13 at 14:46, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> +void asm_domain_crash_synchronous(unsigned long addr)
> +{
> + if ( addr == 0 )
> + addr = this_cpu(last_extable_addr);
> +
> + printk("domain_crash_sync called from entry.S\n"
> + " fault at %p: ", _p(addr));
> + print_symbol("%s\n", addr);
I'd prefer if all output went on a single line, so that grep-ing
though a log would turn up the fault locations. Perhaps the
"fault at" could go in parentheses at the end of the original
message?
> #define UNLIKELY_START(cond, tag) \
> + .Lunlikely.entry.tag: \
> j##cond .Lunlikely.tag; \
> .subsection 1; \
> .Lunlikely.tag:
I have to admit that I still dislike this dead label, albeit in the v2
shape it doesn't look as bad anymore. Nevertheless - why can't
you just use .Llikely.tag? That is in the original function, always
available (i.e. even - as done here - when using __UNLIKELY_END()),
and only very slightly off (pointing past the conditional branch
rather than at it).
And if we decided to stay with it, it still ask for it to be named
sensibly: It is not marking the entry of an unlikely code section
(as it sits in the "normal" code flow).
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] Xen/x86: Improve information from domain_crash_synchronous
2013-09-09 13:45 ` Jan Beulich
@ 2013-09-09 13:49 ` Andrew Cooper
2013-09-09 14:06 ` Jan Beulich
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2013-09-09 13:49 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel, Keir Fraser
On 09/09/13 14:45, Jan Beulich wrote:
>>>> On 09.09.13 at 14:46, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> +void asm_domain_crash_synchronous(unsigned long addr)
>> +{
>> + if ( addr == 0 )
>> + addr = this_cpu(last_extable_addr);
>> +
>> + printk("domain_crash_sync called from entry.S\n"
>> + " fault at %p: ", _p(addr));
>> + print_symbol("%s\n", addr);
> I'd prefer if all output went on a single line, so that grep-ing
> though a log would turn up the fault locations. Perhaps the
> "fault at" could go in parentheses at the end of the original
> message?
Certainly - I shall respin.
>
>> #define UNLIKELY_START(cond, tag) \
>> + .Lunlikely.entry.tag: \
>> j##cond .Lunlikely.tag; \
>> .subsection 1; \
>> .Lunlikely.tag:
> I have to admit that I still dislike this dead label, albeit in the v2
> shape it doesn't look as bad anymore. Nevertheless - why can't
> you just use .Llikely.tag? That is in the original function, always
> available (i.e. even - as done here - when using __UNLIKELY_END()),
> and only very slightly off (pointing past the conditional branch
> rather than at it).
>
> And if we decided to stay with it, it still ask for it to be named
> sensibly: It is not marking the entry of an unlikely code section
> (as it sits in the "normal" code flow).
>
> Jan
I suppose pointing at the end of the unlikely section is ok, but I still
prefer pointing to the actual instruction which made the decsion.
What name would you suggest? I admit that UNLIKELY_ENTRY_LABEL() is not
the best name but I couldn't think of a better name.
~Andrew
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] Xen/x86: Improve information from domain_crash_synchronous
2013-09-09 13:49 ` Andrew Cooper
@ 2013-09-09 14:06 ` Jan Beulich
0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2013-09-09 14:06 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Keir Fraser
>>> On 09.09.13 at 15:49, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> On 09/09/13 14:45, Jan Beulich wrote:
>>>>> On 09.09.13 at 14:46, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> +void asm_domain_crash_synchronous(unsigned long addr)
>>> +{
>>> + if ( addr == 0 )
>>> + addr = this_cpu(last_extable_addr);
>>> +
>>> + printk("domain_crash_sync called from entry.S\n"
>>> + " fault at %p: ", _p(addr));
>>> + print_symbol("%s\n", addr);
>> I'd prefer if all output went on a single line, so that grep-ing
>> though a log would turn up the fault locations. Perhaps the
>> "fault at" could go in parentheses at the end of the original
>> message?
>
> Certainly - I shall respin.
>
>>
>>> #define UNLIKELY_START(cond, tag) \
>>> + .Lunlikely.entry.tag: \
>>> j##cond .Lunlikely.tag; \
>>> .subsection 1; \
>>> .Lunlikely.tag:
>> I have to admit that I still dislike this dead label, albeit in the v2
>> shape it doesn't look as bad anymore. Nevertheless - why can't
>> you just use .Llikely.tag? That is in the original function, always
>> available (i.e. even - as done here - when using __UNLIKELY_END()),
>> and only very slightly off (pointing past the conditional branch
>> rather than at it).
>>
>> And if we decided to stay with it, it still ask for it to be named
>> sensibly: It is not marking the entry of an unlikely code section
>> (as it sits in the "normal" code flow).
>
> I suppose pointing at the end of the unlikely section is ok, but I still
> prefer pointing to the actual instruction which made the decsion.
Just so we talk the same "language" - to me "unlikely section"
means the portion of moved out of the normal code flow. And
using .Llikely.tag you'd already not have the label point into the
unlikely section (for, as we both appear to agree on, it being
slightly more involved to re-associate that out-of-line location
with the origin). All we diverge on is whether the label points
at the beginning or the end of the branch instruction sitting in
the normal code flow.
> What name would you suggest? I admit that UNLIKELY_ENTRY_LABEL() is not
> the best name but I couldn't think of a better name.
As per above, my favorite would be to not have another label
at all. Failing that, .Ldispatch.tag would come to mind. All I'm
asking for is that the label name not cause confusion through
spelling out something that it doesn't fulfill.
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-09-09 14:06 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-09 12:46 [PATCH v2 0/2] Improvements for domain_crash_synchronous Andrew Cooper
2013-09-09 12:46 ` [PATCH v2 1/2] x86/traps: Record last extable faulting address Andrew Cooper
2013-09-09 12:46 ` [PATCH v2 2/2] Xen/x86: Improve information from domain_crash_synchronous Andrew Cooper
2013-09-09 13:45 ` Jan Beulich
2013-09-09 13:49 ` Andrew Cooper
2013-09-09 14:06 ` Jan Beulich
2013-09-09 13:44 ` [PATCH v2 0/2] Improvements for domain_crash_synchronous Keir Fraser
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).