All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@suse.de>, "H. Peter Anvin" <hpa@zytor.com>,
	Ingo Molnar <mingo@elte.hu>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Zachary Amsden <zach@vmware.com>
Subject: Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
Date: Mon, 3 Mar 2008 23:12:30 -0500	[thread overview]
Message-ID: <20080304041230.GA22753@Krystal> (raw)
In-Reply-To: <47CC7897.1080703@goop.org>

* Jeremy Fitzhardinge (jeremy@goop.org) wrote:
> Mathieu Desnoyers wrote:
>> * Jeremy Fitzhardinge (jeremy@goop.org) wrote:
>>   
>>> Andi Kleen wrote:
>>>     
>>>> On Monday 03 March 2008 18:58:03 Jeremy Fitzhardinge wrote:
>>>>
>>>>         
>>>>> Perhaps, though that's uncached by default.
>>>>>             
>>>> ioremap_cached()
>>>>       
>>> Sure.  But given that from the perspective of this problem ioremap* is 
>>> just a wrapper for vmap, we may as well use it directly and avoid getting 
>>> tangled up in any current or future io-related stuff that ioremap may 
>>> want to do.
>>>
>>>    J
>>>     
>>
>> Would you have a quick hint on why I get a page fault with the following
>> implementation ? There is probably a fundamental detail I missed.
>>
>>
>> void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
>> {
>>         char *vaddr;
>>         struct page *pages[1];
>>
>>         BUG_ON(len > sizeof(long));
>>         BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
>>                 - ((long)addr & ~(sizeof(long) - 1)));
>>         if (kernel_text_address((unsigned long)addr))
>>                 pages[0] = virt_to_page(addr);
>>         else
>>                 vaddr = addr;
>>   
>

Hi Jeremy,

(Connecting brain...)

> What's this for?  You just overwrite vaddr with the vmap on the next line.
>

Ah, it's supposed to be

if (kernel_text_address((unsigned long)addr)) {
  pages[0] = virt_to_page(addr);
  vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
  memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
} else
  memcpy(addr, opcode, len);



>>         vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
>>   
> Do you need to handle the case of an instruction spanning a page boundary?
>

Not really :

1-byte changes :

- kprobes is a 1-byte breakpoint
- smp alternatives changes an f0 prefix into a 1-byte nop when a single
  CPU is active (the opposite is done when bringing up a second CPU).

Aligned on word size :
- immediate values are aligned on word size

At boot time only, before kernel pages are marked RO :
- alternatives and paravirt are applied at boot time only

But yes, I guess it's cheap and allows supporting users which does not
have their instructions aligned. I'll change it.

>>         memcpy(&vaddr[(unsigned long)addr & PAGE_MASK], opcode, len);
>>         if (kernel_text_address((unsigned long)addr))
>>                 vunmap(vaddr);
>>         sync_core();
>>         /* Could also do a CLFLUSH here to speed up CPU recovery; but
>>            that causes hangs on some VIA CPUs. */
>>         return addr;
>> }
>>
>>
>> [    0.149856] SMP alternatives: switching to UP code                      
>>      [    0.152009] BUG: unable to handle kernel paging request at 
>> b8902000            
> That's a userspace address, unless you've got 2G:2G, and the error code 
> says there's nothing mapped there.
>

<banging head on the wall> & PAGE_MASK -> & ~PAGE_MASK  :)

Thanks!

Mathieu

>> [    0.152009] IP: [<c03acfa1>] text_poke+0x85/0xb4                        
>>      [    0.152009] *pde = 00000000                                        
>>           [    0.152009] Oops: 0002 [#1] SMP                               
>>                [    0.152009] LTT NESTING LEVEL : 0 <0>                    
>>                     [    0.152009] Modules linked in:
>> [    0.152009] [    0.152009] Pid: 0, comm: swapper Not tainted 
>> (2.6.25-rc3-testssmp #744)
>> [    0.152009] EIP: 0060:[<c03acfa1>] EFLAGS: 00010002 CPU: 0
>> [    0.152009] EIP is at text_poke+0x85/0xb4
>> [    0.152009] EAX: f8800000 EBX: 00000001 ECX: 00000001 EDX: f8800000
>> [    0.152009] ESI: c04a3fab EDI: b8902000 EBP: c0102114 ESP: c04a3f88
>> [    0.152009]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>> [    0.152009] Process swapper (pid: 0, ti=c04a2000 task=c04703a0 
>> task.ti=c04a2)
>> [    0.152009] Stack: 00000163 f8800000 c1002040 c04a400c c04a7cac 
>> c0100000 c03 [    0.152009]        90411244 00000206 c04e072c c17fb72c 
>> 0000672c c04aaefe c03 [    0.152009]        c04ab76a c17f5000 c04a896a 
>> 00000092 c04a80d7 00000008 000 [    0.152009] Call Trace:                  
>>                                     [    0.152009]  [<c03b0139>] 
>> _etext+0x0/0xf7ec7                                 [    0.152009]  
>> [<c0107814>] alternatives_smp_unlock+0x57/0x59                  [    
>> 0.152009]  [<c04aaefe>] alternative_instructions+0x152/0x157               
>> [    0.152009]  [<c03b0139>] _etext+0x0/0xf7ec7                            
>>      [    0.152009]  [<c04ab76a>] check_bugs+0x131/0x14e                   
>>           [    0.152009]  [<c04a896a>] start_kernel+0x2d1/0x327            
>>                [    0.152009]  [<c04a80d7>] unknown_bootoption+0x0/0x1e3   
>>                     [    0.152009]  =======================                
>>                          [    0.152009] Code: b9 04 00 00 00 ba 01 00 00 
>> 00 e8 a6 87 db ff 89 44 24 04 8 [    0.152009] EIP: [<c03acfa1>] 
>> text_poke+0x85/0xb4 SS:ESP 0068:c04a3f88       [    0.152009] ---[ end 
>> trace ca143223eefdc828 ]---                             [    0.152009] 
>> Kernel panic - not syncing: Attempted to kill the idle task!       
>
>    J

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2008-03-04  4:12 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-29 20:39 bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives" Jeremy Fitzhardinge
2008-02-29 20:56 ` H. Peter Anvin
2008-02-29 21:09   ` Jeremy Fitzhardinge
2008-03-03 17:30     ` Mathieu Desnoyers
2008-03-03 17:35       ` Andi Kleen
2008-03-03 17:58       ` Jeremy Fitzhardinge
2008-03-03 18:03         ` Andi Kleen
2008-03-03 18:05           ` Jeremy Fitzhardinge
2008-03-03 20:53             ` Mathieu Desnoyers
2008-03-03 22:15               ` Jeremy Fitzhardinge
2008-03-04  4:12                 ` Mathieu Desnoyers [this message]
2008-02-29 21:24 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080304041230.GA22753@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=ak@suse.de \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=zach@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.