public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thellstrom@vmware.com>
To: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
Cc: Linux kernel mailing list <linux-kernel@vger.kernel.org>,
	"Siddha, Suresh B" <suresh.b.siddha@intel.com>
Subject: Re: 2.6.29 pat issue
Date: Fri, 06 Feb 2009 10:51:26 +0100	[thread overview]
Message-ID: <498C081E.80100@vmware.com> (raw)
In-Reply-To: <1233875311.4286.127.camel@localhost.localdomain>

Pallipadi, Venkatesh wrote:
> On Thu, 2009-02-05 at 13:32 -0800, Thomas Hellstrom wrote:
>   
>> Pallipadi, Venkatesh wrote:
>>     
>>> Only place where vm_pgoff is getting set for a PFNMAP vma is in
>>> remap_pfn_range() which maps the entire range. vm_insert_pfn() which may
>>> have sparsely populated ranges does not set vm_pgoff. What interface are
>>> you using to map discontig pages, where you are seeing these errors?
>>>
>>>       
>> Since vm_pgoff can be nonzero upon every call to a device driver's mmap 
>> method (It corresponds to the @offset parameter, page shifted, given by 
>> the user's mmap call), _Any_ VM_PFNMAP vma can practically be assumed to 
>> be linear by is_linear_pfn_mapping(), and that's an invalid assumption.
>>
>> In this particular case, We set VM_PFNMAP explicitly in the mmap method 
>> and use fault() and vm_insert_pfn() to populate the vmas with PTEs 
>> pointing to private memory pages or io-space depending on where the data 
>> is currently located. The member vma->vm_pgoff is, as mentioned, set by 
>> the user-space mmap call, indicating what part of the device address 
>> space needs to be mapped.
>>
>> So in the end, we're hitting the WARN_ON_ONCE(1) near line 637 in 
>> arch/x86/mm/pat.c. We should never have ended up in reserve_pfn_range() 
>> in the first place.
>>
>>     
>
> OK. Now I understand how you are seeing that warning. I am not what is
> the simple way around this. There are no bits available in vm_flags that
> we can use to identify linear_pfn_mapping. I don't think you have any
> way around in the driver other than using pgoff, in order to do
> vm_insert_pfn.
> One possible way is to overload some existing flag + PFNMAP to mean
> linear pfn map. Will send a patch for this as an RFC soon.
>   
Thanks, Venki. There are a couple of other issues as well. This wasn't 
the root cause of the problem, Pls look at the mail I just sent out.

>   
>>> The result of not having the caching attribute right can be really bad
>>> as to hang/crash the system. So, having this only in debug is not the
>>> enough, IM0. Kernel has to enforce UC and WC caching types are
>>> consistent at all times. And we also have to keep the indentity map and
>>> other mappings that may be present for that address consistent.
>>>       
>> Indeed, it's crucial to keep the mappings consistent, but failure to do 
>> so is a kernel driver bug, it should never be the result of invalid user 
>> data.
>>
>> There are other more common kernel bugs that can be even worse and hang 
>> / crash the system. For example using uninitialized spinlocks, writing 
>> to kfreed memory etc. There is code in the kernel to detect these as 
>> well, but this code is behind debug defines.
>>
>> IMHO checking each vm_insert_pfn() for caching attribute correctness is 
>> not something that should be enabled by default, due to the CPU 
>> overhead. Production drivers should never violate this.
>>
>>     
>
> It is not a question of single production driver. There are many
> variables here. Different drivers can be mapping the same region. There
> can be mapping from /dev/mem. There are also kernel identity and text
> mappings. So, any change of cacheability by one driver has to make sure
> it is not stepping over some other users of that pte. Kernel has to make
> sure different things co-exist in a sane way.
>   
Yes, I understand the need for this check now.
> There is an alternative to checking this in each vm_insert_pfn, as long
> as mappings are going to be contiguous (even though they may be inserted
> individually). As in include/linux/io-mapping.h, we can have a
> create_mapping which reserves the entire space, and individual map and
> unmap, which doesn't have to check. May be we need a new API for your
> use case though...
>   
I think when the issues in the previous mail are fixed, this will in the 
end reduce to a possible performance problem when doing vm_insert_pfn() 
into a contigous range. A create_mapping API could be a way around this.

Thanks,
Thomas



> Thanks,
> Venki
>
>   


  reply	other threads:[~2009-02-06  9:51 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-05 12:47 2.6.29 pat issue Thomas Hellström
2009-02-05 18:03 ` Pallipadi, Venkatesh
2009-02-05 21:32   ` Thomas Hellstrom
2009-02-05 23:08     ` Pallipadi, Venkatesh
2009-02-06  9:51       ` Thomas Hellström [this message]
2009-02-06  1:11     ` Eric W. Biederman
2009-02-06  9:43       ` Thomas Hellström
2009-03-04  6:08         ` Pallipadi, Venkatesh
2009-03-04  9:56           ` Thomas Hellstrom
2009-03-06 22:38             ` Pallipadi, Venkatesh
2009-03-06 23:44               ` Thomas Hellstrom
2009-03-10  1:39                 ` Pallipadi, Venkatesh
2009-03-10  8:22                   ` Thomas Hellstrom
2009-03-10 17:42                     ` Pallipadi, Venkatesh
2009-03-11  9:17                       ` Thomas Hellstrom
2009-03-11  9:33                         ` Ingo Molnar
2009-03-11 17:54                           ` [PATCH] VM, x86, PAT: Change implementation of is_linear_pfn_mapping Pallipadi, Venkatesh
2009-03-11 22:09                             ` Frans Pop
2009-03-12  0:31                               ` Pallipadi, Venkatesh
2009-03-12  3:22                                 ` Pallipadi, Venkatesh
2009-03-12  5:45                                 ` Frans Pop
2009-03-12 18:59                                   ` Pallipadi, Venkatesh
2009-03-12 20:30                                     ` Frans Pop
2009-03-12 22:48                                       ` Pallipadi, Venkatesh
2009-03-13  0:36                                         ` Ingo Molnar
2009-03-13  0:45                                           ` [PATCH] VM, x86, PAT: Change is_linear_pfn_mapping to not use vm_pgoff Pallipadi, Venkatesh
2009-03-13  4:03                                             ` [tip:x86/urgent] " Pallipadi, Venkatesh
2009-03-13 16:25                                               ` Nick Piggin
2009-03-13 17:00                                                 ` Pallipadi, Venkatesh
2009-03-14  2:52                                                   ` Nick Piggin
2009-03-13 23:35                                                 ` [PATCH] Add a new vm flag to track full pfnmap at mmap Pallipadi, Venkatesh
2009-03-14  2:53                                                   ` Nick Piggin
2009-03-14  8:54                                                   ` [tip:x86/urgent] VM, x86, PAT: add " Pallipadi, Venkatesh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=498C081E.80100@vmware.com \
    --to=thellstrom@vmware.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=suresh.b.siddha@intel.com \
    --cc=venkatesh.pallipadi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox