From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Nouveau on dom0 Date: Thu, 4 Mar 2010 13:25:18 -0500 Message-ID: <20100304182518.GB20263@phenom.dumpdata.com> References: <20100225125552.GC9040@phenom.dumpdata.com> <20100225174411.GA13270@phenom.dumpdata.com> <20100301160130.GB7881@phenom.dumpdata.com> <20100303181303.GA21078@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Arvind R Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On Thu, Mar 04, 2010 at 02:47:58PM +0530, Arvind R wrote: > On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk > wrote: > >> > aio-write - > >> > >> which triggers do_page_fault, handle_mm_fault, do_linear_fault, __do_fault > >> and finally ttm_bo_vm_fault. > > > I've attached a simple patch I wrote some time ago to get the real MFNs > > and its page protection. I think you can adapt it (print_data function to be exact) > > to peet at the PTE and its protection values. > Have patched - did not apply clean. Will compile and get some info. Right. I don't think it would help you immediately - I was thinking you could take the print_data function and just jam it in the tt_bo_vm_fault code and use it to print the PTE data. > > > There is an extra flag that the PTE can have when running under Xen: _PAGE_IOMAP. > > This signifies that the PFN is actually the MFN. In this case thought > > it sholdn't be enabled b/c the memory is actually gathered from > > alloc_page. But if it is, it might be the culprit. > > >> What can possibly cause the fault-handler to repeat endlessly? > > FYI: about 2000 times a second - slowed by printk > > >> If a wrong page is backed at the user-address, it should create bad_access or > >> some other subsequent events - but the system is running fine minus all local > > So you see this fault handler being called endlessly while the machine > > is still running and other pieces of code work just fine, right? > Right. Can ssh in - but no local console > > >> ttm_tt_get_page calls alloc in a loop - so it may allocate multiple pages from > >> start/end depending on Highmem memory or not - implying asynchronous allocation > >> and mapping. > > > > I thought it had some logic to figure out that it already handled this > > page and would return an already allocate page? > Right. > > I think the problem lies in the vm_insert_pfn/page/mixed family of functions. > These are only used (grep'ed kernel tree) and invariably for mmaping. > Scsi-tgt, mspec, some media/video, poch,android in staging and ttm > - and, surprise - xen/blktap/ring.c and device.c > - which both check XENFEAT_auto_translated_physmap > > Pls. look at xen/blktap/ring.c - it looks to be what we need Let me take a look at it tomorrow. Bit swamped.