From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261466AbVGGRc6 (ORCPT ); Thu, 7 Jul 2005 13:32:58 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261492AbVGGR20 (ORCPT ); Thu, 7 Jul 2005 13:28:26 -0400 Received: from wproxy.gmail.com ([64.233.184.199]:53419 "EHLO wproxy.gmail.com") by vger.kernel.org with ESMTP id S261513AbVGGR2M convert rfc822-to-8bit (ORCPT ); Thu, 7 Jul 2005 13:28:12 -0400 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=fH3pLYy0SghW5FKYg/xtlwarXcMi9FW+FWG84LUxnta5mOggs8trgwjXzQrwELZZwBm0Ya2jWTZFM3zrGOC7TmA+qrVIx1Iwptedx2jm/GjGgZqrG/kGV6Ij27G6CObzdB1pJFe1C/gyiKBoAUP625114pEO/IBKHi4foRktDQ0= Message-ID: Date: Thu, 7 Jul 2005 12:28:09 -0500 From: Sizhao Yang Reply-To: Sizhao Yang To: linux-kernel@vger.kernel.org Subject: ASPLOV miss ratio porting to planet labs kernel Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Hi all, I was wondering if someone could help me with this. I'm porting an ASPLOV paper miss ratio curve from 2.4.20 2.6.11.6 and eventually to Planet Labs kernel. It's a novel idea for memory management. In porting I at run time I'm consistently hitting kernel bugs at four different places bad_page, bad_range, in rmap.c BUG(page_mapcount(page)< 0), and failing at apm_do_idle. All of these functions except apm_do_idle seem to be new functions from 2.4.20 to 2.6.11.6. I'm pretty sure I'm forgetting to account for certain things when modifying the pages, but I'm not sure where. What I'm doing in the port is resetting protection bits so that when it page faults. It will calculate a miss ratio based on the number of accessed bits and other information. After I gather the information I will reset the accessed bits. Then based on previous miss ratios and current miss ratio it will give out memory to different processes based on that. That's the general idea. For more specifics: http://carmen.cs.uiuc.edu/paper/ASPLOS04-Zhou.pdf I've narrowed it down to primarily when I call the following functions: ptep_test_and_clear_young, static inline pte_t pte_mknominor(pte_t pte) { (pte).pte_low &= ~_PAGE_PROTNONE; return pte; } static inline pte_t pte_mkminor(pte_t pte) { (pte).pte_low |= _PAGE_PROTNONE; return pte; } static inline pte_t pte_mkpresent(pte_t pte) { (pte).pte_low |= _PAGE_PRESENT; return pte; } static inline pte_t pte_mkabsent(pte_t pte) { (pte).pte_low &= ~_PAGE_PRESENT; return pte; } When I don't have those functions in my code the kernel doesn't crash, but when I do they crash. So, my question is am I to page accounting aspects? I looked at rmap functions for incrementing the _mapcount but they seem to be only for when a pte is copied. Should I be incrementing the pagecount at any point? rmap.c BUG(page_mapcount(page)< 0) is invoked when the accessed bits are cleared in zap_pte, but I don't know how the page is being corrupted. So, my main issue is if anyone can help point me to the direction where a page could be corrupted I would greatly appreciate. I'd appreciate any general direction anyone can point me at. Thanks in advance. I look forward to hearing from anyone. Zao