Re: [PATCH] 64K page support for kexec

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Luke Browning <lukebr@linux.vnet.ibm.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org, Paul Mackerras <paulus@samba.org>,
	cbe-oss-dev@ozlabs.org, Arnd Bergmann <arnd@arndb.de>
Subject: Re: [PATCH] 64K page support for kexec
Date: Wed, 25 Apr 2007 10:06:00 -0300	[thread overview]
Message-ID: <1177506360.24866.29.camel@luke-laptop> (raw)
In-Reply-To: <1177454894.14873.142.camel@localhost.localdomain>

On Wed, 2007-04-25 at 08:48 +1000, Benjamin Herrenschmidt wrote:
> Getting better :-)
> 
> Sorry for the constant nagging, let's say I'm a bit perfectionist...
> 
so am I.  That is why I redid the patch instead of going with the
previous version.

> > -static unsigned long slot2va(unsigned long hpte_v, unsigned long slot)
> > -{
> > -	unsigned long avpn = HPTE_V_AVPN_VAL(hpte_v);
> > -	unsigned long va;
> > -
> > -	va = avpn << 23;
> > -
> > -	if (! (hpte_v & HPTE_V_LARGE)) {
> > -		unsigned long vpi, pteg;
> > -
> > -		pteg = slot / HPTES_PER_GROUP;
> > -		if (hpte_v & HPTE_V_SECONDARY)
> > -			pteg = ~pteg;
> 
> Hrm... hpte_decode ends up being a pretty big function... I suppose
> that's ok.
> 
It is bigger because it combines two functions.  I think it is better
this way as it will make it easier to add support for 16G pages and 1T
segments in the future. 

> > +#define LP_SHIFT	12
> > +#define LP_BITS		8
> > +#define LP_MASK(i)	((((1 << LP_BITS) - 1) >> (i)) << LP_SHIFT)
> > +
> > +static void hpte_decode(hpte_t *hpte, unsigned long slot, 
> > +			int *psize, unsigned long *va)
> > +{
> > +	unsigned long hpte_r = hpte->r;
> > +	unsigned long hpte_v = hpte->v;
> > +	unsigned long avpn;
> > +	int i, size, shift, penc, avpnm_bits;
> > +		
> > +	if (!(hpte_v & HPTE_V_LARGE))
> > +		size = MMU_PAGE_4K;
> > +#if 0
> > +	else if (hpte_v & 0x4000000000000000UL)
> > +		size = MMU_PAGE_16G;
> > +#endif
> 
> Remove the above. I don't think it's right anyway. We'll deal with 16G
> pages when we start using them.

OK.  I put it there mainly as a place holder.  How many page sizes do
you expect to support in a 1TB segment? 

> 
> > +	else if (!(hpte_r & LP_MASK(0)))
> > +		size = MMU_PAGE_16M;
> 
> Is that correct ? (The above). 
> I haven't quite noticed in the previous
> instances of the patch, sorry about that but I don't think that's the
> way to detect the "old style" 16M pages... I -suspect- that the normal
> algorithm will work for them, that is, they'll have a penc of 0 which
> will match what's in the mmu_psize_defs[MMU_PAGE_16M] but it's worth
> actually testing it.

Ok.   

> 
> Now that I think about it, it's possible that I lead you on the wrong
> track there initially... Sorry about that.
> 
> I think your above code will treat anything with a penc of 0 as a 16M
> page, which might be true with current implementations, is also, I
> think, not mandated by the arch, is it ? (I don't have my 2.03 at hand
> as I'm writing this email).

right.  The encoding is implementation dependent.  There is a minimum
size requirement that comes into play, but that works the other way.
penc of zero has a minimum page size requirement of 8K.

I will remove the test and use the array for the comparison.

> 
> > +	else {
> > +		for (i = 0; i < LP_BITS; i++) {
> > +			if ((hpte_r & LP_MASK(i+1)) == LP_MASK(i+1))
> > +				break;
> > +		}
> > +		penc = LP_MASK(i+1) >> LP_SHIFT;
> > +		for (size = MMU_PAGE_64K; size < MMU_PAGE_16M; size++) {
> > +			if (!mmu_psize_defs[size].shift)
> > +				continue;
> > +			if (penc == mmu_psize_defs[size].penc)
> > +				break;
> > +		}
> > +	}
> >  
> > -		vpi = ((va >> 28) ^ pteg) & htab_hash_mask;
> > +	/*
> > +	 * FIXME, this could be made more efficient by storing the type 
> > +	 * of hash algorithm in mmu_psize_defs[].  The code below assumes 
> > +	 * the number of bits in the va representing the offset in the
> > +	 * page is less than 23. This affects the hash algorithm that is
> > +	 * used. When 16G pages are supported, a new hash algorithm
> > +	 * needs to be provided.  See POWER ISA Book III.
> > +	 *
> > +	 * The code below works for 16M, 64K, and 4K pages.
> > +	 */
> 
> I'm not 100% certain about your comment. The by type of hash algorithm
> you mean the segment size right ? This is not directly related to the
> page size. While 1T segments are mandatory for 16G pages, they can also
> hold normal page sizes... If we're going to implement support for 1T
> segment, we should get the segment size (and thus the hash algorithm)
> from the B bit of the PTE.

yes. what you said is correct.  I was thinking that you might want to
double the number of array elements so that you could apply a different
hash algorithm based on the type of segments. 

> 
> In fact, when doing 1T segments, we'll have to deal with them regardless
> of the page size (the hashing will be different for all page sizes).
> 
> > +	shift = mmu_psize_defs[size].shift;
> > +	if (mmu_psize_defs[size].avpnm)
> > +		avpnm_bits = __ilog2_u64(mmu_psize_defs[size].avpnm) + 1;
> > +	else
> > +		avpnm_bits = 0;
> > +	if (shift - avpnm_bits <= 23) {
> > +		avpn = HPTE_V_AVPN_VAL(hpte_v) << 23;
> > +
> > +		if (shift < 23) {
> > +			unsigned long vpi, pteg;
> > +
> > +			pteg = slot / HPTES_PER_GROUP;
> > +			if (hpte_v & HPTE_V_SECONDARY)
> > +				pteg = ~pteg;
> > +			vpi = ((avpn >> 28) ^ pteg) & htab_hash_mask;
> > +			avpn |= (vpi << mmu_psize_defs[size].shift);
> > +		}
> > +	}
> > +#if 0
> > +	/* 16GB page hash, p > 23 */
> > +	else {
> >  
> > -		va |= vpi << PAGE_SHIFT;
> >  	}
> > +#endif
> 
> Just don't keep the code in #if 0, just a comment about something
> needing to be done for 16G ...
> 
ok.

> > -	return va;
> > +	*va = avpn;
> > +	*psize = size;
> >  }
> >  
> >  /*
> > @@ -374,8 +420,6 @@ static unsigned long slot2va(unsigned lo
> >   *
> >   * TODO: add batching support when enabled.  remember, no dynamic memory here,
> >   * athough there is the control page available...
> > - *
> > - * XXX FIXME: 4k only for now !
> >   */
> >  static void native_hpte_clear(void)
> >  {
> > @@ -383,6 +427,7 @@ static void native_hpte_clear(void)
> >  	hpte_t *hptep = htab_address;
> >  	unsigned long hpte_v;
> >  	unsigned long pteg_count;
> > +	int psize;
> >  
> >  	pteg_count = htab_hash_mask + 1;
> >  
> > @@ -408,8 +453,9 @@ static void native_hpte_clear(void)
> >  		 * already hold the native_tlbie_lock.
> >  		 */
> >  		if (hpte_v & HPTE_V_VALID) {
> > +			hpte_decode(hptep, slot, &psize, &hpte_v);
> >  			hptep->v = 0;
> > -			__tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K);
> > +			__tlbie(hpte_v, psize);
> >  		}
> >  	}
> >  
> > 
>

next prev parent reply	other threads:[~2007-04-25 13:06 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-24 18:31 [PATCH] 64K page support for kexec Luke Browning
2007-04-24 19:43 ` Olof Johansson
2007-04-24 22:50   ` Benjamin Herrenschmidt
2007-04-24 23:07     ` Olof Johansson
2007-04-25  5:48       ` Milton Miller
2007-04-25 19:35     ` [PATCH v2] powerpc: " Luke Browning
2007-04-25 22:19       ` Benjamin Herrenschmidt
2007-04-26 15:28         ` Luke Browning
2007-04-27  4:36           ` [PATCH v3] " Milton Miller
2007-04-27 14:42             ` Luke Browning
2007-04-27 16:51               ` Milton Miller
2007-04-27 16:22             ` [PATCH v4] " Luke Browning
2007-04-27 16:59               ` Milton Miller
2007-04-27 17:30                 ` Luke Browning
2007-04-27 18:23                   ` Haren Myneni
2007-04-29  5:35                     ` Milton Miller
2007-04-29  8:30                   ` Paul Mackerras
2007-04-29  9:31                     ` Benjamin Herrenschmidt
2007-04-29 13:27                     ` Segher Boessenkool
2007-04-29 22:49                       ` Benjamin Herrenschmidt
2007-04-26  7:15       ` [PATCH v2] " Olof Johansson
2007-04-24 22:48 ` [PATCH] " Benjamin Herrenschmidt
2007-04-25 13:06   ` Luke Browning [this message]
2007-04-25 22:11     ` Benjamin Herrenschmidt
  -- strict thread matches above, loose matches on Subject: below --
2007-04-26 22:23 [PATCH v3] powerpc: " Luke Browning
2007-04-26 22:32 ` Olof Johansson
2007-05-02 14:19   ` [PATCH v4] " Luke Browning
2007-05-03 13:45     ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1177506360.24866.29.camel@luke-laptop \
    --to=lukebr@linux.vnet.ibm.com \
    --cc=arnd@arndb.de \
    --cc=benh@kernel.crashing.org \
    --cc=cbe-oss-dev@ozlabs.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox