public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Andi Kleen <ak@suse.de>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Venki Pallipadi <venkatesh.pallipadi@intel.com>,
	suresh.b.siddha@intel.com, Arjan van de Ven <arjan@infradead.org>,
	Dave Jones <davej@redhat.com>
Subject: Re: CPA patchset
Date: Fri, 11 Jan 2008 08:19:36 +0100	[thread overview]
Message-ID: <20080111071936.GA16175@elte.hu> (raw)
In-Reply-To: <20080110111248.GR25945@bingen.suse.de>


* Andi Kleen <ak@suse.de> wrote:

> > > > but that's not too smart: why dont they use WB plus cflush 
> > > > instead?
> > > 
> > > Because they need to access it WC for performance.
> > 
> > I think you have it fundamentally backwards: the best for 
> > performance is WB + cflush. What would WC offer for performance that 
> > cflush cannot do?
> 
> Cached requires the cache line to be read first before you can write 
> it.

nonsense, and you should know it. It is perfectly possible to construct 
fully written cachelines, without reading the cacheline first. MOVDQ is 
SSE1 so on basically in every CPU today - and it is 16 byte aligned and 
can generate full cacheline writes, _without_ filling in the cacheline 
first. Bulk ops (string ops, etc.) will do full cacheline writes too, 
without filling in the cacheline. Especially with high performance 3D 
ops we do _NOT_ need any funky reads from anywhere because 3D software 
can stream a lot of writes out: we construct a full frame or a portion 
of a frame, or upload vertices or shader scripts, textures, etc.

( also, _even_ when there is a cache fill pending on for a partially
  written cacheline, that might go on in parallel and it is not 
  necessarily holding up the CPU unless it has an actual data dependency 
  on that. )

but that's totally besides the point anyway. WC or WB accesses, if a 3D 
app or a driver does high-freq change_page_attr() calls, it will _lose_ 
the performance game:

> > also, it's irrelevant to change_page_attr() call frequency. Just map 
> > in everything from the card and use it. In graphics, if you remap 
> > anything on the fly and it's not a slowpath you've lost the 
> > performance game even before you began it.
> 
> The typical case would be lots of user space DRI clients supplying 
> their own buffers on the fly. There's not really a fixed pool in this 
> case, but it all varies dynamically. In some scenarios that could 
> happen quite often.

in what scenarios? Please give me in-tree examples of such high-freq 
change_page_attr() cases, where the driver authors would like to call it 
with high frequency but are unable to do it and see performance problems 
due to the WBINVD.

	Ingo

  reply	other threads:[~2008-01-11  7:20 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-03 15:24 [PATCH CPA] [1/28] Shrink __PAGE_KERNEL/__PAGE_KERNEL_EXEC on non PAE kernels Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [2/28] CPA: Do a simple self test at boot Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [3/28] Add pte accessors for the global bit Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [4/28] Add pte_clrhuge on i386 Andi Kleen
2008-01-05  6:48   ` Jeremy Fitzhardinge
2008-01-03 15:24 ` [PATCH CPA] [5/28] Add pte_pgprot " Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [6/28] CPA: Undo white space changes Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [7/28] Extract page table dumping code from i386 fault handler into dump_pagetable() Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [8/28] CPA: Return the page table level in lookup_address() Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [9/28] CPA: Add simple self test at boot Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [10/28] CPA: Change kernel_map_pages to not use c_p_a() Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [11/28] CPA: Change 32bit back to init_mm semaphore locking Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [12/28] CPA: CLFLUSH support in change_page_attr() Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [13/28] CPA: Use macros to modify the PG_arch_1 page flags in change_page_attr Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [14/28] CPA: Use page granuality TLB flushing " Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [15/28] CPA: Don't flush the caches when the CPU supports self-snoop Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [16/28] CPA: Use wbinvd() macro instead of inline assembly in 64bit c_p_a() Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [17/28] CPA: Reorder TLB / cache flushes to follow Intel recommendation Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [18/28] CPA: Make change_page_attr() more robust against use of PAT bits Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [19/28] CPA: Limit cache flushing to pages that really change caching Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [20/28] CPA: Fix inaccurate comments in 64bit change_page_attr() Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [21/28] CPA: Dump pagetable when inconsistency is detected Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [22/28] CPA: Only queue actually unused page table pages for freeing Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [23/28] CPA: Remove unnecessary masking of address Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [24/28] CPA: Only unmap kernel init pages in text mapping when CONFIG_DEBUG_RODATA is set Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [25/28] CPA: Always do full TLB flush when splitting large pages Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [26/28] CPA: Fix reference counting when changing already changed pages Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [27/28] CPA: Change comments of external interfaces to kerneldoc format Andi Kleen
2008-01-03 15:24 ` [PATCH CPA] [28/28] Make kernel_text test match boot mapping initialization Andi Kleen
2008-01-10  9:31 ` CPA patchset Ingo Molnar
2008-01-10  9:44   ` Dave Airlie
2008-01-10  9:55     ` Andi Kleen
2008-01-10 10:20       ` Dave Airlie
2008-01-10 10:50         ` Andi Kleen
2008-01-10 10:02     ` Ingo Molnar
2008-01-10  9:53   ` Andi Kleen
2008-01-10 10:04     ` Ingo Molnar
2008-01-10 10:07       ` Andi Kleen
2008-01-10 10:57         ` Ingo Molnar
2008-01-10 11:12           ` Andi Kleen
2008-01-11  7:19             ` Ingo Molnar [this message]
2008-01-11  7:33               ` Ingo Molnar
2008-01-11 11:28                 ` Andi Kleen
2008-01-11 11:26               ` Andi Kleen
2008-01-11 17:02               ` dean gaudet
2008-01-11 17:18                 ` Arjan van de Ven
2008-01-11 17:56                 ` dean gaudet
2008-01-10 10:43     ` Ingo Molnar
2008-01-10 11:07       ` Andi Kleen
2008-01-10 12:22         ` Ingo Molnar
2008-01-10 12:39           ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080111071936.GA16175@elte.hu \
    --to=mingo@elte.hu \
    --cc=ak@suse.de \
    --cc=arjan@infradead.org \
    --cc=davej@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=venkatesh.pallipadi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox