public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Lee Revell <rlrevell@joe-job.com>
To: Timothy Miller <miller@techsource.com>
Cc: Mark_H_Johnson@raytheon.com, Ingo Molnar <mingo@elte.hu>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"K.R. Foley" <kr@cybsft.com>,
	Felipe Alfaro Solana <lkml@felipe-alfaro.com>,
	Daniel Schmitt <pnambic@unu.nu>
Subject: Re: GPU driver misbehavior  [Re: [patch] voluntary-preempt-2.6.9-rc1-bk4-Q9]
Date: Tue, 05 Oct 2004 17:12:12 -0400	[thread overview]
Message-ID: <1097010731.28100.54.camel@krustophenia.net> (raw)
In-Reply-To: <4163078F.8080709@techsource.com>

On Tue, 2004-10-05 at 16:43, Timothy Miller wrote:
> Lee Revell wrote:
> 
> > 
> > "Misbehaving video card drivers are another source of significant delays
> > in scheduling user code. A number of video cards manufacturers recently
> > began employing a hack to save a PCI bus transaction for each display
> > operation in order to gain a few percentage points on their WinBench
> > [Ziff-Davis 98] Graphics WinMark performance.
> > 
> > The video cards have a command FIFO that is written to via the PCI bus.
> > They also have a status register, read via the PCI bus, which says
> > whether the command FIFO is full or not. The hack is to not check
> > whether the command FIFO is full before attempting to write to it, thus
> > saving a PCI bus read.
> > 
> > The problem with this is that the result of attempting to write to the
> > FIFO when it is full is to stall the CPU waiting on the PCI bus write
> > until a command has been completed and space becomes available to accept
> > the new command. In fact, this not only causes the CPU to stall waiting
> > on the PCI bus, but since the PCI controller chip also controls the ISA
> > bus and mediates interrupts, ISA traffic and interrupt requests are
> > stalled as well. Even the clock interrupts stop.
> > 
> > These video cards will stall the machine, for instance, when the user
> > drags a window. For windows occupying most of a 1024x768 screen on a
> > 333MHz Pentium II with an AccelStar II AGP video board (which is based
> > on the 3D Labs Permedia 2 chip set) this will stall the machine for
> > 25-30ms at a time!"
> 
> I would expect that I'm not the first to think of this, but I haven't 
> seen it mentioned, so it makes me wonder.  Therefore, I offer my solution.
> 
> Whenever you read the status register, keep a copy of the "number of 
> free fifo entries" field.  Whenever you're going to do a group of writes 
> to the fifo, you first must check for enough free entries.  The macro 
> that does this checks the copy of the status register to see if there 
> were enough free the last time you checked.  If so, deduct the number of 
> free slots you're about to use, and move on.  If not, re-read the status 
> register and loop or sleep if you don't have enough free.
> 
> The copy of the status register will always be "correct" in that it will 
> always report a number of free entries less than or equal to the actual 
> number, and it will never report a number greater than what is available 
> (barring a hardware glitch of a bug which is bad for other reasons). 
> This is because you're assuming the fifo doesn't drain, when in fact, it 
> does.
> 
> This results in nearly optimal performance, because usually you end up 
> reading the status register mostly when the fifo is full (a time when 
> extra bus reads don't hurt anything).  If you have a 256-entry fifo, 
> then you end up reading the status register once for ever 256 writes, 
> for a performance loss of only 0.39%, and you ONLY get this performance 
> loss when the fifo drains faster than you can fill it.
> 
> One challenge to this is when you have more than one entity trying to 
> access the same resource.  But in that case, you'll already have to be 
> using some sort of mutex mechanism anyhow.
> 
> 

AFAIK only one driver (VIA unichrome) has had this problem recently. 
Thomas Hellstrom fixed it, so I added him to the cc: list.  Thomas, you
mentioned there was a performance hit associated with the fix; would
this be an improvement over what you did?

Also I should add that I was quoting a research.microsoft.com whitepaper
above.  But s/AccelStar II AGP/VIA CLE266/ and it applies exactly to my
results.  Just want to give credit where it's due...

Lee


      reply	other threads:[~2004-10-05 21:12 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-03 15:33 [patch] voluntary-preempt-2.6.9-rc1-bk4-Q9 Mark_H_Johnson
2004-09-04  0:04 ` Lee Revell
2004-09-04 16:52   ` Alan Cox
2004-09-04 18:05     ` Lee Revell
2004-10-05 20:43   ` GPU driver misbehavior [Re: [patch] voluntary-preempt-2.6.9-rc1-bk4-Q9] Timothy Miller
2004-10-05 21:12     ` Lee Revell [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1097010731.28100.54.camel@krustophenia.net \
    --to=rlrevell@joe-job.com \
    --cc=Mark_H_Johnson@raytheon.com \
    --cc=kr@cybsft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkml@felipe-alfaro.com \
    --cc=miller@techsource.com \
    --cc=mingo@elte.hu \
    --cc=pnambic@unu.nu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox