From: Lee Revell <rlrevell@joe-job.com>
To: Timothy Miller <miller@techsource.com>
Cc: Mark_H_Johnson@raytheon.com, Ingo Molnar <mingo@elte.hu>,
linux-kernel <linux-kernel@vger.kernel.org>,
"K.R. Foley" <kr@cybsft.com>,
Felipe Alfaro Solana <lkml@felipe-alfaro.com>,
Daniel Schmitt <pnambic@unu.nu>
Subject: Re: GPU driver misbehavior [Re: [patch] voluntary-preempt-2.6.9-rc1-bk4-Q9]
Date: Tue, 05 Oct 2004 17:12:12 -0400 [thread overview]
Message-ID: <1097010731.28100.54.camel@krustophenia.net> (raw)
In-Reply-To: <4163078F.8080709@techsource.com>
On Tue, 2004-10-05 at 16:43, Timothy Miller wrote:
> Lee Revell wrote:
>
> >
> > "Misbehaving video card drivers are another source of significant delays
> > in scheduling user code. A number of video cards manufacturers recently
> > began employing a hack to save a PCI bus transaction for each display
> > operation in order to gain a few percentage points on their WinBench
> > [Ziff-Davis 98] Graphics WinMark performance.
> >
> > The video cards have a command FIFO that is written to via the PCI bus.
> > They also have a status register, read via the PCI bus, which says
> > whether the command FIFO is full or not. The hack is to not check
> > whether the command FIFO is full before attempting to write to it, thus
> > saving a PCI bus read.
> >
> > The problem with this is that the result of attempting to write to the
> > FIFO when it is full is to stall the CPU waiting on the PCI bus write
> > until a command has been completed and space becomes available to accept
> > the new command. In fact, this not only causes the CPU to stall waiting
> > on the PCI bus, but since the PCI controller chip also controls the ISA
> > bus and mediates interrupts, ISA traffic and interrupt requests are
> > stalled as well. Even the clock interrupts stop.
> >
> > These video cards will stall the machine, for instance, when the user
> > drags a window. For windows occupying most of a 1024x768 screen on a
> > 333MHz Pentium II with an AccelStar II AGP video board (which is based
> > on the 3D Labs Permedia 2 chip set) this will stall the machine for
> > 25-30ms at a time!"
>
> I would expect that I'm not the first to think of this, but I haven't
> seen it mentioned, so it makes me wonder. Therefore, I offer my solution.
>
> Whenever you read the status register, keep a copy of the "number of
> free fifo entries" field. Whenever you're going to do a group of writes
> to the fifo, you first must check for enough free entries. The macro
> that does this checks the copy of the status register to see if there
> were enough free the last time you checked. If so, deduct the number of
> free slots you're about to use, and move on. If not, re-read the status
> register and loop or sleep if you don't have enough free.
>
> The copy of the status register will always be "correct" in that it will
> always report a number of free entries less than or equal to the actual
> number, and it will never report a number greater than what is available
> (barring a hardware glitch of a bug which is bad for other reasons).
> This is because you're assuming the fifo doesn't drain, when in fact, it
> does.
>
> This results in nearly optimal performance, because usually you end up
> reading the status register mostly when the fifo is full (a time when
> extra bus reads don't hurt anything). If you have a 256-entry fifo,
> then you end up reading the status register once for ever 256 writes,
> for a performance loss of only 0.39%, and you ONLY get this performance
> loss when the fifo drains faster than you can fill it.
>
> One challenge to this is when you have more than one entity trying to
> access the same resource. But in that case, you'll already have to be
> using some sort of mutex mechanism anyhow.
>
>
AFAIK only one driver (VIA unichrome) has had this problem recently.
Thomas Hellstrom fixed it, so I added him to the cc: list. Thomas, you
mentioned there was a performance hit associated with the fix; would
this be an improvement over what you did?
Also I should add that I was quoting a research.microsoft.com whitepaper
above. But s/AccelStar II AGP/VIA CLE266/ and it applies exactly to my
results. Just want to give credit where it's due...
Lee
prev parent reply other threads:[~2004-10-05 21:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-09-03 15:33 [patch] voluntary-preempt-2.6.9-rc1-bk4-Q9 Mark_H_Johnson
2004-09-04 0:04 ` Lee Revell
2004-09-04 16:52 ` Alan Cox
2004-09-04 18:05 ` Lee Revell
2004-10-05 20:43 ` GPU driver misbehavior [Re: [patch] voluntary-preempt-2.6.9-rc1-bk4-Q9] Timothy Miller
2004-10-05 21:12 ` Lee Revell [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1097010731.28100.54.camel@krustophenia.net \
--to=rlrevell@joe-job.com \
--cc=Mark_H_Johnson@raytheon.com \
--cc=kr@cybsft.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkml@felipe-alfaro.com \
--cc=miller@techsource.com \
--cc=mingo@elte.hu \
--cc=pnambic@unu.nu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox