linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@pobox.com>
To: Andy Warner <andyw@pobox.com>
Cc: Andrew Morton <akpm@osdl.org>,
	torvalds@osdl.org, benh@kernel.crashing.org,
	linux-kernel@vger.kernel.org,
	"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>
Subject: Re: page fault scalability patch V12 [0/7]: Overview and performance tests
Date: Thu, 06 Jan 2005 18:40:40 -0500	[thread overview]
Message-ID: <41DDCC78.9080201@pobox.com> (raw)
In-Reply-To: <20041202083030.A30927@florence.linkmargin.com>

Andy Warner wrote:
> Jeff Garzik wrote:
> 
>>[...]
>>I am currently chasing a 2.6.8->2.6.9 SATA regression, which causes 
>>ata_piix (Intel ICH5/6/7) to not-find some SATA devices on x86-64 SMP, 
>>but works on UP.  Potentially related to >=4GB of RAM.
>>
>>
>>
>>Details, in case anyone is interested:
>>Unless my code is screwed up (certainly possible), PIO data-in [using 
>>the insw() call] seems to return all zeroes on a true-blue SMP machine, 
>>for the identify-device command.  When this happens, libata (correctly) 
>>detects a bad id page and bails.  (problem doesn't show up on single CPU 
>>w/ HT)
> 
> 
> Ah, I might have been here recently, with the pass-thru stuff.
> 
> What I saw was that in an SMP machine:
> 
> 1. queue_work() can result in the work running (on another
>    CPU) instantly.
> 
> 2. Having one CPU beat on PIO registers reading data from one port
>    would significantly alter the timing of the CMD->BSY->DRQ sequence
>    used in PIO. This behaviour was far worse for competing ports
>    within one chip, which I put down to arbitration problems.
> 
> 3. CPU utilisation would go through the roof. Effectively the
>    entire pio_task state machine reduced to a busy spin loop.
> 
> 4. The state machine needed some tweaks, especially in error
>    handling cases.
> 
> I made some changes, which effectively solved the problem for promise
> TX4-150 cards, and was going to test the results on other chipsets
> next week before speaking up. Specifically, I have seen some
> issues with SiI 3114 cards.
> 
> I was trying to explore using interrupts instead of polling state
> but for some reason, I was not getting them for PIO data operations,
> or I misunderstand the spec, after removing ata_qc_set_polling() - again
> I saw a difference in behaviour between the Promise & SiI cards
> here.
> 
> I'm about to go offline for 3 days, and hadn't prepared for this
> yet. The best I can do is provide a patch (attached) that applies
> against 2.6.9. It also seems to apply against libata-2.6, but
> barfs a bit against libata-dev-2.6.
> 
> The changes boil down to these:
> 
> 1. Minor changes in how status/error regs are read.
>    Including attempts to use altstatus, while I was
>    exploring interrupts.
> 
> 2. State machine logic changes.
> 
> 3. Replace calls to queue_work() with queue_delayed_work()
>    to stop SMP machines going crazy.
> 
> With these changes, on a platform consisting of 2.6.9 and
> Promise TX4-150 cards, I can move terabytes of parallel
> PIO data, without error.
> 
> My gut says that the PIO mechanism should be overhauled, I
> composed a "how much should we pay for this muffler" email
> to linux-ide at least twice while working on this, but never
> sent it - wanting to send a solution in rather than just
> making more comments from the peanut gallery.
> 
> I'll pick up the thread on this next week, when I'm back online.
> I hope this helps.

Please let me know if you still have problems?

The PIO SMP problems seem to be fixed here.

	Jeff

      reply	other threads:[~2005-01-06 23:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.4.44.0411221457240.2970-100000@localhost.localdomain>
     [not found] ` <Pine.LNX.4.58.0411221343410.22895@schroedinger.engr.sgi.com>
     [not found]   ` <Pine.LNX.4.58.0411221419440.20993@ppc970.osdl.org>
     [not found]     ` <Pine.LNX.4.58.0411221424580.22895@schroedinger.engr.sgi.com>
     [not found]       ` <Pine.LNX.4.58.0411221429050.20993@ppc970.osdl.org>
     [not found]         ` <Pine.LNX.4.58.0412011539170.5721@schroedinger.engr.sgi.com>
     [not found]           ` <Pine.LNX.4.58.0412011608500.22796@ppc970.osdl.org>
     [not found]             ` <41AEB44D.2040805@pobox.com>
     [not found]               ` <20041201223441.3820fbc0.akpm@osdl.org>
2004-12-02  7:00                 ` page fault scalability patch V12 [0/7]: Overview and performance tests Jeff Garzik
2004-12-02  7:05                   ` Benjamin Herrenschmidt
2004-12-02  7:11                     ` Jeff Garzik
2004-12-02 11:16                       ` Benjamin Herrenschmidt
2004-12-02 14:30                   ` Andy Warner
2005-01-06 23:40                     ` Jeff Garzik [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41DDCC78.9080201@pobox.com \
    --to=jgarzik@pobox.com \
    --cc=akpm@osdl.org \
    --cc=andyw@pobox.com \
    --cc=benh@kernel.crashing.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).