public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>, linux-kernel@vger.kernel.org
Subject: Re: alpha iommu fixes
Date: Sun, 20 May 2001 04:40:13 +0200	[thread overview]
Message-ID: <20010520044013.A18119@athlon.random> (raw)
In-Reply-To: <20010518214617.A701@jurassic.park.msu.ru> <20010519155502.A16482@athlon.random> <20010519231131.A2840@jurassic.park.msu.ru>
In-Reply-To: <20010519231131.A2840@jurassic.park.msu.ru>; from ink@jurassic.park.msu.ru on Sat, May 19, 2001 at 11:11:31PM +0400

On Sat, May 19, 2001 at 11:11:31PM +0400, Ivan Kokshaysky wrote:
> On Sat, May 19, 2001 at 03:55:02PM +0200, Andrea Arcangeli wrote:
> > Reading the tsunami specs I learnt 1 tlb entry caches 8 pagetables (not 1)
> > so the tlb flush will be invalidate immediatly by any PCI DMA run after
> > the flush on any of the other 7 mappings cached in the same tlb entry.
> 
> I have neither tsunami docs nor the tsunami box to play with :-(
> so my guesses might be totally wrong...
> But -- assuming that tsunami is similar to cia/pyxis, that is incorrect.
> We're invalidating not the cached ptes, but the TLB tags, with all 4 (on
> pyxis, and 8 on tsunami, I guess) associated ptes. The reason why we

exactly.

> align new entries at 4*PAGE_SIZE on cia/pyxis is a hardware bug -- if cached
> pte is invalid, it doesn't cause TLB miss. I wouldn't be surprised at all if
> tsunami has the same bug; in this case your fix is urgently needed, of course.

It only depends on the specs if it has to be called a bug or a feature.

> BTW, look at Richard's code in core_cia.c/verify_tb_operation() for
> "valid tag invalid pte reload" test, it could be easily ported to tsunami.

I didn't checked very closely what this code is doing but it seems it's
not triggering any DMA transaction from a DMA bus master so it shouldn't
be able to trigger the race, and as far I can tell as soon as you do DMA
on an invalid pagetable cached in a tlb the machine will lock hard. So I
expect if you try to probe if you need the 8 alignment at runtime you
won't be able to finish the probe ;).

> > then I also enlarged the pci SG space to 1G beause runing out of entries
> > right now breaks the whole world:
> 
> It would just delay the painful death, I think ;-)

I _have_ to completly hide the painful death because as soon as I run
out of entries the machine crashes immedatly because lots of drivers
aren't checking for running out of ptes.

Fixing that is a brainer thing, it may need to partly redesign the
driver so you can take a fail path where you coulnd't previously, in
some place you may need to re-issue a softirq later to try again, in
others you must run_task_queue(&tq_disk) and sched_yield and try again
later (you have the guarantee those entries will return available so it
would be not deadlock prone to do that), in other places you can just
drop the skb.  Each driver has to be fixed in its right way. It seems
for a lot of cases people just replaced the virt_to_bus with the
pci_map_single and they didn't thought pci_map_single may even return
0 which _doesn't_ mean bus address 0 ;)

The reason it's not too bad to hide it is that you can usually calculate
an high bound of how many pci mappings a certain given machine may need
at the same time at runtime, so I can give you the guarantee that you
won't be able to reproduce any of that kind of driver bugs on a certain
given machine, this is the only point of the change, just to get this
guarantee on a larger subset of machines until all those bugs are fixed.

> I'm almost sure that all these "pci_map_sg failed" reports are caused
> by some buggy driver[s], which calls pci_map_xx() without proper
> pci_unmap_xx(). This is harmless on i386, and on alpha if all IO is going

I'm not talking about that kind of leak bug. The fact some driver is
leaking ptes is a completly different kind of bug.

I was only talking about when you get the "pci_map_sg failed" because
you have not 3 but 300 scsi disks connected to your system and you are
writing to all them at the same time allocating zillons of pte, and one
of your drivers (possibly not even a storage driver) is actually not
checking the reval of the pci_map_* functions. You don't need a pte
memleak to trigger it, even regardless of the fact I grown the dynamic
window to 1G which makes it 8 times harder to trigger than in mainline.

For now with a a couple of disks and a few nics and a 1G of dynamic
window size it doesn't trigger and the 1G thing gives a fairly large
margin for most machines out there. I cannot care less about the 2M-128k
memory wastage at this point in time, but as I said I wanted at least
to optimize the 2M pte arena allocation away completly if the machine
has less than 2G, that would be a very worthwhile optimization.

> I've got some debugging code checking for this (perhaps it worth
> posting or even porting to i386 ;-)
> For now I can confirm that all drivers I'm currently using are fine
> wrt pci_map/unmap:
> 3c59x, tulip, sym53c8xx, IDE.

all the drivers I'm using are definitely not leaking pte entries either,
but if you give me not 10 but 1000 scsi disks be sure I will trigger the
missing checks for pci_map_* retval without the need of any driver
leaking ptes.

The bug you are talking about is even worse and I never experienced it
myself, but I agree it's likely some driver has it too because it's not
reproducible on x86, so if you have the x86 debugging code that's
welcome so we will be able to reproduce it on a larger variety of
machines. thanks!

Andrea

  reply	other threads:[~2001-05-20  2:40 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-05-18 17:46 alpha iommu fixes Ivan Kokshaysky
2001-05-19  2:34 ` Tom Vier
2001-05-19 10:48   ` Ivan Kokshaysky
2001-05-19 20:58     ` Tom Vier
2001-05-19 13:55 ` Andrea Arcangeli
2001-05-19 19:11   ` Ivan Kokshaysky
2001-05-20  2:40     ` Andrea Arcangeli [this message]
2001-05-20 12:12       ` Ivan Kokshaysky
2001-05-20 13:40         ` Andrea Arcangeli
2001-05-20 14:23         ` Gérard Roudier
     [not found]       ` <3B07AF49.5A85205F@uow.edu.au>
2001-05-20 13:49         ` Andrea Arcangeli
2001-05-20 14:05           ` Andrew Morton
2001-05-20 14:33             ` Andrea Arcangeli
2001-05-21  1:01               ` David S. Miller
2001-05-21  1:47                 ` Andrea Arcangeli
2001-05-21  7:05                   ` David S. Miller
2001-05-21  8:59                     ` Andrea Arcangeli
2001-05-21  9:02                       ` David S. Miller
2001-05-21  9:23                         ` Andi Kleen
2001-05-21  9:30                           ` David S. Miller
2001-05-21  9:42                             ` Andi Kleen
2001-05-21 10:00                               ` David S. Miller
2001-05-21 10:27                                 ` Andi Kleen
2001-05-21 10:34                                   ` David S. Miller
2001-05-21 10:42                                     ` Andi Kleen
2001-05-21 10:55                                       ` David S. Miller
2001-05-21 11:08                                         ` Andi Kleen
2001-05-21 11:36                                           ` David S. Miller
2001-05-21 11:41                                             ` Andi Kleen
2001-05-21 22:22                                   ` Jens Axboe
2001-05-21 10:02                               ` Andrea Arcangeli
2001-05-21 10:17                                 ` Alan Cox
2001-05-21  9:56                         ` Andrea Arcangeli
2001-05-21 10:11                           ` David S. Miller
2001-05-21 10:19                             ` David S. Miller
2001-05-21 11:00                               ` Andrea Arcangeli
2001-05-21 11:04                                 ` David S. Miller
2001-05-21 11:27                                   ` Andrea Arcangeli
2001-05-21 12:16                                     ` Peter Rival
2001-05-21 13:55                               ` Jonathan Lundell
2001-05-21 14:17                                 ` Ivan Kokshaysky
2001-05-21 15:47                                   ` Jonathan Lundell
2001-05-22 11:12                               ` Chris Wedgwood
2001-05-22 17:51                                 ` Jonathan Lundell
2001-05-21 10:50                             ` Andrea Arcangeli
2001-05-21 10:59                               ` David S. Miller
2001-05-21 11:19                                 ` Andrea Arcangeli
2001-05-21 11:51                                   ` Ivan Kokshaysky
2001-05-21 17:53                                     ` Richard Henderson
2001-05-22  0:56                                       ` Andrea Arcangeli
2001-05-22 14:29                                         ` Andrea Arcangeli
2001-05-22 14:44                                           ` Ivan Kokshaysky
2001-05-22 15:00                                             ` Andrea Arcangeli
2001-05-22 20:28                                               ` Richard Henderson
2001-05-22 20:40                                                 ` Jeff Garzik
2001-05-22 20:52                                                   ` Andrea Arcangeli
2001-05-22 20:57                                                   ` Richard Henderson
2001-05-22 21:09                                                   ` Alan Cox
2001-05-22 20:48                                                 ` Jonathan Lundell
2001-05-22 21:02                                                   ` Richard Henderson
2001-05-22 21:10                                                     ` Alan Cox
2001-05-22 21:17                                                     ` Jonathan Lundell
2001-05-22 21:24                                                       ` Alan Cox
2001-05-22 21:34                                                         ` Jonathan Lundell
2001-05-22 21:08                                                 ` Alan Cox
2001-05-22 15:18                                             ` Andrea Arcangeli
2001-05-22 15:55                                               ` Ivan Kokshaysky
2001-05-22 16:06                                                 ` Andrea Arcangeli
2001-05-22 13:22                                       ` Andrea Arcangeli
2001-05-21  9:50                       ` Gerd Knorr
2001-05-21  1:00             ` David S. Miller
2001-05-21  7:47               ` Alan Cox
2001-05-21  7:53                 ` David S. Miller
2001-05-21  8:03                   ` Alan Cox
2001-05-21  8:11                     ` David S. Miller
2001-05-20 16:18           ` Andrea Arcangeli
2001-05-20 16:21             ` Andrew Morton
2001-05-20 16:44               ` Andrea Arcangeli
2001-05-20 16:54                 ` Andrew Morton
2001-05-20 17:12                   ` Andrea Arcangeli
2001-05-21  1:07                     ` David S. Miller
2001-05-21  1:37                       ` Andrea Arcangeli
2001-05-21  6:53                         ` David S. Miller
2001-05-21  7:59                           ` Alan Cox
2001-05-21  8:09                             ` David S. Miller
2001-05-21  8:09                               ` Alan Cox
2001-05-21  8:06                           ` Chris Wedgwood
2001-05-23  0:05                           ` Albert D. Cahalan
2001-05-22 13:11                       ` Pavel Machek
2001-05-22 23:02                         ` David S. Miller
2001-05-20 17:16             ` Jeff Garzik
2001-05-20 17:37               ` Andrea Arcangeli
2001-05-21  1:03             ` David S. Miller
2001-05-21  1:58               ` Richard Henderson
2001-05-20  1:11 ` Richard Henderson
2001-05-20 12:05   ` Ivan Kokshaysky
2001-05-21  0:37     ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010520044013.A18119@athlon.random \
    --to=andrea@suse.de \
    --cc=ink@jurassic.park.msu.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox