* Re: [Linux-ia64] IO/TLB bounce buffer space
2001-06-12 8:44 [Linux-ia64] IO/TLB bounce buffer space Martin Wilck
@ 2001-06-12 16:23 ` root
2001-06-12 17:01 ` Martin Wilck
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: root @ 2001-06-12 16:23 UTC (permalink / raw)
To: linux-ia64
>>>>> On Tue, 12 Jun 2001 10:44:12 +0200 (CEST), Martin Wilck <Martin.Wilck@fujitsu-siemens.com> said:
Martin> it is becoming apparent that the limited bounce buffer space
Martin> is the reason for the crashes with the new aic7xxx driver
Martin> and linux-IA64 with >=4GB RAM that I reported previously.
Martin> If I understand it right, the driver uses up to 253 buffers
Martin> per device, each of which can be 8kB in size. Consequently,
Martin> it almost completely fills up the available IO/TLB space.
The deeper question is of course: is this really a good idea? From a
latency perspective, such long queues may not make a lot of sense.
Even from a throughput perspective the benefit of using a 253 entry
queue is probably negligible compared to a shorter queue. Reading
aic7xxx.h, I get the impression that the author chose 253 as the max
queue length because s/he could. I don't see anything that suggests
that this length is optimal in any sense. It might be worth
experimenting with.
Martin> Question: Would it hurt to increase IO/TLB space on machines
Martin> with large memory? Would it be possible and make sense to
Martin> make IO/TLB space size a kernel configuration option?
An alternative is to use GFP_ATOMIC allocation. I'm not sure that
counts as a "solution" though as it arguably just hides the real
issue.
--david
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [Linux-ia64] IO/TLB bounce buffer space
2001-06-12 8:44 [Linux-ia64] IO/TLB bounce buffer space Martin Wilck
2001-06-12 16:23 ` root
@ 2001-06-12 17:01 ` Martin Wilck
2001-06-12 17:52 ` root
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Martin Wilck @ 2001-06-12 17:01 UTC (permalink / raw)
To: linux-ia64
On Tue, 12 Jun 2001, David wrote:
> The deeper question is of course: is this really a good idea? From a
> latency perspective, such long queues may not make a lot of sense.
> Even from a throughput perspective the benefit of using a 253 entry
> queue is probably negligible compared to a shorter queue. Reading
> aic7xxx.h, I get the impression that the author chose 253 as the max
> queue length because s/he could. I don't see anything that suggests
> that this length is optimal in any sense. It might be worth
> experimenting with.
Actually I found that if the driver "survives" the crash test (currently
with < 4GB RAM only) it eventually cuts the queue length for the disk
in question from 253 to 64 (max supported by the drive).
In any case, I assume some sort of compromise must be found. On a server
scale, a machine with 4 or even more Adaptec boards may well exist. Each
board supports 16 devices (32 if it's a 39160) and the default driver
setting is to use 253 buffers for each device.
Thus, worst case buffer usage would be 8kB * 253 * 16 * 4 = 126 MB only
for this driver! Of course it is highly unlikely that all these buffers
will be in use simultaneously. Even with only 16 SCBs per device, the
driver would still need 8MB of bounce buffers.
I guess it will be necessary to use both lower SCB numbers in the aic7xxx
driver and increase IO-TLB space.
All of this can already be done with kernel command line options, but
will be quite cumbersome to figure out (and tune right) for
administrators.
There will always be some danger of IO-TLB overflow, however, unless
a way is found to gracefully abort an operation if the IO-TLB space is
full (or unless all hardware vendors make their boards 64-bit capable).
Btw: will _hardware_ IO-TLB support be available some time soon?
(I figure that's what's being done on Alpha, right?)
I am currently trying to write a kernel module that allows monitoring of
IO-TLB usage.
Regards,
Martin
--
Martin Wilck <Martin.Wilck@fujitsu-siemens.com>
FSC EP PS DS1, Paderborn Tel. +49 5251 8 15113
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [Linux-ia64] IO/TLB bounce buffer space
2001-06-12 8:44 [Linux-ia64] IO/TLB bounce buffer space Martin Wilck
2001-06-12 16:23 ` root
2001-06-12 17:01 ` Martin Wilck
@ 2001-06-12 17:52 ` root
2001-06-22 22:50 ` Rich Altmaier
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: root @ 2001-06-12 17:52 UTC (permalink / raw)
To: linux-ia64
>>>>> On Tue, 12 Jun 2001 19:01:10 +0200 (CEST), Martin Wilck <Martin.Wilck@fujitsu-siemens.com> said:
Martin> I guess it will be necessary to use both lower SCB numbers
Martin> in the aic7xxx driver and increase IO-TLB space. All of
Martin> this can already be done with kernel command line options,
Martin> but will be quite cumbersome to figure out (and tune right)
Martin> for administrators.
Yes, that's certainly not ideal. A machine should boot without
requiring any tuning-type command-line options.
Martin> There will always be some danger of IO-TLB overflow,
Martin> however, unless a way is found to gracefully abort an
Martin> operation if the IO-TLB space is full (or unless all
Martin> hardware vendors make their boards 64-bit capable).
There isn't. Dave Miller blames me for not bringing up this issue
early enough when the PCI DMA interface was designed. Not that it
would have made much of a difference: it's difficult to recover
gracefully in interrupt handlers, especially considering how many
drivers are our there that would have to be updated for this...
We can either try to improve the heuristic that guesses the right
static size of the I/O TLB buffers or we can make it more dynamic by
using atomic allocations. Either way, it looks to me like the aic7xxx
driver should be tuned to not generate so many concurrent requests.
Martin> Btw: will _hardware_ IO-TLB support be available some time
Martin> soon? (I figure that's what's being done on Alpha, right?)
As soon as there are chipsets that support it!
That's not going to happen for Itanium. For McKinley, we just have to
wait and see...
--david
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [Linux-ia64] IO/TLB bounce buffer space
2001-06-12 8:44 [Linux-ia64] IO/TLB bounce buffer space Martin Wilck
` (2 preceding siblings ...)
2001-06-12 17:52 ` root
@ 2001-06-22 22:50 ` Rich Altmaier
2001-06-24 15:20 ` Rik van Riel
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Rich Altmaier @ 2001-06-22 22:50 UTC (permalink / raw)
To: linux-ia64
This is just a note to comment on the mention of:
Thus, worst case buffer usage would be 8kB * 253 * 16 * 4 = 126 MB only
for this driver! Of course it is highly unlikely that all these buffers
will be in use simultaneously.
In my experience, the question of the peak demand a system may
experience must be governed by
1. what you, the supplier, spec as the peak demand it supports,
2. what the customer understands is the max load the system
can be utilized to perform, and have paid to obtain.
3. and any shortfall in actual delivered performance from your spec
you can be sure the customer will demand you the supplier pay
to make up the difference!
This leaves no room for statistical analysis! The customer will not
view that using all the resources at once is unlikely and not expected.
If you spec 4 controllers of 32 devices each of 253 buffers needed for
their peak operation, then the system better deliver this.
Or you should change your spec and reduce the numbers.
Please don't design for 4 controllers of 32 devices that just limp along.
Either they run at full performance all the time, or reduce the numbers
to what can be supported! How can the customer, who lays out the
dollars for 4*32 devices, interpret this otherwise?
Of course you can also take an approach of specing a max aggregate
bandwidth, to be "fairly" allocated across all devices. This is another
approach that can work, although you will spend a year or so getting
"fairly" to satisfy all customers. Better to pull down your config
until peak really works.
Thanks, Rich
Rich Altmaier
SGI
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [Linux-ia64] IO/TLB bounce buffer space
2001-06-12 8:44 [Linux-ia64] IO/TLB bounce buffer space Martin Wilck
` (3 preceding siblings ...)
2001-06-22 22:50 ` Rich Altmaier
@ 2001-06-24 15:20 ` Rik van Riel
2001-06-25 11:37 ` Martin Wilck
2001-06-25 21:14 ` David Mosberger
6 siblings, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2001-06-24 15:20 UTC (permalink / raw)
To: linux-ia64
On Fri, 22 Jun 2001, Rich Altmaier wrote:
> This is just a note to comment on the mention of:
> Thus, worst case buffer usage would be 8kB * 253 * 16 * 4 = 126 MB only
> for this driver! Of course it is highly unlikely that all these buffers
> will be in use simultaneously.
> This leaves no room for statistical analysis!
Yup. Even though it may be the architecture formerly
known as Itanic, people still CARE ABOUT STABILITY.
Coding device drivers or the OS in such a way that
kernel memory resources can be overcommitted is just
irresponsible and foolish.
regards,
Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...
http://www.surriel.com/ http://distro.conectiva.com/
Send all your spam to aardvark@nl.linux.org (spam digging piggy)
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [Linux-ia64] IO/TLB bounce buffer space
2001-06-12 8:44 [Linux-ia64] IO/TLB bounce buffer space Martin Wilck
` (4 preceding siblings ...)
2001-06-24 15:20 ` Rik van Riel
@ 2001-06-25 11:37 ` Martin Wilck
2001-06-25 21:14 ` David Mosberger
6 siblings, 0 replies; 8+ messages in thread
From: Martin Wilck @ 2001-06-25 11:37 UTC (permalink / raw)
To: linux-ia64
Dear Rich, all,
> In my experience, the question of the peak demand a system may
> experience must be governed by
> 1. what you, the supplier, spec as the peak demand it supports,
> 2. what the customer understands is the max load the system
> can be utilized to perform, and have paid to obtain.
> 3. and any shortfall in actual delivered performance from your spec
> you can be sure the customer will demand you the supplier pay
> to make up the difference!
>
> This leaves no room for statistical analysis! The customer will not
> view that using all the resources at once is unlikely and not expected.
> If you spec 4 controllers of 32 devices each of 253 buffers needed for
> their peak operation, then the system better deliver this.
> Or you should change your spec and reduce the numbers.
I completely agree with you. I just didn't want to sound too harsh in my
analysis of the problem (the harsh version is: This controller/driver
combination is not suitable for production systems on IA/64).
Justing Gibbs is implementing 39bit addressing into the aix7xxx
driver right now. Once finished, this will solve the problem.
Regards,
Martin
--
Martin Wilck <Martin.Wilck@fujitsu-siemens.com>
FSC EP PS DS1, Paderborn Tel. +49 5251 8 15113
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [Linux-ia64] IO/TLB bounce buffer space
2001-06-12 8:44 [Linux-ia64] IO/TLB bounce buffer space Martin Wilck
` (5 preceding siblings ...)
2001-06-25 11:37 ` Martin Wilck
@ 2001-06-25 21:14 ` David Mosberger
6 siblings, 0 replies; 8+ messages in thread
From: David Mosberger @ 2001-06-25 21:14 UTC (permalink / raw)
To: linux-ia64
>>>>> On Mon, 25 Jun 2001 13:37:43 +0200 (CEST), Martin Wilck <Martin.Wilck@fujitsu-siemens.com> said:
Martin> Justing Gibbs is implementing 39bit addressing into the
Martin> aix7xxx driver right now.
That's good to hear. While this won't be a problem with aix7xxx
anymore, as a general rule, drivers should be conscious not to
oversubscribe the I/O TLB.
--david
^ permalink raw reply [flat|nested] 8+ messages in thread