From mboxrd@z Thu Jan 1 00:00:00 1970 From: W. van den Akker Date: Mon, 23 Feb 2009 18:45:41 +0100 Subject: [ath9k-devel] [RFC v2] Serialization of IO In-Reply-To: <20090223173145.GA4264@tesla> References: <20090211080717.GN4248@tesla> <200902222251.38358.listsrv@wilsoft.nl> <20090223173145.GA4264@tesla> Message-ID: <200902231845.41715.listsrv@wilsoft.nl> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ath9k-devel@lists.ath9k.org On Monday 23 February 2009 18:31:46 Luis R. Rodriguez wrote: > On Sun, Feb 22, 2009 at 01:51:38PM -0800, W. van den Akker wrote: > > On Monday 16 February 2009 11:18:28 W. van den Akker wrote: > > > > On Wed, Feb 11, 2009 at 12:07 AM, Luis R. Rodriguez > > > > > > > > wrote: > > > >> So I've gone back to the drawing board, and reviewed this issue > > > >> as thoroughly as I can. The issue is PCI reads/writes can overlap > > > >> with each other (not just writes). This shouldn't generally be an > > > >> issue but if some reads take a while, for example, there could be > > > >> another read/write on its way on another CPU and at least for our > > > >> PCI 11n devices that will make them angry. Some PCI hosts don't seem > > > >> to do this but some others do. It should be noted this issue is not > > > >> present on our pre-802.11n devices or our new 11n PCI-express > > > >> devices. > > > >> > > > >> So with clarified, here's a second attempt at serialization. > > > >> The first patch wasn't doing anything because we never initialized > > > >> ah->config.serialize_regmode. We do that now only on non-UP systems. > > > >> The last patch in the series is perhaps overkill -- but it would > > > >> deal with rare case of a UP system coming up and you hotplugging a > > > >> second CPU later. It may also help with suspend, but don't quote me > > > >> on that yet. > > > >> > > > >> Anyway, here's the latest stab at it: > > > >> > > > >> http://www.kernel.org/pub/linux/kernel/people/mcgrof/patches/ath9k/2 > > > >>009- 02-11/serialization-v2.patch > > > >> > > > >> This applies against today's wireless-testing/compat-wireless > > > >> updates. > > > >> > > > >> Please test and let me know if ath9k with PCI devices on > > > >> HT/Multi-CPU issues are corrected by it. > > > >> > > > >> Known issue: ping flood in a terminal makes it painful to come back. > > > >> > > > >> I've been trying to look for a more neater way to guarantee > > > >> serialization > > > >> but so far this is what I have. I do wonder, for example, if some of > > > >> the atomic.h (atomic_inc_and_test()) stuff may let us use it to > > > >> somehow serialize CPU entry into a read/write. Although its not > > > >> designed for it may be worth considering. I also some of the most > > > >> evil code I've seen lately on drivers/pci/quirks.c and did wonder if > > > >> there was a fix we can re-use through there but didn't see anything. > > > >> If you know have any other ideas please let me know. > > > > > > > > Can someone who is able to reproduce the SMP issue please try these > > > > patches? > > > > > > > > Luis > > > > _______________________________________________ > > > > ath9k-devel mailing list > > > > ath9k-devel at lists.ath9k.org > > > > https://lists.ath9k.org/mailman/listinfo/ath9k-devel > > > > > > > > -- > > > > This message has been scanned for viruses and > > > > dangerous content by MailScanner, and is > > > > believed to be clean. > > > > > > Hi Luis, > > > > > > I am currently on holiday. I have patched the system. But had some > > > issues because with the UDEV and also my notebook wouldnt connect > > > anymore to the AP. > > > I had not had the time to dig into. > > > Sunday I will investigate it further. > > > > > > Is it possible to create the patch against the mainstream RC instead of > > > the RC4-wl? Then I can test it faster. > > > > > > Sorry for the delay. > > > > I grabbed the latest git-testing today and applied the patch. > > The server is running for almost 2 hours now with a double CPU. No > > problems yet found. I have stressed the system. But no hangups. > > > > Also the previous problems with UDEV are gone with this testing version. > > > > So.... it looks like the patch is working for SMP systems > > > > I will report tomorrow if its still up and running. But it looks > > promissing. > > Willem, > > thanks for the feedback so far -- you're the first to report back success > on the patches curing your issues. I believe you may have also enabled > maxcpus=1 before so just want to confirm that if you did have that that > you removed that from your grub conf for the shiny new kernel + > serialization patches. > > Luis Correct, I downloaded the last testing-writeless.. applied the patch and removed the maxcpus=1 from the grub conf. The server is showing 2 CPUs. Now almost 24 hours up and running without any problems. gr, Willem -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.