From mboxrd@z Thu Jan 1 00:00:00 1970 From: Luis R. Rodriguez Date: Mon, 23 Feb 2009 09:31:46 -0800 Subject: [ath9k-devel] [RFC v2] Serialization of IO In-Reply-To: <200902222251.38358.listsrv@wilsoft.nl> References: <20090211080717.GN4248@tesla> <43e72e890902141729s79101fb6x2f602f9b6948f226@mail.gmail.com> <8edeb164ccc18f3ad5635eb1d7d0ff4e.squirrel@www.wilsoft.nl> <200902222251.38358.listsrv@wilsoft.nl> Message-ID: <20090223173145.GA4264@tesla> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ath9k-devel@lists.ath9k.org On Sun, Feb 22, 2009 at 01:51:38PM -0800, W. van den Akker wrote: > On Monday 16 February 2009 11:18:28 W. van den Akker wrote: > > > On Wed, Feb 11, 2009 at 12:07 AM, Luis R. Rodriguez > > > > > > wrote: > > >> So I've gone back to the drawing board, and reviewed this issue > > >> as thoroughly as I can. The issue is PCI reads/writes can overlap > > >> with each other (not just writes). This shouldn't generally be an > > >> issue but if some reads take a while, for example, there could be > > >> another read/write on its way on another CPU and at least for our > > >> PCI 11n devices that will make them angry. Some PCI hosts don't seem > > >> to do this but some others do. It should be noted this issue is not > > >> present on our pre-802.11n devices or our new 11n PCI-express > > >> devices. > > >> > > >> So with clarified, here's a second attempt at serialization. > > >> The first patch wasn't doing anything because we never initialized > > >> ah->config.serialize_regmode. We do that now only on non-UP systems. > > >> The last patch in the series is perhaps overkill -- but it would deal > > >> with rare case of a UP system coming up and you hotplugging a second > > >> CPU later. It may also help with suspend, but don't quote me on that > > >> yet. > > >> > > >> Anyway, here's the latest stab at it: > > >> > > >> http://www.kernel.org/pub/linux/kernel/people/mcgrof/patches/ath9k/2009- > > >>02-11/serialization-v2.patch > > >> > > >> This applies against today's wireless-testing/compat-wireless updates. > > >> > > >> Please test and let me know if ath9k with PCI devices on HT/Multi-CPU > > >> issues are corrected by it. > > >> > > >> Known issue: ping flood in a terminal makes it painful to come back. > > >> > > >> I've been trying to look for a more neater way to guarantee > > >> serialization > > >> but so far this is what I have. I do wonder, for example, if some of > > >> the atomic.h (atomic_inc_and_test()) stuff may let us use it to somehow > > >> serialize CPU entry into a read/write. Although its not designed for it > > >> may be worth considering. I also some of the most evil code I've seen > > >> lately on drivers/pci/quirks.c and did wonder if there was a fix we can > > >> re-use through there but didn't see anything. If you know have any other > > >> ideas please let me know. > > > > > > Can someone who is able to reproduce the SMP issue please try these > > > patches? > > > > > > Luis > > > _______________________________________________ > > > ath9k-devel mailing list > > > ath9k-devel at lists.ath9k.org > > > https://lists.ath9k.org/mailman/listinfo/ath9k-devel > > > > > > -- > > > This message has been scanned for viruses and > > > dangerous content by MailScanner, and is > > > believed to be clean. > > > > Hi Luis, > > > > I am currently on holiday. I have patched the system. But had some issues > > because with the UDEV and also my notebook wouldnt connect anymore to the > > AP. > > I had not had the time to dig into. > > Sunday I will investigate it further. > > > > Is it possible to create the patch against the mainstream RC instead of > > the RC4-wl? Then I can test it faster. > > > > Sorry for the delay. > > > > I grabbed the latest git-testing today and applied the patch. > The server is running for almost 2 hours now with a double CPU. No problems > yet found. I have stressed the system. But no hangups. > > Also the previous problems with UDEV are gone with this testing version. > > So.... it looks like the patch is working for SMP systems > > I will report tomorrow if its still up and running. But it looks promissing. Willem, thanks for the feedback so far -- you're the first to report back success on the patches curing your issues. I believe you may have also enabled maxcpus=1 before so just want to confirm that if you did have that that you removed that from your grub conf for the shiny new kernel + serialization patches. Luis