* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management [not found] <eac2ad3017b5f160d24c.1141922822@localhost.localdomain> @ 2006-03-09 23:20 ` Roland Dreier 2006-03-09 23:39 ` Bryan O'Sullivan 2006-03-09 23:24 ` Roland Dreier 2006-03-09 23:26 ` Roland Dreier 2 siblings, 1 reply; 31+ messages in thread From: Roland Dreier @ 2006-03-09 23:20 UTC (permalink / raw) To: Bryan O'Sullivan Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general Bryan> The ipath_sma.c file supports a lightweight userspace Bryan> subnet management agent (SMA). This is used in deployments Bryan> (such as HPC clusters) where a full Infiniband protocol Bryan> stack is not needed. I've never understood what forces you to maintain two separate SMAs. Why can't you pick one of the two SMAs and use that unconditionally? - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-09 23:20 ` [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management Roland Dreier @ 2006-03-09 23:39 ` Bryan O'Sullivan 2006-03-09 23:47 ` Roland Dreier 2006-03-10 15:54 ` Michael S. Tsirkin 0 siblings, 2 replies; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-09 23:39 UTC (permalink / raw) To: Roland Dreier; +Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general On Thu, 2006-03-09 at 15:20 -0800, Roland Dreier wrote: > I've never understood what forces you to maintain two separate SMAs. > Why can't you pick one of the two SMAs and use that unconditionally? Three reasons. * OpenSM wasn't usable when we wrote our SMA. We have customers using ours now, so we have to support it. * Our SMA does some setup for the layered ethernet emulation driver. * Our SMA works without an IB stack of any kind present. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-09 23:39 ` Bryan O'Sullivan @ 2006-03-09 23:47 ` Roland Dreier 2006-03-09 23:50 ` Bryan O'Sullivan 2006-03-10 15:54 ` Michael S. Tsirkin 1 sibling, 1 reply; 31+ messages in thread From: Roland Dreier @ 2006-03-09 23:47 UTC (permalink / raw) To: Bryan O'Sullivan Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general > Three reasons. > > * OpenSM wasn't usable when we wrote our SMA. We have customers > using ours now, so we have to support it. Huh? What does OpenSM working or not have to do with the SMA? > * Our SMA does some setup for the layered ethernet emulation > driver. > * Our SMA works without an IB stack of any kind present. That's fine. So then I guess the question is, why can't you use your SMA all the time? And does that mean that the verbs SMA doesn't support ethernet emulation, so you can't use ethernet emulation and verbs at the same time? - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-09 23:47 ` Roland Dreier @ 2006-03-09 23:50 ` Bryan O'Sullivan 2006-03-09 23:52 ` Roland Dreier 0 siblings, 1 reply; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-09 23:50 UTC (permalink / raw) To: Roland Dreier; +Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general On Thu, 2006-03-09 at 15:47 -0800, Roland Dreier wrote: > That's fine. So then I guess the question is, why can't you use your > SMA all the time? We do. It coexists with OpenSM if OpenSM is present. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-09 23:50 ` Bryan O'Sullivan @ 2006-03-09 23:52 ` Roland Dreier 0 siblings, 0 replies; 31+ messages in thread From: Roland Dreier @ 2006-03-09 23:52 UTC (permalink / raw) To: Bryan O'Sullivan Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general Roland> That's fine. So then I guess the question is, why can't Roland> you use your SMA all the time? Bryan> We do. It coexists with OpenSM if OpenSM is present. So can we kill the other SMA? - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-09 23:39 ` Bryan O'Sullivan 2006-03-09 23:47 ` Roland Dreier @ 2006-03-10 15:54 ` Michael S. Tsirkin 2006-03-10 16:05 ` Bryan O'Sullivan 1 sibling, 1 reply; 31+ messages in thread From: Michael S. Tsirkin @ 2006-03-10 15:54 UTC (permalink / raw) To: Bryan O'Sullivan Cc: Roland Dreier, akpm, gregkh, linux-kernel, openib-general, davem Quoting r. Bryan O'Sullivan <bos@pathscale.com>: > Subject: Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management > > On Thu, 2006-03-09 at 15:20 -0800, Roland Dreier wrote: > > > I've never understood what forces you to maintain two separate SMAs. > > Why can't you pick one of the two SMAs and use that unconditionally? > > Three reasons. > > * OpenSM wasn't usable when we wrote our SMA. We have customers > using ours now, so we have to support it. Presumably you mean the ib_mad SMA - OpenSM is not an SMA. I don't understand this. You don't talk to an SMA directly - its jobs is to receive and send management packets that the card gets from a subnet manager. So what do customers care which SMA implementation is used, as long as it formats the management packets correctly? If you want to extend the SMA in non-standard ways, you can snoop and send management packets by loading ib_umad. > * Our SMA does some setup for the layered ethernet emulation > driver. You are doing this from userspace, right? But you can already send/receive MADs from userspace by loading ib_umad. What is there that is not sufficient for your purposes? By the way, this might also be a clean way for you to get at the port info node info and port counters atomically like you wanted to in another thread: post a management packet to the local device, get a packet back. No need for extra sysfs files. > * Our SMA works without an IB stack of any kind present. The stack is pretty slim, AFAIK it mostly consists of an SMA implementation. So where's the benefit in avoiding it? Just link against ib_mad, it will get loaded atomatically. -- Michael S. Tsirkin Staff Engineer, Mellanox Technologies ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 15:54 ` Michael S. Tsirkin @ 2006-03-10 16:05 ` Bryan O'Sullivan 0 siblings, 0 replies; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-10 16:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Roland Dreier, akpm, gregkh, linux-kernel, openib-general, davem On Fri, 2006-03-10 at 17:54 +0200, Michael S. Tsirkin wrote: > > * OpenSM wasn't usable when we wrote our SMA. We have customers > > using ours now, so we have to support it. > > Presumably you mean the ib_mad SMA - OpenSM is not an SMA. Yes, I already mentioned that I got my terms swapped in another message. > So what do customers care which SMA > implementation is used, as long as it formats the management packets > correctly? Many, perhaps most right now, of our customers don't have a full IB stack loaded. That's why we have this small userspace SMA. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management [not found] <eac2ad3017b5f160d24c.1141922822@localhost.localdomain> 2006-03-09 23:20 ` [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management Roland Dreier @ 2006-03-09 23:24 ` Roland Dreier 2006-03-09 23:49 ` Bryan O'Sullivan 2006-03-09 23:26 ` Roland Dreier 2 siblings, 1 reply; 31+ messages in thread From: Roland Dreier @ 2006-03-09 23:24 UTC (permalink / raw) To: Bryan O'Sullivan Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general > +static int ipath_sma_open(struct inode *in, struct file *fp) > +{ > + int s; > + > + if (ipath_sma_alive) { > + ipath_dbg("SMA already running (pid %u), failing\n", > + ipath_sma_alive); > + return -EBUSY; > + } > + > + for (s = 0; s < atomic_read(&ipath_max); s++) { > + struct ipath_devdata *dd = ipath_lookup(s); > + /* we need at least one infinipath device to be initialized. */ > + if (dd && dd->ipath_flags & IPATH_INITTED) { > + ipath_sma_alive = current->pid; It seems there's a window here where two processes can both pass the if (ipath_sma_alive) test and then proceed to step on each other. - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-09 23:24 ` Roland Dreier @ 2006-03-09 23:49 ` Bryan O'Sullivan 2006-03-09 23:51 ` Roland Dreier 0 siblings, 1 reply; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-09 23:49 UTC (permalink / raw) To: Roland Dreier; +Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general On Thu, 2006-03-09 at 15:24 -0800, Roland Dreier wrote: > It seems there's a window here where two processes can both pass the > if (ipath_sma_alive) test and then proceed to step on each other. Yep, this is a real race, albeit incredibly unlikely. I just turned ipath_sma_alive into an atomic_t, and wrapped the open/close code in spinlocks. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-09 23:49 ` Bryan O'Sullivan @ 2006-03-09 23:51 ` Roland Dreier 0 siblings, 0 replies; 31+ messages in thread From: Roland Dreier @ 2006-03-09 23:51 UTC (permalink / raw) To: Bryan O'Sullivan Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general Bryan> Yep, this is a real race, albeit incredibly unlikely. I Bryan> just turned ipath_sma_alive into an atomic_t, and wrapped Bryan> the open/close code in spinlocks. How does making it an atomic_t help? I think you're only going to be using atomic_set() and atomic_read(), and atomic_t doesn't provide anything in that case. - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management [not found] <eac2ad3017b5f160d24c.1141922822@localhost.localdomain> 2006-03-09 23:20 ` [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management Roland Dreier 2006-03-09 23:24 ` Roland Dreier @ 2006-03-09 23:26 ` Roland Dreier 2006-03-09 23:52 ` Bryan O'Sullivan 2 siblings, 1 reply; 31+ messages in thread From: Roland Dreier @ 2006-03-09 23:26 UTC (permalink / raw) To: Bryan O'Sullivan Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general > +static int ipath_sma_release(struct inode *in, struct file *fp) > +{ > + int s; > + > + ipath_sma_alive = 0; > + ipath_cdbg(SMA, "Closing SMA device\n"); > + for (s = 0; s < atomic_read(&ipath_max); s++) { > + struct ipath_devdata *dd = ipath_lookup(s); > + > + if (!dd || !(dd->ipath_flags & IPATH_INITTED)) > + continue; > + *dd->ipath_statusp &= ~IPATH_STATUS_SMA; > + if (dd->verbs_layer.l_flags & IPATH_VERBS_KERNEL_SMA) > + *dd->ipath_statusp |= IPATH_STATUS_OIB_SMA; > + } > + return 0; > +} Similarly what protects against another process opening the device right after the ipath_sma_alive = 0 setting, but before you do all the cleanup that's after that? And what protects against a hot unplug of a device after the test of s against ipath_max? - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-09 23:26 ` Roland Dreier @ 2006-03-09 23:52 ` Bryan O'Sullivan 2006-03-10 0:00 ` Roland Dreier 2006-03-10 0:45 ` Greg KH 0 siblings, 2 replies; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-09 23:52 UTC (permalink / raw) To: Roland Dreier; +Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general On Thu, 2006-03-09 at 15:26 -0800, Roland Dreier wrote: > Similarly what protects against another process opening the device > right after the ipath_sma_alive = 0 setting, but before you do all the > cleanup that's after that? This is fixed by the stuff I just did in response to your earlier message. > And what protects against a hot unplug of a device after the test of s > against ipath_max? We don't support hotplugged devices at the moment. If you're asking whether an rmmod at the wrong time could cause something bad to happen, I don't *think* so. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-09 23:52 ` Bryan O'Sullivan @ 2006-03-10 0:00 ` Roland Dreier 2006-03-10 0:04 ` Bryan O'Sullivan 2006-03-10 0:45 ` Greg KH 1 sibling, 1 reply; 31+ messages in thread From: Roland Dreier @ 2006-03-10 0:00 UTC (permalink / raw) To: Bryan O'Sullivan Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general Bryan> We don't support hotplugged devices at the moment. If Bryan> you're asking whether an rmmod at the wrong time could Bryan> cause something bad to happen, I don't *think* so. How do you stop someone from hot plugging a PCIe device? - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 0:00 ` Roland Dreier @ 2006-03-10 0:04 ` Bryan O'Sullivan 0 siblings, 0 replies; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-10 0:04 UTC (permalink / raw) To: Roland Dreier; +Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general On Thu, 2006-03-09 at 16:00 -0800, Roland Dreier wrote: > Bryan> We don't support hotplugged devices at the moment. > > How do you stop someone from hot plugging a PCIe device? You say "we don't support this yet" somewhere in big letters. The chips and cards support hotplug electrically and logically. We just haven't had time yet to do the driver support work, and won't for a while. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-09 23:52 ` Bryan O'Sullivan 2006-03-10 0:00 ` Roland Dreier @ 2006-03-10 0:45 ` Greg KH 2006-03-10 0:48 ` Bryan O'Sullivan 1 sibling, 1 reply; 31+ messages in thread From: Greg KH @ 2006-03-10 0:45 UTC (permalink / raw) To: Bryan O'Sullivan Cc: Roland Dreier, rolandd, akpm, davem, linux-kernel, openib-general On Thu, Mar 09, 2006 at 03:52:47PM -0800, Bryan O'Sullivan wrote: > On Thu, 2006-03-09 at 15:26 -0800, Roland Dreier wrote: > > > Similarly what protects against another process opening the device > > right after the ipath_sma_alive = 0 setting, but before you do all the > > cleanup that's after that? > > This is fixed by the stuff I just did in response to your earlier > message. > > > And what protects against a hot unplug of a device after the test of s > > against ipath_max? > > We don't support hotplugged devices at the moment. Why not? Your cards can't be placed in a machine that supports PCI Hotplug (or PCI-E hotplug)? You can't really tell users that (no matter how often I have wished I could...) thanks, greg k-h ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 0:45 ` Greg KH @ 2006-03-10 0:48 ` Bryan O'Sullivan 2006-03-10 1:04 ` Greg KH 0 siblings, 1 reply; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-10 0:48 UTC (permalink / raw) To: Greg KH; +Cc: Roland Dreier, rolandd, akpm, davem, linux-kernel, openib-general On Thu, 2006-03-09 at 16:45 -0800, Greg KH wrote: > > We don't support hotplugged devices at the moment. > > Why not? Your cards can't be placed in a machine that supports PCI > Hotplug (or PCI-E hotplug)? No, the driver and userspace code doesn't support it yet. That's all. > You can't really tell users that (no matter > how often I have wished I could...) I don't expect this to be a practical problem. We're planning to add hotplug support to the driver once we have some cycles free. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 0:48 ` Bryan O'Sullivan @ 2006-03-10 1:04 ` Greg KH 2006-03-10 4:41 ` Bryan O'Sullivan 0 siblings, 1 reply; 31+ messages in thread From: Greg KH @ 2006-03-10 1:04 UTC (permalink / raw) To: Bryan O'Sullivan Cc: Roland Dreier, rolandd, akpm, davem, linux-kernel, openib-general On Thu, Mar 09, 2006 at 04:48:45PM -0800, Bryan O'Sullivan wrote: > On Thu, 2006-03-09 at 16:45 -0800, Greg KH wrote: > > > > We don't support hotplugged devices at the moment. > > > > Why not? Your cards can't be placed in a machine that supports PCI > > Hotplug (or PCI-E hotplug)? > > No, the driver and userspace code doesn't support it yet. That's all. > > > You can't really tell users that (no matter > > how often I have wished I could...) > > I don't expect this to be a practical problem. We're planning to add > hotplug support to the driver once we have some cycles free. Ugh, that means it's never going to be there. All new PCI drivers have the requirement that they work properly in hotplug systems, as they should follow the PCI core api. If not, odds are they will not be accepted into the tree :( thanks, greg k-h ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 1:04 ` Greg KH @ 2006-03-10 4:41 ` Bryan O'Sullivan 2006-03-10 5:48 ` Greg KH 2006-03-10 5:55 ` Roland Dreier 0 siblings, 2 replies; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-10 4:41 UTC (permalink / raw) To: Greg KH; +Cc: Roland Dreier, rolandd, akpm, davem, linux-kernel, openib-general On Thu, 2006-03-09 at 17:04 -0800, Greg KH wrote: > > I don't expect this to be a practical problem. We're planning to add > > hotplug support to the driver once we have some cycles free. > > Ugh, that means it's never going to be there. > > All new PCI drivers have the requirement that they work properly in > hotplug systems, as they should follow the PCI core api. If not, odds > are they will not be accepted into the tree :( Okay, maybe we're talking at cross purposes here. We do follow the PCI core API. We have a __devinit probe and __devexit remove routine, a MODULE_DEVICE_TABLE, the kernel generates hotplug events when a device is detected or the driver is unloaded, and so on. I *assumed* that there was something more that we would need to do in order to support real hotplug of actual physical cards, but now that I look more closely, it doesn't appear that there is. At least, there's nothing in Documentation/pci.txt or LDD3 that indicates to me that we ought to be doing more. Am I missing something? <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 4:41 ` Bryan O'Sullivan @ 2006-03-10 5:48 ` Greg KH 2006-03-10 13:40 ` Bryan O'Sullivan 2006-03-10 5:55 ` Roland Dreier 1 sibling, 1 reply; 31+ messages in thread From: Greg KH @ 2006-03-10 5:48 UTC (permalink / raw) To: Bryan O'Sullivan Cc: Roland Dreier, rolandd, akpm, davem, linux-kernel, openib-general On Thu, Mar 09, 2006 at 08:41:36PM -0800, Bryan O'Sullivan wrote: > On Thu, 2006-03-09 at 17:04 -0800, Greg KH wrote: > > > > I don't expect this to be a practical problem. We're planning to add > > > hotplug support to the driver once we have some cycles free. > > > > Ugh, that means it's never going to be there. > > > > All new PCI drivers have the requirement that they work properly in > > hotplug systems, as they should follow the PCI core api. If not, odds > > are they will not be accepted into the tree :( > > Okay, maybe we're talking at cross purposes here. We do follow the PCI > core API. We have a __devinit probe and __devexit remove routine, a > MODULE_DEVICE_TABLE, the kernel generates hotplug events when a device > is detected or the driver is unloaded, and so on. > > I *assumed* that there was something more that we would need to do in > order to support real hotplug of actual physical cards, but now that I > look more closely, it doesn't appear that there is. At least, there's > nothing in Documentation/pci.txt or LDD3 that indicates to me that we > ought to be doing more. > > Am I missing something? Nope, that's all that you need to do. Your driver will be notified that the device will be going away by calling the disconnect function. So great, nothing needs to be done :) Oh, and you can test this out if you don't have a pci hotplug system by using the fakephp driver and disconnecting your device that way. thanks, greg k-h ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 5:48 ` Greg KH @ 2006-03-10 13:40 ` Bryan O'Sullivan 0 siblings, 0 replies; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-10 13:40 UTC (permalink / raw) To: Greg KH; +Cc: Roland Dreier, rolandd, akpm, davem, linux-kernel, openib-general On Thu, 2006-03-09 at 21:48 -0800, Greg KH wrote: > Oh, and you can test this out if you don't have a pci hotplug system by > using the fakephp driver and disconnecting your device that way. That's good to know, thanks. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 4:41 ` Bryan O'Sullivan 2006-03-10 5:48 ` Greg KH @ 2006-03-10 5:55 ` Roland Dreier 2006-03-10 13:43 ` Bryan O'Sullivan 1 sibling, 1 reply; 31+ messages in thread From: Roland Dreier @ 2006-03-10 5:55 UTC (permalink / raw) To: Bryan O'Sullivan Cc: Greg KH, rolandd, akpm, davem, linux-kernel, openib-general Bryan> I *assumed* that there was something more that we would Bryan> need to do in order to support real hotplug of actual Bryan> physical cards, but now that I look more closely, it Bryan> doesn't appear that there is. At least, there's nothing in Bryan> Documentation/pci.txt or LDD3 that indicates to me that we Bryan> ought to be doing more. Bryan> Am I missing something? No, the only problems are with the way the various pieces of your drivers refer to devices by index. There are obvious races such as > +int __init ipath_verbs_init(void) > +{ > + int i; > + > + number_of_devices = ipath_layer_get_num_of_dev(); > + i = number_of_devices * sizeof(struct ipath_ibdev *); > + ipath_devices = kmalloc(i, GFP_ATOMIC); > + if (ipath_devices == NULL) > + return -ENOMEM; > + > + for (i = 0; i < number_of_devices; i++) { > + struct ipath_devdata *dd; > + int ret = ipath_verbs_register(i, ipath_ib_piobufavail, > + ipath_ib_rcv, ipath_ib_timer, > + &dd); suppose number_of_devices gets set to 5 but by the time you call ipath_verbs_register(5,...), the 5th device has been unplugged? Also you only do this when the module is loaded, so you won't handle devices that are hot-plugged later. And I don't see anything that would handle hot unplug either. Pretty much any use of ipath_max is probably broken by hot plug. - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 5:55 ` Roland Dreier @ 2006-03-10 13:43 ` Bryan O'Sullivan 2006-03-10 16:58 ` Greg KH 2006-03-10 17:08 ` Roland Dreier 0 siblings, 2 replies; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-10 13:43 UTC (permalink / raw) To: Roland Dreier; +Cc: Greg KH, rolandd, akpm, davem, linux-kernel, openib-general On Thu, 2006-03-09 at 21:55 -0800, Roland Dreier wrote: > No, the only problems are with the way the various pieces of your > drivers refer to devices by index. OK. What's a safe way to iterate over the devices in the presence of hotplug, then? I assume it's list_for_each_mumble; I just don't know what mumble is :-) > Also you only do this when the module is loaded, so you won't handle > devices that are hot-plugged later. No, ipath_max is updated any time a probe routine is called. > And I don't see anything that > would handle hot unplug either. What would this anything look like, if I were hoping for an example to emulate? There's nothing in LDD3 about this, so I'm kind of in the dark. <b -- Bryan O'Sullivan <bos@pathscale.com> ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 13:43 ` Bryan O'Sullivan @ 2006-03-10 16:58 ` Greg KH 2006-03-10 17:05 ` Bryan O'Sullivan 2006-03-10 17:08 ` Roland Dreier 1 sibling, 1 reply; 31+ messages in thread From: Greg KH @ 2006-03-10 16:58 UTC (permalink / raw) To: Bryan O'Sullivan Cc: Roland Dreier, rolandd, akpm, davem, linux-kernel, openib-general On Fri, Mar 10, 2006 at 05:43:50AM -0800, Bryan O'Sullivan wrote: > On Thu, 2006-03-09 at 21:55 -0800, Roland Dreier wrote: > > > No, the only problems are with the way the various pieces of your > > drivers refer to devices by index. > > OK. What's a safe way to iterate over the devices in the presence of > hotplug, then? I assume it's list_for_each_mumble; I just don't know > what mumble is :-) You keep an internal list of devices, if you really need to do such a thing. > > Also you only do this when the module is loaded, so you won't handle > > devices that are hot-plugged later. > > No, ipath_max is updated any time a probe routine is called. > > > And I don't see anything that > > would handle hot unplug either. > > What would this anything look like, if I were hoping for an example to > emulate? There's nothing in LDD3 about this, so I'm kind of in the > dark. It's just the "disconnect" PCI function being called, which can happen at any time. thanks, greg k-h ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 16:58 ` Greg KH @ 2006-03-10 17:05 ` Bryan O'Sullivan 0 siblings, 0 replies; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-10 17:05 UTC (permalink / raw) To: Greg KH; +Cc: Roland Dreier, rolandd, akpm, davem, linux-kernel, openib-general On Fri, 2006-03-10 at 08:58 -0800, Greg KH wrote: > You keep an internal list of devices, if you really need to do such a > thing. OK. I do think we need it, because we have a dozen or so cases where we do "iterate over all of the known devices". > It's just the "disconnect" PCI function being called, which can happen > at any time. Thanks. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 13:43 ` Bryan O'Sullivan 2006-03-10 16:58 ` Greg KH @ 2006-03-10 17:08 ` Roland Dreier 2006-03-10 17:32 ` Bryan O'Sullivan 1 sibling, 1 reply; 31+ messages in thread From: Roland Dreier @ 2006-03-10 17:08 UTC (permalink / raw) To: Bryan O'Sullivan Cc: Greg KH, rolandd, akpm, davem, linux-kernel, openib-general Bryan> OK. What's a safe way to iterate over the devices in the Bryan> presence of hotplug, then? I assume it's Bryan> list_for_each_mumble; I just don't know what mumble is :-) You need something that takes a reference to each device while you're looking at it, like for_each_pci_dev(). But in general iterating through devices is usually the wrong thing to do, because devices can come and go in the middle of your loop. It's better to be driven by the add and remove callbacks. Bryan> No, ipath_max is updated any time a probe routine is Bryan> called. Yes, that's true. (BTW, what does making ipath_max an atomic_t get you? The updates are protected by a lock anyway). But I was talking about the code in ipath_verbs_init(), which is the only place you call ipath_verbs_register() that I could find. You make one pass through the devices that are present when ipath_verbs_init() is called at module load time, and any devices that get added later are missed. Similarly, if a device is unplugged while the verbs module is loaded, there's no notification from the core driver of that, and you'll go ahead and do ipath_verbs_unregister() on a device that is long gone when you get to ipath_verbs_cleanup(). - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 17:08 ` Roland Dreier @ 2006-03-10 17:32 ` Bryan O'Sullivan 2006-03-10 22:20 ` Roland Dreier 0 siblings, 1 reply; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-10 17:32 UTC (permalink / raw) To: Roland Dreier; +Cc: Greg KH, rolandd, akpm, davem, linux-kernel, openib-general On Fri, 2006-03-10 at 09:08 -0800, Roland Dreier wrote: > Bryan> OK. What's a safe way to iterate over the devices in the > Bryan> presence of hotplug, then? I assume it's > Bryan> list_for_each_mumble; I just don't know what mumble is :-) > > You need something that takes a reference to each device while you're > looking at it, like for_each_pci_dev(). OK, thanks. It seems like the thing to do to be fully safe might be to have ipath_get() (just rename ipath_lookup) and ipath_put() functions. Embed a kref inside ipath_devdata, and have ipath_dev_get increment the ref count on both the ipath_devdata and the pci_dev. Is that sane, or am I way off base? > But in general iterating > through devices is usually the wrong thing to do, because devices can > come and go in the middle of your loop. I understand that. In practice, though, I think this is not a good approach in many of the cases we're dealing with. Every use of ipath_max iterates over all devices for a reason. For example, the diag open routine wants to find at least one device that's up. We could maintain a separate list of devices that are in the state that it needs, so that it can just get the first entry off that list or fail if the list is empty, but that's more complex than simply iterating over every device and checking each one. > (BTW, what does making ipath_max an atomic_t get > you? The updates are protected by a lock anyway). Probably not much. The motivation was to ensure that if it got incremented during an iteration, whoever was iterating would see the update in a timely fashion. > But I was talking > about the code in ipath_verbs_init(), which is the only place you call > ipath_verbs_register() that I could find. You make one pass through > the devices that are present when ipath_verbs_init() is called at > module load time, and any devices that get added later are missed. Yes, that code ought to be restructured. Thanks for the helpful feedback. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 17:32 ` Bryan O'Sullivan @ 2006-03-10 22:20 ` Roland Dreier 0 siblings, 0 replies; 31+ messages in thread From: Roland Dreier @ 2006-03-10 22:20 UTC (permalink / raw) To: Bryan O'Sullivan Cc: Greg KH, rolandd, akpm, davem, linux-kernel, openib-general Bryan> Probably not much. The motivation was to ensure that if it Bryan> got incremented during an iteration, whoever was iterating Bryan> would see the update in a timely fashion. But as far as I can see, you never do atomic_inc() -- only atomic_set() under a spinlock. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 0 of 20] [RFC] ipath driver - another round for review
@ 2006-03-10 0:35 Bryan O'Sullivan
2006-03-10 0:35 ` [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management Bryan O'Sullivan
0 siblings, 1 reply; 31+ messages in thread
From: Bryan O'Sullivan @ 2006-03-10 0:35 UTC (permalink / raw)
To: rolandd, gregkh, akpm, davem; +Cc: linux-kernel, openib-general
[-- Attachment #1: Type: text/plain, Size: 2750 bytes --]
I posted these patches for review this morning, but due to a bug in my
posting script, only Roland actually received them. In fact, this version
of these patches contains a few changes in response to his comments.
Thanks, Roland!
The original text from this morning follows.
Here is another set of ipath driver patches for review. The list of
changes compared to the last patch set I posted is huge, so I won't go
into it here. Suffice it to say that we've taken every reviewer
comment into account, and done a *lot* of work to clean things up.
I'll point out a few things that I think are worth attention.
- We've introduced support for our PCI Express chips, so the driver
is no longer HyperTransport-specific. It's still a 64-bit driver,
because 32-bit platforms don't implement readq or writeq. (It
does compile cleanly on i386, but of course fails to link.)
- We've added an ethernet emulation driver so that if you're not
using Infiniband support, you still have a high-performance net
device (lower latency and higher bandwidth than IPoIB) for IP
traffic.
- There are no longer any fixed tables of device structures.
Instead we allocate device structures dynamically using
pci_alloc_consistent or dma_alloc_coherent, and use the
<linux/idr.h> stuff to number them.
- There are no more ioctls anywhere.
- Huge source files have been split up into digestible, logical
chunks.
- A few more sparse annotations.
- Buckets of other cleanups. Code reformatting, comment
reformatting, trimming code to <= 76 cols, you name it.
There are still a few things left to do that I know of.
- Since the core driver isn't really an IB driver at all, perhaps it
belongs in drivers/char instead of drivers/infiniband/hw?
- Our hardware only supports MSI interrupts. I don't know how to
program it to interrupt us if CONFIG_PCI_MSI is not set. Right
now, we have a timer-based hack in place to emulate interrupts.
- Not all of the code is 80- or 76-col clean yet. I'm working on
this.
- I guess we need to face the music and use sysfs binary attributes
in the two cases where we're not at the moment :-)
- There's clearly something wrong with the way we're pinning some
pages into memory, but I don't actually know what it is. I'm
pretty sure our use of get_user_pages is correct, so I suspect it
must be the code that's doing SetPageReserved (see ipath_driver.c
and ipath_file_ops.c).
I've spent some time trying to figure out what the problem is, but
am stumped. If someone knows what we should be doing instead, I'd
be delighted to hear from them.
If you have any comments or suggestions, please let me know.
<b
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 0:35 [PATCH 0 of 20] [RFC] ipath driver - another round for review Bryan O'Sullivan @ 2006-03-10 0:35 ` Bryan O'Sullivan 2006-03-10 0:45 ` Roland Dreier 0 siblings, 1 reply; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-10 0:35 UTC (permalink / raw) To: rolandd, gregkh, akpm, davem; +Cc: linux-kernel, openib-general The ipath_diag.c file permits userspace diagnostic tools to read and write a chip's registers. It is different in purpose from the mmap interfaces to the /sys/bus/pci resource files. The ipath_sma.c file supports a lightweight userspace subnet management agent (SMA). This is used in deployments (such as HPC clusters) where a full Infiniband protocol stack is not needed. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> diff -r 1123028ac13a -r 28bb276205de drivers/infiniband/hw/ipath/ipath_diag.c --- /dev/null Thu Jan 1 00:00:00 1970 +0000 +++ b/drivers/infiniband/hw/ipath/ipath_diag.c Thu Mar 9 16:16:04 2006 -0800 @@ -0,0 +1,377 @@ +/* + * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +/* + * This file contains support for diagnostic functions. It is accessed by + * opening the ipath_diag device, normally minor number 129. Diagnostic use + * of the InfiniPath chip may render the chip or board unusable until the + * driver is unloaded, or in some cases, until the system is rebooted. + * + * Accesses to the chip through this interface are not similar to going + * through the /sys/bus/pci resource mmap interface. + */ + +#include <linux/pci.h> +#include <linux/version.h> +#include <asm/io.h> +#include <asm/uaccess.h> + +#include "ipath_common.h" +#include "ipath_kernel.h" +#include "ips_common.h" +#include "ipath_layer.h" + +__kernel_pid_t ipath_diag_alive; /* PID of diags, if running */ +static int diag_set_link; + +static int ipath_diag_open(struct inode *in, struct file *fp); +static int ipath_diag_release(struct inode *in, struct file *fp); +static ssize_t ipath_diag_read(struct file *fp, char __user *data, + size_t count, loff_t *off); +static ssize_t ipath_diag_write(struct file *fp, const char __user *data, + size_t count, loff_t *off); + +static struct file_operations diag_file_ops = { + .owner = THIS_MODULE, + .write = ipath_diag_write, + .read = ipath_diag_read, + .open = ipath_diag_open, + .release = ipath_diag_release +}; + +static struct cdev *diag_cdev; +static struct class_device *diag_class_dev; + +int ipath_diag_init(void) +{ + return ipath_cdev_init(IPATH_DIAG_MINOR, "ipath_diag", + &diag_file_ops, &diag_cdev, &diag_class_dev); +} + +void ipath_diag_cleanup(void) +{ + ipath_cdev_cleanup(&diag_cdev, &diag_class_dev); +} + +/** + * ipath_read_umem64 - read a 64-bit quantity from the chip into user space + * @dd: the infinipath device + * @uaddr: the location to store the data in user memory + * @caddr: the source chip address (full pointer, not offset) + * @count: number of bytes to copy (multiple of 32 bits) + * + * This function also localizes all chip memory accesses. + * The copy should be written such that we read full cacheline packets + * from the chip. This is usually used for a single qword + * + * NOTE: This assumes the chip address is 64-bit aligned. + */ +static int ipath_read_umem64(struct ipath_devdata *dd, void __user *uaddr, + const void __iomem *caddr, size_t count) +{ + const u64 __iomem *reg_addr = caddr; + const u64 __iomem *reg_end = reg_addr + (count / sizeof(u64)); + int ret; + + /* not very efficient, but it works for now */ + if (reg_addr < dd->ipath_kregbase || + reg_end > dd->ipath_kregend) { + ret = -EINVAL; + goto bail; + } + while (reg_addr < reg_end) { + u64 data = readq(reg_addr); + if (copy_to_user(uaddr, &data, sizeof(u64))) { + ret = -EFAULT; + goto bail; + } + reg_addr++; + uaddr++; + } + ret = 0; +bail: + return ret; +} + +/** + * ipath_write_umem64 - write a 64-bit quantity to the chip from user space + * @dd: the infinipath device + * @caddr: the destination chip address (full pointer, not offset) + * @uaddr: the source of the data in user memory + * @count: the number of bytes to copy (multiple of 32 bits) + * + * This is usually used for a single qword + * NOTE: This assumes the chip address is 64-bit aligned. + */ + +static int ipath_write_umem64(struct ipath_devdata *dd, void __iomem *caddr, + const void __user *uaddr, size_t count) +{ + u64 __iomem *reg_addr = caddr; + const u64 __iomem *reg_end = reg_addr + (count / sizeof(u64)); + int ret; + + /* not very efficient, but it works for now */ + if (reg_addr < dd->ipath_kregbase || + reg_end > dd->ipath_kregend) { + ret = -EINVAL; + goto bail; + } + while (reg_addr < reg_end) { + u64 data; + if (copy_from_user(&data, uaddr, sizeof(data))) { + ret = -EFAULT; + goto bail; + } + writeq(data, reg_addr); + + reg_addr++; + uaddr++; + } + ret = 0; +bail: + return ret; +} + +/** + * ipath_read_umem32 - read a 32-bit quantity from the chip into user space + * @dd: the infinipath device + * @uaddr: the location to store the data in user memory + * @caddr: the source chip address (full pointer, not offset) + * @count: number of bytes to copy + * + * read 32 bit values, not 64 bit; for memories that only + * support 32 bit reads; usually a single dword. + */ +static int ipath_read_umem32(struct ipath_devdata *dd, void __user *uaddr, + const void __iomem *caddr, size_t count) +{ + const u32 __iomem *reg_addr = caddr; + const u32 __iomem *reg_end = reg_addr + (count / sizeof(u32)); + int ret; + + if (reg_addr < (u32 __iomem *) dd->ipath_kregbase || + reg_end > (u32 __iomem *) dd->ipath_kregend) { + ret = -EINVAL; + goto bail; + } + /* not very efficient, but it works for now */ + while (reg_addr < reg_end) { + u32 data = readl(reg_addr); + if (copy_to_user(uaddr, &data, sizeof(data))) { + ret = -EFAULT; + goto bail; + } + + reg_addr++; + uaddr++; + } + ret = 0; +bail: + return ret; +} + +/** + * ipath_write_umem32 - write a 32-bit quantity to the chip from user space + * @dd: the infinipath device + * @caddr: the destination chip address (full pointer, not offset) + * @uaddr: the source of the data in user memory + * @count: number of bytes to copy + * + * write 32 bit values, not 64 bit; for memories that only + * support 32 bit write; usually a single dword. + */ + +static int ipath_write_umem32(struct ipath_devdata *dd, void __iomem *caddr, + const void __user *uaddr, size_t count) +{ + u32 __iomem *reg_addr = caddr; + const u32 __iomem *reg_end = reg_addr + (count / sizeof(u32)); + int ret; + + if (reg_addr < (u32 __iomem *) dd->ipath_kregbase || + reg_end > (u32 __iomem *) dd->ipath_kregend) { + ret = -EINVAL; + return ret; + } + while (reg_addr < reg_end) { + u32 data; + if (copy_from_user(&data, uaddr, sizeof(data))) { + ret = -EFAULT; + goto bail; + } + writel(data, reg_addr); + + reg_addr++; + uaddr++; + } + ret = 0; +bail: + return ret; +} + +static int ipath_diag_open(struct inode *in, struct file *fp) +{ + int i, ret; + + if (ipath_diag_alive) { + ipath_dbg("Diags already running (pid %u), failing\n", + ipath_diag_alive); + ret = -EBUSY; + goto bail; + } + + for (i = 0; i < atomic_read(&ipath_max); i++) { + struct ipath_devdata *dd = ipath_lookup(i); + + if (!dd || !(dd->ipath_flags & IPATH_PRESENT) || + !dd->ipath_kregbase) + continue; + + /* + * we need at least one infinipath device to be + * present (don't use INITTED, because we want to be + * able to open even if device is in freeze mode, + * which cleared INITTED). There is s small amount of + * risk to this, which is why we also verify kregbase + * is set. + */ + + ipath_diag_alive = current->pid; + diag_set_link = 0; + ipath_dbg("diag device now open, active as PID %u\n", + ipath_diag_alive); + + /* Only expose a way to reset the device if we + make it into diag mode. */ + ipath_expose_reset(&dd->pcidev->dev); + + ret = 0; + goto bail; + } + + ipath_dbg("No hardware yet found and initted, failing diags\n"); + ret = -ENODEV; + +bail: + return ret; +} + +static int ipath_diag_release(struct inode *i, struct file *f) +{ + ipath_diag_alive = 0; + ipath_dbg("Closing DIAG device\n"); + return 0; +} + +static ssize_t ipath_diag_read(struct file *fp, char __user *data, + size_t count, loff_t *off) +{ + int unit = 0; /* XXX this is bogus */ + struct ipath_devdata *dd; + void __iomem *kreg_base; + ssize_t ret; + + dd = ipath_lookup(unit); + if (!dd) { + ret = -ENODEV; + goto bail; + } + + kreg_base = dd->ipath_kregbase; + + if (count == 0) + ret = 0; + else if ((count % 4) || (*off % 4)) + /* address or length is not 32-bit aligned, hence invalid */ + ret = -EINVAL; + else if ((count % 8) || (*off % 8)) + /* address or length not 64-bit aligned; do 32-bit reads */ + ret = ipath_read_umem32(dd, data, kreg_base + *off, count); + else + ret = ipath_read_umem64(dd, data, kreg_base + *off, count); + + if (ret >= 0) { + *off += count; + ret = count; + } + +bail: + return ret; +} + +static ssize_t ipath_diag_write(struct file *fp, const char __user *data, + size_t count, loff_t *off) +{ + int unit = 0; /* XXX this is bogus */ + struct ipath_devdata *dd; + void __iomem *kreg_base; + ssize_t ret; + + dd = ipath_lookup(unit); + if (!dd) { + ret = -ENODEV; + goto bail; + } + kreg_base = dd->ipath_kregbase; + + if (count == 0) + ret = 0; + else if ((count % 4) || (*off % 4)) + /* address or length is not 32-bit aligned, hence invalid */ + ret = -EINVAL; + else if ((count % 8) || (*off % 8)) + /* address or length not 64-bit aligned; do 32-bit writes */ + ret = ipath_write_umem32(dd, kreg_base + *off, data, count); + else + ret = ipath_write_umem64(dd, kreg_base + *off, data, count); + + if (ret >= 0) { + *off += count; + ret = count; + } + +bail: + return ret; +} + +void ipath_diag_bringup_link(struct ipath_devdata *dd) +{ + if (diag_set_link || (dd->ipath_flags & IPATH_LINKACTIVE)) + return; + + diag_set_link = 1; + ipath_cdbg(VERBOSE, "Trying to set to set link active for " + "diag pkt\n"); + ipath_layer_set_linkstate(dd, IPATH_IB_LINKARM); + ipath_layer_set_linkstate(dd, IPATH_IB_LINKACTIVE); +} diff -r 1123028ac13a -r 28bb276205de drivers/infiniband/hw/ipath/ipath_sma.c --- /dev/null Thu Jan 1 00:00:00 1970 +0000 +++ b/drivers/infiniband/hw/ipath/ipath_sma.c Thu Mar 9 16:16:04 2006 -0800 @@ -0,0 +1,385 @@ +/* + * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include <linux/io.h> +#include <linux/pci.h> +#include <linux/cdev.h> +#include <linux/vmalloc.h> +#include <asm/uaccess.h> + +#include "ipath_kernel.h" +#include "ips_common.h" +#include "ipath_layer.h" + +/* PID of SMA, if it's running. Atomic because it's accessed without + * a lock outside this file. */ +atomic_t ipath_sma_alive; +DEFINE_SPINLOCK(ipath_sma_lock); /* SMA receive */ + +/* max SM received packets we'll queue; we keep the most recent packets. */ + +struct _ipath_sma_rpkt ipath_sma_data[IPATH_NUM_SMA_PKTS]; + +unsigned ipath_sma_first; /* oldest sma packet index */ +unsigned ipath_sma_next; /* next sma packet index to use */ + +/* + * ipath_sma_data_bufs has one extra, pointed to by ipath_sma_data_spare, + * so we can exchange buffers to do copy_to_user, and not hold the lock + * across the copy_to_user(). + */ + +u8 ipath_sma_data_bufs[IPATH_NUM_SMA_PKTS + 1][IPATH_SMA_MAX_PKTSZ]; +u8 *ipath_sma_data_spare; +/* sma waits globally on all units */ +wait_queue_head_t ipath_sma_wait; +wait_queue_head_t ipath_sma_state_wait; + +static int ipath_sma_open(struct inode *in, struct file *fp); +static int ipath_sma_release(struct inode *in, struct file *fp); +static ssize_t ipath_sma_read(struct file *fp, char __user *data, + size_t count, loff_t *off); +static ssize_t ipath_sma_write(struct file *fp, const char __user *data, + size_t count, loff_t *off); + +static struct file_operations sma_file_ops = { + .owner = THIS_MODULE, + .write = ipath_sma_write, + .read = ipath_sma_read, + .open = ipath_sma_open, + .release = ipath_sma_release +}; + +static struct cdev *sma_cdev; +static struct class_device *sma_class_dev; + +int ipath_sma_init(void) +{ + return ipath_cdev_init(IPATH_SMA_MINOR, "ipath_sma", &sma_file_ops, + &sma_cdev, &sma_class_dev); +} + +void ipath_sma_cleanup(void) +{ + ipath_cdev_cleanup(&sma_cdev, &sma_class_dev); +} + +static int ipath_sma_open(struct inode *in, struct file *fp) +{ + unsigned long flags; + __kernel_pid_t pid; + int s, ret; + + spin_lock_irqsave(&ipath_sma_lock, flags); + + pid = atomic_read(&ipath_sma_alive); + if (pid) { + ipath_dbg("SMA already running (pid %u), failing\n", pid); + ret = -EBUSY; + goto bail; + } + + for (s = 0; s < atomic_read(&ipath_max); s++) { + struct ipath_devdata *dd = ipath_lookup(s); + /* we need at least one infinipath device to be initialized. */ + if (dd && dd->ipath_flags & IPATH_INITTED) { + pid = current->pid; + *dd->ipath_statusp |= IPATH_STATUS_SMA; + *dd->ipath_statusp &= ~IPATH_STATUS_OIB_SMA; + } + } + if (pid) { + ipath_cdbg(SMA, "SMA device now open, SMA active as PID %u\n", + pid); + atomic_set(&ipath_sma_alive, pid); + ret = 0; + } else { + ret = -ENODEV; + ipath_dbg("No hardware yet found and initted, failing\n"); + } +bail: + spin_unlock_irqrestore(&ipath_sma_lock, flags); + return ret; +} + +static int ipath_sma_release(struct inode *in, struct file *fp) +{ + unsigned long flags; + int s; + + spin_lock_irqsave(&ipath_sma_lock, flags); + + atomic_set(&ipath_sma_alive, 0); + ipath_cdbg(SMA, "Closing SMA device\n"); + for (s = 0; s < atomic_read(&ipath_max); s++) { + struct ipath_devdata *dd = ipath_lookup(s); + + if (!dd || !(dd->ipath_flags & IPATH_INITTED)) + continue; + *dd->ipath_statusp &= ~IPATH_STATUS_SMA; + if (dd->verbs_layer.l_flags & IPATH_VERBS_KERNEL_SMA) + *dd->ipath_statusp |= IPATH_STATUS_OIB_SMA; + } + spin_unlock_irqrestore(&ipath_sma_lock, flags); + return 0; +} + +/** + * ipath_sma_read - receive an IB packet with QP 0 or 1 + * @fp: the SMA device file pointer + * @data: ipath_sma_pkt structure saying where to place the SMA data + * @count: unused by this code + * @off: unused by this code + * + * This receives from all/any of the active chips, and we currently do not + * allow specifying just one (we could, by filling in unit in the library + * before the syscall, and checking here). + */ +static ssize_t ipath_sma_read(struct file *fp, char __user *data, + size_t count, loff_t *off) +{ + struct _ipath_sma_rpkt *smpkt; + struct ipath_sma_pkt rp; + unsigned long flags; + ssize_t ret; + u32 rp_len; + int i, any; + u8 *sdata; + u32 slen; + + if (copy_from_user(&rp, data, sizeof(rp))) { + ret = -EFAULT; + goto bail; + } + + if (!ipath_sma_data_spare) { + ipath_dbg("can't do receive, sma not initialized\n"); + ret = -ENETDOWN; + goto bail; + } + + for (any = i = 0; i < atomic_read(&ipath_max); i++) { + struct ipath_devdata *dd = ipath_lookup(i); + if (dd && (dd->ipath_flags & IPATH_INITTED)) + any++; + } + + if (!any) { /* no hardware, freeze, etc. */ + ipath_cdbg(SMA, "Didn't find any initialized and usable chips\n"); + ret = -ENODEV; + goto bail; + } + + wait_event_interruptible(ipath_sma_wait, + ipath_sma_data[ipath_sma_first].len); + + spin_lock_irqsave(&ipath_sma_lock, flags); + if (ipath_sma_data[ipath_sma_first].len == 0) { + /* usually means SMA process received a signal */ + spin_unlock_irqrestore(&ipath_sma_lock, flags); + ret = -EAGAIN; + goto bail; + } + + smpkt = &ipath_sma_data[ipath_sma_first]; + + /* + * we swap out the buffer we are going to use with the spare buffer + * and set spare to that buffer. This code is the only code that + * ever manipulates spare, other than the initialization code. + * This code should never be entered by more than one process at + * a time, and if it is, the user code doing so deserves what it gets; + * it won't break anything in the driver by doing so. We do it this + * way to avoid holding a lock across the copy_to_user, which could + * fault, or delay a long time while paging occurs; ditto for printks + */ + + rp_len = rp.len; + rp.len = smpkt->len; + rp.unit = smpkt->unit; + + slen = smpkt->len; + sdata = smpkt->buf; + smpkt->buf = ipath_sma_data_spare; + ipath_sma_data_spare = sdata; + smpkt->len = 0; /* it's available again */ + if (++ipath_sma_first >= IPATH_NUM_SMA_PKTS) + ipath_sma_first = 0; + spin_unlock_irqrestore(&ipath_sma_lock, flags); + + if (copy_to_user((void __user *) (unsigned long) rp.data, + sdata, min(rp_len, slen))) { + ret = -EFAULT; + goto bail; + } + + if (copy_to_user(data, &rp, sizeof(rp))) { + ret = -EFAULT; + goto bail; + } + + ret = sizeof(rp); +bail: + return ret; +} + +/** + * ipath_sma_write - receive an IB packet with QP 0 or 1 + * @fp: the SMA device file pointer + * @data: ipath_sma_pkt structure saying where to get the SMA packet + * @count: size of data to write + * @off: unused by this code + * + * This routine is no longer on any critical paths; it is used only + * for sending SMA packets, and some diagnostic usage. + * Because it's currently sma only, there are no checks to see if the + * link is up; sma must be able to send in the not fully initialized state + */ +static ssize_t ipath_sma_write(struct file *fp, const char __user *data, + size_t count, loff_t *off) +{ + u32 __iomem *piobuf; + u32 plen, clen, pbufn; + struct ipath_sma_pkt sp; + u32 *tmpbuf = NULL; + struct ipath_devdata *dd; + ssize_t ret = 0; + u64 val; + + if (count < sizeof(sp)) { + ret = -EINVAL; + goto bail; + } + + if (copy_from_user(&sp, data, sizeof(sp))) { + ret = -EFAULT; + goto bail; + } + + // send count must be an exact number of dwords + if (sp.len & 3) { + ret = -EINVAL; + goto bail; + } + + clen = sp.len >> 2; + + dd = ipath_lookup(sp.unit); + if (!dd || !(dd->ipath_flags & IPATH_PRESENT) || !dd->ipath_kregbase) { + ipath_cdbg(SMA, "illegal unit %u for sma send\n", sp.unit); + ret = -ENODEV; + goto bail; + } + + if (ipath_diag_alive) + ipath_diag_bringup_link(dd); + + if (!(dd->ipath_flags & IPATH_INITTED)) { + /* no hardware, freeze, etc. */ + ipath_cdbg(SMA, "unit %u not usable\n", dd->ipath_unit); + ret = -ENODEV; + goto bail; + } + val = dd->ipath_lastibcstat & IPATH_IBSTATE_MASK; + if (val != IPATH_IBSTATE_INIT && val != IPATH_IBSTATE_ARM && + val != IPATH_IBSTATE_ACTIVE) { + ipath_cdbg(SMA, "unit %u not ready (state %llx)\n", + dd->ipath_unit, (unsigned long long) val); + ret = -EINVAL; + goto bail; + } + + /* need total length before first word written */ + plen = sizeof(u32) + sp.len; /* +1 word is for the qword padding */ + + if ((plen + 4) > dd->ipath_ibmaxlen) { + ipath_dbg("Pkt len 0x%x > ibmaxlen %x\n", + plen - 4, dd->ipath_ibmaxlen); + ret = -EINVAL; + goto bail; /* before writing pbc */ + } + tmpbuf = vmalloc(plen); + if (!tmpbuf) { + dev_info(&dd->pcidev->dev, "Unable to allocate tmp buffer, " + "failing\n"); + ret = -ENOMEM; + goto bail; + } + + if (copy_from_user(tmpbuf, + (const void __user *) (unsigned long) sp.data, + sp.len)) { + ret = -EFAULT; + goto bail; + } + + piobuf = ipath_getpiobuf(dd, &pbufn); + if (!piobuf) { + ipath_cdbg(SMA, "No PIO buffers available unit %u %u times\n", + dd->ipath_unit, dd->ipath_nosma_bufs); + dd->ipath_nosma_bufs++; + ret = -EBUSY; + goto bail; + } + if (dd->ipath_nosma_bufs) { + ipath_cdbg(SMA, "Unit %u got SMA send buffer after %u failures, %u seconds\n", + dd->ipath_unit, dd->ipath_nosma_bufs, + dd->ipath_nosma_secs); + dd->ipath_nosma_bufs = 0; + dd->ipath_nosma_secs = 0; + } + + plen >>= 2; /* in dwords */ + + if (ipath_debug & __IPATH_PKTDBG) // SMA and PKT, both + ipath_cdbg(SMA, "unit %u 0x%x+1w pio%d\n", + dd->ipath_unit, plen - 1, pbufn); + + /* we have to flush after the PBC for correctness on some cpus + * or WC buffer can be written out of order */ + writeq(plen, piobuf); + ipath_flush_wc(); + /* copy all by the trigger word, then flush, so it's written + * to chip before trigger word, then write trigger word, then + * flush again, so packet is sent. */ + __iowrite32_copy(piobuf + 2, tmpbuf, clen - 1); + ipath_flush_wc(); + __raw_writel(tmpbuf[clen - 1], piobuf + clen + 1); + ipath_flush_wc(); + ipath_stats.sps_sma_spkts++; + + ret = sizeof(sp); + +bail: + vfree(tmpbuf); + return ret; +} ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 0:35 ` [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management Bryan O'Sullivan @ 2006-03-10 0:45 ` Roland Dreier 2006-03-10 0:47 ` Bryan O'Sullivan 0 siblings, 1 reply; 31+ messages in thread From: Roland Dreier @ 2006-03-10 0:45 UTC (permalink / raw) To: Bryan O'Sullivan Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general OK, now I can see the change you made: > +atomic_t ipath_sma_alive; > +DEFINE_SPINLOCK(ipath_sma_lock); /* SMA receive */ So why is ipath_sma_alive an atomic_t (and why isn't it static)? You never modify ipath_sma_alive outside of your spinlock, so I don't see what having it be atomic buys you. - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 0:45 ` Roland Dreier @ 2006-03-10 0:47 ` Bryan O'Sullivan 2006-03-10 0:52 ` Roland Dreier 0 siblings, 1 reply; 31+ messages in thread From: Bryan O'Sullivan @ 2006-03-10 0:47 UTC (permalink / raw) To: Roland Dreier; +Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general On Thu, 2006-03-09 at 16:45 -0800, Roland Dreier wrote: > So why is ipath_sma_alive an atomic_t (and why isn't it static)? > You never modify ipath_sma_alive outside of your spinlock, so I don't > see what having it be atomic buys you. It's read outside of this file, without a lock held. <b ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management 2006-03-10 0:47 ` Bryan O'Sullivan @ 2006-03-10 0:52 ` Roland Dreier 0 siblings, 0 replies; 31+ messages in thread From: Roland Dreier @ 2006-03-10 0:52 UTC (permalink / raw) To: Bryan O'Sullivan Cc: rolandd, gregkh, akpm, davem, linux-kernel, openib-general Bryan> It's read outside of this file, without a lock held. I missed the other reference in another patch. But the central point still stands: if all you do is atomic_set() and atomic_read(), then using atomic_t doesn't buy you anything. Just look at what atomic_read() expands to -- using it isn't protecting you against anything, so either you have a race, or you were safe without atomic_t. The only point to atomic_t is so that you can safely do read-modify-write things like atomic_inc(). - R. ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2006-03-10 22:21 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <eac2ad3017b5f160d24c.1141922822@localhost.localdomain>
2006-03-09 23:20 ` [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management Roland Dreier
2006-03-09 23:39 ` Bryan O'Sullivan
2006-03-09 23:47 ` Roland Dreier
2006-03-09 23:50 ` Bryan O'Sullivan
2006-03-09 23:52 ` Roland Dreier
2006-03-10 15:54 ` Michael S. Tsirkin
2006-03-10 16:05 ` Bryan O'Sullivan
2006-03-09 23:24 ` Roland Dreier
2006-03-09 23:49 ` Bryan O'Sullivan
2006-03-09 23:51 ` Roland Dreier
2006-03-09 23:26 ` Roland Dreier
2006-03-09 23:52 ` Bryan O'Sullivan
2006-03-10 0:00 ` Roland Dreier
2006-03-10 0:04 ` Bryan O'Sullivan
2006-03-10 0:45 ` Greg KH
2006-03-10 0:48 ` Bryan O'Sullivan
2006-03-10 1:04 ` Greg KH
2006-03-10 4:41 ` Bryan O'Sullivan
2006-03-10 5:48 ` Greg KH
2006-03-10 13:40 ` Bryan O'Sullivan
2006-03-10 5:55 ` Roland Dreier
2006-03-10 13:43 ` Bryan O'Sullivan
2006-03-10 16:58 ` Greg KH
2006-03-10 17:05 ` Bryan O'Sullivan
2006-03-10 17:08 ` Roland Dreier
2006-03-10 17:32 ` Bryan O'Sullivan
2006-03-10 22:20 ` Roland Dreier
2006-03-10 0:35 [PATCH 0 of 20] [RFC] ipath driver - another round for review Bryan O'Sullivan
2006-03-10 0:35 ` [PATCH 9 of 20] ipath - char devices for diagnostics and lightweight subnet management Bryan O'Sullivan
2006-03-10 0:45 ` Roland Dreier
2006-03-10 0:47 ` Bryan O'Sullivan
2006-03-10 0:52 ` Roland Dreier
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.