* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. [not found] <20021218094714.43C712C076@lists.samba.org> @ 2002-12-18 12:57 ` Ed Tomlinson 2002-12-18 16:29 ` Dave Jones [not found] ` <200212181803.23279.tomlins@cam.org> 1 sibling, 1 reply; 18+ messages in thread From: Ed Tomlinson @ 2002-12-18 12:57 UTC (permalink / raw) To: Rusty Russell, Dave Jones; +Cc: linux-kernel On December 18, 2002 04:46 am, Rusty Russell wrote: > In message <20021218102004.A4947@suse.de> you write: > > On Wed, Dec 18, 2002 at 06:07:48PM +1100, Rusty Russell wrote: > > > Dave, it's true that it's my fault, but I'm afraid it looks like your > > > bug 8). This is most likely karma for not reporting bugs in the > > > module code when you had problems 8) > > > > Bah, my bad. 8) > > That's OK, you're in good company as the first I heard from Alan was > that he gave up kernel work because "modules was so broken". 8( > > > > static int __init agp_backend_initialize(struct pci_dev *dev) > > > > > > .... > > > int agp_register_driver (struct pci_dev *dev) > > > { > > > .... > > > ret_val = agp_backend_initialize(dev); > > > > Whoops. Thanks, fixed up here. > > I'll bet I duped that bug in other places too, as I've rejigged > > stuff around. I'll double check those paths in a mo. > > > > > Ed, does it work if you take all the __init out of the agp code? > > > > My moneys on it working. The oops looked like it was jumping to oblivion > > when it called agp_backend_initialize. So the new modutils discards > > __init sections ? That's a new feature isn't it ? > > Yeah, kind of a bonus. It's actually arch-dependent. > > I've added an item in my TODO list to check for relocations on the > non-init section which point into the init sections. It'd be cute. > It'll probably never go into the main kernel, and it's no actually an > error, you can imagine code which does: > > if (initializing) > some_init_func(); Dave when you have this in a bk tree let me know and I will pull and verify it working here. Thanks to both of you Ed Tomlinson ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-18 12:57 ` [drm:drm_init] *ERROR* Cannot initialize the agpgart module Ed Tomlinson @ 2002-12-18 16:29 ` Dave Jones 0 siblings, 0 replies; 18+ messages in thread From: Dave Jones @ 2002-12-18 16:29 UTC (permalink / raw) To: Ed Tomlinson; +Cc: Rusty Russell, linux-kernel On Wed, Dec 18, 2002 at 07:57:53AM -0500, Ed Tomlinson wrote: > > > > Ed, does it work if you take all the __init out of the agp code? > > > My moneys on it working. The oops looked like it was jumping to oblivion > > > when it called agp_backend_initialize. > Dave when you have this in a bk tree let me know and I will pull and > verify it working here. bk://linux-dj.bkbits.net/agpgart I've given it a compile testing, but not booted it yet. Scream if necessary. Dave -- | Dave Jones. http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <200212181803.23279.tomlins@cam.org>]
[parent not found: <20021219105909.GE29122@suse.de>]
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. [not found] ` <20021219105909.GE29122@suse.de> @ 2002-12-20 23:29 ` Ed Tomlinson 2002-12-21 14:22 ` Dave Jones 0 siblings, 1 reply; 18+ messages in thread From: Ed Tomlinson @ 2002-12-20 23:29 UTC (permalink / raw) To: Dave Jones; +Cc: linux-kernel On December 19, 2002 05:59 am, Dave Jones wrote: > On Wed, Dec 18, 2002 at 06:03:23PM -0500, Ed Tomlinson wrote: > > both with and without agp 3.0 enabled I get: > > > > Dec 18 17:51:10 oscar kernel: Linux agpgart interface v0.100 (c) Dave > > Jones Dec 18 17:51:29 oscar kernel: via_agp: Unknown symbol > > agp_generic_agp_3_0_enable > > I don't get this. Can you mail me your .config ? Dave, with the pull from this morning (8am EST), it almost works modular. I get: Dec 20 18:20:19 oscar upsd[636]: Communication established Dec 20 18:20:47 oscar kernel: Linux agpgart interface v0.100 (c) Dave Jones Dec 20 18:20:47 oscar kernel: agpgart: Detected VIA MVP3 chipset Dec 20 18:20:47 oscar kernel: agpgart: AGP aperture is 64M @ 0xe0000000 Dec 20 18:20:58 oscar kernel: [drm] Initialized mga 3.1.0 20021029 on minor 0 Dec 20 18:20:58 oscar kernel: Module agpgart cannot be unloaded due to unsafe usage in drivers/char/ag p/backend.c:58 but find this in the X startup log. (EE) MGA: Failed to load module "mga_hal" (module does not exist, 0) (EE) MGA(0): [agp] Out of memory (-1014) (EE) MGA(0): [drm] failed to remove DRM signal handler DRIUnlock called when not locked ideas? Ed Tomlinson ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-20 23:29 ` Ed Tomlinson @ 2002-12-21 14:22 ` Dave Jones 2002-12-22 4:41 ` Ed Tomlinson 2002-12-23 1:10 ` Rusty Russell 0 siblings, 2 replies; 18+ messages in thread From: Dave Jones @ 2002-12-21 14:22 UTC (permalink / raw) To: Ed Tomlinson; +Cc: Rusty Russell, linux-kernel On Fri, Dec 20, 2002 at 06:29:18PM -0500, Ed Tomlinson wrote: > Dave, with the pull from this morning (8am EST), it almost works modular. > I get: > > Dec 20 18:20:19 oscar upsd[636]: Communication established > Dec 20 18:20:47 oscar kernel: Linux agpgart interface v0.100 (c) Dave Jones > Dec 20 18:20:47 oscar kernel: agpgart: Detected VIA MVP3 chipset > Dec 20 18:20:47 oscar kernel: agpgart: AGP aperture is 64M @ 0xe0000000 > Dec 20 18:20:58 oscar kernel: [drm] Initialized mga 3.1.0 20021029 on minor 0 > Dec 20 18:20:58 oscar kernel: Module agpgart cannot be unloaded due to unsafe usage in drivers/char/ag > p/backend.c:58 This one is due to the way AGPGART does (or has done for the last 3 years) its module locking. It does a MOD_INC_USE_COUNT as soon as someone calls the acquire routines. (So you can't unload agpgart whilst you've a 3d using app (like X) open). This seems quite sensible, but these days you can't unload agpgart.ko anyway because the chipset module (via-agp.ko in your case) already has it 'in use', so I'm tempted to drop those bits. > but find this in the X startup log. > (EE) MGA: Failed to load module "mga_hal" (module does not exist, 0) That's matrox's binary only X blob. Not my fault. > (EE) MGA(0): [agp] Out of memory (-1014) This one is. But it may be a knock-on effect from the bug above. I'll nail that one first. > (EE) MGA(0): [drm] failed to remove DRM signal handler > DRIUnlock called when not locked That one's a problem for the DRI folks. Dave -- | Dave Jones. http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-21 14:22 ` Dave Jones @ 2002-12-22 4:41 ` Ed Tomlinson 2002-12-22 12:28 ` Dave Jones 2002-12-23 1:10 ` Rusty Russell 1 sibling, 1 reply; 18+ messages in thread From: Ed Tomlinson @ 2002-12-22 4:41 UTC (permalink / raw) To: Dave Jones; +Cc: linux-kernel On December 21, 2002 09:22 am, Dave Jones wrote: > On Fri, Dec 20, 2002 at 06:29:18PM -0500, Ed Tomlinson wrote: > > but find this in the X startup log. > > (EE) MGA: Failed to load module "mga_hal" (module does not exist, 0) > > That's matrox's binary only X blob. Not my fault. Know all about this one... My G400 runs just fine without with X setups. > > (EE) MGA(0): [agp] Out of memory (-1014) > > This one is. But it may be a knock-on effect from the bug above. > I'll nail that one first. Thanks > > (EE) MGA(0): [drm] failed to remove DRM signal handler > > DRIUnlock called when not locked > > That one's a problem for the DRI folks. Yep Now for something new. With bk current (6pm EST) I get: Dec 21 23:30:56 oscar kernel: Linux agpgart interface v0.100 (c) Dave Jones Dec 21 23:30:56 oscar kernel: agpgart: Detected VIA MVP3 chipset Dec 21 23:30:56 oscar kernel: printing eip: Dec 21 23:30:56 oscar kernel: e0db9000 Dec 21 23:30:56 oscar kernel: Oops: 0000 Dec 21 23:30:56 oscar kernel: CPU: 0 Dec 21 23:30:56 oscar kernel: EIP: 0060:[<e0db9000>] Not tainted Dec 21 23:30:56 oscar kernel: EFLAGS: 00010296 Dec 21 23:30:56 oscar kernel: EIP is at 0xe0db9000 Dec 21 23:30:56 oscar kernel: eax: 00000000 ebx: c15d8800 ecx: 000000a4 edx: 07000203 Dec 21 23:30:56 oscar kernel: esi: 00000000 edi: c15d884c ebp: c9fc3ee0 esp: c9fc3ec8 Dec 21 23:30:56 oscar kernel: ds: 0068 es: 0068 ss: 0068 Dec 21 23:30:56 oscar kernel: Process modprobe (pid: 2175, threadinfo=c9fc2000 task=d60e92a0) Dec 21 23:30:56 oscar kernel: Stack: e0de9108 c15d8800 00000000 c15d884c 00000000 00000000 c9fc3ef0 e0de92f8 Dec 21 23:30:56 oscar kernel: c15d8800 c15d8800 c9fc3f04 e0dd035a c15d8800 e0dcb4a8 a0dcb4a8 c9fc3f28 Dec 21 23:30:56 oscar kernel: c019d252 c15d8800 e0dd046c c15d884c e0dcb4a8 ffffffed c15d8800 e0dcb480 Dec 21 23:30:56 oscar kernel: Call Trace: Dec 21 23:30:56 oscar kernel: [<e0de9108>] agp_backend_initialize+0x1c/0x168 [agpgart] Dec 21 23:30:56 oscar kernel: [<e0de92f8>] agp_register_driver+0x2c/0xac [agpgart] Dec 21 23:30:56 oscar kernel: [<e0dd035a>] agp_via_probe+0x62/0x6c [via_agp] Dec 21 23:30:56 oscar kernel: [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] Dec 21 23:30:56 oscar kernel: [pci_device_probe+70/96] pci_device_probe+0x46/0x60 Dec 21 23:30:56 oscar kernel: [<e0dd046c>] agp_via_pci_table+0x0/0x38 [via_agp] Dec 21 23:30:56 oscar kernel: [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] Dec 21 23:30:56 oscar kernel: [<e0dcb480>] agp_via_pci_driver+0x0/0xa0 [via_agp] Dec 21 23:30:56 oscar kernel: [bus_match+56/108] bus_match+0x38/0x6c Dec 21 23:30:56 oscar kernel: [driver_attach+66/108] driver_attach+0x42/0x6c Dec 21 23:30:56 oscar kernel: [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] Dec 21 23:30:56 oscar kernel: [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] Dec 21 23:30:56 oscar kernel: [bus_add_driver+172/204] bus_add_driver+0xac/0xcc Dec 21 23:30:56 oscar kernel: [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] Dec 21 23:30:56 oscar kernel: [<e0dcb4c8>] agp_via_pci_driver+0x48/0xa0 [via_agp] Dec 21 23:30:56 oscar kernel: [driver_register+54/56] driver_register+0x36/0x38 Dec 21 23:30:56 oscar kernel: [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] Dec 21 23:30:56 oscar kernel: [pci_register_driver+68/80] pci_register_driver+0x44/0x50 Dec 21 23:30:56 oscar kernel: [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] Dec 21 23:30:56 oscar kernel: [<e0dd0372>] agp_via_init+0xe/0x44 [via_agp] Dec 21 23:30:56 oscar kernel: [<e0dcb480>] agp_via_pci_driver+0x0/0xa0 [via_agp] Dec 21 23:30:56 oscar kernel: [sys_init_module+274/408] sys_init_module+0x112/0x198 Dec 21 23:30:56 oscar kernel: [syscall_call+7/11] syscall_call+0x7/0xb Dec 21 23:30:56 oscar kernel: Dec 21 23:30:56 oscar kernel: Code: Bad EIP value. when modprobing via_agp Ed ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-22 4:41 ` Ed Tomlinson @ 2002-12-22 12:28 ` Dave Jones 2002-12-22 14:08 ` Ed Tomlinson 0 siblings, 1 reply; 18+ messages in thread From: Dave Jones @ 2002-12-22 12:28 UTC (permalink / raw) To: Ed Tomlinson; +Cc: linux-kernel On Sat, Dec 21, 2002 at 11:41:29PM -0500, Ed Tomlinson wrote: > > Now for something new. With bk current (6pm EST) I get: > Dec 21 23:30:56 oscar kernel: Call Trace: > Dec 21 23:30:56 oscar kernel: [<e0de9108>] agp_backend_initialize+0x1c/0x168 [agpgart] > Dec 21 23:30:56 oscar kernel: [<e0de92f8>] agp_register_driver+0x2c/0xac [agpgart] I already fixed a bug with the same call-trace. This looks like you've still got old .o files around. Can you make clean and rebuild just to make sure ? Dave -- | Dave Jones. http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-22 12:28 ` Dave Jones @ 2002-12-22 14:08 ` Ed Tomlinson 0 siblings, 0 replies; 18+ messages in thread From: Ed Tomlinson @ 2002-12-22 14:08 UTC (permalink / raw) To: Dave Jones; +Cc: linux-kernel On December 22, 2002 07:28 am, Dave Jones wrote: > On Sat, Dec 21, 2002 at 11:41:29PM -0500, Ed Tomlinson wrote: > > Now for something new. With bk current (6pm EST) I get: > > Dec 21 23:30:56 oscar kernel: Call Trace: > > Dec 21 23:30:56 oscar kernel: [<e0de9108>] > > agp_backend_initialize+0x1c/0x168 [agpgart] Dec 21 23:30:56 oscar > > kernel: [<e0de92f8>] agp_register_driver+0x2c/0xac [agpgart] > > I already fixed a bug with the same call-trace. This looks like > you've still got old .o files around. Can you make clean and rebuild > just to make sure ? Rebuilt with make clean - I am getting the same oops... Ed ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-21 14:22 ` Dave Jones 2002-12-22 4:41 ` Ed Tomlinson @ 2002-12-23 1:10 ` Rusty Russell 2002-12-23 12:10 ` Dave Jones 1 sibling, 1 reply; 18+ messages in thread From: Rusty Russell @ 2002-12-23 1:10 UTC (permalink / raw) To: Dave Jones; +Cc: Ed Tomlinson, linux-kernel In message <20021221142226.GA24941@suse.de> you write: > On Fri, Dec 20, 2002 at 06:29:18PM -0500, Ed Tomlinson wrote: > > Dave, with the pull from this morning (8am EST), it almost works modular. > > I get: > > > > Dec 20 18:20:19 oscar upsd[636]: Communication established > > Dec 20 18:20:47 oscar kernel: Linux agpgart interface v0.100 (c) Dave Jone s > > Dec 20 18:20:47 oscar kernel: agpgart: Detected VIA MVP3 chipset > > Dec 20 18:20:47 oscar kernel: agpgart: AGP aperture is 64M @ 0xe0000000 > > Dec 20 18:20:58 oscar kernel: [drm] Initialized mga 3.1.0 20021029 on mino r 0 > > Dec 20 18:20:58 oscar kernel: Module agpgart cannot be unloaded due to uns afe usage in drivers/char/ag > > p/backend.c:58 > > This one is due to the way AGPGART does (or has done for the last 3 > years) its module locking. It does a MOD_INC_USE_COUNT as soon as > someone calls the acquire routines. Which is racy under SMP, and under preempt, which is why it's deprecated. > (So you can't unload agpgart whilst you've a 3d using app (like X) > open). This seems quite sensible, but these days you can't unload > agpgart.ko anyway because the chipset module (via-agp.ko in your > case) already has it 'in use', so I'm tempted to drop those bits. If this is true (it usually is), you can simply drop them. There are other cases where the caller is not grabbing references, so MOD_INC_USE_COUNT is better than nothing (should the warning stay for 2.6? Good question). Hope that helps, Rusty. -- Anyone who quotes me in their sig is an idiot. -- Rusty Russell. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-23 1:10 ` Rusty Russell @ 2002-12-23 12:10 ` Dave Jones 2002-12-26 3:51 ` Rusty Russell 0 siblings, 1 reply; 18+ messages in thread From: Dave Jones @ 2002-12-23 12:10 UTC (permalink / raw) To: Rusty Russell; +Cc: Ed Tomlinson, linux-kernel On Mon, Dec 23, 2002 at 12:10:47PM +1100, Rusty Russell wrote: > > This one is due to the way AGPGART does (or has done for the last 3 > > years) its module locking. It does a MOD_INC_USE_COUNT as soon as > > someone calls the acquire routines. > Which is racy under SMP, and under preempt, which is why it's > deprecated. Crapola. I've just realised why this is no longer relevant. I've moved what this was protecting into the per chipset modules. Right now its possible to load modules, start x (which loads DRM), then rmmod via_agp from under its feet. result - boom when something tries to use 3d. Sure it's unlikely someone would be crazy enough to try and do this, and they deserve what they get, but it's not exactly clean, or nice. So where is the documentation describing module locking de jour ? > > (So you can't unload agpgart whilst you've a 3d using app (like X) > > open). This seems quite sensible, but these days you can't unload > > agpgart.ko anyway because the chipset module (via-agp.ko in your > > case) already has it 'in use', so I'm tempted to drop those bits. > If this is true (it usually is), you can simply drop them. I'll need 'something' in the chipset drivers. The first thing that jumps to mind is to give the chipset drivers an 'acquire' op which does the locking much like the old agp_backend_acquire() does. > There are other cases where the caller is not grabbing references, so > MOD_INC_USE_COUNT is better than nothing (should the warning stay for > 2.6? Good question). Why exactly isn't it safe any more? If there's documentation on this, I'd love to read it. If there isn't, there really should be. Dave -- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-23 12:10 ` Dave Jones @ 2002-12-26 3:51 ` Rusty Russell 2002-12-28 0:20 ` Roman Zippel 0 siblings, 1 reply; 18+ messages in thread From: Rusty Russell @ 2002-12-26 3:51 UTC (permalink / raw) To: Dave Jones; +Cc: Ed Tomlinson, linux-kernel In message <20021223121022.GA32080@suse.de> you write: > On Mon, Dec 23, 2002 at 12:10:47PM +1100, Rusty Russell wrote: > > > This one is due to the way AGPGART does (or has done for the last 3 > > > years) its module locking. It does a MOD_INC_USE_COUNT as soon as > > > someone calls the acquire routines. > > Which is racy under SMP, and under preempt, which is why it's > > deprecated. > > Crapola. I've just realised why this is no longer relevant. > I've moved what this was protecting into the per chipset modules. > Right now its possible to load modules, start x (which loads DRM), > then rmmod via_agp from under its feet. result - boom when something > tries to use 3d. > > Sure it's unlikely someone would be crazy enough to try and do this, > and they deserve what they get, but it's not exactly clean, or nice. > > So where is the documentation describing module locking de jour ? Here's the FAQ again. Note that the init stuff is not currently true (you can enter a module doing init, with possibly bad results), because Linus patched my code. It's on my TODO list to revisit this issue. > > > (So you can't unload agpgart whilst you've a 3d using app (like X) > > > open). This seems quite sensible, but these days you can't unload > > > agpgart.ko anyway because the chipset module (via-agp.ko in your > > > case) already has it 'in use', so I'm tempted to drop those bits. > > If this is true (it usually is), you can simply drop them. > > I'll need 'something' in the chipset drivers. The first thing that > jumps to mind is to give the chipset drivers an 'acquire' op > which does the locking much like the old agp_backend_acquire() does. The problem is, a module basically can't lock itself. Hence you should expose an "owner" field when you register an interface, and have the caller do the locking. There are other possibly solutions, but none of them were as simple as "lock down before you call in". > > There are other cases where the caller is not grabbing references, so > > MOD_INC_USE_COUNT is better than nothing (should the warning stay for > > 2.6? Good question). > > Why exactly isn't it safe any more? If there's documentation on this, > I'd love to read it. If there isn't, there really should be. Consider the fictitious my_module.c: static void my_function_called_through_ptr(void) { MOD_INC_USE_COUNT; ... MOD_DEC_USE_COUNT; return; } If you get preempted before the MOD_INC_USE_COUNT, an unload could occur, boom. If you get preempted after the MOD_DEC_USE_COUNT, but before the return, and an unload occurs, boom. If it were not called through a pointer, (ie. called by name) then it means the module calling it has a reference (this is done automatically in symbol resolution, or symbol_get()), so no problem. [ As an implementation detail, this is *not* a problem on SMP without preemption. But that's an implementation detail, and it wasn't true with the previous module implementation. ] Hope this helps, Rusty. -- Anyone who quotes me in their sig is an idiot. -- Rusty Russell. Golden Rule: If you are calling though a function pointer into a (different) module, you must hold a reference to that module. Otherwise you risk sleeping in the module while it is unloaded. Q: How do I get a reference to a module? A: Usually, a successful call to try_module_get(owner). You don't need to check for owner != NULL, BTW. Q: When does try_module_get(owner) fail? A: When the module is not ready to be entered (ie. still in init_module) or it is being removed. This prevents you entering the module as it is being discarded (init might fail, or it's being removed). Q: But the modules' init routine calls my register() routine which wants to call back into one of the function pointers immediately, and so try_module_get() fails! (because the module is not finished initializing yet) A: You're being called from the module, so someone already has a reference (unless there's a bug), so you don't need a try_module_get(). This does mean that if you were to register a structure for *another* module (does anyone do this?) you'd need to have a reference to it. Q: How do I put the reference back? A: Using module_put(owner) (owner == NULL is OK). Q: Do I really need to put try_module_get() before every function ptr call? A: If the function does not sleep (any cannot be preempted) ie. is called in softirq or hardirq context, you can omit this step, since you obviously won't sleep inside the module. Also, most structs have clear "start" and "stop" functions (eg. mount/umount), so you only need one try_module_get() on start, and module_put() on stop. Q: Is it safe to call try_module_get() and module_put() from an interrtupt / softirq? A: Yes. Q: My code use "MOD_INC_USE_COUNT". Do I still need to adjust my module count when someone calls one of my functions? A: No, you never need to adjust your own module count. There are five ways a function in your module can get called: firstly, it could be your module_init() function, in which case the module code holds a reference. It could be another module using one of your EXPORT_SYMBOL'ed functions, in which case you cannot be removed since they would have to be removed first. It could be a module which found an EXPORT_SYMBOL'ed function using symbol_get(), in which case they hold a reference count. It could be through a function pointer which your module gave out previously, which is discussed above. Finally, it could be from within your own module, in which case someone must already hold a reference. Q: My code uses "__MOD_INC_USE_COUNT(reg->owner)", but now I get a warning at runtime that it is unsafe. What do I need to do? A: You need to use try_module_get(), and not call into the module if it fails (act as if it hasn't registered yet). Note that you no longer need to check for NULL yourself, try_module_get() does that. Q: My code used "GET_USE_COUNT(module)" to get the reference count. A: Don't do that. If module unloading is disabled, there is no reference count, and there is never a single value you can assign to. Q: My code used "try_inc_mod_count(module)" to get the reference count. Should I change it? A: No hurry. try_module_get() is exactly the same: the new name reflects that this is now the only way to get a reference. Q: How does the code in try_module_get() work? A: It disables preemption for a moment, checks the live flag, and then increments a per-cpu counter if the module is live. This is even lighter-weight (in icache and cycles) than using a brlock, but has the same effect. If CONFIG_MODULE_UNLOAD=n, it just becomes a check that the module is live. Q: How does the module remove code work? A: It stops the machine by scheduling threads for every other CPU, then they all disable interrupts. At this stage we know that noone is in try_module_get(), so we can reliably read the counter. If zero, or the rmmod user specified --wait, we set the live flag to false. After this, the reference count should not increase, and each module_put() will wake us up, so we can check the counter again. Q: Are these changes all so you could implement an in-kernel module linker? A: No, they were to prevent load and unload races without altering every module, nor introducing drastic new requirements. Q: Doesn't putting linking code into the kernel just add bloat? A: The total linking code is about 200 generic lines, 100 x86-specific lines. The ia64 linking code is about 500 lines (it's the most complex). Richard Henderson has a great suggestion for simplifying it furthur, which I'm implementing now. "insmod" is now a single portable system call, meaning insmod can be written in about 20 lines of code. The previous code had to implement the two module loading system calls, the module querying system call, and the /proc/ksyms output, required a little more code than the current x86 linker. Q: Why not just fix the old code? A: Because having something so intimate with the kernel in userspace greatly restricts what changes the kernel can make. Moving into the kernel means I have implemented modversions, typesafe extensible module parameters and kallsyms without altering userspace in any way. Future extensions won't have to worry about the version of modutils problem. Q: Why not implement two-stage insert / two-stage delete? A: Because I implemented it first and it sucked. And because this *is* two-stage insert and two-stage delete, without exposing it to the modules using it, with the added advantage that the second stage is atomic (activation/deactivation is simply changing mod->live, modulo locking implementation magic detailed above). This prevents the race between deactivating the module and finding that someone has starting using it as you are deactivating it. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-26 3:51 ` Rusty Russell @ 2002-12-28 0:20 ` Roman Zippel 0 siblings, 0 replies; 18+ messages in thread From: Roman Zippel @ 2002-12-28 0:20 UTC (permalink / raw) To: Rusty Russell; +Cc: Dave Jones, Ed Tomlinson, linux-kernel Hi, Rusty Russell wrote: > > So where is the documentation describing module locking de jour ? > > Here's the FAQ again. Note that the init stuff is not currently true > (you can enter a module doing init, with possibly bad results), > because Linus patched my code. It's on my TODO list to revisit this > issue. Rusty, you should do this rather soon. Could you please explain, how you intend to solve the module init races? The "don't enter module until init is finished" concept doesn't work, so what else do you want to do? > Q: How does the code in try_module_get() work? > A: It disables preemption for a moment, checks the live flag, and then > increments a per-cpu counter if the module is live. This is even > lighter-weight (in icache and cycles) than using a brlock, but has > the same effect. Q: But if I have a large number of modules (e.g. netfilter) and I have to use try_module_get() for each of them, won't it still affect performance? > Q: Are these changes all so you could implement an in-kernel module > linker? > A: No, they were to prevent load and unload races without altering > every module, nor introducing drastic new requirements. Q: Wasn't it possible to do the rewrite in several steps? What made it so urgent that everything had to be 2.6? > Q: Doesn't putting linking code into the kernel just add bloat? > A: The total linking code is about 200 generic lines, 100 > x86-specific lines. [...] > > The previous code had to implement the two module loading > system calls, the module querying system call, and the /proc/ksyms > output, required a little more code than the current x86 linker. 2.4.20: wc kernel/module.c include/linux/module.h 1283 3800 29616 kernel/module.c 415 1609 13173 include/linux/module.h 1698 5409 42789 total 2.5.52: wc kernel/module.c kernel/params.c arch/i386/kernel/module.c include/linux/module* 1414 4400 35284 kernel/module.c 338 1093 8182 kernel/params.c 124 412 3347 arch/i386/kernel/module.c 368 1174 9954 include/linux/module.h 61 244 1958 include/linux/moduleloader.h 127 619 5199 include/linux/moduleparam.h 2432 7942 63924 total Q: Doesn't the querying system call provide the same information as /proc/modules and /proc/ksyms? Is procfs now required to load modules? Q: How can I debug modules with ksymoops? Q: Why was it necessary to remove the system calls? Wasn't it possible to emulate them? > Q: Why not just fix the old code? > A: Because having something so intimate with the kernel in userspace > greatly restricts what changes the kernel can make. Moving into > the kernel means I have implemented modversions, typesafe > extensible module parameters and kallsyms without altering > userspace in any way. Future extensions won't have to worry about > the version of modutils problem. Q: Can I see the modversions implementation? Q: Was there an analysis, which discussed the limitation of the old code? Was it really impossible to design a more flexible interface, which left as much as possible in userspace? Rusty, could you _please_ start answering questions instead of ignoring them or is there no discussion required anymore, now that your code is in the kernel? bye, Roman ^ permalink raw reply [flat|nested] 18+ messages in thread
* [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
@ 2002-12-17 1:49 Ed Tomlinson
2002-12-17 8:06 ` Paul P Komkoff Jr
2002-12-17 12:33 ` Dave Jones
0 siblings, 2 replies; 18+ messages in thread
From: Ed Tomlinson @ 2002-12-17 1:49 UTC (permalink / raw)
To: linux-kernel; +Cc: Dave Jones
I am getting the above message in 2.5.51, 52, and 52+bk current.
Pci info follows:
oscar# lspci -vv
00:00.0 Host bridge: VIA Technologies, Inc. VT82C598 [Apollo MVP3] (rev 04)
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR+
Latency: 16
Region 0: Memory at e0000000 (32-bit, prefetchable) [size=64M]
Capabilities: [a0] AGP version 1.0
Status: RQ=7 SBA+ 64bit- FW- Rate=x1,x2
Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>
with a mga G400
01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 04) (prog-if 00 [VGA])
Subsystem: Matrox Graphics, Inc. Millennium G400 MAX/Dual Head 32Mb
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64 (4000ns min, 8000ns max), cache line size 08
Interrupt: pin A routed to IRQ 11
Region 0: Memory at e8000000 (32-bit, prefetchable) [size=32M]
Region 1: Memory at e4000000 (32-bit, non-prefetchable) [size=16K]
Region 2: Memory at e5000000 (32-bit, non-prefetchable) [size=8M]
Expansion ROM at <unassigned> [disabled] [size=64K]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [f0] AGP version 2.0
Status: RQ=31 SBA+ 64bit- FW- Rate=x1,x2
Command: RQ=31 SBA+ AGP+ 64bit- FW- Rate=x1
What else would help to debug this? The drm error above is all I find in the logs...
TIA
Ed Tomlinson
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-17 1:49 Ed Tomlinson @ 2002-12-17 8:06 ` Paul P Komkoff Jr 2002-12-17 12:50 ` Ed Tomlinson 2002-12-17 12:33 ` Dave Jones 1 sibling, 1 reply; 18+ messages in thread From: Paul P Komkoff Jr @ 2002-12-17 8:06 UTC (permalink / raw) To: linux-kernel Replying to Ed Tomlinson: > I am getting the above message in 2.5.51, 52, and 52+bk current. > Pci info follows: > What else would help to debug this? The drm error above is all I find in the logs... If you mount devfs somewhere you also don't find misc/agpgart inside ? :))) And nothing about agp aperture in dmesg? -- Paul P 'Stingray' Komkoff 'Greatest' Jr /// (icq)23200764 /// (http)stingr.net When you're invisible, the only one really watching you is you (my keychain) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-17 8:06 ` Paul P Komkoff Jr @ 2002-12-17 12:50 ` Ed Tomlinson 2002-12-20 1:30 ` Matt Bernstein 0 siblings, 1 reply; 18+ messages in thread From: Ed Tomlinson @ 2002-12-17 12:50 UTC (permalink / raw) To: Paul P Komkoff Jr, linux-kernel, Dave Jones, Rusty Russell Paul P Komkoff Jr wrote: > Replying to Ed Tomlinson: >> I am getting the above message in 2.5.51, 52, and 52+bk current. >> Pci info follows: >> What else would help to debug this? The drm error above is all I find in >> the logs... > > If you mount devfs somewhere you also don't find misc/agpgart inside ? > :))) > > And nothing about agp aperture in dmesg? Not normally. If I modprobe via-agp modprobe segfaults (a Rusty's bug), but via_agp and agpgart get loaded (note that - changed to _ when the module is loaded - it has dash in file in the directory). Doing it this time gets an oops (52bk as of last night): Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected VIA MVP3 chipset Unable to handle kernel paging request at virtual address e0db9080 printing eip: e0db9080 *pde = 1ed12067 *pte = 00000000 Oops: 0000 CPU: 0 EIP: 0060:[<e0db9080>] Not tainted EFLAGS: 00010297 EIP is at 0xe0db9080 eax: 00000000 ebx: c15d5800 ecx: c0291b68 edx: 00000282 esi: 00000000 edi: c15d584c ebp: c15d5800 esp: d93e5f08 ds: 0068 es: 0068 ss: 0068 Process modprobe (pid: 19122, threadinfo=d93e4000 task=df09aca0) Stack: e0de816f c15d5800 c15d5800 e0dc81a1 c15d5800 e0dc84a8 c019f5c8 c15d5800 e0dcd2ac c15d584c e0dc84a8 ffffffed e0dc84a8 e0dc8480 c01ac0af c15d584c c15d584c c15d5854 c02bd934 c01ac183 c15d584c e0dc84a8 c02bd840 e0dc83b8 Call Trace: [<e0de816f>] agp_register_driver+0x27/0x9c [agpgart] [<e0dc81a1>] agp_via_probe+0x35/0x3c [via_agp] [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] [<c019f5c8>] pci_device_probe+0x40/0x5c [<e0dcd2ac>] agp_via_pci_table+0x0/0x38 [via_agp] [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] [<e0dc8480>] agp_via_pci_driver+0x0/0xa0 [via_agp] [<c01ac0af>] bus_match+0x37/0x6c [<c01ac183>] driver_attach+0x37/0x60 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] [<e0dc83b8>] __module_pci_device_size+0x10/0x18 [via_agp] [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] [<c01ac44c>] bus_add_driver+0xa4/0xc4 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] [<e0dc84c8>] agp_via_pci_driver+0x48/0xa0 [via_agp] [<c01ac83c>] driver_register+0x34/0x38 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] [<c019f6c2>] pci_register_driver+0x42/0x50 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp] [<e0dcd1af>] agp_via_init+0xb/0x44 [via_agp] [<e0dc8480>] agp_via_pci_driver+0x0/0xa0 [via_agp] [<c0125752>] sys_init_module+0x116/0x1a4 [<c0108937>] syscall_call+0x7/0xb Code: Bad EIP value. The module load occured after X was started. Ed Tomlinson ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-17 12:50 ` Ed Tomlinson @ 2002-12-20 1:30 ` Matt Bernstein 2002-12-20 1:43 ` Randy.Dunlap 0 siblings, 1 reply; 18+ messages in thread From: Matt Bernstein @ 2002-12-20 1:30 UTC (permalink / raw) To: Ed Tomlinson; +Cc: Paul P Komkoff Jr, linux-kernel, Dave Jones, Rusty Russell On Dec 17 Ed Tomlinson wrote: >Not normally. If I modprobe via-agp modprobe segfaults (a Rusty's bug), >but via_agp and agpgart get loaded (note that - changed to _ when the module >is loaded - it has dash in file in the directory). Doing it this time gets >an oops (52bk as of last night): [snip] I get a very similar oops, but with amd_k7_agp (2.5.52-mm2). I'm not bk-savvy as yet, but if pointed at a diff, would be happy to verify it. Matt ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-20 1:30 ` Matt Bernstein @ 2002-12-20 1:43 ` Randy.Dunlap 2002-12-20 9:33 ` Dave Jones 0 siblings, 1 reply; 18+ messages in thread From: Randy.Dunlap @ 2002-12-20 1:43 UTC (permalink / raw) To: Matt Bernstein Cc: Ed Tomlinson, Paul P Komkoff Jr, linux-kernel, Dave Jones, Rusty Russell On Fri, 20 Dec 2002, Matt Bernstein wrote: | On Dec 17 Ed Tomlinson wrote: | | >Not normally. If I modprobe via-agp modprobe segfaults (a Rusty's bug), | >but via_agp and agpgart get loaded (note that - changed to _ when the module | >is loaded - it has dash in file in the directory). Doing it this time gets | >an oops (52bk as of last night): | [snip] | | I get a very similar oops, but with amd_k7_agp (2.5.52-mm2). I'm not | bk-savvy as yet, but if pointed at a diff, would be happy to verify it. 2.5.zz kernel diff snapshots (from bk) are available at http://www.kernel.org/pub/linux/kernel/v2.5/snapshots/ e.g., latest is: http://www.kernel.org/pub/linux/kernel/v2.5/snapshots/patch-2.5.52-bk4.bz2 -- ~Randy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-20 1:43 ` Randy.Dunlap @ 2002-12-20 9:33 ` Dave Jones 0 siblings, 0 replies; 18+ messages in thread From: Dave Jones @ 2002-12-20 9:33 UTC (permalink / raw) To: Randy.Dunlap Cc: Matt Bernstein, Ed Tomlinson, Paul P Komkoff Jr, linux-kernel, Rusty Russell On Thu, Dec 19, 2002 at 05:43:14PM -0800, Randy.Dunlap wrote: > | > | I get a very similar oops, but with amd_k7_agp (2.5.52-mm2). I'm not > | bk-savvy as yet, but if pointed at a diff, would be happy to verify it. > > 2.5.zz kernel diff snapshots (from bk) are available at > http://www.kernel.org/pub/linux/kernel/v2.5/snapshots/ > e.g., latest is: > http://www.kernel.org/pub/linux/kernel/v2.5/snapshots/patch-2.5.52-bk4.bz2 Latest AGP bits aren't in Linus tree yet. A few more bits to nail down, and then I'll ask him to pull again. Dave -- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module. 2002-12-17 1:49 Ed Tomlinson 2002-12-17 8:06 ` Paul P Komkoff Jr @ 2002-12-17 12:33 ` Dave Jones 1 sibling, 0 replies; 18+ messages in thread From: Dave Jones @ 2002-12-17 12:33 UTC (permalink / raw) To: Ed Tomlinson; +Cc: linux-kernel, rusty On Mon, Dec 16, 2002 at 08:49:16PM -0500, Ed Tomlinson wrote: > I am getting the above message in 2.5.51, 52, and 52+bk current. > Pci info follows: > > What else would help to debug this? The drm error above is all I find in the logs... There are a bunch of pending fixes at bk://linux-dj.bkbits.net/agpgart, but nothing that should be relevant to this problem. Are you using agpgart as modules? Which ones loaded ? I've a feeling agpgart.ko loaded here, but not via-agp.ko What needs to happen is when agpgart.ko is loaded, all the chipset drivers also get pulled in as dependancies. Since the new module stuff went in, I'm not sure how this works, if it works at all.[1] Dave [1] I'm yet another developer who has had a rough time with the new modules stuff. I'll try it again soon. -- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2002-12-28 1:48 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20021218094714.43C712C076@lists.samba.org>
2002-12-18 12:57 ` [drm:drm_init] *ERROR* Cannot initialize the agpgart module Ed Tomlinson
2002-12-18 16:29 ` Dave Jones
[not found] ` <200212181803.23279.tomlins@cam.org>
[not found] ` <20021219105909.GE29122@suse.de>
2002-12-20 23:29 ` Ed Tomlinson
2002-12-21 14:22 ` Dave Jones
2002-12-22 4:41 ` Ed Tomlinson
2002-12-22 12:28 ` Dave Jones
2002-12-22 14:08 ` Ed Tomlinson
2002-12-23 1:10 ` Rusty Russell
2002-12-23 12:10 ` Dave Jones
2002-12-26 3:51 ` Rusty Russell
2002-12-28 0:20 ` Roman Zippel
2002-12-17 1:49 Ed Tomlinson
2002-12-17 8:06 ` Paul P Komkoff Jr
2002-12-17 12:50 ` Ed Tomlinson
2002-12-20 1:30 ` Matt Bernstein
2002-12-20 1:43 ` Randy.Dunlap
2002-12-20 9:33 ` Dave Jones
2002-12-17 12:33 ` Dave Jones
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.