* [PATCH] Fix device removal on net and block frontend drivers @ 2005-11-21 15:43 Murillo Fernandes Bernardes 2005-11-21 16:07 ` Stefan Berger 2005-11-21 17:48 ` Adam Heath 0 siblings, 2 replies; 10+ messages in thread From: Murillo Fernandes Bernardes @ 2005-11-21 15:43 UTC (permalink / raw) To: xen-devel [-- Attachment #1: Type: text/plain, Size: 352 bytes --] Frontend devices are not being unregistered when in closed state. The following patch fix that. Fix bug #420. Makes "05_attach_and_dettach_device_repeatedly_pos" and "09_attach_and_dettach_device_check_data_pos" tests pass. Signed-off-by: Murillo Fernandes Bernardes <mfb@br.ibm.com> -- Murillo Fernandes Bernardes IBM Linux Technology Center [-- Attachment #2: frontend_unregister_device.patch --] [-- Type: text/x-diff, Size: 1117 bytes --] diff -r 6a666940fa04 linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c --- a/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c Sun Nov 20 09:19:38 2005 +++ b/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c Mon Nov 21 14:58:42 2005 @@ -273,7 +273,6 @@ case XenbusStateInitialising: case XenbusStateInitWait: case XenbusStateInitialised: - case XenbusStateClosed: break; case XenbusStateConnected: @@ -282,6 +281,10 @@ case XenbusStateClosing: blkfront_closing(dev); + break; + + case XenbusStateClosed: + device_unregister(&dev->dev); break; } } diff -r 6a666940fa04 linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c --- a/linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c Sun Nov 20 09:19:38 2005 +++ b/linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c Mon Nov 21 14:58:42 2005 @@ -406,11 +406,14 @@ case XenbusStateInitialised: case XenbusStateConnected: case XenbusStateUnknown: - case XenbusStateClosed: break; case XenbusStateClosing: netfront_closing(dev); + break; + + case XenbusStateClosed: + device_unregister(&dev->dev); break; } } [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix device removal on net and block frontend drivers 2005-11-21 15:43 [PATCH] Fix device removal on net and block frontend drivers Murillo Fernandes Bernardes @ 2005-11-21 16:07 ` Stefan Berger 2005-11-21 16:33 ` Murillo Fernandes Bernardes 2005-11-21 17:48 ` Adam Heath 1 sibling, 1 reply; 10+ messages in thread From: Stefan Berger @ 2005-11-21 16:07 UTC (permalink / raw) To: Murillo Fernandes Bernardes; +Cc: xen-devel [-- Attachment #1.1: Type: text/plain, Size: 932 bytes --] xen-devel-bounces@lists.xensource.com wrote on 11/21/2005 10:43:01 AM: > > Frontend devices are not being unregistered when in closed state. The > following patch fix that. > > Fix bug #420. > > Makes "05_attach_and_dettach_device_repeatedly_pos" and > "09_attach_and_dettach_device_check_data_pos" tests pass. Did you test this with suspending / resuming a dom U? The reason I am asking is that when suspending the driver immediately gets into state 'Closed' and when resuming into state 'Connected', but now your device is unregistered. Stefan > > > Signed-off-by: Murillo Fernandes Bernardes <mfb@br.ibm.com> > > -- > Murillo Fernandes Bernardes > IBM Linux Technology Center > [attachment "frontend_unregister_device.patch" deleted by Stefan > Berger/Watson/IBM] _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel [-- Attachment #1.2: Type: text/html, Size: 1236 bytes --] [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix device removal on net and block frontend drivers 2005-11-21 16:07 ` Stefan Berger @ 2005-11-21 16:33 ` Murillo Fernandes Bernardes 2005-11-21 16:49 ` Stefan Berger 0 siblings, 1 reply; 10+ messages in thread From: Murillo Fernandes Bernardes @ 2005-11-21 16:33 UTC (permalink / raw) To: Stefan Berger; +Cc: xen-devel On Monday 21 November 2005 14:07, Stefan Berger wrote: > xen-devel-bounces@lists.xensource.com wrote on 11/21/2005 10:43:01 AM: > > Frontend devices are not being unregistered when in closed state. The > > following patch fix that. > > > > Fix bug #420. > > > > Makes "05_attach_and_dettach_device_repeatedly_pos" and > > "09_attach_and_dettach_device_check_data_pos" tests pass. > > Did you test this with suspending / resuming a dom U? The reason I am > asking is that when suspending the driver immediately gets into state > 'Closed' and when resuming into state 'Connected', but now your device is > unregistered. No, I did not test suspend/resume. I really don't see why it should get into Closed on suspend, but anyway, is this really hapenning? I could not find any switch to Closed into suspend's code, neither on resume. How to test suspend/resume on a domU? It does not have /sys/power/state neither /proc/sleep. -- Murillo Fernandes Bernardes IBM Linux Technology Center ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix device removal on net and block frontend drivers 2005-11-21 16:33 ` Murillo Fernandes Bernardes @ 2005-11-21 16:49 ` Stefan Berger 2005-11-21 18:31 ` Ewan Mellor 0 siblings, 1 reply; 10+ messages in thread From: Stefan Berger @ 2005-11-21 16:49 UTC (permalink / raw) To: Murillo Fernandes Bernardes; +Cc: xen-devel [-- Attachment #1.1: Type: text/plain, Size: 1868 bytes --] Murillo Fernandes Bernardes <mfb@br.ibm.com> wrote on 11/21/2005 11:33:31 AM: > On Monday 21 November 2005 14:07, Stefan Berger wrote: > > xen-devel-bounces@lists.xensource.com wrote on 11/21/2005 10:43:01 AM: > > > Frontend devices are not being unregistered when in closed state. The > > > following patch fix that. > > > > > > Fix bug #420. > > > > > > Makes "05_attach_and_dettach_device_repeatedly_pos" and > > > "09_attach_and_dettach_device_check_data_pos" tests pass. > > > > Did you test this with suspending / resuming a dom U? The reason I am > > asking is that when suspending the driver immediately gets into state > > 'Closed' and when resuming into state 'Connected', but now your device is > > unregistered. > > No, I did not test suspend/resume. > > I really don't see why it should get into Closed on suspend, but anyway, is > this really hapenning? I could not find any switch to Closed into suspend's > code, neither on resume. > What I am seeing is that after a suspend / resume the interface 'eth0' is completely gone. 'ifconfig -a' shows everything, but no eth0. You might only want to unregister if the domain was not suspended. So you probably need to implement the .suspend function in the frontend and set a state variable to know whether the domain is being hibernated, and you clear that variable in the .resume. You check that variable when the driver is going into the 'Closed' state and only unregister if not in 'suspend' mode. > How to test suspend/resume on a domU? It does not have /sys/power/state > neither /proc/sleep. 'xm save <dom id> <dom state filename>' lets you suspend a domain 'xm restore <dom state filename>' lets you resume a domain. I would only use the network driver for testing this by booting into a RAMDisk. Stefan > > -- > Murillo Fernandes Bernardes > IBM Linux Technology Center [-- Attachment #1.2: Type: text/html, Size: 2486 bytes --] [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix device removal on net and block frontend drivers 2005-11-21 16:49 ` Stefan Berger @ 2005-11-21 18:31 ` Ewan Mellor 2005-11-21 20:02 ` Stefan Berger 2005-11-21 21:11 ` Murillo Fernandes Bernardes 0 siblings, 2 replies; 10+ messages in thread From: Ewan Mellor @ 2005-11-21 18:31 UTC (permalink / raw) To: Stefan Berger; +Cc: Murillo Fernandes Bernardes, xen-devel On Mon, Nov 21, 2005 at 11:49:19AM -0500, Stefan Berger wrote: > Murillo Fernandes Bernardes <mfb@br.ibm.com> wrote on 11/21/2005 11:33:31 > AM: > > > On Monday 21 November 2005 14:07, Stefan Berger wrote: > > > xen-devel-bounces@lists.xensource.com wrote on 11/21/2005 10:43:01 AM: > > > > Frontend devices are not being unregistered when in closed state. > The > > > > following patch fix that. > > > > > > > > Fix bug #420. > > > > > > > > Makes "05_attach_and_dettach_device_repeatedly_pos" and > > > > "09_attach_and_dettach_device_check_data_pos" tests pass. > > > > > > Did you test this with suspending / resuming a dom U? The reason I am > > > asking is that when suspending the driver immediately gets into state > > > 'Closed' and when resuming into state 'Connected', but now your > device is > > > unregistered. > > > > No, I did not test suspend/resume. > > > > I really don't see why it should get into Closed on suspend, but anyway, > is > > this really hapenning? I could not find any switch to Closed into > suspend's > > code, neither on resume. xenbus_read_driver_state returns Closed if the backend path is no longer present. Maybe this is where the Closed has come from. However, xenbus_probe.c:otherend_changed is supposed to be protecting us from watches that have fired immediately after a resume. Could you please enable the DPRINTK in xenbus_probe and see whether the Closed is coming through the test in otherend_changed? This would help diagnose the problem. The intention with xenbus_read_driver_state returning Closed was that this was the correct way of forcing the driver to close down if the path goes away, as in normal use the backend path should not just disappear, and for resumption we have a way to detect that. Perhaps one or other of these things should change, but it's not clear to me which one it is, or if indeed this is the problem at all. > What I am seeing is that after a suspend / resume the interface 'eth0' is > completely gone. 'ifconfig -a' shows everything, but no eth0. > > You might only want to unregister if the domain was not suspended. So you > probably need to implement the .suspend function in the frontend and set a > state variable to know whether the domain is being hibernated, and you > clear that variable in the .resume. You check that variable when the > driver is going into the 'Closed' state and only unregister if not in > 'suspend' mode. If this is necessary, and it's not clear to me that it is, then this is a facility that Xenbus should provide in general, rather than each driver having to hack around the problem itself. Returning to Murillo's patch, I assumed that the unregister_netdev in close_netdev would implicitly call device_unregister, and that this was the correct way to close down the device. Is this not the case? My intention for closedown of the device was that the backend would move to state Closing, triggering a graceful shutdown of the frontend (in this case through netfront_closing, close_netdev, etc.). AFAIK, Xend is correctly setting the backend to state Closing, so I expect unregister_netdev to be being called. There is the different issue that Xend does not check for the existence or state of a device before hotplugging a new one. This means that the frontend might not have time to see the Closing before having a chance to close down, for example. This is a problem with Xend that needs to be fixed there. Xend should refuse to hotplug a device if the frontend for the old one has not yet closed down. This is not to say that Murillo's patch is wrong, but simply to say that I expect wider issues than can be fixed by this patch alone. Ewan. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix device removal on net and block frontend drivers 2005-11-21 18:31 ` Ewan Mellor @ 2005-11-21 20:02 ` Stefan Berger 2005-11-21 21:11 ` Murillo Fernandes Bernardes 1 sibling, 0 replies; 10+ messages in thread From: Stefan Berger @ 2005-11-21 20:02 UTC (permalink / raw) To: Ewan Mellor; +Cc: Murillo Fernandes Bernardes, xen-devel [-- Attachment #1.1: Type: text/plain, Size: 6982 bytes --] xen-devel-bounces@lists.xensource.com wrote on 11/21/2005 01:31:48 PM: > On Mon, Nov 21, 2005 at 11:49:19AM -0500, Stefan Berger wrote: > > > Murillo Fernandes Bernardes <mfb@br.ibm.com> wrote on 11/21/2005 11:33:31 > > AM: > > > > > On Monday 21 November 2005 14:07, Stefan Berger wrote: > > > > xen-devel-bounces@lists.xensource.com wrote on 11/21/2005 10:43:01 AM: > > > > > Frontend devices are not being unregistered when in closed state. > > The > > > > > following patch fix that. > > > > > > > > > > Fix bug #420. > > > > > > > > > > Makes "05_attach_and_dettach_device_repeatedly_pos" and > > > > > "09_attach_and_dettach_device_check_data_pos" tests pass. > > > > > > > > Did you test this with suspending / resuming a dom U? The reason I am > > > > asking is that when suspending the driver immediately gets into state > > > > 'Closed' and when resuming into state 'Connected', but now your > > device is > > > > unregistered. > > > > > > No, I did not test suspend/resume. > > > > > > I really don't see why it should get into Closed on suspend, but anyway, > > is > > > this really hapenning? I could not find any switch to Closed into > > suspend's > > > code, neither on resume. > > xenbus_read_driver_state returns Closed if the backend path is no longer > present. Maybe this is where the Closed has come from. However, > xenbus_probe.c:otherend_changed is supposed to be protecting us from watches > that have fired immediately after a resume. > > Could you please enable the DPRINTK in xenbus_probe and see whether the Closed > is coming through the test in otherend_changed? This would help diagnose the > problem. Here's the log from domain 0's /var/log/messages: Nov 21 14:54:43 jlfb-2 gpm[2667]: *** info [startup.c(95)]: Nov 21 14:54:43 jlfb-2 gpm[2667]: Started gpm successfully. Entered daemon mode. Nov 21 14:54:43 jlfb-2 gpm[2667]: *** info [mice.c(1766)]: Nov 21 14:54:43 jlfb-2 gpm[2667]: imps2: Auto-detected intellimouse PS/2 Nov 21 14:54:52 jlfb-2 fstab-sync[2817]: removed all generated mount points Nov 21 14:54:55 jlfb-2 kernel: xenbus;_probe (xenbus_probe_backend:639) . Nov 21 14:54:55 jlfb-2 kernel: xenbus;_probe (xenbus_probe_backend_unit:624) backend/vif/15/0 Nov 21 14:54:55 jlfb-2 kernel: . Nov 21 14:54:55 jlfb-2 kernel: xenbus;_probe (xenbus_probe_backend:639) . Nov 21 14:54:55 jlfb-2 kernel: xenbus;_probe (xenbus_probe_backend_unit:624) backend/vtpm/15/6 Nov 21 14:54:55 jlfb-2 kernel: . Nov 21 14:54:55 jlfb-2 kernel: xenbus;_probe (xenbus_probe_backend:639) . Nov 21 14:54:55 jlfb-2 kernel: xenbus;_probe (xenbus_probe_backend_unit:624) backend/vtpm/2/6 Nov 21 14:54:55 jlfb-2 kernel: . Nov 21 14:54:55 jlfb-2 kernel: xenbus;_probe (frontend_changed:763) . Nov 21 14:54:55 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . Nov 21 14:54:55 jlfb-2 gconfd (root-2848): starting (version 2.10.0), pid 2848 user 'root' Nov 21 14:54:55 jlfb-2 gconfd (root-2848): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0 Nov 21 14:54:55 jlfb-2 gconfd (root-2848): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 1 Nov 21 14:54:56 jlfb-2 gconfd (root-2848): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source at position 2 Nov 21 14:55:03 jlfb-2 gconfd (root-2848): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 0 below: starting user domain Nov 21 14:55:40 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . Nov 21 14:55:40 jlfb-2 kernel: xenbus;_probe (xenbus_hotplug_backend:232) . Nov 21 14:55:40 jlfb-2 kernel: xenbus;_probe (xenbus_dev_probe:338) . Nov 21 14:55:40 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . Nov 21 14:55:40 jlfb-2 last message repeated 5 times Nov 21 14:55:40 jlfb-2 kernel: xenbus;_probe (otherend_changed:307) state is 1, /local/domain/1/device/vif/0/state, /local/domain/1/device/vif/0/state. Nov 21 14:55:40 jlfb-2 kernel: xenbus;_probe (xenbus_hotplug_backend:232) . Nov 21 14:55:40 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . Nov 21 14:55:41 jlfb-2 kernel: device vif1.0 entered promiscuous mode Nov 21 14:55:41 jlfb-2 kernel: xenbr0: port 1(vif1.0) entering learning state Nov 21 14:55:41 jlfb-2 kernel: xenbr0: topology change detected, propagating Nov 21 14:55:41 jlfb-2 kernel: xenbr0: port 1(vif1.0) entering forwarding state Nov 21 14:55:41 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . Nov 21 14:55:45 jlfb-2 kernel: xenbus;_probe (otherend_changed:307) state is 4, /local/domain/1/device/vif/0/state, /local/domain/1/device/vif/0/state. Nov 21 14:55:45 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . below: suspending user domain Nov 21 14:56:09 jlfb-2 kernel: xenbus;_probe (otherend_changed:307) state is 6, /local/domain/1/device/vif/0/state, /local/domain/1/device/vif/0/state. Nov 21 14:56:09 jlfb-2 kernel: xenbus;_probe (xenbus_dev_remove:376) . Nov 21 14:56:09 jlfb-2 kernel: xenbr0: port 1(vif1.0) entering disabled state Nov 21 14:56:09 jlfb-2 kernel: xenbus;_probe (xenbus_hotplug_backend:232) . Nov 21 14:56:09 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . Nov 21 14:56:09 jlfb-2 kernel: device vif1.0 left promiscuous mode Nov 21 14:56:09 jlfb-2 kernel: xenbr0: port 1(vif1.0) entering disabled state Nov 21 14:56:09 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . below: resuming user domain Nov 21 14:56:30 jlfb-2 last message repeated 2 times Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (xenbus_hotplug_backend:232) . Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (xenbus_dev_probe:338) . Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . Nov 21 14:56:30 jlfb-2 last message repeated 5 times Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (otherend_changed:307) state is 1, /local/domain/2/device/vif/0/state, /local/domain/2/device/vif/0/state. Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (xenbus_hotplug_backend:232) . Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (otherend_changed:307) state is 6, /local/domain/2/device/vif/0/state, /local/domain/2/device/vif/0/state. Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (xenbus_dev_remove:376) . Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (xenbus_hotplug_backend:232) . Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . Nov 21 14:56:30 jlfb-2 kernel: device vif2.0 entered promiscuous mode Nov 21 14:56:30 jlfb-2 kernel: xenbr0: port 1(vif2.0) entering learning state Nov 21 14:56:30 jlfb-2 kernel: xenbr0: topology change detected, propagating Nov 21 14:56:30 jlfb-2 kernel: xenbr0: port 1(vif2.0) entering forwarding state Nov 21 14:56:30 jlfb-2 kernel: xenbus;_probe (backend_changed:771) . Result: eth0 is gone. This is a user domain that is booting into a RAM disk. Memory of the user domain is 64M. Stefan [-- Attachment #1.2: Type: text/html, Size: 8178 bytes --] [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix device removal on net and block frontend drivers 2005-11-21 18:31 ` Ewan Mellor 2005-11-21 20:02 ` Stefan Berger @ 2005-11-21 21:11 ` Murillo Fernandes Bernardes 1 sibling, 0 replies; 10+ messages in thread From: Murillo Fernandes Bernardes @ 2005-11-21 21:11 UTC (permalink / raw) To: Ewan Mellor; +Cc: xen-devel, Stefan Berger On Monday 21 November 2005 16:31, Ewan Mellor wrote: > On Mon, Nov 21, 2005 at 11:49:19AM -0500, Stefan Berger wrote: > Could you please enable the DPRINTK in xenbus_probe and see whether the > Closed is coming through the test in otherend_changed? This would help > diagnose the problem. > on DomU: xenbus_probe (xenbus_suspend:831) . xenbus_probe (suspend_dev:786) . xenbus_probe (suspend_dev:786) . xenbus_probe (suspend_dev:786) . xenbus_probe (resume_dev:806) . xenbus_probe (otherend_changed:301) state is 6, /local/domain/0/backend/vif/5/0/state, /local/domain/0/backend/vif/5/0/state. xenbus_probe (otherend_changed:301) state is 6, /local/domain/0/backend/vbd/5/770/state, /local/domain/0/backend/vbd/5/770/state. xenbus_probe (resume_dev:806) . xenbus_probe (backend_changed:764) . xenbus_probe (frontend_changed:756) . xenbus_probe (resume_dev:806) . xenbus_probe (otherend_changed:301) state is 4, /local/domain/0/backend/vbd/6/769/state, /local/domain/0/backend/vbd/6/769/state. xenbus_probe (frontend_changed:756) . xenbus_probe (frontend_changed:756) . xenbus_probe (frontend_changed:756) . xenbus_probe (otherend_changed:301) state is 4, /local/domain/0/backend/vbd/6/769/state, /local/domain/0/backend/vbd/6/769/state. xenbus_probe (otherend_changed:301) state is 4, /local/domain/0/backend/vbd/6/770/state, /local/domain/0/backend/vbd/6/770/state. xenbus_probe (frontend_changed:756) . xenbus_probe (frontend_changed:756) . xenbus_probe (frontend_changed:756) . xenbus_probe (otherend_changed:301) state is 4, /local/domain/0/backend/vbd/6/770/state, /local/domain/0/backend/vbd/6/770/state. xenbus_probe (otherend_changed:301) state is 4, /local/domain/0/backend/vif/6/0/state, /local/domain/0/backend/vif/6/0/state. xenbus_probe (frontend_changed:756) . xenbus_probe (frontend_changed:756) . xenbus_probe (frontend_changed:756) . xenbus_probe (frontend_changed:756) . xenbus_probe (otherend_changed:301) state is 4, /local/domain/0/backend/vif/6/0/state, /local/domain/0/backend/vif/6/0/state. > The intention with xenbus_read_driver_state returning Closed was that this > was the correct way of forcing the driver to close down if the path goes > away, as in normal use the backend path should not just disappear, and for > resumption we have a way to detect that. Perhaps one or other of these > things should change, but it's not clear to me which one it is, or if > indeed this is the problem at all. > > > What I am seeing is that after a suspend / resume the interface 'eth0' is > > completely gone. 'ifconfig -a' shows everything, but no eth0. > > > > You might only want to unregister if the domain was not suspended. So you > > probably need to implement the .suspend function in the frontend and set > > a state variable to know whether the domain is being hibernated, and you > > clear that variable in the .resume. You check that variable when the > > driver is going into the 'Closed' state and only unregister if not in > > 'suspend' mode. > > If this is necessary, and it's not clear to me that it is, then this is a > facility that Xenbus should provide in general, rather than each driver > having to hack around the problem itself. What about a XenbusStateSuspended? > > Returning to Murillo's patch, I assumed that the unregister_netdev in > close_netdev would implicitly call device_unregister, and that this was the > correct way to close down the device. Is this not the case? > It is not happening. All references to device were cleared ? I'm not sure if it is needed in this case. > There is the different issue that Xend does not check for the existence or > state of a device before hotplugging a new one. This means that the > frontend might not have time to see the Closing before having a chance to > close down, for example. This is a problem with Xend that needs to be > fixed there. Xend should refuse to hotplug a device if the frontend for > the old one has not yet closed down. This is not to say that Murillo's > patch is wrong, but simply to say that I expect wider issues than can be > fixed by this patch alone. > -- Murillo Fernandes Bernardes IBM Linux Technology Center ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix device removal on net and block frontend drivers 2005-11-21 15:43 [PATCH] Fix device removal on net and block frontend drivers Murillo Fernandes Bernardes 2005-11-21 16:07 ` Stefan Berger @ 2005-11-21 17:48 ` Adam Heath 2005-11-21 18:40 ` Ewan Mellor 1 sibling, 1 reply; 10+ messages in thread From: Adam Heath @ 2005-11-21 17:48 UTC (permalink / raw) To: Murillo Fernandes Bernardes; +Cc: xen-devel@lists.xensource.com On Mon, 21 Nov 2005, Murillo Fernandes Bernardes wrote: > > Frontend devices are not being unregistered when in closed state. The > following patch fix that. > > Fix bug #420. > > Makes "05_attach_and_dettach_device_repeatedly_pos" and > "09_attach_and_dettach_device_check_data_pos" tests pass. > > > Signed-off-by: Murillo Fernandes Bernardes <mfb@br.ibm.com> Hmm. I have a way to make dom0 in unstable kernel-oops. If I attempt to setup a virtual block device to a /dev/nbN(enbd), but never actually configure the /dev/nbN, I get a kernel oops in dom0 when the domU shutdowns down. I can then no longer reboot or shutdown the dom0. Would this be related? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix device removal on net and block frontend drivers 2005-11-21 17:48 ` Adam Heath @ 2005-11-21 18:40 ` Ewan Mellor 2005-11-21 21:30 ` Adam Heath 0 siblings, 1 reply; 10+ messages in thread From: Ewan Mellor @ 2005-11-21 18:40 UTC (permalink / raw) To: Adam Heath; +Cc: Murillo Fernandes Bernardes, xen-devel@lists.xensource.com On Mon, Nov 21, 2005 at 11:48:10AM -0600, Adam Heath wrote: > On Mon, 21 Nov 2005, Murillo Fernandes Bernardes wrote: > > > > > Frontend devices are not being unregistered when in closed state. The > > following patch fix that. > > > > Fix bug #420. > > > > Makes "05_attach_and_dettach_device_repeatedly_pos" and > > "09_attach_and_dettach_device_check_data_pos" tests pass. > > > > > > Signed-off-by: Murillo Fernandes Bernardes <mfb@br.ibm.com> > > Hmm. I have a way to make dom0 in unstable kernel-oops. If I attempt to > setup a virtual block device to a /dev/nbN(enbd), but never actually configure > the /dev/nbN, I get a kernel oops in dom0 when the domU shutdowns down. I can > then no longer reboot or shutdown the dom0. > > Would this be related? This doesn't seem very related, no. What does your oops look like? Ewan. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix device removal on net and block frontend drivers 2005-11-21 18:40 ` Ewan Mellor @ 2005-11-21 21:30 ` Adam Heath 0 siblings, 0 replies; 10+ messages in thread From: Adam Heath @ 2005-11-21 21:30 UTC (permalink / raw) To: Ewan Mellor; +Cc: Murillo Fernandes Bernardes, xen-devel@lists.xensource.com On Mon, 21 Nov 2005, Ewan Mellor wrote: > On Mon, Nov 21, 2005 at 11:48:10AM -0600, Adam Heath wrote: > > > On Mon, 21 Nov 2005, Murillo Fernandes Bernardes wrote: > > > > > > > > Frontend devices are not being unregistered when in closed state. The > > > following patch fix that. > > > > > > Fix bug #420. > > > > > > Makes "05_attach_and_dettach_device_repeatedly_pos" and > > > "09_attach_and_dettach_device_check_data_pos" tests pass. > > > > > > > > > Signed-off-by: Murillo Fernandes Bernardes <mfb@br.ibm.com> > > > > Hmm. I have a way to make dom0 in unstable kernel-oops. If I attempt to > > setup a virtual block device to a /dev/nbN(enbd), but never actually configure > > the /dev/nbN, I get a kernel oops in dom0 when the domU shutdowns down. I can > > then no longer reboot or shutdown the dom0. > > > > Would this be related? > > This doesn't seem very related, no. What does your oops look like? >From the config file: == disk = [ 'phy:/dev/space/xen-0-16-swap-0,hda,w', 'phy:/dev/space/xen-0-16-tmp,hdb,w', 'phy:/dev/nda,hdc,w', 'phy:/dev/ndb,hdd,w' ] == /dev/nda and /dev/ndb have not been configured yet. >From the domU: == [61776.756910] Registering block device major 3 [61776.756999] hda: unknown partition table [61776.782867] hdb: unknown partition table [61776.805007] Registering block device major 22 [61776.805096] hdc:end_request: I/O error, dev hdc, sector 0 [61776.805225] Buffer I/O error on device hdc, logical block 0 [61776.805372] end_request: I/O error, dev hdc, sector 0 [61776.805379] Buffer I/O error on device hdc, logical block 0 [61776.805392] unable to read partition table == And from dom0, the oops(plus leading lines from syslog): Nov 21 14:21:38 xen-3 logger: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/1/768 Nov 21 14:21:39 xen-3 logger: /etc/xen/scripts/block: Writing backend/vbd/1/768/physical-device 0xfe01 backend/vbd/1/768/node /dev/space/xen-0-16-swap-0 to xenstore. Nov 21 14:21:39 xen-3 logger: /etc/xen/scripts/block: Writing backend/vbd/1/768/hotplug-status connected to xenstore. Nov 21 14:21:39 xen-3 logger: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/1/832 Nov 21 14:21:40 xen-3 logger: /etc/xen/scripts/block: Writing backend/vbd/1/832/physical-device 0xfe02 backend/vbd/1/832/node /dev/space/xen-0-16-tmp to xenstore. Nov 21 14:21:40 xen-3 logger: /etc/xen/scripts/block: Writing backend/vbd/1/832/hotplug-status connected to xenstore. Nov 21 14:21:40 xen-3 logger: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/1/5632 Nov 21 14:21:40 xen-3 logger: /etc/xen/scripts/block: Writing backend/vbd/1/5632/physical-device 0x2b00 backend/vbd/1/5632/node /dev/nda to xenstore. Nov 21 14:21:40 xen-3 logger: /etc/xen/scripts/block: Writing backend/vbd/1/5632/hotplug-status connected to xenstore. Nov 21 14:21:41 xen-3 logger: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/1/5696 Nov 21 14:21:41 xen-3 logger: /etc/xen/scripts/block: Writing backend/vbd/1/5696/physical-device 0x2b10 backend/vbd/1/5696/node /dev/ndb to xenstore. Nov 21 14:21:41 xen-3 logger: /etc/xen/scripts/block: Writing backend/vbd/1/5696/hotplug-status connected to xenstore. Nov 21 14:21:42 xen-3 logger: /etc/xen/scripts/vif-route: online XENBUS_PATH=backend/vif/1/0 Nov 21 14:21:43 xen-3 kernel: [61773.038490] ip_tables: (C) 2000-2002 Netfilter core team Nov 21 14:21:43 xen-3 logger: /etc/xen/scripts/vif-route: Writing backend/vif/1/0/hotplug-status connected to xenstore. Nov 21 14:21:46 xen-3 kernel: [61776.805155] nbd0: Request when not-ready Nov 21 14:21:46 xen-3 kernel: [61776.805187] end_request: I/O error, dev nbd0, sector 0 Nov 21 14:21:46 xen-3 kernel: [61776.805323] nbd0: Request when not-ready Nov 21 14:21:46 xen-3 kernel: [61776.805343] end_request: I/O error, dev nbd0, sector 0 Nov 21 14:21:46 xen-3 kernel: [61776.820855] general protection fault: 0000 [#1] Nov 21 14:21:46 xen-3 kernel: [61776.820879] SMP Nov 21 14:21:46 xen-3 kernel: [61776.820898] Modules linked in: ipt_physdev iptable_filter ip_tables i2c_i801 i2c_core dm_mod nbd sd_mod ata_ piix libata scsi_mod Nov 21 14:21:46 xen-3 kernel: [61776.820979] CPU: 0 Nov 21 14:21:46 xen-3 kernel: [61776.820980] EIP: 0061:[<c0160b60>] Not tainted VLI Nov 21 14:21:46 xen-3 kernel: [61776.820982] EFLAGS: 00010282 (2.6.12.6-xen) Nov 21 14:21:46 xen-3 kernel: [61776.821039] EIP is at blkdev_put+0x9/0x13c Nov 21 14:21:46 xen-3 kernel: [61776.821059] eax: fffffffa ebx: fffffffa ecx: 00000000 edx: 00000106 Nov 21 14:21:46 xen-3 kernel: [61776.821083] esi: c59c46a0 edi: c005a800 ebp: c59c4658 esp: c0427f40 Nov 21 14:21:46 xen-3 kernel: [61776.821107] ds: 007b es: 007b ss: 0069 Nov 21 14:21:46 xen-3 kernel: [61776.821126] Process events/0 (pid: 4, threadinfo=c0426000 task=c0057a20) Nov 21 14:21:46 xen-3 kernel: [61776.821137] Stack: 00000000 c59c467c c59c46a0 c005a800 c59c4658 c025e0bb c59c4658 c025df6c Nov 21 14:21:46 xen-3 kernel: [61776.821207] 00000000 c012c343 00000000 00000002 c1114c60 000de3dd c005a80c c005a814 Nov 21 14:21:46 xen-3 kernel: [61776.821272] c0426000 c59c469c c025df4a 00000001 00000000 c1114c60 00010000 00000000 Nov 21 14:21:46 xen-3 kernel: [61776.821337] Call Trace: Nov 21 14:21:46 xen-3 kernel: [61776.821366] [<c025e0bb>] vbd_free+0xf/0x18 Nov 21 14:21:46 xen-3 kernel: [61776.821394] [<c025df6c>] free_blkif+0x22/0x4c Nov 21 14:21:46 xen-3 kernel: [61776.821421] [<c012c343>] worker_thread+0x175/0x242 Nov 21 14:21:46 xen-3 kernel: [61776.821450] [<c025df4a>] free_blkif+0x0/0x4c Nov 21 14:21:46 xen-3 kernel: [61776.822757] [<c0118fd3>] default_wake_function+0x0/0xc Nov 21 14:21:46 xen-3 kernel: [61776.822786] [<c012c1ce>] worker_thread+0x0/0x242 Nov 21 14:21:46 xen-3 kernel: [61776.822814] [<c012ff99>] kthread+0x93/0x97 Nov 21 14:21:46 xen-3 kernel: [61776.822840] [<c012ff06>] kthread+0x0/0x97 Nov 21 14:21:46 xen-3 kernel: [61776.822867] [<c01070b5>] kernel_thread_helper+0x5/0xb Nov 21 14:21:46 xen-3 kernel: [61776.822894] Code: 89 d8 5b 5e 5f c3 89 f2 89 f8 e8 b6 fa ff ff 89 c3 85 c0 74 e9 89 f8 e8 06 00 00 00 89 d8 5b 5e 5f c3 55 57 56 53 83 ec 04 89 c3 <8b> 70 04 8b 78 58 8d 40 0c 89 04 24 f0 ff 4b 0c 0f 88 3d 03 00 == Doing an objdump of fs/block_dev.c(where blkdev_put exists), I get this: == 00000ca7 <blkdev_put>: ca7: 55 push %ebp ca8: 57 push %edi ca9: 56 push %esi caa: 53 push %ebx cab: 83 ec 04 sub $0x4,%esp cae: 89 c3 mov %eax,%ebx cb0: 8b 70 04 mov 0x4(%eax),%esi cb3: 8b 78 58 mov 0x58(%eax),%edi cb6: 8d 40 0c lea 0xc(%eax),%eax cb9: 89 04 24 mov %eax,(%esp) cbc: f0 ff 4b 0c lock decl 0xc(%ebx) cc0: 0f 88 3d 03 00 00 js 1003 <.text.lock.block_dev+0x7e> cc6: e8 fc ff ff ff call cc7 <lock_kernel+0xcc7> ccb: 8b 43 08 mov 0x8(%ebx),%eax == So, address cb0 is at fault. The snippet from block_dev.c has this: == int blkdev_put(struct block_device *bdev) { int ret = 0; struct inode *bd_inode = bdev->bd_inode; struct gendisk *disk = bdev->bd_disk; == bdev->bd_inode is at offset 4. Combined with eax being fffffffa, we get either a wrap-around, or overflow, on the addressing. This all points, however, to vbd->bdev not being initialized properly. In fact, with eax being what it is(-5 is signed int land), makes me believe some error condition isn't being checked. However, after I read the log again, it looks like an error occurs once, something is freed, and error occurs again, then the oops occurs. However, that may just be coincidence. Anyways, that should be enough info for someone a bit more knowledgable to debug this. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-11-21 21:30 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-11-21 15:43 [PATCH] Fix device removal on net and block frontend drivers Murillo Fernandes Bernardes 2005-11-21 16:07 ` Stefan Berger 2005-11-21 16:33 ` Murillo Fernandes Bernardes 2005-11-21 16:49 ` Stefan Berger 2005-11-21 18:31 ` Ewan Mellor 2005-11-21 20:02 ` Stefan Berger 2005-11-21 21:11 ` Murillo Fernandes Bernardes 2005-11-21 17:48 ` Adam Heath 2005-11-21 18:40 ` Ewan Mellor 2005-11-21 21:30 ` Adam Heath
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.