public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@mandrakesoft.com>
To: Patrick Mochel <mochelp@infinity.powertie.org>
Cc: linux-kernel@vger.kernel.org, Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [RFC] New Driver Model for 2.5
Date: Thu, 18 Oct 2001 02:23:36 -0400	[thread overview]
Message-ID: <3BCE7568.1DAB9FF0@mandrakesoft.com> (raw)
In-Reply-To: <Pine.LNX.4.21.0110171617460.15653-100000@marty.infinity.powertie.org>

Patrick Mochel wrote:
> 
> One July afternoon, while hacking on the pm_dev layer for the purpose of
> system-wide power management support, I decided that I was quite tired of
> trying to make this layer look like a tree and feel like a tree, but not
> have any real integration with the actual device drivers..
> 
> I had read the accounts of what the goals were for 2.5. And, after some
> conversations with Linus and the (gasp) ACPI guys, I realized that I had a
> good chunk of the infrastructural code written; it was a matter of working
> out a few crucial details and massaging it in nicely.
> 
> I have had the chance this week (after moving and vacationing) to update
> the (read: write some) documentation for it. I will not go into details,
> and will let the document speak for itself.
> 
> With all luck, this should go into the early stages of 2.5, and allow a
> significant cleanup of many drivers. Such a model will also allow for neat
> tricks like full device power management support, and Plug N Play
> capabilities.
> 
> In order to support the new driver model, I have written a small in-memory
> filesystem, called ddfs, to export a unified interface to userland. It is
> mentioned in the doc, and is pretty self-explanatory. More information
> will be available soon.
> 
> There is code available for the model and ddfs at:
> 
> http://kernel.org/pub/linux/kernel/people/mochel/device/
> 
> but there are some fairly large caveats concerning it.
> 
> First, I feel comfortable with the device layer code and the ddfs
> code. Though, the PCI code is still work in progress. I am still working
> out some of the finer details concerning it.
> 
> Next is the environment under which I developed it all. It was on an ia32
> box, with only PCI support, and using ACPI. The latter didn't have too
> much of an effect on the development, but there are a few items explicitly
> inspired by it..
> 
> I am hoping both the PCI code, and the structure and in general can be
> further improved based on the input of the driver maintainers.
> 
> This model is not final, and may be way off from what most people actually
> want. It has gotten tentative blessing from all those that have seen it,
> though they number but a few. It's definitely not the only solution...
> 
> That said, enjoy; and have at it.
> 
>         -pat
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> The (New) Linux Kernel Driver Model
> 
> Version 0.01
> 
> 17 October 2001
> 
> Overview
> ~~~~~~~~
> 
> This driver model is a unification of all the current, disparate driver models
> that are currently in the kernel. It is intended is to augment the
> bus-specific drivers for bridges and devices by consolidating a set of data
> and operations into globally accessible data structures.
> 
> Current driver models implement some sort of tree-like structure (sometimes
> just a list) for the devices they control. But, there is no linkage between
> the different bus types.
> 
> A common data structure can provide this linkage with little overhead: when a
> bus driver discovers a particular device, it can insert it into the global
> tree as well as its local tree. In fact, the local tree becomes just a subset
> of the global tree.
> 
> Common data fields can also be moved out of the local bus models into the
> global model. Some of the manipulation of these fields can also be
> consolidated. Most likely, manipulation functions will become a set
> of helper functions, which the bus drivers wrap around to include any
> bus-specific items.
> 
> The common device and bridge interface currently reflects the goals of the
> modern PC: namely the ability to do seamless Plug and Play, power management,
> and hot plug. (The model dictated by Intel and Microsoft (read: ACPI) ensures
> us that any device in the system may fit any of these criteria.)
> 
> In reality, not every bus will be able to support such operations. But, most
> buses will support a majority of those operations, and all future buses will.
> In other words, a bus that doesn't support an operation is the exception,
> instead of the other way around.
> 
> Drivers
> ~~~~~~~
> 
> The callbacks for bridges and devices are intended to be singular for a
> particular type of bus. For each type of bus that has support compiled in the
> kernel, there should be one statically allocated structure with the
> appropriate callbacks that each device (or bridge) of that type share.
> 
> Each bus layer should implement the callbacks for these drivers. It then
> forwards the calls on to the device-specific callbacks. This means that
> device-specific drivers must still implement callbacks for each operation.
> But, they are not called from the top level driver layer.
> 
> This does add another layer of indirection for calling one of these functions,
> but there are benefits that are believed to outweigh this slowdown.
> 
> First, it prevents device-specific drivers from having to know about the
> global device layer. This speeds up integration time incredibly. It also
> allows drivers to be more portable across kernel versions. Note that the
> former was intentional, the latter is an added bonus.
> 
> Second, this added indirection allows the bus to perform any additional logic
> necessary for its child devices. A bus layer may add additional information to
> the call, or translate it into something meaningful for its children.
> 
> This could be done in the driver, but if it happens for every object of a
> particular type, it is best done at a higher level.
> 
> Recap
> ~~~~~
> 
> Instances of devices and bridges are allocated dynamically as the system
> discovers their existence. Their fields describe the individual object.
> Drivers - in the global sense - are statically allocated and singular for a
> particular type of bus. They describe a set of operations that every type of
> bus could implement, the implementation following the bus's semantics.
> 
> Downstream Access
> ~~~~~~~~~~~~~~~~~
> 
> Common data fields have been moved out of individual bus layers into a common
> data structure. But, these fields must still be accessed by the bus layers,
> and
> sometimes by the device-specific drivers.
> 
> Other bus layers are encouraged to do what has been done for the PCI layer.
> struct pci_dev now looks like this:
> 
> struct pci_dev {
>         ...
> 
>         struct device device;
> };
> 
> Note first that it is statically allocated. This means only one allocation on
> device discovery. Note also that it is at the _end_ of struct pci_dev. This is
> to make people think about what they're doing when switching between the bus
> driver and the global driver; and to prevent against mindless casts between
> the two.
> 
> The PCI bus layer freely accesses the fields of struct device. It knows about
> the structure of struct pci_dev, and it should know the structure of struct
> device. PCI devices that have been converted generally do not touch the fields
> of struct device. More precisely, device-specific drivers should not touch
> fields of struct device unless there is a strong compelling reason to do so.
> 
> This abstraction is prevention of unnecessary pain during transitional phases.
> If the name of the field changes or is removed, then every downstream driver
> will break. On the other hand, if only the bus layer (and not the device
> layer) accesses struct device, it is only those that need to change.
> 
> User Interface
> ~~~~~~~~~~~~~~
> 
> By virtue of having a complete hierarchical view of all the devices in the
> system, exporting a complete hierarchical view to userspace becomes relatively
> easy. Whenever a device is inserted into the tree, a file or directory can be
> created for it.
> 
> In this model, a directory is created for each bridge and each device. When it
> is created, it is populated with a set of default files, first at the global
> layer, then at the bus layer. The device layer may then add its own files.
> 
> These files export data about the driver and can be used to modify behavior of
> the driver or even device.
> 
> For example, at the global layer, a file named 'status' is created for each
> device. When read, it reports to the user the name of the device, its bus ID,
> its current power state, and the name of the driver its using.
> 
> By writing to this file, you can have control over the device. By writing
> "suspend 3" to this file, one could place the device into power state "3".
> Basically, by writing to this file, the user has access to the operations
> defined in struct device_driver.
> 
> The PCI layer also adds default files. For devices, it adds a "resource" file
> and a "wake" file. The former reports the BAR information for the device; the
> latter reports the wake capabilities of the device.
> 
> The device layer could also add files for device-specific data reporting and
> control.
> 
> The dentry to the device's directory is kept in struct device. It also keeps a
> linked list of all the files in the directory, with pointers to their read and
> write callbacks. This allows the driver layer to maintain full control of its
> destiny. If it desired to override the default behavior of a file, or simply
> remove it, it could easily do so. (It is assumed that the files added upstream
> will always be a known quantity.)
> 
> These features were initially implemented using procfs. However, after one
> conversation with Linus, a new filesystem - ddfs - was created to implement
> these features. It is an in-memory filesystem, based heavily off of ramfs,
> though it uses procfs as inspiration for its callback functionality.
> 
> Device Structures
> ~~~~~~~~~~~~~~~~~
> 
> struct device {
>         struct list_head        bus_list;
>         struct io_bus           *parent;
>         struct io_bus           *subordinate;
> 
>         char                    name[DEVICE_NAME_SIZE];
>         char                    bus_id[BUS_ID_SIZE];
> 
>         struct dentry           *dentry;
>         struct list_head        files;
> 
>         struct  semaphore       lock;
> 
>         struct device_driver    *driver;
>         void                    *driver_data;
>         void                    *platform_data;
> 
>         u32                     current_state;
>         unsigned char           *saved_state;
> };
> 
> bus_list:
>         List of all devices on a particular bus; i.e. the device's siblings
> 
> parent:
>         The parent bridge for the device.
> 
> subordinate:
>         If the device is a bridge itself, this points to the struct io_bus that is
>         created for it.
> 
> name:
>         Human readable (descriptive) name of device. E.g. "Intel EEPro 100"
> 
> bus_id:
>         Parsable (yet ASCII) bus id. E.g. "00:04.00" (PCI Bus 0, Device 4, Function
>         0). It is necessary to have a searchable bus id for each device; making it
>         ASCII allows us to use it for its directory name without translating it.
> 
> dentry:
>         Pointer to driver's ddfs directory.
> 
> files:
>         Linked list of all the files that a driver has in its ddfs directory.
> 
> lock:
>         Driver specific lock.
> 
> driver:
>         Pointer to a struct device_driver, the common operations for each device. See
>         next section.
> 
> driver_data:
>         Private data for the driver.
>         Much like the PCI implementation of this field, this allows device-specific
>         drivers to keep a pointer to a device-specific data.
> 
> platform_data:
>         Data that the platform (firmware) provides about the device.
>         For example, the ACPI BIOS or EFI may have additional information about the
>         device that is not directly mappable to any existing kernel data structure.
>         It also allows the platform driver (e.g. ACPI) to a driver without the driver
>         having to have explicit knowledge of (atrocities like) ACPI.
> 
> current_state:
>         Current power state of the device. For PCI and other modern devices, this is
>         0-3, though it's not necessarily limited to those values.
> 
> saved_state:
>         Pointer to driver-specific set of saved state.
>         Having it here allows modules to be unloaded on system suspend and reloaded
>         on resume and maintain state across transitions.
>         It also allows generic drivers to maintain state across system state
>         transitions.
>         (I've implemented a generic PCI driver for devices that don't have a
>         device-specific driver. Instead of managing some vector of saved state
>         for each device the generic driver supports, it can simply store it here.)
> 
> struct device_driver {
>         int     (*probe)        (struct device *dev);
>         int     (*remove)       (struct device *dev);
> 
>         int     (*init)         (struct device *dev);
>         int     (*shutdown)     (struct device *dev);
> 
>         int     (*save_state)   (struct device *dev, u32 state);
>         int     (*restore_state)(struct device *dev);
> 
>         int     (*suspend)      (struct device *dev, u32 state);
>         int     (*resume)       (struct device *dev);
> }
> 
> probe:
>         Check for device existence and associate driver with it.
> 
> remove:
>         Dissociate driver with device. Releases device so that it could be used by
>         another driver. Also, if it is a hotplug device (hotplug PCI, Cardbus), an
>         ejection event could take place here.
> 
> init:
>         Initialise the device - allocate resources, irqs, etc.
> 
> shutdown:
>         "De-initialise" the device - release resources, free memory, etc.
> 
> save_state:
>         Save current device state before entering suspend state.
> 
> restore_state:
>         Restore device state, after coming back from suspend state.
> 
> suspend:
>         Physically enter suspend state.
> 
> resume:
>         Physically leave suspend state and re-initialise hardware.
> 
> Initially, the probe/remove sequence followed the PCI semantics exactly, but
> have since been broken up into a four-stage process: probe(), remove(),
> init(), and shutdown().
> 
> While it's not entirely necessary in all environments, breaking them up so
> each routine does only one thing makes sense.
> 
> Hot-pluggable devices may also benefit from this model, especially ones that
> can be subjected to suprise removals - only the remove function would be
> called, and the driver could easily know if the there was still hardware there
> to shutdown.
> 
> Drivers that are controlling failing, or buggy, hardware, by allowing the user
> to trigger a removal of the driver from userspace, without trying to shutdown
> down the device.
> 
> In each case that remove() is called without a shutdown(), it's important to
> note that resources will still need to be freed; it's only the hardware that
> cannot be assumed to be present.

So, remove() might be called without a shutdown(), and then asked to
perform the duties normally performed by shutdown()?  That sounds like
API dain bramage.  :)

Your proposal sounds ok, my one objection is separating probe/remove
further into init/shutdown.  Can you give real-life cases where this
will be useful?  I don't see it causing much except headache.

The preferred way of doing things (IMHO) is to do some simply sanity
checking of the h/w device at probe time, and then perform lots of
initialization and such at device/interface open time.  You ideally want
a device driver lifecycle to look like

probe:
	register interface
	sanity check h/w to make sure it's there and alive
	stop DMA/interrupts/etc., just in case
	start timer to powerdown h/w in N seconds

dev_open:
	wake up device, if necessary
	init device

dev_close:
	stop DMA/interrupts/etc.
	start timer to powerdown h/w in N seconds

With that in mind, init -really- happens at device open, and in
additional is driven more through normal user interaction via standard
APIs, than the PCI and PM subsystems.

-- 
Jeff Garzik      | "Mind if I drive?" -Sam
Building 1024    | "Not if you don't mind me clawing at the dash
MandrakeSoft     |  and shrieking like a cheerleader." -Max

  reply	other threads:[~2001-10-18  6:23 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-10-17 23:52 [RFC] New Driver Model for 2.5 Patrick Mochel
2001-10-18  6:23 ` Jeff Garzik [this message]
2001-10-18 12:13   ` Benjamin Herrenschmidt
2001-10-18 16:19     ` Patrick Mochel
2001-10-18 17:38       ` Tim Jansen
2001-10-18 22:06       ` Benjamin Herrenschmidt
2001-10-19 17:09         ` Kai Henningsen
2001-10-18 22:10       ` Kai Henningsen
2001-10-19 18:26         ` Patrick Mochel
2001-10-19 19:02           ` Tim Jansen
2001-10-19 19:21             ` Mike Fedyk
2001-10-19 20:07               ` Tim Jansen
2001-10-19 20:24                 ` Mike Fedyk
2001-10-19 22:25                   ` Tim Jansen
2001-10-20 13:47                   ` Kai Henningsen
2001-10-20  1:41               ` john slee
2001-10-20 13:52               ` Kai Henningsen
2001-10-22 11:02                 ` Padraig Brady
2001-10-27 11:01                   ` Kai Henningsen
2001-10-19  7:57     ` Henning P. Schmiedehausen
2001-10-19  8:09       ` Jeff Garzik
2001-10-19  8:31         ` Keith Owens
2001-10-19  8:43           ` Jeff Garzik
2001-10-19 18:50       ` Tim Jansen
2001-10-19 15:21     ` Taral
2001-10-19 23:30     ` Benjamin Herrenschmidt
2001-10-19 23:54       ` Benjamin Herrenschmidt
2001-10-18 15:17   ` Patrick Mochel
2001-10-18 16:08 ` Taral
2001-10-18 16:52   ` Jonathan Lundell
2001-10-18 17:38     ` Patrick Mochel
2001-10-18 17:41       ` Patrick Mochel
2001-10-18 18:28       ` Jonathan Lundell
2001-10-18 19:49         ` Patrick Mochel
2001-10-18 20:40           ` Jeff Garzik
2001-10-18 21:32           ` John Alvord
2001-10-18 22:23             ` Benjamin Herrenschmidt
2001-10-18 22:26             ` Jeff Garzik
2001-10-18 22:18           ` Benjamin Herrenschmidt
2001-10-18 23:30             ` Patrick Mochel
2001-10-18 23:44               ` Benjamin Herrenschmidt
2001-10-18 23:52                 ` Jeff Garzik
     [not found]         ` <3BCF3941.D4B79FE1@mandrakesoft.com>
2001-10-19 17:12           ` Jonathan Lundell
2001-10-18 17:05 ` Jonathan Corbet
2001-10-18 17:33   ` Patrick Mochel
  -- strict thread matches above, loose matches on Subject: below --
2001-10-19 17:01 Kevin Easton
2001-10-19 18:40 ` Patrick Mochel
2001-10-19 21:43 Grover, Andrew
2001-10-19 23:33 Benjamin Herrenschmidt
2001-10-20  0:09 ` Linus Torvalds
2001-10-20  9:28   ` Benjamin Herrenschmidt
2001-10-21 17:09     ` Pavel Machek
2001-10-23  0:19       ` Patrick Mochel
2001-10-23  0:31         ` Alan Cox
2001-10-23  0:29           ` Patrick Mochel
2001-10-23  7:53             ` Alan Cox
2001-10-23 15:10               ` Jonathan Lundell
2001-10-23 15:49                 ` Alan Cox
2001-10-23 20:22                 ` Benjamin Herrenschmidt
2001-10-23 20:54                   ` Alan Cox
2001-10-24  0:26                     ` Benjamin Herrenschmidt
2001-10-24  9:57                       ` Alan Cox
2001-10-24 10:34                         ` Benjamin Herrenschmidt
2001-10-24 10:54                           ` Alan Cox
2001-10-24 13:04                             ` Benjamin Herrenschmidt
2001-10-24 13:25                               ` Alan Cox
2001-10-24 16:19                                 ` Linus Torvalds
2001-10-24 16:36                                   ` Michael H. Warfield
2001-10-24 16:45                                     ` Linus Torvalds
2001-10-24 22:48                                   ` Alan Cox
2001-10-24 16:15                               ` Linus Torvalds
2001-10-24 16:46                                 ` Xavier Bestel
2001-10-24 16:54                                   ` Patrick Mochel
2001-10-24 16:55                                   ` Linus Torvalds
2001-10-24 22:45                                     ` Alan Cox
2001-10-24 17:33                                 ` Benjamin Herrenschmidt
2001-10-24 22:41                                   ` Alan Cox
2001-10-24 22:41                                     ` Linus Torvalds
2001-10-25  7:58                                     ` Benjamin Herrenschmidt
2001-10-25 12:22                                       ` Alan Cox
2001-10-25 14:57                                         ` Benjamin Herrenschmidt
2001-10-25  8:03                                     ` Benjamin Herrenschmidt
2001-10-25  8:09                                       ` Benjamin Herrenschmidt
2001-10-25 12:20                                       ` Alan Cox
2001-10-25 21:47                                   ` Pavel Machek
2001-10-24 22:50                                 ` Alan Cox
2001-10-25  4:14                                   ` Linus Torvalds
2001-10-25 12:42                                     ` Alan Cox
2001-10-25 21:52                                       ` Xavier Bestel
2001-10-25 23:53                                         ` Benjamin Herrenschmidt
2001-10-25 23:53                                           ` Alan Cox
2001-10-26 11:35                                       ` Helge Hafting
2001-10-26 12:38                                         ` Alan Cox
2001-10-25  8:27                                 ` Rob Turk
2001-10-25 10:01                                   ` Benjamin Herrenschmidt
2001-10-25 10:02                                   ` Helge Hafting
2001-10-25 14:20                                   ` Victor Yodaiken
2001-10-25 14:44                                     ` Jeff Garzik
2001-10-25 14:45                                     ` Jeff Garzik
2001-10-25 15:22                                     ` Rob Turk
2001-10-25 15:44                                       ` Jonathan Lundell
2001-10-25 16:26                                     ` David Lang
2001-10-25 21:59                                   ` Pavel Machek
2001-10-25 21:32                                     ` Rob Turk
2001-10-24 17:01                               ` Mike Anderson
2001-10-25  9:02                               ` Eric W. Biederman
2001-10-25  9:29                                 ` Linus Torvalds
2001-10-25  9:47                                   ` Benjamin Herrenschmidt
2001-10-25 10:11                                   ` Eric W. Biederman
2001-10-25 10:59                                     ` Linus Torvalds
2001-10-24 15:18                         ` Jonathan Lundell
2001-10-24 15:41                         ` Linus Torvalds
2001-10-24 15:59                           ` Alan Cox
2001-10-24 15:56                             ` Linus Torvalds
2001-10-23  9:44             ` Pavel Machek
2001-10-23 11:03               ` Benjamin Herrenschmidt
2001-10-23 11:49                 ` Benjamin Herrenschmidt
2001-10-23 10:54           ` Benjamin Herrenschmidt
2001-10-24 17:56 Grover, Andrew
2001-10-24 18:45 ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3BCE7568.1DAB9FF0@mandrakesoft.com \
    --to=jgarzik@mandrakesoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mochelp@infinity.powertie.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox