From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hiroshi Aono Date: Mon, 28 Jan 2002 01:36:35 +0000 Subject: Hot Plug I/O Node spec 0.2.1 Message-Id: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-hotplug@vger.kernel.org Hello hotplug developers, I'm working on Hot Plug I/O Node work that is on Atlas project. I've been considering the specification of Hot Plug IO node and I've started the development about this. Now I post Hot Plug I/O Node specification I want to do. This isn't fixed yet. If you're interested, please feel free to add your ideas. Best regards, --- Hiroshi Aono, NEC Solutions (h-aono@ap.jp.nec.com) ----------------8<---------------- IO Node Hot Plug Design Notes ============== Revision 0.2.1 by Hiroshi Aono (h-aono@ap.jp.nec.com) 1. Outline 1.1. Development objective This document describes the functional design for IO node hot plug specification that is discussed in the Atlas project. The primary objective is to implement support in Linux for 82870 (for upcoming McKinley processor) hardware features that allow IO node hot plug for nodes that do not contain legacy devices. The IO node hot plug capability is restricted to IO nodes that contain only hot pluggable buses and is heavily dependent on PCI hot plug device support. 1.2. IO node Expected system layout: +---+ +---+ |CPU|...|CPU| +---+ +---+ | | +-----------+ | PS | +-----------+ -> | | <- Hot-pluggable +---+ +---+ |ION|...|ION| +---+ +---+ PS: PortSwitch ION: IO Node Figure 1 Hot-pluggable System ---------------------------- An IO node can be hot-pluggable between the PS (Port Switch) and IO node. An IO node consists of IO bridges, interrupt controllers, some PCI buses, many PCI slots and so on. IONode +-----------------------------------------------+ | BRIDGE0 BRIDGE1 ...... | |+-------------------------+ +---------+ | || +-------+ | | | | || |IOSAPIC| | | | | || +-------+ | | | | || PCIBUS0 PCIBUS1 ..... | | | | ||| --+-- || --+--SLOT | | | | ||| --+-- || --+--SLOT | | | | ||| --+-- || --+-- | | | | | ||| --+-- || --+-- | | | | | ||+-------++-------+ | | | | |+-------------------------+ +---------+ | +-----------------------------------------------+ Figure 2 IO Node --------------- The BRIDGE is not only a PCI bridge but also next generation IO bridges such as Infiniband, 3GIO and so on. Both hot-plugging and the removal of interfaces between OS and hardware is mainly performed by ACPI. When insertion or removal of an IO node, a single interrupt (calls SCI) is generated. We can also use an interface in Linux instead of the interrupt, such as a GUI/CUI. (This is for debugging or a machine that doesn't support hardware interrupts of insertion/removal. This provides a command interface and executes the interrupt handler for insertion/removal by user request.) All hardware resources such as nodes, PCI buses, and slots are described in the ACPI table. An IO Node will be described as a Module device. (See ACPI2.0 spec p250) 2. Impact against other components 2.1. Domain Resource Manager The IO node manager has an interface to communicate with the Domain Resource Manager. We will have to decide the details of the interface. 2.2. ACPI module device support Firmware(BIOS) should support ACPI module device. 2.3. ACPI hot plug support The BIOS should support hot plug using ACPI. 2.4. IO/Memory space reservation function support The BIOS should support a function that can reserve some IO/memory spaces (IO/Memory gap) and bus numbers before booting. 2.5. Expand/Narrow Memory/IO region interface We recommend that the Firmware (BIOS) support a function that can re-program the IO node memory region and bus numbers. 3. Functions 3.1. Functions overview This is the IO node hot plug function block diagram. +---------------+ +-------------+ |Hotplug | +---------/Configuration/ |Application | | / File / +---------------+ | +-------------+ ---------|--------------|------------------------------------ +---------------+ +-------------------+ [kernel] | Hotplug | |Object Registration| | Filesystem | |Interface | User API +---------------+ +-------------------+ | | | | +---------------------+ | |IO node resource | Kernel module | |manager |----+ | +---------------------+ | | | | | | +-----------+ +-------+ | +->|PCI hotplug| |IO node| | |Function | |Driver | | +-----------+ +-------+ | | | | +--------------+--------------+ | PCI slot |Module Device | | Driver |Support Driver| +--------------+--------------+ | ACPI-CA | +-----------------------------+ ------------------------------------------------------------- +-----------------------------+ | Firmware | +-----------------------------+ Figure 3 Function overview -------------------------- Each tasks in the Atlas IO node hotplug project corresponds to the following components: o Hotplug PCI device support -> PCI hotplug Function, PCI slot Driver o Per IO node resource manager -> IO node resource manager, Object Registration Interface o ACPI Hotplug IO node states support -> IO node driver, Module Device Support Driver o System management interface for notifying OS to add/remove an IO node -> Hotplug application, Hotplug Filesystem 3.2. PCI device hot plug function 3.2.1 PCI hotplug function The PCI hot plug function has already been implemented in the official 2.4 linux kernel. This is for PCI hotplug controllers, so we are working to use the ACPI method. 3.2.2 PCI slot driver This will be included in ACPI-CA and this component has the following APIs: o SCI Interruption handler registration o Getting slot resources o Getting slot status This function is used by the PCI hotplug function. 3.3. Hotplug Filesystem We'd like to use the hotplug filesystem. This is for PCI hotplug filesystem and we feel this is most suitable for the generic hotplug interface. The hotplug filesystem will be used for user interfaces. We will improve this to treat a hierarchical structure for describing IO node objects. IO node hotplug hierarchical structure will be as follows: --+ ionodeXX + bridgeXX + slotXX + slotXX + pic + bridgeYYY + slotYY + slotYY + pic Each node is a directory, and they have following files: o adapter o attention o latch o power o test These files are used for controlling hotplug functions. 3.4. IO node resource manager IO node resource manager is a centric component for IO node hot plug function. This manages IO resources such as IO spaces, IRQs, and bus numbers. The IO node resource manager reads the device structure via the IO node driver and it manages those objects. 3.4.1. Management objects IO node resource manager treats following objects: o IO nodes o IOSAPICs (Interrupt controller) o BRIDGEs o bus numbers o IRQs Following objects are managed by PCI hotplug function: o PCI buses o Slots 3.4.2. Interface between IO node manager and PCI hotplug function IO node hotplug uses the /sbin/hotplug script. When an IO node is added, a script for IO node will be executed and the PCI hotplug driver will be loaded. Conversely, when an IO node is removed, a script for IO node will be executed and the PCI hotplug driver will stop all PCI devices under the IO node. 3.4.3. Solutions for assignment of bus number, I/O address and interruption Our 1st solution is to use the resources that the BIOS has allocated when booting. In this case, we expect that the BIOS supports the reservation of IO/Memory space gap and the bus numbers for hotplugging. 3.4.4. IRQ management (We should discuss, later.) 3.4.5. Advanced solution For more advanced solutions, we should consider active configuration of the IO node memory region and bus number region. For example, when we want to expand region A because of insufficient IO memory area, we should do the following things: o Narrow region B. o Expand region A. MEMORY MAP MEMORY MAP +-------------+ +-------------+ | | | | +-------------+-- +-------------+-- | |A (IO node 1) | |A (IO node 1) | | -> | | +-------------+-- | | | |B (IO node 2) +-------------+ | | | |B' (IO node 2) +-------------+-- +-------------+-- | | | | Figure 4 Advanced solution -------------------------- In addition, when some PCI cards work on IO node B, we should consider the following things: o Suspend all of the devices on IO node 1 o Suspend the cards working on IO node 2 o Move card resources to region B' o Re-program the configuration space for cards on IO node 2 o Resume the cards on IO node 2 o Resume all the devices on IO node 1 Also, hardware should support Expand/Narrow region function. When changing the resources, module device support driver will use AcpiGetPossibleResources/AcpiSetCurrentResources (ACPI-CA function). 3.5. IO node driver IO node driver has following functions. o Interrupt handler for removing/adding IO nodes. o Initialize/cleanup IOSAPICs, BRIDGEs and Buses on an IO node. 3.5.1 Interrupt handler The IO node driver registers an interrupt handler that is called when hot-add and not properly removed (surprise removal). When adding an IO node, the IO node driver will detect all the devices under the hot-added IO node and initialize them. When removing an IO node, the driver will clean up the devices under the IO node. 3.6. Module device support driver This will be included in ACPI-CA. This component has following APIs: o SCI Interruption handler registration o Getting resources (IO nodes, IOSAPICs, BRIDGEs, Buses) o Getting IO node status o Setting IO node memory region, busnumer (Advanced) This will be used by the IO node driver. 3.7. Object Registration Interface This component provides an interface for registering IO node objects. This is for non-ACPI machine and debugging. We can add the objects via this interface instead of using IO node driver. 3.8. State and transition for IO node This is the outline for the hot plug procedure: 1 Not Present | (Insertion) (1.1) FW: rises an interruption (SCI) (Note: FW: Firmware) | v 2 Present but not communicating (2.1) OS: ACPI driver handles SCI interruption and sends message "Inserted IO nodeXX" to the syslog. (for debugging) (2.2) FW: rises an interruption (SCI) | v 3 Communicating (3.1) OS: ACPI driver handles SCI interruption and sends message "Communicating" to the syslog. (for debugging) | v 4 Ready to Join (4.1) FW: scans buses. (4.2) FW: maps IO spaces. (4.3) OS: scans buses and devices on IO node. (4.4) OS: add IO node resources. (4.5) OS: sets interrupt vectors. (4.6) OS: call /sbin/hotplug with notification of the new devices seen. | v 5 Running | (Removal) (5.1) FW: calls an interruption (SCI) or user request | v 6 Prepare to disable (6.1) OS: ACPI driver handles SCI interruption. (6.2) OS: ACPI driver or node manager calls IO node event handler functions for pre-removal. (6.3) OS: call remove method of port drivers on IO node and stops the IO requests. (6.4) OS: changes IO node state to be unavailable. (6.5) OS: executes _PS3, _EJ0 methods. (Power off and Eject) (6.6) OS: executes _STA to verify the node ejected successfully. | v 7 Ready to Remove (Present but not communicating) |(Physical removal) (7.1) User: removes the IO node. (7.1) FW: calls an interruption (SCI) (7.2) OS: ACPI driver handles SCI interruption. (7.3) OS: deletes the IO node resource. | v Go to 1 3.9. IO node states We define IO node state. o Online The state is in 5 (Running). o Offline The state is between 7 (Ready to Remove) and 1 (Not present) o Unavailable We define the following situations: - Between after 1 (Insertion) and after 4 (Ready to Join) - Between after 5 (Removal) and after 6 (Prepare to disable) - Between after 6 (Prepare to disable) and before 7 (Ready to Remove) The state can be only changed in ascending order. Reverse order is not permitted. _______________________________________________ Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net Linux-hotplug-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel