From: Yijing Wang <wangyijing@huawei.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: linux-arch@vger.kernel.org, Wuyun <wuyun.wu@huawei.com>,
Russell King <linux@arm.linux.org.uk>,
Paul.Mundt@huawei.com, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org,
"James E.J. Bottomley" <jejb@parisc-linux.org>,
Xinwei Hu <huxinwei@huawei.com>,
Hanjun Guo <guohanjun@huawei.com>,
Bjorn Helgaas <bhelgaas@google.com>,
virtualization@lists.linux-foundation.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
Date: Mon, 4 Aug 2014 14:43:07 +0800 [thread overview]
Message-ID: <53DF2B7B.4000605@huawei.com> (raw)
In-Reply-To: <201408011552.32135.arnd@arndb.de>
On 2014/8/1 21:52, Arnd Bergmann wrote:
> On Wednesday 30 July 2014, Yijing Wang wrote:
>> On 2014/7/29 22:08, Arnd Bergmann wrote:
>>> On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
>>>>
>>>> The new data struct for generic MSI driver.
>>>> struct msi_irqs {
>>>> u8 msi_enabled:1; /* Enable flag */
>>>> u8 msix_enabled:1;
>>>> struct list_head msi_list; /* MSI desc list */
>>>> void *data; /* help to find the MSI device */
>>>> struct msi_ops *ops; /* MSI device specific hook */
>>>> };
>>>> struct msi_irqs is used to manage MSI related informations. Every device supports
>>>> MSI should contain this data struct and allocate it.
>>>
>>> I think you should have a stronger association with the 'struct
>>> device' here. Can you replace the 'void *data' with 'struct device *dev'?
>>
>> Actually, I used the struct device *dev in my first draft, finally, I replaced
>> it with void *data, because some MSI devices don't have a struct device *dev,
>> like the existing hpet device, dmar msi device, and OF device, like the ARM consolidator.
>>
>> Of course, we can make the MSI devices have their own struct device, and register to
>> device tree, eg, add a class device named MSI_DEV. But I'm not sure whether it is appropriate.
>
> It doesn't have to be in the (OF) device tree, but I think it absolutely makes
> sense to use the 'struct device' infrastructure here, as almost everything uses
> a device, and the ones that don't do that today can be easily changed.
I will try to use "struct device" infrastructure, thanks for your suggestion. :)
>
>>> The other part I'm not completely sure about is how you want to
>>> have MSIs map into normal IRQ descriptors. At the moment, all
>>> MSI users are based on IRQ numbers, but this has known scalability problems.
>>
>> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
>> I'm sorry, I don't understand you meaning.
>> What are the scalability problems you mentioned ?
>> For device drivers, they always process interrupt in two steps.
>> If irq is the legacy interrupt, drivers will first
>> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the IRQ number.
>> Then drivers will call the request_irq() to register the interrupt handler.
>> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then call
>> request_irq() to register interrupt handler.
>
> The method you describe here makes sense for PCI devices that are required to support
> legacy interrupts and may or may not support MSI on a given system, but not so much
> for platform devices for which we know exactly whether we want to use MSI
> or legacy interrupts.
>
> In particular if you have a device that can only do MSI, the entire pci_enable_msi
> step is pointless: all we need to do is program the correct MSI target address/message
> pair into the device and register the handler.
Yes, I almost agree if we won't change the existing hundreds of drivers, what
I worried about is some drivers may want to know the IRQ numbers, and use the IRQ
number to process different things, as I mentioned in another reply.
But we can also provide the interface which integrate MSI configuration and request_irq(),
if most drivers don't care the IRQ number.
>
>>> I wonder if we can do the interface in a way that
>>> hides the interrupt number from generic device drivers and just
>>> passes a 'struct irq_desc'. Note that there are long-term plans to
>>> get rid of IRQ numbers entirely, but those plans have existed for
>>> a long time already without anybody seriously addressing the device
>>> driver interfaces so far, so it might never really happen.
>>>
>>
>> Maybe this is a huge work, now hundreds drivers use the IRQ number, so maybe we can consider
>> this in a separate title.
>
> Sorry for being unclear here: I did suggest changing all drivers now. What I meant
> is that we use a different API for non-PCI devices that works without IRQ numbers.
> I don't think we should touch the PCI interfaces at this point.
OK, I got it.
>>> What I'd envision as the API from the device driver perspective is something
>>> as simple like this:
>>>
>>> struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
>>> unsigned long flags, const char *name, struct device *dev);
>>>
>>> which would get an msi descriptor that is valid for this device (dev)
>>> connected to a particular msi_chip, and associate a handler function
>>> with it. The device driver can call that function and retrieve the
>>> address/message pair from the msi_desc in order to store it in its own
>>> device specific registers. The request_irq() can be handled internally
>>> to msi_request().
>>
>> This is a huge change for device drivers, and some device drivers don't know which msi_chip
>> their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
>> all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
>> binding for the MSI device to the target msi_chip.
>
> Which drivers are you thinking of? Again, I wouldn't expect to change any PCI drivers,
> but only platform drivers that do native MSI, so we only have to change drivers that
> do not support any MSI at all yet and that need to be changed anyway in order to add
> support.
I mean platform device drivers, because we can find the target msi_chip by some platform
interfaces(like the existing of_pci_find_msi_chip_by_node()). So we no need to explicitly provide
the msi_chip as the function argument.
>
>> For PCI device, some arm platform already bound the msi_chip to the pci hostbridge, then all
>> pci devices under the pci hostbridge deliver their MSI irqs to the target msi_chip.
>> And other platform create the binding in DTS file, then the MSI device can find their msi_chip
>> by device_node.
>> I don't know whether there are other situations, we should provide a generic interface that
>> every MSI device under every platform can use it to find its msi_chip exactly.
>
> We have introduced the "msi-parent" property to mirror the "interrupt-parent" property.
> For the PCI case, this property is only needed in the PCI host controller, and there
> can be a system-wide default, by putting the "msi-parent" property into the root device
> node or the node of a bus that is parent to all devices supporting MSI.
>
> For non-PCI devices, it should be possible to override the "msi-parent" property per
> device, but those can also use the global property.
>
> The main use case that I see are PCI host controllers that have their own MSI catcher
> included, so meaning that any PCI device can either send its MSIs there, or to a
> system-wide GICv3 instance, and we need a way to select which one.
Yes, agree.
>
> A degenerate case of this would be a system where a PCI device sends its MSI into
> the host controller, that generates a legacy interrupt and that in turn gets
> sent to an irqchip which turns it back into an MSI for the GICv3. This would of
> course be very inefficient, but I think we should be able to express this with
> both the binding and the in-kernel framework just to be on the safe side.
Yes, the best way to tell the kernel which msi_chip should deliver to is describe
the binding in DTS file. If a real degenerate case found, we can update the platform
interface which is responsible for getting the match msi_chip in future.
>
>
> Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
> .
>
--
Thanks!
Yijing
WARNING: multiple messages have this Message-ID (diff)
From: Yijing Wang <wangyijing@huawei.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
Russell King <linux@arm.linux.org.uk>,
Paul.Mundt@huawei.com, Marc Zyngier <marc.zyngier@arm.com>,
linux-pci@vger.kernel.org,
"James E.J. Bottomley" <jejb@parisc-linux.org>,
virtualization@lists.linux-foundation.org,
Xinwei Hu <huxinwei@huawei.com>,
Hanjun Guo <guohanjun@huawei.com>,
Bjorn Helgaas <bhelgaas@google.com>, Wuyun <wuyun.wu@huawei.com>
Subject: Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
Date: Mon, 4 Aug 2014 14:43:07 +0800 [thread overview]
Message-ID: <53DF2B7B.4000605@huawei.com> (raw)
Message-ID: <20140804064307.lOGTF1HFapqEXVD3P6UR7Pzso8UbU4wj7-Tw4NIXT9E@z> (raw)
In-Reply-To: <201408011552.32135.arnd@arndb.de>
On 2014/8/1 21:52, Arnd Bergmann wrote:
> On Wednesday 30 July 2014, Yijing Wang wrote:
>> On 2014/7/29 22:08, Arnd Bergmann wrote:
>>> On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
>>>>
>>>> The new data struct for generic MSI driver.
>>>> struct msi_irqs {
>>>> u8 msi_enabled:1; /* Enable flag */
>>>> u8 msix_enabled:1;
>>>> struct list_head msi_list; /* MSI desc list */
>>>> void *data; /* help to find the MSI device */
>>>> struct msi_ops *ops; /* MSI device specific hook */
>>>> };
>>>> struct msi_irqs is used to manage MSI related informations. Every device supports
>>>> MSI should contain this data struct and allocate it.
>>>
>>> I think you should have a stronger association with the 'struct
>>> device' here. Can you replace the 'void *data' with 'struct device *dev'?
>>
>> Actually, I used the struct device *dev in my first draft, finally, I replaced
>> it with void *data, because some MSI devices don't have a struct device *dev,
>> like the existing hpet device, dmar msi device, and OF device, like the ARM consolidator.
>>
>> Of course, we can make the MSI devices have their own struct device, and register to
>> device tree, eg, add a class device named MSI_DEV. But I'm not sure whether it is appropriate.
>
> It doesn't have to be in the (OF) device tree, but I think it absolutely makes
> sense to use the 'struct device' infrastructure here, as almost everything uses
> a device, and the ones that don't do that today can be easily changed.
I will try to use "struct device" infrastructure, thanks for your suggestion. :)
>
>>> The other part I'm not completely sure about is how you want to
>>> have MSIs map into normal IRQ descriptors. At the moment, all
>>> MSI users are based on IRQ numbers, but this has known scalability problems.
>>
>> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
>> I'm sorry, I don't understand you meaning.
>> What are the scalability problems you mentioned ?
>> For device drivers, they always process interrupt in two steps.
>> If irq is the legacy interrupt, drivers will first
>> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the IRQ number.
>> Then drivers will call the request_irq() to register the interrupt handler.
>> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then call
>> request_irq() to register interrupt handler.
>
> The method you describe here makes sense for PCI devices that are required to support
> legacy interrupts and may or may not support MSI on a given system, but not so much
> for platform devices for which we know exactly whether we want to use MSI
> or legacy interrupts.
>
> In particular if you have a device that can only do MSI, the entire pci_enable_msi
> step is pointless: all we need to do is program the correct MSI target address/message
> pair into the device and register the handler.
Yes, I almost agree if we won't change the existing hundreds of drivers, what
I worried about is some drivers may want to know the IRQ numbers, and use the IRQ
number to process different things, as I mentioned in another reply.
But we can also provide the interface which integrate MSI configuration and request_irq(),
if most drivers don't care the IRQ number.
>
>>> I wonder if we can do the interface in a way that
>>> hides the interrupt number from generic device drivers and just
>>> passes a 'struct irq_desc'. Note that there are long-term plans to
>>> get rid of IRQ numbers entirely, but those plans have existed for
>>> a long time already without anybody seriously addressing the device
>>> driver interfaces so far, so it might never really happen.
>>>
>>
>> Maybe this is a huge work, now hundreds drivers use the IRQ number, so maybe we can consider
>> this in a separate title.
>
> Sorry for being unclear here: I did suggest changing all drivers now. What I meant
> is that we use a different API for non-PCI devices that works without IRQ numbers.
> I don't think we should touch the PCI interfaces at this point.
OK, I got it.
>>> What I'd envision as the API from the device driver perspective is something
>>> as simple like this:
>>>
>>> struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
>>> unsigned long flags, const char *name, struct device *dev);
>>>
>>> which would get an msi descriptor that is valid for this device (dev)
>>> connected to a particular msi_chip, and associate a handler function
>>> with it. The device driver can call that function and retrieve the
>>> address/message pair from the msi_desc in order to store it in its own
>>> device specific registers. The request_irq() can be handled internally
>>> to msi_request().
>>
>> This is a huge change for device drivers, and some device drivers don't know which msi_chip
>> their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
>> all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
>> binding for the MSI device to the target msi_chip.
>
> Which drivers are you thinking of? Again, I wouldn't expect to change any PCI drivers,
> but only platform drivers that do native MSI, so we only have to change drivers that
> do not support any MSI at all yet and that need to be changed anyway in order to add
> support.
I mean platform device drivers, because we can find the target msi_chip by some platform
interfaces(like the existing of_pci_find_msi_chip_by_node()). So we no need to explicitly provide
the msi_chip as the function argument.
>
>> For PCI device, some arm platform already bound the msi_chip to the pci hostbridge, then all
>> pci devices under the pci hostbridge deliver their MSI irqs to the target msi_chip.
>> And other platform create the binding in DTS file, then the MSI device can find their msi_chip
>> by device_node.
>> I don't know whether there are other situations, we should provide a generic interface that
>> every MSI device under every platform can use it to find its msi_chip exactly.
>
> We have introduced the "msi-parent" property to mirror the "interrupt-parent" property.
> For the PCI case, this property is only needed in the PCI host controller, and there
> can be a system-wide default, by putting the "msi-parent" property into the root device
> node or the node of a bus that is parent to all devices supporting MSI.
>
> For non-PCI devices, it should be possible to override the "msi-parent" property per
> device, but those can also use the global property.
>
> The main use case that I see are PCI host controllers that have their own MSI catcher
> included, so meaning that any PCI device can either send its MSIs there, or to a
> system-wide GICv3 instance, and we need a way to select which one.
Yes, agree.
>
> A degenerate case of this would be a system where a PCI device sends its MSI into
> the host controller, that generates a legacy interrupt and that in turn gets
> sent to an irqchip which turns it back into an MSI for the GICv3. This would of
> course be very inefficient, but I think we should be able to express this with
> both the binding and the in-kernel framework just to be on the safe side.
Yes, the best way to tell the kernel which msi_chip should deliver to is describe
the binding in DTS file. If a real degenerate case found, we can update the platform
interface which is responsible for getting the match msi_chip in future.
>
>
> Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
> .
>
--
Thanks!
Yijing
WARNING: multiple messages have this Message-ID (diff)
From: Yijing Wang <wangyijing@huawei.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: <linux-arm-kernel@lists.infradead.org>,
<linux-kernel@vger.kernel.org>, <linux-arch@vger.kernel.org>,
Russell King <linux@arm.linux.org.uk>, <Paul.Mundt@huawei.com>,
Marc Zyngier <marc.zyngier@arm.com>, <linux-pci@vger.kernel.org>,
"James E.J. Bottomley" <jejb@parisc-linux.org>,
<virtualization@lists.linux-foundation.org>,
Xinwei Hu <huxinwei@huawei.com>,
Hanjun Guo <guohanjun@huawei.com>,
Bjorn Helgaas <bhelgaas@google.com>, Wuyun <wuyun.wu@huawei.com>
Subject: Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
Date: Mon, 4 Aug 2014 14:43:07 +0800 [thread overview]
Message-ID: <53DF2B7B.4000605@huawei.com> (raw)
In-Reply-To: <201408011552.32135.arnd@arndb.de>
On 2014/8/1 21:52, Arnd Bergmann wrote:
> On Wednesday 30 July 2014, Yijing Wang wrote:
>> On 2014/7/29 22:08, Arnd Bergmann wrote:
>>> On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
>>>>
>>>> The new data struct for generic MSI driver.
>>>> struct msi_irqs {
>>>> u8 msi_enabled:1; /* Enable flag */
>>>> u8 msix_enabled:1;
>>>> struct list_head msi_list; /* MSI desc list */
>>>> void *data; /* help to find the MSI device */
>>>> struct msi_ops *ops; /* MSI device specific hook */
>>>> };
>>>> struct msi_irqs is used to manage MSI related informations. Every device supports
>>>> MSI should contain this data struct and allocate it.
>>>
>>> I think you should have a stronger association with the 'struct
>>> device' here. Can you replace the 'void *data' with 'struct device *dev'?
>>
>> Actually, I used the struct device *dev in my first draft, finally, I replaced
>> it with void *data, because some MSI devices don't have a struct device *dev,
>> like the existing hpet device, dmar msi device, and OF device, like the ARM consolidator.
>>
>> Of course, we can make the MSI devices have their own struct device, and register to
>> device tree, eg, add a class device named MSI_DEV. But I'm not sure whether it is appropriate.
>
> It doesn't have to be in the (OF) device tree, but I think it absolutely makes
> sense to use the 'struct device' infrastructure here, as almost everything uses
> a device, and the ones that don't do that today can be easily changed.
I will try to use "struct device" infrastructure, thanks for your suggestion. :)
>
>>> The other part I'm not completely sure about is how you want to
>>> have MSIs map into normal IRQ descriptors. At the moment, all
>>> MSI users are based on IRQ numbers, but this has known scalability problems.
>>
>> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
>> I'm sorry, I don't understand you meaning.
>> What are the scalability problems you mentioned ?
>> For device drivers, they always process interrupt in two steps.
>> If irq is the legacy interrupt, drivers will first
>> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the IRQ number.
>> Then drivers will call the request_irq() to register the interrupt handler.
>> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then call
>> request_irq() to register interrupt handler.
>
> The method you describe here makes sense for PCI devices that are required to support
> legacy interrupts and may or may not support MSI on a given system, but not so much
> for platform devices for which we know exactly whether we want to use MSI
> or legacy interrupts.
>
> In particular if you have a device that can only do MSI, the entire pci_enable_msi
> step is pointless: all we need to do is program the correct MSI target address/message
> pair into the device and register the handler.
Yes, I almost agree if we won't change the existing hundreds of drivers, what
I worried about is some drivers may want to know the IRQ numbers, and use the IRQ
number to process different things, as I mentioned in another reply.
But we can also provide the interface which integrate MSI configuration and request_irq(),
if most drivers don't care the IRQ number.
>
>>> I wonder if we can do the interface in a way that
>>> hides the interrupt number from generic device drivers and just
>>> passes a 'struct irq_desc'. Note that there are long-term plans to
>>> get rid of IRQ numbers entirely, but those plans have existed for
>>> a long time already without anybody seriously addressing the device
>>> driver interfaces so far, so it might never really happen.
>>>
>>
>> Maybe this is a huge work, now hundreds drivers use the IRQ number, so maybe we can consider
>> this in a separate title.
>
> Sorry for being unclear here: I did suggest changing all drivers now. What I meant
> is that we use a different API for non-PCI devices that works without IRQ numbers.
> I don't think we should touch the PCI interfaces at this point.
OK, I got it.
>>> What I'd envision as the API from the device driver perspective is something
>>> as simple like this:
>>>
>>> struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
>>> unsigned long flags, const char *name, struct device *dev);
>>>
>>> which would get an msi descriptor that is valid for this device (dev)
>>> connected to a particular msi_chip, and associate a handler function
>>> with it. The device driver can call that function and retrieve the
>>> address/message pair from the msi_desc in order to store it in its own
>>> device specific registers. The request_irq() can be handled internally
>>> to msi_request().
>>
>> This is a huge change for device drivers, and some device drivers don't know which msi_chip
>> their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
>> all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
>> binding for the MSI device to the target msi_chip.
>
> Which drivers are you thinking of? Again, I wouldn't expect to change any PCI drivers,
> but only platform drivers that do native MSI, so we only have to change drivers that
> do not support any MSI at all yet and that need to be changed anyway in order to add
> support.
I mean platform device drivers, because we can find the target msi_chip by some platform
interfaces(like the existing of_pci_find_msi_chip_by_node()). So we no need to explicitly provide
the msi_chip as the function argument.
>
>> For PCI device, some arm platform already bound the msi_chip to the pci hostbridge, then all
>> pci devices under the pci hostbridge deliver their MSI irqs to the target msi_chip.
>> And other platform create the binding in DTS file, then the MSI device can find their msi_chip
>> by device_node.
>> I don't know whether there are other situations, we should provide a generic interface that
>> every MSI device under every platform can use it to find its msi_chip exactly.
>
> We have introduced the "msi-parent" property to mirror the "interrupt-parent" property.
> For the PCI case, this property is only needed in the PCI host controller, and there
> can be a system-wide default, by putting the "msi-parent" property into the root device
> node or the node of a bus that is parent to all devices supporting MSI.
>
> For non-PCI devices, it should be possible to override the "msi-parent" property per
> device, but those can also use the global property.
>
> The main use case that I see are PCI host controllers that have their own MSI catcher
> included, so meaning that any PCI device can either send its MSIs there, or to a
> system-wide GICv3 instance, and we need a way to select which one.
Yes, agree.
>
> A degenerate case of this would be a system where a PCI device sends its MSI into
> the host controller, that generates a legacy interrupt and that in turn gets
> sent to an irqchip which turns it back into an MSI for the GICv3. This would of
> course be very inefficient, but I think we should be able to express this with
> both the binding and the in-kernel framework just to be on the safe side.
Yes, the best way to tell the kernel which msi_chip should deliver to is describe
the binding in DTS file. If a real degenerate case found, we can update the platform
interface which is responsible for getting the match msi_chip in future.
>
>
> Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
> .
>
--
Thanks!
Yijing
WARNING: multiple messages have this Message-ID (diff)
From: wangyijing@huawei.com (Yijing Wang)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
Date: Mon, 4 Aug 2014 14:43:07 +0800 [thread overview]
Message-ID: <53DF2B7B.4000605@huawei.com> (raw)
In-Reply-To: <201408011552.32135.arnd@arndb.de>
On 2014/8/1 21:52, Arnd Bergmann wrote:
> On Wednesday 30 July 2014, Yijing Wang wrote:
>> On 2014/7/29 22:08, Arnd Bergmann wrote:
>>> On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
>>>>
>>>> The new data struct for generic MSI driver.
>>>> struct msi_irqs {
>>>> u8 msi_enabled:1; /* Enable flag */
>>>> u8 msix_enabled:1;
>>>> struct list_head msi_list; /* MSI desc list */
>>>> void *data; /* help to find the MSI device */
>>>> struct msi_ops *ops; /* MSI device specific hook */
>>>> };
>>>> struct msi_irqs is used to manage MSI related informations. Every device supports
>>>> MSI should contain this data struct and allocate it.
>>>
>>> I think you should have a stronger association with the 'struct
>>> device' here. Can you replace the 'void *data' with 'struct device *dev'?
>>
>> Actually, I used the struct device *dev in my first draft, finally, I replaced
>> it with void *data, because some MSI devices don't have a struct device *dev,
>> like the existing hpet device, dmar msi device, and OF device, like the ARM consolidator.
>>
>> Of course, we can make the MSI devices have their own struct device, and register to
>> device tree, eg, add a class device named MSI_DEV. But I'm not sure whether it is appropriate.
>
> It doesn't have to be in the (OF) device tree, but I think it absolutely makes
> sense to use the 'struct device' infrastructure here, as almost everything uses
> a device, and the ones that don't do that today can be easily changed.
I will try to use "struct device" infrastructure, thanks for your suggestion. :)
>
>>> The other part I'm not completely sure about is how you want to
>>> have MSIs map into normal IRQ descriptors. At the moment, all
>>> MSI users are based on IRQ numbers, but this has known scalability problems.
>>
>> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
>> I'm sorry, I don't understand you meaning.
>> What are the scalability problems you mentioned ?
>> For device drivers, they always process interrupt in two steps.
>> If irq is the legacy interrupt, drivers will first
>> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the IRQ number.
>> Then drivers will call the request_irq() to register the interrupt handler.
>> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then call
>> request_irq() to register interrupt handler.
>
> The method you describe here makes sense for PCI devices that are required to support
> legacy interrupts and may or may not support MSI on a given system, but not so much
> for platform devices for which we know exactly whether we want to use MSI
> or legacy interrupts.
>
> In particular if you have a device that can only do MSI, the entire pci_enable_msi
> step is pointless: all we need to do is program the correct MSI target address/message
> pair into the device and register the handler.
Yes, I almost agree if we won't change the existing hundreds of drivers, what
I worried about is some drivers may want to know the IRQ numbers, and use the IRQ
number to process different things, as I mentioned in another reply.
But we can also provide the interface which integrate MSI configuration and request_irq(),
if most drivers don't care the IRQ number.
>
>>> I wonder if we can do the interface in a way that
>>> hides the interrupt number from generic device drivers and just
>>> passes a 'struct irq_desc'. Note that there are long-term plans to
>>> get rid of IRQ numbers entirely, but those plans have existed for
>>> a long time already without anybody seriously addressing the device
>>> driver interfaces so far, so it might never really happen.
>>>
>>
>> Maybe this is a huge work, now hundreds drivers use the IRQ number, so maybe we can consider
>> this in a separate title.
>
> Sorry for being unclear here: I did suggest changing all drivers now. What I meant
> is that we use a different API for non-PCI devices that works without IRQ numbers.
> I don't think we should touch the PCI interfaces at this point.
OK, I got it.
>>> What I'd envision as the API from the device driver perspective is something
>>> as simple like this:
>>>
>>> struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
>>> unsigned long flags, const char *name, struct device *dev);
>>>
>>> which would get an msi descriptor that is valid for this device (dev)
>>> connected to a particular msi_chip, and associate a handler function
>>> with it. The device driver can call that function and retrieve the
>>> address/message pair from the msi_desc in order to store it in its own
>>> device specific registers. The request_irq() can be handled internally
>>> to msi_request().
>>
>> This is a huge change for device drivers, and some device drivers don't know which msi_chip
>> their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
>> all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
>> binding for the MSI device to the target msi_chip.
>
> Which drivers are you thinking of? Again, I wouldn't expect to change any PCI drivers,
> but only platform drivers that do native MSI, so we only have to change drivers that
> do not support any MSI at all yet and that need to be changed anyway in order to add
> support.
I mean platform device drivers, because we can find the target msi_chip by some platform
interfaces(like the existing of_pci_find_msi_chip_by_node()). So we no need to explicitly provide
the msi_chip as the function argument.
>
>> For PCI device, some arm platform already bound the msi_chip to the pci hostbridge, then all
>> pci devices under the pci hostbridge deliver their MSI irqs to the target msi_chip.
>> And other platform create the binding in DTS file, then the MSI device can find their msi_chip
>> by device_node.
>> I don't know whether there are other situations, we should provide a generic interface that
>> every MSI device under every platform can use it to find its msi_chip exactly.
>
> We have introduced the "msi-parent" property to mirror the "interrupt-parent" property.
> For the PCI case, this property is only needed in the PCI host controller, and there
> can be a system-wide default, by putting the "msi-parent" property into the root device
> node or the node of a bus that is parent to all devices supporting MSI.
>
> For non-PCI devices, it should be possible to override the "msi-parent" property per
> device, but those can also use the global property.
>
> The main use case that I see are PCI host controllers that have their own MSI catcher
> included, so meaning that any PCI device can either send its MSIs there, or to a
> system-wide GICv3 instance, and we need a way to select which one.
Yes, agree.
>
> A degenerate case of this would be a system where a PCI device sends its MSI into
> the host controller, that generates a legacy interrupt and that in turn gets
> sent to an irqchip which turns it back into an MSI for the GICv3. This would of
> course be very inefficient, but I think we should be able to express this with
> both the binding and the in-kernel framework just to be on the safe side.
Yes, the best way to tell the kernel which msi_chip should deliver to is describe
the binding in DTS file. If a real degenerate case found, we can update the platform
interface which is responsible for getting the match msi_chip in future.
>
>
> Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
> .
>
--
Thanks!
Yijing
next prev parent reply other threads:[~2014-08-04 6:43 UTC|newest]
Thread overview: 146+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-26 3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 01/11] PCI/MSI: Use pci_dev->msi_cap instead of msi_desc->msi_attrib.pos Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 02/11] PCI/MSI: Use new MSI type macro instead of PCI MSI flags Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled() Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-08-05 22:35 ` Stuart Yoder
2014-08-05 22:35 ` Stuart Yoder
2014-08-05 22:35 ` Stuart Yoder
2014-08-06 1:23 ` Yijing Wang
2014-08-06 1:23 ` Yijing Wang
2014-08-06 1:23 ` Yijing Wang
2014-08-06 1:23 ` Yijing Wang
2014-08-20 5:57 ` Bharat.Bhushan
2014-08-20 5:57 ` Bharat.Bhushan
2014-08-20 5:57 ` Bharat.Bhushan at freescale.com
2014-08-20 5:57 ` Bharat.Bhushan
2014-08-20 5:57 ` Bharat.Bhushan
2014-08-20 6:30 ` Yijing Wang
2014-08-20 6:30 ` Yijing Wang
2014-08-20 6:30 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 04/11] PCI/MSI: Move MSIX table address mapping out of msix_capability_init Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 05/11] PCI/MSI: Move populate_msi_sysfs() out of msi_capability_init() Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 06/11] PCI/MSI: Save MSI irq in PCI MSI layer Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 07/11] PCI/MSI: Mask MSI-X entry in msix_setup_entries() Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 08/11] PCI/MSI: Introduce new struct msi_irqs and struct msi_ops Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 09/11] PCI/MSI: refactor PCI MSI driver Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-08-20 6:06 ` Bharat.Bhushan
2014-08-20 6:06 ` Bharat.Bhushan at freescale.com
2014-08-20 6:06 ` Bharat.Bhushan
2014-08-20 6:34 ` Yijing Wang
2014-08-20 6:34 ` Yijing Wang
2014-08-20 6:34 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 10/11] PCI/MSI: Split the generic MSI code into new file Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-08-20 6:18 ` Bharat.Bhushan
2014-08-20 6:18 ` Bharat.Bhushan at freescale.com
2014-08-20 6:18 ` Bharat.Bhushan
2014-08-20 6:43 ` Yijing Wang
2014-08-20 6:43 ` Yijing Wang
2014-08-20 6:43 ` Yijing Wang
2014-07-26 3:08 ` [RFC PATCH 11/11] x86/MSI: Refactor x86 MSI code Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-07-26 3:08 ` Yijing Wang
2014-08-20 6:20 ` Bharat.Bhushan
2014-08-20 6:20 ` Bharat.Bhushan at freescale.com
2014-08-20 6:20 ` Bharat.Bhushan
2014-08-20 7:01 ` Yijing Wang
2014-08-20 7:01 ` Yijing Wang
2014-08-20 7:01 ` Yijing Wang
2014-07-29 14:08 ` [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Arnd Bergmann
2014-07-29 14:08 ` Arnd Bergmann
2014-07-29 14:08 ` Arnd Bergmann
2014-07-30 2:45 ` Yijing Wang
2014-07-30 2:45 ` Yijing Wang
2014-07-30 2:45 ` Yijing Wang
2014-07-30 2:45 ` Yijing Wang
2014-07-30 6:47 ` Jiang Liu
2014-07-30 6:47 ` Jiang Liu
2014-07-30 7:20 ` Yijing Wang
2014-07-30 7:20 ` Yijing Wang
2014-07-30 7:20 ` Yijing Wang
2014-08-01 13:16 ` Arnd Bergmann
2014-08-01 13:16 ` Arnd Bergmann
2014-08-01 13:16 ` Arnd Bergmann
2014-08-04 3:32 ` Yijing Wang
2014-08-04 3:32 ` Yijing Wang
2014-08-04 3:32 ` Yijing Wang
2014-08-04 14:45 ` Arnd Bergmann
2014-08-04 14:45 ` Arnd Bergmann
2014-08-04 14:45 ` Arnd Bergmann
2014-08-05 2:20 ` Yijing Wang
2014-08-05 2:20 ` Yijing Wang
2014-08-05 2:20 ` Yijing Wang
2014-08-05 2:20 ` Yijing Wang
2014-08-04 3:32 ` Yijing Wang
2014-07-30 7:20 ` Yijing Wang
2014-07-30 6:47 ` Jiang Liu
2014-08-01 13:52 ` Arnd Bergmann
2014-08-01 13:52 ` Arnd Bergmann
2014-08-01 13:52 ` Arnd Bergmann
2014-08-04 6:43 ` Yijing Wang [this message]
2014-08-04 6:43 ` Yijing Wang
2014-08-04 6:43 ` Yijing Wang
2014-08-04 6:43 ` Yijing Wang
2014-08-04 14:59 ` Arnd Bergmann
2014-08-04 14:59 ` Arnd Bergmann
2014-08-04 14:59 ` Arnd Bergmann
2014-08-05 2:12 ` Yijing Wang
2014-08-05 2:12 ` Yijing Wang
2014-08-05 2:12 ` Yijing Wang
2014-08-05 2:12 ` Yijing Wang
2014-08-01 10:27 ` arnab.basu
2014-08-01 10:27 ` arnab.basu at freescale.com
2014-08-04 3:03 ` Yijing Wang
2014-08-04 3:03 ` Yijing Wang
2014-08-04 3:03 ` Yijing Wang
2014-08-20 5:44 ` Bharat.Bhushan
2014-08-20 5:44 ` Bharat.Bhushan at freescale.com
2014-08-20 5:44 ` Bharat.Bhushan
2014-08-20 6:28 ` Yijing Wang
2014-08-20 6:28 ` Yijing Wang
2014-08-20 7:41 ` Bharat.Bhushan
2014-08-20 7:41 ` Bharat.Bhushan
2014-08-20 7:41 ` Bharat.Bhushan at freescale.com
2014-08-20 7:55 ` Yijing Wang
2014-08-20 7:55 ` Yijing Wang
2014-08-20 7:55 ` Yijing Wang
2014-09-03 7:15 ` Yijing Wang
2014-09-03 7:15 ` Yijing Wang
2014-09-03 7:15 ` Yijing Wang
2014-08-20 6:28 ` Yijing Wang
2014-08-01 10:27 ` arnab.basu
-- strict thread matches above, loose matches on Subject: below --
2014-07-26 3:08 Yijing Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53DF2B7B.4000605@huawei.com \
--to=wangyijing@huawei.com \
--cc=Paul.Mundt@huawei.com \
--cc=arnd@arndb.de \
--cc=bhelgaas@google.com \
--cc=guohanjun@huawei.com \
--cc=huxinwei@huawei.com \
--cc=jejb@parisc-linux.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=virtualization@lists.linux-foundation.org \
--cc=wuyun.wu@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.