From: ebiederm@xmission.com (Eric W. Biederman)
To: Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Kok, Auke" <auke-jan.h.kok@intel.com>,
Michal Piotrowski <michal.k.k.piotrowski@gmail.com>,
Jeff Garzik <jeff@garzik.org>,
Greg Kroah-Hartman <gregkh@suse.de>,
linux-pm@lists.osdl.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Adrian Bunk <bunk@stusta.de>,
michael@ellerman.id.au, linux-pci@atrey.karlin.mff.cuni.cz,
Pavel Machek <pavel@ucw.cz>, Jens Axboe <jens.axboe@oracle.com>,
"Michael S. Tsirkin" <mst@mellanox.co.il>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>
Subject: [PATCH 1/2] msi: Safer state caching.
Date: Thu, 08 Mar 2007 13:04:57 -0700 [thread overview]
Message-ID: <m1ps7j5wnq.fsf_-_@ebiederm.dsl.xmission.com> (raw)
In-Reply-To: <m1tzwv5wz9.fsf_-_@ebiederm.dsl.xmission.com> (Eric W. Biederman's message of "Thu, 08 Mar 2007 12:58:02 -0700")
There are two ways pci_save_state and pci_restore_state are used. As
helper functions during suspend/resume, and as helper functions around
a hardware reset event. When used as helper functions around a hardware
reset event there is no reason to believe the calls will be paired, nor
is there a good reason to believe that if we restore the msi state from
before the reset that it will match the current msi state. Since arch
code may change the msi message without going through the driver, drivers
currently do not have enough information to even know when to call
pci_save_state to ensure they will have msi state in sync with the other
kernel irq reception data structures.
It turns out the solution is straight forward, cache the state in the
existing msi data structures (not the magic pci saved things) and
have the msi code update the cached state each time we write to the hardware.
This means we never need to read the hardware to figure out what the hardware
state should be.
By modifying the caching in this manner we get to remove our save_state
routines and only need to provide restore_state routines.
The only fields that were at all tricky to regenerate were the msi and msi-x
control registers and the way we regenerate them currently is a bit dependent
upon assumptions on how we use the allow msi registers to be configured and used
making the code a little bit brittle. If we ever change what cases we allow
or how we configure the msi bits we can address the fragility then.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
drivers/pci/msi.c | 150 ++++++++--------------------------------------
drivers/pci/pci.c | 2 -
drivers/pci/pci.h | 2 -
include/linux/msi.h | 8 +--
include/linux/pci_regs.h | 1 +
5 files changed, 29 insertions(+), 134 deletions(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 01869b1..ad33e01 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -100,6 +100,7 @@ static void msi_set_mask_bit(unsigned int irq, int flag)
BUG();
break;
}
+ entry->msi_attrib.masked = !!flag;
}
void read_msi_msg(unsigned int irq, struct msi_msg *msg)
@@ -179,6 +180,7 @@ void write_msi_msg(unsigned int irq, struct msi_msg *msg)
default:
BUG();
}
+ entry->msg = *msg;
}
void mask_msi_irq(unsigned int irq)
@@ -225,164 +227,60 @@ static struct msi_desc* alloc_msi_entry(void)
}
#ifdef CONFIG_PM
-static int __pci_save_msi_state(struct pci_dev *dev)
-{
- int pos, i = 0;
- u16 control;
- struct pci_cap_saved_state *save_state;
- u32 *cap;
-
- if (!dev->msi_enabled)
- return 0;
-
- pos = pci_find_capability(dev, PCI_CAP_ID_MSI);
- if (pos <= 0)
- return 0;
-
- save_state = kzalloc(sizeof(struct pci_cap_saved_state) + sizeof(u32) * 5,
- GFP_KERNEL);
- if (!save_state) {
- printk(KERN_ERR "Out of memory in pci_save_msi_state\n");
- return -ENOMEM;
- }
- cap = &save_state->data[0];
-
- pci_read_config_dword(dev, pos, &cap[i++]);
- control = cap[0] >> 16;
- pci_read_config_dword(dev, pos + PCI_MSI_ADDRESS_LO, &cap[i++]);
- if (control & PCI_MSI_FLAGS_64BIT) {
- pci_read_config_dword(dev, pos + PCI_MSI_ADDRESS_HI, &cap[i++]);
- pci_read_config_dword(dev, pos + PCI_MSI_DATA_64, &cap[i++]);
- } else
- pci_read_config_dword(dev, pos + PCI_MSI_DATA_32, &cap[i++]);
- if (control & PCI_MSI_FLAGS_MASKBIT)
- pci_read_config_dword(dev, pos + PCI_MSI_MASK_BIT, &cap[i++]);
- save_state->cap_nr = PCI_CAP_ID_MSI;
- pci_add_saved_cap(dev, save_state);
- return 0;
-}
-
static void __pci_restore_msi_state(struct pci_dev *dev)
{
- int i = 0, pos;
+ int pos;
u16 control;
- struct pci_cap_saved_state *save_state;
- u32 *cap;
+ struct msi_desc *entry;
if (!dev->msi_enabled)
return;
- save_state = pci_find_saved_cap(dev, PCI_CAP_ID_MSI);
- pos = pci_find_capability(dev, PCI_CAP_ID_MSI);
- if (!save_state || pos <= 0)
- return;
- cap = &save_state->data[0];
+ entry = get_irq_msi(dev->irq);
+ pos = entry->msi_attrib.pos;
pci_intx(dev, 0); /* disable intx */
- control = cap[i++] >> 16;
msi_set_enable(dev, 0);
- pci_write_config_dword(dev, pos + PCI_MSI_ADDRESS_LO, cap[i++]);
- if (control & PCI_MSI_FLAGS_64BIT) {
- pci_write_config_dword(dev, pos + PCI_MSI_ADDRESS_HI, cap[i++]);
- pci_write_config_dword(dev, pos + PCI_MSI_DATA_64, cap[i++]);
- } else
- pci_write_config_dword(dev, pos + PCI_MSI_DATA_32, cap[i++]);
- if (control & PCI_MSI_FLAGS_MASKBIT)
- pci_write_config_dword(dev, pos + PCI_MSI_MASK_BIT, cap[i++]);
+ write_msi_msg(dev->irq, &entry->msg);
+ if (entry->msi_attrib.maskbit)
+ msi_set_mask_bit(dev->irq, entry->msi_attrib.masked);
+
+ pci_read_config_word(dev, pos + PCI_MSI_FLAGS, &control);
+ control &= ~(PCI_MSI_FLAGS_QSIZE | PCI_MSI_FLAGS_ENABLE);
+ if (entry->msi_attrib.maskbit || !entry->msi_attrib.masked)
+ control |= PCI_MSI_FLAGS_ENABLE;
pci_write_config_word(dev, pos + PCI_MSI_FLAGS, control);
- pci_remove_saved_cap(save_state);
- kfree(save_state);
-}
-
-static int __pci_save_msix_state(struct pci_dev *dev)
-{
- int pos;
- int irq, head, tail = 0;
- u16 control;
- struct pci_cap_saved_state *save_state;
-
- if (!dev->msix_enabled)
- return 0;
-
- pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
- if (pos <= 0)
- return 0;
-
- /* save the capability */
- pci_read_config_word(dev, msi_control_reg(pos), &control);
- save_state = kzalloc(sizeof(struct pci_cap_saved_state) + sizeof(u16),
- GFP_KERNEL);
- if (!save_state) {
- printk(KERN_ERR "Out of memory in pci_save_msix_state\n");
- return -ENOMEM;
- }
- *((u16 *)&save_state->data[0]) = control;
-
- /* save the table */
- irq = head = dev->first_msi_irq;
- while (head != tail) {
- struct msi_desc *entry;
-
- entry = get_irq_msi(irq);
- read_msi_msg(irq, &entry->msg_save);
-
- tail = entry->link.tail;
- irq = tail;
- }
-
- save_state->cap_nr = PCI_CAP_ID_MSIX;
- pci_add_saved_cap(dev, save_state);
- return 0;
-}
-
-int pci_save_msi_state(struct pci_dev *dev)
-{
- int rc;
-
- rc = __pci_save_msi_state(dev);
- if (rc)
- return rc;
-
- rc = __pci_save_msix_state(dev);
-
- return rc;
}
static void __pci_restore_msix_state(struct pci_dev *dev)
{
- u16 save;
int pos;
int irq, head, tail = 0;
struct msi_desc *entry;
- struct pci_cap_saved_state *save_state;
+ u16 control;
if (!dev->msix_enabled)
return;
- save_state = pci_find_saved_cap(dev, PCI_CAP_ID_MSIX);
- if (!save_state)
- return;
- save = *((u16 *)&save_state->data[0]);
- pci_remove_saved_cap(save_state);
- kfree(save_state);
-
- pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
- if (pos <= 0)
- return;
-
/* route the table */
pci_intx(dev, 0); /* disable intx */
msix_set_enable(dev, 0);
irq = head = dev->first_msi_irq;
+ entry = get_irq_msi(irq);
+ pos = entry->msi_attrib.pos;
while (head != tail) {
entry = get_irq_msi(irq);
- write_msi_msg(irq, &entry->msg_save);
+ write_msi_msg(irq, &entry->msg);
+ msi_set_mask_bit(irq, entry->msi_attrib.masked);
tail = entry->link.tail;
irq = tail;
}
- pci_write_config_word(dev, msi_control_reg(pos), save);
+ pci_read_config_word(dev, pos + PCI_MSIX_FLAGS, &control);
+ control &= ~PCI_MSIX_FLAGS_MASKALL;
+ control |= PCI_MSIX_FLAGS_ENABLE;
+ pci_write_config_word(dev, pos + PCI_MSIX_FLAGS, control);
}
void pci_restore_msi_state(struct pci_dev *dev)
@@ -420,6 +318,7 @@ static int msi_capability_init(struct pci_dev *dev)
entry->msi_attrib.is_64 = is_64bit_address(control);
entry->msi_attrib.entry_nr = 0;
entry->msi_attrib.maskbit = is_mask_bit_support(control);
+ entry->msi_attrib.masked = 1;
entry->msi_attrib.default_irq = dev->irq; /* Save IOAPIC IRQ */
entry->msi_attrib.pos = pos;
if (is_mask_bit_support(control)) {
@@ -507,6 +406,7 @@ static int msix_capability_init(struct pci_dev *dev,
entry->msi_attrib.is_64 = 1;
entry->msi_attrib.entry_nr = j;
entry->msi_attrib.maskbit = 1;
+ entry->msi_attrib.masked = 1;
entry->msi_attrib.default_irq = dev->irq;
entry->msi_attrib.pos = pos;
entry->dev = dev;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index df49530..6fb78df 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -638,8 +638,6 @@ pci_save_state(struct pci_dev *dev)
/* XXX: 100% dword access ok here? */
for (i = 0; i < 16; i++)
pci_read_config_dword(dev, i * 4,&dev->saved_config_space[i]);
- if ((i = pci_save_msi_state(dev)) != 0)
- return i;
if ((i = pci_save_pcie_state(dev)) != 0)
return i;
if ((i = pci_save_pcix_state(dev)) != 0)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index ae7a975..62ea04c 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -52,10 +52,8 @@ static inline void pci_no_msi(void) { }
#endif
#if defined(CONFIG_PCI_MSI) && defined(CONFIG_PM)
-int pci_save_msi_state(struct pci_dev *dev);
void pci_restore_msi_state(struct pci_dev *dev);
#else
-static inline int pci_save_msi_state(struct pci_dev *dev) { return 0; }
static inline void pci_restore_msi_state(struct pci_dev *dev) {}
#endif
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 74c8a2e..e38fe68 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -17,7 +17,7 @@ struct msi_desc {
struct {
__u8 type : 5; /* {0: unused, 5h:MSI, 11h:MSI-X} */
__u8 maskbit : 1; /* mask-pending bit supported ? */
- __u8 unused : 1;
+ __u8 masked : 1;
__u8 is_64 : 1; /* Address size: 0=32bit 1=64bit */
__u8 pos; /* Location of the msi capability */
__u16 entry_nr; /* specific enabled entry */
@@ -32,10 +32,8 @@ struct msi_desc {
void __iomem *mask_base;
struct pci_dev *dev;
-#ifdef CONFIG_PM
- /* PM save area for MSIX address/data */
- struct msi_msg msg_save;
-#endif
+ /* Last set MSI message */
+ struct msi_msg msg;
};
/*
diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h
index f09cce2..495d368 100644
--- a/include/linux/pci_regs.h
+++ b/include/linux/pci_regs.h
@@ -296,6 +296,7 @@
#define PCI_MSIX_FLAGS 2
#define PCI_MSIX_FLAGS_QSIZE 0x7FF
#define PCI_MSIX_FLAGS_ENABLE (1 << 15)
+#define PCI_MSIX_FLAGS_MASKALL (1 << 14)
#define PCI_MSIX_FLAGS_BIRMASK (7 << 0)
#define PCI_MSIX_FLAGS_BITMASK (1 << 0)
--
1.5.0.g53756
next prev parent reply other threads:[~2007-03-08 20:04 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Pine.LNX.4.64.0702202043280.4043@woody.linux-foundation.org>
2007-02-25 17:52 ` 2.6.21-rc1: known regressions (part 1) Adrian Bunk
2007-02-28 18:16 ` Karasyov, Konstantin A
2007-02-25 17:55 ` 2.6.21-rc1: known regressions (part 2) Adrian Bunk
2007-02-27 10:02 ` Jens Axboe
2007-02-27 10:21 ` Pavel Machek
2007-02-27 10:30 ` Jens Axboe
2007-02-27 10:34 ` Ingo Molnar
2007-02-27 10:59 ` Jens Axboe
2007-02-27 11:15 ` Jens Axboe
2007-02-27 13:09 ` Jens Axboe
2007-03-01 9:34 ` Ingo Molnar
2007-03-01 10:41 ` Ingo Molnar
2007-03-01 14:52 ` Ingo Molnar
2007-03-01 16:12 ` Rafael J. Wysocki
2007-03-02 0:26 ` Linus Torvalds
2007-03-02 0:41 ` Linus Torvalds
2007-03-02 7:14 ` Ingo Molnar
2007-03-02 7:21 ` Ingo Molnar
2007-03-02 8:04 ` Ingo Molnar
2007-03-02 10:20 ` Ingo Molnar
2007-03-02 10:22 ` [patch] KVM: T60 resume fix Ingo Molnar
2007-03-02 11:39 ` Michael S. Tsirkin
2007-03-03 8:22 ` Avi Kivity
2007-03-03 8:21 ` Avi Kivity
2007-03-03 11:57 ` Andrew Morton
2007-03-03 12:07 ` Junio C Hamano
2007-03-05 8:22 ` Ingo Molnar
2007-03-05 8:50 ` Avi Kivity
2007-03-05 8:44 ` Ingo Molnar
2007-03-05 8:57 ` Ingo Molnar
2007-03-05 9:27 ` Avi Kivity
2007-03-05 10:05 ` Ingo Molnar
2007-03-05 10:33 ` Avi Kivity
2007-03-05 10:33 ` Ingo Molnar
2007-03-05 10:40 ` Michael S. Tsirkin
2007-03-05 12:54 ` Michael S. Tsirkin
2007-03-05 12:50 ` Ingo Molnar
2007-03-05 13:26 ` Michael S. Tsirkin
2007-03-05 13:32 ` Ingo Molnar
2007-03-05 10:23 ` Michael S. Tsirkin
2007-03-05 10:29 ` Ingo Molnar
2007-03-05 15:44 ` 2.6.21-rc1: known regressions (part 2) Michael S. Tsirkin
2007-03-05 16:14 ` Michael S. Tsirkin
2007-03-05 16:41 ` Ingo Molnar
2007-03-05 18:16 ` Jens Axboe
2007-03-01 23:36 ` Linus Torvalds
2007-03-02 10:07 ` Pavel Machek
2007-03-05 8:42 ` Michael S. Tsirkin
2007-03-05 10:11 ` SATA resume slowness, e1000 MSI warning Ingo Molnar
2007-03-06 5:30 ` Jeff Garzik
2007-03-06 6:35 ` Kok, Auke
2007-03-06 9:04 ` Ingo Molnar
2007-03-06 15:34 ` Kok, Auke
2007-03-07 4:15 ` Eric W. Biederman
2007-03-07 16:31 ` Kok, Auke
2007-03-07 16:45 ` Kok, Auke
2007-03-07 19:28 ` Eric W. Biederman
2007-03-08 2:53 ` Andrew Morton
2007-03-08 6:58 ` Eric W. Biederman
2007-03-08 9:55 ` Jeff Garzik
2007-03-08 17:27 ` Eric W. Biederman
2007-03-08 19:58 ` [PATCH 0/2] Repair pci_restore_state when used with device resets Eric W. Biederman
2007-03-08 20:04 ` Eric W. Biederman [this message]
2007-03-08 20:06 ` [PATCH 2/2] pci: Repair pci_save/restore_state so we can restore one save many times Eric W. Biederman
2007-03-12 22:46 ` Kok, Auke
2007-03-08 20:08 ` [PATCH 0/2] Repair pci_restore_state when used with device resets Ingo Molnar
2007-03-08 20:26 ` Eric W. Biederman
2007-03-08 10:23 ` SATA resume slowness, e1000 MSI warning Michael S. Tsirkin
2007-03-11 11:11 ` Eric W. Biederman
2007-03-11 11:24 ` Michael S. Tsirkin
2007-03-11 17:37 ` Eric W. Biederman
2007-03-11 18:03 ` Michael S. Tsirkin
2007-03-11 18:27 ` Eric W. Biederman
2007-03-11 18:37 ` Michael S. Tsirkin
2007-03-11 19:50 ` Eric W. Biederman
2007-03-12 4:35 ` Michael S. Tsirkin
2007-04-16 19:56 ` Michael S. Tsirkin
2007-03-09 23:06 ` Kok, Auke
2007-03-10 3:41 ` Eric W. Biederman
2007-03-06 9:06 ` Ingo Molnar
2007-03-06 16:26 ` Thomas Gleixner
2007-03-06 16:52 ` Linus Torvalds
2007-03-06 17:09 ` Kok, Auke
2007-03-09 6:44 ` 2.6.21-rc1: known regressions (part 2) Pavel Machek
2007-03-05 15:34 ` Michael S. Tsirkin
2007-02-27 22:09 ` Adrian Bunk
2007-02-28 7:41 ` Jens Axboe
2007-02-26 22:01 ` 2.6.21-rc1: known regressions (v2) (part 1) Adrian Bunk
2007-02-27 4:09 ` Sergio Monteiro Basto
2007-02-27 12:50 ` S.Çağlar Onur
2007-02-27 13:25 ` Ismail Dönmez
2007-02-28 21:13 ` Michael S. Tsirkin
2007-02-28 21:27 ` Thomas Gleixner
2007-02-28 21:40 ` Michael S. Tsirkin
2007-03-01 3:45 ` Jeff Chua
2007-03-02 12:26 ` [linux-pm] " Pavel Machek
2007-03-03 11:17 ` Jens Axboe
2007-03-05 0:04 ` Adrian Bunk
2007-03-06 1:32 ` Jeff Chua
2007-03-06 12:03 ` Jeff Chua
2007-03-06 12:08 ` Michael S. Tsirkin
2007-03-06 12:12 ` Jeff Chua
2007-03-19 15:32 ` Pavel Machek
2007-03-19 21:23 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1ps7j5wnq.fsf_-_@ebiederm.dsl.xmission.com \
--to=ebiederm@xmission.com \
--cc=akpm@linux-foundation.org \
--cc=auke-jan.h.kok@intel.com \
--cc=bunk@stusta.de \
--cc=gregkh@suse.de \
--cc=jeff@garzik.org \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@atrey.karlin.mff.cuni.cz \
--cc=linux-pm@lists.osdl.org \
--cc=michael@ellerman.id.au \
--cc=michal.k.k.piotrowski@gmail.com \
--cc=mingo@elte.hu \
--cc=mst@mellanox.co.il \
--cc=pavel@ucw.cz \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox