* Re: [RFC PATCH] scsi: libsas: fix WARN on device removal
From: John Garry @ 2016-11-09 12:28 UTC (permalink / raw)
To: martin.petersen, jejb
Cc: linux-scsi, linuxarm, linux-kernel, dan.j.williams, john.garry2,
jinpu.wang, lindar_liu, tk
In-Reply-To: <1478185120-5509-1-git-send-email-john.garry@huawei.com>
On 03/11/2016 14:58, John Garry wrote:
> The following patch introduces an annoying WARN
> when a device is removed from the SAS topology:
> [SCSI] libsas: prevent domain rediscovery competing with ata error handling
>
Are there any views on this patch? I would have thought that the parties
who use the drivers based on libsas would be interested in fixing this bug.
BTW, We are internally testing, hence the RFC.
Thanks in advance,
John
> A sample WARN is as follows:
> [ 236.842227] WARNING: CPU: 7 PID: 1520 at fs/sysfs/group.c:237 sysfs_remove_group+0x90/0x98
> [ 236.850465] Modules linked in:
> [ 236.853544]
> [ 236.855045] CPU: 7 PID: 1520 Comm: kworker/u64:4 Tainted: G W 4.9.0-rc1-15310-g3fbc29e-dirty #676
> [ 236.865010] Hardware name: Huawei Taishan 2180 /D03, BIOS Estuary v2.3 D03 UEFI 08/17/2016
> [ 236.873249] Workqueue: scsi_wq_0 sas_destruct_devices
> [ 236.878317] task: ffff8027ba31b200 task.stack: ffff8027b9d44000
> [ 236.884225] PC is at sysfs_remove_group+0x90/0x98
> [ 236.888920] LR is at sysfs_remove_group+0x90/0x98
> [ 236.893616] pc : [<ffff000008256df8>] lr : [<ffff000008256df8>] pstate: 60000145
> [ 236.900989] sp : ffff8027b9d47bf0
>
> < snip >
>
> [ 237.116463] [<ffff000008256df8>] sysfs_remove_group+0x90/0x98
> [ 237.122197] [<ffff00000851fe68>] dpm_sysfs_remove+0x58/0x68
> [ 237.127758] [<ffff000008513678>] device_del+0x40/0x218
> [ 237.132886] [<ffff000008513864>] device_unregister+0x14/0x2c
> [ 237.138536] [<ffff0000083670c4>] bsg_unregister_queue+0x5c/0xa0
> [ 237.144442] [<ffff00000855b984>] sas_rphy_remove+0x44/0x80
> [ 237.149915] [<ffff00000855b9d4>] sas_rphy_delete+0x14/0x28
> [ 237.155388] [<ffff00000855f9d8>] sas_destruct_devices+0x64/0x98
> [ 237.161293] [<ffff0000080d2c1c>] process_one_work+0x128/0x2e4
> [ 237.167027] [<ffff0000080d2e30>] worker_thread+0x58/0x434
> [ 237.172415] [<ffff0000080d8c24>] kthread+0xd4/0xe8
> [ 237.177198] [<ffff000008082e80>] ret_from_fork+0x10/0x50
> [ 237.182557] sysfs group 'power' not found for kobject 'end_device-0:0:5'
>
> (this can be really huge when an expander is unplugged)
>
> The problem is with the process of sas_port and domain_device
> destruction in domain revalidation. There is a 2-stage process:
> In domain revalidation (which runs in work queue context), if a
> domain_device is discovered to be gone, then the following happens:
> - the domain_device is queued for destruction in a separate work item
> - the associated sas_port is destroyed immediately
>
> This causes a problem in that the sas_port associated with
> a domain_device is destroyed prior the domain_device: this causes
> the sysfs WARN. Essentially the "rug has been pulled from underneath".
>
> Also, likewise, when a root port is deformed due to loss of signal,
> we have the same issue.
>
> To solve, destroy the sas_port in a separate work item to which
> we do the domain revalidation with a new discovery event, as follows:
> - When a domain_device is detected to be gone, the domain_device is
> queued for destruction in a separate work item. The associated
> sas_port is also queued for destruction in another separate work item
> (needs to be queued 2nd)
> - the domain_device is destroyed
> - the sas_port is destroyed
> [similar is done for loss of signal event, in sas_port_deformed()].
>
> Fixes: 87c8331fcf72e501c3a3c0cdc5c [SCSI] libsas: prevent domain
> rediscovery competing with ata error handling
>
> Signed-off-by: John Garry <john.garry@huawei.com>
>
> diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
> index 60de662..01d0fe2 100644
> --- a/drivers/scsi/libsas/sas_discover.c
> +++ b/drivers/scsi/libsas/sas_discover.c
> @@ -361,7 +361,7 @@ static void sas_destruct_devices(struct work_struct *work)
>
> clear_bit(DISCE_DESTRUCT, &port->disc.pending);
>
> - list_for_each_entry_safe(dev, n, &port->destroy_list, disco_list_node) {
> + list_for_each_entry_safe(dev, n, &port->dev_destroy_list, disco_list_node) {
> list_del_init(&dev->disco_list_node);
>
> sas_remove_children(&dev->rphy->dev);
> @@ -383,7 +383,7 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
>
> if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
> sas_rphy_unlink(dev->rphy);
> - list_move_tail(&dev->disco_list_node, &port->destroy_list);
> + list_move_tail(&dev->disco_list_node, &port->dev_destroy_list);
> sas_discover_event(dev->port, DISCE_DESTRUCT);
> }
> }
> @@ -525,6 +525,28 @@ static void sas_revalidate_domain(struct work_struct *work)
> mutex_unlock(&ha->disco_mutex);
> }
>
> +/* ---------- Async Port destruct ---------- */
> +static void sas_async_port_destruct(struct work_struct *work)
> +{
> + struct sas_discovery_event *ev = to_sas_discovery_event(work);
> + struct asd_sas_port *port = ev->port;
> + struct sas_port *sas_port, *n;
> +
> + clear_bit(DISCE_PORT_DESTRUCT, &port->disc.pending);
> +
> + list_for_each_entry_safe(sas_port, n, &port->port_destroy_list, destroy_list) {
> + list_del_init(&port->port_destroy_list);
> +
> + sas_port_delete(sas_port);
> + }
> +}
> +
> +void sas_port_destruct(struct asd_sas_port *port, struct sas_port *sas_port)
> +{
> + list_move_tail(&sas_port->destroy_list, &port->port_destroy_list);
> + sas_discover_event(port, DISCE_PORT_DESTRUCT);
> +}
> +
> /* ---------- Events ---------- */
>
> static void sas_chain_work(struct sas_ha_struct *ha, struct sas_work *sw)
> @@ -582,6 +604,7 @@ void sas_init_disc(struct sas_discovery *disc, struct asd_sas_port *port)
> [DISCE_SUSPEND] = sas_suspend_devices,
> [DISCE_RESUME] = sas_resume_devices,
> [DISCE_DESTRUCT] = sas_destruct_devices,
> + [DISCE_PORT_DESTRUCT] = sas_async_port_destruct,
> };
>
> disc->pending = 0;
> diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
> index 022bb6e..f9522a0 100644
> --- a/drivers/scsi/libsas/sas_expander.c
> +++ b/drivers/scsi/libsas/sas_expander.c
> @@ -1900,10 +1900,11 @@ static void sas_unregister_devs_sas_addr(struct domain_device *parent,
> }
> memset(phy->attached_sas_addr, 0, SAS_ADDR_SIZE);
> if (phy->port) {
> + struct asd_sas_port *port = found->port;
> sas_port_delete_phy(phy->port, phy->phy);
> sas_device_set_phy(found, phy->port);
> if (phy->port->num_phys == 0)
> - sas_port_delete(phy->port);
> + sas_port_destruct(port, phy->port);
> phy->port = NULL;
> }
> }
> diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c
> index d3c5297..1a32f86 100644
> --- a/drivers/scsi/libsas/sas_port.c
> +++ b/drivers/scsi/libsas/sas_port.c
> @@ -219,7 +219,7 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone)
>
> if (port->num_phys == 1) {
> sas_unregister_domain_devices(port, gone);
> - sas_port_delete(port->port);
> + sas_port_destruct(port, port->port);
> port->port = NULL;
> } else {
> sas_port_delete_phy(port->port, phy->phy);
> @@ -322,7 +322,8 @@ static void sas_init_port(struct asd_sas_port *port,
> port->id = i;
> INIT_LIST_HEAD(&port->dev_list);
> INIT_LIST_HEAD(&port->disco_list);
> - INIT_LIST_HEAD(&port->destroy_list);
> + INIT_LIST_HEAD(&port->dev_destroy_list);
> + INIT_LIST_HEAD(&port->port_destroy_list);
> spin_lock_init(&port->phy_list_lock);
> INIT_LIST_HEAD(&port->phy_list);
> port->ha = sas_ha;
> diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
> index 60b651b..062c03c 100644
> --- a/drivers/scsi/scsi_transport_sas.c
> +++ b/drivers/scsi/scsi_transport_sas.c
> @@ -934,6 +934,7 @@ struct sas_port *sas_port_alloc(struct device *parent, int port_id)
>
> mutex_init(&port->phy_list_mutex);
> INIT_LIST_HEAD(&port->phy_list);
> + INIT_LIST_HEAD(&port->destroy_list);
>
> if (scsi_is_sas_expander_device(parent)) {
> struct sas_rphy *rphy = dev_to_rphy(parent);
> diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
> index dae99d7..a7953c8 100644
> --- a/include/scsi/libsas.h
> +++ b/include/scsi/libsas.h
> @@ -91,7 +91,8 @@ enum discover_event {
> DISCE_SUSPEND = 4,
> DISCE_RESUME = 5,
> DISCE_DESTRUCT = 6,
> - DISC_NUM_EVENTS = 7,
> + DISCE_PORT_DESTRUCT = 7,
> + DISC_NUM_EVENTS,
> };
>
> /* ---------- Expander Devices ---------- */
> @@ -268,7 +269,8 @@ struct asd_sas_port {
> spinlock_t dev_list_lock;
> struct list_head dev_list;
> struct list_head disco_list;
> - struct list_head destroy_list;
> + struct list_head dev_destroy_list;
> + struct list_head port_destroy_list;
> enum sas_linkrate linkrate;
>
> struct sas_work work;
> @@ -702,6 +704,7 @@ extern int sas_bios_param(struct scsi_device *,
> int sas_ex_revalidate_domain(struct domain_device *);
>
> void sas_unregister_domain_devices(struct asd_sas_port *port, int gone);
> +void sas_port_destruct(struct asd_sas_port *port, struct sas_port *sas_port);
> void sas_init_disc(struct sas_discovery *disc, struct asd_sas_port *);
> int sas_discover_event(struct asd_sas_port *, enum discover_event ev);
>
> diff --git a/include/scsi/scsi_transport_sas.h b/include/scsi/scsi_transport_sas.h
> index 73d8709..b495aac 100644
> --- a/include/scsi/scsi_transport_sas.h
> +++ b/include/scsi/scsi_transport_sas.h
> @@ -154,6 +154,7 @@ struct sas_port {
>
> struct mutex phy_list_mutex;
> struct list_head phy_list;
> + struct list_head destroy_list; /* only used by libsas */
> };
>
> #define dev_to_sas_port(d) \
>
^ permalink raw reply
* Re: [RFC PATCH] scsi: libsas: fix WARN on device removal
From: John Garry @ 2016-11-09 12:28 UTC (permalink / raw)
To: martin.petersen, jejb
Cc: linux-scsi, linuxarm, linux-kernel, dan.j.williams, john.garry2,
jinpu.wang, lindar_liu, tk
In-Reply-To: <1478185120-5509-1-git-send-email-john.garry@huawei.com>
On 03/11/2016 14:58, John Garry wrote:
> The following patch introduces an annoying WARN
> when a device is removed from the SAS topology:
> [SCSI] libsas: prevent domain rediscovery competing with ata error handling
>
Are there any views on this patch? I would have thought that the parties
who use the drivers based on libsas would be interested in fixing this bug.
BTW, We are internally testing, hence the RFC.
Thanks in advance,
John
> A sample WARN is as follows:
> [ 236.842227] WARNING: CPU: 7 PID: 1520 at fs/sysfs/group.c:237 sysfs_remove_group+0x90/0x98
> [ 236.850465] Modules linked in:
> [ 236.853544]
> [ 236.855045] CPU: 7 PID: 1520 Comm: kworker/u64:4 Tainted: G W 4.9.0-rc1-15310-g3fbc29e-dirty #676
> [ 236.865010] Hardware name: Huawei Taishan 2180 /D03, BIOS Estuary v2.3 D03 UEFI 08/17/2016
> [ 236.873249] Workqueue: scsi_wq_0 sas_destruct_devices
> [ 236.878317] task: ffff8027ba31b200 task.stack: ffff8027b9d44000
> [ 236.884225] PC is at sysfs_remove_group+0x90/0x98
> [ 236.888920] LR is at sysfs_remove_group+0x90/0x98
> [ 236.893616] pc : [<ffff000008256df8>] lr : [<ffff000008256df8>] pstate: 60000145
> [ 236.900989] sp : ffff8027b9d47bf0
>
> < snip >
>
> [ 237.116463] [<ffff000008256df8>] sysfs_remove_group+0x90/0x98
> [ 237.122197] [<ffff00000851fe68>] dpm_sysfs_remove+0x58/0x68
> [ 237.127758] [<ffff000008513678>] device_del+0x40/0x218
> [ 237.132886] [<ffff000008513864>] device_unregister+0x14/0x2c
> [ 237.138536] [<ffff0000083670c4>] bsg_unregister_queue+0x5c/0xa0
> [ 237.144442] [<ffff00000855b984>] sas_rphy_remove+0x44/0x80
> [ 237.149915] [<ffff00000855b9d4>] sas_rphy_delete+0x14/0x28
> [ 237.155388] [<ffff00000855f9d8>] sas_destruct_devices+0x64/0x98
> [ 237.161293] [<ffff0000080d2c1c>] process_one_work+0x128/0x2e4
> [ 237.167027] [<ffff0000080d2e30>] worker_thread+0x58/0x434
> [ 237.172415] [<ffff0000080d8c24>] kthread+0xd4/0xe8
> [ 237.177198] [<ffff000008082e80>] ret_from_fork+0x10/0x50
> [ 237.182557] sysfs group 'power' not found for kobject 'end_device-0:0:5'
>
> (this can be really huge when an expander is unplugged)
>
> The problem is with the process of sas_port and domain_device
> destruction in domain revalidation. There is a 2-stage process:
> In domain revalidation (which runs in work queue context), if a
> domain_device is discovered to be gone, then the following happens:
> - the domain_device is queued for destruction in a separate work item
> - the associated sas_port is destroyed immediately
>
> This causes a problem in that the sas_port associated with
> a domain_device is destroyed prior the domain_device: this causes
> the sysfs WARN. Essentially the "rug has been pulled from underneath".
>
> Also, likewise, when a root port is deformed due to loss of signal,
> we have the same issue.
>
> To solve, destroy the sas_port in a separate work item to which
> we do the domain revalidation with a new discovery event, as follows:
> - When a domain_device is detected to be gone, the domain_device is
> queued for destruction in a separate work item. The associated
> sas_port is also queued for destruction in another separate work item
> (needs to be queued 2nd)
> - the domain_device is destroyed
> - the sas_port is destroyed
> [similar is done for loss of signal event, in sas_port_deformed()].
>
> Fixes: 87c8331fcf72e501c3a3c0cdc5c [SCSI] libsas: prevent domain
> rediscovery competing with ata error handling
>
> Signed-off-by: John Garry <john.garry@huawei.com>
>
> diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
> index 60de662..01d0fe2 100644
> --- a/drivers/scsi/libsas/sas_discover.c
> +++ b/drivers/scsi/libsas/sas_discover.c
> @@ -361,7 +361,7 @@ static void sas_destruct_devices(struct work_struct *work)
>
> clear_bit(DISCE_DESTRUCT, &port->disc.pending);
>
> - list_for_each_entry_safe(dev, n, &port->destroy_list, disco_list_node) {
> + list_for_each_entry_safe(dev, n, &port->dev_destroy_list, disco_list_node) {
> list_del_init(&dev->disco_list_node);
>
> sas_remove_children(&dev->rphy->dev);
> @@ -383,7 +383,7 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
>
> if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
> sas_rphy_unlink(dev->rphy);
> - list_move_tail(&dev->disco_list_node, &port->destroy_list);
> + list_move_tail(&dev->disco_list_node, &port->dev_destroy_list);
> sas_discover_event(dev->port, DISCE_DESTRUCT);
> }
> }
> @@ -525,6 +525,28 @@ static void sas_revalidate_domain(struct work_struct *work)
> mutex_unlock(&ha->disco_mutex);
> }
>
> +/* ---------- Async Port destruct ---------- */
> +static void sas_async_port_destruct(struct work_struct *work)
> +{
> + struct sas_discovery_event *ev = to_sas_discovery_event(work);
> + struct asd_sas_port *port = ev->port;
> + struct sas_port *sas_port, *n;
> +
> + clear_bit(DISCE_PORT_DESTRUCT, &port->disc.pending);
> +
> + list_for_each_entry_safe(sas_port, n, &port->port_destroy_list, destroy_list) {
> + list_del_init(&port->port_destroy_list);
> +
> + sas_port_delete(sas_port);
> + }
> +}
> +
> +void sas_port_destruct(struct asd_sas_port *port, struct sas_port *sas_port)
> +{
> + list_move_tail(&sas_port->destroy_list, &port->port_destroy_list);
> + sas_discover_event(port, DISCE_PORT_DESTRUCT);
> +}
> +
> /* ---------- Events ---------- */
>
> static void sas_chain_work(struct sas_ha_struct *ha, struct sas_work *sw)
> @@ -582,6 +604,7 @@ void sas_init_disc(struct sas_discovery *disc, struct asd_sas_port *port)
> [DISCE_SUSPEND] = sas_suspend_devices,
> [DISCE_RESUME] = sas_resume_devices,
> [DISCE_DESTRUCT] = sas_destruct_devices,
> + [DISCE_PORT_DESTRUCT] = sas_async_port_destruct,
> };
>
> disc->pending = 0;
> diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
> index 022bb6e..f9522a0 100644
> --- a/drivers/scsi/libsas/sas_expander.c
> +++ b/drivers/scsi/libsas/sas_expander.c
> @@ -1900,10 +1900,11 @@ static void sas_unregister_devs_sas_addr(struct domain_device *parent,
> }
> memset(phy->attached_sas_addr, 0, SAS_ADDR_SIZE);
> if (phy->port) {
> + struct asd_sas_port *port = found->port;
> sas_port_delete_phy(phy->port, phy->phy);
> sas_device_set_phy(found, phy->port);
> if (phy->port->num_phys == 0)
> - sas_port_delete(phy->port);
> + sas_port_destruct(port, phy->port);
> phy->port = NULL;
> }
> }
> diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c
> index d3c5297..1a32f86 100644
> --- a/drivers/scsi/libsas/sas_port.c
> +++ b/drivers/scsi/libsas/sas_port.c
> @@ -219,7 +219,7 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone)
>
> if (port->num_phys == 1) {
> sas_unregister_domain_devices(port, gone);
> - sas_port_delete(port->port);
> + sas_port_destruct(port, port->port);
> port->port = NULL;
> } else {
> sas_port_delete_phy(port->port, phy->phy);
> @@ -322,7 +322,8 @@ static void sas_init_port(struct asd_sas_port *port,
> port->id = i;
> INIT_LIST_HEAD(&port->dev_list);
> INIT_LIST_HEAD(&port->disco_list);
> - INIT_LIST_HEAD(&port->destroy_list);
> + INIT_LIST_HEAD(&port->dev_destroy_list);
> + INIT_LIST_HEAD(&port->port_destroy_list);
> spin_lock_init(&port->phy_list_lock);
> INIT_LIST_HEAD(&port->phy_list);
> port->ha = sas_ha;
> diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
> index 60b651b..062c03c 100644
> --- a/drivers/scsi/scsi_transport_sas.c
> +++ b/drivers/scsi/scsi_transport_sas.c
> @@ -934,6 +934,7 @@ struct sas_port *sas_port_alloc(struct device *parent, int port_id)
>
> mutex_init(&port->phy_list_mutex);
> INIT_LIST_HEAD(&port->phy_list);
> + INIT_LIST_HEAD(&port->destroy_list);
>
> if (scsi_is_sas_expander_device(parent)) {
> struct sas_rphy *rphy = dev_to_rphy(parent);
> diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
> index dae99d7..a7953c8 100644
> --- a/include/scsi/libsas.h
> +++ b/include/scsi/libsas.h
> @@ -91,7 +91,8 @@ enum discover_event {
> DISCE_SUSPEND = 4,
> DISCE_RESUME = 5,
> DISCE_DESTRUCT = 6,
> - DISC_NUM_EVENTS = 7,
> + DISCE_PORT_DESTRUCT = 7,
> + DISC_NUM_EVENTS,
> };
>
> /* ---------- Expander Devices ---------- */
> @@ -268,7 +269,8 @@ struct asd_sas_port {
> spinlock_t dev_list_lock;
> struct list_head dev_list;
> struct list_head disco_list;
> - struct list_head destroy_list;
> + struct list_head dev_destroy_list;
> + struct list_head port_destroy_list;
> enum sas_linkrate linkrate;
>
> struct sas_work work;
> @@ -702,6 +704,7 @@ extern int sas_bios_param(struct scsi_device *,
> int sas_ex_revalidate_domain(struct domain_device *);
>
> void sas_unregister_domain_devices(struct asd_sas_port *port, int gone);
> +void sas_port_destruct(struct asd_sas_port *port, struct sas_port *sas_port);
> void sas_init_disc(struct sas_discovery *disc, struct asd_sas_port *);
> int sas_discover_event(struct asd_sas_port *, enum discover_event ev);
>
> diff --git a/include/scsi/scsi_transport_sas.h b/include/scsi/scsi_transport_sas.h
> index 73d8709..b495aac 100644
> --- a/include/scsi/scsi_transport_sas.h
> +++ b/include/scsi/scsi_transport_sas.h
> @@ -154,6 +154,7 @@ struct sas_port {
>
> struct mutex phy_list_mutex;
> struct list_head phy_list;
> + struct list_head destroy_list; /* only used by libsas */
> };
>
> #define dev_to_sas_port(d) \
>
^ permalink raw reply
* [PATCH for-4.8] x86/svm: Don't clobber eax and edx if an RDMSR intercept fails
From: Andrew Cooper @ 2016-11-09 12:28 UTC (permalink / raw)
To: Xen-devel
Cc: Andrew Cooper, Boris Ostrovsky, Wei Liu, Suravee Suthikulpanit,
Jan Beulich
The original code has a bug; eax and edx get unconditionally updated even when
hvm_msr_read_intercept() doesn't return X86EMUL_OKAY.
It is only by blind luck (vmce_rdmsr() eagerly initialising its msr_content
pointer) that this isn't an information leak into guests.
While fixing this bug, reduce the scope of msr_content and initialise it to 0.
This makes it obvious that a stack leak won't occur, even if there were to be
a buggy codepath in hvm_msr_read_intercept().
Also make some non-functional improvements. Make the insn_len calculation
common, and reduce the quantity of explicit casting by making better use of
the existing register names.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
xen/arch/x86/hvm/svm/svm.c | 32 +++++++++++++++++---------------
1 file changed, 17 insertions(+), 15 deletions(-)
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 16427f6..6530e22 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1948,26 +1948,28 @@ static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content)
static void svm_do_msr_access(struct cpu_user_regs *regs)
{
- int rc, inst_len;
struct vcpu *v = current;
- struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
- uint64_t msr_content;
+ bool rdmsr = v->arch.hvm_svm.vmcb->exitinfo1 == 0;
+ int rc, inst_len = __get_instruction_length(
+ v, rdmsr ? INSTR_RDMSR : INSTR_WRMSR);
+
+ if ( inst_len == 0 )
+ return;
- if ( vmcb->exitinfo1 == 0 )
+ if ( rdmsr )
{
- if ( (inst_len = __get_instruction_length(v, INSTR_RDMSR)) == 0 )
- return;
- rc = hvm_msr_read_intercept(regs->ecx, &msr_content);
- regs->eax = (uint32_t)msr_content;
- regs->edx = (uint32_t)(msr_content >> 32);
+ uint64_t msr_content = 0;
+
+ rc = hvm_msr_read_intercept(regs->_ecx, &msr_content);
+ if ( rc == X86EMUL_OKAY )
+ {
+ regs->rax = (uint32_t)msr_content;
+ regs->rdx = (uint32_t)(msr_content >> 32);
+ }
}
else
- {
- if ( (inst_len = __get_instruction_length(v, INSTR_WRMSR)) == 0 )
- return;
- msr_content = ((uint64_t)regs->edx << 32) | (uint32_t)regs->eax;
- rc = hvm_msr_write_intercept(regs->ecx, msr_content, 1);
- }
+ rc = hvm_msr_write_intercept(regs->_ecx,
+ (regs->rdx << 32) | regs->_eax, 1);
if ( rc == X86EMUL_OKAY )
__update_guest_eip(regs, inst_len);
--
2.1.4
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related
* Re: [Ksummit-discuss] Including images on Sphinx documents
From: Mauro Carvalho Chehab @ 2016-11-09 12:27 UTC (permalink / raw)
To: Jani Nikula; +Cc: linux-media, linux-doc, linux-kernel, ksummit-discuss
In-Reply-To: <87wpgf8ssc.fsf@intel.com>
Em Mon, 07 Nov 2016 12:53:55 +0200
Jani Nikula <jani.nikula@intel.com> escreveu:
> On Mon, 07 Nov 2016, Mauro Carvalho Chehab <mchehab@s-opensource.com> wrote:
> > Hi Jon,
> >
> > I'm trying to sort out the next steps to do after KS, with regards to
> > images included on RST files.
> >
> > The issue is that Sphinx image support highly depends on the output
> > format. Also, despite TexLive support for svg and png images[1], Sphinx
> > doesn't produce the right LaTeX commands to use svg[2]. On my tests
> > with PNG on my notebook, it also didn't seem to do the right thing for
> > PNG either. So, it seems that the only safe way to support images is
> > to convert all of them to PDF for latex/pdf build.
> >
> > [1] On Fedora, via texlive-dvipng and texlive-svg
> > [2] https://github.com/sphinx-doc/sphinx/issues/1907
> >
> > As far as I understand from KS, two decisions was taken:
> >
> > - We're not adding a sphinx extension to run generic commands;
> > - The PDF images should be build in runtime from their source files
> > (either svg or bitmap), and not ship anymore the corresponding
> > PDF files generated from its source.
> >
> > As you know, we use several images at the media documentation:
> > https://www.kernel.org/doc/html/latest/_images/
> >
> > Those images are tightly coupled with the explanation texts. So,
> > maintaining them away from the documentation is not an option.
> >
> > I was originally thinking that adding a graphviz extension would solve the
> > issue, but, in fact, most of the images aren't diagrams. Instead, there are
> > several ones with images showing the result of passing certain parameters to
> > the ioctls, explaining things like scale and cropping and how bytes are
> > packed on some image formats.
> >
> > Linus proposed to call some image conversion tool like ImageMagick or
> > inkscape to convert them to PDF when building the pdfdocs or latexdocs
> > target at Makefile, but there's an issue with that: Sphinx doesn't read
> > files from Documentation/output, and writing them directly at the
> > source dir would be against what it is expected when the "O=" argument
> > is passed to make.
> >
> > So, we have a few alternatives:
> >
> > 1) copy (or symlink) all rst files to Documentation/output (or to the
> > build dir specified via O= directive) and generate the *.pdf there,
> > and produce those converted images via Makefile.;
> >
> > 2) add an Sphinx extension that would internally call ImageMagick and/or
> > inkscape to convert the bitmap;
> >
> > 3) if possible, add an extension to trick Sphinx for it to consider the
> > output dir as a source dir too.
>
> Looking at the available extensions, and the images to be displayed,
> seems to me making svg work, somehow, is the right approach. (As opposed
> to trying to represent the images in graphviz or whatnot.)
I guess answered this one already, but it got lost somehow...
The problem is not just with svg. Sphinx also do the wrong thing with
PNG, despite apparently generating the right LaTeX image include command.
> IIUC texlive supports displaying svg directly, but the problem is that
> Sphinx produces bad latex for that. Can we make it work by manually
> writing the latex? If yes, we wouldn't need to use an external tool to
> convert the svg to something else, but rather fix the latex. Thus:
>
> 4a) See if this works:
>
> .. only:: html
>
> .. image:: foo.svg
We're currently using .. figure:: instead, as it allow optional caption
and legend, but I got the idea.
> .. raw:: latex
>
> <the correct latex commands required to display foo.svg>
That is a horrible hack, and will lose other attributes at
image:: (or figure::), like :align:
Also, it won't solve, as the images will need to be copied to the
build dir via Makefile, as Spinx only copies the images it recognizes.
So, in practice, the only difference is that Makefile would be calling
"cp" instead of "convert", plus we'll have to hack all ReST sources.
> 4b) Add a directive extension to make the above happen automatically.
If doable, I agree that this is the best solution. Any volunteers to write
such extension?
> Of course, the correct fix is to have this fixed in upstream Sphinx, but
> as a workaround an extension doing the above seems plausible, and not
> too much effort - provided that we can make the raw latex work.
Yeah, fixing it on Sphinx upstream would be the best, but we'll still
need to maintain the workaround for a while for the unpatched versions
of Sphinx.
Thanks,
Mauro
^ permalink raw reply
* Re: Including images on Sphinx documents
From: Mauro Carvalho Chehab @ 2016-11-09 12:27 UTC (permalink / raw)
To: Jani Nikula
Cc: Jonathan Corbet, linux-kernel, linux-media, linux-doc,
ksummit-discuss
In-Reply-To: <87wpgf8ssc.fsf@intel.com>
Em Mon, 07 Nov 2016 12:53:55 +0200
Jani Nikula <jani.nikula@intel.com> escreveu:
> On Mon, 07 Nov 2016, Mauro Carvalho Chehab <mchehab@s-opensource.com> wrote:
> > Hi Jon,
> >
> > I'm trying to sort out the next steps to do after KS, with regards to
> > images included on RST files.
> >
> > The issue is that Sphinx image support highly depends on the output
> > format. Also, despite TexLive support for svg and png images[1], Sphinx
> > doesn't produce the right LaTeX commands to use svg[2]. On my tests
> > with PNG on my notebook, it also didn't seem to do the right thing for
> > PNG either. So, it seems that the only safe way to support images is
> > to convert all of them to PDF for latex/pdf build.
> >
> > [1] On Fedora, via texlive-dvipng and texlive-svg
> > [2] https://github.com/sphinx-doc/sphinx/issues/1907
> >
> > As far as I understand from KS, two decisions was taken:
> >
> > - We're not adding a sphinx extension to run generic commands;
> > - The PDF images should be build in runtime from their source files
> > (either svg or bitmap), and not ship anymore the corresponding
> > PDF files generated from its source.
> >
> > As you know, we use several images at the media documentation:
> > https://www.kernel.org/doc/html/latest/_images/
> >
> > Those images are tightly coupled with the explanation texts. So,
> > maintaining them away from the documentation is not an option.
> >
> > I was originally thinking that adding a graphviz extension would solve the
> > issue, but, in fact, most of the images aren't diagrams. Instead, there are
> > several ones with images showing the result of passing certain parameters to
> > the ioctls, explaining things like scale and cropping and how bytes are
> > packed on some image formats.
> >
> > Linus proposed to call some image conversion tool like ImageMagick or
> > inkscape to convert them to PDF when building the pdfdocs or latexdocs
> > target at Makefile, but there's an issue with that: Sphinx doesn't read
> > files from Documentation/output, and writing them directly at the
> > source dir would be against what it is expected when the "O=" argument
> > is passed to make.
> >
> > So, we have a few alternatives:
> >
> > 1) copy (or symlink) all rst files to Documentation/output (or to the
> > build dir specified via O= directive) and generate the *.pdf there,
> > and produce those converted images via Makefile.;
> >
> > 2) add an Sphinx extension that would internally call ImageMagick and/or
> > inkscape to convert the bitmap;
> >
> > 3) if possible, add an extension to trick Sphinx for it to consider the
> > output dir as a source dir too.
>
> Looking at the available extensions, and the images to be displayed,
> seems to me making svg work, somehow, is the right approach. (As opposed
> to trying to represent the images in graphviz or whatnot.)
I guess answered this one already, but it got lost somehow...
The problem is not just with svg. Sphinx also do the wrong thing with
PNG, despite apparently generating the right LaTeX image include command.
> IIUC texlive supports displaying svg directly, but the problem is that
> Sphinx produces bad latex for that. Can we make it work by manually
> writing the latex? If yes, we wouldn't need to use an external tool to
> convert the svg to something else, but rather fix the latex. Thus:
>
> 4a) See if this works:
>
> .. only:: html
>
> .. image:: foo.svg
We're currently using .. figure:: instead, as it allow optional caption
and legend, but I got the idea.
> .. raw:: latex
>
> <the correct latex commands required to display foo.svg>
That is a horrible hack, and will lose other attributes at
image:: (or figure::), like :align:
Also, it won't solve, as the images will need to be copied to the
build dir via Makefile, as Spinx only copies the images it recognizes.
So, in practice, the only difference is that Makefile would be calling
"cp" instead of "convert", plus we'll have to hack all ReST sources.
> 4b) Add a directive extension to make the above happen automatically.
If doable, I agree that this is the best solution. Any volunteers to write
such extension?
> Of course, the correct fix is to have this fixed in upstream Sphinx, but
> as a workaround an extension doing the above seems plausible, and not
> too much effort - provided that we can make the raw latex work.
Yeah, fixing it on Sphinx upstream would be the best, but we'll still
need to maintain the workaround for a while for the unpatched versions
of Sphinx.
Thanks,
Mauro
^ permalink raw reply
* Re: Could receive allow updating an existing subvolume?
From: Austin S. Hemmelgarn @ 2016-11-09 12:26 UTC (permalink / raw)
To: Ian Kelling, Hugo Mills; +Cc: linux-btrfs
In-Reply-To: <1478646934.2753701.781728689.373545C7@webmail.messagingengine.com>
On 2016-11-08 18:15, Ian Kelling wrote:
> On Tue, Nov 8, 2016, at 03:00 PM, Hugo Mills wrote:
>> On Tue, Nov 08, 2016 at 02:48:56PM -0800, Ian Kelling wrote:
>>> It seems to be an artificially imposed limitation which hurts which
>>> hurts its usefulness. Let me know if this makes sense. If so, perhaps it
>>> can be implemented eventually. It seems a bit obvious but I couldn't
>>> find any existing discussion of it.
>>
>> It's not artificial -- it's ensuring safety of operation.
>
> No, it doesn't ensure the subvolume is not modified, so it IS
> artificial. I can still set the subvolume to rw before or probably
> during the send and modify a file and mess things up.
>
>>
>> If the sender sends an incremental stream, that assumes an *exact*
>> subvol state on the receiving side. If the subvol on the receiving
>> side is modified, then the receive can fail.
>
> No. The reading program never needs to have access to rw files if it's
> reading from a read-only mountpoint while the subvolume is rw and
> mounted as such elsewhere. And a reading program does not magically risk
> writes.
That assumes that the reading program is bug free (and perfectly
secured) running on 100% reliable hardware with no chance of any kind of
failure, which is a pretty significant requirement that's functionally
impossible to enforce.
There's also the fact that things which have files opened read-only
generally expect that the file will not change under them, and that
you'd need to restart most software anyway so that it would pick up on
any renames and new or deleted paths.
>
>>
>> So, the assumption is that the reference subvol on the receiving
>> side (equivalent to the -p subvol on the sending side) hasn't been
>> changed since it was received. The same assumption applies to the -p
>> subvol on the sending side.
>>
>> Now, receive is a fully userspace tool, so it would have to set the
>> subvol to RW, then update it, then set it to RO. The subvol risks
>> being modified by other processes during that window -- *particularly*
>> if it's actively being read by those other processes.
>
> No. The reading program never needs to have access to rw files if it's
> reading from a read-only mountpoint while the subvolume is rw and
> mounted as such elsewhere. And a reading program does not magically risk
> writes.
>
>>
>> Note that this is still an issue with the current situation, but
>> the expectation is that nothing's going to be actively reading that
>> location at the time the receive is running. But, if something does go
>> wrong with the receive, it's possible to abort and restart the
>> process. If you're modifying an existing subvol, there's no
>> recoverability if something goes wrong halfway through.
>
> No. You could recover using the snapshot that I mentioned.
>
>> Hugo.
>
> So my question still stands.
Given the use case you're describing, it sounds like `rsync --inplace`
plus snapshots is a better fit for what you want to do than
send/receive. It's worth pointing out though that this is _NOT_ a safe
way to handle things if your actually serving data based on the contents
of those files, because any read while the file is being updated will
likely return half-updated data. The only case where I would ever
consider doing something like this is on a system which is an
active-backup for another system, because it's pretty much guaranteed to
not be serving data if it's syncing it from the primary system.
^ permalink raw reply
* Re: [Qemu-devel] [PATCH] spapr: Fix migration of PCI host bridges from qemu-2.7
From: David Gibson @ 2016-11-09 12:19 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: mdroth, agraf, thuth, lvivier, qemu-ppc, qemu-devel
In-Reply-To: <8711c844-ccae-a4ef-3c42-eeda73cee5e2@ozlabs.ru>
[-- Attachment #1: Type: text/plain, Size: 3593 bytes --]
On Wed, Nov 09, 2016 at 04:14:25PM +1100, Alexey Kardashevskiy wrote:
> On 09/11/16 14:45, David Gibson wrote:
> > daa2369 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration from
> > qemu-2.7 to the current version. It split the device's MMIO window into
> > two pieces for 32-bit and 64-bit MMIO.
> >
> > The patch included backwards compatibility code to convert the old property
> > into the new format. However, the property value was also transferred in
> > the migration stream and compared with a (probably unwise) VMSTATE_EQUAL.
> > So, the "raw" value from 2.7 is compared to the new style converted value
> > from (pre-)2.8 giving a mismatch and migration failure.
> >
> > Although it would be technically possible to fix this in a way allowing
> > backwards migration, that would leave an ugly legacy around indefinitely.
> > This patch takes the simpler approach of bumping the migration version,
> > dropping the unwise VMSTATE_EQUAL (and some equally unwise ones around it)
> > and ignoring them on an incoming migration.
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > hw/ppc/spapr_pci.c | 17 +++++++++++------
> > 1 file changed, 11 insertions(+), 6 deletions(-)
> >
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 7cde30e..7f1cc29 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -1658,19 +1658,24 @@ static int spapr_pci_post_load(void *opaque, int version_id)
> > return 0;
> > }
> >
> > +static bool version_before_3(void *opaque, int version_id)
> > +{
> > + return version_id < 3;
> > +}
> > +
> > static const VMStateDescription vmstate_spapr_pci = {
> > .name = "spapr_pci",
> > - .version_id = 2,
> > + .version_id = 3,
> > .minimum_version_id = 2,
> > .pre_save = spapr_pci_pre_save,
> > .post_load = spapr_pci_post_load,
> > .fields = (VMStateField[]) {
> > VMSTATE_UINT64_EQUAL(buid, sPAPRPHBState),
>
>
> You could probably go one step further and get rid of @buid as well.
I thought about it. buid at least is specified state that's
vanishingly unlikely to change or disappear from the device. It also
does serve to make sure that QOM instance matching - which is always a
bit black magicy to me - is lining things up correctly.
>
> Nevertheless, this works,
>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
>
>
>
> > - VMSTATE_UINT32_EQUAL(dma_liobn[0], sPAPRPHBState),
> > - VMSTATE_UINT64_EQUAL(mem_win_addr, sPAPRPHBState),
> > - VMSTATE_UINT64_EQUAL(mem_win_size, sPAPRPHBState),
> > - VMSTATE_UINT64_EQUAL(io_win_addr, sPAPRPHBState),
> > - VMSTATE_UINT64_EQUAL(io_win_size, sPAPRPHBState),
> > + VMSTATE_UNUSED_TEST(version_before_3, sizeof(uint32_t) /* dma_liobn[0] */
> > + + sizeof(uint64_t) /* mem_win_addr */
> > + + sizeof(uint64_t) /* mem_win_size */
> > + + sizeof(uint64_t) /* io_win_addr */
> > + + sizeof(uint64_t) /* io_win_size */),
> > VMSTATE_STRUCT_ARRAY(lsi_table, sPAPRPHBState, PCI_NUM_PINS, 0,
> > vmstate_spapr_pci_lsi, struct spapr_pci_lsi),
> > VMSTATE_INT32(msi_devs_num, sPAPRPHBState),
> >
>
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* [Qemu-devel] [PATCHv2 0/3] Allow ISA to be disabled on some platforms
From: David Gibson @ 2016-11-09 12:22 UTC (permalink / raw)
To: edgar.iglesias, michael, proljc, borntraeger, cornelia.huck,
kbastian, jcmvbkbc
Cc: agraf, mst, armbru, pbonzini, peter.maydell, veroniabahaa,
qemu-devel, David Gibson
This is a rebase and revision of a series I wrote quite some time ago.
This makes some cleanups that are a start on allowing ISA to be
compiled out for platforms which don't use it.
Unfortunately, a lot of the pieces here don't have a clear maintainer.
So, I'm hoping to get some Acked-bys from the maintainers of the
affected targets, then I intend to send a pull request direct to Peter.
Notes:
* Patch 3/3 triggers a style warning, but that's just because I'm
moving a C++ // comment verbatim from one file to another
Changes since v1:
* Fixed some silly compile errors in 3/3 exposed by some
changes in other headers
David Gibson (3):
Split serial-isa into its own config option
Allow ISA bus to be configured out
Split ISA and sysbus versions of m48t59 device
default-configs/alpha-softmmu.mak | 1 +
default-configs/arm-softmmu.mak | 1 +
default-configs/i386-softmmu.mak | 1 +
default-configs/mips-softmmu-common.mak | 1 +
default-configs/moxie-softmmu.mak | 2 +
default-configs/pci.mak | 3 +
default-configs/ppc-softmmu.mak | 1 +
default-configs/ppc64-softmmu.mak | 1 +
default-configs/ppcemb-softmmu.mak | 1 +
default-configs/sh4-softmmu.mak | 1 +
default-configs/sh4eb-softmmu.mak | 1 +
default-configs/sparc-softmmu.mak | 1 +
default-configs/sparc64-softmmu.mak | 1 +
default-configs/unicore32-softmmu.mak | 1 +
default-configs/x86_64-softmmu.mak | 1 +
hw/char/Makefile.objs | 3 +-
hw/isa/Makefile.objs | 2 +-
hw/timer/Makefile.objs | 3 +
hw/timer/m48t59-internal.h | 82 ++++++++++++
hw/timer/m48t59-isa.c | 181 +++++++++++++++++++++++++
hw/timer/m48t59.c | 228 +++-----------------------------
21 files changed, 305 insertions(+), 212 deletions(-)
create mode 100644 hw/timer/m48t59-internal.h
create mode 100644 hw/timer/m48t59-isa.c
--
2.7.4
^ permalink raw reply
* [Qemu-devel] [PATCHv2 3/3] Split ISA and sysbus versions of m48t59 device
From: David Gibson @ 2016-11-09 12:22 UTC (permalink / raw)
To: edgar.iglesias, michael, proljc, borntraeger, cornelia.huck,
kbastian, jcmvbkbc
Cc: agraf, mst, armbru, pbonzini, peter.maydell, veroniabahaa,
qemu-devel, David Gibson
In-Reply-To: <1478694124-15803-1-git-send-email-david@gibson.dropbear.id.au>
The m48t59 device supports both ISA and direct sysbus attached versions of
the device in the one .c file. This can be awkward for some embedded
machine types which need the sysbus M48T59, but don't want to pull in the
ISA bus code and its other dependencies.
Therefore, this patch splits out the code for the ISA attached M48T59 into
its own C file. It will be built when both CONFIG_M48T59 and
CONFIG_ISA_BUS are enabled.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
hw/timer/Makefile.objs | 3 +
hw/timer/m48t59-internal.h | 82 ++++++++++++++++
hw/timer/m48t59-isa.c | 181 +++++++++++++++++++++++++++++++++++
hw/timer/m48t59.c | 228 ++++-----------------------------------------
4 files changed, 284 insertions(+), 210 deletions(-)
create mode 100644 hw/timer/m48t59-internal.h
create mode 100644 hw/timer/m48t59-isa.c
diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs
index 7ba8c23..bf3ea3c 100644
--- a/hw/timer/Makefile.objs
+++ b/hw/timer/Makefile.objs
@@ -6,6 +6,9 @@ common-obj-$(CONFIG_DS1338) += ds1338.o
common-obj-$(CONFIG_HPET) += hpet.o
common-obj-$(CONFIG_I8254) += i8254_common.o i8254.o
common-obj-$(CONFIG_M48T59) += m48t59.o
+ifeq ($(CONFIG_ISA_BUS),y)
+common-obj-$(CONFIG_M48T59) += m48t59-isa.o
+endif
common-obj-$(CONFIG_PL031) += pl031.o
common-obj-$(CONFIG_PUV3) += puv3_ost.o
common-obj-$(CONFIG_TWL92230) += twl92230.o
diff --git a/hw/timer/m48t59-internal.h b/hw/timer/m48t59-internal.h
new file mode 100644
index 0000000..32ae957
--- /dev/null
+++ b/hw/timer/m48t59-internal.h
@@ -0,0 +1,82 @@
+/*
+ * QEMU M48T59 and M48T08 NVRAM emulation (common header)
+ *
+ * Copyright (c) 2003-2005, 2007 Jocelyn Mayer
+ * Copyright (c) 2013 Hervé Poussineau
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#ifndef HW_M48T59_INTERNAL_H
+#define HW_M48T59_INTERNAL_H 1
+
+//#define DEBUG_NVRAM
+
+#if defined(DEBUG_NVRAM)
+#define NVRAM_PRINTF(fmt, ...) do { printf(fmt , ## __VA_ARGS__); } while (0)
+#else
+#define NVRAM_PRINTF(fmt, ...) do { } while (0)
+#endif
+
+/*
+ * The M48T02, M48T08 and M48T59 chips are very similar. The newer '59 has
+ * alarm and a watchdog timer and related control registers. In the
+ * PPC platform there is also a nvram lock function.
+ */
+
+typedef struct M48txxInfo {
+ const char *bus_name;
+ uint32_t model; /* 2 = m48t02, 8 = m48t08, 59 = m48t59 */
+ uint32_t size;
+} M48txxInfo;
+
+typedef struct M48t59State {
+ /* Hardware parameters */
+ qemu_irq IRQ;
+ MemoryRegion iomem;
+ uint32_t size;
+ int32_t base_year;
+ /* RTC management */
+ time_t time_offset;
+ time_t stop_time;
+ /* Alarm & watchdog */
+ struct tm alarm;
+ QEMUTimer *alrm_timer;
+ QEMUTimer *wd_timer;
+ /* NVRAM storage */
+ uint8_t *buffer;
+ /* Model parameters */
+ uint32_t model; /* 2 = m48t02, 8 = m48t08, 59 = m48t59 */
+ /* NVRAM storage */
+ uint16_t addr;
+ uint8_t lock;
+} M48t59State;
+
+uint32_t m48t59_read(M48t59State *NVRAM, uint32_t addr);
+void m48t59_write(M48t59State *NVRAM, uint32_t addr, uint32_t val);
+void m48t59_reset_common(M48t59State *NVRAM);
+void m48t59_realize_common(M48t59State *s, Error **errp);
+
+static inline void m48t59_toggle_lock(M48t59State *NVRAM, int lock)
+{
+ NVRAM->lock ^= 1 << lock;
+}
+
+extern const MemoryRegionOps m48t59_io_ops;
+
+#endif /* HW_M48T59_INTERNAL_H */
diff --git a/hw/timer/m48t59-isa.c b/hw/timer/m48t59-isa.c
new file mode 100644
index 0000000..ea1ba70
--- /dev/null
+++ b/hw/timer/m48t59-isa.c
@@ -0,0 +1,181 @@
+/*
+ * QEMU M48T59 and M48T08 NVRAM emulation (ISA bus interface
+ *
+ * Copyright (c) 2003-2005, 2007 Jocelyn Mayer
+ * Copyright (c) 2013 Hervé Poussineau
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "qemu/osdep.h"
+#include "hw/isa/isa.h"
+#include "hw/timer/m48t59.h"
+#include "m48t59-internal.h"
+
+#define TYPE_M48TXX_ISA "isa-m48txx"
+#define M48TXX_ISA_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(M48txxISADeviceClass, (obj), TYPE_M48TXX_ISA)
+#define M48TXX_ISA_CLASS(klass) \
+ OBJECT_CLASS_CHECK(M48txxISADeviceClass, (klass), TYPE_M48TXX_ISA)
+#define M48TXX_ISA(obj) \
+ OBJECT_CHECK(M48txxISAState, (obj), TYPE_M48TXX_ISA)
+
+typedef struct M48txxISAState {
+ ISADevice parent_obj;
+ M48t59State state;
+ uint32_t io_base;
+ MemoryRegion io;
+} M48txxISAState;
+
+typedef struct M48txxISADeviceClass {
+ ISADeviceClass parent_class;
+ M48txxInfo info;
+} M48txxISADeviceClass;
+
+static M48txxInfo m48txx_isa_info[] = {
+ {
+ .bus_name = "isa-m48t59",
+ .model = 59,
+ .size = 0x2000,
+ }
+};
+
+Nvram *m48t59_init_isa(ISABus *bus, uint32_t io_base, uint16_t size,
+ int base_year, int model)
+{
+ DeviceState *dev;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(m48txx_isa_info); i++) {
+ if (m48txx_isa_info[i].size != size ||
+ m48txx_isa_info[i].model != model) {
+ continue;
+ }
+
+ dev = DEVICE(isa_create(bus, m48txx_isa_info[i].bus_name));
+ qdev_prop_set_uint32(dev, "iobase", io_base);
+ qdev_prop_set_int32(dev, "base-year", base_year);
+ qdev_init_nofail(dev);
+ return NVRAM(dev);
+ }
+
+ assert(false);
+ return NULL;
+}
+
+static uint32_t m48txx_isa_read(Nvram *obj, uint32_t addr)
+{
+ M48txxISAState *d = M48TXX_ISA(obj);
+ return m48t59_read(&d->state, addr);
+}
+
+static void m48txx_isa_write(Nvram *obj, uint32_t addr, uint32_t val)
+{
+ M48txxISAState *d = M48TXX_ISA(obj);
+ m48t59_write(&d->state, addr, val);
+}
+
+static void m48txx_isa_toggle_lock(Nvram *obj, int lock)
+{
+ M48txxISAState *d = M48TXX_ISA(obj);
+ m48t59_toggle_lock(&d->state, lock);
+}
+
+static Property m48t59_isa_properties[] = {
+ DEFINE_PROP_INT32("base-year", M48txxISAState, state.base_year, 0),
+ DEFINE_PROP_UINT32("iobase", M48txxISAState, io_base, 0x74),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
+static void m48t59_reset_isa(DeviceState *d)
+{
+ M48txxISAState *isa = M48TXX_ISA(d);
+ M48t59State *NVRAM = &isa->state;
+
+ m48t59_reset_common(NVRAM);
+}
+
+static void m48t59_isa_realize(DeviceState *dev, Error **errp)
+{
+ M48txxISADeviceClass *u = M48TXX_ISA_GET_CLASS(dev);
+ ISADevice *isadev = ISA_DEVICE(dev);
+ M48txxISAState *d = M48TXX_ISA(dev);
+ M48t59State *s = &d->state;
+
+ s->model = u->info.model;
+ s->size = u->info.size;
+ isa_init_irq(isadev, &s->IRQ, 8);
+ m48t59_realize_common(s, errp);
+ memory_region_init_io(&d->io, OBJECT(dev), &m48t59_io_ops, s, "m48t59", 4);
+ if (d->io_base != 0) {
+ isa_register_ioport(isadev, &d->io, d->io_base);
+ }
+}
+
+static void m48txx_isa_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+ NvramClass *nc = NVRAM_CLASS(klass);
+
+ dc->realize = m48t59_isa_realize;
+ dc->reset = m48t59_reset_isa;
+ dc->props = m48t59_isa_properties;
+ nc->read = m48txx_isa_read;
+ nc->write = m48txx_isa_write;
+ nc->toggle_lock = m48txx_isa_toggle_lock;
+}
+
+static void m48txx_isa_concrete_class_init(ObjectClass *klass, void *data)
+{
+ M48txxISADeviceClass *u = M48TXX_ISA_CLASS(klass);
+ M48txxInfo *info = data;
+
+ u->info = *info;
+}
+
+static const TypeInfo m48txx_isa_type_info = {
+ .name = TYPE_M48TXX_ISA,
+ .parent = TYPE_ISA_DEVICE,
+ .instance_size = sizeof(M48txxISAState),
+ .abstract = true,
+ .class_init = m48txx_isa_class_init,
+ .interfaces = (InterfaceInfo[]) {
+ { TYPE_NVRAM },
+ { }
+ }
+};
+
+static void m48t59_isa_register_types(void)
+{
+ TypeInfo isa_type_info = {
+ .parent = TYPE_M48TXX_ISA,
+ .class_size = sizeof(M48txxISADeviceClass),
+ .class_init = m48txx_isa_concrete_class_init,
+ };
+ int i;
+
+ type_register_static(&m48txx_isa_type_info);
+
+ for (i = 0; i < ARRAY_SIZE(m48txx_isa_info); i++) {
+ isa_type_info.name = m48txx_isa_info[i].bus_name;
+ isa_type_info.class_data = &m48txx_isa_info[i];
+ type_register(&isa_type_info);
+ }
+}
+
+type_init(m48t59_isa_register_types)
diff --git a/hw/timer/m48t59.c b/hw/timer/m48t59.c
index e46ca88..0157977 100644
--- a/hw/timer/m48t59.c
+++ b/hw/timer/m48t59.c
@@ -29,17 +29,10 @@
#include "qemu/timer.h"
#include "sysemu/sysemu.h"
#include "hw/sysbus.h"
-#include "hw/isa/isa.h"
#include "exec/address-spaces.h"
#include "qemu/bcd.h"
-//#define DEBUG_NVRAM
-
-#if defined(DEBUG_NVRAM)
-#define NVRAM_PRINTF(fmt, ...) do { printf(fmt , ## __VA_ARGS__); } while (0)
-#else
-#define NVRAM_PRINTF(fmt, ...) do { } while (0)
-#endif
+#include "m48t59-internal.h"
#define TYPE_M48TXX_SYS_BUS "sysbus-m48txx"
#define M48TXX_SYS_BUS_GET_CLASS(obj) \
@@ -49,27 +42,6 @@
#define M48TXX_SYS_BUS(obj) \
OBJECT_CHECK(M48txxSysBusState, (obj), TYPE_M48TXX_SYS_BUS)
-#define TYPE_M48TXX_ISA "isa-m48txx"
-#define M48TXX_ISA_GET_CLASS(obj) \
- OBJECT_GET_CLASS(M48txxISADeviceClass, (obj), TYPE_M48TXX_ISA)
-#define M48TXX_ISA_CLASS(klass) \
- OBJECT_CLASS_CHECK(M48txxISADeviceClass, (klass), TYPE_M48TXX_ISA)
-#define M48TXX_ISA(obj) \
- OBJECT_CHECK(M48txxISAState, (obj), TYPE_M48TXX_ISA)
-
-/*
- * The M48T02, M48T08 and M48T59 chips are very similar. The newer '59 has
- * alarm and a watchdog timer and related control registers. In the
- * PPC platform there is also a nvram lock function.
- */
-
-typedef struct M48txxInfo {
- const char *isa_name;
- const char *sysbus_name;
- uint32_t model; /* 2 = m48t02, 8 = m48t08, 59 = m48t59 */
- uint32_t size;
-} M48txxInfo;
-
/*
* Chipset docs:
* http://www.st.com/stonline/products/literature/ds/2410/m48t02.pdf
@@ -77,40 +49,6 @@ typedef struct M48txxInfo {
* http://www.st.com/stonline/products/literature/od/7001/m48t59y.pdf
*/
-typedef struct M48t59State {
- /* Hardware parameters */
- qemu_irq IRQ;
- MemoryRegion iomem;
- uint32_t size;
- int32_t base_year;
- /* RTC management */
- time_t time_offset;
- time_t stop_time;
- /* Alarm & watchdog */
- struct tm alarm;
- QEMUTimer *alrm_timer;
- QEMUTimer *wd_timer;
- /* NVRAM storage */
- uint8_t *buffer;
- /* Model parameters */
- uint32_t model; /* 2 = m48t02, 8 = m48t08, 59 = m48t59 */
- /* NVRAM storage */
- uint16_t addr;
- uint8_t lock;
-} M48t59State;
-
-typedef struct M48txxISAState {
- ISADevice parent_obj;
- M48t59State state;
- uint32_t io_base;
- MemoryRegion io;
-} M48txxISAState;
-
-typedef struct M48txxISADeviceClass {
- ISADeviceClass parent_class;
- M48txxInfo info;
-} M48txxISADeviceClass;
-
typedef struct M48txxSysBusState {
SysBusDevice parent_obj;
M48t59State state;
@@ -122,21 +60,17 @@ typedef struct M48txxSysBusDeviceClass {
M48txxInfo info;
} M48txxSysBusDeviceClass;
-static M48txxInfo m48txx_info[] = {
+static M48txxInfo m48txx_sysbus_info[] = {
{
- .sysbus_name = "sysbus-m48t02",
+ .bus_name = "sysbus-m48t02",
.model = 2,
.size = 0x800,
},{
- .sysbus_name = "sysbus-m48t08",
+ .bus_name = "sysbus-m48t08",
.model = 8,
.size = 0x2000,
},{
- .sysbus_name = "sysbus-m48t59",
- .model = 59,
- .size = 0x2000,
- },{
- .isa_name = "isa-m48t59",
+ .bus_name = "sysbus-m48t59",
.model = 59,
.size = 0x2000,
}
@@ -248,7 +182,7 @@ static void set_up_watchdog(M48t59State *NVRAM, uint8_t value)
}
/* Direct access to NVRAM */
-static void m48t59_write(M48t59State *NVRAM, uint32_t addr, uint32_t val)
+void m48t59_write(M48t59State *NVRAM, uint32_t addr, uint32_t val)
{
struct tm tm;
int tmp;
@@ -413,7 +347,7 @@ static void m48t59_write(M48t59State *NVRAM, uint32_t addr, uint32_t val)
}
}
-static uint32_t m48t59_read(M48t59State *NVRAM, uint32_t addr)
+uint32_t m48t59_read(M48t59State *NVRAM, uint32_t addr)
{
struct tm tm;
uint32_t retval = 0xFF;
@@ -517,11 +451,6 @@ static uint32_t m48t59_read(M48t59State *NVRAM, uint32_t addr)
return retval;
}
-static void m48t59_toggle_lock(M48t59State *NVRAM, int lock)
-{
- NVRAM->lock ^= 1 << lock;
-}
-
/* IO access to NVRAM */
static void NVRAM_writeb(void *opaque, hwaddr addr, uint64_t val,
unsigned size)
@@ -639,7 +568,7 @@ static const VMStateDescription vmstate_m48t59 = {
}
};
-static void m48t59_reset_common(M48t59State *NVRAM)
+void m48t59_reset_common(M48t59State *NVRAM)
{
NVRAM->addr = 0;
NVRAM->lock = 0;
@@ -650,14 +579,6 @@ static void m48t59_reset_common(M48t59State *NVRAM)
timer_del(NVRAM->wd_timer);
}
-static void m48t59_reset_isa(DeviceState *d)
-{
- M48txxISAState *isa = M48TXX_ISA(d);
- M48t59State *NVRAM = &isa->state;
-
- m48t59_reset_common(NVRAM);
-}
-
static void m48t59_reset_sysbus(DeviceState *d)
{
M48txxSysBusState *sys = M48TXX_SYS_BUS(d);
@@ -666,7 +587,7 @@ static void m48t59_reset_sysbus(DeviceState *d)
m48t59_reset_common(NVRAM);
}
-static const MemoryRegionOps m48t59_io_ops = {
+const MemoryRegionOps m48t59_io_ops = {
.read = NVRAM_readb,
.write = NVRAM_writeb,
.impl = {
@@ -685,14 +606,13 @@ Nvram *m48t59_init(qemu_irq IRQ, hwaddr mem_base,
SysBusDevice *s;
int i;
- for (i = 0; i < ARRAY_SIZE(m48txx_info); i++) {
- if (!m48txx_info[i].sysbus_name ||
- m48txx_info[i].size != size ||
- m48txx_info[i].model != model) {
+ for (i = 0; i < ARRAY_SIZE(m48txx_sysbus_info); i++) {
+ if (m48txx_sysbus_info[i].size != size ||
+ m48txx_sysbus_info[i].model != model) {
continue;
}
- dev = qdev_create(NULL, m48txx_info[i].sysbus_name);
+ dev = qdev_create(NULL, m48txx_sysbus_info[i].bus_name);
qdev_prop_set_int32(dev, "base-year", base_year);
qdev_init_nofail(dev);
s = SYS_BUS_DEVICE(dev);
@@ -712,31 +632,7 @@ Nvram *m48t59_init(qemu_irq IRQ, hwaddr mem_base,
return NULL;
}
-Nvram *m48t59_init_isa(ISABus *bus, uint32_t io_base, uint16_t size,
- int base_year, int model)
-{
- DeviceState *dev;
- int i;
-
- for (i = 0; i < ARRAY_SIZE(m48txx_info); i++) {
- if (!m48txx_info[i].isa_name ||
- m48txx_info[i].size != size ||
- m48txx_info[i].model != model) {
- continue;
- }
-
- dev = DEVICE(isa_create(bus, m48txx_info[i].isa_name));
- qdev_prop_set_uint32(dev, "iobase", io_base);
- qdev_prop_set_int32(dev, "base-year", base_year);
- qdev_init_nofail(dev);
- return NVRAM(dev);
- }
-
- assert(false);
- return NULL;
-}
-
-static void m48t59_realize_common(M48t59State *s, Error **errp)
+void m48t59_realize_common(M48t59State *s, Error **errp)
{
s->buffer = g_malloc0(s->size);
if (s->model == 59) {
@@ -748,23 +644,6 @@ static void m48t59_realize_common(M48t59State *s, Error **errp)
vmstate_register(NULL, -1, &vmstate_m48t59, s);
}
-static void m48t59_isa_realize(DeviceState *dev, Error **errp)
-{
- M48txxISADeviceClass *u = M48TXX_ISA_GET_CLASS(dev);
- ISADevice *isadev = ISA_DEVICE(dev);
- M48txxISAState *d = M48TXX_ISA(dev);
- M48t59State *s = &d->state;
-
- s->model = u->info.model;
- s->size = u->info.size;
- isa_init_irq(isadev, &s->IRQ, 8);
- m48t59_realize_common(s, errp);
- memory_region_init_io(&d->io, OBJECT(dev), &m48t59_io_ops, s, "m48t59", 4);
- if (d->io_base != 0) {
- isa_register_ioport(isadev, &d->io, d->io_base);
- }
-}
-
static int m48t59_init1(SysBusDevice *dev)
{
M48txxSysBusDeviceClass *u = M48TXX_SYS_BUS_GET_CLASS(dev);
@@ -791,51 +670,6 @@ static int m48t59_init1(SysBusDevice *dev)
return 0;
}
-static uint32_t m48txx_isa_read(Nvram *obj, uint32_t addr)
-{
- M48txxISAState *d = M48TXX_ISA(obj);
- return m48t59_read(&d->state, addr);
-}
-
-static void m48txx_isa_write(Nvram *obj, uint32_t addr, uint32_t val)
-{
- M48txxISAState *d = M48TXX_ISA(obj);
- m48t59_write(&d->state, addr, val);
-}
-
-static void m48txx_isa_toggle_lock(Nvram *obj, int lock)
-{
- M48txxISAState *d = M48TXX_ISA(obj);
- m48t59_toggle_lock(&d->state, lock);
-}
-
-static Property m48t59_isa_properties[] = {
- DEFINE_PROP_INT32("base-year", M48txxISAState, state.base_year, 0),
- DEFINE_PROP_UINT32("iobase", M48txxISAState, io_base, 0x74),
- DEFINE_PROP_END_OF_LIST(),
-};
-
-static void m48txx_isa_class_init(ObjectClass *klass, void *data)
-{
- DeviceClass *dc = DEVICE_CLASS(klass);
- NvramClass *nc = NVRAM_CLASS(klass);
-
- dc->realize = m48t59_isa_realize;
- dc->reset = m48t59_reset_isa;
- dc->props = m48t59_isa_properties;
- nc->read = m48txx_isa_read;
- nc->write = m48txx_isa_write;
- nc->toggle_lock = m48txx_isa_toggle_lock;
-}
-
-static void m48txx_isa_concrete_class_init(ObjectClass *klass, void *data)
-{
- M48txxISADeviceClass *u = M48TXX_ISA_CLASS(klass);
- M48txxInfo *info = data;
-
- u->info = *info;
-}
-
static uint32_t m48txx_sysbus_read(Nvram *obj, uint32_t addr)
{
M48txxSysBusState *d = M48TXX_SYS_BUS(obj);
@@ -899,18 +733,6 @@ static const TypeInfo m48txx_sysbus_type_info = {
}
};
-static const TypeInfo m48txx_isa_type_info = {
- .name = TYPE_M48TXX_ISA,
- .parent = TYPE_ISA_DEVICE,
- .instance_size = sizeof(M48txxISAState),
- .abstract = true,
- .class_init = m48txx_isa_class_init,
- .interfaces = (InterfaceInfo[]) {
- { TYPE_NVRAM },
- { }
- }
-};
-
static void m48t59_register_types(void)
{
TypeInfo sysbus_type_info = {
@@ -918,29 +740,15 @@ static void m48t59_register_types(void)
.class_size = sizeof(M48txxSysBusDeviceClass),
.class_init = m48txx_sysbus_concrete_class_init,
};
- TypeInfo isa_type_info = {
- .parent = TYPE_M48TXX_ISA,
- .class_size = sizeof(M48txxISADeviceClass),
- .class_init = m48txx_isa_concrete_class_init,
- };
int i;
type_register_static(&nvram_info);
type_register_static(&m48txx_sysbus_type_info);
- type_register_static(&m48txx_isa_type_info);
- for (i = 0; i < ARRAY_SIZE(m48txx_info); i++) {
- if (m48txx_info[i].sysbus_name) {
- sysbus_type_info.name = m48txx_info[i].sysbus_name;
- sysbus_type_info.class_data = &m48txx_info[i];
- type_register(&sysbus_type_info);
- }
-
- if (m48txx_info[i].isa_name) {
- isa_type_info.name = m48txx_info[i].isa_name;
- isa_type_info.class_data = &m48txx_info[i];
- type_register(&isa_type_info);
- }
+ for (i = 0; i < ARRAY_SIZE(m48txx_sysbus_info); i++) {
+ sysbus_type_info.name = m48txx_sysbus_info[i].bus_name;
+ sysbus_type_info.class_data = &m48txx_sysbus_info[i];
+ type_register(&sysbus_type_info);
}
}
--
2.7.4
^ permalink raw reply related
* [Qemu-devel] [PATCHv2 1/3] Split serial-isa into its own config option
From: David Gibson @ 2016-11-09 12:22 UTC (permalink / raw)
To: edgar.iglesias, michael, proljc, borntraeger, cornelia.huck,
kbastian, jcmvbkbc
Cc: agraf, mst, armbru, pbonzini, peter.maydell, veroniabahaa,
qemu-devel, David Gibson
In-Reply-To: <1478694124-15803-1-git-send-email-david@gibson.dropbear.id.au>
At present, the core device model code for 8250-like serial ports
(serial.c) and the code for serial ports attached to ISA-style legacy IO
(serial-isa.c) are both controlled by the CONFIG_SERIAL variable.
There are lots and lots of embedded platforms that have 8250-like serial
ports but have never had anything resembling ISA legacy IO. Therefore,
split serial-isa into its own CONFIG_SERIAL_ISA option so it can be
disabled for platforms where it's not appropriate.
For now, I enabled CONFIG_SERIAL_ISA in every default-config where
CONFIG_SERIAL is enabled, excepting microblaze, or32, and xtensa. As best
as I can tell, those platforms never used legacy ISA, and also don't
include PCI support (which would allow connection of a PCI->ISA bridge
and/or a southbridge including legacy ISA serial ports).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
---
default-configs/alpha-softmmu.mak | 1 +
default-configs/arm-softmmu.mak | 1 +
default-configs/i386-softmmu.mak | 1 +
default-configs/mips-softmmu-common.mak | 1 +
default-configs/moxie-softmmu.mak | 1 +
default-configs/pci.mak | 1 +
default-configs/ppc-softmmu.mak | 1 +
default-configs/ppc64-softmmu.mak | 1 +
default-configs/ppcemb-softmmu.mak | 1 +
default-configs/sh4-softmmu.mak | 1 +
default-configs/sh4eb-softmmu.mak | 1 +
default-configs/sparc64-softmmu.mak | 1 +
default-configs/x86_64-softmmu.mak | 1 +
hw/char/Makefile.objs | 3 ++-
14 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/default-configs/alpha-softmmu.mak b/default-configs/alpha-softmmu.mak
index 7f6161e..e0d75e3 100644
--- a/default-configs/alpha-softmmu.mak
+++ b/default-configs/alpha-softmmu.mak
@@ -3,6 +3,7 @@
include pci.mak
include usb.mak
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_I8254=y
CONFIG_PCKBD=y
CONFIG_VGA_CIRRUS=y
diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 6de3e16..dcbcea7 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -6,6 +6,7 @@ CONFIG_VGA=y
CONFIG_NAND=y
CONFIG_ECC=y
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_PTIMER=y
CONFIG_SD=y
CONFIG_MAX7310=y
diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 0b51360..3f2e820 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -15,6 +15,7 @@ CONFIG_IPMI_EXTERN=y
CONFIG_ISA_IPMI_KCS=y
CONFIG_ISA_IPMI_BT=y
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_PARALLEL=y
CONFIG_I8254=y
CONFIG_PCSPK=y
diff --git a/default-configs/mips-softmmu-common.mak b/default-configs/mips-softmmu-common.mak
index 0394514..5b8b0c9 100644
--- a/default-configs/mips-softmmu-common.mak
+++ b/default-configs/mips-softmmu-common.mak
@@ -9,6 +9,7 @@ CONFIG_VGA_ISA_MM=y
CONFIG_VGA_CIRRUS=y
CONFIG_VMWARE_VGA=y
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_PARALLEL=y
CONFIG_I8254=y
CONFIG_PCSPK=y
diff --git a/default-configs/moxie-softmmu.mak b/default-configs/moxie-softmmu.mak
index 1a95476..7e22863 100644
--- a/default-configs/moxie-softmmu.mak
+++ b/default-configs/moxie-softmmu.mak
@@ -2,4 +2,5 @@
CONFIG_MC146818RTC=y
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_VGA=y
diff --git a/default-configs/pci.mak b/default-configs/pci.mak
index fff7ce3..d8d6548 100644
--- a/default-configs/pci.mak
+++ b/default-configs/pci.mak
@@ -27,6 +27,7 @@ CONFIG_AHCI=y
CONFIG_ESP=y
CONFIG_ESP_PCI=y
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_SERIAL_PCI=y
CONFIG_IPACK=y
CONFIG_WDT_IB6300ESB=y
diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak
index d4d0f9b..13eb94f 100644
--- a/default-configs/ppc-softmmu.mak
+++ b/default-configs/ppc-softmmu.mak
@@ -45,5 +45,6 @@ CONFIG_PLATFORM_BUS=y
CONFIG_ETSEC=y
CONFIG_LIBDECNUMBER=y
# For PReP
+CONFIG_SERIAL_ISA=y
CONFIG_MC146818RTC=y
CONFIG_ISA_TESTDEV=y
diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak
index 67a9bca..f607125 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -52,6 +52,7 @@ CONFIG_XICS=$(CONFIG_PSERIES)
CONFIG_XICS_SPAPR=$(CONFIG_PSERIES)
CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
# For PReP
+CONFIG_SERIAL_ISA=y
CONFIG_MC146818RTC=y
CONFIG_ISA_TESTDEV=y
CONFIG_MEM_HOTPLUG=y
diff --git a/default-configs/ppcemb-softmmu.mak b/default-configs/ppcemb-softmmu.mak
index 54acc4d..7f56004 100644
--- a/default-configs/ppcemb-softmmu.mak
+++ b/default-configs/ppcemb-softmmu.mak
@@ -5,6 +5,7 @@ include sound.mak
include usb.mak
CONFIG_M48T59=y
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_I8257=y
CONFIG_OPENPIC=y
CONFIG_PFLASH_CFI01=y
diff --git a/default-configs/sh4-softmmu.mak b/default-configs/sh4-softmmu.mak
index 8e00390..546d855 100644
--- a/default-configs/sh4-softmmu.mak
+++ b/default-configs/sh4-softmmu.mak
@@ -3,6 +3,7 @@
include pci.mak
include usb.mak
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_PTIMER=y
CONFIG_PFLASH_CFI02=y
CONFIG_SH4=y
diff --git a/default-configs/sh4eb-softmmu.mak b/default-configs/sh4eb-softmmu.mak
index efdd058..2d3fd49 100644
--- a/default-configs/sh4eb-softmmu.mak
+++ b/default-configs/sh4eb-softmmu.mak
@@ -3,6 +3,7 @@
include pci.mak
include usb.mak
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_PTIMER=y
CONFIG_PFLASH_CFI02=y
CONFIG_SH4=y
diff --git a/default-configs/sparc64-softmmu.mak b/default-configs/sparc64-softmmu.mak
index c0cdd64..922db55 100644
--- a/default-configs/sparc64-softmmu.mak
+++ b/default-configs/sparc64-softmmu.mak
@@ -5,6 +5,7 @@ include usb.mak
CONFIG_M48T59=y
CONFIG_PTIMER=y
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_PARALLEL=y
CONFIG_PCKBD=y
CONFIG_FDC=y
diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
index 7f89503..f34bc80 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -15,6 +15,7 @@ CONFIG_IPMI_EXTERN=y
CONFIG_ISA_IPMI_KCS=y
CONFIG_ISA_IPMI_BT=y
CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
CONFIG_PARALLEL=y
CONFIG_I8254=y
CONFIG_PCSPK=y
diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
index 69a553c..6ea76fe 100644
--- a/hw/char/Makefile.objs
+++ b/hw/char/Makefile.objs
@@ -2,7 +2,8 @@ common-obj-$(CONFIG_IPACK) += ipoctal232.o
common-obj-$(CONFIG_ESCC) += escc.o
common-obj-$(CONFIG_PARALLEL) += parallel.o
common-obj-$(CONFIG_PL011) += pl011.o
-common-obj-$(CONFIG_SERIAL) += serial.o serial-isa.o
+common-obj-$(CONFIG_SERIAL) += serial.o
+common-obj-$(CONFIG_SERIAL_ISA) += serial-isa.o
common-obj-$(CONFIG_SERIAL_PCI) += serial-pci.o
common-obj-$(CONFIG_VIRTIO) += virtio-console.o
common-obj-$(CONFIG_XILINX) += xilinx_uartlite.o
--
2.7.4
^ permalink raw reply related
* [Qemu-devel] [PATCHv2 2/3] Allow ISA bus to be configured out
From: David Gibson @ 2016-11-09 12:22 UTC (permalink / raw)
To: edgar.iglesias, michael, proljc, borntraeger, cornelia.huck,
kbastian, jcmvbkbc
Cc: agraf, mst, armbru, pbonzini, peter.maydell, veroniabahaa,
qemu-devel, David Gibson
In-Reply-To: <1478694124-15803-1-git-send-email-david@gibson.dropbear.id.au>
Currently, the code to handle the legacy ISA bus is always included in
qemu. However there are lots of platforms that don't include ISA legacy
devies, and quite a few that have never used ISA legacy devices at all.
This patch allows the ISA bus code to be disabled in the configuration for
platforms where it doesn't make sense.
For now, the default configs are adjusted to include ISA on all platforms
including PCI: anything with PCI can at least in principle add an i82378
PCI->ISA bridge. Also, CONFIG_IDE_CORE which is already in pci.mak
requires ISA support.
We also explicitly enable ISA on some other non-PCI platforms which include
ISA devices: moxie, sparc and unicore32. We may want to pare this down in
future.
The platforms that will lose ISA by default are: cris, lm32, microblazeel,
microblaze, openrisc, s390x, tricore, xtensaeb, xtensa. As far as I can
tell none of these ever used ISA.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
default-configs/moxie-softmmu.mak | 1 +
default-configs/pci.mak | 2 ++
default-configs/sparc-softmmu.mak | 1 +
default-configs/unicore32-softmmu.mak | 1 +
hw/isa/Makefile.objs | 2 +-
5 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/default-configs/moxie-softmmu.mak b/default-configs/moxie-softmmu.mak
index 7e22863..e00d099 100644
--- a/default-configs/moxie-softmmu.mak
+++ b/default-configs/moxie-softmmu.mak
@@ -1,5 +1,6 @@
# Default configuration for moxie-softmmu
+CONFIG_ISA_BUS=y
CONFIG_MC146818RTC=y
CONFIG_SERIAL=y
CONFIG_SERIAL_ISA=y
diff --git a/default-configs/pci.mak b/default-configs/pci.mak
index d8d6548..60dc651 100644
--- a/default-configs/pci.mak
+++ b/default-configs/pci.mak
@@ -1,4 +1,6 @@
CONFIG_PCI=y
+# For now, CONFIG_IDE_CORE requires ISA, so we enable it here
+CONFIG_ISA_BUS=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO=y
CONFIG_USB_UHCI=y
diff --git a/default-configs/sparc-softmmu.mak b/default-configs/sparc-softmmu.mak
index ab796b3..004b0f4 100644
--- a/default-configs/sparc-softmmu.mak
+++ b/default-configs/sparc-softmmu.mak
@@ -1,5 +1,6 @@
# Default configuration for sparc-softmmu
+CONFIG_ISA_BUS=y
CONFIG_ECC=y
CONFIG_ESP=y
CONFIG_ESCC=y
diff --git a/default-configs/unicore32-softmmu.mak b/default-configs/unicore32-softmmu.mak
index de38577..5f6c4a8 100644
--- a/default-configs/unicore32-softmmu.mak
+++ b/default-configs/unicore32-softmmu.mak
@@ -1,4 +1,5 @@
# Default configuration for unicore32-softmmu
+CONFIG_ISA_BUS=y
CONFIG_PUV3=y
CONFIG_PTIMER=y
CONFIG_PCKBD=y
diff --git a/hw/isa/Makefile.objs b/hw/isa/Makefile.objs
index 9164556..fb37c55 100644
--- a/hw/isa/Makefile.objs
+++ b/hw/isa/Makefile.objs
@@ -1,4 +1,4 @@
-common-obj-y += isa-bus.o
+common-obj-$(CONFIG_ISA_BUS) += isa-bus.o
common-obj-$(CONFIG_APM) += apm.o
common-obj-$(CONFIG_I82378) += i82378.o
common-obj-$(CONFIG_PC87312) += pc87312.o
--
2.7.4
^ permalink raw reply related
* [PATCH v9 8/8] thunderbolt: Adding maintainer entry
From: Amir Levy @ 2016-11-09 14:20 UTC (permalink / raw)
To: gregkh
Cc: andreas.noever, bhelgaas, corbet, linux-kernel, linux-pci, netdev,
linux-doc, mario_limonciello, thunderbolt-linux, mika.westerberg,
tomas.winkler, xiong.y.zhang, Amir Levy
In-Reply-To: <1478701208-4585-1-git-send-email-amir.jer.levy@intel.com>
Add Amir Levy as maintainer for Thunderbolt(TM) ICM driver
Signed-off-by: Amir Levy <amir.jer.levy@intel.com>
---
MAINTAINERS | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 411e3b8..87763c44 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10652,7 +10652,13 @@ F: include/uapi/linux/stm.h
THUNDERBOLT DRIVER
M: Andreas Noever <andreas.noever@gmail.com>
S: Maintained
-F: drivers/thunderbolt/
+F: drivers/thunderbolt/*
+
+THUNDERBOLT ICM DRIVER
+M: Amir Levy <amir.jer.levy@intel.com>
+S: Maintained
+F: drivers/thunderbolt/icm/
+F: Documentation/thunderbolt/networking.txt
TI BQ27XXX POWER SUPPLY DRIVER
R: Andrew F. Davis <afd@ti.com>
--
2.7.4
^ permalink raw reply related
* [PATCH v9 7/8] thunderbolt: Networking doc
From: Amir Levy @ 2016-11-09 14:20 UTC (permalink / raw)
To: gregkh
Cc: andreas.noever, bhelgaas, corbet, linux-kernel, linux-pci, netdev,
linux-doc, mario_limonciello, thunderbolt-linux, mika.westerberg,
tomas.winkler, xiong.y.zhang, Amir Levy
In-Reply-To: <1478701208-4585-1-git-send-email-amir.jer.levy@intel.com>
Adding Thunderbolt(TM) networking documentation.
Signed-off-by: Amir Levy <amir.jer.levy@intel.com>
---
Documentation/00-INDEX | 2 +
Documentation/thunderbolt/networking.txt | 132 +++++++++++++++++++++++++++++++
2 files changed, 134 insertions(+)
create mode 100644 Documentation/thunderbolt/networking.txt
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index 3acc4f1..0239e68 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -440,6 +440,8 @@ this_cpu_ops.txt
- List rationale behind and the way to use this_cpu operations.
thermal/
- directory with information on managing thermal issues (CPU/temp)
+thunderbolt/
+ - directory with info regarding Thunderbolt.
trace/
- directory with info on tracing technologies within linux
unaligned-memory-access.txt
diff --git a/Documentation/thunderbolt/networking.txt b/Documentation/thunderbolt/networking.txt
new file mode 100644
index 0000000..88d1c12
--- /dev/null
+++ b/Documentation/thunderbolt/networking.txt
@@ -0,0 +1,132 @@
+Intel Thunderbolt(TM) Networking driver
+=======================================
+
+Copyright(c) 2013 - 2016 Intel Corporation.
+
+Contact Information:
+Intel Thunderbolt mailing list <thunderbolt-software@lists.01.org>
+Edited by Amir Levy <amir.jer.levy@intel.com>
+
+Overview
+========
+
+* The Thunderbolt Networking driver enables peer to peer networking on non-Apple
+ platforms running Linux.
+
+* The driver creates a virtual Ethernet device that enables computer to computer
+ communication over the Thunderbolt cable.
+
+* Using Thunderbolt Networking you can perform high speed file transfers between
+ computers, perform PC migrations and/or set up small workgroups with shared
+ storage without compromising any other Thunderbolt functionality.
+
+* The driver is located in drivers/thunderbolt/icm.
+
+* This driver will function only on non-Apple platforms with firmware based
+ Thunderbolt controllers that support Thunderbolt Networking.
+
+ +----------------+ +----------------+
+ |Host 1 | |Host 2 |
+ | | | |
+ | +-------+ | | +-------+ |
+ | |Network| | | |Network| |
+ | |Stack | | | |Stack | |
+ | +-------+ | | +-------+ |
+ | ^ | | ^ |
+ | | | | | |
+ | v | | v |
+ | +-----------+ | | +-----------+ |
+ | |Thunderbolt| | | |Thunderbolt| |
+ | |Networking | | | |Networking | |
+ | |Driver | | | |Driver | |
+ | +-----------+ | | +-----------+ |
+ | ^ | | ^ |
+ | | | | | |
+ | v | | v |
+ | +-----------+ | | +-----------+ |
+ | |Thunderbolt| | | |Thunderbolt| |
+ | |Controller |<-+------------+->|Controller | |
+ | |with ICM | | | |with ICM | |
+ | |enabled | | | |enabled | |
+ | +-----------+ | | +-----------+ |
+ +----------------+ +----------------+
+
+Files
+=====
+
+The following files are located in the drivers/thunderbolt/icm directory:
+
+- icm_nhi.c/h: These files allow communication with the firmware (Intel
+ Connection Manager) based controller. They also create an interface for
+ netlink communication with a user space daemon.
+
+- net.c/net.h: These files implement the 'eth' interface for the
+ Thunderbolt(TM) Networking.
+
+Interface to User Space
+=======================
+
+The interface to the user space module is implemented through a Generic Netlink.
+This is the communications protocol between the Thunderbolt driver and the user
+space application.
+
+Note that this interface mediates user space communication with ICM.
+(Existing Linux tools can be used to configure the network interface.)
+
+The Thunderbolt Daemon utilizes this interface to communicate with the driver.
+To be accessed by the user space module, both kernel and user space modules
+have to register with the same GENL_NAME.
+For the purpose of the Thunderbolt Network driver, "thunderbolt" is used.
+The registration is done at driver initialization time for all instances
+of the Thunderbolt controllers. The communication is carried through pre-defined
+Thunderbolt messages. Each specific message has a callback function that is
+called when the related message is received.
+
+Message Definitions:
+* NHI_CMD_UNSPEC: Not used.
+* NHI_CMD_SUBSCRIBE: Subscription request from daemon to driver to open the
+ communication channel.
+* NHI_CMD_UNSUBSCRIBE: Request from daemon to driver to unsubscribe and
+ to close communication channel.
+* NHI_CMD_QUERY_INFORMATION: Request information from the driver such as
+ driver version, FW version offset, number of ports in the controller
+ and DMA port.
+* NHI_CMD_MSG_TO_ICM: Message from user space module to FW.
+* NHI_CMD_MSG_FROM_ICM: Response from FW to user space module.
+* NHI_CMD_MAILBOX: Message that uses mailbox mechanism such as FW policy
+ changes or disconnect path.
+* NHI_CMD_APPROVE_TBT_NETWORKING: Request from user space module to FW to
+ establish path.
+* NHI_CMD_ICM_IN_SAFE_MODE: Indication that the FW has entered safe mode.
+
+Communication with Intel Connection Manager(ICM) Firmware
+=========================================================
+
+There are several circular buffers in Thunderbolt each using Direct Memory
+Access (DMA).
+
+Communication with ICM utilizes circular buffer ring #0. (The other rings are
+used for peer to peer communication, packet transmission and receiving).
+
+The driver allocates a shared memory that is physically mapped onto the DMA
+physical space at ring #0.
+For the software to communicate with the firmware, the driver sends a command
+in ring #0. The command contains a pre-defined field (PDF) value notifying the
+firmware that the driver is ready. To proceed, the driver must receive the
+appropriate PDF value in response from the firmware.
+
+Once the exchange is completed, messages can be sent to the firmware through
+the driver. Similarly, the firmware can now send notifications about hardware
+and firmware events.
+
+Information
+===========
+
+Mailing list:
+ thunderbolt-software@lists.01.org
+ Register at: https://lists.01.org/mailman/listinfo/thunderbolt-software
+ Archives at: https://lists.01.org/pipermail/thunderbolt-software/
+
+For additional information about Thunderbolt technology visit:
+ https://01.org/thunderbolt-sw
+ https://thunderbolttechnology.net/
--
2.7.4
^ permalink raw reply related
* [PATCH v9 1/8] thunderbolt: Macro rename
From: Amir Levy @ 2016-11-09 14:20 UTC (permalink / raw)
To: gregkh
Cc: andreas.noever, bhelgaas, corbet, linux-kernel, linux-pci, netdev,
linux-doc, mario_limonciello, thunderbolt-linux, mika.westerberg,
tomas.winkler, xiong.y.zhang, Amir Levy
In-Reply-To: <1478701208-4585-1-git-send-email-amir.jer.levy@intel.com>
This first patch updates the NHI Thunderbolt controller registers file to
reflect that it is not only for Cactus Ridge.
No functional change intended.
Signed-off-by: Amir Levy <amir.jer.levy@intel.com>
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
---
drivers/thunderbolt/nhi_regs.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/thunderbolt/nhi_regs.h b/drivers/thunderbolt/nhi_regs.h
index 86b996c..75cf069 100644
--- a/drivers/thunderbolt/nhi_regs.h
+++ b/drivers/thunderbolt/nhi_regs.h
@@ -1,11 +1,11 @@
/*
- * Thunderbolt Cactus Ridge driver - NHI registers
+ * Thunderbolt driver - NHI registers
*
* Copyright (c) 2014 Andreas Noever <andreas.noever@gmail.com>
*/
-#ifndef DSL3510_REGS_H_
-#define DSL3510_REGS_H_
+#ifndef NHI_REGS_H_
+#define NHI_REGS_H_
#include <linux/types.h>
--
2.7.4
^ permalink raw reply related
* [PATCH v9 6/8] thunderbolt: Kconfig for Thunderbolt Networking
From: Amir Levy @ 2016-11-09 14:20 UTC (permalink / raw)
To: gregkh
Cc: andreas.noever, bhelgaas, corbet, linux-kernel, linux-pci, netdev,
linux-doc, mario_limonciello, thunderbolt-linux, mika.westerberg,
tomas.winkler, xiong.y.zhang, Amir Levy
In-Reply-To: <1478701208-4585-1-git-send-email-amir.jer.levy@intel.com>
Update to the Kconfig Thunderbolt description to add
Thunderbolt networking as an option.
The menu item "Thunderbolt support" now offers:
"Apple Hardware Support" (existing)
and/or
"Thunderbolt Networking" (new)
You can choose the driver for your platform or build both drivers -
each driver will detect if it can run on the specific platform.
If the Thunderbolt Networking option is chosen, Thunderbolt Networking
will be enabled between Linux non-Apple systems, macOS and
Windows based systems.
Thunderbolt Networking will not affect any other Thunderbolt feature that
was previous available to Linux users on either Apple or
non-Apple platforms.
Signed-off-by: Amir Levy <amir.jer.levy@intel.com>
---
drivers/thunderbolt/Kconfig | 27 +++++++++++++++++++++++----
drivers/thunderbolt/Makefile | 3 ++-
2 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/drivers/thunderbolt/Kconfig b/drivers/thunderbolt/Kconfig
index c121acc..376e5bb 100644
--- a/drivers/thunderbolt/Kconfig
+++ b/drivers/thunderbolt/Kconfig
@@ -1,13 +1,32 @@
-menuconfig THUNDERBOLT
- tristate "Thunderbolt support for Apple devices"
+config THUNDERBOLT
+ tristate "Thunderbolt support"
depends on PCI
select CRC32
help
- Cactus Ridge Thunderbolt Controller driver
+ Thunderbolt Controller driver
+
+if THUNDERBOLT
+
+config THUNDERBOLT_APPLE
+ tristate "Apple hardware support"
+ help
This driver is required if you want to hotplug Thunderbolt devices on
Apple hardware.
Device chaining is currently not supported.
- To compile this driver a module, choose M here. The module will be
+ To compile this driver as a module, choose M here. The module will be
called thunderbolt.
+
+config THUNDERBOLT_ICM
+ tristate "Thunderbolt Networking"
+ help
+ This driver is required if you want Thunderbolt Networking on
+ non-Apple hardware.
+ It creates a virtual Ethernet device that enables computer to
+ computer communication over a Thunderbolt cable.
+
+ To compile this driver as a module, choose M here. The module will be
+ called thunderbolt_icm.
+
+endif
diff --git a/drivers/thunderbolt/Makefile b/drivers/thunderbolt/Makefile
index 5d1053c..b6aa6a3 100644
--- a/drivers/thunderbolt/Makefile
+++ b/drivers/thunderbolt/Makefile
@@ -1,3 +1,4 @@
-obj-${CONFIG_THUNDERBOLT} := thunderbolt.o
+obj-${CONFIG_THUNDERBOLT_APPLE} := thunderbolt.o
thunderbolt-objs := nhi.o ctl.o tb.o switch.o cap.o path.o tunnel_pci.o eeprom.o
+obj-${CONFIG_THUNDERBOLT_ICM} += icm/
--
2.7.4
^ permalink raw reply related
* [PATCH v9 5/8] thunderbolt: Networking transmit and receive
From: Amir Levy @ 2016-11-09 14:20 UTC (permalink / raw)
To: gregkh
Cc: andreas.noever, bhelgaas, corbet, linux-kernel, linux-pci, netdev,
linux-doc, mario_limonciello, thunderbolt-linux, mika.westerberg,
tomas.winkler, xiong.y.zhang, Amir Levy
In-Reply-To: <1478701208-4585-1-git-send-email-amir.jer.levy@intel.com>
This patch provides the handling interface for sending and receiving
network packets between the hosts over the full communication route
(using the communication path established in the previous patch).
The Thunderbolt Network driver interfaces the Linux network stack
and the hardware controller configuration to handle packet transmissions:
+----------------+ +----------------+
|Host 1 | |Host 2 |
| | | |
| +-------+ | | +-------+ |
| |Network| | | |Network| |
| |Stack | | | |Stack | |
| +-------+ | | +-------+ |
| ^ | | ^ |
| | | | | |
| v | | v |
| +-----------+ | | +-----------+ |
| |Thunderbolt| | | |Thunderbolt| |
| |Networking | | | |Networking | |
| |Driver | | | |Driver | |
| +-----------+ | | +-----------+ |
| ^ | | ^ |
| | | | | |
| v | | v |
| +-----------+ | | +-----------+ |
| |Thunderbolt| | | |Thunderbolt| |
| |Controller |<-+------------+->|Controller | |
| +-----------+ | | +-----------+ |
+----------------+ +----------------+
Signed-off-by: Amir Levy <amir.jer.levy@intel.com>
---
drivers/thunderbolt/icm/icm_nhi.c | 15 +
drivers/thunderbolt/icm/net.c | 1471 +++++++++++++++++++++++++++++++++++++
2 files changed, 1486 insertions(+)
diff --git a/drivers/thunderbolt/icm/icm_nhi.c b/drivers/thunderbolt/icm/icm_nhi.c
index edc910b..b1cc347 100644
--- a/drivers/thunderbolt/icm/icm_nhi.c
+++ b/drivers/thunderbolt/icm/icm_nhi.c
@@ -928,6 +928,7 @@ static irqreturn_t nhi_msi(int __always_unused irq, void *data)
{
struct tbt_nhi_ctxt *nhi_ctxt = data;
u32 isr0, isr1, imr0, imr1;
+ int i;
/* clear on read */
isr0 = ioread32(nhi_ctxt->iobase + REG_RING_NOTIFY_BASE);
@@ -950,6 +951,20 @@ static irqreturn_t nhi_msi(int __always_unused irq, void *data)
spin_unlock(&nhi_ctxt->lock);
+ for (i = 0; i < nhi_ctxt->num_ports; ++i) {
+ struct net_device *net_dev =
+ nhi_ctxt->net_devices[i].net_dev;
+ if (net_dev) {
+ u8 path = PATH_FROM_PORT(nhi_ctxt->num_paths, i);
+
+ if (isr0 & REG_RING_INT_RX_PROCESSED(
+ path, nhi_ctxt->num_paths))
+ tbt_net_rx_msi(net_dev);
+ if (isr0 & REG_RING_INT_TX_PROCESSED(path))
+ tbt_net_tx_msi(net_dev);
+ }
+ }
+
if (isr0 & REG_RING_INT_RX_PROCESSED(TBT_ICM_RING_NUM,
nhi_ctxt->num_paths))
schedule_work(&nhi_ctxt->icm_msgs_work);
diff --git a/drivers/thunderbolt/icm/net.c b/drivers/thunderbolt/icm/net.c
index beeafb3..cf985dd 100644
--- a/drivers/thunderbolt/icm/net.c
+++ b/drivers/thunderbolt/icm/net.c
@@ -124,6 +124,17 @@ struct approve_inter_domain_connection_cmd {
};
+struct tbt_frame_header {
+ /* size of the data with the frame */
+ __le32 frame_size;
+ /* running index on the frames */
+ __le16 frame_index;
+ /* ID of the frame to match frames to specific packet */
+ __le16 frame_id;
+ /* how many frames assembles a full packet */
+ __le32 frame_count;
+};
+
enum neg_event {
RECEIVE_LOGOUT = NUM_MEDIUM_STATUSES,
RECEIVE_LOGIN_RESPONSE,
@@ -131,15 +142,81 @@ enum neg_event {
NUM_NEG_EVENTS
};
+enum frame_status {
+ GOOD_FRAME,
+ GOOD_AS_FIRST_FRAME,
+ GOOD_AS_FIRST_MULTICAST_FRAME,
+ FRAME_NOT_READY,
+ FRAME_ERROR,
+};
+
+enum packet_filter {
+ /* all multicast MAC addresses */
+ PACKET_TYPE_ALL_MULTICAST,
+ /* all types of MAC addresses: multicast, unicast and broadcast */
+ PACKET_TYPE_PROMISCUOUS,
+ /* all unicast MAC addresses */
+ PACKET_TYPE_UNICAST_PROMISCUOUS,
+};
+
enum disconnect_path_stage {
STAGE_1 = BIT(0),
STAGE_2 = BIT(1)
};
+struct tbt_net_stats {
+ u64 tx_packets;
+ u64 tx_bytes;
+ u64 tx_errors;
+ u64 rx_packets;
+ u64 rx_bytes;
+ u64 rx_length_errors;
+ u64 rx_over_errors;
+ u64 rx_crc_errors;
+ u64 rx_missed_errors;
+ u64 multicast;
+};
+
+static const char tbt_net_gstrings_stats[][ETH_GSTRING_LEN] = {
+ "tx_packets",
+ "tx_bytes",
+ "tx_errors",
+ "rx_packets",
+ "rx_bytes",
+ "rx_length_errors",
+ "rx_over_errors",
+ "rx_crc_errors",
+ "rx_missed_errors",
+ "multicast",
+};
+
+struct tbt_buffer {
+ dma_addr_t dma;
+ union {
+ struct tbt_frame_header *hdr;
+ struct page *page;
+ };
+ u32 page_offset;
+};
+
+struct tbt_desc_ring {
+ /* pointer to the descriptor ring memory */
+ struct tbt_buf_desc *desc;
+ /* physical address of the descriptor ring */
+ dma_addr_t dma;
+ /* array of buffer structs */
+ struct tbt_buffer *buffers;
+ /* last descriptor that was associated with a buffer */
+ u16 last_allocated;
+ /* next descriptor to check for DD status bit */
+ u16 next_to_clean;
+};
+
/**
* struct tbt_port - the basic tbt_port structure
* @tbt_nhi_ctxt: context of the nhi controller.
* @net_dev: networking device object.
+ * @napi: network API
* @login_retry_work: work queue for sending login requests.
* @login_response_work: work queue for sending login responses.
* @work_struct logout_work: work queue for sending logout requests.
@@ -155,6 +232,11 @@ enum disconnect_path_stage {
* @login_retry_count: counts number of login retries sent.
* @local_depth: depth of the remote peer in the chain.
* @transmit_path: routing parameter for the icm.
+ * @tx_ring: transmit ring from where the packets are sent.
+ * @rx_ring: receive ring where the packets are received.
+ * @stats: network statistics of the rx/tx packets.
+ * @packet_filters: defines filters for the received packets.
+ * @multicast_hash_table: hash table of multicast addresses.
* @frame_id: counting ID of frames.
* @num: port number.
* @local_path: routing parameter for the icm.
@@ -164,6 +246,7 @@ enum disconnect_path_stage {
struct tbt_port {
struct tbt_nhi_ctxt *nhi_ctxt;
struct net_device *net_dev;
+ struct napi_struct napi;
struct delayed_work login_retry_work;
struct work_struct login_response_work;
struct work_struct logout_work;
@@ -179,6 +262,17 @@ struct tbt_port {
u8 login_retry_count;
u8 local_depth;
u8 transmit_path;
+ struct tbt_desc_ring tx_ring ____cacheline_aligned_in_smp;
+ struct tbt_desc_ring rx_ring;
+ struct tbt_net_stats stats;
+ u32 packet_filters;
+ /*
+ * hash table of 1024 boolean entries with hashing of
+ * the multicast address
+ */
+ u32 multicast_hash_table[DIV_ROUND_UP(
+ TBT_NET_MULTICAST_HASH_TABLE_SIZE,
+ BITS_PER_U32)];
u16 frame_id;
u8 num;
u8 local_path;
@@ -225,6 +319,8 @@ static void tbt_net_tear_down(struct net_device *net_dev, bool send_logout)
(port->local_path * REG_OPTS_STEP);
u32 rx_reg_val = ioread32(rx_reg) & ~REG_OPTS_E2E_EN;
+ napi_disable(&port->napi);
+
tx_reg = iobase + REG_TX_OPTIONS_BASE +
(port->local_path * REG_OPTS_STEP);
tx_reg_val = ioread32(tx_reg) & ~REG_OPTS_E2E_EN;
@@ -266,8 +362,1336 @@ static void tbt_net_tear_down(struct net_device *net_dev, bool send_logout)
port->nhi_ctxt->num_paths);
spin_unlock_irqrestore(&port->nhi_ctxt->lock, flags);
}
+
+ port->rx_ring.next_to_clean = 0;
+ port->rx_ring.last_allocated = TBT_NET_NUM_RX_BUFS - 1;
+
+}
+
+void tbt_net_tx_msi(struct net_device *net_dev)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+ void __iomem *iobase = port->nhi_ctxt->iobase;
+ u32 prod_cons, prod, cons;
+
+ prod_cons = ioread32(TBT_RING_CONS_PROD_REG(iobase, REG_TX_RING_BASE,
+ port->local_path));
+ prod = TBT_REG_RING_PROD_EXTRACT(prod_cons);
+ cons = TBT_REG_RING_CONS_EXTRACT(prod_cons);
+ if (prod >= TBT_NET_NUM_TX_BUFS || cons >= TBT_NET_NUM_TX_BUFS)
+ return;
+
+ if (TBT_NUM_BUFS_BETWEEN(prod, cons, TBT_NET_NUM_TX_BUFS) >=
+ TX_WAKE_THRESHOLD) {
+ netif_wake_queue(port->net_dev);
+ } else {
+ spin_lock(&port->nhi_ctxt->lock);
+ /* enable TX interrupt */
+ RING_INT_ENABLE_TX(iobase, port->local_path);
+ spin_unlock(&port->nhi_ctxt->lock);
+ }
+}
+
+static irqreturn_t tbt_net_tx_msix(int __always_unused irq, void *data)
+{
+ struct tbt_port *port = data;
+ void __iomem *iobase = port->nhi_ctxt->iobase;
+ u32 prod_cons, prod, cons;
+
+ prod_cons = ioread32(TBT_RING_CONS_PROD_REG(iobase,
+ REG_TX_RING_BASE,
+ port->local_path));
+ prod = TBT_REG_RING_PROD_EXTRACT(prod_cons);
+ cons = TBT_REG_RING_CONS_EXTRACT(prod_cons);
+ if (prod < TBT_NET_NUM_TX_BUFS && cons < TBT_NET_NUM_TX_BUFS &&
+ TBT_NUM_BUFS_BETWEEN(prod, cons, TBT_NET_NUM_TX_BUFS) >=
+ TX_WAKE_THRESHOLD) {
+ spin_lock(&port->nhi_ctxt->lock);
+ /* disable TX interrupt */
+ RING_INT_DISABLE_TX(iobase, port->local_path);
+ spin_unlock(&port->nhi_ctxt->lock);
+
+ netif_wake_queue(port->net_dev);
+ }
+
+ return IRQ_HANDLED;
+}
+
+void tbt_net_rx_msi(struct net_device *net_dev)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+
+ napi_schedule_irqoff(&port->napi);
+}
+
+static irqreturn_t tbt_net_rx_msix(int __always_unused irq, void *data)
+{
+ struct tbt_port *port = data;
+
+ if (likely(napi_schedule_prep(&port->napi))) {
+ struct tbt_nhi_ctxt *nhi_ctx = port->nhi_ctxt;
+
+ spin_lock(&nhi_ctx->lock);
+ /* disable RX interrupt */
+ RING_INT_DISABLE_RX(nhi_ctx->iobase, port->local_path,
+ nhi_ctx->num_paths);
+ spin_unlock(&nhi_ctx->lock);
+
+ __napi_schedule_irqoff(&port->napi);
+ }
+
+ return IRQ_HANDLED;
+}
+
+static void tbt_net_pull_tail(struct sk_buff *skb)
+{
+ skb_frag_t *frag = &skb_shinfo(skb)->frags[0];
+ unsigned int pull_len;
+ unsigned char *va;
+
+ /*
+ * it is valid to use page_address instead of kmap since we are
+ * working with pages allocated out of the lomem pool
+ */
+ va = skb_frag_address(frag);
+
+ pull_len = eth_get_headlen(va, TBT_NET_RX_HDR_SIZE);
+
+ /* align pull length to size of long to optimize memcpy performance */
+ skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long)));
+
+ /* update all of the pointers */
+ skb_frag_size_sub(frag, pull_len);
+ frag->page_offset += pull_len;
+ skb->data_len -= pull_len;
+ skb->tail += pull_len;
+}
+
+static inline bool tbt_net_alloc_mapped_page(struct device *dev,
+ struct tbt_buffer *buf, gfp_t gfp)
+{
+ if (!buf->page) {
+ buf->page = alloc_page(gfp | __GFP_COLD);
+ if (unlikely(!buf->page))
+ return false;
+
+ buf->dma = dma_map_page(dev, buf->page, 0, PAGE_SIZE,
+ DMA_FROM_DEVICE);
+ if (dma_mapping_error(dev, buf->dma)) {
+ __free_page(buf->page);
+ buf->page = NULL;
+ return false;
+ }
+ buf->page_offset = 0;
+ }
+ return true;
+}
+
+static bool tbt_net_alloc_rx_buffers(struct device *dev,
+ struct tbt_desc_ring *rx_ring,
+ u16 cleaned_count, void __iomem *reg,
+ gfp_t gfp)
+{
+ u16 i = (rx_ring->last_allocated + 1) & (TBT_NET_NUM_RX_BUFS - 1);
+ bool res = false;
+
+ while (cleaned_count--) {
+ struct tbt_buf_desc *desc = &rx_ring->desc[i];
+ struct tbt_buffer *buf = &rx_ring->buffers[i];
+
+ /* making sure next_to_clean won't get old buffer */
+ desc->attributes = cpu_to_le32(DESC_ATTR_REQ_STS |
+ DESC_ATTR_INT_EN);
+ if (tbt_net_alloc_mapped_page(dev, buf, gfp)) {
+ res = true;
+ rx_ring->last_allocated = i;
+ i = (i + 1) & (TBT_NET_NUM_RX_BUFS - 1);
+ desc->phys = cpu_to_le64(buf->dma + buf->page_offset);
+ } else {
+ break;
+ }
+ }
+
+ if (res) {
+ iowrite32((rx_ring->last_allocated << REG_RING_CONS_SHIFT) &
+ REG_RING_CONS_MASK, reg);
+ }
+
+ return res;
+}
+
+static inline bool tbt_net_multicast_mac_set(const u32 *multicast_hash_table,
+ const u8 *ether_addr)
+{
+ u16 hash_val = TBT_NET_ETHER_ADDR_HASH(ether_addr);
+
+ return !!(multicast_hash_table[hash_val / BITS_PER_U32] &
+ BIT(hash_val % BITS_PER_U32));
+}
+
+static enum frame_status tbt_net_check_frame(struct tbt_port *port,
+ u16 frame_num, u32 *count,
+ u16 index, u16 *id, u32 *size)
+{
+ struct tbt_desc_ring *rx_ring = &port->rx_ring;
+ __le32 desc_attr = rx_ring->desc[frame_num].attributes;
+ enum frame_status res = GOOD_AS_FIRST_FRAME;
+ u32 len, frame_count, frame_size;
+ struct tbt_frame_header *hdr;
+
+ if (!(desc_attr & cpu_to_le32(DESC_ATTR_DESC_DONE)))
+ return FRAME_NOT_READY;
+
+ rmb(); /* read other fields from desc after checking DD */
+
+ if (unlikely(desc_attr & cpu_to_le32(DESC_ATTR_RX_CRC_ERR))) {
+ ++port->stats.rx_crc_errors;
+ goto err;
+ } else if (unlikely(desc_attr &
+ cpu_to_le32(DESC_ATTR_RX_BUF_OVRN_ERR))) {
+ ++port->stats.rx_over_errors;
+ goto err;
+ }
+
+ len = (le32_to_cpu(desc_attr) & DESC_ATTR_LEN_MASK)
+ >> DESC_ATTR_LEN_SHIFT;
+ if (len == 0)
+ len = TBT_RING_MAX_FRAME_SIZE;
+ /* should be greater than just header i.e. contains data */
+ if (unlikely(len <= sizeof(struct tbt_frame_header))) {
+ ++port->stats.rx_length_errors;
+ goto err;
+ }
+
+ prefetchw(rx_ring->buffers[frame_num].page);
+ hdr = page_address(rx_ring->buffers[frame_num].page) +
+ rx_ring->buffers[frame_num].page_offset;
+ /* prefetch first cache line of first page */
+ prefetch(hdr);
+
+ /* we are reusing so sync this buffer for CPU use */
+ dma_sync_single_range_for_cpu(&port->nhi_ctxt->pdev->dev,
+ rx_ring->buffers[frame_num].dma,
+ rx_ring->buffers[frame_num].page_offset,
+ TBT_RING_MAX_FRAME_SIZE,
+ DMA_FROM_DEVICE);
+
+ frame_count = le32_to_cpu(hdr->frame_count);
+ frame_size = le32_to_cpu(hdr->frame_size);
+
+ if (unlikely((frame_size > len - sizeof(struct tbt_frame_header)) ||
+ (frame_size == 0))) {
+ ++port->stats.rx_length_errors;
+ goto err;
+ }
+ /*
+ * In case we're in the middle of packet, validate the frame header
+ * based on first fragment of the packet
+ */
+ if (*count) {
+ /* check the frame count fits the count field */
+ if (frame_count != *count) {
+ ++port->stats.rx_length_errors;
+ goto check_as_first;
+ }
+
+ /*
+ * check the frame identifiers are incremented correctly,
+ * and id is matching
+ */
+ if ((le16_to_cpu(hdr->frame_index) != index) ||
+ (le16_to_cpu(hdr->frame_id) != *id)) {
+ ++port->stats.rx_missed_errors;
+ goto check_as_first;
+ }
+
+ *size += frame_size;
+ if (*size > TBT_NET_MTU) {
+ ++port->stats.rx_length_errors;
+ goto err;
+ }
+ res = GOOD_FRAME;
+ } else { /* start of packet, validate the frame header */
+ const u8 *addr;
+
+check_as_first:
+ rx_ring->next_to_clean = frame_num;
+
+ /* validate the first packet has a valid frame count */
+ if (unlikely(frame_count == 0 ||
+ frame_count > (TBT_NET_NUM_RX_BUFS / 4))) {
+ ++port->stats.rx_length_errors;
+ goto err;
+ }
+
+ /* validate the first packet has a valid frame index */
+ if (hdr->frame_index != 0) {
+ ++port->stats.rx_missed_errors;
+ goto err;
+ }
+
+ BUILD_BUG_ON(TBT_NET_RX_HDR_SIZE > TBT_RING_MAX_FRM_DATA_SZ);
+ if ((frame_count > 1) && (frame_size < TBT_NET_RX_HDR_SIZE)) {
+ ++port->stats.rx_length_errors;
+ goto err;
+ }
+
+ addr = (u8 *)(hdr + 1);
+
+ /* check the packet can go through the filter */
+ if (is_multicast_ether_addr(addr)) {
+ if (!is_broadcast_ether_addr(addr)) {
+ if ((port->packet_filters &
+ (BIT(PACKET_TYPE_PROMISCUOUS) |
+ BIT(PACKET_TYPE_ALL_MULTICAST))) ||
+ tbt_net_multicast_mac_set(
+ port->multicast_hash_table, addr))
+ res = GOOD_AS_FIRST_MULTICAST_FRAME;
+ else
+ goto err;
+ }
+ } else if (!(port->packet_filters &
+ (BIT(PACKET_TYPE_PROMISCUOUS) |
+ BIT(PACKET_TYPE_UNICAST_PROMISCUOUS))) &&
+ !ether_addr_equal(port->net_dev->dev_addr, addr)) {
+ goto err;
+ }
+
+ *size = frame_size;
+ *count = frame_count;
+ *id = le16_to_cpu(hdr->frame_id);
+ }
+
+#if (PREFETCH_STRIDE < 128)
+ prefetch((u8 *)hdr + PREFETCH_STRIDE);
+#endif
+
+ return res;
+
+err:
+ rx_ring->next_to_clean = (frame_num + 1) & (TBT_NET_NUM_RX_BUFS - 1);
+ return FRAME_ERROR;
+}
+
+static inline unsigned int tbt_net_max_frm_data_size(
+ __maybe_unused u32 frame_size)
+{
+#if (TBT_NUM_FRAMES_PER_PAGE > 1)
+ return ALIGN(frame_size + sizeof(struct tbt_frame_header),
+ L1_CACHE_BYTES) -
+ sizeof(struct tbt_frame_header);
+#else
+ return TBT_RING_MAX_FRM_DATA_SZ;
+#endif
+}
+
+static int tbt_net_poll(struct napi_struct *napi, int budget)
+{
+ struct tbt_port *port = container_of(napi, struct tbt_port, napi);
+ void __iomem *reg = TBT_RING_CONS_PROD_REG(port->nhi_ctxt->iobase,
+ REG_RX_RING_BASE,
+ port->local_path);
+ struct tbt_desc_ring *rx_ring = &port->rx_ring;
+ u16 cleaned_count = TBT_NUM_BUFS_BETWEEN(rx_ring->last_allocated,
+ rx_ring->next_to_clean,
+ TBT_NET_NUM_RX_BUFS);
+ unsigned long flags;
+ int rx_packets = 0;
+
+loop:
+ while (likely(rx_packets < budget)) {
+ struct sk_buff *skb;
+ enum frame_status status;
+ bool multicast = false;
+ u32 frame_count = 0, size;
+ u16 j, frame_id;
+ int i;
+
+ /*
+ * return some buffers to hardware, one at a time is too slow
+ * so allocate TBT_NET_RX_BUFFER_WRITE buffers at the same time
+ */
+ if (cleaned_count >= TBT_NET_RX_BUFFER_WRITE) {
+ tbt_net_alloc_rx_buffers(&port->nhi_ctxt->pdev->dev,
+ rx_ring, cleaned_count, reg,
+ GFP_ATOMIC);
+ cleaned_count = 0;
+ }
+
+ status = tbt_net_check_frame(port, rx_ring->next_to_clean,
+ &frame_count, 0, &frame_id,
+ &size);
+ if (status == FRAME_NOT_READY)
+ break;
+
+ if (status == FRAME_ERROR) {
+ ++cleaned_count;
+ continue;
+ }
+
+ multicast = (status == GOOD_AS_FIRST_MULTICAST_FRAME);
+
+ /*
+ * i is incremented up to the frame_count frames received,
+ * j cyclicly goes over the location from the next frame
+ * to clean in the ring
+ */
+ j = (rx_ring->next_to_clean + 1);
+ j &= (TBT_NET_NUM_RX_BUFS - 1);
+ for (i = 1; i < frame_count; ++i) {
+ status = tbt_net_check_frame(port, j, &frame_count, i,
+ &frame_id, &size);
+ if (status == FRAME_NOT_READY)
+ goto out;
+
+ j = (j + 1) & (TBT_NET_NUM_RX_BUFS - 1);
+
+ /* if a new frame is found, start over */
+ if (status == GOOD_AS_FIRST_FRAME ||
+ status == GOOD_AS_FIRST_MULTICAST_FRAME) {
+ multicast = (status ==
+ GOOD_AS_FIRST_MULTICAST_FRAME);
+ cleaned_count += i;
+ i = 0;
+ continue;
+ }
+
+ if (status == FRAME_ERROR) {
+ cleaned_count += (i + 1);
+ goto loop;
+ }
+ }
+
+ /* allocate a skb to store the frags */
+ skb = netdev_alloc_skb_ip_align(port->net_dev,
+ TBT_NET_RX_HDR_SIZE);
+ if (unlikely(!skb))
+ break;
+
+ /*
+ * we will be copying header into skb->data in
+ * tbt_net_pull_tail so it is in our interest to prefetch
+ * it now to avoid a possible cache miss
+ */
+ prefetchw(skb->data);
+
+ /*
+ * if overall size of packet smaller than TBT_NET_RX_HDR_SIZE
+ * which is a small buffer size we decided to allocate
+ * as the base to RX
+ */
+ if (size <= TBT_NET_RX_HDR_SIZE) {
+ struct tbt_buffer *buf =
+ &(rx_ring->buffers[rx_ring->next_to_clean]);
+ u8 *va = page_address(buf->page) + buf->page_offset +
+ sizeof(struct tbt_frame_header);
+
+ memcpy(__skb_put(skb, size), va,
+ ALIGN(size, sizeof(long)));
+
+ /*
+ * Reuse buffer as-is,
+ * just make sure it is local
+ * Access to local memory is faster than non-local
+ * memory so let's reuse.
+ * If not local, let's free it and reallocate later.
+ */
+ if (likely(page_to_nid(buf->page) == numa_node_id()))
+ /* sync the buffer for use by the device */
+ dma_sync_single_range_for_device(
+ &port->nhi_ctxt->pdev->dev,
+ buf->dma, buf->page_offset,
+ TBT_RING_MAX_FRAME_SIZE,
+ DMA_FROM_DEVICE);
+ else {
+ /* this page cannot be reused so discard it */
+ put_page(buf->page);
+ buf->page = NULL;
+ dma_unmap_page(&port->nhi_ctxt->pdev->dev,
+ buf->dma, PAGE_SIZE,
+ DMA_FROM_DEVICE);
+ }
+ rx_ring->next_to_clean = (rx_ring->next_to_clean + 1) &
+ (TBT_NET_NUM_RX_BUFS - 1);
+ } else {
+ for (i = 0; i < frame_count; ++i) {
+ struct tbt_buffer *buf = &(rx_ring->buffers[
+ rx_ring->next_to_clean]);
+ struct tbt_frame_header *hdr =
+ page_address(buf->page) +
+ buf->page_offset;
+ u32 frm_size = le32_to_cpu(hdr->frame_size);
+
+ unsigned int truesize =
+ tbt_net_max_frm_data_size(frm_size);
+
+ /* add frame to skb struct */
+ skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
+ buf->page,
+ sizeof(struct tbt_frame_header)
+ + buf->page_offset,
+ frm_size, truesize);
+
+#if (TBT_NUM_FRAMES_PER_PAGE > 1)
+ /* move offset up to the next cache line */
+ buf->page_offset += (truesize +
+ sizeof(struct tbt_frame_header));
+
+ /*
+ * we can reuse buffer if there is space
+ * available and it is local
+ */
+ if (page_to_nid(buf->page) == numa_node_id()
+ && buf->page_offset <=
+ PAGE_SIZE - TBT_RING_MAX_FRAME_SIZE) {
+ /*
+ * bump ref count on page before
+ * it is given to the stack
+ */
+ get_page(buf->page);
+ /*
+ * sync the buffer for use by the
+ * device
+ */
+ dma_sync_single_range_for_device(
+ &port->nhi_ctxt->pdev->dev,
+ buf->dma, buf->page_offset,
+ TBT_RING_MAX_FRAME_SIZE,
+ DMA_FROM_DEVICE);
+ } else
+#endif
+ {
+ buf->page = NULL;
+ dma_unmap_page(
+ &port->nhi_ctxt->pdev->dev,
+ buf->dma, PAGE_SIZE,
+ DMA_FROM_DEVICE);
+ }
+
+ rx_ring->next_to_clean =
+ (rx_ring->next_to_clean + 1) &
+ (TBT_NET_NUM_RX_BUFS - 1);
+ }
+ /*
+ * place header from the first
+ * fragment in linear portion of buffer
+ */
+ tbt_net_pull_tail(skb);
+ }
+
+ /*
+ * The Thunderbolt medium doesn't have any restriction on
+ * minimum frame size, thus doesn't need any padding in
+ * transmit.
+ * The network stack accepts Runt Ethernet frames,
+ * therefor there is neither padding in receive.
+ */
+
+ skb->protocol = eth_type_trans(skb, port->net_dev);
+ napi_gro_receive(&port->napi, skb);
+
+ ++rx_packets;
+ port->stats.rx_bytes += size;
+ if (multicast)
+ ++port->stats.multicast;
+ cleaned_count += frame_count;
+ }
+
+out:
+ port->stats.rx_packets += rx_packets;
+
+ if (cleaned_count)
+ tbt_net_alloc_rx_buffers(&port->nhi_ctxt->pdev->dev,
+ rx_ring, cleaned_count, reg,
+ GFP_ATOMIC);
+
+ /* If all work not completed, return budget and keep polling */
+ if (rx_packets >= budget)
+ return budget;
+
+ /* Work is done so exit the polling mode and re-enable the interrupt */
+ napi_complete(napi);
+
+ spin_lock_irqsave(&port->nhi_ctxt->lock, flags);
+ /* enable RX interrupt */
+ RING_INT_ENABLE_RX(port->nhi_ctxt->iobase, port->local_path,
+ port->nhi_ctxt->num_paths);
+
+ spin_unlock_irqrestore(&port->nhi_ctxt->lock, flags);
+
+ return 0;
+}
+
+static int tbt_net_open(struct net_device *net_dev)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+ int res = 0;
+ int i, j;
+
+ /* change link state to off until path establishment finishes */
+ netif_carrier_off(net_dev);
+
+ /*
+ * if we previously succeeded to allocate msix entries,
+ * now request IRQ for them:
+ * 2=tx data port 0,
+ * 3=rx data port 0,
+ * 4=tx data port 1,
+ * 5=rx data port 1,
+ * ...
+ * if not, if msi is used, nhi_msi will handle icm & data paths
+ */
+ if (port->nhi_ctxt->msix_entries) {
+ char name[] = "tbt-net-xx-xx";
+
+ scnprintf(name, sizeof(name), "tbt-net-rx-%02u", port->num);
+ res = devm_request_irq(&port->nhi_ctxt->pdev->dev,
+ port->nhi_ctxt->msix_entries[3+(port->num*2)].vector,
+ tbt_net_rx_msix, 0, name, port);
+ if (res) {
+ netif_err(port, ifup, net_dev, "request_irq %s failed %d\n",
+ name, res);
+ goto out;
+ }
+ name[8] = 't';
+ res = devm_request_irq(&port->nhi_ctxt->pdev->dev,
+ port->nhi_ctxt->msix_entries[2+(port->num*2)].vector,
+ tbt_net_tx_msix, 0, name, port);
+ if (res) {
+ netif_err(port, ifup, net_dev, "request_irq %s failed %d\n",
+ name, res);
+ goto request_irq_failure;
+ }
+ }
+ /*
+ * Verifying that all buffer sizes are well defined.
+ * Starting with frame(s) will not tip over the
+ * page boundary
+ */
+ BUILD_BUG_ON(TBT_NUM_FRAMES_PER_PAGE < 1);
+ /*
+ * Just to make sure we have enough place for containing
+ * 3 max MTU packets for TX
+ */
+ BUILD_BUG_ON((TBT_NET_NUM_TX_BUFS * TBT_RING_MAX_FRAME_SIZE) <
+ (TBT_NET_MTU * 3));
+ /* make sure the number of TX Buffers is power of 2 */
+ BUILD_BUG_ON_NOT_POWER_OF_2(TBT_NET_NUM_TX_BUFS);
+ /*
+ * Just to make sure we have enough place for containing
+ * 3 max MTU packets for RX
+ */
+ BUILD_BUG_ON((TBT_NET_NUM_RX_BUFS * TBT_RING_MAX_FRAME_SIZE) <
+ (TBT_NET_MTU * 3));
+ /* make sure the number of RX Buffers is power of 2 */
+ BUILD_BUG_ON_NOT_POWER_OF_2(TBT_NET_NUM_RX_BUFS);
+
+ port->rx_ring.last_allocated = TBT_NET_NUM_RX_BUFS - 1;
+
+ port->tx_ring.buffers = vzalloc(TBT_NET_NUM_TX_BUFS *
+ sizeof(struct tbt_buffer));
+ if (!port->tx_ring.buffers)
+ goto ring_alloc_failure;
+ port->rx_ring.buffers = vzalloc(TBT_NET_NUM_RX_BUFS *
+ sizeof(struct tbt_buffer));
+ if (!port->rx_ring.buffers)
+ goto ring_alloc_failure;
+
+ /*
+ * Allocate TX and RX descriptors
+ * if the total size is less than a page, do a central allocation
+ * Otherwise, split TX and RX
+ */
+ if (TBT_NET_SIZE_TOTAL_DESCS <= PAGE_SIZE) {
+ port->tx_ring.desc = dmam_alloc_coherent(
+ &port->nhi_ctxt->pdev->dev,
+ TBT_NET_SIZE_TOTAL_DESCS,
+ &port->tx_ring.dma,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!port->tx_ring.desc)
+ goto ring_alloc_failure;
+ /* RX starts where TX finishes */
+ port->rx_ring.desc = &port->tx_ring.desc[TBT_NET_NUM_TX_BUFS];
+ port->rx_ring.dma = port->tx_ring.dma +
+ (TBT_NET_NUM_TX_BUFS * sizeof(struct tbt_buf_desc));
+ } else {
+ port->tx_ring.desc = dmam_alloc_coherent(
+ &port->nhi_ctxt->pdev->dev,
+ TBT_NET_NUM_TX_BUFS *
+ sizeof(struct tbt_buf_desc),
+ &port->tx_ring.dma,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!port->tx_ring.desc)
+ goto ring_alloc_failure;
+ port->rx_ring.desc = dmam_alloc_coherent(
+ &port->nhi_ctxt->pdev->dev,
+ TBT_NET_NUM_RX_BUFS *
+ sizeof(struct tbt_buf_desc),
+ &port->rx_ring.dma,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!port->rx_ring.desc)
+ goto rx_desc_alloc_failure;
+ }
+
+ /* allocate TX buffers and configure the descriptors */
+ for (i = 0; i < TBT_NET_NUM_TX_BUFS; i++) {
+ port->tx_ring.buffers[i].hdr = dma_alloc_coherent(
+ &port->nhi_ctxt->pdev->dev,
+ TBT_NUM_FRAMES_PER_PAGE * TBT_RING_MAX_FRAME_SIZE,
+ &port->tx_ring.buffers[i].dma,
+ GFP_KERNEL);
+ if (!port->tx_ring.buffers[i].hdr)
+ goto buffers_alloc_failure;
+
+ port->tx_ring.desc[i].phys =
+ cpu_to_le64(port->tx_ring.buffers[i].dma);
+ port->tx_ring.desc[i].attributes =
+ cpu_to_le32(DESC_ATTR_REQ_STS |
+ TBT_NET_DESC_ATTR_SOF_EOF);
+
+ /*
+ * In case the page is bigger than the frame size,
+ * make the next buffer descriptor points
+ * on the next frame memory address within the page
+ */
+ for (i++, j = 1; (i < TBT_NET_NUM_TX_BUFS) &&
+ (j < TBT_NUM_FRAMES_PER_PAGE); i++, j++) {
+ port->tx_ring.buffers[i].dma =
+ port->tx_ring.buffers[i - 1].dma +
+ TBT_RING_MAX_FRAME_SIZE;
+ port->tx_ring.buffers[i].hdr =
+ (void *)(port->tx_ring.buffers[i - 1].hdr) +
+ TBT_RING_MAX_FRAME_SIZE;
+ /* move the next offset i.e. TBT_RING_MAX_FRAME_SIZE */
+ port->tx_ring.buffers[i].page_offset =
+ port->tx_ring.buffers[i - 1].page_offset +
+ TBT_RING_MAX_FRAME_SIZE;
+ port->tx_ring.desc[i].phys =
+ cpu_to_le64(port->tx_ring.buffers[i].dma);
+ port->tx_ring.desc[i].attributes =
+ cpu_to_le32(DESC_ATTR_REQ_STS |
+ TBT_NET_DESC_ATTR_SOF_EOF);
+ }
+ i--;
+ }
+
+ port->negotiation_status =
+ BIT(port->nhi_ctxt->net_devices[port->num].medium_sts);
+ if (port->negotiation_status == BIT(MEDIUM_READY_FOR_CONNECTION)) {
+ port->login_retry_count = 0;
+ queue_delayed_work(port->nhi_ctxt->net_workqueue,
+ &port->login_retry_work, 0);
+ }
+
+ netif_info(port, ifup, net_dev, "Thunderbolt(TM) Networking port %u - ready for ThunderboltIP negotiation\n",
+ port->num);
+ return 0;
+
+buffers_alloc_failure:
+ /*
+ * Rollback the Tx buffers that were already allocated
+ * until the failure
+ */
+ for (i--; i >= 0; i--) {
+ /* free only for first buffer allocation */
+ if (port->tx_ring.buffers[i].page_offset == 0)
+ dma_free_coherent(&port->nhi_ctxt->pdev->dev,
+ TBT_NUM_FRAMES_PER_PAGE *
+ TBT_RING_MAX_FRAME_SIZE,
+ port->tx_ring.buffers[i].hdr,
+ port->tx_ring.buffers[i].dma);
+ port->tx_ring.buffers[i].hdr = NULL;
+ }
+ /*
+ * For central allocation, free all
+ * otherwise free RX and then TX separately
+ */
+ if (TBT_NET_SIZE_TOTAL_DESCS <= PAGE_SIZE) {
+ dmam_free_coherent(&port->nhi_ctxt->pdev->dev,
+ TBT_NET_SIZE_TOTAL_DESCS,
+ port->tx_ring.desc,
+ port->tx_ring.dma);
+ port->rx_ring.desc = NULL;
+ } else {
+ dmam_free_coherent(&port->nhi_ctxt->pdev->dev,
+ TBT_NET_NUM_RX_BUFS *
+ sizeof(struct tbt_buf_desc),
+ port->rx_ring.desc,
+ port->rx_ring.dma);
+ port->rx_ring.desc = NULL;
+rx_desc_alloc_failure:
+ dmam_free_coherent(&port->nhi_ctxt->pdev->dev,
+ TBT_NET_NUM_TX_BUFS *
+ sizeof(struct tbt_buf_desc),
+ port->tx_ring.desc,
+ port->tx_ring.dma);
+ }
+ port->tx_ring.desc = NULL;
+ring_alloc_failure:
+ vfree(port->tx_ring.buffers);
+ port->tx_ring.buffers = NULL;
+ vfree(port->rx_ring.buffers);
+ port->rx_ring.buffers = NULL;
+ res = -ENOMEM;
+ netif_err(port, ifup, net_dev, "Thunderbolt(TM) Networking port %u - unable to allocate memory\n",
+ port->num);
+
+ if (!port->nhi_ctxt->msix_entries)
+ goto out;
+
+ devm_free_irq(&port->nhi_ctxt->pdev->dev,
+ port->nhi_ctxt->msix_entries[2 + (port->num * 2)].vector,
+ port);
+request_irq_failure:
+ devm_free_irq(&port->nhi_ctxt->pdev->dev,
+ port->nhi_ctxt->msix_entries[3 + (port->num * 2)].vector,
+ port);
+out:
+ return res;
+}
+
+static int tbt_net_close(struct net_device *net_dev)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+ int i;
+
+ /*
+ * Close connection, disable rings, flow controls
+ * and interrupts
+ */
+ tbt_net_tear_down(net_dev, !(port->negotiation_status &
+ BIT(RECEIVE_LOGOUT)));
+
+ cancel_work_sync(&port->login_response_work);
+ cancel_work_sync(&port->logout_work);
+ cancel_work_sync(&port->status_reply_work);
+ cancel_work_sync(&port->approve_inter_domain_work);
+
+ /* Rollback the Tx buffers that were allocated */
+ for (i = 0; i < TBT_NET_NUM_TX_BUFS; i++) {
+ if (port->tx_ring.buffers[i].page_offset == 0)
+ dma_free_coherent(&port->nhi_ctxt->pdev->dev,
+ TBT_NUM_FRAMES_PER_PAGE *
+ TBT_RING_MAX_FRAME_SIZE,
+ port->tx_ring.buffers[i].hdr,
+ port->tx_ring.buffers[i].dma);
+ port->tx_ring.buffers[i].hdr = NULL;
+ }
+ /* Unmap the Rx buffers that were allocated */
+ for (i = 0; i < TBT_NET_NUM_RX_BUFS; i++)
+ if (port->rx_ring.buffers[i].page) {
+ put_page(port->rx_ring.buffers[i].page);
+ port->rx_ring.buffers[i].page = NULL;
+ dma_unmap_page(&port->nhi_ctxt->pdev->dev,
+ port->rx_ring.buffers[i].dma, PAGE_SIZE,
+ DMA_FROM_DEVICE);
+ }
+
+ /*
+ * For central allocation, free all
+ * otherwise free RX and then TX separately
+ */
+ if (TBT_NET_SIZE_TOTAL_DESCS <= PAGE_SIZE) {
+ dmam_free_coherent(&port->nhi_ctxt->pdev->dev,
+ TBT_NET_SIZE_TOTAL_DESCS,
+ port->tx_ring.desc,
+ port->tx_ring.dma);
+ port->rx_ring.desc = NULL;
+ } else {
+ dmam_free_coherent(&port->nhi_ctxt->pdev->dev,
+ TBT_NET_NUM_RX_BUFS *
+ sizeof(struct tbt_buf_desc),
+ port->rx_ring.desc,
+ port->rx_ring.dma);
+ port->rx_ring.desc = NULL;
+ dmam_free_coherent(&port->nhi_ctxt->pdev->dev,
+ TBT_NET_NUM_TX_BUFS *
+ sizeof(struct tbt_buf_desc),
+ port->tx_ring.desc,
+ port->tx_ring.dma);
+ }
+ port->tx_ring.desc = NULL;
+
+ vfree(port->tx_ring.buffers);
+ port->tx_ring.buffers = NULL;
+ vfree(port->rx_ring.buffers);
+ port->rx_ring.buffers = NULL;
+
+ devm_free_irq(&port->nhi_ctxt->pdev->dev,
+ port->nhi_ctxt->msix_entries[3 + (port->num * 2)].vector,
+ port);
+ devm_free_irq(&port->nhi_ctxt->pdev->dev,
+ port->nhi_ctxt->msix_entries[2 + (port->num * 2)].vector,
+ port);
+
+ netif_info(port, ifdown, net_dev, "Thunderbolt(TM) Networking port %u - is down\n",
+ port->num);
+
+ return 0;
+}
+
+static bool tbt_net_xmit_csum(struct sk_buff *skb,
+ struct tbt_desc_ring *tx_ring, u32 first,
+ u32 last, u32 frame_count)
+{
+
+ struct tbt_frame_header *hdr = tx_ring->buffers[first].hdr;
+ __wsum wsum = (__force __wsum)htonl(skb->len -
+ skb_transport_offset(skb));
+ int offset = skb_transport_offset(skb);
+ __sum16 *tucso; /* TCP UDP Checksum Segment Offset */
+ __be16 protocol = skb->protocol;
+ u8 *dest = (u8 *)(hdr + 1);
+ int len;
+
+ if (skb->ip_summed != CHECKSUM_PARTIAL) {
+ for (; first != last;
+ first = (first + 1) & (TBT_NET_NUM_TX_BUFS - 1)) {
+ hdr = tx_ring->buffers[first].hdr;
+ hdr->frame_count = cpu_to_le32(frame_count);
+ }
+ return true;
+ }
+
+ if (protocol == htons(ETH_P_8021Q)) {
+ struct vlan_hdr *vhdr, vh;
+
+ vhdr = skb_header_pointer(skb, ETH_HLEN, sizeof(vh), &vh);
+ if (!vhdr)
+ return false;
+
+ protocol = vhdr->h_vlan_encapsulated_proto;
+ }
+
+ /*
+ * Data points on the beginning of packet.
+ * Check is the checksum absolute place in the
+ * packet.
+ * ipcso will update IP checksum.
+ * tucso will update TCP/UPD checksum.
+ */
+ if (protocol == htons(ETH_P_IP)) {
+ __sum16 *ipcso = (__sum16 *)(dest +
+ ((u8 *)&(ip_hdr(skb)->check) - skb->data));
+
+ *ipcso = 0;
+ *ipcso = ip_fast_csum(dest + skb_network_offset(skb),
+ ip_hdr(skb)->ihl);
+ if (ip_hdr(skb)->protocol == IPPROTO_TCP)
+ tucso = (__sum16 *)(dest +
+ ((u8 *)&(tcp_hdr(skb)->check) - skb->data));
+ else if (ip_hdr(skb)->protocol == IPPROTO_UDP)
+ tucso = (__sum16 *)(dest +
+ ((u8 *)&(udp_hdr(skb)->check) - skb->data));
+ else
+ return false;
+
+ *tucso = ~csum_tcpudp_magic(ip_hdr(skb)->saddr,
+ ip_hdr(skb)->daddr, 0,
+ ip_hdr(skb)->protocol, 0);
+ } else if (skb_is_gso(skb)) {
+ if (skb_is_gso_v6(skb)) {
+ tucso = (__sum16 *)(dest +
+ ((u8 *)&(tcp_hdr(skb)->check) - skb->data));
+ *tucso = ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
+ &ipv6_hdr(skb)->daddr,
+ 0, IPPROTO_TCP, 0);
+ } else if ((protocol == htons(ETH_P_IPV6)) &&
+ (skb_shinfo(skb)->gso_type & SKB_GSO_UDP)) {
+ tucso = (__sum16 *)(dest +
+ ((u8 *)&(udp_hdr(skb)->check) - skb->data));
+ *tucso = ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
+ &ipv6_hdr(skb)->daddr,
+ 0, IPPROTO_UDP, 0);
+ } else {
+ return false;
+ }
+ } else if (protocol == htons(ETH_P_IPV6)) {
+ tucso = (__sum16 *)(dest + skb_checksum_start_offset(skb) +
+ skb->csum_offset);
+ *tucso = ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
+ &ipv6_hdr(skb)->daddr,
+ 0, ipv6_hdr(skb)->nexthdr, 0);
+ } else {
+ return false;
+ }
+
+ /* First frame was headers, rest of the frames is data */
+ for (; first != last; first = (first + 1) & (TBT_NET_NUM_TX_BUFS - 1),
+ offset = 0) {
+ hdr = tx_ring->buffers[first].hdr;
+ dest = (u8 *)(hdr + 1) + offset;
+ len = le32_to_cpu(hdr->frame_size) - offset;
+ wsum = csum_partial(dest, len, wsum);
+ hdr->frame_count = cpu_to_le32(frame_count);
+ }
+ *tucso = csum_fold(wsum);
+
+ return true;
+}
+
+static netdev_tx_t tbt_net_xmit_frame(struct sk_buff *skb,
+ struct net_device *net_dev)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+ void __iomem *iobase = port->nhi_ctxt->iobase;
+ void __iomem *reg = TBT_RING_CONS_PROD_REG(iobase,
+ REG_TX_RING_BASE,
+ port->local_path);
+ struct tbt_desc_ring *tx_ring = &port->tx_ring;
+ struct tbt_frame_header *hdr;
+ u32 prod_cons, prod, cons, first;
+ /* len equivalent to the fragment length */
+ unsigned int len = skb_headlen(skb);
+ /* data_len is overall packet length */
+ unsigned int data_len = skb->len;
+ u32 frm_idx, frag_num = 0;
+ const u8 *src = skb->data;
+ bool unmap = false;
+ __le32 *attr;
+ u8 *dest;
+
+ if (unlikely(data_len == 0 || data_len > TBT_NET_MTU))
+ goto invalid_packet;
+
+ prod_cons = ioread32(reg);
+ prod = TBT_REG_RING_PROD_EXTRACT(prod_cons);
+ cons = TBT_REG_RING_CONS_EXTRACT(prod_cons);
+ if (prod >= TBT_NET_NUM_TX_BUFS || cons >= TBT_NET_NUM_TX_BUFS)
+ goto tx_error;
+
+ if (data_len > (TBT_NUM_BUFS_BETWEEN(prod, cons, TBT_NET_NUM_TX_BUFS) *
+ TBT_RING_MAX_FRM_DATA_SZ)) {
+ unsigned long flags;
+
+ netif_stop_queue(net_dev);
+
+ spin_lock_irqsave(&port->nhi_ctxt->lock, flags);
+ /*
+ * Enable TX interrupt to be notified about available buffers
+ * and restart transmission upon this.
+ */
+ RING_INT_ENABLE_TX(iobase, port->local_path);
+ spin_unlock_irqrestore(&port->nhi_ctxt->lock, flags);
+
+ return NETDEV_TX_BUSY;
+ }
+
+ first = prod;
+ attr = &tx_ring->desc[prod].attributes;
+ hdr = tx_ring->buffers[prod].hdr;
+ dest = (u8 *)(hdr + 1);
+ /* if overall packet is bigger than the frame data size */
+ for (frm_idx = 0; data_len > TBT_RING_MAX_FRM_DATA_SZ; ++frm_idx) {
+ u32 size_left = TBT_RING_MAX_FRM_DATA_SZ;
+
+ *attr &= cpu_to_le32(~(DESC_ATTR_LEN_MASK |
+ DESC_ATTR_INT_EN |
+ DESC_ATTR_DESC_DONE));
+ hdr->frame_size = cpu_to_le32(TBT_RING_MAX_FRM_DATA_SZ);
+ hdr->frame_index = cpu_to_le16(frm_idx);
+ hdr->frame_id = cpu_to_le16(port->frame_id);
+
+ do {
+ if (len > size_left) {
+ /*
+ * Copy data onto tx buffer data with full
+ * frame size then break
+ * and go to next frame
+ */
+ memcpy(dest, src, size_left);
+ len -= size_left;
+ dest += size_left;
+ src += size_left;
+ break;
+ }
+
+ memcpy(dest, src, len);
+ size_left -= len;
+ dest += len;
+
+ if (unmap) {
+ kunmap_atomic((void *)src);
+ unmap = false;
+ }
+ /*
+ * Ensure all fragments have been processed
+ */
+ if (frag_num < skb_shinfo(skb)->nr_frags) {
+ const skb_frag_t *frag =
+ &(skb_shinfo(skb)->frags[frag_num]);
+ len = skb_frag_size(frag);
+ /* map and then unmap quickly */
+ src = kmap_atomic(skb_frag_page(frag)) +
+ frag->page_offset;
+ unmap = true;
+ ++frag_num;
+ } else if (unlikely(size_left > 0)) {
+ goto invalid_packet;
+ }
+ } while (size_left > 0);
+
+ data_len -= TBT_RING_MAX_FRM_DATA_SZ;
+ prod = (prod + 1) & (TBT_NET_NUM_TX_BUFS - 1);
+ attr = &tx_ring->desc[prod].attributes;
+ hdr = tx_ring->buffers[prod].hdr;
+ dest = (u8 *)(hdr + 1);
+ }
+
+ *attr &= cpu_to_le32(~(DESC_ATTR_LEN_MASK | DESC_ATTR_DESC_DONE));
+ /* Enable the interrupts, for resuming from stop queue later (if so) */
+ *attr |= cpu_to_le32(DESC_ATTR_INT_EN |
+ (((sizeof(struct tbt_frame_header) + data_len) <<
+ DESC_ATTR_LEN_SHIFT) & DESC_ATTR_LEN_MASK));
+ hdr->frame_size = cpu_to_le32(data_len);
+ hdr->frame_index = cpu_to_le16(frm_idx);
+ hdr->frame_id = cpu_to_le16(port->frame_id);
+
+ /* In case the remaining data_len is smaller than a frame */
+ while (len < data_len) {
+ memcpy(dest, src, len);
+ data_len -= len;
+ dest += len;
+
+ if (unmap) {
+ kunmap_atomic((void *)src);
+ unmap = false;
+ }
+
+ if (frag_num < skb_shinfo(skb)->nr_frags) {
+ const skb_frag_t *frag =
+ &(skb_shinfo(skb)->frags[frag_num]);
+ len = skb_frag_size(frag);
+ src = kmap_atomic(skb_frag_page(frag)) +
+ frag->page_offset;
+ unmap = true;
+ ++frag_num;
+ } else if (unlikely(data_len > 0)) {
+ goto invalid_packet;
+ }
+ }
+ memcpy(dest, src, data_len);
+ if (unmap) {
+ kunmap_atomic((void *)src);
+ unmap = false;
+ }
+
+ ++frm_idx;
+ prod = (prod + 1) & (TBT_NET_NUM_TX_BUFS - 1);
+
+ if (!tbt_net_xmit_csum(skb, tx_ring, first, prod, frm_idx))
+ goto invalid_packet;
+
+ if (port->match_frame_id)
+ ++port->frame_id;
+
+ prod_cons &= ~REG_RING_PROD_MASK;
+ prod_cons |= (prod << REG_RING_PROD_SHIFT) & REG_RING_PROD_MASK;
+ wmb(); /* make sure producer update is done after buffers are ready */
+ iowrite32(prod_cons, reg);
+
+ ++port->stats.tx_packets;
+ port->stats.tx_bytes += skb->len;
+
+ dev_consume_skb_any(skb);
+ return NETDEV_TX_OK;
+
+invalid_packet:
+ netif_err(port, tx_err, net_dev, "port %u invalid transmit packet\n",
+ port->num);
+tx_error:
+ ++port->stats.tx_errors;
+ dev_kfree_skb_any(skb);
+ return NETDEV_TX_OK;
}
+static void tbt_net_set_rx_mode(struct net_device *net_dev)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+ struct netdev_hw_addr *ha;
+
+ if (net_dev->flags & IFF_PROMISC)
+ port->packet_filters |= BIT(PACKET_TYPE_PROMISCUOUS);
+ else
+ port->packet_filters &= ~BIT(PACKET_TYPE_PROMISCUOUS);
+ if (net_dev->flags & IFF_ALLMULTI)
+ port->packet_filters |= BIT(PACKET_TYPE_ALL_MULTICAST);
+ else
+ port->packet_filters &= ~BIT(PACKET_TYPE_ALL_MULTICAST);
+
+ /* if you have more than a single MAC address */
+ if (netdev_uc_count(net_dev) > 1)
+ port->packet_filters |= BIT(PACKET_TYPE_UNICAST_PROMISCUOUS);
+ /* if have a single MAC address */
+ else if (netdev_uc_count(net_dev) == 1) {
+ netdev_for_each_uc_addr(ha, net_dev)
+ /* checks whether the MAC is what we set */
+ if (ether_addr_equal(ha->addr, net_dev->dev_addr))
+ port->packet_filters &=
+ ~BIT(PACKET_TYPE_UNICAST_PROMISCUOUS);
+ else
+ port->packet_filters |=
+ BIT(PACKET_TYPE_UNICAST_PROMISCUOUS);
+ } else {
+ port->packet_filters &= ~BIT(PACKET_TYPE_UNICAST_PROMISCUOUS);
+ }
+
+ /* Populate the multicast hash table with received MAC addresses */
+ memset(port->multicast_hash_table, 0,
+ sizeof(port->multicast_hash_table));
+ netdev_for_each_mc_addr(ha, net_dev) {
+ u16 hash_val = TBT_NET_ETHER_ADDR_HASH(ha->addr);
+
+ port->multicast_hash_table[hash_val / BITS_PER_U32] |=
+ BIT(hash_val % BITS_PER_U32);
+ }
+
+}
+
+static struct rtnl_link_stats64 *tbt_net_get_stats64(
+ struct net_device *net_dev,
+ struct rtnl_link_stats64 *stats)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+
+ memset(stats, 0, sizeof(*stats));
+ stats->tx_packets = port->stats.tx_packets;
+ stats->tx_bytes = port->stats.tx_bytes;
+ stats->tx_errors = port->stats.tx_errors;
+ stats->rx_packets = port->stats.rx_packets;
+ stats->rx_bytes = port->stats.rx_bytes;
+ stats->rx_length_errors = port->stats.rx_length_errors;
+ stats->rx_over_errors = port->stats.rx_over_errors;
+ stats->rx_crc_errors = port->stats.rx_crc_errors;
+ stats->rx_missed_errors = port->stats.rx_missed_errors;
+ stats->rx_errors = stats->rx_length_errors + stats->rx_over_errors +
+ stats->rx_crc_errors + stats->rx_missed_errors;
+ stats->multicast = port->stats.multicast;
+ return stats;
+}
+
+static int tbt_net_set_mac_address(struct net_device *net_dev, void *addr)
+{
+ struct sockaddr *saddr = addr;
+
+ if (!is_valid_ether_addr(saddr->sa_data))
+ return -EADDRNOTAVAIL;
+
+ memcpy(net_dev->dev_addr, saddr->sa_data, net_dev->addr_len);
+
+ return 0;
+}
+
+static int tbt_net_change_mtu(struct net_device *net_dev, int new_mtu)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+
+ /* MTU < 68 is an error and causes problems on some kernels */
+ if (new_mtu < 68 || new_mtu > (TBT_NET_MTU - ETH_HLEN))
+ return -EINVAL;
+
+ netif_info(port, probe, net_dev, "Thunderbolt(TM) Networking port %u - changing MTU from %u to %d\n",
+ port->num, net_dev->mtu, new_mtu);
+
+ net_dev->mtu = new_mtu;
+
+ return 0;
+}
+
+static const struct net_device_ops tbt_netdev_ops = {
+ /* called when the network is up'ed */
+ .ndo_open = tbt_net_open,
+ /* called when the network is down'ed */
+ .ndo_stop = tbt_net_close,
+ .ndo_start_xmit = tbt_net_xmit_frame,
+ .ndo_set_rx_mode = tbt_net_set_rx_mode,
+ .ndo_get_stats64 = tbt_net_get_stats64,
+ .ndo_set_mac_address = tbt_net_set_mac_address,
+ .ndo_change_mtu = tbt_net_change_mtu,
+ .ndo_validate_addr = eth_validate_addr,
+};
+
+static int tbt_net_get_settings(__maybe_unused struct net_device *net_dev,
+ struct ethtool_cmd *ecmd)
+{
+ ecmd->supported |= SUPPORTED_20000baseKR2_Full;
+ ecmd->advertising |= ADVERTISED_20000baseKR2_Full;
+ ecmd->autoneg = AUTONEG_DISABLE;
+ ecmd->transceiver = XCVR_INTERNAL;
+ ecmd->supported |= SUPPORTED_FIBRE;
+ ecmd->advertising |= ADVERTISED_FIBRE;
+ ecmd->port = PORT_FIBRE;
+ ethtool_cmd_speed_set(ecmd, SPEED_20000);
+ ecmd->duplex = DUPLEX_FULL;
+
+ return 0;
+}
+
+
+static u32 tbt_net_get_msglevel(struct net_device *net_dev)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+
+ return port->msg_enable;
+}
+
+static void tbt_net_set_msglevel(struct net_device *net_dev, u32 data)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+
+ port->msg_enable = data;
+}
+
+static void tbt_net_get_strings(__maybe_unused struct net_device *net_dev,
+ u32 stringset, u8 *data)
+{
+ if (stringset == ETH_SS_STATS)
+ memcpy(data, tbt_net_gstrings_stats,
+ sizeof(tbt_net_gstrings_stats));
+}
+
+static void tbt_net_get_ethtool_stats(struct net_device *net_dev,
+ __maybe_unused struct ethtool_stats *sts,
+ u64 *data)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+
+ memcpy(data, &port->stats, sizeof(port->stats));
+}
+
+static int tbt_net_get_sset_count(__maybe_unused struct net_device *net_dev,
+ int sset)
+{
+ if (sset == ETH_SS_STATS)
+ return sizeof(tbt_net_gstrings_stats) / ETH_GSTRING_LEN;
+ return -EOPNOTSUPP;
+}
+
+static void tbt_net_get_drvinfo(struct net_device *net_dev,
+ struct ethtool_drvinfo *drvinfo)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+
+ strlcpy(drvinfo->driver, "Thunderbolt(TM) Networking",
+ sizeof(drvinfo->driver));
+ strlcpy(drvinfo->version, DRV_VERSION, sizeof(drvinfo->version));
+
+ strlcpy(drvinfo->bus_info, pci_name(port->nhi_ctxt->pdev),
+ sizeof(drvinfo->bus_info));
+ drvinfo->n_stats = tbt_net_get_sset_count(net_dev, ETH_SS_STATS);
+}
+
+static const struct ethtool_ops tbt_net_ethtool_ops = {
+ .get_settings = tbt_net_get_settings,
+ .get_drvinfo = tbt_net_get_drvinfo,
+ .get_link = ethtool_op_get_link,
+ .get_msglevel = tbt_net_get_msglevel,
+ .set_msglevel = tbt_net_set_msglevel,
+ .get_strings = tbt_net_get_strings,
+ .get_ethtool_stats = tbt_net_get_ethtool_stats,
+ .get_sset_count = tbt_net_get_sset_count,
+};
+
static inline int send_message(struct tbt_port *port, const char *func,
enum pdf_value pdf, u32 msg_len,
const void *msg)
@@ -496,6 +1920,10 @@ void negotiation_events(struct net_device *net_dev,
/* configure TX ring */
reg = iobase + REG_TX_RING_BASE +
(port->local_path * REG_RING_STEP);
+ iowrite32(lower_32_bits(port->tx_ring.dma),
+ reg + REG_RING_PHYS_LO_OFFSET);
+ iowrite32(upper_32_bits(port->tx_ring.dma),
+ reg + REG_RING_PHYS_HI_OFFSET);
tx_ring_conf = (TBT_NET_NUM_TX_BUFS << REG_RING_SIZE_SHIFT) &
REG_RING_SIZE_MASK;
@@ -538,6 +1966,10 @@ void negotiation_events(struct net_device *net_dev,
*/
reg = iobase + REG_RX_RING_BASE +
(port->local_path * REG_RING_STEP);
+ iowrite32(lower_32_bits(port->rx_ring.dma),
+ reg + REG_RING_PHYS_LO_OFFSET);
+ iowrite32(upper_32_bits(port->rx_ring.dma),
+ reg + REG_RING_PHYS_HI_OFFSET);
rx_ring_conf = (TBT_NET_NUM_RX_BUFS << REG_RING_SIZE_SHIFT) &
REG_RING_SIZE_MASK;
@@ -547,6 +1979,17 @@ void negotiation_events(struct net_device *net_dev,
REG_RING_BUF_SIZE_MASK;
iowrite32(rx_ring_conf, reg + REG_RING_SIZE_OFFSET);
+ /* allocate RX buffers and configure the descriptors */
+ if (!tbt_net_alloc_rx_buffers(&port->nhi_ctxt->pdev->dev,
+ &port->rx_ring,
+ TBT_NET_NUM_RX_BUFS,
+ reg + REG_RING_CONS_PROD_OFFSET,
+ GFP_KERNEL)) {
+ netif_err(port, link, net_dev, "Thunderbolt(TM) Networking port %u - no memory for receive buffers\n",
+ port->num);
+ tbt_net_tear_down(net_dev, true);
+ break;
+ }
spin_lock_irqsave(&port->nhi_ctxt->lock, flags);
/* enable RX interrupt */
@@ -559,6 +2002,7 @@ void negotiation_events(struct net_device *net_dev,
netif_info(port, link, net_dev, "Thunderbolt(TM) Networking port %u - ready\n",
port->num);
+ napi_enable(&port->napi);
netif_carrier_on(net_dev);
netif_start_queue(net_dev);
break;
@@ -769,15 +2213,42 @@ struct net_device *nhi_alloc_etherdev(struct tbt_nhi_ctxt *nhi_ctxt,
scnprintf(net_dev->name, sizeof(net_dev->name), "tbtnet%%dp%hhu",
port_num);
+ net_dev->netdev_ops = &tbt_netdev_ops;
+
+ netif_napi_add(net_dev, &port->napi, tbt_net_poll, NAPI_POLL_WEIGHT);
+
+ net_dev->hw_features = NETIF_F_SG |
+ NETIF_F_ALL_TSO |
+ NETIF_F_UFO |
+ NETIF_F_GRO |
+ NETIF_F_IP_CSUM |
+ NETIF_F_IPV6_CSUM;
+ net_dev->features = net_dev->hw_features;
+ if (nhi_ctxt->pci_using_dac)
+ net_dev->features |= NETIF_F_HIGHDMA;
+
INIT_DELAYED_WORK(&port->login_retry_work, login_retry);
INIT_WORK(&port->login_response_work, login_response);
INIT_WORK(&port->logout_work, logout);
INIT_WORK(&port->status_reply_work, status_reply);
INIT_WORK(&port->approve_inter_domain_work, approve_inter_domain);
+ net_dev->ethtool_ops = &tbt_net_ethtool_ops;
+
+ tbt_net_change_mtu(net_dev, TBT_NET_MTU - ETH_HLEN);
+
+ if (register_netdev(net_dev))
+ goto err_register;
+
+ netif_carrier_off(net_dev);
+
netif_info(port, probe, net_dev,
"Thunderbolt(TM) Networking port %u - MAC Address: %pM\n",
port_num, net_dev->dev_addr);
return net_dev;
+
+err_register:
+ free_netdev(net_dev);
+ return NULL;
}
--
2.7.4
^ permalink raw reply related
* Re: [Qemu-devel] virsh dump (qemu guest memory dump?): KASLR enabled linux guest support
From: Andrew Jones @ 2016-11-09 12:20 UTC (permalink / raw)
To: Daniel P. Berrange
Cc: Laszlo Ersek, Dave Young, qiaonuohan, bhe, anderson, qemu-devel
In-Reply-To: <20161109115819.GG22181@redhat.com>
On Wed, Nov 09, 2016 at 11:58:19AM +0000, Daniel P. Berrange wrote:
> On Wed, Nov 09, 2016 at 12:48:09PM +0100, Andrew Jones wrote:
> > On Wed, Nov 09, 2016 at 11:37:35AM +0000, Daniel P. Berrange wrote:
> > > On Wed, Nov 09, 2016 at 12:26:17PM +0100, Laszlo Ersek wrote:
> > > > On 11/09/16 11:40, Andrew Jones wrote:
> > > > > On Wed, Nov 09, 2016 at 11:01:46AM +0800, Dave Young wrote:
> > > > >> Hi,
> > > > >>
> > > > >> Latest linux kernel enabled kaslr to randomiz phys/virt memory
> > > > >> addresses, we had some effort to support kexec/kdump so that crash
> > > > >> utility can still works in case crashed kernel has kaslr enabled.
> > > > >>
> > > > >> But according to Dave Anderson virsh dump does not work, quoted messages
> > > > >> from Dave below:
> > > > >>
> > > > >> """
> > > > >> with virsh dump, there's no way of even knowing that KASLR
> > > > >> has randomized the kernel __START_KERNEL_map region, because there is no
> > > > >> virtual address information -- e.g., like "SYMBOL(_stext)" in the kdump
> > > > >> vmcoreinfo data to compare against the vmlinux file symbol value.
> > > > >> Unless virsh dump can export some basic virtual memory data, which
> > > > >> they say it can't, I don't see how KASLR can ever be supported.
> > > > >> """
> > > > >>
> > > > >> I assume virsh dump is using qemu guest memory dump facility so it
> > > > >> should be first addressed in qemu. Thus post this query to qemu devel
> > > > >> list. If this is not correct please let me know.
> > > > >>
> > > > >> Could you qemu dump people make it work? Or we can not support virt dump
> > > > >> as long as KASLR being enabled. Latest Fedora kernel has enabled it in x86_64.
> > > > >>
> > > > >
> > > > > When the -kernel command line option is used, then it may be possible
> > > > > to extract some information that could be used to supplement the memory
> > > > > dump that dump-guest-memory provides. However, that would be a specific
> > > > > use. In general, QEMU knows nothing about the guest kernel. It doesn't
> > > > > know where it is in the disk image, and it doesn't even know if it's
> > > > > Linux.
> > > > >
> > > > > Is there anything a guest userspace application could probe from e.g.
> > > > > /proc that would work? If so, then the guest agent could gain a new
> > > > > feature providing that.
> > > >
> > > > I fully agree. This is exactly what I suggested too, independently, in
> > > > the downstream thread, before arriving at this upstream thread. Let me
> > > > quote that email:
> > > >
> > > > On 11/09/16 12:09, Laszlo Ersek wrote:
> > > > > [...] the dump-guest-memory QEMU command supports an option called
> > > > > "paging". Here's its documentation, from the "qapi-schema.json" source
> > > > > file:
> > > > >
> > > > >> # @paging: if true, do paging to get guest's memory mapping. This allows
> > > > >> # using gdb to process the core file.
> > > > >> #
> > > > >> # IMPORTANT: this option can make QEMU allocate several gigabytes
> > > > >> # of RAM. This can happen for a large guest, or a
> > > > >> # malicious guest pretending to be large.
> > > > >> #
> > > > >> # Also, paging=true has the following limitations:
> > > > >> #
> > > > >> # 1. The guest may be in a catastrophic state or can have corrupted
> > > > >> # memory, which cannot be trusted
> > > > >> # 2. The guest can be in real-mode even if paging is enabled. For
> > > > >> # example, the guest uses ACPI to sleep, and ACPI sleep state
> > > > >> # goes in real-mode
> > > > >> # 3. Currently only supported on i386 and x86_64.
> > > > >> #
> > > > >
> > > > > "virsh dump --memory-only" sets paging=false, for obvious reasons.
> > > > >
> > > > > [...] the dump-guest-memory command provides a raw snapshot of the
> > > > > virtual machine's memory (and of the registers of the VCPUs); it is
> > > > > not enlightened about the guest.
> > > > >
> > > > > If the additional information you are looking for can be retrieved
> > > > > within the running Linux guest, using an appropriately privieleged
> > > > > userspace process, then I would recommend considering an extension to
> > > > > the qemu guest agent. The management layer (libvirt, [...]) could
> > > > > first invoke the guest agent (a process with root privileges running
> > > > > in the guest) from the host side, through virtio-serial. The new guest
> > > > > agent command would return the information necessary to deal with
> > > > > KASLR. Then the management layer would initiate the dump like always.
> > > > > Finally, the extra information would be combined with (or placed
> > > > > beside) the dump file in some way.
> > > > >
> > > > > So, this proposal would affect the guest agent and the management
> > > > > layer (= libvirt).
> > > >
> > > > Given that we already dislike "paging=true", enlightening
> > > > dump-guest-memory with even more guest-specific insight is the wrong
> > > > approach, IMO. That kind of knowledge belongs to the guest agent.
> > >
> > > If you're trying to debug a hung/panicked guest, then using a guest
> > > agent to fetch info is a complete non-starter as it'll be dead.
> >
> > So don't wait. Management software can make this query immediately
> > after the guest agent goes live. The information needed won't change.
>
> That doesn't help with trying to diagnose a crash during boot up, since
> the guest agent isn't running till fairly late. I'm also concerned that
> the QEMU guest agent is likely to be far from widely deployed in guests,
> so reliance on the guest agent will mean the dump facility is no longer
> reliably available.
>
It'd still be reliably available and useable during early boot, just like
it is now, for kernels that don't use KASLR. This proposal is only
attempting to *also* address KASLR kernels, for which there is currently
no support whatsoever. Call it a best-effort.
Of course we can get support for [probably] early boot and
guest-agent-less guests using KASLR too if we introduce a paravirt
solution, requiring guest kernel and KVM changes. Is it worth it?
Thanks,
drew
^ permalink raw reply
* [PATCH v9 4/8] thunderbolt: Networking state machine
From: Amir Levy @ 2016-11-09 14:20 UTC (permalink / raw)
To: gregkh
Cc: andreas.noever, bhelgaas, corbet, linux-kernel, linux-pci, netdev,
linux-doc, mario_limonciello, thunderbolt-linux, mika.westerberg,
tomas.winkler, xiong.y.zhang, Amir Levy
In-Reply-To: <1478701208-4585-1-git-send-email-amir.jer.levy@intel.com>
This patch builds the peer to peer communication path.
Communication is established by a negotiation process whereby messages are
sent back and forth between the peers until a connection is established.
This includes the Thunderbolt Network driver communication with the second
peer via Intel Connection Manager(ICM) firmware.
+--------------------+ +--------------------+
|Host 1 | |Host 2 |
| | | |
| +-----------+ | | +-----------+ |
| |Thunderbolt| | | |Thunderbolt| |
| |Networking | | | |Networking | |
| |Driver | | | |Driver | |
| +-----------+ | | +-----------+ |
| ^ | | ^ |
| | | | | |
| +------------+---+ | | +------------+---+ |
| |Thunderbolt | | | | |Thunderbolt | | |
| |Controller v | | | |Controller v | |
| | +---+ | | | | +---+ | |
| | |ICM|<-+-+------------+-+-------->|ICM| | |
| | +---+ | | | | +---+ | |
| +----------------+ | | +----------------+ |
+--------------------+ +--------------------+
Note that this patch only establishes the link between the two hosts and
not Network Packet handling - this is dealt with in the next patch.
Signed-off-by: Amir Levy <amir.jer.levy@intel.com>
---
drivers/thunderbolt/icm/Makefile | 2 +-
drivers/thunderbolt/icm/icm_nhi.c | 262 ++++++++++++-
drivers/thunderbolt/icm/net.c | 783 ++++++++++++++++++++++++++++++++++++++
drivers/thunderbolt/icm/net.h | 70 ++++
4 files changed, 1109 insertions(+), 8 deletions(-)
create mode 100644 drivers/thunderbolt/icm/net.c
diff --git a/drivers/thunderbolt/icm/Makefile b/drivers/thunderbolt/icm/Makefile
index f0d0fbb..94a2797 100644
--- a/drivers/thunderbolt/icm/Makefile
+++ b/drivers/thunderbolt/icm/Makefile
@@ -1,2 +1,2 @@
obj-${CONFIG_THUNDERBOLT_ICM} += thunderbolt-icm.o
-thunderbolt-icm-objs := icm_nhi.o
+thunderbolt-icm-objs := icm_nhi.o net.o
diff --git a/drivers/thunderbolt/icm/icm_nhi.c b/drivers/thunderbolt/icm/icm_nhi.c
index c843ce8..edc910b 100644
--- a/drivers/thunderbolt/icm/icm_nhi.c
+++ b/drivers/thunderbolt/icm/icm_nhi.c
@@ -64,6 +64,13 @@ static const struct nla_policy nhi_genl_policy[NHI_ATTR_MAX + 1] = {
.len = TBT_ICM_RING_MAX_FRAME_SIZE },
[NHI_ATTR_MSG_FROM_ICM] = { .type = NLA_BINARY,
.len = TBT_ICM_RING_MAX_FRAME_SIZE },
+ [NHI_ATTR_LOCAL_ROUTE_STRING] = {
+ .len = sizeof(struct route_string) },
+ [NHI_ATTR_LOCAL_UUID] = { .len = sizeof(uuid_be) },
+ [NHI_ATTR_REMOTE_UUID] = { .len = sizeof(uuid_be) },
+ [NHI_ATTR_LOCAL_DEPTH] = { .type = NLA_U8, },
+ [NHI_ATTR_ENABLE_FULL_E2E] = { .type = NLA_FLAG, },
+ [NHI_ATTR_MATCH_FRAME_ID] = { .type = NLA_FLAG, },
};
/* NHI genetlink family */
@@ -480,6 +487,29 @@ int nhi_mailbox(struct tbt_nhi_ctxt *nhi_ctxt, u32 cmd, u32 data, bool deinit)
return 0;
}
+static inline bool nhi_is_path_disconnected(u32 cmd, u8 num_ports)
+{
+ return (cmd >= DISCONNECT_PORT_A_INTER_DOMAIN_PATH &&
+ cmd < (DISCONNECT_PORT_A_INTER_DOMAIN_PATH + num_ports));
+}
+
+static int nhi_mailbox_disconn_path(struct tbt_nhi_ctxt *nhi_ctxt, u32 cmd)
+ __releases(&controllers_list_mutex)
+{
+ struct port_net_dev *port;
+ u32 port_num = cmd - DISCONNECT_PORT_A_INTER_DOMAIN_PATH;
+
+ port = &(nhi_ctxt->net_devices[port_num]);
+ mutex_lock(&port->state_mutex);
+
+ mutex_unlock(&controllers_list_mutex);
+ port->medium_sts = MEDIUM_READY_FOR_APPROVAL;
+ if (port->net_dev)
+ negotiation_events(port->net_dev, MEDIUM_DISCONNECTED);
+ mutex_unlock(&port->state_mutex);
+ return 0;
+}
+
static int nhi_mailbox_generic(struct tbt_nhi_ctxt *nhi_ctxt, u32 mb_cmd)
__releases(&controllers_list_mutex)
{
@@ -526,13 +556,90 @@ static int nhi_genl_mailbox(__always_unused struct sk_buff *u_skb,
return -ERESTART;
nhi_ctxt = nhi_search_ctxt(*(u32 *)info->userhdr);
- if (nhi_ctxt && !nhi_ctxt->d0_exit)
- return nhi_mailbox_generic(nhi_ctxt, mb_cmd);
+ if (nhi_ctxt && !nhi_ctxt->d0_exit) {
+
+ /* rwsem is released later by the below functions */
+ if (nhi_is_path_disconnected(cmd, nhi_ctxt->num_ports))
+ return nhi_mailbox_disconn_path(nhi_ctxt, cmd);
+ else
+ return nhi_mailbox_generic(nhi_ctxt, mb_cmd);
+
+ }
mutex_unlock(&controllers_list_mutex);
return -ENODEV;
}
+static int nhi_genl_approve_networking(__always_unused struct sk_buff *u_skb,
+ struct genl_info *info)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt;
+ struct route_string *route_str;
+ int res = -ENODEV;
+ u8 port_num;
+
+ if (!info || !info->userhdr || !info->attrs ||
+ !info->attrs[NHI_ATTR_LOCAL_ROUTE_STRING] ||
+ !info->attrs[NHI_ATTR_LOCAL_UUID] ||
+ !info->attrs[NHI_ATTR_REMOTE_UUID] ||
+ !info->attrs[NHI_ATTR_LOCAL_DEPTH])
+ return -EINVAL;
+
+ /*
+ * route_str is an unique topological address
+ * used for approving remote controller
+ */
+ route_str = nla_data(info->attrs[NHI_ATTR_LOCAL_ROUTE_STRING]);
+ /* extracts the port we're connected to */
+ port_num = PORT_NUM_FROM_LINK(L0_PORT_NUM(route_str->lo));
+
+ if (mutex_lock_interruptible(&controllers_list_mutex))
+ return -ERESTART;
+
+ nhi_ctxt = nhi_search_ctxt(*(u32 *)info->userhdr);
+ if (nhi_ctxt && !nhi_ctxt->d0_exit) {
+ struct port_net_dev *port;
+
+ if (port_num >= nhi_ctxt->num_ports) {
+ res = -EINVAL;
+ goto free_ctl_list;
+ }
+
+ port = &(nhi_ctxt->net_devices[port_num]);
+
+ mutex_lock(&port->state_mutex);
+ mutex_unlock(&controllers_list_mutex);
+
+ if (port->medium_sts != MEDIUM_READY_FOR_APPROVAL)
+ goto unlock;
+
+ port->medium_sts = MEDIUM_READY_FOR_CONNECTION;
+
+ if (!port->net_dev) {
+ port->net_dev = nhi_alloc_etherdev(nhi_ctxt, port_num,
+ info);
+ if (!port->net_dev) {
+ mutex_unlock(&port->state_mutex);
+ return -ENOMEM;
+ }
+ } else {
+ nhi_update_etherdev(nhi_ctxt, port->net_dev, info);
+
+ negotiation_events(port->net_dev,
+ MEDIUM_READY_FOR_CONNECTION);
+ }
+
+unlock:
+ mutex_unlock(&port->state_mutex);
+
+ return 0;
+ }
+
+free_ctl_list:
+ mutex_unlock(&controllers_list_mutex);
+
+ return res;
+}
static int nhi_genl_send_msg(struct tbt_nhi_ctxt *nhi_ctxt, enum pdf_value pdf,
const u8 *msg, u32 msg_len)
@@ -579,17 +686,127 @@ static int nhi_genl_send_msg(struct tbt_nhi_ctxt *nhi_ctxt, enum pdf_value pdf,
return res;
}
+static bool nhi_handle_inter_domain_msg(struct tbt_nhi_ctxt *nhi_ctxt,
+ struct thunderbolt_ip_header *hdr)
+{
+ struct port_net_dev *port;
+ u8 port_num;
+
+ const uuid_be proto_uuid = APPLE_THUNDERBOLT_IP_PROTOCOL_UUID;
+
+ if (uuid_be_cmp(proto_uuid, hdr->apple_tbt_ip_proto_uuid) != 0)
+ return true;
+
+ port_num = PORT_NUM_FROM_LINK(
+ L0_PORT_NUM(be32_to_cpu(hdr->route_str.lo)));
+
+ if (unlikely(port_num >= nhi_ctxt->num_ports))
+ return false;
+
+ port = &(nhi_ctxt->net_devices[port_num]);
+ mutex_lock(&port->state_mutex);
+ if (port->net_dev != NULL)
+ negotiation_messages(port->net_dev, hdr);
+ mutex_unlock(&port->state_mutex);
+
+ return false;
+}
+
+static void nhi_handle_notification_msg(struct tbt_nhi_ctxt *nhi_ctxt,
+ const u8 *msg)
+{
+ struct port_net_dev *port;
+ u8 port_num;
+
+#define INTER_DOMAIN_LINK_SHIFT 0
+#define INTER_DOMAIN_LINK_MASK GENMASK(2, INTER_DOMAIN_LINK_SHIFT)
+ switch (msg[3]) {
+
+ case NC_INTER_DOMAIN_CONNECTED:
+ port_num = PORT_NUM_FROM_MSG(msg[5]);
+#define INTER_DOMAIN_APPROVED BIT(3)
+ if (port_num < nhi_ctxt->num_ports &&
+ !(msg[5] & INTER_DOMAIN_APPROVED))
+ nhi_ctxt->net_devices[port_num].medium_sts =
+ MEDIUM_READY_FOR_APPROVAL;
+ break;
+
+ case NC_INTER_DOMAIN_DISCONNECTED:
+ port_num = PORT_NUM_FROM_MSG(msg[5]);
+
+ if (unlikely(port_num >= nhi_ctxt->num_ports))
+ break;
+
+ port = &(nhi_ctxt->net_devices[port_num]);
+ mutex_lock(&port->state_mutex);
+ port->medium_sts = MEDIUM_DISCONNECTED;
+
+ if (port->net_dev != NULL)
+ negotiation_events(port->net_dev,
+ MEDIUM_DISCONNECTED);
+ mutex_unlock(&port->state_mutex);
+ break;
+ }
+}
+
+static bool nhi_handle_icm_response_msg(struct tbt_nhi_ctxt *nhi_ctxt,
+ const u8 *msg)
+{
+ struct port_net_dev *port;
+ bool send_event = true;
+ u8 port_num;
+
+ if (nhi_ctxt->ignore_icm_resp &&
+ msg[3] == RC_INTER_DOMAIN_PKT_SENT) {
+ nhi_ctxt->ignore_icm_resp = false;
+ send_event = false;
+ }
+ if (nhi_ctxt->wait_for_icm_resp) {
+ nhi_ctxt->wait_for_icm_resp = false;
+ up(&nhi_ctxt->send_sem);
+ }
+
+ if (msg[3] == RC_APPROVE_INTER_DOMAIN_CONNECTION) {
+#define APPROVE_INTER_DOMAIN_ERROR BIT(0)
+ if (unlikely(msg[2] & APPROVE_INTER_DOMAIN_ERROR))
+ return send_event;
+
+ port_num = PORT_NUM_FROM_LINK((msg[5]&INTER_DOMAIN_LINK_MASK)>>
+ INTER_DOMAIN_LINK_SHIFT);
+
+ if (unlikely(port_num >= nhi_ctxt->num_ports))
+ return send_event;
+
+ port = &(nhi_ctxt->net_devices[port_num]);
+ mutex_lock(&port->state_mutex);
+ port->medium_sts = MEDIUM_CONNECTED;
+
+ if (port->net_dev != NULL)
+ negotiation_events(port->net_dev, MEDIUM_CONNECTED);
+ mutex_unlock(&port->state_mutex);
+ }
+
+ return send_event;
+}
+
static bool nhi_msg_from_icm_analysis(struct tbt_nhi_ctxt *nhi_ctxt,
enum pdf_value pdf,
const u8 *msg, u32 msg_len)
{
- /*
- * preparation for messages that won't be sent,
- * currently unused in this patch.
- */
bool send_event = true;
switch (pdf) {
+ case PDF_INTER_DOMAIN_REQUEST:
+ case PDF_INTER_DOMAIN_RESPONSE:
+ send_event = nhi_handle_inter_domain_msg(
+ nhi_ctxt,
+ (struct thunderbolt_ip_header *)msg);
+ break;
+
+ case PDF_FW_TO_SW_NOTIFICATION:
+ nhi_handle_notification_msg(nhi_ctxt, msg);
+ break;
+
case PDF_ERROR_NOTIFICATION:
/* fallthrough */
case PDF_WRITE_CONFIGURATION_REGISTERS:
@@ -599,7 +816,12 @@ static bool nhi_msg_from_icm_analysis(struct tbt_nhi_ctxt *nhi_ctxt,
nhi_ctxt->wait_for_icm_resp = false;
up(&nhi_ctxt->send_sem);
}
- /* fallthrough */
+ break;
+
+ case PDF_FW_TO_SW_RESPONSE:
+ send_event = nhi_handle_icm_response_msg(nhi_ctxt, msg);
+ break;
+
default:
break;
}
@@ -788,6 +1010,12 @@ static const struct genl_ops nhi_ops[] = {
.doit = nhi_genl_mailbox,
.flags = GENL_ADMIN_PERM,
},
+ {
+ .cmd = NHI_CMD_APPROVE_TBT_NETWORKING,
+ .policy = nhi_genl_policy,
+ .doit = nhi_genl_approve_networking,
+ .flags = GENL_ADMIN_PERM,
+ },
};
static int nhi_suspend(struct device *dev) __releases(&nhi_ctxt->send_sem)
@@ -795,6 +1023,17 @@ static int nhi_suspend(struct device *dev) __releases(&nhi_ctxt->send_sem)
struct tbt_nhi_ctxt *nhi_ctxt = pci_get_drvdata(to_pci_dev(dev));
void __iomem *rx_reg, *tx_reg;
u32 rx_reg_val, tx_reg_val;
+ int i;
+
+ for (i = 0; i < nhi_ctxt->num_ports; i++) {
+ struct port_net_dev *port = &nhi_ctxt->net_devices[i];
+
+ mutex_lock(&port->state_mutex);
+ port->medium_sts = MEDIUM_DISCONNECTED;
+ if (port->net_dev)
+ negotiation_events(port->net_dev, MEDIUM_DISCONNECTED);
+ mutex_unlock(&port->state_mutex);
+ }
/* must be after negotiation_events, since messages might be sent */
nhi_ctxt->d0_exit = true;
@@ -954,6 +1193,15 @@ static void icm_nhi_remove(struct pci_dev *pdev)
nhi_suspend(&pdev->dev);
+ for (i = 0; i < nhi_ctxt->num_ports; i++) {
+ mutex_lock(&nhi_ctxt->net_devices[i].state_mutex);
+ if (nhi_ctxt->net_devices[i].net_dev) {
+ nhi_dealloc_etherdev(nhi_ctxt->net_devices[i].net_dev);
+ nhi_ctxt->net_devices[i].net_dev = NULL;
+ }
+ mutex_unlock(&nhi_ctxt->net_devices[i].state_mutex);
+ }
+
if (nhi_ctxt->net_workqueue)
destroy_workqueue(nhi_ctxt->net_workqueue);
diff --git a/drivers/thunderbolt/icm/net.c b/drivers/thunderbolt/icm/net.c
new file mode 100644
index 0000000..beeafb3
--- /dev/null
+++ b/drivers/thunderbolt/icm/net.c
@@ -0,0 +1,783 @@
+/*******************************************************************************
+ *
+ * Intel Thunderbolt(TM) driver
+ * Copyright(c) 2014 - 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ ******************************************************************************/
+
+#include <linux/etherdevice.h>
+#include <linux/crc32.h>
+#include <linux/prefetch.h>
+#include <linux/highmem.h>
+#include <linux/if_vlan.h>
+#include <linux/jhash.h>
+#include <linux/vmalloc.h>
+#include <net/ip6_checksum.h>
+#include "icm_nhi.h"
+#include "net.h"
+
+#define DEFAULT_MSG_ENABLE (NETIF_MSG_PROBE | NETIF_MSG_LINK | NETIF_MSG_IFUP)
+static int debug = -1;
+module_param(debug, int, 0000);
+MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)");
+
+#define TBT_NET_RX_HDR_SIZE 256
+
+#define NUM_TX_LOGIN_RETRIES 60
+
+#define APPLE_THUNDERBOLT_IP_PROTOCOL_REVISION 1
+
+#define LOGIN_TX_PATH 0xf
+
+#define TBT_NET_MTU (64 * 1024)
+
+/* Number of Rx buffers we bundle into one write to the hardware */
+#define TBT_NET_RX_BUFFER_WRITE 16
+
+#define TBT_NET_MULTICAST_HASH_TABLE_SIZE 1024
+#define TBT_NET_ETHER_ADDR_HASH(addr) (((addr[4] >> 4) | (addr[5] << 4)) % \
+ TBT_NET_MULTICAST_HASH_TABLE_SIZE)
+
+#define BITS_PER_U32 (sizeof(u32) * BITS_PER_BYTE)
+
+#define TBT_NET_NUM_TX_BUFS 256
+#define TBT_NET_NUM_RX_BUFS 256
+#define TBT_NET_SIZE_TOTAL_DESCS ((TBT_NET_NUM_TX_BUFS + TBT_NET_NUM_RX_BUFS) \
+ * sizeof(struct tbt_buf_desc))
+
+
+#define TBT_NUM_FRAMES_PER_PAGE (PAGE_SIZE / TBT_RING_MAX_FRAME_SIZE)
+
+#define TBT_NUM_BUFS_BETWEEN(idx1, idx2, num_bufs) \
+ (((num_bufs) - 1) - \
+ ((((idx1) - (idx2)) + (num_bufs)) & ((num_bufs) - 1)))
+
+#define TX_WAKE_THRESHOLD (2 * DIV_ROUND_UP(TBT_NET_MTU, \
+ TBT_RING_MAX_FRM_DATA_SZ))
+
+#define TBT_NET_DESC_ATTR_SOF_EOF (((PDF_TBT_NET_START_OF_FRAME << \
+ DESC_ATTR_SOF_SHIFT) & \
+ DESC_ATTR_SOF_MASK) | \
+ ((PDF_TBT_NET_END_OF_FRAME << \
+ DESC_ATTR_EOF_SHIFT) & \
+ DESC_ATTR_EOF_MASK))
+
+/* E2E workaround */
+#define TBT_EXIST_BUT_UNUSED_HOPID 2
+
+enum tbt_net_frame_pdf {
+ PDF_TBT_NET_MIDDLE_FRAME,
+ PDF_TBT_NET_START_OF_FRAME,
+ PDF_TBT_NET_END_OF_FRAME,
+};
+
+struct thunderbolt_ip_login {
+ struct thunderbolt_ip_header header;
+ __be32 protocol_revision;
+ __be32 transmit_path;
+ __be32 reserved[4];
+ __be32 crc;
+};
+
+struct thunderbolt_ip_login_response {
+ struct thunderbolt_ip_header header;
+ __be32 status;
+ __be32 receiver_mac_address[2];
+ __be32 receiver_mac_address_length;
+ __be32 reserved[4];
+ __be32 crc;
+};
+
+struct thunderbolt_ip_logout {
+ struct thunderbolt_ip_header header;
+ __be32 crc;
+};
+
+struct thunderbolt_ip_status {
+ struct thunderbolt_ip_header header;
+ __be32 status;
+ __be32 crc;
+};
+
+struct approve_inter_domain_connection_cmd {
+ __be32 req_code;
+ __be32 attributes;
+#define AIDC_ATTR_LINK_SHIFT 16
+#define AIDC_ATTR_LINK_MASK GENMASK(18, AIDC_ATTR_LINK_SHIFT)
+#define AIDC_ATTR_DEPTH_SHIFT 20
+#define AIDC_ATTR_DEPTH_MASK GENMASK(23, AIDC_ATTR_DEPTH_SHIFT)
+ uuid_be remote_uuid;
+ __be16 transmit_ring_number;
+ __be16 transmit_path;
+ __be16 receive_ring_number;
+ __be16 receive_path;
+ __be32 crc;
+
+};
+
+enum neg_event {
+ RECEIVE_LOGOUT = NUM_MEDIUM_STATUSES,
+ RECEIVE_LOGIN_RESPONSE,
+ RECEIVE_LOGIN,
+ NUM_NEG_EVENTS
+};
+
+enum disconnect_path_stage {
+ STAGE_1 = BIT(0),
+ STAGE_2 = BIT(1)
+};
+
+/**
+ * struct tbt_port - the basic tbt_port structure
+ * @tbt_nhi_ctxt: context of the nhi controller.
+ * @net_dev: networking device object.
+ * @login_retry_work: work queue for sending login requests.
+ * @login_response_work: work queue for sending login responses.
+ * @work_struct logout_work: work queue for sending logout requests.
+ * @status_reply_work: work queue for sending logout replies.
+ * @approve_inter_domain_work: work queue for sending interdomain to icm.
+ * @route_str: allows to route the messages to destination.
+ * @interdomain_local_uuid: allows to route the messages from local source.
+ * @interdomain_remote_uuid: allows to route the messages to destination.
+ * @command_id a number that identifies the command.
+ * @negotiation_status: holds the network negotiation state.
+ * @msg_enable: used for debugging filters.
+ * @seq_num: a number that identifies the session.
+ * @login_retry_count: counts number of login retries sent.
+ * @local_depth: depth of the remote peer in the chain.
+ * @transmit_path: routing parameter for the icm.
+ * @frame_id: counting ID of frames.
+ * @num: port number.
+ * @local_path: routing parameter for the icm.
+ * @enable_full_e2e: whether to enable full E2E.
+ * @match_frame_id: whether to match frame id on incoming packets.
+ */
+struct tbt_port {
+ struct tbt_nhi_ctxt *nhi_ctxt;
+ struct net_device *net_dev;
+ struct delayed_work login_retry_work;
+ struct work_struct login_response_work;
+ struct work_struct logout_work;
+ struct work_struct status_reply_work;
+ struct work_struct approve_inter_domain_work;
+ struct route_string route_str;
+ uuid_be interdomain_local_uuid;
+ uuid_be interdomain_remote_uuid;
+ u32 command_id;
+ u16 negotiation_status;
+ u16 msg_enable;
+ u8 seq_num;
+ u8 login_retry_count;
+ u8 local_depth;
+ u8 transmit_path;
+ u16 frame_id;
+ u8 num;
+ u8 local_path;
+ bool enable_full_e2e : 1;
+ bool match_frame_id : 1;
+};
+
+static void disconnect_path(struct tbt_port *port,
+ enum disconnect_path_stage stage)
+{
+ u32 cmd = (DISCONNECT_PORT_A_INTER_DOMAIN_PATH + port->num);
+
+ cmd <<= REG_INMAIL_CMD_CMD_SHIFT;
+ cmd &= REG_INMAIL_CMD_CMD_MASK;
+ cmd |= REG_INMAIL_CMD_REQUEST;
+
+ mutex_lock(&port->nhi_ctxt->mailbox_mutex);
+ if (!mutex_trylock(&port->nhi_ctxt->d0_exit_mailbox_mutex)) {
+ netif_notice(port, link, port->net_dev, "controller id %#x is existing D0\n",
+ port->nhi_ctxt->id);
+ } else {
+ nhi_mailbox(port->nhi_ctxt, cmd, stage, false);
+
+ port->nhi_ctxt->net_devices[port->num].medium_sts =
+ MEDIUM_READY_FOR_CONNECTION;
+
+ mutex_unlock(&port->nhi_ctxt->d0_exit_mailbox_mutex);
+ }
+ mutex_unlock(&port->nhi_ctxt->mailbox_mutex);
+}
+
+static void tbt_net_tear_down(struct net_device *net_dev, bool send_logout)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+ void __iomem *iobase = port->nhi_ctxt->iobase;
+ void __iomem *tx_reg = NULL;
+ u32 tx_reg_val = 0;
+
+ netif_carrier_off(net_dev);
+ netif_stop_queue(net_dev);
+
+ if (port->negotiation_status & BIT(MEDIUM_CONNECTED)) {
+ void __iomem *rx_reg = iobase + REG_RX_OPTIONS_BASE +
+ (port->local_path * REG_OPTS_STEP);
+ u32 rx_reg_val = ioread32(rx_reg) & ~REG_OPTS_E2E_EN;
+
+ tx_reg = iobase + REG_TX_OPTIONS_BASE +
+ (port->local_path * REG_OPTS_STEP);
+ tx_reg_val = ioread32(tx_reg) & ~REG_OPTS_E2E_EN;
+
+ disconnect_path(port, STAGE_1);
+
+ /* disable RX flow control */
+ iowrite32(rx_reg_val, rx_reg);
+ /* disable TX flow control */
+ iowrite32(tx_reg_val, tx_reg);
+ /* disable RX ring */
+ iowrite32(rx_reg_val & ~REG_OPTS_VALID, rx_reg);
+
+ rx_reg = iobase + REG_RX_RING_BASE +
+ (port->local_path * REG_RING_STEP);
+ iowrite32(0, rx_reg + REG_RING_PHYS_LO_OFFSET);
+ iowrite32(0, rx_reg + REG_RING_PHYS_HI_OFFSET);
+ }
+
+ /* Stop login messages */
+ cancel_delayed_work_sync(&port->login_retry_work);
+
+ if (send_logout)
+ queue_work(port->nhi_ctxt->net_workqueue, &port->logout_work);
+
+ if (port->negotiation_status & BIT(MEDIUM_CONNECTED)) {
+ unsigned long flags;
+
+ /* wait for TX to finish */
+ usleep_range(5 * USEC_PER_MSEC, 7 * USEC_PER_MSEC);
+ /* disable TX ring */
+ iowrite32(tx_reg_val & ~REG_OPTS_VALID, tx_reg);
+
+ disconnect_path(port, STAGE_2);
+
+ spin_lock_irqsave(&port->nhi_ctxt->lock, flags);
+ /* disable RX and TX interrupts */
+ RING_INT_DISABLE_TX_RX(iobase, port->local_path,
+ port->nhi_ctxt->num_paths);
+ spin_unlock_irqrestore(&port->nhi_ctxt->lock, flags);
+ }
+}
+
+static inline int send_message(struct tbt_port *port, const char *func,
+ enum pdf_value pdf, u32 msg_len,
+ const void *msg)
+{
+ u32 crc_offset = msg_len - sizeof(__be32);
+ __be32 *crc = (__be32 *)((u8 *)msg + crc_offset);
+ bool is_intdom = (pdf == PDF_INTER_DOMAIN_RESPONSE);
+ int res;
+
+ *crc = cpu_to_be32(~__crc32c_le(~0, msg, crc_offset));
+ res = down_timeout(&port->nhi_ctxt->send_sem,
+ msecs_to_jiffies(3 * MSEC_PER_SEC));
+ if (res) {
+ netif_err(port, link, port->net_dev, "%s: controller id %#x timeout on send semaphore\n",
+ func, port->nhi_ctxt->id);
+ return res;
+ }
+
+ if (!mutex_trylock(&port->nhi_ctxt->d0_exit_send_mutex)) {
+ up(&port->nhi_ctxt->send_sem);
+ netif_notice(port, link, port->net_dev, "%s: controller id %#x is existing D0\n",
+ func, port->nhi_ctxt->id);
+ return -ENODEV;
+ }
+
+ res = nhi_send_message(port->nhi_ctxt, pdf, msg_len, msg, is_intdom);
+
+ mutex_unlock(&port->nhi_ctxt->d0_exit_send_mutex);
+ if (res)
+ up(&port->nhi_ctxt->send_sem);
+
+ return res;
+}
+
+static void approve_inter_domain(struct work_struct *work)
+{
+ struct tbt_port *port = container_of(work, typeof(*port),
+ approve_inter_domain_work);
+ struct approve_inter_domain_connection_cmd approve_msg = {
+ .req_code = cpu_to_be32(CC_APPROVE_INTER_DOMAIN_CONNECTION),
+ .transmit_path = cpu_to_be16(LOGIN_TX_PATH),
+ };
+ u32 aidc = (L0_PORT_NUM(port->route_str.lo) << AIDC_ATTR_LINK_SHIFT) &
+ AIDC_ATTR_LINK_MASK;
+
+ aidc |= (port->local_depth << AIDC_ATTR_DEPTH_SHIFT) &
+ AIDC_ATTR_DEPTH_MASK;
+
+ approve_msg.attributes = cpu_to_be32(aidc);
+
+ memcpy(&approve_msg.remote_uuid, &port->interdomain_remote_uuid,
+ sizeof(approve_msg.remote_uuid));
+ approve_msg.transmit_ring_number = cpu_to_be16(port->local_path);
+ approve_msg.receive_ring_number = cpu_to_be16(port->local_path);
+ approve_msg.receive_path = cpu_to_be16(port->transmit_path);
+
+ send_message(port, __func__, PDF_SW_TO_FW_COMMAND, sizeof(approve_msg),
+ &approve_msg);
+}
+
+static inline void prepare_header(struct thunderbolt_ip_header *header,
+ struct tbt_port *port,
+ enum thunderbolt_ip_packet_type packet_type,
+ u8 len_dwords)
+{
+ const uuid_be proto_uuid = APPLE_THUNDERBOLT_IP_PROTOCOL_UUID;
+
+ header->packet_type = cpu_to_be32(packet_type);
+ header->route_str.hi = cpu_to_be32(port->route_str.hi);
+ header->route_str.lo = cpu_to_be32(port->route_str.lo);
+ header->attributes = cpu_to_be32(
+ ((port->seq_num << HDR_ATTR_SEQ_NUM_SHIFT) &
+ HDR_ATTR_SEQ_NUM_MASK) |
+ ((len_dwords << HDR_ATTR_LEN_SHIFT) & HDR_ATTR_LEN_MASK));
+ memcpy(&header->apple_tbt_ip_proto_uuid, &proto_uuid,
+ sizeof(header->apple_tbt_ip_proto_uuid));
+ memcpy(&header->initiator_uuid, &port->interdomain_local_uuid,
+ sizeof(header->initiator_uuid));
+ memcpy(&header->target_uuid, &port->interdomain_remote_uuid,
+ sizeof(header->target_uuid));
+ header->command_id = cpu_to_be32(port->command_id);
+
+ port->command_id++;
+}
+
+static void status_reply(struct work_struct *work)
+{
+ struct tbt_port *port = container_of(work, typeof(*port),
+ status_reply_work);
+ struct thunderbolt_ip_status status_msg = {
+ .status = 0,
+ };
+
+ prepare_header(&status_msg.header, port,
+ THUNDERBOLT_IP_STATUS_TYPE,
+ (offsetof(struct thunderbolt_ip_status, crc) -
+ offsetof(struct thunderbolt_ip_status,
+ header.apple_tbt_ip_proto_uuid)) /
+ sizeof(u32));
+
+ send_message(port, __func__, PDF_INTER_DOMAIN_RESPONSE,
+ sizeof(status_msg), &status_msg);
+
+}
+
+static void logout(struct work_struct *work)
+{
+ struct tbt_port *port = container_of(work, typeof(*port),
+ logout_work);
+ struct thunderbolt_ip_logout logout_msg;
+
+ prepare_header(&logout_msg.header, port,
+ THUNDERBOLT_IP_LOGOUT_TYPE,
+ (offsetof(struct thunderbolt_ip_logout, crc) -
+ offsetof(struct thunderbolt_ip_logout,
+ header.apple_tbt_ip_proto_uuid)) / sizeof(u32));
+
+ send_message(port, __func__, PDF_INTER_DOMAIN_RESPONSE,
+ sizeof(logout_msg), &logout_msg);
+
+}
+
+static void login_response(struct work_struct *work)
+{
+ struct tbt_port *port = container_of(work, typeof(*port),
+ login_response_work);
+ struct thunderbolt_ip_login_response login_res_msg = {
+ .receiver_mac_address_length = cpu_to_be32(ETH_ALEN),
+ };
+
+ prepare_header(&login_res_msg.header, port,
+ THUNDERBOLT_IP_LOGIN_RESPONSE_TYPE,
+ (offsetof(struct thunderbolt_ip_login_response, crc) -
+ offsetof(struct thunderbolt_ip_login_response,
+ header.apple_tbt_ip_proto_uuid)) / sizeof(u32));
+
+ ether_addr_copy((u8 *)login_res_msg.receiver_mac_address,
+ port->net_dev->dev_addr);
+
+ send_message(port, __func__, PDF_INTER_DOMAIN_RESPONSE,
+ sizeof(login_res_msg), &login_res_msg);
+
+}
+
+static void login_retry(struct work_struct *work)
+{
+ struct tbt_port *port = container_of(work, typeof(*port),
+ login_retry_work.work);
+ struct thunderbolt_ip_login login_msg = {
+ .protocol_revision = cpu_to_be32(
+ APPLE_THUNDERBOLT_IP_PROTOCOL_REVISION),
+ .transmit_path = cpu_to_be32(LOGIN_TX_PATH),
+ };
+
+
+ if (port->nhi_ctxt->d0_exit)
+ return;
+
+ port->login_retry_count++;
+
+ prepare_header(&login_msg.header, port,
+ THUNDERBOLT_IP_LOGIN_TYPE,
+ (offsetof(struct thunderbolt_ip_login, crc) -
+ offsetof(struct thunderbolt_ip_login,
+ header.apple_tbt_ip_proto_uuid)) / sizeof(u32));
+
+ if (send_message(port, __func__, PDF_INTER_DOMAIN_RESPONSE,
+ sizeof(login_msg), &login_msg) == -ENODEV)
+ return;
+
+ if (likely(port->login_retry_count < NUM_TX_LOGIN_RETRIES))
+ queue_delayed_work(port->nhi_ctxt->net_workqueue,
+ &port->login_retry_work,
+ msecs_to_jiffies(5 * MSEC_PER_SEC));
+ else
+ netif_notice(port, link, port->net_dev, "port %u (%#x) login timeout after %u retries\n",
+ port->num, port->negotiation_status,
+ port->login_retry_count);
+}
+
+void negotiation_events(struct net_device *net_dev,
+ enum medium_status medium_sts)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+ void __iomem *iobase = port->nhi_ctxt->iobase;
+ u32 sof_eof_en, tx_ring_conf, rx_ring_conf, e2e_en;
+ void __iomem *reg;
+ unsigned long flags;
+ u16 hop_id;
+ bool send_logout;
+
+ if (!netif_running(net_dev)) {
+ netif_dbg(port, link, net_dev, "port %u (%#x) is down\n",
+ port->num, port->negotiation_status);
+ return;
+ }
+
+ netif_dbg(port, link, net_dev, "port %u (%#x) receive event %u\n",
+ port->num, port->negotiation_status, medium_sts);
+
+ switch (medium_sts) {
+ case MEDIUM_DISCONNECTED:
+ send_logout = (port->negotiation_status
+ & (BIT(MEDIUM_CONNECTED)
+ | BIT(MEDIUM_READY_FOR_CONNECTION)));
+ send_logout = send_logout && !(port->negotiation_status &
+ BIT(RECEIVE_LOGOUT));
+
+ tbt_net_tear_down(net_dev, send_logout);
+ port->negotiation_status = BIT(MEDIUM_DISCONNECTED);
+ break;
+
+ case MEDIUM_CONNECTED:
+ /*
+ * check if meanwhile other side sent logout
+ * if yes, just don't allow connection to take place
+ * and disconnect path
+ */
+ if (port->negotiation_status & BIT(RECEIVE_LOGOUT)) {
+ disconnect_path(port, STAGE_1 | STAGE_2);
+ break;
+ }
+
+ port->negotiation_status = BIT(MEDIUM_CONNECTED);
+
+ /* configure TX ring */
+ reg = iobase + REG_TX_RING_BASE +
+ (port->local_path * REG_RING_STEP);
+
+ tx_ring_conf = (TBT_NET_NUM_TX_BUFS << REG_RING_SIZE_SHIFT) &
+ REG_RING_SIZE_MASK;
+
+ iowrite32(tx_ring_conf, reg + REG_RING_SIZE_OFFSET);
+
+ /* enable the rings */
+ reg = iobase + REG_TX_OPTIONS_BASE +
+ (port->local_path * REG_OPTS_STEP);
+ if (port->enable_full_e2e) {
+ iowrite32(REG_OPTS_VALID | REG_OPTS_E2E_EN, reg);
+ hop_id = port->local_path;
+ } else {
+ iowrite32(REG_OPTS_VALID, reg);
+ hop_id = TBT_EXIST_BUT_UNUSED_HOPID;
+ }
+
+ reg = iobase + REG_RX_OPTIONS_BASE +
+ (port->local_path * REG_OPTS_STEP);
+
+ sof_eof_en = (BIT(PDF_TBT_NET_START_OF_FRAME) <<
+ REG_RX_OPTS_MASK_SOF_SHIFT) &
+ REG_RX_OPTS_MASK_SOF_MASK;
+
+ sof_eof_en |= (BIT(PDF_TBT_NET_END_OF_FRAME) <<
+ REG_RX_OPTS_MASK_EOF_SHIFT) &
+ REG_RX_OPTS_MASK_EOF_MASK;
+
+ iowrite32(sof_eof_en, reg + REG_RX_OPTS_MASK_OFFSET);
+
+ e2e_en = REG_OPTS_VALID | REG_OPTS_E2E_EN;
+ e2e_en |= (hop_id << REG_RX_OPTS_TX_E2E_HOP_ID_SHIFT) &
+ REG_RX_OPTS_TX_E2E_HOP_ID_MASK;
+
+ iowrite32(e2e_en, reg);
+
+ /*
+ * Configure RX ring
+ * must be after enable ring for E2E to work
+ */
+ reg = iobase + REG_RX_RING_BASE +
+ (port->local_path * REG_RING_STEP);
+
+ rx_ring_conf = (TBT_NET_NUM_RX_BUFS << REG_RING_SIZE_SHIFT) &
+ REG_RING_SIZE_MASK;
+
+ rx_ring_conf |= (TBT_RING_MAX_FRAME_SIZE <<
+ REG_RING_BUF_SIZE_SHIFT) &
+ REG_RING_BUF_SIZE_MASK;
+
+ iowrite32(rx_ring_conf, reg + REG_RING_SIZE_OFFSET);
+
+ spin_lock_irqsave(&port->nhi_ctxt->lock, flags);
+ /* enable RX interrupt */
+ iowrite32(ioread32(iobase + REG_RING_INTERRUPT_BASE) |
+ REG_RING_INT_RX_PROCESSED(port->local_path,
+ port->nhi_ctxt->num_paths),
+ iobase + REG_RING_INTERRUPT_BASE);
+ spin_unlock_irqrestore(&port->nhi_ctxt->lock, flags);
+
+ netif_info(port, link, net_dev, "Thunderbolt(TM) Networking port %u - ready\n",
+ port->num);
+
+ netif_carrier_on(net_dev);
+ netif_start_queue(net_dev);
+ break;
+
+ case MEDIUM_READY_FOR_CONNECTION:
+ /*
+ * If medium is connected, no reason to go back,
+ * keep it 'connected'.
+ * If received login response, don't need to trigger login
+ * retries again.
+ */
+ if (unlikely(port->negotiation_status &
+ (BIT(MEDIUM_CONNECTED) |
+ BIT(RECEIVE_LOGIN_RESPONSE))))
+ break;
+
+ port->negotiation_status = BIT(MEDIUM_READY_FOR_CONNECTION);
+ port->login_retry_count = 0;
+ queue_delayed_work(port->nhi_ctxt->net_workqueue,
+ &port->login_retry_work, 0);
+ break;
+
+ default:
+ break;
+ }
+}
+
+void negotiation_messages(struct net_device *net_dev,
+ struct thunderbolt_ip_header *hdr)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+ __be32 status;
+
+ if (!netif_running(net_dev)) {
+ netif_dbg(port, link, net_dev, "port %u (%#x) is down\n",
+ port->num, port->negotiation_status);
+ return;
+ }
+
+ switch (hdr->packet_type) {
+ case cpu_to_be32(THUNDERBOLT_IP_LOGIN_TYPE):
+ port->transmit_path = be32_to_cpu(
+ ((struct thunderbolt_ip_login *)hdr)->transmit_path);
+ netif_dbg(port, link, net_dev, "port %u (%#x) receive ThunderboltIP login message with transmit path %u\n",
+ port->num, port->negotiation_status,
+ port->transmit_path);
+
+ if (unlikely(port->negotiation_status &
+ BIT(MEDIUM_DISCONNECTED)))
+ break;
+
+ queue_work(port->nhi_ctxt->net_workqueue,
+ &port->login_response_work);
+
+ if (unlikely(port->negotiation_status & BIT(MEDIUM_CONNECTED)))
+ break;
+
+ /*
+ * In case a login response received from other peer
+ * on my login and acked their login for the first time,
+ * so just approve the inter-domain now
+ */
+ if (port->negotiation_status & BIT(RECEIVE_LOGIN_RESPONSE)) {
+ if (!(port->negotiation_status & BIT(RECEIVE_LOGIN)))
+ queue_work(port->nhi_ctxt->net_workqueue,
+ &port->approve_inter_domain_work);
+ /*
+ * if we reached the number of max retries or previous
+ * logout, schedule another round of login retries
+ */
+ } else if ((port->login_retry_count >= NUM_TX_LOGIN_RETRIES) ||
+ (port->negotiation_status & BIT(RECEIVE_LOGOUT))) {
+ port->negotiation_status &= ~(BIT(RECEIVE_LOGOUT));
+ port->login_retry_count = 0;
+ queue_delayed_work(port->nhi_ctxt->net_workqueue,
+ &port->login_retry_work, 0);
+ }
+
+ port->negotiation_status |= BIT(RECEIVE_LOGIN);
+
+ break;
+
+ case cpu_to_be32(THUNDERBOLT_IP_LOGIN_RESPONSE_TYPE):
+ status = ((struct thunderbolt_ip_login_response *)hdr)->status;
+ if (likely(status == 0)) {
+ netif_dbg(port, link, net_dev, "port %u (%#x) receive ThunderboltIP login response message\n",
+ port->num,
+ port->negotiation_status);
+
+ if (unlikely(port->negotiation_status &
+ (BIT(MEDIUM_DISCONNECTED) |
+ BIT(MEDIUM_CONNECTED) |
+ BIT(RECEIVE_LOGIN_RESPONSE))))
+ break;
+
+ port->negotiation_status |=
+ BIT(RECEIVE_LOGIN_RESPONSE);
+ cancel_delayed_work_sync(&port->login_retry_work);
+ /*
+ * login was received from other peer and now response
+ * on our login so approve the inter-domain
+ */
+ if (port->negotiation_status & BIT(RECEIVE_LOGIN))
+ queue_work(port->nhi_ctxt->net_workqueue,
+ &port->approve_inter_domain_work);
+ else
+ port->negotiation_status &=
+ ~BIT(RECEIVE_LOGOUT);
+ } else {
+ netif_notice(port, link, net_dev, "port %u (%#x) receive ThunderboltIP login response message with status %u\n",
+ port->num,
+ port->negotiation_status,
+ be32_to_cpu(status));
+ }
+ break;
+
+ case cpu_to_be32(THUNDERBOLT_IP_LOGOUT_TYPE):
+ netif_dbg(port, link, net_dev, "port %u (%#x) receive ThunderboltIP logout message\n",
+ port->num, port->negotiation_status);
+
+ queue_work(port->nhi_ctxt->net_workqueue,
+ &port->status_reply_work);
+ port->negotiation_status &= ~(BIT(RECEIVE_LOGIN) |
+ BIT(RECEIVE_LOGIN_RESPONSE));
+ port->negotiation_status |= BIT(RECEIVE_LOGOUT);
+
+ if (!(port->negotiation_status & BIT(MEDIUM_CONNECTED))) {
+ tbt_net_tear_down(net_dev, false);
+ break;
+ }
+
+ tbt_net_tear_down(net_dev, true);
+
+ port->negotiation_status |= BIT(MEDIUM_READY_FOR_CONNECTION);
+ port->negotiation_status &= ~(BIT(MEDIUM_CONNECTED));
+ break;
+
+ case cpu_to_be32(THUNDERBOLT_IP_STATUS_TYPE):
+ netif_dbg(port, link, net_dev, "port %u (%#x) receive ThunderboltIP status message with status %u\n",
+ port->num, port->negotiation_status,
+ be32_to_cpu(
+ ((struct thunderbolt_ip_status *)hdr)->status));
+ break;
+ }
+}
+
+void nhi_dealloc_etherdev(struct net_device *net_dev)
+{
+ unregister_netdev(net_dev);
+ free_netdev(net_dev);
+}
+
+void nhi_update_etherdev(struct tbt_nhi_ctxt *nhi_ctxt,
+ struct net_device *net_dev, struct genl_info *info)
+{
+ struct tbt_port *port = netdev_priv(net_dev);
+
+ nla_memcpy(&(port->route_str),
+ info->attrs[NHI_ATTR_LOCAL_ROUTE_STRING],
+ sizeof(port->route_str));
+ nla_memcpy(&port->interdomain_remote_uuid,
+ info->attrs[NHI_ATTR_REMOTE_UUID],
+ sizeof(port->interdomain_remote_uuid));
+ port->local_depth = nla_get_u8(info->attrs[NHI_ATTR_LOCAL_DEPTH]);
+ port->enable_full_e2e = nhi_ctxt->support_full_e2e ?
+ nla_get_flag(info->attrs[NHI_ATTR_ENABLE_FULL_E2E]) : false;
+ port->match_frame_id =
+ nla_get_flag(info->attrs[NHI_ATTR_MATCH_FRAME_ID]);
+ port->frame_id = 0;
+}
+
+struct net_device *nhi_alloc_etherdev(struct tbt_nhi_ctxt *nhi_ctxt,
+ u8 port_num, struct genl_info *info)
+{
+ struct tbt_port *port;
+ struct net_device *net_dev = alloc_etherdev(sizeof(struct tbt_port));
+ u32 hash;
+
+ if (!net_dev)
+ return NULL;
+
+ SET_NETDEV_DEV(net_dev, &nhi_ctxt->pdev->dev);
+
+ port = netdev_priv(net_dev);
+ port->nhi_ctxt = nhi_ctxt;
+ port->net_dev = net_dev;
+ nla_memcpy(&port->interdomain_local_uuid,
+ info->attrs[NHI_ATTR_LOCAL_UUID],
+ sizeof(port->interdomain_local_uuid));
+ nhi_update_etherdev(nhi_ctxt, net_dev, info);
+ port->num = port_num;
+ port->local_path = PATH_FROM_PORT(nhi_ctxt->num_paths, port_num);
+
+ port->msg_enable = netif_msg_init(debug, DEFAULT_MSG_ENABLE);
+
+ net_dev->addr_assign_type = NET_ADDR_PERM;
+ /* unicast and locally administred MAC */
+ net_dev->dev_addr[0] = (port_num << 4) | 0x02;
+ hash = jhash2((u32 *)&port->interdomain_local_uuid,
+ sizeof(port->interdomain_local_uuid)/sizeof(u32), 0);
+
+ memcpy(net_dev->dev_addr + 1, &hash, sizeof(hash));
+ hash = jhash2((u32 *)&port->interdomain_local_uuid,
+ sizeof(port->interdomain_local_uuid)/sizeof(u32), hash);
+
+ net_dev->dev_addr[5] = hash & 0xff;
+
+ scnprintf(net_dev->name, sizeof(net_dev->name), "tbtnet%%dp%hhu",
+ port_num);
+
+ INIT_DELAYED_WORK(&port->login_retry_work, login_retry);
+ INIT_WORK(&port->login_response_work, login_response);
+ INIT_WORK(&port->logout_work, logout);
+ INIT_WORK(&port->status_reply_work, status_reply);
+ INIT_WORK(&port->approve_inter_domain_work, approve_inter_domain);
+
+ netif_info(port, probe, net_dev,
+ "Thunderbolt(TM) Networking port %u - MAC Address: %pM\n",
+ port_num, net_dev->dev_addr);
+
+ return net_dev;
+}
diff --git a/drivers/thunderbolt/icm/net.h b/drivers/thunderbolt/icm/net.h
index 0281201..1cb6701 100644
--- a/drivers/thunderbolt/icm/net.h
+++ b/drivers/thunderbolt/icm/net.h
@@ -23,6 +23,10 @@
#include <linux/semaphore.h>
#include <net/genetlink.h>
+#define APPLE_THUNDERBOLT_IP_PROTOCOL_UUID \
+ UUID_BE(0x9E588F79, 0x478A, 0x1636, \
+ 0x64, 0x56, 0xC6, 0x97, 0xDD, 0xC8, 0x20, 0xA9)
+
/*
* Each physical port contains 2 channels.
* Devices are exposed to user based on physical ports.
@@ -33,6 +37,9 @@
* host channel/link which starts from 1.
*/
#define PORT_NUM_FROM_LINK(link) (((link) - 1) / CHANNELS_PER_PORT_NUM)
+#define PORT_NUM_FROM_MSG(msg) PORT_NUM_FROM_LINK(((msg) & \
+ INTER_DOMAIN_LINK_MASK) >> \
+ INTER_DOMAIN_LINK_SHIFT)
#define TBT_TX_RING_FULL(prod, cons, size) ((((prod) + 1) % (size)) == (cons))
#define TBT_TX_RING_EMPTY(prod, cons) ((prod) == (cons))
@@ -125,6 +132,17 @@ enum {
CC_SET_FW_MODE_FDA_DA_ALL
};
+struct route_string {
+ u32 hi;
+ u32 lo;
+};
+
+struct route_string_be {
+ __be32 hi;
+ __be32 lo;
+};
+
+#define L0_PORT_NUM(cpu_route_str_lo) ((cpu_route_str_lo) & GENMASK(5, 0))
/* NHI genetlink attributes */
enum {
@@ -138,12 +156,53 @@ enum {
NHI_ATTR_PDF,
NHI_ATTR_MSG_TO_ICM,
NHI_ATTR_MSG_FROM_ICM,
+ NHI_ATTR_LOCAL_ROUTE_STRING,
+ NHI_ATTR_LOCAL_UUID,
+ NHI_ATTR_REMOTE_UUID,
+ NHI_ATTR_LOCAL_DEPTH,
+ NHI_ATTR_ENABLE_FULL_E2E,
+ NHI_ATTR_MATCH_FRAME_ID,
__NHI_ATTR_MAX,
};
#define NHI_ATTR_MAX (__NHI_ATTR_MAX - 1)
+/* ThunderboltIP Packet Types */
+enum thunderbolt_ip_packet_type {
+ THUNDERBOLT_IP_LOGIN_TYPE,
+ THUNDERBOLT_IP_LOGIN_RESPONSE_TYPE,
+ THUNDERBOLT_IP_LOGOUT_TYPE,
+ THUNDERBOLT_IP_STATUS_TYPE
+};
+
+struct thunderbolt_ip_header {
+ struct route_string_be route_str;
+ __be32 attributes;
+#define HDR_ATTR_LEN_SHIFT 0
+#define HDR_ATTR_LEN_MASK GENMASK(5, HDR_ATTR_LEN_SHIFT)
+#define HDR_ATTR_SEQ_NUM_SHIFT 27
+#define HDR_ATTR_SEQ_NUM_MASK GENMASK(28, HDR_ATTR_SEQ_NUM_SHIFT)
+ uuid_be apple_tbt_ip_proto_uuid;
+ uuid_be initiator_uuid;
+ uuid_be target_uuid;
+ __be32 packet_type;
+ __be32 command_id;
+};
+
+enum medium_status {
+ /* Handle cable disconnection or peer down */
+ MEDIUM_DISCONNECTED,
+ /* Connection is fully established */
+ MEDIUM_CONNECTED,
+ /* Awaiting for being approved by user-space module */
+ MEDIUM_READY_FOR_APPROVAL,
+ /* Approved by user-space, awaiting for establishment flow to finish */
+ MEDIUM_READY_FOR_CONNECTION,
+ NUM_MEDIUM_STATUSES
+};
+
struct port_net_dev {
struct net_device *net_dev;
+ enum medium_status medium_sts;
struct mutex state_mutex;
};
@@ -213,5 +272,16 @@ struct tbt_nhi_ctxt {
int nhi_send_message(struct tbt_nhi_ctxt *nhi_ctxt, enum pdf_value pdf,
u32 msg_len, const void *msg, bool ignore_icm_resp);
int nhi_mailbox(struct tbt_nhi_ctxt *nhi_ctxt, u32 cmd, u32 data, bool deinit);
+struct net_device *nhi_alloc_etherdev(struct tbt_nhi_ctxt *nhi_ctxt,
+ u8 port_num, struct genl_info *info);
+void nhi_update_etherdev(struct tbt_nhi_ctxt *nhi_ctxt,
+ struct net_device *net_dev, struct genl_info *info);
+void nhi_dealloc_etherdev(struct net_device *net_dev);
+void negotiation_events(struct net_device *net_dev,
+ enum medium_status medium_sts);
+void negotiation_messages(struct net_device *net_dev,
+ struct thunderbolt_ip_header *hdr);
+void tbt_net_rx_msi(struct net_device *net_dev);
+void tbt_net_tx_msi(struct net_device *net_dev);
#endif
--
2.7.4
^ permalink raw reply related
* [PATCH v9 3/8] thunderbolt: Communication with the ICM (firmware)
From: Amir Levy @ 2016-11-09 14:20 UTC (permalink / raw)
To: gregkh
Cc: andreas.noever, bhelgaas, corbet, linux-kernel, linux-pci, netdev,
linux-doc, mario_limonciello, thunderbolt-linux, mika.westerberg,
tomas.winkler, xiong.y.zhang, Amir Levy
In-Reply-To: <1478701208-4585-1-git-send-email-amir.jer.levy@intel.com>
This patch provides the communication protocol between the
Intel Connection Manager(ICM) firmware that is operational in the
Thunderbolt controller in non-Apple hardware.
The ICM firmware-based controller is used for establishing and maintaining
the Thunderbolt Networking connection - we need to be able to communicate
with it.
Signed-off-by: Amir Levy <amir.jer.levy@intel.com>
---
drivers/thunderbolt/icm/Makefile | 2 +
drivers/thunderbolt/icm/icm_nhi.c | 1257 +++++++++++++++++++++++++++++++++++++
drivers/thunderbolt/icm/icm_nhi.h | 85 +++
drivers/thunderbolt/icm/net.h | 217 +++++++
4 files changed, 1561 insertions(+)
create mode 100644 drivers/thunderbolt/icm/Makefile
create mode 100644 drivers/thunderbolt/icm/icm_nhi.c
create mode 100644 drivers/thunderbolt/icm/icm_nhi.h
create mode 100644 drivers/thunderbolt/icm/net.h
diff --git a/drivers/thunderbolt/icm/Makefile b/drivers/thunderbolt/icm/Makefile
new file mode 100644
index 0000000..f0d0fbb
--- /dev/null
+++ b/drivers/thunderbolt/icm/Makefile
@@ -0,0 +1,2 @@
+obj-${CONFIG_THUNDERBOLT_ICM} += thunderbolt-icm.o
+thunderbolt-icm-objs := icm_nhi.o
diff --git a/drivers/thunderbolt/icm/icm_nhi.c b/drivers/thunderbolt/icm/icm_nhi.c
new file mode 100644
index 0000000..c843ce8
--- /dev/null
+++ b/drivers/thunderbolt/icm/icm_nhi.c
@@ -0,0 +1,1257 @@
+/*******************************************************************************
+ *
+ * Intel Thunderbolt(TM) driver
+ * Copyright(c) 2014 - 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ ******************************************************************************/
+
+#include <linux/printk.h>
+#include <linux/crc32.h>
+#include <linux/delay.h>
+#include <linux/dmi.h>
+#include "icm_nhi.h"
+#include "net.h"
+
+#define NHI_GENL_VERSION 1
+#define NHI_GENL_NAME "thunderbolt"
+
+#define DEVICE_DATA(num_ports, dma_port, nvm_ver_offset, nvm_auth_on_boot,\
+ support_full_e2e) \
+ ((num_ports) | ((dma_port) << 4) | ((nvm_ver_offset) << 10) | \
+ ((nvm_auth_on_boot) << 22) | ((support_full_e2e) << 23))
+#define DEVICE_DATA_NUM_PORTS(device_data) ((device_data) & 0xf)
+#define DEVICE_DATA_DMA_PORT(device_data) (((device_data) >> 4) & 0x3f)
+#define DEVICE_DATA_NVM_VER_OFFSET(device_data) (((device_data) >> 10) & 0xfff)
+#define DEVICE_DATA_NVM_AUTH_ON_BOOT(device_data) (((device_data) >> 22) & 0x1)
+#define DEVICE_DATA_SUPPORT_FULL_E2E(device_data) (((device_data) >> 23) & 0x1)
+
+#define USEC_TO_256_NSECS(usec) DIV_ROUND_UP((usec) * NSEC_PER_USEC, 256)
+
+/* NHI genetlink commands */
+enum {
+ NHI_CMD_UNSPEC,
+ NHI_CMD_SUBSCRIBE,
+ NHI_CMD_UNSUBSCRIBE,
+ NHI_CMD_QUERY_INFORMATION,
+ NHI_CMD_MSG_TO_ICM,
+ NHI_CMD_MSG_FROM_ICM,
+ NHI_CMD_MAILBOX,
+ NHI_CMD_APPROVE_TBT_NETWORKING,
+ NHI_CMD_ICM_IN_SAFE_MODE,
+ __NHI_CMD_MAX,
+};
+#define NHI_CMD_MAX (__NHI_CMD_MAX - 1)
+
+/* NHI genetlink policy */
+static const struct nla_policy nhi_genl_policy[NHI_ATTR_MAX + 1] = {
+ [NHI_ATTR_DRV_VERSION] = { .type = NLA_NUL_STRING, },
+ [NHI_ATTR_NVM_VER_OFFSET] = { .type = NLA_U16, },
+ [NHI_ATTR_NUM_PORTS] = { .type = NLA_U8, },
+ [NHI_ATTR_DMA_PORT] = { .type = NLA_U8, },
+ [NHI_ATTR_SUPPORT_FULL_E2E] = { .type = NLA_FLAG, },
+ [NHI_ATTR_MAILBOX_CMD] = { .type = NLA_U32, },
+ [NHI_ATTR_PDF] = { .type = NLA_U32, },
+ [NHI_ATTR_MSG_TO_ICM] = { .type = NLA_BINARY,
+ .len = TBT_ICM_RING_MAX_FRAME_SIZE },
+ [NHI_ATTR_MSG_FROM_ICM] = { .type = NLA_BINARY,
+ .len = TBT_ICM_RING_MAX_FRAME_SIZE },
+};
+
+/* NHI genetlink family */
+static struct genl_family nhi_genl_family = {
+ .id = GENL_ID_GENERATE,
+ .hdrsize = FIELD_SIZEOF(struct tbt_nhi_ctxt, id),
+ .name = NHI_GENL_NAME,
+ .version = NHI_GENL_VERSION,
+ .maxattr = NHI_ATTR_MAX,
+};
+
+static LIST_HEAD(controllers_list);
+static DEFINE_MUTEX(controllers_list_mutex);
+static atomic_t subscribers = ATOMIC_INIT(0);
+/*
+ * Some of the received generic netlink messages are replied in a different
+ * context. The reply has to include the netlink portid of sender, therefore
+ * saving it in global variable (current assuption is one sender).
+ */
+static u32 portid;
+
+static bool nhi_nvm_authenticated(struct tbt_nhi_ctxt *nhi_ctxt)
+{
+ enum icm_operation_mode op_mode;
+ u32 *msg_head, port_id, reg;
+ struct sk_buff *skb;
+ int i;
+
+ if (!nhi_ctxt->nvm_auth_on_boot)
+ return true;
+
+ /*
+ * The check for NVM authentication can take time for iCM,
+ * especially in low power configuration.
+ */
+ for (i = 0; i < 5; i++) {
+ u32 status = ioread32(nhi_ctxt->iobase + REG_FW_STS);
+
+ if (status & REG_FW_STS_NVM_AUTH_DONE)
+ break;
+
+ msleep(30);
+ }
+ /*
+ * The check for authentication is done after checking if iCM
+ * is present so it shouldn't reach the max tries (=5).
+ * Anyway, the check for full functionality below covers the error case.
+ */
+ reg = ioread32(nhi_ctxt->iobase + REG_OUTMAIL_CMD);
+ op_mode = (reg & REG_OUTMAIL_CMD_OP_MODE_MASK) >>
+ REG_OUTMAIL_CMD_OP_MODE_SHIFT;
+ if (op_mode == FULL_FUNCTIONALITY)
+ return true;
+
+ dev_warn(&nhi_ctxt->pdev->dev, "controller id %#x is in operation mode %#x status %#lx, NVM image update might be required\n",
+ nhi_ctxt->id, op_mode,
+ (reg & REG_OUTMAIL_CMD_STS_MASK)>>REG_OUTMAIL_CMD_STS_SHIFT);
+
+ skb = genlmsg_new(NLMSG_ALIGN(nhi_genl_family.hdrsize), GFP_KERNEL);
+ if (!skb) {
+ dev_err(&nhi_ctxt->pdev->dev, "genlmsg_new failed: not enough memory to send controller operational mode\n");
+ return false;
+ }
+
+ /* keeping port_id into a local variable for next use */
+ port_id = portid;
+ msg_head = genlmsg_put(skb, port_id, 0, &nhi_genl_family, 0,
+ NHI_CMD_ICM_IN_SAFE_MODE);
+ if (!msg_head) {
+ nlmsg_free(skb);
+ dev_err(&nhi_ctxt->pdev->dev, "genlmsg_put failed: not enough memory to send controller operational mode\n");
+ return false;
+ }
+
+ *msg_head = nhi_ctxt->id;
+
+ genlmsg_end(skb, msg_head);
+
+ genlmsg_unicast(&init_net, skb, port_id);
+
+ return false;
+}
+
+int nhi_send_message(struct tbt_nhi_ctxt *nhi_ctxt, enum pdf_value pdf,
+ u32 msg_len, const void *msg, bool ignore_icm_resp)
+{
+ u32 prod_cons, prod, cons, attr;
+ struct tbt_icm_ring_shared_memory *shared_mem;
+ void __iomem *reg = TBT_RING_CONS_PROD_REG(nhi_ctxt->iobase,
+ REG_TX_RING_BASE,
+ TBT_ICM_RING_NUM);
+
+ if (nhi_ctxt->d0_exit)
+ return -ENODEV;
+
+ prod_cons = ioread32(reg);
+ prod = TBT_REG_RING_PROD_EXTRACT(prod_cons);
+ cons = TBT_REG_RING_CONS_EXTRACT(prod_cons);
+ if (prod >= TBT_ICM_RING_NUM_TX_BUFS) {
+ dev_warn(&nhi_ctxt->pdev->dev,
+ "controller id %#x is not functional, producer %u out of range\n",
+ nhi_ctxt->id, prod);
+ return -ENODEV;
+ }
+ if (TBT_TX_RING_FULL(prod, cons, TBT_ICM_RING_NUM_TX_BUFS)) {
+ dev_err(&nhi_ctxt->pdev->dev,
+ "controller id %#x is not functional, TX ring full\n",
+ nhi_ctxt->id);
+ return -ENOSPC;
+ }
+
+ attr = (msg_len << DESC_ATTR_LEN_SHIFT) & DESC_ATTR_LEN_MASK;
+ attr |= (pdf << DESC_ATTR_EOF_SHIFT) & DESC_ATTR_EOF_MASK;
+
+ shared_mem = nhi_ctxt->icm_ring_shared_mem;
+ shared_mem->tx_buf_desc[prod].attributes = cpu_to_le32(attr);
+
+ memcpy(shared_mem->tx_buf[prod], msg, msg_len);
+
+ prod_cons &= ~REG_RING_PROD_MASK;
+ prod_cons |= (((prod + 1) % TBT_ICM_RING_NUM_TX_BUFS) <<
+ REG_RING_PROD_SHIFT) & REG_RING_PROD_MASK;
+
+ nhi_ctxt->wait_for_icm_resp = true;
+ nhi_ctxt->ignore_icm_resp = ignore_icm_resp;
+
+ iowrite32(prod_cons, reg);
+
+ return 0;
+}
+
+static int nhi_send_driver_ready_command(struct tbt_nhi_ctxt *nhi_ctxt)
+{
+ struct driver_ready_command {
+ __be32 req_code;
+ __be32 crc;
+ } drv_rdy_cmd = {
+ .req_code = cpu_to_be32(CC_DRV_READY),
+ };
+ u32 crc32;
+
+ crc32 = __crc32c_le(~0, (unsigned char const *)&drv_rdy_cmd,
+ offsetof(struct driver_ready_command, crc));
+
+ drv_rdy_cmd.crc = cpu_to_be32(~crc32);
+
+ return nhi_send_message(nhi_ctxt, PDF_SW_TO_FW_COMMAND,
+ sizeof(drv_rdy_cmd), &drv_rdy_cmd, false);
+}
+
+/**
+ * nhi_search_ctxt - search by id the controllers_list.
+ * Should be called under controllers_list_mutex.
+ *
+ * @id: id of the controller
+ *
+ * Return: driver context if found, NULL otherwise.
+ */
+static struct tbt_nhi_ctxt *nhi_search_ctxt(u32 id)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt;
+
+ list_for_each_entry(nhi_ctxt, &controllers_list, node)
+ if (nhi_ctxt->id == id)
+ return nhi_ctxt;
+
+ return NULL;
+}
+
+static int nhi_genl_subscribe(__always_unused struct sk_buff *u_skb,
+ struct genl_info *info)
+ __acquires(&nhi_ctxt->send_sem)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt;
+
+ /*
+ * To send driver ready command to iCM, need at least one subscriber
+ * that will handle the response.
+ * Currently the assumption is one user mode daemon as subscriber
+ * so one portid global variable (without locking).
+ */
+ if (atomic_inc_return(&subscribers) >= 1) {
+ portid = info->snd_portid;
+ if (mutex_lock_interruptible(&controllers_list_mutex)) {
+ atomic_dec_if_positive(&subscribers);
+ return -ERESTART;
+ }
+ list_for_each_entry(nhi_ctxt, &controllers_list, node) {
+ int res;
+
+ if (nhi_ctxt->d0_exit ||
+ !nhi_nvm_authenticated(nhi_ctxt))
+ continue;
+
+ res = down_timeout(&nhi_ctxt->send_sem,
+ msecs_to_jiffies(10*MSEC_PER_SEC));
+ if (res) {
+ dev_err(&nhi_ctxt->pdev->dev,
+ "%s: controller id %#x is not functional, timeout on waiting for FW response to previous message\n",
+ __func__, nhi_ctxt->id);
+ continue;
+ }
+
+ if (!mutex_trylock(&nhi_ctxt->d0_exit_send_mutex)) {
+ up(&nhi_ctxt->send_sem);
+ continue;
+ }
+
+ res = nhi_send_driver_ready_command(nhi_ctxt);
+
+ mutex_unlock(&nhi_ctxt->d0_exit_send_mutex);
+ if (res)
+ up(&nhi_ctxt->send_sem);
+ }
+ mutex_unlock(&controllers_list_mutex);
+ }
+
+ return 0;
+}
+
+static int nhi_genl_unsubscribe(__always_unused struct sk_buff *u_skb,
+ __always_unused struct genl_info *info)
+{
+ atomic_dec_if_positive(&subscribers);
+
+ return 0;
+}
+
+static int nhi_genl_query_information(__always_unused struct sk_buff *u_skb,
+ struct genl_info *info)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt;
+ struct sk_buff *skb;
+ bool msg_too_long;
+ int res = -ENODEV;
+ u32 *msg_head;
+
+ if (!info || !info->userhdr)
+ return -EINVAL;
+
+ skb = genlmsg_new(NLMSG_ALIGN(nhi_genl_family.hdrsize) +
+ nla_total_size(sizeof(DRV_VERSION)) +
+ nla_total_size(sizeof(nhi_ctxt->nvm_ver_offset)) +
+ nla_total_size(sizeof(nhi_ctxt->num_ports)) +
+ nla_total_size(sizeof(nhi_ctxt->dma_port)) +
+ nla_total_size(0), /* nhi_ctxt->support_full_e2e */
+ GFP_KERNEL);
+ if (!skb)
+ return -ENOMEM;
+
+ msg_head = genlmsg_put_reply(skb, info, &nhi_genl_family, 0,
+ NHI_CMD_QUERY_INFORMATION);
+ if (!msg_head) {
+ res = -ENOMEM;
+ goto genl_put_reply_failure;
+ }
+
+ if (mutex_lock_interruptible(&controllers_list_mutex)) {
+ res = -ERESTART;
+ goto genl_put_reply_failure;
+ }
+
+ nhi_ctxt = nhi_search_ctxt(*(u32 *)info->userhdr);
+ if (nhi_ctxt && !nhi_ctxt->d0_exit) {
+ *msg_head = nhi_ctxt->id;
+
+ msg_too_long = !!nla_put_string(skb, NHI_ATTR_DRV_VERSION,
+ DRV_VERSION);
+
+ msg_too_long = msg_too_long ||
+ nla_put_u16(skb, NHI_ATTR_NVM_VER_OFFSET,
+ nhi_ctxt->nvm_ver_offset);
+
+ msg_too_long = msg_too_long ||
+ nla_put_u8(skb, NHI_ATTR_NUM_PORTS,
+ nhi_ctxt->num_ports);
+
+ msg_too_long = msg_too_long ||
+ nla_put_u8(skb, NHI_ATTR_DMA_PORT,
+ nhi_ctxt->dma_port);
+
+ if (msg_too_long) {
+ res = -EMSGSIZE;
+ goto release_ctl_list_lock;
+ }
+
+ if (nhi_ctxt->support_full_e2e &&
+ nla_put_flag(skb, NHI_ATTR_SUPPORT_FULL_E2E)) {
+ res = -EMSGSIZE;
+ goto release_ctl_list_lock;
+ }
+ mutex_unlock(&controllers_list_mutex);
+
+ genlmsg_end(skb, msg_head);
+
+ return genlmsg_reply(skb, info);
+ }
+
+release_ctl_list_lock:
+ mutex_unlock(&controllers_list_mutex);
+ genlmsg_cancel(skb, msg_head);
+
+genl_put_reply_failure:
+ nlmsg_free(skb);
+
+ return res;
+}
+
+static int nhi_genl_msg_to_icm(__always_unused struct sk_buff *u_skb,
+ struct genl_info *info)
+ __acquires(&nhi_ctxt->send_sem)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt;
+ int res = -ENODEV;
+ int msg_len;
+ void *msg;
+
+ if (!info || !info->userhdr || !info->attrs ||
+ !info->attrs[NHI_ATTR_PDF] || !info->attrs[NHI_ATTR_MSG_TO_ICM])
+ return -EINVAL;
+
+ msg_len = nla_len(info->attrs[NHI_ATTR_MSG_TO_ICM]);
+ if (msg_len > TBT_ICM_RING_MAX_FRAME_SIZE)
+ return -ENOBUFS;
+
+ msg = nla_data(info->attrs[NHI_ATTR_MSG_TO_ICM]);
+
+ if (mutex_lock_interruptible(&controllers_list_mutex))
+ return -ERESTART;
+
+ nhi_ctxt = nhi_search_ctxt(*(u32 *)info->userhdr);
+ if (nhi_ctxt && !nhi_ctxt->d0_exit) {
+ /*
+ * waiting 10 seconds to receive a FW response
+ * if not, just give up and pop up an error
+ */
+ res = down_timeout(&nhi_ctxt->send_sem,
+ msecs_to_jiffies(10 * MSEC_PER_SEC));
+ if (res) {
+ void __iomem *rx_prod_cons = TBT_RING_CONS_PROD_REG(
+ nhi_ctxt->iobase,
+ REG_RX_RING_BASE,
+ TBT_ICM_RING_NUM);
+ void __iomem *tx_prod_cons = TBT_RING_CONS_PROD_REG(
+ nhi_ctxt->iobase,
+ REG_TX_RING_BASE,
+ TBT_ICM_RING_NUM);
+ dev_err(&nhi_ctxt->pdev->dev,
+ "%s: controller id %#x is not functional, timeout on waiting for FW response to previous message, tx prod&cons=%#x, rx prod&cons=%#x\n",
+ __func__, nhi_ctxt->id, ioread32(tx_prod_cons),
+ ioread32(rx_prod_cons));
+ goto release_ctl_list_lock;
+ }
+
+ if (!mutex_trylock(&nhi_ctxt->d0_exit_send_mutex)) {
+ up(&nhi_ctxt->send_sem);
+ goto release_ctl_list_lock;
+ }
+
+ mutex_unlock(&controllers_list_mutex);
+
+ res = nhi_send_message(nhi_ctxt,
+ nla_get_u32(info->attrs[NHI_ATTR_PDF]),
+ msg_len, msg, false);
+
+ mutex_unlock(&nhi_ctxt->d0_exit_send_mutex);
+ if (res)
+ up(&nhi_ctxt->send_sem);
+
+ return res;
+ }
+
+release_ctl_list_lock:
+ mutex_unlock(&controllers_list_mutex);
+ return res;
+}
+
+int nhi_mailbox(struct tbt_nhi_ctxt *nhi_ctxt, u32 cmd, u32 data, bool deinit)
+{
+ u32 delay = deinit ? U32_C(20) : U32_C(100);
+ int i;
+
+ iowrite32(data, nhi_ctxt->iobase + REG_INMAIL_DATA);
+ iowrite32(cmd, nhi_ctxt->iobase + REG_INMAIL_CMD);
+
+#define NHI_INMAIL_CMD_RETRIES 50
+ /*
+ * READ_ONCE fetches the value of nhi_ctxt->d0_exit every time
+ * and avoid optimization.
+ * deinit = true to continue the loop even if D3 process has been
+ * carried out.
+ */
+ for (i = 0; (i < NHI_INMAIL_CMD_RETRIES) &&
+ (deinit || !READ_ONCE(nhi_ctxt->d0_exit)); i++) {
+ cmd = ioread32(nhi_ctxt->iobase + REG_INMAIL_CMD);
+
+ if (cmd & REG_INMAIL_CMD_ERROR)
+ return -EIO;
+
+ if (!(cmd & REG_INMAIL_CMD_REQUEST))
+ break;
+
+ msleep(delay);
+ }
+
+ if (i == NHI_INMAIL_CMD_RETRIES) {
+ if (!deinit)
+ dev_err(&nhi_ctxt->pdev->dev,
+ "controller id %#x is not functional, inmail timeout\n",
+ nhi_ctxt->id);
+ return -ETIMEDOUT;
+ }
+
+ return 0;
+}
+
+static int nhi_mailbox_generic(struct tbt_nhi_ctxt *nhi_ctxt, u32 mb_cmd)
+ __releases(&controllers_list_mutex)
+{
+ int res = -ENODEV;
+
+ if (mutex_lock_interruptible(&nhi_ctxt->mailbox_mutex)) {
+ res = -ERESTART;
+ goto release_ctl_list_lock;
+ }
+
+ if (!mutex_trylock(&nhi_ctxt->d0_exit_mailbox_mutex)) {
+ mutex_unlock(&nhi_ctxt->mailbox_mutex);
+ goto release_ctl_list_lock;
+ }
+
+ mutex_unlock(&controllers_list_mutex);
+
+ res = nhi_mailbox(nhi_ctxt, mb_cmd, 0, false);
+ mutex_unlock(&nhi_ctxt->d0_exit_mailbox_mutex);
+ mutex_unlock(&nhi_ctxt->mailbox_mutex);
+
+ return res;
+
+release_ctl_list_lock:
+ mutex_unlock(&controllers_list_mutex);
+ return res;
+}
+
+static int nhi_genl_mailbox(__always_unused struct sk_buff *u_skb,
+ struct genl_info *info)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt;
+ u32 cmd, mb_cmd;
+
+ if (!info || !info->userhdr || !info->attrs ||
+ !info->attrs[NHI_ATTR_MAILBOX_CMD])
+ return -EINVAL;
+
+ cmd = nla_get_u32(info->attrs[NHI_ATTR_MAILBOX_CMD]);
+ mb_cmd = ((cmd << REG_INMAIL_CMD_CMD_SHIFT) &
+ REG_INMAIL_CMD_CMD_MASK) | REG_INMAIL_CMD_REQUEST;
+
+ if (mutex_lock_interruptible(&controllers_list_mutex))
+ return -ERESTART;
+
+ nhi_ctxt = nhi_search_ctxt(*(u32 *)info->userhdr);
+ if (nhi_ctxt && !nhi_ctxt->d0_exit)
+ return nhi_mailbox_generic(nhi_ctxt, mb_cmd);
+
+ mutex_unlock(&controllers_list_mutex);
+ return -ENODEV;
+}
+
+
+static int nhi_genl_send_msg(struct tbt_nhi_ctxt *nhi_ctxt, enum pdf_value pdf,
+ const u8 *msg, u32 msg_len)
+{
+ u32 *msg_head, port_id;
+ struct sk_buff *skb;
+ int res;
+
+ if (atomic_read(&subscribers) < 1)
+ return -ENOTCONN;
+
+ skb = genlmsg_new(NLMSG_ALIGN(nhi_genl_family.hdrsize) +
+ nla_total_size(msg_len) +
+ nla_total_size(sizeof(pdf)),
+ GFP_KERNEL);
+ if (!skb)
+ return -ENOMEM;
+
+ port_id = portid;
+ msg_head = genlmsg_put(skb, port_id, 0, &nhi_genl_family, 0,
+ NHI_CMD_MSG_FROM_ICM);
+ if (!msg_head) {
+ res = -ENOMEM;
+ goto genl_put_reply_failure;
+ }
+
+ *msg_head = nhi_ctxt->id;
+
+ if (nla_put_u32(skb, NHI_ATTR_PDF, pdf) ||
+ nla_put(skb, NHI_ATTR_MSG_FROM_ICM, msg_len, msg)) {
+ res = -EMSGSIZE;
+ goto nla_put_failure;
+ }
+
+ genlmsg_end(skb, msg_head);
+
+ return genlmsg_unicast(&init_net, skb, port_id);
+
+nla_put_failure:
+ genlmsg_cancel(skb, msg_head);
+genl_put_reply_failure:
+ nlmsg_free(skb);
+
+ return res;
+}
+
+static bool nhi_msg_from_icm_analysis(struct tbt_nhi_ctxt *nhi_ctxt,
+ enum pdf_value pdf,
+ const u8 *msg, u32 msg_len)
+{
+ /*
+ * preparation for messages that won't be sent,
+ * currently unused in this patch.
+ */
+ bool send_event = true;
+
+ switch (pdf) {
+ case PDF_ERROR_NOTIFICATION:
+ /* fallthrough */
+ case PDF_WRITE_CONFIGURATION_REGISTERS:
+ /* fallthrough */
+ case PDF_READ_CONFIGURATION_REGISTERS:
+ if (nhi_ctxt->wait_for_icm_resp) {
+ nhi_ctxt->wait_for_icm_resp = false;
+ up(&nhi_ctxt->send_sem);
+ }
+ /* fallthrough */
+ default:
+ break;
+ }
+
+ return send_event;
+}
+
+static void nhi_msgs_from_icm(struct work_struct *work)
+ __releases(&nhi_ctxt->send_sem)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt = container_of(work, typeof(*nhi_ctxt),
+ icm_msgs_work);
+ void __iomem *reg = TBT_RING_CONS_PROD_REG(nhi_ctxt->iobase,
+ REG_RX_RING_BASE,
+ TBT_ICM_RING_NUM);
+ u32 prod_cons, prod, cons;
+
+ prod_cons = ioread32(reg);
+ prod = TBT_REG_RING_PROD_EXTRACT(prod_cons);
+ cons = TBT_REG_RING_CONS_EXTRACT(prod_cons);
+ if (prod >= TBT_ICM_RING_NUM_RX_BUFS) {
+ dev_warn(&nhi_ctxt->pdev->dev,
+ "controller id %#x is not functional, producer %u out of range\n",
+ nhi_ctxt->id, prod);
+ return;
+ }
+ if (cons >= TBT_ICM_RING_NUM_RX_BUFS) {
+ dev_warn(&nhi_ctxt->pdev->dev,
+ "controller id %#x is not functional, consumer %u out of range\n",
+ nhi_ctxt->id, cons);
+ return;
+ }
+
+ while (!TBT_RX_RING_EMPTY(prod, cons, TBT_ICM_RING_NUM_RX_BUFS) &&
+ !nhi_ctxt->d0_exit) {
+ struct tbt_buf_desc *rx_desc;
+ u8 *msg;
+ u32 msg_len;
+ enum pdf_value pdf;
+ bool send_event;
+
+ cons = (cons + 1) % TBT_ICM_RING_NUM_RX_BUFS;
+ rx_desc = &(nhi_ctxt->icm_ring_shared_mem->rx_buf_desc[cons]);
+ if (!(le32_to_cpu(rx_desc->attributes) & DESC_ATTR_DESC_DONE))
+ usleep_range(10, 20);
+
+ rmb(); /* read the descriptor and the buffer after DD check */
+ pdf = (le32_to_cpu(rx_desc->attributes) & DESC_ATTR_EOF_MASK)
+ >> DESC_ATTR_EOF_SHIFT;
+ msg = nhi_ctxt->icm_ring_shared_mem->rx_buf[cons];
+ msg_len = (le32_to_cpu(rx_desc->attributes)&DESC_ATTR_LEN_MASK)
+ >> DESC_ATTR_LEN_SHIFT;
+
+ send_event = nhi_msg_from_icm_analysis(nhi_ctxt, pdf, msg,
+ msg_len);
+
+ if (send_event)
+ nhi_genl_send_msg(nhi_ctxt, pdf, msg, msg_len);
+
+ /* set the descriptor for another receive */
+ rx_desc->attributes = cpu_to_le32(DESC_ATTR_REQ_STS |
+ DESC_ATTR_INT_EN);
+ rx_desc->time = 0;
+ }
+
+ /* free the descriptors for more receive */
+ prod_cons &= ~REG_RING_CONS_MASK;
+ prod_cons |= (cons << REG_RING_CONS_SHIFT) & REG_RING_CONS_MASK;
+ iowrite32(prod_cons, reg);
+
+ if (!nhi_ctxt->d0_exit) {
+ unsigned long flags;
+
+ spin_lock_irqsave(&nhi_ctxt->lock, flags);
+ /* enable RX interrupt */
+ RING_INT_ENABLE_RX(nhi_ctxt->iobase, TBT_ICM_RING_NUM,
+ nhi_ctxt->num_paths);
+
+ spin_unlock_irqrestore(&nhi_ctxt->lock, flags);
+ }
+}
+
+static irqreturn_t nhi_icm_ring_rx_msix(int __always_unused irq, void *data)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt = data;
+
+ spin_lock(&nhi_ctxt->lock);
+ /*
+ * disable RX interrupt
+ * We like to allow interrupt mitigation until the work item
+ * will be completed.
+ */
+ RING_INT_DISABLE_RX(nhi_ctxt->iobase, TBT_ICM_RING_NUM,
+ nhi_ctxt->num_paths);
+
+ spin_unlock(&nhi_ctxt->lock);
+
+ schedule_work(&nhi_ctxt->icm_msgs_work);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t nhi_msi(int __always_unused irq, void *data)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt = data;
+ u32 isr0, isr1, imr0, imr1;
+
+ /* clear on read */
+ isr0 = ioread32(nhi_ctxt->iobase + REG_RING_NOTIFY_BASE);
+ isr1 = ioread32(nhi_ctxt->iobase + REG_RING_NOTIFY_BASE +
+ REG_RING_NOTIFY_STEP);
+ if (unlikely(!isr0 && !isr1))
+ return IRQ_NONE;
+
+ spin_lock(&nhi_ctxt->lock);
+
+ imr0 = ioread32(nhi_ctxt->iobase + REG_RING_INTERRUPT_BASE);
+ imr1 = ioread32(nhi_ctxt->iobase + REG_RING_INTERRUPT_BASE +
+ REG_RING_INTERRUPT_STEP);
+ /* disable the arrived interrupts */
+ iowrite32(imr0 & ~isr0,
+ nhi_ctxt->iobase + REG_RING_INTERRUPT_BASE);
+ iowrite32(imr1 & ~isr1,
+ nhi_ctxt->iobase + REG_RING_INTERRUPT_BASE +
+ REG_RING_INTERRUPT_STEP);
+
+ spin_unlock(&nhi_ctxt->lock);
+
+ if (isr0 & REG_RING_INT_RX_PROCESSED(TBT_ICM_RING_NUM,
+ nhi_ctxt->num_paths))
+ schedule_work(&nhi_ctxt->icm_msgs_work);
+
+ return IRQ_HANDLED;
+}
+
+/**
+ * nhi_set_int_vec - Mapping of the MSIX vector entry to the ring
+ * @nhi_ctxt: contains data on NHI controller
+ * @path: ring to be mapped
+ * @msix_msg_id: msix entry to be mapped
+ */
+static inline void nhi_set_int_vec(struct tbt_nhi_ctxt *nhi_ctxt, u32 path,
+ u8 msix_msg_id)
+{
+ void __iomem *reg;
+ u32 step, shift, ivr;
+
+ if (msix_msg_id % 2)
+ path += nhi_ctxt->num_paths;
+
+ step = path / REG_INT_VEC_ALLOC_PER_REG;
+ shift = (path % REG_INT_VEC_ALLOC_PER_REG) *
+ REG_INT_VEC_ALLOC_FIELD_BITS;
+ reg = nhi_ctxt->iobase + REG_INT_VEC_ALLOC_BASE +
+ (step * REG_INT_VEC_ALLOC_STEP);
+ ivr = ioread32(reg) & ~(REG_INT_VEC_ALLOC_FIELD_MASK << shift);
+ iowrite32(ivr | (msix_msg_id << shift), reg);
+}
+
+/* NHI genetlink operations array */
+static const struct genl_ops nhi_ops[] = {
+ {
+ .cmd = NHI_CMD_SUBSCRIBE,
+ .policy = nhi_genl_policy,
+ .doit = nhi_genl_subscribe,
+ },
+ {
+ .cmd = NHI_CMD_UNSUBSCRIBE,
+ .policy = nhi_genl_policy,
+ .doit = nhi_genl_unsubscribe,
+ },
+ {
+ .cmd = NHI_CMD_QUERY_INFORMATION,
+ .policy = nhi_genl_policy,
+ .doit = nhi_genl_query_information,
+ },
+ {
+ .cmd = NHI_CMD_MSG_TO_ICM,
+ .policy = nhi_genl_policy,
+ .doit = nhi_genl_msg_to_icm,
+ .flags = GENL_ADMIN_PERM,
+ },
+ {
+ .cmd = NHI_CMD_MAILBOX,
+ .policy = nhi_genl_policy,
+ .doit = nhi_genl_mailbox,
+ .flags = GENL_ADMIN_PERM,
+ },
+};
+
+static int nhi_suspend(struct device *dev) __releases(&nhi_ctxt->send_sem)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt = pci_get_drvdata(to_pci_dev(dev));
+ void __iomem *rx_reg, *tx_reg;
+ u32 rx_reg_val, tx_reg_val;
+
+ /* must be after negotiation_events, since messages might be sent */
+ nhi_ctxt->d0_exit = true;
+
+ rx_reg = nhi_ctxt->iobase + REG_RX_OPTIONS_BASE +
+ (TBT_ICM_RING_NUM * REG_OPTS_STEP);
+ rx_reg_val = ioread32(rx_reg) & ~REG_OPTS_E2E_EN;
+ tx_reg = nhi_ctxt->iobase + REG_TX_OPTIONS_BASE +
+ (TBT_ICM_RING_NUM * REG_OPTS_STEP);
+ tx_reg_val = ioread32(tx_reg) & ~REG_OPTS_E2E_EN;
+ /* disable RX flow control */
+ iowrite32(rx_reg_val, rx_reg);
+ /* disable TX flow control */
+ iowrite32(tx_reg_val, tx_reg);
+ /* disable RX ring */
+ iowrite32(rx_reg_val & ~REG_OPTS_VALID, rx_reg);
+
+ mutex_lock(&nhi_ctxt->d0_exit_mailbox_mutex);
+ mutex_lock(&nhi_ctxt->d0_exit_send_mutex);
+
+ cancel_work_sync(&nhi_ctxt->icm_msgs_work);
+
+ if (nhi_ctxt->wait_for_icm_resp) {
+ nhi_ctxt->wait_for_icm_resp = false;
+ nhi_ctxt->ignore_icm_resp = false;
+ /*
+ * if there is response, it is lost, so unlock the send
+ * for the next resume.
+ */
+ up(&nhi_ctxt->send_sem);
+ }
+
+ mutex_unlock(&nhi_ctxt->d0_exit_send_mutex);
+ mutex_unlock(&nhi_ctxt->d0_exit_mailbox_mutex);
+
+ /* wait for all TX to finish */
+ usleep_range(5 * USEC_PER_MSEC, 7 * USEC_PER_MSEC);
+
+ /* disable all interrupts */
+ iowrite32(0, nhi_ctxt->iobase + REG_RING_INTERRUPT_BASE);
+ /* disable TX ring */
+ iowrite32(tx_reg_val & ~REG_OPTS_VALID, tx_reg);
+
+ return 0;
+}
+
+static int nhi_resume(struct device *dev) __acquires(&nhi_ctxt->send_sem)
+{
+ dma_addr_t phys;
+ struct tbt_nhi_ctxt *nhi_ctxt = pci_get_drvdata(to_pci_dev(dev));
+ struct tbt_buf_desc *desc;
+ void __iomem *iobase = nhi_ctxt->iobase;
+ void __iomem *reg;
+ int i;
+
+ if (nhi_ctxt->msix_entries) {
+ iowrite32(ioread32(iobase + REG_DMA_MISC) |
+ REG_DMA_MISC_INT_AUTO_CLEAR,
+ iobase + REG_DMA_MISC);
+ /*
+ * Vector #0, which is TX complete to ICM,
+ * isn't been used currently.
+ */
+ nhi_set_int_vec(nhi_ctxt, 0, 1);
+
+ for (i = 2; i < nhi_ctxt->num_vectors; i++)
+ nhi_set_int_vec(nhi_ctxt, nhi_ctxt->num_paths - (i/2),
+ i);
+ }
+
+ /* configure TX descriptors */
+ for (i = 0, phys = nhi_ctxt->icm_ring_shared_mem_dma_addr;
+ i < TBT_ICM_RING_NUM_TX_BUFS;
+ i++, phys += TBT_ICM_RING_MAX_FRAME_SIZE) {
+ desc = &nhi_ctxt->icm_ring_shared_mem->tx_buf_desc[i];
+ desc->phys = cpu_to_le64(phys);
+ desc->attributes = cpu_to_le32(DESC_ATTR_REQ_STS);
+ }
+ /* configure RX descriptors */
+ for (i = 0;
+ i < TBT_ICM_RING_NUM_RX_BUFS;
+ i++, phys += TBT_ICM_RING_MAX_FRAME_SIZE) {
+ desc = &nhi_ctxt->icm_ring_shared_mem->rx_buf_desc[i];
+ desc->phys = cpu_to_le64(phys);
+ desc->attributes = cpu_to_le32(DESC_ATTR_REQ_STS |
+ DESC_ATTR_INT_EN);
+ }
+
+ /* configure throttling rate for interrupts */
+ for (i = 0, reg = iobase + REG_INT_THROTTLING_RATE;
+ i < NUM_INT_VECTORS;
+ i++, reg += REG_INT_THROTTLING_RATE_STEP) {
+ iowrite32(USEC_TO_256_NSECS(128), reg);
+ }
+
+ /* configure TX for ICM ring */
+ reg = iobase + REG_TX_RING_BASE + (TBT_ICM_RING_NUM * REG_RING_STEP);
+ phys = nhi_ctxt->icm_ring_shared_mem_dma_addr +
+ offsetof(struct tbt_icm_ring_shared_memory, tx_buf_desc);
+ iowrite32(lower_32_bits(phys), reg + REG_RING_PHYS_LO_OFFSET);
+ iowrite32(upper_32_bits(phys), reg + REG_RING_PHYS_HI_OFFSET);
+ iowrite32((TBT_ICM_RING_NUM_TX_BUFS << REG_RING_SIZE_SHIFT) &
+ REG_RING_SIZE_MASK,
+ reg + REG_RING_SIZE_OFFSET);
+
+ reg = iobase + REG_TX_OPTIONS_BASE + (TBT_ICM_RING_NUM*REG_OPTS_STEP);
+ iowrite32(REG_OPTS_RAW | REG_OPTS_VALID, reg);
+
+ /* configure RX for ICM ring */
+ reg = iobase + REG_RX_RING_BASE + (TBT_ICM_RING_NUM * REG_RING_STEP);
+ phys = nhi_ctxt->icm_ring_shared_mem_dma_addr +
+ offsetof(struct tbt_icm_ring_shared_memory, rx_buf_desc);
+ iowrite32(lower_32_bits(phys), reg + REG_RING_PHYS_LO_OFFSET);
+ iowrite32(upper_32_bits(phys), reg + REG_RING_PHYS_HI_OFFSET);
+ iowrite32(((TBT_ICM_RING_NUM_RX_BUFS << REG_RING_SIZE_SHIFT) &
+ REG_RING_SIZE_MASK) |
+ ((TBT_ICM_RING_MAX_FRAME_SIZE << REG_RING_BUF_SIZE_SHIFT) &
+ REG_RING_BUF_SIZE_MASK),
+ reg + REG_RING_SIZE_OFFSET);
+ iowrite32(((TBT_ICM_RING_NUM_RX_BUFS - 1) << REG_RING_CONS_SHIFT) &
+ REG_RING_CONS_MASK,
+ reg + REG_RING_CONS_PROD_OFFSET);
+
+ reg = iobase + REG_RX_OPTIONS_BASE + (TBT_ICM_RING_NUM*REG_OPTS_STEP);
+ iowrite32(REG_OPTS_RAW | REG_OPTS_VALID, reg);
+
+ /* enable RX interrupt */
+ RING_INT_ENABLE_RX(iobase, TBT_ICM_RING_NUM, nhi_ctxt->num_paths);
+
+ if (likely((atomic_read(&subscribers) > 0) &&
+ nhi_nvm_authenticated(nhi_ctxt))) {
+ down(&nhi_ctxt->send_sem);
+ nhi_ctxt->d0_exit = false;
+ mutex_lock(&nhi_ctxt->d0_exit_send_mutex);
+ /*
+ * interrupts are enabled here before send due to
+ * implicit barrier in mutex
+ */
+ nhi_send_driver_ready_command(nhi_ctxt);
+ mutex_unlock(&nhi_ctxt->d0_exit_send_mutex);
+ } else {
+ nhi_ctxt->d0_exit = false;
+ }
+
+ return 0;
+}
+
+static void icm_nhi_shutdown(struct pci_dev *pdev)
+{
+ nhi_suspend(&pdev->dev);
+}
+
+static void icm_nhi_remove(struct pci_dev *pdev)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt = pci_get_drvdata(pdev);
+ int i;
+
+ nhi_suspend(&pdev->dev);
+
+ if (nhi_ctxt->net_workqueue)
+ destroy_workqueue(nhi_ctxt->net_workqueue);
+
+ /*
+ * disable irq for msix or msi
+ */
+ if (likely(nhi_ctxt->msix_entries)) {
+ /* Vector #0 isn't been used currently */
+ devm_free_irq(&pdev->dev, nhi_ctxt->msix_entries[1].vector,
+ nhi_ctxt);
+ pci_disable_msix(pdev);
+ } else {
+ devm_free_irq(&pdev->dev, pdev->irq, nhi_ctxt);
+ pci_disable_msi(pdev);
+ }
+
+ /*
+ * remove controller from the controllers list
+ */
+ mutex_lock(&controllers_list_mutex);
+ list_del(&nhi_ctxt->node);
+ mutex_unlock(&controllers_list_mutex);
+
+ nhi_mailbox(
+ nhi_ctxt,
+ ((CC_DRV_UNLOADS_AND_DISCONNECT_INTER_DOMAIN_PATHS
+ << REG_INMAIL_CMD_CMD_SHIFT) &
+ REG_INMAIL_CMD_CMD_MASK) |
+ REG_INMAIL_CMD_REQUEST,
+ 0, true);
+
+ usleep_range(1 * USEC_PER_MSEC, 5 * USEC_PER_MSEC);
+ iowrite32(1, nhi_ctxt->iobase + REG_HOST_INTERFACE_RST);
+
+ mutex_destroy(&nhi_ctxt->d0_exit_send_mutex);
+ mutex_destroy(&nhi_ctxt->d0_exit_mailbox_mutex);
+ mutex_destroy(&nhi_ctxt->mailbox_mutex);
+ for (i = 0; i < nhi_ctxt->num_ports; i++)
+ mutex_destroy(&(nhi_ctxt->net_devices[i].state_mutex));
+}
+
+static int icm_nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct tbt_nhi_ctxt *nhi_ctxt;
+ void __iomem *iobase;
+ int i, res;
+ bool enable_msi = false;
+
+ res = pcim_enable_device(pdev);
+ if (res) {
+ dev_err(&pdev->dev, "cannot enable PCI device, aborting\n");
+ return res;
+ }
+
+ res = pcim_iomap_regions(pdev, 1 << NHI_MMIO_BAR, pci_name(pdev));
+ if (res) {
+ dev_err(&pdev->dev, "cannot obtain PCI resources, aborting\n");
+ return res;
+ }
+
+ /* cannot fail - table is allocated in pcim_iomap_regions */
+ iobase = pcim_iomap_table(pdev)[NHI_MMIO_BAR];
+
+ /* check if ICM is running */
+ if (!(ioread32(iobase + REG_FW_STS) & REG_FW_STS_ICM_EN)) {
+ dev_err(&pdev->dev, "ICM isn't present, aborting\n");
+ return -ENODEV;
+ }
+
+ nhi_ctxt = devm_kzalloc(&pdev->dev, sizeof(*nhi_ctxt), GFP_KERNEL);
+ if (!nhi_ctxt)
+ return -ENOMEM;
+
+ nhi_ctxt->pdev = pdev;
+ nhi_ctxt->iobase = iobase;
+ nhi_ctxt->id = (PCI_DEVID(pdev->bus->number, pdev->devfn) << 16) |
+ id->device;
+ /*
+ * Number of paths represents the number of rings available for
+ * the controller.
+ */
+ nhi_ctxt->num_paths = ioread32(iobase + REG_HOP_COUNT) &
+ REG_HOP_COUNT_TOTAL_PATHS_MASK;
+
+ nhi_ctxt->nvm_auth_on_boot = DEVICE_DATA_NVM_AUTH_ON_BOOT(
+ id->driver_data);
+ nhi_ctxt->support_full_e2e = DEVICE_DATA_SUPPORT_FULL_E2E(
+ id->driver_data);
+
+ nhi_ctxt->dma_port = DEVICE_DATA_DMA_PORT(id->driver_data);
+ /*
+ * Number of ports in the controller
+ */
+ nhi_ctxt->num_ports = DEVICE_DATA_NUM_PORTS(id->driver_data);
+ nhi_ctxt->nvm_ver_offset = DEVICE_DATA_NVM_VER_OFFSET(id->driver_data);
+
+ mutex_init(&nhi_ctxt->d0_exit_send_mutex);
+ mutex_init(&nhi_ctxt->d0_exit_mailbox_mutex);
+ mutex_init(&nhi_ctxt->mailbox_mutex);
+
+ sema_init(&nhi_ctxt->send_sem, 1);
+
+ INIT_WORK(&nhi_ctxt->icm_msgs_work, nhi_msgs_from_icm);
+
+ spin_lock_init(&nhi_ctxt->lock);
+
+ nhi_ctxt->net_devices = devm_kcalloc(&pdev->dev,
+ nhi_ctxt->num_ports,
+ sizeof(struct port_net_dev),
+ GFP_KERNEL);
+ if (!nhi_ctxt->net_devices)
+ return -ENOMEM;
+
+ for (i = 0; i < nhi_ctxt->num_ports; i++)
+ mutex_init(&(nhi_ctxt->net_devices[i].state_mutex));
+
+ /*
+ * allocating RX and TX vectors for ICM and per port
+ * for thunderbolt networking.
+ * The mapping of the vector is carried out by
+ * nhi_set_int_vec and looks like:
+ * 0=tx icm, 1=rx icm, 2=tx data port 0,
+ * 3=rx data port 0...
+ */
+ nhi_ctxt->num_vectors = (1 + nhi_ctxt->num_ports) * 2;
+ nhi_ctxt->msix_entries = devm_kcalloc(&pdev->dev,
+ nhi_ctxt->num_vectors,
+ sizeof(struct msix_entry),
+ GFP_KERNEL);
+ if (likely(nhi_ctxt->msix_entries)) {
+ for (i = 0; i < nhi_ctxt->num_vectors; i++)
+ nhi_ctxt->msix_entries[i].entry = i;
+ res = pci_enable_msix_exact(pdev,
+ nhi_ctxt->msix_entries,
+ nhi_ctxt->num_vectors);
+
+ if (res ||
+ /*
+ * Allocating ICM RX only.
+ * vector #0, which is TX complete to ICM,
+ * isn't been used currently
+ */
+ devm_request_irq(&pdev->dev,
+ nhi_ctxt->msix_entries[1].vector,
+ nhi_icm_ring_rx_msix, 0, pci_name(pdev),
+ nhi_ctxt)) {
+ devm_kfree(&pdev->dev, nhi_ctxt->msix_entries);
+ nhi_ctxt->msix_entries = NULL;
+ enable_msi = true;
+ }
+ } else {
+ enable_msi = true;
+ }
+ /*
+ * In case allocation didn't succeed, use msi instead of msix
+ */
+ if (enable_msi) {
+ res = pci_enable_msi(pdev);
+ if (res) {
+ dev_err(&pdev->dev, "cannot enable MSI, aborting\n");
+ return res;
+ }
+ res = devm_request_irq(&pdev->dev, pdev->irq, nhi_msi, 0,
+ pci_name(pdev), nhi_ctxt);
+ if (res) {
+ dev_err(&pdev->dev,
+ "request_irq failed %d, aborting\n", res);
+ return res;
+ }
+ }
+ /*
+ * try to work with address space of 64 bits.
+ * In case this doesn't work, work with 32 bits.
+ */
+ if (!dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64))) {
+ nhi_ctxt->pci_using_dac = true;
+ } else {
+ res = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
+ if (res) {
+ dev_err(&pdev->dev,
+ "No suitable DMA available, aborting\n");
+ return res;
+ }
+ }
+
+ BUILD_BUG_ON(sizeof(struct tbt_buf_desc) != 16);
+ BUILD_BUG_ON(sizeof(struct tbt_icm_ring_shared_memory) > PAGE_SIZE);
+ nhi_ctxt->icm_ring_shared_mem = dmam_alloc_coherent(
+ &pdev->dev, sizeof(*nhi_ctxt->icm_ring_shared_mem),
+ &nhi_ctxt->icm_ring_shared_mem_dma_addr,
+ GFP_KERNEL | __GFP_ZERO);
+ if (nhi_ctxt->icm_ring_shared_mem == NULL) {
+ dev_err(&pdev->dev, "dmam_alloc_coherent failed, aborting\n");
+ return -ENOMEM;
+ }
+
+ nhi_ctxt->net_workqueue = create_singlethread_workqueue("thunderbolt");
+ if (!nhi_ctxt->net_workqueue) {
+ dev_err(&pdev->dev, "create_singlethread_workqueue failed, aborting\n");
+ return -ENOMEM;
+ }
+
+ pci_set_master(pdev);
+ pci_set_drvdata(pdev, nhi_ctxt);
+
+ nhi_resume(&pdev->dev);
+ /*
+ * Add the new controller at the end of the list
+ */
+ mutex_lock(&controllers_list_mutex);
+ list_add_tail(&nhi_ctxt->node, &controllers_list);
+ mutex_unlock(&controllers_list_mutex);
+
+ return res;
+}
+
+/*
+ * The tunneled pci bridges are siblings of us. Use resume_noirq to reenable
+ * the tunnels asap. A corresponding pci quirk blocks the downstream bridges
+ * resume_noirq until we are done.
+ */
+static const struct dev_pm_ops icm_nhi_pm_ops = {
+ SET_SYSTEM_SLEEP_PM_OPS(nhi_suspend, nhi_resume)
+};
+
+static const struct pci_device_id nhi_pci_device_ids[] = {
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_REDWOOD_RIDGE_2C_NHI),
+ DEVICE_DATA(1, 5, 0xa, false, false) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_REDWOOD_RIDGE_4C_NHI),
+ DEVICE_DATA(2, 5, 0xa, false, false) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_FALCON_RIDGE_2C_NHI),
+ DEVICE_DATA(1, 5, 0xa, false, false) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_FALCON_RIDGE_4C_NHI),
+ DEVICE_DATA(2, 5, 0xa, false, false) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_WIN_RIDGE_2C_NHI),
+ DEVICE_DATA(1, 3, 0xa, false, false) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_2C_NHI),
+ DEVICE_DATA(1, 5, 0xa, true, true) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_NHI),
+ DEVICE_DATA(2, 5, 0xa, true, true) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_USBONLY_NHI),
+ DEVICE_DATA(1, 5, 0xa, true, true) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_NHI),
+ DEVICE_DATA(1, 3, 0xa, true, true) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_USBONLY_NHI),
+ DEVICE_DATA(1, 3, 0xa, true, true) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_2C_NHI),
+ DEVICE_DATA(1, 5, 0xa, true, true) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_4C_NHI),
+ DEVICE_DATA(2, 5, 0xa, true, true) },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_USBONLY_NHI),
+ DEVICE_DATA(1, 5, 0xa, true, true) },
+ { 0, }
+};
+
+MODULE_DEVICE_TABLE(pci, nhi_pci_device_ids);
+MODULE_LICENSE("GPL");
+MODULE_VERSION(DRV_VERSION);
+
+static struct pci_driver icm_nhi_driver = {
+ .name = "thunderbolt",
+ .id_table = nhi_pci_device_ids,
+ .probe = icm_nhi_probe,
+ .remove = icm_nhi_remove,
+ .shutdown = icm_nhi_shutdown,
+ .driver.pm = &icm_nhi_pm_ops,
+};
+
+static int __init icm_nhi_init(void)
+{
+ int rc;
+
+ if (dmi_match(DMI_BOARD_VENDOR, "Apple Inc."))
+ return -ENODEV;
+
+ rc = genl_register_family_with_ops(&nhi_genl_family, nhi_ops);
+ if (rc)
+ goto failure;
+
+ rc = pci_register_driver(&icm_nhi_driver);
+ if (rc)
+ goto failure_genl;
+
+ return 0;
+
+failure_genl:
+ genl_unregister_family(&nhi_genl_family);
+
+failure:
+ pr_debug("nhi: error %d occurred in %s\n", rc, __func__);
+ return rc;
+}
+
+static void __exit icm_nhi_unload(void)
+{
+ genl_unregister_family(&nhi_genl_family);
+ pci_unregister_driver(&icm_nhi_driver);
+}
+
+module_init(icm_nhi_init);
+module_exit(icm_nhi_unload);
diff --git a/drivers/thunderbolt/icm/icm_nhi.h b/drivers/thunderbolt/icm/icm_nhi.h
new file mode 100644
index 0000000..1db37e5
--- /dev/null
+++ b/drivers/thunderbolt/icm/icm_nhi.h
@@ -0,0 +1,85 @@
+/*******************************************************************************
+ *
+ * Intel Thunderbolt(TM) driver
+ * Copyright(c) 2014 - 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ ******************************************************************************/
+
+#ifndef ICM_NHI_H_
+#define ICM_NHI_H_
+
+#include <linux/pci.h>
+#include "../nhi_regs.h"
+
+#define DRV_VERSION "16.1.55.1"
+
+#define PCI_DEVICE_ID_INTEL_WIN_RIDGE_2C_NHI 0x157d /*Tbt 2 Low Pwr*/
+#define PCI_DEVICE_ID_INTEL_WIN_RIDGE_2C_BRIDGE 0x157e
+#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_NHI 0x15bf /*Tbt 3 Low Pwr*/
+#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_BRIDGE 0x15c0
+#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_4C_NHI 0x15d2 /*Thunderbolt 3*/
+#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_4C_BRIDGE 0x15d3
+#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_2C_NHI 0x15d9
+#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_2C_BRIDGE 0x15da
+#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_USBONLY_NHI 0x15dc
+#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_USBONLY_NHI 0x15dd
+#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_USBONLY_NHI 0x15de
+
+#define TBT_ICM_RING_MAX_FRAME_SIZE 256
+#define TBT_ICM_RING_NUM 0
+#define TBT_RING_MAX_FRM_DATA_SZ (TBT_RING_MAX_FRAME_SIZE - \
+ sizeof(struct tbt_frame_header))
+
+enum icm_operation_mode {
+ SAFE_MODE,
+ AUTHENTICATION_MODE_FUNCTIONALITY,
+ ENDPOINT_OPERATION_MODE,
+ FULL_FUNCTIONALITY,
+};
+
+#define TBT_ICM_RING_NUM_TX_BUFS TBT_RING_MIN_NUM_BUFFERS
+#define TBT_ICM_RING_NUM_RX_BUFS ((PAGE_SIZE - (TBT_ICM_RING_NUM_TX_BUFS * \
+ (sizeof(struct tbt_buf_desc) + TBT_ICM_RING_MAX_FRAME_SIZE))) / \
+ (sizeof(struct tbt_buf_desc) + TBT_ICM_RING_MAX_FRAME_SIZE))
+
+/* struct tbt_icm_ring_shared_memory - memory area for DMA */
+struct tbt_icm_ring_shared_memory {
+ u8 tx_buf[TBT_ICM_RING_NUM_TX_BUFS][TBT_ICM_RING_MAX_FRAME_SIZE];
+ u8 rx_buf[TBT_ICM_RING_NUM_RX_BUFS][TBT_ICM_RING_MAX_FRAME_SIZE];
+ struct tbt_buf_desc tx_buf_desc[TBT_ICM_RING_NUM_TX_BUFS];
+ struct tbt_buf_desc rx_buf_desc[TBT_ICM_RING_NUM_RX_BUFS];
+} __aligned(TBT_ICM_RING_MAX_FRAME_SIZE);
+
+/* mailbox data from SW */
+#define REG_INMAIL_DATA 0x39900
+
+/* mailbox command from SW */
+#define REG_INMAIL_CMD 0x39904
+#define REG_INMAIL_CMD_CMD_SHIFT 0
+#define REG_INMAIL_CMD_CMD_MASK GENMASK(7, REG_INMAIL_CMD_CMD_SHIFT)
+#define REG_INMAIL_CMD_ERROR BIT(30)
+#define REG_INMAIL_CMD_REQUEST BIT(31)
+
+/* mailbox command from FW */
+#define REG_OUTMAIL_CMD 0x3990C
+#define REG_OUTMAIL_CMD_STS_SHIFT 0
+#define REG_OUTMAIL_CMD_STS_MASK GENMASK(7, REG_OUTMAIL_CMD_STS_SHIFT)
+#define REG_OUTMAIL_CMD_OP_MODE_SHIFT 8
+#define REG_OUTMAIL_CMD_OP_MODE_MASK \
+ GENMASK(11, REG_OUTMAIL_CMD_OP_MODE_SHIFT)
+#define REG_OUTMAIL_CMD_REQUEST BIT(31)
+
+#define REG_FW_STS 0x39944
+#define REG_FW_STS_ICM_EN GENMASK(1, 0)
+#define REG_FW_STS_NVM_AUTH_DONE BIT(31)
+
+#endif
diff --git a/drivers/thunderbolt/icm/net.h b/drivers/thunderbolt/icm/net.h
new file mode 100644
index 0000000..0281201
--- /dev/null
+++ b/drivers/thunderbolt/icm/net.h
@@ -0,0 +1,217 @@
+/*******************************************************************************
+ *
+ * Intel Thunderbolt(TM) driver
+ * Copyright(c) 2014 - 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ ******************************************************************************/
+
+#ifndef NET_H_
+#define NET_H_
+
+#include <linux/pci.h>
+#include <linux/netdevice.h>
+#include <linux/mutex.h>
+#include <linux/semaphore.h>
+#include <net/genetlink.h>
+
+/*
+ * Each physical port contains 2 channels.
+ * Devices are exposed to user based on physical ports.
+ */
+#define CHANNELS_PER_PORT_NUM 2
+/*
+ * Calculate host physical port number (Zero-based numbering) from
+ * host channel/link which starts from 1.
+ */
+#define PORT_NUM_FROM_LINK(link) (((link) - 1) / CHANNELS_PER_PORT_NUM)
+
+#define TBT_TX_RING_FULL(prod, cons, size) ((((prod) + 1) % (size)) == (cons))
+#define TBT_TX_RING_EMPTY(prod, cons) ((prod) == (cons))
+#define TBT_RX_RING_FULL(prod, cons) ((prod) == (cons))
+#define TBT_RX_RING_EMPTY(prod, cons, size) ((((cons) + 1) % (size)) == (prod))
+
+#define PATH_FROM_PORT(num_paths, port_num) (((num_paths) - 1) - (port_num))
+
+/* Protocol Defined Field values for SW<->FW communication in raw mode */
+enum pdf_value {
+ PDF_READ_CONFIGURATION_REGISTERS = 1,
+ PDF_WRITE_CONFIGURATION_REGISTERS,
+ PDF_ERROR_NOTIFICATION,
+ PDF_ERROR_ACKNOWLEDGMENT,
+ PDF_PLUG_EVENT_NOTIFICATION,
+ PDF_INTER_DOMAIN_REQUEST,
+ PDF_INTER_DOMAIN_RESPONSE,
+ PDF_CM_OVERRIDE,
+ PDF_RESET_CIO_SWITCH,
+ PDF_FW_TO_SW_NOTIFICATION,
+ PDF_SW_TO_FW_COMMAND,
+ PDF_FW_TO_SW_RESPONSE
+};
+
+/*
+ * SW->FW commands
+ * CC = Command Code
+ */
+enum {
+ CC_GET_THUNDERBOLT_TOPOLOGY = 1,
+ CC_GET_VIDEO_RESOURCES_DATA,
+ CC_DRV_READY,
+ CC_APPROVE_PCI_CONNECTION,
+ CC_CHALLENGE_PCI_CONNECTION,
+ CC_ADD_DEVICE_AND_KEY,
+ CC_APPROVE_INTER_DOMAIN_CONNECTION = 0x10
+};
+
+/*
+ * FW->SW responses
+ * RC = response code
+ */
+enum {
+ RC_GET_TBT_TOPOLOGY = 1,
+ RC_GET_VIDEO_RESOURCES_DATA,
+ RC_DRV_READY,
+ RC_APPROVE_PCI_CONNECTION,
+ RC_CHALLENGE_PCI_CONNECTION,
+ RC_ADD_DEVICE_AND_KEY,
+ RC_INTER_DOMAIN_PKT_SENT = 8,
+ RC_APPROVE_INTER_DOMAIN_CONNECTION = 0x10
+};
+
+/*
+ * FW->SW notifications
+ * NC = notification code
+ */
+enum {
+ NC_DEVICE_CONNECTED = 3,
+ NC_DEVICE_DISCONNECTED,
+ NC_DP_DEVICE_CONNECTED_NOT_TUNNELED,
+ NC_INTER_DOMAIN_CONNECTED,
+ NC_INTER_DOMAIN_DISCONNECTED
+};
+
+/*
+ * SW -> FW mailbox commands
+ * CC = Command Code
+ */
+enum {
+ CC_STOP_CM_ACTIVITY,
+ CC_ENTER_PASS_THROUGH_MODE,
+ CC_ENTER_CM_OWNERSHIP_MODE,
+ CC_DRV_LOADED,
+ CC_DRV_UNLOADED,
+ CC_SAVE_CURRENT_CONNECTED_DEVICES,
+ CC_DISCONNECT_PCIE_PATHS,
+ CC_DRV_UNLOADS_AND_DISCONNECT_INTER_DOMAIN_PATHS,
+ DISCONNECT_PORT_A_INTER_DOMAIN_PATH = 0x10,
+ DISCONNECT_PORT_B_INTER_DOMAIN_PATH,
+ DP_TUNNEL_MODE_IN_ORDER_PER_CAPABILITIES = 0x1E,
+ DP_TUNNEL_MODE_MAXIMIZE_SNK_SRC_TUNNELS,
+ CC_SET_FW_MODE_FD1_D1_CERT = 0x20,
+ CC_SET_FW_MODE_FD1_D1_ALL,
+ CC_SET_FW_MODE_FD1_DA_CERT,
+ CC_SET_FW_MODE_FD1_DA_ALL,
+ CC_SET_FW_MODE_FDA_D1_CERT,
+ CC_SET_FW_MODE_FDA_D1_ALL,
+ CC_SET_FW_MODE_FDA_DA_CERT,
+ CC_SET_FW_MODE_FDA_DA_ALL
+};
+
+
+/* NHI genetlink attributes */
+enum {
+ NHI_ATTR_UNSPEC,
+ NHI_ATTR_DRV_VERSION,
+ NHI_ATTR_NVM_VER_OFFSET,
+ NHI_ATTR_NUM_PORTS,
+ NHI_ATTR_DMA_PORT,
+ NHI_ATTR_SUPPORT_FULL_E2E,
+ NHI_ATTR_MAILBOX_CMD,
+ NHI_ATTR_PDF,
+ NHI_ATTR_MSG_TO_ICM,
+ NHI_ATTR_MSG_FROM_ICM,
+ __NHI_ATTR_MAX,
+};
+#define NHI_ATTR_MAX (__NHI_ATTR_MAX - 1)
+
+struct port_net_dev {
+ struct net_device *net_dev;
+ struct mutex state_mutex;
+};
+
+/**
+ * struct tbt_nhi_ctxt - thunderbolt native host interface context
+ * @node: node in the controllers list.
+ * @pdev: pci device information.
+ * @iobase: address of I/O.
+ * @msix_entries: MSI-X vectors.
+ * @icm_ring_shared_mem: virtual address of iCM ring.
+ * @icm_ring_shared_mem_dma_addr: DMA addr of iCM ring.
+ * @send_sem: semaphore for sending messages to iCM
+ * one at a time.
+ * @mailbox_mutex: mutex for sending mailbox commands to
+ * iCM one at a time.
+ * @d0_exit_send_mutex: synchronizing the d0 exit with messages.
+ * @d0_exit_mailbox_mutex: synchronizing the d0 exit with mailbox.
+ * @lock: synchronizing the interrupt registers
+ * access.
+ * @icm_msgs_work: work queue for handling messages
+ * from iCM.
+ * @net_devices: net devices per port.
+ * @net_workqueue: work queue to send net messages.
+ * @id: id of the controller.
+ * @num_paths: number of paths supported by controller.
+ * @nvm_ver_offset: offset of NVM version in NVM.
+ * @num_vectors: number of MSI-X vectors.
+ * @num_ports: number of ports in the controller.
+ * @dma_port: DMA port.
+ * @d0_exit: whether controller exit D0 state.
+ * @nvm_auth_on_boot: whether iCM authenticates the NVM
+ * during boot.
+ * @wait_for_icm_resp: whether to wait for iCM response.
+ * @ignore_icm_resp: whether to ignore iCM response.
+ * @pci_using_dac: whether using DAC.
+ * @support_full_e2e: whether controller support full E2E.
+ */
+struct tbt_nhi_ctxt {
+ struct list_head node;
+ struct pci_dev *pdev;
+ void __iomem *iobase;
+ struct msix_entry *msix_entries;
+ struct tbt_icm_ring_shared_memory *icm_ring_shared_mem;
+ dma_addr_t icm_ring_shared_mem_dma_addr;
+ struct semaphore send_sem;
+ struct mutex mailbox_mutex;
+ struct mutex d0_exit_send_mutex;
+ struct mutex d0_exit_mailbox_mutex;
+ spinlock_t lock;
+ struct work_struct icm_msgs_work;
+ struct port_net_dev *net_devices;
+ struct workqueue_struct *net_workqueue;
+ u32 id;
+ u32 num_paths;
+ u16 nvm_ver_offset;
+ u8 num_vectors;
+ u8 num_ports;
+ u8 dma_port;
+ bool d0_exit;
+ bool nvm_auth_on_boot : 1;
+ bool wait_for_icm_resp : 1;
+ bool ignore_icm_resp : 1;
+ bool pci_using_dac : 1;
+ bool support_full_e2e : 1;
+};
+
+int nhi_send_message(struct tbt_nhi_ctxt *nhi_ctxt, enum pdf_value pdf,
+ u32 msg_len, const void *msg, bool ignore_icm_resp);
+int nhi_mailbox(struct tbt_nhi_ctxt *nhi_ctxt, u32 cmd, u32 data, bool deinit);
+
+#endif
--
2.7.4
^ permalink raw reply related
* [PATCH v9 2/8] thunderbolt: Updating the register definitions
From: Amir Levy @ 2016-11-09 14:20 UTC (permalink / raw)
To: gregkh
Cc: andreas.noever, bhelgaas, corbet, linux-kernel, linux-pci, netdev,
linux-doc, mario_limonciello, thunderbolt-linux, mika.westerberg,
tomas.winkler, xiong.y.zhang, Amir Levy
In-Reply-To: <1478701208-4585-1-git-send-email-amir.jer.levy@intel.com>
Adding more Thunderbolt(TM) register definitions
and some helper macros.
Signed-off-by: Amir Levy <amir.jer.levy@intel.com>
---
drivers/thunderbolt/nhi_regs.h | 109 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 109 insertions(+)
diff --git a/drivers/thunderbolt/nhi_regs.h b/drivers/thunderbolt/nhi_regs.h
index 75cf069..b8e961f 100644
--- a/drivers/thunderbolt/nhi_regs.h
+++ b/drivers/thunderbolt/nhi_regs.h
@@ -9,6 +9,11 @@
#include <linux/types.h>
+#define NHI_MMIO_BAR 0
+
+#define TBT_RING_MIN_NUM_BUFFERS 2
+#define TBT_RING_MAX_FRAME_SIZE (4 * 1024)
+
enum ring_flags {
RING_FLAG_ISOCH_ENABLE = 1 << 27, /* TX only? */
RING_FLAG_E2E_FLOW_CONTROL = 1 << 28,
@@ -39,6 +44,33 @@ struct ring_desc {
u32 time; /* write zero */
} __packed;
+/**
+ * struct tbt_buf_desc - TX/RX ring buffer descriptor.
+ * This is same as struct ring_desc, but without the use of bitfields and
+ * with explicit endianity.
+ */
+struct tbt_buf_desc {
+ __le64 phys;
+ __le32 attributes;
+ __le32 time;
+};
+
+#define DESC_ATTR_LEN_SHIFT 0
+#define DESC_ATTR_LEN_MASK GENMASK(11, DESC_ATTR_LEN_SHIFT)
+#define DESC_ATTR_EOF_SHIFT 12
+#define DESC_ATTR_EOF_MASK GENMASK(15, DESC_ATTR_EOF_SHIFT)
+#define DESC_ATTR_SOF_SHIFT 16
+#define DESC_ATTR_SOF_MASK GENMASK(19, DESC_ATTR_SOF_SHIFT)
+#define DESC_ATTR_TX_ISOCH_DMA_EN BIT(20) /* TX */
+#define DESC_ATTR_RX_CRC_ERR BIT(20) /* RX after use */
+#define DESC_ATTR_DESC_DONE BIT(21)
+#define DESC_ATTR_REQ_STS BIT(22) /* TX and RX before use */
+#define DESC_ATTR_RX_BUF_OVRN_ERR BIT(22) /* RX after use */
+#define DESC_ATTR_INT_EN BIT(23)
+#define DESC_ATTR_OFFSET_SHIFT 24
+#define DESC_ATTR_OFFSET_MASK GENMASK(31, DESC_ATTR_OFFSET_SHIFT)
+
+
/* NHI registers in bar 0 */
/*
@@ -60,6 +92,30 @@ struct ring_desc {
*/
#define REG_RX_RING_BASE 0x08000
+#define REG_RING_STEP 16
+#define REG_RING_PHYS_LO_OFFSET 0
+#define REG_RING_PHYS_HI_OFFSET 4
+#define REG_RING_CONS_PROD_OFFSET 8 /* cons - RO, prod - RW */
+#define REG_RING_CONS_SHIFT 0
+#define REG_RING_CONS_MASK GENMASK(15, REG_RING_CONS_SHIFT)
+#define REG_RING_PROD_SHIFT 16
+#define REG_RING_PROD_MASK GENMASK(31, REG_RING_PROD_SHIFT)
+#define REG_RING_SIZE_OFFSET 12
+#define REG_RING_SIZE_SHIFT 0
+#define REG_RING_SIZE_MASK GENMASK(15, REG_RING_SIZE_SHIFT)
+#define REG_RING_BUF_SIZE_SHIFT 16
+#define REG_RING_BUF_SIZE_MASK GENMASK(27, REG_RING_BUF_SIZE_SHIFT)
+
+#define TBT_RING_CONS_PROD_REG(iobase, ringbase, ringnumber) \
+ ((iobase) + (ringbase) + \
+ ((ringnumber) * REG_RING_STEP) + \
+ REG_RING_CONS_PROD_OFFSET)
+
+#define TBT_REG_RING_PROD_EXTRACT(val) (((val) & REG_RING_PROD_MASK) >> \
+ REG_RING_PROD_SHIFT)
+
+#define TBT_REG_RING_CONS_EXTRACT(val) (((val) & REG_RING_CONS_MASK) >> \
+ REG_RING_CONS_SHIFT)
/*
* 32 bytes per entry, one entry for every hop (REG_HOP_COUNT)
* 00: enum_ring_flags
@@ -77,6 +133,19 @@ struct ring_desc {
* ..: unknown
*/
#define REG_RX_OPTIONS_BASE 0x29800
+#define REG_RX_OPTS_TX_E2E_HOP_ID_SHIFT 12
+#define REG_RX_OPTS_TX_E2E_HOP_ID_MASK \
+ GENMASK(22, REG_RX_OPTS_TX_E2E_HOP_ID_SHIFT)
+#define REG_RX_OPTS_MASK_OFFSET 4
+#define REG_RX_OPTS_MASK_EOF_SHIFT 0
+#define REG_RX_OPTS_MASK_EOF_MASK GENMASK(15, REG_RX_OPTS_MASK_EOF_SHIFT)
+#define REG_RX_OPTS_MASK_SOF_SHIFT 16
+#define REG_RX_OPTS_MASK_SOF_MASK GENMASK(31, REG_RX_OPTS_MASK_SOF_SHIFT)
+
+#define REG_OPTS_STEP 32
+#define REG_OPTS_E2E_EN BIT(28)
+#define REG_OPTS_RAW BIT(30)
+#define REG_OPTS_VALID BIT(31)
/*
* three bitfields: tx, rx, rx overflow
@@ -86,6 +155,7 @@ struct ring_desc {
*/
#define REG_RING_NOTIFY_BASE 0x37800
#define RING_NOTIFY_REG_COUNT(nhi) ((31 + 3 * nhi->hop_count) / 32)
+#define REG_RING_NOTIFY_STEP 4
/*
* two bitfields: rx, tx
@@ -94,8 +164,47 @@ struct ring_desc {
*/
#define REG_RING_INTERRUPT_BASE 0x38200
#define RING_INTERRUPT_REG_COUNT(nhi) ((31 + 2 * nhi->hop_count) / 32)
+#define REG_RING_INT_TX_PROCESSED(ring_num) BIT(ring_num)
+#define REG_RING_INT_RX_PROCESSED(ring_num, num_paths) BIT((ring_num) + \
+ (num_paths))
+#define RING_INT_DISABLE(base, val) iowrite32( \
+ ioread32((base) + REG_RING_INTERRUPT_BASE) & ~(val), \
+ (base) + REG_RING_INTERRUPT_BASE)
+#define RING_INT_ENABLE(base, val) iowrite32( \
+ ioread32((base) + REG_RING_INTERRUPT_BASE) | (val), \
+ (base) + REG_RING_INTERRUPT_BASE)
+#define RING_INT_DISABLE_TX(base, ring_num) \
+ RING_INT_DISABLE(base, REG_RING_INT_TX_PROCESSED(ring_num))
+#define RING_INT_DISABLE_RX(base, ring_num, num_paths) \
+ RING_INT_DISABLE(base, REG_RING_INT_RX_PROCESSED(ring_num, num_paths))
+#define RING_INT_ENABLE_TX(base, ring_num) \
+ RING_INT_ENABLE(base, REG_RING_INT_TX_PROCESSED(ring_num))
+#define RING_INT_ENABLE_RX(base, ring_num, num_paths) \
+ RING_INT_ENABLE(base, REG_RING_INT_RX_PROCESSED(ring_num, num_paths))
+#define RING_INT_DISABLE_TX_RX(base, ring_num, num_paths) \
+ RING_INT_DISABLE(base, REG_RING_INT_TX_PROCESSED(ring_num) | \
+ REG_RING_INT_RX_PROCESSED(ring_num, num_paths))
+
+#define REG_RING_INTERRUPT_STEP 4
+
+#define REG_INT_THROTTLING_RATE 0x38c00
+#define REG_INT_THROTTLING_RATE_STEP 4
+#define NUM_INT_VECTORS 16
+
+#define REG_INT_VEC_ALLOC_BASE 0x38c40
+#define REG_INT_VEC_ALLOC_STEP 4
+#define REG_INT_VEC_ALLOC_FIELD_BITS 4
+#define REG_INT_VEC_ALLOC_FIELD_MASK (BIT(REG_INT_VEC_ALLOC_FIELD_BITS) - 1)
+#define REG_INT_VEC_ALLOC_PER_REG ((BITS_PER_BYTE * sizeof(u32)) / \
+ REG_INT_VEC_ALLOC_FIELD_BITS)
/* The last 11 bits contain the number of hops supported by the NHI port. */
#define REG_HOP_COUNT 0x39640
+#define REG_HOP_COUNT_TOTAL_PATHS_MASK GENMASK(10, 0)
+
+#define REG_HOST_INTERFACE_RST 0x39858
+
+#define REG_DMA_MISC 0x39864
+#define REG_DMA_MISC_INT_AUTO_CLEAR BIT(2)
#endif
--
2.7.4
^ permalink raw reply related
* Re: [Qemu-devel] [PATCH V10] fsdev: add IO throttle support to fsdev devices
From: Pradeep Jagadeesh @ 2016-11-09 12:18 UTC (permalink / raw)
To: Alberto Garcia, Pradeep Jagadeesh, Aneesh Kumar K.V, Greg Kurz
Cc: qemu-devel, Claudio Fontana
In-Reply-To: <w51lgwt0x59.fsf@maestria.local.igalia.com>
On 11/9/2016 11:23 AM, Alberto Garcia wrote:
> On Wed 09 Nov 2016 10:50:40 AM CET, Pradeep Jagadeesh wrote:
>
>> Uses throttling APIs to limit I/O bandwidth and number of operations
>> on the devices which use 9p-local driver.
>>
>> Signed-off-by: Pradeep Jagadeesh <pradeep.jagadeesh@huawei.com>
>
> It looks good now, thanks!
>
>> +void fsdev_throttle_parse_opts(QemuOpts *opts, FsThrottle *fst, Error **err)
>> +{
> [...]
>> + throttle_is_valid(&fst->cfg, err);
>> +}
>
> Following the QEMU conventions, I would still rename 'err' to 'errp' in
> this function (since it's an Error **).
>
> Otherwise,
Thanks, I will change it.
-Pradeep
> Reviewed-by: Alberto Garcia <berto@igalia.com>
>
> Berto
>
^ permalink raw reply
* [linux-lvm] pvmove launched on inactive vg
From: Lorenzo Dalrio @ 2016-11-09 12:16 UTC (permalink / raw)
To: linux-lvm
Hi,
we have a 2-node cluster with some ha-lvm resources on it. Storage
folks asked us to migrate the lun where those vgs are. They provided
us with new luns of the same size.
We followed a standard procedure of
pvcreate /dev/mapper/new_lun
vgextend vg /dev/mapper/new_lun
pvmove /dev/mapper/old_lun /dev/mapper/new_lun
Here is the problem: some vgs were active on a node and some other vgs
where active on the other node. We have run pvmove for all the vgs on
the first node that completed its active vgs without problem, but left
the other in a intermediate state where pvmove detects the move in
progress but doesn't seem to proceed.
here is the output of lvs -av:
Found same device /dev/mapper/vPLONE07 with same pvid
WH6unuymepW89nwmSLqweMPTUoeypLlW
Found same device /dev/mapper/vPLONE06 with same pvid
mnK6EtFiLH2G5yMLSZ0179zKGdERxQI1
Found same device /dev/mapper/vPLONE07 with same pvid
WH6unuymepW89nwmSLqweMPTUoeypLlW
Found same device /dev/mapper/vd_PLONE_05 with same pvid
gsfXREbuaXiTj96U72Mia9UPJOGwuvxo
Found same device /dev/mapper/vPLONE05 with same pvid
QXEsmAhzG2iFa05LfFE5kl40H8UoV6UG
Found same device /dev/mapper/vPLONE06 with same pvid
mnK6EtFiLH2G5yMLSZ0179zKGdERxQI1
Found same device /dev/mapper/vd_PLONE_04 with same pvid
Jkc3DYEbNAkQwvfH8HBar0BfMy56p68U
Found same device /dev/mapper/vPLONE04 with same pvid
C0bLL0ka1j0a8msqs6h9RKOgxWfWAVtj
Found same device /dev/mapper/vPLONE05 with same pvid
QXEsmAhzG2iFa05LfFE5kl40H8UoV6UG
Found same device /dev/mapper/vd_PLONE_05 with same pvid
gsfXREbuaXiTj96U72Mia9UPJOGwuvxo
Found same device /dev/mapper/vd_PLONE_03 with same pvid
cMoQ7J4qkihBUwWSPB13Wd7QkcXQlN7J
Found same device /dev/mapper/vPLONE03 with same pvid
spC75FaZhB5l5cx1CoMtB8MO0gkSqb2r
Found same device /dev/mapper/vd_PLONE_04 with same pvid
Jkc3DYEbNAkQwvfH8HBar0BfMy56p68U
Found same device /dev/mapper/vPLONE04 with same pvid
C0bLL0ka1j0a8msqs6h9RKOgxWfWAVtj
Found same device /dev/mapper/vd_PLONE_02 with same pvid
odB8kFMhlcP8UdPc7aLBsUPqM4eKRZ2H
Found same device /dev/mapper/vPLONE02 with same pvid
bzttEe6Z0YGLfZzdvvDFUozk7MsM1mmn
Found same device /dev/mapper/vPLONE03 with same pvid
spC75FaZhB5l5cx1CoMtB8MO0gkSqb2r
Found same device /dev/mapper/vd_PLONE_03 with same pvid
cMoQ7J4qkihBUwWSPB13Wd7QkcXQlN7J
Found same device /dev/mapper/vd_PLONE_01 with same pvid
LQ5X7UYOAW3ILCBH5Yv7np32zQrxd1cd
Found same device /dev/mapper/vPLONE01 with same pvid
2V2lLklWhZDSh1hdYzPsfqs9kEp9pVfR
Found same device /dev/mapper/vd_PLONE_02 with same pvid
odB8kFMhlcP8UdPc7aLBsUPqM4eKRZ2H
Found same device /dev/mapper/vPLONE02 with same pvid
bzttEe6Z0YGLfZzdvvDFUozk7MsM1mmn
Found same device /dev/sda2 with same pvid JYCsWcY7TjLh0UbDjjnwm2kU70itCLia
Found same device /dev/mapper/vd_PLONE_01 with same pvid
LQ5X7UYOAW3ILCBH5Yv7np32zQrxd1cd
Found same device /dev/mapper/vPLONE01 with same pvid
2V2lLklWhZDSh1hdYzPsfqs9kEp9pVfR
LV VG #Seg Attr LSize Maj Min KMaj
KMin Pool Origin Data% Meta% Move Cpy%Sync Log
Convert LV UUID LProfile
root centos 1 -wi-ao---- 40.54g -1 -1 253
0
BCGuH9-3669-xwV1-o541-XfyY-m2hI-Ulde9F
swap centos 1 -wi-ao---- 4.56g -1 -1 253
1
0MKiBs-XuBd-3yUw-eXpr-7Ofj-Cva8-o0c6JY
lv_assemblea vg_assemblea 1 -wi-ao---- 50.00g -1 -1 253
10
NxHrKd-RbGz-bnXW-7LQE-A4kS-2mIB-JZU2hk
lv_bur vg_bur 1 -wI------- 40.00g -1 -1 -1
-1
Mg67Sx-ke62-n3Q9-CWqO-e7Zo-R9pc-BQMhRE
[pvmove0] vg_bur 1 p-C---m--- 40.00g -1 -1 -1
-1 /dev/mapper/vd_PLONE_01
ILbHqP-IOlz-e4e9-eThD-QwGd-D7Z4-KOzsmL
lv_ermes vg_ermes 1 -wI------- 40.00g -1 -1 -1
-1
DZTKAk-X047-eOai-rpDp-Uciw-CxUr-uCgOZC
[pvmove0] vg_ermes 1 p-C---m--- 40.00g -1 -1 -1
-1 /dev/mapper/vd_PLONE_03
NOV7rE-rWbI-msda-53LT-d9tk-cutV-K2ok6s
lv_geoportale vg_geoportale 1 -wi-ao---- 10.00g -1 -1 253
9
I1YWt5-sHZu-fEl6-qeW2-JDlw-NHWB-jAHp0m
lv_groupware vg_groupware 1 -wI------- 10.00g -1 -1 -1
-1
a10Lpj-2Ni7-RzNA-Nf8p-3LnT-4mYh-spTIyj
[pvmove0] vg_groupware 1 p-C---m--- 10.00g -1 -1 -1
-1 /dev/mapper/vd_PLONE_05
0MVtil-lIF2-p4lu-2ZhZ-FYb8-ojdq-PmygpF
lv_internos vg_internos 1 -wI------- 40.00g -1 -1 -1
-1
cZEwfL-kZ4G-vKLy-92aX-KKOW-IFkj-Ne0VdR
[pvmove0] vg_internos 1 p-C---m--- 40.00g -1 -1 -1
-1 /dev/mapper/vd_PLONE_02
Yh9UrO-BF42-ETkw-mFxr-VDHE-s6He-dLfX1l
lv_portali vg_portali 1 -wI------- 300.00g -1 -1 -1
-1
IvfQsy-8XAF-qT7D-X52j-cR11-egyL-3WHlMf
[pvmove0] vg_portali 1 p-C---m--- 300.00g -1 -1 -1
-1 /dev/mapper/vd_PLONE_04
cGkqJx-0R7h-at97-E8yt-BCwA-ljCm-p8rcRj
Any advice on how to proceed?
Thank you,
--
Lorenzo Dalrio
^ permalink raw reply
* [PATCH v2] leds: Add mutex protection in brightness_show()
From: Jacek Anaszewski @ 2016-11-09 12:15 UTC (permalink / raw)
To: linux-leds
Cc: linux-kernel, Jacek Anaszewski, Hans de Goede, Sakari Ailus,
Pavel Machek, Andrew Lunn
Initially the claim about no need for lock in brightness_show()
was valid as the function was just returning unchanged
LED brightness.
After the addition of led_update_brightness() this is no longer
true, as the function can change the brightness if a LED class
driver implements brightness_get op. It can lead to races between
led_update_brightness() and led_set_brightness().
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Andrew Lunn <andrew@lunn.ch>
---
Changes since v1:
- added led_sysfs_is_disabled() check
- moved sprintf under mutex protection
drivers/leds/led-class.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c
index b12f861..e472407 100644
--- a/drivers/leds/led-class.c
+++ b/drivers/leds/led-class.c
@@ -29,11 +29,23 @@ static ssize_t brightness_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct led_classdev *led_cdev = dev_get_drvdata(dev);
+ int ret;
- /* no lock needed for this */
- led_update_brightness(led_cdev);
+ mutex_lock(&led_cdev->led_access);
+
+ if (led_sysfs_is_disabled(led_cdev)) {
+ ret = -EBUSY;
+ goto unlock;
+ }
+
+ ret = led_update_brightness(led_cdev);
+ if (ret < 0)
+ goto unlock;
- return sprintf(buf, "%u\n", led_cdev->brightness);
+ ret = sprintf(buf, "%u\n", led_cdev->brightness);
+unlock:
+ mutex_unlock(&led_cdev->led_access);
+ return ret;
}
static ssize_t brightness_store(struct device *dev,
--
1.9.1
^ permalink raw reply related
* [PATCH v3 2/2] ARM: EXYNOS: Remove unused soc_is_exynos{4,5}
From: Pankaj Dubey @ 2016-11-09 12:15 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1478693755-11953-1-git-send-email-pankaj.dubey@samsung.com>
As no more user of soc_is_exynos{4,5} we can safely remove them.
Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
---
arch/arm/mach-exynos/common.h | 5 -----
1 file changed, 5 deletions(-)
diff --git a/arch/arm/mach-exynos/common.h b/arch/arm/mach-exynos/common.h
index dd5d8e8..fb12d11 100644
--- a/arch/arm/mach-exynos/common.h
+++ b/arch/arm/mach-exynos/common.h
@@ -105,11 +105,6 @@ IS_SAMSUNG_CPU(exynos5800, EXYNOS5800_SOC_ID, EXYNOS5_SOC_MASK)
# define soc_is_exynos5800() 0
#endif
-#define soc_is_exynos4() (soc_is_exynos4210() || soc_is_exynos4212() || \
- soc_is_exynos4412())
-#define soc_is_exynos5() (soc_is_exynos5250() || soc_is_exynos5410() || \
- soc_is_exynos5420() || soc_is_exynos5800())
-
extern u32 cp15_save_diag;
extern u32 cp15_save_power;
--
2.7.4
^ permalink raw reply related
* [PATCH v3 1/2] ARM: EXYNOS: Remove static mapping of SCU SFR
From: Pankaj Dubey @ 2016-11-09 12:15 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1478693755-11953-1-git-send-email-pankaj.dubey@samsung.com>
Lets remove static mapping of SCU SFR mainly used in CORTEX-A9 SoC based
boards. Instead use mapping from device tree node of SCU.
NOTE: This patch has dependency on DT file of any such CORTEX-A9 SoC
based boards, in the absence of SCU device node in DTS file, only single
CPU will boot. So if you are using OUT-OF-TREE DTS file of CORTEX-A9 based
Exynos SoC make sure to add SCU device node to DTS file for SMP boot.
Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
---
arch/arm/mach-exynos/common.h | 1 +
arch/arm/mach-exynos/exynos.c | 22 ------------------
arch/arm/mach-exynos/include/mach/map.h | 2 --
arch/arm/mach-exynos/platsmp.c | 34 +++++++++++++++++++++-------
arch/arm/mach-exynos/pm.c | 4 +---
arch/arm/mach-exynos/suspend.c | 4 +---
arch/arm/plat-samsung/include/plat/map-s5p.h | 4 ----
7 files changed, 29 insertions(+), 42 deletions(-)
diff --git a/arch/arm/mach-exynos/common.h b/arch/arm/mach-exynos/common.h
index 9424a8a..dd5d8e8 100644
--- a/arch/arm/mach-exynos/common.h
+++ b/arch/arm/mach-exynos/common.h
@@ -161,6 +161,7 @@ extern void exynos_cpu_restore_register(void);
extern void exynos_pm_central_suspend(void);
extern int exynos_pm_central_resume(void);
extern void exynos_enter_aftr(void);
+extern int exynos_scu_enable(void);
extern struct cpuidle_exynos_data cpuidle_coupled_exynos_data;
diff --git a/arch/arm/mach-exynos/exynos.c b/arch/arm/mach-exynos/exynos.c
index 757fc11..fa08ef9 100644
--- a/arch/arm/mach-exynos/exynos.c
+++ b/arch/arm/mach-exynos/exynos.c
@@ -28,15 +28,6 @@
#include "common.h"
-static struct map_desc exynos4_iodesc[] __initdata = {
- {
- .virtual = (unsigned long)S5P_VA_COREPERI_BASE,
- .pfn = __phys_to_pfn(EXYNOS4_PA_COREPERI),
- .length = SZ_8K,
- .type = MT_DEVICE,
- },
-};
-
static struct platform_device exynos_cpuidle = {
.name = "exynos_cpuidle",
#ifdef CONFIG_ARM_EXYNOS_CPUIDLE
@@ -99,17 +90,6 @@ static int __init exynos_fdt_map_chipid(unsigned long node, const char *uname,
return 1;
}
-/*
- * exynos_map_io
- *
- * register the standard cpu IO areas
- */
-static void __init exynos_map_io(void)
-{
- if (soc_is_exynos4())
- iotable_init(exynos4_iodesc, ARRAY_SIZE(exynos4_iodesc));
-}
-
static void __init exynos_init_io(void)
{
debug_ll_io_init();
@@ -118,8 +98,6 @@ static void __init exynos_init_io(void)
/* detect cpu id and rev. */
s5p_init_cpu(S5P_VA_CHIPID);
-
- exynos_map_io();
}
/*
diff --git a/arch/arm/mach-exynos/include/mach/map.h b/arch/arm/mach-exynos/include/mach/map.h
index 5fb0040..0eef407 100644
--- a/arch/arm/mach-exynos/include/mach/map.h
+++ b/arch/arm/mach-exynos/include/mach/map.h
@@ -18,6 +18,4 @@
#define EXYNOS_PA_CHIPID 0x10000000
-#define EXYNOS4_PA_COREPERI 0x10500000
-
#endif /* __ASM_ARCH_MAP_H */
diff --git a/arch/arm/mach-exynos/platsmp.c b/arch/arm/mach-exynos/platsmp.c
index a5d6841..94405c7 100644
--- a/arch/arm/mach-exynos/platsmp.c
+++ b/arch/arm/mach-exynos/platsmp.c
@@ -168,6 +168,27 @@ int exynos_cluster_power_state(int cluster)
S5P_CORE_LOCAL_PWR_EN);
}
+/**
+ * exynos_scu_enable : enables SCU for Cortex-A9 based system
+ * returns 0 on success else non-zero error code
+ */
+int exynos_scu_enable(void)
+{
+ struct device_node *np;
+ void __iomem *scu_base;
+
+ np = of_find_compatible_node(NULL, NULL, "arm,cortex-a9-scu");
+ scu_base = of_iomap(np, 0);
+ of_node_put(np);
+ if (!scu_base) {
+ pr_err("%s failed to map scu_base\n", __func__);
+ return -ENOMEM;
+ }
+ scu_enable(scu_base);
+ iounmap(scu_base);
+ return 0;
+}
+
static void __iomem *cpu_boot_reg_base(void)
{
if (soc_is_exynos4210() && samsung_rev() == EXYNOS4210_REV_1_1)
@@ -224,11 +245,6 @@ static void write_pen_release(int val)
sync_cache_w(&pen_release);
}
-static void __iomem *scu_base_addr(void)
-{
- return (void __iomem *)(S5P_VA_SCU);
-}
-
static DEFINE_SPINLOCK(boot_lock);
static void exynos_secondary_init(unsigned int cpu)
@@ -393,9 +409,11 @@ static void __init exynos_smp_prepare_cpus(unsigned int max_cpus)
exynos_set_delayed_reset_assertion(true);
- if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A9)
- scu_enable(scu_base_addr());
-
+ if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A9) {
+ /* if exynos_scu_enable fails, return */
+ if (exynos_scu_enable())
+ return;
+ }
/*
* Write the address of secondary startup into the
* system-wide flags register. The boot monitor waits
diff --git a/arch/arm/mach-exynos/pm.c b/arch/arm/mach-exynos/pm.c
index 487295f..c0b46c3 100644
--- a/arch/arm/mach-exynos/pm.c
+++ b/arch/arm/mach-exynos/pm.c
@@ -26,8 +26,6 @@
#include <asm/suspend.h>
#include <asm/cacheflush.h>
-#include <mach/map.h>
-
#include "common.h"
static inline void __iomem *exynos_boot_vector_addr(void)
@@ -177,7 +175,7 @@ void exynos_enter_aftr(void)
cpu_suspend(0, exynos_aftr_finisher);
if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A9) {
- scu_enable(S5P_VA_SCU);
+ exynos_scu_enable();
if (call_firmware_op(resume) == -ENOSYS)
exynos_cpu_restore_register();
}
diff --git a/arch/arm/mach-exynos/suspend.c b/arch/arm/mach-exynos/suspend.c
index 06332f6..73df9f3 100644
--- a/arch/arm/mach-exynos/suspend.c
+++ b/arch/arm/mach-exynos/suspend.c
@@ -34,8 +34,6 @@
#include <asm/smp_scu.h>
#include <asm/suspend.h>
-#include <mach/map.h>
-
#include <plat/pm-common.h>
#include "common.h"
@@ -462,7 +460,7 @@ static void exynos_pm_resume(void)
exynos_pm_release_retention();
if (cpuid == ARM_CPU_PART_CORTEX_A9)
- scu_enable(S5P_VA_SCU);
+ exynos_scu_enable();
if (call_firmware_op(resume) == -ENOSYS
&& cpuid == ARM_CPU_PART_CORTEX_A9)
diff --git a/arch/arm/plat-samsung/include/plat/map-s5p.h b/arch/arm/plat-samsung/include/plat/map-s5p.h
index 0fe2828..512ed1f 100644
--- a/arch/arm/plat-samsung/include/plat/map-s5p.h
+++ b/arch/arm/plat-samsung/include/plat/map-s5p.h
@@ -15,10 +15,6 @@
#define S5P_VA_CHIPID S3C_ADDR(0x02000000)
-#define S5P_VA_COREPERI_BASE S3C_ADDR(0x02800000)
-#define S5P_VA_COREPERI(x) (S5P_VA_COREPERI_BASE + (x))
-#define S5P_VA_SCU S5P_VA_COREPERI(0x0)
-
#define VA_VIC(x) (S3C_VA_IRQ + ((x) * 0x10000))
#define VA_VIC0 VA_VIC(0)
#define VA_VIC1 VA_VIC(1)
--
2.7.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.