* [PATCH 1/5] nvmet: Reorder fields in 'struct nvmet_sq'
2023-05-01 12:40 [PATCH 0/5] optimize some data structure in nvme Christophe JAILLET
@ 2023-05-01 12:40 ` Christophe JAILLET
2023-05-01 12:40 ` [PATCH 2/5] nvmet: Reorder fields in 'struct nvme_ctrl' Christophe JAILLET
` (7 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Christophe JAILLET @ 2023-05-01 12:40 UTC (permalink / raw)
To: hch, sagi, kch
Cc: linux-nvme, linux-kernel, kernel-janitors, Christophe JAILLET
Group some variables based on their sizes to reduce holes.
On x86_64, this shrinks the size of 'struct nvmet_sq' from 472 to 464
bytes when CONFIG_NVME_TARGET_AUTH is defined.
This structure is embedded into some other structures, so it helps reducing
their sizes as well.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
---
Using pahole
Before:
======
struct nvmet_sq {
struct nvmet_ctrl * ctrl; /* 0 8 */
struct percpu_ref ref; /* 8 16 */
u16 qid; /* 24 2 */
u16 size; /* 26 2 */
u32 sqhd; /* 28 4 */
bool sqhd_disabled; /* 32 1 */
/* XXX 7 bytes hole, try to pack */
struct delayed_work auth_expired_work; /* 40 184 */
/* XXX last struct has 4 bytes of padding */
/* --- cacheline 3 boundary (192 bytes) was 32 bytes ago --- */
bool authenticated; /* 224 1 */
/* XXX 1 byte hole, try to pack */
u16 dhchap_tid; /* 226 2 */
u16 dhchap_status; /* 228 2 */
/* XXX 2 bytes hole, try to pack */
int dhchap_step; /* 232 4 */
/* XXX 4 bytes hole, try to pack */
u8 * dhchap_c1; /* 240 8 */
u8 * dhchap_c2; /* 248 8 */
/* --- cacheline 4 boundary (256 bytes) --- */
u32 dhchap_s1; /* 256 4 */
u32 dhchap_s2; /* 260 4 */
u8 * dhchap_skey; /* 264 8 */
int dhchap_skey_len; /* 272 4 */
/* XXX 4 bytes hole, try to pack */
struct completion free_done; /* 280 96 */
/* --- cacheline 5 boundary (320 bytes) was 56 bytes ago --- */
struct completion confirm_done; /* 376 96 */
/* size: 472, cachelines: 8, members: 19 */
/* sum members: 454, holes: 5, sum holes: 18 */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 24 bytes */
};
After:
=====
struct nvmet_sq {
struct nvmet_ctrl * ctrl; /* 0 8 */
struct percpu_ref ref; /* 8 16 */
u16 qid; /* 24 2 */
u16 size; /* 26 2 */
u32 sqhd; /* 28 4 */
bool sqhd_disabled; /* 32 1 */
bool authenticated; /* 33 1 */
/* XXX 6 bytes hole, try to pack */
struct delayed_work auth_expired_work; /* 40 184 */
/* XXX last struct has 4 bytes of padding */
/* --- cacheline 3 boundary (192 bytes) was 32 bytes ago --- */
u16 dhchap_tid; /* 224 2 */
u16 dhchap_status; /* 226 2 */
int dhchap_step; /* 228 4 */
u8 * dhchap_c1; /* 232 8 */
u8 * dhchap_c2; /* 240 8 */
u32 dhchap_s1; /* 248 4 */
u32 dhchap_s2; /* 252 4 */
/* --- cacheline 4 boundary (256 bytes) --- */
u8 * dhchap_skey; /* 256 8 */
int dhchap_skey_len; /* 264 4 */
/* XXX 4 bytes hole, try to pack */
struct completion free_done; /* 272 96 */
/* --- cacheline 5 boundary (320 bytes) was 48 bytes ago --- */
struct completion confirm_done; /* 368 96 */
/* size: 464, cachelines: 8, members: 19 */
/* sum members: 454, holes: 2, sum holes: 10 */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 16 bytes */
};
---
drivers/nvme/target/nvmet.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index c50146085fb5..8cfd60f3b564 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -109,8 +109,8 @@ struct nvmet_sq {
u32 sqhd;
bool sqhd_disabled;
#ifdef CONFIG_NVME_TARGET_AUTH
- struct delayed_work auth_expired_work;
bool authenticated;
+ struct delayed_work auth_expired_work;
u16 dhchap_tid;
u16 dhchap_status;
int dhchap_step;
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH 2/5] nvmet: Reorder fields in 'struct nvme_ctrl'
2023-05-01 12:40 [PATCH 0/5] optimize some data structure in nvme Christophe JAILLET
2023-05-01 12:40 ` [PATCH 1/5] nvmet: Reorder fields in 'struct nvmet_sq' Christophe JAILLET
@ 2023-05-01 12:40 ` Christophe JAILLET
2023-05-15 14:21 ` Keith Busch
2023-05-01 12:40 ` [PATCH 3/5] nvmet: Reorder fields in 'struct nvmf_ctrl_options' Christophe JAILLET
` (6 subsequent siblings)
8 siblings, 1 reply; 12+ messages in thread
From: Christophe JAILLET @ 2023-05-01 12:40 UTC (permalink / raw)
To: hch, sagi, kch
Cc: linux-nvme, linux-kernel, kernel-janitors, Christophe JAILLET
Group some variables based on their sizes to reduce holes.
On x86_64, this shrinks the size of 'struct nvmet_sq' from 5368 to 5344
bytes when all CONFIG_* are defined.
This structure is embedded into some other structures, so it helps reducing
their size as well.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
---
Using pahole
Before:
======
struct nvme_ctrl {
bool comp_seen; /* 0 1 */
/* XXX 3 bytes hole, try to pack */
enum nvme_ctrl_state state; /* 4 4 */
bool identified; /* 8 1 */
/* XXX 7 bytes hole, try to pack */
spinlock_t lock; /* 16 72 */
/* --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- */
struct mutex scan_lock; /* 88 160 */
/* --- cacheline 3 boundary (192 bytes) was 56 bytes ago --- */
const struct nvme_ctrl_ops * ops; /* 248 8 */
/* --- cacheline 4 boundary (256 bytes) --- */
struct request_queue * admin_q; /* 256 8 */
struct request_queue * connect_q; /* 264 8 */
struct request_queue * fabrics_q; /* 272 8 */
struct device * dev; /* 280 8 */
int instance; /* 288 4 */
int numa_node; /* 292 4 */
struct blk_mq_tag_set * tagset; /* 296 8 */
struct blk_mq_tag_set * admin_tagset; /* 304 8 */
struct list_head namespaces; /* 312 16 */
/* --- cacheline 5 boundary (320 bytes) was 8 bytes ago --- */
struct rw_semaphore namespaces_rwsem; /* 328 168 */
/* --- cacheline 7 boundary (448 bytes) was 48 bytes ago --- */
struct device ctrl_device __attribute__((__aligned__(8))); /* 496 1400 */
/* XXX last struct has 3 bytes of padding */
/* --- cacheline 29 boundary (1856 bytes) was 40 bytes ago --- */
struct device * device; /* 1896 8 */
struct device * hwmon_device; /* 1904 8 */
struct cdev cdev; /* 1912 296 */
/* --- cacheline 34 boundary (2176 bytes) was 32 bytes ago --- */
struct work_struct reset_work; /* 2208 80 */
/* --- cacheline 35 boundary (2240 bytes) was 48 bytes ago --- */
struct work_struct delete_work; /* 2288 80 */
/* --- cacheline 37 boundary (2368 bytes) --- */
wait_queue_head_t state_wq; /* 2368 88 */
/* --- cacheline 38 boundary (2432 bytes) was 24 bytes ago --- */
struct nvme_subsystem * subsys; /* 2456 8 */
struct list_head subsys_entry; /* 2464 16 */
struct opal_dev * opal_dev; /* 2480 8 */
char name[12]; /* 2488 12 */
/* --- cacheline 39 boundary (2496 bytes) was 4 bytes ago --- */
u16 cntlid; /* 2500 2 */
/* XXX 2 bytes hole, try to pack */
u32 ctrl_config; /* 2504 4 */
u16 mtfa; /* 2508 2 */
/* XXX 2 bytes hole, try to pack */
u32 queue_count; /* 2512 4 */
/* XXX 4 bytes hole, try to pack */
u64 cap; /* 2520 8 */
u32 max_hw_sectors; /* 2528 4 */
u32 max_segments; /* 2532 4 */
u32 max_integrity_segments; /* 2536 4 */
u32 max_discard_sectors; /* 2540 4 */
u32 max_discard_segments; /* 2544 4 */
u32 max_zeroes_sectors; /* 2548 4 */
u32 max_zone_append; /* 2552 4 */
u16 crdt[3]; /* 2556 6 */
/* --- cacheline 40 boundary (2560 bytes) was 2 bytes ago --- */
u16 oncs; /* 2562 2 */
u32 dmrsl; /* 2564 4 */
u16 oacs; /* 2568 2 */
u16 sqsize; /* 2570 2 */
u32 max_namespaces; /* 2572 4 */
atomic_t abort_limit; /* 2576 4 */
u8 vwc; /* 2580 1 */
/* XXX 3 bytes hole, try to pack */
u32 vs; /* 2584 4 */
u32 sgls; /* 2588 4 */
u16 kas; /* 2592 2 */
u8 npss; /* 2594 1 */
u8 apsta; /* 2595 1 */
u16 wctemp; /* 2596 2 */
u16 cctemp; /* 2598 2 */
u32 oaes; /* 2600 4 */
u32 aen_result; /* 2604 4 */
u32 ctratt; /* 2608 4 */
unsigned int shutdown_timeout; /* 2612 4 */
unsigned int kato; /* 2616 4 */
bool subsystem; /* 2620 1 */
/* XXX 3 bytes hole, try to pack */
/* --- cacheline 41 boundary (2624 bytes) --- */
long unsigned int quirks; /* 2624 8 */
struct nvme_id_power_state psd[32]; /* 2632 1024 */
/* --- cacheline 57 boundary (3648 bytes) was 8 bytes ago --- */
struct nvme_effects_log * effects; /* 3656 8 */
struct xarray cels; /* 3664 88 */
/* --- cacheline 58 boundary (3712 bytes) was 40 bytes ago --- */
struct work_struct scan_work; /* 3752 80 */
/* --- cacheline 59 boundary (3776 bytes) was 56 bytes ago --- */
struct work_struct async_event_work; /* 3832 80 */
/* --- cacheline 61 boundary (3904 bytes) was 8 bytes ago --- */
struct delayed_work ka_work; /* 3912 184 */
/* XXX last struct has 4 bytes of padding */
/* --- cacheline 64 boundary (4096 bytes) --- */
struct delayed_work failfast_work; /* 4096 184 */
/* XXX last struct has 4 bytes of padding */
/* --- cacheline 66 boundary (4224 bytes) was 56 bytes ago --- */
struct nvme_command ka_cmd; /* 4280 64 */
/* --- cacheline 67 boundary (4288 bytes) was 56 bytes ago --- */
struct work_struct fw_act_work; /* 4344 80 */
/* --- cacheline 69 boundary (4416 bytes) was 8 bytes ago --- */
long unsigned int events; /* 4424 8 */
u8 anacap; /* 4432 1 */
u8 anatt; /* 4433 1 */
/* XXX 2 bytes hole, try to pack */
u32 anagrpmax; /* 4436 4 */
u32 nanagrpid; /* 4440 4 */
/* XXX 4 bytes hole, try to pack */
struct mutex ana_lock; /* 4448 160 */
/* --- cacheline 72 boundary (4608 bytes) --- */
struct nvme_ana_rsp_hdr * ana_log_buf; /* 4608 8 */
size_t ana_log_size; /* 4616 8 */
struct timer_list anatt_timer; /* 4624 88 */
/* --- cacheline 73 boundary (4672 bytes) was 40 bytes ago --- */
struct work_struct ana_work; /* 4712 80 */
/* --- cacheline 74 boundary (4736 bytes) was 56 bytes ago --- */
struct work_struct dhchap_auth_work; /* 4792 80 */
/* --- cacheline 76 boundary (4864 bytes) was 8 bytes ago --- */
struct mutex dhchap_auth_mutex; /* 4872 160 */
/* --- cacheline 78 boundary (4992 bytes) was 40 bytes ago --- */
struct nvme_dhchap_queue_context * dhchap_ctxs; /* 5032 8 */
struct nvme_dhchap_key * host_key; /* 5040 8 */
struct nvme_dhchap_key * ctrl_key; /* 5048 8 */
/* --- cacheline 79 boundary (5056 bytes) --- */
u16 transaction; /* 5056 2 */
/* XXX 6 bytes hole, try to pack */
u64 ps_max_latency_us; /* 5064 8 */
bool apst_enabled; /* 5072 1 */
/* XXX 3 bytes hole, try to pack */
u32 hmpre; /* 5076 4 */
u32 hmmin; /* 5080 4 */
u32 hmminds; /* 5084 4 */
u16 hmmaxd; /* 5088 2 */
/* XXX 2 bytes hole, try to pack */
u32 ioccsz; /* 5092 4 */
u32 iorcsz; /* 5096 4 */
u16 icdoff; /* 5100 2 */
u16 maxcmd; /* 5102 2 */
int nr_reconnects; /* 5104 4 */
/* XXX 4 bytes hole, try to pack */
long unsigned int flags; /* 5112 8 */
/* --- cacheline 80 boundary (5120 bytes) --- */
struct nvmf_ctrl_options * opts; /* 5120 8 */
struct page * discard_page; /* 5128 8 */
long unsigned int discard_page_busy; /* 5136 8 */
struct nvme_fault_inject fault_inject; /* 5144 216 */
/* XXX last struct has 4 bytes of padding */
/* --- cacheline 83 boundary (5312 bytes) was 48 bytes ago --- */
enum nvme_ctrl_type cntrltype; /* 5360 4 */
enum nvme_dctype dctype; /* 5364 4 */
/* size: 5368, cachelines: 84, members: 104 */
/* sum members: 5323, holes: 13, sum holes: 45 */
/* paddings: 4, sum paddings: 15 */
/* forced alignments: 1 */
/* last cacheline: 56 bytes */
} __attribute__((__aligned__(8)));
After:
=====
struct nvme_ctrl {
bool comp_seen; /* 0 1 */
bool identified; /* 1 1 */
/* XXX 2 bytes hole, try to pack */
enum nvme_ctrl_state state; /* 4 4 */
spinlock_t lock; /* 8 72 */
/* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
struct mutex scan_lock; /* 80 160 */
/* --- cacheline 3 boundary (192 bytes) was 48 bytes ago --- */
const struct nvme_ctrl_ops * ops; /* 240 8 */
struct request_queue * admin_q; /* 248 8 */
/* --- cacheline 4 boundary (256 bytes) --- */
struct request_queue * connect_q; /* 256 8 */
struct request_queue * fabrics_q; /* 264 8 */
struct device * dev; /* 272 8 */
int instance; /* 280 4 */
int numa_node; /* 284 4 */
struct blk_mq_tag_set * tagset; /* 288 8 */
struct blk_mq_tag_set * admin_tagset; /* 296 8 */
struct list_head namespaces; /* 304 16 */
/* --- cacheline 5 boundary (320 bytes) --- */
struct rw_semaphore namespaces_rwsem; /* 320 168 */
/* --- cacheline 7 boundary (448 bytes) was 40 bytes ago --- */
struct device ctrl_device __attribute__((__aligned__(8))); /* 488 1400 */
/* XXX last struct has 3 bytes of padding */
/* --- cacheline 29 boundary (1856 bytes) was 32 bytes ago --- */
struct device * device; /* 1888 8 */
struct device * hwmon_device; /* 1896 8 */
struct cdev cdev; /* 1904 296 */
/* --- cacheline 34 boundary (2176 bytes) was 24 bytes ago --- */
struct work_struct reset_work; /* 2200 80 */
/* --- cacheline 35 boundary (2240 bytes) was 40 bytes ago --- */
struct work_struct delete_work; /* 2280 80 */
/* --- cacheline 36 boundary (2304 bytes) was 56 bytes ago --- */
wait_queue_head_t state_wq; /* 2360 88 */
/* --- cacheline 38 boundary (2432 bytes) was 16 bytes ago --- */
struct nvme_subsystem * subsys; /* 2448 8 */
struct list_head subsys_entry; /* 2456 16 */
struct opal_dev * opal_dev; /* 2472 8 */
char name[12]; /* 2480 12 */
u16 cntlid; /* 2492 2 */
u16 mtfa; /* 2494 2 */
/* --- cacheline 39 boundary (2496 bytes) --- */
u32 ctrl_config; /* 2496 4 */
u32 queue_count; /* 2500 4 */
u64 cap; /* 2504 8 */
u32 max_hw_sectors; /* 2512 4 */
u32 max_segments; /* 2516 4 */
u32 max_integrity_segments; /* 2520 4 */
u32 max_discard_sectors; /* 2524 4 */
u32 max_discard_segments; /* 2528 4 */
u32 max_zeroes_sectors; /* 2532 4 */
u32 max_zone_append; /* 2536 4 */
u16 crdt[3]; /* 2540 6 */
u16 oncs; /* 2546 2 */
u32 dmrsl; /* 2548 4 */
u16 oacs; /* 2552 2 */
u16 sqsize; /* 2554 2 */
u32 max_namespaces; /* 2556 4 */
/* --- cacheline 40 boundary (2560 bytes) --- */
atomic_t abort_limit; /* 2560 4 */
u8 vwc; /* 2564 1 */
/* XXX 3 bytes hole, try to pack */
u32 vs; /* 2568 4 */
u32 sgls; /* 2572 4 */
u16 kas; /* 2576 2 */
u8 npss; /* 2578 1 */
u8 apsta; /* 2579 1 */
u16 wctemp; /* 2580 2 */
u16 cctemp; /* 2582 2 */
u32 oaes; /* 2584 4 */
u32 aen_result; /* 2588 4 */
u32 ctratt; /* 2592 4 */
unsigned int shutdown_timeout; /* 2596 4 */
unsigned int kato; /* 2600 4 */
bool subsystem; /* 2604 1 */
/* XXX 3 bytes hole, try to pack */
long unsigned int quirks; /* 2608 8 */
struct nvme_id_power_state psd[32]; /* 2616 1024 */
/* --- cacheline 56 boundary (3584 bytes) was 56 bytes ago --- */
struct nvme_effects_log * effects; /* 3640 8 */
/* --- cacheline 57 boundary (3648 bytes) --- */
struct xarray cels; /* 3648 88 */
/* --- cacheline 58 boundary (3712 bytes) was 24 bytes ago --- */
struct work_struct scan_work; /* 3736 80 */
/* --- cacheline 59 boundary (3776 bytes) was 40 bytes ago --- */
struct work_struct async_event_work; /* 3816 80 */
/* --- cacheline 60 boundary (3840 bytes) was 56 bytes ago --- */
struct delayed_work ka_work; /* 3896 184 */
/* XXX last struct has 4 bytes of padding */
/* --- cacheline 63 boundary (4032 bytes) was 48 bytes ago --- */
struct delayed_work failfast_work; /* 4080 184 */
/* XXX last struct has 4 bytes of padding */
/* --- cacheline 66 boundary (4224 bytes) was 40 bytes ago --- */
struct nvme_command ka_cmd; /* 4264 64 */
/* --- cacheline 67 boundary (4288 bytes) was 40 bytes ago --- */
struct work_struct fw_act_work; /* 4328 80 */
/* --- cacheline 68 boundary (4352 bytes) was 56 bytes ago --- */
long unsigned int events; /* 4408 8 */
/* --- cacheline 69 boundary (4416 bytes) --- */
u8 anacap; /* 4416 1 */
u8 anatt; /* 4417 1 */
/* XXX 2 bytes hole, try to pack */
u32 anagrpmax; /* 4420 4 */
u32 nanagrpid; /* 4424 4 */
/* XXX 4 bytes hole, try to pack */
struct mutex ana_lock; /* 4432 160 */
/* --- cacheline 71 boundary (4544 bytes) was 48 bytes ago --- */
struct nvme_ana_rsp_hdr * ana_log_buf; /* 4592 8 */
size_t ana_log_size; /* 4600 8 */
/* --- cacheline 72 boundary (4608 bytes) --- */
struct timer_list anatt_timer; /* 4608 88 */
/* --- cacheline 73 boundary (4672 bytes) was 24 bytes ago --- */
struct work_struct ana_work; /* 4696 80 */
/* --- cacheline 74 boundary (4736 bytes) was 40 bytes ago --- */
struct work_struct dhchap_auth_work; /* 4776 80 */
/* --- cacheline 75 boundary (4800 bytes) was 56 bytes ago --- */
struct mutex dhchap_auth_mutex; /* 4856 160 */
/* --- cacheline 78 boundary (4992 bytes) was 24 bytes ago --- */
struct nvme_dhchap_queue_context * dhchap_ctxs; /* 5016 8 */
struct nvme_dhchap_key * host_key; /* 5024 8 */
struct nvme_dhchap_key * ctrl_key; /* 5032 8 */
u16 transaction; /* 5040 2 */
/* XXX 6 bytes hole, try to pack */
u64 ps_max_latency_us; /* 5048 8 */
/* --- cacheline 79 boundary (5056 bytes) --- */
bool apst_enabled; /* 5056 1 */
/* XXX 1 byte hole, try to pack */
u16 hmmaxd; /* 5058 2 */
u32 hmpre; /* 5060 4 */
u32 hmmin; /* 5064 4 */
u32 hmminds; /* 5068 4 */
u32 ioccsz; /* 5072 4 */
u32 iorcsz; /* 5076 4 */
u16 icdoff; /* 5080 2 */
u16 maxcmd; /* 5082 2 */
int nr_reconnects; /* 5084 4 */
long unsigned int flags; /* 5088 8 */
struct nvmf_ctrl_options * opts; /* 5096 8 */
struct page * discard_page; /* 5104 8 */
long unsigned int discard_page_busy; /* 5112 8 */
/* --- cacheline 80 boundary (5120 bytes) --- */
struct nvme_fault_inject fault_inject; /* 5120 216 */
/* XXX last struct has 4 bytes of padding */
/* --- cacheline 83 boundary (5312 bytes) was 24 bytes ago --- */
enum nvme_ctrl_type cntrltype; /* 5336 4 */
enum nvme_dctype dctype; /* 5340 4 */
/* size: 5344, cachelines: 84, members: 104 */
/* sum members: 5323, holes: 7, sum holes: 21 */
/* paddings: 4, sum paddings: 15 */
/* forced alignments: 1 */
/* last cacheline: 32 bytes */
} __attribute__((__aligned__(8)));
---
drivers/nvme/host/nvme.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index bf46f122e9e1..53417b6439a7 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -246,8 +246,8 @@ enum nvme_ctrl_flags {
struct nvme_ctrl {
bool comp_seen;
- enum nvme_ctrl_state state;
bool identified;
+ enum nvme_ctrl_state state;
spinlock_t lock;
struct mutex scan_lock;
const struct nvme_ctrl_ops *ops;
@@ -279,8 +279,8 @@ struct nvme_ctrl {
char name[12];
u16 cntlid;
- u32 ctrl_config;
u16 mtfa;
+ u32 ctrl_config;
u32 queue_count;
u64 cap;
@@ -353,10 +353,10 @@ struct nvme_ctrl {
bool apst_enabled;
/* PCIe only: */
+ u16 hmmaxd;
u32 hmpre;
u32 hmmin;
u32 hmminds;
- u16 hmmaxd;
/* Fabrics only */
u32 ioccsz;
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH 2/5] nvmet: Reorder fields in 'struct nvme_ctrl'
2023-05-01 12:40 ` [PATCH 2/5] nvmet: Reorder fields in 'struct nvme_ctrl' Christophe JAILLET
@ 2023-05-15 14:21 ` Keith Busch
2023-05-15 17:22 ` Christophe JAILLET
0 siblings, 1 reply; 12+ messages in thread
From: Keith Busch @ 2023-05-15 14:21 UTC (permalink / raw)
To: Christophe JAILLET
Cc: hch, sagi, kch, linux-nvme, linux-kernel, kernel-janitors
On Mon, May 01, 2023 at 02:40:26PM +0200, Christophe JAILLET wrote:
> Group some variables based on their sizes to reduce holes.
> On x86_64, this shrinks the size of 'struct nvmet_sq' from 5368 to 5344
> bytes when all CONFIG_* are defined.
This patch is changing struct nvme_ctrl but the commit log only mentions
struct nvmet_sq, which was handled in patch 1/5. I'll just fix that up
when applying.
> This structure is embedded into some other structures, so it helps reducing
> their size as well.
>
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
> ---
> Using pahole
>
> Before:
> ======
> struct nvme_ctrl {
> bool comp_seen; /* 0 1 */
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 2/5] nvmet: Reorder fields in 'struct nvme_ctrl'
2023-05-15 14:21 ` Keith Busch
@ 2023-05-15 17:22 ` Christophe JAILLET
0 siblings, 0 replies; 12+ messages in thread
From: Christophe JAILLET @ 2023-05-15 17:22 UTC (permalink / raw)
To: Keith Busch; +Cc: hch, sagi, kch, linux-nvme, linux-kernel, kernel-janitors
Le 15/05/2023 à 16:21, Keith Busch a écrit :
> On Mon, May 01, 2023 at 02:40:26PM +0200, Christophe JAILLET wrote:
>> Group some variables based on their sizes to reduce holes.
>> On x86_64, this shrinks the size of 'struct nvmet_sq' from 5368 to 5344
>> bytes when all CONFIG_* are defined.
>
> This patch is changing struct nvme_ctrl but the commit log only mentions
> struct nvmet_sq, which was handled in patch 1/5. I'll just fix that up
> when applying.
>
Thanks for catching and fixing it :).
CJ
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 3/5] nvmet: Reorder fields in 'struct nvmf_ctrl_options'
2023-05-01 12:40 [PATCH 0/5] optimize some data structure in nvme Christophe JAILLET
2023-05-01 12:40 ` [PATCH 1/5] nvmet: Reorder fields in 'struct nvmet_sq' Christophe JAILLET
2023-05-01 12:40 ` [PATCH 2/5] nvmet: Reorder fields in 'struct nvme_ctrl' Christophe JAILLET
@ 2023-05-01 12:40 ` Christophe JAILLET
2023-05-01 12:40 ` [PATCH 4/5] nvmet: Reorder fields in 'struct nvme_dhchap_queue_context' Christophe JAILLET
` (5 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Christophe JAILLET @ 2023-05-01 12:40 UTC (permalink / raw)
To: hch, sagi, kch
Cc: linux-nvme, linux-kernel, kernel-janitors, Christophe JAILLET
Group some variables based on their sizes to reduce holes.
On x86_64, this shrinks the size of 'struct nvmf_ctrl_options' from 136 to
128 bytes.
When such a structure is allocated in nvmf_create_ctrl(), because of the
way memory allocation works, when 136 bytes were requested, 192 bytes were
allocated.
So this saves 64 bytes per allocation, 1 cache line to hold the whole
structure and a few cycles when zeroing the memory in nvmf_create_ctrl().
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
---
Using pahole
Before:
======
struct nvmf_ctrl_options {
unsigned int mask; /* 0 4 */
/* XXX 4 bytes hole, try to pack */
char * transport; /* 8 8 */
char * subsysnqn; /* 16 8 */
char * traddr; /* 24 8 */
char * trsvcid; /* 32 8 */
char * host_traddr; /* 40 8 */
char * host_iface; /* 48 8 */
size_t queue_size; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
unsigned int nr_io_queues; /* 64 4 */
unsigned int reconnect_delay; /* 68 4 */
bool discovery_nqn; /* 72 1 */
bool duplicate_connect; /* 73 1 */
/* XXX 2 bytes hole, try to pack */
unsigned int kato; /* 76 4 */
struct nvmf_host * host; /* 80 8 */
int max_reconnects; /* 88 4 */
/* XXX 4 bytes hole, try to pack */
char * dhchap_secret; /* 96 8 */
char * dhchap_ctrl_secret; /* 104 8 */
bool disable_sqflow; /* 112 1 */
bool hdr_digest; /* 113 1 */
bool data_digest; /* 114 1 */
/* XXX 1 byte hole, try to pack */
unsigned int nr_write_queues; /* 116 4 */
unsigned int nr_poll_queues; /* 120 4 */
int tos; /* 124 4 */
/* --- cacheline 2 boundary (128 bytes) --- */
int fast_io_fail_tmo; /* 128 4 */
/* size: 136, cachelines: 3, members: 24 */
/* sum members: 121, holes: 4, sum holes: 11 */
/* padding: 4 */
/* last cacheline: 8 bytes */
};
After:
=====
struct nvmf_ctrl_options {
unsigned int mask; /* 0 4 */
int max_reconnects; /* 4 4 */
char * transport; /* 8 8 */
char * subsysnqn; /* 16 8 */
char * traddr; /* 24 8 */
char * trsvcid; /* 32 8 */
char * host_traddr; /* 40 8 */
char * host_iface; /* 48 8 */
size_t queue_size; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
unsigned int nr_io_queues; /* 64 4 */
unsigned int reconnect_delay; /* 68 4 */
bool discovery_nqn; /* 72 1 */
bool duplicate_connect; /* 73 1 */
/* XXX 2 bytes hole, try to pack */
unsigned int kato; /* 76 4 */
struct nvmf_host * host; /* 80 8 */
char * dhchap_secret; /* 88 8 */
char * dhchap_ctrl_secret; /* 96 8 */
bool disable_sqflow; /* 104 1 */
bool hdr_digest; /* 105 1 */
bool data_digest; /* 106 1 */
/* XXX 1 byte hole, try to pack */
unsigned int nr_write_queues; /* 108 4 */
unsigned int nr_poll_queues; /* 112 4 */
int tos; /* 116 4 */
int fast_io_fail_tmo; /* 120 4 */
/* size: 128, cachelines: 2, members: 24 */
/* sum members: 121, holes: 2, sum holes: 3 */
/* padding: 4 */
};
---
drivers/nvme/host/fabrics.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index dcac3df8a5f7..1bc57fb75825 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -77,6 +77,9 @@ enum {
* with the parsing opts enum.
* @mask: Used by the fabrics library to parse through sysfs options
* on adding a NVMe controller.
+ * @max_reconnects: maximum number of allowed reconnect attempts before removing
+ * the controller, (-1) means reconnect forever, zero means remove
+ * immediately;
* @transport: Holds the fabric transport "technology name" (for a lack of
* better description) that will be used by an NVMe controller
* being added.
@@ -96,9 +99,6 @@ enum {
* @discovery_nqn: indicates if the subsysnqn is the well-known discovery NQN.
* @kato: Keep-alive timeout.
* @host: Virtual NVMe host, contains the NQN and Host ID.
- * @max_reconnects: maximum number of allowed reconnect attempts before removing
- * the controller, (-1) means reconnect forever, zero means remove
- * immediately;
* @dhchap_secret: DH-HMAC-CHAP secret
* @dhchap_ctrl_secret: DH-HMAC-CHAP controller secret for bi-directional
* authentication
@@ -112,6 +112,7 @@ enum {
*/
struct nvmf_ctrl_options {
unsigned mask;
+ int max_reconnects;
char *transport;
char *subsysnqn;
char *traddr;
@@ -125,7 +126,6 @@ struct nvmf_ctrl_options {
bool duplicate_connect;
unsigned int kato;
struct nvmf_host *host;
- int max_reconnects;
char *dhchap_secret;
char *dhchap_ctrl_secret;
bool disable_sqflow;
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH 4/5] nvmet: Reorder fields in 'struct nvme_dhchap_queue_context'
2023-05-01 12:40 [PATCH 0/5] optimize some data structure in nvme Christophe JAILLET
` (2 preceding siblings ...)
2023-05-01 12:40 ` [PATCH 3/5] nvmet: Reorder fields in 'struct nvmf_ctrl_options' Christophe JAILLET
@ 2023-05-01 12:40 ` Christophe JAILLET
2023-05-01 12:40 ` [PATCH 5/5] nvmet: Reorder fields in 'struct nvmefc_fcp_req' Christophe JAILLET
` (4 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Christophe JAILLET @ 2023-05-01 12:40 UTC (permalink / raw)
To: hch, sagi, kch
Cc: linux-nvme, linux-kernel, kernel-janitors, Christophe JAILLET
Group some variables based on their sizes to reduce holes.
On x86_64, this shrinks the size of 'struct nvme_dhchap_queue_context' from
416 to 400 bytes.
This structure is kvcalloc()'ed in nvme_auth_init_ctrl(), so it is likely
that the allocation can be relatively big. Saving 16 bytes per structure
may might a slight difference.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
---
Using pahole
Before:
======
struct nvme_dhchap_queue_context {
struct list_head entry; /* 0 16 */
struct work_struct auth_work; /* 16 80 */
/* --- cacheline 1 boundary (64 bytes) was 32 bytes ago --- */
struct nvme_ctrl * ctrl; /* 96 8 */
struct crypto_shash * shash_tfm; /* 104 8 */
struct crypto_kpp * dh_tfm; /* 112 8 */
void * buf; /* 120 8 */
/* --- cacheline 2 boundary (128 bytes) --- */
int qid; /* 128 4 */
int error; /* 132 4 */
u32 s1; /* 136 4 */
u32 s2; /* 140 4 */
u16 transaction; /* 144 2 */
u8 status; /* 146 1 */
u8 hash_id; /* 147 1 */
/* XXX 4 bytes hole, try to pack */
size_t hash_len; /* 152 8 */
u8 dhgroup_id; /* 160 1 */
u8 c1[64]; /* 161 64 */
/* --- cacheline 3 boundary (192 bytes) was 33 bytes ago --- */
u8 c2[64]; /* 225 64 */
/* --- cacheline 4 boundary (256 bytes) was 33 bytes ago --- */
u8 response[64]; /* 289 64 */
/* XXX 7 bytes hole, try to pack */
/* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
u8 * host_response; /* 360 8 */
u8 * ctrl_key; /* 368 8 */
int ctrl_key_len; /* 376 4 */
/* XXX 4 bytes hole, try to pack */
/* --- cacheline 6 boundary (384 bytes) --- */
u8 * host_key; /* 384 8 */
int host_key_len; /* 392 4 */
/* XXX 4 bytes hole, try to pack */
u8 * sess_key; /* 400 8 */
int sess_key_len; /* 408 4 */
/* size: 416, cachelines: 7, members: 25 */
/* sum members: 393, holes: 4, sum holes: 19 */
/* padding: 4 */
/* last cacheline: 32 bytes */
};
After:
=====
struct nvme_dhchap_queue_context {
struct list_head entry; /* 0 16 */
struct work_struct auth_work; /* 16 80 */
/* --- cacheline 1 boundary (64 bytes) was 32 bytes ago --- */
struct nvme_ctrl * ctrl; /* 96 8 */
struct crypto_shash * shash_tfm; /* 104 8 */
struct crypto_kpp * dh_tfm; /* 112 8 */
void * buf; /* 120 8 */
/* --- cacheline 2 boundary (128 bytes) --- */
int qid; /* 128 4 */
int error; /* 132 4 */
u32 s1; /* 136 4 */
u32 s2; /* 140 4 */
u16 transaction; /* 144 2 */
u8 status; /* 146 1 */
u8 dhgroup_id; /* 147 1 */
u8 hash_id; /* 148 1 */
/* XXX 3 bytes hole, try to pack */
size_t hash_len; /* 152 8 */
u8 c1[64]; /* 160 64 */
/* --- cacheline 3 boundary (192 bytes) was 32 bytes ago --- */
u8 c2[64]; /* 224 64 */
/* --- cacheline 4 boundary (256 bytes) was 32 bytes ago --- */
u8 response[64]; /* 288 64 */
/* --- cacheline 5 boundary (320 bytes) was 32 bytes ago --- */
u8 * host_response; /* 352 8 */
u8 * ctrl_key; /* 360 8 */
u8 * host_key; /* 368 8 */
u8 * sess_key; /* 376 8 */
/* --- cacheline 6 boundary (384 bytes) --- */
int ctrl_key_len; /* 384 4 */
int host_key_len; /* 388 4 */
int sess_key_len; /* 392 4 */
/* size: 400, cachelines: 7, members: 25 */
/* sum members: 393, holes: 1, sum holes: 3 */
/* padding: 4 */
/* last cacheline: 16 bytes */
};
---
drivers/nvme/host/auth.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index ea16a0aba679..daf5d144a8ea 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -30,18 +30,18 @@ struct nvme_dhchap_queue_context {
u32 s2;
u16 transaction;
u8 status;
+ u8 dhgroup_id;
u8 hash_id;
size_t hash_len;
- u8 dhgroup_id;
u8 c1[64];
u8 c2[64];
u8 response[64];
u8 *host_response;
u8 *ctrl_key;
- int ctrl_key_len;
u8 *host_key;
- int host_key_len;
u8 *sess_key;
+ int ctrl_key_len;
+ int host_key_len;
int sess_key_len;
};
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH 5/5] nvmet: Reorder fields in 'struct nvmefc_fcp_req'
2023-05-01 12:40 [PATCH 0/5] optimize some data structure in nvme Christophe JAILLET
` (3 preceding siblings ...)
2023-05-01 12:40 ` [PATCH 4/5] nvmet: Reorder fields in 'struct nvme_dhchap_queue_context' Christophe JAILLET
@ 2023-05-01 12:40 ` Christophe JAILLET
2023-05-01 13:57 ` [PATCH 0/5] optimize some data structure in nvme Sagi Grimberg
` (3 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Christophe JAILLET @ 2023-05-01 12:40 UTC (permalink / raw)
To: hch, sagi, kch
Cc: linux-nvme, linux-kernel, kernel-janitors, Christophe JAILLET
Group some variables based on their sizes to reduce holes.
On x86_64, this shrinks the size of 'struct nvmefc_fcp_req' from
112 to 104 bytes.
This structure is embedded in some other structures (nvme_fc_fcp_op
which itself is embedded in nvme_fcp_op_w_sgl), so it helps reducing the
size of these structures too.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
---
Using pahole
Before:
======
struct nvmefc_fcp_req {
void * cmdaddr; /* 0 8 */
void * rspaddr; /* 8 8 */
dma_addr_t cmddma; /* 16 8 */
dma_addr_t rspdma; /* 24 8 */
u16 cmdlen; /* 32 2 */
u16 rsplen; /* 34 2 */
u32 payload_length; /* 36 4 */
struct sg_table sg_table; /* 40 16 */
struct scatterlist * first_sgl; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
int sg_cnt; /* 64 4 */
enum nvmefc_fcp_datadir io_dir; /* 68 4 */
__le16 sqid; /* 72 2 */
/* XXX 6 bytes hole, try to pack */
void (*done)(struct nvmefc_fcp_req *); /* 80 8 */
void * private; /* 88 8 */
u32 transferred_length; /* 96 4 */
u16 rcv_rsplen; /* 100 2 */
/* XXX 2 bytes hole, try to pack */
u32 status; /* 104 4 */
/* size: 112, cachelines: 2, members: 17 */
/* sum members: 100, holes: 2, sum holes: 8 */
/* padding: 4 */
/* last cacheline: 48 bytes */
} __attribute__((__aligned__(8)));
After:
=====
struct nvmefc_fcp_req {
void * cmdaddr; /* 0 8 */
void * rspaddr; /* 8 8 */
dma_addr_t cmddma; /* 16 8 */
dma_addr_t rspdma; /* 24 8 */
u16 cmdlen; /* 32 2 */
u16 rsplen; /* 34 2 */
u32 payload_length; /* 36 4 */
struct sg_table sg_table; /* 40 16 */
struct scatterlist * first_sgl; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
int sg_cnt; /* 64 4 */
enum nvmefc_fcp_datadir io_dir; /* 68 4 */
void (*done)(struct nvmefc_fcp_req *); /* 72 8 */
void * private; /* 80 8 */
__le16 sqid; /* 88 2 */
u16 rcv_rsplen; /* 90 2 */
u32 transferred_length; /* 92 4 */
u32 status; /* 96 4 */
/* size: 104, cachelines: 2, members: 17 */
/* padding: 4 */
/* last cacheline: 40 bytes */
} __attribute__((__aligned__(8)));
---
include/linux/nvme-fc-driver.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/linux/nvme-fc-driver.h b/include/linux/nvme-fc-driver.h
index fa092b9be2fd..4109f1bd6128 100644
--- a/include/linux/nvme-fc-driver.h
+++ b/include/linux/nvme-fc-driver.h
@@ -185,7 +185,6 @@ enum nvmefc_fcp_datadir {
* @first_sgl: memory for 1st scatter/gather list segment for payload data
* @sg_cnt: number of elements in the scatter/gather list
* @io_dir: direction of the FCP request (see NVMEFC_FCP_xxx)
- * @sqid: The nvme SQID the command is being issued on
* @done: The callback routine the LLDD is to invoke upon completion of
* the FCP operation. req argument is the pointer to the original
* FCP IO operation.
@@ -194,12 +193,13 @@ enum nvmefc_fcp_datadir {
* while processing the operation. The length of the buffer
* corresponds to the fcprqst_priv_sz value specified in the
* nvme_fc_port_template supplied by the LLDD.
+ * @sqid: The nvme SQID the command is being issued on
*
* Values set by the LLDD indicating completion status of the FCP operation.
* Must be set prior to calling the done() callback.
+ * @rcv_rsplen: length, in bytes, of the FCP RSP IU received.
* @transferred_length: amount of payload data, in bytes, that were
* transferred. Should equal payload_length on success.
- * @rcv_rsplen: length, in bytes, of the FCP RSP IU received.
* @status: Completion status of the FCP operation. must be 0 upon success,
* negative errno value upon failure (ex: -EIO). Note: this is
* NOT a reflection of the NVME CQE completion status. Only the
@@ -219,14 +219,14 @@ struct nvmefc_fcp_req {
int sg_cnt;
enum nvmefc_fcp_datadir io_dir;
- __le16 sqid;
-
void (*done)(struct nvmefc_fcp_req *req);
void *private;
- u32 transferred_length;
+ __le16 sqid;
+
u16 rcv_rsplen;
+ u32 transferred_length;
u32 status;
} __aligned(sizeof(u64)); /* alignment for other things alloc'd with */
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH 0/5] optimize some data structure in nvme
2023-05-01 12:40 [PATCH 0/5] optimize some data structure in nvme Christophe JAILLET
` (4 preceding siblings ...)
2023-05-01 12:40 ` [PATCH 5/5] nvmet: Reorder fields in 'struct nvmefc_fcp_req' Christophe JAILLET
@ 2023-05-01 13:57 ` Sagi Grimberg
2023-05-12 20:26 ` Christoph Hellwig
` (2 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Sagi Grimberg @ 2023-05-01 13:57 UTC (permalink / raw)
To: Christophe JAILLET, hch, kch; +Cc: linux-nvme, linux-kernel, kernel-janitors
> This serie is a proposal to slighly optimize the memory needed for some
> structures used in nvme.
>
> This follows the discussion in [1].
>
> Honnestly, I'm not convinced that this serie really brings semething.
> Because of the way memory alocation works, and its over-allocation to try to
> avoid memory fragmentation, some limited gains are most of the time useless.
>
> It could still help:
> - many holes in structure can, at some point, have its size reach a threshold
> (this is specially true if such structures are allocated with kcalloc() or
> kmalloc_array())
> - it can save some space in some other structures if embedded in them
> - it can save a few cycles if the structure is memcpy()'ed or zeroed, for
> example
> - can reduce cache usage
>
> With that in mind, patch 3 is a win, patch 4 is likely a win, the other ones are
> much more theorical.
>
> The changes are really limited, so even if the gain is marginal, maybe it still
> makes sense to merge them.
Don't see why not, given they make do the structures smaller.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 0/5] optimize some data structure in nvme
2023-05-01 12:40 [PATCH 0/5] optimize some data structure in nvme Christophe JAILLET
` (5 preceding siblings ...)
2023-05-01 13:57 ` [PATCH 0/5] optimize some data structure in nvme Sagi Grimberg
@ 2023-05-12 20:26 ` Christoph Hellwig
2023-05-13 1:01 ` Chaitanya Kulkarni
2023-05-15 17:13 ` Keith Busch
8 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2023-05-12 20:26 UTC (permalink / raw)
To: Christophe JAILLET
Cc: hch, sagi, kch, linux-nvme, linux-kernel, kernel-janitors
Thanks,
the whole series looks good to me:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 0/5] optimize some data structure in nvme
2023-05-01 12:40 [PATCH 0/5] optimize some data structure in nvme Christophe JAILLET
` (6 preceding siblings ...)
2023-05-12 20:26 ` Christoph Hellwig
@ 2023-05-13 1:01 ` Chaitanya Kulkarni
2023-05-15 17:13 ` Keith Busch
8 siblings, 0 replies; 12+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-13 1:01 UTC (permalink / raw)
To: Christophe JAILLET
Cc: linux-nvme@lists.infradead.org, sagi@grimberg.me,
Chaitanya Kulkarni, linux-kernel@vger.kernel.org, hch@lst.de,
kernel-janitors@vger.kernel.org
On 5/1/23 05:40, Christophe JAILLET wrote:
> This serie is a proposal to slighly optimize the memory needed for some
> structures used in nvme.
>
> This follows the discussion in [1].
>
> Honnestly, I'm not convinced that this serie really brings semething.
> Because of the way memory alocation works, and its over-allocation to try to
> avoid memory fragmentation, some limited gains are most of the time useless.
>
> It could still help:
> - many holes in structure can, at some point, have its size reach a threshold
> (this is specially true if such structures are allocated with kcalloc() or
> kmalloc_array())
> - it can save some space in some other structures if embedded in them
> - it can save a few cycles if the structure is memcpy()'ed or zeroed, for
> example
> - can reduce cache usage
>
> With that in mind, patch 3 is a win, patch 4 is likely a win, the other ones are
> much more theorical.
>
> The changes are really limited, so even if the gain is marginal, maybe it still
> makes sense to merge them.
>
> Each patch gives the layout generated by pahole before and after the patch.
>
> [1]: https://lore.kernel.org/all/67a9e53e-4ac9-7ba8-9713-96c1dfe1e341@nvidia.com/
>
> Christophe JAILLET (5):
> nvmet: Reorder fields in 'struct nvmet_sq'
> nvmet: Reorder fields in 'struct nvme_ctrl'
> nvmet: Reorder fields in 'struct nvmf_ctrl_options'
> nvmet: Reorder fields in 'struct nvme_dhchap_queue_context'
> nvmet: Reorder fields in 'struct nvmefc_fcp_req'
>
> drivers/nvme/host/auth.c | 6 +++---
> drivers/nvme/host/fabrics.h | 8 ++++----
> drivers/nvme/host/nvme.h | 6 +++---
> drivers/nvme/target/nvmet.h | 4 ++--
> include/linux/nvme-fc-driver.h | 10 +++++-----
> 5 files changed, 17 insertions(+), 17 deletions(-)
>
thanks a lot for doing this and following the comment on the other patch.
Looks good.
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
-ck
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 0/5] optimize some data structure in nvme
2023-05-01 12:40 [PATCH 0/5] optimize some data structure in nvme Christophe JAILLET
` (7 preceding siblings ...)
2023-05-13 1:01 ` Chaitanya Kulkarni
@ 2023-05-15 17:13 ` Keith Busch
8 siblings, 0 replies; 12+ messages in thread
From: Keith Busch @ 2023-05-15 17:13 UTC (permalink / raw)
To: Christophe JAILLET
Cc: hch, sagi, kch, linux-nvme, linux-kernel, kernel-janitors
Thanks, applied to nvme-6.5.
^ permalink raw reply [flat|nested] 12+ messages in thread