linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Namhyung Kim" <namhyung@kernel.org>,
	"Mark Rutland" <mark.rutland@arm.com>,
	"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"Adrian Hunter" <adrian.hunter@intel.com>,
	"James Clark" <james.clark@arm.com>,
	"Athira Rajeev" <atrajeev@linux.vnet.ibm.com>,
	"Colin Ian King" <colin.i.king@gmail.com>,
	"Ahelenia Ziemiańska" <nabijaczleweli@nabijaczleweli.xyz>,
	"Leo Yan" <leo.yan@linux.dev>, "Song Liu" <song@kernel.org>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Liam Howlett" <liam.howlett@oracle.com>,
	"Ilkka Koskinen" <ilkka@os.amperecomputing.com>,
	"Ben Gainey" <ben.gainey@arm.com>,
	"K Prateek Nayak" <kprateek.nayak@amd.com>,
	"Kan Liang" <kan.liang@linux.intel.com>,
	"Yanteng Si" <siyanteng@loongson.cn>,
	"Ravi Bangoria" <ravi.bangoria@amd.com>,
	"Sun Haiyong" <sunhaiyong@loongson.cn>,
	"Changbin Du" <changbin.du@huawei.com>,
	"Masami Hiramatsu" <mhiramat@kernel.org>,
	zhaimingbing <zhaimingbing@cmss.chinamobile.com>,
	"Paran Lee" <p4ranlee@gmail.com>, "Li Dong" <lidong@vivo.com>,
	elfring@users.sourceforge.net, "Andi Kleen" <ak@linux.intel.com>,
	"Markus Elfring" <Markus.Elfring@web.de>,
	"Chengen Du" <chengen.du@canonical.com>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	bpf@vger.kernel.org
Subject: Re: [PATCH v2 01/13] perf dso: Reorder variables to save space in struct dso
Date: Thu, 21 Mar 2024 17:28:09 -0300	[thread overview]
Message-ID: <ZfyYWcu0E-Pcn6Rf@x1> (raw)
In-Reply-To: <20240321160300.1635121-2-irogers@google.com>

On Thu, Mar 21, 2024 at 09:02:48AM -0700, Ian Rogers wrote:
> Save 40 bytes and move from 8 to 7 cache lines. Make variable dwfl
> dependent on being a powerpc build. Squeeze bits of int/enum types
> when appropriate. Remove holes/padding by reordering variables.

Thanks, applied.

- Arnaldo
 
> Before:
> ```
> struct dso {
>         struct mutex               lock;                 /*     0    40 */
>         struct list_head           node;                 /*    40    16 */
>         struct rb_node             rb_node __attribute__((__aligned__(8))); /*    56    24 */
>         /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
>         struct rb_root *           root;                 /*    80     8 */
>         struct rb_root_cached      symbols;              /*    88    16 */
>         struct symbol * *          symbol_names;         /*   104     8 */
>         size_t                     symbol_names_len;     /*   112     8 */
>         struct rb_root_cached      inlined_nodes;        /*   120    16 */
>         /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
>         struct rb_root_cached      srclines;             /*   136    16 */
>         struct {
>                 u64                addr;                 /*   152     8 */
>                 struct symbol *    symbol;               /*   160     8 */
>         } last_find_result;                              /*   152    16 */
>         void *                     a2l;                  /*   168     8 */
>         char *                     symsrc_filename;      /*   176     8 */
>         unsigned int               a2l_fails;            /*   184     4 */
>         enum dso_space_type        kernel;               /*   188     4 */
>         /* --- cacheline 3 boundary (192 bytes) --- */
>         _Bool                      is_kmod;              /*   192     1 */
> 
>         /* XXX 3 bytes hole, try to pack */
> 
>         enum dso_swap_type         needs_swap;           /*   196     4 */
>         enum dso_binary_type       symtab_type;          /*   200     4 */
>         enum dso_binary_type       binary_type;          /*   204     4 */
>         enum dso_load_errno        load_errno;           /*   208     4 */
>         u8                         adjust_symbols:1;     /*   212: 0  1 */
>         u8                         has_build_id:1;       /*   212: 1  1 */
>         u8                         header_build_id:1;    /*   212: 2  1 */
>         u8                         has_srcline:1;        /*   212: 3  1 */
>         u8                         hit:1;                /*   212: 4  1 */
>         u8                         annotate_warned:1;    /*   212: 5  1 */
>         u8                         auxtrace_warned:1;    /*   212: 6  1 */
>         u8                         short_name_allocated:1; /*   212: 7  1 */
>         u8                         long_name_allocated:1; /*   213: 0  1 */
>         u8                         is_64_bit:1;          /*   213: 1  1 */
> 
>         /* XXX 6 bits hole, try to pack */
> 
>         _Bool                      sorted_by_name;       /*   214     1 */
>         _Bool                      loaded;               /*   215     1 */
>         u8                         rel;                  /*   216     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         struct build_id            bid;                  /*   224    32 */
>         /* --- cacheline 4 boundary (256 bytes) --- */
>         u64                        text_offset;          /*   256     8 */
>         u64                        text_end;             /*   264     8 */
>         const char  *              short_name;           /*   272     8 */
>         const char  *              long_name;            /*   280     8 */
>         u16                        long_name_len;        /*   288     2 */
>         u16                        short_name_len;       /*   290     2 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         void *                     dwfl;                 /*   296     8 */
>         struct auxtrace_cache *    auxtrace_cache;       /*   304     8 */
>         int                        comp;                 /*   312     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         /* --- cacheline 5 boundary (320 bytes) --- */
>         struct {
>                 struct rb_root     cache;                /*   320     8 */
>                 int                fd;                   /*   328     4 */
>                 int                status;               /*   332     4 */
>                 u32                status_seen;          /*   336     4 */
> 
>                 /* XXX 4 bytes hole, try to pack */
> 
>                 u64                file_size;            /*   344     8 */
>                 struct list_head   open_entry;           /*   352    16 */
>                 u64                elf_base_addr;        /*   368     8 */
>                 u64                debug_frame_offset;   /*   376     8 */
>                 /* --- cacheline 6 boundary (384 bytes) --- */
>                 u64                eh_frame_hdr_addr;    /*   384     8 */
>                 u64                eh_frame_hdr_offset;  /*   392     8 */
>         } data;                                          /*   320    80 */
>         struct {
>                 u32                id;                   /*   400     4 */
>                 u32                sub_id;               /*   404     4 */
>                 struct perf_env *  env;                  /*   408     8 */
>         } bpf_prog;                                      /*   400    16 */
>         union {
>                 void *             priv;                 /*   416     8 */
>                 u64                db_id;                /*   416     8 */
>         };                                               /*   416     8 */
>         struct nsinfo *            nsinfo;               /*   424     8 */
>         struct dso_id              id;                   /*   432    24 */
>         /* --- cacheline 7 boundary (448 bytes) was 8 bytes ago --- */
>         refcount_t                 refcnt;               /*   456     4 */
>         char                       name[];               /*   460     0 */
> 
>         /* size: 464, cachelines: 8, members: 49 */
>         /* sum members: 440, holes: 4, sum holes: 18 */
>         /* sum bitfield members: 10 bits, bit holes: 1, sum bit holes: 6 bits */
>         /* padding: 4 */
>         /* forced alignments: 1 */
>         /* last cacheline: 16 bytes */
> } __attribute__((__aligned__(8)));
> ```
> 
> After:
> ```
> struct dso {
>         struct mutex               lock;                 /*     0    40 */
>         struct list_head           node;                 /*    40    16 */
>         struct rb_node             rb_node __attribute__((__aligned__(8))); /*    56    24 */
>         /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
>         struct rb_root *           root;                 /*    80     8 */
>         struct rb_root_cached      symbols;              /*    88    16 */
>         struct symbol * *          symbol_names;         /*   104     8 */
>         size_t                     symbol_names_len;     /*   112     8 */
>         struct rb_root_cached      inlined_nodes;        /*   120    16 */
>         /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
>         struct rb_root_cached      srclines;             /*   136    16 */
>         struct {
>                 u64                addr;                 /*   152     8 */
>                 struct symbol *    symbol;               /*   160     8 */
>         } last_find_result;                              /*   152    16 */
>         struct build_id            bid;                  /*   168    32 */
>         /* --- cacheline 3 boundary (192 bytes) was 8 bytes ago --- */
>         u64                        text_offset;          /*   200     8 */
>         u64                        text_end;             /*   208     8 */
>         const char  *              short_name;           /*   216     8 */
>         const char  *              long_name;            /*   224     8 */
>         void *                     a2l;                  /*   232     8 */
>         char *                     symsrc_filename;      /*   240     8 */
>         struct nsinfo *            nsinfo;               /*   248     8 */
>         /* --- cacheline 4 boundary (256 bytes) --- */
>         struct auxtrace_cache *    auxtrace_cache;       /*   256     8 */
>         union {
>                 void *             priv;                 /*   264     8 */
>                 u64                db_id;                /*   264     8 */
>         };                                               /*   264     8 */
>         struct {
>                 struct perf_env *  env;                  /*   272     8 */
>                 u32                id;                   /*   280     4 */
>                 u32                sub_id;               /*   284     4 */
>         } bpf_prog;                                      /*   272    16 */
>         struct {
>                 struct rb_root     cache;                /*   288     8 */
>                 struct list_head   open_entry;           /*   296    16 */
>                 u64                file_size;            /*   312     8 */
>                 /* --- cacheline 5 boundary (320 bytes) --- */
>                 u64                elf_base_addr;        /*   320     8 */
>                 u64                debug_frame_offset;   /*   328     8 */
>                 u64                eh_frame_hdr_addr;    /*   336     8 */
>                 u64                eh_frame_hdr_offset;  /*   344     8 */
>                 int                fd;                   /*   352     4 */
>                 int                status;               /*   356     4 */
>                 u32                status_seen;          /*   360     4 */
>         } data;                                          /*   288    80 */
> 
>         /* XXX last struct has 4 bytes of padding */
> 
>         struct dso_id              id;                   /*   368    24 */
>         /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
>         unsigned int               a2l_fails;            /*   392     4 */
>         int                        comp;                 /*   396     4 */
>         refcount_t                 refcnt;               /*   400     4 */
>         enum dso_load_errno        load_errno;           /*   404     4 */
>         u16                        long_name_len;        /*   408     2 */
>         u16                        short_name_len;       /*   410     2 */
>         enum dso_binary_type       symtab_type:8;        /*   412: 0  4 */
>         enum dso_binary_type       binary_type:8;        /*   412: 8  4 */
>         enum dso_space_type        kernel:2;             /*   412:16  4 */
>         enum dso_swap_type         needs_swap:2;         /*   412:18  4 */
> 
>         /* Bitfield combined with next fields */
> 
>         _Bool                      is_kmod:1;            /*   414: 4  1 */
>         u8                         adjust_symbols:1;     /*   414: 5  1 */
>         u8                         has_build_id:1;       /*   414: 6  1 */
>         u8                         header_build_id:1;    /*   414: 7  1 */
>         u8                         has_srcline:1;        /*   415: 0  1 */
>         u8                         hit:1;                /*   415: 1  1 */
>         u8                         annotate_warned:1;    /*   415: 2  1 */
>         u8                         auxtrace_warned:1;    /*   415: 3  1 */
>         u8                         short_name_allocated:1; /*   415: 4  1 */
>         u8                         long_name_allocated:1; /*   415: 5  1 */
>         u8                         is_64_bit:1;          /*   415: 6  1 */
> 
>         /* XXX 1 bit hole, try to pack */
> 
>         _Bool                      sorted_by_name;       /*   416     1 */
>         _Bool                      loaded;               /*   417     1 */
>         u8                         rel;                  /*   418     1 */
>         char                       name[];               /*   419     0 */
> 
>         /* size: 424, cachelines: 7, members: 48 */
>         /* sum members: 415 */
>         /* sum bitfield members: 31 bits, bit holes: 1, sum bit holes: 1 bits */
>         /* padding: 5 */
>         /* paddings: 1, sum paddings: 4 */
>         /* forced alignments: 1 */
>         /* last cacheline: 40 bytes */
> } __attribute__((__aligned__(8)));
> ```
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/util/dso.h | 84 +++++++++++++++++++++----------------------
>  1 file changed, 42 insertions(+), 42 deletions(-)
> 
> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> index 2cdcd1e2ef8b..17dab230a2ca 100644
> --- a/tools/perf/util/dso.h
> +++ b/tools/perf/util/dso.h
> @@ -161,66 +161,66 @@ struct dso {
>  		u64		addr;
>  		struct symbol	*symbol;
>  	} last_find_result;
> -	void		 *a2l;
> -	char		 *symsrc_filename;
> -	unsigned int	 a2l_fails;
> -	enum dso_space_type	kernel;
> -	bool			is_kmod;
> -	enum dso_swap_type	needs_swap;
> -	enum dso_binary_type	symtab_type;
> -	enum dso_binary_type	binary_type;
> -	enum dso_load_errno	load_errno;
> -	u8		 adjust_symbols:1;
> -	u8		 has_build_id:1;
> -	u8		 header_build_id:1;
> -	u8		 has_srcline:1;
> -	u8		 hit:1;
> -	u8		 annotate_warned:1;
> -	u8		 auxtrace_warned:1;
> -	u8		 short_name_allocated:1;
> -	u8		 long_name_allocated:1;
> -	u8		 is_64_bit:1;
> -	bool		 sorted_by_name;
> -	bool		 loaded;
> -	u8		 rel;
>  	struct build_id	 bid;
>  	u64		 text_offset;
>  	u64		 text_end;
>  	const char	 *short_name;
>  	const char	 *long_name;
> -	u16		 long_name_len;
> -	u16		 short_name_len;
> +	void		 *a2l;
> +	char		 *symsrc_filename;
> +#if defined(__powerpc__)
>  	void		*dwfl;			/* DWARF debug info */
> +#endif
> +	struct nsinfo	*nsinfo;
>  	struct auxtrace_cache *auxtrace_cache;
> -	int		 comp;
> -
> +	union { /* Tool specific area */
> +		void	 *priv;
> +		u64	 db_id;
> +	};
> +	/* bpf prog information */
> +	struct {
> +		struct perf_env	*env;
> +		u32		id;
> +		u32		sub_id;
> +	} bpf_prog;
>  	/* dso data file */
>  	struct {
>  		struct rb_root	 cache;
> -		int		 fd;
> -		int		 status;
> -		u32		 status_seen;
> -		u64		 file_size;
>  		struct list_head open_entry;
> +		u64		 file_size;
>  		u64		 elf_base_addr;
>  		u64		 debug_frame_offset;
>  		u64		 eh_frame_hdr_addr;
>  		u64		 eh_frame_hdr_offset;
> +		int		 fd;
> +		int		 status;
> +		u32		 status_seen;
>  	} data;
> -	/* bpf prog information */
> -	struct {
> -		u32		id;
> -		u32		sub_id;
> -		struct perf_env	*env;
> -	} bpf_prog;
> -
> -	union { /* Tool specific area */
> -		void	 *priv;
> -		u64	 db_id;
> -	};
> -	struct nsinfo	*nsinfo;
>  	struct dso_id	 id;
> +	unsigned int	 a2l_fails;
> +	int		 comp;
>  	refcount_t	 refcnt;
> +	enum dso_load_errno	load_errno;
> +	u16		 long_name_len;
> +	u16		 short_name_len;
> +	enum dso_binary_type	symtab_type:8;
> +	enum dso_binary_type	binary_type:8;
> +	enum dso_space_type	kernel:2;
> +	enum dso_swap_type	needs_swap:2;
> +	bool			is_kmod:1;
> +	u8		 adjust_symbols:1;
> +	u8		 has_build_id:1;
> +	u8		 header_build_id:1;
> +	u8		 has_srcline:1;
> +	u8		 hit:1;
> +	u8		 annotate_warned:1;
> +	u8		 auxtrace_warned:1;
> +	u8		 short_name_allocated:1;
> +	u8		 long_name_allocated:1;
> +	u8		 is_64_bit:1;
> +	bool		 sorted_by_name;
> +	bool		 loaded;
> +	u8		 rel;
>  	char		 name[];
>  };
>  
> -- 
> 2.44.0.396.g6e790dbe36-goog
> 

  reply	other threads:[~2024-03-21 20:28 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-21 16:02 [PATCH v2 00/13] dso/dsos memory savings and clean up Ian Rogers
2024-03-21 16:02 ` [PATCH v2 01/13] perf dso: Reorder variables to save space in struct dso Ian Rogers
2024-03-21 20:28   ` Arnaldo Carvalho de Melo [this message]
2024-03-21 16:02 ` [PATCH v2 02/13] perf dsos: Attempt to better abstract dsos internals Ian Rogers
2024-03-21 16:02 ` [PATCH v2 03/13] perf dsos: Tidy reference counting and locking Ian Rogers
2024-03-21 16:02 ` [PATCH v2 04/13] perf dsos: Add dsos__for_each_dso Ian Rogers
2024-03-22 20:43   ` Namhyung Kim
2024-03-22 20:54     ` Namhyung Kim
2024-03-21 16:02 ` [PATCH v2 05/13] perf dso: Move dso functions out of dsos Ian Rogers
2024-03-21 16:02 ` [PATCH v2 06/13] perf dsos: Switch more loops to dsos__for_each_dso Ian Rogers
2024-03-21 16:02 ` [PATCH v2 07/13] perf dsos: Switch backing storage to array from rbtree/list Ian Rogers
2024-03-21 16:02 ` [PATCH v2 08/13] perf dsos: Remove __dsos__addnew Ian Rogers
2024-03-21 16:02 ` [PATCH v2 09/13] perf dsos: Remove __dsos__findnew_link_by_longname_id Ian Rogers
2024-03-21 16:02 ` [PATCH v2 10/13] perf dsos: Switch hand code to bsearch Ian Rogers
2024-03-21 16:02 ` [PATCH v2 11/13] perf dso: Add reference count checking and accessor functions Ian Rogers
2024-03-21 16:02 ` [PATCH v2 12/13] perf dso: Reference counting related fixes Ian Rogers
2024-03-25 17:22   ` Markus Elfring
2024-03-21 16:03 ` [PATCH v2 13/13] perf dso: Use container_of to avoid a pointer in dso_data Ian Rogers
2024-03-25 21:03 ` [PATCH v2 00/13] dso/dsos memory savings and clean up Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZfyYWcu0E-Pcn6Rf@x1 \
    --to=acme@kernel.org \
    --cc=Markus.Elfring@web.de \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=atrajeev@linux.vnet.ibm.com \
    --cc=ben.gainey@arm.com \
    --cc=bpf@vger.kernel.org \
    --cc=changbin.du@huawei.com \
    --cc=chengen.du@canonical.com \
    --cc=colin.i.king@gmail.com \
    --cc=elfring@users.sourceforge.net \
    --cc=ilkka@os.amperecomputing.com \
    --cc=irogers@google.com \
    --cc=james.clark@arm.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kprateek.nayak@amd.com \
    --cc=leo.yan@linux.dev \
    --cc=liam.howlett@oracle.com \
    --cc=lidong@vivo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nabijaczleweli@nabijaczleweli.xyz \
    --cc=namhyung@kernel.org \
    --cc=ojeda@kernel.org \
    --cc=p4ranlee@gmail.com \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    --cc=siyanteng@loongson.cn \
    --cc=song@kernel.org \
    --cc=sunhaiyong@loongson.cn \
    --cc=zhaimingbing@cmss.chinamobile.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).