-
-
upstream/v1.294ef47f84 · ·
DWARF loader: - Multithreading is now contained in the DWARF loader using a jobs queue and a pool of worker threads. BTF encoder: - The parallel reproducible BTF generation done using the new DWARF loader multithreading model is as fast as the old non-reproducible one and thus is now always performed, making the "reproducible_build" flag moot. The memory consumption is now greatly reduced as well. BTF loader: - Support for multiple BTF_DECL_TAGs pointing to same tag. Example: $ pfunct vmlinux -F btf -f bpf_rdonly_cast bpf_kfunc bpf_fastcall void *bpf_rdonly_cast(const void *obj__ign, u32 btf_id__k); $ Regression tests: - Verify that pfunct prints btf_decl_tags read from BTF. pfunct: - Don't print functions twice when using -f. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
upstream/v1.281cb4202e · ·
pahole: - Various improvements to reduce the memory footprint of pahole, notably when doing BTF encoding. - Show flexible arrays statistics, it detects them at the end of member types, in the middle, etc. This should help with the efforts to spot problematic usage of flexible arrays in the kernel sources, examples: https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=6ab5318f536927cb - Introduce --with_embedded_flexible_array option. - Add '--padding N' to show only structs with N bytes of padding. - Add '--padding_ge N' to show only structs with at least N bytes of padding. - Introduce --running_kernel_vmlinux to find a vmlinux that matches the build-id of the running kernel, e.g.: $ pahole --running_kernel_vmlinux /usr/lib/debug/lib/modules/6.11.7-200.fc40.x86_64/vmlinux $ rpm -qf /usr/lib/debug/lib/modules/6.11.7-200.fc40.x86_64/vmlinux kernel-debuginfo-6.11.7-200.fc40.x86_64 $ This is a shortcut to find the right vmlinux to use for the running kernel and helps with regression tests. pfunct: - Don't stop at the first function that matches a filter, show all of them. BTF Encoder: - Allow encoding data about all global variables, not just per CPU ones. There are several reasons why type information for all global variables to be useful in the kernel, including drgn without DWARF, __ksym BPF programs return type. This is non-default, experiment with it using 'pahole --btf-features=+global_var' - Handle .BTF_ids section endianness, allowing for cross builds involving machines with different endianness to work. For instance, encoding BTF info on a s390 vmlinux file on a x86_64 workstation. - Generate decl tags for bpf_fastcall for eligible kfuncs. - Add "distilled_base" BTF feature to split BTF generation. - Use the ELF_C_READ_MMAP mode with libelf, reducing peak memory utilization. BTF Loader: - Allow overiding /sys/kernel/btf/vmlinux with some other file, for testing, via the PAHOLE_VMLINUX_BTF_FILENAME environment variable. DWARF loader: - Allow setting the list of compile units produced from languages to skip via the PAHOLE_LANG_EXCLUDE environment variable. - Serialize access to elfutils dwarf_getlocation() to avoid elfutils internal data structure corruption when running multithreaded pahole. - Honour --lang_exclude when merging LTO built CUs. - Add the debuginfod client cache directory to the vmlinux search path. - Print the CU's language when a tag isn't supported. - Initial support for the DW_TAG_GNU_formal_parameter_pack, DW_TAG_GNU_template_parameter_pack, DW_TAG_template_value_param and DW_TAG_template_type_param DWARF tags. - Improve the parameter parsing by checking DW_OP_[GNU_]entry_value, this makes some more functions to be made eligible by the BTF encoder, for instance the perf_event_read() in the 6.11 kernel. Core: - Use pahole to help in reorganizing its data structures to reduce its memory footprint. Regression tests: - Introduce a tests/ directory for adding regression tests, run it with: $ tests/tests Or run the individual tests directly. - Add a regression test for the reproducible build feature that establishes as a baseline a detached BTF file without asking for a reproducible build and then compares the output of 'bpftool btf dump file' for this file with the one from BTF reproducible build encodings done with a growing number or threads. - Add a regression test for the flexible arrays features, checking if the various comments about flexible arrays match the statistics at the final of the pahole pretty print output. - Add a test that checks if pahole fails when running on a BTF system and BTF was requested, previously it was falling back to DWARF silently. - Add test validating BTF encoding, reasons we skip functions: DWARF functions that made it into BTF match signatures, functions we say we skipped, we did indeed skip them in BTF encoding and that it was correct to skip these functions. - Add regression test for 'pahole --prettify' that uses perf to record a simple workload and then pretty print the resulting perf.data file to check that what is produced are the expected records for such a file. Link: https://lore.kernel.org/all/Z0jVLcpgyENlGg6E@x1/ Tested-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
upstream/v1.273e265dac · ·
BTF encoder: - Inject kfunc decl tags into BTF from the BTF IDs ELF section in the Linux kernel vmlinux file. This allows tools such as bpftools and pfunct to enumerate the available kfuncs and to gets its function signature, the type of its return and of its arguments. See the example in the BTF loader changes description, below. - Support parallel reproducible builds, where it doesn't matter how many threads are used, the end BTF encoding result is the same. - Sanitize unsupported DWARF int type with greater-than-16 byte, as BTF doesn't support it. BTF loader: - Initial support for BTF_KIND_DECL_TAG: $ pfunct --prototypes -F btf vmlinux.btf.decl_tag,decl_tag_kfuncs | grep ^bpf_kfunc | head bpf_kfunc void cubictcp_init(struct sock * sk); bpf_kfunc void cubictcp_cwnd_event(struct sock * sk, enum tcp_ca_event event); bpf_kfunc void cubictcp_cong_avoid(struct sock * sk, u32 ack, u32 acked); bpf_kfunc u32 cubictcp_recalc_ssthresh(struct sock * sk); bpf_kfunc void cubictcp_state(struct sock * sk, u8 new_state); bpf_kfunc void cubictcp_acked(struct sock * sk, const struct ack_sample * sample); bpf_kfunc int bpf_iter_css_new(struct bpf_iter_css * it, struct cgroup_subsys_state * start, unsigned int flags); bpf_kfunc struct cgroup_subsys_state * bpf_iter_css_next(struct bpf_iter_css * it); bpf_kfunc void bpf_iter_css_destroy(struct bpf_iter_css * it); bpf_kfunc s64 bpf_map_sum_elem_count(const struct bpf_map * map); $ pfunct --prototypes -F btf vmlinux.btf.decl_tag,decl_tag_kfuncs | grep ^bpf_kfunc | wc -l 116 $ pretty printing: - Fix hole discovery with inheritance in C++. Tested-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Daniel Xu <dxu@dxuuu.xyz> Tested-by: Jiri Olsa <olsajiri@gmail.com> Link: https://lore.kernel.org/all/ZmIXxgbfIJGWmXer@x1/T/#u Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
upstream/v1.26922085e3 · ·
pahole: - When expanding types using 'pahole -E' do it for union and struct typedefs and for enums too. E.g: that 'state' field in 'struct module': $ pahole module | head struct module { enum module_state state; /* 0 4 */ /* XXX 4 bytes hole, try to pack */ struct list_head list; /* 8 16 */ char name[56]; /* 24 56 */ /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */ struct module_kobject mkobj; /* 80 96 */ /* --- cacheline 2 boundary (128 bytes) was 48 bytes ago --- */ $ now gets expanded: $ pahole -E module | head struct module { enum module_state { MODULE_STATE_LIVE = 0, MODULE_STATE_COMING = 1, MODULE_STATE_GOING = 2, MODULE_STATE_UNFORMED = 3, } state; /* 0 4 */ /* XXX 4 bytes hole, try to pack */ $ - Print number of holes, bit holes and bit paddings in class member types. Doing this recursively to show how much waste a complex data structure has is something that still needs to be done, there were the low hanging fruits on the path to having that feature. For instance, for 'struct task_struct' in the Linux kernel we get this extra info: --- task_struct.before.c 2024-02-09 11:38:39.249638750 -0300 +++ task_struct.after.c 2024-02-09 16:19:34.221134835 -0300 @@ -29,6 +29,12 @@ /* --- cacheline 2 boundary (128 bytes) --- */ struct sched_entity se; /* 128 256 */ + + /* XXX last struct has 3 holes */ + /* --- cacheline 6 boundary (384 bytes) --- */ struct sched_rt_entity rt; /* 384 48 */ struct sched_dl_entity dl; /* 432 224 */ + + /* XXX last struct has 1 bit hole */ + /* --- cacheline 10 boundary (640 bytes) was 16 bytes ago --- */ const struct sched_class * sched_class; /* 656 8 */ struct rb_node core_node; /* 664 24 */ @@ -100,6 +103,9 @@ /* --- cacheline 35 boundary (2240 bytes) was 16 bytes ago --- */ struct list_head tasks; /* 2256 16 */ struct plist_node pushable_tasks; /* 2272 40 */ + + /* XXX last struct has 1 hole */ + /* --- cacheline 36 boundary (2304 bytes) was 8 bytes ago --- */ struct rb_node pushable_dl_tasks; /* 2312 24 */ struct mm_struct * mm; /* 2336 8 */ @@ -172,6 +178,9 @@ /* XXX last struct has 4 bytes of padding */ struct vtime vtime; /* 2744 48 */ + + /* XXX last struct has 1 hole */ + /* --- cacheline 43 boundary (2752 bytes) was 40 bytes ago --- */ atomic_t tick_dep_mask; /* 2792 4 */ @@ -396,9 +405,12 @@ /* --- cacheline 145 boundary (9280 bytes) --- */ struct thread_struct thread __attribute__((__aligned__(64))); /* 9280 4416 */ + /* XXX last struct has 1 hole, 1 bit hole */ + /* size: 13696, cachelines: 214, members: 262 */ /* sum members: 13518, holes: 21, sum holes: 162 */ /* sum bitfield members: 82 bits, bit holes: 2, sum bit holes: 46 bits */ /* member types with holes: 4, total: 6, bit holes: 2, total: 2 */ /* paddings: 6, sum paddings: 49 */ /* forced alignments: 2, forced holes: 2, sum forced holes: 88 */ }; - Introduce --contains_enumerator=ENUMERATOR_NAME: E.g.: $ pahole --contains_enumerator S_VERSION enum file_time_flags { S_ATIME = 1, S_MTIME = 2, S_CTIME = 4, S_VERSION = 8, } $ The shorter form --contains_enum is also accepted. - Fix pretty printing when using DWARF, where sometimes the class (-C) and a specified "type_enum", may not be present on the same CU, so wait till both are found. Now this example that reads the 'struct perf_event_header' and 'enum perf_event_type' from the DWARF info in ~/bin/perf to pretty print records in the perf.data file works just like when using type info from BTF in ~/bin/perf: $ pahole -F dwarf -V ~/bin/perf \ --header=perf_file_header \ --seek_bytes '$header.data.offset' \ --size_bytes='$header.data.size' \ -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_MMAP2)' \ --prettify perf.data --count 1 pahole: sizeof_operator for 'perf_event_header' is 'size' pahole: type member for 'perf_event_header' is 'type' pahole: type enum for 'perf_event_header' is 'perf_event_type' pahole: filter for 'perf_event_header' is 'type==PERF_RECORD_MMAP2' pahole: seek bytes evaluated from --seek_bytes=$header.data.offset is 0x3f0 pahole: size bytes evaluated from --size_bytes=$header.data.size is 0xd10 // type=perf_event_header, offset=0xc20, sizeof=8, real_sizeof=112 { .header = { .type = PERF_RECORD_MMAP2, .misc = 2, .size = 112, }, .pid = 1533617, .tid = 1533617, .start = 94667542700032, .len = 90112, .pgoff = 16384,{ .maj = 0, .min = 33, .ino = 35914923, .ino_generation = 26870, },{ .build_id_size = 0, .__reserved_1 = 0, .__reserved_2 = 0, .build_id = { 33, 0, 0, 0, -85, 4, 36, 2, 0, 0, 0, 0, -10, 104, 0, 0, 0, 0, 0, 0 }, }, .prot = 5, .flags = 2, .filename = "/usr/bin/ls", }, $ DWARF loader: - Add support for DW_TAG_constant, first seen in Go DWARF. - Fix loading DW_TAG_subroutine_type generated by the Go compiler, where it may have a DW_AT_byte_size. Go DWARF. And pretty print it as if it was from C, this helped in writing BPF programs to attach to Go binaries, using uprobes. BTF loader: - Fix loading of 32-bit signed enums. BTF encoder: - Add 'pahole --btf_features' to allow consumers to specify an opt-in set of features they want to use in BTF encoding. Supported features are a comma-separated combination of encode_force Ignore invalid symbols when encoding BTF. var Encode variables using BTF_KIND_VAR in BTF. float Encode floating-point types in BTF. decl_tag Encode declaration tags using BTF_KIND_DECL_TAG. type_tag Encode type tags using BTF_KIND_TYPE_TAG. enum64 Encode enum64 values with BTF_KIND_ENUM64. optimized_func Encode representations of optimized functions with suffixes like ".isra.0" etc consistent_func Avoid encoding inconsistent static functions. These occur when a parameter is optimized out in some CUs and not others, or when the same function name has inconsistent BTF descriptions in different CUs. Specifying "--btf_features=all" is the equivalent to setting all of the above. If pahole does not know about a feature specified in --btf_features it silently ignores it. The --btf_features can either be specified via a single comma-separated list --btf_features=enum64,float ...or via multiple --btf_features values --btf_features=enum64 --btf_features=float These properties allow us to use the --btf_features option in the kernel scripts/pahole_flags.sh script to specify the desired set of BTF features. If a feature named in --btf_features is not present in the version of pahole used, BTF encoding will not complain. This is desired because it means we no longer have to tie new features to a specific pahole version. Use --btf_features_strict to change that behaviour and bail out if one of the requested features isn't present. To see the supported features, use: $ pahole --supported_btf_features encode_force,var,float,decl_tag,type_tag,enum64,optimized_func,consistent_func $ btfdiff: - Parallelize loading BTF and DWARF, speeding up a bit. - Do type expansion to cover "private" types and enumerations. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
upstream/v1.24de242344 · ·
BTF encoder: - Add support to BTF_KIND_ENUM64 to represent enumeration entries with more than 32 bits. - Support multithreaded encoding, in addition to DWARF multithreaded loading, speeding up the process. Selected just like DWARF multithreaded loading, using the 'pahole -j' option. - Encode 'char' type as signed. BTF Loader: - Add support to BTF_KIND_ENUM64. pahole: - Introduce --lang and --lang_exclude to specify the language the DWARF compile units were originated from to use or filter. Use case is to exclude Rust compile units while aspects of the DWARF generated for it get sorted out in a way that the kernel BPF verifier don't refuse loading the BTF generated from them. - Introduce --compile to generate compilable code in a similar fashion to: bpftool btf dump file vmlinux format c > vmlinux.h As with 'bpftool', this will notice type shadowing, i.e. multiple types with the same name and will disambiguate by adding a suffix. - Don't segfault when processing bogus files. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
v1.24de242344 · ·
BTF encoder: - Add support to BTF_KIND_ENUM64 to represent enumeration entries with more than 32 bits. - Support multithreaded encoding, in addition to DWARF multithreaded loading, speeding up the process. Selected just like DWARF multithreaded loading, using the 'pahole -j' option. - Encode 'char' type as signed. BTF Loader: - Add support to BTF_KIND_ENUM64. pahole: - Introduce --lang and --lang_exclude to specify the language the DWARF compile units were originated from to use or filter. Use case is to exclude Rust compile units while aspects of the DWARF generated for it get sorted out in a way that the kernel BPF verifier don't refuse loading the BTF generated from them. - Introduce --compile to generate compilable code in a similar fashion to: bpftool btf dump file vmlinux format c > vmlinux.h As with 'bpftool', this will notice type shadowing, i.e. multiple types with the same name and will disambiguate by adding a suffix. - Don't segfault when processing bogus files. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>