**** BEGIN LOGGING AT Tue Sep 28 02:59:56 2021 Sep 28 07:50:45 good morning Sep 28 08:18:39 hi, maybe this is something known already, although I couldn't find anything in the MLs or elsewhere Sep 28 08:18:42 so Sep 28 08:19:21 I built oe-core/master (recent-ish, like from a day ago) for cortexa7thf-neon-vfpv4 (stm32mp1) core-image-base Sep 28 08:19:45 then I ran some ad-hoc python3 loop (i.e. not-really-good-benchmark provided by someone else) Sep 28 08:20:57 that loop took around 35% longer on that python 3.9.x interpreter in OE-core than it did in debian armhf testing/unstable hybrid rootfs with python 3.9.6 from debian package running the exact same kernel version on the exact same hardware Sep 28 08:21:14 I basically took the two rootfs images , chrooted into each, ran the test Sep 28 08:21:45 now, it is not new, I originally found it in oe-core dunfell with python 3.8.11, same problem there, it is slower than the debian one Sep 28 08:22:25 lmbench indicates no funny outliers /wrt the hardware or image, in fact, the lmbench run is generally faster on the OE built image because of better CPU tailored optimizations Sep 28 08:22:39 so this is likely isolated to python3 interpreter Sep 28 08:23:12 I also had a look at what debian configures there, they enable pthread support (OE does not for the cross compiled target) and computed gotos, but neither makes a difference Sep 28 08:24:48 am I missing anything obvious which would make such a huge performance difference ? Sep 28 08:24:52 RP: ^ Sep 28 09:17:13 well uh ... on aarch64 (ca53,zynqmp) it also runs slower compared to debian aarch64 build Sep 28 09:30:48 marex: try enabling profile guided optimisation (pgo) Sep 28 09:31:04 marex: you'll need to turn off reproducbility for python for that :/ Sep 28 09:31:30 RP: pgo is already enabled for class-target Sep 28 09:31:44 RP: I also tried LTO Sep 28 09:31:53 marex: hmm, don't know then Sep 28 09:31:57 pthread is missing when cross compiling, I force-enabled that Sep 28 09:32:07 I also added --with-computed-gotos Sep 28 09:32:12 would be good to track down the difference Sep 28 09:32:16 that should make the python interpretter configured close to what debian has there Sep 28 09:32:18 still nothing Sep 28 09:32:24 RP: yes Sep 28 09:32:57 RP: the odd thing is, it happens both on arm64 and arm32, which means it is some arch-independent config issue Sep 28 09:33:26 and I suspect if my base system(s) had problems, they would show up in lmbench, since the python interpreter should be mostly independent of the base system libs Sep 28 09:43:32 marex: did you disable reproducibility? Sep 28 09:44:28 marex: you've looked at the possibly_include_pgo() function? Sep 28 09:48:47 if bb.utils.contains('MACHINE_FEATURES', 'qemu-usermode', True, False, d) and d.getVar('BUILD_REPRODUCIBLE_BINARIES') != '1': Sep 28 09:48:50 this ? Sep 28 09:49:18 I do NOT have BUILD_REPRODUCIBLE_BINARIES set Sep 28 09:49:24 (so I get the PGO) Sep 28 09:49:43 I also verified the PGO configure flag is set, that --enable-optimizations one Sep 28 11:08:20 marex: ok, good. reproducible builds are the default now Sep 28 11:08:28 marex: just wanted to double check Sep 28 11:55:09 RP: btw the same problem happens in dunfell LTS, there the reproducible builds are surely disabled Sep 28 11:55:23 so this isnt a new problem Sep 28 11:59:01 marex: we enabled there there I think too Sep 28 13:11:35 RP: lemme double-check Sep 28 13:27:20 bitbake -e python3 | grep ^PACKAGECONFIG Sep 28 13:27:21 PACKAGECONFIG="readline pgo gdbm" Sep 28 13:27:22 PACKAGECONFIG_class-target="readline pgo gdbm" Sep 28 13:27:28 RP: looks good to me ^ no ? Sep 28 13:43:58 marex: as long as reproduciblity isn't enabled Sep 28 15:22:47 RP: it is not Sep 28 15:24:03 marex: hmm, ok. I think I'm getting confused as we did start testing it for dunfell **** ENDING LOGGING AT Wed Sep 29 02:59:56 2021