Flamefire

Flamefire

Member Since 11 years ago

Dresden

Experience Points
23
follower
Lessons Completed
1
follow
Lessons Completed
13
stars
Best Reply Awards
82
repos

1800 contributions in the last year

Pinned
⚡ Vocaluxe is an open source singing game inspired by SingStar™ and Ultrastar Deluxe.
⚡ Eclipse plugin for importing parameter names from smali source files
⚡ Return To The Roots (Settlers II(R) Clone)
⚡ Boost.Nowide - standard library functions with UTF-8 API on windows
⚡ LineageOS Kernel Tree for Sony Xperia XZ Premium, XZ1 and XZ1 Compact
⚡ Renaming of Identifiers in Visual Studio C# Projects
Activity
Oct
14
1 day ago
open pull request

Flamefire wants to merge easybuilders/easybuild-framework

Flamefire
Flamefire

Deprecate use of ec['parallel'] and fix updating the template value

ec['parallel'] currently doubles as an EC option and as the storage for the calculated parallelism set by the EasyBlock. This makes it hard to reason about especially as maxparallel has pretty much the same effect. Also changes to ec['parallel'] done by e.g. easyblocks (or the set_parallel method) are not reflected by the template %(parallel)s

Solution: Introduce a property which on write updates the template and some magic to mirror the effect of the now deprecated ec['parallel']

Additional change I'd like to do: Treat parallel = False (legacy) as maxparallel =1 so cfg.parallel is always a number

See https://github.com/easybuilders/easybuild-framework/pull/3811#issuecomment-911819520 for the motivation

Flamefire
Flamefire

Can we avoid parking this in the self._config dictionary, and use self._parallel_legacy instead?

Not really as it needs to be in the EC instance not the EB instance. Played around with it a bit and this was the best solution to keep the change as transparent and backwards compatible as possible. But yes we can rename it although I chose paralleLegacy to be already obscure enough to not be used.

pull request

Flamefire merge to easybuilders/easybuild-framework

Flamefire
Flamefire

Deprecate use of ec['parallel'] and fix updating the template value

ec['parallel'] currently doubles as an EC option and as the storage for the calculated parallelism set by the EasyBlock. This makes it hard to reason about especially as maxparallel has pretty much the same effect. Also changes to ec['parallel'] done by e.g. easyblocks (or the set_parallel method) are not reflected by the template %(parallel)s

Solution: Introduce a property which on write updates the template and some magic to mirror the effect of the now deprecated ec['parallel']

Additional change I'd like to do: Treat parallel = False (legacy) as maxparallel =1 so cfg.parallel is always a number

See https://github.com/easybuilders/easybuild-framework/pull/3811#issuecomment-911819520 for the motivation

Activity icon
issue

Flamefire issue comment easybuilders/easybuild-framework

Flamefire
Flamefire

Filter out duplicate paths added to module files

This basically fixes faulty EC files which result in modules like:

prepend_path("EBPYTHONPREFIXES", root)
prepend_path("QT_PLUGIN_PATH", pathJoin(root, "plugins"))
prepend_path("EBPYTHONPREFIXES", root)

The idea is to use a context manager around the creation of a module, store all added paths and reject duplicate ones that way clearing the stored paths at the end.

Testcase: PyQt5-5.9.2-foss-2018a-Python-3.6.4.eb

Flamefire
Flamefire

@akesandgren Basically the test at https://github.com/easybuilders/easybuild-framework/pull/3770/files#diff-afa83d19ec1ee9ca4fd954c188c8e846c691fbcf4845d14e02d168a5f0cf359dL745 does test that. Maybe we can rework it to be a bit clearer in (test) intent but it certainly checks that the common case of adding a path multiple times to modgen.prepend_paths is filtered

Sep
30
2 weeks ago
Activity icon
issue

Flamefire issue comment easybuilders/easybuild-easyconfigs

Flamefire
Flamefire

{devel,lib,tools}[fosscuda/2019b] Bazel v3.7.2, Horovod v0.22.1, TensorFlow v2.5.0 w/ Python 3.7.4

(created using eb --new-pr)

Flamefire
Flamefire

Test report by @Flamefire SUCCESS Build succeeded for 4 out of 4 (4 easyconfigs in total) taurusml25 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), Python 2.7.5 See https://gist.github.com/e8e3f3527ba9024acce80d0a02675715 for a full test report.

Activity icon
issue

Flamefire issue comment boostorg/locale

Flamefire
Flamefire

Failing tests

With ./b2 toolset=gcc-10 libs/locale/test cxxstd=11 -j4 variant=debug I see many failing tests with recent ICU

One of those is "USD 1,345.00" instead of "USD1,345.00", other look more confusing. See https://github.com/Flamefire/locale/runs/3748013166?check_suite_focus=true or https://github.com/Flamefire/locale/actions in general

@artyom-beilis As you wrote the tests I'm not sure if the tests are wrong or the results.

Flamefire
Flamefire

On my machine:

/** The current ICU library version as a dotted-decimal string. The patchlevel
 *  only appears in this string if it non-zero.
 *  This value will change in the subsequent releases of ICU
 *  @stable ICU 2.4
 */
#define U_ICU_VERSION "66.1"
Activity icon
issue

Flamefire issue boostorg/locale

Flamefire
Flamefire

Failing tests

With ./b2 toolset=gcc-10 libs/locale/test cxxstd=11 -j4 variant=debug I see many failing tests with recent ICU

One of those is "USD 1,345.00" instead of "USD1,345.00", other look more confusing. See https://github.com/Flamefire/locale/runs/3748013166?check_suite_focus=true or https://github.com/Flamefire/locale/actions in general

@artyom-beilis As you wrote the tests I'm not sure if the tests are wrong or the results.

Activity icon
issue

Flamefire issue comment boostorg/locale

Flamefire
Flamefire

Remove linking with Boost.System

Since Boost.System is header-only now, no need to link with the library.

Also, trim trailing spaces in the jamfile.

Flamefire
Flamefire

Anycase we will see if something fails on tests :-)

Preparing CI tests ATM. But as we don't use Boost.System ourselves and it is really header-only now linking to it should no longer be required

push

Flamefire push boostorg/locale

Flamefire
Flamefire

Remove linking with Boost.System.

Since Boost.System is header-only now, no need to link with the library.

Flamefire
Flamefire
Flamefire
Flamefire

Merge pull request #38 from Lastique/remove_linking_system

Remove linking with Boost.System

commit sha: 33d06c0c4be7f9bcfb05e200c72f752e9777261c

push time in 2 weeks ago
pull request

Flamefire pull request boostorg/locale

Flamefire
Flamefire

Remove linking with Boost.System

Since Boost.System is header-only now, no need to link with the library.

Also, trim trailing spaces in the jamfile.

push

Flamefire push Flamefire/locale

Flamefire
Flamefire

Fix compilation without auto-ptr

commit sha: d4065fa46768ccac93553a4a350f249dc89c4d62

push time in 2 weeks ago
Activity icon
delete

Flamefire in Flamefire/easybuild-easyconfigs delete branch 20210924165547_new_pr_Bazel372

deleted time in 2 weeks ago
Sep
29
2 weeks ago
Activity icon
issue

Flamefire issue comment tensorflow/tensorflow

Flamefire
Flamefire

mkl_fused_batch_norm_op_test failing

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): RHEL 7
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 2.5.0 / 2.6.0
  • Python version: 3.8
  • Bazel version (if compiling from source): 3.7.2
  • GCC/Compiler version (if compiling from source): 10.3

Describe the current behavior

According to https://groups.google.com/a/tensorflow.org/g/build/c/RZhgZst-fgQ we build without --config=mkl and when running the tests of the build the test //tensorflow/core/kernels/mkl:mkl_fused_batch_norm_op_test fails:

[==========] 5 tests from 1 test suite ran. (197 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 5 tests, listed below:
[  FAILED  ] Test/FusedBatchNormOpTest/0.Training, where TypeParam = float
[  FAILED  ] Test/FusedBatchNormOpTest/0.TrainingRunningMean, where TypeParam = float
[  FAILED  ] Test/FusedBatchNormOpTest/0.Inference, where TypeParam = float
[  FAILED  ] Test/FusedBatchNormOpTest/0.InferenceIgnoreAvgFactor, where TypeParam = float
[  FAILED  ] Test/FusedBatchNormOpTest/0.FusedBatchNormGradV3, where TypeParam = float

The last one (FusedBatchNormGradV3) seemingly succeeds on TF 2.6 while the other 4 fail on 2.5 and 2.6. The tests succeed when using --config=mkl and on other systems. It seems the AMD Epyc CPUs are affected, another Intel node works fine. So that might be related although the Intel CPUs are a bit older (broadwell) and we use -march=native.

Other info / logs

Test log: test.log

Command used: CC_OPT_FLAGS="-O3 -march=native -fno-math-errno -fPIC" bazel test --config=noaws --config=nogcp --config=nohdfs --compilation_mode=opt --config=opt --subcommands --verbose_failures --jobs=64 --copt="-fPIC" --action_env=PYTHONPATH --action_env=EBPYTHONPREFIXES --action_env=PYTHONNOUSERSITE=1 --distinct_host_configuration=false --test_output=errors --build_tests_only --local_test_jobs=64 --test_env=CUDA_VISIBLE_DEVICES='-1' --test_timeout=3600 -- //tensorflow/core/kernels/mkl:mkl_fused_batch_norm_op_test

Flamefire
Flamefire

Some more test results for //tensorflow/core/kernels/mkl:mkl_fused_batch_norm_op_test where I checked Test/FusedBatchNormOpTest/0.Training and Test/FusedBatchNormOpTest/0.TrainingRunningMean and printed the first element of the output and mkl_output tensors for various configs. The input seems to be random, but the same (checked the first 2 elements if the input tensor)

--config=mkl is equal to --define=build_with_mkl=true --define=enable_mkl=true --define=tensorflow_mkldnn_contraction_kernel=0 --define=build_with_openmp=true`

So I did a full range scan over all possible combinations. Note that all values in a column for the results should be the same and for the output part are indeed (noted in the header), just the mkl output varies wildly

build with mkl enable mkl tensorflow mkldnn contraction kernel build with openmp TF ENABLE ONEDNN OPTS Training 0.647915 TrainingRunningMean -0.821899
true true 0 true 0 0.647915 -0.821899
true true 0 true 1 0.647915 -0.821899
true true 0 false 0 0 0.0025533
true true 0 false 1 1.55481 0
true true 1 true 0 0.647915 -0.821899
true true 1 true 1 0.647915 -0.821899
true true 1 false 0 -9.09574e+23 0
true true 1 false 1 -0.488759 0
true false 0 true 0
true false 0 true 1
true false 0 false 0 1.55481 -1.94947e+25
true false 0 false 1 -1.43031e-17 1.53355e-35
true false 1 true 0
true false 1 true 1
true false 1 false 0 0 -4.68903e+24
true false 1 false 1 0 7.27895e+24
false true 0 true 0 0 -6.97298e+14
false true 0 true 1 1.09406 0
false true 0 false 0 0 0.872275
false true 0 false 1 0 -1.87957e+35
false true 1 true 0 -5.79435e-30 0
false true 1 true 1 -5.24765e+22 -5.24765e+22
false true 1 false 0 1437.87 4.23253e-35
false true 1 false 1 -1.32684e+24 -1.32684e+24
false false 0 true 0 0 1.44743e+09
false false 0 true 1 1.09406 2.43139e-35
false false 0 false 0 0 -8.40301e-23
false false 0 false 1 1.09406 -0.706766
false false 1 true 0 6578.88 -3.87551e+15
false false 1 true 1 -195440 -195440
false false 1 false 0 -6.85091e-32 3.93441e-35
false false 1 false 1 5.69273e+15 0
Activity icon
issue

Flamefire issue comment easybuilders/easybuild-easyconfigs

Flamefire
Flamefire

{devel,lib,tools}[fosscuda/2019b] Bazel v3.7.2, Horovod v0.22.1, TensorFlow v2.5.0 w/ Python 3.7.4

(created using eb --new-pr)

Flamefire
Flamefire

Test report by @Flamefire SUCCESS Build succeeded for 4 out of 4 (4 easyconfigs in total) taurusa4 - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2603 v4 @1.70GHz (broadwell), Python 2.7.5 See https://gist.github.com/199b00a7bf8ab677ad8abbac0ddd4e4a for a full test report.

pull request

Flamefire pull request boostorg/locale

Flamefire
Flamefire

Add CI via Github Actions

Activity icon
delete
deleted time in 2 weeks ago
Activity icon
created branch

Flamefire in Flamefire/locale create branch feature/ci

createdAt 2 weeks ago
Activity icon
delete

Flamefire in Flamefire/locale delete branch stateless_encoding

deleted time in 2 weeks ago
Activity icon
delete

Flamefire in Flamefire/locale delete branch svn-branches/maintenance/1_55_0

deleted time in 2 weeks ago
Activity icon
delete

Flamefire in Flamefire/locale delete branch svn-branches/modular-build

deleted time in 2 weeks ago
Activity icon
delete

Flamefire in Flamefire/locale delete branch svn-branches/maintenance/1_54_0

deleted time in 2 weeks ago
Previous