Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve channel_unordered performance: Take items from input channel outside worker threads #123

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tkf
Copy link
Member

@tkf tkf commented Jan 5, 2020

No description provided.

@codecov-io
Copy link

codecov-io commented Jan 5, 2020

Codecov Report

Merging #123 into master will decrease coverage by 5.69%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #123     +/-   ##
=========================================
- Coverage   93.67%   87.97%   -5.7%     
=========================================
  Files          19       19             
  Lines        1264     1206     -58     
=========================================
- Hits         1184     1061    -123     
- Misses         80      145     +65
Impacted Files Coverage Δ
src/unordered.jl 84.78% <100%> (-11.38%) ⬇️
src/interop/dataframes.jl 0% <0%> (-100%) ⬇️
src/basics.jl 43.75% <0%> (-37.5%) ⬇️
src/simd.jl 78.12% <0%> (-18.85%) ⬇️
src/comprehensions.jl 64.28% <0%> (-18.07%) ⬇️
src/core.jl 75.48% <0%> (-16.5%) ⬇️
src/interop/blockarrays.jl 87.5% <0%> (-12.5%) ⬇️
src/lister.jl 80.48% <0%> (-7.32%) ⬇️
src/progress.jl 87.77% <0%> (-5.71%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30cdf2a...2fbc18a. Read the comment docs.

@tkf tkf force-pushed the unordered-performance branch 2 times, most recently from 127f945 to 5c399c2 Compare January 5, 2020 05:52
@tkf
Copy link
Member Author

tkf commented Jan 5, 2020

No improvement 5c399c2:

                                           ID time ratio memory ratio
  ––––––––––––––––––––––––––––––––––––––––––– –––––––––– ––––––––––––
     ["unordered", "unordered", "basesize=1"]  0.98 (5%)  1.02 (1%) ❌
  ["unordered", "unordered", "basesize=1024"]  0.99 (5%)  0.91 (1%) ✅
    ["unordered", "unordered", "basesize=32"]  1.02 (5%)  1.04 (1%) ❌

https://travis-ci.com/tkf/Transducers.jl/jobs/272454798#L388

@github-actions
Copy link
Contributor

Multi-thread benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 19 Jan 2020 - 03:44
    • Baseline: 19 Jan 2020 - 03:46
  • Package commits:
    • Target: 4594da
    • Baseline: 30cdf2
  • Julia commits:
    • Target: 2d5741
    • Baseline: 2d5741
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: JULIA_NUM_THREADS => 2
    • Baseline: JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["parallel_histogram", "comm", "basesize=16384"] 0.83 (5%) ✅ 0.96 (1%) ✅
["parallel_histogram", "comm", "basesize=4096"] 0.48 (5%) ✅ 1.03 (1%) ❌
["parallel_histogram", "comm", "basesize=8192"] 0.69 (5%) ✅ 1.19 (1%) ❌
["unordered", "unordered", "basesize=1"] 1.07 (5%) ❌ 1.02 (1%) ❌
["unordered", "unordered", "basesize=1024"] 0.82 (5%) ✅ 0.85 (1%) ✅
["words", "nthreads=1"] 0.88 (5%) ✅ 0.99 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Target

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      18203 s          0 s       1226 s      16570 s          0 s
       #2  2294 MHz      19258 s          0 s       1294 s      16079 s          0 s
       
  Memory: 6.782737731933594 GB (3684.1953125 MB free)
  Uptime: 377.0 sec
  Load Avg:  1.67236328125  1.0966796875  0.517578125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      24865 s          0 s       1578 s      21558 s          0 s
       #2  2294 MHz      29238 s          0 s       1578 s      17902 s          0 s
       
  Memory: 6.782737731933594 GB (3603.3671875 MB free)
  Uptime: 499.0 sec
  Load Avg:  1.59912109375  1.2265625  0.64013671875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 3:44
  • Package commit: 4594da
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["parallel_histogram", "assoc", "basesize=16384"] 5.322 ms (5%) 732.25 KiB (1%) 110
["parallel_histogram", "assoc", "basesize=4096"] 6.347 ms (5%) 1.80 MiB (1%) 540
["parallel_histogram", "assoc", "basesize=8192"] 5.687 ms (5%) 1.43 MiB (1%) 261
["parallel_histogram", "comm", "basesize=16384"] 11.523 ms (5%) 1.17 MiB (1%) 183
["parallel_histogram", "comm", "basesize=4096"] 11.465 ms (5%) 1.10 MiB (1%) 259
["parallel_histogram", "comm", "basesize=8192"] 11.690 ms (5%) 1.47 MiB (1%) 208
["parallel_histogram", "seq"] 9.623 ms (5%) 364.63 KiB (1%) 25
["unordered", "collect"] 459.454 ms (5%) 513.00 KiB (1%) 23
["unordered", "unordered", "basesize=1"] 573.563 ms (5%) 30.75 MiB (1%) 507257
["unordered", "unordered", "basesize=1024"] 300.023 ms (5%) 851.44 KiB (1%) 5546
["unordered", "unordered", "basesize=32"] 274.863 ms (5%) 1.56 MiB (1%) 22416
["words", "nthreads=1"] 39.965 ms (5%) 6.947 ms 64.44 MiB (1%) 2085156
["words", "nthreads=2"] 24.115 ms (5%) 65.16 MiB (1%) 2085319
["words", "nthreads=4"] 24.295 ms (5%) 65.80 MiB (1%) 2085625

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      18203 s          0 s       1226 s      16570 s          0 s
       #2  2294 MHz      19258 s          0 s       1294 s      16079 s          0 s
       
  Memory: 6.782737731933594 GB (3684.1953125 MB free)
  Uptime: 377.0 sec
  Load Avg:  1.67236328125  1.0966796875  0.517578125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 3:46
  • Package commit: 30cdf2
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["parallel_histogram", "assoc", "basesize=16384"] 5.284 ms (5%) 732.25 KiB (1%) 110
["parallel_histogram", "assoc", "basesize=4096"] 6.344 ms (5%) 1.80 MiB (1%) 539
["parallel_histogram", "assoc", "basesize=8192"] 5.906 ms (5%) 1.43 MiB (1%) 261
["parallel_histogram", "comm", "basesize=16384"] 13.917 ms (5%) 1.22 MiB (1%) 331
["parallel_histogram", "comm", "basesize=4096"] 23.771 ms (5%) 1.07 MiB (1%) 5131
["parallel_histogram", "comm", "basesize=8192"] 16.847 ms (5%) 1.23 MiB (1%) 887
["parallel_histogram", "seq"] 9.247 ms (5%) 364.63 KiB (1%) 25
["unordered", "collect"] 462.960 ms (5%) 513.00 KiB (1%) 23
["unordered", "unordered", "basesize=1"] 537.271 ms (5%) 30.26 MiB (1%) 475643
["unordered", "unordered", "basesize=1024"] 366.726 ms (5%) 998.72 KiB (1%) 17033
["unordered", "unordered", "basesize=32"] 273.819 ms (5%) 1.57 MiB (1%) 23090
["words", "nthreads=1"] 45.193 ms (5%) 7.502 ms 64.87 MiB (1%) 2099520
["words", "nthreads=2"] 23.404 ms (5%) 65.59 MiB (1%) 2099681
["words", "nthreads=4"] 24.044 ms (5%) 66.23 MiB (1%) 2099990

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      24865 s          0 s       1578 s      21558 s          0 s
       #2  2294 MHz      29238 s          0 s       1578 s      17902 s          0 s
       
  Memory: 6.782737731933594 GB (3603.3671875 MB free)
  Uptime: 499.0 sec
  Load Avg:  1.59912109375  1.2265625  0.64013671875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

@github-actions
Copy link
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 19 Jan 2020 - 03:47
    • Baseline: 19 Jan 2020 - 03:51
  • Package commits:
    • Target: 4594da
    • Baseline: 30cdf2
  • Julia commits:
    • Target: 2d5741
    • Baseline: 2d5741
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["gemm", "fusedmul", "blas", "16"] 1.08 (5%) ❌ 1.00 (1%)
["gemm", "fusedmul", "blas", "2"] 1.09 (5%) ❌ 1.00 (1%)
["gemm", "mul", "man", "false", "8"] 1.37 (5%) ❌ 1.00 (1%)
["gemm", "mul", "man", "ivdep", "8"] 1.31 (5%) ❌ 1.00 (1%)
["gemm", "mul", "xf", "false", "32"] 0.92 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "false", "8"] 1.06 (5%) ❌ 1.00 (1%)
["gemm", "mul", "xf", "ivdep", "8"] 1.33 (5%) ❌ 1.00 (1%)
["missing_dot", "equiv"] 0.90 (5%) ✅ 1.00 (1%)
["missing_dot", "rf_nota"] 0.90 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Target

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      19101 s          0 s       1986 s      36371 s          0 s
       #2  2294 MHz      32848 s          0 s       1352 s      24039 s          0 s
       
  Memory: 6.782737731933594 GB (3492.5625 MB free)
  Uptime: 594.0 sec
  Load Avg:  1.197265625  1.04052734375  0.58251953125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      38678 s          0 s       2160 s      37267 s          0 s
       #2  2294 MHz      35937 s          0 s       1980 s      41016 s          0 s
       
  Memory: 6.782737731933594 GB (3562.1953125 MB free)
  Uptime: 801.0 sec
  Load Avg:  1.15673828125  1.11376953125  0.71630859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 3:47
  • Package commit: 4594da
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 205.600 μs (5%)
["cat", "xf"] 1.460 μs (5%)
["collect", "filter-missing"] 81.600 μs (5%) 33.05 KiB (1%) 20
["collect", "identity-float"] 63.200 μs (5%) 256.91 KiB (1%) 20
["collect", "identity-union"] 293.500 μs (5%) 285.28 KiB (1%) 6673
["dot", "blas"] 2.278 μs (5%)
["dot", "man"] 2.244 μs (5%)
["dot", "rf"] 2.656 μs (5%)
["dot", "xf"] 2.667 μs (5%)
["filter_map_map!", "man"] 66.300 μs (5%)
["filter_map_map!", "xf"] 69.400 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 194.900 μs (5%)
["filter_map_reduce", "xf"] 194.900 μs (5%)
["gemm", "fusedmul", "blas", "16"] 3.340 ms (5%)
["gemm", "fusedmul", "blas", "2"] 2.543 ms (5%)
["gemm", "fusedmul", "blas", "32"] 4.506 ms (5%)
["gemm", "fusedmul", "blas", "8"] 2.819 ms (5%)
["gemm", "fusedmul", "xf", "16"] 4.935 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 613.100 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 9.963 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.462 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 658.400 μs (5%)
["gemm", "mul", "linalg", "32"] 3.712 μs (5%)
["gemm", "mul", "linalg", "8"] 289.362 ns (5%)
["gemm", "mul", "man", "false", "256"] 4.389 ms (5%)
["gemm", "mul", "man", "false", "32"] 7.100 μs (5%)
["gemm", "mul", "man", "false", "8"] 411.000 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 4.348 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 6.240 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 392.574 ns (5%)
["gemm", "mul", "man", "true", "256"] 4.332 ms (5%)
["gemm", "mul", "man", "true", "32"] 7.375 μs (5%)
["gemm", "mul", "man", "true", "8"] 381.373 ns (5%)
["gemm", "mul", "xf", "false", "256"] 4.389 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 6.900 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 423.618 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 4.439 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 5.683 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 398.049 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 4.308 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 6.880 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 401.005 ns (5%) 48 bytes (1%) 2
["missing_argmax", "man"] 889.362 ns (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 2.200 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 2.211 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.210 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 1.040 μs (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 4.043 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 850.000 ns (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.230 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 185.800 μs (5%) 74.11 KiB (1%) 3866
["missing_dot", "xf_nota"] 183.800 μs (5%) 73.94 KiB (1%) 3862
["partition_by", "man"] 1.626 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 1.562 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      19101 s          0 s       1986 s      36371 s          0 s
       #2  2294 MHz      32848 s          0 s       1352 s      24039 s          0 s
       
  Memory: 6.782737731933594 GB (3492.5625 MB free)
  Uptime: 594.0 sec
  Load Avg:  1.197265625  1.04052734375  0.58251953125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 3:51
  • Package commit: 30cdf2
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 206.900 μs (5%)
["cat", "xf"] 1.460 μs (5%)
["collect", "filter-missing"] 80.000 μs (5%) 33.05 KiB (1%) 20
["collect", "identity-float"] 61.200 μs (5%) 256.91 KiB (1%) 20
["collect", "identity-union"] 291.301 μs (5%) 285.42 KiB (1%) 6678
["dot", "blas"] 2.267 μs (5%)
["dot", "man"] 2.256 μs (5%)
["dot", "rf"] 2.656 μs (5%)
["dot", "xf"] 2.667 μs (5%)
["filter_map_map!", "man"] 66.700 μs (5%)
["filter_map_map!", "xf"] 68.700 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 194.900 μs (5%)
["filter_map_reduce", "xf"] 194.900 μs (5%)
["gemm", "fusedmul", "blas", "16"] 3.094 ms (5%)
["gemm", "fusedmul", "blas", "2"] 2.331 ms (5%)
["gemm", "fusedmul", "blas", "32"] 4.480 ms (5%)
["gemm", "fusedmul", "blas", "8"] 2.812 ms (5%)
["gemm", "fusedmul", "xf", "16"] 4.838 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 602.100 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 9.749 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.417 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 658.601 μs (5%)
["gemm", "mul", "linalg", "32"] 3.800 μs (5%)
["gemm", "mul", "linalg", "8"] 300.000 ns (5%)
["gemm", "mul", "man", "false", "256"] 4.304 ms (5%)
["gemm", "mul", "man", "false", "32"] 7.000 μs (5%)
["gemm", "mul", "man", "false", "8"] 300.000 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 4.277 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 6.300 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 300.000 ns (5%)
["gemm", "mul", "man", "true", "256"] 4.310 ms (5%)
["gemm", "mul", "man", "true", "32"] 7.100 μs (5%)
["gemm", "mul", "man", "true", "8"] 400.000 ns (5%)
["gemm", "mul", "xf", "false", "256"] 4.298 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 7.500 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 400.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 4.263 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 5.800 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 300.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 4.297 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 6.700 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 400.000 ns (5%) 48 bytes (1%) 2
["missing_argmax", "man"] 900.000 ns (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 2.178 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 2.178 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.350 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 1.030 μs (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 4.043 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 857.692 ns (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.360 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 183.200 μs (5%) 74.08 KiB (1%) 3864
["missing_dot", "xf_nota"] 188.100 μs (5%) 73.92 KiB (1%) 3859
["partition_by", "man"] 1.629 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 1.566 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      38678 s          0 s       2160 s      37267 s          0 s
       #2  2294 MHz      35937 s          0 s       1980 s      41016 s          0 s
       
  Memory: 6.782737731933594 GB (3562.1953125 MB free)
  Uptime: 801.0 sec
  Load Avg:  1.15673828125  1.11376953125  0.71630859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

@tkf
Copy link
Member Author

tkf commented Jan 27, 2020

#123 (comment)

ID time ratio memory ratio
["parallel_histogram", "comm", "basesize=16384"] 0.83 (5%) ✅ 0.96 (1%) ✅
["parallel_histogram", "comm", "basesize=4096"] 0.48 (5%) ✅ 1.03 (1%) ❌
["parallel_histogram", "comm", "basesize=8192"] 0.69 (5%) ✅ 1.19 (1%) ❌

tkf added a commit that referenced this pull request Feb 13, 2020
Using commit:
Support Table 1.0 (#123)
JuliaFolds/BangBang.jl@976e825
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants