Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oneDNN 2.5 migration #121

Open
wants to merge 63 commits into
base: v2.5_for_ie_master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
8109bd8
jit: avx512: conv: Fix missed ur_w iteration
AlexPeskov Mar 26, 2018
e248b50
Enable jit sse41 NxN convolution for grayscale input
dmitry-gorokhov Jun 5, 2018
1c6c17d
Support of strided blobs for [de]convolution and simple reorder
AlexPeskov Jan 23, 2018
94dbed3
Updated sse41 jit convolutions to support padded channels
dmitry-gorokhov Oct 26, 2018
a707196
Introduced Depthwise and Quantization post ops
dmitry-gorokhov Sep 24, 2020
d31c5c4
Pooling pads like Caffe
AlexPeskov Mar 5, 2018
21d958b
TBB_AUTO was enabled
alexey-varyzgin May 14, 2019
23dacbf
Add API function dnnl_memory_set_data_handle_no_pads_proc
AlexPeskov Aug 27, 2019
d22cb71
nchw_pooling dense fix
alexey-varyzgin Nov 14, 2019
b17c0cc
Enabled BWD (JIT/GEMM) FP32/BF16 Convoltions + Depthwise post ops fus…
dmitry-gorokhov Oct 21, 2020
d83837b
Fixes for MKLDNN to enable LTO
ilya-lavrenov May 18, 2020
8e0bbf7
[MSVC] Enabling SIMD functionality for VS2019
Aug 12, 2020
6a7071e
Avoid usage of undefined macro
AlexPeskov Oct 26, 2020
d13149c
Add several uni instruction wrappers into jit_generator
AlexPeskov Oct 26, 2020
11c81cd
Fix ODR violation
AlexPeskov Nov 16, 2020
b625aec
fix name matching with system strauct 'user' in llvm-android toolchain
AlexPeskov Nov 16, 2020
e333684
Added JIT FP32/BF16 Softmax for arbitrary inner_size
dmitry-gorokhov Dec 4, 2020
31cd484
Added support of hsigmoid, round_half_to_even, round_half_away_from_z…
a-sidorova Aug 27, 2020
c74c508
Limit applicability of is_1stconv logic for JIT FP32/BF16 AVX512 Conv…
dmitry-gorokhov Dec 9, 2020
3b61547
[WA] Removed kernel_outside_src condition on JIT FP32/BF16 Convolutions
dmitry-gorokhov Dec 9, 2020
83137dc
Added custom vesrion of JIT DW FP32/BF16 Convolution with 5D input su…
dmitry-gorokhov Dec 14, 2020
11141a1
Asymmetric quntization for activations
dmitry-gorokhov Nov 20, 2020
7d34a00
Added 3D DW case support for JIT INT8 Convolutions
dmitry-gorokhov Dec 14, 2020
fad9dbe
[WA] Disabled weights md transpose in FC to prevent perf degradations
dmitry-gorokhov Dec 16, 2020
8d279a1
Dynamic batch support via context
dmitry-gorokhov Jan 2, 2021
50043b9
Added JIT AVX512/AVX2 FP32 Planar Convolution implementation
dmitry-gorokhov Jan 2, 2021
ce960ad
Binary networks support
dmitry-gorokhov Jan 21, 2021
64232aa
Accommodating oneTBB (with hybrid cores support) that
myshevts Nov 24, 2020
daeb468
NCHW pooling perfomance fixed in accordance with v0.21
maxnick Feb 8, 2021
4592ba4
[WA] Fixed fallback on ref conv in case exceeding scratchpad limit
dmitry-gorokhov Feb 26, 2021
0c99230
Returned old behavior for fp32 avx2 1x1 conv with dw conv fusing
antonvor Feb 16, 2021
dcd0abf
Updated SoftPlus
a-sidorova Apr 12, 2021
9ccc627
Fixed warning fo undefined ITT_ARCH_IA64 (#52)
ilya-lavrenov May 12, 2021
ccd5353
Disable reorder JIT if both inputs and outputs are batch-strided.
IvanNovoselov Jun 8, 2021
94c0a20
Include TBB headers as system
AlexPeskov Oct 26, 2020
f1fd7e4
Fixed redifinition of tls model
dmitry-gorokhov Jun 24, 2021
a8f7910
nspc layout support for convolutions
maxnick Mar 31, 2021
597c659
set scale = 1.f in case of signed input on platforms without vnni
antonvor May 26, 2021
90b00ac
Enable direct copy primitives for u8 reorder
IvanNovoselov Jul 2, 2021
de66d63
Memory descriptor dynamism related changes
maxnick Jul 23, 2021
1501344
Added prelu as binary post op
antonvor Aug 2, 2021
a7c1712
Depthwise and Quantization post ops for Gemm Convolutions
antonvor Aug 23, 2021
12e2bca
Perf fixes for Ref and NCHW Poolings
antonvor Sep 1, 2021
ab4c85f
perf fixes for quantization post ops
antonvor Sep 16, 2021
80a89c2
todo: fix assert(idx < max_idx)
antonvor Sep 16, 2021
f5f763d
simple reorder: temporarily disabled zero padding
antonvor Sep 19, 2021
98064bb
Fix possible data race when accessing global reorder list
Sep 30, 2021
49d6a78
Brgemm implementation has perf degradation in RNN node
alexey-varyzgin Oct 4, 2021
994f9c5
reverted old behavior with pdim_consistent check due to perf problems
antonvor Nov 10, 2021
a5aa5bb
Renamed matmul kernel type: brg -> brgemm
dmitry-gorokhov Aug 24, 2021
a4ebe69
Fixed bias addition order in brgemm kernel
dmitry-gorokhov Nov 3, 2021
9f776d1
[1D] Enlarge support
alexey-varyzgin Oct 22, 2021
8ac9ff7
Update uni_ jit methods to avoid mixing vex and nonVEX instructions
Nov 17, 2021
26a09d4
Quantization post op structure modified to reduce its complexity
maxnick Nov 24, 2021
298546e
Hash utility functions were extracted to a separate module for reuse
maxnick Nov 29, 2021
c4d9021
Desc similar_to routine consider start stride
maxnick Jan 14, 2022
81698b9
Desc similar_to routine use stride cmp mask
maxnick Jan 26, 2022
f8d99a9
added some legacy parallel methods to fix perf issues
antonvor Jan 17, 2022
5a77d6e
Migrate legacy post ops and zero points on runtime data pointers
dmitry-gorokhov Jan 26, 2022
15c2025
Revert "all: remove mkldnn compatibility layer"
dmitry-gorokhov Feb 21, 2022
22e2744
Revert "reverted old behavior with pdim_consistent check due to perf …
dmitry-gorokhov Mar 5, 2022
5ac2a40
Fixed ODR violataion
dmitry-gorokhov Mar 14, 2022
d55288f
equality_uni_xxx_for_sse_and_avx
chenhu-wang Feb 24, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ set(CMAKE_TEST_CCXX_FLAGS) # TESTS specifics
string(TOUPPER "${CMAKE_BUILD_TYPE}" UPPERCASE_CMAKE_BUILD_TYPE)

include("cmake/dnnl_compat.cmake")
include("cmake/mkldnn_compat.cmake")

include("cmake/utils.cmake")
include("cmake/options.cmake")
Expand Down
7 changes: 5 additions & 2 deletions cmake/TBB.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,10 @@ include("cmake/Threading.cmake")
macro(handle_tbb_target)
if(TBB_FOUND)
set_property(TARGET TBB::tbb PROPERTY "MAP_IMPORTED_CONFIG_RELWITHMDD" "DEBUG")
include_directories_with_host_compiler(${_tbb_include_dirs})
foreach(inc_dir ${_tbb_include_dirs})
include_directories(BEFORE SYSTEM ${inc_dir})
append_host_compiler_options(CMAKE_CXX_FLAGS "-I${inc_dir}")
endforeach()
list(APPEND EXTRA_SHARED_LIBS ${TBB_IMPORTED_TARGETS})

# Print TBB location
Expand Down Expand Up @@ -56,7 +59,7 @@ macro(handle_tbb_target)
append_to_windows_path_list(CTESTCONFIG_PATH "${_tbb_redist_dir}")
endmacro()

if(NOT DNNL_CPU_THREADING_RUNTIME STREQUAL "TBB")
if(NOT "${DNNL_CPU_THREADING_RUNTIME}" MATCHES "^(TBB|TBB_AUTO)$")
return()
endif()

Expand Down
7 changes: 3 additions & 4 deletions cmake/Threading.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,12 @@ list(APPEND EXTRA_SHARED_LIBS "${CMAKE_THREAD_LIBS_INIT}")

# A macro to avoid code duplication
macro(find_package_tbb)
set(_cmake_proj_dir "${PROJECT_SOURCE_DIR}/cmake")
if(WIN32)
find_package(TBB ${ARGN} COMPONENTS tbb HINTS ${_cmake_proj_dir}/win)
find_package(TBB ${ARGN} COMPONENTS tbb)
elseif(APPLE)
find_package(TBB ${ARGN} COMPONENTS tbb HINTS ${_cmake_proj_dir}/mac)
find_package(TBB ${ARGN} COMPONENTS tbb)
elseif(UNIX)
find_package(TBB ${ARGN} COMPONENTS tbb HINTS ${_cmake_proj_dir}/lnx)
find_package(TBB ${ARGN} COMPONENTS tbb)
endif()

if(TBB_FOUND)
Expand Down
46 changes: 46 additions & 0 deletions cmake/gen_mkldnn_compat_cmakes.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#===============================================================================
# Copyright 2019-2020 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#===============================================================================

# Creates cmake config for MKLDNN based on oneDNN one
# (by replacing DNNL with MKLDNN)
# Parameters:
# DIR -- path to cmake install dir

set(DNNL_DIR ${DIR}/dnnl)
set(MKLDNN_DIR ${DIR}/mkldnn)

file(MAKE_DIRECTORY ${MKLDNN_DIR})

file(GLOB_RECURSE fs "${DNNL_DIR}/*")
foreach(f ${fs})
# set the destination
file(RELATIVE_PATH frel ${DNNL_DIR} ${f})
string(REGEX REPLACE "dnnl" "mkldnn" dest_rel "${frel}")
set(dest "${MKLDNN_DIR}/${dest_rel}")
# message(STATUS "file: ${f} --> ${frel} --> ${dest_rel} --> ${dest}")

# read and change the content of the file
file(STRINGS ${f} contents NEWLINE_CONSUME)
string(REGEX REPLACE "DNNL" "MKLDNN" contents "${contents}")
string(REGEX REPLACE "dnnl" "mkldnn" contents "${contents}")
foreach (ext "a" "so" "dylib" "dll" "lib")
string(REGEX REPLACE "mkldnn[.]${ext}" "dnnl.${ext}" contents "${contents}")
endforeach()
string(REGEX REPLACE "lmkldnn" "ldnnl" contents "${contents}")

# store the result
file(WRITE ${dest} ${contents})
endforeach()
183 changes: 0 additions & 183 deletions cmake/lnx/TBBConfig.cmake

This file was deleted.

Loading