Skip to content
This repository has been archived by the owner on Aug 10, 2024. It is now read-only.
/ JITrench.jl Public archive

[wip] Lightweight Automatic Differentiation & DeepLearning Framework implemented in pure Julia.

License

Notifications You must be signed in to change notification settings

abap34/JITrench.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JITrench


Let's dive into the deep trenches of the loss function
with JITrench.jl.

Stable Dev Build Status


Install

]add https://github.com/abap34/JITrench.jl

Automatic gradient calculation

julia> using JITrench

julia> f(x) = sin(x) + 1
f (generic function with 1 method)

julia> JITrench.@diff! f(x)
f′ (generic function with 1 method)

julia> f′(π)
-1.0

Compute gradients for "Deep" functions, visualize computational graphs

julia> x = Scalar(2.5)
Scalar{Float64}(2.5) 


julia> y = Scalar(3.5)
Scalar{Float64}(3.5) 

julia> goldstain(x, y) = (1 + (x + y + 1)^2 * (19 - 14x + 3x^2 - 14y + 6x*y + 3y^2)) *  (30 + (2x - 3y)^2 * (18 - 32x + 12x^2 + 48y - 36x*y + 27*y^2))
goldstain (generic function with 1 method)

julia> z = goldstain(x, y)
Scalar{Float64}(1.260939725e7) 

julia> backward!(z)

julia> x.grad
-5.324409e6

julia> y.grad
3.3109701e7

julia> JITrench.plot_graph(z, to_file="example/visualize/goldstain.png")

Compute gradients for operations on multidimensional arrays

julia> A = AutoDiff.Tensor([1 2; 3 4; 5 6])
3×2 Tensor{Matrix{Int64}}: 
 1  2
 3  4
 5  6 
 

julia> B = reshape(A, (2, 3))
2×3 Tensor{Matrix{Int64}}: 
 1  5  4
 3  2  6 
 

julia> C = B[1, :]
3×1 Tensor{Vector{Int64}}: 
 1
 5
 4 
 

julia> y = sum(C)
Scalar{Int64}(10) 


julia> backward!(y)

julia> A.grad
3×2 Matrix{Float64}:
 1.0  0.0
 0.0  1.0
 1.0  0.0

GPU Support: CuTensor

With the CuTensor type, you can perform calculations on the GPU as you would with Tensor.

julia> using JITrench

julia> using BenchmarkTools

julia> x = Tensor(rand(512, 512));

julia> W = Tensor(rand(512, 512));

julia> x_gpu = CuTensor(rand(512, 512));

julia> W_gpu = CuTensor(rand(512, 512));

julia> @benchmark x * W
BenchmarkTools.Trial: 7490 samples with 1 evaluation.
 Range (min  max):  616.548 μs   1.238 ms  ┊ GC (min  max): 0.00%  47.29%
 Time  (median):     649.301 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   665.530 μs ± 90.051 μs  ┊ GC (mean ± σ):  2.36% ±  7.64%

    ▆█▆▂                                                  ▁    ▁
  ▄▇████▇▆▃▃▁▁▁▄▃▁▁▁▁▃▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆██▇▇ █
  617 μs        Histogram: log(frequency) by time      1.19 ms <

 Memory estimate: 2.00 MiB, allocs estimate: 5.

julia> @benchmark x_gpu * W_gpu
BenchmarkTools.Trial: 10000 samples with 3 evaluations.
 Range (min  max):   8.317 μs   12.454 ms  ┊ GC (min  max): 0.00%  10.13%
 Time  (median):     38.716 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   38.481 μs ± 124.332 μs  ┊ GC (mean ± σ):  0.33% ±  0.10%

  ▄                                            ▃▁▂    ▁  ▄█▆▃▄ ▂
  █▅▄▅▁▁▃▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄███▅▄▁▄█▇▆█████ █
  8.32 μs       Histogram: log(frequency) by time      40.6 μs <

 Memory estimate: 704 bytes, allocs estimate: 31.

Example: Gradient Descent

rosenbrock(x₀, x₁) = 100 * (x₁ - x₀^2)^2 + (x₀ - 1)^2

x₀ = Scalar(0.0)
x₁ = Scalar(5.0)
lr = 1e-3
iters = 10000

for i in 1:iters    
    y = rosenbrock(x₀, x₁)
    JITrench.AutoDiff.cleargrad!(x₀)
    JITrench.AutoDiff.cleargrad!(x₁)
    backward!(y)
    x₀.values -= lr * x₀.grad
    x₁.values -= lr * x₁.grad
end

See example/optimization/gradient_descent.jl for details.

Example: Newton Method

JITrench.jl can also compute higher-order derivatives!

using JITrench

f(x) = x^4 - 2x^2

x = Scalar(2.0)
iters = 10

for i in 1:iters    
    y = f(x)
    JITrench.AutoDiff.cleargrad!(x)
    backward!(y, create_graph=true)
    gx = x.grad
    JITrench.AutoDiff.cleargrad!(gx)
    backward!(gx)
    gx2 = x.grad
    x.values -= gx.values / gx2.values
end

See example/optimization/newton_method.jl for details.

Example: Train Neural Network

using JITrench
using JITrench.NN 
using Printf


N = 100
p = 1
n_iter = 20000

x = rand(N, p)
y = sin.(2π .* x) + (rand(N, p) / 1)

function model(x)
    x = NN.Linear(out_dim=10)(x)
    x = NN.functions.sigmoid.(x)
    x = NN.Linear(out_dim=1)(x)
    return NN.result(x)
end

params = NN.init(model, NN.Initializer((nothing, 1)))
optimizer = NN.SGD(params, 1e-1)

x = Tensor(x)
y = Tensor(y)

for iter in 1:n_iter
    pred = NN.apply(model, x, params)
    loss = NN.functions.mean_squared_error(y, pred)
    NN.cleargrads!(params)
    backward!(loss)
    NN.optimize!(optimizer)
    if (iter % 500 == 0)
        @printf "[iters] %4i [loss] %.4f\n" iter loss.values 
    end
end


NN.save_weight(params, "weight")

This is a simple example of training a sine curve on an MLP and saving the weights as 'weights.jtw'

Progress of the training:

The visualization code is available in 'example/NN/MLP.jl'.

About

[wip] Lightweight Automatic Differentiation & DeepLearning Framework implemented in pure Julia.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages