print("Hello World!")
Hello World!
Jihong Zhang
March 15, 2024
This post illustrates how to use Julia to create a gradient descent algorithm. What has not been introduced, however, is how to perform the data analysis using Julia in Quarto. This post will illustrate the workflow step by step.
First of all, refer to this Quarto.org, JuliaHub, and Patrick Altmeyer’s post. The first step is to install following components:
Second, when you create the new quarto document, make sure the yaml
header contains the jupyter
item. For example, the yaml of this post is:
title: 'How to use Julia in Quarto'
author: 'Jihong Zhang'
date: 'Mar 10 2024'
categories:
- Julia
- Quarto
format:
html:
code-summary: 'Code'
code-fold: false
code-line-numbers: false
jupyter: julia-1.6
After the installation, you should be able to run the julia code in quarto like:
Row | carat | cut | color | clarity | depth | table | price | x | y | z |
---|---|---|---|---|---|---|---|---|---|---|
Float64 | String15 | String1 | String7 | Float64 | Float64 | Int64 | Float64 | Float64 | Float64 | |
1 | 0.23 | Ideal | E | SI2 | 61.5 | 55.0 | 326 | 3.95 | 3.98 | 2.43 |
2 | 0.21 | Premium | E | SI1 | 59.8 | 61.0 | 326 | 3.89 | 3.84 | 2.31 |
3 | 0.23 | Good | E | VS1 | 56.9 | 65.0 | 327 | 4.05 | 4.07 | 2.31 |
4 | 0.29 | Premium | I | VS2 | 62.4 | 58.0 | 334 | 4.2 | 4.23 | 2.63 |
5 | 0.31 | Good | J | SI2 | 63.3 | 58.0 | 335 | 4.34 | 4.35 | 2.75 |
6 | 0.24 | Very Good | J | VVS2 | 62.8 | 57.0 | 336 | 3.94 | 3.96 | 2.48 |
7 | 0.24 | Very Good | I | VVS1 | 62.3 | 57.0 | 336 | 3.95 | 3.98 | 2.47 |
Following the previous post, we can easily model a generalized linear regression using GLM
module:
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}
price ~ 1 + depth
Coefficients:
────────────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────────────
(Intercept) 5763.67 740.556 7.78 <1e-14 4312.17 7215.16
depth -29.65 11.9897 -2.47 0.0134 -53.1499 -6.15005
────────────────────────────────────────────────────────────────────────
Let’s do some more advanced measurement - Factor analysis:
using MultivariateStats
# only sample first 300 cases and four variables
Xtr = diamonds[1:300 , [:x, :y, :z]]
# with each observation in a column
Xtr = Matrix(Xtr)' # somehow the data matrix has size of (d, n), which is the trasponse of data matrix in R
# train a one-factor model
M = fit(FactorAnalysis, Xtr; maxoutdim=1, method=:em)
Factor Analysis(indim = 3, outdim = 1)
You can refer to this doc for more details for parameter estimation of factor analysis
Let’s quickly compare the results of lavaan
Flux is an elegant approach to machine learning. It’s a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia’s native GPU and AD support. Flux makes the easy things easy while remaining fully hackable. See more details in Juliahub.
using Flux, Plots
data = [([x], 2x-x^3) for x in -2:0.1f0:2]
model = Chain(Dense(1 => 23, tanh), Dense(23 => 1, bias=false), only)
optim = Flux.setup(Adam(), model)
for epoch in 1:1000
Flux.train!((m,x,y) -> (m(x) - y)^2, model, data, optim)
end
plot(x -> 2x-x^3, -2, 2, legend=false)
scatter!(x -> model([x]), -2:0.1f0:2)
# This will prompt if neccessary to install everything, including CUDA:
using Flux, CUDA, Statistics, ProgressMeter
# Generate some data for the XOR problem: vectors of length 2, as columns of a matrix:
noisy = rand(Float32, 2, 1000) # 2×1000 Matrix{Float32}
truth = [xor(col[1]>0.5, col[2]>0.5) for col in eachcol(noisy)] # 1000-element Vector{Bool}
# Define our model, a multi-layer perceptron with one hidden layer of size 3:
model = Chain(
Dense(2 => 3, tanh), # activation function inside layer
BatchNorm(3),
Dense(3 => 2),
softmax) |> gpu # move model to GPU, if available
# The model encapsulates parameters, randomly initialised. Its initial output is:
out1 = model(noisy |> gpu) |> cpu # 2×1000 Matrix{Float32}
# To train the model, we use batches of 64 samples, and one-hot encoding:
target = Flux.onehotbatch(truth, [true, false]) # 2×1000 OneHotMatrix
loader = Flux.DataLoader((noisy, target) |> gpu, batchsize=64, shuffle=true);
# 16-element DataLoader with first element: (2×64 Matrix{Float32}, 2×64 OneHotMatrix)
optim = Flux.setup(Flux.Adam(0.01), model) # will store optimiser momentum, etc.
# Training loop, using the whole data set 1000 times:
losses = []
@showprogress for epoch in 1:1_000
for (x, y) in loader
loss, grads = Flux.withgradient(model) do m
# Evaluate model and loss inside gradient context:
y_hat = m(x)
Flux.crossentropy(y_hat, y)
end
Flux.update!(optim, model, grads[1])
push!(losses, loss) # logging, outside gradient context
end
end
optim # parameters, momenta and output have all changed
out2 = model(noisy |> gpu) |> cpu # first row is prob. of true, second row p(false)
mean((out2[1,:] .> 0.5) .== truth) # accuracy 94% so far!
0.932
using Plots # to draw the above figure
p_true = scatter(noisy[1,:], noisy[2,:], zcolor=truth, title="True classification", legend=false)
p_raw = scatter(noisy[1,:], noisy[2,:], zcolor=out1[1,:], title="Untrained network", label="", clims=(0,1))
p_done = scatter(noisy[1,:], noisy[2,:], zcolor=out2[1,:], title="Trained network", legend=false)
plot(p_true, p_raw, p_done, layout=(1,3), size=(1000,330))
---
title: 'How to use Julia in Quarto'
author: 'Jihong Zhang'
date: 'Mar 15 2024'
categories:
- Julia
- Quarto
toc: true
number-sections: true
execute:
eval: true
output: false
format:
html:
code-summary: 'Code'
code-fold: false
code-line-numbers: false
jupyter: julia-1.10
---
## Previous posts
This [post](posts/2021-08-30-gradient-descent-via-julia/index.md) illustrates how to use Julia to create a gradient descent algorithm. What has not been introduced, however, is how to perform the data analysis using Julia in Quarto. This post will illustrate the workflow step by step.
## Initial Setup
First of all, refer to this [Quarto.org](https://quarto.org/docs/computations/julia.html), [JuliaHub](https://help.juliahub.com/juliahub/stable/tutorials/quarto/), and Patrick Altmeyer's [post](https://www.paltmeyer.com/blog/posts/tips-and-tricks-for-using-quarto-with-julia/). The first step is to install following components:
1. IJulia
2. Revise.jl
3. Jupyter Cache
``` {.bash filename="Terminal"}
using Pkg
Pkg.add("IJulia")
Pkg.add("Revise")
using Conda
Conda.add("jupyter-cache")
```
Second, when you create the new quarto document, make sure the `yaml` header contains the `jupyter` item. For example, the yaml of this post is:
``` yaml
title: 'How to use Julia in Quarto'
author: 'Jihong Zhang'
date: 'Mar 10 2024'
categories:
- Julia
- Quarto
format:
html:
code-summary: 'Code'
code-fold: false
code-line-numbers: false
jupyter: julia-1.6
```
After the installation, you should be able to run the julia code in quarto like:
```{julia}
#| output: true
print("Hello World!")
```
## Import dataset
```{julia}
# import packages
using DataFrames
using CSV
# load in the diamonds.csv
diamonds = DataFrame(CSV.File("diamonds.csv"))
```
```{julia}
#| output: true
first(diamonds, 7)
```
## Basic Statistical Modeling
Following the previous post, we can easily model a generalized linear regression using `GLM` module:
```{julia}
#| output: true
using GLM
lm_fit = lm(@formula(price ~ depth), diamonds)
```
Let's do some more advanced measurement - Factor analysis:
```{julia}
#| output: true
using MultivariateStats
# only sample first 300 cases and four variables
Xtr = diamonds[1:300 , [:x, :y, :z]]
# with each observation in a column
Xtr = Matrix(Xtr)' # somehow the data matrix has size of (d, n), which is the trasponse of data matrix in R
# train a one-factor model
M = fit(FactorAnalysis, Xtr; maxoutdim=1, method=:em)
```
You can refer to this [doc](https://multivariatestatsjl.readthedocs.io/en/latest/fa.html) for more details for parameter estimation of factor analysis
```{julia}
#| output: true
loadings(M)
```
Let's quickly compare the results of `lavaan`
``` r
library(ggplot2)
library(lavaan)
data('diamonds')
X = diamonds[1:300, c('x', 'y', 'z')]
fa_model = "
F1 =~ x + y + z
"
fit = cfa(fa_model, data = X, std.lv = TRUE)
coef(fit)[1:3] # factor loading
F1=~x F1=~y F1=~z
0.7802245 0.7673664 0.4752576
```
## Some Popular Julia Package
### Flux
Flux is an elegant approach to machine learning. It's a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support. Flux makes the easy things easy while remaining fully hackable. See more details in [Juliahub](https://juliahub.com/ui/Packages/General/Flux).
```{julia}
#| output: true
using Flux, Plots
data = [([x], 2x-x^3) for x in -2:0.1f0:2]
model = Chain(Dense(1 => 23, tanh), Dense(23 => 1, bias=false), only)
optim = Flux.setup(Adam(), model)
for epoch in 1:1000
Flux.train!((m,x,y) -> (m(x) - y)^2, model, data, optim)
end
plot(x -> 2x-x^3, -2, 2, legend=false)
scatter!(x -> model([x]), -2:0.1f0:2)
```
```{julia}
#| output: true
#| code-fold: true
# This will prompt if neccessary to install everything, including CUDA:
using Flux, CUDA, Statistics, ProgressMeter
# Generate some data for the XOR problem: vectors of length 2, as columns of a matrix:
noisy = rand(Float32, 2, 1000) # 2×1000 Matrix{Float32}
truth = [xor(col[1]>0.5, col[2]>0.5) for col in eachcol(noisy)] # 1000-element Vector{Bool}
# Define our model, a multi-layer perceptron with one hidden layer of size 3:
model = Chain(
Dense(2 => 3, tanh), # activation function inside layer
BatchNorm(3),
Dense(3 => 2),
softmax) |> gpu # move model to GPU, if available
# The model encapsulates parameters, randomly initialised. Its initial output is:
out1 = model(noisy |> gpu) |> cpu # 2×1000 Matrix{Float32}
# To train the model, we use batches of 64 samples, and one-hot encoding:
target = Flux.onehotbatch(truth, [true, false]) # 2×1000 OneHotMatrix
loader = Flux.DataLoader((noisy, target) |> gpu, batchsize=64, shuffle=true);
# 16-element DataLoader with first element: (2×64 Matrix{Float32}, 2×64 OneHotMatrix)
optim = Flux.setup(Flux.Adam(0.01), model) # will store optimiser momentum, etc.
# Training loop, using the whole data set 1000 times:
losses = []
@showprogress for epoch in 1:1_000
for (x, y) in loader
loss, grads = Flux.withgradient(model) do m
# Evaluate model and loss inside gradient context:
y_hat = m(x)
Flux.crossentropy(y_hat, y)
end
Flux.update!(optim, model, grads[1])
push!(losses, loss) # logging, outside gradient context
end
end
optim # parameters, momenta and output have all changed
out2 = model(noisy |> gpu) |> cpu # first row is prob. of true, second row p(false)
mean((out2[1,:] .> 0.5) .== truth) # accuracy 94% so far!
```
```{julia}
#| output: true
using Plots # to draw the above figure
p_true = scatter(noisy[1,:], noisy[2,:], zcolor=truth, title="True classification", legend=false)
p_raw = scatter(noisy[1,:], noisy[2,:], zcolor=out1[1,:], title="Untrained network", label="", clims=(0,1))
p_done = scatter(noisy[1,:], noisy[2,:], zcolor=out2[1,:], title="Trained network", legend=false)
plot(p_true, p_raw, p_done, layout=(1,3), size=(1000,330))
```
```{julia}
#| output: true
plot(losses; xaxis=(:log10, "iteration"),
yaxis="loss", label="per batch")
n = length(loader)
plot!(n:n:length(losses), mean.(Iterators.partition(losses, n)),
label="epoch mean", dpi=200)
```