References: Hasktorch-Tutorial
Is a Haskell library built on libtorch, the same C++ library that powers PyTorch.
The tensors people say here are not tensors in mathematics, they are just arrays (vector written in coordinates) whose dimension is a product of some numbers. For example, a
But you see why it can be called as tensor, a linear transform
For example we have functions in Torch.Tensor.Factories
that create tensors filled with zeros
zeros' :: [Int] -> Tensor
zeros :: [Int] -> TensorOptions -> Tensor
There are also two easy cute maps that can be used to create tensors from lists or values
asTensor :: (TensorLike a) => a -> Tensor
asValue :: (TensorLike a) => Tensor -> a
For example you can make a scalar as a zero-dimensional tensor (of shape []
) using asTensor (3 :: Float)
. You can also convert a list of lists to a tensor using asTensor ([[1,2],[3,4]] :: [[Float]])
.
Tensor options can be constructed via the following functions
defaultOpts :: TensorOptions
withDType :: DType -> TensorOptions -> TensorOptions
withDevice :: Device -> TensorOptions -> TensorOptions
withLayout :: Layout -> TensorOptions -> TensorOptions
-- where DType is used to specify data type
data DType = Bool | UInt8 | Int8 | Int16 | Int32 | Int64 | Half | Float | Double | ComplexHalf | ComplexFloat | ComplexDouble | QInt8 | QUInt8 | QInt32 | BFloat16
-- Device is used to specify the device tensor is running on
data Device = Device { deviceType :: DeviceType, deviceIndex :: Int }
data DeviceType = CPU | CUDA | MPS
Tensors are in Num
so you can do Ring operations on them, component-wise.
There are component wise operations like relu :: Tensor -> Tensor
which applies the ReLU function to each component of the tensor. There are a lot of functions you can use in Torch.Functional
and Torch.Typed.Functional
.
The function select :: Int -> Int -> Tensor -> Tensor
in Torch.Tensor
slices input tensor along the selected dimension at given index.
The first parameter specifies the dimension to slice on, counted from
The second parameter specifies the index to slice at, counted from
This is actually a projection map at the
import Control.Monad.State
import Torch
import Torch.Internal.Managed.Type.Context (manual_seed_L)
Without specifying RNG generator, you have the impure random number generation
randIO' :: [Int] -> IO Tensor
-- ^ impure, but there is a hack function `manual_seed_L` to set the seed
= do
example 12345
manual_seed_L 2,3] randIO' [
that fills you with uniform random numbers in
There is also a stateful generator
rand' :: [Int] -> Generator -> (Tensor, Generator)
-- ^ which is essentially
-- [Int] -> State Generator Tensor
= do
example <- mkGenerator (Device CPU 0) 12345
rng0 ... use rng0 potentially in state ...
There are two main functions in Torch.Autograd
makeIndependent :: Tensor -> IO (IndependentTensor)
newtype IndependentTensor = IndependentTensor { toDependent :: Tensor }
this is just a newtype wrapper. But using the IO function makeIndependent
we will implicitly mark the corresponding array in libtorch
as differentiable and construct a compute graph that we can take grad
on. Think of IndependentTensor
as free variables you can use to compose functions, and take partial derivatives on.
The grad
function is used to compute the gradient of a (composed) tensor w.r.t. some independent tensors.
grad :: Tensor
-- ^ a tensor that requires gradient (requiresGrad = True)
-- this tensor is a function of the independent tensors
-> [IndependentTensor]
-- ^ the "free variables" that the tensor depends on
-> [Tensor]
-- ^ gradient of the tensor w.r.t. each of the free variables
-- evaluated at the current value of the independent tensors
class Parametrized f where
flattenParameters :: f -> [Parameter]
-- type Parameter = IndependentTensor
default flattenParameters :: (Generic f, Parametrized' (Rep f))
=> f -> [Parameter]
= flattenParameters' . from
flattenParameters -- recall that from :: a -> Rep a is the unit
-- to :: Rep a -> a is the counit
replaceOwnParameters :: f -> ParamStream f
-- type ParamStream a = State [Parameter] a
The use of generics allows automatic generation of the flattenParameters
function for any type that is an instance of Generic
and avoids the need to write boilerplate code for each type.
The class and instance derivation will derive them for tensors, containers of tensors, other types that build on tensors, and so on.
I rewrote the example to use my own RST monad, which can conveniently handle the stateful and reader-like computations.
module Main where
import Control.Monad.RST
import Control.Monad
import Torch
groundTruth :: Tensor -> Tensor
= squeezeAll $ matmul t a + b
groundTruth t -- the squeezeAll removes redundant dimensions after a contraction of tensors
where a = asTensor [1,2,3 :: Float]
= full' [1] (5 :: Float)
b
linearModel :: Linear -- ^ represents a linear layer, implemented Parameterized
-> Tensor -> Tensor
= squeezeAll $ linear a x
linearModel a x
randnM' :: (Monad m) => [Int] -> RST '[] '[Generator] m Tensor
= do
randnM' dims <- get
gen let (t, gen') = randn' dims gen
put gen'return t
runStepM :: (Parameterized model, Optimizer optim)
=> Loss -> RST '[LearningRate] '[model, optim] IO ()
= do
runStepM loss <- getsE EZ
model <- getsE (ES EZ)
optim <- queriesE EZ
learn <- liftIO $ runStep model optim loss learn
(model', optim') EZ model'
putsE ES EZ) optim'
putsE (
train :: RST '[LearningRate] '[Linear, GD, Generator, Int] IO ()
= do
train <- getsE EZ
model <- embedRST $ randnM' [5, 3]
input <- modifyThenGet @Int (+1)
count let loss = mseLoss (groundTruth input) (linearModel model input)
`mod` 100 == 0) $ liftIO $ putStrLn $ "train Loss:" <> show loss
when (count $ runStepM @Linear @GD loss
embedRST
main :: IO ()
= do
main <- sample $ LinearSpec { in_features = 3, out_features = 1 }
initModel <- mkGenerator (Device CPU 0) 99
randGen let learningRate = 5e-3 :: LearningRate
:* _) <- runRST (replicateM 2000 train) (learningRate :* Nil) (initModel :* GD :* randGen :* 0 :* Nil)
(_, model' print model'