As a ‘functional’ programming language, how can we not have functions? We will introduce more commonly used basic functions, as well as operations on basic functions!
addXY :: (Num a) => a -> a -> a
-- addXY x y :: (Num a) => a
= (+) x y
addXY x y
-- addXY x :: (Num a) => a -> a
= (+) x
addXY x -- y can be removed, as functions of type (a -> a), addXY x and (+) x are equal
= (+) -- a -> a -> a
addXY -- well, by the same spirit, x can also be removed! as functions of type (a -> a -> a)
-- they are equal!
When we write the definition f x = g x, x can be eliminated, resulting in f = g. This process is called Eta reduction. Eta reduction can help us write more abstract and concise code. (In some cases, using eta reduction can also make the compiler notice more optimizations, improving code reuse and efficiency.)
Let’s look at more examples:
getFirsts :: [(a, b)] -> [a]
= map fst list
getFirsts list
-- --> reduction: remove the 'list' at both ends
getFirsts :: [(a, b)] -> [a]
= map fst getFirsts
showLength :: [a] -> String
= "The length of the list is " ++ show (length list)
showLength list
showLength :: [a] -> String
= ("The length of the list is " ++) . show . length showLength
Lambda abstraction can be seen as the reverse operation of Eta-reduction: Eta-expansion, which “adds” a variable to the expression after it, allowing you to use a nameless function where needed, usually as a parameter to another function:
-> 2*x) 3 = 2*3 -- lambda expression with one variable
(\x
= \x y -> x + y -- lambda expression with two variables
addXY
... xn -> f x1 x2 x3 .. xn -- you can have arbitrary many variables \x1 x2 x3
Lambda abstraction can also be seen as the inverse operation of eta-reduction:
= map fst
getFirsts
-- eta expansion:
= \list -> map fst list -- they are equal as functions
getFirsts -- well, in that case, just write
-- getFirsts list = map fst list
In Haskell, there is no such thing as loops. For functions, all loops can be implemented using recursion. In functional programming, recursion is as common as eating and walking.
import Text.Read (readMaybe)
import Data.Ord
main :: IO ()
= do
main putStrLn "Guess the number:"
<- getLine
guessed case (compare secretNumber) <$> (readMaybe guessed) of -- monad magic, please ignore for now
Just LT -> putStrLn "Too large!" >> main -- recursion happens!
Just EQ -> putStrLn "You win!" -- the program ends here, no more recursion
Just GT -> putStrLn "Too small!" >> main -- recursion happens!
Nothing -> putStrLn "Wrong format!" >> main -- recursion happens!
where secretNumber = 88
-- ignore the monad magics for now owo, we will explain them next class!
-- getLine :: IO String -- feed in user input
-- (>>) :: IO a -> IO b -> IO b -- sequencing events in a monad
-- do notation: a syntax sugar form of a monad binding operator (>>=)
Recursion must have a termination condition, unless you intend to write an infinite loop.
-- this function pattern matches on the input list
-- if the input is empty, it returns empty
-- if the input is non-empty, recursion happens and it calls itself!
quickSort :: (Ord a) => [a] -> [a]
= [] -- without this line, the recursion will loop forever (in fact, runtime error)
quickSort [] :xs) = quickSort [y | y <- xs, y < x] ++ [x] ++ quickSort [y | y <- xs, y >= x] quickSort (x
An example of an infinite loop:
repeat :: a -> [a] -- this is a standard function
repeat x = x : repeat x -- infinite list of x's
Let’s look at a simple example, calculating 1+2+…+100
sum [1..100] :: Int -- you can use list, of course
What if we don’t use the sum function? Or how do we write our own sum function?
mySum :: (Num a) => [a] -> a
= 0
mySum [] :xs) = x + mySum xs -- this works
mySum (x
1,3,4] = mySum 1:[3,4]
mySym [= 1 + mySum [3,4] -- use the definition, repeatedly
= 1 + 3 + mySum [4]
= 1 + 3 + 4 + mySum []
= 1 + 3 + 4 + 0
```
But...
```Haskell
GHCi, version 9.6.3: https://www.haskell.org/ghc/ :? for help
> :{
ghci| mySum [] = 0
ghci| mySum (x:xs) = x + mySum xs
ghci| :}
ghci> mySum [1..10000000]
ghci50000005000000
> mySum [1..100000000]
ghci*** Exception: stack overflow
> sum [1..100000000]
ghci5000000050000000
> ghci
What happened? why is mySum different from sum? Because of lazy evaluation, the expression 1+2+3+4+…+100000000+0 is not evaluated immediately, causing a stack overflow.
-- the right way
= sum2 0 -- eta reduction happens here (mySum2 list = sum2 0 list)
mySum2 where sum2 !acc [] = acc
!acc (x:xs) = sum2 (acc+x) xs
sum2 -- when sum2 traverses the list, it accumulates all the list elements in acc
-- ! is the Bang pattern, it forces evaluation, avoids laziness in the variable acc
-- without the (!), evaluating it on [1..100000000] still gives you a stack overflow
-- functions and 'variables' defined in 'where' clause are local
-- they will not affect other functions.
-- advanced tip : ! will only force evaluation to weak head normal form (WHNF).
-- if you need to fully evaluate a complex data structure, you need rdeepseq
Effect:
> :{
ghci| mySum2 = sum2 0
ghci| where sum2 !acc [] = acc
ghci| sum2 !acc (x:xs) = sum2 (acc+x) xs
ghci| :}
ghci> mySum2 [1..100000000]
ghci5000000050000000
This technique, where a recursion is called directly by the function body itself, is called tail recursion. (the ! is called bang pattern, it triggers eager evaluation, avoids laziness in this variable acc).
Although tail recursion is a type of recursion, it will be optimized by the compiler into a loop during compilation.
fib :: Int -> Integer
1 = 1
fib 2 = 1
fib = fib (n-1) + fib (n-2) -- the naive version
fib n -- computing time is exponential and this wastes a lot of computation
-- fib 5 = fib 4 + fib 3
-- = (fib 3 + fib 2) + (fib 2 + fib 1) -- wastes the information about fib 3
-- = ((fib 2 + fib 1) + 1) + (1 + 1)
-- = ((1 + 1) + 1) + (1 + 1)
-- ^^^^^^^ ^^^^^^^ redundant computation! should only be computed once
The function fib
calculates the Fibonacci number at a given index using naive recursion. The computation time is exponential and results in a lot of wasted computation. For example, when calculating fib 5
, the function will compute fib 3
twice, leading to redundant computation.
= fibT 0 1 1
fib2 n where fibT previous current n = current
= fibT current (previous + current) (m+1) fibT previous current m
This code uses tail recursion to calculate the Fibonacci sequence. It starts with the first two numbers, 0 and 1, and then uses a recursive function to calculate the next number by adding the previous two numbers. This process continues until the desired number in the sequence is reached.
= 0 : 1 : zipWith (+) fibList (drop 1 fibList) -- infinite list!
fibList -- 0 : 1 : (0 : 1 : a2 : ...)
-- (+ + + )
-- (1 : a2 : a3 : ...)
-- || || || ||
-- 0 : 1 : a2 : a3 : ...
!! :: [a] -> Int -> a -- get the i-th term of a list, O(i) time since linked list
-- this is a standard function, but we kindly provide its definition for you
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith operator (x:xs) (y:ys) = operator x y : zipWith xs ys
zipWith operator _ _ = []
-- zipWith f [1,2,3] [4,5,6] = [f 1 4, f 2 5, f 3 6]
map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs -- recursion!
map (*2) [1..5] = [2,4,6,8,10]
-- foldl f z0 [x1,...,xn] = ((((z0 `f` x1) `f` x2) `f` x3) `f`) ... `f` xn
foldl :: (a -> b -> b) -> a -> [b] -> b
foldl f z0 [] = z0
foldl f z0 (x:xs) = foldl f (z0 `f` x) xs
-- foldl' is the strict version of foldl
foldl' :: (a -> b -> b) -> a -> [b] -> b
!z0 [] = z0
foldl' f !z0 (x:xs) = foldl' f (z0 `f` x) xs
foldl' f -- in fact,
-- sum = foldl' (+) 0
-- product = foldl' (*) 1
-- foldr f z0 [x1,...,xn] = x1 `f` (x2 `f` (x3 `f` (... `f` (xn `f` z0))))
foldr :: (b -> a -> a) -> a -> [b] -> a
foldr f z0 [] = z0
foldr f z0 (x:xs) = foldl f (x `f` z0) xs
-- efficient if f is lazy at right argument
Exercise: what is this function? without computing the code, figure it out its type and what does it do?
foldr (:) []
This function takes a list and returns the same list.
filter :: (a -> Bool) -> [a] -> [a]
filter condition [] = []
filter condition (x:xs) = case condition x of
True -> x:filter condition xs
False -> filter condition xs
= filter (/= 0) -- hard exercise: what is the type of this function? filterNonZero
The filter
function takes in a condition and a list and returns a new list with only the elements that satisfy the condition. If the list is empty, it returns an empty list. Otherwise, it checks the condition for the first element and adds it to the new list if it satisfies the condition. Then, it recursively calls itself on the rest of the list. The filterNonZero
function uses the filter
function with the condition of not equal to 0, and its type is (Num a, Eq a) => [a] -> [a]
.
(++) :: [a] -> [a] -> [a]
++ ys = ys
[] :xs) ++ ys = x:(xs ++ ys)
(x
concat :: [[a]] -> [a]
concat = foldr (++) [] -- you should not use foldl, because foldr behaves better with laziness
-- foldr is very efficient if the operator is lazy on its right argument
concat [[1],[2,3],[4,5,6],[]] = [1,2,3,4,5,6]
Exercise: Verify that filter can be defined as concat:
filter condition = concat . map (\x -> if condition x then [x] else [])
Newtype is a type keyword similar to data, used to create new types. It can have any number of type variables, but unlike data, newtype can only have one data type. This essentially creates a copy of the data type, which is represented in memory exactly the same as the original type, but is differentiated by the compiler in the type system.
This gives newtype the following characteristics:
Newtype is a zero-cost abstraction, meaning it does not add any additional performance overhead. After compilation, it behaves the same as the type it copies.
Newtype can be used to prevent confusion between similar types, increasing readability and type safety.
getUserInput :: IO String
= getLine
getUserInput
filterUserInput :: String -> String
= filterFunction string
filterUserInput string
useUserInput :: String -> IO ()
= (...)
useUserInput str -- bad design, what are these strings? you might accidentally mix them
>>= useUserInput -- compiles without problem, but this is not what we want
getUserInput -- ignore the monad and functor magics for now.
-- If you are curious, think (>>=) :: IO a -> (a -> IO b) -> IO b
-- (In fact, (>>=) :: Monad m => m a -> (a -> m b) -> m b , here m = IO)
newtype UserInput = UserInput String
newtype FilteredString = FilteredString String
getUserInput :: IO UserInput
= UserInput <$> getLine
getUserInput
filterUserInput :: UserInput -> FilteredString -- much more clear what it does
UserInput string) = FilteredString (filterFunction string)
filterUserInput (
useUserInput :: FilteredString -> IO () -- much more clear that you need to do filtering before use
FilteredString str) = (...)
useUserInput (-- this will prevent anyone from passing UserInput directly to useUserInput without filtering
>>= useUserInput -- type error : UserInput is not FilteredString
getUserInput -- ignore the monad and functor magics (>>=, <$>) for now.
-- If you are curious, (<$>) :: (a -> b) -> IO a -> IO b
-- in fact, (<$>) :: (Functor f) => (a -> b) -> f a -> f b , here f = IO
When your data only has one item, it is better to use newtype instead.
data Name = Name String -- why not use newtype
newtype Name = Name String
data
Enum, and (coproduct)type Radius = Double
type Height = Double
type Width = Double -- types are just synonyms, they are identified by the compiler
-- i.e. Radius = Height = Width = Double as types
-- they do not provide extra type safety, but they provide clarity
--| Shape is the type constructor
data Shape = Circle Radius | Rectangle Width Height
--| Circle :: Radius -> Shape
--| these are data constructors
--| Rectangle :: Width -> Height -> Shape
--| these are data constructors
Circle 5 :: Shape
Rectangle 3 4 :: Shape
area :: Shape -> Double
Circle r) = pi * r**2
area (Rectangle x y) = x * y
area (-- enumerated data constructors can be easily pattern-matched
Maybe is a standard type in Haskell that is used to safely express information that may not exist or computations that may fail.
data Maybe a = Just a | Nothing -- you don't have to define this, this is already defined
-- Just :: a -> Maybe a
-- Nothing :: Maybe a -- requires no field
safeSqrt :: Double -> Maybe Double
= if x >= 0 then Just (sqrt x) else Nothing
safeSqrt x -- this function avoids runtime errors!
In general, you can use any number of type variables.
data InterestingData a b c d = InterestingData a | WhatIsThis b c | Nothing
-- d is a 'phantom' type variables here, they only exists at type level,
-- no values corresponds to them. This is valid
newtype MyTaggedType a b = CreateMyTaggedType b
-- a is phantom variable. Remember newtype can only have one field on the constructor side,
-- not at the left side of type definition.
In fact, the list [a] is a data type.
data [a] = [] | a:[a]
:: [a] -- empty list
[](:) :: a -> [a] -> [a] -- list constructor
-- 1:2:3:[] = 1:(2:(3:[])) = [1,2,3]
However, [] is already predefined, so the above code is not usable. We can use a custom List instead:
data List a = Nil | Cons a (List a)
Nil :: List a -- is the empty list
Cons :: a -> List a -> List a -- list constructor
Cons 1 (Cons 2 (Cons 3 Nil)) -- a list with three terms. Compare it with 1:2:3:[]
myLength :: List a -> Int
Nil = 0
myLength Cons _ xs) = 1 + myLength xs
myLength (-- this code is acceptable, but for best performance you should write it in tail recursion
(Demonstration in browser) hoogle.haskell.org
Eta reduction can be used to simplify definitions, allowing for more abstract code.
Recursion is the workhorse in functional programming.
Tail recursion is a special form of recursion.
Lambda expressions can create functions without giving them a name.
Functions such as map, foldl, foldr, foldl’, foldr’, filter, and concat are commonly used.
The keyword newtype can create distinct type copies, while type is used to create synonyms.
The keyword data can create complex data types, allowing for a mix of sums and products.
Exercise: try to eta-reduce the following functions, eliminate their input variable ```Haskell removeOdds :: [Int] -> [Int] removeOdds list = filter odd list
doubleSum :: [Int] -> [Int] doubleSum list = sum $ map (*2) list
apply :: (a -> b) -> a -> b apply f x = f x – you can eliminate them all, not just x ```
Exercise: what is this function? without computing the code, figure it out its type and what does it do? Haskell foldr (:) []
Exercise: Verify that filter can be defined as concat: Haskell filter condition = concat . map (\x -> if condition x then [x] else [])
Prove that List a and [a] are isomorphic. That is, write a function to convert List a to [a] and a function from [a] to List a. Their composition should be id.
toList :: [a] -> List a
toHaskellList :: List a -> [a]
Can you write a function to chain multiple possibly failing computations? For example, computing sqrt(1-sqrt(x)).
sequenceMaybe :: (a -> Maybe b) -> (b -> Maybe c) -> a -> Maybe c
= case f x of
sequenceMaybe f g x Just y -> g y
Nothing -> Nothing
safeSqrt :: Double -> Maybe Double
safeSqrt x| x >= 0 = Just (sqrt x)
| otherwise = Nothing
Consider the following definition of a binary tree type: ```Haskell data Tree a = Nil | Tree a (Tree a) (Tree a) deriving Show
– for example, you can have exampleTree = Tree 2 (Tree 3 Nil Nil) (Tree 1 Nil Nil) :: Tree Int – 2 – 3 1
– Task: try to implement the mapTree function : – that maps every element inside the tree by using a given function. mapTree :: (a -> b) -> Tree a -> Tree b mapTree = undefined
main = print $ mapTree (*2) exampleTree ```
Use recursion to implement merge sort.
mergeSort :: (Ord a) => [a] -> [a]
= []
mergeSort [] = [x]
mergeSort [x] = merge (mergeSort part1) (mergeSort part2)
mergeSort list where (part1, part2) = splitEvenly list
:xs) (y:ys)
merge (x| x <= y = x:merge xs (y:ys)
| otherwise = y:merge (x:xs) ys
= ys
merge [] ys = xs
merge xs [] :x1:xs) = (x0:rest1, x1:rest2)
splitEvenly (x0where (rest1, rest2) = splitEvenly xs
= ([x], [])
splitEvenly [x] = ([], []) splitEvenly []