As a ‘functional’ programming language, how can we not have functions? We will introduce more commonly used basic functions, as well as operations on basic functions!
addXY :: (Num a) => a -> a -> a
-- addXY x y :: (Num a) => a
addXY x y = (+) x y
-- addXY x :: (Num a) => a -> a
addXY x = (+) x
-- y can be removed, as functions of type (a -> a), addXY x and (+) x are equal
addXY = (+) -- a -> a -> a
-- well, by the same spirit, x can also be removed! as functions of type (a -> a -> a)
-- they are equal!When we write the definition f x = g x, x can be eliminated, resulting in f = g. This process is called Eta reduction. Eta reduction can help us write more abstract and concise code. (In some cases, using eta reduction can also make the compiler notice more optimizations, improving code reuse and efficiency.)
Let’s look at more examples:
getFirsts :: [(a, b)] -> [a]
getFirsts list = map fst list
-- --> reduction: remove the 'list' at both ends
getFirsts :: [(a, b)] -> [a]
getFirsts = map fstshowLength :: [a] -> String
showLength list = "The length of the list is " ++ show (length list)
showLength :: [a] -> String
showLength = ("The length of the list is " ++) . show . length Lambda abstraction can be seen as the reverse operation of Eta-reduction: Eta-expansion, which “adds” a variable to the expression after it, allowing you to use a nameless function where needed, usually as a parameter to another function:
(\x -> 2*x) 3 = 2*3 -- lambda expression with one variable
addXY = \x y -> x + y -- lambda expression with two variables
\x1 x2 x3 ... xn -> f x1 x2 x3 .. xn -- you can have arbitrary many variablesLambda abstraction can also be seen as the inverse operation of eta-reduction:
getFirsts = map fst
-- eta expansion:
getFirsts = \list -> map fst list -- they are equal as functions
-- well, in that case, just write
-- getFirsts list = map fst listIn Haskell, there is no such thing as loops. For functions, all loops can be implemented using recursion. In functional programming, recursion is as common as eating and walking.
import Text.Read (readMaybe)
import Data.Ord
main :: IO ()
main = do
putStrLn "Guess the number:"
guessed <- getLine
case (compare secretNumber) <$> (readMaybe guessed) of -- monad magic, please ignore for now
Just LT -> putStrLn "Too large!" >> main -- recursion happens!
Just EQ -> putStrLn "You win!" -- the program ends here, no more recursion
Just GT -> putStrLn "Too small!" >> main -- recursion happens!
Nothing -> putStrLn "Wrong format!" >> main -- recursion happens!
where secretNumber = 88
-- ignore the monad magics for now owo, we will explain them next class!
-- getLine :: IO String -- feed in user input
-- (>>) :: IO a -> IO b -> IO b -- sequencing events in a monad
-- do notation: a syntax sugar form of a monad binding operator (>>=)Recursion must have a termination condition, unless you intend to write an infinite loop.
-- this function pattern matches on the input list
-- if the input is empty, it returns empty
-- if the input is non-empty, recursion happens and it calls itself!
quickSort :: (Ord a) => [a] -> [a]
quickSort [] = [] -- without this line, the recursion will loop forever (in fact, runtime error)
quickSort (x:xs) = quickSort [y | y <- xs, y < x] ++ [x] ++ quickSort [y | y <- xs, y >= x]An example of an infinite loop:
repeat :: a -> [a] -- this is a standard function
repeat x = x : repeat x -- infinite list of x'sLet’s look at a simple example, calculating 1+2+…+100
sum [1..100] :: Int -- you can use list, of courseWhat if we don’t use the sum function? Or how do we write our own sum function?
mySum :: (Num a) => [a] -> a
mySum [] = 0
mySum (x:xs) = x + mySum xs -- this works
mySym [1,3,4] = mySum 1:[3,4]
= 1 + mySum [3,4] -- use the definition, repeatedly
= 1 + 3 + mySum [4]
= 1 + 3 + 4 + mySum []
= 1 + 3 + 4 + 0
```
But...
```Haskell
GHCi, version 9.6.3: https://www.haskell.org/ghc/ :? for help
ghci> :{
ghci| mySum [] = 0
ghci| mySum (x:xs) = x + mySum xs
ghci| :}
ghci> mySum [1..10000000]
50000005000000
ghci> mySum [1..100000000]
*** Exception: stack overflow
ghci> sum [1..100000000]
5000000050000000
ghci> What happened? why is mySum different from sum? Because of lazy evaluation, the expression 1+2+3+4+…+100000000+0 is not evaluated immediately, causing a stack overflow.
-- the right way
mySum2 = sum2 0 -- eta reduction happens here (mySum2 list = sum2 0 list)
where sum2 !acc [] = acc
sum2 !acc (x:xs) = sum2 (acc+x) xs
-- when sum2 traverses the list, it accumulates all the list elements in acc
-- ! is the Bang pattern, it forces evaluation, avoids laziness in the variable acc
-- without the (!), evaluating it on [1..100000000] still gives you a stack overflow
-- functions and 'variables' defined in 'where' clause are local
-- they will not affect other functions.
-- advanced tip : ! will only force evaluation to weak head normal form (WHNF).
-- if you need to fully evaluate a complex data structure, you need rdeepseqEffect:
ghci> :{
ghci| mySum2 = sum2 0
ghci| where sum2 !acc [] = acc
ghci| sum2 !acc (x:xs) = sum2 (acc+x) xs
ghci| :}
ghci> mySum2 [1..100000000]
5000000050000000This technique, where a recursion is called directly by the function body itself, is called tail recursion. (the ! is called bang pattern, it triggers eager evaluation, avoids laziness in this variable acc).
Although tail recursion is a type of recursion, it will be optimized by the compiler into a loop during compilation.
fib :: Int -> Integer
fib 1 = 1
fib 2 = 1
fib n = fib (n-1) + fib (n-2) -- the naive version
-- computing time is exponential and this wastes a lot of computation
-- fib 5 = fib 4 + fib 3
-- = (fib 3 + fib 2) + (fib 2 + fib 1) -- wastes the information about fib 3
-- = ((fib 2 + fib 1) + 1) + (1 + 1)
-- = ((1 + 1) + 1) + (1 + 1)
-- ^^^^^^^ ^^^^^^^ redundant computation! should only be computed onceThe function fib calculates the Fibonacci number at a given index using naive recursion. The computation time is exponential and results in a lot of wasted computation. For example, when calculating fib 5, the function will compute fib 3 twice, leading to redundant computation.
fib2 n = fibT 0 1 1
where fibT previous current n = current
fibT previous current m = fibT current (previous + current) (m+1)This code uses tail recursion to calculate the Fibonacci sequence. It starts with the first two numbers, 0 and 1, and then uses a recursive function to calculate the next number by adding the previous two numbers. This process continues until the desired number in the sequence is reached.
fibList = 0 : 1 : zipWith (+) fibList (drop 1 fibList) -- infinite list!
-- 0 : 1 : (0 : 1 : a2 : ...)
-- (+ + + )
-- (1 : a2 : a3 : ...)
-- || || || ||
-- 0 : 1 : a2 : a3 : ...
!! :: [a] -> Int -> a -- get the i-th term of a list, O(i) time since linked list
-- this is a standard function, but we kindly provide its definition for you
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith operator (x:xs) (y:ys) = operator x y : zipWith xs ys
zipWith operator _ _ = []
-- zipWith f [1,2,3] [4,5,6] = [f 1 4, f 2 5, f 3 6]map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs -- recursion!
map (*2) [1..5] = [2,4,6,8,10]-- foldl f z0 [x1,...,xn] = ((((z0 `f` x1) `f` x2) `f` x3) `f`) ... `f` xn
foldl :: (a -> b -> b) -> a -> [b] -> b
foldl f z0 [] = z0
foldl f z0 (x:xs) = foldl f (z0 `f` x) xs
-- foldl' is the strict version of foldl
foldl' :: (a -> b -> b) -> a -> [b] -> b
foldl' f !z0 [] = z0
foldl' f !z0 (x:xs) = foldl' f (z0 `f` x) xs
-- in fact,
-- sum = foldl' (+) 0
-- product = foldl' (*) 1
-- foldr f z0 [x1,...,xn] = x1 `f` (x2 `f` (x3 `f` (... `f` (xn `f` z0))))
foldr :: (b -> a -> a) -> a -> [b] -> a
foldr f z0 [] = z0
foldr f z0 (x:xs) = foldl f (x `f` z0) xs
-- efficient if f is lazy at right argumentExercise: what is this function? without computing the code, figure it out its type and what does it do?
foldr (:) []This function takes a list and returns the same list.
filter :: (a -> Bool) -> [a] -> [a]
filter condition [] = []
filter condition (x:xs) = case condition x of
True -> x:filter condition xs
False -> filter condition xs
filterNonZero = filter (/= 0) -- hard exercise: what is the type of this function?The filter function takes in a condition and a list and returns a new list with only the elements that satisfy the condition. If the list is empty, it returns an empty list. Otherwise, it checks the condition for the first element and adds it to the new list if it satisfies the condition. Then, it recursively calls itself on the rest of the list. The filterNonZero function uses the filter function with the condition of not equal to 0, and its type is (Num a, Eq a) => [a] -> [a].
(++) :: [a] -> [a] -> [a]
[] ++ ys = ys
(x:xs) ++ ys = x:(xs ++ ys)
concat :: [[a]] -> [a]
concat = foldr (++) [] -- you should not use foldl, because foldr behaves better with laziness
-- foldr is very efficient if the operator is lazy on its right argument
concat [[1],[2,3],[4,5,6],[]] = [1,2,3,4,5,6]Exercise: Verify that filter can be defined as concat:
filter condition = concat . map (\x -> if condition x then [x] else [])Newtype is a type keyword similar to data, used to create new types. It can have any number of type variables, but unlike data, newtype can only have one data type. This essentially creates a copy of the data type, which is represented in memory exactly the same as the original type, but is differentiated by the compiler in the type system.
This gives newtype the following characteristics:
Newtype is a zero-cost abstraction, meaning it does not add any additional performance overhead. After compilation, it behaves the same as the type it copies.
Newtype can be used to prevent confusion between similar types, increasing readability and type safety.
getUserInput :: IO String
getUserInput = getLine
filterUserInput :: String -> String
filterUserInput string = filterFunction string
useUserInput :: String -> IO ()
useUserInput str = (...)
-- bad design, what are these strings? you might accidentally mix them
getUserInput >>= useUserInput -- compiles without problem, but this is not what we want
-- ignore the monad and functor magics for now.
-- If you are curious, think (>>=) :: IO a -> (a -> IO b) -> IO b
-- (In fact, (>>=) :: Monad m => m a -> (a -> m b) -> m b , here m = IO)newtype UserInput = UserInput String
newtype FilteredString = FilteredString String
getUserInput :: IO UserInput
getUserInput = UserInput <$> getLine
filterUserInput :: UserInput -> FilteredString -- much more clear what it does
filterUserInput (UserInput string) = FilteredString (filterFunction string)
useUserInput :: FilteredString -> IO () -- much more clear that you need to do filtering before use
useUserInput (FilteredString str) = (...)
-- this will prevent anyone from passing UserInput directly to useUserInput without filtering
getUserInput >>= useUserInput -- type error : UserInput is not FilteredString
-- ignore the monad and functor magics (>>=, <$>) for now.
-- If you are curious, (<$>) :: (a -> b) -> IO a -> IO b
-- in fact, (<$>) :: (Functor f) => (a -> b) -> f a -> f b , here f = IOWhen your data only has one item, it is better to use newtype instead.
data Name = Name String -- why not use newtype
newtype Name = Name Stringdata Enum, and (coproduct)type Radius = Double
type Height = Double
type Width = Double -- types are just synonyms, they are identified by the compiler
-- i.e. Radius = Height = Width = Double as types
-- they do not provide extra type safety, but they provide clarity
--| Shape is the type constructor
data Shape = Circle Radius | Rectangle Width Height
--| Circle :: Radius -> Shape
--| these are data constructors
--| Rectangle :: Width -> Height -> Shape
--| these are data constructors
Circle 5 :: Shape
Rectangle 3 4 :: Shape
area :: Shape -> Double
area (Circle r) = pi * r**2
area (Rectangle x y) = x * y
-- enumerated data constructors can be easily pattern-matchedMaybe is a standard type in Haskell that is used to safely express information that may not exist or computations that may fail.
data Maybe a = Just a | Nothing -- you don't have to define this, this is already defined
-- Just :: a -> Maybe a
-- Nothing :: Maybe a -- requires no field
safeSqrt :: Double -> Maybe Double
safeSqrt x = if x >= 0 then Just (sqrt x) else Nothing
-- this function avoids runtime errors!In general, you can use any number of type variables.
data InterestingData a b c d = InterestingData a | WhatIsThis b c | Nothing
-- d is a 'phantom' type variables here, they only exists at type level,
-- no values corresponds to them. This is valid
newtype MyTaggedType a b = CreateMyTaggedType b
-- a is phantom variable. Remember newtype can only have one field on the constructor side,
-- not at the left side of type definition.In fact, the list [a] is a data type.
data [a] = [] | a:[a]
[] :: [a] -- empty list
(:) :: a -> [a] -> [a] -- list constructor
-- 1:2:3:[] = 1:(2:(3:[])) = [1,2,3]However, [] is already predefined, so the above code is not usable. We can use a custom List instead:
data List a = Nil | Cons a (List a)
Nil :: List a -- is the empty list
Cons :: a -> List a -> List a -- list constructor
Cons 1 (Cons 2 (Cons 3 Nil)) -- a list with three terms. Compare it with 1:2:3:[]
myLength :: List a -> Int
myLength Nil = 0
myLength (Cons _ xs) = 1 + myLength xs
-- this code is acceptable, but for best performance you should write it in tail recursion(Demonstration in browser) hoogle.haskell.org
Eta reduction can be used to simplify definitions, allowing for more abstract code.
Recursion is the workhorse in functional programming.
Tail recursion is a special form of recursion.
Lambda expressions can create functions without giving them a name.
Functions such as map, foldl, foldr, foldl’, foldr’, filter, and concat are commonly used.
The keyword newtype can create distinct type copies, while type is used to create synonyms.
The keyword data can create complex data types, allowing for a mix of sums and products.
Exercise: try to eta-reduce the following functions, eliminate their input variable ```Haskell removeOdds :: [Int] -> [Int] removeOdds list = filter odd list
doubleSum :: [Int] -> [Int] doubleSum list = sum $ map (*2) list
apply :: (a -> b) -> a -> b apply f x = f x – you can eliminate them all, not just x ```
Exercise: what is this function? without computing the code, figure it out its type and what does it do? Haskell foldr (:) []
Exercise: Verify that filter can be defined as concat: Haskell filter condition = concat . map (\x -> if condition x then [x] else [])
Prove that List a and [a] are isomorphic. That is, write a function to convert List a to [a] and a function from [a] to List a. Their composition should be id.
toList :: [a] -> List a
toHaskellList :: List a -> [a]Can you write a function to chain multiple possibly failing computations? For example, computing sqrt(1-sqrt(x)).
sequenceMaybe :: (a -> Maybe b) -> (b -> Maybe c) -> a -> Maybe c
sequenceMaybe f g x = case f x of
Just y -> g y
Nothing -> Nothing
safeSqrt :: Double -> Maybe Double
safeSqrt x
| x >= 0 = Just (sqrt x)
| otherwise = NothingConsider the following definition of a binary tree type: ```Haskell data Tree a = Nil | Tree a (Tree a) (Tree a) deriving Show
– for example, you can have exampleTree = Tree 2 (Tree 3 Nil Nil) (Tree 1 Nil Nil) :: Tree Int – 2 – 3 1
– Task: try to implement the mapTree function : – that maps every element inside the tree by using a given function. mapTree :: (a -> b) -> Tree a -> Tree b mapTree = undefined
main = print $ mapTree (*2) exampleTree ```
Use recursion to implement merge sort.
mergeSort :: (Ord a) => [a] -> [a]
mergeSort [] = []
mergeSort [x] = [x]
mergeSort list = merge (mergeSort part1) (mergeSort part2)
where (part1, part2) = splitEvenly list
merge (x:xs) (y:ys)
| x <= y = x:merge xs (y:ys)
| otherwise = y:merge (x:xs) ys
merge [] ys = ys
merge xs [] = xs
splitEvenly (x0:x1:xs) = (x0:rest1, x1:rest2)
where (rest1, rest2) = splitEvenly xs
splitEvenly [x] = ([x], [])
splitEvenly [] = ([], [])