Archives

All posts for the month August, 2014

At the end of my previous learn-Haskell post I was just finished using some lifters for monads for a log parser. The next part of the homework was to make a binary tree to hold the log messages. I think I made something similar in ML as part of Programming Languages on Coursera, and the types for the tree nodes were already defined, so this was fairly straightforward.

The main part to get right was inserting log messages in the correct order:

insert :: LogMessage -> MessageTree -> MessageTree
insert (Unknown _) t = t
insert m Leaf = Node Leaf m Leaf
insert _ (Node _ (Unknown _) _) = error "Invalid Node: a node must not contain Unknown"
insert m@(LogMessage _ t _) (Node l m2@(LogMessage _ t2 _) r) = case compare t t2 of
  LT -> insertOnLeft
  EQ -> insertOnLeft
  GT -> Node l m2 (insert m r)
  where insertOnLeft = Node (insert m l) m2 r

I could have done a few things differently – the error condition could be ignored, but it feels much better to include it so that I have a total function (as far as I understand function totality). I could also have taken a different approach to the last case, using an if then else such as if (t <= t2) then Node (insert m l) m2 r else Node l m2 (insert m r). This may be more readable, now that I think about it, since compare t t2 isn’t quite as intuitive about what is less than what. I could also probably have done something with guards, which might have been nicer.

One thing I would have liked to do is leave off the t in the first pattern, giving insert (Unknown _) = id, but this seems to make it incompatible with the other patterns.

As usual, I am sure there are some abstractions I could use over this code to make it simpler, but I am happy enough with this for now.

I ended up with a nice little function to build an ordered tree out of an unordered list of log messages:

build :: [LogMessage] -> MessageTree
build = build' Leaf
  where build' = foldr insert

Looking at that now, I think I mustn’t have sat back and looked at it after I simplified the where clause, because I could as easily have written build = foldr insert Leaf, which is faster to comprehend.

Converting the ordered tree back to a list was straightforward – I’m not totally confident about the performance characteristics, but I doubt that is worth worrying about at this stage of my learning.

inOrder :: MessageTree -> [LogMessage]
inOrder = (`inOrder'` [])
  where
    inOrder' Leaf lst = lst
    inOrder' (Node l m r) lst = inOrder' l (m : inOrder' r lst)

I have been working through the week 2 homework exercises, and had a bit of a rocky start. I got it working without much trouble, but I had a real mess of code:

-- parses individual line from the log file
parseMessage :: String -> LogMessage
parseMessage line = case words line of
  "I" : time : message -> case maybeRead time :: (Maybe Int) of
                            Just t -> LogMessage Info t (unwords message)
                            _ -> Unknown line
  "W" : time : message -> case maybeRead time :: (Maybe Int) of
                            Just t -> LogMessage Warning t (unwords message)
                            _ -> Unknown line
  "E" : severity : time : message -> case maybeRead severity :: (Maybe Int) of
                            Just s -> case maybeRead time :: (Maybe Int) of
                                        Just t -> LogMessage (Error s) t (unwords message)
                                        _ -> Unknown line
  _ -> Unknown line

This is awful – so full of duplication and really not pleasant to read. I couldn’t leave it like that, so I set about at least separating out the code for determining the message type. I ended up with something at least moderately readable:

parseMessage line = let (messageType, rest) = maybeGetMessageTypeAndRemainder line in
  case messageType of
    Nothing -> Unknown line
    Just aType -> case rest of
           [] -> Unknown line
           time : message -> case maybeRead time :: (Maybe Int) of
             Just t -> LogMessage aType t (unwords message)
             _ -> Unknown line

-- partition the message type and remainder of the string
maybeGetMessageTypeAndRemainder :: String -> (Maybe MessageType, [String])
maybeGetMessageTypeAndRemainder line = case words line of
  "I" : rest -> (Just Info, rest)
  "W" : rest -> (Just Warning, rest)
  "E" : severity : rest  -> case maybeRead severity :: (Maybe Int) of
                              Just s -> (Just (Error s), rest)
                              _ -> (Nothing, words line)
  _ -> (Nothing, words line)

It still looked wrong though, with a number of cases mapping to the exact same code. It was at this point that my brain kicked into gear, and I had to laugh out loud when I realized that I was using the Maybe monad and completely ignoring one of the main uses of monads: that thing where you chain them together. I had forgotten what it was called (the back of my mind said ‘bind’, but it is a human mind so I did not trust it fully), but I knew what I wanted it to do, so I plugged this into Hoogle: m a -> (a -> m b) -> m b, and it obligingly reminded me that I’m looking for (>>=) :: Monad m => m a -> (a -> m b) -> m b. It was indeed ‘bind’.

After a short break to make some tea and let my brain process the problem a little with its rediscovered knowledge, I set about setting things right.

My approach would be to use Maybe tuples to pass through the parsed and unparsed sections of a log message. The tuples would increase in size until they were used at the end to construct a LogMessage… but wait! It occurred to me that I may even be able to get away with only using pairs, if I could partially apply the LogMessage constructor. I gave it a little try in ghci:

*Log> :t LogMessage
LogMessage :: MessageType -> TimeStamp -> String -> LogMessage
*Log> :t LogMessage Warning
LogMessage Warning :: TimeStamp -> String -> LogMessage
*Log> :t LogMessage Warning 5
LogMessage Warning 5 :: String -> LogMessage

Aha! So I would be able to build up a log message along the way.

Tried a bunch of stuff, got confused about bind (>>=), nutted it out and used (>=>) but had to add return to the final appendMessage.

At this point I asked on #haskell-beginners on freenode, and got a bit of advice:

jle> so the thing is, if you have m a jle> you can apply all sorts of functions to it, you just need the right lifter jle> if you have an (a -> b), you use fmap jle> if you have an m (a -> b), you use ap (or (<*>)) jle> if you have an (a -> m b), you use (=<<) , or (>>=) jle> fmap :: (a -> b) -> m a -> m b jle> (<*>) :: m (a -> b) -> m a -> m b jle> (=<<) :: (a -> m b) -> m a -> m b jle> so…there ya go

Then went on with a whole lot of information, too much to process for the moment. I was directed to watch the week 4 videos of a functional programming course, so I have added those to the list.

I persevered a bit more and ended up with something that seems about as concise as I can make it:

parseMessageType :: String -> Maybe (TimeStamp -> String -> LogMessage, [String])
parseMessageType line = case words line of
  "I" : rest -> Just (LogMessage Info, rest)
  "W" : rest -> Just (LogMessage Warning, rest)
  "E" : rest -> parseErrorSeverity rest
  _ -> Nothing

parseErrorSeverity :: [String] -> Maybe (TimeStamp -> String -> LogMessage, [String])
parseErrorSeverity (severity : rest) = case maybeRead severity :: (Maybe Int) of
                                  Just s -> Just (LogMessage (Error s), rest)
                                  _ -> Nothing
parseErrorSeverity _ = Nothing

parseTimeStamp :: (TimeStamp -> String -> LogMessage, [String]) -> Maybe (String -> LogMessage, [String])
parseTimeStamp (_, []) = Nothing
parseTimeStamp (logMessage, timeStamp : rest) = case maybeRead timeStamp :: (Maybe TimeStamp) of
                                    Just t -> Just (logMessage t, rest)
                                    _ -> Nothing

appendMessage :: (String -> LogMessage, [String]) -> LogMessage
appendMessage (logMessage, message) = logMessage . unwords $ message

parseValidMessage :: String -> Maybe LogMessage
parseValidMessage = fmap appendMessage . (parseMessageType >=> parseTimeStamp)

parseMessage :: String -> LogMessage
parseMessage = liftM2 fromMaybe Unknown parseValidMessage

parseLog :: String -> [LogMessage]
parseLog = map parseMessage . lines

I can probably simplify some of the earlier functions in wonderful ways of which I am not yet aware, but that will do for now. I can always look back at this later when I have learned more, and laugh at the verbosity of the above.

I did some reading and got through all the recommended chapters of Learn You A Haskell and Real World Haskell.

Playing around

I also explored several concepts, including composing functions that take multiple arguments. I did not find evidence of any built-in functions, so I ended up defining this:

composeDiadic :: (c -> d) -> (a -> b -> c) -> a -> b -> d
composeDiadic f g x y = f (g x y)

(.:) = composeDiadic

I wanted to make this because I was playing around with lists of lists, and found intersperse, concat, and a function that combines the two: intercalate. I wanted to rewrite intercalate as a little exercise, but it did not like concat . intersperse since (.) does not expect any function that takes 2 arguments.

Lecture 2

Algebraic data types are defined like so:

data MyType = Constructor1
            | Constructor2 Type1
            | Constructor3 TypeA TypeB
  deriving Show
  • add deriving Show after a data type definition to automatically make it able to be displayed as a string.
  • type constructors and data constructors inhabit different namespaces, so single-constructor types often use the same name for both.

Pattern Matching

  • Parens are needed for pattern matching around a data constructor that has one or more parameters.
  • Bind the whole value to a name using something like myFuncton name@(Constructor2 a) = <do something with name>

Resolving merge conflicts is something I do now and then, but not every week (if I had to do it every week, there would be something quite wrong with my team’s development process). As a result, I often have to look up some of the syntax. This is a personal reference for the things I usually want to do:

Use a merge tool

git mergetool allows each conflicting file to be opened in a merge resolution tool (whichever is configured or installed).

Just use one or the other version of a file

git checkout --ours path/to/file.md

git checkout --theirs path/to/file.md

For git versions before 1.6.1, see keep either file in merge conflicts