Sunday, February 19, 2012

haskell string tokenizer

I recently needed a function to split strings on a delimeter and found this one on http://blog.julipedia.org/2006/08/split-function-in-haskell.html:


tokenizeString :: String -> [Char] -> [[Char]]
tokenizeString [] _ = [""]
tokenizeString (c:cs) delim
  | c `elem` delim = [] : rest
  | otherwise = (c : head rest) : tail rest
    where rest = tokenizeString cs delim

It took me a little while to wrap my mind around the recursion and what goes on here, a pen and some paper and trace it out this is really really cool.   It's a shame that I was not able to come up with this on my  own.

Execution trace original string: "a,b" delimiter list: ","
I shortened some names to save space by hMt is really head $ mT and tmT is tail $ mT where mT is myTokenizer.  I am no expert, but the secret sauce appears to be in the base case which always returns a [""] -- list of strings.  And how the string elements are concatenated:

Char : [Char]: [[Char]]  -- Single character to a string to a list of strings

Wednesday, February 15, 2012

Fun with Haskell

Had some more time to play with Haskell today, I am still very new at this, not being able to use iteration makes coding "interesting".  Also this makes nesting data structures a little challenging.

For example take a function that returns a list:
multAddFunc :: Integer -> Integer -> [Integer]
multAddFunc a b = [a*b, a+b]

*Main> multAddFunc 2 3
[6,5]

Now lets say we had two lists of numbers [1,2,3] [4,5,6] And we wanted to write a function that processed them:
*Main> [ multAddFunc x y | x<-[1,2,3], y<-[4,5,6] ]
[[4,5],[5,6],[6,7],[8,6],[10,7],[12,8],[12,7],[15,8],[18,9]]

This just returned a list of lists, observe

*Main> :t [ multAddFunc x y | x<-[1,2,3], y<-[4,5,6] ]
[[4,5],[5,6],[6,7],[8,6],[10,7],[12,8],[12,7],[15,8],[18,9]]

But what if we wanted a flattened list?  For this we would have to concatenate the lists together using the ++ operator

*Main> :t (++)
(++) :: [a] -> [a] -> [a]

listMath :: [Integer] -> [Integer] -> [Integer]
listMath [] [] = []
listMath (x:xs) (y:ys) = (multAddFunc x y) ++ (listMath xs ys)

*Main> listMath [1,2,3] [4,5,6]
[4,5,10,7,18,9]

And there we have it a flattened list