Saturday, April 28, 2012

CRC32

Another piece fell into place

calc_adler32 :: String->Int->Int->Int
calc_adler32 [] a b = (b `shiftL` 16) .|. a
calc_adler32 str a b  = 
  let ap = (a + (ord $ head str)) `mod` mod_adler
      bp = (ap + b) `mod` mod_adler
  in calc_adler32 (tail str) ap bp
    
crc32 :: String -> Int
crc32 str = calc_adler32 str 1 0

*Main> crc32 "abc"
38600999

Thursday, April 26, 2012

Making shingles

This is starting to get a little boring, it seems that all of my Haskell work consists of recombining lists and folding them, but either way yet another piece of the puzzle is now in place, I can now generate shingles


makeShingle :: [[a]]->[[a]]
makeShingle [] = []
makeShingle zs 
  | length zs >= 4 = (foldl(\acc z -> (acc ++ z)) [] (take 4 zs)) : (makeShingle $ drop 1 zs)
  | otherwise = []



makeShingle $ filter(/="") $ genericRev $ otherTok "this, is a string.  it is a string, like many other.  but i like it, i like it a lot!" standard_delim
["thisisastring","isastringit","astringitis","stringitisa","itisastring","isastringlike","astringlikemany","stringlikemanyother","likemanyotherbut","manyotherbuti","otherbutilike","butilikeit","ilikeiti","likeitilike","itilikeit","ilikeita","likeitalot"]


Wednesday, April 25, 2012

Set permutations

There is a part of my project that calls for a permuted set, that is given the original set of elements create a set of sets where each set contains a permutation of the original.  It turns out that this is more or less simple once you break the problem down into smaller parts.

There are a few algorithms to generate permutations,  I chose interleaving for it being relatively simple and straight forward.
For a set of [x,y, x] permute the first two elements
[[x, y], [y, x]]
and then interleave each set with z
[[z, x, y], [x, z, y], [x, y, z], [z, y, x], [y, z, x], [y, x, z]]

First we create a function that would take a list an a value and create permutations of that list with a value:

interleaveSet :: [a] -> a -> [[a]]
interleaveSet [] value = [[value]]
interleaveSet xs value = [ take pos xs ++ [value] ++ drop pos xs | pos <- [0..length(xs)] ]

Here are some samples:

*Main> interleaveSet [] 1
[[1]]
*Main> interleaveSet [1] 2
[[2,1],[1,2]]
*Main> interleaveSet [1,2,3,4] 5
[[5,1,2,3,4],[1,5,2,3,4],[1,2,5,3,4],[1,2,3,5,4],[1,2,3,4,5]]

So we now have a way of interleaving a set with a value, but remember that what we need to do is build this from an existing set.  What we need is something that will take the output of interleaveSet and interleave the sets created by it with the next value.  I was at first tempted to use a list comprehension once again

interleaveSets sets value = [ interleaveSet set value | set <- sets ]

but since interleaveSet outputs a [[a]] and interleaveSets will create a list of those, our return value will be [[[a]]], which is not what we want at all.


*Main> interleaveSets (interleaveSet [1,2,3,4] 5) 6
[[[6,5,1,2,3,4],[5,6,1,2,3,4],[5,1,6,2,3,4],[5,1,2,6,3,4],[5,1,2,3,6,4],[5,1,2,3,4,6]],[[6,1,5,2,3,4],[1,6,5,2,3,4],[1,5,6,2,3,4],[1,5,2,6,3,4],[1,5,2,3,6,4],[1,5,2,3,4,6]],[[6,1,2,5,3,4],[1,6,2,5,3,4],[1,2,6,5,3,4],[1,2,5,6,3,4],[1,2,5,3,6,4],[1,2,5,3,4,6]],[[6,1,2,3,5,4],[1,6,2,3,5,4],[1,2,6,3,5,4],[1,2,3,6,5,4],[1,2,3,5,6,4],[1,2,3,5,4,6]],[[6,1,2,3,4,5],[1,6,2,3,4,5],[1,2,6,3,4,5],[1,2,3,6,4,5],[1,2,3,4,6,5],[1,2,3,4,5,6]]]

Since we want to be able to feed it back to interleaveSets, this is not what we want.

This will do:

interleaveSets :: [[a]] -> a -> [[a]]
interleaveSets [] value = [[]]
interleaveSets (s:sets) value = interleaveSet s value ++ interleaveSets sets value

*Main> interleaveSets (interleaveSet [1,2,3,4] 5) 6
[[6,5,1,2,3,4],[5,6,1,2,3,4],[5,1,6,2,3,4],[5,1,2,6,3,4],[5,1,2,3,6,4],[5,1,2,3,4,6],[6,1,5,2,3,4],[1,6,5,2,3,4],[1,5,6,2,3,4],[1,5,2,6,3,4],[1,5,2,3,6,4],[1,5,2,3,4,6],[6,1,2,5,3,4],[1,6,2,5,3,4],[1,2,6,5,3,4],[1,2,5,6,3,4],[1,2,5,3,6,4],[1,2,5,3,4,6],[6,1,2,3,5,4],[1,6,2,3,5,4],[1,2,6,3,5,4],[1,2,3,6,5,4],[1,2,3,5,6,4],[1,2,3,5,4,6],[6,1,2,3,4,5],[1,6,2,3,4,5],[1,2,6,3,4,5],[1,2,3,6,4,5],[1,2,3,4,6,5],[1,2,3,4,5,6],[]]

I am not sure what that empty element at the end is, will have to deal with it later (I am sure it is a side effect of my base case)

Ah the solution is that the base case return a [] because any set ++ [] is just set

*Main> [1,2,3] ++ []
[1,2,3]

And here is the new result:
*Main>  interleaveSets (interleaveSet [1,2,3,4] 5) 6
[[6,5,1,2,3,4],[5,6,1,2,3,4],[5,1,6,2,3,4],[5,1,2,6,3,4],[5,1,2,3,6,4],[5,1,2,3,4,6],[6,1,5,2,3,4],[1,6,5,2,3,4],[1,5,6,2,3,4],[1,5,2,6,3,4],[1,5,2,3,6,4],[1,5,2,3,4,6],[6,1,2,5,3,4],[1,6,2,5,3,4],[1,2,6,5,3,4],[1,2,5,6,3,4],[1,2,5,3,6,4],[1,2,5,3,4,6],[6,1,2,3,5,4],[1,6,2,3,5,4],[1,2,6,3,5,4],[1,2,3,6,5,4],[1,2,3,5,6,4],[1,2,3,5,4,6],[6,1,2,3,4,5],[1,6,2,3,4,5],[1,2,6,3,4,5],[1,2,3,6,4,5],[1,2,3,4,6,5],[1,2,3,4,5,6]]

And finally to put it all together:

permuteSet :: [a] -> [[a]]
permuteSet [] = []
permuteSet set = foldl(\acc e -> interleaveSets acc e) seed (tail set)
  where seed = interleaveSet [] (head set)

Lets test

*Main > permuteSet [1,2,3,4]
[[4,3,2,1],[3,4,2,1],[3,2,4,1],[3,2,1,4],[4,2,3,1],[2,4,3,1],[2,3,4,1],[2,3,1,4],[4,2,1,3],[2,4,1,3],[2,1,4,3],[2,1,3,4],[4,3,1,2],[3,4,1,2],[3,1,4,2],[3,1,2,4],[4,1,3,2],[1,4,3,2],[1,3,4,2],[1,3,2,4],[4,1,2,3],[1,4,2,3],[1,2,4,3],[1,2,3,4]]

Looks about right

*Main> length $ permuteSet [1,2,3,4]
24

Number of permutations is equal to n! / (n -k)!
Where k is the number of elements from n per permutation, in our case k = n so that turns into 0! which we know is one

and the factorial of 4 is 4 * 3 * 2 *1 = 24


Sunday, April 22, 2012

BASH directory recursion, scope and local variables

How quaint, I am using a bash shell script to do backups of my data, very similar to time machine if anyone is familiar.  Not sure why shell, at this point I can count on so many other things to be included, but I chose shell.  My directory recursion was failing after a few loops and I could not figure out why, until I found out that bash does not do lexical scope very well, and local variables need to be declared as local explicitly.  This way when a recursive invocation exits it's assignments will not live on.