Plutonium Dog

Tuesday, March 27, 2012

A more readable haskell string tokenizer

Another crack and a tokenizer

otherTok :: String -> [Char] -> [[Char]]
otherTok [] _ = []
otherTok cs delim = foldl(\acc c -> if c `elem` delim then [] : acc else (head acc ++ [c]) : (tail acc) ) [] cs

*Main> otherTok "blah,blah blah. blah! blah" " ,.!"
["blah","","blah","","blah","blah","*** Exception: Prelude.head: empty list

Encountering a delimiter char produces an empty string, which I can remove later with filter, not sure what to do with the Exception.

P.S. Problem solved

otherTok :: String -> [Char] -> [[Char]]
otherTok [] _ = []
otherTok cs delim = foldl(\acc c -> if c `elem` delim then [] : acc else (head acc ++ [c]) : (tail acc) ) [""] cs

*Main> filter (/="") (otherTok "blah,blah blah. blah! blah" " ,.!")
["blah","blah","blah","blah","blah"]

P.S.S The tokens that result are in reverse order of how the words show up in a line, also easily fixed

rev :: [a] -> [a]
rev [] = []
rev xs = foldr(\x acc -> acc ++ [x]) [] xs
*Main> rev(filter(/="") (otherTok "abc,def.ghi" ",."))
["abc","def","ghi"]

Or just use the builotin reverse and stop trying to reinvent the wheel

Sunday, March 25, 2012

Mac OS finder/account slowness

I've noticed that it takes finder a good 15 seconds to startup sometimes. None of the forums I visited had any good answers, most dealt with generic account slowness and most recommended to just create a new account. I decided to run top and see what is running and process flags when I noticed at the top of the list mount and mount_nfs sleeping.

PID COMMAND %CPU TIME #TH #WQ #POR #MRE RPRVT RSHRD RSIZE ...

2429 mount_nfs 0.0 00:00.00 1 0 15 28 112K 288K 428K 17M 586M 2385 2428 sleeping ...

2428 mount 0.0 00:00.00 1 0 14 26 104K 284K 404K 9496K 578M 2385 2385 sleeping

(output discarded for brevity)

180 Finder 0.0 01:35.63 4 1 213 522 18M 51M 62M 37M 908M 180 174 sleeping

I booted up my NFS server (which for my own reasons I do not keep constantly powered on) and tried launching finder again and this time, the window popped up with no delay.

Shut off the NFS server, closed and reopened finder just to make sure that this delay is repeatable and reproducible. Indeed it is and every time I open a finder window Mac OS (not sure which one of the internals) will attempt to mount_nfs.

Added the following to my NFS mount options
-dumbtimer -timeo=3

Closed the finder window clicked on the finder icon again, this time the delay was much shorter. Hope this helps someone.

P.S. Having solved this problem does not make me a mac convert. Figuring out why it takes finder so long to list a directory when ls only takes a split second might get me half way there.

P.S.S. Also deleting related items from /Users/$USER/Library/Preferences

Finder:
com.apple.finder.plist

Digital Photo Professional:

com.canon.Digital Photo Professional.LSSharedFileList.plist

com.canon.Digital Photo Professional.plist

helped, and now they work well even without NFS. So the forums were right about that, just pointed to a different place when they mentioned preference cache. So in my case I found the root cause, but resolved it the same way everyone else did.

Thursday, March 1, 2012

HP-UX getent

I spent quite a bit of time looking for a getent equivalent for HP-UX. I needed something that would return a non zero value if a user or a group I queried were not there. In case you are having as much fun as I am, the commands are pwget and grget

Sunday, February 19, 2012

haskell string tokenizer

I recently needed a function to split strings on a delimeter and found this one on http://blog.julipedia.org/2006/08/split-function-in-haskell.html:

tokenizeString :: String -> [Char] -> [[Char]]
tokenizeString [] _ = [""]
tokenizeString (c:cs) delim
| c `elem` delim = [] : rest
| otherwise = (c : head rest) : tail rest
where rest = tokenizeString cs delim

It took me a little while to wrap my mind around the recursion and what goes on here, a pen and some paper and trace it out this is really really cool. It's a shame that I was not able to come up with this on my own.

Execution trace original string: "a,b" delimiter list: ","

I shortened some names to save space by hMt is really head $ mT and tmT is tail $ mT where mT is myTokenizer. I am no expert, but the secret sauce appears to be in the base case which always returns a [""] -- list of strings. And how the string elements are concatenated:

Char : [Char]: [[Char]] -- Single character to a string to a list of strings

Wednesday, February 15, 2012

Fun with Haskell

Had some more time to play with Haskell today, I am still very new at this, not being able to use iteration makes coding "interesting". Also this makes nesting data structures a little challenging.

For example take a function that returns a list:
multAddFunc :: Integer -> Integer -> [Integer]
multAddFunc a b = [a*b, a+b]

*Main> multAddFunc 2 3
[6,5]

Now lets say we had two lists of numbers [1,2,3] [4,5,6] And we wanted to write a function that processed them:
*Main> [ multAddFunc x y | x<-[1,2,3], y<-[4,5,6] ]
[[4,5],[5,6],[6,7],[8,6],[10,7],[12,8],[12,7],[15,8],[18,9]]

This just returned a list of lists, observe

*Main> :t [ multAddFunc x y | x<-[1,2,3], y<-[4,5,6] ]
[[4,5],[5,6],[6,7],[8,6],[10,7],[12,8],[12,7],[15,8],[18,9]]

But what if we wanted a flattened list? For this we would have to concatenate the lists together using the ++ operator

*Main> :t (++)

(++) :: [a] -> [a] -> [a]

listMath :: [Integer] -> [Integer] -> [Integer]

listMath [] [] = []

listMath (x:xs) (y:ys) = (multAddFunc x y) ++ (listMath xs ys)

*Main> listMath [1,2,3] [4,5,6]

[4,5,10,7,18,9]

And there we have it a flattened list

Monday, January 30, 2012

Shingles

It is easy to create a naive Data Leakage Protection (DLP) Product that will look for exact data or pattern matches, it is a lot more difficult to spot similarity between documents such as this document is x% similar to this reference. This article looks interesting and the approach seems easy to implement http://nlp.stanford.edu/IR-book/html/htmledition/near-duplicates-and-shingling-1.html

Subscribe To