If you have a distributed application with function calls going up and down the stack and forwarding to other components on the network, you need to have a unifying identifier otherwise correlating and interpreting logs will be very difficult.
How to tie logs together
How to pick an identifier and set it. Lets say you have some application be it a web app or some network app and you want to be able to trace function calls and logs and connect them together. At the first entry point pick an identifier, if no meaningful identifier that pertains to the application exists (such as DHCP-Transaction-Id) just pick a random one. I find that a uuid works really well in this case. At this point you have established an identifier, but how will you let everyone know what it is? In case of synchronous frameworks such as django you may be able to just use some global variable which everyone can refer to. However in asynchronous frameworks such as twisted, you have no other choice but to pass the id around from a function to a function. So just go ahead and make sure all your functions have a req_id field. It also helps to have a uniform logging function to wrap around your log facility, something similar to
func log(logger, req_id, status, msg, extra):
...
extra is for things that did not fit that category, a variable or a dict that I will make up on the spot when I want to print multiple things. Call str on it or use pprint to format it
What else to put there
However in addition to printing that it is a good idea to also let the person who is reading logs know where this log came from, such as module, function and line number. And again in python's logging a formatter may be configured to do all that, however the formatter will report the function and line number as being the log function, so I usually replace that with inspect
func_name = inspect.stack()[1][3]
line_no = inspect.stack()[1][2]
Where to print
EVERYWHERE! I usually print when entering a functions and existing, when entering I print the interesting parameters and when exiting I print what it is returning. Every time a function does something such as find available IP in the subnet, I log, ping that IP to see if it is free, I log, create a record with the IP, log, creation passed or failed log. Any other action such as database calls always log results. One day you will want to know why your app did something and you will wish you had those logs.
Log confidentiality
Logs usually go on local disk and then may get moved to some cheap storage, most logs are not confidential. If your application accepts passwords and you must log user input for certain things, I suggest replacing the password wiht **** or ####, by no means do a 1:1 replacement, you do not want to give a hint as to the length. This way your logs do not have to contain confidential information
Logs without context are meaningless
Logs need to provide context, such as:
- I am in the process of finding an IP for as:bb:cc:dd:ee.
- Sent and offer with IP of 192.168.1.23 to as:bb:cc:did:ee.
- Received a NAC from as:bb:cc:did:ee.
- about to write /etc/cron.d/job with artifact ID crown.fabdeed12da
No comments:
Post a Comment