As an effort both to learn a little bit about Go and possibly as a production project for work, I took a crack at some log parsing code in Node and Go today.
The basic work required:
- Read lines from a syslog file.
- Parse relevant tags and body.
- Date, host, tag, message
- If the syslog message is JSON, parse it.
- There is no static structure, so a JSON-to-struct mapping won’t work.
Eventually, we may push messages into some other location (Redis or Mongo, most likely).
I know Node decently well and I chose Go as it’s a recent darling. Now, it’s been a while since I’ve done anything resembling system-level programming so I’m sure the code I wrote has a bunch of issues. I’m sure I could make better use of goroutines and I’m also not 100% sure if my use of bytes and strings is as performant as possible.
I list all these caveats because I was surprised to find the NodeJS version consistently ran in about half the time on a 1GB log.
On the Go side, I ended up using a somewhat hacky set of splits and joins on the lines as the regex library is known to not be mature quite yet.
Based on not reading the entire documentation for the JSON package, I didn’t realize it supported arbitrary document structure. Because of that, I used an external library which may not perform as well.
If there’s anybody out there with a bit more experience, let me know where I went wrong in the Go code!