Trace Data

Ramblings on devops, data, games, programming, and more

Log Parsing: Node vs Go

| Comments

As an effort both to learn a little bit about Go and possibly as a production project for work, I took a crack at some log parsing code in Node and Go today.

The basic work required:

  • Read lines from a syslog file.
  • Parse relevant tags and body.
    • Date, host, tag, message
  • If the syslog message is JSON, parse it.
    • There is no static structure, so a JSON-to-struct mapping won’t work.

Eventually, we may push messages into some other location (Redis or Mongo, most likely).

I know Node decently well and I chose Go as it’s a recent darling. Now, it’s been a while since I’ve done anything resembling system-level programming so I’m sure the code I wrote has a bunch of issues. I’m sure I could make better use of goroutines and I’m also not 100% sure if my use of bytes and strings is as performant as possible.

I list all these caveats because I was surprised to find the NodeJS version consistently ran in about half the time on a 1GB log.

On the Go side, I ended up using a somewhat hacky set of splits and joins on the lines as the regex library is known to not be mature quite yet.

Based on not reading the entire documentation for the JSON package, I didn’t realize it supported arbitrary document structure. Because of that, I used an external library which may not perform as well.

If there’s anybody out there with a bit more experience, let me know where I went wrong in the Go code!

Automating SSH Key Deployment With Capistrano

| Comments

I’ve been playing with capistrano at work recently both as a convenient ad-hoc method of running arbitrary or pre-defined commands on our farms and possibly as an addition or replacement to deployments that we largely manage with Chef Solo right now.

For some legacy reasons that are changing, SSH keys have been somewhat taboo until recently so none were set up on any of our machines. Between our numerous dev, test, and production servers, we manage over a hundred VMs. Manually setting up keys struck me as time-consuming, boring, and an ideal candidate for some basic automation.

Our current Chef usage is a little weird and needs some rethinking so I decided to see what could be done with Capistrano. Here’s the task I whipped up to copy keys around.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
    desc "Add a public key to the remote machine"
    task :copykey do

      keyfile = ENV['KEY']

      # make user provided a proper key (exists and is .pub)
      unless keyfile
        abort "Provide a keyfile: KEY=~/.ssh/id_rsa.pub cap copykey"
      end
      unless keyfile.end_with?(".pub")
        abort "Keyfile must end in .pub"
      end
      unless File.exist?(keyfile)
        abort "Missing key file at #{keyfile}"
      end

      # put the pub key on the server
      remotefile = "cap-key-tmp.pub"
      put File.new(keyfile).read, remotefile, :mode => 0600

      # make sure .ssh/ exists, add the key if it's not already there, clean up
      run <<-EOC
        mkdir -p -m 700 .ssh;
        if ! grep -f #{remotefile} .ssh/authorized_keys;
        then
          cat #{remotefile} >> .ssh/authorized_keys;
        fi;
        chmod 600 .ssh/authorized_keys;
        rm #{remotefile};
      EOC

    end

Playing With ComputerCraft

| Comments

As happens every 6 months or so, I’m playing a bit of Minecraft again. This time, it’s on a server that has ComputerCraft as one of the available mods. ComputerCraft is pretty wild - you can craft computers in-game that have a working OS with a command line shell, several default programs, various peripherals (printers, disk drives, monitors), and, most fun, the little “turtle” programmable robots. LUA is the language and you can write your own programs to control the computers and turtles.

The turtles have a simple API that’s quite reminiscent of LOGO. Move forward, backward, turn left or right, move up or down, and since it’s Minecraft, you can of course dig, place blocks, and more. A turtle requires fuel to move and has its own inventory as well. It’s a pretty full-featured little guy and a ton of fun to play with. There’s a huge amount of programs written by other players out there but I’m writing my own for a bit of fun.

Stairs

Simple program to dig stairs down into the ground. Places torches every other stair to keep things nice and lit. Needs some love on the fuel management side of things.

Redwood Lumberjack

Simple program to chop down the large redwoods on my server. Cuts a 2x2 level, moves up, and repeats until there is nothing detected above the turtle.

Written while the server was down so it hasn’t been tested yet!

Gist Downloader

A simple script to download a gist and save it as a program.

Chef, vBulletin, Iconv, and LC_ALL

| Comments

Skip to the End…

If the LC_ALL environment variable is not set, Chef will use ‘C’ as the default value which may cause headaches.

For now, I recommend either setting the environment variable to the empty string in a cookbook:

1
    ENV['LC_ALL']=''

or setting it in an execute block for a command that may have an issue:

1
2
3
4
    execute "my command" do
      command "/bin/echo 'Hello!'"
      environment ({'LC_ALL' => ''})
    end

I haven’t tried it, but you could also look at explicitly setting the LC_ALL environment variable to an empty value in your system level shell profile (/etc/profile), default environment setup files (e.g., /etc/environment), and/or digging into Ubuntu locale configuration.

From the Top!

At work, we’re currently using Chef to manage our systems and perform some application installations. A few weeks ago, we ran into a particularly strange head-scratcher that took me about half a day to track down.

Our application stack currently consists of Ubuntu 12.04 (precise), Apache 2.22, and PHP 5.3 running our integration of Drupal 7, vBulletin 4, and our own application code based on Zend Framework. We have a set of shell and PHP scripts that perform the installation of Drupal, VB, and our code all in one go. For all non-developer environments, Bamboo triggers chef-solo runs on code commits or when an engineer wants to redeploy to the manual testing servers.

The problem that popped up was that VB was having issues on the automated environments but only when run via chef-solo. When the script were run by hand on the same environment, everything worked like a charm.

At issue were some localization installation steps that were failing which caused the entire forum portion of the site to nearly completely fail to function. In the build logs, the only evidence was that SQL queries part of language installation were failing because a value wasn’t present but only for some languages.

I spent some time digging through our installation scripts and the VB code that was running the language instalation trying to understand where the issue came from. Of note, VB has a bunch of custom XML parsing code that uses iconv to attempt to handle character set conversions for localized phrases. I even got so far as to see the code improperly convert some of the French accented characters during the chef-solo runs that worked without issue when run manually.

Environment variable state was one of my first guesses when we has isolated the repro steps to chef-solo vs manual runs. The LC_* variables looked like a likely culprit but my initial inspection was cursory - I only paid attention to LANG and LC_CTYPE which were set correctly to en_US.utf8. Only after a lot more testing and research did I notice that LC_ALL was set to ‘C’ during the chef-solo run and was not set on the manual runs.

As soon as I explicitly set LC_ALL to the empty string in the chef cookbook, everything worked as expected. The iconv calls in the VB code now had the proper environment variables and weren’t trying to treat everything as the ‘C’ language type.

The underlying issue in Chef is the following code:

lib/chef/mixin/command/unix.rb, lines 53-58
1
2
3
4
5
6
    # Default on C locale so parsing commands output can be done
    # independently of the node's default locale.
    # "LC_ALL" could be set to nil, in which case we also must ignore it.
    unless args[:environment].has_key?("LC_ALL")
      args[:environment]["LC_ALL"] = "C"
    end

When LC_ALL is not in the environment variables at all, it defaults it to ‘C’. Ubuntu (and at least one version of debian that I had easy access to), LC_ALL is not defined at all:

Environment variables on Ubuntu 12.04 Precise
1
2
3
4
5
    $ env | grep -E 'LC|LANG'

    LANG=en_US.utf8
    LANGUAGE=en_US:
    LC_CTYPE=en_US.UTF-8

Interestingly, the locale command displays an empty LC_ALL:

Locale on Ubuntu 12.04 Precise
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
    $ locale

    LANG=en_US.utf8
    LANGUAGE=en_US:
    LC_CTYPE=en_US.UTF-8
    LC_NUMERIC="en_US.utf8"
    LC_TIME="en_US.utf8"
    LC_COLLATE="en_US.utf8"
    LC_MONETARY="en_US.utf8"
    LC_MESSAGES="en_US.utf8"
    LC_PAPER="en_US.utf8"
    LC_NAME="en_US.utf8"
    LC_ADDRESS="en_US.utf8"
    LC_TELEPHONE="en_US.utf8"
    LC_MEASUREMENT="en_US.utf8"
    LC_IDENTIFICATION="en_US.utf8"
    LC_ALL=

From the UNIX specification, we can see the effect of LC_ALL:

This variable determines the values for all locale categories. The value of the LC_ALL environment variable has precedence over any of the other environment variables starting with LC_ (LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, LC_TIME) and the LANG environment variable.

So, from now on I’ll likely explicitly set LC_ALL in Chef cookbooks as well as making sure that the environment variables exists but is empty as the default state of things in the OS.

References and Further Reading

There’s at least one ticket about this issue right now:

However, it looks like the ‘C’ default may be in place due to issues when they weren’t setting it at all:

Properly setting the locale for Ubuntu may also help alleviate problems. I haven’t tested this so I’m sure if it would set the proper environment variables or not

What Is This Supposed to Be?

| Comments

While one doesn’t really need to validate writing or answer the question of “why am I blogging,” I figure it’s worth laying out my goals for the blog as well as spelling out who I am and who I think my audience might be.

I’m Dave. 31, married, no kids yet, Boston, gamer, nerd and web developer for 10 years. For the past 5 I’ve been lucky enough to work in the games industry building and operating web sites and backend systems. From time to time I get the itch to build my own things - one of which is Raidbots, a fairly successful information site intended for WoW raiders. You can see brief summaries of many others on the home page of this site.

My rough plan for the blog is to write about things I build, problems encountered and solved, some guides or overviews of web technology, and perhaps a bit of commentary or experiences being a gamer. I’ve often read that writing for one person can be a method to preserve a singular voice and allow the best to come out. I’ve always really liked this idea but it’s hard to come up with one person that would cover all the bases.

My current notion is that each post might be implicitly or explicitly intended for one person. A post about why I love Vim and hate IDEs might be addressed to one coworker while the beauty, utility, and necessity of understanding the command line might be to another. An overview of what I do and why I love it would likely act as a guide to my mom or dad. A quick entry on a particularly nasty software bug or conflict may be simply tossed into the ether in the hope that some poor, kindred developer or admin might save a few hours of debugging. I could even codify the conceit a bit and make up some personas as a way of categorization. I’ll keep toying with this thought and see if it’s actually useful or not.

If nothing else, the act of writing and publishing is just a good habit that I’ve been neglecting for quite a while.

Spinning Up…

| Comments

Time for blogging once again!

After a bit of flailing, I finally have Octopress up and running. My initial attempt had an issue where ‘rake generate’ wasn’t actually doing much of anything. It clearly called out to jekyll but no generation output ever appeared. Given that I’m not a ruby expert, my best guess is that my environment just wasn’t quite set up properly.

Tonight, I followed another guide on setting up Octopress on Mountain Lion and it seemed to work without a hitch.

Also of use for my purposes was deploying to a subdirectory of a site from the Octopress documentation.

With all that, I’m in a nice vanilla state at the moment. We shall see if I go after trying to match the minimal theming with the rest of the site or if it’s time to just write some posts that have been kicking around my head for a while.