IS COMPILING THE NEW SCRIPTING?

You may have read my friend and colleague Peters post recently on why laziness, impatience and hubris drives developers to script.
It’s a great read and this post follows on from what Peter has to say. If you know me or follow me on medium or twitter, you’ll no doubt have seen my recent posts on Golang, more commonly known as Go. I’m attempting to learn the language and what better way to learn that to have an actual example/challenge to run with. So, before you go any further, have a read of Peters post first to get context, and then come back here.

Sorted? Good. So a long time ago I used to code a lot in C , and I like the C similarities that are apparent in Go. I figured I’d have a go at this log parsing example and see how the Go version performs.

The code looks like this:

  • Lines 75–82: We define our main function and use the Go flag library to setup what command line parameters that we anticipate being passed to the executable. These take the format: variable := flag.type(“parameter”, default, “message to user”), where type is the type of the parameter (bool, int, string), default is the default value and the message to the user is the prompt they will see on the command line. In most cases, we default our expected inputs to true, and set a default of 1 (second) for the threshold value.
  • The flag.parse() command on line 82 basically means that you are happy with the parameters you have setup beforehand and that anything else coming in on the command line after these parameters have been parsed should be captured into a slice (a dynamic array) that we can access with flag.Args(). This allows us to iterate over the log file paths/names (testdata/*.log) and pass these in for processing when we run grep_elb -b -i -r -t=5 -v testdata/*.log <- this bit goes into flag.Args()
  • Lines 84:88: If the debug flag ‘d’ is true, print out some additional information on what arguments are set and the file paths.
  • Lines 91–93: Pass in each log file to the parseELBLogfile() function, along with any command line parameters, such as thresholds etc.
  • Lines 31–71: This is where the magic happens. parseELBLogfile() basically opens each file, and reads all the lines in the file into a slice. For each line, it splits it into separate array items by using a space ‘ ‘ as a delimiter. We then evaluate the status of our flags that we accepted via the command line. If i, b or r are set, then we want those timings to be compared against the threshold parameter value ‘t’, and if they are greater than the threshold, output them.
  • Lines 15–29: We basically read in the lines of the file here. Notice the defer statement. This is kinda like a ‘finally’ block in a try-catch-finally statement. It makes sure the file gets closed before exiting the function.
  • Lines 13, 71, 92, 96 — you might be wondering what those wg.x() statements are for? Well they are signifying that we want to wait until the concurrent function calls (goroutines) are complete before we exit the program. We define the waitGroup on line 13, and then on line 92 we increment a waitGroup counter to show that we are running a function concurrently, i.e. we prefix it with the ‘Go’ keyword.
  • When we are finished within a function, we signify this with a wg.Done() (line71) and we wait until all goroutines are complete with a wg.Wait() (line 96).

So whats it all look like then? Well, I ran the application against the same log data (5 files with 411,000 lines in total) that Peter has used, on my ‘manager’ Macbook [1].

  • PERL: perl grep_elb.pl -t 5 testdata/*.log 6.56s user 0.19s system 95% cpu 7.067 total
  • GO: grep_elb -t=5 testdata/*.log 1.00s user 0.17s system 92% cpu 1.263 total. Note, this version excluded goroutines.

I actually tried again with goroutines and the timing ended up around 1.56s or so. I suspect that the IO operations of reading the file and writing to stdout are the bottlenecks. One final thing — Peter ran this against the full set of 24 hours worth of ELB log files for a large service. It has 1589 log files that contain 5,159,601 lines and are 1.6Gb. Perl took 114 seconds. The Go version? 26.39 seconds.

Now the title of this post is asking if compiling is the new scripting? Possibly. I wrote this little app in Sublime Text with the GoSublime plugin. Compilation took 0.902 seconds, and I get autocomplete, formatting and REPL from within the environment. Definitely a nice way to knock up a simple ‘script’.

So — do you fancy having a go yourself with another language? Let us know how you get on.


[1] MacBook, Early 2015 (MacBook 8,1). 1.3 Ghz, 8Gb RAM 1600 Mhz DDR3, 256Gb SSD (FileVault enabled), Intel HD Graphics 5300 1536Mb. El Capitan, 10.11.3

2 thoughts on “IS COMPILING THE NEW SCRIPTING?

  1. Apologies – you can get away *without* using a sync group, by using a channel. You’ll find this is a little easier to test

    ➜ demo cat demo.go
    package main

    import (
    “fmt”
    “math/rand”
    “time”
    )

    func doStuff(done chan bool) {
    time.Sleep(time.Second * time.Duration(rand.Int31n(10)))
    done <- true
    }

    func main() {
    completionNotification := make(chan bool)
    i := 0
    for ; i 0 {
    select {
    case <-completionNotification:
    i–
    fmt.Println(i, " processes remaining")
    case <-time.After(1 * time.Second):
    fmt.Println("Still alive!")
    }
    }
    }
    ➜ demo go run demo.go
    9 processes remaining
    8 processes remaining
    7 processes remaining
    6 processes remaining
    Still alive!
    Still alive!
    Still alive!
    5 processes remaining
    4 processes remaining
    3 processes remaining
    2 processes remaining
    Still alive!
    1 processes remaining
    Still alive!
    0 processes remaining

Leave a Reply

Your email address will not be published. Required fields are marked *