November 9, 2015

IDR 02: The tool guy

My professor calls me "the tool guy". Here are the primary tools I'm using.

First, you have to track what you're doing.

  • Trello: Kanban board for TODO lists (TODO, Doing, Done)
  • Pomodoro: A timer to break my day into 25-minute segments of INTENSE focus
I've lost data too many times. I have three rules of thumb:
  • Automate everything
  • Check in all source files into Github
  • Save all derived files to Amazon S3
When I automate something, I write either a Bash, Groovy, or R script for it, depending on the task and domain. System-level things go into Bash. Java-y things go into Groovy, but call Java libraries for heavy lifting. Stats things go into R.

In the case of Groovy code, I have to "include" whatever Java libraries I want to call. So when I'm ready to call a script, I wrap it in a Gradle task which processes args and calls the Groovy script with the right classpath.

In addition to the Gradle -> Groovy -> Java stack of tools, I also write scripts in R for statistical processing.

When I want to call a Gradle or R script (or a sequence of them in combination), I use Jenkins. I define Jenkins jobs which call Gradle and R.

For persistence across jobs, I use MongoDB and Amazon S3.

My Jenkins "slave" machines (their term) need Gradle and R installed; and sometimes, I find that I need to run a bunch of similar tasks in parallel. That's why I did this, to bring up my Jenkins master and slaves (as well as a Nexus Java library artifact repository and a MongoDB instance) within Docker containers but still persist important data. My latest Docker scripts for this infrastructure are public on Github.

I also have a technique which uses the Jenkins Java API to break down work, so that it can be done via parallel jobs. I wrote about that here.

IDR Series

1 comment:

  1. It is interesting to see the parallels in your work methods and mine. We use the same types of tools for the same purposes (the differences are where I'm centering most of the tools around GitLab (since it is my revision control tool) - Kanban Board for GitLab instead of Trello and GitLab CI instead of Jenkins).

    Good luck on your dissertation. I took the same type of time off from work to finish my master's thesis a few years ago after it languished for a year or so.

    --Jason Hall