November 13, 2015

IDR 05: R (and cycle times, again)

Cycle times are indeed killing me since Wednesday, but I am making some progress.

I've written an R script which will apply the Lasso method (https://en.wikipedia.org/wiki/Least_squares#Lasso_method) to my candidate features, which should help ID the most important ones.

I've had to run multiple cycles of "feature extraction", though, to deal with a bug in an "optimization" I had written in that code about a year ago. Ouch.

Now that I've got that sorted out, another hour to go until I have some feature data. Another 6 hours from that, and I'll have the full data ready for processing by R. To give an idea, I'm working with 10K test cases and 100K features per test case at the moment. I'm currently working with a single "app" (Application under Test, or AUT), but have 3 others ready once the features are finalized.

In the meantime, the ordinary grad student responsibilities don't really stop. I've got to review two papers for an international conference with really bad English ... but at least that will be out of the way by the time my full data set is ready for this app!

Thanks for reading.

IDR Series

1 comment: