Lua base64 __again__ D’oh! January 16, 2014 at 10:09 am

I am at it again. Sigh.

Optimization is trial and error. I have an encoding project that was taking a good 60+ seconds using BASH when a “huge” number of files uncovered the inherent inefficiency. Moving “invariant encoding” out of the main loop helped. The time was down to around 30 seconds. However, I ~knew~ this was a poor implementation. The problem was all of the sub-processes and string movement.

In comes Lua to offload the issue into a “real” language. The entire process dropped to around 6-7 seconds using the existing encoding schemes. This seemed a bit much to me still, and I went digging. The existing routines used “cool tricky Lua” using string.gsub. This ~is~ cool, but can take a large amount of memory. It was at this point I opted in to the “I can do it better.” You can read about that below. Today’s “rant” is about shaving even more time off encoding. (Encoding is one of the “points” of the project in question.)

So first the results based on the “brute force” 820K large file sent through the routines:

base64 (GNU 8.21) base64.lua (github/ErnieE5) base64.lua (github/paulmoore) Lua wiki
0.055s 0.245s 0.916s 2.586s

And then the delta from my latest optimization in the “real world” of the app.

Before ~0.984s after ~0.877s. A rough tenth of a second based on this apps “real world” usage of encoding around 110,000 smaller (30-80 bytes) strings into base64. A tenth of a second is just enough to perceive the difference.

If I ~really~ need the app to scream, I ~know~ that well under 100ms is plausible if I move to a compiled version. For this application BASH scripts used to be good enough. Lua will likely suffice for now. (Until the number of encodings hits over a million!) I’d likely try a Java approach before I resorted to having 3 different binaries.

Lua isn’t a self optimizing language which can be fun. (Or at least be a challenge to me.)
Ernie

Comments are closed.