Recently, after updated Grape from 0.15 to 0.19.1, this unexpectedly slowed our test suite down by almost 60% overall. In an extreme case, an example is 26x slower than it was. For a guy who just made the test suite able to finish within 5 minutes, it is definitely unacceptable.
So first off, I identified the slowest example (the one I mentioned that is 26x slower) and StackProf it. By profiling that specific example, I was able to say with a little bit more confident that the regression was introduced by Grape::Util::StackableValues# since it accounted for 65% of CPU time.
Next, I found the very version introduced the regression is 0.17. Scanned through its commits, one particularly caught my eyes: Improve grape performance
1 2 3 4 5 6 7 8 9 10 11
def(name) return @froozen_values[name] if @froozen_values.key? name - value = [@inherited_values[name], @new_values[name]] - value.compact! - value.flatten!(1) + + value =  + value.concat(@inherited_values[name]) if @inherited_values[name] + value.concat(@new_values[name]) if @new_values[name] value end
It’s easy to see why the second implementation is preferred since Array#concat does automatic flattening if given object is an array.
The problem lies in the effort to prevent nil, which was handled by Array#compact! in previous versions. Here, it is done by adding conditionals to each #concat statement. Which may seems to be harmless and intuitive, these two conditionals actually cause multiple recursive calls of Grape::Util::StackableValues#, one in conditional, and the other in the statement itself if the conditional is true. As we can see from following profiling result, the difference is enormous:
def(name) return @froozen_values[name] if @froozen_values.key? name
value =  - value.concat(@inherited_values[name]) if @inherited_values[name] - value.concat(@new_values[name]) if @new_values[name] + value.concat(@inherited_values[name] || ) + value.concat(@new_values[name] || ) value end
There was a concern on using empty array would cause extra memory consumption. In the end, we’re convinced it would only take a minor GC to recycle that array thus no memory overhead.