Testing a Thousand Applications With Flipper
·Feature flags are amazing. No, really, did I tell you that feature flags are amazing? They are. But you might be running a thousand applications. When this kind of complexity gets involved you might need to test combinations of feature flags, sometimes - dozens of those combinations. Exhaustive testing to the rescue!
As I mentioned: if you have many feature flags, sometimes your application might be dependent on the state of multiple feature flags at once. Imagine you have a feature flag called deferred_checkout
, and another called buy_one_click
. Since the formula for the number of possible states is 2 ** feature_count
, we know that we have a matrix of 4 possible states:
On, On | On, Off
Off, Off | Off, On
With every extra feature flag, the matrix grows with 1 row and 1 column. There is in fact a great technique for testing these types of matrices - exhaustive testing. With that technique, we can feed our software all the possible inputs, and see how it reacts. And computers are great at enumerating large datasets. Way better than us, humans. Why not make our test suite generate test cases for all these combinations? When using Flipper for instance, we could then do this:
test "the checkout screen renders correctly", feature_flags: [:buy_one_click, :deferred_checkout] do
get "/checkout" #...
end
Thanks to the meta-programming abilities of Ruby we can put together such a helper quite easily. In your test_helper.rb
, add the following:
class FeatureFlagCombo
def initialize(table)
@table = table
end
def set_flags!
@table.each_pair do |flag, is_enabled|
is_enabled ? Flipper.enable(flag) : Flipper.disable(flag)
end
end
def to_s
@table.map do |flag_name, is_enabled|
"#{flag_name}: #{is_enabled ? :on : :off}"
end.join(", ")
end
end
Then we will need a method which executes a block passing it a combination of on|off
values for every flag. This is a bit obtuse (and makes a good question
for a tech interview which you probably should not be asking): generate the entire set of possible vectors with a vector having N dimensions and values
of every dimension being restricted to a finite set.
The N in this case is the number of feature flags involved, and the set of possible values per
dimension is [true, false]
- but if you ever need such a contraption for more possible values it will work just fine.
def self.with_every_feature_flag_combination(*feature_flags)
bit_values = [false, true]
possible_combinations_of_enabled_and_disabled = bit_values.product(*[bit_values] * (feature_flags.length - 1))
possible_combinations_of_enabled_and_disabled.each do |booleans|
feature_combo = FeatureFlagCombo.new(feature_flags.zip(booleans).to_h)
yield(feature_combo)
end
end
This method will yield you a FeatureFlagCombo
object for every such feature flag combination. If you have 2 flags - 4 yield
s, 10 - 1024 and so forth. Then we need to extend ActiveSupport::Testing::Declarative
to allow it to accept a keyword argument:
def self.test(name, feature_flags: [], &block)
if feature_flags.any?
with_every_feature_flag_combination(*feature_flags) do |combo|
super("#{name} with features #{combo}") do
combo.set_flags!
instance_exec(&block)
end
end
else
super(name, &block)
end
end
and we can define our tests:
test "a purchase is always refundable", feature_flags: [:discounted_purchase, :rapid_refund] do
purchase = Purchase.create!
assert_predicate purchase, :refundable?
end
This is actually where RSpec can be nicer than Minitest because of its contexts. Note that Flipper automatically installs a test helper for you, and will revert all the feature flags after every test case. Flipper is amazing.