How Did That Get There?

I.

Earlier this week a co-worker of mine was working on some old code for running reports that had been written by someone who had long since departed. Every time he ran a report of a certain type he would always get wildly incorrect results. He had isolated the problem to a call to one particular method. RubyMine, his editor of choice, wasn't being very helpful in revealing the definition of the method. Exasperated, he said something to the effect of "How am I supposed to know where this method is defined?" I was more than a bit excited that I was able to tell him I knew of a way: Method#source_location.

Method#source_location returns the file name and line number where a given method is defined or nil if it's a native compiled method (e.g. part of the standard library). In order to actually use Method#source_location one must first actually have a method. Fortunately this is pretty easy; the Object#method method will return a Method object when called on an instance. If all that is available is a Class or Module then Module#instance_method can be used to get an UnboundMethod object. Either will work.

Here is an example of Method#source_location when called on a method that is defined in native code as part of the standard library.

2.4.1 :001 > x = 100
 => 100
2.4.1 :002 > x.method(:to_s)
 => #<Method: Integer#to_s>
2.4.1 :003 > x.method(:to_s).source_location
 => nil

In an irb session a method's source location won't have a file name, but the result won't be nil.

2.4.1 :004 > class Hello
2.4.1 :005?>   def hi
2.4.1 :006?>     puts 'hi!'
2.4.1 :007?>   end
2.4.1 :008?> end
 => :hi
2.4.1 :009 > Hello.new.method(:hi)
 => #<Method: Hello#hi>
2.4.1 :010 > Hello.new.method(:hi).source_location
 => ["(irb)", 5]

Here is an example of UnboundMethod#source_location.

2.4.1 :011 > Hello.instance_method(:hi)
 => #<UnboundMethod: Hello#hi>
2.4.1 :012 > Hello.instance_method(:hi).source_location
 => ["(irb)", 5]

If given a file named "goodbye.rb" with the following contents...

class Goodbye
  def bye
    puts 'bye!'
  end
end

...then here is an example of Method#source_location for a method defined in a file.

2.4.1 :013 > require_relative 'goodbye'
 => true
2.4.1 :014 > Goodbye.new.method(:bye)
 => #<Method: Goodbye#bye>
2.4.1 :015 > Goodbye.new.method(:bye).source_location
 => ["/home/sean/goodbye.rb", 2]

And once again the UnboundMethod#source_location version.

2.4.1 :016 > Goodbye.instance_method(:bye)
 => #<UnboundMethod: Goodbye#bye>
2.4.1 :017 > Goodbye.instance_method(:bye).source_location
 => ["/home/sean/goodbye.rb", 2]

II.

Using this technique my co-worker was able to quickly identify where the method in question was defined. Case closed. Well, not quite. It turned out the method's source location raised more questions than it answered. The method he was looking for was in a related but different class from the one he was expecting. This seemed suspicious so I suggested he try inspecting the inheritance chain of the object the method was called on using Module#ancestors method.

Module#ancestors returned nothing out of the ordinary at first glance.

[ReportA, ReportBase, Object, ..., Kernel, BasicObject]

What was confusing though was that the source location for the method was inside ReportB not ReportA. So, just how was that happening? After staring at the ReportA class for a minute I realized that it didn't inherit from ReportBase, instead it included it. I had a hunch, so I suggested we take a look at the ReportBase module.

Below is a minimal reproduction of the behavior.

module ReportBase
  def self.included(base)
    helpers = if base.const_defined?(:Helpers)
                base::Helpers.extend(Helpers)
              else
                Helpers
              end

    base.const_set(:Helpers, helpers)
  end

  module Helpers
    def greet
      puts 'greetings from ReportBase::Helpers#greet'
    end
  end
end

class ReportA
  include ReportBase

  module Helpers
    def greet
      puts 'greetings from ReportA::Helpers#greet'
    end
  end
end

class ReportB
  include ReportBase

  module Helpers
    def greet
      puts 'greetings from ReportB::Helpers#greet'
    end
  end
end

class ReportC
  include ReportBase
end

ReportA, ReportB, and ReportC are all pretty simple. All three include ReportBase. ReportA and ReportB both have a submodule named Helpers which defines a method named greet.

Where things start to get a little strange is inside ReportBase. The first thing to take note of is that Module#included is overridden. Module#included is a callback which is called whenever the module is included in another module or class. This allows for performing some specified action upon inclusion.

In the case of ReportBase the callback first checks to see if the constant Helpers is defined for the including class/module. If it is, then the including class/module's own Helpers submodule is extended into the ReportBase::Helpers submodule which is then assigned to helpers otherwise ReportBase::Helpers is assigned to helpers. Then the including class/module's Helpers constant is set to the helpers.

The end result of this is that if the including class/module has its own Helpers submodule then it is effectively merged with ReportBase::Helpers which then overwrites the Helpers submodule in the including class. This happens every time a class or module includes ReportBase. Because Module#const_set sets the Helpers constant to the Module object rather than creating a new copy this causes ReportBase::Helpers to end up polluted with the Helpers submodule of every class or module that include ReportBase. Worse, it also pollutes the Helpers submodule of each including class or module!

Below demonstrates the sort of frustration my co-worker was experiencing because of this.

2.4.1 :018 > require_relative 'report_test'
 => true
2.4.1 :019 > ReportA::Helpers.greet
greetings from ReportB::Helpers#greet
 => nil
2.4.1 :020 > ReportB::Helpers.greet
greetings from ReportB::Helpers#greet
 => nil
2.4.1 :021 > ReportC::Helpers.greet
greetings from ReportB::Helpers#greet
 => nil

At the risk of being hyperbolic: this behavior is awful. Truly, maddeningly, awful. Please do not write code like this!

In retrospect, after careful dissection, this code makes perfect sense. However, at a glance, the actual behavior is surprising. If all I could look at was the definitions of ReportA, ReportB, and ReportC it would take me ages to divine what is actually happening. And even with the source for ReportBase it still wasn't obvious what the source of the behavior was until I spent several minutes parsing through it in my head and writing a minimal reproduction similar to the one presented here.

I think what the author of the code was trying to do was make it so helper methods from one report class were available in all report classes. That sounds like it might be useful, but the way it was done clobbers the namespacing that the actual structure of the code appears to have. Rails helpers actually have very similar behavior. I suspect that's where the idea for this reporting code was taken from.

This is definitely a case where the code was a little too magical. The most impactful change that could be made to the code would be to make inclusion of the helper modules be explicit. Rather than automatically extend all the helper modules into one module, each report class could instead explicitly include any helpers. The urge to be clever and creative when writing code for an unexciting task like generating reports can be great. You're better off resisting that urge and instead keep things explicit and unsurprising. Your co-workers and your future self will thank you for it.