Sean Eshbaugh

Web Developer + Programmer

Finding All ActiveRecord Callbacks

Most of the time ActiveRecord Callbacks are pretty straight forward. But sometimes in larger projects or when using certain gems you can end up with more callbacks happening than you realize. If you're curious about just what is happening when on your model there's no straight forward way that I'm aware of to find out. However, it's actually not too difficult to do yourself.

If you look at the methods available on an ActiveRecord model you'll find several related to callbacks. Here's what we find when inspecting a model that has a Paperclip attachment (you'll see why in a minute).

~/my_project% rails c
Loading development environment (Rails 4.2.0)
2.2.1 :001 > MyModel.methods.select { |method| method.to_s.include?('callback') }
 => [:_validate_callbacks,
 :_save_callbacks,
 :_destroy_callbacks,
 :_commit_callbacks,
 :_post_process_callbacks,
 :_post_process_callbacks?,
 :_post_process_callbacks=,
 :_file_post_process_callbacks,
 :_file_post_process_callbacks?,
 :_file_post_process_callbacks=,
 :_validate_callbacks?,
 :_validate_callbacks=,
 :_validation_callbacks,
 :_validation_callbacks?,
 :_validation_callbacks=,
 :_initialize_callbacks,
 :_initialize_callbacks?,
 :_initialize_callbacks=,
 :_find_callbacks,
 :_find_callbacks?,
 :_find_callbacks=,
 :_touch_callbacks,
 :_touch_callbacks?,
 :_touch_callbacks=,
 :_save_callbacks?,
 :_save_callbacks=,
 :_create_callbacks,
 :_create_callbacks?,
 :_create_callbacks=,
 :_update_callbacks,
 :_update_callbacks?,
 :_update_callbacks=,
 :_destroy_callbacks?,
 :_destroy_callbacks=,
 :_commit_callbacks?,
 :_commit_callbacks=,
 :_rollback_callbacks,
 :_rollback_callbacks?,
 :_rollback_callbacks=,
 :raise_in_transactional_callbacks,
 :raise_in_transactional_callbacks=,
 :define_paperclip_callbacks,
 :normalize_callback_params,
 :__update_callbacks,
 :set_callback,
 :skip_callback,
 :reset_callbacks,
 :define_callbacks,
 :get_callbacks,
 :set_callbacks,
 :define_model_callbacks]

That's a pretty lengthy list, and just by glancing at it we can see several methods like _initialize_callbacks= and skip_callback that aren't likely to be relevant to the problem at hand. The protected method get_callbacks looks promising, but if you look at the source:

def get_callbacks(name)
  send "_#{name}_callbacks"
end

it quickly becomes obvious that it wasn't meant to be used to get a comprehensive list of all the callbacks on a model. Instead it just gives us the callbacks related to one particular event. That's great, but what about when we don't know all of the events? I deliberately chose a model with a Paperclip attachment because Paperclip provides some of its own callback events. They could easily be missed if we assumed only the standard ActiveRecord callbacks were available. Without knowing otherwise before hand that's a fair, but potentially incorrect, assumption.

From get_callbacks we can see that the methods it calls all take the form of "_#{name}_callbacks" where name is the name of the event. Well, a few methods in our list from before seem to match that pattern, so with a little help from a regular expression we can get just those:

2.2.1 :002 > MyModel.methods.select { |method| method.to_s =~ /^_{1}[^_].+_callbacks$/ }
 => [:_validate_callbacks,
 :_save_callbacks,
 :_destroy_callbacks,
 :_commit_callbacks,
 :_post_process_callbacks,
 :_file_post_process_callbacks,
 :_validation_callbacks,
 :_initialize_callbacks,
 :_find_callbacks,
 :_touch_callbacks,
 :_create_callbacks,
 :_update_callbacks,
 :_rollback_callbacks]

This is great, but still not quite what we want. Each of these methods returns an array-like CallbackChain object containing a set of Callback objects:

2.2.1 :003 > MyModel._save_callbacks
 => #<ActiveSupport::Callbacks::CallbackChain:0x007fbf7567e918
 @callbacks=nil,
 @chain=
  [#<ActiveSupport::Callbacks::Callback:0x007fbf7362c098
    @chain_config=
     {:scope=>[:kind, :name],
      :terminator=>
       #<Proc:0x007fbf73237cf8@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/activemodel-4.2.0/lib/active_model/callbacks.rb:106 (lambda)>,
      :skip_after_callbacks_if_terminated=>true},
    @filter=
     #<Proc:0x007fbf7362c390@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/has_attached_file.rb:91>,
    @if=
     [#<ActiveSupport::Callbacks::Conditionals::Value:0x007fbf7362c340
       @block=
        #<Proc:0x007fbf7362c2f0@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/activemodel-4.2.0/lib/active_model/callbacks.rb:141>>],
    @key=70230125666760,
    @kind=:after,
    @name=:save,
    @unless=[]>,
   #<ActiveSupport::Callbacks::Callback:0x007fbf75684ae8
    @chain_config=
     {:scope=>[:kind, :name],
      :terminator=>
       #<Proc:0x007fbf73237cf8@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/activemodel-4.2.0/lib/active_model/callbacks.rb:106 (lambda)>,
      :skip_after_callbacks_if_terminated=>true},
    @filter=:autosave_associated_records_for_document,
    @if=[],
    @key=:autosave_associated_records_for_document,
    @kind=:before,
    @name=:save,
    @unless=[]>,
   #<ActiveSupport::Callbacks::Callback:0x007fbf7567ea80
    @chain_config=
     {:scope=>[:kind, :name],
      :terminator=>
       #<Proc:0x007fbf73237cf8@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/activemodel-4.2.0/lib/active_model/callbacks.rb:106 (lambda)>,
      :skip_after_callbacks_if_terminated=>true},
    @filter=:autosave_associated_records_for_uploader,
    @if=[],
    @key=:autosave_associated_records_for_uploader,
    @kind=:before,
    @name=:save,
    @unless=[]>],
 @config=
  {:scope=>[:kind, :name],
   :terminator=>
    #<Proc:0x007fbf73237cf8@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/activemodel-4.2.0/lib/active_model/callbacks.rb:106 (lambda)>,
   :skip_after_callbacks_if_terminated=>true},
 @mutex=#<Mutex:0x007fbf7567e8c8>,
 @name=:save>

Each of these has an interesting method named raw_filter which returns either a method name Symbol or a Proc object. Let's see what we get when we inspect that for each of our model's save callbacks:

2.2.1 :004 > MyModel._save_callbacks.map { |callback| callback.raw_filter }
 => [#<Proc:0x007fbf7362c390@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/has_attached_file.rb:91>,
 :autosave_associated_records_for_document,
 :autosave_associated_records_for_uploader]

We get an array with a Proc and a couple of Symbols which starts to give us a much better sense of what will happen when we save a model. There's one more important detail though that we've overlooked, each Callback object has a kind property that will tell us whether the callback gets called before, after, or around the event. Let's group our callbacks by kind:

2.2.1 :005 > MyModel._save_callbacks.group_by(&:kind).each { |_, callbacks| callbacks.map! { |callback| callback.raw_filter } }
 => {:after=>
  [#<Proc:0x007fbf7362c390@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/has_attached_file.rb:91>],
 :before=>
  [:autosave_associated_records_for_document,
   :autosave_associated_records_for_uploader]}
 => {:after=>[#<Proc:0x007fbf7362c390@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/has_attached_file.rb:91>], :before=>[:autosave_associated_records_for_document, :autosave_associated_records_for_uploader]}

Awesome! Finally something that will start to give us real insight into what happens when. But we can still do better, what about all the callbacks? If we combine the regular expression filter of the class methods from before with the above we get a complete picture for the whole model:

2.2.1 :006 > MyModel.methods.select { |method| method.to_s =~ /^_{1}[^_].+_callbacks$/ }.each_with_object({}) { |method, memo| memo[method] = MyModel.send(method).group_by(&:kind).each { |_, callbacks| callbacks.map! { |callback| callback.raw_filter } } }
 => {:_validate_callbacks=>
  {:before=>
    [#<ActiveModel::BlockValidator:0x007fbf7362d3f8
      @attributes=[:file],
      @block=
       #<Proc:0x007fbf7362d510@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/has_attached_file.rb:27>,
      @options={}>,
     #<Paperclip::Validators::MediaTypeSpoofDetectionValidator:0x007fbf73624320
      @attributes=[:file],
      @options=
       {:if=>
         #<Proc:0x007fbf736245f0@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/has_attached_file.rb:85 (lambda)>}>,
     #<ActiveRecord::Validations::PresenceValidator:0x007fbf7567e440
      @attributes=[:document],
      @options={}>,
     #<ActiveRecord::Validations::PresenceValidator:0x007fbf7567dc20
      @attributes=[:uploader],
      @options={}>,
     #<ActiveRecord::Validations::UniquenessValidator:0x007fbf7567d400
      @attributes=[:file_fingerprint],
      @klass=
       MyModel(id: integer, file_file_name: string, file_content_type: string, file_file_size: integer, file_updated_at: datetime, file_fingerprint: string, created_at: datetime, updated_at: datetime),
      @options=
       {:case_sensitive=>true,
        :if=>
         #<Proc:0x007fbf7567d5b8@/Users/sean_eshbaugh/sites/clickherelabs/hub/app/models/attachment.rb:22 (lambda)>}>,
     #<Paperclip::Validators::AttachmentPresenceValidator:0x007fbf7567c4b0
      @attributes=[:file],
      @options={}>,
     #<Paperclip::Validators::AttachmentSizeValidator:0x007fbf756774d8
      @attributes=[:file],
      @options={:less_than=>1073741824}>,
     #<Paperclip::Validators::AttachmentFileTypeIgnoranceValidator:0x007fbf75676510
      @attributes=[:file],
      @options={}>]},
 :_save_callbacks=>
  {:after=>
    [#<Proc:0x007fbf7362c390@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/has_attached_file.rb:91>],
   :before=>
    [:autosave_associated_records_for_document,
     :autosave_associated_records_for_uploader]},
 :_destroy_callbacks=>
  {:before=>
    [#<Proc:0x007fbf73627f48@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/has_attached_file.rb:92>]},
 :_commit_callbacks=>
  {:after=>
    [#<Proc:0x007fbf736279f8@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/has_attached_file.rb:93>]},
 :_post_process_callbacks=>{},
 :_file_post_process_callbacks=>
  {:before=>
    [#<Proc:0x007fbf75687888@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/validators.rb:67>,
     #<Proc:0x007fbf75677b18@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/validators.rb:67>,
     #<Proc:0x007fbf75676bf0@/Users/sean_eshbaugh/.rvm/gems/ruby-2.2.1@my_project/gems/paperclip-4.2.1/lib/paperclip/validators.rb:67>]},
 :_validation_callbacks=>{},
 :_initialize_callbacks=>{},
 :_find_callbacks=>{},
 :_touch_callbacks=>{},
 :_create_callbacks=>{},
 :_update_callbacks=>{},
 :_rollback_callbacks=>{}}

And for the sake of reusability we can easily wrap this up in a module (pardon the terrible name):

module ShowCallbacks
  def show_callbacks
    _callback_methods = methods.select do |method|
      method.to_s =~ /^_{1}[^_].+_callbacks$/
    end

    _callback_methods.each_with_object({}) do |method, memo|
      memo[method] = send(method).group_by(&:kind).each do |_, callbacks|
        callbacks.map! do |callback|
          callback.raw_filter
        end
      end
    end
  end
end

class MyModel
  extend ShowCallbacks
  ...
end

One of the Hard Things

There are two hard things in computer science: cache invalidation, naming things, and off-by-one errors.

I love that quote, not only because it's more amusing than it should be, but because it's extremely true. I know because I've been bitten by all three things plenty of times. Tonight it was while using the rails-settings-cached gem to handle some global settings for a Rails application.

At some point I truncated the settings table so I could reset it with new defaults. Afterwards my new settings weren't taking in the application or showing up in the database. I tried to mimic the behavior of #save_default but with some extra output by doing the following inside my initializer

if Setting.application_title.nil?
  puts 'Setting application_title.'

  Setting.application_title = 'My Application'
end

just to make sure something weird wasn't going on.

Setting.application_title wasn't returning nil so the setting wasn't being set, even after restarting the server. I discovered that when I added Rails.cache.delete('settings:application_title') before the above that it worked just fine. So of course the normal call to #save_default worked just fine as well.

It then occurred to me that the problem might be related to Spring which keeps Rails loaded and ready to get started quickly. I couldn't find confirmation in the Spring source but I'm guessing that by keeping the Rails process around it also keeps the cache nice and full. This means that, despite removing the setting's table's contents and restarting the server, the old settings were hanging around in memory. I'm hesitant to say with 100% confidence that this is what was happening, but it certainly makes sense to me.

Spring ships with Rails 4.1 by default so if you're making heavy use of the Rails cache this sort of thing is probably something you'll have to look out for. Also, keep in mind that the Spring readme does mention, "There's no need to 'shut down' spring. This will happen automatically when you close your terminal."

Converting Text into a Sorted List

Ever had a list that's sorta broken up into different lines but is also mostly just uses spaces to delimit items? Ever wanted each item in that list on its own line? Ever want that one-line-per-item list sorted? It's shocking how often I need to do this. Actually it's probably more unfortunate than shocking.

If you find yourself needing to do all that too then you're in luck! It turns out there's plenty of easy ways to turn a bunch of words into a sorted list!

First, a few notes...

Since I'm on OSX I'm accessing my clipboard with pbcopy and pbpaste. If you're on Linux with X11 you can use xclip or xsel instead. Obviously you can replace the clipboard paste with some other output command and you can omit the clipboard copy or replace it with some other command.

All of these examples use grep and sort. grep is used here to remove blank lines, I'll just leave it at that since a book could be (and has been apparently) written about grep. sort does exactly what you'd expect, it sorts a file or input. The -f option makes the sorting case insensitive. If you do really want words that start with capital letters to go first then omit that option.

sed

% pbpaste | sed 's/[[:space:]]/\'$'\n'/g | grep -v '^[[:space:]]*$' | sort -f | pbcopy

sed is ancient. Despite its age it remains incredibly powerful and versatile. If you're on OSX or some other BSD variant then your sed will function somewhat differently from GNU sed. I won't waste a bunch of space explaining the details here, but this Unix & Linux Stack Exchange question explains it nicely. Basically BSD sed doesn't do escape sequences in output. The best solution I've seen to the problem is in this Stack Overflow comment. If you're on Linux and using GNU sed then this is what you'd do:

% xclip -o -selection clipboard | sed 's/[[:space:]]/\n/g' | grep -v '^[[:space:]]*$' | sort -f | xclip -i -selection clipboard

The s command takes a regular expression, a replacement string, and optionally one or more flags in the form "s/regular expression/replacement string/flags". The g flag, like it does most places, makes the substitution global.

tr

% pbpaste | tr -s '[:space:]' '\n' | grep -v '^[[:space:]]*$' | sort -f | pbcopy

tr is similar to sed but much simpler. So simple there isn't much to say. The first argument is a set of characters to replace and the second argument is a corresponding set of characters to replace the first with one-to-one. The -s option squeezes consecutive occurrences of the replacement characters to into a single character.

awk

% pbpaste | awk '{gsub(/[[:space:]]+/, "\n"); print}' | grep -v '^[[:space:]]*$' | sort -f | pbcopy

awk reads each line and executes the action inside the curly braces for each line. In our case we're using gsub to do a global substitution and then unconditionally printing the line. awk does far more than simple substitution and printing so there's probably a million different ways to accomplish this task. I've met several people who swear by awk, and I can understand why. Personally, I find it to be too awkward (pun sorta intended) for serious use given that alternatives with far fewer rough edges and more extensibility exist.

Ruby

% pbpaste | ruby -ne 'puts $_.gsub(/[[:space:]]+/, "\n")' | grep -v '^[[:space:]]*$' | sort -f | pbcopy

This right here is actually the biggest reason why I'm writing this post. Whenever I'm faced with a task involving transforming text my natural inclination is to write a small throwaway script in Ruby to get the job done. Usually those scripts end up being fairly elaborate and proper, in the sense that they could easily be part of an actual program. I like to make it a habit to not write overly terse code. Even when I know I'm going to throw it all away I like my code to be readable with nice descriptive variable names and no magical short cuts. That being said, this article inspired me to venture forth and try my hand at something arcane and nigh unreadable. I try and avoid writing Ruby that looks like 1990's Perl, but the -n option coupled with -e is just too cool to ignore. I will, however, choose to ignore that the Ruby example looks almost exactly like the awk example. Personally I don't think that's a very flattering comparison.

If all of this seems familiar, it's probably because you've seen Avdi Grimm's excellent post on solving almost the same problem in several different languages.

Multiple Key Hashes in Ruby

Here's an idea I've had rolling around inside my head for a while: hashes with multiple keys for the same value. Or, rather, some data structure, that is like a hash, except that only the values are unique, not the key/value pair. A data structure like that would allow for multiple keys to access the same underlying data. What use could this possibly be? Well, I occasionally find myself doing something along these lines:

def flash_message_alert_class(name)
  case name
    when :success, :notice
      'alert-success'
    when :info
      'alert-info'
    when :warning
      'alert-warning'
    when :danger, :alert, :error
      'alert-danger'
    else
      'alert-info'
  end
end

Where name is a key to the Rails flash hash. That particular example isn't too egregious; it's easy enough to understand, only a handful of lines long, and most importantly has only a few possible outcomes. But what if that wasn't the case? What if we had 10, 100, or even 1000 when clauses? What if each of those clauses had as many possible values that would trigger it? That seems far fetched, and it is, but consider a more likely scenario, what if the above mapping between sets of symbols and a single string was somehow constructed at run-time based on various forms of input. It'd be very impractical or downright impossible to write a case statement to handle that. It occurred to me the other day that the above scenario could be modeled has a data structure much like the one I described.

I'm positive I'm not the first person to think of this, but I have no idea what it would be called so I can't verify whether or not it has a name. If you're reading this and know the proper technical name of the data structure I've described please send me a message, I would love to know. For now I'm calling it a "multiple key hash". Other possible names I've considered are "unique value hash", "dedupicated hash", and "double layered hash". That last one will make sense in a minute.

I did however find an interesting Stack Overflow answer which offered up what the poster called an AliasedHash. That data structure is pretty cool and is so close to what I've been thinking about but it's not quite there. I want "aliasing" to be implicit and consequently I want it to be impossible to have duplicate values. Attempting to create one will instead merely create an "alias".

Yesterday evening I finally got enough inspiration to implement a multiple key hash in Ruby. What I have so far is still very rough, untested (since I'm only one step beyond playing around in an irb REPL), and likely very bad as far as performance goes. Here's the most important parts:

class MultikeyHash
  def initialize(initial_values = nil)
    @outer_hash = {}

    @inner_hash = {}

    @next_inner_key = 1

    if initial_values
      initial_values.each do |keys, value|
        if keys.is_a?(Array) && !keys.empty?
          keys.each do |key|
            self[key] = value
          end
        else
          self[keys] = value
        end
      end
    end
  end

  def [](outer_key)
    inner_key = @outer_hash[outer_key]

    if inner_key.nil?
      nil
    else
      @inner_hash[inner_key]
    end
  end

  def []=(outer_key, new_value)
    inner_key = @inner_hash.select { |_, existing_value| existing_value == new_value }.map { |key, _| key }.first

    if inner_key
      @outer_hash[outer_key] = inner_key
    else
      @outer_hash[outer_key] = @next_inner_key

      @inner_hash[@next_inner_key] = new_value

      @next_inner_key += 1
    end
  end
end

A quick note before I explain this code in detail. The MultikeyHash#new method behaves a bit differently from Hash#new method; rather than take the default value (a feature I have not yet implemented) it takes a hash that represents the initial values of the MultikeyHash. Here is an example of how it would be used:

m = MultikeyHash.new(['a', 'b', 'c'] => 'letters', [1, 2, 3] => 'numbers') #=> #<MultikeyHash:0x007f9ad31bb370 @outer_hash={"a"=>1, "b"=>1, "c"=>1, 1=>2, 2=>2, 3=>2}, @inner_hash={1=>"letters", 2=>"numbers"}, @next_inner_key=3>

m[1]                                                                       #=> "numbers"

m[2]                                                                       #=> "numbers"

m['a']                                                                     #=> "letters"

m['b']                                                                     #=> "letters"
  

If a key in the initial hash is a non-empty array then each element in that array is made a key of the new MultikeyHash. This means that if you want an actual array to be a key you will have to nest it inside of another array. Unfortunately I haven't been able to come up with a better solution. I'm afraid this might become a nuisance since it's not at all obvious without reading the source for initialize. I'm also considering changing it to accept anything that response to each to make it a bit more flexible.

The MultikeyHash class consists primarily of two hashes. The outer hash is what is exposed to the user. The keys behave like normal hash keys but the values are always just a key to the inner hash. I've chosen to use an integer for simplicity's sake. When accessing a MultikeyHash value we first find the inner hash key in the outer hash. If it exists we use that key to get the value from the inner hash, otherwise we return nil.

Setting a value is a bit more complicated. First we check the inner hash to see if the value exists in the inner hash and if it does we get the inner key for it and set the outer hash value for the outer key to the inner key. If the value was not found we set the outer hash value for the outer key to a new inner key and then set the inner hash value for that new inner key to the new value and increment the inner key counter. The result of all this shuffling is that new values are effectively inserted as normal and existing values are given one more key by which they can be accessed. From the user's perspective hash access occurs like normal, but in reality there are two layers of access, the first mediating access to the second (hence why "double layered hash" is a name I've considered).

The above code works just fine, but it lacks something very important. One of the key features of Ruby's hashes is their ability to be enumerated. The Enumerable module provides a powerful set of methods to any class that implements its own each method. Let's take a look at just how easy this is:

class MultikeyHash
  include Enumerable

  # Omitting the rest of the class for the sake of brevity.

  def each(&block)
    @outer_hash.group_by { |_, inner_key| inner_key }.inject({}) { |acc, e| acc[e.last.map { |i| i.first }] = @inner_hash[e.first]; acc }.each do |key, value|
      block.call(key, value)
    end
  end
end

By grouping the outer hash by the inner key and then collecting those groups into a new hash where the key is all of the outer keys and the value is value the inner key points to we end up with a hash that looks like {[:a, :b, :c]=>"letters", [1, 2, 3]=>"numbers"}. This lets us easily implement an inspect method:

class MultikeyHash
  # Omitting the rest of the class for the sake of brevity.

  def inspect
    "{#{self.map { |keys, value| "#{keys.inspect}=>#{value.inspect}" }.join(', ')}}"
  end

  def to_s
    inspect
  end
end

Because MultikeyHash has an each method it now has all the other goodies like map, select, reject, and inject.

I'm still pretty hesitant to say this data structure is a good idea. I haven't actually used it for anything so I have no idea how it works in the real world. Odds are I never will. Either way, building new types of data structures is always lots of fun! You can find the whole class here.

Chef Resource Condtionals

Lately it seems like all of my posts are about things that are super, painfully, embarrassingly obvious in hindsight. The trend continues!

Over the last week I've been learning to use Chef to set up some servers at work (with the help of the iron_chef gem, which was written by a co-worker of mine). At this point I feel like a real dummy for never having bothered to use Chef before, especially since it's been around for some time now. If you're not using Chef for server management you really ought to look into it. It makes automating your setup easy and having everything that your servers need documented in your scripts is awesome.

Despite quickly becoming a "why wasn't I using this before?" sort of tool there's been a few conceptual hurdles, as there always is with any framework or DSL. The one that really got me is the not_if/only_if conditional guards on resource blocks. The Chef documentation lays it out in what seems like a straightforward manner:

The not_if and only_if conditional executions can be used to put additional guards around certain resources so that they are only run when the condition is met.

Seems simple right? Well, if you look around enough you'll see examples of not_if and only_if used with either a block passed as the argument or with a String passed as the argument.

Here's two quick real and I swear not-contrived examples. One with a block:

bash 'unarchive-lame-source' do
  cwd ::File.dirname(src_filepath)

  code <<-EOH
    tar zxf #{::File.basename(src_filepath)} -C #{::File.dirname(src_filepath)}
  EOH

  not_if { ::File.directory?(::File.join(Chef::Config[:file_cache_path] || 'tmp', "lame-#{node['lame']['version']}")) }
end

And one with a string:

bash 'compile-lame-source' do
  cwd ::File.dirname(src_filepath)

  code <<-EOH
    cd lame-#{node['lame']['version']} &&
    ./configure #{lame_options.join(' ')} &&
    make &&
    make install
  EOH

  not_if 'sudo ldconfig && ldconfig -p | grep libmp3lame'
end

Here comes the embarrassing part. To me, at least, it wasn't clear what each form of the method call did, or really that there is a difference between the two. When passing a block as the argument, the result of the block, truthy or falsy, determines whether or not the resource is run. When passing a String, it is executed as a shell command and the return result of the command is used to determine whether or not the resource is run. Remember, for shell commands a return result of 0 indicates success (or true) and anything else, typically 1, but it can be any non-zero value, indicates failure (or false).

At first I was naively trying to use not_if like this not_if { 'sudo ldconfig && ldconfig -p | grep libmp3lame' } expecting the block to run the command. Instead, the block just returns the string. Since Strings are truthy the block always returns true and always skips the resource for not_if or runs the resource for only_if.

If we take a look at the source for Chef::Resource::Conditional#initialize it becomes pretty clear what's going on.

def initialize(positivity, command=nil, command_opts={}, &block)
  @positivity = positivity
  case command
  when String
    @command, @command_opts = command, command_opts
    @block = nil
  when nil
    raise ArgumentError, "only_if/not_if requires either a command or a block" unless block_given?
    @command, @command_opts = nil, nil
    @block = block
  else
    raise ArgumentError, "Invalid only_if/not_if command: #{command.inspect} (#{command.class})"
  end
end

Here we can clearly see that if the optional command is passed as a String the Chef::Resource::Conditional object is initialized with the command and command options and the block instance variable set to nil (and importantly, ignored if it was passed at all). If no command was passed but a block was given then the command and command options instance variables are set to nil and the block instance variable is set to the block that was passed. And finally an exception is raised if no command or block is given or if something weird is passed as the command.

And if you look a little bit further down in the source you'll find where the conditional is actually evaluated:

def evaluate
  @command ? evaluate_command : evaluate_block
end

def evaluate_command
  shell_out(@command, @command_opts).status.success?
rescue Chef::Exceptions::CommandTimeout
  Chef::Log.warn "Command '#{@command}' timed out"
  false
end

def evaluate_block
  @block.call
end

Pretty much exactly as I described above. If the command instance variable is present, it'll evaluate the command, otherwise it'll call the block. If you're interested in seeing how the cross-platform shell_out method works you can check out the source, it's definitely worth a read.

In fact, I think the takeaway from all of this is, when in doubt, go straight to the source code. It'll save you lots of time and you'd be hard pressed to not learn something new, especially if you're diving into a well-known and properly designed library.

I heard you like Ruby and Erlang so I put Ruby inside Erlang for you

Now I know I'm only just scratching the surface of this whole Elixir thing, but I have a sneaking suspicion I'm going to be feeling lots of deja vu...

~% irb
2.0.0p247 :001 > name = "world"
 => "world" 
2.0.0p247 :002 > "Hello, #{name}!"
 => "Hello, world!" 
2.0.0p247 :003 > 

~% iex
Erlang R16B01 (erts-5.10.2) [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]

Interactive Elixir (0.10.0) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> name = "world"
"world"
iex(2)> "Hello, #{name}!"
"Hello, world!"
iex(3)> 

Overriding But Preserving Ruby Methods

Recently I found myself needing to overwrite ActiveRecord's default save method but still retain the ability to call the original method. I know, I know, that's crazy talk, right? What could you possibly need to do that for? Well, in my case it was to provide a way to create "drafts" of my models under certain conditions when save is called. Rather than have all sorts of messy logic repeated over an over my controllers or tucked away in an awkward helper method it made much more sense to me to attach the functionality on my models as I need it. The ever so sublime paper_trail gem does something quite similar with ActiveRecord callbacks. But that isn't quite what I needed. What I really wanted was the ability to prevent a model from being saved in the first place. After all, what good is saving a draft if we've overwritten the original in the process? I particularly had in mind a use case where some users could only save drafts, which could be approved at a later time by more privileged users.

So now that we know the why of doing something that at first seems crazy (and more than a bit dangerous), what about the how? The core of how to override but preserve a method is pretty simple, but I think it might be helpful to provide some context, so bear with me.

Just like paper_trail, and many other gems, we start off with the following to get our module to load whenever ActiveRecord is loaded. This ensures that we don't have to manually include our module.

# /lib/kentouzu.rb
ActiveSupport.on_load(:active_record) do
  include Kentouzu::Model
end

Next we define self.included in or Model module so that when it's included we extend the base class with the ClassMethods module. This provides a slew of class methods to our model, the most important of which for the purpose of this post is the has_drafts method.

# /lib/kentouzu/has_drafts.rb
module Kentouzu
  module Model
    def self.included(base)
      base.send :extend, ClassMethods
    end

The has_drafts method provides us with a nice way of making it so we only include our InstanceMethods when we actually need it. It'd be really bad if we always override a vital method like save! If we just included the code to orverride the method without going through this it would lead to all sorts of disasterous behavior as our earlier hook into ActiveSupport#on_load would include it in every model in our application even when it doesn't make sense.

By providing this method we give a nice clean way to add functionality to our models (or really, any class) in the same way paper_trail's has_paper_trail does. Lots of gems take advantage of this pattern.

    module ClassMethods
      def has_drafts options = {}
        send :include, InstanceMethods
      end
    end

Here's where things start to get interesting (and relevant). In our InstanceMethods module we use the same self.included method as before. But this time we call instance_method(:save) on the base class to get an UnboundMethod for save. This allows us to reuse it later.

    module InstanceMethods
      def self.included(base)
        default_save = base.instance_method(:save)

After getting a reference to the old save method we then override it with define_method, sent to the base class. define_method is important because it allows access to the surrounding scope where default_save is defined. This lets us use it even after its out of scope. Inside the block the key is the if statement. It checks for the conditions for using our new save method. In my particular case I check to make sure that everything is enabled on the model (in pretty much the same way paper_trail does) and that the conditions for saving are met and then create draft from the model and save the draft without saving the model. The details of what happens here are up to you.

        base.send :define_method, :save do
          if switched_on? && save_draft?
            draft = Draft.new(:item_type => self.class.base_class.to_s, :item_id => self.id, :event => self.persisted? ? "update" : "create", :source_type => Kentouzu.source.present? ? Kentouzu.source.class.to_s : nil, :source_id => Kentouzu.source.present? ? Kentouzu.source.id : nil, :object => self.to_yaml)

            draft.save

And now for the magic. If the conditions for using our new version of the save method aren't met we take our unbound reference to the old save and bind it to self which, since this is an instance method on our model now, is our model. Finally we call it with the () method. You could also use call.

          else
            default_save.bind(self).()
          end
        end
      end
    end
  end
end

Now whenever we call the save method on our model so long as switched_on? and save_draft? return true we'll get a copy of the model as a draft. Of course we could strip this down to something much simpler without all the fancy including, but in my opinion all that is what makes this so useful, we only get it when and where we want it. That's pretty important because overriding methods like this can be very dangerous. I strongly suggest that before you do this you make sure you actually need to.

The source for the gem this is from is on GitHub.