True Inversion of a Hash in Ruby
by Tilo Sloboda, Nov 2004

The Ruby Hash.invert method should come with the following warning label:

WARNING: Hash.invert

Hash.invert

Hash.inverse

What do you expect if you want to compute an inverted Hash?

If you have a Math background, you would expect that performing an “invert” operation twice would result in the original hash.

h == h.invert.invert

Let's see what Ruby's built-in Hash#invert method does to a hash:

        # given a hash which contains the words for numbers 1..3 in different languages: (English,German,Japanese)
        #
        h = {"eins"=>1, "drei"=>3, "uno"=>1, "one"=>1, "two"=>2, "san"=>3, "ichi"=>1, "three"=>3, "four"=>4}
        h.invert
         => {1=>"one", 2=>"two", 3=>"drei", 4=>"four"}  # oops, data is lost!

        #   the above result IS SIMPLY WRONG!   Why?

        h.invert.invert
	 => {"two"=>2, "one"=>1, "drei"=>3, "four"=>4}  # and you can not go back..

	h.invert.invert == h        # if you have a math background, your stomach will ache locking at this ;-)
	 => false

Not only is the above result very questionable, but you actually lose data which was stored in the original hash and you can not revert. I would say that Ruby's built-in Hash.invert method is simply broken! For an explanaition on why, see below

Correctly working Hash.inverse Implementation

If you use the new implemnentation of Hash.invert , you can either access it with Hash.inverse , or you can overload the old method with the new one and still access the old method with Hash.old_invert (see below)

Hash#inverse is fully backwards-compatible to the Hash#invert behavior if you don't have duplicates in the hash.
Hash#inverse preserves your data.

        # givena hash which contains the wordsfor numbers 1..3 in different languages: (English,German,Japanese)
        #
        > h = {"eins"=>1, "drei"=>3, "uno"=>1, "one"=>1, "two"=>2, "san"=>3, "ichi"=>1, "three"=>3, "four"=>4}
	 => {"uno"=>1, "three"=>3, "two"=>2, "eins"=>1, "ichi"=>1, "san"=>3, "one"=>1, "drei"=>3, "four"=>4}

	> h.inverse
	 => {1=>["one", "ichi", "eins", "uno"], 2=>"two", 3=>["drei", "san", "three"], 4=>"four"}  # preserves data!

	> h.inverse.inverse
	 => {"uno"=>1, "three"=>3, "two"=>2, "eins"=>1, "san"=>3, "ichi"=>1, "one"=>1, "drei"=>3, "four"=>4}  # you can revert the operation

	> h.inverse.inverse == h   # this is always true
	 => true

Isn't that a much more pleasing result?

Overloading Hash.invert

In case you want to overload the old method, and replace it completely, you may want to do this:

	class Hash
	    alias old_invert invert

	    def invert
	       self.inverse
	    end
	end

If you always want to overload the Hash.invert method , you can modify the file invert_hash.rb , by removing the line which contains __END__

Download

Hash.inverse is also available through the Ruby Facets library.

... and is referenced in the Ruby Cookbook

License

Freely available under the terms of the OpenSource "Artistic License" in combination with the Addendum A (below)
In case you did not get a copy of the license along with the software, it is also available at: http://www.unixgods.org/~tilo/artistic-license.html

Note: There is one corner case

If you start out with a Hash of Arrays, and want to inverse it twice, you end up with a similar Hash, including all the original values, but out of order... e.g.:

        h = {:key1 => [:a, :b, :c], :key2 => [:d, :e, :f]}

        h.inverse
         => {:a=>:key1, :b=>:key1, :c=>:key1, :d=>:key2, :e=>:key2, :f=>:key2} 

        h.inverse.inverse
         => { :key2 => [:d, :e, :f], :key1 => [:a, :c, :b]}  # preserves data, but h.inverse.inverse != h  in this case because order in the arrays is not preserved

The reason for this is that a regular Hash does not preserve order.

To fix this, you will need an OrderedHash:

        require'active_support'   # we need an OrderedHash

        class Hash

          def inverse
            i = ActiveSupport::OrderedHash.new
            self.each_pair{ |k,v|
              if (v.class == Array)
                v.each{ |x|
                  i[x] = i.has_key?(x) ? [i[x],k].flatten : k
                }
              else
                i[v] = i.has_key?(v) ? [i[v],k].flatten : k
              end
            }
            return i
          end

        end

Note

The corner-case is mentioned in this blog-post , but the author accidentially wrote: "If your original hash used arrays as hash keys" instead of: "If your original hash used arrays as hash values". It doesn't make much sense to use arrays as hash-keys ;-) Using an OrderedHash fixes the problem of arrays as hash-values.

Why is Ruby's Hash.invert broken?
I beleive that it's broken because of an inaccurate design-assumption.

Simple explanation:
Typically you want to use a Hash when you try to keep track of some data, and store some values associated with each item's key. In the real world multiple keys can map to the same value. The Ruby Hash class does not assume this, hence it can't cope with it.

More Lengthy Explaination:
The Ruby Hash class is a mis-nomer at best.
If you studied algorithms in computer science, then you learned that a hash is a data structure which has a mapping function to compute a key for each piece of data you want to place in the hash, e.g. f(value) = key . The key-concept of a hash is that the key is computed from the data/value. And often (for a hashes without collision resolution) the algorithm designers assume that the key is unique for each piece of data, and that no two pieces of data generate the same key:
(A1) Foreach f(value1) = key1 , f(value2) = key2 : value1 != value2 <==> key1 != key2 That means in plain English: each value has only one key, and each key has only one value
Now here's what's wrong with Ruby's implementation of class Hash, and why class Hash in Ruby is not the same as a hash datastructure in CS at all!

the above assumption A1 only works if the key is computed!
in Ruby we don't compute the key for the data, we arbitrarily assign the key to the data/values.
therefore we can not assume that A1 holds!
Ruby's class Hash is actually a mis-nomer.. it should rather be called Dictionary, lacking a better word. And that's how users use it - like a look-up dictionary for arbitrary key/value pairs, for which (in general) multiple different keys can lookup the same data.
Now Ruby's Hash.invert method was probably based on assumption A1 , which is not necessarily true for the data we may want to put into the hash.. that's why Hash.invert is not working properly..

True Inversion of a Hash in Ruby by Tilo Sloboda, Nov 2004