> ## Unit Conversion (#183)


The right way, generally, to do a task such as unit conversion is to  
see if someone has already done all the hard work for you. As was  
pointed out, there are several options in this respect:

  * The [Stick] library for Ruby; a [brief summary] was provided.  
Stick provides a value class (i.e. quantity with units), conversions,  
syntactic sugar and more.
  * Google's search engine can act as a calculator, including unit  
conversions. Using Google's API is one option; another is screen- 
scraping, as was done by _Peter Szinek_. (Of course, as noted, you  
must have an activate Internet connection to use this solution.)
  * As was pointed out by _Ryan Davis_, there is a BSD/Un*x command  
and library called `units` which does this same task. Transform the  
arguments, pass them to a shell, and capture the output.

Many thanks to _Martin Boese_, whose solution had to be empirically  
confirmed. Repeatedly.

But I'm going to look at the solution from _Robert Dober_. While it is  
limited, as posted, his data driven approach could be expanded to  
include more conversions.

To understand how the expression `1.0.in.to.mm` will generate the  
string "25.4mm", I'll trace it a step at a time, looking at the  
relevant bits of code.

First, we have the float value `1.0`, but where does the method `in`  
come from? Clearly, class `Float` gets something by way of extension:

	class Float
	 include Conversion
	end

Module `Conversion` only defines one method that will extend `Float`  
(with the rest of `Conversion` being helper classes and code executed  
when `Conversion` is first evaluated). That method is `method_missing`:

	 def method_missing unit_name
	   pc = ProxyClasses[ unit_name.to_s ] || super( unit_name )
	   pc::new self
	 end

So we will look for `ProxyClasses["in"]` and, if not found, we just  
call to the parent class and hope it knows what to do with method call  
`in`. But in this case, we're expecting to find something in  
`ProxyClasses`... a Class, in fact, which we attempt to instantiate  
immediately using `new`. But where does we fill `ProxyClasses`?

Ah, that would be the code right below `method_missing` in his  
solution: the code that makes use of `LineParser`.

	conversions = LineParser::new
	File::open "units.txt" do | f |
	  f.each do | line |
	    conversions.parse_line line
	  end
	end
	
Robert provided a minimal `units.txt` data file to show how the code  
works. (Note that the line beginning "use SI" is part of the data file  
and not a mistake; see `parse_line` for how that is handled.)

	1 in = 0.0254 m
	1 l  = 0.001 m3
	use SI prefixes for m g l m3

It could be expanded greatly to support many more units. As each line  
is read, the `LineParser` object parses them, keeping track of the  
conversion rules -- I'll come back to that later. What I want to look  
at first is what gets done with those rules:

	conversions.traverse do | src_unit, tgt_unit, conversion |
	  ( ProxyClasses[ src_unit ] ||= Class::new ProxyClass ).module_eval do
	    define_method tgt_unit do (@value * conversion).to_s + tgt_unit end
	  end
	end

`traverse` is going to enumerate over a number of valid conversions --  
source units, target units, and the conversion factor. And here we see  
from where the `ProxyClasses` originate... New `ProxyClass` objects  
are created through the code `Class::new ProxyClass` (but only if one  
didn't exist already for the particular source unit... note the use of  
the `||=` operator which only evaluates the right side and assigns  
left if the left was initially nil).

After ensuring that the `ProxyClass` corresponding to the source units  
exists, we call `module_eval` in order to add methods to the anonymous  
class just created. The method name will be the target units, and the  
method multiplies in the conversion factor, converts to a string, and  
appends the targets units.

So, getting back to our example `1.0.in.to.mm`, we've now found the  
`ProxyClass` corresponding to `1.0.in`. And we know that `ProxyClass`  
also has methods named by target units, which includes one that  
corresponds to the last part of the example: `.mm`.

If you're wondering about `to`, every `ProxyClass` defines that method  
to return self: essentially a useless function (in the sense that it  
does nothing more than `1.0.in.mm`). It's existence mimics other  
libraries, and the point is readability. (An alternative would be a  
more traditional call, such as 1.0.convert(:in, :mm) or similar.)

So once these proxy classes exist, there's very little effort going on  
to evaluate calls such as our example. And creating the proxy classes  
isn't much more difficult, assuming you have a proper conversion  
table. Now we come back to `LineParser` and what happens beyond its  
`parse_line` method. (I'll skip `parse_line` itself, since it is a  
few, simple regular expressions.)

Most of `units.txt` that defines our conversions is going to be  
handled by `add_conversion`, which just receives as arguments each  
split line of the data file. The conversion table (stored in `@c`) is  
two-layered hash -- a hash of hashes -- and is setup with this code:

	def add_conversion lhs_value, lhs_unit, equal_dummy, rhs_value,  
rhs_unit
	  @c[ lhs_unit ][ rhs_unit ] = Float( rhs_value ) / Float( lhs_value )
	  @c[ rhs_unit ][ lhs_unit ] = Float( lhs_value ) / Float( rhs_value )
	end

The conversion ratio (and the inverse conversion ratio) are stored in  
two places based on the indexing order. By storing both ratios/orders,  
we can convert in "both directions". That is, for our example, not  
only can we convert inches to millimeters, but millimeters to inches.

The last bit of file parsing is adding appropriate metric prefixes (SI  
units). One line in the file indicates which units are worthy of  
metric prefixes. In the data file provided, we see that meters can  
accept metric prefixes (such as "kilo" and "milli"), but inches will  
not. These prefixes are handed by `add_si_unit_for`:
	
	def add_si_unit_for unit
	  SIUnits.each do | prefix, conversion |
		@c[ prefix + unit ][ unit ] = conversion
		@c[ unit ][ prefix + unit ] = 1 / conversion
	  end
	end
	
Here, `unit` is the particular unit we want to support metric  
prefixes. `SIUnits` is the hash containing the metric prefixes as  
characters and the corresponding orders of magnitude. For every unit  
and metric prefix, two more conversions are added, each the inverse of  
the other: conversion between the naked unit and the adorned unit  
(e.g. between meters and millimeters, and vice-versa).

Finally, `traverse` is an enumerator that will yield (via `blk.call`)  
every valid combination of units and the appropriate conversion  
factor. It manages this without storing every conversion (e.g. we  
store the inches to meters conversion, and the meters to millimeters  
conversion, but don't explicitly store inches to millimeters).  
Enumerating every possible, valid conversion is done in the private  
method `_traverse`:

	def _traverse src_unit, unit_conversions, traversed_units, f=1.0, &blk
	  unit_conversions.each do | new_unit, conversion |
		next if traversed_units.include? new_unit
		blk.call src_unit, new_unit, f * conversion
		_traverse src_unit, @c[ new_unit ], traversed_units + [ new_unit ],  
f * conversion, &blk
	  end
	end

The final, recursive step here is what allows us to build a transitive  
closure of all units. `src_unit` is, of course, the source unit (e.g.  
inches). `unit_conversion` contains all possible immediate conversions  
from the source and is the hash of units and conversion factors. And,  
you can see, we enumerate those into `new_unit` and `conversion`.

We skip a target unit if it's already been visited (i.e. in  
`traversed_units`). Otherwise, we yield to the caller (`blk.call`) and  
recurse, now converting the source unit to everything the target unit  
can also be converted, making sure to update `traversed_units` so as  
to terminate eventually.



[1]: http://stick.rubyforge.org/
[2]: http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/320583