« Pretty SQL Statement | Main | Cleaning »

March 18, 2005

Ruby Advocacy

This is a conversation between Justin Dressel and myself, about Ruby and Python. I thought it had some value, so here it is.

in particular, list comprehensions work well with the functional techniques like anonymous functions mapped across a list - I don’t see Ruby supporting anything like that offhand.

Far from correct! Ruby supports a block-passing style inherited from Smalltalk, so it’s common to work with lots of different data structures by passing effectively blocks. For example:


[1, 2, 3, 4].collect { |x| x * x }
=> [1, 4, 9, 16]

(1..10).select { |x| x % 3 == 0 }
=> [3, 6, 9]

The bummer is that you can’t do the [f(x) for x in l if cond(x)] in a single block. But it can still be done:


((1..10).select { |x| x % 3 == 0}).collect {|x| x * x }
=> [9, 36, 81]

The advantage of the block-passing style is that generators are a lot more built-in to the language. For example, these are equivalent:


1.upto(100000).each {|x| puts x}

for i in 1..100000
  puts i
end

And in neither of those cases is an array instantiated with all members of the range (which is what a X..Y is, and that is an object, and you can add methods to it).

Also what is the difference between multiple inheritance and Ruby’s ‘mixins’?

It’s really an academic difference. The class heirarchy answers the “is-a” question. You also can only mix in a module, not a full class, but it’s kind of moot because a module can have all the benefits of a class, it just can’t be instantiated. You can design a module to be included with a class though, and in so doing create new instance variables and rename other methods and so forth, just like you normally can, so that difference is pretty minor too—you just have to be accustomed to the idea that the module doesn’t get instantiated.

In practice, it rarely turns out that you have a class that wants to be two or more things, partly because in Ruby it’s so easy to decorate a class with these mixins, and partly because Ruby’s built-in mixins take care of a lot of the things you would want to handle on your own. For example, if you implement the comparsion operator, <=>, and the generic iteration method, each, then you can mix in Enumerable and get all these methods “for free”:

collect — return a list containing everything you returned in each, with a block applied to it (or the item, if no block is supplied)
detect — return the first item that satisfies the supplied block
each_with_index — apply the block to each item along with the sequence number of its occurance
entries — return an array of everything contained in your object
find_all — return an array containing everything in your object that the applied block returns true for
grep — apply a regex to everything within and return what is detected
include? — return true if the supplied object exists within your object
map — just like collect
max — return the “largest” item, as defined by the <=> method or an optional block taking two items as arguments
member? — synonym for include?
min — like max, but smallest
reject — like find_all, but for which the block returns false
select — synonym for find_all
sort — return an array containing the sorted stuff within, according to <=> or a supplied block
to_a — return an array representing the contents

And this is widely used in the standard library. Lots of things will also implement more than one each method. Using the object-specific class capability, you could change the behavior of just one instance, by carefully renaming those methods.

Also what about generators, a feature I love in Python?

The idea is so ingrained in Ruby, it doesn’t have a name. Things take blocks. It’s very normal. The syntax is actually rather nice, and because it isn’t tied to the for-loop concept, it is used in other places as well. For example, the file-open method will optionally take a block and pass the file object to it. This way, it takes care of closing it when the block is done. Database connections also can work the same way. This is similar to unwind-protect in Lisp.

There is a syntax other than what I’ve shown you, which is what I recommend for blocks longer than one line:


1..100000.each {|i| puts i }

1..100000.each do |i|
  puts i
end

And of course, there is a Kernel method which you can use to create a block if you just want to have one: lambda { |parm1, parm2, ...| ... } This will give you a Proc object, which can be passed around or invoked by calling the Proc#call method. The distinction is necessary because Ruby supports the principle of uniform access: if a method takes no arguments, it needs no parentheses. This also has a few important results: your code will look cleaner, always, and you cannot access a member variable from outside the class.

The second thing sounds like a bummer, but there is this syntax:


class MyFoo
  attr_accessor :myVar, :myOtherVar

  def initialize
    @myVar = 10
  end
end

The attr_accessor thing creates two methods for you: MyFoo#myVar and MyFoo#myVar=. The : syntax means, this symbol, and it is used pretty frequently. Anyway, back to those methods, the first one is called whenever you call MyFoo#myVar, and the second is called whenever you assign to myVar on the class (m.myVar = 23 is fine, the = just has to be bumped up there for the definition).

This is good, and Python is getting this capability now with the property() function. But there is another reason I bring it up: attr_accessor is not syntax. attr_accessor is a function on the Class class. You can add your own, for creating useful methods automatically or in general changing the behavior of the language from within the language. The Pickaxe 2 shows a sample Class method once which enables you to mark some methods as having to be called only once, thereafter reading out of a cache. It looks the same way:


class MyClass
  once :myMethod

  def myMethod
    ...
  end
end

Python certainly doesn’t support many of the functional techniques (i.e. pattern matching), but it does have a basic suite of functions

I’m not convinced pattern matching is a functional technique, because it’s absent from Lisp and Scheme, and present in Prolog which is technically a logic language.

I don’t miss it much, since as Mercury points out, you can always use “or” instead, but there are times when it just looks cleaner. And I know it speeds things up in Prolog when you have a bodiless head (a fact) as part of your predicate.

Using Ruby will point out things that piss you off in Python. You’ll grow to hate "".join(...). You’ll grow to hate all those builtin functions. To get a list of methods on a Ruby object, you call “methods.” You get back a list, you can call sort:


238.methods.sort
=> ["%", "&", "*", "**", "+", "+@", "-", "-@", "/", 
    "<", "<<", "<=", "<=>", "==", "===", 
    "=~", ">", ">=", ">>", "[]", "^", "__id__", 
    "__send__", "abs", "between?", "ceil", "chr", 
    "class", "clone", "coerce", "display", 
    "divmod", "downto", "dup", "eql?", "equal?", 
    "extend", "floor", "freeze", "frozen?", 
    "hash", "id", "id2name", "inspect", 
    "instance_eval", "instance_of?", 
    "instance_variables", "integer?", "is_a?", 
    "kind_of?", "method", "methods", "modulo", 
    "next", "nil?", "nonzero?", "prec", "prec_f", 
    "prec_i", "private_methods", 
    "protected_methods", "public_methods", 
    "remainder", "respond_to?", "round", "send", 
    "singleton_methods", "size", "step", "succ", 
    "taint", "tainted?", "times", "to_a", "to_f", 
    "to_i", "to_int", "to_s", "truncate", "type", 
    "untaint", "upto", "zero?", "|", "~"]

You’ll grow to hate the fact that Python’s sort is in-place (Ruby provides Array#sort! if you really have to). You’ll grow to appreciate that Ruby allows “?” and “!” on the end of methods.

It’s just a really good language. Too bad about the end end end but that’s really the only blight.

Posted by FusionGyro at March 18, 2005 11:10 PM

Trackback Pings

TrackBack URL for this entry:
http://www.clanspum.net/~fusion/blog/admin/mt-tb.cgi/103

Comments

Post a comment




Remember Me?

(you may use HTML tags for style)

Want HTML? Use Textile instead.