I keep seeing many programmers from different backgrounds are unable to get what ruby symbols are, and though I do know that there are many great posts regarding this topic, and actually my intent is not to increase them by one
, but i feel i have to clear few points regarding them.
So I’m trying to answer 2 important question here: What are ruby symbols? and When to use them?
Well, according to the API documentation:
Symbol objects represent names and some strings inside the Ruby interpreter. They are generated using the :name and :”string” literals syntax, and by the various to_sym methods. The same Symbol object will be created for a given name or string for the duration of a program’s execution, regardless of the context or meaning of that name. Thus if Fred is a constant in one context, a method in another, and a class in a third, the Symbol :Fred will be the same object in all three contexts.
Let’s walk over this long quote, point by point, but let’s first list all the points it has:
1-Symbol objects represent names and some strings inside the Ruby interpreter.
2-They are generated using the :name and :”string” literals syntax, and by the various to_sym methods.
3-The same Symbol object will be created for a given name or string for the duration of a program’s execution, regardless of the context or meaning of that name. Thus if Fred is a constant in one context, a method in another, and a class in a third, the Symbol :Fred will be the same object in all three contexts.
I’ll start with the 2nd point then get back to the rest, just before i do, please fire your irb:
We can create a symbol with various ways:
# Normal way, just prefix a token with ':' greeting = :hi #=> :hi # Multi token symbol another_greeting = :"hello man" #=> :"hello man" # Use the .to_sym if it's defined for your object class # For example .to_sym is defind in String class a_third_greeing = "howdy".to_sym #=> :howdy # Using %s[ ] %s[a 4th one] #=> :"a 4th one" # We can also cast a symbol to string with to_s :ds.to_s #=> "ds"
That was the easy part, now let’s get back to the first point, it says: “Symbol objects represent names and some strings inside the Ruby interpreter.” , what does that mean exactly?
In computer science there is a term called: Symbol table, where the compiler or the interpreter of the language stores all the identifiers of a source code in that table to reference them -specifically to be referenced by the Abstract Syntax Tree(AST).
Actually the data structure that represents the symbol table varies from one interpreter to another, but what we care for is ruby, in ruby, the symbol table stores various things like method names and symbol names(we will check why ruby does so later on), and the value of a symbol is a unique integer value, that can’t be changed.
# Not working on ruby 1.9 :ds.to_i #=> 28777 # Notice the value of the symbol is not its object id :ds.object_id #=> 287778 # Symbol values can't be changed :ds = 3 #SyntaxError: compile error
Now let’s take more in depth example, let’s explore the symbol table:
# Let's check what symbols names start with 'hello' Symbol.all_symbols.collect{|x| x.to_s}.grep /^hello.*$/ #=> ["hello"] # Now let's define a new dummy class and add a new method called 'hello_world' class Dummy; def hello_world; end ; end #=> nil # Check again. Symbol.all_symbols.collect{|x| x.to_s}.grep /^hello.*$/ #=> ["hello", "hello_world"]
As you can see, when we defined the class ‘Dummy’ and more specifically when we defined the ‘hello_world’ method, it was added to the symbol table.
Let’s take another example:
Symbol.all_symbols.size #=> 3329 :koko #=> :koko Symbol.all_symbols.size #=> 3330
Now let’s take the last point : “The same Symbol object will be created for a given name or string for the duration of a program’s execution, regardless of the context or meaning of that name. Thus if Fred is a constant in one context, a method in another, and a class in a third, the Symbol :Fred will be the same object in all three contexts.” ,so: Fred is :Fred wherever you see it and no matter what the context it comes in:
k = :Fred #=> :Fred module M; Cons = :Fred; end #=> :Fred k.object_id #=> 287498 M::Cons.object_id #=> 287498
Well, this might be the one million dollars question, and that’s initially why i wrote this post for. You also might be wondering, why have Matz chosen to give us this low level introspection in the language by allowing me to work with the interpreter stuff?
The answer is divided in 2 parts:
1- Efficiency . 2- Metaprogramming(reflection)
We will talk about efficiency at first place,so let’s check this snippet of code:
# Some programmer would do this if name == "khaled alhabache"
The snippet of code above is really costive, in terms of memory and efficiency:
1-Comparing 2 strings is costive, specially when the 2 strings are long.
2-Reserving “changeable” amount of memory, 16 bytes in our case to instantiate “Khaled alhabache”.
3-The GC would have to clean this “Khaled alhabache” later on.
What about doing :
# This is what i call it "a cleaner approach" if name.to_sym == :"khaled alhabache"
Now what we did is:
1-Comparing 2 integers(the value of a symbol is integer) which is cheaper.
2-Reserving memory 4 bytes for :”Khaled alhabache” symbol, cause a symbol is an integer finally.
3-The GC would not have to clean this :”Khaled alhabache” symbol, cause symbols don’t get deleted till program exits.
So use symbols as much as you can, and avoid using stings as much as you can, but take extra care of defining thousands of symbols, cause as mentioned: symbols don’t get deleted till program exits, and thus they stick in memory.
Well working with metaprogramming in ruby is really nice, you can do something like:
to = [:to_s , :to_f , :to_r] #=> [:to_s, :to_f, :to_r] # Notice the use of symbols with reflection -Ex with 'send' method to.each{|method| puts "#{method} => #{5.send method}"} # to_s => 5 # to_f => 5.0 # to_r => 5
Without symbols, you would never be able to use reflection techniques like ’send’, otherwise how can you invoke methods dynamically?, also without symbols, you would never be able to use introspection techniques like ‘respond_to?’
5.respond_to? :slice #=> false 5.respond_to? :to_f #=> true
Update, a respond to readers comments:
It’s true that you can do something like :
5.respond_to? "to_f" #=> true
But what’s happening is that ruby is casting it for you, but why to reserve extra memory to send it as a string?
For guys who are objecting on memory efficiency with symbols, i strongly recommend reading this post also.
I hope i could help you understand what ruby symbols are and why they are used for, specially of you who are coming from other programing backgrounds.
Emmanuel Oga | January 6th, 2009 at 2:57 pm #
I’m not sure about using
if name.to_sym == :"khaled alhabache"
…always. I guess it depends on where does the name var come from. If I’m going to have thousands of never-garbage-collected symbols, maybe I could be introducing a “memory leak” by using them too sparingly.
Also, you can do things like
5.respond_to? "slice"
that is, using strings instead of symbols, but you are right, ruby converts strings into symbols for that kind of operations anyway:
Symbol.all_symbols.length # => 8495
1.respond_to?("caracatisqui") # => false
Symbol.all_symbols.length # => 8496
John | January 6th, 2009 at 6:21 pm #
respond_to? and send respond to strings, so technically symbols are not required to call those methods.
khelll | January 6th, 2009 at 9:40 pm #
@ Emmanuel Oga, then if so, it’s better that you have the name var as a symbol by default, instead of having it as a string and then casting it name.to_sym.
As for the second part of your question, you are right, ruby will cast “caracatisqui” to a symbol, but why even to reserve a memory for sending it as a string? why not to use the symbol directly?
khelll | January 6th, 2009 at 9:43 pm #
@ John, if you had a look at the Object API documentation, you will notice that the ’send’ method receives a symbol, not a string, but ruby casts it for you.
So technically it’s not required, but why to reserve an extra memory to ’send’ a string, while you can use a symbol directly.
Then it’s all about efficiency….
nico | January 7th, 2009 at 2:39 am #
In ruby 1.9 :ds.to_i gives a NoMethodError
khelll | January 7th, 2009 at 4:00 am #
@ nico, true, that won’t work, thanks for mentioning it.
Glenn Gillen | January 7th, 2009 at 9:52 am #
How is that considered considered a more efficient use of memory? The same result as interpreted from others has been:
from Josh Susser
from Eric Kidd
I’m sure there are scenarios where this might well be the case, but they’d need to be carefully thought out and measured with real world performance. I would think in the vast majority of cases that leaving these objects sitting around stale in memory is less than optimal.
khelll | January 7th, 2009 at 1:06 pm #
@ Glenn Gillen, you are totally true, that’s why i said:
khelll | January 7th, 2009 at 1:17 pm #
For guys who are objecting on memory benefits, please have a look at this article
Apostlion | January 7th, 2009 at 4:28 pm #
A symbol is not a memory leak. A symbol is a memory saver.
What’s GC, really?
Say, you create a hash as in Glu.ttono.us article listed above, {”abubu” => “ububa”}, and create another hash, oddly enough, also {”abubu” => “ububa”}. Well, four strings created, and if you delete the first hash, GC will kill two strings for ya.
If using symbols, you don’t have those “extra” strings to delete anyway.
And no matter how long your program runs, or whatever fancy loops and hoops it does, if you’re not dynamically creating symbols via #to_sym, only those specific, unique symbols you’ve explicitly named in your code will live. 1Mb memory leak is around 260k such symbols… can you even imagine that many symbol names?
As Khaled said, you’re ok unless you start dynamically creating thousands of symbols somewhere.
Jack | January 7th, 2009 at 4:40 pm #
My fave use for symbols in Ruby, actually, is for uses roughly equivalent to use cases for “enum” in C: when passing mnemonic codes around between different scopes (something beyond simple “true” and “false”). Practically that’s likely to be similar to metaprogramming but a bit simpler. For example, you might have:
def submit_button(caption, type=:input)
# Code for :input, :button, :image, :a_with_onclick or whatever
end
…or there’s the slightly more familiar approach of the options hash:
def add_column(name, type, default_value=nil, options={})
# code which checks for options[:not_null], options[:key], options[:constraint],
# options[:comment] et cetera.
end
Dan Mayer | January 7th, 2009 at 5:26 pm #
Good post I never really knew why people were pushing to use symbols over strings all the time. The best explanation before this I had heard was that Symbols where ‘lighter weight’ strings. I have used them all the time with out really fully getting it.
Twitter | June 3rd, 2009 at 1:02 pm #
This is my third time here, just thought I would mention that you are doing a great job