A Mental Model for Understanding Encapsulation in Ruby

A Mental Model for Understanding Encapsulation in Ruby

Encapsulation in object oriented programming is the grouping of data into objects while making that data unavailable to other parts of a codebase. It's one of the fundamental conceptual pillars of object oriented programming (along with abstraction, polymorphism and inheritance).

More simply, encapsulation is akin to placing information into a bucket, closing the lid and then hiding that information from the curious and prying eyes of others. Within the context of object oriented programming, the data we're hiding is the attributes that an object has (which together comprise the state of the object) and the functionality (behaviors) of which the object is capable. Therefore, to say that we have encapsulated data is to say that we have hidden the state and behaviors of an object from the rest of the codebase. To say that we are hiding data is to say that we are making that data inaccessible from the rest of the program, which also protects that data from unintentional manipulation.

In this article I discuss what exactly encapsulation looks like, how methods are used to encapsulate data or expose it to the rest of a codebase, and how exactly encapsulation benefits our ability to design complex programs.

The Hidden Nature of Objects

Ruby uses instance variables to track the values associated with an object’s attributes. For example, we may have a Dog class whose objects have the following attributes: name, weight and age. To track these attributes, our Dog class has instance variables of the same name: @name , @weight and @age . Together, these three instance variables comprise the state of objects instantiated from the Dog class.

Let’s consider the following example:

class Dog
  def initialize(name, weight, age)
    @name = name
    @weight = weight
    @age = age
  end
end

spot = Dog.new('Spot', 12, 4)
p spot
# => #<Dog:0x000000010d85f198 @name="Spot", @weight=12, @age=4>

Here we have initialized a local variable spot to a new object of the Dog class. The state of this object is represented by the instance variables that we have initialized in lines 3-5 , which tell us that the @name of this object is ‘Spot’, that it has a @weight of ‘12' (lbs) and that its @age is ‘4’. Invoking p on line 10 returns a human readable representation of the object spot: the class name of our object, an encoding of the object id and, most importantly for this discussion, all of the instance variables that have been initialized within that object along with their values.

At their most basic level, objects are distinct entities that are separate and independent of each other. We could also say that the internal representation of objects is hidden from each other, where one object is unable to access information about the other. Practically, this means that we cannot interact with an object unless we explicitly expose the state and behaviors of that object to other parts of our code. Ruby uses methods to expose the state of an object, but without these methods an object remains what is essentially an information silo, closed off and isolated from the rest of the world (or, rather, code). With these methods, we can retrieve and manipulate the data within an object.

To use the example above, our dog spot has a number of attributes. However, with the code as currently written there is no way to access them from the rest of the program. To demonstrate this we can attempt to reference one of the instance variables from outside of the class Dog where the instance variable was initialized.

class Dog
  def initialize(name, weight, age)
    @name = name
    @weight = weight
    @age = age
  end
end

spot = Dog.new('Spot', 12, 4)
p spot
# => #<Dog:0x000000010d85f198 @name="Spot", @weight=12, @age=4>

p @name
# => nil

As we saw before, we’ve initialized the @name instance variable in line 3, to which we've assigned the value ’Spot’ . We also confirmed that the instance variables have been initialized when we invoked p on line 10 , which returned the instance variables and their values along with other information about our object spot . Yet when we attempt to reference the variable in line 13, nil is returned.

There’s a good chance this result isn’t surprising to you, but it might not be surprising for the reason you think.

The reason nil is returned when directly referencing the @name instance variable outside of the Dog class is because instance variables are encapsulated in the object where they were initialized. As you recall from the beginning of our discussion, when data is encapsulated it is hidden — that is to say, inaccessible — from other parts of our codebase. For instance variables, they are hidden by default unless we deliberately create a way to access their values. To expose an object's instance variables — to take them out of hiding, so to speak — we have to define methods that allow us to access the instance variable's value outside of the object.

Using Methods to Open Objects to the Rest of the World

Instance variables are scoped at the object level, meaning that an instance variable exists only within the object where it was initialized. Therefore, to gain access to the values assigned to an object’s instance variables we have to consider the object as the gateway to access them.

To access these values, we need to define instance methods that expose them to other parts of our code. When an object's instance variables are exposed we can retrieve the values assigned to them, or we can manipulate those values altogether. When we don't have methods that expose them, they remain locked and hidden within the object.

Let’s define instance methods that allow us to retrieve and manipulate the value assigned to one of our instance variables; such methods are called getter and setter methods, respectively. For this example, we’ll use the @name instance variable.

class Dog
  # previous code omitted for brevity

  def name
    @name
  end

  def name=(new_name)
    @name = new_name
  end
end

spot = Dog.new('Spot', 12, 4)

p spot.name
# => 'Spot'

spot.name = 'Daisy'

p spot.name
# => 'Daisy'

On lines 4-6 we've defined our getter method name . When we invoke the name method on line 13, it simply returns the value assigned to the @name instance variable. Likewise, on lines 8-10 we've defined our setter method name=() , which reassigns the @name instance variable to the string passed as an argument to the name=() method.

And voilà, we are now able to access the value assigned to @name from outside of the class where it was initialized. With both getter and setter methods in place, we can now both retrieve and reassign the value assigned to @name .

You’ll notice, of course, that the name and name=() instance methods are invoked on our Dog object spot. This is because instance methods can only be invoked on objects of the class where those methods are defined. As mentioned previously, if we want to access information about the state of an object, we have to consider the object as the gateway to that information. If we were to invoke, for example, the name method apart from spot , an error would be raised.

class Dog
  # previous code omitted for brevity

  def name
    @name
  end

  def name=(new_name)
    @name = new_name
  end
end

spot = Dog.new('Spot', 12, 4)

name
# => undefined local variable or method `name' for main:Object (NameError)

Controlling How Objects are Exposed with Method Access Control

Encapsulation allows a programmer to have fine tuned control over what data is hidden within an object and what data is exposed to other parts of a program. Think of it like peeling back the layers of an onion: rather than exposing the entirety of the inner onion, I can peel back as many or as few layers as I want, revealing only as much of the onion as is necessary.

Any discussion of encapsulation necessarily entails a discussion of how exactly Ruby accomplishes the encapsulation of data. Like many other programming languages, Ruby uses the concept of access control to restrict and open access to the methods that allow one to retrieve and manipulate data within an object. Within the context of Ruby, we call this mechanism method access control.

Method access control uses access modifiers to control access to methods. In Ruby, access modifiers are public, private and the less commonly used protected. Public methods are available outside of the class where they are defined, meaning that they can be invoked anywhere without restriction. In Ruby, methods are public unless explicitly declared to be private or protected. In our code example above where we defined getter and setter methods for the @name instance variable, these are public methods since we did not declare them to be otherwise. We also saw this in practice when we successfully invoked the name and name=() methods from outside of the Dog class.

Private methods, on the other hand, can only be invoked from within the class itself. It is also the case that private methods cannot have a caller, because they are implicitly invoked on self .[1]

Let’s look at an example of how to declare a method to be private, as well as the implications of doing so.

class Dog
  # previous code omitted for brevity

  def name=(new_name)
    @name = new_name
  end

  private  # methods defined after this are private methods

  def name
    @name
  end
end

spot = Dog.new('Spot', 12, 4)

p spot.name
# => private method `name' called for #<Dog:0x0000000123136608 @name="Spot", @weight=12, @age=4> (NoMethodError)

To declare a method as a private method, we simply include the private method invocation, followed by any method definitions we intend to be private. In the above code, we have taken the name getter method and moved it below the private method invocation. And the result? As we see here, invoking a private method from outside the class raises a NoMethodError as well as an indication that the method we have tried to call is a private method. While Ruby typically raises a NoMethodError when it doesn’t find a method in the calling object's lookup path, in this case it raises the error not because the method doesn't exist but because making the method private has blocked access to it.

As stated previously, private methods can only be invoked from within the class where the method is defined. Now that we’ve declared the name method as a private method, here’s an example of how we can call it without raising an error.

class Dog
  # previous code omitted for brevity

  def speak
    "My name is #{name}!"
  end

  private

  def name
    @name
  end
end

spot = Dog.new('Spot', 12, 4)

p spot.speak
# => 'My name is Spot!'

In line 5 we've interpolated the name method within a new method definition, speak , with the resulting output on line 18. Whereas we could not invoke the private name method from outside of the class, this demonstrates that private methods can be invoked by other instance methods of the same class. It also demonstrates how we can encapsulate methods, since private methods are accessible only within the class itself and are protected from invocation outside of the class.

Protected methods lie between public and private methods. Like private methods, they can only be invoked from within the class where they are defined. However, unlike private methods they can be invoked on a calling object other than self , so long as the calling object is an instance of the same class. Protected methods aren't commonly used, but one common use case is when comparing objects of the same class.

class Dog
  def initialize(name, weight, age)
    @name = name
    @weight = weight
    @age = age
  end

  def <(other)
    weight < other.weight
  end

  protected

  def weight
    @weight
  end
end

spot = Dog.new('Spot', 12, 4)
daisy = Dog.new('Daisy', 25, 12)

puts spot < daisy
# => true

puts spot.weight
# => protected method `weight' called for #<Dog:0x00000001260c1710 @name="Spot", @weight=12, @age=4> (NoMethodError)

In the above example we’ve declared weight as a protected method. This allows us to protect the weight method from being accessed outside of the Dog class, while also allowing us to compare the weight of objects of the Dog class as defined in the < method on lines 8-10 . Like private methods, protected methods can only be invoked from within the class, but unlike private methods they can be invoked on both self and other objects of the same class.

The Practical Benefits of Encapsulating Data in Objects

Learning object oriented programming for the first time can be very challenging. Not only is it a major conceptual shift in how we think about programming, but it can be difficult to translate object oriented programming on a conceptual level to how it benefits the code we write on a practical level. This is true as well of understanding encapsulation and how it is implemented in Ruby.

Encapsulating data into objects has two chief benefits:

  1. It protects data from unintentional manipulation. In other words, in order to change data within an object there must be obvious intention behind doing so. It also means that we can restrict the way in which data is manipulated.
  2. It allows us to hide complex operations while leaving a simple public interface to interact with those more complex operations.

Let’s illustrate these benefits with an example.

class Person
  attr_accessor :car

  def initialize(name)
    @name = name
    @car = nil
  end
end

class Car
  attr_reader :make, :model, :year, :engine_status

  def initialize(make, model, year)
    @make = make
    @model = model
    @year = year
    @engine_status = :off
  end

  def start_engine
    switch_ignition
    start_relay
    start_motor
    puts "Engine is #{engine_status}!"
  end

  private

  attr_writer :engine_status

  def switch_ignition
    # implementation
    puts 'Starting ignition...'
  end

  def start_relay
    # implementation
    puts 'Starting relay...'
  end

  def start_motor
    # implementation
    puts 'Starting motor...'
    self.engine_status = :on
  end
end

joe = Person.new('Joe')
joe.car = Car.new('Chevy', 'Impala', 1958)

joe.car.start_engine
# => 'Starting ignition...'
# => 'Starting relay...'
# => 'Starting motor...'
# => 'Engine is on!'

puts joe.car.engine_status
# => 'on'

joe.car.engine_status = :off
# => private method `engine_status=' called for #<Car:0x0000000147948bd8 @make="Chevy", @model="Impala", @year=1958, @engine_status=:on> (NoMethodError)

The above code illustrates a simple but key point: in order to start the car, the object joe doesn’t need to know the implementation details of every method involved in starting the engine, or even that those methods exist. More specifically, objects of the Person class do not need to access the switch_ignition , start_relay and start_motor methods, which are all necessary steps in starting an engine. Rather, the only method that objects of the Person class need to know about is the start_engine method; all other implementation details that follow from this method can remain hidden and inaccessible.

And that is encapsulation in practice. We've packaged all of the complex details involved in starting an engine, have made them inaccessible outside of the Car class and instead have defined a simple public interface — the start_engine method — to handle all of the underlying complexity. In fact, this models the real world implementation of starting a car: one doesn’t need to know the internal mechanics of how exactly a car engine starts. Instead, a person only needs to know how to turn the ignition with a key. The rest of the implementation happens under the hood and out of sight; in other words, it is encapsulated.

Notice as well that while we have defined a public getter method for the @engine_status instance variable in line 11, we have made its setter method in line 27 private. While we may want the status of the engine to be publicly accessible (for example, a mobile app is able to check if the engine is running), we want only the internal implementation of the object’s class to be able to reassign it, which protects it from the possibility of being directly changed from outside the class. In practical terms, we don't want an object other than a Car to be able to modify @engine_status. Rather, we want it to be changed only as a result of the internal implementation that begins with the public start_engine method and ends with the private start_motor method. In other words, we want the value of @engine_status to reflect the actual status of the engine, while also preventing arbitrary changes that don't. Using method access control to structure our methods this way ensures that @engine_status is manipulated with clear intention and only in the specific way we've designed it to be changed in our program.

Conclusion

In summary, encapsulation allows a programmer to group data into objects and then hide that data from the rest of the codebase. Likewise, it also allows a programmer to expose only data that needs to be accessed outside of the class. By encapsulating data, we can prevent arbitrary changes to data, and we can also hide complex operations while providing a simple public interface to interact with them.


  1. As of Ruby 2.7, self can explicitly call private methods. ↩︎

Comments