This article is written by an author from toptal.com ( Origin )
Domain specific languages (DSL) are an incredibly powerful tool for making it easier to program or configure complex systems. They are also everywhere—as a software engineer you are most likely using several different DSLs on a daily basis.
In this article, you will learn what domain specific languages are, when they should be used, and finally how you can make your very own DSL in Ruby using advanced metaprogramming techniques.
This article builds upon Nikola Todorovic’s introduction to Ruby metaprogramming, also published on the Toptal Blog. So if you are new to metaprogramming, make sure you read that first.
This is Ruby code, yet it feels more like a custom route definition language, thanks to the various metaprogramming techniques that make such a clean, easy-to-use interface possible. Notice that the structure of the DSL is implemented using Ruby blocks, and method calls such as
Metaprogramming is used even more heavily in the RSpec testing library:
This piece of code also contains examples for fluent interfaces, which allow declarations to be read out loud as plain English sentences, making it a lot easier to understand what the code is doing:
Although the clean and expressive syntax of Ruby along with its metaprogramming capabilities makes it uniquely suited for building domain specific languages, DSLs exist in other languages as well. Here is an example of a JavaScript test using the Jasmine framework:
This syntax is perhaps not as clean as that of the Ruby examples, but it shows that with clever naming and creative use of the syntax, internal DSLs can be created using almost any language.
The benefit of internal DSLs is that they don’t require a separate parser, which can be notoriously difficult to implement properly. And because they use the syntax of the language they are implemented in, they also integrate seamlessly with the rest of the codebase.
What we have to give up in return is syntactic freedom—internal DSLs have to be syntactically valid in their implementation language. How much you have to compromise in this regard depends largely on the selected language, with verbose, statically typed languages such as Java and VB.NET being on one end of the spectrum, and dynamic languages with extensive metaprogramming capabilities such as Ruby on the other end.
Let’s implement this interface first—and then, using it as the starting point, we can improve it step by step by adding more features, cleaning up the syntax, and making our work reusable.
What do we need to make this interface work? The
Once the configuration block has run, we can easily access and modify the values:
So far, this implementation does not feel like a custom language enough to be considered a DSL. But let’s take things one step at a time. Next, we will decouple the configuration functionality from the
Everything related to configuration has been moved to the
Not much has changed here, except for the new
We are not done yet—our next step is to make it possible to specify the supported attributes in the host class that includes the
Perhaps somewhat surprisingly, the code above is syntactically correct—
There is a lot to unpack here. The entire
The fact that blocks in Ruby have access to outside variables is also the reason why they are sometimes called closures, as they include, or “close over” the outside environment that they were defined in. Note that I used the phrase “defined in” and not “executed in”. That’s correct – regardless of when and where our
Now that we know about this neat behavior of blocks, we can go ahead and define an anonymous module in
Finally, we call
Phew, that was some pretty hardcore metaprogramming already. But was the added complexity worth it? Take a look at how easy it is to use and decide for yourself:
But we can do even better. In the next step we will clean up the syntax of the
Let’s implement it, shall we? From the looks of it, we will need two things. First, we need a way to execute the block passed to
The simpler change here is running the
The change to the attribute accessor methods in
This is equivalent to defining a reader and writer method for each specified attribute:
So when we wrote
We can keep auto-generating the writer methods by calling
To generate the reader methods ourselves, we loop over the
Here we use Ruby’s
You might be wondering why we have to use a blank object as the default value for “not provided” and why we can’t simply use
That blank object stored in
Here we added a reference from
If the expression is wrapped in a block, that will prevent it from being evaluated right away. Instead, we can save the block to be executed later when the attribute value is retrieved:
We do not have to make big changes to the
When setting an attribute, the
Supporting references comes with its own caveats and edge cases, of course. For example, you can probably figure out what happens if you read any of the attributes in this configuration:
Here is the final version of the module that implements our DSL—a total of 36 lines of code:
Looking at all this Ruby magic in a piece of code that is nearly unreadable and therefore very hard to maintain, you might wonder if all this effort was worth it just to make our domain specific language a little bit nicer. The short answer is that it depends—which brings us to the final topic of this article.
For a domain specific language to be worth its implementation and maintenance cost, it must bring an even greater sum of benefits to the table. This is usually achieved by making the language reusable in as many different scenarios as possible, thereby amortizing the total cost between many different use cases. Frameworks and libraries are more likely to contain their own DSLs exactly because they are used by lots of developers, each of whom can enjoy the productivity benefits of those embedded languages.
So, as a general principle, only build DSLs if you, other developers, or the end users of your application will be getting a lot of use out of them. If you do create a DSL, make sure to include a comprehensive test suite with it, as well as properly document its syntax as it can be very hard to figure out from the implementation alone. Future you and your fellow developers will thank you for it.
Visit & subscribe toptal.com for such insightful articles. It's a #1 blog for Engineers.
Domain specific languages (DSL) are an incredibly powerful tool for making it easier to program or configure complex systems. They are also everywhere—as a software engineer you are most likely using several different DSLs on a daily basis.
In this article, you will learn what domain specific languages are, when they should be used, and finally how you can make your very own DSL in Ruby using advanced metaprogramming techniques.
This article builds upon Nikola Todorovic’s introduction to Ruby metaprogramming, also published on the Toptal Blog. So if you are new to metaprogramming, make sure you read that first.
What Is a Domain Specific Language?
The general definition of DSLs is that they are languages specialized to a particular application domain or use case. This means that you can only use them for specific things—they are not suitable for general-purpose software development. If that sounds broad, that’s because it is—DSLs come in many different shapes and sizes. Here are a few important categories:- Markup languages such as HTML and CSS are designed for describing specific things like the structure, content, and styles of web pages. It is not possible to write arbitrary algorithms with them, so they fit the description of a DSL.
- Macro and query languages (e.g., SQL) sit on top of a particular system or another programming language and are usually limited in what they can do. Therefore they obviously qualify as domain specific languages.
- Many DSLs do not have their own syntax—instead, they use the syntax of an established programming language in a clever way that feels like using a separate mini-language.
Rails.application.routes.draw do
root to: "pages#main"
resources :posts do
get :preview
resources :comments, only: [:new, :create, :destroy]
end
end
This is Ruby code, yet it feels more like a custom route definition language, thanks to the various metaprogramming techniques that make such a clean, easy-to-use interface possible. Notice that the structure of the DSL is implemented using Ruby blocks, and method calls such as
get
and resources
are used for defining the keywords of this mini-language.Metaprogramming is used even more heavily in the RSpec testing library:
describe UsersController, type: :controller do
before do
allow(controller).to receive(:current_user).and_return(nil)
end
describe "GET #new" do
subject { get :new }
it "returns success" do
expect(subject).to be_success
end
end
end
This piece of code also contains examples for fluent interfaces, which allow declarations to be read out loud as plain English sentences, making it a lot easier to understand what the code is doing:
allow(controller).to receive(:current_user).and_return(nil)
expect(subject).to be_success
Another example of a fluent interface is the query interface of ActiveRecord and Arel, which uses an abstract syntax tree internally for building complex SQL queries:Post.
select([
Post[Arel.star],
Comment[:id].count.
as("num_comments"),
]).
joins(:comments).
where.not(status: :draft).
where(
Post[:created_at].lte(Time.now)
).
group(Post[:id])
Although the clean and expressive syntax of Ruby along with its metaprogramming capabilities makes it uniquely suited for building domain specific languages, DSLs exist in other languages as well. Here is an example of a JavaScript test using the Jasmine framework:
describe("Helper functions", function() {
beforeEach(function() {
this.helpers = window.helpers;
});
describe("log error", function() {
it("logs error message to console", function() {
spyOn(console, "log").and.returnValue(true);
this.helpers.log_error("oops!");
expect(console.log).toHaveBeenCalledWith("ERROR: oops!");
});
});
});
This syntax is perhaps not as clean as that of the Ruby examples, but it shows that with clever naming and creative use of the syntax, internal DSLs can be created using almost any language.
The benefit of internal DSLs is that they don’t require a separate parser, which can be notoriously difficult to implement properly. And because they use the syntax of the language they are implemented in, they also integrate seamlessly with the rest of the codebase.
What we have to give up in return is syntactic freedom—internal DSLs have to be syntactically valid in their implementation language. How much you have to compromise in this regard depends largely on the selected language, with verbose, statically typed languages such as Java and VB.NET being on one end of the spectrum, and dynamic languages with extensive metaprogramming capabilities such as Ruby on the other end.
Building Our Own—A Ruby DSL for Class Configuration
The example DSL we are going to build in Ruby is a reusable configuration engine for specifying the configuration attributes of a Ruby class using a very simple syntax. Adding configuration capabilities to a class is a very common requirement in the Ruby world, especially when it comes to configuring external gems and API clients. The usual solution is an interface like this:MyApp.configure do |config|
config.app_id = "my_app"
config.title = "My App"
config.cookie_name = "my_app_session"
end
Let’s implement this interface first—and then, using it as the starting point, we can improve it step by step by adding more features, cleaning up the syntax, and making our work reusable.
What do we need to make this interface work? The
MyApp
class should have a configure
class method that takes a block and then executes that block by yielding to it, passing in a configuration object that has accessor methods for reading and writing the configuration values:class MyApp
class << self
def config
@config ||= Configuration.new
end
def configure
yield config
end
end
class Configuration
attr_accessor :app_id, :title, :cookie_name
end
end
Once the configuration block has run, we can easily access and modify the values:
MyApp.config
=>
MyApp.config.title
=> "My App"
MyApp.config.app_id = "not_my_app"
=> "not_my_app"
So far, this implementation does not feel like a custom language enough to be considered a DSL. But let’s take things one step at a time. Next, we will decouple the configuration functionality from the
MyApp
class and make it generic enough to be usable in many different use cases.Making It Reusable
Right now, if we wanted to add similar configuration capabilities to a different class, we would have to copy both theConfiguration
class and its related setup methods into that other class, as well as edit the attr_accessor
list to change the accepted configuration attributes. To avoid having to do this, let’s move the configuration features into a separate module called Configurable
. With that, our MyApp
class will look like this:class MyApp
include Configurable
end
Everything related to configuration has been moved to the
Configurable
module:module Configurable
def self.included(host_class)
host_class.extend ClassMethods
end
module ClassMethods
def config
@config ||= Configuration.new
end
def configure
yield config
end
end
class Configuration
attr_accessor :app_id, :title, :cookie_name
end
end
Not much has changed here, except for the new
self.included
method. We need this method because including a module only mixes in its instance methods, so our config
and configure
class methods will not be added to the host class by default. However, if we define a special method called included
on a module, Ruby will call it whenever that module is included in a class. There we can manually extend the host class with the methods in ClassMethods
:def self.included(host_class)
host_class.extend ClassMethods
end
We are not done yet—our next step is to make it possible to specify the supported attributes in the host class that includes the
Configurable
module. A solution like this would look nice:class MyApp
include Configurable.with(:app_id, :title, :cookie_name)
end
Perhaps somewhat surprisingly, the code above is syntactically correct—
include
is not a keyword but simply a regular method that expects a Module
object as its parameter. As long as we pass it an expression that returns a Module
, it will happily include it. So, instead of including Configurable
directly, we need a method with the name with
on it that generates a new module that is customized with the specified attributes:module Configurable
def self.with(*attrs)
config_class = Class.new do
attr_accessor *attrs
end
class_methods = Module.new do
define_method :config do
@config ||= config_class.new
end
def configure
yield config
end
end
Module.new do
singleton_class.send :define_method, :included do |host_class|
host_class.extend class_methods
end
end
end
end
There is a lot to unpack here. The entire
Configurable
module now consists of just a single with
method, with everything happening within that method. First, we create a new anonymous class with Class.new
to hold our attribute accessor methods. Because Class.new
takes the class definition as a block and blocks have access to outside variables, we are able to pass the attrs
variable to attr_accessor
without problems.def self.with(*attrs)
config_class = Class.new do
attr_accessor *attrs
end
The fact that blocks in Ruby have access to outside variables is also the reason why they are sometimes called closures, as they include, or “close over” the outside environment that they were defined in. Note that I used the phrase “defined in” and not “executed in”. That’s correct – regardless of when and where our
define_method
blocks will eventually be executed, they will always be able to access the variables config_class
and class_methods
, even after the with
method has finished running and returned. The following example demonstrates this behavior:def create_block
foo = "hello"
return Proc.new { foo }
end
block = create_block
block.call
=> "hello"
Now that we know about this neat behavior of blocks, we can go ahead and define an anonymous module in
class_methods
for the class methods that will be added to the host class when our generated module is included. Here we have to use define_method
to define the config
method, because we need access to the outside config_class
variable from within the method. Defining the method using the def
keyword would not give us that access because regular method definitions with def
are not closures – however, define_method
takes a block, so this will work:config_class =
class_methods = Module.new do
define_method :config do
@config ||= config_class.new
end
Finally, we call
Module.new
to create the module that we are going to return. Here we need to define our self.included
method, but unfortunately we cannot do that with the def
keyword, as the method needs access to the outside class_methods
variable. Therefore, we have to use define_method
with a block again, but this time on the singleton class of the module, as we are defining a method on the module instance itself. Oh, and since define_method
is a private method of the singleton class, we have to use send
to invoke it instead of calling it directly:class_methods =
Module.new do
singleton_class.send :define_method, :included do |host_class|
host_class.extend class_methods
end
end
Phew, that was some pretty hardcore metaprogramming already. But was the added complexity worth it? Take a look at how easy it is to use and decide for yourself:
class SomeClass
include Configurable.with(:foo, :bar)
end
SomeClass.configure do |config|
config.foo = "wat"
config.bar = "huh"
end
SomeClass.config.foo
=> "wat"
But we can do even better. In the next step we will clean up the syntax of the
configure
block a little bit to make our module even more convenient to use.Cleaning Up the Syntax
There is one last thing that is still bothering me with our current implementation—we have to repeatconfig
on every single line in the configuration block. A proper DSL would know that everything within the configure
block should be executed in the context of our configuration object and enable us to achieve the same thing with just this:MyApp.configure do
app_id "my_app"
title "My App"
cookie_name "my_app_session"
end
Let’s implement it, shall we? From the looks of it, we will need two things. First, we need a way to execute the block passed to
configure
in the context of the configuration object so that method calls within the block go to that object. Second, we have to change the accessor methods so that they write the value if an argument is provided to them and read it back when called without an argument. A possible implementation looks like this:module Configurable
def self.with(*attrs)
not_provided = Object.new
config_class = Class.new do
attrs.each do |attr|
define_method attr do |value = not_provided|
if value === not_provided
instance_variable_get("@#{attr}")
else
instance_variable_set("@#{attr}", value)
end
end
end
attr_writer *attrs
end
class_methods = Module.new do
def configure(&block)
config.instance_eval(&block)
end
end
end
end
The simpler change here is running the
configure
block in the context of the configuration object. Calling Ruby’s instance_eval
method on an object lets you execute an arbitrary block of code as if it was running within that object, which means that when the configuration block calls the app_id
method on the first line, that call will go to our configuration class instance.The change to the attribute accessor methods in
config_class
is a bit more complicated. To understand it, we need to first understand what exactly attr_accessor
was doing behind the scenes. Take the following attr_accessor
call for example:class SomeClass
attr_accessor :foo, :bar
end
This is equivalent to defining a reader and writer method for each specified attribute:
class SomeClass
def foo
@foo
end
def foo=(value)
@foo = value
end
end
So when we wrote
attr_accessor *attrs
in the original code, Ruby defined the attribute reader and writer methods for us for every attribute in attrs
—that is, we got the following standard accessor methods: app_id
, app_id=
, title
, title=
and so on. In our new version, we want to keep the standard writer methods so that assignments like this still work properly:MyApp.config.app_id = "not_my_app"
=> "not_my_app"
We can keep auto-generating the writer methods by calling
attr_writer *attrs
. However, we can no longer use the standard reader methods, as they also have to be capable of writing the attribute to support this new syntax:MyApp.configure do
app_id "my_app"
app_id
end
To generate the reader methods ourselves, we loop over the
attrs
array and define a method for each attribute that returns the current value of the matching instance variable if no new value is provided and writes the new value if it is specified:not_provided = Object.new
attrs.each do |attr|
define_method attr do |value = not_provided|
if value === not_provided
instance_variable_get("@#{attr}")
else
instance_variable_set("@#{attr}", value)
end
end
end
Here we use Ruby’s
instance_variable_get
method to read an instance variable with an arbitrary name, and instance_variable_set
to assign a new value to it. Unfortunately the variable name must be prefixed with an “@” sign in both cases—hence the string interpolation.You might be wondering why we have to use a blank object as the default value for “not provided” and why we can’t simply use
nil
for that purpose. The reason is simple—nil
is a valid value that someone might want to set for a configuration attribute. If we tested for nil
, we would not be able to tell these two scenarios apart:MyApp.configure do
app_id nil
app_id
end
That blank object stored in
not_provided
is only ever going to be equal to itself, so this way we can be certain that nobody is going to pass it into our method and cause an unintended read instead of a write.Adding Support for References
There is one more feature that we could add to make our module even more versatile—the ability to reference a configuration attribute from another one:MyApp.configure do
app_id "my_app"
title "My App"
cookie_name { "#{app_id}_session" }
End
MyApp.config.cookie_name
=> "my_app_session"
Here we added a reference from
cookie_name
to the app_id
attribute. Note that the expression containing the reference is passed in as a block—this is necessary in order to support the delayed evaluation of the attribute value. The idea is to only evaluate the block later when the attribute is read and not when it is defined—otherwise funny things would happen if we defined the attributes in the “wrong” order:SomeClass.configure do
foo "#{bar}_baz"
bar "hello"
end
SomeClass.config.foo
=> "_baz"
If the expression is wrapped in a block, that will prevent it from being evaluated right away. Instead, we can save the block to be executed later when the attribute value is retrieved:
SomeClass.configure do
foo { "#{bar}_baz" }
bar "hello"
end
SomeClass.config.foo
=> "hello_baz"
We do not have to make big changes to the
Configurable
module to add support for delayed evaluation using blocks. In fact, we only have to change the attribute method definition:define_method attr do |value = not_provided, &block|
if value === not_provided && block.nil?
result = instance_variable_get("@#{attr}")
result.is_a?(Proc) ? instance_eval(&result) : result
else
instance_variable_set("@#{attr}", block || value)
end
end
When setting an attribute, the
block || value
expression saves the block if one was passed in, or otherwise it saves the value. Then, when the attribute is later read, we check if it is a block and evaluate it using instance_eval
if it is, or if it is not a block, we return it like we did before.Supporting references comes with its own caveats and edge cases, of course. For example, you can probably figure out what happens if you read any of the attributes in this configuration:
SomeClass.configure do
foo { bar }
bar { foo }
end
The Finished Module
In the end, we have got ourselves a pretty neat module for making an arbitrary class configurable and then specifying those configuration values using a clean and simple DSL that also lets us reference one configuration attribute from another:class MyApp
include Configurable.with(:app_id, :title, :cookie_name)
end
SomeClass.configure do
app_id "my_app"
title "My App"
cookie_name { "#{app_id}_session" }
end
Here is the final version of the module that implements our DSL—a total of 36 lines of code:
module Configurable
def self.with(*attrs)
not_provided = Object.new
config_class = Class.new do
attrs.each do |attr|
define_method attr do |value = not_provided, &block|
if value === not_provided && block.nil?
result = instance_variable_get("@#{attr}")
result.is_a?(Proc) ? instance_eval(&result) : result
else
instance_variable_set("@#{attr}", block || value)
end
end
end
attr_writer *attrs
end
class_methods = Module.new do
define_method :config do
@config ||= config_class.new
end
def configure(&block)
config.instance_eval(&block)
end
end
Module.new do
singleton_class.send :define_method, :included do |host_class|
host_class.extend class_methods
end
end
end
end
Looking at all this Ruby magic in a piece of code that is nearly unreadable and therefore very hard to maintain, you might wonder if all this effort was worth it just to make our domain specific language a little bit nicer. The short answer is that it depends—which brings us to the final topic of this article.
Ruby DSLs—When to Use and When Not to Use Them
You have probably noticed while reading the implementation steps of our DSL that, as we made the external facing syntax of the language cleaner and easier to use, we had to use an ever increasing number of meta-programming tricks under the hood to make it happen. This resulted in an implementation that will be incredibly hard to understand and modify in the future. Like so many other things in software development, this is also a tradeoff that must be carefully examined.For a domain specific language to be worth its implementation and maintenance cost, it must bring an even greater sum of benefits to the table. This is usually achieved by making the language reusable in as many different scenarios as possible, thereby amortizing the total cost between many different use cases. Frameworks and libraries are more likely to contain their own DSLs exactly because they are used by lots of developers, each of whom can enjoy the productivity benefits of those embedded languages.
So, as a general principle, only build DSLs if you, other developers, or the end users of your application will be getting a lot of use out of them. If you do create a DSL, make sure to include a comprehensive test suite with it, as well as properly document its syntax as it can be very hard to figure out from the implementation alone. Future you and your fellow developers will thank you for it.
Visit & subscribe toptal.com for such insightful articles. It's a #1 blog for Engineers.
About the Author of this post :
No comments
commentPost a Comment