The Road To Mastery

From Ruby intermediate to Master...

Intoduction to Dynamic Programming

| Comments

As a big fan of programming puzzles i realized that the harder the problems get, the less equipped you are to solve them if you don’t use dynamic programming techniques. I know it sounds very tounting and fency but Dynamic programming is not. It is not as elegant as Recusion and require most of the time additional data structure to solve intermediate solutions.

Wikipedia has a better definition than me and it is the folowing: dynamic programming is a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions - ideally, using a memory-based data structure.

This is very iteresting. As software developer, don’t we always try to break down all problems into a collection of simpler problems? The answer is yes. Do we always try to store their solutions in a data structure? No so sure.

In this post we will try to solve a very common problem (fibonacci sequence) in two ways and comment on how the methods of dynamic programming are more efficient.

The Recusive way

This is the recursive way to solve it:

1
2
3
4
5
def fib n
    return 0 if n == 0
    return 1 if (n == 1 || n == 2)
    return fib(n-1)+fib(n-2)
end

let see how fib(4) is calculated:

1
2
3
4
5
6
7
8
fib(5) =          fib(4)            +     fib(3)
                    |                        |
            fib(3)  +  fib(2)          fib(2) + fib(1)
              |          |               |        |
              |          1               1        1
            fib(2) + fib(1)
              |       |
              1       1

It seems a little bit confusing at first (That’s recursion in general), but what i want you to notice is that we are solving the same problems over and over again. For example fib(2) is solved twice. You can imagine how trying to calculate the fibonacci sequence of a big number could result in a large amount of time. It takes 2-3 min to find the 50th term!

The Dynamic programming way

Since the topic is dynamic programming and it is all breaking down the complex problems to simpler ones and storing their solution, which is by the way called memoisation, Let’s try to do so. in our example the intermediate solution could be fib(3) so let’s store it in an array so we don’t have to re caluculate it. And let’s do this for all the other number. Here is what the solution look like.

1
2
3
4
5
6
7
8
9
def dynamic_fib n
  fibonacci = Array.new
  fibonacci[0] = 0
  fibonacci[1] = fibonacci[2] = 1
  (3).upto(n) do |index|
    fibonacci[index] = fibonacci[index-1]+fibonacci[index-2]
  end
  fibonacci[n]
end

This solution looks much more efficient, it will find the fibonacci sequenece with n complexity. Let’s try to benchmark it though:

1
2
3
4
5
6
7
8
9
10
11
12
require 'benchmark/ips'

Benchmark.ips do |x|
  x.report 'recursive fib 30' do
    fib 30
  end

  x.report 'dynamic fib 30' do
    dynamic_fib 30
  end
  x.compare! # Print the comparison
end
1
2
3
4
5
6
7
8
9
10
Calculating -------------------------------------
    recursive fib 30     1.234k i/100ms
      dynamic fib 30    40.252k i/100ms
-------------------------------------------------
    recursive fib 30     12.358k (± 3.8%) i/s -     62.934k
      dynamic fib 30    489.616k (±10.1%) i/s -      2.455M

Comparison:
      dynamic fib 30:   489616.3 i/s
    recursive fib 30:    12357.8 i/s - 39.62x slower

Conclusion

Once you see how ti would, it would seem very natural to implement these methods to solve certain class of problems. Whenever you see recusion you might have a chance to optimize your code using Dynamic programming.

My Experience From Railschallenge.com

| Comments

Rails challenge

A couple of months ago I stumbled on a reddit post talking about a Rails challenge. I accepted the challenge and finished it with no error. In this blog i would try to share what i learned form that experience.

Rubocop

It was my first time using Rubocop and it was life changing. For those who don’t know Rubocop, It is a static code analyser that would use the Ruby community outline and give you the list of offenses you have in your code. An example would be use && instead of and (more information here).

Serializers

I started by parsing my own JSON data structure and then i realized that it is getting too complicated to i used ActiveModel::Serializer. Serializers describe which attributes and relationships should be serialized. Adapters describe how attributes and relationships should be serialized.

One thing i struggled with was raising an expection when unpermitted parameters were used. I wasted tremendous amount of implementing that fuctionality instead of using the config.action_controller.action_on_unpermitted_parameters = :raise

The Dispatcher algorithm

This was supposed to be the most complicated part of the challenge. After refining my algorithm a couple of times i realized that i could pass the test using this basic logic. I would order the capacacity of the responder by increasing order i would just assign available responder until in don’t have anymore or until i responded fully to the emergency need. It would look like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def dispatch
    RESPONDER_TYPE.each do |key,value|
      # find a responder on duty and available that can handle the emergency on his own
      responder = Responder.by_type(value).on_duty.available.where("capacity = ?", self.send(key))
      # see if we ahve an exact match
      if responder.count == 1
        self.responders << responder
      else
        # if there is no exact match add resources until it is enough for the emergency severity
        add_resource(value,self.send(key))
      end
    end
    self.fully_responded = true if self.enough_resources?
  end

  def add_resource(type,severity)
    responders_by_type = Responder.by_type(type).on_duty.available.order(capacity: :desc)
    responder_severity_sum = 0
    responder_list = []
    responders_by_type.each do |responder|
      if (responder_severity_sum < severity)
        responder_severity_sum += responder.capacity
        self.responders << responder
      end
    end
  end

Conclusion

I really enjoyed working on this challenge and here is my solution. My only regret was not doing it before the deadline because they were going through your solution and grade it. I heard they will have other challenges soon, so stay tuned!

Find Who Is Accessing My Google Drive Files

| Comments

I Recently had to share some documents from my Google Drive with a team of developers outside the office. Most of these documents were API docs. After a couple of months and when those developers didn’t need the API docs anymore, it was almost impossible for me to find what i shared with them, it was like looking for a needle in a haystack. I bet i am not the only one that had this problem before!!!

So how would i do that manually. I would go to each file and click on share button. see the list of the permission and remove the users i don’t want to. It means that i would need to go through 200 files!!! I am too lazy for that!!

https://github.com/gimite/google-drive-ruby

Since i like Ruby programming i thought about using the Google Drive API to accomplish this task. I looked at this gem and i was immediately discouraged by it’s complexity. It has to be a better solution!!!

After looking at Google Script overview page i realized that this is going to be fairly easy to solve an amazing complex problem. Google Script use a language based on Javascript and this API doc.

This code is largely inspired from Amit Agarwal code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
function Start() {


  var files = DriveApp.getFiles();
  var timezone = Session.getScriptTimeZone();
  var email = Session.getActiveUser().getEmail();

  var MAXFILES = 300; count=1;
  var file, date, access, url, permission;
  var privacy, view, viewers, edit, editors;

  var rows = [["File Name", "Who has access?", "Date Created"]];

  for (var i=0; i<MAXFILES && files.hasNext(); i++ ) {

    file = files.next();

    try {

      access     = file.getSharingAccess();
      permission = file.getSharingPermission();
      viewers    = file.getViewers();
      editors    = file.getEditors();

      view = [];
      edit = [];

      date =  Utilities.formatDate(file.getDateCreated(), timezone, "yyyy-MM-dd HH:mm")
      url = count++ + '. <a href="' + file.getUrl() + '">' + file.getName() + '</a>';

      for (var v=0; v<viewers.length; v++) {
        view.push(viewers[v].getName() + " " + viewers[v].getEmail());
      }

      for (var ed=0; ed<editors.length; ed++) {
        edit.push(editors[ed].getName() + " " + editors[ed].getEmail());
      }

      switch(access) {
        case DriveApp.Access.PRIVATE:
          privacy = "Private";
          break;
        case DriveApp.Access.ANYONE:
          privacy = "Anyone";
          break;
        case DriveApp.Access.ANYONE_WITH_LINK:
          privacy = "Anyone with a link";
          break;
        case DriveApp.Access.DOMAIN:
          privacy = "Anyone inside domain";
          break;
        case DriveApp.Access.DOMAIN_WITH_LINK:
          privacy = "Anyone inside domain who has the link";
          break;
        default:
          privacy = "Unknown";
      }

      switch(permission) {
        case DriveApp.Permission.COMMENT:
          permission = "can comment";
          break;
        case DriveApp.Permission.VIEW:
          permission = "can view";
          break;
        case DriveApp.Permission.EDIT:
          permission = "can edit";
          break;
        default:
          permission = "";
      }

      view = view.join(", ");

      edit = edit.join(", ");

      privacy += (permission === "" ? "" : " " + permission)
               + (edit === "" ? "" : "<br><small>" + edit + " can edit</small>")
               + (view === "" ? "" : "<br><small>" + view + " can view</small>")

      rows.push([url, privacy, date]);

    } catch (e) {Logger.log(e.toString()); Logger.log(file.getName());};

  }

  var html = "<p style='text-align:center'><strong><a style='font-size:160%;text-decoration:none;color:#49B3F5;' File Permissions Report for Google Drive</a></strong></p>";


  html += "<table border='1' cellpadding='5' cellspacing='0'><tr><td><b>" + rows[0].join("</b></td><td><b>") + "</b></td></tr>";

  for (var i=1; i<rows.length; i++) {
    html += "<tr><td>" + rows[i].join("</td><td>") + "</td></tr>";
  }

  html += "</table>";


  GmailApp.sendEmail(email, "Google Drive - File Permissions Report", "", {htmlBody: html});

}

Upgrade to the New Universal Analytics Syntax in SQL

| Comments

Upgrade to the Universal Analytics Syntax in SQL

We are all struggling with upgrading our analytics tools from the old ga.js to the new universal.js plateform. Most of it is done automatically and would require just changing that ga.js file. But would would happen to event tracking? all that _gaq.push(['_trackEvent', 'button', 'event']) that should be turned into something like this ga('send', 'event', 'button');

In this post we are going to update our syntax that is present in a code stored in Postgres. This is very common, just imagine you have a table post with a field body and you want to track some click on a specific post. so you would have something like a href="#" onClick="_gaq.push(['_trackEvent', 'Videos', 'Play', 'Baby\'s First Birthday']);">Play</a> . We need to figure out ways to track all the instances of this syntax and then replace it with the new syntax. We don’t want to do it by hand if there are more than 2 :), that would take forever!! so let’s figure out an automated way

Doing this with a Programming Language would take a long time, the best way seems to be SQL. We need to find all the posts where the tracking event is used:

1
2
 select id,regexp_matches(body,'_gaq.push\(\[\s*?''\s*?_trackEvent\s*?''\s*?,(.+?)\]\)', 'g') as match
  from  posts ;

Good, now let’s make sure that this would capture multiple instances in a single post, so let’s create a query that would display every post and how many matches it has:

1
SELECT id,count(*) FROM posts, regexp_matches(body,'_gaq.push\(\[\s*?''\s*?_trackEvent\s*?''\s*?,(.+?)\]\)', 'g') GROUP BY id HAVING count(*) > 0;

it would look like this:

        id   | count
   19911 |     1
   20894 |     6
   19717 |     8
   19178 |     1
   ..... | .....

Now that our regex capture all the instances of the old syntax, it is time to change it. Let’s update our table with the new syntax:

1
2
3
4
5
begin;
  UPDATE posts SET DATA = regexp_replace(body,'_gaq.push\(\[\s*?''\s*?_trackEvent\s*?''\s*?,(.+?)\]\)', 'ga(''send'',''event'',\1)','gi' )
        WHERE id IN ( SELECT id FROM posts
                           WHERE DATA::text LIKE '%_gaq.push%' );
commit;

This looks scary but it is not!! I am capturing the parameter for _gaq.push and will create a new function call ga(...) using the \1 .

And you should be ready to go now. You can see the event tracking in real time, that is the best way to test it!

Coalesce

| Comments

If you have never heard about the coalesce function, this will be very helpful. Basically this function exist in most modern DBMS and the definition reads, The COALESCE function in SQL returns the first non-NULL expression among its arguments.

let’s see a simple example in postgres:

1
2
3
4
5
6
7
8
9
10
create table employees
(
        name text,
        hourly_wage int,
        salary int
);

insert into employees values 
('bob',null,40000),
('sam',40,null);

There are 2 ways to determine the annual salary, the first one is by displaying the the value of the salary field and the second one is by calculating that salary from the hourly wage. The problem is that sometimes on of the value is null so we would need a conditional, this is where the coalesce function enter into play.

Let’s write a query that will display the annual salary regardless of the null values.

select coalesce(hourly_wage*40*52 ,salary) as "total salary" from employees;

1
2
3
4
5
 total salary 
--------------
40000
83200
(2 rows)

Perfect, we are having the result we want without using a case statement. Now, I am wondering what would happen if both of the values where null, let’s try it out:

1
2
3
4
5
6
7
8
9
insert into employees values ('jim',null,null);
select coalesce(hourly_wage*40*52 ,salary) as  "total salary" from employees;

total salary
--------------
        40000
        83200
             
(3 rows)

Ok, So if both of the parameters are null then it will return null. However there is a way around it and it is adding another parameter at the end of the parameter list that will be the default , something like this coalesce(ex1,ex2,default_value). Here is an example with the previous case:

1
2
3
4
5
6
7
select coalesce(hourly_wage*40*52 ,salary,0) as  "total salary" from employee;
 total salary 
--------------
        40000
        83200
            0
(3 rows)

Fun With Next Train

| Comments

Introduction

enter image description here

I have been playing with creating my own API wrapper lately and i didn’t find a good use to it. Fortunately i found inspiration when i saw my colleague building a widget for dashing.

You can see in the image above that using the this the ruby_wmata gem i can see when are the next train coming. but i can also tell when if there are some trains boarding. It get’s more interesting if i go to the WMATA api documentation and see what kind of information i can retrieve about each train:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
"Trains":[
                {
                "Car":"6",
                "Destination":"SilvrSpg",
                "DestinationCode":"B08",
                "DestinationName":"Silver Spring",
                "Group":"1",
                "Line":"RD",
                "LocationCode":"A01",
                "LocationName":"Metro Center",
                "Min":"3"
                },
                {
...

mmm, i can get how many cars the train has and more importantly it’s direction. What if i try to track the trains movement on a specific line, would i be able to determine where the trains are in real time? This is the problem i would try to resolve in this post.

Step 1 : Get the stations code

Given the information provided earlier, we need to find out where all the trains are in a specific line. In order to achieve that we need to make the GetPrediction on each station, The only problem is that we need a list of station codes to do that. Fortunately, we can use the jPath that will return all the stations between a FromStationCode and a ToStationCode. So let’s try to find all these stations code first.

1
2
3
4
5
require 'ruby_wmata'
WMATA.api = 'kfgpmgvfgacx98de9q3xazww'
#pick just 9 stations for the demonstration
path = WMATA.train_path("A01","A15")
path_by_code = @path.map{|station| station["StationCode"]}

Step 2 : Find where the trains are

Now that we have a list of train station codes, we can iterate through it then and make the next train API call. We will retrieve the list of all the arriving trains and sometime we will see that some trains are “BRD”, it would mean that the location of that train is the stations itself. So whenever we see that let’s mark the stations with a ’T' to indicate the presence of a train and ‘—’ the absence of a trains . So i want something that looks like:

1
2
3
## the first like would represent the 3 first letters of a station's name
2015-02-28 18:04:13 -0500:Met-Far-Dup-Woo-Cle-Van-Ten-Fri-Bet
2015-02-28 18:04:14 -0500:-T---------------------------------

Let’s code it:

1
2
3
4
5
6
7
8
9
10
11
12
def find_trains_in_path(path)
  path_with_trains = []
  path.each_with_index{|station,index|
    begin
      next_trains = WMATA.next_trains(station)
      train_boarding?(next_trains) ? path_with_trains[index] = "-T-" : path_with_trains[index] = "---"
    rescue
      path_with_trains[index] = "---"
    end
   }
   return path_with_trains
end

Now that we are extracting information about each stations, let’s implement the function train_boarding that will tell us if there is a train boarding on that station

1
2
3
def train_boarding?(next_trains)
  next_trains.select{|train| train['Min'] == 'BRD'}.empty?
end

Step 3 : Update periodically the display to show trains position

Now that we have all the information we need on a specific time. let’s try to update the display every 10s. Also we don’t want to show the same information multiple time, so we won’t refresh the display if the train are in the same positions.
We can also add some colors to the display to make it more user friendly using the colorize gem. (we all know that User friendly in a terminal is not possible!! )

1
2
3
4
5
6
7
8
9
10
11
12
puts "#{Time.now}:".blue + "#{@path.map{|station| station["StationName"][0,3]}.join('-')}".red
last_display = []
while (true) do
  display = find_trains_in_path(@path_by_code).join("-")
  unless display == last_display
    print "#{Time.now}:".blue
    print "#{display}"
    puts ""
  end
  last_display = display
  sleep 10
end

And here is the final result:

enter image description here

Obviously, this implementation is not perfect. We can argue a couple of things - We don’t really see the direction of the trains - What if there are 2 trains going to the opposite directions boarding, we are not showing that information - We could replace the last display by the current one but you would not see the movement of the trains - We don’t show the train when it is between two stations.

Conclusion

We came a long way form knowing the next train in a train station to display a whole line with the trains moving in realtime. This method could be implement for any other public transportation with a decent API . The most obvious one would be the next bus. We can also extend this project to make have a full map with more than one line and see all the train moving in realtime using Javascript and HTML/CSS.Who knows maybe my next project!

You can find the source code for this project here

ActiveRecord Dirty Change

| Comments

Today, After chatting with my colleague and try to fix some ActiveRecord Store bug, he showed me some very neat ActiveRecord:Dirty method called changed.

I went to Rails documentation and found out the module that allows all this magic. It will tell you if anything has changed on the object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
2.0.0-p598 :109 > c = Content.first
  Content Load (0.8ms)  SELECT  "content".* FROM "content"   ORDER BY "content"."id" ASC LIMIT 1
 => #<Content id: 1, content_type_id: 2, name: "culture page", url_name: "culture_page", options: {}, creator_id: 123, created_at: "2014-04-21 17:02:23", updated_at: "2014-05-30 18:44:41", published_at: "2014-04-21 17:02:23", removed_at: "2014-05-30 18:44:41">
2.0.0-p598 :110 > c.changed?
 => false
2.0.0-p598 :111 > c.name = "hi"
 => "hi"
2.0.0-p598 :112 > c.changed?
 => true
2.0.0-p598 :113 > c.changed
 => ["name"]
2.0.0-p598 :115 > c.name_change
 => ["culture page", "hi"]
 2.0.0-p598 :118 > c.changes
 => {"name"=>["culture page", "hi"]}
 2.0.0-p598 :114 > c.name_was
 => "culture page"

If you really to avoid saving and making useless DB call. you can always use this module.

Rspec Internals (Part 1)

| Comments

If you wonder how Rspec is built and is curious about the internals of such a tool then this article is for you. After watching an episode from the excellent Gary Bernhard’s Destroy all software series ,i decided to blog about it to make sure i understand how it works.

We will use Minitest TestCase to test the most basic Rspec syntax. As usual let’s write two tests, one that pass and another one that fails. And we will write the code that defines the Rspec features in spec.rb. but lets start with just the tests:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
require 'minitest/autorun'
require_relative 'spec'

class TesDecribe < MiniTest::Unit::TestCase
  def test_that_it_can_pass
    describe 'some description' do
      it 'has some property' do

      end
    end
  end

  def test_that_it_can_fail
    assert_raises(IndexError) do
      describe 'some failing test' do
        it 'fails' do
          raise IndexError
        end
      end
    end
  end
end

Running this will throw a bunch of errors :

1
2
3
4
5
6
7
Finished in 0.001565s, 1277.9553 runs/s, 638.9776 assertions/s.

  1) Failure:
TestDescribe#test_that_it_can_fail [test.rb:13]:
IndexError expected but nothing was raised.

2 runs, 1 assertions, 1 failures, 0 errors, 0 skips

Interesting, It seems like the describe method is defined somehow but the block inside is not executed. It is time to start working on that spec.rb file

1
2
3
4
#lets define describe method that will pick a description and a block and let's call that block
def describe description,&block
  block.call
end

let’s run our test again

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Finished in 0.001240s, 1612.9032 runs/s, 806.4516 assertions/s.

  1) Error:
TestDescribe#test_describe_method:
NoMethodError: undefined method `it' for #<TestDescribe:0x007fb8158a2558>
    test.rb:10:in `block in test_describe_method'
    test.rb:4:in `call'
    test.rb:4:in `describe'
    test.rb:9:in `test_describe_method'


  2) Failure:
TestDescribe#test_that_it_can_fail [test.rb:17]:
[IndexError] exception expected, not
Class: <NoMethodError>
Message: <"undefined method `it' for #<TestDescribe:0x007fb8158a1e50>">
---Backtrace---
test.rb:19:in `block (2 levels) in test_that_it_can_fail'
test.rb:4:in `call'
test.rb:4:in `describe'
test.rb:18:in `block in test_that_it_can_fail'
---------------

This is the same Exception, the block inside describe gets executed but there is an undefined method it inside. The it method is pretty much the same as describe, it will take a description and a block and will execute that block. Let’s update our spec.rb:

1
2
3
4
5
6
7
8
#lets define describe method that will pick a description and a block and let's call that block
def describe description,&block
 block.call
end

def it description,&block
 block.call
end

let’s run our tests again:

1
2
3
Finished in 0.000968s, 2066.1157 runs/s, 1033.0579 assertions/s.

2 runs, 1 assertions, 0 failures, 0 errors, 0 skips

Good this is working, i can use that describe do .... end syntax, but how about assertions, i want to be able to do something like 2.should == 2. As Usual let’s do the same and build 2 tests, a passing one and a failing one.

1
2
3
4
5
6
7
8
9
10
11
class TestAssertion  < MiniTest::Test
  def test_that_it_can_pass
    2.should == 2
  end

  def test_that_it_can_fail
    assert_raises AssertionError do
      1.should ==  2
    end
  end
end

Obviously running test will not pass:

1
2
3
4
5
6
7
8
9
10
11
12
Finished in 0.001312s, 3048.7805 runs/s, 762.1951 assertions/s.

  1) Error:
TestAssertion#test_that_it_can_fail:
NameError: uninitialized constant TestAssertion::AssertionError
    test.rb:17:in `test_that_it_can_fail'


  2) Error:
TestAssertion#test_that_it_can_pass:
NoMethodError: undefined method `should' for 2:Fixnum
    test.rb:13:in `test_that_it_can_pass'

I see two things , an undefined method should and an undefined Exception AssertionError. Let’s update spec.rb:

1
2
3
4
...
#this is easy fix, just define an exception
class AssertionErro < Exception
end

In order to implement the should we need to create an assertion for the object passed and it needs to work for all ruby objects so we will implement it for the Object class. You also need to remember that ‘==’ is just syntactic sugar offered by ruby and it is the equivalent of ‘equal?’. Basically, 2 == 2 is the same as 2.equal?(2) . this is very important because we will define a == method in our DelayedAssertion class. Enough talking, let’s take a look at the code in spec.rb:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class Object
  def should
    DelayedAssertion.new(self)
  end
end

#lets define DelayedAssertion class now

class DelayedAssertion
  def initialize(subject)
    @subject = subject
  end
  #here where the magic happens
  def ==(other)
    raise AssertionError unless @subject == other
  end
end

let’s run the test now:

1
2
3
4
5
6
7
# Running:

....

Finished in 0.001384s, 2890.1734 runs/s, 1445.0867 assertions/s.

4 runs, 2 assertions, 0 failures, 0 errors, 0 skips

Wow, we have been able to implement some major features of Rspec. In future post i will try to add support to should method such as something.should be_true or array.should have(4).things. You can also add your implementation as a comments.

Create Your First API Wrapper With TDD

| Comments

Whether at you job or personal projects you would need at some point have to build you own API wrapper in ruby. Today, i decided to build my own Wrapper that i will turn into a gem in a later blog post. I want to do it methodically according to Test Driven Development principles.

It is always a hassle to test API calls but there is a way on how it should be done and we will talk about it later on.

I picked the WMATA API because it is well written and well documented. As you can see in their developer website each call has a json response example. We will use these to test our code against it later.

Let’s talk about our goal. I want to be able to set my api key and then make an api call in this manner:

1
2
WMATA.api_key = "kfgpmgvfgacx98de9q3xazww"
WMATA.next_trains("A01") # A01 stands for a station code

Now, that i know what i want, let’s set up our TDD environment. The initial repository tree would look like this:

1
2
3
4
5
6
lib
|___wamta.rb
spec
|___spec_helper.rb
|___trains_helper.rb
Gemfile

Let’s take a look at the Gemfile and see what are the tools we are going to use:

1
2
3
4
5
source 'https://rubygems.org'
gem 'httparty' #i use httparty to make http queries 
gem 'pry' #i use binding.pry to open a IRB console during runtime
gem 'webmock' , group: :test
gem 'rspec' , group: :test # this is my favorite testing framework

Now that we have all the necessary gems do a bundle install. Let’s take a look at the spec/spec_helper.rb:

1
2
3
4
5
6
7
8
9
10
$:.unshift 'lib'
require 'wmata'
require 'webmock/rspec'
WebMock.disable_net_connect!(allow_localhost: true)
RSpec.configure do |config|
    config.before(:each) do
      stub_request(:get, %r|https://api.wmata.com/StationPrediction.svc/json/GetPrediction/|).
        to_return(status: 200, body: File.open("./spec/fixtures/next_trains.json"){|f| f.read}, headers: {})
    end
end

This is where it get’s tricky. If you rely on the API, your tests are not really tests, because they depend on the status of the API. If you have no internet, those tests will not work. Your tests should only test your own code, not the response of the API. We are using webmock to return to each call made to https://api.wmata.com/StationPrediction.svc/json/GetPrediction/ the json response in next_trains.json. This response is the expected result provided by the documentation here

So let’s add the file next_trains.json in spec/fixtures/next_trains.json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
{
"Trains":[
{
"Car":"6",
"Destination":"SilvrSpg",
"DestinationCode":"B08",
"DestinationName":"Silver Spring",
"Group":"1",
"Line":"RD",
"LocationCode":"A01",
"LocationName":"Metro Center",
"Min":"3"
},
{
"Car":"6",
"Destination":"Grsvnor",
"DestinationCode":"A11",
"DestinationName":"Grosvenor-Strathmore",
"Group":"2",
"Line":"RD",
"LocationCode":"A01",
"LocationName":"Metro Center",
"Min":"4"
}]}

We have everything setup now let’s write our first test in spec/train_spec.rb.

1
2
3
4
5
6
7
8
9
10
11
12
require 'spec_helper'
describe WMATA do
  before(:all) { WMATA.api = 'xxxxx'  }
     it 'should predict next trains' do
       expect(WMATA).to respond_to :next_trains
       next_trains = WMATA.next_trains
       expect(next_trains).not_to be_empty
       ["Car","Min","DestinationName"].each{ |field|
        expect(next_trains.first[field]).not_to be_nil
      }
   end
end

Notice that the api key does not have to be real because the call to the live api will not be made. It will be mocked and will return the expected json response.

Let’s create the module that will handle the API calls in lib/wamta.rb

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$:.unshift(File.dirname(__FILE__))
require 'client'
module WMATA
  def self.client
    @client ||= Client.new(api)
  end

  def self.api
    @api || raise("please set the api key")
  end

  def self.api=(api_key)
    @client = Client.new(api_key)
    @api = api_key
  end

  def self.next_trains(station = "A06")
    client.next_trains(station)
  end
end

Good, now we have a module that will create a client object when an api_key is passed and if we want to call next_trains without setting up the api key it will throw an exception “please set the api key”. Now, let’s create the Client class in lib/client.rb that will actually handle the API calls.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
require 'httparty'
module WMATA

  class Client
    include HTTParty
    base_uri "https://api.wmata.com"
    def initialize(api_key)
        @options = {query: {api_key: api_key} }
    end

    def next_trains(stations='A06')
      response = self.class.get("/StationPrediction.svc/json/GetPrediction/#{stations}", @options)
      JSON.parse(response.body)["Trains"]
    end
  end

end

When you run the tests now, everything should be green. This part makes sure that our code handles correctly the expected response, now let’s make a live API call :

1
2
WMATA.api_key = "kfgpmgvfgacx98de9q3xazww"
WMATA.next_trains("A01") # A01 stands for a station code

It should return an array of all the next trains for the specified station. Now that it is working we can do the same for all the other api calls

alt

HTTParty Debugging

| Comments

Whenever you start working on API call you should use HTTParty. In my opinion it is the best tool to make a API wrapper in a very modular way.I have been playing with it lately, and i got this weird error

1
2
/Users/***/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/uri/generic.rb:214:in `initialize':
the scheme http does not accept registry part: :80 (or bad hostname?) (URI::InvalidURIError)`

Well, i know that HTTParty rely on URI module, but i would like to have more details in order to debug it. How about getting more verbose for the http requests i am making. The solution is easy, HTTParty provides a simple tool called debug_output. You can display the debug in stderr and it will show you only the errors or jsut be more verbose and use stdout, it woudl be something like:

1
2
include HTTParty
debug_output $stdout

let see the output:

1
2
3
4
5
6
7
8
9
opening connection to api.themoviedb.org:80...
opened
<- "GET /3movie/12?api_key=87465574ffc9 HTTP/1.1\r\nAccept: application/json\r\nContent-Type: application/json\r\nConnection: close\r\nHost: api.themoviedb.org\r\n\r\n"
-> "HTTP/1.1 404 Not Found\r\n"
-> "Content-Type: text/html\r\n"
-> "Date: Fri, 02 Jan 2015 20:20:30 GMT\r\n"
-> "Server: nginx\r\n"
-> "Content-Length: 162\r\n"
-> "Connection: Close\r\n"

Mmmm…i was making a /3movie/ call instead of movie/3, i know what my problem is now.