The Road To Mastery

From Ruby intermediate to Master...

'Pair With Sum' Problem

| Comments

Recently, I Have been researching common coding interview-problems and I’ve discovered that most of the major tech companies like Google or facebook have overlapping problem lists. One of these problems is called Pair With Sum and it states the following Find a pair of elements from an array whose sum equals a given number.

This request is a bit ambiguous - so let’s start out by listing all the assumptions that we are making before we begin working on a solution:

  • The array contains only integers
  • The array can have duplicates
  • Return a boolean that indicates if the list has such a pair
  • We will start with an ordered (ascending order) list example

Brute force

Lets code the most naive solution. Basically, try every pair possible and see if their sum is equal to the given number. It would look like this in Ruby

1
2
3
4
5
6
7
8
9
10
def has_pair_with_sum_brute_force(input, sum)
  0.upto(input.length - 1) do |n|
    0.upto(input.length - 1) do |i|
      if input[i] + input[n] == sum and n != i
        return true
      end
    end
  end
  return false
end

This solution would work whether the list is sorted or not. However, this is an inefficient way of solving the problem, indicated by the complexity of this algorithm at O(n2). This will get out of control pretty quickly for large arrays. We need a solution that has a linear complexity or a O(n). Let’s explore another solution where we assume that the array elemets are sorted.

Moving pointer

1
2
3
4
5
6
7
8
9
10
11
12
def has_pair_with_sum_in_ordered_list(input, sum)
  pointer_begining = 0
  pointer_end = input.length - 1
  while(pointer_begining < pointer_end)
    # puts pointer_begining
    # puts pointer_end
    return true if input[pointer_begining] + input[pointer_end] == sum
    pointer_end -= 1 if input[pointer_begining] + input[pointer_end] > sum
    pointer_begining += 1 if input[pointer_begining] + input[pointer_end] < sum
  end
  return false
end

This solution has the algorithm complexity of O(n) we are looking for, making it much more scalable and efficient. Let’s go over the solution briefly. We have one pointer at the begining of the array and one at the end. we sum the value pointed with the two pointer

  • If the sum is equal to the given number then we have found a solution
  • If the sum is greater than the given number then the sum is too big, so we move the last pointer to the value before it
  • If the sum is smaller than the given number then the sum is too small so we move the last pointer to the value after it

Saving the Complement

Imagine now that our list is not ordered, In this case, the previous method would not work. We need to come up with another algorithm with a linear complexity to retain our scalability. We note that in order for us to determine if a number has a pair, that the sum would equal to the given number, it needs to have its complement in the array. Let’s assume that the first element of the array is 2 and that the given number is 8- the complement of 2 to make 8 is 8-2=6. If we find 6 in the array, we know that it represents a valid pair.

Here’s an example:

array = [1,3,2], given_number = 4

Let’s go through this array and save the complements in some data structure. The ideal data structure in this case would be some kind of hash that would save only keys. The reason why we would choose this data structure is because it will ensure our collection constantly maintains unique elements (no redundancy) and it will do the element look up in a constant complexity O(1). In Ruby we can use the Set library.

Back to our array traversal. The first element is 1, its complement is (4-1=3), let’s add this complement to the complements set. The second element is 3, this number is a complement of a previous element, therefore we have a valid pair here. Let’s code the method that would do that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def has_pair_with_sum_in_unordered_list(input, sum)
  complements = Set.new
  0.upto(input.length - 1) do |i|
    value = input[i]
    complement = sum - value
    # if we have stored the value complement then we have a winner pair
    if complements.include?(value)
      return true
    else
      complements.add(complement)
    end
  end
  return false
end

benchmarking

Let’s prove that the best solution is not the brute force method by using the benchmark/ips gem.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ ruby sum_of_pair.rb
Warming up --------------------------------------
         brute force    13.000  i/100ms
pointer algorithm for ordered list
                         1.022k i/100ms
find complement algorithm for unordered list
                       647.000  i/100ms
Calculating -------------------------------------
         brute force    140.826  (± 2.8%) i/s -    715.000  in   5.082566s
pointer algorithm for ordered list
                         10.314k (± 2.8%) i/s -     52.122k in   5.057393s
find complement algorithm for unordered list
                          7.045k (±25.5%) i/s -     31.703k in   5.064816s

Comparison:
pointer algorithm for ordered list:    10314.1 i/s
find complement algorithm for unordered list:     7045.2 i/s - 1.46x  slower
         brute force:      140.8 i/s - 73.24x  slower

The brute force solution is 73.24 times slower than the unordered list algorithm. The unordered list algorithm is 1.46 times slower that the ordered list algorithm (very insignificant difference).

Conclusion

As you can see, there are multiple ways to solve an these common interview questions. Each method would lead to a perfect linear complexity solution. Ensuring you have identified the right assumptions is essential in solving similar problems.

Intoduction to Dynamic Programming

| Comments

As a big fan of programming puzzles, I realized that the harder the problems get, the less equipped you are to solve them if you don’t use dynamic programming techniques. I know it sounds very taunting and fancy but Dynamic programming is not. It is not as elegant as Recursion and requires additional data structure to solve intermediate solutions most of the time.

Wikipedia has a better as follows: dynamic programming is a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions - ideally, using a memory-based data structure.

This definition is very interesting. As a software developer, don’t we always try to break down all problems into a collection of simpler problems? The answer is yes. Do we always try to store their solutions in a data structure? Not so sure.

In this post we will try to solve a very common problem (ie Fibonacci Sequence) in two ways and demonstrate on how the methods of dynamic programming can be more efficient.

The Recusive way

This is the recursive way to solve it:

1
2
3
4
5
def fib n
    return 0 if n == 0
    return 1 if (n == 1 || n == 2)
    return fib(n-1)+fib(n-2)
end

let’s see how fib(4) is calculated:

1
2
3
4
5
6
7
8
fib(5) =          fib(4)            +     fib(3)
                    |                        |
            fib(3)  +  fib(2)          fib(2) + fib(1)
              |          |               |        |
              |          1               1        1
            fib(2) + fib(1)
              |       |
              1       1

It seems a bit confusing at first (that’s recursion in general), but what I want you to notice is that we are solving the same problems over and over again. For example, fib(2) is solved twice. You can imagine how trying to calculate the Fibonacci sequence of a big number could result in a large amount of time. It acutally takes 2-3 minutes to find the 50th term!

The Dynamic programming way

Since the topic is Dynamic Programming and it is all breaking down the complex problems to simpler ones and storing their solution, this is called Memoization by the way. Let’s try this. In our example the intermediate solution could be fib(3), so let’s store it in an array so we don’t have to recalculate it, And let’s do this for all the other numbers. Here is what the solution should look like:

1
2
3
4
5
6
7
8
9
def dynamic_fib n
  fibonacci = Array.new
  fibonacci[0] = 0
  fibonacci[1] = fibonacci[2] = 1
  (3).upto(n) do |index|
    fibonacci[index] = fibonacci[index-1]+fibonacci[index-2]
  end
  fibonacci[n]
end

This solution looks more efficient, it will find the Fibonacci Sequence with n complexity. Let’s benchmark it though:

1
2
3
4
5
6
7
8
9
10
11
12
require 'benchmark/ips'

Benchmark.ips do |x|
  x.report 'recursive fib 30' do
    fib 30
  end

  x.report 'dynamic fib 30' do
    dynamic_fib 30
  end
  x.compare! # Print the comparison
end
1
2
3
4
5
6
7
8
9
10
Calculating -------------------------------------
    recursive fib 30     1.234k i/100ms
      dynamic fib 30    40.252k i/100ms
-------------------------------------------------
    recursive fib 30     12.358k (± 3.8%) i/s -     62.934k
      dynamic fib 30    489.616k (±10.1%) i/s -      2.455M

Comparison:
      dynamic fib 30:   489616.3 i/s
    recursive fib 30:    12357.8 i/s - 39.62x slower

Conclusion

Once you see how it works, it would seem very natural to implement these methods to solve a certain class of problems. Whenever you see recursion you might have a chance to optimize your code using Dynamic Programming.

My Experience From Railschallenge.com

| Comments

Rails challenge

A couple of months ago I stumbled on a reddit post talking about a Rails challenge. I accepted the challenge and finished it with no error. In this blog i would try to share what i learned form that experience.

Rubocop

It was my first time using Rubocop and it was life changing. For those who don’t know Rubocop, It is a static code analyser that would use the Ruby community outline and give you the list of offenses you have in your code. An example would be use && instead of and (more information here).

Serializers

I started by parsing my own JSON data structure and then i realized that it is getting too complicated to i used ActiveModel::Serializer. Serializers describe which attributes and relationships should be serialized. Adapters describe how attributes and relationships should be serialized.

One thing i struggled with was raising an expection when unpermitted parameters were used. I wasted tremendous amount of implementing that fuctionality instead of using the config.action_controller.action_on_unpermitted_parameters = :raise

The Dispatcher algorithm

This was supposed to be the most complicated part of the challenge. After refining my algorithm a couple of times i realized that i could pass the test using this basic logic. I would order the capacacity of the responder by increasing order i would just assign available responder until in don’t have anymore or until i responded fully to the emergency need. It would look like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def dispatch
    RESPONDER_TYPE.each do |key,value|
      # find a responder on duty and available that can handle the emergency on his own
      responder = Responder.by_type(value).on_duty.available.where("capacity = ?", self.send(key))
      # see if we ahve an exact match
      if responder.count == 1
        self.responders << responder
      else
        # if there is no exact match add resources until it is enough for the emergency severity
        add_resource(value,self.send(key))
      end
    end
    self.fully_responded = true if self.enough_resources?
  end

  def add_resource(type,severity)
    responders_by_type = Responder.by_type(type).on_duty.available.order(capacity: :desc)
    responder_severity_sum = 0
    responder_list = []
    responders_by_type.each do |responder|
      if (responder_severity_sum < severity)
        responder_severity_sum += responder.capacity
        self.responders << responder
      end
    end
  end

Conclusion

I really enjoyed working on this challenge and here is my solution. My only regret was not doing it before the deadline because they were going through your solution and grade it. I heard they will have other challenges soon, so stay tuned!

Find Who Is Accessing My Google Drive Files

| Comments

I Recently had to share some documents from my Google Drive with a team of developers outside the office. Most of these documents were API docs. After a couple of months and when those developers didn’t need the API docs anymore, it was almost impossible for me to find what i shared with them, it was like looking for a needle in a haystack. I bet i am not the only one that had this problem before!!!

So how would i do that manually. I would go to each file and click on share button. see the list of the permission and remove the users i don’t want to. It means that i would need to go through 200 files!!! I am too lazy for that!!

https://github.com/gimite/google-drive-ruby

Since i like Ruby programming i thought about using the Google Drive API to accomplish this task. I looked at this gem and i was immediately discouraged by it’s complexity. It has to be a better solution!!!

After looking at Google Script overview page i realized that this is going to be fairly easy to solve an amazing complex problem. Google Script use a language based on Javascript and this API doc.

This code is largely inspired from Amit Agarwal code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
function Start() {


  var files = DriveApp.getFiles();
  var timezone = Session.getScriptTimeZone();
  var email = Session.getActiveUser().getEmail();

  var MAXFILES = 300; count=1;
  var file, date, access, url, permission;
  var privacy, view, viewers, edit, editors;

  var rows = [["File Name", "Who has access?", "Date Created"]];

  for (var i=0; i<MAXFILES && files.hasNext(); i++ ) {

    file = files.next();

    try {

      access     = file.getSharingAccess();
      permission = file.getSharingPermission();
      viewers    = file.getViewers();
      editors    = file.getEditors();

      view = [];
      edit = [];

      date =  Utilities.formatDate(file.getDateCreated(), timezone, "yyyy-MM-dd HH:mm")
      url = count++ + '. <a href="' + file.getUrl() + '">' + file.getName() + '</a>';

      for (var v=0; v<viewers.length; v++) {
        view.push(viewers[v].getName() + " " + viewers[v].getEmail());
      }

      for (var ed=0; ed<editors.length; ed++) {
        edit.push(editors[ed].getName() + " " + editors[ed].getEmail());
      }

      switch(access) {
        case DriveApp.Access.PRIVATE:
          privacy = "Private";
          break;
        case DriveApp.Access.ANYONE:
          privacy = "Anyone";
          break;
        case DriveApp.Access.ANYONE_WITH_LINK:
          privacy = "Anyone with a link";
          break;
        case DriveApp.Access.DOMAIN:
          privacy = "Anyone inside domain";
          break;
        case DriveApp.Access.DOMAIN_WITH_LINK:
          privacy = "Anyone inside domain who has the link";
          break;
        default:
          privacy = "Unknown";
      }

      switch(permission) {
        case DriveApp.Permission.COMMENT:
          permission = "can comment";
          break;
        case DriveApp.Permission.VIEW:
          permission = "can view";
          break;
        case DriveApp.Permission.EDIT:
          permission = "can edit";
          break;
        default:
          permission = "";
      }

      view = view.join(", ");

      edit = edit.join(", ");

      privacy += (permission === "" ? "" : " " + permission)
               + (edit === "" ? "" : "<br><small>" + edit + " can edit</small>")
               + (view === "" ? "" : "<br><small>" + view + " can view</small>")

      rows.push([url, privacy, date]);

    } catch (e) {Logger.log(e.toString()); Logger.log(file.getName());};

  }

  var html = "<p style='text-align:center'><strong><a style='font-size:160%;text-decoration:none;color:#49B3F5;' File Permissions Report for Google Drive</a></strong></p>";


  html += "<table border='1' cellpadding='5' cellspacing='0'><tr><td><b>" + rows[0].join("</b></td><td><b>") + "</b></td></tr>";

  for (var i=1; i<rows.length; i++) {
    html += "<tr><td>" + rows[i].join("</td><td>") + "</td></tr>";
  }

  html += "</table>";


  GmailApp.sendEmail(email, "Google Drive - File Permissions Report", "", {htmlBody: html});

}

Upgrade to the New Universal Analytics Syntax in SQL

| Comments

Upgrade to the Universal Analytics Syntax in SQL

We are all struggling with upgrading our analytics tools from the old ga.js to the new universal.js plateform. Most of it is done automatically and would require just changing that ga.js file. But would would happen to event tracking? all that _gaq.push(['_trackEvent', 'button', 'event']) that should be turned into something like this ga('send', 'event', 'button');

In this post we are going to update our syntax that is present in a code stored in Postgres. This is very common, just imagine you have a table post with a field body and you want to track some click on a specific post. so you would have something like a href="#" onClick="_gaq.push(['_trackEvent', 'Videos', 'Play', 'Baby\'s First Birthday']);">Play</a> . We need to figure out ways to track all the instances of this syntax and then replace it with the new syntax. We don’t want to do it by hand if there are more than 2 :), that would take forever!! so let’s figure out an automated way

Doing this with a Programming Language would take a long time, the best way seems to be SQL. We need to find all the posts where the tracking event is used:

1
2
 select id,regexp_matches(body,'_gaq.push\(\[\s*?''\s*?_trackEvent\s*?''\s*?,(.+?)\]\)', 'g') as match
  from  posts ;

Good, now let’s make sure that this would capture multiple instances in a single post, so let’s create a query that would display every post and how many matches it has:

1
SELECT id,count(*) FROM posts, regexp_matches(body,'_gaq.push\(\[\s*?''\s*?_trackEvent\s*?''\s*?,(.+?)\]\)', 'g') GROUP BY id HAVING count(*) > 0;

it would look like this:

        id   | count
   19911 |     1
   20894 |     6
   19717 |     8
   19178 |     1
   ..... | .....

Now that our regex capture all the instances of the old syntax, it is time to change it. Let’s update our table with the new syntax:

1
2
3
4
5
begin;
  UPDATE posts SET DATA = regexp_replace(body,'_gaq.push\(\[\s*?''\s*?_trackEvent\s*?''\s*?,(.+?)\]\)', 'ga(''send'',''event'',\1)','gi' )
        WHERE id IN ( SELECT id FROM posts
                           WHERE DATA::text LIKE '%_gaq.push%' );
commit;

This looks scary but it is not!! I am capturing the parameter for _gaq.push and will create a new function call ga(...) using the \1 .

And you should be ready to go now. You can see the event tracking in real time, that is the best way to test it!

Coalesce

| Comments

If you have never heard about the coalesce function, this will be very helpful. Basically this function exist in most modern DBMS and the definition reads, The COALESCE function in SQL returns the first non-NULL expression among its arguments.

let’s see a simple example in postgres:

1
2
3
4
5
6
7
8
9
10
create table employees
(
        name text,
        hourly_wage int,
        salary int
);

insert into employees values 
('bob',null,40000),
('sam',40,null);

There are 2 ways to determine the annual salary, the first one is by displaying the the value of the salary field and the second one is by calculating that salary from the hourly wage. The problem is that sometimes on of the value is null so we would need a conditional, this is where the coalesce function enter into play.

Let’s write a query that will display the annual salary regardless of the null values.

select coalesce(hourly_wage*40*52 ,salary) as "total salary" from employees;

1
2
3
4
5
 total salary 
--------------
40000
83200
(2 rows)

Perfect, we are having the result we want without using a case statement. Now, I am wondering what would happen if both of the values where null, let’s try it out:

1
2
3
4
5
6
7
8
9
insert into employees values ('jim',null,null);
select coalesce(hourly_wage*40*52 ,salary) as  "total salary" from employees;

total salary
--------------
        40000
        83200
             
(3 rows)

Ok, So if both of the parameters are null then it will return null. However there is a way around it and it is adding another parameter at the end of the parameter list that will be the default , something like this coalesce(ex1,ex2,default_value). Here is an example with the previous case:

1
2
3
4
5
6
7
select coalesce(hourly_wage*40*52 ,salary,0) as  "total salary" from employee;
 total salary 
--------------
        40000
        83200
            0
(3 rows)

Fun With Next Train

| Comments

Introduction

enter image description here

I have been playing with creating my own API wrapper lately and i didn’t find a good use to it. Fortunately i found inspiration when i saw my colleague building a widget for dashing.

You can see in the image above that using the this the ruby_wmata gem i can see when are the next train coming. but i can also tell when if there are some trains boarding. It get’s more interesting if i go to the WMATA api documentation and see what kind of information i can retrieve about each train:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
"Trains":[
                {
                "Car":"6",
                "Destination":"SilvrSpg",
                "DestinationCode":"B08",
                "DestinationName":"Silver Spring",
                "Group":"1",
                "Line":"RD",
                "LocationCode":"A01",
                "LocationName":"Metro Center",
                "Min":"3"
                },
                {
...

mmm, i can get how many cars the train has and more importantly it’s direction. What if i try to track the trains movement on a specific line, would i be able to determine where the trains are in real time? This is the problem i would try to resolve in this post.

Step 1 : Get the stations code

Given the information provided earlier, we need to find out where all the trains are in a specific line. In order to achieve that we need to make the GetPrediction on each station, The only problem is that we need a list of station codes to do that. Fortunately, we can use the jPath that will return all the stations between a FromStationCode and a ToStationCode. So let’s try to find all these stations code first.

1
2
3
4
5
require 'ruby_wmata'
WMATA.api = 'kfgpmgvfgacx98de9q3xazww'
#pick just 9 stations for the demonstration
path = WMATA.train_path("A01","A15")
path_by_code = @path.map{|station| station["StationCode"]}

Step 2 : Find where the trains are

Now that we have a list of train station codes, we can iterate through it then and make the next train API call. We will retrieve the list of all the arriving trains and sometime we will see that some trains are “BRD”, it would mean that the location of that train is the stations itself. So whenever we see that let’s mark the stations with a ’T' to indicate the presence of a train and ‘—’ the absence of a trains . So i want something that looks like:

1
2
3
## the first like would represent the 3 first letters of a station's name
2015-02-28 18:04:13 -0500:Met-Far-Dup-Woo-Cle-Van-Ten-Fri-Bet
2015-02-28 18:04:14 -0500:-T---------------------------------

Let’s code it:

1
2
3
4
5
6
7
8
9
10
11
12
def find_trains_in_path(path)
  path_with_trains = []
  path.each_with_index{|station,index|
    begin
      next_trains = WMATA.next_trains(station)
      train_boarding?(next_trains) ? path_with_trains[index] = "-T-" : path_with_trains[index] = "---"
    rescue
      path_with_trains[index] = "---"
    end
   }
   return path_with_trains
end

Now that we are extracting information about each stations, let’s implement the function train_boarding that will tell us if there is a train boarding on that station

1
2
3
def train_boarding?(next_trains)
  next_trains.select{|train| train['Min'] == 'BRD'}.empty?
end

Step 3 : Update periodically the display to show trains position

Now that we have all the information we need on a specific time. let’s try to update the display every 10s. Also we don’t want to show the same information multiple time, so we won’t refresh the display if the train are in the same positions.
We can also add some colors to the display to make it more user friendly using the colorize gem. (we all know that User friendly in a terminal is not possible!! )

1
2
3
4
5
6
7
8
9
10
11
12
puts "#{Time.now}:".blue + "#{@path.map{|station| station["StationName"][0,3]}.join('-')}".red
last_display = []
while (true) do
  display = find_trains_in_path(@path_by_code).join("-")
  unless display == last_display
    print "#{Time.now}:".blue
    print "#{display}"
    puts ""
  end
  last_display = display
  sleep 10
end

And here is the final result:

enter image description here

Obviously, this implementation is not perfect. We can argue a couple of things - We don’t really see the direction of the trains - What if there are 2 trains going to the opposite directions boarding, we are not showing that information - We could replace the last display by the current one but you would not see the movement of the trains - We don’t show the train when it is between two stations.

Conclusion

We came a long way form knowing the next train in a train station to display a whole line with the trains moving in realtime. This method could be implement for any other public transportation with a decent API . The most obvious one would be the next bus. We can also extend this project to make have a full map with more than one line and see all the train moving in realtime using Javascript and HTML/CSS.Who knows maybe my next project!

You can find the source code for this project here

ActiveRecord Dirty Change

| Comments

Today, After chatting with my colleague and try to fix some ActiveRecord Store bug, he showed me some very neat ActiveRecord:Dirty method called changed.

I went to Rails documentation and found out the module that allows all this magic. It will tell you if anything has changed on the object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
2.0.0-p598 :109 > c = Content.first
  Content Load (0.8ms)  SELECT  "content".* FROM "content"   ORDER BY "content"."id" ASC LIMIT 1
 => #<Content id: 1, content_type_id: 2, name: "culture page", url_name: "culture_page", options: {}, creator_id: 123, created_at: "2014-04-21 17:02:23", updated_at: "2014-05-30 18:44:41", published_at: "2014-04-21 17:02:23", removed_at: "2014-05-30 18:44:41">
2.0.0-p598 :110 > c.changed?
 => false
2.0.0-p598 :111 > c.name = "hi"
 => "hi"
2.0.0-p598 :112 > c.changed?
 => true
2.0.0-p598 :113 > c.changed
 => ["name"]
2.0.0-p598 :115 > c.name_change
 => ["culture page", "hi"]
 2.0.0-p598 :118 > c.changes
 => {"name"=>["culture page", "hi"]}
 2.0.0-p598 :114 > c.name_was
 => "culture page"

If you really to avoid saving and making useless DB call. you can always use this module.

Rspec Internals (Part 1)

| Comments

If you wonder how Rspec is built and is curious about the internals of such a tool then this article is for you. After watching an episode from the excellent Gary Bernhard’s Destroy all software series ,i decided to blog about it to make sure i understand how it works.

We will use Minitest TestCase to test the most basic Rspec syntax. As usual let’s write two tests, one that pass and another one that fails. And we will write the code that defines the Rspec features in spec.rb. but lets start with just the tests:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
require 'minitest/autorun'
require_relative 'spec'

class TesDecribe < MiniTest::Unit::TestCase
  def test_that_it_can_pass
    describe 'some description' do
      it 'has some property' do

      end
    end
  end

  def test_that_it_can_fail
    assert_raises(IndexError) do
      describe 'some failing test' do
        it 'fails' do
          raise IndexError
        end
      end
    end
  end
end

Running this will throw a bunch of errors :

1
2
3
4
5
6
7
Finished in 0.001565s, 1277.9553 runs/s, 638.9776 assertions/s.

  1) Failure:
TestDescribe#test_that_it_can_fail [test.rb:13]:
IndexError expected but nothing was raised.

2 runs, 1 assertions, 1 failures, 0 errors, 0 skips

Interesting, It seems like the describe method is defined somehow but the block inside is not executed. It is time to start working on that spec.rb file

1
2
3
4
#lets define describe method that will pick a description and a block and let's call that block
def describe description,&block
  block.call
end

let’s run our test again

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Finished in 0.001240s, 1612.9032 runs/s, 806.4516 assertions/s.

  1) Error:
TestDescribe#test_describe_method:
NoMethodError: undefined method `it' for #<TestDescribe:0x007fb8158a2558>
    test.rb:10:in `block in test_describe_method'
    test.rb:4:in `call'
    test.rb:4:in `describe'
    test.rb:9:in `test_describe_method'


  2) Failure:
TestDescribe#test_that_it_can_fail [test.rb:17]:
[IndexError] exception expected, not
Class: <NoMethodError>
Message: <"undefined method `it' for #<TestDescribe:0x007fb8158a1e50>">
---Backtrace---
test.rb:19:in `block (2 levels) in test_that_it_can_fail'
test.rb:4:in `call'
test.rb:4:in `describe'
test.rb:18:in `block in test_that_it_can_fail'
---------------

This is the same Exception, the block inside describe gets executed but there is an undefined method it inside. The it method is pretty much the same as describe, it will take a description and a block and will execute that block. Let’s update our spec.rb:

1
2
3
4
5
6
7
8
#lets define describe method that will pick a description and a block and let's call that block
def describe description,&block
 block.call
end

def it description,&block
 block.call
end

let’s run our tests again:

1
2
3
Finished in 0.000968s, 2066.1157 runs/s, 1033.0579 assertions/s.

2 runs, 1 assertions, 0 failures, 0 errors, 0 skips

Good this is working, i can use that describe do .... end syntax, but how about assertions, i want to be able to do something like 2.should == 2. As Usual let’s do the same and build 2 tests, a passing one and a failing one.

1
2
3
4
5
6
7
8
9
10
11
class TestAssertion  < MiniTest::Test
  def test_that_it_can_pass
    2.should == 2
  end

  def test_that_it_can_fail
    assert_raises AssertionError do
      1.should ==  2
    end
  end
end

Obviously running test will not pass:

1
2
3
4
5
6
7
8
9
10
11
12
Finished in 0.001312s, 3048.7805 runs/s, 762.1951 assertions/s.

  1) Error:
TestAssertion#test_that_it_can_fail:
NameError: uninitialized constant TestAssertion::AssertionError
    test.rb:17:in `test_that_it_can_fail'


  2) Error:
TestAssertion#test_that_it_can_pass:
NoMethodError: undefined method `should' for 2:Fixnum
    test.rb:13:in `test_that_it_can_pass'

I see two things , an undefined method should and an undefined Exception AssertionError. Let’s update spec.rb:

1
2
3
4
...
#this is easy fix, just define an exception
class AssertionErro < Exception
end

In order to implement the should we need to create an assertion for the object passed and it needs to work for all ruby objects so we will implement it for the Object class. You also need to remember that ‘==’ is just syntactic sugar offered by ruby and it is the equivalent of ‘equal?’. Basically, 2 == 2 is the same as 2.equal?(2) . this is very important because we will define a == method in our DelayedAssertion class. Enough talking, let’s take a look at the code in spec.rb:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class Object
  def should
    DelayedAssertion.new(self)
  end
end

#lets define DelayedAssertion class now

class DelayedAssertion
  def initialize(subject)
    @subject = subject
  end
  #here where the magic happens
  def ==(other)
    raise AssertionError unless @subject == other
  end
end

let’s run the test now:

1
2
3
4
5
6
7
# Running:

....

Finished in 0.001384s, 2890.1734 runs/s, 1445.0867 assertions/s.

4 runs, 2 assertions, 0 failures, 0 errors, 0 skips

Wow, we have been able to implement some major features of Rspec. In future post i will try to add support to should method such as something.should be_true or array.should have(4).things. You can also add your implementation as a comments.

Create Your First API Wrapper With TDD

| Comments

Whether at you job or personal projects you would need at some point have to build you own API wrapper in ruby. Today, i decided to build my own Wrapper that i will turn into a gem in a later blog post. I want to do it methodically according to Test Driven Development principles.

It is always a hassle to test API calls but there is a way on how it should be done and we will talk about it later on.

I picked the WMATA API because it is well written and well documented. As you can see in their developer website each call has a json response example. We will use these to test our code against it later.

Let’s talk about our goal. I want to be able to set my api key and then make an api call in this manner:

1
2
WMATA.api_key = "kfgpmgvfgacx98de9q3xazww"
WMATA.next_trains("A01") # A01 stands for a station code

Now, that i know what i want, let’s set up our TDD environment. The initial repository tree would look like this:

1
2
3
4
5
6
lib
|___wamta.rb
spec
|___spec_helper.rb
|___trains_helper.rb
Gemfile

Let’s take a look at the Gemfile and see what are the tools we are going to use:

1
2
3
4
5
source 'https://rubygems.org'
gem 'httparty' #i use httparty to make http queries 
gem 'pry' #i use binding.pry to open a IRB console during runtime
gem 'webmock' , group: :test
gem 'rspec' , group: :test # this is my favorite testing framework

Now that we have all the necessary gems do a bundle install. Let’s take a look at the spec/spec_helper.rb:

1
2
3
4
5
6
7
8
9
10
$:.unshift 'lib'
require 'wmata'
require 'webmock/rspec'
WebMock.disable_net_connect!(allow_localhost: true)
RSpec.configure do |config|
    config.before(:each) do
      stub_request(:get, %r|https://api.wmata.com/StationPrediction.svc/json/GetPrediction/|).
        to_return(status: 200, body: File.open("./spec/fixtures/next_trains.json"){|f| f.read}, headers: {})
    end
end

This is where it get’s tricky. If you rely on the API, your tests are not really tests, because they depend on the status of the API. If you have no internet, those tests will not work. Your tests should only test your own code, not the response of the API. We are using webmock to return to each call made to https://api.wmata.com/StationPrediction.svc/json/GetPrediction/ the json response in next_trains.json. This response is the expected result provided by the documentation here

So let’s add the file next_trains.json in spec/fixtures/next_trains.json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
{
"Trains":[
{
"Car":"6",
"Destination":"SilvrSpg",
"DestinationCode":"B08",
"DestinationName":"Silver Spring",
"Group":"1",
"Line":"RD",
"LocationCode":"A01",
"LocationName":"Metro Center",
"Min":"3"
},
{
"Car":"6",
"Destination":"Grsvnor",
"DestinationCode":"A11",
"DestinationName":"Grosvenor-Strathmore",
"Group":"2",
"Line":"RD",
"LocationCode":"A01",
"LocationName":"Metro Center",
"Min":"4"
}]}

We have everything setup now let’s write our first test in spec/train_spec.rb.

1
2
3
4
5
6
7
8
9
10
11
12
require 'spec_helper'
describe WMATA do
  before(:all) { WMATA.api = 'xxxxx'  }
     it 'should predict next trains' do
       expect(WMATA).to respond_to :next_trains
       next_trains = WMATA.next_trains
       expect(next_trains).not_to be_empty
       ["Car","Min","DestinationName"].each{ |field|
        expect(next_trains.first[field]).not_to be_nil
      }
   end
end

Notice that the api key does not have to be real because the call to the live api will not be made. It will be mocked and will return the expected json response.

Let’s create the module that will handle the API calls in lib/wamta.rb

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$:.unshift(File.dirname(__FILE__))
require 'client'
module WMATA
  def self.client
    @client ||= Client.new(api)
  end

  def self.api
    @api || raise("please set the api key")
  end

  def self.api=(api_key)
    @client = Client.new(api_key)
    @api = api_key
  end

  def self.next_trains(station = "A06")
    client.next_trains(station)
  end
end

Good, now we have a module that will create a client object when an api_key is passed and if we want to call next_trains without setting up the api key it will throw an exception “please set the api key”. Now, let’s create the Client class in lib/client.rb that will actually handle the API calls.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
require 'httparty'
module WMATA

  class Client
    include HTTParty
    base_uri "https://api.wmata.com"
    def initialize(api_key)
        @options = {query: {api_key: api_key} }
    end

    def next_trains(stations='A06')
      response = self.class.get("/StationPrediction.svc/json/GetPrediction/#{stations}", @options)
      JSON.parse(response.body)["Trains"]
    end
  end

end

When you run the tests now, everything should be green. This part makes sure that our code handles correctly the expected response, now let’s make a live API call :

1
2
WMATA.api = "kfgpmgvfgacx98de9q3xazww"
WMATA.next_trains("A01") # A01 stands for a station code

It should return an array of all the next trains for the specified station. Now that it is working we can do the same for all the other api calls

alt