Ruby Summer of Code Student Application

Name * Gonçalo Silva
Email *
Phone Number (US Only)
Phone Number (Non-US)
*hidden*
IRC Nick
*hidden*
Twitter Username *
goncalossilva
GitHub Username *
goncalossilva
Other Ways to Reach You
*hidden*
Location *
Porto, Portugal
Time Zone *
GMT
Primary Language *
Portuguese
Other Languages
English
University *
Faculty of Engineering of the University of Porto
Degree Program *
Master in Informatics and Computing Engineering
Proof of Student Status * *hidden* 671.02 kB · pdf
Have you participated in Google Summer of Code before? *
No
Bio *

I'm a soon-to-be software engineer from Portugal who always had a crush about performance. Being a freelance web developer for many years, I started using Ruby on Rails in 2007 in a college project. A couple of months later, RoR was my first choice for almost every web project. I'm currently working for “Tecla Colorida”, the creators of http://escolinhas.pt, where they began using RoR in early 2009. In February 2010, I engaged on a master's thesis entitled “Scaling Rails: a system-wide approach to performance optimization”, mixing my passion about Rails and my mild obsession for performance.

I spend most of my free time tweaking open source projects, from operating systems to graphical user interfaces. This hacking instinct surrounds my love for open source software, allowing me to make small contributions to many projects. I also avoid becoming a full-time geek by going out with friends, discovering new places, meeting new people and watching some TV shows (oh, wait, that's a geek thing).

You can find more about me by visiting my website (http://goncalossilva.com), taking a sneak peak at my Rails-oriented blog (http://snaprails.tumblr.com) or following me on twitter (http://twitter.com/goncalossilva).

Headshot * *hidden* 1.48 MB · jpg
Why do you use Ruby and/or Rails? How would you like to see them improve? *

I've started using Ruby before even knowing Rails. I used it as a scripting language, mainly, because it was extremely powerful and plain easy. Things that would take me hours to develop in Bash or Perl would get done in minutes using Ruby. Its syntax and libraries allowed me to accomplish much more with less effort.

When I started using Rails, I found the need to expand my knowledge on the Ruby programming language since it was not only a scripting language to me anymore. Rails made my life easier. I couldn't really handle a lot of simultaneous web projects in parallel with college work. With RoR, this was no longer an issue. I started developing and deploying really fast. I noticed my products' quality was actually much better (since I had more time to craft them) and that my clients were happier. It couldn't be better.

Both Ruby and Rails have one common pitfall, from my perspective: performance and scalability. It is perfectly possible to build a highly efficient application, but it takes too much effort and, to accomplish it, developers need to have a deep insight on how Ruby and Rails are structured and work. In my opinion, performance is of the most sensitive aspects of Ruby-related projects.

Ruby's garbage collector is not very suited for Rails applications and Rails itself is a dense framework divided in many modules which were developed by a lot people, becoming the culprit for the lack of performance in certain situations. Rails was recently overhauled with the development of version 3, with some parts being completely rewritten. All these deep changes can lead to new performance problems, aside form the existing ones, and this risk should be avoidable. Developers should be able to know if there were any performance regressions when they make any changes.

Outline the specific project you're proposing. *

Ruby on Rails should have an official full-stack benchmarking suite that developers can work against. Each commit would automatically trigger this process, either locally or remotely, and developers would be able to keep track of the impact their changes had on the framework's performance. In the most basic form, this would be a kind of performance-oriented continuous integration (CI).

One hypothetical example would be a server (maybe AWS) that would nightly run the all the benchmarks and compare them to the previously gathered data, being able to assert if there were any performance regressions/improvements in the recent changes. Developers would be notified if anything went wrong and would be able to get a good overview over the performance evolution over time, after each commit.

If I'm done with this before the final deadline, I can work on real performance improvements using the aforementioned suite, namely ActionPack and ActiveRecord. Ordered by their priority, those would be:
1. Improve form helpers;
2. Overview over link helpers and the InstanceTag system;
3. Top to bottom ActiveRecord profile and performance improvements (AR was heavily overhauled in version 3, increasing the odds of it having performance-related issues);
4. Parallel partial rendering support (and maybe making Rails smart enough to decide which is better according to the request/machine, since single-core machines wouldn't benefit from this).

Why is this important to the Ruby and/or Rails communities at large? Why is this important to you? *

The whole community would benefit from having an official benchmarking suite for Ruby on Rails. Developers would have increased awareness on the framework's performance. They would be able to benchmark their changes and understand their impact on every component, making small adjustments if any significant performance regressions were found. The community would also benefit since Rails would definitely be faster and more scalable over time.

For me, this project and its timing couldn't be better. As I've said before, I'm very concerned about the performance of everything computer-related. I'm also enduring a thesis in this area, so I can help the community while putting the "cherry on top" of my work.

List a clear set of goals/milestones you'll hit during the summer, along with a rough timeline. Be specific about your deliverables. *

Since this is a 2-month project, these would be its milestones along with an estimate of how long they would take:
1. Building a set of suitable full-stack benchmarks (3 weeks);
2. Creating a script that runs the aforementioned benchmarks, collects and organizes important data like the time needed, memory usage, CPU usage, slower methods, etc (2 weeks);
3. Figuring out a way of automating a performance CI so that they run the same way as the test CI (1 week);
4. Building a graphical interface to help on data visualization (with appropriate charts and links to commits) and work on a notification system which is triggered when performance regressions are found (1 week);
5. Use the platform to improve ActionPack and ActiveRecord (uncertain, 1 week).

As I've mentioned before, the main focus of this project is building the full-stack benchmarking suite for Rails. If I'm able to finish it and polish it within schedule, I'll use it to work on AP and AR.

What are the "unknowns" in this project for you? What kind of pitfalls could you run into? *
First of all, there is a risk that the benchmarking suite won't, at first, cover the full extent of the framework. The process of automating a performance CI can also be risky, since I'm not fully aware of how the test CI for Rails is structured.
How will you measure progress? How will you handle falling behind? *

Each phase is measured differently, as it involves different kinds of work. Following the above enumeration, the progress measure would be something like this:
1. The benchmarking suite's coverage is an excellent indicator of how things are progressing;
2. The script will measure a defined set of variables. The more it organizes and stores, the more complete it is;
3. This phase is actually divided in two steps: finding a way of solving the problem and actually doing it. Since it's a small phase, these 2 steps should suffice as measurement;
4. This small phase is all about presenting the gathered data in an appropriate format, allowing easy and intuitive navigation. Without usability tests, this can't be realistically measured but common sense will surely help;
5. This is an optional and very small phase, so measuring its progress will be very hard to accomplish.

If somehow the work falls behind schedule, priorities will need to be set in order to create a useful benchmarking tool, even if it's not 100% complete. Luckily, the order of the 5 phases is related to their priority. Also, working overtime is always an option.

Is there anything else you'd like to share with us about yourself or your project?
Although being Portuguese, I can easily write, read and have conversations in English, so please don't take the fact that English is not my primary language as a limitation. If I'm accepted, I can smoothly work with an English-speaking mentor/company.