Building Wanelo @buildingwanelo - Tumblr Blog

A Brief History of Sprout Wrap

When Wanelo gets a brand new workstation the first thing we install on it is Sprout. Sprout is a collection of OS X-specific recipes that allow you to install common utilities and applications that every Ruby developer has and will appreciate.

Anyone who has worked at Pivotal Labs would feel comfortable with the workstations that spawn from sprout-wrap, because they’ve probably worked on one before. Sprout is based on chef-soloist that allows a developer to run a set of chef recipes from their local machine. Recipes have been built for applications like Chrome, RubyMine and iTerm. Other common OS X settings — ones that are changed on every new workstation — can be switched, including turning SSH on, changing the default keyboard repeat rate, installing sane git aliases, rbenv and bash completion.

You can automatically clone git repositories into your ~/workspace directory. You can install PostgreSQL, ImageMagick, node, Dropbox, PhantomJS, GitX, Caffeine and Heroku Toolbelt. If you’ve found a utility useful in a development setting, or flipped a switch somewhere in System Preferences, a recipe probably exists in one of Sprout’s cookbooks.

So who is the useful for?

We at Wanelo have grown considerably in the last few months I’ve been there. We’ve hired four new people. That’s two pairing stations that two pairs would normally have to set up. Instead, we took the time to set up asoloistrc file — the file where you specify which recipes to run.

Here’s a snippet from a standard soloistrc file:

# development (rails) - pivotal_workstation::rbenv - pivotal_workstation::gem_setup - pivotal_workstation::postgres - sprout-osx-apps::imagemagick - sprout-osx-apps::node_js - sprout-osx-apps::qt # apps - sprout-osx-apps::skype - sprout-osx-apps::chrome - sprout-osx-apps::textmate - sprout-osx-apps::1password - sprout-osx-apps::hub - sprout-osx-apps::phantomjs - sprout-osx-apps::gitx - sprout-osx-apps::propane

Running bundle exec soloist inside the proper directory will ensure all applications are installed. After unboxing new workstations (or formatting the out-of-date ones) a full run of sprout-wrap takes about two and a half hours. Subsequently, each run takes about 3 minutes.

Customize!

At Wanelo we wrote a recipe that runs bundle install and sets up our development database. It gets our Ruby web application to a state where we can run foreman start on a brand new machine. We also wrote a recipe that installs our VPN on our machines. The most recent recipe we wrote installs our favorite vim plugins, configurations and theme.

We have reached convergence

Sprout-wrap has the benefit of following chef’s principles and requires recipes to be idempotent. So when a developer includes the recipes that, for instance, installs cowsay, going forward each workstation will now have cowsay. Which means every developer is happy to work on any workstation. Because, hey, now you have cowsay on every workstation.

< Moo. > \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||

Sprout enables us as pairing developers to build a stable cluster of similar machines where there are minimal development bottlenecks. Having someone who is familiar and can set up sprout-wrap on your machines is hugely valuable for a growing organization, and can save hours, likely days, of developers' time.

You can learn and find instructions on how to set up and install sprout on GitHub.

-James

#wanelo #chef #sprout wrap

Just enough client-side error tracking

Deploying at Wanelo tends to be high-frequency and low-stress, since we have most aspects of our systems performance graphed in real time. We can roll out new code to a percentage of app servers, monitor app server and db performance, check error rates, and then finish up the deploy. However, there’s one area where I’ve always wanted better metrics: on the client side. In particular, I want better visibility into uncaught JavaScript exceptions. Client-side error tracking is a notoriously difficult problem -- browser extensions can throw errors, adding noise to your reports; issues may manifest only in certain browsers or with certain network conditions; exception messages tend to be generic, and line-numbers are unhelpful, since scripts are usually minified; data has to be captured and collected from users’ browsers and reported via http before a user navigates to a new page. And on and on. On the other hand, many sites are moving more and more functionality client-side these days, so it’s becoming increasingly important to know when there are problems in the browser. I have yet to see a great solution to this problem, so I try to ask about other companies’ client-side error tracking whenever I can. I usually hear one of two answers: A.) We don’t track them (but we’d like to), or B.) We built our own in-house tracking system; sometimes it helps us catch issues, but usually it’s a firehose of random errors that we can’t trace back to a particular issue. There’s a middle path between these two answers that I think will end up being the “just right” solution for us: client-side error rate tracking. Essentially, ignore all error messages and calculate the total count of client side errors per minute relative to “page views." The goal of this sort of tracking isn’t to pinpoint each new client-side issue, but just to answer the question: did we break something during this deploy that’s going to prevent our users from having a good experience on the site?

Multi-process or multi-threaded design for Ruby daemons? GIL to the rescue :)

MRI Ruby has a global interpreter lock (GIL), meaning that even when writing multi-threaded Ruby only a single thread is on-CPU at a point in time. Other distributions of Ruby have done away with the GIL, but even in MRI threads can be useful. The Sidekiq background worker gem takes advantage of this, running multiple workers in separate threads within a single process. If the workload of a job blocks on I/O, Ruby can context-switch to other threads and do other work until the I/O finishes. This could happen when the workload reaches out to an external API, shells out to another command, or is accessing the file system. If the workload of a process does not block on I/O, it will not benefit from thread switching under a GIL, as it will be, instead, CPU-bound. In this case, multiple processes will be more efficient, and will be able to take better advantage of multi-core systems. So… why not skip threads and just deal with processes? A number of reasons.

#wanelo #ruby #threads #gil #context switching

Quick heads-up on our upcoming webinar with Joyent on Manta

A few months back, one of our engineers Atasay Gokkaya published a fantastic overview of how we at Wanelo use Joyent's new innovative object store Manta for a massively parallelized user retention analysis, using just a few lines of basic UNIX commands in combination with map/reduce paradigm.

I also recently went onstage with Joyent's VP of Engineering Bryan Cantrill for a fireside chat at VentureBeat’s CloudBeat, discussing Wanelo's use of Manta, as well as our excitement about Joyent's cloud. If you missed it, or are interested to learn more about the subject, we're continuing the discussion with a live webinar on Tuesday, October 29th.

Atasay and I will dive deep into our team's experience using Joyent Manta storage and big data analytics service.

It's an hour-long webinar, and we'll cover the following:

How we solved the problem of user event data collection on a massive scale, and very cheaply

How Joyent Manta storage and big data analytics service allowed us to use the collected data to analyze user behavior and retention over many months, and run our queries in mere minutes

We'll discuss the unique benefits of using Joyent Manta Storage Service, including ease of use, flexibility, performance, and cost-savings

We'll answer any questions from the audience as much as time permits.

To join, please register here.

-Konstantin

#joyent #venturebeat cloudbeat

Detangling Business Logic in Rails Apps with PORO Events and Observers

With any Rails app that evolves along with substantial user growth and active feature development, pretty soon a moment comes when there appears to be a decent amount of tangled logic, AKA "technical debt."

A typical example would be a user registration controller's "register" action, which upon a successful registration might coordinate a bunch of actions related to the registration but unrelated to one another, such as:

Sending the user a welcome email

Logging an analytics event for future reporting

Queueing up a job to notify user's Facebook friends

Running a check against a spam database of IP addresses to validate the new account

Running recommendation engine logic to suggest topics to follow

These are all concerns that are independent of one another, but happen when a user registers. Some of these actions happen immediately, some even within a single transaction, and some asynchronously (in another thread, or in a background job).

This topic has been given a lot of discussion on this famous thread, where even DHH chimed in. We'll use the example discussed in that thread, and the version that DHH presented (slightly compacted) below. Basically, a controller that's creating a comment and then performing a bunch of related actions, such as posting to Twitter and Facebook, or running it through a spam check.

class PostsController def create @entry = current_user.entries.find(params[:id]) return head(:bad_request) if SpamChecker.spammy?(params[:post][:body]) @comment = @entry.comments. create!(params[:post]. permit(:title, :body). merge(author: current_user)) Notifications.new_comment(@comment).deliver if @comment.share_on_twitter? TwitterPoster.new(current_user, @comment.body).post end if @comment.share_on_facebook? FacebookPoster.new(current_user, @comment.body). action(:comment) end end end

In this blog post we'll examine an event-based approach to decoupling this business logic, a method that's been pretty successful within the Wanelo codebase thus far.

#ruby #rubyonrails #design pattern #web development

Really Really Really Deleting SMF Service Instances on Illumos

We recently ran into a tricky situation with a custom SMF service we maintain on our Joyent SmartOS hosts. The namespace for the service instance (defined in upstream code) had changed, which meant that as our Chef automation upgraded the service instances to the latest code, we ended up with a lot of duplicate service instances that each had a unique namespace.

After wrestling with the best way to batch delete/reinstall the service (using Chef's knife cli), we found a way to improve our old process.

Normally, we would delete services with something like svccfg delete <service_name>, but this doesn't work well if you need to delete a number of services, especially if they have similar namespaces. Further, we found that running this in a loop against the output of svcs -a -H | grep <service_name> wasn't effective because service configurations could linger even after the service instance had been deleted.

Digging into man svccfg, we came up with a way to enumerate services and service configurations more cleanly with svccfg:

for service in $(svccfg list | grep nad); do sudo svcadm disable -s $service done for instance in $(svccfg list | grep nad); do sudo svccfg delete $instance done

Blake

#illumos #smf #smartos #omnios #solaris #wanelo #buildingwanelo

A Cost-effective Approach to Scaling Event-based Data Collection and Analysis

With millions of people now using Wanelo across various platforms, collecting and analyzing user actions and events becomes a pretty fun problem to solve. While in most services user actions generate some aggregated records in database systems and keeping those actions non-aggregated is not explicitly required for the product itself, it is critical for other reasons such as user history, behavioral analytics, spam detection and ad hoc querying.

If we were to split this problem into two sub-problems, they would probably be “data collection" and “data aggregation and analysis."

UPDATE: please checkout the following presentation from Surge2013 Conference for another view into this project:

First question: Why don’t we use our relational database backend for this?

#big data #manta #joyent #smartos #parallel computing #distributed computing

Scaling Wanelo 100x in Six Months

We recently gave a talk at the SFRoR Meetup here in San Francisco about how we scaled this rails app to 200K RPM in six months. There were a lot of excellent questions at the meetup, and so we decided to put the slides up on SlideShare.

Without further ado, here it is. Feedback and comments are always welcome.

Scaling Wanelo.com 100x in Six Months by Konstantin Gredeskoul and Eric Saxby

#scaling #scalability #performance #webscale #postgresql #monitoring #technology #rubyonrails #ruby #databases #replication #redis #tech

High Read/Write Performance PostgreSQL 9.2 and Joyent Cloud

At Wanelo we are pretty ardent fans of PostgreSQL database server, but try not to be dogmatic about it.

I have personally used PostgreSQL since version 7.4, dating back to some time in 2003 or 4. I was always impressed with how easy it was to get PostgreSQL installed on a UNIX system, how quick it was to configure (only two config files to edit), and how simple it was to create and authenticate users.

#postgresql #database #joyent #postgres #tech

How Alerts Can Tell You When Beyoncé Is On

This past weekend a number of us were focused on a really important annual prime time television event (the Puppy Bowl, of course). Turns out other people out there were watching some other sporting event, which leads to the rest of this story.

The Case for Vertical Sharding

Wanelo's recent surge in popularity rewarded our engineers with a healthy stream of scaling problems to solve.

Among the many performance initiatives launched over the last few weeks, vertical sharding has been the most impactful and interesting so far.

#scalability #webscale #joyent #smartos #sharding #postgresql

The Big Switch: How We Rebuilt Wanelo from Scratch and Lived to Tell About It

Originally published here on 14 Sep 2012.

The Wanelo you see today is a completely different website than the one that existed a few months ago. It’s been rewritten and rebuilt from the ground up, as part of a process that took about two months. We thought we’d share the details of what we did and what we learned, in case someone out there ever finds themselves in a similar situation, weighing the risks of either working with a legacy stack or going full steam ahead with a rewrite.

One early finding we bumped into was that the old tables and new tables would have to live in the same PostgreSQL schema so that we could easily copy/transform data using SQL between tables. If we put the legacy tables in another database or schema, it was not as easy to move data between them. This might be a limitation of PostgreSQL. Luckily, our Java application used singular names of entities (such as “user”) while the Rails app uses plural. So we could keep the same names and go from the "user" table to "users,” within the same database schema. The second challenge was the actual transfer of data from MySQL to PostgreSQL as is. This proved to be no small feat, as we ended up using a custom-built project with a few rake tasks, on top of the mysql2psql gem. In that mini-project, we used the gem to export from MySQL to *.sql files ready for import into PostgreSQL, but we had to munge some of them first to make them work with Pg. For example, the MySQL ENUM column type was exported as VARCHAR(0), and we had to fix that because Pg did not accept that data type. Similarly, bit(1) columns were not properly exported and had to be converted prior to export into regular integer columns. Luckily, there weren't that many special cases we had to deal with, and mysql2psql provided most of the functionality out of the box. We also found that the gem appears to have an exponentially increasing performance penalty based on the number of rows exported. So we wrote a rake task that split all rows in the largest table into chunks of 100k, and then started one export per chunk in parallel, using multiple processes. This allowed us to eventually export data from MySQL into a Pg-acceptable SQL file very quickly (under 30 minutes for a 5GB InnoDB file). Our initial attempts to do so took over 20 hours. The exported files used PostgreSQL's "COPY" command, which is very fast. So the import of the files took a significantly shorter amount of time -- about 15 minutes tops. Now we just had to run the legacy migrations to populate our schema. Our switch from from the old stack to the new one took place on June 27, with four hours of downtime (we could have done it faster, but we decided that four hours was acceptable for such a large migration). There were six steps:

#rubyonrails #ruby #java #j2ee #hibernate #spring #rewrite #mysql #postgresql #migration #databases #data migration

Trending Blogs

Recently Viewed Blogs

Building Wanelo