Wuaki.tv going global on system operations.
Almost Two years ago, we decided to move our technical infrastructure from a dedicated hosting company in Holland called LeaseWeb to Amazon AWS. The journey wasn’t as easy as we thought, as we faced lots of challenges to reach the point where we are nowadays.
One of the very first things that you need to understand when you arrive to AWS, Rackspace or any other of this IAAS platforms, is that you must forget about server level thinking, and you must start thinking on the Application level. If you stop for a minute and think about it, one of the key features of IAAS is idempotence that they provide you, and this is really valuable when you need to scale a service during certaning periods of time depending on unpredictable time periods. At the moment Wuaki.tv is deployed in AWS EU Region (Ireland) and every single piece of our production stack is running on at least 3 servers (one for each AZ) in a completelly puppetized environment.
Wuaki.tv system architecture
One of the things that you notice when you start working with Amazon is that you don’t get a static IP address by default, and you have to rely on other tricks to ensure that you can code your infrastructure. At the moment, at Wuaki.tv we rely almost everything on Amazon’s security groups, this is the only way you can ensure that your cluster is perfectly configured, and that your firewall rules won’t be a mess.
As you can see we think on clusters, for us everything is a cluster, from the smallest group of servers up to the biggest one. We don’t care if we have 1 request per minute but we do care that when a user wants to use the application he/she can do it no matter what.
Right now, June 2013, Wuaki.tv many different clusters that receives our user requests, cluster for: Nginx, WWW, API, Internal services, CDN balancing, MySQL, Caching, Queueing, Twemproxy, RabbitMQ, ElasticSearch. Almost all of them are involved on each request made by any user of our service. If we had to describe the path of a request, it could be something like this:
A user types in their browser the url Wuaki.tv, they go through our Nginx cluster which will decide depending url if we need to send the request to the WWW or the API cluster, after making that decision, Nginx proceeds to evaluate the ip address of the user who is making the request, and depending on the user localization this cluster needs to decide to which country and which cluster is going to process that request.
Once the user is identified, all requests are sent to the proper app server, as you may know by now, Wuaki.tv runs on Ruby on Rails, and Unicorn this helps us serve requests really fast, we are talking about 100ms to serve a request from our API, and 200ms from our website on the app server according to New Relic. Our Apdex is and must be between 0.98 and 1 no matter what.
Our config management tool (Puppet)
There is no way you can scale your application fast if you do it manually, it’s simply not possible. Since March 2012 we started to puppetize Wuaki.tv’s infrastructure, I had previously worked with puppet, and at that moment I was the only person doing Operations so was a easy decision. At the moment we have 4 environments, almost 60 modules, and we have implemented Mcollective for orchestration with RabbitMQ and The Foreman, working with a single Puppet Master.
We had to do some tuning to our Puppet Master so it could handle all of our server requests:
- Splaylimit $runinterval/2
Basically this setup helps us to avoid too many nodes connecting to our Puppet master at the same time, PuppetDB helps us with things like Nagios, and Hiera to prepare some modules on realtime depending on our needs to scale and more.
Wuaki.tv central logging system
Last year Ignasi Blanco joined the operations team at Wuaki.tv, and one of the first tasks he had was to deploy a central logging system. We happen to be using Splunk at that moment but we reached the limits of the free license really fast. When we were looking for alternatives we found Logstash and Kibana.
Logstash is an Open Source project developed by Jordan Sissel from Dreamhost, which helps us process any kind of logs.
It’s been almost a year since we started that project, I am sure we will make a post with more details soon but to get an idea at the moment we are processing around 83 million log lines per day. One of the key aspects of our infrastructure is that almost everything is asynchronous, not only on the application level but also on the infrastructure.
On December, we decided that was time to implement caching on our application, we made this post to explain about how did we implement it. Since we posted we have made a couple of changes. Because of our global expansion, we decided that parts of our stack were going to handle global traffic while others were going to handle local traffic.
At the beggining this seems a little crazy but after trying and failing in our implementations, we had seen that this ways will have a positive impact on the server administration and will help us detect any issue easily, than having a local stack. This solution is applied thanks to Redis namespaces were we can create keys with country, environment, and application names in order to identify any request at a given time.
And this is basically what makes wuaki.tv runs at the moment, mainly on top os m1.large AWS instances, using tools such as Amazon S3, Cloudfront, Autoscaling and more…much more.
GeoIP configuration on Nginx
During our implementation at Wuaki.tv we found a couple of issues related to Amazon ELB’s and autoscaling policies on Nginx, which doesn’t refresh our upstreams Ip’s on realtime in order to solve this issue we implemented the resolver directive with amazon DNS servers on our nginx.conf file.
As you may know by now, Wuaki.tv is a VOD service which depends on the user location. We have some business rules that we need to follow and depending on our users location we have to send traffic to many different clusters, with completely different business rules on each one of them. To accomplish this we use the Nginx-full package on Debian because it offers us support for Geoip configuration in a simple way using our APT package manager.
Particular Vhost configuration for wuaki.tv
For our API, we use a different method because we need transparency in order to complete each requests from our clients. We need to give the appropriate response to each user depending on their location without any weird redirect.