True.com gains real-time visibility to their infrastructure through the True Performance Dashboard.
True.com is the leading internet dating web site, with over 1 million visitors a day. While there are hundreds of online dating sites, only True does background checks to protect its members from felons and those already married. This has differentiated True and has quickly made them the industry leader.
True wanted to have real-time visibility to their production systems so that they could:
- See how many visitors were currently online.
- See how their web, search and database tiers were performing.
- Have some warning before an outage was going to occur so that they could take preventative steps.
- Research complex problems that occurred across coupled systems.
- Plan for future growth by analyzing historical trends.
TitaniumSea built components that gathered statistics on the various production machines, and then normalized and consolidated those statistics into a database. Graphs are pulled to the intranet dashboard on a periodic basis. The results make sense to technical and non-technical people alike. On the capacity graphs, 0% means nothing is happening while 100% means that they are at full capacity. Shading makes it even more obvious when there are problems:
Implementation Details
True utilizes primarily a Microsoft-based technology infrastructure, with some Linux at the periphery. The great thing about the Windows platform is that Microsoft provides built-in performance counters. However, you can easily overwhelm yourself with too much data. On this project, we used local agents to consolidate data down to 1 minute peaks and averages before propagating the data up to the central performance database.
True's web site is hosted in a secure area, which makes is difficult to propagate performance metrics back to the corporate network where they can be analyzed. To solve this, the graphs are created in the secure area and are pulled periodically to the intranet using a batch job.
Another significant challenge was to take the raw performance counters, such as CPU utilization, WWW connections, and SQL transactions/minute, and to convert them into something meaningful to the staff. Using our extensive background in high-traffic web sites, TitaniumSea was able to convert these raw numbers into a 0 to 100% scale that quickly communicates the current load and the remaining capacity.
back to case studies
|