Cogo Engineering Using Presto

Cogo Labs is featured as an early adopter and success case of Presto, Facebook’s cutting edge distributed query system, in the official AWS blog.

From the article:

Presto is an open-source distributed SQL query engine optimized for low-latency, ad-hoc analysis of data. It supports the ANSI SQL standard, including complex queries, aggregations, joins, and window functions. Presto can process data from multiple data sources including the Hadoop Distributed File System (HDFS) and Amazon S3.

Here at Cogo we run Presto on Amazon EMR (Amazon’s Hadoop framework). This allows us to run SQL queries on the 500+ TB of data we have stored in Amazon S3. 

We love SQL here, and all of our Analysts are expected to be able to write their own queries. Using Presto means that even as the amount of data we work with grows ever exponentially larger, our Analysts can continue to leverage their existing SQL skills to perform complicated analyses, quickly surfacing valuable information within that data.