Cloud Datalab

Powerful Data Exploration

Cloud Datalab is a powerful interactive tool created to explore, analyze, transform and visualize data and build machine learning models on Google Cloud Platform. It runs on Google Compute Engine and connects to multiple cloud services easily so you can focus on your data science tasks

Integrated & Open Source

Cloud Datalab is built on Jupyter (formerly IPython), which boasts a thriving ecosystem of modules and a robust knowledge base. Cloud Datalab enables analysis of your data on Google BigQuery, Cloud Machine Learning Engine, Google Compute Engine, and Google Cloud Storage using Python, SQL, and JavaScript (for BigQuery user-defined functions).

Scalable

Whether you're analyzing megabytes or terabytes, Cloud Datalab has you covered. Query terabytes of data in BigQuery, run local analysis on sampled data and run training jobs on terabytes of data in Cloud Machine Learning Engine seamlessly.

Data Management & Visualization

Use Cloud Datalab to gain insight from your data. Interactively explore, transform, analyze, and visualize your data using BigQuery, Cloud Storage and Python.

Machine Learning with Lifecycle Support

Go from data to deployed machine-learning (ML) models ready for prediction. Explore data, build, evaluate and optimize Machine Learning models using TensorFlow or Cloud Machine Learning Engine.

Features

Integrated

Cloud Datalab simplifies data processing with Cloud BigQuery, Cloud Machine Learning Engine, Cloud Storage, and Stackdriver Monitoring. Authentication, cloud computation and source control are taken care of out-of-the-box.

Multi-Language Support

Cloud Datalab currently supports Python, SQL, and JavaScript (for BigQuery user-defined functions).

Notebook Format

Cloud Datalab combines code, documentation, results, and visualizations together in an intuitive notebook format.

Pay-per-use Pricing

Only pay for the cloud resources you use: Google Compute Engine VMs, BigQuery, and any additional resources you decide to use, such as Cloud Storage.

Interactive Data Visualization

Use Google Charting or matplotlib for easy visualizations.

Machine Learning

Supports TensorFlow-based deep ML models in addition to scikit-learn. Scales training and prediction via specialized libraries for Cloud Machine Learning Engine.

IPython Support

Datalab is based on Jupyter (formerly IPython) so you can use a large number of existing packages for statistics, machine learning etc. Learn from published notebooks and swap tips with a vibrant IPython community.

Open Source

Developers wishing to extend Datalab can fork and or submit pull requests on the GitHub hosted project.