Workspaces are version controlled docker instances that can contain source code, libraries & dependencies, and environment settings. You can also collaborate on workspaces with other users by adding members to your project.
Workspaces are used to create living environments for easy experimentation of your models and for supporting pipeline processes. Workspaces leverage pre-installed libraries like TensorFlow, Keras, PyTorch and notebooks like Jupyter, Zepplin, H2O, and others to help you experiment, build, test, visualize, and deploy your models.
NOTE: Once a Workspace is created, it will continue to consume and allocate compute resources like CPU, GPU, RAM, and storage. These resources will be billed to your account (not for beta users). If you want to leverage compute resources for a finite running task please use jobs instead.
TIP: We recommend using lower cost CPU instances initially with a notebook of your choice to browse files, modify code, import libraries, and connect to datasets before you are ready to spin up a GPU Workspace.
Creating a Workspace is very easy, once you are in a project, click "Workspaces" and then the "+ CREATE" button.
NOTE: Workspaces can take up to 10 mins to be created
NOTE: When working on a Jupyter, H2O, or Zeppelin notebooks in a container and using your CLI simultaneously be sure to sync your versioned controlled files occasionally between the two by using the 'Pull Latest' and the 'Push Changes' button on the code button on the Workspaces card:
You can select either a CPU or GPU machine type.
Using a CPU will take a much longer time to process data vs. a GPU. Once you are ready to start training your models you may want to select a GPU as this will ultimately save both time and money to achieve results quicker that will help you to iterate through algorithm changes.
Environments are pre-built Docker images. Each image is preinstalled with the primary libraries along with all required dependencies. There are also other support tools pre-installed for your convenience.
Some of the libraries available:
Python 3.0 (initially called Python 3000 or py3k) was released on 3 December 2008 after a long testing period. It is a major revision of the language that is not backward-compatible with previous versions. However, many of its major features have been backported to the backward-compatible Python 2.6.x and 2.7.x version series.
R Software Environment:
R is an open source programming language and software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Polls, surveys of data miners, and studies of scholarly literature databases show that R's popularity has increased substantially in recent years.
Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR) and by community contributors. Yangqing Jia created the project during his PhD at UC Berkeley
TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.
PyTorch is a Python package that provides two high-level features:
- Tensor computation (like NumPy) with strong GPU acceleration
- Deep neural networks built on a tape-based autograd system
You can reuse your favorite Python packages such as NumPy, SciPy and Cython to extend PyTorch when needed.
The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.
An extensible environment for interactive and reproducible computing, based on the Jupyter Notebook and Architecture.
JupyterLab is the next generation user interface for Project Jupyter. It offers all the familiar building blocks of the classic Jupyter Notebook (notebook, terminal, text editor, file browser, rich outputs, etc.) in a flexible and powerful user interface that can be extended through third party extensions that access our public APIs. Eventually, JupyterLab will replace the classic Jupyter Notebook.
H2O is an open-source software for big-data analysis. The H2O software runs can be called from the statistical package R, Python, and other environments. H2O allows users to fit thousands of potential models as part of discovering patterns in data.
Apache Zeppelin is a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more.
You can choose from several different sizes of SSD Disk space ranging from 10GB to 10TB.
If you require larger sizes please contact firstname.lastname@example.org
Once you create a Workspace, the 'Launching' indicator will be visible until the its fully launched and ready for use. During the this time period the docker image is being built for you automatically. The application will complete launching within 2-10 minutes depending the size of the image and other factors. If the image is not built within 10 minutes - please email email@example.com
Each workspace container will include:
- Name of the container
- Icon of the Libraries included
- CPU or GPU Machine Type
- Disk space allocated
- 'OPEN' Button ( Environment selection and SSH Terminal access)
- '</> CODE' Button ( Provides git commit functions pull & push)
- 'SYSTEM' button (Allows the user to Resume, Pause, and Terminate a workspace)
- 'STATE' button (Opens a window that shows detailed system information for the workspace)
'CODE' button options:
'OPEN' button options
'STATE' button options:
You can access your workspace by command line by clicking on the '>_ SSH ACCESS' button on the bottom of the workspace container.
You can terminate workspaces by pressing the 'TERMINATE' button on the bottom each individual workspace container.
NOTE: Once you terminate a workspace, any code associated with the workspace will be lost. Please make sure you have either pushed the code by using the CLI or you have copied to an off-platform location.