Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The purpose of the GCP Deployment Request is to allow our client`s Data Science team access Boxalino datasets, for the goal of running jupyter/notebook processes in the designed anaconda environments.

...

1

project name

as will appear in your project`s list


restrictions: space, - and _ allowed.

2

email

the requestor is the one managing the applications running on the project;

this email will receive messages (alert and notifications) for when the project is ready to be used;

** the email alerts for the VM / application run - is part of the instance.txt file, specific for every application launch

3

client name

(also known as the Boxalino account name) this is to ensure the access to the views, core & reports datasets

4

labels

optional; the labels are used as project meta-information. see Labels

5

permissions

optional; by default, the requestor will have full access and can further share with others. see Permissions

...

Property

Default

Required

Description

1

instance-name

project name

yes

the instance name is the VM name as appears in the Compute Engine view

2

machine-type

e2-micro

yes

the value depends on what the application needs: more CPU or more RAM? for options, please check the Google Cloud documentation

3

email-to

yes

the email is used once to receive an alert for when the VM is ready.

4

home

\/home\/project-name

no

the path on the server where the content of the GCS bucket is uploaded;
this is also used for the commands from the commands.txt file in order to launch/trigger your application execution.

alternatives: \/home\/<your-gcs-bucket> , \/srv\/app

when you SSH in the machine (ex: your email isĀ data-science-guru@boxalino-client.com) , the VM creates a directory /home/data-science-guru (this is default for any server) so this is your local path;

5

image-family

ubuntu-2004-lts

no

6

boot-disk-size

30

no

7

zone

europe-west1-b

no

this property can be left empty;

Note

use a zone which is in Europe.

Code Block
instance-name:application-name
machine-type:e2-micro
email-to:data-science-guru@boxalino-client.com
home:\/home\/project-name
image-family:ubuntu-2004-lts
boot-disk-size:30
zone:europe-west1-b

...

Code Block
chmod -R 777 <home value from instance.txt>/*
papermill <home value from instance.txt>/process.ipynb <root-dir<home value from instance.txt>/process-output.ipynb

...

1

project ID

the project ID is unique;

the project ID is diplayed on the dashboard of your project https://console.cloud.google.com/home/dashboard

2

GCS bucket name

the bucket name where the Required Files are located (ex: gs://project-name<project-name>-<app-name>);

the contents will be made available on the application as well.

info

Note

the bucket must be located in EUROPE; either use EU(multi-region) or europe-west-1 (singural region)

Note

the bucket name must be unique, for this purpose - we recommend that every bucket-name starts with your project name.

3

access code

as provided by Boxalino

Tip

Once the application has been launched, a script will initialize the environment and load all your content from the GCS bucket.

You can further log/ SSH on the virtual machine and check out the output or inspect the contents.

...

In the MONITORING view of your Application, you are able to track the resources available & consumed:

  1. CPU utilization

  2. Memory Utilization

  3. Disk Space Utilization

...

Application Delete

If you want to stop the application, you can freely delete it from your Compute Engine view, or use the form provided by Boxalino https://gcp-deploy-du3do2ydza-ew.a.run.app/instance

Image Added

BigQuery access

  1. As a data scientist, chances are that you have been provided with a Service Account (SA) to access the client`s private projects.

  2. The application is run by the project's own Compute Engine Service Account (CE SA).
    Because the project is in the scope of Boxalino, it will have direct read access to the client's datasets.

...