Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The purpose of the GCP Deployment Request is to allow our client`s Data Science team access Boxalino datasets, for the goal of running jupyter/notebook processes in the designed anaconda environments.

...

  1. Python 3.7

  2. git

  3. Anaconda3

  4. pip / pip3

  5. setuptools

  6. papermill

  7. jupyter

  8. google-api-python-client & google SDK libraries

...

Steps

  1. Project Deploy.

    1. Make a GCP Project Deployment Request with the Required Information information.

    Shortly, Boxalino will provide the project.
  2. Your user email (as the requester) will be given the editor role.

    1. a GCP project will be provided to the requestor

  3. Billing Information

    1. Set the billing account on the new project.

    2. This is required in order to be able to use the GCP resources.

  4. Application Content

    1. Prepare the Required Files (

    project
    1. application structure)

    and load them
    1. Load the content in a GCS bucket from the project.

  5. Prepare the content for theApplication Launch.Launch the application

Info

Your user email (as the requester) will be given the Editor, Owner and Project Billing Manager role.

Share access to other people who need access to the project.

Tip

The application is launched in a VM in the project. The commands from commands.txt are executed. Additionally, you can SSH on the VM and update/check content.

...

  1. BigQuery Data Editor : <client>_lab, <client>_views

  2. BigQuery Data Viewer : <client>_core, <client>_stage, <client>_reports, <client>_intelligence

1. Project Deploy

Required Information

When contacting Boxalino with In order to create a GCP project deployment request, please provide the following informationProject, in which the application will be run, the following information is required:

1

project name

as will appear in your project`s list
naming requirements: space, - and _ are allowed.

2

email

the requestor is the one managing the applications running on the project;

this email will receive messages (alert and notifications) for when the project is ready to be used;

** the email alerts for the VM / application run - is part of the instance.txt file, specific for every application launch

3

client name

(also known as the Boxalino account name) this is to ensure the access to the views, core & reports datasets (https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/303792129/GCP+Project+Deployment#BigQuery-Datasets-Access )

4

labels

optional; the labels are used as project meta-information. see Labels

5

permissions

optional; by default, the requestor will have full access and can further share with others. see Permissions

Once the project is created (2-3 min), the requestor will have access to it, in their Google Cloud Console.

Tip

As an editor on the project, the requestor will be able to:

...

Labels (optional)

Labels are key-value pairs meant to better organize the projects.

...

More information on labels: https://cloud.google.com/resource-manager/docs/creating-managing-labels

Permissions (optional)

The permissions are added when the project is created.

  • By default, the requestor`s email has the project editor role

  • Once the project is released, the requestor can add more emails / users to the IAM policies of the project.

...

Code Block
user:dana@boxalino.com:roles/editor
user:dana@boxalino.com:roles/resourcemanager.projectIamAdmin
user:dana@boxalino.com:roles/compute.osLogin
user:dana@boxalino.com:roles/compute.osAdminLogin
user:dana@boxalino.com:roles/bigquery.dataOwner
serviceAccount:service-account-from-other-projects:roles/iam.serviceAccountUser
serviceAccount:service-account-from-other-projects:roles/bigquery.dataOwner
serviceAccount:service-account-from-other-projects:roles/bigquery.dataEditor

...

More information on permissions: https://cloud.google.com/iam/docs/understanding-roles

2. Billing Information

In order to access the Google Cloud resources - a billing account must be set on the project.

In order to achieve this:

  1. go to the Billing menu in GCP console or check the billing projects https://console.cloud.google.com/billing/projects

    Image Added

  2. Identify the project and click on the 3 dots. Select “Change Billing”

    Image Added

3. From the appeared window, select the Billing Account on which the costs of the Application will be billed

...

If you do not have access to a billing account, provide the Project Billing Manager role to someone who does. Use the IAM menu for this https://console.cloud.google.com/iam-admin/iam

3. Application Content

Info

In order to launch the application, the source files must be loaded in a Google Storage Bucket https://console.cloud.google.com/

...

storage/browser

Note

The Google Storage Bucket must have an unique name. Due to this, we recommend that every bucket-name starts with your project name.

Required Files

1

instance.txt

properties for the Virtual Machine (VM machine ) (name, size, root pathhome, etc) (see instance.txt)

2

requirements.txt

environment requirements (for pip/anaconda install) (see requirements.txt)

  • pip freeze > requirements.txt - command to create the file from a tested environment

3

commands.txt

a list of commands to be executed as part of your application run process (see comands.txt)

*can be left empty as well (if you chose to SSH on the VM and run your own processes from the project`s scope)

4

env.yml

(optional) anaconda environment file;
if no file is provided - the environment is not created

5

your jupyter/python/application files

the content of your application (in python, jupyter notebooks, etc)

...

Code Block
name: gcp-application-name
channels:
  - defaults
dependencies:
  - ca-certificates=2020.1.1=0
  - <a list of dependencies>
  - pip:
      - google-api-core==1.22.2
      - google-api-python-client==1.9.3
      - google-auth==1.17.2
      - <more-libraries required for the application>
prefix: /opt/conda/envs/gcp-application-env

4. Application Launch

Note

Before launching the application, make sure that the Required Files are uploaded in a GCS bucket.

...

Code Block
sudo gsutil rsync -r gs://<BUCKET>/ <APPLICATION-PATH>
Note

Replace <BUCKET> with your storage bucket name (where the application files have been loaded).

Replace <APPLICATION-PATH> with the path to your application (default: /home/project-name).

...