The purpose of the GCP Deployment Request is to allow our client`s Data Science team access Boxalino datasets, for the goal of running jupyter/notebook processes in the designed anaconda environments.
...
Python 3.7
git
Anaconda3
pip / pip3
setuptools
papermill
jupyter
google-api-python-client & google SDK libraries
...
Steps
Project Deploy.
Make a GCP Project Deployment Request with the Required Information information.
Your user email (as the requester) will be given the editor role.
a GCP project will be provided to the requestor
Billing Information
Set the billing account on the new project.
This is required in order to be able to use the GCP resources.
Application Content
Prepare the Required Files (
application structure)
Load the content in a GCS bucket from the project.
Prepare the content for theApplication Launch.Launch the application
Info |
---|
Your user email (as the requester) will be given the Editor, Owner and Project Billing Manager role. Share access to other people who need access to the project. |
Tip |
---|
The application is launched in a VM in the project. The commands from commands.txt are executed. Additionally, you can SSH on the VM and update/check content. |
...
BigQuery Data Editor : <client>_lab, <client>_views
BigQuery Data Viewer : <client>_core, <client>_stage, <client>_reports, <client>_intelligence
1. Project Deploy
Required Information
When contacting Boxalino with In order to create a GCP project deployment request, please provide the following informationProject, in which the application will be run, the following information is required:
1 | project name | as will appear in your project`s list |
2 | the requestor is the one managing the applications running on the project; this email will receive messages (alert and notifications) for when the project is ready to be used; | |
3 | client name | (also known as the Boxalino account name) this is to ensure the access to the views, core & reports datasets (https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/303792129/GCP+Project+Deployment#BigQuery-Datasets-Access ) |
4 | optional; the labels are used as project meta-information. see Labels | |
5 | optional; by default, the requestor will have full access and can further share with others. see Permissions |
Once the project is created (2-3 min), the requestor will have access to it, in their Google Cloud Console.
Tip |
---|
As an editor on the project, the requestor will be able to:
|
...
Labels (optional)
Labels are key-value pairs meant to better organize the projects.
...
More information on labels: https://cloud.google.com/resource-manager/docs/creating-managing-labels
Permissions (optional)
The permissions are added when the project is created.
By default, the requestor`s email has the project editor role
Once the project is released, the requestor can add more emails / users to the IAM policies of the project.
...
Code Block |
---|
user:dana@boxalino.com:roles/editor user:dana@boxalino.com:roles/resourcemanager.projectIamAdmin user:dana@boxalino.com:roles/compute.osLogin user:dana@boxalino.com:roles/compute.osAdminLogin user:dana@boxalino.com:roles/bigquery.dataOwner serviceAccount:service-account-from-other-projects:roles/iam.serviceAccountUser serviceAccount:service-account-from-other-projects:roles/bigquery.dataOwner serviceAccount:service-account-from-other-projects:roles/bigquery.dataEditor |
...
More information on permissions: https://cloud.google.com/iam/docs/understanding-roles
2. Billing Information
In order to access the Google Cloud resources - a billing account must be set on the project.
In order to achieve this:
go to the Billing menu in GCP console or check the billing projects https://console.cloud.google.com/billing/projects
Identify the project and click on the 3 dots. Select “Change Billing”
3. From the appeared window, select the Billing Account on which the costs of the Application will be billed
...
If you do not have access to a billing account, provide the Project Billing Manager role to someone who does. Use the IAM menu for this https://console.cloud.google.com/iam-admin/iam
3. Application Content
Info |
---|
In order to launch the application, the source files must be loaded in a Google Storage Bucket https://console.cloud.google.com/ |
...
Note |
---|
The Google Storage Bucket must have an unique name. Due to this, we recommend that every bucket-name starts with your project name. |
Required Files
1 | instance.txt | properties for the Virtual Machine (VM machine ) (name, size, root pathhome, etc) (see instance.txt) |
2 | requirements.txt | environment requirements (for pip/anaconda install) (see requirements.txt)
|
3 | commands.txt | a list of commands to be executed as part of your application run process (see comands.txt) |
4 | env.yml | (optional) anaconda environment file; |
5 | your jupyter/python/application files | the content of your application (in python, jupyter notebooks, etc) |
...
Code Block |
---|
name: gcp-application-name channels: - defaults dependencies: - ca-certificates=2020.1.1=0 - <a list of dependencies> - pip: - google-api-core==1.22.2 - google-api-python-client==1.9.3 - google-auth==1.17.2 - <more-libraries required for the application> prefix: /opt/conda/envs/gcp-application-env |
4. Application Launch
Note |
---|
Before launching the application, make sure that the Required Files are uploaded in a GCS bucket. |
...
Code Block |
---|
sudo gsutil rsync -r gs://<BUCKET>/ <APPLICATION-PATH> |
Note |
---|
Replace <BUCKET> with your storage bucket name (where the application files have been loaded). Replace <APPLICATION-PATH> with the path to your application (default: /home/project-name). |
...