Tip |
---|
At Boxalino we are strong advisors for ELT flows (vs ETL). The major benefit is speed, transparency and maintenence: data is loaded directly into a destination system (BQ), and transformed in-parallel (Dataform). |
...
Info |
---|
As this step was done in-house by Boxalino, for the POC of our ELT solution, further definition will be provided. |
Dataform
The transformation happens with the help of Google Dataform https://cloud.google.com/dataform .
This implies the following:
The client has access to a GCP project
The client will create a Dataform repository https://cloud.google.com/dataform/docs/repositories
The client has access to a GitHub or GitLab repository (to connect it to the Dataform repository) https://cloud.google.com/dataform/docs/connect-repository
The client has given “Dataform Admin” permission to Boxalino Service Account
boxalino-dataform@rtux-data-integration.iam.gserviceaccount.com
The DI-SAAS request
The DI request will use the same headers (client, tm, mode, type, authorization) and a JSON request body that would provide mapping details between the loaded .jsonl files and data meaning.
...
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
The use of the header |
Tip |
---|
Step #1 - #2 must be repeated for every file that is required to be added for the given process (same tm, mode & type) Only after the full content is available in GCS, you can move on to step#3. |
Tip |
---|
After all required documents (doc) for the given |
...