Tip |
---|
At Boxalino we are strong advisors for ELT flows (vs ETL). The major benefit is speed, transparency and maintenence: data is loaded directly into a destination system (BQ), and transformed in-parallel (Dataform). |
...
The client has access to a GCP project
The client will create a Dataform repository https://cloud.google.com/dataform/docs/repositories
The client has access to a GitHub or GitLab repository (to connect it to the Dataform repository) https://cloud.google.com/dataform/docs/connect-repository
The client has given “Dataform Admin” permission to Boxalino Service Account
boxalino-dataform@rtux-data-integration.iam.gserviceaccount.com
When using dataform
for transforming the exported data (your custom CSV/JSONL files) into Boxalino Data Structure, the JSON body (for the SYNC REQUEST) has the following information (next to connector
and di
):
Code Block |
---|
"transform": {
"vars": {
"var1": "value"
},
"tags": [
"order"
],
"dataform": {
"project": "rtux-data-integration",
"location": "europe-west1",
"repository": "boxalino-di-saas-elt",
"workspace": "dataform"
}
} |
Note |
---|
The values for |
The DI-SAAS SYNC request
The DI request will use the same headers (client, tm, mode, type, authorization)and a JSON request body that would provide mapping details between the loaded .jsonl files and data meaning.
...
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
The use of the header |
Tip |
---|
Step #1 - #2 must be repeated for every file that is required to be added for the given process (same tm, mode & type) Only after all the files are available in GCS, you can move on to step#3. |
Tip |
---|
After all required documents (doc) for the given |
...