Tip |
---|
At Boxalino we are strong advisors for ELT flows (vs ETL). The major benefit is speed, transparency and maintenence: data is loaded directly into a destination system (BQ), and transformed in-parallel (Dataform). |
...
For content over 32MB, we provide an endpoint to access a Signed GCS Url that would put all your streamed content into a file (currently there is no defined file size limit in GCS)
Read more about Google Cloud Signed URL https://cloud.google.com/storage/docs/access-control/signed-urls (response samples, uses, etc)
1. Make a request for public upload link
...
Code Block |
---|
curl --connect-timeout 60 --max-time 300 "https://boxalino-di-stage-krceabfwya-ew.a.run.app/transformer/load/url" \
-X POST \
-H "Content-Type: application/json" \
-H "client: <account>" \
-H "dev: true|false" \
-H "tm: YYYYmmddHHiiss" \
-H "type: product|content|order|user|communication_history|communication_planning|user_generated_content" \
-H "mode: F|D|I|E" \
-H "chunk: <id>" \
-H "doc: <filename>" \
-H "Authorization: Basic <encode of the account>" |
...
Code Block |
---|
curl --connect-timeout 60 --timeout 0 <GCS-signed-url> \ -X PUT \ -H "Content-Type: application/octet-stream" \ -d "<YOUR DOCUMENT JSONL CONTENT (STREAM)>" |
...
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
The use of the header |
Tip |
---|
Step #1 - #2 must be repeated for every file that is required to be added for the given process (same tm, mode & type) Only after all the full content is files are available in GCS, you can move on to step#3. |
Tip |
---|
After all required documents (doc) for the given |
...