Table of Contents |
---|
...
You can test that your JSONL is valid by doing a test load in your own GCP project https://github.com/boxalino/data-integration-doc-schema#are-you-an-integrator
You can test that your JSONL is valid by doing a test with using the generator https://github.com/boxalino/data-integration-doc-schema/blob/master/schema/generator.html (guidelines in the repository README.md)
For certain headless CMS, Boxalino has designed a Transformer service Transformer
...
The content is exported as the
body
of yourPOST
requestThe content is exported with the help of a public GCS Signed URL (https://cloud.google.com/storage/docs/access-control/signed-urls )
Option #1 is allowed recommended for data volume less than 32MB.
Option #2 is allowed for any data size.
...
A. Loading content less than 32 MB
For this use-case, there is a single request to be made: https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/415432770/Load+Request#Request-Definition
Code Block |
---|
curl --connect-timeout 60 --max-time 300 "https://boxalino-di-stage-krceabfwya-ew.a.run.app/load" \
-X POST \
-H "Content-Type: application/json" \
-H "client: <account>" \
-H "dev: true|false" \
-H "tm: YYYYmmddHHiiss" \
-H "type: product|content|order|user|communication_history|communication_planning|user_generated_content" \
-H "mode: F|D|I" \
-H "chunk: <batch nr>" \
-H "doc: product|language|attribute_value|attribute|order|user|communication_history|communication_planning|user_generated_content" \
-d "<JSONL>" \
-H "Authorization: Basic <encode of the account>" |
...
Warning |
---|
If the service response is an error like: |
...
B. Loading undefined data size
This flow is also described in other pages https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/415432770/Load+Request#Load-in-Batches-%2F-data-%3E-32-MB
...
Code Block |
---|
curl --connect-timeout 60 --max-time 300 "https://boxalino-di-stage-krceabfwya-ew.a.run.app/load/chunk" \ -X POST \ -H "Content-Type: application/json" \ -H "client: <account>" \ -H "dev: true|false" \ -H "tm: YYYYmmddHHiiss" \ -H "type: product|content|order|user|communication_history|communication_planning|user_generated_content" \ -H "mode: F|D|I" \ -H "chunk: <id>" \ -H "doc: doc_product|doc_language|doc_attribute_value|doc_attribute|doc_order|doc_user|communication_history|communication_planning|doc_user_generated_content" \ -H "Authorization: Basic <encode of the account>" |
...
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
The use of the header |
...
Code Block |
---|
curl --connect-timeout 60 --max-time 300 "https://boxalino-di-stage-krceabfwya-ew.a.run.app/load/bq" \ -X POST \ -H "Content-Type: application/json" \ -H "client: <account>" \ -H "dev: true|false" \ -H "tm: YYYYmmddHHiiss" \ -H "type: product|content|order|user|communication_history|communication_planning|user_generated_content" \ -H "mode: F|D|I" \ -H "doc: doc_product|doc_language|doc_attribute_value|doc_attribute|doc_order|doc_user|communication_history|communication_planning|doc_user_generated_content" \ -H "Authorization: Basic <encode of the account>" |
...
Code Block |
---|
curl --connect-timeout 60 --max-time 300 "https://boxalino-di-stage-krceabfwya-ew.a.run.app/sync" \ -X POST \ -H "Content-Type: application/json" \ -H "client: <account>" \ -H "dev: true|false" \ -H "tm: YYYYmmddHHiiss" \ -H "type: product|content|order|user|communication_history|communication_planning|user_generated_content" \ -H "mode: F|D|I" \ -H "Authorization: Basic <encode of the account>"project: <client GCP project>" \ -H "dataset: <client GCP dataset>" \ -H "Authorization: Basic <encode of the account>" |
For the test scenario before - product data synchronization, the following SYNC request can be made once the documents have been loaded to BQ:
...
Info |
---|
If your project uses their own private GCP project & resources, please add as well include the headers for |
Tip |
After making the SYNC request, the data is being computed . For more options, always review the https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/394559761/Sync+Request#Request-Definition |
Tip |
---|
After making the SYNC request, the data is being computed and updated in relevant feeds (data index, real time injections, reports, etc) |
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
We encourage to have a stable fallback & retry policy. |
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
In the technical samples, the
|