Status Review

There are a few options available in order to design automated checks for the account`s data integration status.

By the SYNC REQUEST response
By an endpoint check

SYNC Request Response

If desired, your data integration flow/process can read the response from the SYNC REQUEST.

The response will be of type JSON:

in case of error: 400 BAD REQUEST HTTP code + an error message
in case of success: 200 OK HTTP code + a combination key-value: {"taskId":"task value"}
- use the <taskId> to make request to <endpoint>/task/status/<taskId> in order to check the data synchronization status in account’s data index.

Account review (WEB)

You can review the status of the triggered events in the Account page: <endpoint>/account like :

https://boxalino-di-process-krceabfwya-ew.a.run.app/account

Account Review (CLI)

	Endpoint	https://boxalino-di-process-krceabfwya-ew.a.run.app/account/review
1	Method	POST
2	Headers	Content-Type	application/json
3	Body	key	DATASYNC API key
4		client	account name
5		limit	number of logs (ordered by most recent)
6		index	dev / prod (default: none)
7		mode	D for delta , I for instant update, F for full (default: none)
8		type	product, user, content, user_content, order (default: none)
9		status	SYNC - sync requests and the status LOAD - load requests and status OK - ok processes FAIL - fail statuses none - all logs (default: none)

For example, this request will return the last SYNC OK (succesfull sync request):

curl https://boxalino-di-process-krceabfwya-ew.a.run.app/account/review \
  -X POST \
  -d "{\n  \"client\": \"BOXALINO_ACCOUNT\",\n  \"key\": \"BOXALINO_ACCOUNT_ADMIN_KEY\",\n  \"index\": \"prod\",\n  \"mode\": \"F\",\n  \"type\": \"product\",\n  \"status\": \"SYNC OK\",\n  \"limit\": 1\n}" \
  -H "Content-Type: application/json"

The API response for a request for status: “SYNC OK”, limit:1 - would be a JSON list, like:

[
  {
    "ID": "UUID-FOR-THE-SYNC-REQUEST",
    "RequestReceivedAt": "Y-m-d H:i:s",
    "Status": "SYNC OK",
    "Message": null,
    "Timestamp": "YmdHis,
    "VersionTs": "TIME-IN-UNIX-MS",
    "Project": null,
    "Dataset": null,
    "Document": null,
    "Default": "[]"
  }
]

Data Integration Statuses

Status Code	Meaning
STATUS CODES DURING SYNC REQUESTS
SYNC OK	the data content was exported to SOLR; (for doc_product/ doc_content) the status of the process is available upon calling the <endpoint>/task/status/<taskId> service
SYNC REQUEST	a SYNC REQUEST was done (<endpoint>/sync); once the SYNC REQUEST is received, the compute process starts
SYNC FAIL	the data update failed;
SYNCPRODUCT FAIL SYNCCONTENT FAIL SYNCINDEX FAIL FAIL CORPUS COMPUTED FAIL COMPUTED FIELDS FAIL COMPUTED	fail of SYNC REQUEST during compute
STOP SYNC	the SYNC REQUEST was stopped (for ex: the content quota was not reached: min X products to be synced, OR the doc_X table is empty)
DENIED SYNC	it appears in the case of product delta sync requests when the doc_X content is too much (ex: over 1GB BQ table size).
BIG SOLR CONTENT SOLRCOMPUTE REQUEST SOLRCOMPUTE OK	generating the SOLR file for SOLR export from the doc_X_<mode>_<tm> file
SOLRSYNC REQUEST	exporting the solr-compute file (above) to SOLR for sync
DISPATCHED SYNC REQUEST SYNCCOMPUTE REQUEST	the BQ compute process log (for dispatched requests)
RESYNCACCOUNT REQUEST RESYNCACCOUNT OK	re-sync request (triggered internally, on client request); (triggers the /sync request for given tm/index/type/mode)
SYNCCHECK OK	a synccheck request was done (<endpoint>/sync/check); this is done for ex for D/I to access the last SYNC OK status for the account and type
FAIL AUTH	authentication headers are invalid / not a match for the account
FAIL SOLR EXPORT	the export of the generated file failed (data index not updated)
STATUS CODES DURING LOAD REQUESTS
LOAD OK	the doc_X data structure was loaded succesfully in BQ
LOAD REQUEST	a LOAD REQUEST was received (<endpoint>/load); once the LOAD REQUEST is received it: 1. creates GCS bucket for account (if needed) 2. creates the BQ dataset for account & mode (if needed) 3. loads the content in GCS file (doc_<type>_<mode>_<tm>.json file) 4. loads the GCS file in BQ
FAIL BQ LOAD	BQ load step failed
LOADBYCHUNK REQUEST	a LOAD BY CHUNK request was received. when this happens - it loads the content in a GCS file (doc_<type>_<mode>_<tm>-<chunk>.json)
LOADBYCHUNK OK	the GCS file (doc_<type>_<mode>_<tm>-<chunk>.json) was created
LOADBYCHUNK FAIL	the GCS file was not properly loaded
FAIL GCS	the GCS bucket / content failed to generate
LOADBQ REQUEST	loads all the chunk files in BQ
LOADBQ OK	succesfully loaded the doc_<type> content in BQ; the table <index>_<mode>.doc_<type>_<mode>_<tm> is available
LOADBQ FAIL	the BQ table was unable to generate based on the available doc_<type>_<mode>_<tm>-*.json content

Data (Item) Review

The item review is available as a web service in https://boxalino-di-process-krceabfwya-ew.a.run.app/

If desired to avoid the web form, you can access directly in the WEB/CLI the content exported for a given item SKU / ID or products group.

The requested link has the following structure:
<di-process-endpoing>/item/<API-Key-admin>/<data-index>/<type>/<mode>/<field>/<value>

1	<di-process-endpoing>	https://boxalino-di-process-krceabfwya-ew.a.run.app/
2	<API-Key-admin>	the API Key with the role `ADMIN` from Intelligence Admin (the API Key used for the DI SYNC REQUEST)
3	<data-index>	dev \| prod
4	<type>	The value of the `type` parameter in the SYNC REQUEST ex: product \| user \| order \| content
5	<mode>	F \| f
6	<field>	id \| sku \| products_group_id
7	<value>	the value for the given field

In the WEB/CLI the content will be returned as a JSON. You can use any JSON formatter to structure it for an easier view.

Using BigQuery

The client`s team has read & view access to the data integration BigQuery datasets from the rtux-data-integration GCP project.

In order to be able to execute BQ queries in BQ view https://console.cloud.google.com/bigquery - the integrator/client:

must be authenticated with a Google account.
The scope of a project (other than bx-bdp-53322 or rtux-data-integration ) must be used in order to run BigQuery jobs.

The following SQLs can be used to check the exported data and, as well, the computed data:

Check the exported product data by SKU

SELECT pg.* FROM `rtux-data-integration.<index>_F.doc_product_F_<tm>`
JOIN UNNEST(product_line.product_groups) pg
JOIN UNNEST(pg.skus) sku
WHERE sku.sku = "<sku>"

The SQL can be used to also check different properties at the level of sku (sku) or product_groups (pg).

Check the exported product data by ID

SELECT pg.* FROM `rtux-data-integration.<index>_F.doc_product_F_<tm>`
JOIN UNNEST(product_line.product_groups) pg
JOIN UNNEST(pg.skus) sku
WHERE sku.internal_id = "<id>"

Check final product attributes

SELECT DISTINCT(property_name) FROM `rtux-data-integration.<index>_F.doc_computed_F_<tm>`
-- WHERE id="<product-id>" #optional add filter by product id

Identify values for a specific property (and nr of matching products)

SELECT pv, COUNT(*) AS products_match FROM `rtux-data-integration.<index>_F.doc_computed_F_<tm>`
JOIN UNNEST(property_values) pv
WHERE property_name = "<property_name>"
GROUP BY pv