Skip to end of banner
Go to start of banner

Data Feeds (DF)

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 23 Current »

In order to assist with the Data Integration (DI) efforts with 3rd party systems (ex: Productsup, Google Shopping, Google Sheet, Google Ads, etc), Boxalino is presenting the Data Feeds services.

Use this service to:

  1. Export your data to other tools in your project (ex: productsup, google shopping feed, etc)

  2. Accessing calculated data in a structured way (per request)

  3. Load files directly to your GCP ecosystem (for order, customers enrichment)

This is service is currently expanding. We value our clients' feedback.
For further service requests and data feeds, please get in touch with Boxalino.

GETTING STARTED

For any data required to be exposed (privately), follow the bellow steps:

  1. Get in touch with Boxalino to receive an access key

    1. Communicate the purpose of the data feed & desired information

    2. We will provide an access key

  2. Share the access key with the integration team

  3. The integration team makes an HTTP request to access the data (per documentation bellow)

READ FEED REQUEST

There should be a process within the client project that reads the data from the /read/{access-key} service.

The {access-key} is provided by Boxalino.

As an integrator, please create the bellow request to the provided endpoint.

Endpoint

https://boxalino-winning-interactions-krceabfwya-ew.a.run.app

1

Action

/read/{access-key}

2

Method

POST

3

Headers

Authorization

Basic base64<<READ API key : READ API Secret>>

note: use the API credentials from your Boxalino account that have the PUBLIC_READ assigned

*Read more about credentials API Credentials

The data exposed in the data feed is in a JSON format.

The READ service is for authenticated access only.

The data accessed is from a custom/private SQL script (as defined in the {access-key} configuration), available for the account in the GCS bucket

VIEW FEED REQUEST

There should be a process within client`s own project that accesses the data from the /view/{access-key} service. Final format is CSV file, to be downloaded.

The {access-key} is provided by Boxalino.

As an integrator, please create the bellow request to the provided endpoint.

Endpoint

https://boxalino-winning-interactions-krceabfwya-ew.a.run.app

1

Action

/view/{access-key}

2

Method

POST | GET

3

Parameters (optional)

separator

Default: comma (,)

Define a different separator for the .csv file ( tab, pipe (|), etc). One character only!

NOTE: the default will be upgraded to “tab” in order to allow a list of data for value (ex: multivalue fields)

4

format

Default: csv

Change the format of export (ex: txt)

5

page

(optional) pagination; by default - all results are returned;

6

limit

(optional) set a limit for number of rows per request

by default - no limit; when used with page - default limit is 1000.

7

Authorization

only if the key is for authenticated requests
Basic base64<<READ API key : READ API Secret>>

note: use the API credentials from your Boxalino account that have the PUBLIC_READ assigned

*Read more about credentials API Credentials

*Currently, the data exposed in the data feed is in a CSV format.

The data accessed is from a custom view in <client>_views dataset (as defined in the {access-key} configuration).

In order to use CSV headers with spaces, the _S_ can be used within SQL select field names.

Parametrized queries are supported (https://cloud.google.com/bigquery/docs/parameterized-queries ). Simply use '@<key>' in the SQL and add the same parameter to the feed URL.

NOTE: all parameter values are casted as STRING.

VIEW GOOGLEADSFEED REQUEST

There should be a process within clients' own project that accesses the data from the /view/googleadsfeed/{access-key} service. Final format is CSV file, to be downloaded.

The {access-key} is provided by Boxalino.

It is only valid for this service.

As an integrator, please create the bellow request to the provided endpoint.

Endpoint

https://boxalino-winning-interactions-krceabfwya-ew.a.run.app

1

Action

/view/googleadsfeed/{access-key}?name=<feed-name>

2

Method

POST | GET

3

Parameters (optional)

separator

Default: comma (,)

Define a different separator for the .csv file ( tab, pipe (|), etc). One character only!

NOTE: the default will be upgraded to “tab” in order to allow a list of data for value (ex: multivalue fields)

4

format

Default: csv

Change the format of export (ex: txt)

*Currently, the data exposed in the data feed is in a CSV format.

The difference between a simple CSV export and a CSV format compatible with Google Ads Uploads is around the Parameters properties before the table header.

The parameters must be defined in the clients` views dataset , in the google_ads_data_feeds_parameters table.

The service will replace every _ from the SQL field names with a space.

JSON FEED REQUEST

As an integrator, please create the bellow request to the provided endpoint.

There should be a process within your own project that requires the data from the /json/{access-key} service. Final format is JSON.

The {access-key} is provided by Boxalino.

Endpoint

https://boxalino-winning-interactions-krceabfwya-ew.a.run.app

1

Action

/json/{access-key}?name=<feed-name>

2

Method

POST | GET

3

Authorization (header)

only if the key is for authenticated requests
Basic base64<<READ API key : READ API Secret>>

note: use the API credentials from your Boxalino account that have the PUBLIC_READ assigned

*Read more about credentials API Credentials

4

Parameters (optional)

5

page

(optional) pagination; by default - all results are returned;

6

limit

(optional) set a limit for number of rows per request

by default - no limit; when used with page - default limit is 1000.

The data exposed in the data feed is in a JSON format.

The JSON feed rely on the presence of a source in the <client>_views dataset in BigQuery.

Parametrized queries are supported (https://cloud.google.com/bigquery/docs/parameterized-queries ). Simply use '@<key>' in the SQL and add the same parameter to the feed URL.

NOTE: all parameter values are casted as STRING.

GOOGLE SHOPPING FEED REQUEST

As an integrator, please create the bellow request to the provided endpoint.

There should be a process within your own project that accesses the data from the /googleshoppingfeed/{access-key} service.

Final format is tab delimited .txt (per Google Shopping Feed requirements). The data is formated based on Google Shopping Feed rules: https://support.google.com/merchants/answer/7052112

The {acces-key} is provided by Boxalino.

It is only valid for this service.

Endpoint

https://boxalino-winning-interactions-krceabfwya-ew.a.run.app

1

Action

/googleshoppingfeed/{access-key}

2

Method

POST | GET

3

Authorization (header)

only if the key is for authenticated requests
Basic base64<<READ API key : READ API Secret>>

note: use the API credentials from your Boxalino account that have the PUBLIC_READ assigned

*Read more about credentials API Credentials

4

Parameters (optional)

5

page

(optional) pagination; by default - all results are returned;

6

limit

(optional) set a limit for number of rows per request

by default - no limit; when used with page - default limit is 1000.

The data accessed is from a custom view in <client>_views dataset (as defined in the {access-key} configuration)

In order to use CSV headers with spaces, the _S_ can be used within SQL select field names.

Parametrized queries are supported (https://cloud.google.com/bigquery/docs/parameterized-queries ). Simply use '@<key>' in the SQL and add the same parameter to the feed URL.

NOTE: all parameter values are casted as STRING.

LOAD FEED REQUEST

The LOADFEED service was created in order to load files directly into your project`s BigQuery ecosystem from Boxalino.

There should be configurations in the input_data_feeds view from <account>_views dataset.
Use the <endpoint>/loadfeed/{access-key}?name=<feed-name> action in order to validate your configurations.

The {access-key} is provided by Boxalino.

It is only valid for this service.

Endpoint

https://boxalino-winning-interactions-krceabfwya-ew.a.run.app

1

Action

/loadfeed/{access-key}?name=<feed-name>

2

Method

POST | GET

Once the view model is configured, the files will get loaded per desired frequency (as configured via the cronjob property).

Adding a new loadfeed configuration model

As an integrator, check that you have write permissions to the <client>_views dataset in order to continue:

Go to Google Cloud BigQuery and run the SQL:
SELECT * FROM `bx-bdp-53322.<BOXALINO ACCOUNT NAME>_views.input_data_feeds`

In case you do not have permissions, contact Boxalino on slack or via email datafeed@boxalino.com

The loadfeed configuration model is defined in the input_data_feeds view in the client`s account dataset.

SELECT
'<FEED NAME>' as name
, '<FEED SOURCE: direct|aws>' AS source
, '<ACCESS CONFIGURATION/HEADERS AS JSON>' AS access
, '<LINK OR PATH TO FILES TO BE DOWNLOADED>' AS url
, '' AS body
, ';' AS field_delimiter
, '<CONTENT FORMAT: CSV,NEWLINE_DELIMITED_JSON>' AS format
, '<BOXALINO ACCOUNT NAME>_core' AS destination_dataset
, '<TABLE NAME>' AS destination_table
, true AS autodetect
, '<SCHEMA (string or JSON) IF autodetect=false>' AS schema
, 0 AS add_body_as_record
, 0 AS add_tm_to_record
, 1 AS skipRows
, 0 AS max_bad_records
, '' AS quote
, 'WRITE_TRUNCATE | WRITE_APPEND' AS write_disposition
, '' AS create_disposition
, '' AS encoding
, '<CRONJOB CONFIGURATION FOR LOADING FILES, EX: 20 12 * * *>' AS cronjob
, 'feed' AS type

The integration sources are expanding per client request. Currently, it is able to load content direct (via URL) and from aws storage buckets.

In the case of direct access, it is possible to set headers (in the access property) and also define a JSON body to be sent as load. It can be used in the case of making API requests (to Boxalino or 3rd party services like ChatGPT, Deepl, OpenAI, etc).

It is possible to configure dynamic url and destination_table (ex: if the file to be loaded is stored in a different path or has a timestamp for name). The supported variations are:
__DATE__ (Y-m-d), __NODASHDATE__ (Ymd), __TM__ (Y-m-dTH:i:s), __NODASHTM__ (YmdHis), __DATE_PATH__ (Y/m/d), __JN_DATE_PATH__ (Y/n/j), __DATE_YESTERDAY__ (Y-m-d -1 day), __DATE_MONTH__ (Y-m-01), __NODASHDATE_MONTH__ (Ym01), __NODASHDATE_YESTERDAY__ (Ymd -1 day)

Loading of content in BigQuery is following the configurations modes & limitations per Google Cloud documentation.
For CSV: https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-csv
For JSONL: https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-json

GOOGLE SHEET FEED REQUEST

The GOOGLESHEETFEED service was created in order to import google sheets directly into your project`s BigQuery ecosystem from Boxalino.

The Google Sheet used for load must share VIEW PERMISSIONS with our service account boxalino-google-sheet-feed@rtux-data-integration.iam.gserviceaccount.com

There should be configurations in the google_sheet_data_feeds view from <account>_views dataset.
Use the <endpoint>/googlesheetfeed/{access-key} action in order to validate your configurations.

The {access-key} is provided by Boxalino.
It is only valid for this service.

Endpoint

https://boxalino-winning-interactions-krceabfwya-ew.a.run.app

1

Action

/googlesheetfeed/{access-key}?name=<feed-name>

2

Method

POST | GET

Once the views is configured, the files will get loaded per desired frequency (as configured).

Adding a new googlesheetfeed configuration model

As an integrator, check that you have write permissions to the <client>_views dataset in order to continue:

Go to Google Cloud BigQuery and run the SQL:
SELECT * FROM `bx-bdp-53322.<BOXALINO ACCOUNT NAME>_views.google_sheet_data_feeds`

In case you do not have permissions, contact Boxalino on slack or via email datafeed@boxalino.com

The googlesheetfeed configuration model is defined in the google_sheet_data_feeds view in the client`s account dataset.

SELECT
'<FEED NAME>' as name
, '<ID of the google sheet>' AS id
, '<A1:D50>' AS sheet_range
, ';' AS field_delimiter
, 'CSV' AS format
, '<BOXALINO ACCOUNT NAME>_core' AS destination_dataset
, '<TABLE NAME>' AS destination_table
, true AS autodetect
, '<SCHEMA (string or JSON) IF autodetect=false>' AS schema
, 1 AS skipRows
, 0 AS max_bad_records
, '' AS quote
, 'WRITE_TRUNCATE | WRITE_APPEND' AS write_disposition
, '' AS create_disposition
, '' AS encoding
, '<CRONJOB CONFIGURATION FOR LOADING FILES, EX: 20 12 * * *>' AS cronjob
, 'googlesheet' AS type
  • No labels