doc_product

Content: Products data

All the data about products

Overview

This is the core data of the products.

It is based on a 3 level structure:

  1. Product line

  2. Product group

  3. SKU

Product line

You can decide to use the first level (product line) or not, this is only helpful if you have many data which are the same for all the products of a product line and don’t want to copy-paste them in all the products.

If you don’t need the product line, then you can skip all the fields of the product_line and directly put your data in one product_line.product_groups. Make sure to put only one product group per entry as there should be several product_groups only in case you want to group them in the same product line.

Product group

This is the level which you show to the end-user on a page, so if you have the same product available in different sizes / colors / … then you have one product group with several SKUs.

In case you have no product group and only SKU and show each SKU as a separate result to the user, then do the same as for the product line and simply ignore all the product group properties and add one SKU (only one per record).

SKU

This is the level of what can be actually bought by the user, a unique identifier of a buyable item.

Example

In the example, we consider one product (a wine) with the following logic

Option 1 (one search result per year and per bottle size)

  1. Product line = the wine (independently of its year)

  2. Product group = transparent (one SKU per product group)

  3. SKU = different products based on the year of the wine

Option 2 (one search result per year but not per bottle size)

Alternatively, it could also be possible to do:

  1. Product line = the wine (independently of its year)

  2. Product group = different products based on the year of the wine

  3. SKU = the specific bottle size

The difference between the first and second cases is what is shown to the user in a product list (like the search results, for example). If the user should see one entry for each year and for each bottle size, then the first case should be applied. If the user should see only one entry for each year and then, after going to the PDP (or directly in the list) should select a bottle size to add it to the basket, then option 2 should be applied.

Please note that some attributes are defined at the product_line level in the example, and others at the SKU level. The idea is simple: if an attribute is the same for all the product groups and SKUs, then it can be defined at the product_line level. But if it is different, then it should be defined at the product group or at the SKU level. It will also work if they are all defined at the SKU level (to avoid to have to sort out which one goes where) but it will make the data potentially much bigger and might take significantly longer to process, so to put the attributes at the right level of granularity is recommended.

Here is the example for the first case above (make sure to format it in JSONL before loading to BigQuery: Newline delimited JSON : https://en.wikipedia.org/wiki/JSON_streaming).

{ "product_line": { "internal_id": "2c872325-748b-4cf8-a4af-d4876ab55a2c", "creation": "2020-01-01 00:00:00", "is_new": false, "in_sales": false, "categories": [ { "category_ids": [ { "language": "de", "value": "12" }, { "language": "de", "value": "18" } ] } ], "suppliers": [ { "values": [ { "value": [ { "language": "de", "value": "Paul Ullrich AG" } ] } ] } ], "brands": [ { "values": [ { "value": [ { "language": "de", "value": "Famille Perrin" } ] } ] } ], "localized_string_attributes": [ { "name": "country", "values": [ { "value": [ { "language": "de", "value": "Frankreich" } ] } ] }, { "name": "culinary_txt", "values": [ { "value": [ { "language": "de", "value": "Fruchtdesserts, Hartkäse, Weichkäse" } ] } ] }, { "name": "grape_txt", "values": [ { "value": [ { "language": "de", "value": "Muscat" } ] } ] }, { "name": "awarded", "values": [ { "value": [ { "language": "de", "value": "Robert Parker: 90/100" } ] } ] } ], "numeric_attributes": [ { "name": "bio", "values": [ 1 ] } ], "product_groups": [ { "skus": [ { "internal_id": "55111", "title": [ { "language": "de", "value": "Muscat Beaumes de Venise ac 2014 - Famille Perrin" } ], "description": [ { "language": "de", "value": "In der Nase intensiv und frisch Zitrusnoten exotische Früchte etwas Minze und Honig. Im Gaumen reichhaltig und gehaltvoll sehr finessenreich und harmonisch. Schönes Spiel von Süsse und Frische lang und komplex im Abgang." } ], "price": [ { "list_price": [ { "value": "29.90" } ], "sales_price": [ { "value": "29.90" } ], "gross_margin": [ { "value": "8.7906" } ] } ], "stock": [ { "value": 1 } ], "visibility": [ { "values": [ { "language": "de", "value": 4 } ] } ], "status": [ { "language": "de", "value": 1 } ], "images": [ { "values": [ { "value": [ { "language": "de", "value": "https://www.flaschenpost.ch/media/catalog/product/9/1/91064e7c-3a8c-4d66-8d9a-06f1216a28fb.1042256.jpg" } ] } ] } ], "link": [ { "language": "de", "value": "https://www.flaschenpost.ch/muscat-beaumes-de-venise-ac-p109553" } ], "numeric_attributes": [ { "name": "alcohol_num", "values": [ 13 ] }, { "name": "h24h", "values": [ 0 ] } ], "localized_string_attributes": [ { "name": "enjoy_phase_txt", "values": [ { "value": [ { "language": "de", "value": "2013 - 2020" } ] } ] } ] } ] } ] }, "creation_tm": "2020-10-20 00:00:00", "client_id": 1, "src_sys_id": 1 }

 

Required Properties

The designed Boxalino Data Structure has been developed by following various platforms for e-commerce data particularities. Due to that, there are a series of properties which are required in order for the computation of doc_product to be a success:

  1. title

  2. price

  3. visibility

  4. categories (with extended definition exported in doc_attribute_value export, at least one definition must have a parent_value_ids property set)

  5. doc_language

Common Properties

All the 3 levels have the same following properties:

Field name

Type

Mode

Description

Field name

Type

Mode

Description

internal_id

STRING

NULLABLE

the internal identifier

external_id

STRING

NULLABLE

the external identifier (can be the same as the internal identifier)

label

STRING

NULLABLE

label

creation

DATETIME

NULLABLE

 

last_update

DATETIME

NULLABLE

 

is_new

BOOLEAN

NULLABLE

 

in_sales

BOOLEAN

NULLABLE

 

product_relations

PRODUCT

REPEATED

relations to other products

other_relations

CONTENT

REPEATED

relations to other contents

stores

STRING

REPEATED

the stores

title

LOCALIZED

REPEATED

the title

description

LOCALIZED

REPEATED

the description

short_description

LOCALIZED

REPEATED

the short description

brands

LIST

REPEATED

the brands

suppliers

LIST

REPEATED

the suppliers

categories

RECORD

REPEATED

the categories

categories. categorization

STRING

NULLABLE

 

categories. category_ids

LOCALIZED

REPEATED

 

images

LIST

REPEATED

the images

link

LOCALIZED

REPEATED

the link

tags

TAG

REPEATED

the tags , e.g.: [STRUCT('tag', 'hello world', [STRUCT('de', 'hello world')])]

labels

LABEL

REPEATED

the labels of the product line, e.g.: [STRUCT('symbol', 'delivery', '24h', [STRUCT('de', '24-H Versand')])]

 periods

PERIOD

REPEAT

information about the activity periods of the product line

string_attributes

MAP

REPEATED

additional string (not localized) attributes of the product line
(MAP type: STRING)

localized_string_attributes

MAP

REPEATED

additional localized string attributes
(MAP type: LOCALIZED in STRING)

 numeric_attributes

MAP

REPEATED

additional numeric (not localized) attributes
(MAP type: NUMERIC)

localized_numeric_attributes

MAP

REPEATED

additional localized numeric attributes
(MAP type: LOCALIZED in NUMERIC)

datetime_attributes

MAP

REPEATED

additional datetime (not localized) attributes
(MAP type: DATETIME)

localized_datetime_attributes

MAP

REPEATED

additional localized datetime attributes
(MAP type: LOCALIZED inDATETIME)

Properties specific to the product line & product group

Field name

Type

Mode

Description

Field name

Type

Mode

Description

 pricing

PRICING

NULLABLE

pricing information about the product line, e.g.: STRUCT('discount',[STRUCT('de','Bis:')],[STRUCT('de','-50:')],[STRUCT('de','%')])

 

Properties specific to the product group

Field name

Type

Mode

Description

Field name

Type

Mode

Description

 attribute_visibility_grouping

STRING

REPEATED

additional grouping options (ex: BY certain attributes/options for variants);

set to empty [] as a default

 

Properties specific to the product group & SKU

Field name

Type

Mode

Description

Field name

Type

Mode

Description

 visibility

VISIBLITY

REPEATED

the product visibility :VISIBILITY_NOT_VISIBLE = 1; VISIBILITY_IN_CATALOG = 2; VISIBILITY_IN_SEARCH = 3; VISIBILITY_BOTH = 4;

status

STATUS

REPEATED

the product status

 price

PRICE

REPEATED

 

Properties specific to the SKU

Field name

Type

Mode

Description

Field name

Type

Mode

Description

type

STRING

NULLABLE

the type value

sku

STRING

NULLABLE

the sku value

 ean

STRING

NULLABLE

the ean value

 additional_product_groups

RECORD

REPEATED

connection to other product groups

additional_product_groups. type

STRING

NULLABLE

 

additional_product_groups. product_group

STRING

NULLABLE

 

stock

STOCK

REPEATED

the current stock

individually_visible

BOOLEAN

NULLABLE

in addition to be an sku in this product group, the product should also appear separately in the list of results as itself

show_out_of_stock

BOOLEAN

NULLABLE

show the product even if it is out of stock

Resources

BigQuery JSON Schema

https://github.com/boxalino/data-integration-doc-schema/blob/master/doc/doc_product.json

BigQuery DDL

https://github.com/boxalino/data-integration-doc-schema/blob/master/ddl/doc_product.sql