doc_product
Content: Products data
All the data about products
Overview
This is the core data of the products.
It is based on a 3 level structure:
Product line
Product group
SKU
Product line
You can decide to use the first level (product line) or not, this is only helpful if you have many data which are the same for all the products of a product line and don’t want to copy-paste them in all the products.
If you don’t need the product line, then you can skip all the fields of the product_line and directly put your data in one product_line.product_groups. Make sure to put only one product group per entry as there should be several product_groups only in case you want to group them in the same product line.
Product group
This is the level which you show to the end-user on a page, so if you have the same product available in different sizes / colors / … then you have one product group with several SKUs.
In case you have no product group and only SKU and show each SKU as a separate result to the user, then do the same as for the product line and simply ignore all the product group properties and add one SKU (only one per record).
SKU
This is the level of what can be actually bought by the user, a unique identifier of a buyable item.
Example
In the example, we consider one product (a wine) with the following logic
Option 1 (one search result per year and per bottle size)
Product line = the wine (independently of its year)
Product group = transparent (one SKU per product group)
SKU = different products based on the year of the wine
Option 2 (one search result per year but not per bottle size)
Alternatively, it could also be possible to do:
Product line = the wine (independently of its year)
Product group = different products based on the year of the wine
SKU = the specific bottle size
The difference between the first and second cases is what is shown to the user in a product list (like the search results, for example). If the user should see one entry for each year and for each bottle size, then the first case should be applied. If the user should see only one entry for each year and then, after going to the PDP (or directly in the list) should select a bottle size to add it to the basket, then option 2 should be applied.
Please note that some attributes are defined at the product_line level in the example, and others at the SKU level. The idea is simple: if an attribute is the same for all the product groups and SKUs, then it can be defined at the product_line level. But if it is different, then it should be defined at the product group or at the SKU level. It will also work if they are all defined at the SKU level (to avoid to have to sort out which one goes where) but it will make the data potentially much bigger and might take significantly longer to process, so to put the attributes at the right level of granularity is recommended.
Required Properties
The designed Boxalino Data Structure has been developed by following various platforms for e-commerce data particularities. Due to that, there are a series of properties which are required in order for the computation of doc_product to be a success:
title
price
visibility
categories (with extended definition exported in
doc_attribute_value
export, at least one definition must have aparent_value_ids
property set)doc_language
Common Properties
All the 3 levels have the same following properties:
Field name | Type | Mode | Description |
---|---|---|---|
internal_id | STRING | NULLABLE | the internal identifier |
external_id | STRING | NULLABLE | the external identifier (can be the same as the internal identifier) |
label | STRING | NULLABLE | label |
creation | DATETIME | NULLABLE |
|
last_update | DATETIME | NULLABLE |
|
is_new | BOOLEAN | NULLABLE |
|
in_sales | BOOLEAN | NULLABLE |
|
product_relations | REPEATED | relations to other products | |
other_relations | REPEATED | relations to other contents | |
stores | STRING | REPEATED | the stores |
title | REPEATED | the title | |
description | REPEATED | the description | |
short_description | REPEATED | the short description | |
brands | REPEATED | the brands | |
suppliers | REPEATED | the suppliers | |
categories | RECORD | REPEATED | the categories |
categories. categorization | STRING | NULLABLE |
|
categories. category_ids | REPEATED |
| |
images | REPEATED | the images | |
link | REPEATED | the link | |
tags | REPEATED | the tags , e.g.: [STRUCT('tag', 'hello world', [STRUCT('de', 'hello world')])] | |
labels | REPEATED | the labels of the product line, e.g.: [STRUCT('symbol', 'delivery', '24h', [STRUCT('de', '24-H Versand')])] | |
periods | REPEAT | information about the activity periods of the product line | |
string_attributes | REPEATED | additional string (not localized) attributes of the product line | |
localized_string_attributes | REPEATED | additional localized string attributes | |
numeric_attributes | REPEATED | additional numeric (not localized) attributes | |
localized_numeric_attributes | REPEATED | additional localized numeric attributes | |
datetime_attributes | REPEATED | additional datetime (not localized) attributes | |
localized_datetime_attributes | REPEATED | additional localized datetime attributes |
Properties specific to the product line & product group
Field name | Type | Mode | Description |
---|---|---|---|
pricing | NULLABLE | pricing information about the product line, e.g.: STRUCT('discount',[STRUCT('de','Bis:')],[STRUCT('de','-50:')],[STRUCT('de','%')]) |
Properties specific to the product group
Field name | Type | Mode | Description |
---|---|---|---|
attribute_visibility_grouping | STRING | REPEATED | additional grouping options (ex: BY certain attributes/options for variants); |
Properties specific to the product group & SKU
Properties specific to the SKU
Field name | Type | Mode | Description |
---|---|---|---|
type | STRING | NULLABLE | the type value |
sku | STRING | NULLABLE | the sku value |
ean | STRING | NULLABLE | the ean value |
additional_product_groups | RECORD | REPEATED | connection to other product groups |
additional_product_groups. type | STRING | NULLABLE |
|
additional_product_groups. product_group | STRING | NULLABLE |
|
stock | REPEATED | the current stock | |
individually_visible | BOOLEAN | NULLABLE | in addition to be an sku in this product group, the product should also appear separately in the list of results as itself |
show_out_of_stock | BOOLEAN | NULLABLE | show the product even if it is out of stock |
Resources
BigQuery JSON Schema
https://github.com/boxalino/data-integration-doc-schema/blob/master/doc/doc_product.json
BigQuery DDL
https://github.com/boxalino/data-integration-doc-schema/blob/master/ddl/doc_product.sql