In this post we present a few possible directions for using the new A.I. Transformer in the field of E-Commerce.

The sections are not ordered in a particular way (as we are not sure where the size of the realistic impact will be the biggest):

Better personalized product recommendations than NCF?

Data Scientists like Denis Rothman present this case that Transformer models could be used not with natural language, but behavior data from E-Shops to predict better what they are likely to buy.

https://www.youtube.com/watch?v=QByLSKFSk0w

Other large platforms seem to do it already : https://paperswithcode.com/paper/behavior-sequence-transformer-for-e-commerce or apply it for specific cases like size : (PDF) PreSizE: Predicting Size in E-Commerce using Transformers or https://medium.com/nvidia-merlin/winning-the-sigir-ecommerce-challenge-on-session-based-recommendation-with-transformers-v2-793f6fac2994

We use already for a while Neural Collaborative Filtering as a standard AI recommendation algorithm in Boxalino: CJP - Collaborative Filtering on Purchases

However, this algorithm mainly showed good results for predicting the next purchase of logged in customers (and less clear results for cross-selling, up-selling or behavior-based recommendations for non logged in customers which we call VJP).

It could be interesting to see if a Transformer model could provide better results for the non typical of personalized recommendations based on the purchase history (which we call CJP).

Key opportunities:

WELCOME: Product suggestions on start pages - VJP : based on the prior visited pages and clicks
SEARCH/NAVI: In-site Search and Product Listings - VJP : based on the prior visited pages and clicks
UP-SELL: Similar Product Suggestions on PDP - VJP & PCC: could we find what other products / content are similar using these behaviors?
CROSS-SELL: Product Suggestions on Overlay and Basket Page - VJP & PCC: could we find what other products are good to add to a what is already in the basket using these behaviors?
RE-BUY: Product Suggestions on My Account Page
INDIVIDUALIZE: Product Suggestions in E-Mails VJP : same as WELCOME but here preprocessing is less a problem as we have the time to generate them
PROMOTE: Banner Suggestions in E-shop and E-Mail - VJP : same as WELCOME but with marketing banners and not with products
READ: Promote Blog and Editorial Content - VJP : same as WELCOME but with blog/magazine/… content and not with products

Key challenges / questions:

How to run the prediction in real-time? which is optimal for VJP
If not possible, what type of preprocessing cases would be prepared and how should they be detected (to be applied) in real-time
Is the approach usable in a non-personalized way for up-selling and cross-selling? or only for VJP?
What should the transformer predicts? is it more a short list of products to highlights (recommend or put at the top of a listing) or more generalizable rules (which would be less precise, but affect more than a short list of products) or a mix of both?

“Best Next Click”: Visitor Pathways Optimization

Identifying and reporting on the key patterns (positive or negative, or simply important) of the visitor journey is not easy to do in a way which shows valuable and usable insight to our clients.

Maybe a Transformer model trained on the online behaviors could highlight important navigation flows (sequences / patterns) and make suggestions where the flows are failing (dead-ends, …) or are working well and should be amplified (with more traffic).

We consider this a key opportunity for the visual guidance in the e-shop, as Boxalino can not only show products (listing and recommendations), but also suggest “next pages” either as a filtering or a change of the current context of the page.

As example, the a session (or a visit) can be viewed as a sequence of steps (events) mainly (but not 100%) corresponding to a sequence of page views.

By session (visit) we mean her (as Google Analytics) a sequence of interactions with a visitor (detected typically by a long-lasting cookie) without a 30 minutes gap (or more) between two interactions (which would otherwise start a new session for the same visitor).

To make it simple, we can consider here for the description a page view (the event) as URL visited of a web-site.

Each of these event represent a step in the session journey and has the following properties:

name	format	description

name	format	description
session_id	STRING	A unique string identifier for the session
page_view_number	INT64	a number form 1 to n defining the step in the session (so first page view is 1, second is 2, etc.)
event_timestamp	TIMESTAMP	the timetamp of the current event
next_event_timestamp	TIMESTAMP	the timestamp of the next event
event	EVENT (RECORD)	details about the event
page_location	STRING	url of the page for this event
related_requests	ARRAY<REQUEST (RECORD)>	additional information giving us more details about what the page is and what the user has seen Requests represent information which define what type of filters (for instance a brand if the user is on a brand page) was applied and what type of products was shown on the page
related_events	ARRAY<EVENT (RECORD)>	additional information giving us more details about what the page is and what the user has seen These other events represent more information about the type of page (category page, search page, etc.) as well as engagement of the user with products (and content) on the page (scroll when the content/product is displayed or click when a content/product is clicked)

So, in addition to having simply a sequence of url the session visited, a lot of extra parameters (some of them highly structured) are also available in the related requests and events!

Detailed information about the Records (event and request):

Event

name	format	description

name	format	description
event_timestamp	TIMESTAMP	timestamp of the event
event_name	STRING	name of the event: page_view: visit a page add_to_cart: add a product to the cart purchase: make a purchase view_search: visit the search page (with listing of results) scroll: some content/product appears on the screen click: some content/product is clicked login: the visitor logged in view_item: visited a product page view_listing: visited a category listing page
event_params	RECORD	associative array of parameters about the event
visitor	RECORD	information about the visitor
device	RECORD	information about the device
geo	RECORD	information about the geo-ip location
traffic_source	RECORD	information about the traffic source
ecommerce	RECORD	only for purchase event, indicate the information of the purchase revenue and other sales parameters
items	ARRAY<RECORD>	for purchase, view_item and add_to_basket, information about the product(s)
system	RECORD	some technical parameters
creation_tm	TIMESTAMP	technical debugging field, please ignore
client_id	STRING	technical debugging field, please ignore
src_sys_id	STRING	technical debugging field, please ignore

Request:

Name	Format	Description

Name	Format	Description
request_id	STRING	a unique identifier of the request
variant_id	STRING	the identifier of the algorithm test variant which was applied
bundle_id	STRING	a groupind identifer (technical debugging field, please ignore)
request_ts	TIMESTAMP	the request timestamp
session_id	STRING	the request session id
request_order_num	INT64	the numeric identifier for bundling (technical debugging field, please ignore)
choice_id	STRING	the name of the logic used (for example ‘search’ for a search result page, ‘navigation’ for a category listing page, etc.) see widget here : Narrative API - Technical Reference \| Base parameters
scenario_id	STRING	a specific scenario of the test variant algorithm which was applied
language_cd	STRING	the language of the request, as here: Narrative API - Technical Reference \| Base parameters
groupby_cd	STRING	what field the results of the requests where grouped by
products_group_id	STRING	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#Typical-parameters
offset	INT64	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#Typical-parameters
query_txt	STRING	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#Typical-parameters
final_query_txt	STRING	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#Typical-parameters
sort_field_name	STRING	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#Typical-parameters
sort_direction_cd	STRING	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#Typical-parameters
items	LIST<RECORD>	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#UP-SELL-%2F-CROSS-SELL-REQUEST
filters	LIST<RECORD>	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#Typical-parameters
facets	LIST<RECORD>	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#Facets-without-selected-Value-by-with-additional-parameters%3A
parameters	LIST<RECORD>	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#Typical-parameters
others	LIST<RECORD>	see details here : https://boxalino.atlassian.net/wiki/spaces/BPKB/pages/8749643/Narrative+API+-+Technical+Reference#Typical-parameters
response	LIST<RECORD>	the products (and other content) which were returned and shown on the page
creation_tm	TIMESTAMP	technical debugging field, please ignore
client_id	STRING	technical debugging field, please ignore
src_sys_id	STRING	technical debugging field, please ignore

Similarity between products, contents and across both

Transformer models are well adapted to work with NLP for which the textual description of products as well as the textual content of blog articles / magazine.

Such content could be used for several reasons (as highlighted in other sections of the post) but one of them could be the creation of similarities / related contents between products, between contents and between products and content.

Key opportunities:

UP-SELL: Similar Product Suggestions on PDP - PCC: could we find what other products / content are similar using their description?
CROSS-SELL: Product Suggestions on Overlay and Basket Page - PCC: could we find what other products are good to add to a what is already in the basket using these behaviors?
PROMOTE: Banner Suggestions in E-shop and E-Mail - PCC: same as UP-SELL but with marketing banners and not with products
READ: Promote Blog and Editorial Content - VJP : same as UP-SELL but with blog/magazine/… content and not with products

Key challenges / questions:

Our prior experiences to create relation between products didn’t show great results as using the other attributes (category, brands, …) was already covering the same outcomes and the cases where it didn’t were not much better
Would the model generate attributes / tags which can be use for similarity as well? or only set for each piece of content a list of highly related other pieces of contents?

Generate Texts for SEO

Our clients use SEO tools like SEMRush which could be (but are not yet) exported to Google BigQuery (https://www.semrush.com/kb/5-api ).

Our clients who use such tools are able to identify many (small) opportunities that they have to improve each time manually.

Generating SEO improvement (new texts, better texts based on suggested keywords, etc.) could be a massive win.

Boxalino could automate the implementation of these SEO texts on the web-site and provide this way a feed-back loop if the SEO data are exported to BigQuery to check and fine-tune the results automatically.

The logic could also simply improve / enrich the product textual description as described here:

https://machine-learning-company.nl/technisch/transformer-based-language-models-for-large-scale-e-commerce-product-data-enrichment/

It is also important to mention that we have (sometimes) the product description of competitors. So comparing the text and make it more unique could be a strong advantage.

A/B test different texts

Boxalino can easily provide different textual content for a product page or a content page in a personalized or a/b tested way.

Currently, this option is not used because of the challenge of automatically creating an alternative version of the textual description of thousands of products and/or contents.

To be able to “change the tone”, “change the style”, or other type of systematic changes of the textual content could be very interesting to a/b test and to use for personalization.

Generate Texts for Landing Pages

Boxalino generates many landing pages for which a list of products is displayed. Adding automatically generated texts to these landing pages would make them directly more effective for SEO and SEA.

Here is an example showing a list of products and 2 random banners on top (which is not optimal):

https://www.mcdrogerie.ch/t-Eisen

Generate Small Netflix-like label from filter

Boxalino generates mechanically small prompts on the home page of e-shops like Qualipet based on filters:

https://www.qualipet.ch/

For example : “NEUHEITEN VON HARMONY FÜR IHRE KATZE”

is generated from a filter on the property : “new” the brand “Harmony” and the animal “cat”.

The mechanism is very limited and can easily generate non-optimal prompts.

It would be interesting to see if a Transformer model would be able to generate better prompts.

Emotional analysis of user comments

Understanding better for which product (or group of products) user comments are positive or not could be valuable for many aspects (as described here):

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0247984

Boxalino Public Knowledge Base

A.I. Transformer models for SMB E-Commerce