Start your Data Transformation Process

Introduction

You have decided to be (more) data-driven? rely less on the infamous “gut feeling”?

That’s great, but how will you do it? What should you address first?

You cannot go from “gut feeling” to “data-driven” for all facets of your business in one day.

Therefore, in a way quite similar as companies do with a “digitalization” process, you need to go through a Data Transformation Process.

At Boxalino Winning Interactions, we are there to help, not only with our technologies, which will be the key pillar of your success, but also with a concrete methodology you can apply practically, step-by-step and area-by-area throughout your activities.

In this post, we are presenting what this methodology is and what it means in the concrete example of Google Shopping optimization.

Step 1: The Detailed Tracking of what appears on the screen

In the old days, what was meant with tracking was to keep a log of the page URLs accessed by your visitors. Arguably, you might already do better than that today (for example, you might already use options like the Google Analytics Enhanced Ecommerce tracking).

But in to become really data-driven, it is required to adopt a more radical tracking approach: “track everything that appears on the screen of the visitors devices”.

For example, this means that if a visitor comes to a Product Detail Page (PDP) and sees a delivery of 2 weeks because the product is not currently on stock, such information should be tracked at that moment and in a structured way (which means a clear name and easy to analyze values like a color code ‘orange’ and the exact delivery date shown to the visitor: ‘22.08.2022’). Also, it should only be tracked if it did appear on the screen of the device of the visitor (and for how long) and not simply be tracked because it was part of the content of the page.

Such radical tracking approach is important for the following reasons:

  1. In order to evaluate what factors have an impact on your KPIs (conversion rate, average order value, etc.), you need to assess it on the basis of what your visitors have seen (and even how long they looked at it which might indicate that it was a point of importance for them) and not only that it was part of the page.

  2. One could argue that the fact a visitor saw this information (which depends on the stock status at that moment) can be calculated considering the product URL, the stock at the beginning of the day and the number of orders done that day before the time of the page view,.
    But it is much more effective to directly “track what the visitor has seen”, removing therefore both the effort of making such calculation and removing the risk that this information might not be fully reliable due to the complexity of the calculation and other data issues.
    Also, you will want to track many elements on many pages and you will not have the resources to do so if each event requires a complicated calculation to be analyzed accurately.

  3. You might change over time the way that you are displaying the product delivery (going from a color code to another, adding some icons or changing them, indicating the exact date, or the duration instead). All these visual differences should alter what is tracked because it did affects what the visitors saw.

In short, if you track what visitors see precisely when they see it, you track final and definitive data you can analyze directly and efficiently without doubting the validity of the data.

Boxalino Winning Interactions provide an efficient and fast way to track everything (for content coming from our API or not) simply by “tagging” your HTML with attributes, making it significantly easier than if you had to write all the tracking in JavaScript by yourself.
Read more:

Use Case: Google Shopping Optimization

In the case of Google Shopping Optimization, the tracking of what is happening before the user clicks on the organic or advertised product in Google is provided by Google directly.

Make sure to integrate these 3 data transfers provided by Google to BigQuery to make them available for analysis: Google Ads, Google Analytics 4 and Google Merchant Center as documented here.

The tracking to be done is now the one which is happening on your web-site, typically on these pages:

  1. The Product Detail Page

  2. (all other pages the visitors might visit after landing on the Product Detail Page)

  3. The Basket

  4. The Checkout

As part of our “area-by-area” methodology, we recommend you to do first basic tracking for the pages of 2, 3 and 4 (as per JS Tracker API standard events, without the parts requiring ‘bx-attributes’).

You can (and should) address them as well in the future, but understanding exactly what is happening on the Product Detail Page is the most important task for your Google Shopping Optimization.

Therefore, for this case, focus your tracking effort on the point 1: The Product Detail Page. Here are the typical visual elements which should be tracked:

  1. Main Image (and image sliders)

  2. Prices (main and other displayed pricing information, like original price and discount)

  3. Delivery (stocks, delivery time, possible delivery options, …)

  4. Ratings and reviews (number of stars, number of reviews, display of the reviews, …)

  5. Variant selection (if several options are possible)

  6. Add to basket, add to wish-list and other options (pinning, …)

  7. Loyalty program information

  8. Description: text, expansion of text, description sections, table(s) with information, …

  9. Recommendations : similar products, cross-selling products, bundles

  10. Advantages: free gifts, …

  11. Related content

As a focus, consider that everything appearing in the first page scroll (on a desktop screen) should be tracked with full details, and that most of what appears on the second and third page scroll should be tracked with good details and that at least some elements should be tracked in each page scroll below (so you have at least an idea if people scroll so deep at all).

Step 2: Better Analytics: Source → Journey → Results

A highly effective principle is to define a dedicated analytics for each area of your business you want to optimize (one after the other), and to define it from a customer journey perspective.

While you will probably need analytics where the customer journeys are not present at all (about logistic, stocks, retours, supplier’s kickbacks and so on), it is our recommendation to orient and structure your analytics around the customer journey as much as possible.

A simple way to define your analytics in that way is to adopt the Source → Journey → Results model everywhere possible.

Source → Results & Behaviors → Results versus Source → Journey → Results

Web analytics typically focuses on showing on the one side the performance of traffic sources (e.g.: how much money you made by attracting how many visitors from Google Shopping and at what cost). This is what we call the Source → Results model.

On the other sides, many other reports show different behaviors (how many people make a search, visit a specific page, come many times) their frequency and their business results (the 15% of your visitors who use the search contribute to 50% of your turnover). This is what we call the Behaviors → Results model.

These two models suffer from 2 key problems:

  • Source → Results tell you what is the result, but not why you have it and what to do to improve it
    For example: My return on ad spend is very low for one of my key brands and therefore the Google algorithm doesn’t allocate much budget to it. What should I do?

  • Behaviors → Results are hard to put into context and it is quickly hard to know if you are addressing a situation which can be improved
    For example: My mobile conversion rate is much lower than my desktop conversion rate and I invested months of work to try to improve it until realizing this is the case for most e-shops

We therefore suggest another approach which we call the Source → Journey → Results

Our analytics model: Source → Journey → Results?

The Source → Journey → Results starts exactly like the Source → Results model, by identifying a source of traffic (for example the Google Shopping Ads) and its results (the ROAS and other KPIs of your campaigns).

But, instead of considering the Source as segmentation (what results do I get per campaign, or per product brand) we are including the key patterns of the Customer Journey as well (how often the delivery indication was green (1-2 days) versus orange (1-2 weeks)).

This is where all the efforts put in the tracking of Step 1 start to bring their fruits. While it might be interesting to know in general how frequently people visit product detail pages with each delivery status indicated, it might be hard to judge what to do about it. However, it it turns out to be a key factor in making the difference between a positive or negative return on ad spent, its value becomes very clear.

Use Case: Google Shopping Optimization

The use case of Google Shopping Optimization brings as a very helpful example between Source and Journey Segmentations.

A source segmentation will for instance focus on the brand of the product which was advertised and created the visit to your web-site. We will call this source segmentation as the “Proxy Brand” (the brand of the product they clicked on, not necessarily the brand of the product they bought).

A Journey segmentation will give you the information of the movement of the customer from this first product page, to possibly other product pages (or other pages) and the purchase of a basket containing different products, including (or not) the initially clicked products.

As a result, you will have a collection of Brand segmentations & Metrics:

Segmentation

Metric

Description

Segmentation

Metric

Description

Proxy Brand

ROAS, Clicks, Costs, …

All the typical Source → Results analytics you probably already know about

Proxy Brand

Product In Scope%, Qty%, UP%, …

The contribution of the clicked product to the orders

Proxy Brand

Brand In Scope%, Qty%, UP%, …

The contribution of any products of the clicked product to the orders

Cross-Brands

Brand Order%, Qty%, UP%, …

The contribution of products from other brands to the orders

Per Proxy Product

Purchase KPIs: Effective Margin, Margin ROAS, …

By analyzing the products effectively bought (which might differ from the Proxy Product) we have a much better understanding of the effective margin of the

Customer KPIs: New Customer, Projected Customer Lifetime Value, …

By analyzing the purchase to identify new and reacquired customers as well as projecting a value of the Customer Lifetime Value, you can go beyond the Revenue and even Margin Return on Ad Spend

In addition, the focus put on Step 1 on the PDP event tracking will give us many very clear facets about what affects the performance of the traffic:

Segmentations

Metrics (on each component)

Segmentations

Metrics (on each component)

Price Levels

Price Discount Levels

Delivery conditions

Number / type of Images

Description

Ratings

Displays

Sessions with at least 1 display

Clicks

Session with at least 1 click

Conversions

Session Conversion Rate, Session $ Value

Display Conversion Rate, Display $ Value

Click Conversion Rate, Click $ Value

Step 3: The Data-Driven Hypothesis: Collect & Conclude

If everything went according to plan, as a result of Step 2, you have a lot of information about not only the performance of your activities, but about key aspects of the Customer Journey which correlates to good or bad results.

However, this is usually not enough to act which we will describe on step 4.

There is a crucial step before which, as much as possible, we recommend to address in a collective way.

Looking at the reports will hopefully give many ideas to each key person of your organization about what needs to be done, but not everyone will necessarily conclude the same things or at least the same order of priorities.

We therefore recommend to share these reports within your team (possibly after an initial presentation) and to collect their feed-back in a structured way.

You can do it in a simple manner at first (create a simple poll to decide on a 1 or 2 first changes), but we recommend to do it in a complete and structured way over time, as described in Step 5.

However, this step is important, because it will ensure that a collection of relevant optimization candidates are well defined, which will avoid the typical issue of identifying only 1 idea and to decide immediately to act on it without considering at least a list of 3-5 (or more).

Use Case: Google Shopping Optimization

As a result a Step 2, a collection of candidates can be defined, some of which might be in the following list, but don’t take the list as a collection of hot-tip, because the entire goal of the present methodology is to avoid taking hot-tips that your “gut feeling” tells you ought to be true, but instead to detect them in a data-driven way:

  1. Structure your campaigns based not directly on the margin of the advertised products (proxy products) but on the revealed margin generated by visitors clicking on these products

  2. Structure your campaigns based products which are good at acquiring new customers or at reacquiring elapsed customers (or generating higher lifetime values)

  3. Structure your campaigns based products which do not frequently cause a multitude of back and fourth between your e-shop and Google Shopping during the same visits (and for which you pay for the ad click every time)

  4. Structure your campaigns based on products which have a stock level higher than their daily purchases to avoid visitors seeing long deliveries often

  5. Structure your campaigns based on products which have a discount % level which is in the sweet-spot of your conversion rate

  6. Structure your campaigns based on products which have currently a positive marker on a tracked event on the PDP (good ratings, multiple pictures often all viewed, high add to wish-list rate, interest shown in reading their description, related content often clicked, …)

  7. And of course (as we will discuss more in Step 4): A/B Testing of a visual changes on the PDP itself (based on the findings of any of the tracked event)

Step 4: Targeted Testing: Data, Process & Visual

As a result of Step 3, most of your optimization ideas are likely to belong in one of 3 groups:

  • Data
    These are the best kind (at least for data scientists). Basically it means that the change you want to test requires no change in your processes (IT or otherwise) and can be done behind the scene.
    This is the case for example of any algorithmic or rule-based change in the sorting / selection of products in product listing or product recommendations. The change simply modifies what product will appear in which context and to whom.
    The typical Data change consists of an data processing in BigQuery which results in pushing data to the Boxalino Lab (which are then automatically uploaded in Boxalino Real-Time Platform) and then to configure the Boxalino Admin to make usage of these new data, typically in an A/B Test.

  • Process
    we are changing/improving the management of our e-shop based on new analytics
    example: we are improving our stock management process based on the information of the most viewed product with non optimal delivery times

  • Visual
    we are changing what the user sees on the web-site
    example: We are making a visual change on the page to show similar recommendations higher on the page

  • Hybrid
    This is a very common case typically when both Data and Process or Data and Visual are both needed together in a change, or even all 3 together
    example: create margin groups to change the campaigns of Google Ads and change the source
    A data change is required, but also a structing of your Campaigns which is a change in your processes

About Targeted Testing:

  • Testing
    If possible, we do the change as a test (if possible an A/B Test) to have a direct causal understanding of the effect of the change

  • Targeted
    we are doing the testing in a targeted way if possible, which might mean “personalized” either individually or in a customer segment but can also mean we are implementing the change a segment of the product sortiment

Step 5: The Learnings & the Prioritization

Here we discussed how to interpret the learnings (results of the tests)

as well as how to prioritize a large collection of data-driven optimization hypothesis by doing a prioritized spread-sheet with the ICE scoring (Impact, Confidence and Ease).