Application of ANOVA and Regression Analysis in e-commerce business

Let’s assume that eCommerce organisations like Amazon and Flipkart would like to understand if shopping habit for a specific category has any relationship with the Gender and Income Group of the customers. As there are two factors (i.e. two independent categorical columns) which are being considered in this example, we are talking about Two-way ANOVA. So, there would be 3 hypothesis – one each for each of the independent categorical column and third to cater for the interaction effect of two independent variables.

Another example, suppose an eCommerce organisation would like to understand if page crash has anything to do with the education level of the customers. ANOVA would be the right choice to find if there is any statistical significance on the probability of page crash when measured against a single factor therefore education level.

For any business, specifically for an eCommerce organisation, conversion/purchase is the final goal. Hence to find out what impact each activity has on the sales, Regression equation with all activities like emails campaign, TV ads, Social Media broadcast, personalized communication to frequent customers, cold call , as independent variables to understand the impact on the sales with unit increase in the cost of each variable , keeping other independent variables constant.

Two Way ANOVA:

  1. In any business, Customer Satisfaction and Customer Loyalty play a vital role, for which usability of the e-commerce site, confidential protection of user information and better response time are some of the contributing factors to determine the areas of improvement in the business. Two Way ANOVA can be used to come up with the statistical significance level of these factors towards the major goal.
  2. To determine the significance level of shopping habit of a customer based on demographic factors such as Gender or Annual Income, by computing the association between and the interaction between these 2 independent factors.
  3. To come up with the significance levels of the channels used for marketing of a multi-channel marketing campaign.

We can express shopping habit as below:

Shopping Habit=a Gender + b Annual Income,

where Shopping habit is a DV and Gender and Annual Income are IVs

As there are 2 IVs and Gender has different levels, we have to perform Two Way ANOVA.

If the below is a sample of data:

Cust Id Gender Annual Income Shopping Habit
1 F >1L Occasionally
2 M 1L-3L Weekly
3 F 3L-5L Monthly
4 M >1L Occasionally
5 F 1L-3L Weekly
6 M 3L-5L Monthly
7 F >1L Occasionally
8 F 1L-3L Weekly
9 F 1L-3L Monthly


Hypothesis for Gender

H0: there is no effect of Gender on Shopping Habit

Ha: there is effect of Gender on Shopping Habit

Hypothesis of Annual Income

H0: there is no effect of Annual Income on Shopping Habit

Ha: there is effect of Annual Income on Shopping Habit.

Perform Normality and Homogeneity Test of the data distribution.

Considering only for Gender, we need to calculate MSwithin for both Male and Female and MSbetween for Male and Female.

Then Fratio= MSbetween/ MSwithin.

If Fratio>F0.05, then we can conclude that there is a effect of Gender on Shopping Habit.

Considering only for Annual Income, we need to calculate MSwithin for each of the 3 groups of income and MSbetween for 3 groups.

Then Fratio= MSbetween/ MSwithin.

If Fratio>F0.05, then we can conclude that there is a effect of Annual Income on Shopping Habit.

Interaction Effect: Through TukeyHSD test, we can get the interaction.

  1. Objectives of Multi Channel Marketing:
  2. Low cost marketing channel
  3. Better customer experience
  4. Better integration and interaction of channels

CRM(Customer Relationship Management) system can be a data source for this, where we can get the insight of the customers based on various marketing channels, response and customer acquisition, the most statistical significant channels(Most effective) for a group(Cluster) of customers, through linear regression and is there any increase in performance of a specific channel in association with other channel(ANOVA interaction).

  1. Several factors such as Product Description, One day Delivery option, Availability of Cash on Delivery, Quality of packaging of product, Free returns with pickup facility, do they have any significant impact on Customer buying behaviour?
  2. Factors such as Application User Interface, Information Quality, User Information Security and Service Feedback are some of the factors which can be hypothesized to come up to a conclusion to decide customer satisfaction and trust, by achieving which an e-commerce Organisation can gain Customer Loyalty.

So, here hypothesis testing can be summarised as below:

H0ui: Application User Interface does not have any impact on Customer Loyalty

Haui: Application User Interface has significant impact on Customer Loyalty

H0iq: Information Quality does not have any impact on Customer Loyalty

Haiq: Information Quality has significant impact on Customer Loyalty

H0is: User Information Security does not have any impact on Customer Loyalty

Hais: User Information Security has significant impact on Customer Loyalty

H0sf: Customer Feedback does not have any impact on Customer Loyalty

Hasf: Customer Feedback has significant impact on Customer Loyalty.

An example of Multiple ANOVA and Regression in Error Correction:

1. During Seasonal Offer, I just wonder why the Discount Sale is only for 3 days why not at least for 7 days. Below might be a reason for that:

A Manager may believe that extending Discount Sales offer duration will greatly increase sales.

Multiple ANOVA can suggest statistical significance of the maximum number of days for Discount Sales whether it is 3 or 4 or 5 days.

Regression analysis, however, may indicate that the increase in revenue might not be sufficient to support the base price of the products or rise in operating expenses due to longer support hour to handle the huge load (such as any additional IT infrastructure and support cost related to this).

Here, there is a possibility of getting lower profit due to additional support cost which might be ignored by the Manager at the time of decision making.

Hence, regression analysis along with ANOVA can provide quantitative support for decisions and prevent mistakes due to manager’s intuitions.

Failure to understand the components of correlation and regression and each of their implications and limitations can lead to poor business decisions. When applied correctly, correlation and regression analysis can be used.

Application of Statistics  in Marketing and Customer Analytics are vast, but as the topic of discussion is limited to One way, Two way ANOVA and Regression(as there is not any mention of any specific kind of Regression), I am adding few more use cases:

1. Predictive Model: Linear Regression which is a type of Predictive model can be used to enhance the Pricing Model of the e-commerce business by analyzing  historical data for different products, customer responses to past pricing trends, and evaluating competitor’s pricing model which helps to build suitable pricing models.

2. Logistic Regression can be used for fraud detection by analyzing customer behavior analytics where algorithms get used to analyze suspicious activities and find inconsistencies in the historical sets of personal data, in scenarios when a scammer breaches a user account, alters personal data, and tries to get money or goods from a retailer using this semi-fake personal information.

3. Cox Regression: Cox regression can be used for Time to Event analysis means if we want to analyze the time difference between the user account creation and the first event triggered by the user, means he did any purchase or not  or the closure of user account. So here, there are two target variables: one is the time difference and the other is occurrence of any specific event.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s