Interaction Assignment

Discoholic 🪩
Three Goblin Art
he wasn't even looking at me and he found me
Sweet Seals For You, Always

#extradirty
One Nice Bug Per Day
will byers stan first human second
Show & Tell

oozey mess
DEAR READER
"I'm Dorothy Gale from Kansas"

⁂
Claire Keane
Lint Roller? I Barely Know Her
ojovivo

roma★
Not today Justin

Janaina Medeiros
taylor price

izzy's playlists!
seen from United States
seen from United States

seen from France

seen from Germany
seen from Türkiye

seen from Malaysia

seen from United States
seen from United States
seen from United States

seen from United States
seen from Austria
seen from Türkiye
seen from United States
seen from Türkiye
seen from United States
seen from United States
seen from Netherlands
seen from United States
seen from South Africa

seen from Malaysia
@ritesh110587
Interaction Assignment
Correlation Assignment
Chi-Square Test
Inference: p-value here is 0.0021 and it is <= 0.05. So we can reject NULL Hypothesis and say that there is some association between Property_area and loan_status
Post Hoc Testing
We have three combinations here: 'Rural - Semiurban', 'Rural - Urban' and 'Semiurban - Rural'. So le tus create three dataframes and start performing chi-square tests for each of the pair
Apply Bonferroni correction and considering adjusted significance level - 0.05/3 = 0.0167
Here p-value is 0.00107 and it is <= 0.016 (new significance level). So Loan_Status does have an association with Property_Area Rural and Semiurban
Here p-value is 0.43373 and it is > 0.016 (new significance level). So Loan_Status does not have an association with Property_Area - Rural and Urban
Here p-value is 0.001071 and it is <= 0.016 (new significance level). So Loan_Status does not have an association with Property_Area - Rural and Semiurban
ANOVA Test in Python
Importing required libraries
import os as os import pandas as pd import numpy as np import statsmodels.formula.api as smf import statsmodels.stats.multicomp as multi
Importing Data
df = pd.read_csv('Housing_Data.csv')
The dataset is on loan prediction having customer attributes and the loan outcome
ANOVA problem statement - Checking if Loan Amount is related to Education Qualification
So here Loan Amount is the continuous variable and Education Qualification is Categorical Variable. Education Qualification has only two levels
df_anova_test = df_anova[['Education', 'LoanAmount']]
using ols function for calculating the F-statistic and associated p value
model1 = smf.ols(formula='LoanAmount ~ C(Education)', data=df_anova_test) results1 = model1.fit() print (results1.summary())
So here the p value is 0.000142 and it is < 0.05 (Significance Level). So we can say that Loan Amount does vary with Education Qualification and there is some association between Education Qualification and Loan Amount. So we reject Null Hypothesis
Comparing Means
df_anova_test.groupby('Education').mean()
So we can the difference of 34 in the means of the two Education levels
ANOVA POST Hoc Testing
Here we will consider categorical variable that has more than two levels. So here the categorical variable would be Property_Area which has three levels and Loan Amount would be the continuous variable
Let us run the ANOVA test first
df_anova_test1 = df_anova[['Property_Area', 'LoanAmount']]
Using ols function for calculating the F-statistic and associated p value
model1 = smf.ols(formula='LoanAmount ~ C(Property_Area)', data=df_anova_test1)
results1 = model1.fit() print (results1.summary())
So here p-value < 0.05 and we can say that there is some association between Property Area and Loan Amount. So we reject NULL Hypothesis
Let us perform Post Hoc Test and check on Type I error. So let us perform this by using Tukey HSD test
mc1 = multi.MultiComparison(df_anova_test1['LoanAmount'], df_anova_test1['Property_Area'])
res1 = mc1.tukeyhsd() print(res1.summary())
Here the last column indicates which categories of the variables are significant. So True means we can reject NULL Hypothesis and can say that Loan Amount for Rural is very different from Urban. But there is no association between the Loan Amounts of Rural & Semi Urban and Semiurban & Urban