![]() > fit.reduced summary(fit.reduced) Deviance Residuals: Min 1Q Median 3Q Max -1.628 -0.755 -0.570 -0.262 2.400 Coefficients: Estimate Std. Therefore, we can try to fit a second model by including only significant variables such as age, years married, religiousness, and rating to fit the data instead. If we observe the Pr(>|z|) or p-values for the regression coefficients, then we find that gender, presence of children, education, and occupation do not have a significant contribution to our response variable. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 675.38 on 600 degrees of freedom Residual deviance: 609.51 on 592 degrees of freedom AIC: 627.5 Number of Fisher Scoring iterations: 4 > fit.full summary(fit.full) Deviance Residuals: Min 1Q Median 3Q Max -1.571 -0.750 -0.569 -0.254 2.519 Coefficients: Estimate Std. Now, we can execute the logistic regression to measure the relationship between response variable (affair) and explanatory variables (age, gender, education, occupation, children, self-rating, etc) in R. > Affairs$ ynaffair Affairs$ ynaffair Affairs$ynaffair table(Affairs$ynaffair) No Yes 451 150 We can transform affairs into a binary variable called ynaffair with the following code. ![]() In conclusion, we can say that 6% of respondents has 1 affair per month □.Īs we are interested in the binary outcome for our response variable (had an affair/didn’t have an affair). It means 25% of our respondents has an affair with the largest number reported was 12. In addition, we find that 451 respondents claimed not engaging in an affair in the past year. :5.000 >table(Affairs$ affairs) 0 1 2 3 7 12 451 34 17 19 42 38įrom the summary above, we can see that there are 286 male respondents (representing 48% of the overall respondents), 430 respondents had children (representing 72% of the overall respondents), and average age for our respondents was 32.5 years old. :15.000 religiousness education occupation rating Min. # How to do Logistic Regression in R # Created by Michaelino Mervisiano > install.packages("AER") > library("AER") > data(Affairs, package="AER") > View(Affairs) > summary(Affairs) affairs gender age yearsmarried children Min. ![]() Then, you can use the model to check which one between you and your partner that more likely to have an affair or not □īut, before that, we will run through some descriptive statistics with the code below to get a better understanding of our data. The figure below shows a few observations to give you an overview of the data.Īpplying Logistic Regression, we can find which factors contributed the most to infidelity. This data contains 9 variables collected on 601 respondents which hold information such as how often they have affairs during the past years, as well as their age, gender, education, years married, have children (yes/no), how religious they are (on a 5-point scale from 1=anti to 5=very), occupation (7-point classification), and a self-rating on happiness toward their marriage (from 1=very unhappy to 5=very happy). We will use infidelity data as our example dataset, known as Fair’s Affairs, which is based on a cross-sectional survey conducted by Psychology Today in 1969 and is described in Greene (2003) and Fair (1978). In this article, I will discuss an overview on how to use Logistic Regression in R with an example dataset. It comes in handy if you want to predict a binary outcome from a set of continuous and/or categorical predictor variables. Logistic regression is one of the most popular forms of the generalized linear model.
0 Comments
Leave a Reply. |