WHY STUDENT PREFER US?  
4.9/5

5 Star Rating

93940

Orders Deliver

3949

PhD Experts

24x7

Support

100%

Privacy

100%

Top Quality

Sample Details

Multiple Linear Regression

Number Of View : 65

Download : 0

Pages: 3

Words : 680

Question :

 

Create a scenario in your discipline in which multiple linear regression would be the appropriate analysis method. Assume that you need to obtain the best model with three predictors selected from a pool of five.

 

a— Briefly describe the underlying problem. Give the response variable, the variables you will choose from for predictors, and the reasons you believe they would be relevant.

b— Give, in detail the steps you would go through to check that your predictor meets the requirements for regression. What would you do for your potential predictors?

c— Give, in detail, the process you would use to select the best model.

d— Suppose that you find two different models that meet your standard for best.

 

Answer :

 

Underlying problem

In the existing study, we would like to construct a multiple linear regression model analyzing how strongly following independent characteristics affect the life of city dweller:

1 1.      %obesity

2 2.      %Excessive drinking

3 3.      %Smokers

4 4.      %Physically active

5 5.      %Insured

Multiple-linear regression was developed to analyse the effect of the above features on the loss of city dwellers' life.

Methodology

 

Data Source

We gathered the data from the Health Report published by the University of Wisconsin Population Health Institute’s New York in 2014. The data collected was sponsored through the Robert Wood Johnson Foundation program. This foundation is the most significantUS charity organization focussing on public health3.

 

Software

According to Princeton’s online archive, regression testing is: “Any software testing that seeks to uncover software errors after changes to the program…have been made by retesting the program.4We tried to create the relationship between our regressors and beta values by using the Microsoft Excel add-in program Data Analysis.

 

Model Refinement Procedure

As referenced by author James R. Evans, regression analysis is defined as: “A tool for building statistical models that characterize relationships among a dependent variable and one or more independent variables, all of which are numerical.5” To determine as toif our regression variables have a plausible power of value, we evaluated the individual characteristics having the potential to influence our dependent variable. Isolating each singular aspect reinforced our understanding that they are independently, identically distributed6 (IID).

We worked under the strict assumption that any findings regarding one characteristic as IID would not, in any way, affect another. This study was also conducted under the assumption that each x-variable in itself was to be independent of the others.

 

Results

ŷ=76.12*X1+347.07*X2+75.73*X3-69.9

X1= % Smokers

X2= % Had Low Birth Weight

X3= % Physically Inactive

The final Model reveals that “% Smokers,” “% Had Low Birth Weight,” and “%Physically Inactive” were the most significant factors affecting the life of a city dweller while the factors like “% Uninsured,” “% Excessive Drinking,” “% Obese,” and “Physically-inactive” have the highest p-values proving four x-variables were the most insignificant predictors of the regression model. The adjusted R-Square in this model was 0.54, suggesting approximately 54% of the variation in y attributed to x,as explained by the regression analysis we conducted. The t-stats are high, emphasizing each individual x-variable has significance (assuming the 95% confidence interval is being used).

 

Model 1

The purpose of Model 1 is to show a regression analysis with all six autonomous x-variables. Our interest in this model is to identify which of those variables is most likely to affect the dependent variable and to remove the insignificant variables. The higher the value of T-stat, the greater the influence each variable will have on the dependent variable. We will remove variables in a step-wise manner that contain excessively highp-values. Our initial hypothesis is that the “% Smokers” category will have the most considerable influence on our loss rate2.

 

Model 2                    

The purpose of Model 2 is to show a regression analysis with five of the x-variables, excluding the one that had the highest P-value (% Obese). Our interest in this model is to try and find a better “fit” for the model.  Our initial hypothesis is that repeating the regression while excluding “% Obese” will help us to find a better fit for our model.

 

References

1

11.      “About RWJF” http://www.rwjf.org/en/about-rwjf.html

 

22.      Evans, James R. “Statistics, Data Analysis, and Decision Modeling.” Pearson Education, Inc. 2013

 

33.      “Income, Poverty, and Health Insurance Coverage in the United States: 2012.” http://www.census.gov/prod/2013pubs/p60-245.pdf

 

44.      “Regression Testing.” Princeton University. https://www.princeton.edu/~achaney/tmve/wiki100k/docs/Regression_testing.html

Place Order For A Top Grade Assignment Now

We have some amazing discount offers running for the students

Order Now

Get Help Instantly

    FREE FEATURES

    Limitless Amendments

    $09.50 free

    Bibliography

    $10.50 free

    Outline

    $05.00 free

    Title page

    $07.50 free

    Formatting

    $07.50 free

    Plagiarism Report

    $10.00 free

    Get all these features for $50.00

    free

    Let's Talk

    Enter your email, and we shall get back to you in an hour.