Summary lecture 5, Emperical research project for IB

Lecture 5:

Ouliers: Data point that does not follow the general trend of the data (extreme value)

What does happen if we run the regression anyway?

The fit of the model can change
The regression may be titled
You can remove an outlier

How can we check whether we have outliers

Scatterplots
Statistical test
Easiest wat the range of +/- 2 à 3 standard deviatons include the word at least in the conclusion! à based on the assumption that it distribution is normal

How can we solve the problem?

First think: what is the reason for the outlier, can/could you do something?
Throw the outlier out of the dataset, however mismeasurement, error in the observation, data entry error. But not because it’s convenient to do so.
Be careful: some extreme values are to be expected, indicative of the characteristics of the population. Therefore it is important to check how sensitive your results are to the presence of the outlier? à what happens if we keep the outlier, what happens if we omit the outlier.
If the outlier does not change the results, but does affect assumptions, you may drop the outlier
If it affects both results and assumptions, you may not drop the outlier, but you have to run the regression both with and without the outlier and say that in he paper
If a relationship is clearly created by the outlier, you may drop the outlier, because without it there would be no relationship between x and y. So the regression coefficient does not truly describe the effect of x on y

Reverse causality: We assume that changes in the dependent variables are caused by changes in the independent variables. But we only find a statistical relationship, says nothing about causality of the direction of causality. In some analysis is could be that y (also) causes X which is called reverse causality à cause endogeneity problem

How can we check whether there is a reverse causality problem?

What does the theory say
Timing of measurement: Theory says x causes y, but sometimes x is measured later than y
Statistical tests (to check whether changes in x precede changes in y) and some more advances techniques

What to do:

Have a model that is well-grounded in theory
Explain
Acknowledge
In general: advances econometric techniques also exist to mitigate the problem of endogeneity

Omitted variable bias: which variables to include as IVs and what happens if we omit relevant variables? You have omitted variables if:

As excluded variable has some effect on your DV and
It’s correlated with at least one of your IVs (endogeneity)

It is impossible to control for everything, so how do we solve the problem?

Avoid simple regressions models (with on IV)
Include variables that are likely to be the most important theoretically in explaining the DV (what does the literature say)

Panel data or longitudinal data: data on many units collected at several points in time, whereby each unit is observed several times. You also have cross sectional and time series dimensions.

Why panel data:

Rich in information
Potentially, an increase in sample size
Possibility to control for time-invariant effects correlated with the regressors
How> intuition: Include dummy variables for each cross-section unit and use fixed effects.
Mitigate omitted variable bias

Fixed effects model: is a statistical regression model in which the intercept of the regression model in which the intercept of the regression model is allowed to vary freely across individuals or groups. It often applied to panel data in order to control for any individual-specific attributes that do not vary across time. Remove omitted variable bias. Assumption: the individual-specific effects are correlated with the IV’s

Assume: For the Grundfeld data we concluded that the assumption of OLS regression that the investment behaviour of all firms in all years is the same à is not realistic. The fixed effects model offers another way of restricting that assumption, namely by assuming that each firm has a number of unique characteristics that influence the firm’s investment behaviour. These unique characteristics are caught in the model by including for each firm a separate dummy variable.

In example of Grunfeld we assume:

Each firm has a unique characteristic which is stable over time
Random error term is assumed to satisfy the usual OLS assumptions
Hence each firm I gets a different intercept parameter but the slope coefficient b2 and b3 are assumed to be the same for all firms
An easy way to estimate the model is to create for each firm a dummy variable and add thse dummies to the model

General equation FE model

Restrictions of FE model, the FE model is very powerful but:

We cannot include variables that do not vary over time, all stable characteristics are captured by dummies, it leaves not variation left for estimating effects of variables that vary between economic entities
You can only include those that change over time
However you can still examine the interaction between group dummies and time-varying variables in FE model

When should you use a Fe model à if you are concerned about omitted factors that may be correlated with key predictors at the group level

Interpretation of results à similar to OLS

Logs in the regression equation, in general don’t forget:

Sign-size significance
Use the unit of measurement of y and x when given
Ceteris paribus

4 situations:

Robustness/sensitivity analysis:

To what end? à determine how sensitive your results are to change in the model

Experiment with:

Combinations of (other) control variables
Datasets
Time frames

Always rely on theory and literature

Do you results remain, results are robust.

Access:

Public

Verzekeren bij een faire en solidaire zorgverzekeraar?

Join: WorldSupporter!

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

Check: concept of JoHo WorldSupporter

Concept of JoHo WorldSupporter

JoHo WorldSupporter mission and vision:

JoHo wants to enable people and organizations to develop and work better together, and thereby contribute to a tolerant and sustainable world. Through physical and online platforms, it supports personal development and promote international cooperation is encouraged.

JoHo concept:

As a JoHo donor, member or insured, you provide support to the JoHo objectives. JoHo then supports you with tools, coaching and benefits in the areas of personal development and international activities.
JoHo's core services include: study support, competence development, coaching and insurance mediation when departure abroad.