Startmagazine: Introduction to Statistics

Introduction to Statistics: in short

  • Statistics comprises the arithmetic procedures to organize, sum up and interpret information. By means of statistics you can note information in a compact manner.
  • The aim of statistics is twofold: 1) organizing and summing up of information, in order to publish research results and 2) answering research questions, which are formed by the researcher beforehand.

Introduction to Statistics

This page presents an explanation of some fundamental concepts regarding statistics. In the connected pages you can find:

  • A glossary of the most important terms generally associated with Introduction to statistics
  • Selected contributions of other WorldSupporters regarding Introduction to statistics
  • Practice questions for Introduction to statistics
  • Tips, explanations and examples per topic when encountering, understanding and applying statistics (feel free to explore!)
  • Updates of contributions by WorldSupporter Statistics

As a behavioral scientist, it is important to understand statistics. Research is namely conducted using empirical techniques, of which statistics is an essential part. When you understand which technique should be applied in which situation, you can use statistics correctly.

Basic terminology

Often, research is conducted to examine the association between variables. A variable is a characteristic or condition that is changeable, or has different values for different individuals, for example age. These are person variables. But, variables can also apply to characteristics of the surroundings, for example temperature. Here, they are called environmental variables. Variables are noted by means of letters, for example variable X and variable Y. There are different kinds of variables. An independent variable is a variable that is being manipulated by the researcher. It often comprises two or more conditions, to which participants are being exposed. The dependent variable is the variable that is being observed after manipulating the observed variable. It shows what the effect is of the different conditions of the independent variable. Often, a control group is used in an experiment. This group receives no treatment or a placebo to see if there is a difference between the experimental condition and the control group. Variables can also be subdivided into discrete and continuous variables. A discrete variable comprises different categories. For example, a class can consist of 18 or 19 children, but can not consist of 18.5 children. For a continuous variable, there are infinite numbers or values possible between two observed values. Think for example of length and weight.
Many variables that are being examined, are hypothetical constructs. Think for example of self-confidence. These constructs can not be measured directly. To measure these constructs, definitions of these constructs have to be formed that can be examined. For example, intelligence can be examined by using an IQ test. An operational definition describes how this construct should be examined. For example, hunger can be described as ‘the state in which someone is after not eating for at least 12 hours’. This is an example of an operational definition.

Research designs

Researchers can use four different research designs to test hypotheses:

  1. Descriptive research: with descriptive research, the behavior, thoughts and feelings of a group of individuals are described. Developmental psychologists for example try to describe the behavior of children of different ages.

  2. Correlational research: with correlational research, the association between variables is studied. With correlational studies, no statements can be made about cause-and-effect relationships.

  3. Experimental research: in experimental studies, a variable (the independent variable) is manipulated to examine its possible effects on behavior (the dependent variable). If this is true (and all other assumptions are met), we can conclude that the independent variable causes these changes. The main feature of an experiment is the manipulation of the independent variable.

  4. Quasi-experimental research: this type of design is used when researchers are, for whatever reason, not able to manipulate the variable. Think for example of gender and age. The researcher studies the effects of a variable of an event that happens naturally and can not be manipulated. Quasi-experiments do provide less certainty than real experiments.

The research process

The research process comprises seven steps:

  1. Select a topic.

  2. Demarcate and specify the topic. Study prior research with regard to your topic and specify the research question(s).

  3. Set up a plan to answer the research question, and examine which research design is most appropriate for this.

  4. Collect data to find an answer to your question.

  5. Analyse the data. Look for patterns in your data.

  6. Interpret the data; give meaning to your data.

  7. Publish the results of your research, and inform others about the results.

The above steps are rarely clearly separated from each other: conducting research is an interactive process in which many steps are intermingled with each other. In addition, sometimes you have to go back to a prior step of the process.

Types of statistics

There are different types of statistics. Descriptive statistics is used to describe the data. We can calculate the mean, display the data in a graph or look for extreme scores. Inferential statistics refers to making inferences about the population, based on a certain sample. By means of inferential statistics, we try to answer answer this. When a measurement refers to the whole population, it is called a parameter. When a measure refers to the sample, it is called a statistic. Statistics are thus estimates of the parameter.

Basic symbols

In statistics it is important not to loose sight of the difference between the statistic that describes only the sample and the parameter that describes the entire population. Greek letters are used for the population parameters, Roman letters are used for the sample statistics. For a sample ȳ indicates the mean and s indicates the standard deviation. For a population μ indicates the population mean and σ the standard deviation of the population. The mean and the standard deviation can also be regarded as variables (for a population there is no mean or standard deviation because there is only one population).

Below, you see a table with some useful and frequently used symbols:

SymbolSymbol NameMeaning / definitionExample
P(A)probability functionprobability of event AP(A) = 0.5
f (x)probability density function (pdf)P(a x b) = ∫ f (x) dx 
F(x)cumulative distribution function (cdf)F(x) = P(X x) 
μpopulation meanmean of population valuesμ = 10
E(X)expectation valueexpected value of random variable XE(X) = 10
E(X | Y)conditional expectationexpected value of random variable X given YE(X | Y=2) = 5
var(X)variancevariance of random variable Xvar(X) = 4
σ2variancevariance of population valuesσ2 = 4
std(X)standard deviationstandard deviation of random variable Xstd(X) = 2
σXstandard deviationstandard deviation value of random variable XσX = 2
median symbolmedianmiddle value of random variable xexample
ρX,Ycorrelationcorrelation of random variables X and YρX,Y = 0.6
summationsummation - sum of all values in range of seriesexample

Measurement levels

Variables can be subdivided into four different measurement levels, which are summarized below from lowest to highest level of measurement:

  1. Nominal: the simplest (lowest) measurement level is the nominal scale. For nominal variables, numbers only refer to categories. Measurement on the nominal scale categorize and label observations. The number 1 for example can be used for ‘men’ and the number 2 can be used for ‘women’. One can not calculate something with these numbers, because they are only labels.

  2. Ordinal: an ordinal variable comprises a set of categories with an ordering. For example, you can order the participants of a singing competition of worst to best on the basis of the applause they received. However, we can not determine perfectly how much more applause one or the other singer received.

  3. Interval: here, we do speak of ‘real’ number. Equal differences between number on this scale reflect equal differences in strength. However, with interval variables, there is no defined zero point. For example, you can not say a person has zero height. Because there is no zero level, we can not multiply or divide the numbers of an interval scaled variable.

  4. Ratio: here, we do speak of a zero-level. Because of this, we are able to add, subtract, multiply and divide observations. Examples of ratio scaled variables are weight and reaction time.

 

Levels of Measurement
   RatioAbsolute zero
  IntervalDistance is meaningful; no absolute zero
 OrdinalAtrributes can be ordered; distance not meaningful
NominalAtributes are only named; cannot be ordered

 

Glossary for Introduction to Statistics

Glossary for Introduction to Statistics

What is statistics?

What is statistics?

Statistics is the science of data, encompassing its collection, analysis, interpretation, and communication to extract knowledge and inform decision-making.

This definition focuses on the core aspects of the field:

  • Data-driven: Statistics revolves around analyzing and interpreting data, not just manipulating numbers.
  • Knowledge extraction: The goal is to gain insights and understanding from data, not just generate summaries.
  • Decision-making: Statistics informs and empowers informed choices in various settings.

Statistics has a wide application:

1. Design and Inference:

  • Designing studies: Statisticians use statistical principles to design experiments, surveys, and observational studies that allow for reliable inferences.
  • Drawing conclusions: Statistical methods help estimate population parameters from sample data, accounting for uncertainty and variability.

2. Modeling and Analysis:

  • Identifying relationships: Statistical models reveal patterns and relationships among variables, aiding in understanding complex systems.
  • Quantitative analysis: Various statistical techniques, from regression to machine learning, enable deep analysis of data structures and trends.

3. Interpretation and Communication:

  • Meaningful conclusions: Statisticians go beyond numbers to draw meaningful and context-specific conclusions from their analyses.
  • Effective communication: Clear and concise communication of findings, including visualizations, is crucial for informing stakeholders and advancing knowledge.

Applications across disciplines:

These core principles of statistics find diverse applications in various academic fields:

  • Social sciences: Understanding societal patterns, testing hypotheses about human behavior, and evaluating policy interventions.
  • Natural sciences: Analyzing experimental data, modeling physical phenomena, and drawing inferences about natural processes.
  • Business and economics: Forecasting market trends, evaluating business strategies, and guiding investment decisions.
  • Medicine and public health: Analyzing clinical trials, identifying risk factors for disease, and informing healthcare policies.

Ultimately, statistics plays a crucial role in numerous academic disciplines, serving as a powerful tool for extracting knowledge, informing decisions, and advancing human understanding.

What is a variable?

What is a variable?

A statistical variable is a characteristic, attribute, or quantity that can assume different values and can be measured or counted within a given population or sample. It's essentially a property that changes across individuals or observations.

Key Points:

  • Variability: The defining feature is that the variable takes on different values across units of analysis.
  • Measurable: The values must be quantifiable, not just qualitative descriptions.
  • Population vs. Sample: Variables can be defined for a whole population or a sampled subset.

Examples:

  • Human height in centimeters (continuous variable)
  • Eye color (categorical variable with specific options)
  • Annual income in dollars (continuous variable)
  • Number of siblings (discrete variable with whole number values)

Applications:

  • Research: Identifying and measuring variables of interest is crucial in research questions and designing studies.
  • Data analysis: Different statistical methods are applied based on the type of variable (continuous, categorical, etc.).
  • Modeling: Variables are the building blocks of statistical models that explore relationships and make predictions.
  • Summaries and comparisons: We use descriptive statistics like averages, medians, and standard deviations to summarize characteristics of variables.

Types of Variables:

  • Quantitative: Measurable on a numerical scale (e.g., height, income, age).
  • Qualitative: Described by categories or attributes (e.g., eye color, education level, city).
  • Discrete: Takes on distinct, countable values (e.g., number of children, shoe size).
  • Continuous: Takes on any value within a range (e.g., weight, temperature, time).
  • Dependent: Variable being studied and potentially influenced by other variables.
  • Independent: Variable influencing the dependent variable.

Understanding variables is crucial for interpreting data, choosing appropriate statistical methods, and drawing valid conclusions from your analysis.

What is the difference between the dependent and independent variables?

What is the difference between the dependent and independent variables?

The dependent and independent variables are two crucial concepts in research and statistical analysis. They represent the factors involved in understanding cause-and-effect relationships.

Independent Variable:

  • Definition: The variable that is manipulated or controlled by the researcher. It's the cause in a cause-and-effect relationship.
  • Applications:
    • Experimental design: The researcher changes the independent variable to observe its effect on the dependent variable.
    • Observational studies: The researcher measures the independent variable alongside the dependent variable to see if any correlations exist.
    • Examples: Dose of medication, study method, temperature in an experiment.

Dependent Variable:

  • Definition: The variable that is measured and expected to change in response to the independent variable. It's the effect in a cause-and-effect relationship.
  • Applications:
    • Measures the outcome or response of interest in a study.
    • Affected by changes in the independent variable.
    • Examples: Plant growth, test score, patient recovery rate.

Key Differences:

FeatureIndependent VariableDependent Variable
ManipulationControlled by researcherMeasured by researcher
RoleCauseEffect
ExampleStudy methodTest score

Side Notes:

  • In some cases, the distinction between independent and dependent variables can be less clear-cut, especially in complex studies or observational settings.
  • Sometimes, multiple independent variables may influence a single dependent variable.
  • Understanding the relationship between them is crucial for drawing valid conclusions from your research or analysis.

Additional Applications:

  • Regression analysis: Independent variables are used to predict the dependent variable.
  • Hypotheses testing: We test whether changes in the independent variable cause changes in the dependent variable as predicted by our hypothesis.
  • Model building: Both independent and dependent variables are used to build models that explain and predict real-world phenomena.

By understanding the roles of independent and dependent variables, you can effectively design studies, analyze data, and draw meaningful conclusions from your research.

What is the difference between discrete and continuous variables?

What is the difference between discrete and continuous variables?

Both discrete and continuous variables are used to represent and measure things, but they differ in the way they do so:

Discrete variables:

  • Represent countable values
  • Have distinct, separate categories with no values in between
  • Think of them as individual units you can count
  • Examples: Number of people in a room, number of correct answers on a test, grades (A, B, C, etc.), size categories (S, M, L), number of days in a month.

Continuous variables:

  • Represent measurable values that can take on an infinite number of values within a range
  • Don't have distinct categories and can be divided further and further
  • Think of them as measurements along a continuous scale
  • Examples: Height, weight, temperature, time, distance, speed, volume.

Here's a table to summarize the key differences:

FeatureDiscrete variableContinuous variable
Type of valuesCountableMeasurable
CategoriesDistinct, no values in betweenNo distinct categories, can be divided further
ExampleNumber of applesWeight of an apple

Additional points to consider:

  • Discrete variables can sometimes be grouped into ranges: For example, instead of counting individual people, you might group them into age ranges (0-10, 11-20, etc.). However, the underlying nature of the variable remains discrete.
  • Continuous variables can be converted to discrete by grouping: For example, you could create discrete categories for temperature (e.g., below freezing, warm, hot). However, this loses information about the actual measurement.
What is a descriptive research design?

What is a descriptive research design?

In the world of research, a descriptive research design aims to provide a detailed and accurate picture of a population, situation, or phenomenon. Unlike experimental research, which seeks to establish cause-and-effect relationships, descriptive research focuses on observing and recording characteristics or patterns without manipulating variables.

Think of it like taking a snapshot of a particular moment in time. It can answer questions like "what," "where," "when," "how," and "who," but not necessarily "why."

Here are some key features of a descriptive research design:

  • No manipulation of variables: The researcher does not actively change anything in the environment they are studying.
  • Focus on observation and data collection: The researcher gathers information through various methods, such as surveys, interviews, observations, and document analysis.
  • Quantitative or qualitative data: Descriptive research can use both quantitative data (numerical) and qualitative data (descriptive) to paint a comprehensive picture.
  • Different types: There are several types of descriptive research, including:
    • Cross-sectional studies: Observe a group of people or phenomena at a single point in time.
    • Longitudinal studies: Observe a group of people or phenomena over time.
    • Case studies: Deeply investigate a single individual, group, or event.

Here are some examples of when a descriptive research design might be useful:

  • Understanding the characteristics of a population: For example, studying the demographics of a city or the buying habits of consumers.
  • Describing a phenomenon: For example, observing the behavior of animals in their natural habitat or documenting the cultural traditions of a community.
  • Evaluating the effectiveness of a program or intervention: For example, studying the impact of a new educational program on student learning.

While descriptive research doesn't necessarily explain why things happen, it provides valuable information that can be used to inform further research, develop interventions, or make informed decisions.

What is a correlational research design?

What is a correlational research design?

A correlational research design investigates the relationship between two or more variables without directly manipulating them. In other words, it helps us understand how two things might be connected, but it doesn't necessarily prove that one causes the other.

Imagine it like this: you observe that people who sleep more hours tend to score higher on tests. This correlation suggests a link between sleep duration and test scores, but it doesn't prove that getting more sleep causes higher scores. There could be other factors at play, like individual study habits or overall health.

Here are some key characteristics of a correlational research design:

  • No manipulation: Researchers observe naturally occurring relationships between variables, unlike experiments where they actively change things.
  • Focus on measurement: Both variables are carefully measured using various methods, like surveys, observations, or tests.
  • Quantitative data: The analysis mostly relies on numerical data to assess the strength and direction of the relationship.
  • Types of correlations: The relationship can be positive (both variables increase or decrease together), negative (one increases while the other decreases), or nonexistent (no clear pattern).

Examples of when a correlational research design is useful:

  • Exploring potential links between variables: Studying the relationship between exercise and heart disease, screen time and mental health, or income and educational attainment.
  • Developing hypotheses for further research: Observing correlations can trigger further investigations to determine causal relationships through experiments.
  • Understanding complex phenomena: When manipulating variables is impractical or unethical, correlations can provide insights into naturally occurring connections.

Limitations of correlational research:

  • It cannot establish causation: Just because two things are correlated doesn't mean one causes the other. Alternative explanations or even coincidence can play a role.
  • Third-variable problem: Other unmeasured factors might influence both variables, leading to misleading correlations.

While correlational research doesn't provide definitive answers, it's a valuable tool for exploring relationships and informing further research. Always remember to interpret correlations cautiously and consider alternative explanations.

What is an experimental research design?

What is an experimental research design?

An experimental research design takes the scientific inquiry a step further by testing cause-and-effect relationships between variables. Unlike descriptive research, which observes, and correlational research, which identifies relationships, experiments actively manipulate variables to determine if one truly influences the other.

Think of it like creating a controlled environment where you change one thing (independent variable) to see how it impacts another (dependent variable). This allows you to draw conclusions about cause and effect with more confidence.

Here are some key features of an experimental research design:

  • Manipulation of variables: The researcher actively changes the independent variable (the presumed cause) to observe its effect on the dependent variable (the outcome).
  • Control groups: Experiments often involve one or more control groups that don't experience the manipulation, providing a baseline for comparison.
  • Randomization: Participants are ideally randomly assigned to groups to control for any other factors that might influence the results.
  • Quantitative data: The analysis focuses on numerical data to measure and compare the effects of the manipulation.

Here are some types of experimental research designs:

  • True experiment: Considered the "gold standard" with a control group, random assignment, and manipulation of variables.
  • Quasi-experiment: Similar to a true experiment but lacks random assignment due to practical limitations.
  • Pre-test/post-test design: Measures the dependent variable before and after the manipulation, but lacks a control group.

Examples of when an experimental research design is useful:

  • Testing the effectiveness of a new drug or treatment: Compare groups receiving the drug with a control group receiving a placebo.
  • Examining the impact of an educational intervention: Compare students exposed to the intervention with a similar group not exposed.
  • Investigating the effects of environmental factors: Manipulate an environmental variable (e.g., temperature) and observe its impact on plant growth.

While powerful, experimental research also has limitations:

  • Artificial environments: May not perfectly reflect real-world conditions.
  • Ethical considerations: Manipulating variables may have unintended consequences.
  • Cost and time: Can be expensive and time-consuming to conduct.

Despite these limitations, experimental research designs provide the strongest evidence for cause-and-effect relationships, making them crucial for testing hypotheses and advancing scientific knowledge.

What is a quasi-experimental research design?

What is a quasi-experimental research design?

In the realm of research, a quasi-experimental research design sits between an observational study and a true experiment. While it aims to understand cause-and-effect relationships like a true experiment, it faces certain limitations that prevent it from reaching the same level of control and certainty.

Think of it like trying to cook a dish with similar ingredients to a recipe, but lacking a few key measurements or specific tools. You can still identify some flavor connections, but the results might not be as precise or replicable as following the exact recipe.

Here are the key features of a quasi-experimental research design:

  • Manipulation of variables: Similar to a true experiment, the researcher actively changes or influences the independent variable.
  • No random assignment: Unlike a true experiment, participants are not randomly assigned to groups. Instead, they are grouped based on pre-existing characteristics or naturally occurring conditions.
  • Control groups: Often involve a control group for comparison, but the groups may not be perfectly equivalent due to the lack of randomization.
  • More prone to bias: Because of the non-random assignment, factors other than the manipulation might influence the results, making it harder to conclude causation with absolute certainty.

Here are some reasons why researchers might choose a quasi-experimental design:

  • Practical limitations: When random assignment is impossible or unethical, such as studying existing groups or programs.
  • Ethical considerations: Randomly assigning participants to receive or not receive an intervention might be harmful or unfair.
  • Exploratory studies: Can be used to gather preliminary evidence before conducting a more rigorous experiment.

Here are some examples of quasi-experimental designs:

  • Pre-test/post-test design with intact groups: Compare groups before and after the intervention, but they weren't randomly formed.
  • Non-equivalent control group design: Select a comparison group that already differs from the intervention group in some way.
  • Natural experiment: Leverage naturally occurring situations where certain groups experience the intervention while others don't.

Keep in mind:

  • Although less conclusive than true experiments, quasi-experimental designs can still provide valuable insights and evidence for cause-and-effect relationships.
  • Careful interpretation of results and consideration of potential biases are crucial.
  • Sometimes, multiple forms of quasi-experimental evidence combined can create a stronger case for causation.
What are the seven steps of the research process?

What are the seven steps of the research process?

While the specific steps might differ slightly depending on the research methodology and field, generally, the seven steps of the research process are:

1. Identify and Develop Your Topic:

  • Start with a broad area of interest and refine it into a specific research question.
  • Consider your personal interests, academic requirements, and potential contributions to the field.
  • Conduct preliminary research to get familiar with existing knowledge and identify gaps.

2. Find Background Information:

  • Consult scholarly articles, books, encyclopedias, and databases to understand the existing knowledge base on your topic.
  • Pay attention to key concepts, theories, and debates within the field.
  • Take notes and organize your findings to build a strong foundation for your research.

3. Develop Your Research Design:

  • Choose a research design that aligns with your research question and data collection methods (e.g., experiment, survey, case study).
  • Determine your sample size, data collection tools, and analysis methods.
  • Ensure your research design is ethical and feasible within your resources and timeframe.

4. Collect Data:

  • Implement your research design and gather your data using chosen methods (e.g., conducting interviews, running experiments, analyzing documents).
  • Be organized, meticulous, and ethical in your data collection process.
  • Document your methods and any challenges encountered for transparency and reproducibility.

5. Analyze Your Data:

  • Apply appropriate statistical or qualitative analysis methods to interpret your data.
  • Identify patterns, trends, and relationships that answer your research question.
  • Be aware of potential biases and limitations in your data and analysis.

6. Draw Conclusions and Interpret Findings:

  • Based on your analysis, draw conclusions that answer your research question and contribute to the existing knowledge.
  • Discuss the implications and significance of your findings for the field.
  • Acknowledge limitations and suggest future research directions.

7. Disseminate Your Findings:

  • Share your research through written reports, presentations, publications, or conferences.
  • Engage with the academic community and participate in discussions to contribute to the advancement of knowledge.
  • Ensure responsible authorship and proper citation of sources.

Remember, these steps are a general framework, and you might need to adapt them based on your specific research project.

What is the difference between descriptive and inferential statistics?

What is the difference between descriptive and inferential statistics?

In the realm of data analysis, both descriptive statistics and inferential statistics play crucial roles, but they serve distinct purposes:

Descriptive Statistics:

  • Focus: Describe and summarize the characteristics of a dataset.
  • What they tell you: Provide information like central tendencies (mean, median, mode), variability (range, standard deviation), and frequency distributions.
  • Examples: Calculating the average age of a group of students, finding the most common hair color in a population sample, visualizing the distribution of income levels.
  • Limitations: Only analyze the data you have, cannot make generalizations about larger populations.

Inferential Statistics:

  • Focus: Draw conclusions about a population based on a sample.
  • What they tell you: Use sample data to estimate population characteristics, test hypotheses, and assess the likelihood of relationships between variables.
  • Examples: Testing whether a new teaching method improves student performance, comparing the average heights of two groups of athletes, evaluating the correlation between exercise and heart disease.
  • Strengths: Allow you to generalize findings to a broader population, make predictions, and test cause-and-effect relationships.
  • Limitations: Reliant on the representativeness of the sample, require careful consideration of potential biases and margins of error.

Here's a table summarizing the key differences:

FeatureDescriptive StatisticsInferential Statistics
FocusDescribe data characteristicsDraw conclusions about populations
Information providedCentral tendencies, variability, distributionsEstimates, hypotheses testing, relationships
ExamplesAverage age, most common hair color, income distributionTesting teaching method effectiveness, comparing athlete heights, exercise-heart disease correlation
LimitationsLimited to analyzed data, no generalizationsReliant on sample representativeness, potential biases and error
 

Remember:

  • Both types of statistics are valuable tools, and the best choice depends on your research question and data availability.
  • Descriptive statistics lay the foundation by understanding the data itself, while inferential statistics allow you to draw broader conclusions and explore possibilities beyond the immediate dataset.
  • Always consider the limitations of each type of analysis and interpret the results with caution.
What is the difference between a parameter and a statistic?

What is the difference between a parameter and a statistic?

In the world of data, where numbers reign supreme, understanding the difference between a parameter and a statistic is crucial. Here's the key difference:

Parameter:

  • Represents a characteristic of the entire population you're interested in.
  • It's a fixed, unknown value you're trying to estimate.
  • Think of it as the true mean, proportion, or other measure of the entire population (like the average height of all humans).
  • It's usually denoted by Greek letters (e.g., mu for population mean, sigma for population standard deviation).

Statistic:

  • Represents a characteristic of a sample drawn from the population.
  • It's a calculated value based on the data you actually have.
  • Think of it as an estimate of the true parameter based on a smaller group (like the average height of your classmates).
  • It's usually denoted by Roman letters (e.g., x-bar for sample mean, s for sample standard deviation).

Here's an analogy:

  • Imagine you want to know the average weight of all elephants on Earth (parameter). You can't weigh every elephant, so you take a sample of 100 elephants and calculate their average weight (statistic). This statistic estimates the true average weight, but it might not be exactly the same due to sampling variability.

Here are some additional key points:

  • You can never directly measure a parameter, but you can estimate it using statistics.
  • The more representative your sample is of the population, the more likely your statistic is to be close to the true parameter.
  • Different statistics can be used to estimate different parameters.
What is the nominal measurement level?

What is the nominal measurement level?

In the realm of data and research, the nominal measurement level represents the most basic way of classifying data. It focuses on categorization and labeling, without any inherent order or numerical value associated with the categories. Imagine it like sorting socks by color - you're simply grouping them based on a distinct characteristic, not measuring any quantitative aspects.

Here are some key features of the nominal measurement level:

  • Categorical data: Values represent categories or labels, not numbers.
  • No inherent order: The categories have no specific ranking or hierarchy (e.g., red socks are not "better" than blue socks).
  • Limited operations: You can only count the frequency of each category (e.g., how many red socks, how many blue socks).
  • Examples: Hair color (blonde, brown, black), blood type (A, B, AB, O), eye color (blue, green, brown), country of origin, shirt size (S, M, L).

Here are some important things to remember about the nominal level:

  • You cannot perform mathematical operations like addition, subtraction, or averaging on nominal data.
  • Statistical tests used with nominal data focus on comparing frequencies across categories (e.g., chi-square test).
  • It's a valuable level for initial categorization and understanding basic relationships between variables.

While it may seem simple, the nominal level plays a crucial role in research by setting the foundation for further analysis and providing insights into basic structures and trends within data. It's like the first step in organizing your closet before you can compare shirt sizes or count the total number of clothes.

What is the ordinal measurement level?

What is the ordinal measurement level?

In the world of data measurement, the ordinal level takes things a step further than the nominal level. While still focused on categorization, it introduces the concept of order. Think of it like sorting t-shirts based on size - you're not just labeling them (small, medium, large), but you're also arranging them in a specific order based on their size value.

Here are the key features of the ordinal measurement level:

  • Categorical data: Similar to nominal level, it represents categories or labels.
  • Ordered categories: The categories have a specific rank or sequence (e.g., small < medium < large).
  • Limited operations: You can still only count the frequency of each category, but you can also compare and rank them.
  • Examples: Educational attainment (high school, bachelor's degree, master's degree), movie rating (1-5 stars), customer satisfaction level (very dissatisfied, somewhat dissatisfied, neutral, somewhat satisfied, very satisfied).

Here are some important points to remember about the ordinal level:

  • You cannot perform calculations like adding or subtracting ordinal data because the intervals between categories might not be equal (e.g., the difference between "medium" and "large" t-shirts might not be the same as the difference between "small" and "medium").
  • Statistical tests used with ordinal data often focus on comparing ranks or order (e.g., median test, Mann-Whitney U test).
  • It provides more information than the nominal level by revealing the relative position of each category within the order.

While still limited in calculations, the ordinal level allows you to understand not only the "what" (categories) but also the "how much" (relative order) within your data. It's like organizing your bookshelf not only by genre but also by publication date.

What is the interval measurement level?

What is the interval measurement level?

In the world of data analysis, the interval measurement level represents a step towards more precise measurements. It builds upon the strengths of the ordinal level by adding equal intervals between categories. Think of it like measuring temperature on a Celsius scale - you have ordered categories (degrees), but the difference between 20°C and 30°C is the same as the difference between 10°C and 20°C.

Here are the key features of the interval measurement level:

  • Quantitative data: Represents numerical values, not just categories.
  • Ordered categories: Similar to the ordinal level, categories have a specific rank or sequence.
  • Equal intervals: The distance between each category is consistent and measurable (e.g., each degree on a Celsius scale represents the same change in temperature).
  • Meaningful zero point: The zero point doesn't necessarily represent an absence of the variable, but it maintains a consistent meaning within the scale (e.g., 0°C doesn't mean "no temperature," but it defines a specific reference point).
  • Wider range of operations: You can perform calculations like addition, subtraction, and averaging, but not multiplication or division (due to the arbitrary zero point).
  • Examples: Temperature (Celsius or Fahrenheit), time (in seconds, minutes, hours), IQ scores, standardized test scores.

Here are some important points to remember about the interval level:

  • While intervals are equal, the ratios between values might not be meaningful (e.g., saying someone with an IQ of 150 is "twice as intelligent" as someone with an IQ of 75 isn't accurate).
  • Statistical tests used with interval data often focus on means, standard deviations, and comparisons of differences between groups (e.g., t-tests, ANOVA).
  • It provides valuable insights into the magnitude and relative differences between data points, offering a deeper understanding of the underlying phenomenon.

Think of the interval level like taking your t-shirt sorting a step further - you're not just ranking sizes but also measuring the exact difference in centimeters between each size. This allows for more precise analysis and comparisons.

What is the ratio measurement level?

What is the ratio measurement level?

In the realm of measurement, the ratio level stands as the most precise and informative among its peers. It builds upon the strengths of the interval level by introducing a true zero point, allowing for meaningful comparisons of magnitudes and ratios between values. Imagine measuring distance in meters - not only are the intervals between meters equal, but a zero value on the scale truly represents a complete absence of distance.

Here are the key features of the ratio measurement level:

  • Quantitative data: Represents numerical values with clear meanings.
  • Ordered categories: Similar to previous levels, categories have a specific rank or sequence.
  • Equal intervals: Like the interval level, the distance between each category is consistent and measurable.
  • True zero point: The zero point signifies the complete absence of the variable (e.g., zero meters means absolutely no distance, zero seconds means no time passed).
  • Widest range of operations: You can perform all mathematical operations (addition, subtraction, multiplication, and division) on ratio data, as the ratios between values have real meaning.
  • Examples: Length (meters, centimeters), weight (kilograms, grams), time (seconds with a true zero at the starting point), age (years since birth).

Here are some important points to remember about the ratio level:

  • It offers the most precise and informative level of measurement, allowing for comparisons of actual magnitudes and ratios.
  • Statistical tests used with ratio data often focus on ratios, proportions, and growth rates (e.g., comparing income levels, analyzing reaction times).
  • It's not always possible to achieve a true zero point in every measurement situation, limiting the application of the ratio level in some cases.

Think of the ratio level like having a ruler marked not just with numbers but also with clear and meaningful reference points - you can not only measure the length of an object but also say it's twice as long as another object. This level unlocks the most powerful analysis capabilities.

Startmagazine: Introduction to Statistics
Practice Questions for Introduction to Statistics

Practice Questions for Introduction to Statistics

Questions

1. What is the difference between an independent and a dependent variable? Describe both terms.
 
2. Study both definitions of the term performance-motivation, as given below.
  1. Someone gets assigned the task to build a tower of matches. Performance-motivation refers to the number of attempts someone tries, before he or she quits.
  2. Performance-motivation is the ability to set yourself to do a certain performance.
Are these definitions conceptual or operational?
 
3. What does ‘correlational research’ mean?
 
4. A researcher wants to examine to what extent giftedness of children at secondary school coincides with behavioral problems in the class. What kind of research is suitable to examine this question?
 
5. What is the form of statistics called, that focuses on drawing conclusions?
 
6. Someone claims about a certain variable that the score of Elise is twice as large als the score of Adriaan. Which level of measurement should the variable at least have to be able to make certain claims?
 
7. In a study, the variable intelligence is measured as:
1 = IQ below 70
2 = IQ between 71 and 90
3 = IQ between 91 and 110
4 = IQ between 111 and 120
5 = IQ higher than 120
Which level of measurement does this variable hold?
 
8. In a study, the connection between gender, age and cognitive abilities is examined. Which of these variables can exclusively play a role as independent variable in psychological research?

 

Answers

1. What is the difference between an independent and a dependent variable? Describe both terms.
An independent variable is a variable that is being manipulated by the researcher. Often, this exists of two conditions to which the participants are exposed. The dependent variable is a variable that is observed, after the independent variable is manipulated.
 
2. Study both definitions of the term performance-motivation, as given below.
  1. Someone gets assigned the task to build a tower of matches. Performance-motivation refers to the number of attempts someone tries, before he or she quits.
  2. Performance-motivation is the ability to set yourself to do a certain performance.
Are these definitions conceptual or operational?
Definition I is operational.
Definition II is conceptual.
 
3. What does ‘correlational research’ mean?
With this type of study, the association between variables is studied. With correlational studies, no claims can be made about cause-and-effect relationships.
 
4. A researcher wants to examine to what extent giftedness of children at secondary school coincides with behavioral problems in the class. What kind of research is suitable to examine this question?
Correlational research
 
5. What is the form of statistics called, that focuses on drawing conclusions?
Inferential statistics.
This method assumes that the independent variable has had a certain effect, if the difference between the means of the conditions is larger than expected based upon coincidence only. Therefore, we compare group means that we found with group means that
.......read more
Access: 
Public
Selected contributions for Introduction to Statistics

Selected contributions for Introduction to Statistics

What are statistical methods? – Chapter 1

What are statistical methods? – Chapter 1

1.1 What is statistics and how can you learn it?

Statistics is used more and more often to study the behavior of people, not only by the social sciences but also by companies. Everyone can learn how to use statistics, even without much knowledge of mathematics and even with fear of statistics. Most important are logic thinking and perseverance.

To first step to using statistical methods is collecting data. Data are collected observations of characteristics of interest. For instance the opinion of 1000 people on whether marihuana should be allowed. Data can be obtained through questionnaires, experiments, observations or existing databases.

But statistics aren't only numbers obtained from data. A broader definition of statistics entails all methods to obtain and analyze data.

1.2 What is the difference between descriptive and inferential statistics?

Before being able to analyze data, a design is made on how to obtain the data. Next there are two sorts of statistical analyses; descriptive statistics and inferential statistics. Descriptive statistics summarizes the information obtained from a collection of data, so the data is easier to interpret. Inferential statistics makes predictions with the help of data. Which kind of statistics is used, depends on the goal of the research (summarize or predict).

To understand the differences better, a number of basic terms are important. The subjects are the entities that are observed in a research study, most often people but sometimes families, schools, cities etc. The population is the whole of subjects that you want to study (for instance foreign students). The sample is a limited number of selected subjects on which you will collect data (for instance 100 foreign students from several universities). The ultimate goal is to learn about the population, but because it's impossible to research the entire population, a sample is made.

Descriptive statistics can be used both in case data is available for the entire population and only for the sample. Inferential statistics is only applicable to samples, because predictions for a yet unknown future are made. Hence the definition of inferential statistics is making predictions about a population, based on data gathered from a sample.

The goal of statistics is to learn more about the parameter. The parameter is the numerical summary of the population, or the unknown value that can tell something about the ultimate conditions of the whole. So it's not about the sample but about the population. This is why an important part of inferential statistics is measuring and crediting how representative a sample is.

.....read more
Access: 
Public
What are the main measures and graphs of descriptive statistics? - Chapter 3

What are the main measures and graphs of descriptive statistics? - Chapter 3

3.1 Which tables and graphs display data?

Descriptive statistics serves to create an overview or summary of data. There are two kinds of data, quantitative and categorical, each has different descriptive statistics.

To create an overview of categorical data, it's easiest if the categories are in a list including the frequence for each category. To compare the categories, the relative frequencies are listed too. The relative frequency of a category shows how often a subject falls within this category compared to the sample. This can be calculated as a percentage or a proportion. The percentage is the total number of observations within a certain category, divided by the total number of observations * 100. Calculating a proportion works largely similar, but then the number isn't multiplied by 100. The sum of all proportions should be 1.00, the sum of all percentages should be 100.

Frequencies can be shown using a frequency distribution, a list of all possible values of a variable and the number of observations for each value. A relative frequency distributions also shows the comparisons with the sample.

Example (relative) frequency distribution:

Gender

Frequence

Proportion

Percentage

Male

150

0.43

43%

Female

200

0.57

57%

Total

350 (=n)

1.00

100%

Aside from tables also other visual displays are used, such as bar graphs, pie charts, histograms and stem-and-leaf plots.

A bar graph is used for categorical variables and uses a bar for each category. The bars are separated to indicate that the graph doesn't display quantitative variables but categorical variables.

A pie chart is also used for categorical variables. Each slice represents a category. When the values are close together, bar graphs show the differences more clearly than pie charts.

Frequency distributions and other visual displays are also used for quantitative variables. In that case, the categories are replaced by intervals. Each interval has a frequence, a proportion and a percentage.

A histogram is a graph of the frequency distribution for a quantitative variable. Each value is represented by a bar, except when there are many values, then it's easier to divide them into intervals.

A stem-and-leaf plot

.....read more
Access: 
Public
Call to action: Do you have statistical knowledge and skills and do you enjoy helping others while expanding your international network?

Call to action: Do you have statistical knowledge and skills and do you enjoy helping others while expanding your international network?

People who share their statistical knowledge and skills can contact WorldSupporter Statistics for more exposure to a larger audience. Relevant contributions to specific WorldSupporter Statistics Topics are highlighted per topic so that users who are interested in certain statistical topics can broaden their theoretical perspective and international network.

Do you have statistical knowledge and skills and do you enjoy helping others while expanding your international network? Would you like to cooperate with WorldSupporter Statistics? Please send us an e-mail with some basics (Where do you live? What's your (statistical) background? How are you helping others at the moment? And how do you see that in relation to WorldSupporter Statistics?) to info@joho.org - and we will most definitely be in touch.

Startmagazine: Introduction to Statistics

Topics related to Introduction to Statistics

Statistics: Magazines for encountering Statistics

Statistics: Magazines for encountering Statistics

Startmagazine: Introduction to Statistics
Stats for students: Simple steps for passing your statistics courses

Stats for students: Simple steps for passing your statistics courses

Image

How to triumph over the theory of statistics (without understanding everything)?

Stats of students

  • The first years that you follow statistics, it is often a case of taking knowledge for granted and simply trying to pass the courses. Don't worry if you don't understand everything right away: in later years it will fall into place, and you will see the importance of the theory you had to know before.
  • The book you need to study may be difficult to understand at first. Be patient: later in your studies, the effort you put in now will pay off.
  • Be a Gestalt Scientist! In other words, recognize that the whole of statistics is greater than the sum of its parts. It is very easy to get hung up on nit-picking details and fail to see the forest because of the trees
  • Tip: Precise use of language is important in research. Try to reproduce the theory verbatim (i.e. learn by heart) where possible. With that, you don't have to understand it yet, you show that you've been working on it, you can't go wrong by using the wrong word and you practice for later reporting of research.
  • Tip: Keep study material, handouts, sheets, and other publications from your teacher for future reference.

How to score points with formulas of statistics (without learning them all)?

  • The direct relationship between data and results consists of mathematical formulas. These follow their own logic, are written in their own language, and can therefore be complex to comprehend.
  • If you don't understand the math behind statistics, you don't understand statistics. This does not have to be a problem, because statistics is an applied science from which you can also get excellent results without understanding. None of your teachers will understand all the statistical formulas.
  • Please note: you will probably have to know and understand a number of formulas, so that you can demonstrate that you know the principle of how statistics work. Which formulas you need to know differs from subject to subject and lecturer to lecturer, but in general these are relatively simple formulas that occur frequently, and your lecturer will likely tell you (often several times) that you should know this formula.
  • Tip: if you want to recognize statistical symbols, you can use: Recognizing commonly used statistical symbols
  • Tip: have fun with LaTeX! LaTeX code gives us a simple way to write out mathematical formulas and make them look professional. Play with LaTeX. With that, you can include used formulas in your own papers and you learn to understand how a formula is built up – which greatly benefits your understanding and remembering that formula. See also (in Dutch): How to create formulas like a pro on JoHo WorldSupporter?
  • Tip: Are you interested in a career in sciences or programming? Then take your formulas seriously and go through them again after your course.

How to practice your statistics (with minimal effort)?

How to select your data?

  • Your teacher will regularly use a dataset for lessons during the first years of your studying. It is instructive (and can be a lot of fun) to set up your own research for once with real data that is also used by other researchers.
  • Tip: scientific articles often indicate which datasets have been used for the research. There is a good chance that those datasets are valid. Sometimes there are also studies that determine which datasets are more valid for the topic you want to study than others. Make use of datasets other researchers point out.
  • Tip: Do you want an interesting research result? You can use the same method and question, but use an alternative dataset, and/or alternative variables, and/or alternative location, and/or alternative time span. This allows you to validate or falsify the results of earlier research.
  • Tip: for datasets you can look at Discovering datasets for statistical research

How to operationalize clearly and smartly?

  • For the operationalization, it is usually sufficient to indicate the following three things:
    • What is the concept you want to study?
    • Which variable does that concept represent?
    • Which indicators do you select for those variables?
  • It is smart to argue that a variable is valid, or why you choose that indicator.
  • For example, if you want to know whether someone is currently a father or mother (concept), you can search the variables for how many children the respondent has (variable) and then select on the indicators greater than 0, or is not 0 (indicators). Where possible, use the terms 'concept', 'variable', 'indicator' and 'valid' in your communication. For example, as follows: “The variable [variable name] is a valid measure of the concept [concept name] (if applicable: source). The value [description of the value] is an indicator of [what you want to measure].” (ie.: The variable "Number of children" is a valid measure of the concept of parenthood. A value greater than 0 is an indicator of whether someone is currently a father or mother.)

How to run analyses and draw your conclusions?

  • The choice of your analyses depends, among other things, on what your research goal is, which methods are often used in the existing literature, and practical issues and limitations.
  • The more you learn, the more independently you can choose research methods that suit your research goal. In the beginning, follow the lecturer – at the end of your studies you will have a toolbox with which you can vary in your research yourself.
  • Try to link up as much as possible with research methods that are used in the existing literature, because otherwise you could be comparing apples with oranges. Deviating can sometimes lead to interesting results, but discuss this with your teacher first.
  • For as long as you need, keep a step-by-step plan at hand on how you can best run your analysis and achieve results. For every analysis you run, there is a step-by-step explanation of how to perform it; if you do not find it in your study literature, it can often be found quickly on the internet.
  • Tip: Practice a lot with statistics, so that you can show results quickly. You cannot learn statistics by just reading about it.
  • Tip: The measurement level of the variables you use (ratio, interval, ordinal, nominal) largely determines the research method you can use. Show your audience that you recognize this.
  • Tip: conclusions from statistical analyses will never be certain, but at the most likely. There is usually a standard formulation for each research method with which you can express the conclusions from that analysis and at the same time indicate that it is not certain. Use that standard wording when communicating about results from your analysis.
  • Tip: see explanation for various analyses: Introduction to statistics
Statistics: Magazines for understanding statistics

Statistics: Magazines for understanding statistics

Startmagazine: Introduction to Statistics
Understanding data: distributions, connections and gatherings
Understanding reliability and validity
Statistics Magazine: Understanding statistical samples
Understanding distributions in statistics
Understanding variability, variance and standard deviation
Understanding inferential statistics
Understanding type-I and type-II errors
Understanding effect size, proportion of explained variance and power of tests to your significant results
Statistiek: samenvattingen en studiehulp - Thema
Statistics: Magazines for applying statistics

Statistics: Magazines for applying statistics

Applying z-tests and t-tests
Applying correlation, regression and linear regression
Applying spearman's correlation
Applying multiple regression
Statistiek: samenvattingen en studiehulp - Thema

Updates & About WorldSupporter Statistics

What can you do on a WorldSupporter Statistics Topic?

What can you do on a WorldSupporter Statistics Topic?

  • Understand statistics with knowledge and explanation about a topic of statistics
  • Practice with questions and answers to test your statistical knowledge and skills
  • Watch statistics practiced in real life with selected videos for extra clarification
  • Study relevant terminology with glossaries of statistical topics
  • Share your knowledge and experience and see other WorldSupporters' contributions about a topic of statistics
Updates of WorldSupporter Statistics
This content is used in bundle:
Crossroads: activities, countries, competences, study fields and goals
Activity abroad, study field of working area:
Competences and goals for meaningful life:
Comments, Compliments & Kudos

Add new contribution

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.
Access level of this page
  • Public
  • WorldSupporters only
  • JoHo members
  • Private
Statistics
7593 3 3