Z-Test vs. T-Test, P-Value

been_29Β·2024λ…„ 7μ›” 29일
post-thumbnail

πŸ’‘ Z-Test and T-Test

  • They are statistical methods used to compare sample data to a population mean or to compare two samples.
  • The choice between a z-test and a t-test depends on the sample size and whether the population standard deviation is know.

Z-Test

Used when the population variance (or standard deviation) is known, and the sample size is large (typically nn >30).

Steps for Z-Test

  1. Formulate Hypotheses
    • Null Hypothesis (H0H_0): The sample mean is eqaul to the population mean.
    • Alternative Hypothesis (H1H_1): The sample mean is not equal to the population mean.
  2. Calculate the Z-Statistic
    • where xΛ‰\bar{x} is the sample mean, ΞΌ\mu is the population mean, Οƒ\sigma is the population standard deviation, and nn is the sample size.
z=xΛ‰βˆ’ΞΌΟƒ/n\\z = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}
  1. Determine the Critical Value
    • Use the standard normal distribution (Z-distribution) to find the critical value corresponding to the significance level (Ξ±\alpha).
  2. Make a Decision
    • If the z-statistic falls into the rejection region, reject the null hypothesis.

Example Code

import numpy as np
from scipy.stats import norm

# Example data
sample_data = [78, 82, 79, 83, 76, 77, 85, 88, 75, 74, 80, 79]
population_mean = 75
population_std = 5

# Calculate sample mean
sample_mean = np.mean(sample_data)
n = len(sample_data)

# Calculate the z-statistic
z_statistic = (sample_mean - population_mean) / (population_std / np.sqrt(n))

# Significance level
alpha = 0.05

# Determine the critical value for a two-tailed test
z_critical = norm.ppf(1 - alpha/2)

# Make a decision
if abs(z_statistic) > z_critical:
    print(f"z-statistic: {z_statistic}, Critical value: {z_critical}")
    print("Reject the null hypothesis.")
else:
    print(f"z-statistic: {z_statistic}, Critical value: {z_critical}")
    print("Fail to reject the null hypothesis.")

T-Test

Used when the population variance is unknown and the sample size is small (n<30n<30). It is also used when comparing the means of two samples.

Steps for T-Test

  1. Formulate Hypotheses
    • Null Hypothesis (H0H_0): The sample mean is eqaul to the population mean (one-sample t-test) or the means of the two samples are equal (two-sample t-test).
    • Alternative Hypothesis (H1H_1): The sample mean is not equal to the population mean or the means of the two samples are not eqaul.
  2. Calculate the T-Statistic
    • One-sample t-test
t=xΛ‰βˆ’ΞΌs/n\\t = \frac{\bar{x} - \mu}{s/\sqrt{n}}
  1. Determine the Critical Value
    • Use the t-distribution with nβˆ’1n-1 degrees of freedom to find the critical value corresponding to the significance level (Ξ±\alpha).
  2. Make a Decision
    • If the t-statistic falls into the rejection region, reject the null hypothesis.

Example Code

import numpy as np
from scipy.stats import t

# Example data
sample_data = [78, 82, 79, 83, 76, 77, 85, 88, 75, 74, 80, 79]
population_mean = 75

# Calculate sample mean and sample standard deviation
sample_mean = np.mean(sample_data)
sample_std = np.std(sample_data, ddof=1)
n = len(sample_data)

# Calculate the t-statistic
t_statistic = (sample_mean - population_mean) / (sample_std / np.sqrt(n))

# Significance level
alpha = 0.05
df = n - 1

# Determine the critical value for a two-tailed test
t_critical = t.ppf(1 - alpha/2, df)

# Make a decision
if abs(t_statistic) > t_critical:
    print(f"t-statistic: {t_statistic}, Critical value: {t_critical}")
    print("Reject the null hypothesis.")
else:
    print(f"t-statistic: {t_statistic}, Critical value: {t_critical}")
    print("Fail to reject the null hypothesis.")





πŸ’‘ P-Value


What is P-Value

  • Definition
    • A measure used in statistical hypothesis testing toe determine the significance of the observed data.
    • Represent the probability of obtaining test results at least as extreme as the observed results, assuming that the null hypothesis is true.
  • Hypothesis Testing
    • Null Hypothesis (H0H_0) : The hypothesis that there is no effect or no difference. It is the default or starting assumption.
    • Alternative Hypothesis (H1H_1) : The hypothesis that there is an effect or a difference. It is what you aim to support.
  • Interpretation of P-Value
    • Low p-value (<=Ξ±<=\alpha): Indicate that the observed data are unlikely under the null hypothesis. This leads to rejecting the null hypothesis.
    • High p-value (>Ξ±>\alpha): Indicate that the observed data are likely under the null hypothesis. This leads to failing to rejecting null hypothesis.

Steps in Hypothesis Testing Using P-Value

  1. State the Hypotheses
    • H0H_0: Null hypothesis.
    • H1H_1: Alternative hypothesis.
  2. Choose a Significance Level (Ξ±\alpha)
    • Common choices are 0.05, 0.01, etc.
  3. Calculate the Test Statistic
    • Depending on the test (t-test, z-test, etc.), calculate the corresponding test statistic.
  4. Determine the P-Value:
    • Find the p-value associated with the test statistic.
  5. Make a Decisdion
    • Compare the p-value to Ξ±\alpha.
    • If p-value <=Ξ±<= \alpha, reject H0H_0.
    • If p-value >Ξ±> \alpha, fail to reject H0H_0.

Example Code

import numpy as np
from scipy import stats

# Example data for two groups
group1 = [78, 82, 79, 83, 76, 77, 85, 88, 75, 74, 80, 79]
group2 = [68, 72, 69, 73, 66, 67, 75, 78, 65, 64, 70, 69]

# Perform t-test
t_statistic, p_value = stats.ttest_ind(group1, group2)

print(f"t-statistic: {t_statistic}")
print(f"p-value: {p_value}")

# Decision based on p-value
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis.")
else:
    print("Fail to reject the null hypothesis.")
profile
Data Analysis

0개의 λŒ“κΈ€