Analyzing Gender Disparity in Academic Achievement among Primary School Pupils

Apr 5, 2024, 1:33 PM

To find out gender differences in academic achievement among Primary School Pupils in Social Studies Kosofe Local Government of Lagos State

import pandas as pd

# Load the Excel file
xls = pd.ExcelFile('Questionnaire Data.xlsx')

# Load all sheets into a dictionary of dataframes
dfs = {sheet_name: xls.parse(sheet_name) for sheet_name in xls.sheet_names}
# Display the first few rows of the dataframe to inspect it
display(dfs['Sheet1'].head())
# Clean the dataframe to make it more usable

df = dfs['Sheet1'].copy()

# Dropping the first row as it is a header row repeated
# Also, renaming columns for clarity

df = df.drop(index=0)
df.columns = ['S/N', 'Name of School', 'Gender', 'Age', 'Unused', 'PSSAT Score']

# Drop the 'Unused' column as it does not contain any relevant information
df = df.drop(columns=['Unused'])

# Convert the Age and PSSAT Score columns to numeric types
df['Age'] = pd.to_numeric(df['Age'])
df['PSSAT Score'] = pd.to_numeric(df['PSSAT Score'])

# Display the cleaned dataframe to verify changes
display(df.head())
import plotly.express as px

# Group the data by Gender and calculate the mean PSSAT Score for each gender
gender_means = df.groupby('Gender')['PSSAT Score'].mean().reset_index()

# Create a bar chart to visualize the differences in academic achievement by gender
fig = px.bar(gender_means, x='Gender', y='PSSAT Score', color='Gender', title='Mean PSSAT Scores by Gender',
labels={'PSSAT Score': 'Mean PSSAT Score'}, text='PSSAT Score')
fig.update_layout(plot_bgcolor='#111', paper_bgcolor='#111', font_color='white')
fig.show()
Loading...
  • The dataset titled Effect of Gender on Academic Achievement of Pupils in Social Studies in Kosofe Local Government of Lagos State was successfully loaded and cleaned for analysis.
  • The data includes information on the pupils' school names, gender, age, and PSSAT scores.
  • A bar chart visualizing the mean PSSAT scores by gender was created to illustrate gender differences in academic achievement among primary school pupils.
  • The visualization indicates a difference in mean PSSAT scores between genders, which could suggest a gender disparity in academic achievement in social studies within the specified area.

Explore correlations between age and PSSAT scores

# Calculate the correlation coefficient between Age and PSSAT Score
correlation = df[['Age', 'PSSAT Score']].corr()
display(correlation)
  • The correlation coefficient between Age and PSSAT scores is approximately 0.047, indicating a very weak positive correlation.
  • This suggests that there is hardly any linear relationship between a pupil's age and their performance on the PSSAT in this dataset.

Investigate gender differences in PSSAT scores further

# Explore gender differences in PSSAT scores further using descriptive statistics

gender_stats = df.groupby('Gender')['PSSAT Score'].describe()
display(gender_stats)
  • Descriptive statistics for PSSAT scores by gender reveal:
    • Females (F): Mean score = 12.98, Standard Deviation = 3.35, Min = 3, Max = 19
    • Males (M): Mean score = 12.85, Standard Deviation = 3.35, Min = 2, Max = 19
  • The mean scores are very close, with females slightly outperforming males on average.
  • Both genders show a similar range and variation in scores, indicating that while there may be slight differences in average scores, the overall performance distribution is quite similar across genders.

Compare gender performance using visualizations

import plotly.graph_objects as go

# Create box plots to compare the distribution of PSSAT scores between genders
fig = go.Figure()

fig.add_trace(go.Box(y=df[df['Gender'] == 'F']['PSSAT Score'], name='Female', marker_color='pink'))
fig.add_trace(go.Box(y=df[df['Gender'] == 'M']['PSSAT Score'], name='Male', marker_color='blue'))

fig.update_layout(title_text='Distribution of PSSAT Scores by Gender',
plot_bgcolor='#111', paper_bgcolor='#111', font_color='white')
fig.show()
Loading...
  • Loaded and cleaned the dataset, focusing on gender, age, and PSSAT scores.
  • Found a very weak positive correlation between age and PSSAT scores, suggesting minimal relationship.
  • Descriptive statistics showed slight differences in mean PSSAT scores between genders, with females slightly outperforming males.
  • Visualized gender differences in PSSAT scores through a bar chart and box plots, highlighting similar performance distributions with minor differences in averages.

Investigate the relationship between age and PSSAT scores further

# Create a scatter plot to visualize the relationship between Age and PSSAT scores
fig = px.scatter(df, x='Age', y='PSSAT Score', color='Gender',
title='Relationship Between Age and PSSAT Scores by Gender',
labels={'Age': 'Age', 'PSSAT Score': 'PSSAT Score'},
trendline='ols') # Adding a trendline using Ordinary Least Squares regression
fig.update_layout(plot_bgcolor='#111', paper_bgcolor='#111', font_color='white')
fig.show()
Loading...