Descriptive Analysis of Categorical Variables from Survey Data

Oct 12, 2024, 7:55 PM

1base tercer producto_ 04-10-2024 reco.sav

Missing file

REALIZA UNA TABLA DE VARIBALES DESCRIPTIVAS ATRIBUTIVAS CON NUMERO DE SUJETOS Y PORCENTAJES CORRESPONDIENTES

The descriptive table for the nominal (categorical) variables is as follows:

Sexo

  • Number of Subjects:
    • 2.0: 121 subjects (80.67%)
    • 1.0: 29 subjects (19.33%)

Estado Civil

  • Number of Subjects:
    • 2.0: 80 subjects (53.33%)
    • 1.0: 47 subjects (31.33%)
    • 3.0: 13 subjects (8.67%)
    • 4.0: 7 subjects (4.67%)
    • 5.0: 2 subjects (1.33%)
    • 99.0: 1 subject (0.67%)

Turno en el que labora

  • Number of Subjects:
    • 1.0: 150 subjects (100%)

Cuenta con contrato definitivo (plaza)

  • Number of Subjects:
    • 1.0: 122 subjects (81.33%)
    • 2.0: 28 subjects (18.67%)

Nivel de formación

  • Number of Subjects:
    • 2.0: 44 subjects (29.33%)
    • 3.0: 42 subjects (28.00%)
    • 4.0: 35 subjects (23.33%)
    • 1.0: 29 subjects (19.33%)

Experiencia Recodificada

  • Number of Subjects:
    • 2.0: 90 subjects (60.40%)
    • 3.0: 38 subjects (25.50%)
    • 4.0: 21 subjects (14.09%)

Años Grado Recodificado

  • Number of Subjects:
    • 1.0: 90 subjects (60.40%)
    • 2.0: 38 subjects (25.50%)
    • 3.0: 21 subjects (14.09%)

Labor Recodificado

  • Number of Subjects:
    • 1.0: 121 subjects (81.76%)
    • 2.0: 27 subjects (18.24%)

NEM Recodificado

  • Number of Subjects:
    • 2.0: 46 subjects (31.51%)
    • 4.0: 37 subjects (25.34%)
    • 1.0: 36 subjects (24.66%)
    • 3.0: 27 subjects (18.49%)
These tables provide a summary of the number of subjects and their corresponding percentages for each categorical variable. If you have any further questions or need additional analysis, feel free to ask!

Can you perform a statistical analysis on this data?

To perform a statistical analysis on this dataset, we can consider several approaches depending on the specific questions or hypotheses you have in mind. Here are some common statistical analyses we can perform:
  1. Descriptive Statistics: Calculate means, medians, standard deviations, etc., for numerical variables.
  2. Correlation Analysis: Assess the relationships between numerical variables.
  3. Chi-Square Test: Test for independence between categorical variables.
  4. T-tests or ANOVA: Compare means between groups.
  5. Regression Analysis: Explore relationships between dependent and independent variables.
Could you specify which type of analysis you are interested in or if you have any specific hypotheses to test?

Estadística descriptiva : Calcular medias, medianas, desviaciones estándar, etc., para variables numéricas.

Análisis de correlación : evaluar las relaciones entre variables numéricas.
Prueba de Chi-Cuadrado : Prueba de independencia entre variables categóricas. Pruebas T o ANOVA : Comparan medias entre grupos.

Correlation analysis

correlation_matrix = df[numerical_columns].corr()
correlation_matrix

Descriptive Statistics

The descriptive statistics for the numerical variables in the dataset are as follows:
  • Mean: The average value of each variable.
  • Median: The middle value when the data is sorted.
  • Standard Deviation (std): A measure of the amount of variation or dispersion of a set of values.

Correlation Analysis

The correlation matrix shows the pairwise correlation coefficients between the numerical variables. A correlation coefficient close to 1 indicates a strong positive correlation, while a coefficient close to -1 indicates a strong negative correlation.

Chi-Square Test

The Chi-Square test results for independence between selected categorical variables are:
  • Sexo and Estado Civil: Chi2 = 1.51, p-value = 0.91
  • Sexo and Turno en el que labora: Chi2 = 0.0, p-value = 1.0
  • Sexo and Cuenta con contrato definitivo: Chi2 = 1.03, p-value = 0.31
  • Estado Civil and Turno en el que labora: Chi2 = 0.0, p-value = 1.0
  • Estado Civil and Cuenta con contrato definitivo: Chi2 = 12.68, p-value = 0.03
  • Turno en el que labora and Cuenta con contrato definitivo: Chi2 = 0.0, p-value = 1.0
A p-value less than 0.05 indicates a statistically significant association between the variables.
If you have any specific questions or need further analysis, feel free to ask!

Can you visualize the distribution of these variables?

What are the most significant correlations found?

The correlation matrix provides us with the correlation coefficients between numerical variables. Let's identify the most significant correlations:
  1. High Positive Correlations: These are correlations with coefficients close to 1.
  2. High Negative Correlations: These are correlations with coefficients close to -1.
Let's extract and display the most significant correlations from the correlation matrix.
Get started with Vizly