Dec 9, 2024, 6:02 PM
IHEC-2020.sav
Missing file
roa
(rentabilité économique)assets
(total des actifs), capital
(total des capitaux propres), and beta
(beta)roa
(rentabilité économique)assets
(total des actifs), capital
(total des capitaux propres), beta
(beta)ROA
are not normally distributed.roa
.roa
is explained by the model.log_assets
and roa
, suggesting that as the log of assets increases, the return on assets decreases.log_capital
and roa
, indicating that as the log of capital increases, the return on assets increases.sqrt_beta
and roa
is not statistically significant at the 0.05 level, but it is close (p-value = 0.062), suggesting a potential positive relationship.roa
, roe
, mva
, etc.assets
, capital
, beta
, lnassets
, etc.roa
, roe
, mva
) with potential explanatory variables (assets
, capital
, beta
, lnassets
, srassets
).X_cleaned
contain NaN values, which the LinearRegression
model from scikit-learn does not handle natively. To resolve this issue, we need to handle the missing values in the dataset.HistGradientBoostingRegressor
which can handle NaN values natively.roa
:beta
, lnassets
, srassets
) is approximately 0.037. This indicates that the model explains only about 3.7% of the variance in the dependent variable roa
.roa
.assets
and lnassets
: High correlation of 0.937, indicating a strong linear relationship.assets
and srassets
: High correlation of 0.982, suggesting a strong linear relationship.capital
and assets
: Correlation of 0.807, indicating a strong relationship.lnassets
as the dependent variable and assets
, srassets
, and capital
as explanatory variables, and fit a linear regression model.lnassets
as the dependent variable and assets
, srassets
, and capital
as explanatory variables achieves an R-squared value of approximately 0.997. This indicates that the model explains about 99.7% of the variance in lnassets
, suggesting a very strong linear relationship.assets
, srassets
, capital
) are very effective in predicting lnassets
.mva
) and correct any abnormality.mva
as the dependent variable and is
, assets
, and roa
as explanatory variables.mva
and correct any abnormality if necessary.mva
variable resulted in:mva
variable is not normally distributed.mva
variable, such as a log transformation. Let's proceed with this transformation and then perform the regression analysis.mva
variable, the Shapiro-Wilk test for normality resulted in:mva
variable, as indicated by the very low p-value. This suggests that the log transformation was not effective in normalizing the data.mva
variable, as linear regression is robust to deviations from normality, especially with larger sample sizes. Let's fit the regression model using mva
as the dependent variable and is
, assets
, and roa
as explanatory variables.mva
as the dependent variable and is
, assets
, and roa
as explanatory variables yields the following results:mva
variable. This is a relatively low R-squared value, suggesting that the model does not capture a significant portion of the variability in mva
.mva
.roa
is the largest, indicating that it has the most substantial impact on mva
among the explanatory variables. However, given the low R-squared, caution should be taken in interpreting these coefficients as indicative of strong relationships.mva
. Further exploration of additional variables or more complex modeling techniques may be necessary to improve model performance.