Analysis of the Average Enade Grade for Each State - Pedro

The analysis I will do is related to the general Enade grade obtained by courses.

The dataset have two columns related to the grades: “ Nota Padronizada - CE” and “ Conceito Enade (Contínuo)”. The difference between the two is small, so I chose to analyze based on the “ Conceito Enade (Contínuo)” column.:

>>> import createobjects
>>> import data_cleaner
>>> data_path = "./data/dataframes/resultados_cpc_2021.xlsx"
>>> raw_df = createobjects.load_data_as_df(data_path)
>>> df = data_cleaner.dataframe_cleaner(raw_df)
>>> df[" Conceito Enade (Contínuo)"].mean()
2.400735623003195
>>> df[" Nota Padronizada - CE"].mean()
2.3791958200212995

The means of the two columns are very simillar. As we can see below, the mean of the differecens between the two columns is ignorable:

>>> (df[" Conceito Enade (Contínuo)"] - df[" Nota Padronizada - CE"]).mean()
0.021539802981895634

For the map plotting, I filtered the data to reflect only public universitys. We can compare the grade’s means for the different university types:

>>> df.groupby("Categoria Administrativa")[" Conceito Enade (Contínuo)"].mean()
Categoria Administrativa
Especial                       1.369682
Privada com fins lucrativos    2.100839
Privada sem fins lucrativos    2.394795
Pública Estadual               2.382922
Pública Federal                2.743884
Pública Municipal              2.078963
>>> df.groupby("Categoria Administrativa")[" Conceito Enade (Contínuo)"].std()
Categoria Administrativa
Especial                       0.740122
Privada com fins lucrativos    0.778820
Privada sem fins lucrativos    0.861284
Pública Estadual               0.915827
Pública Federal                0.954435
Pública Municipal              0.918509

As we can see, they aren’t really different; All of them are between 2 and 3.

Now, to the actual plotting, we get the df containing the means for each state:

>>> createobjects.create_mean_of_general_grade(df)
   Sigla da UF    Conceito Enade (Contínuo)
0            AC                    1.978276
1            AL                    1.854321
2            AM                    1.990718
3            AP                    1.965472
4            BA                    2.478058
5            CE                    2.369505
6            DF                    3.557844
7            ES                    3.022967
8            GO                    2.101168
9            MA                    1.968111
10           MG                    2.910275
11           MS                    2.201744
12           MT                    1.885990
13           PA                    1.874467
14           PB                    2.409157
15           PE                    2.322109
16           PI                    2.079104
17           PR                    2.867315
18           RJ                    3.278528
19           RN                    2.665535
20           RO                    1.947105
21           RR                    1.816065
22           RS                    2.894011
23           SC                    3.247607
24           SE                    2.463977
25           SP                    3.201814
26           TO                    1.829220

We can already see that some states excel the others considerably. We will be able to see that these states are near each other when we plot a map with this data.

After some mapa data manipulation that isn’t related to the actual data analysis, this is the result:

_images/average_score_states.png