Input Datasets

In this project 4 text classification standardized datasets are used to feed the neural nets and report the efficiency of the algorithms.
The idea is to evaluate both architectures using different datasets. The statistics summary for each dataset is shown in the Table below.
The optization is done at each trial chosing a set of parameters applied to each dataset. There is two versions of each dataset:
TFIDF and distance-based meta-features (MF)

Dataset	Size	#Features	#Classes	Mean	Minor Class	1st Quartile	Median	3rd Quartile	Major Class
20NG	18766	61050	20	938	627	952	978	988	998
4UNI	8274	40195	7	1182	13	343	929	1382	3757
REUTERS	13327	19590	90	148	2	8	29	91	3964
ACM	24897	59990	11	2263	63	761	2041	3278	6562

The datasets we use to create all visualizations are derived from the process of optimization (with 5-fold cross-validation)
of each set of parameters applied to each dataset version. During the optization we get the all trials (we set 80 trials) with all
the set of parameters that were tested, time and loss.