The purpose of this research project was to perform supervised machine learning on a dataset from the UCI Machine Learning Repository to see if we can accurately determine whether a person used illegal drugs based on their personality traits and demographics. The dataset consists of 32 columns and about 1800 rows of data. The particpants provided typical demographic information as part of the dataset, and each participant has a score for 7 separate personality characteristics. In addition, the participants indicated whether or not they have taken 19 different drugs. They indicated whether a drug has been taken baed on the following criteria:
The dataset was simplified to allow for better results when performing supervised machine learning models. More information on this topic can be found in the Preprocessing section of the Analysis page.
The dataset has 7 personality measurements including Nscore, Escore, Oscore, Ascore, Cscore, Impulsive and SS.
The 1800 or so participants provided their Age, Country, Ethnicity, and Education level as seen in the below charts.
Each participant indicated usage of the following drugs as part of the study:
Refer to the Analysis page for further information regarding the delineation between legal and illegal drugs and whether or not the participant is considered to have used the drug.