Multi-Class Document Classification Based on Deep Neural Network and Word2Vec
Keywords:
Document Classification, Multiclass Classification, Data Preprocessing, Word Embedding Methods, Machine Learning, Deep LearningAbstract
With the increase in unstructured data, the importance of classification of text-based documents has increased. In particular, the classification of news texts and digital documentation provides easy access to the information sought. In this study, a large amount of news textual data was used. After the data set was preprocessed, Bag of Words (BoW), TF-IDF, Word2Vec and Doc2Vec word embedding methods were applied. In the classification phase, Random Forest (RF), Multilayer Perceptron (MLP), Support Vector Machine (SVM) and Deep Neural Network (DNN) algorithms were applied. As a result of the experimental studies, using the Word2Vec method together with the DNN algorithm performed the best result.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
The manuscript with title and authors is being submitted for publication in Journal of Aeronautics and Space Technologies. This article or a major portion of it was not published, not accepted and not submitted for publication elsewhere. If accepted for publication, I hereby grant the unlimited and all copyright privileges to Journal of Aeronautics and Space Technologies.
I declare that I am the responsible writer on behalf of all authors.