Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Date of Graduation

Spring 2019

ORCID

https://orcid.org/0000-0002-7955-5441

Document Type

Thesis

Degree Name

Bachelor of Science (BS)

Department

Department of Computer Science

Advisor(s)

Ramon A. Mata-Toledo

Abstract

The purpose of this thesis is to assist in automating the detection of Fake News by identifying which features are more useful for different classifiers. The effectiveness of different extracted features for Fake News detection are going to be examined. When classifying text with machine learning algorithms features have to be extracted from the articles for the classifiers to be trained on. In this thesis, several different features are extracted: word counts, ngram counts, term frequency-inverse document frequency, sentiment analysis, lemmatization, and named entity recognition to train the classifiers. Two classifiers are used, a Random Forest classifier and a Naïve Bayes classifier. Training on different features combined with different machine learning algorithms yields different accuracies. By testing the different features on different classifiers, it can be determined which features are the best for Fake News detection. Classifying news articles as either Fake News or as not Fake News is explored using three datasets, which in total contains over 40,000 articles. One of the datasets is used to partly to train the classifiers and partly to test the classifiers. The remaining two datasets are used purely for testing the classifiers. All the code used in conjunction with thesis can be found in Appendix B.

Recommended Citation

Shoemaker, Eliza, "Using data science to detect fake news" (2019). Senior Honors Projects, 2010-2019. 714.
https://commons.lib.jmu.edu/honors201019/714

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Senior Honors Projects, 2010-2019

Using data science to detect fake news

Creative Commons License

Date of Graduation

ORCID

Document Type

Degree Name

Department

Advisor(s)

Abstract

Recommended Citation

Included in

Browse

Search

Author Corner

Links

Other JMU Repositories

Links

Senior Honors Projects, 2010-2019

Using data science to detect fake news

Author

Creative Commons License

Date of Graduation

ORCID

Document Type

Degree Name

Department

Advisor(s)

Abstract

Recommended Citation

Included in

Share

Browse

Search

Author Corner

Links

Other JMU Repositories

Links