Identity theft and account takeover continue to be a major problems in social media, such as Facebook and Twitter. The key component in the security chain on social media is the human user, and it is the savvy human user armed with a little knowledge that poses the best defense in protecting privacy and preventing identity theft. Criminals can use a data mining of Facebook and Twitter to gain access to passwords and user accounts. A little knowledge on data mining and scammer techniques can go a long way in preventing presentation of valuable information social media from being sold and accessed by malicious third party users, including criminal organizations. Herein we present a classroom exercise of using Word Cloud software to visualize and analyze words and information the user is presenting on social media.
Data Mining is analyzing a given database from different aspects and views in an automated fashion, allowing the construction of a model that provides useful information and correlation about the analyzed database (Dawson and Omar, 130). Social media, such as Facebook and Twitter, are in reality robust databases created by users voluntarily revealing personal, familial, and employment data. Analyzing this data, Russian intelligence and Cambridge Analytica data mined user information on Facebook to influence the 2016 U.S. Presidential election while identity theft, costing $220 million in the U.S. in that same, has been made easier with photos and personal information published on Facebook (Picchi 2018). Facebook users regularly thank people for birthday wishes and anniversary congratulations, publish photos and ages of their parents and children on their birthdays, announce they are on vacation and the location, share information about where they work and whom they work with, and reveal they are going to the hospital as well their health condition. This is not only valuable information for Facebook to sell to third parties, but it is information that can be data mined by nefarious parties.
Facebook users provide a wealth of personal information, family information, and employment information that is used by domestic and foreign security agencies, pedophiles searching for victims, burglars, and scammers. An MI5 intelligence officer explained almost all the information they need to profile a target comes from the internet where they can usually discover a target’s hobbies on Facebook (Corera, 237). Pedophiles can comb Facebook and find names, ages, birthdates, schools, and sport teams of potential victims. Scammers and identity thieves can usually find clues to hacking pass codes for personal and financial accounts by using Facebook to discover your hometown, high school mascot, names of children, birthdate, sibling names, favorite foods, first dog, and other common security question information (Identity Theft Resource Center, 2018).
Users of social media can readily protect their confidentiality of information by becoming more aware of their privacy settings, on platforms such a Facebook, and of the information they are presenting. Here we are focusing on filtering information presented, not the former of privacy settings. For example, one should filter information and make decisions such as does the world need to know you are vacation thousands of miles away in Hawaii? You might like to brag to your friends and family that you have the money, time, and means to pamper yourself with a trip to the islands, but at the same time you could be revealing to burglary rings you are not at home, making your home a potential robbery target. A better choice and decision, if you feel you must tell the world, would be to publish photos after you are back home, and this time delay would not only add visuals, but would also prevent the information of your home being empty and open to robbery from being known on Facebook or Twitter.
Where does one start in filtering information presented on social media, such as Facebook? A good place to start is by analyzing what you have already presented on social media. We will be using word cloud software to visualize information being presented on social media. A Word Cloud program works by using Latent Dirichlet Allocation, LDA, modeling algorithms to visually present words in a text as larger in relation to the frequency of the word in the text (Kusumaningrum and Adhy, 1752). In other words, a visual cloud of words is painted where a word becomes larger if it is used more often. Word clouds become useful because, as the saying goes, “A picture is worth a thousand words,” the visual word cloud makes it easy to see what information is being presented by providing us a graphic representation of what words we are using most often.
This graphic representation allows us to analyze what words we are using on social media and also the frequency we are using it. So for example, if the word ‘health’ appears larger than most other words, you need to be aware you are most likely presenting personal information on your health that might be better kept off social media. The point is word cloud graphic representation gives us the chance analyze and evaluate what we are presenting, so that we might make better choices in choosing what, when, and if we are going to present information on social media.
Students will access different social platforms and download the content of the sites. Then, they will paste the contents of the websites into word cloud for content analysis.
Risks (Identity theft, account takeover, scamming, online preying, and fraud)
Example of Occurrence
According to the U.S. Department of Justice, each year 17.6 million people in the U.S. experience some form of Identity Theft, and financial losses to victims were $15.4 billion in 2014, with the cost of loss to victim averaging $1,343 (Gredler, 2016).
“My pictures have been stolen and used in fake accounts,” Representative Adam Kinzinger, R-Illinois, told Zuckerberg. “In many cases, people have been extorted for money (Picchi 2018).”
Russian Intelligence and Cambridge Analytica data mined Facebook and used Facebook to influence the 2016 U.S. Presidential Election (Corera, 237).
In 2010, Google UK security was breached after a Google employee was targeted, and the employee clicked on a malicious link in a phishing email (Abawajy, 236).