In a previous article, Hacking the Psyche, I presented the security and privacy implications of capturing feelings of individuals using on-line mechanisms for good use as well as abuse and manipulation. Whenever controls around individual privacy are called into question, there is always, on the other side of the coin, a clear business opportunity.
Corporations often use indirect data such as demographic information and sales statistics to measure the health of their brand because the direct data, i.e how the public and their customers actually feel about their brand, is not available for capture. In this article, I want put forth a case study to demonstrate how capturing feelings on the social web can allow companies to measure the reputation of their brand.
In September 2008, Microsoft reportedly paid Jerry Seinfeld $10 Million dollars to star in it's recent TV commercial campaign. In this article I want to provide evidence to facilitate the hypothesis that Microsoft, in addition to paying Seinfeld, suffered the additional cost of damage to its brand from the commercials. On a positive note, the I'm a PC commercial that followed seems to have up for the damage.
Here are the TV advertisements:
September 4, 2008: Shoe Circus [starring Jerry Seinfeld and Bill Gates]
September 11, 2008: New Family [starring Jerry Seinfeld and Bill Gates]
September 18, 2008: I'm a PC [not starring Jerry Seinfeld]
Now, lets turn to Twitter to measure the feelings expressed towards these commercials during the month of September 2008. Using the Emotion Dashboard tool I presented in Hacking the Psyche, I was able to visualize how people on Twitter felt about these commercials. Here's a video of the tool in action:
Here is a screen-shot of the result including some annotations:
Most people disliked the first commercial (Red bar indicating overall negative feelings). The most common word used to express feelings towards the first commercial was "WTF" as indicated by the word cloud and the video demonstration.
Feelings on the Microsoft brand started to pick up to a positive state only to be re-plummet downwards once the second commercial was aired (Red bar).
The third commercial, I'm a PC, devoid of Seinfeld, was generally liked and appreciated, helping feelings towards the Microsoft brand return to a positive state (Yellow bar indicating 'happy' feelings).
There you have it: a powerful method to use feelings expressed in social media to measure a corporation's brand and marketing efforts.
Brand reconnaissance is not the only effort that can be leveraged from feelings on the social web. If you are interested in this topic, I invite you to consider my upcoming talk the O'Reilly Money Tech Conference titled Emotion Dashboard: Harvesting Feelings on the Social Web for Powerful Decisioning.
In this article, I want to persuade you of the real possibility and high probability that, in the very near future, remote entities will be able target people’s on-line presence to capture and leverage their emotional states and feelings. There are some very extreme implications of this from a security and privacy perspective, and this is the scope I will adhere to in this article. On the flip side, the ideas presented in this article can be leveraged to construct powerful business decisioning and measurement capabilities, a topic that deserves it’s own space - I will cover this subject in a separate article in the next few days.
Before I go any further, I want to stress that the purpose of this article is not to spread undue alarm, nor is the purpose to portray social online media as an evil. I personally utilize the many avenues of online communication and collaboration facilitated by the Generation Y culture. The purpose of this article, instead, is to share some of my initial thoughts on the possibilities of abuse, specific to the mapping of individual feelings online and possible implications.
In this talk, Jonathan describes his passion for making sense of the emotional world and his deep compassion for the human condition. Regardless of this particular article, Jonathan’s talk stands on it’s own. I think Jonathan’s ideas, projects, and aspirations are true works of art. His ideas are powerful enough to inspire a security professional such as me to look outside the oft-incestual world of information security, and to reach out and connect with other venues of Science and understanding. In a small way, the material presented in this article are my attempts to try and do just that.
I invite you to visit one of Jonathan’s projects that he co-founded with Sep Kamvar - We Feel Fine :
Since August 2005, We Feel Fine has been harvesting human feelings from a large number of weblogs. Every few minutes, the system searches the world's newly posted blog entries for occurrences of the phrases "I feel" and "I am feeling". When it finds such a phrase, it records the full sentence, up to the period, and identifies the "feeling" expressed in that sentence (e.g. sad, happy, depressed, etc.). Because blogs are structured in largely standard ways, the age, gender, and geographical location of the author can often be extracted and saved along with the sentence, as can the local weather conditions at the time the sentence was written. All of this information is saved.
The result is a database of several million human feelings, increasing by 15,000 - 20,000 new feelings per day. Using a series of playful interfaces, the feelings can be searched and sorted across a number of demographic slices, offering responses to specific questions like: do Europeans feel sad more often than Americans? Do women feel fat more often than men? Does rainy weather affect how we feel? What are the most representative feelings of female New Yorkers in their 20s? What do people feel right now in Baghdad? What were people feeling on Valentine's Day? Which are the happiest cities in the world? The saddest? And so on.
...
At its core, We Feel Fine is an artwork authored by everyone. It will grow and change as we grow and change, reflecting what's on our blogs, what's in our hearts, what's in our minds. We hope it makes the world seem a little smaller, and we hope it helps people see beauty in the everyday ups and downs of life.
Here is a video I uploaded to Youtube, demonstrating We Feel Fine’s interface, including the ability filter for specific targets (for example: feelings expressed by individuals in their 20s in Iraq):
Emotion Dashboard: Targeting Individuals. The We Feel Fine project does not target specific individuals. The creators of the project imply that doing so would violate an individual's privacy:
Privacy: We Feel Fine only collects and displays data that was already posted publicly on the World Wide Web? We Feel Fine never associates individual human names with the feelings it displays, though it always provides a link to the blog from which any displayed sentence or picture was collected....
We Feel Fine is a work of art designed by well meaning intellectuals. It doesn’t have the capability nor the intention of intruding on any one particular person’s privacy, yet the project raised my personal consciousness towards the security and privacy implications of capturing the feelings (past and present) of individuals.
To pursue discussion around the possibility and implications of capturing feelings projected by individuals online, I decided to develop a proof of concept visualization tool that I will call Emotion Dashboard. This is not a production-ready tool of any sort because I do not currently have the resources to develop such a thing. The goal of this tool (if you should even call it a tool) is to demonstrate my ideas and my vision on this particular topic to facilitate and encourage further discussion in the community. Here are the components of Emotion Dashboard:
RSS. It consumes an RSS feed as its source of input. This RSS feed can include more than one resource stitched together using a service such as Yahoo Pipes:
In other words, the targeted individual’s online presence may include his or her Facebook profile updates, Blogs, and Twitter messages. In this way, updates on all of the sources of a particular individual’s online presence can be coupled together in one RSS feed and then supplied to Emotion Dashboard which will scan the feed from the past to the present (older entries first).
Pulse. In order to visualize the emotional state of an individual from the past (older RSS entry) to the current, the tool includes a line graph at the top of the interface that tends upwards when a word that expresses a happy (positive) emotion is found, and downwards when a word that expresses a sad or angry (negative) emotion is located. To accomplish this feature, I was able to leverage the CSV file provided by the We Feel Fine project located here: http://www.wefeelfine.org/data/files/feelings.txt. This file includes a list of words that are commonly used to express feelings. I marked each word in this file against my judgment of it being a positive or negative sounding word. Occurrences of these words are plotted on the line graph, and can also be clicked on to spawn a new browser session targeting the relevant location of the word.
Immediately below the line graph is a solid bar that expresses the culmination of the individual’s overall mood. The color of this bar is either Yellow (happy), Blue (sad), or Red (angry). The hex code for these colors are also derived from the We Feel Fine CSV file listed above.
I concede that this technique of merely grepping for words lacks context and that is prone to an extremely high error rate. However, given the limited amount of resources I have at this point, my goal is not to provide something that readily usable for all cases, but to present a starting point of a possible approach and the probable implications should this be extended to apply intelligent grammar based contextual analysis. Do note that, even though I concede this is an approach vulnerable to a high error rate, the technique does, statistically speaking, get slightly more accurate the more words it consumes.
Word Cloud. Below the line graph is a simple word cloud containing words from the CSV list discussed above. As the RSS feed is analyzed from past to present, words in the word cloud grow in size as they re-occur.
The word cloud allows the user to analyze the words being used to express feelings as the Emotion Dashboard reads the RSS feed from past to present. The words in the cloud are colored based on the associated hex color codes present in the CSV file.
The following is a screen-shot demonstrates a sample output of an individual’s (who we will call “Jack Smith” for the purposes of this discussion) online presence:
Here are some observations and implications:
Jack’s initial online presence portrays his emotional state as positive (word-cloud: happy).
Jack’s blogs about his friend being laid-off from his job (word-cloud: layoff). This is a negative event.
Feelings expressed by Jack on venues (other than this blog) where he has online presence (example: Twitter), on the same day as his blog entry about his friend’s layoff, are extremely negative (word-cloud: handicapped, upset) even though Jack is discussing other topics. This can lead to the hypothesis that Jack’s overall mood is negative because he is influenced by his friend’s situation. This hypothesis, if true, may allow a malicious third party into manipulating Jack’s negative state to influence his actions. However, in order for such a tactic to succeed, the third party will need to understand Jack’s personality to understand how Jack behaves in moments of stress. It is possible for a third party to construct a personality profile on Jack by studying his authored content based on his on-line presence (blog, Twitter, Facebook, etc) and correlating it with known personality analysis methodologies, for example, the Big Five personality traits based tests:
Once enough information about Jack is collected to reasonably satisfy the personality test requirements, Jack’s personality patterns can be determined that may aid a malicious third party in exploiting Jack’s current emotional state. It is also plausible that this an be extended to automated and trigger based abilities. This is an extremely powerful idea - Jack may not be consciously aware of his negative mood, yet a third party may be able to analyze this remotely with some degree of probability. The following is a screen-shot of the results of a Big 5-like personality test (courtesy of Signal Patterns) :
Jack’s mood recovers to a positive state as time progresses, only to be briefly pulled down momentarily by his discussion of his friend’s layoff situation. This illustrates that the after-shocks of his friends situation are still negatively affecting him.
Eventually, Jack recovers to his average positive state (word-cloud: nice).
Case Study: Criminal Investigation and Analysis. There are numerous security and privacy implications of the discussion at hand. I am unlikely to succeed in attempting to iterate them all. Instead, I want to present one particular case study that can further illustrate the impact of this topic.
Ex-con vents pain online, then kills OCEANA COUNTY -- Danlee Mead was apparently using his MySpace site to tell the world how unhappy and desperate he felt in the hours before he abducted and killed his wife, then turned a shotgun on himself.... Hours later, the depth of the ex-convict's anguish turned to violence.....
A cached copy of Danlee’s MySpace page suggests that he changed his profile (moments before he committed the violent act) to use more positive-sounding words, even though his overall thoughts remained negative. His prior profile, also consisted of negative feelings, yet the words used in the original profile were more negative-sounding. Here is a demonstration of what his profile looks like when run through an analysis over time:
A few observations:
Initially, Danlee’s Myspace profile frequents negative-feeling words (blue bar).
His profile remains consistently negative over time (blue bar).
The words used in his updated profile tip the mood bar to positive (yellow). This is when Danlee changed his profile right before committing the crime.
Following from the above observations, it is clear to see how this type of analysis can be used by investigators, admittedly after-the-fact, to get a glimpse into a suspect's state of mind over time.
It may not be possible to use data from online social media to proactively detect the future behavior of all individuals, yet in this situation, the criminal did indeed have prior history of crimes. Perhaps a proactive approach targeted towards known suspects’ online social presence can be used to detect certain deviance form tuned thresholds - possibly in an automatic fashion based on a set of defined triggers. Such an approach seems more tolerable for a set of individuals with known backgrounds because the elements in their history can aid in influencing the signal-to-noise ratio in favor of the signal.
Some Additional Thoughts. The prior case study was just one illustration of the many impacts of using social media to capture the psyche of individuals. Here are some additional thoughts:
There are positive and negative implications of targeting individuals (or groups). In the first situation, it is easy to see how Jack’s online activity was used to get a better understanding of his psychological state, in addition to the hypothesis on how something like this can be further extended to aid in malicious manipulation and influence by a malicious entity. In the second situation, it is clear to see how the visualization of expressed feelings online may aid investigators into obtaining further insight into a given case.
The victim is the volunteer. Individuals with social presence online willingly contribute and volunteer data that can facilitate the mapping of their psyche. This is in contrast to the Orwellian sense, where information is extracted from the victims in an intrusive way.
The data set is genuine. Most people do not over-edit their blog entries or Twitter messages to conceal emotions.
The study of an individual’s online presence and it’s correlation to emotion and personality analysis is most likely to remain probabilistic. This introduces the risk of unfair analysis. For example: What does it mean for an individual to be identified, and in turn judged, as someone with a 15% chance of being a psychopath?
(online) Social privacy is an oxymoron. Social applications are, by definition, mutually beneficial to users within the system. If you sign up on a social networking application as Mickey Mouse to protect your identity, your friends will not be able to find you, thereby decreasing the value of the system to you. The popular social networking sites often promise privacy by implementing controls on certain tuples, yet as a user, it is important to understand that there is implied and indirect information within the system (such as connections between networks and the cases presented in this article) that cannot be concealed without destroying the core use-cases of the social application.
To conclude, I sincerely hope this article facilitates further discussion around the topics presented. You may feel that the probability of fruition of some of my thoughts and ideas is low. Perhaps you may find them extremely fantastical, or perhaps you agree that the scenarios presented indeed have a high probability of being relevant in the near future. I am obviously intrigued by the topic and I’d be delighted to hear your thoughts.
My new article, titled "Google Your Site for Security Vulerabilities" is available here. It includes google_vulns.php, a sample PHP script, which uses the Google API to search for sensitive data.
My new article, Installing and Configuring Nessus, is now available. Part 2 of this article will discuss NASL and how to write Nessus plug-ins. It should be up in a few weeks or so.