


everybody lies book analysis
1. Your Faulty Gut
This section introduces the concept of data science to the reader, explaining it with a simple story. The author outlines how at his thanksgiving dinner, his family was giving him a hard time about his lack of a relationship. His grandmother goes on to describe what type of girl the author should be looking for/ what type of person will be perfect for him. He then explains how this is data science. Using your experiences and knowledge to come to a conclusion about any given hypothesis. So, as we go into far more complex problems, we still understand this understanding that we are all data scientists. But, this does introduce the question asking if data really matters if we are all technically data scientists. Davidowits explains how our experiences don’t always provide us with enough data. For example, he says on page 30 “ It is unlikely that you-or your close friends or family members-have seen enough cases of pancreatic cancer to tease out the difference between indigestion followed by abdominal pain compared to indigestion alone.” This illustrates just how little we know, no matter how much we think we know, and some of the extremely complicated problems data can solve.
2. Was Freud Right?
This section focuses on Sigmund Freud and the thought that our personalities are built upon interactions. Davidowits decides to conduct a study centered around typos to try to see if they can be correlated to deeper desires within the human psyche, as Freud would argue. To do this, he developed a computer program that would make errors as frequently as people do. He then analysed this data and found that there is no clear correlation between typos and what people desire. Almost all typos are just simple mistakes. To sum this up, on page 49 he notes, “If a monkey types long enough, he will eventually write “to be or not to be.” If a person types long enough, she will eventually write “penistrian.” This is one way big data is so powerful. He was able to analyse the human psyche through typos.
Davidowits then outlines the four unique powers of big data that his entire argument is built upon. Number one, big data offers new types of data. Number two, big data provides honest data. Number three, big data allows us to zoom in on small groups of people. Number four, big data allows us to do many casual experiments. With these things in mind, we are able to take a closer look at the way people think, how that translates to actions on the internet, and finally, actions in real life.
3. Data Reimagined
This section first starts with the speed of big data vs government collected surveys. For example, unemployment rates. The government takes forever to calculate these rates, as well as release them to the public, but with big data, that may change. Davidowits put the U.S unemployment rates through a platform called Google Correlate between 2004 and 2011. He found that the key searches that correlated with the unemployment rates happened to be “Slutload” and “Spider Solitaire.” He explains on page 59 what this means by saying, “This example illustrates the first power of big data, the reimagining of what qualifies as data. Frequently, the value of big data is not its size; it’s that it can offer you new kinds of information to study-information that had never previously been collected.” Google searches are maybe the most honest data collection method. People tell google and ask google questions things they would never have asked or said to anyone else. This is clearly demonstrated by how surprising this result is. This to me feels like an invasion of privacy. The author explains google searches as an incredibly powerful tool, which is completely true, but how are they getting their hands on this tool. By using our searches. They are taking some of the most vulnerable searches and people’s thoughts so easily. Anyone can access google trends and find how many people searched for almost any given word in the past month. As the icing on the cake, we get absolutely nothing for it. If you were going in to do a focus group, my guess is you get something in return. We are part of countless focus groups, our data being used daily for not just smaller scientists to do independent research projects, but for large corporations to target advertisements and connect us with groups they want us to connect with. Big data is powerful, in my opinion, sometimes too powerful.
4. Digital Truth Serum
Digital truth serum is a real thing. As Davidowits explains, when we are prompted to answer questions from anyone, especially in person, we are driven to lie. He explains this on page 108 by saying, “A person who looks like your favorite aunt walks in…. Do you want to tell your favorite aunt you used marijiuana last month.” I think this perfectly illustrates how people don’t ever want to disappoint. So, by revealing truths about yourself, you will most likely disappoint someone. That doesn’t feel great. Additionally, the author explains how there is no incentive to answering those questions truthfully. Google searches do because (this is an example from the text) if someone searches for best racist jokes on google, they will be greeted with a large selection of pages with a countless number of racist jokes. This is the second power. The fact that big data can show what people truly think. This is scary. Google knows more that sometimes our closest family and friends. People confide in google to ask questions and learn things they would never ask anyone to teach them. We are giving google our deepest secrets. Even a simple search can tell more about you then you may think. When paired with other searches, data scientists can craft an entire archetype for your “character.” Once again, we get absolutely nothing for this. By just using google, they have permission to take all of this data and do whatever they wish with it.
5. Zooming In
Zooming in is the third power of big data. The author explains how because of the pure quantity of information, past and present, we can answer incredibly specific questions. For example, Davidowits examined fans of certain MLB fans, the year they were born, and how old they were when these teams were in their prime. He found that there seems to be a key period, between 14 and 24, when many of our opinions about things are crafted. Considering the fact that the fans of these teams were all in this age gap when these teams were winning big, this correlation was made. This is an extremely small subset of people. Fans of this team, born in this specific time period, that were this old when they won the World series. This is one of the most important aspects of big data. If this were a large corporation trying to target their product to a specific subset of people, imagine how specific this can get. What year they were born, where they live, what they like, what they dislike, what they search, and what they think. This is being sold and handed to powerful people with almost no regulations to prohibit what they can and can’t do. I don’t know about you, but that gives me the chills. Some dude chilling in his mansion is using me to buy another bugatti. But, this does allow us to answer some pretty important questions. A study was done where they took where people grew up, how much money they had as kids, and how much money they make as adults to analyse what cities where kids with poor parents have the greatest change of becoming rich. They found that San Jose has the greatest percentage and Charlotte has the least. This is an interesting question that without this data couldn’t have been answered. Clearly there are some good questions to be asked, but do the benefits outway all the negatives?
6. All the World’s A Lab
The section is all about correlation vs causation. This is a really difficult thing to distinguish. Is something because of another, or do they go hand in? To test this, a popular method called A/B testing is used. This is a type of testing most of us have used before. A group is the control group and the other is the test group in which a variable is changed. This allows you to see if that change makes a difference, and further if that change is caused by or correlated to the non-change group. This paired with the immense power of big data allows even more questions to be answered. These tests are incredibly common as well. On page 231 this is noted by saying, “Facebook now runs a thousand A/B tests per day, which means that a small number of engineers at Facebook start more randomized, controlled experiments in a given day than the entire pharmaceutical industry starts in a year.” That’s absolutely crazy, and perfectly illustrates the knowledge that Facebook possesses. They learn thousands of things every day about their users that they can use whenever they wish. Once again, I think this is scary because I know I don’t want a guy like Mark Zuckerburg who build a website called Facemash where the Harvard community got to rate people’s physical appearances to know more about me than my own mother.
7. Big Data, Big Schmata? What it Cannot Do
Big data can do a whole lot, but haha, it can’t do everything (thank god). For example, it cannot yet predict the stock market. Researchers attempted to use key google search words to understand the country’s mood, and then use that to predict how the market would change. But that didn’t work. Davidowits explained the “curse of dimensionality” which is when you have more variables, there is more room for something to appear to work because of chance. For example, if you were to flip 1000 coins every morning and guess what they would be, maybe after 1 week coin 678 seems to be correct every time. This is now your lucky coin. But, that outcome would not have been the same if you only flipped 100 coins every morning. Sice big data gives us so many variables, it can be incredibly difficult to see if something is due to chance, or if it actually fits. For example, a study was done to see if genetics correlated to high iq. With their first group, they found a correlation with the IGF2r gene, but with the second group, that correlation could not be made. This leaves a lot of room for errors to be made if experiments aren’t conducted properly. So, these conclusions may be entirely inaccurate. False conclusions are never a good thing because they will cause people to believe a false reality that could cause people to think their lives were either positively or negatively effective. This is another incredibly scary thing about big data and the lack of restrictions put on these large companies.
8. Mo Data, Mo Problems? What We Shouldn’t Do
The internet is a crazy place. What you put on the internet can affect everything you do. For example, if you gamble you may be affected by your online profiles. Casinos don’t want people to win because the house always wins. So if Karen comes in, loses 3k and doesn’t come back for 2 months, you have lost a customer. So, casinos will find gambler’s “pain points” which is the point in which they will stop playing and not come back. With this knowledge, casinos will work to stop you just before you reach that point and give you a free steak dinner or a massage, or another way to get you to stop playing and come back later. To get information like this, companies will use doppelganger searches to find people like you and use their data about their personalities and “pain points” to predict what your pain point is. So, we are constantly being grouped based off certain elements of ourselves to predict how others will behave. WE ARE BEING USED!!! It’s funny, because the author is a data scientist himself who views all of this data as awesome. He never mentions all of the bad things about all of this information being out in the world. I feel like this is unfair because sure this data tells us more than we could have ever imagined, but we are getting nothing for giving people like him all of this useful information.
9. How Many People Finish Books?
Gonna be honest, no idea how I actually read this whole book through. It has been a while since I have read a book on my own time all the way through (besides the U.R. I guess). The conclusion was really just about how hard he thought it was to write a conclusion. He mentions the four main abilities big data has, and how most people still don’t fully view data science as true science. But, Davidowits believes that data science is the future of science because the internet will continue to grow as the major force it is. We will only continue to give them even more information about ourselves which will better them, and without any change, will continue to provide zero benefit for us. So, we really need to read the fine print when we check the box to give all of ourselves to Google and Zuckerberg, or fight to receive some sort of compensation for the knowledge we have given them.