Post Re: Mayer-Schonberger & Cukier

The first section focuses on the difference between data and technology, two terms that are often conflated.  The anecdote about seafaring is used as a clear lens through which to read the rest of the chapter.  I was interested in the claim that “Amazon understand the value of digitizing content, while Google understands the value of datifying it.”  My initial reaction was that this feel  uninformed as it fails to consider the ways that Amazon does datify, just not in plain sight.  Yes, I can see bar graphs on Google of other people’s searches, but Amazon has quietly datafied in such a way that the items marketed to me are no coincidences.  In general, this was something that I had wished was discussed more in the chapters we read (although it’s probably discussed in other parts of the book).  Data is incredibly useful to us when we are aware of it, but what about when we are not?

Reading these chapters, I was forced to reconsider my preconceptions about human advancement.  The idea that tech is our next “big thing” is basically undermined by the disparateness between data and tech.  The “technological age” is not on the same timeline as say, the stone or industrial ages.  This begs the categorical question.  If our tech-crazed culture is not, in itself, a landmark moment of advancement, then what is?  Our ability for complex communications?  What does it mean that so many get left behind from these cultural changes?

One of the other items discussed is very timely.  They introduce social credit ideas, which I’ve seen on Black Mirror and is literally underway in China right now.  It will be interesting to see how the mass data is used to help (or more likely hurt) society at large.

I loved the Privacy/Punishment chapter that references Minority Report.  It’s always been one of my favorite movies (don’t judge me) so I’ve thought a lot about this.  The ethical questions need to precede the technical ones, but I suspect they won’t.  Why are people trying to predict crime for punitive purposes?  It’s frightening that this what we build by default, as opposed to a data-driven system that could prevent crime by preventing the CAUSES of certain crime.

At the end of chapter eight, the authors bring up the precarious future of free will.  This, to me, seemed like the most likely place to find our society’s next large-scale change.  Digitizing collected data makes our world more accessible, but becoming collected data would change our relationships with ourselves and each other.

Reaction to Big Data by Viktor Mayer-Schönberger and Kenneth Cukier

This reading was divided into two parts: the collection of data and the pitfalls of Big Data. It provided a great many examples to illustrate its points, which was helpful. I felt that it was informative without being overwhelmingly so.

In the first part of the reading, the collection of data, a few things stood out to me: this history of big data, the amount of data collected, and the type of data collected.

I found the story of Matthew Fontaine Maury interesting, because he collected vast amounts of data, and from unusual sources. The datafication of journals was a brilliant idea. His work was not unlike research projects done in some classes (just on a much bigger scale). Maury’s case shows that solid research methods are important in data collection.

The sheer volume of collected data never ceases to amaze me. It’s pulled from so many different sources, and it just feels like privacy doesn’t really exist anymore, unless someone decides to completely unplug from their online existence, and even then, that person may have stopped new data from being collected, but the already-collected information is still out there.

I was boggled by what qualifies as data. Books and words? I’m a linguist: I suppose that I’ve looked at language as data for quite some time. That Facebook’s social graph involves over one billion people is astonishing.

It’s just… I’ve never thought about what parts of myself I’m surrendering by being online.

Granted, not all uses of this data are troubling, but enough are, which is what chapter eight is about.

Informed consent strikes me as important. In an ideal world, we would be able to give it, but I’m not sure we can. As data is collected and research on that data evolves, the reasons for data collection will change. This especially matters because the precautions taken to anonymize the data just simply don’t seem to work.
Does this mean that companies like Facebook should have to ask every six months or so? Or, perhaps, when a new data mining project is initiated? I don’t know. I do think the whole “Let’s just click on this privacy notice once” thing doesn’t seem to be adequate.

People are fallible, so they may decide to focus on the wrong data or analyze the data improperly.

Standardized tests in school struck a chord with me because I used to teach SAT test taking skills: I know those tests can be played. Specific techniques have been developed to raise student scores, and those techniques aren’t about information learned in schools, but, rather, information about how to approach the questions. For instance, in the math sections, “the answer cannot be determined by the information given” is almost never correct.

As a result, I don’t think we can trust the data standardized tests can provide, yet, we still see many people holding them up as proof of learning.

The potential for abuse and misuse of data is great, and we have to watch out for it. I’m not sure how. I mean, I know enough about history to know that trusting the corporations to police themselves is a huge mistake, but I don’t know how we can manage it.

This reading raised some troubling questions for me.

Everybody Lies

  • Reconsider what we consider “data.”
    • Bodies
    • Words
    • Pictures
  • The search for information is information.
  • Google searches provide a unique insight into “human psyche.”

What was the most surprising result and why?

What limitations can we see with this method?

What is a research question that we can investigate with Google Trends and Correlate?

link: “23andMe Is Terrifying”

https://www.scientificamerican.com/article/23andme-is-terrifying-but-not-for-the-reasons-the-fda-thinks/

“…That’s just the beginning, though. 23andMe reserves the right to use your personal information—including your genome—to inform you about events and to try to sell you products and services. There is a much more lucrative market waiting in the wings, too. One could easily imagine how insurance companies and pharmaceutical firms might be interested in getting their hands on your genetic information, the better to sell you products (or deny them to you). According to 23andMe’s privacy policy, that wouldn’t be an acceptable use of the database. Although 23andMe admits that it will share aggregate information about users genomes to third parties, it adamantly insists that it will not sell your personal genetic information without your explicit consent.

We’ve heard that one before.”

interesting reading and course.

It took me two days to finish reading the two chapters of Everybody Lies,not only because of my poor reading of English,but also becasue of much of my consideration on the interesting idea expressed in the book.
It attracts me much by its copious samples of Google search, ranging from politics of presidential voting and viewpoint of right\left wing to invidual curiosity in terms of sex. I can’t image what might happen in China if these samples of talking about sensitive topics be used in the college courses. From the interesing reading materials, I experienced much difference between America and China in terms of environment of culture,politics,economics, as well as technology.
I don’t want to talk much about politics and economic. But I am interested in the technology. I’d like to express my thought on the data technology by discussing a deadly accident which happened in the past week in China. A young lady who called car-pooling service of DiDi was raped and killed by the car driver. This accident rouse explosive outrages of all the Chinese people, including me. The outrage was not only on the criminal driver, but major the company,DiDi, of providing platform of car-pooling service because it didn’t timely and correctly responded to the help seeking  from the victim before she was done by the driver. From my viewpoint of technology, DiDi should use more technology of data analysis to enhance the security of its platform. DiDi,similary UBER,a China’s car hailing giant,provides transportation services for 550 million users across over 400 cities,including taxi hailing, priviate car hailing,and social ride-sharing. The accident happent to the ride-sharing service platform which is to make a match between car driver who is not dedicated to providing hailing service and the customer. If both the driver and the customer have the same tour direction,the platform will notice them to make the deal if both accept. Here is the issue potentially leading the accident. Based the running mechnisam of the car-pooling platform service, the driver and the customer don’t know each other at all. The customer has no way to evaluate the driver and totally rely on the platform before he/she accept the deal.So did the platform provider do the basic due dilligence on the verification of the driver? How is it implemented and is it enough just by checking the identification and criminal history ? Now It is an era of big data and artifical intelligence. All kind of invidual’s data of social network, financial condition report might be used to anylize each invidual driver. For example, could the platform use the feedback and comments from the passagers,joint with the driver’s previous and current financial conditions, to make the sentiment analysis on the driver and judge if the driver lies or behaves always abnormally ,then judge if he/she is qualified for acting as car-pooling driver?

This is my thought during my reading Everybody Lies. Looking forward to communicating and deeping into the Data,Place,and Society with all of you.

link: “Yahoo, Bucking Industry, Scans Emails for Data to Sell Advertisers”

https://www.wsj.com/articles/yahoo-bucking-industry-scans-emails-for-data-to-sell-advertisers-1535466959

. . . Initially, Yahoo mined users’ emails in part to discover products they bought through receipts from e-commerce companies such asAmazon.com Inc., people familiar with the practice said. Yahoo salespeople told potential advertisers that about one-third of Yahoo Mail users were active Amazon customers, one of the people said. In 2015, Amazon stopped including full itemized receipts in the emails it sends customers, partly because the company didn’t want Yahoo and others gathering that data for their own use, someone familiar with the matter said. . . .

link: “Welcome to the Age of Privacy Nihilism”

https://www.theatlantic.com/technology/archive/2018/08/the-age-of-privacy-nihilism-is-here/568198/

A barista gets burned at work, buys first-aid cream at Target, and later that day sees a Facebook ad for the same product. In another Target, someone shouts down the aisle to a companion to pick up some Red Bull; on the ride home, Instagram serves a sponsored post for the beverage. A home baker wishes aloud for a KitchenAid mixer, and moments after there’s an ad for one on his phone. Two friends are talking about recent trips to Japan, and soon after one gets hawked cheap flights there. A woman has a bottle of perfume confiscated at airport security, and upon arrival sees a Facebook ad for local perfume stores. These are just some of the many discomforting coincidences that make today’s consumers feel surveilled and violated. The causes are sometimes innocuous, and sometimes duplicitous. As more of them come to light, some will be cause for regulatory or legal remedy.

But none of this is new, nor is it unique to big tech. Online services are only accelerating the reach and impact of data-intelligence practices that stretch back decades. They have collected your personal data, with and without your permission, from employers, public records, purchases, banking activity, educational history, and hundreds more sources. They have connected it, recombined it, bought it, and sold it. Processed foods look wholesome compared to your processed data, scattered to the winds of a thousand databases. Everything you have done has been recorded, munged, and spat back at you to benefit sellers, advertisers, and the brokers who service them. It has been for a long time, and it’s not going to stop. The age of privacy nihilism is here, and it’s time to face the dark hollow of its pervasive void.”

Welcome to the Course

Hello everyone, and welcome to the class blog for DATA 74000: Data, Place, and Society. Here’s what you should do first:

1) If you have not done so for another class, sign up for a Commons account (you do not need your own blog for this class, but you’re welcome to create one on the system). Once you have a username, look for the “Join this Site” widget to the bottom right to add yourself as an “Author” to this site, which means that you can create, edit, and publish your own posts to this blog (you’ll need to do this for assignments). You won’t be able to make changes to the course documents or other students’ posts, so don’t worry about that. Need more help? Check here.

2) Familiarize yourself with the blog layout and the syllabus materials I’ve uploaded. You’ll find a link to the syllabus at the top along with assignments, policies, and the course schedule with links to all the readings.

3) Leave a comment to this post when you’re all signed up and introduce yourself (like: give a first impression of the class, ask a question, tell us about your interest in the M.S., or say what you hope to get out of the class).