Category Archives: Uncategorized

The Googlization of Everything- By Siva Vaidhyanathan

Vaidhyanathan states;  “If Google is the dominant way we navigate the Internet, and thus the primary lens through which we experience both the local and the global, then it has remarkable power to set agendas and alter perceptions. Its biases are built into algorithms. And those biases affect how we value things, perceive things, and navigate the worlds of culture and ideas (pg.7).” His fundamental idea is that we should work to regulate search systems like Google to take responsibility for how the Web delivers knowledge to us, the general public.

His reflection on the examination of Google Scholar is that the product ranks different articles based on the citations they receive. When searching on Google Scholar we are given results from across all the disciplines. This tool brings the titles of academic research to the public knowledge but the service is said to be flawed. Because, “according to academic librarians, Google Scholar has been constructed with Google’s usual high level of opacity and without serious consideration of the needs and opinions of scholars” (pg.192). While it may be true that many of the search features of ‘academic’ search engines are lacking, Vaidhyanathan misses the importance of design: do present modes of academic search meet his objective of a system that allows for the easy acquisition of knowledge?

Thus, like old times this makes the need for librarians important. We can trust librarians because of their philosophy of protecting users and information. Librarians have always been a trusted source of knowledge filtering and the university libraries still have a stronghold of knowledge accumulation and storage.

Google’s mass digitization of books (i.e. Google Books), and Google’s rising influence in higher education is alarming. Trusting Google with such important material, like all our academic heritage, is going a bit too far.

Reaction to Googlization fo Everything

I thought Siva Vaidhyanathan’s The Googlization of Everything was interesting. The discussion on the Googlization of educators and students especially resonated with me, because I’ve seen up close what the author discusses.

The author points out that Google is perhaps the dominant way in which we interact with the internet, and, as a result, wields considerable power. Its biases (found in its algorithms) become our biases. These biases affect us: when we do research via Google (and let’s not lie, we all do), how often do we look past the first page? It doesn’t matter if the information is accurate; it’s on the first page.

When I first started teaching at LaGuardia, I did an exercise which involved students googling Rev. Martin Luther King, jr.  At the time, the third or fourth link was to a MLK webpage that looks official but is completely wrong and racist. It was put up by Stormfront, a white nationalist hate group. I did this to show that you can’t totally trust Google’s results. (As for the website, it has since been taken down.)

I’m not blaming my students. I frequently do the same thing, and, in fact, have probably believed wrong information. It’s common. I was trying to show students what to look for on a website so that they could judge the accuracy of the information.

Google dominates research for our students. I mentioned on the first day of class that I run something call the States Project, wherein my students have to produce a three minute long video about one of the fifty states.

When we start the project, I tell them that they can find most of the information they need on their state’s official website, the state’s tourism board website(s), local news sources, and, for demographics, census.gov.

Students usually don’t use these sites. (Maybe I should require them.) I have had to ban certain websites, mostly because they’re either encyclopedias or aimed at kids. Typically, students type the question into google and just go wherever it takes them. Further, if they don’t find the information on THAT website’s first page, they complain that “they can’t find the answer to the question.”

But, as I said above, I don’t know that I can really fault them that much. When I start researching a subject, I go to Google first. Of course, especially if it’s research for work, I do switch over to other databases, but I start with Google, and when I know nothing about the topic, I start at Wikipedia.

As the author said, the whole issue is information literacy: our students don’t really have info literacy skills. We have to teach them. I think we have to integrate Google into these efforts, because students are going to use it anyway. So, let’s give them the skills to analyze what they find and show them other databases to give them more option AND more access to scholarly work.

Speed Conference at Cornell Tech

Hi guys,

I’m just being a little bit off-topic regarding our readings but I promise I am still talking about data. Tomorrow I am going to a conference that has many promising presentations. It is called “Speed” and it will be at Cornell Tech.

It is free, though you have to submit a registration.

See you there in case you decide it is interesting for you as well!

Post Re: Mayer-Schonberger & Cukier

The first section focuses on the difference between data and technology, two terms that are often conflated.  The anecdote about seafaring is used as a clear lens through which to read the rest of the chapter.  I was interested in the claim that “Amazon understand the value of digitizing content, while Google understands the value of datifying it.”  My initial reaction was that this feel  uninformed as it fails to consider the ways that Amazon does datify, just not in plain sight.  Yes, I can see bar graphs on Google of other people’s searches, but Amazon has quietly datafied in such a way that the items marketed to me are no coincidences.  In general, this was something that I had wished was discussed more in the chapters we read (although it’s probably discussed in other parts of the book).  Data is incredibly useful to us when we are aware of it, but what about when we are not?

Reading these chapters, I was forced to reconsider my preconceptions about human advancement.  The idea that tech is our next “big thing” is basically undermined by the disparateness between data and tech.  The “technological age” is not on the same timeline as say, the stone or industrial ages.  This begs the categorical question.  If our tech-crazed culture is not, in itself, a landmark moment of advancement, then what is?  Our ability for complex communications?  What does it mean that so many get left behind from these cultural changes?

One of the other items discussed is very timely.  They introduce social credit ideas, which I’ve seen on Black Mirror and is literally underway in China right now.  It will be interesting to see how the mass data is used to help (or more likely hurt) society at large.

I loved the Privacy/Punishment chapter that references Minority Report.  It’s always been one of my favorite movies (don’t judge me) so I’ve thought a lot about this.  The ethical questions need to precede the technical ones, but I suspect they won’t.  Why are people trying to predict crime for punitive purposes?  It’s frightening that this what we build by default, as opposed to a data-driven system that could prevent crime by preventing the CAUSES of certain crime.

At the end of chapter eight, the authors bring up the precarious future of free will.  This, to me, seemed like the most likely place to find our society’s next large-scale change.  Digitizing collected data makes our world more accessible, but becoming collected data would change our relationships with ourselves and each other.

Reaction to Big Data by Viktor Mayer-Schönberger and Kenneth Cukier

This reading was divided into two parts: the collection of data and the pitfalls of Big Data. It provided a great many examples to illustrate its points, which was helpful. I felt that it was informative without being overwhelmingly so.

In the first part of the reading, the collection of data, a few things stood out to me: this history of big data, the amount of data collected, and the type of data collected.

I found the story of Matthew Fontaine Maury interesting, because he collected vast amounts of data, and from unusual sources. The datafication of journals was a brilliant idea. His work was not unlike research projects done in some classes (just on a much bigger scale). Maury’s case shows that solid research methods are important in data collection.

The sheer volume of collected data never ceases to amaze me. It’s pulled from so many different sources, and it just feels like privacy doesn’t really exist anymore, unless someone decides to completely unplug from their online existence, and even then, that person may have stopped new data from being collected, but the already-collected information is still out there.

I was boggled by what qualifies as data. Books and words? I’m a linguist: I suppose that I’ve looked at language as data for quite some time. That Facebook’s social graph involves over one billion people is astonishing.

It’s just… I’ve never thought about what parts of myself I’m surrendering by being online.

Granted, not all uses of this data are troubling, but enough are, which is what chapter eight is about.

Informed consent strikes me as important. In an ideal world, we would be able to give it, but I’m not sure we can. As data is collected and research on that data evolves, the reasons for data collection will change. This especially matters because the precautions taken to anonymize the data just simply don’t seem to work.
Does this mean that companies like Facebook should have to ask every six months or so? Or, perhaps, when a new data mining project is initiated? I don’t know. I do think the whole “Let’s just click on this privacy notice once” thing doesn’t seem to be adequate.

People are fallible, so they may decide to focus on the wrong data or analyze the data improperly.

Standardized tests in school struck a chord with me because I used to teach SAT test taking skills: I know those tests can be played. Specific techniques have been developed to raise student scores, and those techniques aren’t about information learned in schools, but, rather, information about how to approach the questions. For instance, in the math sections, “the answer cannot be determined by the information given” is almost never correct.

As a result, I don’t think we can trust the data standardized tests can provide, yet, we still see many people holding them up as proof of learning.

The potential for abuse and misuse of data is great, and we have to watch out for it. I’m not sure how. I mean, I know enough about history to know that trusting the corporations to police themselves is a huge mistake, but I don’t know how we can manage it.

This reading raised some troubling questions for me.

interesting reading and course.

It took me two days to finish reading the two chapters of Everybody Lies,not only because of my poor reading of English,but also becasue of much of my consideration on the interesting idea expressed in the book.
It attracts me much by its copious samples of Google search, ranging from politics of presidential voting and viewpoint of right\left wing to invidual curiosity in terms of sex. I can’t image what might happen in China if these samples of talking about sensitive topics be used in the college courses. From the interesing reading materials, I experienced much difference between America and China in terms of environment of culture,politics,economics, as well as technology.
I don’t want to talk much about politics and economic. But I am interested in the technology. I’d like to express my thought on the data technology by discussing a deadly accident which happened in the past week in China. A young lady who called car-pooling service of DiDi was raped and killed by the car driver. This accident rouse explosive outrages of all the Chinese people, including me. The outrage was not only on the criminal driver, but major the company,DiDi, of providing platform of car-pooling service because it didn’t timely and correctly responded to the help seeking  from the victim before she was done by the driver. From my viewpoint of technology, DiDi should use more technology of data analysis to enhance the security of its platform. DiDi,similary UBER,a China’s car hailing giant,provides transportation services for 550 million users across over 400 cities,including taxi hailing, priviate car hailing,and social ride-sharing. The accident happent to the ride-sharing service platform which is to make a match between car driver who is not dedicated to providing hailing service and the customer. If both the driver and the customer have the same tour direction,the platform will notice them to make the deal if both accept. Here is the issue potentially leading the accident. Based the running mechnisam of the car-pooling platform service, the driver and the customer don’t know each other at all. The customer has no way to evaluate the driver and totally rely on the platform before he/she accept the deal.So did the platform provider do the basic due dilligence on the verification of the driver? How is it implemented and is it enough just by checking the identification and criminal history ? Now It is an era of big data and artifical intelligence. All kind of invidual’s data of social network, financial condition report might be used to anylize each invidual driver. For example, could the platform use the feedback and comments from the passagers,joint with the driver’s previous and current financial conditions, to make the sentiment analysis on the driver and judge if the driver lies or behaves always abnormally ,then judge if he/she is qualified for acting as car-pooling driver?

This is my thought during my reading Everybody Lies. Looking forward to communicating and deeping into the Data,Place,and Society with all of you.