Category Archives: Uncategorized

So, this is happening.

Cold cases solved via online DNA profiles.

We talked about this sort of thing earlier in the term, though, perhaps, not in this way.

In just a few years, the DNA every white person whose family is from Northern Europe will be identifiable in GEDMatch’s database.

I guess that;s great for people who want to find family members, but our DNA isn’t really private. That’s an issue. It can go down a bunch of different rabbit holes, most of them unpleasant.

Transparency of Algorithm

In Weapons of Math Destruction, one of the issues discussed is the transparency of algorithms, especially those that are deployed to “measure” human beings and have wreaked havoc on them. This reminds me of a very vivid example that I saw on the New York Subway. It is advertisement for Seamless, an online food order service. In this interestingly funny advertisement, there are several pictures with words that describe the characteristics of New York neighborhoods based on the data Seamless collected through their services and their interpretation of the data analysis results. For example, one of the picture says “The most tender neighborhood- Fordham, Bronx Based on the number of orders of chicken tenders”. In this picture, there is a macho guy who holds a piece of chicken tender in one hand  while holding a cute little kitty in the other hand. Obviously, this interpretation of analyzed data is completely for commercial purpose. It uses the play-on word of “tender” to achieve a humorous effect so that viewers of the advertisement can have a deep impression of their service through exposure to this unreasonable but funny connection they make between data analyzed by algorithms and their product. The company is transparent in revealing the way in which data and algorithm are used to draw conclusions, which may impose a stereotype on a neighborhood. For people who do not agree with this conclusion, they will know how it is made through such transparency. In this series of advertisements, chelsea is named “the most homesick neighborhood” based on the order of a home-made dish and another neighborhood is named “the neighborhood with the most hot yoga” for having the most orders kumbacha. Every one of those names could have an impact on the perceptions and images of these neighborhoods. If such perceptions and images have detrimental effects on the people living there, as what the algorithms did to the math teachers who were labeled “incompetent” in the elementary school mentioned in the reading, then the methods through which those images and perceptions were created are of major significance as they are the keys to solving issues of injustice and inequality.

 

However, transparency of algorithms is still at the mercy of big corporations who keep them as top secret in spite of the harm they can cause to individuals who are subject to unfair “measuring” based on such algorithms. Naming or categorizing individuals is of high risk because there are millions of implications such names or categories are associated with that may change one’s life tremendously. A socially responsible approach to algorithms should be adopted by corporations, the government, and individuals to make sure that we are not hurt by what we create for a better life in the first place.

 

Post Re: Cathy O’Neil

Before I begin my comments on WMDs, I would like to share with you that on my way home from class last week, I passed a sign outside of a Bank of America advertising their mobile “assistant”.  Her name was Erica.

O’Neil, in her conclusion, says that “big data processes codify the past.  They do not invent the future.”  This neatly sums up the arguments she’s made.  The examples are clear.  In schools, the “codifying of the past” is done wildly inaccurately.  The performance indicators of teachers simply do not measure what they are meant  to.  This is the first type of problem introduced by WMDs.  The response to the inaccuracies are not surprising.  In the name of ease or of streamlining, or most likely in the name of cost minimization, teachers are held to standards that bring inherent contradiction.  How can a school measure the value added by a teacher of underperforms with the same algorithm that it measures teachers of overachievers?  The outcomes are not important here, only the seemingly priceless impact of essentially digitizing employee review.  While pretending that taking humans out of the judgement process will level the playing field, it actually codifies the human error.

The example of teacher evaluation is the least threatening of the examples given by O’Neil in the assigned reading.  Worse is the outright and blatant codification of existing systems and structures.  Where value-added educator evaluation is an original model of measurement with new flaws, in the case of the use of WMDs in the financial industry is the codification of unoriginal, existing models that have unfairness baked deep within already.  By using existing data, choices are made about the value of individuals without consideration of the data that has not already been collected – like using the zip code as a weapon despite an unmeasured propensity toward frugality, for example.  This, arguably more dangerous, form of WMD highlights O’Neil’s point about “codifying the past”.

Reading these chapters, I thought about our prior conversations about digitization and datafication. The data had already been collected; vast swaths of information exists about individual insurance risk, policing patterns, or political motivations.  The use of WMDs seems to me a type of digitization of our existing social structures and patterns.  This begs a new perspective.  Why are we looking at the success of data systems to fix the world when we cannot even create data systems that properly express the world as it is?  O’Neil’s answer is this: the mathematical tools discussed can be used for good or for evil, for equity or inequality, to codify or to “create” our society.  It is the human component that decides how to use these tools.  Unfortunately, it appears that the same players involved in codifying, datafying, and digitizing our reality have very little interest in the human component at all – likely underestimating or even devaluing their roles.

a new word went viral among Chinese netizens.

I have been in a tangled feeling along with reading the book, Weapon of Math Destruction.
We get much convenience from the abundant services of all kinds of information feeding, such news notifications ,shopping recommendations,music suggestions, even ads sometimes. We can get to know the basic information for daily life during in the subway to office in the morning without needing to subscribe news papers and journals magazines, to pay much attentions to look for sales events. The world seems going towards perfect with the coming of Information Age and AI.
However the book lists many cases of WMD, revealing its negative results from the perspective of downsize. I got to wary of the horrible consequence of savagery developing, evolving and applications of WMD in various areas, including educations, finances,policing, etc.
This reminds me of a word, “melon-eating masses”, which is newly produced by Chinese netizens and went viral in the recent 2 two years in China . There are many versions about the origin of the new word. The major one is from an elderly who was interviewed by a reporter. In the interview, the elderly said “I know nothing about it, I was just eating watermelon on the roadside”. From then on, the Chinese internet users, often use it to describe a massive group of passive onlookers at a major incident or event. In my opinion, the fired teacher, a victim of the WMD, is a member of “melon-eating masses” for she couldn’t figure out why he got such a low score as to be fired. The single mother, who can’t arrange well his child care any more after the introduction of precise algorithm to calculate the job time, could be also a member of “melon-eating masses”, who might be only able to accept the “truth” that the technology advance is improving the efficiency of work, without doubting of the fairness of the algorithm. The designer and developers of the computing models, the privileged politicians, could also be members of the “melon-eating masses”, with scale expanding of data collection, he complexity deepening of algorithm and the neural network getting more and more entangled, and getting out of control on the plans plotted by themselves. The new word created in China internet reflects a society phenomenon of that the mass are a little desperate with the current situation of pool access to the real information and are looking for forward to the governors to regulate and rule the data usage, business modeling, information feeding.

Good to see that yesterday it is reported: Facebook takes down hundreds of pages and accounts that were spreading false or misleading political content ahead of the midterm elections.

https://www.wsj.com/articles/facebook-takes-down-hundreds-of-u-s-pages-it-said-spread-misinformation-1539289601?emailToken=29411f823c22631890df9e943e3debb2tap1voHRTPW1NnbNXyll2f7csI8gu6r5CyDg1SDgG+zu+U0/hkpCZLlehT6zKf+os3K+dKzJW+7fYrW+6AaXXTHt/+/2ime5T1IxtCG81gOONqzYtKaaiyRGQB5rlwC6l80n331F/lpAFoP6iPP4HA%3D%3D&reflink=article_email_share

Transparency

Apparently it’s not a thing in the Big Data universe. Everything from how data is collected, to who sees it, to how it is processed and analyzed

As O’Neil points out in Weapons of Math Destruction, this is just part of the problem, but it comes back again and again.

The lack of transparency prevents any real analysis of effectiveness of the various WMD’s. Not only do we not know what data is collected, we don’t know how it is measured.

So, if the WMD is inaccurate, we really don;t have any recourse. We can protest it, but, more often than not, the powers that be will say, “This is what the data show.” They accept it as correct even though they don’t know what it’s doing.

The opaqueness of the process also prevents correction. These are closed systems. They don’t change until the coders decide they need to. the coders may be resistant to change. After all, they came up with the data analysis to begin with, they might think they got it right and resist evidence to the contrary.

I’m not saying that other issues aren’t important, they absolutely are, but the lack of transparency just gets to me every time.

Two blog posts I came across

The Government Is Blacklisting People Based on Predictions of Future Crimes

We were talking about this sort of thing last week, The government is putting people on the No Fly List based on things people might do, not what they have actually done.

Further, despite promises to the contrary, the government is not providing reasons why citizens are on the list, nor does it really give those on the list any real way to appeal the decision.

Meantime, This post says we should be suspicious of tech companies going to the federal government for privacy legislation,

This post points out that much of the consumer privacy protection legislation is being done at the state level. It was interesting to read how different states are doing things.

However, the author believes that the industry is appealing to the federal government to negate the legislative work done in the states. It is doing it using terms like “privacy regulation” because that has popular support right now.

Granted, both of these are from the ACLU’s website, so they might be slanted, but they raise disturbing issues.

On The Art of Forgetting

In The Googlization of Everything, the author raises an intriguing issue: the googlization of memory. In this chapter, the author discusses a very important issue that is difficult to notice – the importance of forgetting things. There are several examples about how forgetting plays a more significant role than remembering to human beings in spite of the efforts they have spared to remember things all through history. They all demonstrate that in an age of information explosion, where “the scarcity has become plentiful”, it is more important to learn how to filter out what might become a burden to our thinking and our mind and forget about it, because or we will get lost in the ocean of details and feel overwhelmed and anxious. As a long time sufferer of anxiety issues, I have tremendous experience in not being able to forget things – things that hurt me in the past and left trauma in my sensitive memory. It is the inability to forget the past that impinged my ability to focus on the present. I kept seeing triggers of my traumatic experience and they were constantly exaggerated by my memories and my mind. Finally, unable to cope with such disorders, I turned to therapists and psychiatrists for help. Through medicine and training, I made tremendous improvement in focusing on what’s important in life.

 

Even worse, we will lose the ability to take in anything new into our minds if we do not know how to properly forget. The author gives the example of his grandfather who does not have any mental space for new things because the memory of the past has been rooted too deep in his mind. Therefore it is more than necessary to train, or “discipline” our minds to choose and select what is useful for us and filter out what is not. In addition, it is important to forget because unforgotten information has the propensity to be “misused and abused”. Seemingly minor details can “come back and haunt us” in ways we would never expect.

 

Indeed, it is of great importance to forget than to remember in such an information age where an abundance of information is accessible to us through tools such as google. But the key questions here is: what should we remember and what should we forget? Who can decide it and how it can be decided? According to the author, in an age of googlization, Google does it for us. As the author says, it opens up an abundance of information to us and filters out even more. What we need to learn is how to keep our judgement and work with Google to make the optimal choice. The author says although google makes it easy to both forget and remember things, he is the one who “choose(s) what elements to remember and comfortably ignore the rest”. “What matters is how we choose what to consider in our daily judgements and choices”.

 

However, the author also mentions that it should be cautioned that google is doing this for us-the right to decide what to take into our mind and what to filter out transfers from our parents and other adults to google. However, do we really know how google does this? What is filtered out for us by the algorithms and what, in the information that is filtered out, is valuable to us? Should we just easily transfer this right to google? In other words, is google reliable and capable of assuming the role that used to be played by the parents and perhaps should be played by ourselves? In education, how do we teach students what to remember and what to forget? Although teacher are usually taught in teacher training classes that they need to ask students to critically filter and interpret the information. It’s hardly clear what those required skills are and how to teach them. The author says in “It feels somewhat liberating that I don’t have to remember to remember very much”. All through my education in China, it is filled with memorization of everything. It is hard to transition into this new mode of learning and teaching mode in the Information Age, where an abundance of information makes me feel nervous and intimidated when I read. I don’t know how to select or filter when faced with tons of readings because I have developed the habit of reading everything very carefully for meaning behind the text. To adapt to the new trend requires special and step-by-step training.

 

 

 

 

 

 

 

 

This popped up today.

https://www.npr.org/sections/thesalt/2018/09/26/651849441/cornell-food-researchers-downfall-raises-larger-questions-for-science

I thought it was relevant because it discusses the use and misuse of big data, but from a different direction than what we have been discussing. Unlike FB and Google, which are businesses and are gathering/manipulating data for capitalist purposes, this one is academic.

Basically, the researcher involved kept analyzing data sets until he came up with something, which can be a good thing — we should look at data from many different directions, but he seems to have been involved in p-hacking, which is the manipulation of data to make certain points stand out and look more significant than they are.

(I think. I’ve never heard of this before, and I’m totally going on the article. Stats isn’t really my field. I could be very wrong here.)

The npr article linked to this article, which deals with p-hacking and its effects. I haven’t read it, but will try to for next week.

 

 

Googlification of Everything

I found the Googlification of Everything to be an interesting read. Starting with the “Book of Google” we’re introduced to the idea of blind faith in Google. Not going to lie after this reading I still have faith, but I can a bit more clearly now the rain is gone. Similar to the introduction of cars and planes to society, tech companies like Google are innovative, good at what they do, and set their own rules . It’s only when things go wrong that regulations are put in place. These companies are left alone to innovate and in many cases dominate. I thought of Amazon as a parallel to Google for shopping. They’re slogan is “Everything from A to Z” and their stock is pretty robust. Sure there are other online retailers, but none of them can get me groceries, the newest N.K. Jemisin book, and a drum pedal in one day like Amazon can. Similar to Google whenever I need a product my first thought to check Amazon even though there are many other places to get the item. We’ve seen with big tech companies to grow and monopolize in their respective areas with little to no push back. Which begs the question what is the cost of neglecting to regulate the size and scale of these companies, what they are actually doing with user information and how they are affecting society at large.

I also became a bit more aware of my habits. For example after Googling something I often say “Well according to the Internet blah blah blah” which isn’t bad, but I hadn’t considered that the only search engine I use is Google. Maybe Bing might have said something else. Who knows? I had not really thought much about it until reading this. In many ways Google has become synonymous with the web. The services they offer encompass all the reasons why you go on a computer: checking emails, answering  questions, watching videos, settling scores, mapping directions, and so forth and so on. It’s the perfect business model because you get users to stay on your site for extended periods of time while tracking their movements.

Lastly, one concept that stuck with me from the reading  was this idea of “public failure”. The author describes public failure as a troubling phenomenon that occurs, “when Google does something adequately and relatively cheaply in the service of the public, and public institutions are relieved of pressure to perform their tasks well.”(pg6). I think this stuck out to me in part because I work at a non-profit and often speak with my colleagues about the ways that nonprofits take on issues that’s really the job of the government (education, job training, arts programs in schools etc). It’s way easier for the government to hand out funds and let nonprofits do the work than to fix certain issues themselves. The services these private organizations provide their communities ease pressure on public institutions.  So in that way I guess I had considered public failure before, just not in the context Vaidhyanathan mentions