How Should We Think About Big Data — Harold Sjursen Harold Sjursen Philosophy of Technology & Global EthicsHarold Sjursen

Big Data & Singularities: Creativity as a Basis for Re-thinking the Human Condition

How should we think about big data?

Harold P. Sjursen

January 31, 2020

(This was written for the inaugural issue of HAS magazine)

Such a large and interesting topic calling for a theory of everything. How can we begin to approach it and why is it important? Despite its au courant focus on the new knowledge embedded in and now being released from big data, the questions being posed are perennial themes of philosophy appearing to us in new guise. The prisoners in Plato’s Cave Allegory were likewise called upon to re-think the human condition based upon the unveiling of new knowledge previously sequestered behind the veil of false appearances. By mining the depths of big data, its proponents argue, we will see through false constructs and understand in what sense we too have been prisoners, and subsequently redefine the human condition and thus be able better to place ourselves on the road to liberation.

Let’s start with a story.

It’s Manhattan in the 1960s and everything seems up for grabs. Two priests who were boyhood friends growing up in Brooklyn in the 30s keep up their friendship by meeting weekly for lunch. One is a Jesuit, cerebral, intellectual, intense, the other a Franciscan, compassionate, relaxed, living to realize Pax et bonum. Their boyhood friendship is nurtured by the guilty question: Is it acceptable to smoke and pray at the same time? They meet weekly at a small Italian restaurant just south of Greenwich Village and over eggplant parmigiano discuss pressing issues. Inevitably their conflict over the propriety of smoking and praying simultaneously takes theological form as the topic for reflection over lunch. The discussion followed the canonical methods of disinterested scholarship, hermeneutics and apologetics, eudemonistic ethics, and enlightened psychology; their prodigious collective memory consults the Bible, the Church Fathers, Augustine, Aquinas; they invoke positivistic accounts of language, the later Heidegger’s non-objectifying thinking, the principles of Carl Rogers’ client centered therapy, but still the solution evades them. Their schedules are full and they finally agree to pursue the question further the next time they meet. A week hence they return to the same restaurant and upon arrival each note a look of self-satisfaction upon the face of the other. “Father J, you’re looking rather pleased with yourself today” said Father F. The Jesuit replied in kind, noting the Franciscan’s delight bordering on smugness. Well I have solved our puzzle, the Franciscan said -- the answer is No! His Jesuit companion taken aback retorted, “But that can’t be. We discussed it thoroughly and the answer is undoubtedly Yes.” After enduring a few moments of silent puzzlement Father J final inquired: “What question did you pose?” Without any hesitation Father F confidently asserted, “Exactly the question we puzzled over – Is it alright to smoke while praying?” The Jesuit then allowed that he thought he understood the contradiction. “Ah, in our conversations we discussed the praying while smoking.” For if while smoking, for example, one witnesses an act exemplifying the grace of God and responds sincerely with a spontaneous prayer, of course that is acceptable and proper, But on the other hand, if one is for example in the midst of fulfilling the priestly duty of administering the holy sacraments, then smoking would be an abomination! It’s all how you frame the question.

But how do we frame the question and indeed given the resources of big data, what are the questions? The theme of this inaugural issue of HAS Magazine connects big data, creativity and the human condition. Big data as a concept within the engineering discipline of informatics was singled out at the beginning of the 21st century. Its famous definition advanced by Doug Laney (an analyst at Gartner) concisely identifies potential and challenges before us:

"Big data" is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.[1]

Like the priests in the story above we believe that there are definitive answers to the existential questions of how we should live our lives or exist, if only we knew and could understand the sources. But for us, unlike our hapless pair, big data does not present itself to us within a set of canonical classics with established if disputed methods of interpretation. On the contrary, big data (some have said, like dark matter) is normally invisible to us; is it wildly heterogeneous, dynamic and in perpetual flux. Yet we believe that if only we can find the key to this treasure trove, the abundance of insight unlocked will allow us truly to pursue the good. Today this kind of techno-optimism may be somewhat muted, but still our hope is that with the right computational heuristics we will be able to mine the data and organize the needed information in a manner that will yield key information permitting the best decisions and ultimately the solution to our most vexing and threatening problems.

These aspirations allow for a variety of creative approaches or creativities. Just as there are many ways to search for pebbles on the beach and just as many ways to use or play with those collected, so is our imagination given a full range of opportunity when facing the expansive universe of big data. Will such creativities express insight and will they lead us to understand the existential dilemmas of the good and how to live well? More to the point, perhaps, will they initiate or advance re-thinking the human condition?

The proposition that the conjunction of big data, creativity and thinking as a possible way to understand the human condition radically reframes enduring questions behind the central admonition of Socrates to “know thyself.” Socrates was surely suggesting a moral imperative, something we ought to do for the sake of living the good and just life. But what knowing oneself really means and how one goes about doing so are persistent and open questions. The very idea that the use of big data can facilitate a better awareness or understanding of the human condition is both novel and from a traditional philosophical point of view, against the stream.

In the philosophical tradition the relationship of thought, knowledge and understanding to action or praxis has been much discussed without a strong consensus. One can find arguments for both their mutual distinctiveness and separation as well as for the contrary notion that on some level they are the same. Common sense suggests (as reflected in Doug Laney’s formulation) that thinking precedes action, the effectiveness and quality of which is enhanced in rough proportion to the accuracy, detail and correctness of the thinking. Thus, it is assumed that thinking prepares the way for action and the better-informed thinking is the more likely that successful action will follow. But is the collecting and analyzing of big data a mode of thinking such that the commonsense belief that it can improve action in order?

The results of big data mining can hardly be likened to the standard body of scientific evidence, let alone to the contemplation of personal experience. Our awareness of big data is almost hypothetical. Of course, in ordinary experience we are also frequently removed from crucial evidence, which is invisible to us, mediated by technology such as a microscope, and in this sense big data superficially resembles much scientific information. But this sort of scientific evidence, produced through laboratory experimentation or field work, is normally an enlargement of something of which we have an immediate awareness, a clue or symptom for an underlying complexity. In the case of big data, the situation is different; what is purportedly disclosed comes as a surprise because we did not have evidence suggesting it, only theoretical conjectures. For this reason, it can be compared to dark matter which we know about primarily only inferentially. It is necessary for the universe to hold together, but what we know of it is hardly more than that. So the matter of big data may very well influence our lives in significant ways and in ways of which we are unaware. Knowledge of it might change our understanding of the human condition. This may be the premise motivating data mining.

But big data is more than a matter of practicality. It has inspired creative appropriation by artists like my friend and colleague Luke Dubois.[2] Dubois is an academically trained musician, performance and composition, and visual artist who is completely at home in the world of digital media. Truly an artist he nonetheless thinks of himself, as do many other contemporary artists I know, as a kind of engineer in the understanding that engineering is what artists actually do. One of the most interesting recent projects that he has done has been reported with great enthusiasm in the business press -- probably because of its deployment of attitudes toward big data that seem to resonate with Doug Laney’s famous definition. [3]

Dubois’ approach is both ironic and challenging. He encourages us to think about what the reality is, not about an abstract cosmological account of reality, but about the reality of our day-to-day lived experience. He does this by mining a data base, namely the record terms members of online dating services use to describe themselves. DuBois describes this project as follows:

“A More Perfect Union [the title of the project] is a large-scale artwork based on online dating and the United States Census. In progress since 2008, the work attempts to create an alternative census based not on the socio-economic fact but on socio-cultural identity.

“In the summer of 2010 I joined 21 different online dating services and “spidered” their contents, downloading 19 million profiles of single Americans. These profiles were sorted by zip code and analyzed for significant words. A series of national, state and city maps (43 in all) that show this data in various ways. Most notably, a set of prints shows a road atlas of the United States, with the city names replaced by the word used by more people in that city than anywhere else in the country. This lexicon of American romance, as it were, consists of more than 2000,000 unique words, and gives an imperfect, but extremely interesting perspective on how Americans describe themselves in a forum where the objective is love.”[4]

In this project, large heterogeneous data sets are culled and juxtaposed revealing an aspect of ordinary life with a new and surprising focus. The subject, how one presents oneself when seeking romance, addresses something of our understanding of the human condition, suggesting how we understand basic human characteristics such as erotic desire and the need for companionship. Importantly, however, it also indicates that we don’t know and might not recognize ourselves in this context without the kind of analysis this project reveals.

As it was reported in the Financial Times, “What [people like DuBois] are doing is trying to convey the secret life of data in a way that is elegant and exciting … we have gone from a very literal view of data to a very emotional view.”[5]

This project would seem to satisfy the elements of the proposition that through creativity big data can help us to redefine and thus better understand the human condition. But is that what is actually being done? Are enumerated and corelated records of large amounts of human behavior (statements or actions) indicative of what makes humanity what it is? Does this enhance our insight and lead to better decision making? Pragmatically, perhaps. If knowledge of the most successful terminology for finding a romantic partner will lead to my greater success finding in finding such a partner for myself, then in that sense it can guide me to making a better decision. This seems doubtful, but even if it is the case it does not afford anything like a better understanding of the human condition. And if this is how we make decisions, are we following our inward light, are we in possession of any genuine insight, or are we merely performing a calculative process possibly devoid of any understanding whatsoever?

We mentioned Socrates’ admonition to know thyself as conveying a moral dimension to the human condition. Self-knowledge however is often elusive; Socrates’ injunction is more than a moral admonition, it’s an epistemic challenge as well. How does one know oneself? Our introspective self-examinations may lead us to reinforce beliefs that obscure genuine self-understanding. Are summations of the data of our lives any more auspicious a path to self-understanding?

Another of DuBois’ projects engages this question. Called Self-Portrait, 1993-2014 he explains it in this way:

“The term quantified selfie was, to my knowledge, coined by Maureen O’Connor in 2013. Writing in New York Magazine (Heartbreak and the Quantified Selfie, 12/2/13), O’Connor discusses the Tumblr blog of journalist Lam Thuy Vo and the work of designer Nick Felton in the framework of a larger cultural trend in which the narcissism of social media and the ubiquity of Big Data collide in a new form of self-portraiture. These data portraits often co-opt, parodically or otherwise, the visual semantics of post-Tufte infographics for the purposes of generating content for the Millennialist online sharing.

The self-portrait I created consists of a force-directed graph of my email since September, 1993. In layman’s terms, imagine a “big bang” of a universe of personal and professional e-mail sent and received of 20 years; the different people in this universe have different mass and gravity, causing galaxies of attraction to form; those in constant dialogue with one another, or whose language is more familiar, or loving, have stronger bonds of attraction. The five or so primary e-mail addresses I’ve used over the years appear in the center of this star map, with the several thousand people I’ve corresponded to surrounding them in clusters of sentiment and carbon-copy.”[6]

Portraits both reveal and conceal something of the human condition, that is they open our eyes to perhaps unnoticed dimensions of self-presentation while simultaneously protecting or reinforcing one’s position in the world. The official portraits of the president of a university, for example, are intended to show how an individual embodied the spirit of the institution while both preserving its legacy and leading it forward to master the new challenges of the future. That is to say, portraits create a person, institution or event while asserting its natural compatibility and salutary relationship with the human condition writ large. The veracity of a portrayal is a function of its selectivity, and no less so with reference to the results of previously unnoticed factoids uncovered by data mining.

So how seriously should we take efforts to reframe the world according to the results of big data disclosures? DuBois’ ironic re-description of common place beliefs are playful and a reminder that what we see is sometimes little more than just what we want to see. Our understanding of the human condition no less than our seeing the world around us is an intentional act, formed and guided by tradition and necessity. The humorous question Is it OK to smoke and pray at the same time? Illustrates this aspect of our understanding of the human condition. Big data indeed provides the platform for creatively redefining the human condition, but is it a disclosure of truths hidden deep within the human collective psyche, or on the contrary an arbitrary collection of thing/events that we find as evidence in support of our contingent desires?

Consider the three components of Doug Laney’s definition of big data:

1. high-volume, high-velocity and high-variety information assets

2. that demand cost-effective, innovative forms of information processing

3. for enhanced insight and decision making

We notice that the source (1) is not accessible to ordinary observation or comprehension. It is too vast, changes too quickly and too diverse for that. Normally invisible, these characteristics may evoke a sense of awe when we first become aware of them. Next it is asserted (2) that this awesome source makes demands of us, viz., we are to know it through innovative information processing. Normative modes of information processing will not do. And finally (3) those who inquire in the proper way will be rewarded. The anti-democratic message is implicit for obviously not everyone, not even most people, but only a select few, the philosophers or high priests of big data, can access this source and they, at their discretion, mediate the enhanced insight they possess for the benefit of the many.

This doctrine has been put forward before, politics and religion both offer examples. We have mentioned the Platonic version as found in the Republic. The gnostic paradigm suggests another and perhaps more insidious version. According to the Gnostics of late antiquity the truth was concealed, and humanity was generally imprisoned in a body surrounded by veils of ignorance. A secret message was conveyed to select few providing the salvific key break out of this constraining environment and on to understanding and liberation.[7] Is it too great a stretch to think of big data in these terms, viz., as an unapproachable deity that can provide the secret message that will lift the veil of ignorance and bring humanity to a brighter future? Are artists like Luke DuBois or analysts like Doug Laney the purveyors of such a secret message?[8]

If we believe Aristotle the human condition is one not of certainty but wonder. The question of purpose, the purpose of action, and the belief that there must be purpose, that things make sense, supports the conviction that with enhanced insight beneficial decisions are possible and that progress can be made. Behind the idea of progress is the assumption of fixity, a stability against which motion toward a goal is possible. On this view the human condition is largely a quest for understanding.

This belief in progress and the quest for certainty foment the crisis of modernity from Descartes to Kant. For Descartes the discovery that what appeared to be and was evident to ordinary observation, and which was validated by metaphysics beginning with Aristotle, was false and called for the wholesale and radical reassessment of all knowledge. His method was disbelieving or at least doubting all one had been taught and which had been confirmed by experience as correct. Descartes called this discovery our new knowledge, a precarious formulation that ultimately required the severing of mind from body and the declaration that God is no deceiver to legitimate it. The faith that Descartes’s God required was in the enhanced insight afforded by modern mathematics (of which Descartes was a prominent founder). Descartes’s assertion of the efficacy of mathematical rationality both succinctly to summarize the true nature of the physical world and to demark the limits of human insight was eventually capped and partially refuted by Kant’s famous declaration that “I had to deny knowledge in order to make room for faith.” Similarly, he asserted: “The schematicism by which our understanding deals with the phenomenal world ... is a skill so deeply hidden in the human soul that we shall hardly guess the secret trick that Nature here employs.”[9]

Kant acknowledges, in this way like the advocates of big data theory, that the source of our knowledge (the noumena) is beyond our grasp, that what appears to us (phenomena) is due to the structure of human reason itself. The ways of nature are beyond our ken while still determinative of our well-being. Conformity to duty becomes the key ethical principle and guide for our actions and the basis of our hope.

The promise of big data asserts the claim to be able through the data mining technology of information science to penetrate Kant’s noumena, or in other words not to be constrained by the limitations of pure reason. The new knowledge disclosed is (or will be) salvific in that it is promised to put us on the road to progress. In this way it is possible to transcend the limits and constraints on the human condition as understood by Kant. This approach of big data is inherently gnostic: it is predicated on the communication of secret knowledge (from a demythologized deity) conveyed by a messenger to an elect few. The messenger of this secret knowledge is technology, aided for the present by human under laborers. The salvific promise entails the subordination of human action to data mining technology. Indeed, this must be the case, given the presupposed complexity of the fields of big data, that successful data mining can ultimately be accomplished only by computing devices managed by artificial intelligence. Clearly such an eventuality would redefine the human condition, the nature of human action and the existential meaning of being human.

An alternative way of conceiving the human condition, one that preserves the integrity of human action, has been suggested by Hannah Arendt. Let us approach her theory from the standpoint of thinking. Descartes’ famous designation of a human being as a thinking thing (res cogitans) of course raises the questions of just what thinking is, why it is the defining characteristic of humanity, and why it is that humans choose to think. Kant was critical of what he called Denker vom Gewerbe (professional thinkers) because thinking was the natural disposition of humanity. Yet when referring to the highest interests of humanity (for Kant God, Freedom and Immortality), he opposes those he mocks as the Luftbaumeister of reason, people who would try to establish the truth about these matters through arguments removed from all common experience and understanding. For Hannah Arendt the problem is precisely how to see thinking in terms of common experience and understanding. Mental activity that is disconnected from such understanding (as indeed the calculative heuristics of mining big data would be) cannot lead to action and our determination of ourselves as agents of the human prospect.

In her aptly titled book The Human Condition Arendt delineates several useful distinctions: the public and private realms; the vita activa (active life) and the vita contemplativa (contemplative life); and the three types of activities within the vita activa – labor, work and action. Unlike in the philosophical tradition the contemplative life is not viewed as superior to the life of action; action is not dependent the formative influence of thought and the goal of action need not be to change understanding – Arendt is not simply inverting Marx’s 11th thesis. While Marx argues that humans are animal laborans, that is defined by the necessity of labor, Arendt asks what if automation (AI technology) frees us from this necessity of labor so that we don’t need to labor merely to survive? Work according to her scheme is different because whereas labor is what one does simply to survive, work has different goals and produces durable objects. Action, the third category, includes what we ordinarily call action as well as speech and it is the way by which humans present themselves to each other and is distinctly human. Being human implies the ability to act. It is through action that the human world is created and maintained and through which human community is sustained. But this is due to difference, not conformity to an unchanging essence: the human condition is contingent, beginning anew with each birth, and hence a matter of ever-changing possibility. “Human plurality, the basic condition of both action and speech, has the twofold character of equality and distinction. If men were not equal, they could neither understand each other.”[10]

The Cartesian mind-body dualism is by Arendt supplanted by more subtle distinctions in which human action is neither predetermined nor the emulation of an ideal type. Moreover, with her famous emphasis on natality, she underlines the fact that with each birth a new beginning with new possibilities and hope, is established. A Hegelian view of history is ruled out. Like Kierkegaard, Arendt sees new individuals as the foundation of the human condition. These individuals are to be sure thinkers, but thinkers in the midst of lived experience contributing to the common realm of possibility by working through diverse opinions.

The 24th World Congress of Philosophy was held in Beijing in August of 2018. The general theme of the Congress was Learning to be Human. The Congress represented all branches of philosophy globally and vigorously pursued the general theme from multiple perspectives and methods. Big data was not a prominent concern among the participants and discussants. The idea of learning to be human stands out in an age when the notion of post humanity is thought by many to be in its incipient stages or upon us already. The question of learning how to be human in this context assumes a new urgency. It is a step beyond the Socratic injunction to know thyself in order to live well in accord with the good, beautiful and just, but becomes a question of how or if it is possible to co-exist in a world in which non-human entities, cyborgs in possession of intelligent agency, determine the social and cultural norms available to humans. It is curious, perhaps somewhat distressing, that the reality of big data, with its inextricable bond to such devices as intelligent robots, has not emerged as one of philosophy’s leading concerns.

As we have suggested, the accessibility of big data radically reframes the questions of what it means to be human and of the state of the human condition. This reframing challenges the traditional formulations of philosophy from antiquity and the enlightenment. Big data is not available to us ether through a rational, deductive logic or through sense perception, the two sources of all knowledge that Descartes argued were exhaustive. Moreover, given the dynamic and even volatile state of big data, an epistemology yielding certainty is out of the question. The approach advocated in the techno-business world suggests a dangerous Gnostic typology based upon privileged access to a body of hidden knowledge which can offer the enhanced insight necessary for a life of excellence. The mining of big data is offered as the new paradigm obviating approaches rooted in common experience. Arendt’s notion of action with a pluralistic world of competing doxa derived from experience in the public realm likewise is on this view inapplicable.

Where do we turn? It seems that the challenge presented by big data is how, in a world where decisions are based on aggregations of information that are beyond the parameters of natural access, is it possible to sustain an idea of humanity that preserves our unique status as agents who can pursue the good, true and beautiful. Creative attempts to redefine the human condition in works of art, as several of Luke DuBois’ projects do, suggest that rather than active agents we are caught unawares in the volatility of big data’s dynamism. This surely should be a question high on the agenda of philosophy’s quest to learn how to be human

_____________________________________________________________

[1] https://www.forbes.com/sites/gartnergroup/2013/03/27/gartners-big-data-definition-consists-of-three-parts-not-to-be-confused-with-three-vs/

[2] https://engineering.nyu.edu/faculty/r-luke-dubois

[3] Financial Times, https://www.ft.com/content/7b3a2828-e440-11e2-91a3-00144feabdc0

[4] http://lukedubois.com/

[5] Ibid., Financial Times.

[6] Ibid., lukedubois.com

[7] The term gnostic paradigm refers to ideas held by the Gnostics of late antiquity but is broader than the inverted theological cosmology they proclaimed. See Hans Jonas: Gnosis und spätantiker geist.

[8] I very seriously doubt that either has entertained anything like the gnostic typology. I mean only that their work hints at structural similarities.

[9] Both remarks are found in Kant’s Kritik der reinen Vernunft.

[10] Arendt, The Human Condition.