Notes for “Heuristics of Discovery”

Peter Bearman — Department of Sociology and INCITE, Columbia University (United States) — Contact: psb17@columbia.edu — http://www.incite.columbia.edu/peter-bearman/

Peter Bearman is the Cole Professor of Social Science at Columbia University. His new work focuses on identifying the neural signatures of social relations. In addition, most recently, with Adam Reich he is the author of Working for Respect: Community and Conflict at Walmart, forthcoming from Columbia University Press in June 2018.

Published: 2018-07-26

Acknowledgements

Comments from Mark Hoffman, Alessandra Nicifero, and Adam Reich are gratefully acknowledged as is support from the Interdisciplinary Center for Innovative Theory and Empirics (INCITE) at Columbia University.

Thinking about the problems the editors posed to us — how do we pick topics, what heuristics do we follow, what work processes do we use, and so on — made me realize that the hardest thing for me about any project is knowing when it is finished. That is one of the reasons why I’ve sometimes waited years between finishing papers and submitting them to journals, essentially unchanged after years spent in a box, or file cabinet. Relations into Rhetorics was written in 1985 and mailed to the press in 1992; Chains of Affection was written in 1998, but not published until 2004; Becoming a Nazi was written in 1992, and published almost a decade later, in 2000. Early on in my career I thought this was a disorder caused by a very negative review of my first attempt to publish Generalized Exchange, in 1984 (finally published in 1997) which consisted, in its entirety, of the following lines: “This must be a word processor error because the tables come from one paper and the text comes from another.”¹ But this still happens to me now, and today there are papers I will come to think as really good which remain deeply in the closet. I’ve overcome whatever stress I had about reviewers and I now understand that the delays, early on in my career, and now, are just because my papers are waiting for me to understand what their contribution could be. And that sometimes take a long time to see.

Knowing when something is finished reflects what contribution we want to make in the first place. The contributions that I try to make share the ambition of creating beautiful things that have not been seen before. In this regard I think of my work as aesthetic in orientation. I think of the conventions that structure scientific work as comparable to the frames that bound canvasses in painting — constraints that one works with because they make many of the hard decisions easier; they take them off the table. Because these constraints vary with the style of work, they also bound the character of the objects one can create, and so the choice of topic and style or problem and method are inextricably woven together. Not all papers are going to succeed entirely on the beautiful object dimension, and part of trying to figure out when a paper is finished is coming to grips with the fact that for whatever reason, usually a bad starting point, it can’t achieve what I had imagined, but that still, there is some part of it; a figure, a turn of phrase, an idea, that is beautiful enough.

We always wonder, or I always wonder, why people work on the problems and projects that they work on. Maybe that same curiosity was the motivation for this issue, on the part of the editors. It seems worth saying here that our methods are sometimes designed to provide answers to causal questions (though the typical explanation in our field is a just-so story) and sometimes the work I do also explicitly addresses causality. I have the perception that getting some causal estimation right motivates much work in our discipline — but for me, that is a secondary goal. I only mention this because, for those whose ultimate goal is different than mine, it is unlikely that my thoughts on the broad topic of heuristics for creating new objects and heuristics for knowing when to mail one's work to journals and presses will be at all useful.

So, in terms of structure, for this essay, I’ll talk about two heuristics that I use, connect them to work of mine by way of example, and then finish with the three things I learned from Harrison White.

1 Heuristic I: Use Relational Data or Induce a Relational Context

I still browse the shelves of others’ offices and libraries looking through books and archives and record repositories for relational data that appear systematic enough to exploit in one way or another. That is how I found the data structure — 221 221x221 square matrices identifying with a letter code how people living on Groote Eylandt on the eve of detribalization referred to one another using one of roughly 21 distinct kinship terms — that was at the heart of a paper I wrote on Generalized Exchange (Bearman, 1997). I knew that project was essentially finished when I was able to discover and reveal the hidden structure of their kinship system induced from a block model of kin terms. This structure was a perfect cycle; built from hundreds of violations of stated norms, a cycle based on categories that natives could not articulate but which actually produced, on the ground, a theoretical ideal, long imagined but never seen, a cycle for generalized exchange. I think I discovered the micro-mechanism undergirding the generation of cyclic exchange, the pursuit of balance in a context of stark intergenerational asymmetry in partner “choice”. But what sealed the deal was thinking about an amazing photograph of a trial by ordeal that took place on the beach, one day. A man stole a woman who he felt should be his wife but who was given to another man, whom he killed in the resulting fracas. By explicit native norms he was in the wrong for the theft, and therefore the murder. By the hidden structure revealed from the block model, he was the rightful spouse, the murdered man was an illegitimate interloper. All of the men on the island gathered. They stood fewer than 100 feet from the thief. The photo captures the moment immediately before they each threw a spear at him. They all missed. I remember thinking: “Now that is beautiful.” And sometime later, I mailed the paper off.²

I only once made the mistake of building a data set that was not at its core relational. That was for a project on desertion from the Confederacy (Bearman, 1993), and the paper was only saved by realizing that I could induce relationality (of sorts) by imagining that people who were listed next to one another in the census ought to live next to each other, since the census taker in 1860 had to walk from household to household to enumerate residents. By inducing relations through spatial proximity on the home front I could embed soldiers simultaneously in two communities: one arising from the units in which they served, the other arising from the micro-contexts — below the level of counties or towns — in which they could return. And from that, I could infer something about the ways in which the structure of their social relations shaped their identity, and hence their actions, at least with respect to desertion. What sealed the deal for me in this paper was that I could find a partition of household numbers which matched (in one case perfectly, in another closely) desertion timing for late but not early deserters. For me, the beautiful discovery was that localism — an identity that arises from relations with others — not interests (abstract or concrete) brought men into and out of the confederate army, at least from North Carolina.

Sometimes, one can build beautiful relational data structures but elements critical for the project are totally elusive. From my work with Add Health, I had become familiar and enamored with multi-level models. I thought then and still think now that they help us capture an aspect of the essentially Russian-doll reality of the contexts in which we are embedded and that they help us understand the ways in which context shapes our sense of self, and hence our action in the world. When I came to Columbia I wanted to shift back to historical work, and I thought a long time about what a model historical project — that is, a project which could serve as a model for a wide array of different research problems — would look like. And I understood that it would require a multi-level framework, with rich and very granular temporal data, on a large interlinked population of actors, whose linkages were also multi-level.

Driving those methodological considerations down to something realizable was more complex. After a lot of thought, I had the idea that I could undertake a study of mutinies if I could induce a data structure that captured every boat on the seas and every person on each boat at multiple moments in each day that the boat was on the water. Boats could be linked by sharing ports; and by sharing men; men could be linked by their sharing boats. Mutinies as a repertoire of action took off at a certain point — one could clearly see that there was an epidemic of them and so I needed then to capture boats over a long durée, before mutinies took off and after they largely disappeared. I first looked in the Atlantic but the boat data was too imprecise. Then, Emily Erikson discovered the East India Company Archive — a book which listed every boat that ever departed from England for the East under the aegis of the Company (and since they had a monopoly, that was pretty much every boat) and the individuals, above ordinary seamen, who were on it. The data were so precise and so uniform that we had the idea that we could measure long-term changes in climate based on trip durations only to discover that climatologists had actually already done it! By inducing boat overlap from sharing a stay at an Eastern port and by inducing ties between people by modeling their career mobility, as they moved from boat to boat over the course of multiple voyages (a similar strategy to what I had done in Relations into Rhetorics (1993), for preachers) we realized we could build the network structure that facilitated learning about how to mutiny.

The thing about mutinies, though, is that like all social action they are motivated for some reason, and that, anecdotally at least, the reason was about the conditions on board — whether there was food, water, the maggot situation, cholera, the character of the punishment meted out by the Captain, getting stuck in the doldrums, and so on. Voyage logbooks, built from entries recorded every four hours, provided insight into ship conditions at a level of temporal granularity that was unprecedented. My plan was to get a sample of logs for boats that experienced a mutiny and those that did not, capture their network position, and understand quite precisely how knowledge about how to mutiny shaped the likelihood of actually doing it, if the conditions warranted. Emily went to England to extract the logs. After the first few arrived it became pretty clear that the plan was very deeply flawed. I had forgotten to consider the obvious fact that when boats experienced a mutiny the first thing to disappear was the logbook. So that project failed.

But because there was a relational data structure we were able to model the emergence of global capitalism (Erikson & Bearman, 2006). And here there was a special joy in being able to discover that the British were able to expand beyond the Dutch because their captains were cheats and crooks. What could be more beautiful than a single figure which suggested that capitalism as a global system arose from malfeasance?

2 Heuristic II: Discover and Represent Multiple Standpoints

My screen saver reminds me throughout the day when I open my computer that “it is better to travel through a single land with a thousand pairs of eyes than a thousand lands with a single pair of eyes.” I found this sentence years ago in RD Laing, the Politics of Experience (Laing, 1967). Laing attributes the quote to Proust, and a friend of mine — a Proust expert — found something kind of like it³ in In Search of Lost Time but the Laing version is too distant to say anything other than “attributed to Proust.” Provenance aside, this aphorism shapes my thinking about how we are to understand the social world. I wish I could say that I have, but I’ve never been able to make it through Proust. But from what I understand, In Search of Lost Time is about continuity. And that makes sense because understanding how continuity happens is irresolvable without being able to capture what contexts look like from multiple points of view, at every moment in time. And, not really as an aside, but relevant to the question of how one chooses topics, my interest now is to understand continuity, which I think has always been one of the hardest problems facing the discipline.

In ethnographic contexts, it is possible to capture the orientations of actors by standing on the edge of actors’ perceptions as they are seeing. Because one is in the setting, the good ethnographer can see how the multiple orientations compose the whole setting. In historical work this is much more difficult. Even situating ourselves within the framework of single actors is difficult. How can we see what actors saw without imputing our standpoint to them? How can we preserve the multiplicity of standpoints that characterizes a single setting? I’ve been working on this problem for a very long time. Interestingly, one of the reasons that classificatory kinship systems are so attractive to work with is because they radically simplify, through an incredible expansion of the language of kinship, the multiple standpoint problem. In classificatory systems every person in one section of the tribe can agree on the relations of every other pair of pair of persons in the tribe. As White (1963, pp. 81–82) argues, “Of course I can always agree on how two people are related to each other by putting myself in one of their places as ego, but it is only in a classificatory system that I as ego can group others in exactly the same clusters of equivalence as they do.” Which is how it came about to be that all the men on Groote Eylandt, at the same moment, missed in their trial by ordeal.

In Blocking the Future (1999), Moody, Faris and I interwove multiple life stories extracted from residents of a single Chinese village to induce a history of interlinked events that covered a half century of massive social change. We exploit the fact that actors’ life stories arise from different standpoints — they have to since the life story is the narrators’ theory of how s/he got to where they are (wherever they are). We know that the standpoint of the life story is not the standpoint that the actor had at the moment of their action, but we know with equal certainty that it is not ours. Stacking those life stories on top of one another like we used to stack transparencies in grade school reports makes it possible to induce a single context from the multiple stories that cross it. Like the giant crab-like spanning tree in Chains of Affection (2002), the beautiful object at the end of Blocking the Future (1999) is built from local action, but could never be seen by a single perspective. It is a history of a small village that has to be — randomly re-wiring whole chunks of the past don’t change the structure — but which no one can see by themselves. And that is what we mean, I think, when we think about continuity. Each actor, doing their thing, on the short chains they are embedded in, contributing to each present in such a manner that most anything that happens preserves the opportunity structures they and others face, just as they were, for the next event, and the next.

In one of the best papers I have ever had the fortune of collaborating on — a paper written around 2008 on the conflict in Northern Ireland, with Hrag Balian, and still not published — this central idea is pushed to the limit. Here the data are all of the thousands of author-victim killing pairs, perfectly time-stamped, unfolding as a sequence of killings carried out by different groups. At any specific moment each group, relative to all others, can try to achieve a coveted end — revenge for a prior killing, dominance over other groups — but as with talk in meetings they cannot all act (speak) at once, and the next killing whether theirs or that of another group, changes the structure for everyone, producing new opportunities. The paper conceives of a way of capturing which opportunities each group can see relative to all other groups at every moment in time. They look back through a window to their past with other groups. Network ties are no longer dots connected by edges; they are vectors of events extending over different calendric periods. And so, the image of history is not a graph linking one event to another from an Archimedean standpoint outside of the graph, but instead a series of multiple sequences seen from every perspective simultaneously. Inducing a picture of the dozens of interwoven event sequences in the Northern Ireland conflict from each groups’ perspective was, for me, just staggeringly beautiful. It is inserted here, as Figure 1. The lines are group-specific histories. They start when groups have a motive to kill and stop when they no longer have one. Is it any wonder that civil conflicts last for generations if any specific moment in time is embedded in one or more events sequences defined by the presence of a reason to kill?

Figure 1: History as an Accordion Note: The x-axis is a count of killing events, from 1 (the first killing) to 2300 (of more than 3000). Each moment is embedded in multiple unfolding event sequences; each actor is embedded on multiple lines; The past for each actor has multiple durations, and there is no uniform time. — **Figure 1: History as an Accordion**
Note: The x-axis is a count of killing events, from 1 (the first killing) to 2300 (of more than 3000). Each moment is embedded in multiple unfolding event sequences; each actor is embedded on multiple lines; The past for each actor has multiple durations, and there is no uniform time.

Is this history as an accordion real? Simmel may or may not have said somewhere that facts are overrated and that if one can think something it is as good as if it were an actual fact.⁴ This is a version of what I take to be the structural conjecture, which I learned from Harrison White. The reason this makes sense — and this is another thing I learned from Harrison White — is that people are like plants in a hothouse. They just naturally get all intertwined. But what distinguishes people from plants (and following Levi-Strauss, from animals as well) is that they define some ties as ties that they cannot have. And the pattern that is revealed by the absence of ties points our way towards understanding the cultural rules that structure social life. This is why the vast majority of work on social networks says so little about social structure. Social structure arises from the absence of ties, not the presence. The so-called network science revolution which just looks at the presence of ties can’t really get to structure beyond epiphenomenal features, like power laws. But that is a digression.

The importance of the structural conjecture is hard to underestimate for finishing projects. One example of this comes from the work of Kate Stovel, whose discovery of a structure in county lynching histories — which appears only when memory is decayed using a specific functional form over seven years — is the proof that memory decays on that form over seven years. The structural conjecture is a simple and yet powerful idea: the structures are out there waiting to be discovered. It is our job to reveal them. When we find beautiful patterns, they are real enough to think with. And really, what more do we want besides an opportunity to induce new things to think with? Well, from my perspective, we want those objects to have some character on their own. And the character I want to maximize is aesthetic.

3 Conclusion

We all pursue our work for different reasons. I’m interested in discovering structures that can exist but are not known (which is the same as creating new objects). There are lots of ways to discover structures. The two heuristics I’ve discussed just happen to be the ones that I use and find useful for discovery. Even better, they also provide great stopping rules. Speaking of which …

Coda: The three⁵ things I learned from Harrison White

For this to be a heuristic it has to be something like: “Remember the three things I learned from Harrison White.” These are:

1 . Trust your students, they are smarter than you; 2 . Being completely wrong is better than being just a little wrong; 3 . Look at things in reverse; 4 . Leave technical problems for technical people.

References

Balian, H. & Bearman, P.S. (2008). Pathways to Violence: Dynamics for the Continuation of Large-Scale Conflict. Manuscript.

Bearman, P.S. (1991). Desertion as Localism: Army Unit Solidarity and Group Norms in the U.S. Civil War. Social Forces, 70(2), 321–342.

Bearman, P.S. (1993). Relations into Rhetorics. New Brunswick: Rutgers University Press, Rose Monograph Series.

Bearman, P.S. (1997). Generalized Exchange. American Journal of Sociology, 102(5), 1383–1415.

Bearman, P.S., Faris, R., & Moody, J. (1999). Blocking the Future: New Solutions for Old Problems in Historical Social Science. Social Science History, 23(4), 501–533.

Bearman, P.S. & Stovel, K. (2000). Becoming a Nazi: A Model for Narrative Networks. Poetics, 27(2–3), 69–90.

Bearman, P.S., Moody, J., & Stovel, K. (2002). Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks. New York: Columbia University Academic Commons.

Erikson, E. & Bearman, P.S. (2006) Malfeasance and the Foundations for Global Trade: The Structure of English Trade in the East Indies, 1601–1833. American Journal of Sociology, 112(1), 195–230.

Laing, R.D. (1967). The Politics of Experience. New York: Pantheon.

Proust, M. (1993). In Search of Lost Time: Volume 5: The Captive: The Fugitive. London: Chatto & Windus. (Translated by C.K. Scott Moncrieff and T. Kilmartin).

Stovel, K. (2001) Local Sequential Patterns: The Structure of Lynching in the Deep South, 1882–1930. Social Forces, 79(3), 843–880.

White, H.C. (1963). Anatomy of Kinship. New York: Prentice-Hall.

For younger generation readers who do not know what a word processor was, they were short-lived machines that bridged the gap between typewriters and computers. They were new in 1984 and the reviewer was right: I used one. These kinds of reviews make one sensitive to theory/data gaps.↩
Gerry Marwell, the editor of ASR who first rejected the sociology version of the paper — the earlier rejection was from an anthropology journal — wrote a little note on the paper which he sent back, which said: “You should talk to a senior colleague about how to write a paper.” He crossed out “senior” and wrote “junior”. He crossed out “junior colleague” and wrote “anyone”. It took a long time to publish that paper because sociology reviewers didn’t believe the idea that people could and did follow norms that they could not articulate and which operated on categories which they had no words for. In short, they didn’t believe sociology was possible.↩
“The only true voyage, the only bath in the Fountain of Youth, would be not to visit strange lands but to possess other eyes, to see the universe through the eyes of another, of a hundred others, to see the hundred universes that each of them sees, that each of them is […]” (Proust, 1993, p. 343. In Search of Lost Time, Volume 5. The Captive: The Fugitive. London: Chatto & Windus. Translated by C.K. Scott Moncrieff and Terence Kilmartin).↩
I don’t know where he said that, if he said that, and the special thing about that idea is that it doesn’t actually matter if he said it, since I can think with the idea that he said it, which is sufficient.↩
Cf: #4↩

1 Heuristic I: Use Relational Data or Induce a Relational Context

2 Heuristic II: Discover and Represent Multiple Standpoints

3 Conclusion

Coda: The three5 things I learned from Harrison White

References

Coda: The three⁵ things I learned from Harrison White