From Archives to Algorithms: Distance, Evidence, and Inference

A.K.M. Skarpelis

doi:10.60923/issn.1971-8853/23642

Authors

A.K.M. Skarpelis Department of Sociology, CUNY Queens College, New York https://orcid.org/0000-0002-9534-998X

Anna K.M. Skarpelis is Assistant Professor and Richard Lachmann Chair of Sociology at CUNY Queens College in New York. She is a cultural and comparative-historical sociologist utilizing mixed methods in her research on social classification, violence, and artificial intelligence. She received her Ph.D. from New York University and held postdoctoral fellowships at Harvard University and the Social Science Center Berlin (WZB). She is the recipient of the 2024 Roger V. Gould Prize.

DOI:

https://doi.org/10.60923/issn.1971-8853/23642

Keywords:

Generative AI, Archival epistemology, Simulation, National Socialism, Racial science

Abstract

This article reinterprets Carlo Ginzburg’s indiciary paradigm as a general theory of knowledge production and connects it to contemporary debates over generative artificial intelligence. In line with Ginzburg, I posit that we cannot directly access unmediated social life. But rather than treat distance as an obstacle to knowledge, temporal, epistemic, and perspectival forms of distance are its enabling conditions. We can make sense of these by weighing positive and negative analogy transfers (Mary Hesse) between radically different form of knowledge traces. Consider for example archival records, or the outputs of in silico research. Both domains require reasoning from traces that stand in for absent realities. Yet, synthetic outputs derive their authority from optimization and their plausibility is operational, rather than referential. A case study of Nazi racial science clarifies what is at stake when AI systems are treated as stand-ins for social actors, and shows how perspective can be abstracted from subjecthood and redeployed instrumentally: the extraction of epistemic resources without reciprocity, and the obscuring of production processes. I introduce the concept of in silico perspectivism to name a reflexive methodological stance adequate to this moment.

References

Alvarado, R.C. (2024). What Large Language Models Know. Critical AI, 2(1). https://doi.org/10.1215/2834703x-11205161

Anderson, E. (1995). Feminist Epistemology: An Interpretation and a Defense. Hypatia, 10(3), 50–84. https://doi.org/10.1111/j.1527-2001.1995.tb00737.x

Arjomand, N.A. (2022). Empirical Fiction: Composite Character Narratives in Analytical Sociology. The American Sociologist, 55, 436–472. https://doi.org/10.1007/s12108-022-09546-z

Attiah, K. (2025). I Talked to Meta’s Black AI Character. Here’s What She Told Me. Is This the New Era of Digital Blackface? Washington Post, January 8. Retrieved from: https://www.washingtonpost.com/opinions/2025/01/08/meta-ai-bots-backlash-racist/?utm_source=rss&utm_medium=referral&utm_campaign=wp_homepage.

Bajohr, H. (2024). On Artificial and Post-artificial Texts: Machine Learning and the Reader’s Expectations of Literary and Non-literary Writing. Poetics Today, 45(2), 331–361. https://doi.org/10.1215/03335372-11092990

Barrie, C., & Cerina, R. (2026). Synthetic Personas Distort the Structure of Human Belief Systems. Working Paper.

Buolamwini, J., & Gebru, T. (2018). Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. Proceedings of Machine Learning Research: Conference on Fairness, Accountability, and Transparency, 81, 1–15. https://proceedings.mlr.press/v81/buolamwini18a.html

Campt, T. (2017). Listening to Images. Durham, NC: Duke University Press.

Chesney, B., & Citron, D. (2019). Deep Fakes: A Looming Challenge for Privacy, Democracy, and National Security. California Law Review, 107. https://doi.org/10.15779/Z38RV0D15J

Davidson, T., & Karell, D. (2025). Integrating Generative Artificial Intelligence into Social Science Research: Measurement, Prompting, and Simulation. Sociological Methods & Research, 54(3), 775–793. https://doi.org/10.1177/00491241251339184

Dennett, D.C. (2023). The Problem with Counterfeit People. The Atlantic, May 16. Retrieved from: https://www.theatlantic.com/technology/archive/2023/05/problem-counterfeit-people/674075/.

Denton, E., Hanna, A., Amironesei, R., Smart, A., & Nicole, H. (2021). On the Genealogy of Machine Learning Datasets: A Critical History of ImageNet. Big Data & Society, 8(2), 20539517211035955. https://doi.org/10.1177/20539517211035955

Du Bois, W.E.B. (1903). The Souls of Black Folk: Essays and Sketches. Chicago, IL: A.C. McClurg & Co.

Fuentes, M.J. (2016). Dispossessed Lives: Enslaved Women, Violence, and the Archive. Philadelphia, PA: University of Pennsylvania Press, Inc.

Ginzburg, C. (1989). Clues, Myths, and the Historical Method (A. Tedeschi & J. Tedeschi, Trans.). Baltimore, MD: Johns Hopkins University Press.

Ginzburg, C. (2001). Wooden Eyes: Nine Reflections on Distance (M. Ryle & K. Soper, Trans.). New York, NY: Columbia University Press.

Ginzburg, C. (2012). Threads and Traces: True, False, Fictive (A. Tedeschi & J. Tedeschi, Trans.). Berkeley, CA: University of California Press.

Goddard, D. (1973). Max Weber and the Objectivity of Social Science. History and Theory, 12(1), 1–22. https://doi.org/10.2307/2504691

Goffman, E. (1959). The Presentation of Self in Everyday Life. New York, NY: Doubleday.

Hacking, I. (2006). Making up People. London Review of Books, 28(16). Retrieved from: https://www.lrb.co.uk/the-paper/v28/n16/ian-hacking/making-up-people.

Hanlon, A.R. (2024). LLM Outputs Are Fictions. Critical AI, 2(1). https://doi.org/10.1215/2834703x-11205210

Hao, Q., Xu, F., Li, Y., & Evans, J. (2026). Artificial Intelligence Tools Expand Scientists’ Impact but Contract Science’s Focus. Nature, 649(8099), 1237–1243. https://doi.org/10.1038/s41586-025-09922-y

Haraway, D.J. (1991). Simians, Cyborgs, and Women: The Reinvention of Nature. New York, NY: Routledge.

Harding, S.G. (1987). Feminism and Methodology: Social Science Issues. Bloomington, IN: Indiana University Press.

Henrickson, L. (2024). Conversations with No One. Poetics Today, 45(2), 291–299. https://doi.org/10.1215/03335372-11092924

Hesse, M. B. (1966). Models and Analogies in Science. Notre Dame, IN: University of Notre Dame Press.

Hill Collins, P. (1991). Black Feminist Thought: Knowledge, Consciousness, and the Politics of Empowerment. New York, NY: Routledge.

Kozlowski, A.C., & Evans, J. (2025). Simulating Subjects: The Promise and Peril of Artificial Intelligence Stand-Ins for Social Agents and Interactions. Sociological Methods & Research, 54(3), 1017–1073. https://doi.org/10.1177/00491241251337316

Larson, E.J. (2021). The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do. Cambridge, MA: The Belknap Press of Harvard University Press.

Lu, J..G., Song, L.L., & Zhang, L.D. (2025). Cultural Tendencies in Generative AI. Nature Human Behaviour, 9(11), 2360–2369. https://doi.org/10.1038/s41562-025-02242-1

Luft, A. (2020). How Do You Repair a Broken World? Conflict(ing) Archives after the Holocaust. Qualitative Sociology, 43(3), 317–343. https://doi.org/10.1007/s11133-020-09458-9

Luft, A., & Subotić, J. (2025). Ethics of Archives: Improving Historical Social Science Through the Consideration of Research on Violence. Social Science History, 49(1), 229–253. https://doi.org/10.1017/ssh.2024.42

Manoff, M. (2004). Theories of the Archive from Across the Disciplines. portal: Libraries and the Academy, 4(1), 9–25. https://doi.org/10.1353/pla.2004.0015

Marres, N., Katzenbach, C., Munk, A.K., & Jobin, A. (2025). On the Controversiality of AI: The Controversy Is Not the Situation. Big Data & Society, 12(4). https://doi.org/10.1177/20539517251383870

Paglen, T. (2014). Operational Images. e-flux, 59. Retrieved from: https://www.e-flux.com/journal/59/61130/operational-images/

Ricœur, P. (1969). Le conflit des interprétations; essais d'herméneutique. Paris: Éditions du Seuil.

Rini, R. (2020). Deepfakes and the Epistemic Backstop. Philosophers’ Imprint, 20, 1. https://philpapers.org/archive/RINDAT.pdf

Sabbagh-Khoury, A. (2022). Settler Colonialism and the Archives of Apprehension. Current Sociology, 72(1), 25–47. https://doi.org/10.1177/00113921221100580

Scheuerman, M.K., Hanna, A., & Denton, E. (2021). Do Datasets Have Politics? Disciplinary Values in Computer Vision Dataset Development. Proceedings of the ACM on human-computer interaction, 5(CSCW2), 1–37. https://doi.org/10.1145/3476058

Skarpelis, A.K.M. (2020). Life on File: Archival Epistemology and Theory. Qualitative Sociology, 43(3), 385–405. https://doi.org/10.1007/s11133-020-09460-1

Skarpelis, A.K.M. (2024). The Moral Pixel: Troubling Genealogies of Composite Person Classification in the Age of Artificial Intelligence. Talk presented at the 2024 Annual Meeting of the Social Science History Association. Chicago, IL.

Skarpelis, A.K.M. (2025). Racial Science on Trial: Ludwig Ferdinand Clauss, Objectivity, and the Politics of Nazi Science. Talk presented at the 2025 Annual Meeting of the Social Science History Association. Chicago, IL.

Smith, D.E. (1974). The Social Construction of Documentary Reality. Sociological Inquiry, 44(4), 257–268. https://doi.org/10.1111/j.1475-682X.1974.tb01159.x

Suchman, L. (2023). The Uncontroversial “Thingness” of AI. Big Data & Society, 10(2), 20539517231206794. https://doi.org/10.1177/20539517231206794

The President of the United States. (2025). Preventing Woke AI in The Federal Government. Executive Order. Retrieved from: https://www.whitehouse.gov/presidential-actions/2025/07/preventing-woke-ai-in-the-federal-government/.

Titcomb, J. (2024). We “Messed up” with Black Nazi Blunder, Google Co-founder Admits: Sergey Brin Responds to Criticism of Company’s AI Chatbot. Telegraph, March 4. Retrieved from: https://www.telegraph.co.uk/business/2024/03/04/google-sergey-brin-we-messed-up-black-nazi-blunder/.

Underwood, T., Nelson, L.K., & Wilkens, M. (2025). Can Language Models Represent the Past without Anachronism? arXiv preprint. https://doi.org/10.48550/arXiv.2505.00030

Vale, M.D. (2024). Moral Entrepreneurship and the Ethics of Artificial Intelligence in Digital Psychiatry. Socius, 10. https://doi.org/10.1177/23780231241259641

Van Loon, A. (2025). The Use of LLMs in Social Science Experiments. Paper presented at the Annual Meeting of the American Sociological Association. Chicago, IL, USA.

Wallace, M., & Peeler, M. (2024). Harriet Tubman's Deep Voice. Critical AI, 2(1). https://doi.org/10.1215/2834703X-11205217

Wang, A., Liu, A., Zhang, R., Kleiman, A., Kim, L., Zhao, D., Shirai, I., Narayanan, A., & Russakovsky, O. (2022). Revise: A Tool for Measuring and Mitigating Bias in Visual Datasets. International Journal of Computer Vision, 130(7), 1790–1810. https://doi.org/10.1007/s11263-022-01625-5

Zhao, D., Wang, A., & Russakovsky, O. (2021). Understanding and Evaluating Racial Biases in Image Captioning. Paper presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision. Montréal, QC.

Zhou, D., & Zhang, Y. (2024). Political Biases and Inconsistencies in Bilingual GPT Models – The Cases of the U.S. and China. Scientific Reports, 14(1), 25048. https://doi.org/10.1038/s41598-024-76395-w

From Archives to Algorithms: Distance, Evidence, and Inference

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Information

Make a Submission

Keywords

Current Issue