An Ivory Tower of One’s Own

Seat number 1 is by far the best in the room. Close to the high window, it is well lit, there is no neighbor to the left and the aisle leaves plenty of space for ones elbow to roam freely. From this seat there is a nice view of the room and the narrow gallery with wooden railings that overlooks it. Each morning at ten o’clock there are at least two people who have decided that this seat will be theirs.

— Arlette Farge, Le goût de l’archive (1989)

Image result for the allure of the archives

Last week Sean Takats visited the Lab to deliver a workshop – about Tropy, a free image management system oriented to the needs of archival researchers – and give a lecture, “Subjectivity and Digital Research.”

Sean’s talk was elegant and stimulating, and the first take-home was this: research is embodied, and the material conditions of the researcher in the archive shape the kind of research they can and do perform. And this really is a take-home all about taking home. Institutional archives are increasingly the sites of photographic data capture. Exploratory and interpretive decisions increasingly take place at home.

Or in the office, or on the train, or another library, or the deepest corner of the More Than Just … Coffee! Lounge on Hoe Street in Walthamstow, or the pay-per-hour workspace into which it will gentrify overnight in the summer of 2023. Research takes place, perhaps, with headphones in. At a different set of temperatures, in different clothes, in fewer clothes, with different levels of caffeine and hydration. With a different set of objects, people, and landscapes in the visual field, in difference ambiences, and with different activity in-between bouts of research.

For example, consider Londa Schiebinger’s acknowledgements to Secret Cures of Slaves: People, Plants, and Medicine in the Eighteenth-Century Atlantic World:

Writing history has changed […] Although one loses the tactile pleasure of eighteenth-century papers and leather bindings, one does not miss the mold, dust, jet lag, and hours waiting for things to be delivered to the reading-room table. Now one can read Jean-Jacques Barthélemy while taking breaks to do laundry (the benefits – physical and intellectual – of interspersing heavy-duty research and writing with mundane chores should not be underestimated).

Such materialities must surely show up in research outputs. But how exactly? We might start by saying that research occurs in a different set of moods, although tracing the affective shift feels quite daunting to me. Another starting point is that non-archive research spaces are all variously wannabe Woolf’s rooms-of-one’s-own. So perhaps we can consider how researchers in those spaces encounter different affordances, stimuli, textures, and impediments according to factors such as gender, class, race, and ability.

Back to the archive: well, it’s full of researchers taking photos. Sean cited empirical as well as anecdotal evidence to demonstrate how practices have shifted. There was a twinkle in Sean’s eye – like a little camera flashing – whenever he spoke about this transformation, and his Tropy project promises to further normalise, elaborate, and refine it. Nevertheless, I don’t think the lecture adopted a fully normative stance. That is, Sean wasn’t here to endorse the transformation, exactly.

His interest was rather – and maybe this is the second big take-home – given that this shift is actually happening, shouldn’t we be alert to the implications? And in particular, alert to the stories we tell about research?

How have researchers’ subjective experiences of conducting research changed? Do we need a new language of archival research in the digital age? Do the explicit and implicit stories we tell about how knowledge is generated reflect and/or support actual practices?

Sean identified a residual discourse of the romanticized archive. Arlette Farge was cited as one example. In fact, Sean suggested, there is even a kind of travel literature of the archive. What happens when you descend into the archive? The archive is a strange and distant land: we journey there, and we bring things back. Along the way we encounter wonders, obstacles, even perils. But mostly we don’t …

The day was very hot; we walked up the hills, and along all the rough road, which made our walking half the day’s journey. Travelled under the foot of Carrock, a mountain covered with stones on the lower part; above, it is very rocky, but sheep pasture there; we saw several where there seemed to be no grass to tempt them. Passed the foot of Grisdale and Mosedale, both pastoral valleys, narrow, and soon terminating in the mountains—green, with scattered trees and houses, and each a beautiful stream. At Grisdale our horse backed upon a steep bank where the road was not fenced, just above a pretty mill at the foot of the valley; and we had a second threatening of a disaster in crossing a narrow bridge between the two dales; but this was not the fault of either man or horse. Slept at Mr. Younghusband’s public-house, Hesket Newmarket. In the evening walked to Caldbeck Falls, a delicious spot in which to breathe out a summer’s day—limestone rocks, hanging trees, pools, and waterbreaks—caves and caldrons which have been honoured with fairy names, and no doubt continue in the fancy of the neighbourhood to resound with fairy revels.

— Dorothy Wordsworth, Recollections of a Tour Made in Scotland, AD 1803

… mostly we just patiently make some progress. Sometimes it’s swift and steady, sometimes erratic and embarrassed, filled with stumbles and setbacks. We have good days and bad.

And on some level, we want others to understand all this, and to know that it is what makes our knowledge authentic.

We want them to admire the rituals we recited to gain our safe passage, the amulets we brandished to complete our homecoming.

We want them to know that we’ve been there, man.

To agree that the figure in front of them contains a few molecules from far-flung climes.

To acknowledge our body as a body of work.

To suspect that the faraway glint in our eye, as we wait our turn to speak, is actually a speck — immured in eye-lime by ancient Lucretian optics — of moulted surface-veil of the specimen itself.

To admire us for our professional clarity of thought, sure …

… but more profoundly, to vent visceral awe for our throats as they professorially clear, inviting the infinitesimal yet non-zero possibility that what dislodge and dance in our alveolar folds are secreted atoms of air of that very zone, and that the syllable forming at our lips is a lost-and-found zephyr of the archive itself. Like Swift’s academicians of Lagado, who have to wave around whatever whereof they aver, we suspect that our judgments can only be validated by relations that are physical, even somatic.

Or something. Okay, Sean put it way more sensibly than that, but I still wasn’t totally sold on any link between archival research and travel literature. Until, that is, he read out an email — from a former supervisor, I think — offering guidance to Young Sean for his first visit to the archive. “It is pretty simple I think. You fill out a form, and have a little interview and a card.” It was pretty simple, and yet the email was remarkably detailed. And although Sean didn’t quite put it this way, there’s no way that it was purely generosity or loquacity or attention-to-detail. There was a real love of storytelling there, and the story was Recollections of a Tour Made in the Reading Room.

So here’s another take-home. If researchers feel that they are laying hands on history through its tangible artefacts, perhaps this goes together with a tendency to conceal the use of digital sources: Google Books, Artful, Gallica, and many others. With a little detective work you can figure out that folk are doing this — Sean mentioned, for example, lacunae: research which claims to be citing physical sources, but consistently cites only the editions of a work that have been digitised in some particular repository. So how do we interpret this silence? Is it professional malpractice if you don’t cite Google Books … when you typed it into your article from Google Books?

Sean’s talk was also about posing a question, or set of questions. If we’re still overly entranced by the old travel narratives, what stories are we neglecting? What stories could we, should we be telling about our research?

One obvious answer is: more accurate stories, rooted in the best available evidence about our real collective experiences. Would such stories still be travel literature, I wonder, or some other genre? Perhaps a descendant of travel literature — science fiction? Perhaps utopian fiction? I’m certainly persuaded that using a database, constructing keyword searches, reading a patchwork of text, needs to become part of the account of doing research. We should give serious consideration, as researchers, to the way that IT conditions our research practices: the practices we proudly theorise and teach as methodology, the practices we feel a bit shady about, and even the practices we don’t notice, but which are nevertheless technologically traceable. How do we leave tracks, trails and traces of our subjectivity? Each researcher is potentially gathering a mass of data about how they gather data.

Sean finished with an open question, which was a callback to Arlette Farge’s Le goût de l’archive (1989). “What, in 2019 — in a dematerialized and iterative archive — is seat number 1?”

There was a lively Q&A. I hardly took any notes and I won’t try to summarise. I think I can just about remember three sort of interrelated questions from Tim Hitchcock, Caroline Bassett, and Rachel Thomson.

Tim asked about how we were imagining archives before this particular wave of romanticization, which he suggested was rooted in the 80s and 90s, just as the archive is starting to transform. He brought up an earlier Foucauldian analysis of the archive as an antagonistic mechanism of order and control: a way of understanding the inescapable web of technology and language in which we are caught and from which we are construed.

Caroline asked about the ways in which the argument was grounded in history specifically, and spoke specifically to the historians’ archive. What is the prior structure that that makes the thing that you set out to collect “history”?

And I think Rachel asked about democratization and authority, and suggested that this argument might be interestingly reframed in terms of loss of historians’ traditional prestige. Could this be a moment of re-territorialization? Might it turn out that people can do history without historians? Might they do it in totally different ways, or do totally different things altogether?

There were plenty of other questions and plenty of answers.

I asked one about zooming out from archival research and thinking about all kinds of academic practice in the same way — especially teaching.

Having mulled it over a bit more, I guess I was really thinking about that gripe you sometimes get about students who don’t do the set reading … or who somehow game the reading. It’s grounded in a recognition that a patchwork of Google Books text fragments isn’t intellectually transformative in the same way substantial linear readings of chapters and books are … and perhaps also a faint recognition that we currently aren’t that good at conveying this fact via formative, summative, and informal assessment.

Speaking anecdotally, reading a whole damn book is a big deal, takes absolutely forever, and it fundamentally changes who I am. I am a pretty bad reader, and perhaps the worse a reader you are, the more it changes you. I can advise a student to be wary of shortcuts, but I know I wouldn’t persuade me.

So perhaps we do need a new language, or a new set of stories, around learning in an era of widely available digital shortcuts. How much do we need to nudge such a discourse along, and how much is it emerging spontaneously? My hunch is that it’s largely emerging spontaneously, and the questions are more around how to steer its growth.

One area might be citation. Citation has connotation, and perhaps the connotation is systematically false. What if we were to experiment with a citation system which elegantly communicates not provenance, but some sense of how the author came across it, and how deeply and widely they have explored the context in which it occurred?

Then again — and this was my follow-up question, a thing I always trot out these days in various guises (I think because of Simmel) — does increasing the truthfulness of the stories we tell about our research, learning, and teaching necessarily always produce more truth per se … or whether it might in some cases be destructive of truth? Might silence, misdirection, equivocation, euphemism, tact, white lies, opacity, deferral validated by uncertainty, and all manner of ruses also be built into the enabling infrastructure of truth?

Creative practice is perhaps where this is most obviously seen: the fidelity between a poetics and a poetry is seldom a descriptive fidelity — why on earth would a poet settle for that? — but is rather a provocative and generative fidelity. The poet represents their practice in ways that enable and modify their practice. Such representations both coheres with and contradict representations capable of communicating their practice.

All this pertains to the tacit validity claims of scholarship. Perhaps to cite a work is to impersonate something, and a linear reading of whatever is in the codex may not be the best way to identify and to inhabit the ‘something’ you are impersonating. For starters, if you haven’t read through the source text in that way, you won’t be alone. Has there been a big DH project to model where citations cluster? Because I have a suspicion that the history of philosophy is the history of conversations between first chapters.

So what kind of poetics ought we aspire to for research? I think my instincts are pragmatic: it would be great if we could recognise and duly weigh the transformative power of longform textual encounter, or could rediscover similar transformative power in more distributed, patchwork formats. So a poetics, or a new travel narrative, that might allow you to take your bearings in that more fragmented reading, without insisting on linearity, to find ways to make that experience more cumulative.

The Tropy workshop was excellent, an opportunity to learn its current capabilities, but also a nice glimpse into its ongoing evolution, and into how the interplay of “nice to have” and “easier said than done” influences development priorities. Speaking personally it was the incidental side quest which really did it for me: arriving late to the Zotero party. Zotero is a free research and citation management system — a bit like Endnotes, if you’re familiar with that, but open source — that is oriented toward collaborative research (it integrates easily with Google Docs, for example, although I’m not sure about CryptPad and others). It very zestily searches the web to identify whatever you click and drag into it, and gives you nice titles and abstracts and hooks to hang your own metadata too. You can input ISBNs or DOIs and it usually does the rest. Sean even gave me a Zotero sticker, and you know what, I stuck it on my laptop. And then he was gone. Mood:

Zorro

JLW

And we’re off!

Sussex’s Dr Nicola Stylianou reflects on the launch of Making African Connections.

Suchi Chatterjee (researcher, Brighton and Hove Black History) and Scobie Lekhuthile (curator, Khama III Memorial Museum) discussing the project.

Last week was the first time that everybody working on the Making African Connections project was in the same room together. This was a very exciting moment for us and was no small feat: people travelled from Namibia, Botswana, Sudan and all across the UK to attend our first project workshop. We began by discussing the project together and then broke into three groups to discuss the three museum collections of African objects that are now in Kent and Sussex.

The first working group was discussing a collection of Batswana artefacts donated to Brighton museum by Revd Willoughby, a missionary. Staff at the museum will be working with researcher Winani Thebele (Botswana National Museums) and curator Scobie Lekhuthile (Khama III Memorial Museum) as well Tshepo Skwambane (DCES) and Suchi Chatterjee and Bert Williams (Brighton and Hove Black History). The second case study focuses on a large collection of objects from South West Angola that are held at the Powell-Cotton Museum and were acquired in the 1930s. The objects are mainly Kwanyama and this part of the project has, as its advisor, an expert in Kwanyama history, Napandulwe Shiweda (University of Namibia). Finally, the project will consider Sudanese objects held at the Royal Engineers Museum. Research for this part of the project is being conducted by Fergus Nicoll, Reem al Hilou (Shams AlAseel Charitable Initiative) and Osman Nusairi (intellectual).

The aim of the workshop was to decide together what the priorities for the project were. We will begin digitising objects for our online archive in April so we need to know which objects we want to work on first as some of the collections are very large. It will only be possible to create online records for a selection of objects.

BOT_20190212_3-1024x768 — Viewing the objects in the store room

REMLA_20190208_1-1-1024x683 — Viewing galleries at the Royal Engineers Museum

Before the workshop on Wednesday we had arranged for all the participants to visit the relevant galleries and see objects in storage. This had lead to some interesting and difficult conversations that we were able to build on during the workshop. Perhaps the clearest thing to come out of the meeting was the sheer amount of work to be done to fully research these collections and to understand their potential to connect to audiences and each other.

This post originally appeared on the Making African Connections project blog on 25 February 2019. Making African Connections is an AHRC-funded project.

Mending Dame Durrants’ Shoes

This week Louise Falcini gave us an update on the AHRC-funded project The Poor Law: Small Bills and Petty Finance 1700-1834.

The Old Poor Law in England and Wales, administered by the local parish, dispensed benefits to paupers providing a uniquely comprehensive, pre-modern system of relief. The law remained in force until 1834, and provided goods and services to keep the poor alive. Each parish provided food, clothes, housing and medical care. This project will investigate the experiences of people across the social spectrum whose lives were touched by the Old Poor Law, whether as paupers or as poor-law employees or suppliers.

The project seeks to enrich our understanding of the many lives touched by the Old Poor Law. This means paupers, but it also means workhouse mistresses and other administrators, midwives, tailors, cobblers, butchers, bakers, and many others. Intricate everyday social and economic networks sprung up around the Poor Law, about which we still know very little.

To fill these gaps to bursting, the project draws on a previously neglected class of sources: thousands upon thousands of slips of paper archived in Cumbria, Staffordshire and East Sussex, often tightly folded or rolled, of varying degrees of legibility, and all in the perplexing loops and waves of an eighteenth century hand …

Overseer note

These Overseers’ vouchers – similar to receipts – record the supply of food, clothes, healthcare, and other goods and services. Glimpse by glimpse, cross-reference by cross-reference, these fine-grained fragments glom together, revealing ever larger and more refined images of forgotten lives. Who was working at which dates? How did procurement and price fluctuate? What scale of income was possible for the suppliers the parish employed? What goods were stocked? Who knew whom, and when? Who had what? What broke or wore out when? As well as the digital database itself, the project will generate a dictionary of partial biographies, collaboratively authored by professional academics and volunteer researchers.

Louise took us through the data capture tool used by volunteer researchers. A potentially intimidating fifty-nine fields subtend the user-friendly front-end. The tool is equipped with several useful features. For example, it is possible to work remotely. The researcher has the option to “pin” the content of a field from one record to the next. The database automatically saves every iteration of each record. The controlled vocabulary is hopefully flexible enough to helpfully accommodate any anomalies. It’s also relatively easy to flag up records for conservation assessment or transcription assistance, or to go back and edit records. Right now they’re working on implementing automated catalogue entry creation, drawing on the Calm archive management system.

Personally, one of the things I find exciting about the project is how it engages both with the history of work and with the future of work. Part of its core mission is to illuminate institutions of disciplinarity, entrepreneurship, and precarity in eighteenth and early nineteenth century England. At the same time the project also involves, at its heart, questions about how we work in the twenty-first century.

Just take that pinning function, which means that researchers can avoid re-transcribing the same text if it’s repeated over a series of records. It almost feels inadequate to frame this as a “useful feature,” with all those overtones of efficiency and productivity! I’m not one of those people who can really geek out over user experience design. But most of us can relate to the experience of sustained labour in slightly the wrong conditions or using slightly the wrong tools. Most of us intuit that the moments of waste woven into such labour can’t really be expressed just in economic terms. And I’m pretty sure the moments of frustration woven into such labour can’t be expressed in purely psychological terms either. Those moments might perhaps be articulated in the language of metaethics and aesthetics? – or perhaps they need their very own (as it were) controlled vocabulary. But whatever they are, I think they manifest more clearly in voluntary labour, where it is less easy to let out that resigned sigh and think, “Whatever, work sucks. Come on Friday.”

I don’t have any first-hand experience of working with this particular data capture tool. But from the outside, the design certainly appears broadly worker-centric. I think digital work interfaces, especially those inviting various kinds of voluntary labour, can be useful sites for thinking more widely about how to challenge a productivity-centric division of labour with a worker-centric design of labour. At the same time, I guess there are also distinctive dangers to doing that kind of thinking in that kind of context. I wouldn’t be surprised if the digital humanities’ love of innovation, however reflexive and critical it is, tempts us to downplay the importance of the minute particularity of every worker’s experience, and the ways in which working practices can be made more hospitable and responsive to that particularity. (Demos before demos, that’s my demand).

I asked Louise what she thought motivated the volunteer researchers. Not that I was surprised – if something is worth doing there are people willing to do it, given the opportunity! – but I wondered what drew these particular people to this particular work? In the case of these parishes, it helps that there are good existing sources into which the voucher data can be integrated, meaning that individual stories are coming to life especially rapidly and richly-resolved. Beyond this? Obviously, the motives were various. And obviously, once a research community was established, it has the potential to become a motivating energy in itself. But Louise also reckoned that curiosity about these histories – about themes of class, poor relief and the prehistory of welfare, social and economic justice, and of course about work – played a huge role in establishing it in the first place.

Blake wrote in Milton about “a moment in each Day that Satan cannot find / Nor can his Watch Fiends find it.” I bet there is a moment within every rote task that those Watch Fields have definitely stuck there on purpose. It’s that ungainly, draining, inimitable moment that can swell with every iteration till it somehow comes to dominate the task’s entire temporality. It is politically commendable to insist that these moments persist in any task designed fait accompli from a distance, by people who will never have to complete that task more than once or twice … no matter how noble or comradely their intentions. But even if we should be careful about any dogmatic redesign of labour, I think we should at least be exploring how to redesign the redesign of labour. Karl Marx wrote in his magnum opus The Wit and Wisdom of Karl Marx that, unlike some of his utopian contemporaries, he was not interested in writing recipes for the cook-shops of the future. In some translations, not recipes but receipts. It actually is definitely the future now. And some of us are hungry.

JLW

The Poor Law: Small Bills and Petty Finance 1700-1834 is an AHRC-funded project.

PI: Alannah Tomkins (Keele)
Co-I: Tim Hitchcock (Sussex)
Research Fellow: Louise Falcini (Sussex)
Research Associate: Peter Collinge (Keele)

Opacity and Splaination

I’m just back from Beatrice Fazi’s seminar on ‘Deep Learning, Explainability and Representation.’ This was a fascinating account of opacity in deep learning processes, grounded in the philosophy of science but also ranging further afield.

Beatrice brought great clarity to a topic which — being implicated with the limits of human intelligibility — is by its nature pretty tough-going. The seminar talk represented work-in-progress building on her recently published book, Contingent Computation: Abstraction, Experience, and Indeterminacy in Computational Aesthetics, exploring the nature of thought and representation.

I won’t try to summarise the shape of the talk, but I’ll briefly pick up on two of the major themes (as advertised by the title), and then go off on my own quick tangent.

First, representation. Or more specifically, abstraction (from, I learned, the Latin abstrahere, ‘to draw away’). Beatrice persuasively distinguished between human and deep learning modes of abstraction. Models abstracted by deep learning, organised solely according to predictive accuracy, may be completely uninterested in representation and explanation. Such models are not exactly simplifications, since they may end up as big and detailed as the problems they account for. Such machinic abstraction is quasi-autonomous, in the sense that it produces representational concepts independent of the phenomenology of programmers and users, and without any shared nomenclature. In fact, even terms like ‘representational concept’ or ‘nomenclature’ deserve to be challenged.

This brought to my mind the question: so how do we delimit abstraction? What do machines do that is not abstraction? If we observe a machine interacting with some entity in a way which involves receiving and manipulating data, what would we need to know to decide whether it is an abstractive operation? If there is a deep learning network absorbing some inputs, is whatever occurs in the first few layers necessarily ‘abstraction,’ or might we want to tag on some other conditions before calling it that? And is there non-representational abstraction? There could perhaps be both descriptive and normative approaches to these questions, as well as fairly domain-specific answers.

Incidentally, the distinction between machine and human abstraction also made me wonder if pattern-recognition actually belongs with terms such as generalization, simplification, reduction, and (perhaps!) conceptualization, and (perhaps even!) modelling, terms which pertain only in awkward and perhaps sort of metaphorical ways to machine abstraction. It also made me wonder how applicable other metaphors might be: rationalizing, performing, adapting, mocking up? Tidying? — like a machinic Marie Kondo, discarding data points that fail to spark joy?

The second theme was explanation. Beatrice explored the incommensurability between the abstractive operations of human and (some) machine cognition from a number of angles, including Jenna Burrell’s critical data studies work, ongoing experiments by DARPA, and the broader philosophical context of scientific explainability, such as Kuhn and Feyerabend’s influential clashes with conceptual conservativism. She offered translation as a broad paradigm for how human phenomenology might interface with zones of machinic opacity. However, to further specify appropriate translation techniques, and/or ways of teaching machine learning to speak a second language, we need to clarify what we want from explanation.

For example, we might want ways to better understand the impact of emerging machine learning applications on existing policy, ways to integrate machine abstractions into policy analysis and formation, to clarify lines of accountability which extend through deep learning processes, and to create legibility for (and of) human agents capable of bearing legal and ethical responsibility. These are all areas of obvious relevance to the Science Policy Research Unit, which hosted today’s seminar. But Beatrice Fazi’s project is at the same time fundamentally concerned with the ontologies and epistemologies which underlie translation, whether it is oriented to these desires or to others. One corollary of such an approach is that it will not reject in advance the possibility that (to repurpose Langdon Winner’s phrase) the black box of deep learning could be empty: it could contain nothing translateable at all.

For me, Beatrice’s account also sparked questions about how explanation could enable human agency, but could curtail human agency as well. Having something explained to you can be the beginning of something, but it can also be the end. How do we cope with this?

Might we want to mobilise various modes of posthumanism and critical humanism, to open up the black box of ‘the human’ as well? Might we want to think about who explanation is for, where in our own socio-economic layers explanation could insert itself, and what agencies it could exert from there? Think about how making automated processes transparent might sometimes place them beyond dispute, in ways which their opaque predecessors were not? Think about how to design institutions which — by mediating, distributing, and structuring it — make machinic abstraction more hospitable for human being, in ways relatively independent of its transparency or opacity to individual humans? Think about how to encourage a plurality of legitimate explanations, and to cultivate an agonistic politics in their interplay and rivalry?

Might we want to think about distinguishing explainable AI from splainable AI? The word mansplain has been around for about ten years. Rebecca Solnit’s ‘Men Explain Things To Me‘ (2008), an essay that actually intersects with many of Rebecca Solnit’s interests and which she is probably recommended at parties, doesn’t use the word, but it does seem to have inspired it.

Mansplaining in Art (@MansplainingArt) | Twitter

Splain has splayed a little, and nowadays a watered-down version might apply to any kind of pompous or unwelcome speech, gendered or not. However, just for now, one way to specify splaining might be: overconfident, one-way communication which relies on and enacts privilege, which does not invite the listener as co-narrator, nor even monitor via backchannels the listener’s ongoing consent. Obviously splained content is also often inaccurate, condescending, dull, draining, ominously interminable, and even dangerous, but I think these are implications of violating consent, rather than essential features of splaining: in principle someone could tell you something (a) that is true, (b) that you didn’t already know, (c) that you actually care about, (d) that doesn’t sicken or weary you, (e) that doesn’t impose on your time, and wraps up about when you predict, (f) that is harmless … and you could still sense that you’ve been splained, because there is no way this bloke could have known (a), (b), (c), (d), (e), and/or (f).

“Overconfident” could maybe be glossed a bit more: it’s not so much a state of mind as a rejection of the listener’s capacity to evaluate; a practiced splainer can even splain their own confusion, doubt, and forgetfulness, so long as they are acrobatically incurious about the listener’s independent perspective. So overconfidence makes possible the minimalist splain (“That’s a Renoir,” “You press this button”), but it also goes hand-in-hand with the impervious, juggernaut quality of longform splaining.

Splainable AI, by analogy, would be translatable into human phenomenology, without human phenomenology being translatable into it. AI which splains itself might well root us to the spot, encourage us to doubt ourselves, and insist we sift through vast swathes of noise for scraps of signal, and at a systemic level, devalue our experience, our expertise, and our credibility in bearing witness. I’m not really sure how it would do this or what form it would take: perhaps, by analogy with big data, big abductive reasoning? I.e. you can follow every step perfectly, there are just so many steps? Splainable AI might also give rise to new tactics of subversion, resistance, solidarity. Also, although I say ‘we’ and ‘us,’ there is every reason to suppose that splainable AI would exacerbate gendered and other injustices.

It is interesting, for example, that DARPA mention “trust” among the reasons they are researching explainable artificial intelligence. There is a little link here with another SHL-related project, Automation Anxiety. When AIs work within teams of humans, okay, the AI might be volatile and difficult to explain, evaluate, debug, veto, learn from, steer to alternative strategies, etc. … but the same is true of the humans. The same is particularly true of the humans if they have divergent and erratic expectations about their automated team-mates. In other words, rendering machine learning explainable is not only useful for co-ordinating the interactions of humans with machines, but also useful for co-ordinating the interactions of humans with humans in proximity to machines. Uh-oh. For that purpose, there only needs to be a consistent and credible, or perhaps even incontrovertible, channel of information about what the AI is doing. It does not need to be true. And in fact, a cheap way to accomplish such incontrovertibility is to make such a channel one-way, to reject its human collaborators as co-narrators. Maybe, after all, getting splAIned will do the trick.

JLW

Earlier: Messy Notes from the Messy Edge.

Data Cleaning and Preparation

Earlier today Ben Jackson gave the first in this semester’s series of digital methods open workshops. Here are a few rough notes on what we covered. If you missed the workshop and want to try out some of it on your own, you can find the tasks here. For details of forthcoming workshops, go here. All workshops are free and open to everyone (but it helps if you register).

Ben started us off with a rapid hurtle through some of his recent and ongoing projects (slides), including his collaboration with Caroline Bassett exploring ways of analysing and visualising Philip K. Dick’s writing (counting electric sheep, baa charts), and work bringing to life the text data of the Old Bailey Online. (It’s a tremendously rich archive of nearly 200,000 trials heard at the Old Bailey between 1674 and 1913). Bringing to life, and also bringing to unlife: Ben uses a kind of estrangement effect to remind the observer of what the data isn’t telling us, populating his legal drama puppet show with a cast of spoopy skellingtons.

Ben Jackson puppet show

Most of the workshop was a free exploration of prompts and tools Ben pulled together. People basically tried out whatever they liked, while he glided from table to table rendering assistance.

Calibre is a free ebook manager that is also a bit of a Swiss army knife, and it just so happens one of its fold-out doohickeys is a very good ebook-to-plain-text converter. Ebook files (.AZW, .EPUB, .MOBI etc.) are stuffed with all kind of metadata that usually needs to be cleared away before you can do any analysis on the raw text itself. We also did something similar with another free tool, AntFileConverter, turning PDF into plain text. The lesson was that documents can be ornery and eccentric, and different converter tools will work differently and give rise to different glitches: “no single converter that will just magically work on every document.”

AntFileConverter is part of a family of tools. We also checked out TagAnt and AntConc. I feel like I only scratched the surface of these. TagAnt creates a copy of a text file with all the grammatical parts-of-speech tagged. So if you input something like “We waited for ages at Clapham Junction, with the guard complaining about people blocking the
doors” you get something like “We_PP waited_VVD for_IN ages_NNS at_IN Clapham_NP Junction_NP ,_, with_IN the_DT guard_NN complaining_VVG about_IN people_NNS blocking_VVG the_DT doors_NNS ._SENT” as output. PP is a personal pronoun, VVD is a past tense verb, IN is a preposition or subordinating conjunction, and so on. By itself this just seems to be an extremely pedantic form of vandalism. It does let you fairly easily find out if, for example, an author just loves adverbs. And tagging parts of speech could be the first step toward more interesting manipulations, for creative purposes (shuffle all the adverbs) and/or analytic purposes (analysis of genre or authorship attribution).

AntConc allows you to create concordances. A concordance is (more or less) an alphabetical list of key terms in a text, each one nestled in a fragment of its original context. So it’s a useful way to browse an unreadably large corpus based on some particular word (and so to some extent some particular theme) that interests you. Sure Augustine had stuff to say about sin and grace, but what did he think about, I don’t know, fingers?

Fingers

So a concordance helps you to find sections you might want to read more thoroughly. But I guess it doesn’t just have to be used like that — like a kind of map, or a very comprehensive index — but could also be read in its own right, and that reading could comprise a legitimate way of encountering and gaining knowledge of the underlying text.

How might, for instance, reading every appearance of the word “light” constitute its own way of knowing how the term “light” is working within a text? Are such readings reliably productive of knowledge? Or is it more like you might get lucky and stumble on something intelligible, like how a particular word is being tugged in distinct, divergent directions by two different discourses it’s implicated in?

How do these tools actually work? Well, going by the name and a logo, a really fast clever ant just does it for you. Thanks ant!

AntConc screenshot

Voyant Tools is a web-based reading and analysis environment for digital texts. What does that mean in practice? When you feed it your text file, a bright little dashboard pops up with five resizable areas. Each one of these contains a tool, and you can swap different tools in and out. I’d guess there are about fifty or so tools, although I’m not sure how distinct they all are really.

Voyant Tools screenshot 1

At least one tool was very familiar: “Cirrus” in the top left corner makes a word cloud of the text you’ve inputted, with the most frequent words appearing the largest. Very common words like “a” and “the” are filtered out (in the lingo, they are “stopwords”). The bottom right tool, “Contexts,” was also pretty familiar, since it seems to be a concordance, like we’d just been doing in AntConc. “Summary” and “Trends” were pretty self-explanatory. “TermsBerry” required a bit more poking and prodding. It clusters the more frequent words near the middle, the rarer words round the edges. When you hover your mouse pointer over a word, some of the other drupelets light up to show you what other words tend to appear nearby. You can mess with the thresholds and decide exactly how close counts as “nearby.”

The “Topics” tool looks interesting. It starts with random seeds, then builds up a distinct word cluster around these seeds based on co-occurence and then tries to work out how these word clusters are distributed throughout the text. Each word cluster (or “topic”) technically contains all the words in the text, but each one is named after the top ten terms in the cluster. A few of these seem knitted together by some strong affect (“bed i’ve past lay depression writing chore couple suffering usually”) or a kind of prosody or soundscape (“it’s daily hope rope dropped round drain okay bucket bowls”). Others feel tantalisingly not-quite-arbitrary, resonant with linkages in the same way a surrealist painting is (“bike asda hard ago tried open bag surprisingly guy beard”). But I’m not sure how far I trust my instincts about these artefacts, and I definitely don’t yet know how they might be used to deepen my knowledge of a text, or how they relate to various notions you might invoke in a close reading (theme, conceit, discourse, semantic field, layer, thread, note, tone, mood, preoccupation, etc.).

The various tools on your Voyant dashboard also seemed to be linked, although I didn’t get round to fully figuring that out. Definitely whenever I clicked on a word in the “Reader” tool the other displays would change. Oh: and Voyant Tools seems to be pretty fussy, and didn’t want to run on some people’s laptops. I didn’t have any trouble though.

I got a bit sucked into trying to work out what the “Knot” tool does — it’s this strange rainbow claw waving at me — and didn’t spend much time on the last exercise, which was about regular expressions (or regex). Basically, these are conventions which let you do very fancy and complicated find-replace routines. You can search something like ‘a[a-z]’ which will match aa, ab, ac, ad, etc. Or (one of Ben’s examples) by replacing <[^>]+> with nothing, you can clear out all the XML tags in a text document. You can use regular expressions in plain old Word (just make sure you check the box in the find-replace dialogue), but they probably work a little better in a text editor like Atom or Sublime Text.

“The purpose of this part of the task is to teach you how to use them, not to teach you how to write them.” Phew! For me, regular expressions never seem to stick around very long in my memory, but it’s very useful to know in broad terms what they’re capable of. Every now and then a task pops up in the form of, “Oh my God, I have to go through the whole thing and change every …” and that’s my cue to start puzzling and Googling and figuring out whether it can be done with regular expressions. If it can, it will probably be quicker and more accurate, and it will definitely be more satisfying.

So: plenty explored, plenty more to explore. And I’m looking forward to the next workshop, Archival Historical Research with Tropy, on 19 February.

JLW

Subjectivity and Digital Research

Takats Poster web

The day after this lecture, Sean Takats will also be running a workshop on using Tropy for archival research.