Tag Archives: digital collection


Response to A. Hyde et al. and L. Hyde: Collaboration, Sharing, Ownership

That there were two articles written by different first authors of the same last name this week, both of whom described issues related to property, ownership, and sharing, seemed to add a new layer of complexity to the issues at hand. How can one prove ownership of property if one cannot prove to be themselves and no one else?

In any case, the two articles demonstrated different aspects of collaboration and openness with respect to the distribution and use of digital property. A. Hyde et al. (2012) provide an overview of what constitutes sharing and collaboration of intellectual property. Drawing a distinction between sharing and collaboration, the authors suggest that to share content involves treating it as a social object that can be directly linked to author, whereas in collaboration, the direct linkage between author and the content produced is less clearly observed. In the case of Wikipedia, all edits are preserved but the final written article as it appears could consist of multiple edits. Though Wikipedia articles are in some ways culturally constructed, there are safeguards against the falsification of information, as noted by the Colbert Report incident. How might having a distributed network of authors affect the product of a collaboration, and is the accuracy of the information source any more or less questionable than a piece written a solitary author?

A. Hyde et al. (2012) continue by outlining criteria of questions for a successful collaboration. Included are questions or intentions, goals, self-governance, coordination mechanisms, knowledge transfer, identity, scale, network topology, accessibility, and equality. The question or network topology stuck out as an important issue, yet one that I had not considered before as an aspect of collaboration. In the case of Wikipedia, contributions appear to be individually connected, unless there is a conflict with two editors working at the same time. In any given collaboration, is it possible to sketch out a model of the roles and tasks of the individuals or entities involved? Is it always feasible to do so?

Whereas the A. Hyde et al. (2012) discuss the process of collaboration, L. Hyde focus proprietary aspects of collaboration, specifically the “commons.” In contrast to views that place the idea of a commons outside the realm of physical property, L. Hyde speculates that the commons is in fact property, and by definition, “a right to action.” Later, he elaborates by stating that “a commons is a kind of property in which more than one person has rights,” (p. 27) suggesting that a commons may be inclusive of larger units of contributors. The word “commons” itself apparently has been derived from proprietary feudal systems, where such a thing would ultimately be under the ownership of nobility and in order to be used by others, they would have to contribute certain goods or resources in exchange. In this case, a commons was typically a piece of land jointly used by multiple individuals for agrarian purposes. These types of systems strictly controlled the use of the commons as well as any product reaped from it. According to the author, a modern commons is a “kind of property in which more than one person has a right of action.” (p. 43) As “commoners,” how should they view their contributions? Can one reasonably expect to have sole ownership of property once its been submitted to a commons?



This week, we again consider the issue of ownership of intellectual property. A Hyde et al. (2012) prompts us to consider the complexities of collaboration, and to think about ways to structure successful collaborations, while L. Hyde describes the evolution of the modern commons as a property with collective ownership. As teachers and academics, in what ways can we effectively structure collaboration and sharing of knowledge in a commons? What recommendations would you have for students and peers to form constructive models of knowledge generation and sharing?



Lewis Hyde (2010). Common As Air: Revolution, Art and Ownership. New York, NY: Farrar, Straus, Giroux, p. 23-38.

Adam Hyde, et. al. (2012). What Is Collaboration Anyway? In Mandiberg (Ed.), The Social Media Reader, 53-67.

Visualizing Impossibility: Thoughts on Lauren Klein

In Lauren Klein’s “The Image of Absence: Archival Silence, Data Visualization, and James Hemings,” we search alongside her for ghosts, silences, and absences in the archive. Over the course of the article, she seeks to illuminate the life and contributions of James Hemings within the Papers of Thomas Jefferson, a digital archive made available through ROTUNDA, University of Virginia Press, and in doing so, discusses the possibilities and pitfalls of data visualization in this process. For Klein, digital technology has the capacity to render visible the invisibilities of archival gaps, and at the same time expose the limits of our knowledge as productive space with which to think.

Recalling last week’s conversation about narrative and database, Klein suggests that archival silences can be produced, in part, by metadata and data structuring decisions (663). This claim dovetails with Lisa Brundage’s suggestion that the most essential word in database theory is the “you,” or human agency responsible for decisions regarding information. In the context of Klein, the locus of “you” as human interacting with or producing an archive becomes a space for determining the nature of archival imbalances, power, and structure—particularly when Klein asks, “How does one account for the power relations at work in the relationships between the enslaved men and women who committed their thoughts to paper, and the group of (mostly white) reformers who edited and published their works?” (664)

This same question of the “you” that must be accounted for appears in the data visualists’ role in rendering information visually, and is part of Klein’s call for a greater theorization of the digital humanities. She states, “the critic’s involvement in the design and implementation—or at the least, the selection and application—of digital tools demands an acknowledgment of his or her critical agency” (668). In Klein’s scholarship, qualifying and elucidating the role of “you” is paramount to understanding the archive, the visualization, and the data collected.

Critique without suggesting an alternative is all too easy, and I admire the way in which Klein posits data visualization as antidote to archival silences and also deeply engages the fraught history of its practice (665). She engages visualization’s vexed history through the figure of Thomas Jefferson himself, who underwent training in early forms of data visualization with William Small at the College of William and Mary. In this section of the article, we gain a sense of how complex it is to engage these forms: can the same tool that Jefferson was so fond of also be a tool for scholars to resurrect the memories and presence of the slaves he owned, centuries later?

Klein also explores the ways in which Jefferson’s note-taking and records use representation in diagrams, charts, and tables to suggest that he was engaged in using data visualization as a “form of subjugation and control—that is, the reduction of persons to objects, and stories to names,” which points at the reductiveness and potential for violence in types of visual display (679). Klein’s portrayal of Jefferson here, as an unthinking white man who recorded Hemings as empirical evidence, to be charted and claimed as thus, is emblematic of the central question of her piece: how can we visualize without appropriation, acknowledge incompleteness, and in a paraphrase of Marcus and Best, let ghosts be ghosts without claiming them for our own purposes or meanings?

Evoking Stephen Ramsay’s idea of “deformance,” or the creative manipulation and interpretation of textual materials, Klein ultimately suggests that rendering Hemings in an act of visual deformance makes legible “possibilities of recognition” that the actual textual content of the Papers of Thomas Jefferson resist, while “expos[ing] the impossibilities of recognition—and of cognition—that remain essential to our understanding of the archive of slavery” in contemporary studies (682).


When confronted with archival ghosts, Klein seems to suggest that the best policy is: illuminate, not explicate. How do you negotiate the difference between these two words, and can you share with us the ways it influences your pedagogy and scholarship?

Is there ever truly a safe way to visualize data, particularly regarding people and especially those who have been silenced, ghosted, or violated, in a way that rhetorically privileges stories and narrative over names and numbers?

To what extent does digital technology provide solutions of access for archival materials, but at the same time reproduce power structures that perpetuate silences? Can digital technology increasingly address this question through innovation, or is this a question of institutional change?

Klein’s argument regarding silences in digital archives seems to address the question of mark-up and encoding, whose granularity is often determined by institutional funding. In a recent conversation, Erin Glass (of Social Paper, an amazing platform for student-centered writing that you should check out!) and I noted that the first invisible document of any archive, institution, or project is often a grant. This document lays out the rationale, timeline, and required resources that shape the development of the project, but it is rarely discussed once secured for an institution, and is often invisible except in gestures towards sponsorship or funding. ROTUNDA is an organization that is part of University of Virginia Press, but whose digitization work is funded through grants. It is likely that decisions of encoding granularity were built into the grant itself and the time requirements of the project.

So, at the roots of the process of creating digital archives, how might we conceive of the entire process–from grant onwards–as a new space to intervene in inclusive, even collaborative, editing processes that produce richer metadata? Does this help address archival silences, or instead offer more opportunities to reproduce them?

Lev Manovich. The Language of New Media: The Forms

Minecraft Creeper novelty wallet

In this selection from The Language of New Media, Lev Manovich observes the shifts in visual culture and their underlying organization. To begin, he sketches a portrait of New York web development in 1999. He observes the iconographic migrations of browser buttons to wallets and filing cabinets to computer icons to illustrate the cross-pollination of “virtual” forms. He traces the movements of cultural metaphor — those grafted into computer practices and those conceptualizations based on computers. Manovich goes on to distinguish and blur the computer database and 3-D virtual space as arenas of work and fun in computers. He refers to two of Janet Murray’s four essential properties of digital environments, encyclopedic and spatial, to elaborate the aims of new media design. He draws attention to the “opposition characteristic of new media — between action and representation” (Manovich 216). His call for “info-aesthetics” corresponds with much of his art — he considers data the new media as film and photography once were. Take for example, his Timeline. Introducing the database as “the key form of cultural expression of the modern age,” Manovich traces a theoretical descendance from Panofsky’s art historical description of perspective to Lyotard’s cultural theoretical Postmodern Condition to Berners-Lee’s computer science proposal of the world wide web  (218-9). Threading together these disciplinary developments, he demonstrates the broadly strewn, networked fields of cultural productivity. The refresh, addend, amend nature of the Web, he contends, lends itself to organization by collection rather than completed narratives. Apparent narratives, ie computer games, depend on players reverse learning algorithms. Thus the “ontology of the world according to computers” is reduced to data structures and algorithms (223). Describing the complementary nature of database and algorithm, he shows how the map of our information is greater than the territory — our indices eclipse our information; positing database in contrast to narrative, he addresses how our meaning making shifts accordingly. He goes on to describe the structure of new media in semiotic structuralist terms (following Barthes). He contends that the language-like sequencing is a holdover from the cinema. Manovich’s frame-by-frame sequence of cinema as differentiated from all-images-at-once spatialized visual culture does not entirely hold up, especially with today’s ‘view-as-you-please’ stop-and-go on-demand video media. I wonder if the database articulation Manovich extols in Whitney’s Catalog really changed the course of how we perceive visual culture. The effects Whitney developed certain contributed to the visual amplifications made by computers, but do they really mark any sort of break in the database/narrative tension?

Manovich seems to suggest that chronological linearity is narrative, and that artists trying to undermine it are attempting to express the database — or all options at once. He considers Peter Greenaway a prominent “database filmmaker” (239). Excerpt from Peter Greenaway’s The Falls, 1982 and


I am not sure that these catalogs of effects achieve the non-narrative. There are certainly differences, but do these assemblages constitute paradigm over syntagm?



What would a radical break from narrative to database look like? Do those things which stubbornly persist through restructuration (Manovich citing Jameson) have something to them which is, dare I say, essentially human? Or are our formal expressions discrete, replaceable, and bound to evolve beyond recognition? Can the paradigm, the vast array of associations, truly be manifested in the database, if we as readers still depend on syntagms (what the screen or interface can render)?


Manovich, Lev. The Language of New Media. Cambridge: MIT Press, 1999.



Etsy Floppy Disk Notebook

Cohen on Data Mining


Cohen argues that computational methods for analyzing, manipulating and retrieving data from large corpuses will provide new tools for academic research, including the humanities. He provides two examples, projects he worked on. Syllabus Finder, a document classification tool for aggregating and searching course syllabi, finds and collects documents that show similar patterns in their use of words. It also allows to differentiate documents that have similar keywords by analyzing the use of other words. Another example he provides is H-Bot, a question answering tool that takes in queries in natural language (instead of code), transforms the query using predetermined rules and conducts a web search before outputting the answer the tool decides is relevant.

Lessons that Cohen learned while building these tools:

  • APIs are good
    • they offer the possibilities for combining various resources (which facilitates the use of less rigorous but more accessible corpuses)
    • third-party development can lead to unexpected and positive results
  • open resources are better than restricted ones (access makes up for quality)
  • large quantity can make up for quality

Just in case: an API is a way of making easy the process of using our software get data (instead of doing it manually) from another software (usually on another computer, like a web server). The following is one of the more concise and less technical-details-oriented explanations I found online: https://www.quora.com/What-is-an-API

Also, I feel that The Lexicon of DH workshop slides provide a good overview of the coming week’s theme.

So indeed, the use of APIs has become more common outside of the IT field since 2006. New York Public LibraryCooper-Hewitt Museum and the New York Times, among many others, provide APIs that allow the access on their digital collection through software. MOMA provides their collection data on Github.

The technology used for document searching and question answering, the two examples that Cohen provides, have developed into something arguably more reliable, faster and easier to use. For example, we don’t even need to build a tool in order to be able to ask some questions in natural language:

We'll remember you, H-Bot.

Relating back to the discussion of previous weeks, what do you think is the impacts or implications that the increase of digital collections and APIs, along with developments in data collecting and analyzing technologies, have on teaching? (or on more broader aspects of life and research) How does this fit together with more traditional modes of teaching, like textbooks?

Another question I have relates to the fact that both examples mentioned in the article are no longer functioning. The latest update on Syllabus Finder that I could find explains that a system change in the Google search API effectively deprecated the tool; it also provides a download link to the database of syllabi—but only a small part of it. H-Bot is online, but sadly doesn’t seem able to answer me:

Oh, H-Bot.

I can easily imagine the difficulties of maintaining such a digital project. I am also under the impression that the eventual outdating is the fate of many digital projects. They require a different type of effort than, say, putting out journal articles. Maintenance requires manpower, manpower requires funds— I also have the ambivalent feeling that it may not be necessarily a bad thing that some projects finish their life cycle, while it would be great if those projects were archived somehow (in a functioning state). I guess I feel more personally involved since I will probably build something or another during my time here— I would love to hear your thoughts on this matter.

Some more or less related links: