Tag Archives: database


Because I am Trying to Conceptualize Leaves of Grass as a Database…

Ed Folsom’s semi-anecdotal opening to “Database as Genre: The Epic Transformation of Archives” took me back to the late ‘80’s and early ‘90’s. My parents, in an attempt to find economic solutions to grocery shopping for a family of 9, frequented the generic detergent, cold cereal, hot cereal, and toiletries sections of the grocery store. I was conditioned to avoid the bright colorful pictures, and I instead turned my gaze to the black background with the white Times New Roman printing of “Toasted Oats.”

Folsom’s start—an opening frustration with the abundance of lifelessness in the realm of the generic—is a smart preface to his discussion of Walt Whitman and genre. Whitman, even in his labeling, defied the laws of genre as he teased the boundaries of poetry, prose, and everything near or in between. This is no surprise when one considers how Whitman’s writing, if not his very existence, tore at the seams of the very fabric of sexual identity and philosophical thought. He was somewhere between transcendentalism and realism, somewhere between fifty shades of sexual orientation, and somewhere between anti-slavery and white supremacy. Whitman was not one to easily follow a prescribed agenda, and Folsom speaks to how this plays out in Whitman’s description of genre: “peculiar to that person, period, or place—not universal” (1572). Whitman was frustrated with the narrowness, the lack of transport-friendly-interconnectedness that comes along with genre. He did not want to be placed in a box, and Folsom is suggesting that the reason behind his refusal was a lack of options.

Recognizing this “ongoing battle with genre,” (1572) Folsom offers up the database as the best description of Whitman’s work. He credits Lev Manovich for introducing this conceptualization of the database as genre, and he adds to the conversation by asserting that for Whitman, “the world was a kind of preelectronic database” (1574). Moreover, he supports this claim by referring to Whitman’s multiple edits, last minute edits, antebellum and post-bellum coverage, and strategic posting of lines from poetry as markers or code within the text. This problematizing of Whitman as database then leads to a conversation of archive vs. database. Seeking to separate Derrida’s concept of “archive fever” from database, Folsom contends that archive has much more of an association with the physical space, the actual housing of artifacts, whereas database is more of a digital linking of information concerning a particular subject or combination of subjects. He establishes database as a new genre, one that can make the fitting genre home for Whitman’s works.

To be completely honest, I struggled with this piece. At times I jumped in, ready to find a place for Whitman, willing to re-embrace him as low-tech visionary and genius. And then there were times when my spidey senses tingled: How dare he box the unboxed Whitman? Why must “archive” exist in such limited terms? Being mindful of these tensions, I pose three questions. Like my previous provocation, feel free to respond to one or none of the following questions:

1) How do you think Whitman would respond to Folsom’s reading of his work?
2) Given our readings this week and last week, what do you think of Ed Folsom’s description of “archive” and “database”? Would you reframe them?
3) What does Folsom’s act of naming database as a genre do for the field of the humanities? What is its effect?

Folsom, Ed. “Database as Genre: The Epic Transformation of Archives.” PMLA: 1571-579. Print.

Lev Manovich. The Language of New Media: The Forms

Minecraft Creeper novelty wallet

In this selection from The Language of New Media, Lev Manovich observes the shifts in visual culture and their underlying organization. To begin, he sketches a portrait of New York web development in 1999. He observes the iconographic migrations of browser buttons to wallets and filing cabinets to computer icons to illustrate the cross-pollination of “virtual” forms. He traces the movements of cultural metaphor — those grafted into computer practices and those conceptualizations based on computers. Manovich goes on to distinguish and blur the computer database and 3-D virtual space as arenas of work and fun in computers. He refers to two of Janet Murray’s four essential properties of digital environments, encyclopedic and spatial, to elaborate the aims of new media design. He draws attention to the “opposition characteristic of new media — between action and representation” (Manovich 216). His call for “info-aesthetics” corresponds with much of his art — he considers data the new media as film and photography once were. Take for example, his Timeline. Introducing the database as “the key form of cultural expression of the modern age,” Manovich traces a theoretical descendance from Panofsky’s art historical description of perspective to Lyotard’s cultural theoretical Postmodern Condition to Berners-Lee’s computer science proposal of the world wide web  (218-9). Threading together these disciplinary developments, he demonstrates the broadly strewn, networked fields of cultural productivity. The refresh, addend, amend nature of the Web, he contends, lends itself to organization by collection rather than completed narratives. Apparent narratives, ie computer games, depend on players reverse learning algorithms. Thus the “ontology of the world according to computers” is reduced to data structures and algorithms (223). Describing the complementary nature of database and algorithm, he shows how the map of our information is greater than the territory — our indices eclipse our information; positing database in contrast to narrative, he addresses how our meaning making shifts accordingly. He goes on to describe the structure of new media in semiotic structuralist terms (following Barthes). He contends that the language-like sequencing is a holdover from the cinema. Manovich’s frame-by-frame sequence of cinema as differentiated from all-images-at-once spatialized visual culture does not entirely hold up, especially with today’s ‘view-as-you-please’ stop-and-go on-demand video media. I wonder if the database articulation Manovich extols in Whitney’s Catalog really changed the course of how we perceive visual culture. The effects Whitney developed certain contributed to the visual amplifications made by computers, but do they really mark any sort of break in the database/narrative tension?

Manovich seems to suggest that chronological linearity is narrative, and that artists trying to undermine it are attempting to express the database — or all options at once. He considers Peter Greenaway a prominent “database filmmaker” (239). Excerpt from Peter Greenaway’s The Falls, 1982 and


I am not sure that these catalogs of effects achieve the non-narrative. There are certainly differences, but do these assemblages constitute paradigm over syntagm?



What would a radical break from narrative to database look like? Do those things which stubbornly persist through restructuration (Manovich citing Jameson) have something to them which is, dare I say, essentially human? Or are our formal expressions discrete, replaceable, and bound to evolve beyond recognition? Can the paradigm, the vast array of associations, truly be manifested in the database, if we as readers still depend on syntagms (what the screen or interface can render)?


Manovich, Lev. The Language of New Media. Cambridge: MIT Press, 1999.



Etsy Floppy Disk Notebook

Cohen on Data Mining


Cohen argues that computational methods for analyzing, manipulating and retrieving data from large corpuses will provide new tools for academic research, including the humanities. He provides two examples, projects he worked on. Syllabus Finder, a document classification tool for aggregating and searching course syllabi, finds and collects documents that show similar patterns in their use of words. It also allows to differentiate documents that have similar keywords by analyzing the use of other words. Another example he provides is H-Bot, a question answering tool that takes in queries in natural language (instead of code), transforms the query using predetermined rules and conducts a web search before outputting the answer the tool decides is relevant.

Lessons that Cohen learned while building these tools:

  • APIs are good
    • they offer the possibilities for combining various resources (which facilitates the use of less rigorous but more accessible corpuses)
    • third-party development can lead to unexpected and positive results
  • open resources are better than restricted ones (access makes up for quality)
  • large quantity can make up for quality

Just in case: an API is a way of making easy the process of using our software get data (instead of doing it manually) from another software (usually on another computer, like a web server). The following is one of the more concise and less technical-details-oriented explanations I found online: https://www.quora.com/What-is-an-API

Also, I feel that The Lexicon of DH workshop slides provide a good overview of the coming week’s theme.

So indeed, the use of APIs has become more common outside of the IT field since 2006. New York Public LibraryCooper-Hewitt Museum and the New York Times, among many others, provide APIs that allow the access on their digital collection through software. MOMA provides their collection data on Github.

The technology used for document searching and question answering, the two examples that Cohen provides, have developed into something arguably more reliable, faster and easier to use. For example, we don’t even need to build a tool in order to be able to ask some questions in natural language:

We'll remember you, H-Bot.

Relating back to the discussion of previous weeks, what do you think is the impacts or implications that the increase of digital collections and APIs, along with developments in data collecting and analyzing technologies, have on teaching? (or on more broader aspects of life and research) How does this fit together with more traditional modes of teaching, like textbooks?

Another question I have relates to the fact that both examples mentioned in the article are no longer functioning. The latest update on Syllabus Finder that I could find explains that a system change in the Google search API effectively deprecated the tool; it also provides a download link to the database of syllabi—but only a small part of it. H-Bot is online, but sadly doesn’t seem able to answer me:

Oh, H-Bot.

I can easily imagine the difficulties of maintaining such a digital project. I am also under the impression that the eventual outdating is the fate of many digital projects. They require a different type of effort than, say, putting out journal articles. Maintenance requires manpower, manpower requires funds— I also have the ambivalent feeling that it may not be necessarily a bad thing that some projects finish their life cycle, while it would be great if those projects were archived somehow (in a functioning state). I guess I feel more personally involved since I will probably build something or another during my time here— I would love to hear your thoughts on this matter.

Some more or less related links: