Introduction

Digital Textuality is based on the interplay of several distinct elements:

Apply these considerations to

Flesh these general considerations while outlining my own prejudices (theoretical models), research projects, and system implementations.



Theory, Design, Result

System design is an instantiation of theoretical considerations.

Selection of what is important: setting the parameters of an environment in which others will interact. More obvious in analytic and presentation systems:

All systems have sets of theoretical/critical assumptions built in.

Digital text design starts with critical theory. PhiloLogic as an example.

Large scale orientation toward
social elements of textual
Text modeled as hierarchies of
objects and associated metadata

Results feedback (hopefully) to refinement of theory and implementations.



Ghosts in the Machine

There is no "computer" in the class

Critics/scholars in the humanities fetishize the "computer" as a machine; an abstract, rather magical machine (cyborg).

There are only implementations. Implementations are

The computer we use (opposed to that physical calculator) is the result of the interplay of many humans, at various levels, creating textual environments, within which we work, play, understand, etc.

Criticism of digital textuality (or other media) should begin and end with, as any other humane endeavor, the human and how that is expressed in this particular kind of writing.



Design/Programming as Cultural Activity

Cultural/Social Activity: Extreme Programming as Constructive Disruption

"Everything about XP that has to do with social change is where you get all the resistance."

Design/programming is creation of a text that describes and creates "a world" that can be criticized in much the same way as any other text. Some examples:

Design/programming is at least two sets of abstractions, described/encoded as a text (data structures and algorithms).

These two forms of abstraction -- from the world and from the machine -- may be related.

PhiloLogic: text object model may be best way to think about text with large amount of descriptive, social metadata (gender or sexual orientation of characters in plays, location of composition of letter).



Example: Is there an author in the class?

Authorship is open to question in (at least):

A matter of control?

The "author" of a hypertext functions within the environment created by a designer/programmer, with an uncertain amount of control of that enviornment.

The "author" may give to the reader/player more or less control over the outcome(s) and directions.

The reader/player may control avatars by sets of parameters which are restricted by the designer and "author".

Death of the Author?

Probably not. Author as self-conscious conspirator to achieve an effect.



Finding Critical Assumptions

What can't you do/say/implement as author, reader, player. Environment is designed by others -- a social process -- and there is a grammar and semantics of this particular space.

Example: multi-threading?

Loose parallel with my interest in using computers to look for longer term patterns of meaning rather than look at individual texts/authors.

Speculative Definition:

A digital critical theory might examine the structures and contours of socially created environments in and with which individuals attempt to achieve one or more meaningful effects.

These social environments may be, in retrospective systems

In "born-digital materials", all of the above apply, with examination of

of a particular system.

Or: the interaction of what you don't control with what you do control.



Algorithmic Limitations

Criticism of digital textuality is based upon the human sciences.

Stress the human.

There are Algorithmic Limitations which limit what can be done with computers:

So far, we have had better success in designing algorithms to treat relatively simple and repeatable human phenomena rather than the extraordinary or irreducibly individual.

MVO: Conclude Section



Access is good

Electronic publication is an economic and cultural revolution of staggering proportions.

It does not significantly impact the textual experinence or analytical capability without designed systems supporting extended functionality in the text.

Much of what is called electronic or digital publication today is simply getting a single, long text online, often in a format better suited for printing (PDF, etc) than digital exploitation.

Improving microfilm, photocopying, and publication technologies -- using digital means -- is vitally important, but does not in and of itself significantly alter textuality.



Overview

Encyclopédie ou Dictionnaire raisonné des sciences, des arts et des métiers (1751-1772). Scale of the work and complexity of it's organization:

Scale of enterprise:

Automated feature recognition -- discussed in reading.

Ongoing: currently correcting metadata and adding authorial attributions from later scholarship.



Design Objectives

Overcome the difficulties in working with the Encyclopédie and to implement it in a way that conformed to what we understood of Diderot's conception of the organization of knowledge.

Three general modes of organization:

Each of these, even the simplest, present interesting problems.

The dictionary organization, by head words, often includes groups of subarticles under the same title or rubric but with different attributes.

A number of elements in the Encyclopédie which are not articles, legends, etc, which also have to be handled, such as the Preliminary Discourse.

Combination of text and image, dynamically related.



Text Objects and Attributes

Underlying design principle is that all text is composed of hierarchies of objects and their particular attributes.

Textual objects are are defined from documents to words without reference to the type of object -- book, chapter, article, dictionary entry, poem, verse, sentence, etc. -- in the word index allowing for searching of words in selected objects and navigation up and down the hierarchy.

The paragraph in Diderot's Encyclopédie, containing the phrase étincelles lumineuses has the logical address: 35:78:0:51, being the 51st child object of the 78th top level object, of the 35th file (or document). Link

The metadata associated with this object is stored in an SQL table points to 35:78 indicating that it is the main article Electricite, by d'Aumont, in the class of knowledge Physique, and is associated with the page object 35:43, the 43rd page object (not the pa ge number which is stored in a related table) of the 35th file.

Textual Objects are Nested Hierarchies that may have different attributes.

HeadwordTypeAuthorClass of KnowledgeP.S.Vol:ObjObj ID
LUNEartmd'AlembertAstr.s.f.9:347270:12:0
LuneartsXXXChimie.NA9:347370:12:1
LuneartsVenelChimie.NA9:347470:12:2
LuneartsXXXHist. nat. Chimie, Metallurgie a Mineralogie.NA9:347570:12:3
Lune cornéeartsXXXChimie Metall.NA9:347670:12:4
LuneartsJaucourtMythologie.NA9:347770:12:5



Textbase/Knowledgebase/Hyperbase

The object model underlying PhiloLogic allows for three simultaneous and mutually dependent modes of access using different subsystems:

Example: search for "chien.*" in all articles dealing with "hunting" where the entry word is a verb (GO).

Search plate legends pertaining to "alphabet" containing the word "grec". (GO).



Mapping Structures of Knowledge

Three modes of organization which taken together was described by Diderot and d'Alember as encyclopedic:

Some modern commentators to describe the Encyclopédie as an "ancestor of hypertext" and to depict Diderot as "l'internaute d'hier".

Ainsi trois choses forment l'ordre encyclopédique; le nom de la Science à laquelle l'article appartient; le rang de cette Science dans l'Arbre; la liaison de l'article avec d'autres dans la même Science ou dans une Science différente; liaison indiquée par les renvois, ou facile à sentir au moyen des termes techniques expliqués suivant leur ordre alphabétique. [DP]

The implementation of the Encyclopédie allows for experimentation. The following is a first try to examine the organization of 18th century knowledge by conflating the classification and the cross-references. Inspired by Diderot:

l'ordre encyclopédique général sera comme une mappemonde où l'on ne rencontrera que les grandes régions; les ordres particuliers, comme des cartes particulieres de royaumes, de provinces, de contrées; le dictionnaire, comme l'histoire géographique & détaillée de tous les lieux, la topographie générale & raisonnée de ce que nous connoissons dans le monde intelligible & dans le monde visible; & les renvois serviront d'itinéraires dans ces deux mondes..
Comparison: Systême Figuré Renvois Map Three

Diderot's model of hypertext

Each dictionary entry was assigned to a "class of knowledge," placing it within the "order" of human knowledge,

This is a Baconian hierarchical classification of knowledge and Enlightenment theories of epistemology.

All understanding is founded upon memory, reason, or imagination, with the numerous categories and sub-categories branching out from these three faculties. (Essai d'un distribution généalogique des Sciences et des Arts Principaux)

Insufficient to indicate the inter-connections of knowledge.

The renvois system: provides a lattice of inter-connections between individual leaves of the tree as well as between classes of knowledge.

The reader is encouraged to follow the renvois, in the original print version and the ARTFL electronic implementation, which unite fields of knowledge into what was hoped to be a seamless totality.

As presented by Diderot and d'Alembert in the "Discours Préliminaire" and the article "Encyclopédie," the renvois are a critical part of the epistemology that informs the entire work. Diderot suggested, furthermore, that they could reveal hidden, subversive relationships between articles, thereby eluding the strict censorship to which the work was subject.



The Order of Things

Encyclopédie signifies the "chaining of knowledge", from the Greek "en", "circle", and "knowledge".

Christophe de Savigny depicted the structures of knowledge in the 16th century.

IMAGE

Tableaux accomplis de tous les arts libé (Paris Gourmond, 1587). A follower of Ramus, his "encyclopédie, ou suite et liasion de tous les arts et sciences" is bordered with a chain, clearly echoes Ramus view of knowledge (1555) as

quelque longque chaîne d'or telle que feint Homère, de laquelle annelets soient ces derè ainsi dépendents l'un de l'autre, et tous enchaînés si justement ensemble, que rien ne s'en puisse ôter sans rompre l'order et continuation de tout.


Words: Dictionary Order

Dictionary order is the simplest, but it has many assumptions in terms of organizing knowledge.

17th and 18th century tradition of universal dictionaires, Diderot borrows heavily from the Trévoux (1743, 52) and the earlier Furetière (1690).

La langue d'un peuple donne son vocabulaire, & le vocabulaire est une table assez fidele de tous les connoissances de ce peuple.

Diderot recognizes that there is a structure, since general words subsume many others.

Even this breaks down in practice:
APPARITION Diderot; Gram., NA
Apparition d'Alembert; , NA
VISION, APPARITION Jaucourt; Synonym., NA

fall in different classifications and without a single renvoisbetween them.

Forces odd editorial conventions. Many places name, few proper names. Biographies are found under the places where the individual was born (Newton).

Not all, biography of Descartes is found in the article CARTÉSIANISME. There is an article NEWTONIANISME.

Diderot likens biographies and eulogies of great men to statues honoring them village squares.



Classification

All classification schemes are arbitrary.

Diderot: infinite number of points of view. "[T]he number of systems of human knowledge is as large as that of these points of view".

Developed the arbre généalogique ou encyclopédique as a means place the particular elements of human understanding into a coherent structure based on the three major faculties, memory, reason, and imagination.

Two views Systême Figuré des connoissances humaines and Essai d'un distribution généalogique des Sciences et des Arts Principaux

Directions of use

comment nous avons tâché de concilier dans ce Dictionnaire l'ordre encyclopédique avec l'ordre alphabétique. Nous avons employé pour cela trois moyens, le Système figuré qui est à la tête de l'Ouvrage, la Science à laquelle chaque article se rapporte, & la maniere dont l'article est traité. On a placé pour l'ordinaire après le mot qui fait le sujet de l'article, le nom de la Science dont cet article fait partie; il ne faut plus que voir dans le Système figuré quel rang cette Science y occupe, pour connoître la place que l'article doit avoir dans l'Encyclopédie.

Two functions: a guide for selection of materials to be treated in the Encyclopédie and as a navigation tool for the reader/user.



Classification Problems

A number of problems in applying this scheme to the entries of the Encyclopédie.

Entries and subentries may have different classes of knowledge -- a creative tension with dictionary order.

OPTIQUE, en Anatomie:
Optique, s. f. (Ordre encyclop. Entendement. Raison. philosoph. ou science, Science de la nat. Mathem. Mathématiques mixtes, Optique).
Optique, pris adjectivement, se dit de ce qui a rapport à la vision.

Many were not assigned classes of knowledge. Requires an active reader:

S'il arrive que le nom de la Science soit omis dans l'article, la lecture suffira pour connoître à quelle Science il se rapporte; & quand nous aurions, par exemple, oublié d'avertir que le mot Bombe appartient à l'art militaire, & le nom d'une ville ou d'un pays à la Géographie, nous comptons assez sur l'intelligence de nos lecteurs, pour espérer qu'ils ne seroient pas choqués d'une pareille omission.

Reference to nonexistent classes:
Rayon visuel, (Nivell.)
Réfraction, (Nivell.)

Variety of terms

 freq.   class of knowledge
  70     Geom.
  58     Physiq.
  51     Astron.
  42     Phys.
  23     terme d'Astronomie
  22     Physique.
  22     Astronom.
  17     Mechan.
  13     Gram.
  12     Gramm.
  12     Astronomie.
  11     terme de Geometrie
  11     Mech.
   9     Geometrie.
   9     Geog.
   8     terme de Physique
   8     en Astronomie.
   8     Astr.
   8     Algebre.
   7     Optique.
   7     Morale.
   7     Geomet.
   6     Musique.
   6     Mathem.
   6     Hist. mod.
   5     terme de Mechanique
   5     Mechanique.
   5     Mechaniq.
   5     Alg.


Renvois

The structure of the renvois is impossible to visualize as a whole because of their number and diffuse nature.

The extensive use of the renvois through the entire work reflects Diderot's opinion that they are "la partie de l'ordre encyclopédique la plus importante".

Diderot asserts that his cross-references are different.

D'ailleurs par la disposition des matieres dans chaque article, sur--tout lorsqu'il est un peu étendu, on ne pourra manquer de voir que cet article tient à un autre qui dépend d'une Science différente, celui--là à un troisieme, & ainsi de suite.  On a tâché que l'exactitude & la fréquence des renvois ne laissât là--dessus rien à desirer; car les renvois dans ce Dictionnaire ont cela de particulier, qu'ils servent principalement à indiquer la liaison des matieres; au lieu que dans les autres ouvrages de cette espece, ils ne sont destinés qu'à expliquer un article par un autre.

Task of the renvois is to situate a particular element in the fabric of knowledge.

Artere assigned to the class, en Anatomie is linked to Coeur, Poumon, Aorte, Pulmonaire, Nerveux, and Cellulaire.

Some renvois are completely unambiguous, such as Aorte and Cellulaire which are both classied as Anatomie and for which there is only one entry in the Encyclopédie.

Others, such as Pulmonaire are found with several meanings, each assigned to its class of knowledge:

PULMONAIRE Jaucourt; Hist. nat. Bot., s.m.
Pulmonaire Venel; Mat. medic., NA
Pulmonaire Jaucourt; Botan., NA
Pulmonaire XXX; Anatom., adject
Only the last of these are on point (GO), being basically a pointer to Poumon, while the other three are botanical and medical uses of the term. It is clear that Diderot expected the reader to disambiguate.

Chaining of knowledge: rapid expansion of links to other classes of knowledge:

Ainsi à tout moment la Grammaire renverra à la Dialectique, la Dialectique à la Métaphysique, la Métaphysique à la Théologie, la Théologie à la Jurisprudence, la Jurisprudence à l'Histoire, l'Histoire à la Géographie & à la Chronologie, la Chronologie à l'Astronomie, l'Astronomie à la Géométrie, la Géométrie à l'Algebre, l'Algebre à l'Arithmétique, &c.

Diderot: Renvois cannot be excessively multiplied.

Many failed renvois: links leading nowhere because of the editorial history of the Encyclopédie.

TRANSPIRATION links to Evacuation. This fails because there is no headword, but there is EVACUANT.

Diderot suggests that readers will be able to adjust by finding a corresponding or related word form, and in fact this is frequently the case.



Encyclopedic Order

Three distinct modes of organization:

The conceptual power of the Encyclopédie organization arises from the interaction of these three modes of ordering knowledge.

Hard to visualize this interaction when reading the Encyclopédie as a static or printed document.

ARTFL implementation of the Encyclopédie as a dynamic document makes using Diderot's Encyclopedic Order much easier, but does not, in itself, provide a general overview of this order.



Methodology

Grasp the general features of the renvois system in Diderot's encyclopedic order.

Extracted a secondary database of

Objective: count the cross-references between classes of knowledge, rather than treat renvois as simply node to node links.

Huge variations in representation of classes of knowledge. Required heuristics to normalize the data, a controlled vocabulary. This is NOT implemented in the production version of the Encyclopédie so we recommend users adopt a very simple and general search strategy, such as using "astr" to search all articles dealing with astronomie, since there are many abbreviations found in the database.

In cases of multiple classes of knowledge, we used the first.

In cases where single cross-reference produces multiple hits (different subarticles assigned to different classes of knowledge), all of them are taken into account.

The goal of these hueristics is the construction of a database with classes broad enough to have a statistically significant number of cross-references.



Statistics

Resulting database contains classes that are highly variable in size.

The class lutherie is quite small while others, such as droit can be quite large.

Simply frequencies are not a good measure.

Two measures of the discrepancy between expected and observed links, z-scores and binomial scores, which factor in variations in frequencies.

The z-score has been widely used in the calculation, for example, of lexical collocations.

The binomial score is better suited to the relatively low frequency of links as compared to running words in large corpora of text.

The differences in the calculations on the resulting "maps" of links are minimal in any event.

Unseen feature in the maps: nearly 50% of renvois are links to entries in the same class of knowledge.



Some Maps

The three maps are based on different statistics and filtering, in order to both more accurately and clearly show the broad outlines of the encyclopedic order as found in the Encyclopédie.

The width of the arrows for each is drawn proportionately to the strength of the relationship. The threshold value for which lines are drawn is a tail binomial probability of 1% (if the links were random we would observe this number only 1% of the time. Also, for readability, we have excluded links with frequencies less than 8 in Map 3, which are usually high specialized.

IMAGE
Map 1: Encyclopédie class of knowledge and renvois map
(50% sample). Arrow width is proportional to number of xrefs
where z-score measure of significance is XXXX.

IMAGE
Map 2: Encyclopédie class of knowledge and renvois map
based on binomial measures drawn at 5% threshold. Arrow width is
proportional to strength of relationship.

IMAGE
Map 3: Encyclopédie class of knowledge and renvois map
based on binomial measures drawn at 0.1% threshold. Arrow width is
proportional to strength of relationship. Circled nodes are categories
with more than 200 articles.



Map Discussion

The maps of the encyclopedic order differ significantly from the Systême Figuré in the organization of knowledge depicted and the design of the representation.

Most importantly, our maps are not hierarchical representations. Each node is related to other nodes by the statistical strength of that relationship, not on an established hierarchy.

This does not, however, eliminate all hierarchy from the maps, since we are using classifications derived from a hierarchical arrangement.

It is equally important to note that the here is no starting point or general principle, philosophical or otherwise, to provide an orientation: simply put, there is no compass to indicate up from down, north from south.

Striking feature: apparent split of the field of human understanding into two hemispheres, which does not readily translate to the broad distinctions found in the Systême Figuré.

The strong linkages of botany to natural history, for example, cut across broad distinctions drawn by the editors.

Strong relationship found in our maps of natural history to other sciences, such as metallurgy, chemistry, physiology would suggest that the distinctions drawn by Bacon and borrowed by Diderot and d'Alembert were in the process of changing into "natural sciences", but the location of astronomy and optics, seen to be more closely related to mathematics, suggests that this development was not yet complete.

Both hemispheres feature relatively tightly organized related classes -- such as medicine, physiology, pathology, anatomy and surgery, or the cluster around geometry -- correspond more closely to the genealogical tree of human understanding.

The Systême Figuré sets surgery, therapeutics, diet, pathology, and pharmacy as sub-classes of medicine and places medicine in the general class of zoologie, along with physiology and anatomy. More useful, omissions from the Systême Figuré, such as maladie, are also shown to be strongly related medicine.

The cluster of classes around belle lettres including poetry, literature, with links to mythology -- typically references mythological allusions -- and philosophy is certainly reasonable. The strong relationship between history and theology appears to be due to the number of entries dealing with themes that might be best classified as ecclesiastical or sacred history, itself a possible holdover from the fact that the encyclopedists were dependent on many earlier reference works.



Comparisons: Furet/Roche, Dewey

The Furet-Roche Classification of 18th century books, based on contemporary bibliographic manuals, corresponds in some ways, more closely to the general organization of knowledge depicted in our maps of the renvois than to structures articulated in the Systême Figuré.

Natural history as a category does not appear and that the scientific categories provided by Furet-Roche seem to parallel the linkages provided by our maps. The establishment of the natural sciences as a set of distinct disciplines, a process facilitated in no small part by the Encyclopédie, is even more clearly depicted by Dewey's classification a century later, where Natural History is completely ignored as a general category.

The only reference to Natural History in Dewey is number 573, the natural history of Man. Of equal interest is the creation of an entire category of "Useful Arts" in Dewey, including Agriculture, Medicine, and Commerce. We find, in less systematic form, the same kinds of classifications in Sciences and Arts in the 18th century scheme derived by Furet and Roche.



Conclusion: Use of Maps

One of the primary objectives of the Encyclopédie was the systematic depiction of the arts and trades, an almost revolutionary recognition of applied science and technology as vital elements of knowledge. These are more coherently represented in our maps of the renvois than in the Systême Figuré.

The maps of the renvois certainly show natural history as an important classification.

The general division of natural history and natural science as depicted in the Systême Figuré does not appear as evident in the renvois.

The stronger filiations found in our maps between domains of the natural sciences suggest that Diderot and d'Alembert's theoretical organization of knowledge, based on Baconian distinctions, was already outmoded in the 18th century and, were in some ways ignored the process of elaborating the renvois systems as the editors compiled the Encyclopédie.

Diderot cautions:

L'usage des divisions générales est de rassembler un fort grand nombre d'objets: mais il ne faut pas croire qu'il puisse suppléer à l'étude de ces objets mêmes. C'est une espece de dénombrement des connoissances qu'on peut acquérir; dénombrement frivole pour qui voudroit s'en contenter, utile pour qui desire d'aller plus loin. Un seul article raisonné sur un objet particulier de Science ou d'Art, renferme plus de substance que toutes les divisions & subdivisions qu'on peut faire des termes généraux; & pour ne point sortir de la comparaison que nous avons tirée plus haut des Cartes géographiques, celui qui s'en tiendroit à l'Arbre encyclopédique pour toute connoissance, n'en sauroit guere plus que celui qui pour avoir acquis par les Mappemondes une idée générale du globe & de ses parties principales, se flateroit de connoître les différens Peuples qui l'habitent, & les Etats particuliers qui le composent.
Mapping the world is a useful exercise, but it omits many details, and it would be frivolous for anyone to remain contended with that superficial treatment.

Hypertextual Considerations

The Encyclopedic Order: a model of hypertextual organization?

The project can only succeed in the interaction of the three modes of organization:

Proper use of the Encyclopédie requires that the reader be able to situation himself on the mappemonde of knowledge and to do this regularly. Thus, the renvois system, the proto-hypertext of the Encyclopédie is far more than simple leaf-to-leaf links. It entails the ability to see, understand, and use conceptual maps. And it requires that the reader know where he has been, is now, and at least some sense of where he is going.

Diderot and d'Alembert are particularly demanding editors, requiring the reader not only to simply follow them along the various itineraries of knowledge, but to be actively working with them. Once the reader understands the principles of the Systême Figuré, he should feel free to fit the contents of particular articles into the encyclopedic tree.

The reader is finally responsible for keeping track of the current location on the map of knowledge, even it if requires that he get out a compass because the editors have failed to include this information.

The editors count on the intelligence of their readers to classify knowledge and navigate the field of human understanding, using the tools put at their disposal -- the encyclopedic order -- blurring the distinction between reader and author, since use of the work is consciously dependent on the input of the reader.

The "hypertext" of the Encyclopédie is not simply, as currently found in many modern hypertext implementations, a node-to-node set of links to be followed by readers, but principled interaction of several modes of organization of knowledge that makes significant demands on the user. Inspite of the obvious limitations of the Encyclopédie -- and in all honesty the ARTFL implementation as well -- it is clear that the editors, Diderot in particular, developed a model of handling the complexities of representing the vast field of human understanding from which we can benefit today, if not as practical applications, a distant objective to which we should move.



Two Conclusions

At the most general level, we suggest that the general topography of the renvois system of the Encyclopédie shows that the editors' use of general Baconian scheme to organize knowledge was already outdated. As the editors actually built the renvois system, they proved to be far more influenced by contemporary views of the organization of knowledge than would be suggested by the examination of the Systême Figuré alone.

If one can consider the Encyclopédie as an ancestor to modern hypertext, it is clear that Diderot and d'Alembert's theoretical and practical work on the articulation of the multiple levels of organization of knowledge, and the active role of the reader/user of the Encyclopédie, may be very instructive to the articulation of modern hypertext systems.



Theory, Implementation, Result

ARTFL Encyclopédie attempts to implement a representation of Diderot's theory of knowledge.

Using the resulting system, we can analyze the effects and implications of this theory and implementation as well as examine the degree to which Encyclopedic model is internally coherent.

In practice, development of this project was (is) an iterative process, involving modifications to the theory and implementations.

One abstraction deserves another: the general "object model" of text computing in PhiloLogic was developed, in part, as a response to the "hypertextuality" of the Encyclopédie.

The general object model is now a primary development orientation at ARTFL, with a set of implications concerning what we can do and what we cannot/do not do in building many other databases.



Corrections and Extensions

Database are never done. The Encyclopédie is an ongoing project:

Larger scale experimentation

Conclusions

System design is a theoretically driven activity which is, to my mind, a proper subject of any critical analysis of digital textuality.

The contours of any system/database are clear choices and define the environment in which critical analysis and presentation of retrospective and born-digital will take place.