PhiloLogic©
User Documentation

About PhiloLogic The ARTFL Project EFTS Sample Searches Databases under PhiloLogic

Table of Contents


1. Introduction

About PhiloLogic:
PhiloLogic, developed by the
ARTFL Project at the University of Chicago in collaboration with The University of Chicago Library's Electronic Text Services, provides sophisticated searching of a wide variety of large encoded databases on the World Wide Web. It is an easy to use, yet powerful, full-text search, retrieval, and reporting system for large multimedia databases (texts, images, sound) with the ability to understand complex text structures (e.g., SGML, BetaCode) with rich metadata. Its functions were originally designed and continue to be for scholarly research in databases of literary, religious, philosophical, and historical texts. Important historical encyclopedias and dictionaries are also ideally suited for development under PhiloLogic. The Encyclopédie Project, for example, is an implementation of a full hypermedia system supporting full-text retrieval and navigation with hyper-textual cross-referencing and full digital imaging support in a single, easy-to-use system.

PhiloLogic in its simplest form can serve as a document retrieval/look up mechanism whereby users can search a relational database to retrieve given titles and, in some implementations, portions of texts such as acts, scenes, articles, or head-words. This same document retrieval mechanism serves as the basis for defining a corpus in a full-text search. The typical PhiloLogic search is broken down into five distinct stages: defining a corpus (i.e. limiting a search), word expansion, word index searching, text extraction, and link resolution and formatting (e.g., SGML to HTML conversion). In other words, after defining a corpus (or one may search an entire database), one can execute a single term, phrase or proximity search. By looking up indices of the word(s) in a relational database, PhiloLogic extracts blocks of text containing the search term(s) with links to larger blocks of text. These extracts are formatted to display on a Web browser and sometimes include links to images, other texts, and other databases.

In addition to simple word and phrase searches, users can perform more sophisticated searches by using extended UNIX-style regular expressions and, in some implementations, morphological and orthographical expansion. All of these mechanisms to expand words can be combined using Boolean operators such as OR (the vertical bar "|") and AND (a space) within a variety of searching contexts.

The Text Collection Version of PhiloLogic:
This version of PhiloLogic has been developed by the ARTFL Project in collaboration with the University of Chicago Library's Electronic Text Services (ETS). Other versions of PhiloLogic, including those developed for dictionaries and encyclopedias, have somewhat different functionality. Accordingly, the documentation which follows outlines elements of PhiloLogic that are specific to databases of collections of texts. This documentation provides general user documentation for the main functions and features of PhiloLogic. The search-forms for individual databases under PhiloLogic link to this documentation and describe database-specific features and provide database-specific searching tips as well.

Return to Table of Contents


2. Searching the Bibliography

Bibliographic searching in PhiloLogic has two distinct purposes: 1) to allow the user to locate particular documents and read them online ("document retrieval") and 2) to allow the user to select one or more documents in which to search ("defining a corpus" or "limiting one's search"). If the user does not enter search term(s) into the Search Text(s) For: box, PhiloLogic automatically acts as a document retrieval system, providing a bibliography with links to the digital table of contents of each document retrieved. If, on the other hand, term(s) are entered into the Search Text(s) For: box, then PhiloLogic goes into full-text searching mode, looking for the entered term(s) in the document(s) specified in the bibliographic fields of the search-form.

In PhiloLogic the most common bibliographic fields for searching are:

Some databases, nonetheless, offer many more fields. To select all documents in a database, simply leave the bibliographic fields blank and press SEARCH. (If performing a bibliographic search to retrieve documents, keep in mind that this can generate a very large number of titles) The documents are sorted by date with the earliest published being listed first. In PhiloLogic, fields can be combined to refine a search further. Thus, for example, entering smollett in the author field and 1750-1765 in the date field selects only those works by Tobias George Smollett which were published between 1750 and 1765.

2.1 Bibliographic Fields

A. Searching by Author:
Bibliographic entries in the author field must match the name exactly as given in a database's online bibliography. (One may, however, use upper or lower case letters; author searches are case insensitive.) Searches are on "strings" of characters; in fact, punctuation, spaces, and diacritics must be entered or one receives a "No documents found" message. Nonetheless, author searching has been designed so that a user can enter the fewest possible terms. Typically, it will suffice to enter an author's last name if the author's name is unique within the online bibliography. Thus, entering smollett is likely to select titles only by Tobias George Smollett in the Eighteenth-Century Fiction Database. If entering the author's full name, one must type smollett, tobias george. Author searching also works on "sub-strings" so that entering smoll also selects works by Smollett. PhiloLogic's wildcard characters may also be employed to match many forms.

Note: At this time brackets ([ ]), double quotes ("), and parentheses (( )) are not searchable in the author field. Thus, block-copying an author listing such as Eliot, T. S. (Thomas Stearns) will produce a "No documents found" message. Try the most distinctive sub-string or a wildcard character (period) for the mark of punctuation.

To search the works of more than one author type the authors' names separated by a vertical bar (|) which serves as the OR operator (with no spaces intervening). Thus, smollett|fielding|sterne searches the works of Tobias George Smollett, Henry Fielding, and Laurence Sterne as would smollett, tobias george|fielding, henry|sterne, laurence. To select all the authors in a database, leave the "Author:" field, as well as the other fields, blank.

Please note that PhiloLogic now requires the user to take into account accented characters in bibliographic searching when accents appear in the online bibliography. Accents in bibliographic fields are to be represented in the same way as in full-text searching, described in detail in section 3.1 Accents and Special Characters. Thus one may 1) enter the accented character as such from one's browser, 2) use a two character sequence (e.g.,. e^) or 3) use an uppercase letter (e.g., E) to match any form of that letter. Thus, entering calderOn or caldero/n finds works by Pedro Calderón de la Barca in the Teatro español del siglo de oro database. Tip: In order to enter search terms without having to pay attention to diacritics simply turn on "Caps Lock" and type in all uppercase.

B. Searching by Title:
Bibliographic entries in the title field must match the title exactly as given in a database's online bibliography. (One may, however, use upper or lower case letters; title searches are case insensitive.) Searches are on "strings" of characters; in fact, punctuation, spaces, and diacritics must be entered or one receives a "No documents found" message. Nonetheless, title searching has been designed so that a user can enter the fewest possible terms. Complete titles' names are rarely required to compose well-defined queries. Typically, it will suffice to enter an uncommon word or phrase from a title if the word or phrase is unique within the online bibliography. Thus, entering jones or tom jones is likely to select only The History of Tom Jones, a Foundling by Henry Fielding in the Eighteenth-Century Fiction Database. If entering a title's full name, one must type history of tom jones, a foundling with comma and spaces. Title searching also works on "sub-strings" so that entering jon also selects Fielding's The History of Tom Jones, a Foundling. PhiloLogic's wildcard characters may also be employed to match many forms.

Note: At this time entering the following punctuation marks and symbols into the title field produces a "No documents found" message: parentheses, ampersand (&), double quotes, and brackets ([ ]). In all cases, punctuation and spacing must match exactly that in the bibliography.

To search more than one title at a time type the titles separated by a vertical bar (|) which acts as the OR operator (with no spaces intervening). Thus, jones|amelia selects both The History of Tom Jones, a Foundling and Amelia as would history of tom jones, a foundling|amelia. To select all titles in a database, leave the "Title:" box, as well as the "Author:" and "Date:" boxes, blank.

Please note that PhiloLogic now requires the user to take into account accented characters in bibliographic searching when accents appear in the online bibliography. Accents in bibliographic fields are to be represented in the same way as in full text searching, described in detail in section 3.1 Accents and Special Characters. Thus one may 1) enter the accented character as such from one's browser, 2) use a two character sequence (e.g.,. o/) or 3) use a capitalized letter (e.g., O) to match any form of that letter. Thus, entering nin~a de go/mez arias or niNa de gOmez arias finds La niña de Gómez Arias by Pedro Calderón de la Barca in the Teatro español del siglo de oro database. Tip: in order to enter search terms without having to pay attention to diacritics simply turn on "Caps Lock" and type in all uppercase.

C. Searching by Date:
To define a corpus by date or a range of dates enter a single year (e.g., 1880) or a range of years (e.g., 1865-1875). Since some works cannot be dated to an exact year, it is often best to adopt a range of dates strategy. Always check a database's online bibliography to confirm dates.

Note: At this time searching by date in several ETS databases is not always productive since in some cases the publisher has entered only the date of the printed edition from which the data have been drawn, not the date of the first edition, composition, or first performance. In the African-American Poetry Database, for example, only by searching for works published in 1993 is one able to search the poems of Paul Laurence Dunbar (1872-1906) by date, since the data come from the 1993 edition. The Database-Specific Searching Tips on individual database search-forms warns users if dates of first publication, composition, and/or performance have not been entered.

Return to Table of Contents


2.2 Retrieving and Navigating Documents:

PhiloLogic displays bibliographic citations, which are linked to a work's digital table of contents, in a number of places:

Clicking on the title of a document automatically generates a "digital table of contents", showing the bibliographic entry of the document and all of the parts that have been identified in that document. The parts reflect the logical organization of the document in up to three levels of hierarchy (not all documents contain three levels). The top level part of a hierarchy is not indented and shown in bold. The second level is indented several spaces. The third level of a hierarchy is indented further and shown in italics. Any part of any level may be selected by simply clicking on it (unless the links have been disabled because of copyright restrictions). Notice the structure in the following example taken from Eighteenth-Century Fiction (links to the text have been disabled).

Fielding, Sarah [1759], The History of the Countess of Dellwyn. In Two Volumes: By the Author of David Simple. [etc.] (Cambridge: Chadwyck - Healey, 1996) [FieSar,ThHiOfT3].

   [Fielding, S./The Countess of Dellwyn, Vol. 1]
      [Fielding, S./The Countess of Dellwyn, Vol. 1, Title Page]
      [Fielding, S./The Countess of Dellwyn, Vol. 1, Preface]
      [Fielding, S./The Countess of Dellwyn, Vol. 1, Book 1]
         [Fielding, S./The Countess of Dellwyn, Vol. 1, Book 1, Chap. 1]
         [Fielding, S./The Countess of Dellwyn, Vol. 1, Book 1, Chap. 2]

         [...material omitted...]

   [Fielding, S./The Countess of Dellwyn, Vol. 2]
      [Fielding, S./The Countess of Dellwyn, Vol. 2, Title Page]
      [Fielding, S./The Countess of Dellwyn, Vol. 2, Book 3]
         [Fielding, S./The Countess of Dellwyn, Vol. 2, Book 3, Chap. 1]
         [...material omitted...]


When a part is selected, PhiloLogic displays the bibliographic citation at the top and bottom of the text with a link back to the digital table of contents. It also allows one to go to previous and next sections at the same level of the hierarchy if they should exist.

When one selects a document part from a hierarchy or a page, PhiloLogic provides links, when available, to additional material such as images or cross-references (e.g., notes). In some documents, note references are displayed at the bottom of textual units with the notes themselves available through these links from a database note server. Specific details on the location of notes and other types of material are found on individual database search-forms under Database-Specific Searching Tips.

Return to Table of Contents


3. Character Representation for Search Terms

The term(s) to be searched in selected documents are entered into the Search Text(s) For: box on the search-form. Word searches in PhiloLogic are by default case insensitive, so that a search finds both lower and upper case representations of words. The user must, however, take into account diacritics when searching databases that have accented characters. PhiloLogic's wildcard characters may also be employed to match many forms. The simplest search in PhiloLogic is a single term search without wildcards. If searching for a term such as "magic" in a database, simply type the word magic into the Search Text(s) For: box and press the SEARCH button.

3.1 Accents and Special Characters:
PhiloLogic requires that one take into account diacritics when searching documents with accented characters in both bibliographic and full-text searching. The system provides three ways to search for accented characters: 1) simply type the required accented character from the keyboard; 2) use a capital letter to match all accented and non-accented forms of a letter; or 3) enter the two character representations listed below. Tip: If you do not want to have to think about accents, turn on "Caps Lock" and type in all uppercase.

capital letter = any form of the letter
(e. g., E matches é ê è ë and e (no accent) and É Ê È Ë and E (no accent).
grave = (\) back slash
(e.g., a\ matches à).
acute = (/) forward slash
(e.g., e/ matches é).
circumflex = (^) caret
(e.g., e^ matches ê).
cedilla = (,) comma
(e.g., c, matches ç).
ümlaut/dieresis = (") double quote
(e.g., u" matches ü).
tilde = (~) tilde
(e.g., n~ matches ñ).

Special Characters and Symbols

ae-ligature (æ) = ae
the ligature is resolved into two letters. (e.g., to search æther type in aether).
oe-ligature (œ) = oe
the ligature is resolved into two letters. (e.g., to search œconomy type in oeconomy). %
sz-ligature or sharp S (ß) = s^
Always check Database-Specific Searching Tips to see whether the German sz-ligatures have been resolved into two esses or not.
eth (ð) = d^
in the short term is not searchable, but does display.
thorn (þ)= p^
in the short term is not searchable, but does display.
yogh
at this time displays as an SGML entity and cannot be searched.
ancient Greek Characters
display as SGML entities and cannot be searched. Unicode support in PhiloLogic is under development.
Hebrew Characters
display as SGML entities and cannot be searched. Unicode support in PhiloLogic is under development.
ampersand (&)
is not a searchable character. Avoid Phrase Searches where an ampersand could be used as a conjunction.
mathematical symbols = to be determined
the equal sign (=) and minus sign (-) will produce a "Nothing found" message. The plus sign (+) is not a searchable character, but, if entered, will be ignored.

In order to handle words properly that have italics, bold, underlining, superscripts, and subscripts, PhiloLogic does not treat the following tags as word separators:

3.2 Wildcard Characters and Boolean Operators:
Wildcard characters allow the user to enter a single search entry that may find many forms. This is in contrast to a simple word search which requires an exact match in order to find a word. The following describes the most commonly used wildcard characters in full-text searching and in bibliographic searching.

3.2.1 Full-Text Searching: PhiloLogic supports wildcard characters and Boolean (logical) operators, which are modeled on UNIX regular expressions to perform "pattern matching" in full-text searching. Pattern matching allows identification of a large number of words corresponding to a defined pattern. Wildcard characters can be useful, for example, in identifying cognates made obscure by affixes and vowel weakening, inconsistencies due to irregular orthography, and variations on account of word inflection as well as for discovering potential emendations for uncertain readings. The most commonly used regular expression operators (wildcard and Boolean) are listed below.

Wildcard Characters

. (period):
matches any single character (e.g., gentlem.n will retrieve gentleman and gentlemen).
.* (period asterisk "dot-star"):
matches any string of characters, anchoring the match at the beginning of a word (e.g., cigar.* will match cigar, cigars, cigarette, etc.), anchoring the match at the end of a word (e.g., .*habit will retrieve habit, cohabit, and inhabit), or in the middle (e.g., c.*eers matches compeers, cheers, and careers).
.? (period question mark):
matches the characters entered or the characters entered plus one more character in place of the question mark (e.g., hono.?r matches both honor and honour and cat.? matches cat and cats, but not cathedral, Catherine, etc.).
[a-z] (brackets):
matches a single character found in the specified range (e.g., [c-f]at will match cat, dat, eat, and fat) or any letters within the brackets (e.g., civili[zs]e will match both civilize and civilise).
# (hash mark):
matches capitalized words only (e.g., #bacon will retrieve Bacon, but not bacon). Otherwise word searches are case insensitive. Please note that this operator does not work properly in conjunction with the vertical bar (e.g., searching #hamlet|#bacon will not retrieve accurate results).
E (capital letter):
matches all accented and non-accented forms (e.g., to search naïveté regardless of accents type naIvetE).

Note: If you are using wildcard characters and would like to see a full list of the words matching your search-term, then run your search as a "Frequency by Title" search. The results page of a "Frequency by Title" search lists all the terms found in a database that match your search-term.

Boolean Operators

| (vertical bar):
serves as the OR operator (e.g., freedom|liberty retrieves instances of either).
Space:
serves as the AND operator in sentence and paragraph Proximity Searching (e.g., church state retrieve all cases where church and state appear in the same specified context; this is not the case in phrase searching).

These expressions can be combined for more sophisticated searches; for example, searching old|aged|ancient m.n|fellow.* finds any of the three adjectives together with the nouns man or fellow in the singular or plural.

3.2.2 Bibliographic Searching (Corpus Definition and Document Retrieval): PhiloLogic also supports certain Boolean and wildcard operators, which are modeled on UNIX regular expressions, for "pattern matching" in bibliographic searching; however, there are important differences. Only the Boolean operator OR may be used and not AND since all bibliographic searches are by default consecutive searches. Furthermore, since bibliographic searches are also by default searching for "strings" of characters, the wildcard operator (.*) is not needed. Thus, typing habit in a bibliographic field is the same as typing .*habit.* in full-text searching. Names of authors and titles bearing diacritics must be entered with accented characters or with the use of a capital letter for the accented character. Bibliographic searching is otherwise also case insensitive. In bibliographic searching, unlike in full-text searching, marks of punctuation are not only permitted, but in most cases required, when found in the online bibliography. If the bibliography, for example, reads Poe, Edgar Allen, the name must be entered in the author field in the same inverted order with comma separating surname from given names. Otherwise, one receives a "No documents found!" message. One must also avoid unwanted spaces. Typically, it will suffice to enter an uncommon word or phrase from a title or author's name if the word or phrase is unique within the online bibliography.

3.3 Punctuation Marks and Searching

Punctuation and Full-Text Searching: In full-text searching entering marks of punctuation (for example, when a period is used as a full-stop) in the search box often produces a "nothing found" message. All punctuation marks such as the comma, question mark, exclamation mark, vertical bar (|), forward and backward slashes, colons, and semicolons as well as quotation marks, ampersands (&), asterisk (*), percentage sign (%), dollar sign, number sign (#) should be stripped from an entry especially if one is block-copying text. (Many of the symbols traditionally used for punctuation are used instead for accent representation or wildcard characters.) Some marks of punctuation are especially problematic and may be dealt with slightly differently:

Apostrophe: The only punctuation that PhiloLogic regularly supports in full-text searching is the apostrophe. Entering sister's retrieves "sister's" in most databases, but typing in sisters does not retrieve the possessives sister's or sisters', only the plural sisters. Always check Database-Specific Searching Tips on individual database search-forms to be sure since punctuation marks are treated differently because of a given language's needs. (In French and Italian, for example, the apostrophe separates words and thus must be entered with a space following it, e.g., l' histoire and d' Italia.)

Hyphen: Hyphens act as word separators in most databases. Thus, if looking for all occurrences of the word "valiant," one may enter only valiant and still find "ever-valiant." Always check Database-Specific Searching Tips on individual database search-forms to be sure since punctuation marks may be treated differently because of a given language's needs.

Brackets: Although brackets usually act as word-separators, they will not always, for example, when they indicate uncertain readings (Agr[ipp]ina). In the near future, PhiloLogic will support MSS punctuation for some databases, in which cases brackets will not be word-separators. Other marks of punctuation will be part of the MSS implementation. Always check individual search-forms under Database-Specific Searching Tips to know for sure.

Ampersand: The ampersand (&) is not a searchable character. Avoid Phrase Searches where an ampersand could be used as a conjunction.

Period: The period is not a searchable character (it serves as a wildcard operator). Please note that most databases are not tagged for sentence termination and therefore PhiloLogic must rely on marks of punctuation in combination with capitalization to identify sentence termination. This is especially problematic for combinations such as St. Ambrose. If you suspect that a period in an abbreviation may be splitting a phrase, switch to a Proximity Search in the same paragraph.

Punctuation and Bibliographic Searching: At this time entering the following marks of punctuation and symbols into bibliographic fields produces a "No documents found" message: parentheses (( )), semi-colons (;), colons (:), ampersand (&), apostrophes ('), single and double quotes, braces ({ }), brackets ([ ]), and angle brackets (< >) as well as the dollar sign ($).Thus block-copying a name such as D'Urfey, Thomas will produce a "No documents found" message. Try the most distinctive sub-string such as urfey or a wildcard character (period) for the mark of punctuation (e.g., d.urfey, thomas). The following punctuation marks have no adverse effect on an author or title search and, if appearing within a string, must be entered: period (.), hyphen (-), question mark (?), exclamation mark (!), forward slash (/), and comma (,).

Return to Table of Contents


4. Selecting a Search Option

PhiloLogic at this time offers two kinds of searches: "Single Term and Phrase Search," which is set up as the default, and "Proximity Searching in the Same Sentence or Paragraph." One may select and deselect a search option by clicking on the "radio" buttons.

4.1. Single Term and Phrase Search (Default):

To search a single term in the entire database or a defined corpus make sure that the Single Term and Phrase Search radio button is highlighted, simply enter the term into the Search Text(s) For: box, and press the SEARCH button. (One may use upper or lower case letters; searches are case insensitive.) Single Term searching supports wildcard characters and the Boolean operator OR, which is the vertical bar (|). Entering, for example, freedom|liberty retrieves all occurrences of the word "freedom" or "liberty" in the entire database or a specified corpus.

Similarly, to search a phrase make sure that the Single Term and Phrase Search radio button is highlighted, simply type the phrase into the Search Text(s) For: box, and press the SEARCH button. Phrase searching restricts the search to adjacent words in a particular order (punctuation in the text is ignored). Thus, for example, the search church state would not retrieve "church and state," but only cases where the word "church" is next to the word "state" with the word "church" preceding. To retrieve occurrences of the phrase "church and state" one must type in church and state. Phrase searching supports wildcard characters and the Boolean operator OR. Note: one cannot search for two separate phrases using the OR operator. Two separate searches must be run. One may, however, use the OR operator within a phrase; medieval|mediaeval age retrieves, for example, instances of both "medieval age" and "mediaeval age."

4.2 Proximity Searching in the Same Sentence or Paragraph:

Searching for more than one term in a single sentence or paragraph without regard to adjacency or word-order constitutes Proximity Searching. Simply type the terms in question into the Search Text(s) For: box, indicate whether they are to be found in the same sentence or paragraph by highlighting the appropriate radio button, and press SEARCH. (One may use upper or lower case letters; searches are case insensitive.) Proximity Searching supports wildcard characters, the Boolean operator OR, which is the vertical bar (|), and the Boolean operator AND, which is a space. If looking for occurrences of the words "church" and "state" within the same sentence or paragraph in any order, enter church state. Entering church state|throne retrieves instances of "church" and "state" or "church" and "throne" in the same sentence or paragraph. Note: at this time one cannot perform a proximity search with a phrase and another phrase or a phrase and another single term in the same sentence or paragraph. Remember; a space acts as the AND operator in proximity searching.

Return to Table of Contents


5. Selecting a Results Format

Except in the case of a Frequency by Title and Author reports, references for occurrences are numbered from one and sorted by date with the works published the earliest being listed first. A Results Bibliography can be found at the bottom of the report. Bibliographic citations generally take the following form:

Sterne, Laurence [1760], The Life and Opinions of Tristram Shandy, Gentleman ... The Second Edition (Cambridge: Chadwyck - Healey, 1996) [SteLau,ThLiAnO].

Each typically shows the author's name, the date of first publication or composition, the title, information on the digital publication, and finally the short citation code in brackets. The short citation code is displayed with the reference for each occurrence in Concordance and KWIC reports. All full titles are linked to their digital table of contents (disabled in the example above).

A user can switch to another display format at any time while viewing results without having to resubmit a search. Simply click on the appropriate link ("Click here for a KWIC Report" or "Click here for a Concordance Report"), which is always provided at the bottom of any given results page (and usually at the top, unless the report is still in progress when the first 25 occurrences are initially displayed).

Note: PhiloLogic will not complete a search that yields more than 10,000 occurrences. Only the first 10,000 will be retrieved. In addition, users are currently limited to 500 unique forms in a single search. By using wildcard characters and Boolean operators one can sometimes submit a query for a very large set of terms, especially in highly inflected languages. If a search exceeds the limit of unique forms, PhiloLogic will provide a list of all 500 plus unique forms so that the user can devise an alternate strategy for searching. Some databases such as the PLD have higher limits set. Research is underway to find ways to increase this limit substantially.

5.1 Condordance Report (300 Characters Plus)

Concordance reporting is the default results format option. This report indicates the number of texts searched, the search term(s) entered in a defined corpus, and the total number of occurrences found. (The number of occurrences displays at the top of the report if PhiloLogic has detected the number before generating the first 25 occurrences. If not, the total number of occurrences displays at the bottom of the report.) Following this general information is a list of occurrences. Each occurrence is represented by a short citation consisting of abbreviations for the author's name and the title of the work with a reference to where the term(s) in question occur within the document. (Full entries for the short citations are listed in the Results Bibliography at the bottom of the report.) References may be page numbers, acts and scenes, chapters and verses, columns, and the like. Along side the citation is listed several levels of context (e.g., page, paragraph, or levels of hierarchy designated by h3, h2, and h1). Below the short citation there is a passage of text consisting of some forty words on either side of the key word, which is highlighted. PhiloLogic, however, displays as much text as needed to capture all words in a multi-term search and all search words are highlighted. The reference listed with the short citation is linked to the text. If clicking on the page number, one retrieves the full page with key words still highlighted. The same is true for paragraph and the three other levels of hierarchy. Links to the previous and next page, paragraph or levels respectively, if they exist, are provided.
Note: remember that, when searching for two or more terms within the same paragraph, the concordance report expands the amount of text displayed to include all of the search terms in the paragraph. At times the text displayed in a proximity search to accommodate all the search terms may be several screens in length since some paragraph divisions in documents in some databases are very far apart.

In cases where a search finds more than 25 occurrences, PhiloLogic provides the first 25 occurrences with links at the bottom of the report to the remaining occurrences of the search in sets of one hundred. One may also retrieve a full list of occurrences which can be useful for down-loading or printing, but which may take some time to retrieve. Note: when results number over hundreds or thousands of occurrences, the report may not be complete when first starting to view results. In this case, one sees the message "The search is still in progress. 908 occurrences have been generated so far. (please follow the link(s) below to check on the progress) ". The server continues to append results until it has completed the entire report and, by clicking on any of the sets of one hundred, one can retrieve the full report.

Return to Table of Contents


5.2 KWIC (Key Word in Context) Report (A Single Line of Text)

As in a Concordance Report, a KWIC (pronounced "quick") report indicates the number of texts searched, the search term(s) entered in a defined corpus, and the total number of occurrences found. (The number of occurrences displays at the top of the report if PhiloLogic has detected the number before generating the first 25 occurrences. If not, the total number of occurrences displays at the bottom of the report.) Following this general information is a list of occurrences. Each occurrence is represented by a short citation consisting of abbreviations for the author's name and the title of the work with a reference to where the term(s) in question occur within the document. References may be page numbers, acts and scenes, chapters and verses, columns, or the like. A KWIC Report differs from a Concordance Report in that it limits the text displayed to only a single line of text. The search term, which is highlighted, is centered in the line so that a user can quickly scan the results. At the bottom of the report one finds the Results Bibliography, which lists the full references for the short citations above. Unlike the Concordance report, a KWIC report only offers one level of linked context (typically a page reference or scene number) with search terms still highlighted and the next and previous pages (or scenes) available, if they should exist.

In cases where a search finds more than 25 occurrences, PhiloLogic provides the first 25 occurrences with links at the bottom of the report to the remaining occurrences of the search in sets of one hundred. One may also retrieve a full list of occurrences which can be useful for down-loading or printing, but which may take some time to retrieve. Note: when results number over hundreds or thousands of occurrences, the report may not be complete when first starting to view results. In this case, one sees the message "The search is still in progress. [908] occurrences have been generated so far. (please follow the link(s) below to check on the progress) ". The server continues to append results until it has completed the entire report and, by clicking on any of the sets of one hundred, one can retrieve the full report.

Note: when executing a "Proximity Search," especially with paragraph set as the searching parameter, it is best to avoid the KWIC format since all search terms are not likely to be in the single line of text displayed. The term that is located first in the paragraph is the one that is centered in the single line of text. Using the Concordance results format ensures that all terms are included in the display even if the paragraph should happen to run for several pages. One can switch from a KWIC format to a Concordance Report format at any time while viewing results and switch back. PhiloLogic takes the user to the same set of results being viewed at the time of the switch.

Return to Table of Contents


5.3 Frequency by Title Report

A Frequency by Title report indicates the bibliographic criteria entered, the number of documents searched, the search term(s) entered, the number of unique forms derived from the search term(s) within the database, a list of those unique forms, and the total number of occurrences found in the defined corpus. Following this information, the report indicates the number of occurrences by title in descending order of frequency with a link to the digital table of contents for each title and a link to the occurrences found within that title. See below for an example (links to the table of contents and occurrences have been disabled).


Bibliographic criteria: author=robinson
Searching 6 documents for eft.?|newt.
Number of Unique Forms: 5

Search Terms: newt | Newt | eft | Eft | efts

Your search found 3 occurrences.


Frequency by title in descending numeric order:

1. 2 The Collected Poetry of Robinson Jeffers: Edited by Tim Hunt: Volume 3 1938-1962, Jeffers, Robinson [Occurrences]
2. 1 Collected poems of Edwin Arlington Robinson, Robinson, Edwin Arlington [Occurrences]


The Frequency by Title Report is useful if one is curious how frequently an author uses term(s) in one work as compared to his/her other works or in his/her works as compared to others' works. It can also be enlightening to see for what terms within a database one's search criteria are searching (for example, one can discover that entering the search term magic.* in Early English Prose Fiction searches for the following unique forms: magic, magical, magicall, magician, magicians, magick, magicke, and magicks).

Any definable corpus or search can be used in generating this report. Unlike Concordance and KWIC reports, this report does not display text, only frequency statistics with links to occurrences displayed in Concordance Report format. Note: the sets of occurrences linked to from the frequency report are numbered in chronological order, not by frequency. In other words, clicking on the [Occurrences] link for a title at the top of the list could, for example, bring up occurrences numbered 21-28 instead of 1-8 because that title while ranked first in frequency is not first chronologically.

5.4 Frequency by Author Report

A Frequency by Author report indicates the bibliographic criteria entered, the number of documents searched, the search term(s) entered, the number of unique forms derived from the search term(s) within the database, a list of those unique forms, and the total number of occurrences found in the defined corpus. Following this information, the report indicates the number of occurrences by author in descending order of frequency with individual titles listed with a link to the digital table of contents for each title and a link to the occurrences found within that title. See below for an example (links to the table of contents and occurrences have been disabled).
Bibliographic criteria: none
Searching Entire Database for newt|eft.?.
Number of Unique Forms: 3

Search Terms: eft | efts | newt

Your search found 4 occurrences.


Frequency by Author in descending numeric order:

1. Scott, Walter, Sir, 1771--1832: 2
      1: A Legend of Montrose [in, the Waverley Novels]  [Occurrences]
      1: Guy Mannering; Or, The Astrologer [in, the Waverley Novels]  [Occurrences]
2. Lytton, Edward Bulwer Lytton, Baron, 1803--1873: 2
      2: Pelham; Or, The Adventures Of A Gentleman  [Occurrences]


Any definable corpus or search can be used in generating this report. Unlike Concordance and KWIC reports, this report does not display text, only frequency statistics with links to occurrences displayed in Concordance Report format. Note: the sets of occurrences linked to from the frequency report are numbered in chronological order, not by frequency. In other words, clicking on the [Occurrences] link for a title at the top of the list could, for example, bring up occurrences numbered 21-28 instead of 1-8 because that author's title while ranked first in frequency is not first chronologically.

5.5 Navigating Documents from Word Searches

In a Concordance report one finds several options for viewing more context around one's matched term(s). In addition to "page" and paragraph, one finds other levels of context. The parts of a document in up to three levels of hierarchy are indicated by h3, h2, and h1 and reflect the logical organization of the document from smaller parts (h3) to larger parts (h1). In other words, the top level part of a hierarchy is h1; the second level is h2; and the third level of a hierarchy is h3. What each level represents depends upon how each text was encoded and so in some cases there may not be an h3 (e.g., Volume/Book/Chapter or Act/Scene). Any part of any level may be selected by simply clicking on it. Once a user goes to a second level of context, he/she will find the search term(s) still highlighted. One may also find the next and previous sections for each level if one should wish to "flip through" the document by sections (provided that a next or previous section exists for a given level). As always, the linked table of contents for the entire work is available by clicking on the title of the work as listed in the Results Bibliography at the bottom of a report or in the reference citation, when within sections, listed at the top and bottom of any level of sections.
Please note that some databases have limited navigation because of copyright restrictions, at which times only a few pages of context are allowed and the links from the digtial table of contents are disabled.

Notes: In PhiloLogic notes never interfere when searching the text to which they refer. Note references are linked to notes and in recently acquired databases text from notes is linked to page references. Note references can be found on any level of context (e.g., page, paragraph, h3, h2, or h1), but not from a first-level results screen.

Images: Most images are displayed as inline images once the user pulls up any level of context (e.g., page, paragraph, h3, h2, or h1), but not from a first-level results screen.

Sound: In databases for which there are recordings, one finds links to RealAudio files from any level of context (e.g., page, paragraph, h3, h2, or h1), but not from a first-level results screen.

Return to Table of Contents


6. Getting More Help

The Library offers a number of instructional opportunities for the University of Chicago community on searching full-text databases available through ETS. Once a month an open course is offered in the Electronic Classroom, JRL 153, to those interested in full-text manipulation in general. Patrons can also arrange for tutorials or workshops at other times and on particular databases. Faculty and preceptors are especially encouraged to set up classroom instruction tailored to their specific courses. To set up tutorials and workshops please contact:

Catherine Mardikes, ETS Coordinator, located on the fourth floor of the Joseph Regenstein Library, Room 471; (c-mardikes@uchicago.edu; 702-2783).

Return to Table of Contents


7. Local Issues

7.1 Access Policies

Access for the University of Chicago: Most ETS databases are site-restricted to the University of Chicago community. Access is controlled by the name or number of the requesting computer. For the University of Chicago, the name must end with uchicago.edu or have an IP address beginning with 128.135. The University of Chicago Data Network Operations provides dial-up services to the campus network. The Network may also be connected to from a third party Internet service provider by using the NSIT Web Proxy Server.

Access for Other Institutions to databases under PhiloLogic:

To discuss the possibility of gaining access to the PhiloLogic version of a database contact Catherine Mardikes, ETS Coordinator, at c-mardikes@uchicago.edu.

Return to Table of Contents


Please direct comments or queries about this service to ets@lib.uchicago.edu.

The ARTFL Project   |   The University of Chicago Library