Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Textual Analysis and Annotation
This web-based tool allows you to analyze the text on both web pages and documents you upload from your computer.
Part of the Machine Learning of Language Toolkit (MALLET), Topic Modeling can analyze large volumes of text to deterring clusters of words that frequently occur together. This tool is installed locally and has specific instructions for document analysis.
Scripto is an open source community transcription tools for document and media files.
TAPoR is a gateway to the tools used in sophisticated text analysis and retrieval. You can browse tools by type or tag; search and use tools; read and create tool reviews; and contribute and advertise tools.
TEI: Text Encoding Initiative
"The Text Encoding Initiative (TEI) is a consortium which collectively develops and maintains a standard for the representation of texts in digital form. Its chief deliverable is a set of Guidelines which specify encoding methods for machine-readable texts, chiefly in the humanities, social sciences and linguistics."
Stanford Name Entity Reconizer
"Stanford NER is a Java implementation of a Named Entity Recognizer. Named Entity Recognition (NER) labels sequences of words in a text which are the names of things, such as person and company names, or gene and protein names. It comes with well-engineered feature extractors for Named Entity Recognition, and many options for defining feature extractors. Included with the download are good named entity recognizers for English, particularly for the 3 classes (PERSON, ORGANIZATION, LOCATION), and we also make available on this page various other models for different languages and circumstances, including models trained on just the CoNLL 2003 English training data. The distributional similarity features in some models improve performance but the models require considerably more memory. "
"WordSeer is a text analysis environment that combines visualization, information retrieval, sensemaking and and natural language processing to make the contents of text navigable, accessible, and useful."
"The most significant difference between Annotation Studio and other digital annotation projects is its emphasis on student-centered design and pedagogy. Most other annotation tools assume user familiarity with TEI, and a well-developed understanding of the relationships between literary sources, manuscripts, editions, and adaptations. Annotation Studio makes sophisticated yet easy-to-use commenting tools immediately accessible to students with no prior experience with close textual analysis or TEI."
"A.nnotate is an online annotation, collaboration and indexing system for documents and images, supporting PDF, Word and other document formats. Instead of emailing different versions of a document back and forth you can now all comment on a single read-only copy online. Documents are displayed in high quality with fonts and layout just like the printed version. It is easy to use and runs in all common web browsers, with no software or plugins to install."
"Our team is building an open platform for discussion on the web. It leverages annotation to enable sentence-level critique or note-taking on top of news, blogs, scientific articles, books, terms of service, ballot initiatives, legislation and more. Everything we build is guided by our principles. In particular that it be free, open, non-profit, neutral and lasting to name a few."