| Sign In to gain access to subscriptions and/or personal tools. |
DOI: 10.1177/0075424204264856 The American National CorpusOverall Goals and the First ReleaseNorthern Arizona University
Vassar College The American National Corpus (ANC) will be a carefully designed corpus of 100 million words of American written and spoken language that generally follows the framework of the British National Corpus. The ANC project will provide both a standard format for text encoding and a format for different types of corpus annotation (e.g., parts of speech, rhetorical features, etc.), as well as different versions of the same type of annotation (e.g. multiple part of speech taggings). As the only widely available large corpus of spoken and written American English containing a variety of registers, the ANC will represent a synchronic slice of American English across many registers. The First Release of the ANC, described in this article, is a preview of the corpus and a chance for researchers to contribute feedback on format and related issues, while allowing them access to data rather than waiting until the entire corpus is completed.
Key Words: American National Corpus corpus linguistics computational linguistics encoding annotation
This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||


