297 lines
13 KiB
Plaintext
297 lines
13 KiB
Plaintext
UNIVERSAL/STANFORD DEPENDENCIES. Stanford Parser v3.7.0
|
|
-----------------------------------------------------------
|
|
|
|
IMPORTANT: Starting with version 3.5.2 the default dependencies
|
|
representation output by the Stanford Parser is the new Universal
|
|
Dependencies Representation. Universal Dependencies were developed
|
|
with the goal of being a cross-linguistically valid representation.
|
|
Note that some constructions such as prepositional phrases are now
|
|
analyzed differently and that the set of relations was updated. The
|
|
online documentation of English Universal Dependencies at
|
|
|
|
http://www.universaldependencies.org
|
|
|
|
should be consulted for the current set of dependency relations.
|
|
|
|
|
|
The parser and converter also still support the original
|
|
Stanford Dependencies as described in the Stanford Dependencies
|
|
manual. Use the flag
|
|
|
|
-originalDependencies
|
|
|
|
to obtain the original Stanford Dependencies. Note, however, that we
|
|
are no longer maintaining the SD converter or representation and we
|
|
therefore recommend to use the Universal Dependencies representation
|
|
for any new projects.
|
|
|
|
|
|
The manual for the English version of the Stanford Dependencies
|
|
representation:
|
|
|
|
StanfordDependenciesManual.pdf
|
|
|
|
should be consulted for the set of dependency relations in the original
|
|
Stanford Dependencies representation and the correct commands for
|
|
generating Stanford Dependencies together with any of the Stanford Parser,
|
|
another parser, or a treebank.
|
|
|
|
A typed dependencies representation is also available for Chinese. For
|
|
the moment the documentation consists of the code, and a brief
|
|
presentation in this paper:
|
|
|
|
Pi-Chuan Chang, Huihsin Tseng, Dan Jurafsky, and Christopher
|
|
D. Manning. 2009. Discriminative Reordering with Chinese Grammatical
|
|
Relations Features. Third Workshop on Syntax and Structure in Statistical
|
|
Translation. http://nlp.stanford.edu/pubs/ssst09-chang.pdf
|
|
|
|
--------------------------------------
|
|
DEPENDENCIES SCHEMES
|
|
|
|
For an overview of the original English Universal Dependencies schemes, please look
|
|
at:
|
|
|
|
Marie-Catherine de Marneffe, Timothy Dozat, Natalia Silveira, Katri Haverinen,
|
|
Filip Ginter, Joakim Nivre, and Christopher D. Manning. 2014. Universal Stanford
|
|
dependencies: A cross-linguistic typology. 9th International Conference on
|
|
Language Resources and Evaluation (LREC 2014).
|
|
http://nlp.stanford.edu/~manning/papers/USD_LREC14_UD_revision.pdf
|
|
|
|
and
|
|
|
|
Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajič,
|
|
Christopher D. Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira,
|
|
Reut Tsarfaty, and Daniel Zeman. 2016. Universal Dependencies v1: A Multilingual
|
|
Treebank Collection. In Proceedings of the Tenth International Conference on Language
|
|
Resources and Evaluation (LREC 2016).
|
|
http://nlp.stanford.edu/pubs/nivre2016ud.pdf
|
|
|
|
Please note, though, that some of the relations discussed in the first paper
|
|
were subsequently updated and please refer to the online documentation at
|
|
|
|
http://www.universaldependencies.org
|
|
|
|
for an up to date documention of the set of relations.
|
|
|
|
For an overview of the enhanced and enhanced++ dependency representations, please look
|
|
at:
|
|
|
|
Sebastian Schuster and Christopher D. Manning. 2016. Enhanced English Universal
|
|
Dependencies: An Improved Representation for Natural Language Understanding Tasks.
|
|
In Proceedings of the Tenth International Conference on Language Resources and
|
|
Evaluation (LREC 2016).
|
|
http://nlp.stanford.edu/~sebschu/pubs/schuster-manning-lrec2016.pdf
|
|
|
|
For an overview of the original typed dependencies scheme, please look
|
|
at:
|
|
|
|
Marie-Catherine de Marneffe, Bill MacCartney, and Christopher D.
|
|
Manning. 2006. Generating Typed Dependency Parses from Phrase
|
|
Structure Parses. 5th International Conference on Language Resources
|
|
and Evaluation (LREC 2006).
|
|
http://nlp.stanford.edu/~manning/papers/LREC_2.pdf
|
|
|
|
For more discussion of the design principles, please see:
|
|
|
|
Marie-Catherine de Marneffe and Christopher D. Manning. 2008. The
|
|
Stanford typed dependencies representation. In Proceedings of the
|
|
workshop on Cross-Framework and Cross-Domain Parser Evaluation, pp. 1-8.
|
|
http://nlp.stanford.edu/~manning/papers/dependencies-coling08.pdf
|
|
|
|
These papers can be cited as references for the original English Stanford
|
|
Dependencies and Enlgish Universal Dependencies.
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v3.7.0
|
|
|
|
Implementation of enhanced and enhanced++ dependency
|
|
representations as described in Schuster and Manning (2016).
|
|
|
|
Fixed concurrency issue.
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v3.5.2
|
|
|
|
Switch to Universal Dependencies as the default representation.
|
|
Please see the Universal Dependencies documentation at
|
|
|
|
http://www.universaldependencies.org
|
|
|
|
for more information on the new relations.
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v3.5.1
|
|
|
|
A couple of small fixes were made, leading to ccomp and advcl being
|
|
recognized in a couple of new environments.
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v3.4
|
|
|
|
One major change was made to the dependency taxonomy:
|
|
|
|
- We decided to collapse together the two dependencies partmod and infmod,
|
|
since they have similar function and mainly differ in the form of the verbal
|
|
head, which is anyways recorded in the POS tag. Those two relations are
|
|
removed from the taxonomy, and a new relation vmod covering the union of both
|
|
was added.
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v3.3.1
|
|
|
|
A couple of fixes/improvements were made in the dependency conversion,
|
|
and one change was made to the taxonomy of relations.
|
|
|
|
- The partmod and infmod relations were deleted, and replaced with
|
|
vmod for reduced, non-finite verbal modifiers. The distinction between
|
|
these two relations can be recovered from the POS tag of the dependent.
|
|
- A couple of improvements were made to the conversion, the largest
|
|
one being recognizing pobj inside a PP not headed by something tagged
|
|
as IN or TO.
|
|
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v3.3
|
|
|
|
Some fixes/improvements were made in the dependency conversion, and one
|
|
change was made to the taxonomy of relations.
|
|
|
|
- For currency amount expressions with a currency symbol like "$", it
|
|
had previously been the case that "$" was the head, and then each
|
|
number word modified it as a number. We realized that this was
|
|
unnecessarily inconsistent. For the expression "two thousand dollars",
|
|
"dollars" is the head, but "thousand" is a num modifier of it, and
|
|
number is used for the parts of a number multi-word expression only.
|
|
This analysis is now also used for cases with a currency symbol. E.g.,
|
|
"for $ 52.7 million": prep(for, $) num($, million) number(million, 52.7).
|
|
Similarly, for "the $ 2.29 billion value", we changed the analysis from
|
|
num(value, $) number($, billion) to amod(value, $) num($, billion).
|
|
This corresponds to hwat you got for "a two dollar value".
|
|
This is actually the most common change (at least on WSJ newswire!).
|
|
- Remove the attr relation. Some cases disappear by making the question
|
|
phrase of WHNP be NP questions the root. Others (predicative NP
|
|
complements) become xcomp.
|
|
- Less aggressive labeling of participial form VPs as xcomp. More of them
|
|
are correctly labeled partmod (but occasionally a true xcomp is also
|
|
mislabeled as partmod).
|
|
- Small rule changes to recognize a few more ccomp and parataxis.
|
|
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v3.2, JUNE 2013
|
|
|
|
An improved dependency conversion means that our dependency trees are
|
|
not always projective, one deletion was made from the taxonomy of
|
|
relations, and various small converter fixes were made:
|
|
- rel was removed. rel was originally used as the relation for an
|
|
overt relativizer in a relative clause. But it was never a real
|
|
grammatical relation, and we gradually started labeling easy cases
|
|
as nsubj or dobj. In this release, rel is removed, pobj cases are
|
|
also labeled, and the remaining hard cases are labeled as dep.
|
|
- As a result of correctly labeling a pobj in questions and relative
|
|
clauses, the converter now sometimes produces non-projective dependency
|
|
trees (ones with crossing dependencies, if the words are laid out in
|
|
their normal order in a line, and all dependency arcs are drawn above
|
|
them). This is not a bug, it's an improvement in the generated
|
|
dependencies, but you should be aware that Stanford Dependencies
|
|
trees are now occasionally non-projective. (Some simple dependency
|
|
parsing algorithms only produce projective dependency trees.)
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v2.0.5, MARCH 2013
|
|
|
|
We have begun a more major effort to improve the suitability and coverage of
|
|
Stanford Dependencies on less formal text types, and to clean up a couple of
|
|
the more quirky dependencies in the original set. These changes are still
|
|
ongoing, but in this first installment, we have removed 3 dependencies and
|
|
added 2:
|
|
- abbrev was removed, and is now viewed as just a case of appos.
|
|
- complm was removed, and is now viewed as just a case of mark.
|
|
(This is consistent with an HPSG-like usage of mark.)
|
|
- purpcl was removed, and is now viewed as just a case of advcl.
|
|
- discourse was added. The lack of a dependency type for
|
|
interjections was an omission even in the early versions, but it
|
|
became essential as we expanded our consideration of informal
|
|
text types. It is used for interjections, fillers, discourse markers
|
|
and emoticons.
|
|
- goeswith was added. In badly edited text, it is used to join the
|
|
two parts of a word.
|
|
|
|
A few other changes and improvements were also made, including improvements
|
|
in the recognition of advcl. There has been a reduction of "dep" dependencies
|
|
of about 14% on newswire (and higher on more informal text genres).
|
|
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v2.0.4, NOVEMBER 2012
|
|
|
|
A few minor changes and fixes were made: HYPH is now recognized, and treated
|
|
as punctuation and clausal complements of adjectives (including comparatives)
|
|
are recognized as ccomp.
|
|
|
|
--------------------------------------
|
|
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v1.6.9
|
|
|
|
This version adds an explicit root dependency in the set of dependencies
|
|
returned. In the past, there had been no explicit representation of the
|
|
root of the sentence in the set of dependencies returned, except in the
|
|
CoNLL format output, which always showed the root. Now, there is always
|
|
an explicit extra dependency that marks the sentence root, using a fake
|
|
ROOT pseudoword with index 0. That is, the root is marked in this way:
|
|
root(ROOT-0, depends-3)
|
|
Otherwise there were only a couple of minute changes in the dependencies
|
|
produced (appositions are now recognized in WHNPs!).
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- v1.6.8
|
|
|
|
This version includes only small fixes, principally addressing some gaps
|
|
in the correct treatment of dependencies in inverted sentence (SQ and SINV)
|
|
constructions, and some errors in the treatment of copulas in the presence of
|
|
temporal NPs.
|
|
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- NOVEMBER 2010 - JANUARY 2011
|
|
|
|
Two changes were made to the taxonomy of dependencies.
|
|
- measure (phrase modifier) was generalized and replaced by
|
|
npadvmod (noun phrase adverbial modifier) which includes measure
|
|
phrases and other adverbial uses of noun phrases. Temporal NPs
|
|
(tmod) are now a subtype of npadvmod in the dependency hierarchy.
|
|
- mwe (multi-word expression) is introduced for certain common
|
|
function word dependencies for which another good analysis isn't
|
|
easy to come by (and which were frequently dep before) such as
|
|
"instead of" or "rather than".
|
|
|
|
A new option has ben added to allow the copula to be treated as
|
|
the head when it has an adjective or noun complement.
|
|
|
|
The conversion software will now work fairly well with the
|
|
David Vadas version of the treebank with extra noun phrase
|
|
structure. (A few rare cases that are handled with the standard
|
|
treebank aren't yet handled, but you will get better dependencies
|
|
for compound nouns and multiword adjectival modifiers, etc.)
|
|
|
|
Considerable improvements were made in the coverage of named
|
|
dependencies. You should expect to see only about half as many generic
|
|
"dep" dependencies as in version 1.6.4.
|
|
|
|
--------------------------------------
|
|
CHANGES IN ENGLISH TYPED DEPENDENCIES CODE -- JUNE-AUGUST 2010
|
|
|
|
No new dependency relations have been introduced.
|
|
|
|
There have been some significant improvements in the generated
|
|
dependencies, principally covering:
|
|
- Better resolution of nsubj and dobj long distance dependencies
|
|
(but v1.6.4 fixes the overpercolation of dobj in v1.6.3)
|
|
- Better handling of conjunction distribution in CCprocessed option
|
|
- Correction of bug in v1.6.2 that made certain verb dependents noun
|
|
dependents.
|
|
- Better dependencies are generated for question structures (v1.6.4)
|
|
- Other minor improvements in recognizing passives, adverbial
|
|
modifiers, etc.
|
|
|