Anisotropic span embeddings and the negative impact of higher-order inference for coreference resolution: An empirical analysis

dc.citation.volumeFirst View
dc.contributor.authorHou F
dc.contributor.authorWang R
dc.contributor.authorNg S-K
dc.contributor.authorZhu F
dc.contributor.authorWitbrock M
dc.contributor.authorCahan SF
dc.contributor.authorChen L
dc.contributor.authorJia X
dc.date.accessioned2024-10-07T22:40:48Z
dc.date.available2024-10-07T22:40:48Z
dc.date.issued2024-01-25
dc.description.abstractCoreference resolution is the task of identifying and clustering mentions that refer to the same entity in a document. Based on state-of-the-art deep learning approaches, end-to-end coreference resolution considers all spans as candidate mentions and tackles mention detection and coreference resolution simultaneously. Recently, researchers have attempted to incorporate document-level context using higher-order inference (HOI) to improve end-to-end coreference resolution. However, HOI methods have been shown to have marginal or even negative impact on coreference resolution. In this paper, we reveal the reasons for the negative impact of HOI coreference resolution. Contextualized representations (e.g., those produced by BERT) for building span embeddings have been shown to be highly anisotropic. We show that HOI actually increases and thus worsens the anisotropy of span embeddings and makes it difficult to distinguish between related but distinct entities (e.g., pilots and flight attendants). Instead of using HOI, we propose two methods, Less-Anisotropic Internal Representations (LAIR) and Data Augmentation with Document Synthesis and Mention Swap (DSMS), to learn less-anisotropic span embeddings for coreference resolution. LAIR uses a linear aggregation of the first layer and the topmost layer of contextualized embeddings. DSMS generates more diversified examples of related but distinct entities by synthesizing documents and by mention swapping. Our experiments show that less-anisotropic span embeddings improve the performance significantly (+2.8 F1 gain on the OntoNotes benchmark) reaching new state-of-the-art performance on the GAP dataset.
dc.description.confidentialfalse
dc.edition.edition2024
dc.identifier.citationHou F, Wang R, Ng SK, Zhu F, Witbrock M, Cahan SF, Chen L, Jia X. (2024). Anisotropic span embeddings and the negative impact of higher-order inference for coreference resolution: An empirical analysis. Natural Language Engineering. First View.
dc.identifier.doi10.1017/S1351324924000019
dc.identifier.eissn1469-8110
dc.identifier.elements-typejournal-article
dc.identifier.issn1351-3249
dc.identifier.urihttps://mro.massey.ac.nz/handle/10179/71617
dc.languageEnglish
dc.publisherCambridge University Press
dc.publisher.urihttps://www.cambridge.org/core/journals/natural-language-engineering/article/anisotropic-span-embeddings-and-the-negative-impact-of-higherorder-inference-for-coreference-resolution-an-empirical-analysis/E59F426F59F86445BD3A0B9EA24EBB4A
dc.relation.isPartOfNatural Language Engineering
dc.rights(c) 2024 The Author/s
dc.rightsCC BY 4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectCoreference resolution
dc.subjecthigher-order inference
dc.subjectanisotropic span embeddings
dc.subjectcontextualized representations
dc.titleAnisotropic span embeddings and the negative impact of higher-order inference for coreference resolution: An empirical analysis
dc.typeJournal article
pubs.elements-id486356
pubs.organisational-groupOther
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Published version.pdf
Size:
1.02 MB
Format:
Adobe Portable Document Format
Description:
486356 PDF.pdf
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
9.22 KB
Format:
Plain Text
Description:
Collections