PubAnnotation

[ Overview | Features | Format | API ]

This page is under improvement. We will appreciate your any comment.

PubAnnotation supports four different types of annotation for modeling of any semantic information expressed in text.

Category annotations (catanns)

A category annotation in PubAnnotation consists of specification of a span and its category, identifying the span as a reference to an object of the category. A span is specified by a pair character offsets, begin and end, which are delimited by a colon(':'). Semantically, a category annotation represents an entity identified by the category in the context of surrounding the span. It is also know as a text-bound entity.

Example of category annotations [Figure 1]

Id. span category text
T1 193:200 Gene_expression produce
T2 201:223 Protein_family inflammatory cytokines
T3 232:237 Protein IFN-γ
T4 242:245 Protein TNF

In the above example, four entities are identified: gene expression, protein family, and two proteins.

Typical annotations of in this type include text-bound term annotation, named entity annotation, and so on. However, any annotation that may be created by a span selection and category assignment can be classified as a category annotation.

Relation annotations (relanns)

A relation annotation consists of specification of a relation type and two objects to be related to each other.

Example of relation annotations [Figure 2]

Id. subject relation object
R1 T2 themeOf T1
R2 T2 coreferenceOf T3
R3 T2 coreferenceOf T4
In the above example,
  • T2('inflammatory cytokines') is related to T1('produce') by the relationship themeOf, and
  • T2 is also related to T3 and T4 by the relationship coreferenceOf.

Instance annotations (insanns)

An instance annotation may be used when a text-bound entity needs to be 'instantiated' to different objects.

Example of instance annotations [Figure 3]

Id. type object
E1 instanceOf T1
E2 instanceOf T1
Id. subject relation object
R4 T3 themeOf T1
R5 T4 themeOf T1
This example shows a different way of annotation to the same sentence as in Figure 2.
Below is the difference of the two annotations:
  • in Figure 2, a gene expression event, of which inflammatory cytokines is the theme object, is identified and annotated as such, whereas
  • in Figure 3, two gene expression events are identified and annotated as such:
    • one is related to IFN-γ, and
    • the other is related to TNF.

Another alternative is shown below:

Alternative to instance annotations [Figure 4]

In this example, two category annotations are created to the term 'produce'.

Semantically, the annotations in Figure 3 and 4 are almost the same.
A slight different is that Figure 3 represents a natural steps of annotation - term annotation first and then relation annotation -, thus the type of instances (events) are dependent on the category annotation.

The annotation examples shown in Figure 2, 3, and 4 are all possible, and the semantics are more or less similar to each other. It is a matter of modeling, rather than one is right and the others are wrong.

PubAnnotation support all the alternatives, leaving the choice up to the user's decision.

Note that the BioNLP-ST GE task takes on the modeling in Figure 3.

Modification annotations (modanns)

A modification annotation is used to represent a relation or instantiation that is negated or speculated.

Example of modification annotations [Figure 5]

[Category annotations]
Id. span category text
T25 1793:1798 Protein Runx3
T66 1806:1815 Gene_expression expressed
T26 1793:1798 Protein CD4
[Instance annotations]
Id. type object
E1 subClassOf T66
[Relation annotations]
Id. type subject object
R19 themeOf T25 E11
[Modification annotations]
Id. type object
M3 Negation E11

In the above example, the event "gene expression of Runx3" is negated in the text, which is represented as a negation (M3) of the instantiation (E1) of the event gene expression (T66), that is related to the protein 'Runx3' (T25) by the relationship 'themeOf' (R19).

Figure 6 shows an alternative approach:
Example of modification annotations [Figure 6]

Here, the protein 'Runx' (T25) is directly related to the gene expression event (T66), and the relationship is negated (M3).

Either of above approaches would be possible (a bit different semantics of each component would be required). Again, PubAnnotation is neutral to any approach, leaving the choice up to the user's decision.


ex-catanns.png - Example of category annotations (24 KB) Jin-Dong Kim, 10/26/2012 01:21 AM

ex-relanns.png - Example of relation annotations (38.8 KB) Jin-Dong Kim, 10/26/2012 02:24 AM

ex-insanns.png - Example of instance annotations (52 KB) Jin-Dong Kim, 10/30/2012 05:56 AM

ex-insanns-alt.png - Alternative to instance annotations (46.8 KB) Jin-Dong Kim, 10/30/2012 06:29 AM

ex-modanns.png - Example of modification annotations (30.7 KB) Jin-Dong Kim, 10/30/2012 07:06 AM

ex-modanns-alt.png - Example of modification annotations (25.7 KB) Jin-Dong Kim, 10/30/2012 07:06 AM