BioProp : a Biomedical Proposition Bank

Supported by: Intelligent Agent Systems Lab., Institute of Information Science, Academia Sinica., Taipei, Taiwan


BioProp is a biomedical proposition bank. Like PropBank in the newswire domain, BioProp contains annotations of predicate argument structures and semantic roles in a treebank schema. To suit the needs in the biomedical domain, we modify the PropBank annotation guidelines and characterize semantic roles as components of biological events. Inter-annotator agreement measured by kappa statistic reaches 95% for combined decision of role identification and classification when all argument labels are considered.

Structures for each predicate: (82 predicates are presented)

All Predicates


Chou W-C, Tsai RT-H, Su Y-S, Ku W, Sung T-Y, Hsu W-L: A Semi-Automatic Method for Annotating a Biomedical Proposition Bank. Proceedings of ACL Workshop on Frontiers in Linguistically Annotated Corpora 2006:5-12.

Lai P-T, Dai H-J, Wu JC-Y, Tsai RT-H: A Biomedical Semantic Role Labeling BioC Module for BioCreative IV. Proceedings of the Fourth BioCreative Challenge Evaluation Workshop vol. 1, 54-60


download from LDC:

GENIA Treebank Beta
The GENIA Treebank version is a beta version, which has 200+300 abstracts in PTB (.tree files) format. The original download page:
GENIA Treebank used in Bioprop


Richard Tzong-Han Tsai
Wen-Lian Hsu



Hong-Jie Dai
Po-Ting Lai
Johnny Chi-Yang Wu