data integration

Online Dictionary of common molecular identifiers of the biomedical sciences research infrastructures

The provision and use of common and unambiguous identifiers for bio-molecules such as genes, proteins and bioactive compounds is key to supporting the information flow from basic science, model organism biology, bioinformatics and structural biology through to translational research and clinical care. The ESFRI BMS project partners have determined an interoperable ‘Dictionary’ of identifier types (Appendix 1) used in this project, and within clinical/translational research more broadly. At the request of the scientific advisory board, we have also expanded our WP3.1 activities to include the development of best practices documentation for identifiers (Appendix 2-1) which was based on our identifiers landscape analysis (Appendix 2-2). Part of the expanded work on identifiers includes a shortlist of the most relevant Identifier Resolution and Conversion Tools (Appendix 2-3); these have also been registered with the BioMedBridges Tools and Data Services Registry. Further documentation was developed to guide the selection of ontologies (Appendix 2-4) to support cross-domain data integration. Where no authoritative identifier standard exists, we have worked with the respective community to determine one that would support the activities of WP4 and BioMedBridges use cases. Relevant identifiers include those for samples (Task 2), small molecules, macromolecular assemblies, genes, proteins, drugs, diseases and phenotypes.
