[Research Paper]: Not Ready for Convergence in Data Infrastructures

Authors: Keith Jeffery, Peter Wittenburg, Larry Lannom, George Strawn, Claudia Biniossek, Dirk Betz, and Christophe Blanchi


Much research is dependent on Information and Communication Technologies (ICT). Researchers indifferent research domains have set up their own ICT systems (data labs) to support their research, from data collection (observation, experiment, simulation) through analysis (analytics, visualization) to publication. However, too frequently the Digital Objects (DOs) upon which the research results are based are not curated and thus neither available for reproduction of the research nor utilization for other (e.g., multidisciplinary)research purposes. The key to curation is rich metadata recording not only a description of the DO and the conditions of its use but also the provenance – the trail of actions performed on the DO along the research workflow. There are increasing real-world requirements for multidisciplinary research. With DOs in domain-specific ICT systems (silos), commonly with inadequate metadata, such research is hindered. Despite wide agreement on principles for achieving FAIR (findable, accessible, interoperable, and reusable) utilization of research data, current practices fall short. FAIR DOs offer a way forward. The paradoxes, barriers, and possible solutions are examined. The key is persuading the researcher to adopt best practices which implies decreasing the cost (easy to use autonomic tools) and increasing the benefit (incentives such as acknowledgment and citation) while maintaining researcher independence and flexibility.

Scientific process; Workflow; Metadata; FAIR; Scientific data; Data wrangling
Citation: Jeffery, K., et al.: Not ready for convergence in data infrastructures. Data Intelligence 3(1), 116-135 (2021). doi: 10.1162/ dint_a_00084