# 1. ABOUT THE DATASET ------------ Title: University of Reading Open Research Survey 2021 dataset Creator(s): Daniel Brady [1], Peter Bray [2], Auvikki de Boon [3], Marcello De Maria [3], Kirsty Hodgson [1], Sophie Read [3], and Brendan Williams [1,4]. Organisation(s): 1. School of Psychology and Clinical Language Sciences. 2. School of Archaeology, Geography and Environmental Science. 3. School of Agriculture, Policy and Development. 4. Centre for Integrative Neuroscience and Neurodynamics. Rights-holder(s): University of Reading, Auvikki de Boon, Kirsty Hodgson, Sophie Read, and Brendan Williams. Publication Year: 2022 Description: This dataset contains anonymised data collected during the University of Reading Open Research Survey 2021. This project was lead by a group of Open Research Champions across multiple departments, with the aim of mapping the current open research landscape of the university. Questionnaire responses were collected from 403 staff and students in the University of Reading community between October and November 2021. The data shared here contains anonymised responses from 390 participants, following cleaning of the dataset to remove duplicates and participants who did not provide consent for data usage and/or sharing. Participants were recruited using departmental mailing lists, and through word of mouth. Dissemination across the institution were supported by Open Research Champions within their respective departments. Three 50 GBP prizes were offered to respondees to incentivise participation in the survey. The dataset contains anonymised survey data for individual respondees, a data dictionary for interpreting values in the dataset, a copy of the original survey as implimented in REDCap, and a Jupyter Notebook used to generated the sharable data from our raw dataset. Cite as: Brady, Daniel, Bray, Peter, de Boon, Auvikki, De Maria, Marcello, Hodgson, Kirsty, Read, Sophie and Williams, Brendan (2022): University of Reading Open Research Survey 2021 Dataset. University of Reading. Dataset. https://doi.org/10.17864/1947.000355. Contact: b.williams3@reading.ac.uk # 2. TERMS OF USE ----------------- Copyright 2022 University of Reading, Auvikki de Boon, Kirsty Hodgson, Sophie Read, and Brendan Williams. All documentation and code are licensed by the rights-holder under a Creative Commons Attribution 4.0 International (CC BY 4.0) license (https://creativecommons.org/licenses/by/4.0/). # 3. PROJECT AND FUNDING INFORMATION ------------ Title: Survey on Open Research at the University of Reading. Dates: April 2021 - April 2023 Funding organisation: Open Research Champions Scheme, University of Reading. # 4. CONTENTS ------------ ## data_cleaning.ipynb: Interactive Jupyter Notebook used to modify raw data received from REDCap, and merge manually anonymised qualitative data. String data are converted to integer format for data processing, age is binned into groups for anonymisation, length of tenure is removed for anonymisation, duplicate cases were removed (IDs: 44, 116, 149, 272, 355), and participants who did not give consent were also removed. This was then saved as a single csv file (data_share.csv) that is available within this data archive. ## Data_dictionary.xlsx: Excel spreadsheet that enables matching of data in data_share.csv to the questionnaire presented in OR_Survey_Questionnaire.pdf. This spreadsheet contains five columns. 'Question number' gives the item number for the question used in the questionnaire. 'Raw Data' gives the question as it is presented in the questionnaire. 'Preprocessed data' gives the column name used to record participant responses in data_share.csv. 'Scoring' matches the numeric values used to record participant responses for that item with the available options given in the questionnaire. 'Notes' gives any additional information about an item not otherwise recorded in the dictionary. ## data_share.csv: Anonymised survey data post-cleaning. This includes removal of duplicate records, anonymisation of qualitative responses, and removal of participants who did not give consent. Further details of data cleaning can be found in the description of data_cleaning.ipynb. A description of the dictionary needed to interpret this data is found under Data_dictionary.xlsx. ## OR_Survey_Questionnaire.pdf: Open research survey with attribution, license, and description. This also includes the participant information sheet that was given to respondents of the survey. The items included in the survey here can be matched to participant data in data_share.csv using the Data_dictionary.xlsx file. # 5. METHODS -------------------------- Quantitative data were processed by BW. Information on the processing of raw data can be found in the Data_dictionary.xlsx. Qualitative data were anonymised by AdB, KH, and BW, following the recommendations on qualitative data anonymisation made by Braun and Clarke (2013), and Saunders et al., (2015). Full details on the anonymisation protocol can be found below. The script used for data processing was created by BW, and can be found in data_cleaning.ipynb The survey was initially developed jointly by DB, PB, AdB, MDM, KH, SR, ER, and BW. ## Protocol of the Anonymisation of Qualitative Survey Data ### Aim To adopt a transparent, reproducible and rigorous approach to the screening and anonymisation of the qualitative survey data in order to protect the identification of participants but preserve the integrity of the responses and communication of salient themes. ### Methods This protocol was informed by the recommendations on qualitative data anonymisation made by Braun and Clarke (2013), and Saunders et al., (2015). Qualitative responses were screened by three researchers (ADB, KH, BW) in order to identify specific anonymisation criteria specific to the sample. Further collaborative discussion of each identified anonymity concern was addressed through consultation between each of the researchers in order to balance confidentiality with the preservation of content and themes. The anonymity criteria identified and details of how each of these were addressed are as follows: 1. People's names In all cases these were substituted with their generic title e.g., master's student, supervisor, Dr, Professor. No pseudonyms were used. 2. Locations and specialised institutional departments Referenced to broad departments and the University of Reading were retained, specialised small sub-departmental structures were removed to preserve the anonymity of the survey respondents. 3. Specific projects and grants Named specific projects and grants were either removed or neutralised as 'project' or 'grant'. 4. Specialised occupations In some cases, specialised organisational titles were genericised e.g., 'Organisational Lead for OR' in order to preserve salient meaning, but protect the anonymity of any specific individual. 5. Occupational relationships Where possible the relationships described (e.g., student and supervisor, junior colleague) were preserved with the anonymisation of all identifiable details. 6. Further identifiable information This included specialist research interests or methodologies unique to individual researchers. Errors related to sentence structure, and omission errors were not corrected in this process. However, definitions of colloquial terms and abbreviations have been itemised and may be found in the appendices of this protocol. ### Outcome Three versions of the qualitative survey data have been developed: 1. An original un-anonymised version. 2. An unmarked transcription screened and edited for anonymity that may be published in an open data repository. 3. A code to identify all redacted and amended anonymisation will be created for procedural transparency, thereby providing a marked generic transcription. All data are stored in password-protected secure electronic files. ### Recommendations Due to the size of the sample in this study, cumulative effects (I.e., the consideration of identifiable information across all an individual's responses) were not accounted for as there are no individual-case analyses planned in this study. However, this protocol recommends further anonymisation if the analysis plan were to change. ### References Clarke, V., & Braun, V. (2013). Successful Qualitative Research: A Practical Guide for Beginners. Sage Publications. Saunders, B., Kitzinger, J., & Kitzinger, C. (2015). Anonymising interview data: challenges and compromise in practice. Qualitative Research, 15(5), 616-632. https://doi.org/10.1177/1468794114550439 This protocol is based on https://dx.doi.org/10.1177%2F1468794114550439 ### Definitions of colloquial terms *Please note, colloquial terms that end with a question mark are our best guesses at what the responder meant when using an acronym* CentAUR - Central Archive at the University of Reading. University of Reading publication archive CIF - Crystallographic Information File CINN - Centre for Integrative Neuroscience and Neurodynamics CORRI - Committee on Open Research and Research Integrity CO-I - co-investigator CPD - continued professional development CT - computerised tomography CYLC - https://cylc.github.io/ ? ECR - early career research EEG - electroencephalography ELN - electronic lab notebooks ESDM - exegesis spatial data management FAIR - FAIR principles. Findable accessible, interoperable, reusable FDG - Focus Group Discussion GDPR - general data protection regulation GIS - Geographic Information System MRI - magnetic resonance imaging MRS - magnetic resonance spectroscopy NCAS-CMS - National Centre for Atmospheric Science-Computational Modelling Sciences OA - open access OR - open research OSF - Open Science Framework PCLS/SPCLS - School of Psychology and Clinical Language Sciences PGR - postgraduate research student PI - principle investigator Pre-reg - pre-registration RCT - randomised control trial REF - research excellence framework RCUK - UK research councils ? RDM - Research data management ? ROSE - In computing, a rose tree is a term for the value of a tree data structure with a variable and unbounded number of branches per node. ? RRDP - Reading researcher development programme. Training scheme provided by the Graduate school at the university of Reading for doctoral researchers SEM - structural equation modelling STS - science and technology studies UG - Undergraduate UKCORR - UK Council of Open Research and Repositories UKRI - UK Research and Innovation UoR - University of Reading XIOS - Extensible Markup Language Internet Operating System