REDCap Best Practices
These best practices and research hints, assembled by Drs. Marjorie Bowman and Rose Maxwell, supplement the information found at the official REDCap Project and in the Wright State University REDCap documentation, and provide commonality and consistency in usage at Wright State. Further information may be found on the Wright State University Boonshoft School of Medicine Office of Research Affairs webpage. Please send comments to redcap@wright.edu.
On this page:
- Overview of REDCap Projects
- Development Phase (The ‘Build’)
- Project Naming Conventions with Recommendations for School or Department Abbreviations
- Main Project Settings
- Designing the Data Collection Instruments
- Protected Health Information (PHI) and HIPAA (Health Insurance Portability and Accountability Act)
- Role Assignment and Rights and User Management
- Test and Retest
- Moving from Development to Production Phases in REDCap
- Production Phase
- Archive Phase
- Wright State University REDCap Administrators’ Reserve Rights
- General Research Best Practices and Helpful Hints
Overview of REDCap Projects
For Project Owners, planning the project well can avoid many errors and create wonderful outcomes. Plan. Plan. Plan!
This material is designed to supplement that available from the official REDCap project website. Some of the material is Wright State University specific, and some is to make it easier to find certain information. It is organized according to the phases of a REDCap project, primarily the ‘Development’ (often called ‘the Build’ and includes ‘Test and Retest’) and ‘Production’ phases of research projects. There is also the phase of ‘Design Change’, i.e., requests for changes after Production has commenced. Design changes are actively discouraged, not always avoidable, but can easily lead to data loss. Two additional phases are ‘Inactive‘ and ‘Archived’.
Development Phase (‘The Build’)
Project Naming Conventions with Recommendations for School or Department Abbreviations
REDCap assigns a unique system project number behind the scenes and does not require a specific project naming convention. As such, there is no system safeguard to prevent projects from having the same or similar names. You should consider a naming convention that will lead you to a unique name, e.g., PI last name, year project initiated, Sponsor/funder, or short project title name.This simplifies management of projects.
We recommend that the first portion of the project name be a regularly used abbreviation. For the Boonshoft School of Medicine this would be your Department, for other Schools/Colleges, use your School/College abbreviation, and when those don’t apply, such as a staff member from central administration, use an appropriate abbreviation. The following table contains our recommendations.
As an example, a project name would be “FAMMED 2016 Bowman Survey New Students.”
School or Department or Research Institute |
Abbreviation for Project Title |
---|---|
Boonshoft School of Medicine Departments |
|
-Biochemistry and Molecular Biology |
BMB |
-Population and Public Health Sciences |
PHS |
-Emergency Medicine |
EMERGMED |
-Family Medicine |
FAMMED |
-Internal Medicine |
IM |
-Neurology |
NEUR |
-Neuroscience, Cell Biology & Physiology |
NCBP |
-Obstetrics & Gynecology |
OBGYN |
-Orthopaedic Surgery and Sports Medicine |
ORTHO |
-Pathology |
PATH |
-Pediatrics |
PED |
-Pharmacology and Toxicology |
PHARM |
-Psychiatry |
PSYCH |
-Surgery |
SURG |
Business, Raj Soin College of |
BUS |
Education and Human Services |
EHS |
Liberal Arts |
COLA |
Nursing and Health |
NURS |
Professional Psychology |
PPSYCH |
Science and Mathematics |
COSM |
Wright State Research Institute |
WSRI |
See also suggestions for naming files and rules for variable names.
Main Project Settings
- REDCap contains project templates that help with design and background settings. These have been customized for Wright State to include standard user rights and roles. You are strongly encouraged to start with a template that is similar to the design of your project. All aspects of these templates can be changed to fit your needs. See Role Assignments and User Rights section below.
- Build the REDCap project (data entry forms) in such a way that it corresponds to the study design and provides proper data collection tools for all the data necessary for testing study hypothesis.
- Design the project to collect all data necessary for required outcome analyses.
- Collect all the data necessary for testing study hypothesis.
- Focus on how you want your data to look at the end of the study so you address all important data fields and design elements at the beginning. Consider creating the tables you hope to have for your final data analyses in draft form to check that you are collecting the data you will need.
- Determine whether you want to do a survey only or a longitudinal project. Unless you are doing a one time survey, you should enable the longitudinal data collection with repeating forms, even if you do not have repeating forms. You can do surveys in the longitudinal data collection version, a mixed use.
Designing the Data Collection Instruments
During this stage all changes to Data Collection Instruments are immediate and auto-saved. REDCap categorizes Data Collection Instruments as “surveys” and “data entry forms”.
Surveys
With “surveys” you can collect data directly from participants. In general, you will be either using a 1) publically available URL (anonymous with no authentication needed) or 2) participant email list (not totally anonymous as you have their email address). “Surveys” can be a one-time survey or a series of surveys; to do a series, you would ask the respondent for their email address as a means to send the next survey. A public URL is provided by REDCap and can only be used when the first instrument is a survey.
- You should not use the REDCap ‘Participant Email Contact list’ with group email addresses or distribution lists. The emailed invitations send only 1 unique survey link per email address; therefore, only the first person in the distribution group who clicks on the email link would be able to complete the survey.
- For group distribution lists, you can email the general survey link provided at the top of the “Invite Participants” page directly from your email account.
- Or, you can add each individual email address from a distribution list to the Participant Contact list. You can copy/paste the emails from a list (word or excel) into REDCap.
- The advantages of using REDCap’s Participant Contact list and the individual emails is that REDCap will track responders and non-responders for you. You’ll be able to email only non-responders if you want to send a reminder. With the general distribution email, you won’t be able to track responses and participants will have the potential to complete the survey more than once.
Data entry forms
With “data entry forms”, data are entered by authorized REDCap project users. REDCap log-in access and project rights are required to view and edit the data entry forms. Most projects will use the data entry version.
- This option can be set-up to have both REDCap user data entry on some forms and e-mailed surveys to participants, i.e., a mixed project.
Identifiers
Understand the different types of identifiers: unique identifier, optional secondary unique field, and the redcap_survey_identifer
- The first variable listed in your project is the unique identifier which links all your data.
- Do not use Protected Health Information (PHI) identifiers such as medical record number or date of birth or initials as the unique identifier, as it could be accidently displayed if a URL is created. The medical record number or date of birth, if required, can be added as another field.
- In Data Entry projects, you must define the unique identifier field. For projects where a survey is the first data collection instrument, it is automatically defined as the Participant ID. The Participant ID value is numeric and auto-increments starting with the highest value in the project. If no records exist, it will begin with ‘1’.
- Users can define the unique ID for projects with surveys instead of using the participant_id by having the first data collection instrument as a data entry form (do NOT enable it as a survey).
- The optional secondary unique field may be defined as any field on the data collection instruments. The value for the field you specify will be displayed next to the Participant ID (for surveys) or next to your unique identifier when choosing an existing record/response. It will also appear at the top of the data entry page when viewing a record/response. Unlike the value of the primary unique identifier field, it will not be potentially visible in a URL.
- The data values entered into the secondary unique field must also be unique. The system will not allow for duplicate entries and checks values entered in real time. If a duplicate value is entered, an error message will appear and the value must be changed to save/submit data entered on the data entry instrument.
- Common secondary unique identifiers are medical record numbers, subject name, and subject birthdate.
- The redcap_survey_identifier is the identifier defined for surveys when utilizing the Participant Email Contact List and sending survey invitations from the system. The “Participant Identifier” is an optional field you can use to identify individual survey responses so that the participant doesn’t have to enter any identifying information into the actual survey. This field is exported in the data set; the email address of the participant is not.
Other Design Considerations
Other things to consider.
- The automatic survey feature requires respondents to answer every question (forced choice radio button or drop-down, etc.) in order to move to the next question. Consider having a choice be “I prefer not to answer” or “Other”. If desirable, a branched logic free text response can be linked to collect more specific information from the respondent on the reason for the answer, “Other”.
- Collect only minimally-necessary set of PHI/Level 3 data (protected health information), in addition to those required by study design or operational requirements, to positively identify study subjects during data entry phase.
- Mark all PHI/Level 3 data fields as “Identifiers = Yes” (in the development phase, this is on the right side of the screen). See Protected Health Information and HIPAA section below.
- Note that data dictionary uploads are only available in the Development Phase.
- Be sure your variable names and field types are correct before requesting to move to production phase – later changes in a variable name, field type, or field label can cause data loss (see Data Impact Table). Similarly adding or modifying branching logic later can cause data loss.
Protected Health Information and HIPAA
Protected Health Information (PHI), also referred to as identifiers, must be treated with care. The PHI identifiers in REDCap include:
- Name
- Fax number
- Phone number
- E-mail address
- Account numbers
- Social Security number
- Medical Record number
- Health Plan number
- Certificate/license numbers
- URL
- IP address
- Vehicle identifiers
- Device ID
- Biometric ID
- Full face/identifying photo
- Other unique identifying number, characteristic, or code
- Postal address (geographic subdivisions smaller than state)
- Date precision beyond year
Here are some Wright State HIPAA related information and training resources:
- CaTS Information Technology Security Training
- CaTS Information Technology Security Policies
- Research and Sponsored Programs Human Subjects Compliance
Other specific or combined information can be the equivalent of identifiers. For example, “President of (specific) University” would readily identify a specific individual. Or a combination of ‘indirect’ identifiers can be the equivalent of an identifier. ‘Indirect’ identifiers include items such as place of treatment or doctor’s name, gender, rare disease or treatment, sensitive data such as illicit drug use, place of birth, workplace, occupation, annual income, education, household or family composition, ethnicity, birth year or age, or verbatim responses or transcripts.
Role Assignments and Rights and User Management
Project Owners assign user rights for data input and access, including the rights of the required Secondary Owner who, in turn, may be given rights to assign other users rights. The user(s) will need to obtain their REDCap username and password, then the Project Owner will need to enter their username(s) in the project to grant them access.
The Project Owner also must:
- Identify the project level rights for individual users.
- Give the minimum amount of access needed for users (including the Secondary Owner) to perform their duties. The assigned rights should be reviewed regularly if the project changes significantly.
- Assign only Full Data Export rights for projects with Personal Health Information (PHI) to those who require it. Frequently only the Owner and Secondary Owner would require full PHI access.
- Ensure all users continue to have the appropriate level of training and certifications, such as the CITI/HIPAA training.
- Ensure all users are listed on the IRB forms (if applicable).
- Use the role title section to clearly delineate the user roles. Consider common types of user rights:
- Project Owner (required)
- Secondary Owner (required)
- Quality Reviewer [options: 1) no editing, can only view; and 2) can use the Data Comparison Tool when in the double data entry mode, then merge data],
- Data entry (no data quality oversight, but can edit records),
- Research Coordinator (data entry plus sometimes can export data),
- Statistician [can be set up to only export data if de-identified, which helps prevent HIPAA data violations].
- Note: Statisticians or Data Analysts are needed for Projects that require randomization or more than simple statistics and graphs. When using the REDCap randomization module, statistical support is strongly encouraged because it requires creating and maintaining allocation tables. Project Owners are also strongly urged to include a data analyst, statistician or similar expert on projects in REDCap that require analyses more complicated than simple graphs and data counts. This role could be assumed by the Project Owner, a Secondary Owner or be designated in a Team Member/Statistician role. While REDCap provides simple statistics and graphs, more complex analyses can be performed after data exports to common statistical packages.
- Ensure the Principal Investigator (PI) of the project (if not the Project Owner) is on the project with full user rights. The PI is often the Secondary Owner.
- Set a user expiration date (don’t delete) for all users who have left the project, such as those who remain in their status at WSU but no longer participate in the project, or when they are not performing adequately or have violated rules, such as, HIPAA. Those who leave the university are automatically expired. Do not delete users, as this can remove them from any other concurrent REDCap projects and risk data loss.
- Note: form restrictions are essentially another layer of user rights; an individual user can have rights, e.g., view & edit, read-only, no access, and edit responses.
Test and Retest the Project; repeat: TEST and RETEST!!
- Test the project prior to requesting the project be moved to Production Mode. The testing should include data entry, review of project unique identifier, data export formats, etc., to ensure the project design is suitable and appropriate. For surveys, pilot the instrument and test multiple iterations of potential answers.
- Save a copy of the forms or data dictionary before changing any items in the test phase.
- Test the project with sham (fake) data, in all instruments and events to validate instruments and event definitions, branching logic, calculated fields, and minimum/maximum ranges. Entering and saving test data is the only way to test that the branching logic and calculated fields are properly working. Do not enter real data into the development phase of the project.
- Review test data: open data entry forms, create reports, export data and send to the Secondary Owner, co-investigators, and any data analyst or statistician to review.
- It is important to think through the planned statistical analysis before collecting any data. A statistician can make sure you are collecting the fields you need, in the format you need them, in order to perform the needed statistical analyses. For projects with a large amount of data and many forms, consider having the data dictionary of the database reviewed by the Statistician. The data dictionary is easy to download. This will clearly communicate any defined branching logic that is not communicated in the raw data file or meta-data formatting accessible through the “Data Export” application. This will also clearly communicate the formulas from calculated fields. Also, send the blank case report form or other data collection tools. Have the Statistician perform a data export to ensure it does not extract identifiers. The statistician can give you feedback regarding the overall design of your database, as well as the definition of each field.
- Delete sham data after completion of the test phase! You will be automatically prompted to delete data from the draft/build phase, and you should do so.
- After adequate testing, and before submitting, download a codebook and PDFs of forms.
- After adequate testing and downloading the codebook and PDFs of forms, submit the project to the REDCap Administrators to approve a move to Production Phase. Ideally, this should be done after the IRB has approved the data entry forms when IRB review is applicable.
Moving from Development to Production Phases in REDCap:
Reminder: Only WSU REDCap Administrators can move a project from Development to Production phase. The role of Wright State REDCap Administrators is not to ensure that you have designed your project correctly — they will only do cursory reviews for obvious errors when providing the required approval to move from development to production phase. You should not expect nor count on administrators detecting a design flaw or missing data points in your data fields.
Production Phase
General Reminders in Production Phase
- Keep at least one printed blank copy of all forms used in the study.
- Regularly review the internal validation checks inside the project.
- REDCap has no autosave feature during data collection phase. Save early and often. Also, there is no cancel button, but editing and rewriting is easy before the final save of a record or hitting the end of survey button.
Design Changes after Production Phase
Design changes after the Production phase are discouraged. You can lose data.
Once the project has gone into Production Phase, changes are generally not required and actively discouraged. Checking and testing the changes is harderthan in the Build phase, and can result in data loss.
- Design changes must be re-submitted through the WSU REDCap administrators and will not be visible or take effect until the WSU REDCap administrators approve the changes.
- Request the minimum number of design changes possible, and do so in a manner that will not destroy prior collected data, whether or not you believe you will use that prior collected data. The data dictionary will not upload when you change the position of the field, edit the label, edit the variable name, or edit SQL code.
- Always make a copy of the codebook before making any significant production changes.
- Caution: the data import tool will override the existing data.
- Changes to any variables may affect programmed calculations and/or branching logic. It is the responsibility of the requester to review and test all calculated fields and branching logic prior to submitting changes.
Metadata |
Change Type |
Data Impact |
Further Explanation |
---|---|---|---|
Variable / Field Name |
Add new |
No Data Loss |
The new field will be added to all records. |
Variable / Field Name |
Delete |
Data Loss |
This deletes the field and all the data entered for that field. |
Variable / Field Name |
Rename |
Data Loss |
This is equivalent to deleting a variable and adding a new variable, thus the data is deleted. |
Form Name |
Add new form |
No Data Loss |
New form/fields will be added to all records. |
Form Name |
Change via data dictionary upload |
Data Loss |
Form completeness data will be lost. Form names can be changed within the Online Designer without data loss. |
Form Name vs. Form Label |
Rename form |
Possible Data Loss |
Recommend renaming the Form Label rather than the Form Name. The data dictionary renames the form. All form status values (unverified, complete) for ALL records will then be reset to “Incomplete”. The Online Form Editor will NOT change the form name and only renames the form label, preserving the form status for all records. ‘Note: Form name “back end” name (data dictionary ex:”baseline_data”) does not appear on screen. Form label “front end” name displays on screen (ex: “Baseline Data”) |
Field Units |
Add, Modify, Delete |
No Data Loss |
This would merely change the field unit label. |
Section Header |
Add, Modify, Delete |
No Data Loss |
No data loss as it is descriptive text. |
Field Type |
Modify |
Possible Data Loss |
Depending on the change, data can be lost. Examples of changes that can be made without data loss:
Examples of changes that can be made that cause data loss:
|
Field Label |
Modify |
Possible Data Confusion |
Changes to question caption may change the meaning of data previously entered. Simple spelling corrections or format changes are not problems. |
Response Choices |
Add |
Possible Data Impact and Confusion |
-If the choice added at the end of a list, no data is lost, but there could be confusion in the final data analyses. New choice will be added to all records. -If the choice is added in the middle, data will be confused. |
Response Choices |
Delete |
Data loss |
Deletes the choice and ALL data entered as that choice. |
Response Choices |
Recode |
Possible Label Mismatch |
Codes are not automatically re-mapped to new codes. Data entered remains the same in the database. Relabeling codes may change the meaning of data entered. |
Calculations |
Add, Modify, Delete |
Data Confusion |
Forms with saved calculated field values will not automatically recalculate when changes are committed. Values should be derived and confirmed in analysis. All forms with values should be resaved to update stored values. |
Slider Labels |
Add, Modify, Delete |
Possible Data Confusion |
If the changes impact how respondents answer the question, former data may not be consistent with data after the change. |
Field Note |
Add, Modify, Delete |
No Data Loss |
No direct data impact as it is descriptive text. |
Text Validation Type |
Add, Modify |
Possible Data Loss |
Data entered as free text or other type of validation text may no longer be valid. |
Text Validation Type |
Delete |
Possible Data Confusion |
Field becomes open text field. |
Show Slider Number |
Add, Delete |
Possible Data Confusion |
If the changes impact how respondents answer the questions, former data may not be consistent with data after the change. |
Text Validation Minimum |
Add, Modify, Delete |
No Data Loss |
No data impact since out of range data can still be saved. |
Text Validation Maximum |
Add, Modify, Delete |
No Data Loss |
No data impact since out of range data can still be saved. |
Identifier |
Add, Delete |
No Data Loss |
No direct data impact. |
Branching Logic |
Add, Modify |
Data Loss |
Fields that will be hidden due to updated logic but that already contain data will prompt data erasure. |
Branching Logic |
Delete |
No Data Loss |
No direct data impact, but may impact missing data analyses. Fields will remain visible. |
Required Field |
Add, Delete |
No Data Loss |
Data can still be saved without completion of required fields. |
Custom Alignment |
Modify |
No Data Loss |
Display only. |
Question Number (surveys only) |
Modify |
No Data Loss |
Display only. |
Matrix Group Name |
Add, Modify, Delete |
No Data Loss |
Display only. |
Archive Phase
Project Owners will:
- Lock all data.
- Disable (set user expiration date) all users from the project except the Principal Investigator, Project Owner (if different person), and the Secondary Owner. Do not delete users.
Wright State University REDCap Administrators’ Reserve Rights
The Wright State REDCap administrators reserve the following rights related to Project Management:
- Record and track REDCap project databases, including the name of the PI, the date of project creation, and date of project move to production.
- Promptly disable user access for persons and entities that no longer need access to REDCap.
- Review and assign protections to data fields with Level 3 information by indicating “Identifiers=Yes” when moving the project to production and assign protections to identifiers with Level 3 information.
- Delete data from the development phase at any time.
- See also: Rights for Administrators Versus Project Owners
General Research Data Management Best Practices and Helpful Hints for Wright State University REDCap USERS
File names should be consistent and descriptive. Include project or experiment or acronym name recognizable to users. Consider adding researcher initials or type of data. Include a version number, in sequential fashion. Consider using status (draft, final). A good format for dates is YYYYMMDD or YYMMDD. Special characters should be avoided. Do not use spaces, as they are not recognized. You can use underscores, dashes, or camel case (ex: FileName). Start with general and work to the specific in order of importance.
The variable field name must be unique across the entire project, and will be the field name used for branching or piping. Except for the record_id field, use a three-four letter prefix that represents the form name and an underscore before every field name. Follow-up form #4 question on smoking could be: fu4_smoking. The variable name should start with a letter; the remaining characters may be any letter, digit, a period or the symbols #, @, _, or $. Do not end with a period. No more than 64 characters are permissible. No duplicate names are acceptable. Certain words cannot be used for variable names, specifically: ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO and WITH. REDCap has an auto-naming function for variable names that names variables according to the question title you enter. Sometimes this is helpful, but other times it is not as some of the names it creates may not be intuitive or easily distinguishable from other variable names.
Ontology / metadata:
Researchers should use classic, recognized ontology, and create metadata for each project, experiment or analysis. Ontology refers to an organizational system designed to categorize information and informational relationships. A simplistic example would be categorizing something as a bacteria or a virus. Metadata is data that describe other data, such as ‘3 projects with 3 arms each and 800 total variables’. Frequently, using ontology and metadata is not difficult and is intuitive. However, some projects require higher level support for the metadata. Increasing needs for data sharing and big data analysis require recognized ontologies. Listed here are several support tools:
Ontology/terminology: Consider using the Bioportal to use internationally recognized ontology and terminology: http://bioportal.bioontology.org/ , under UMLS License (https://uts.nlm.nih.gov/license.html ). This includes SNOMED clinical terms, RxNORM, Medical Dictionary for Regulatory Activities (MEDDRA), National Drug Data File (NDDF), Current Procedural Terminology (CPT), Medical Subject Headings (MESH), Radiology Lexicon (RADLEX), National Cancer Institute Thesaurus (NCIT), Symptom Ontology (SYMP), and much more.
Metadata Support Tools: Create metadata for experiments and analyses that are not just done for one-time consideration. Again, this can be intuitive, but for larger or complicated projects downloaded from REDCap into Excel, one metadata tool, Rightfield, can be particularly useful. Examples of metadata tools (all of these can be used by Windows, Mac, and Linux with some limitations):
Rightfield - this uses info from the Bioportal. It is an open source tool for adding ontology term selection to Excel spreadsheets, i.e, Excel plug-in. “Rightfield is used by a ‘Template Creator’ to create semantically aware Excel spreadsheet templates. The Excel templates are then reused by scientists to collect and annotate their data without any need to understand, or even be aware of, RightField of the ontologies used.”
Annotare from the European Bioinformatics Institute - a “tool for annotating biomedical investigators and resulting data. It is intended to help a bench biologist construct a MIAME-compliant file based on the MAGE-TAB format.”
ISA Creator, for lab scientists to record experimental information and meet annotation requirements. Metadata schema is ISA-tab. Good for Life Sciences.
OMERO - “Handles all microscopy images in a secure central repository”.
OntoMaton - “facilitates ontology search and tagging functionalities within Google Spreadsheets.” It has been developed by the ISA Team at the University of Oxford’s e-Research Centre. Ontology searching and automated tagging from the NCBO Bioportal. Part of ISA-Tools Suite. Annotations are generated within the tabular data file.
General Spreadsheet Hints:
- Use rows, not columns for dependent variables (observations). There are more available rows than columns and most programs process row by row. REDCap basically forces this when using the REDCap tools to develop your project. If there are too many columns, data downloads will not work correctly.
- Do not put in unnecessary blank rows – every heading, subtotal or empty column, row or cell in the data makes it far less useable for data analysis.
- Do not use formatting to convey information. Formatting should occur later in the project when showing results to others, not as part of the internal use of spreadsheets to manage data. Fancy colors and formatting should only be used for data presentation, not data storage or manipulation.
- Do not place comments in cells.
- Do not put more than one piece of information in a cell.
- Use pivot tables to manipulate the raw data in Excel cells (go to Insert: Pivot Table); then protect cells with formulas in them to prevent accidental overwriting and separate the data from the analysis.
- Avoid numbers in a formula unless they are fixed (such as 60 for turning hours into minutes) – put variables in cells and refer to the cell in the formula or give a name to the cell (click on top left corner or cell and type name desired).
Generally, use cell references rather than cut and paste. In addition to the multiple REDCap resources specific to REDCap, here are some on-line resources for how to use software for researchers:
- www.software-carpentry.org/ Non-profit organization whose members teach researchers basic software skills (command line, python, version control, etc.). Includes open access teaching materials.
- www.datacarpentry.org/ offshoot of software-carpentry.org designed to teach basic concepts, skills and tools for working more effectively with data.