When searching for data created by others in repositories or databases, ask yourself the following questions:
US National Library of Medicine A Framework for Data Quality Assessment in Clinical Research Datasets
Assessing the creation methods of a dataset are necessary to ensure your understanding of the data, the quality of your analysis and ultimately verify the quality of your research. You can use the factors in the table below to assess and describe the quality of your own data, or data from an external sources.
The proportion of stored data against the potential of "100% complete"
|Percentage of patient records that have all minimum and core data elements populated with non-blank values|
|Uniqueness||Nothing will be recorded more than once based on how that thing is identified||
Percentage of unique (vs. duplicate) records within a data set represents the uniqueness of the records within a set of data.
|Timeliness||The degree to which data represent reality from the required point in time||Time difference between the event and and the information about this event being recorded.|
|Validity||The degree to which data represent reality from the required point in time||
|Accuracy||The degree to which data correctly describes the "real world" object or event being described||
Date time formats should be formatted based on the parameters of the data system, or standards of the project.
Assess the data against the actual thing it represents, e.g. visit the hospital and determine how birth and screening data are collected and entered into the system. Or assess the data against an authoritative reference data set.
|Consistency||The absence of difference, when comparing two or more representations of a thing against a definition||Data in a given field should be collected or calculated in the same way across all records.|