Validity & Reliability in Secondary Data Analysis
Renowned methodologist Earl Babbie (2011) states that validity in the analysis of secondary datasets should take into consideration both logical reasoning and replication. Additionally, Schutt (2014) states that validity is when our conclusions are consistent with empirical reality. That is, the information that is contained in any dataset should be consistent with reality, and other researchers who attempt to replicate the data or the analysis, should achieve similar (or the same) outcome.
Increasingly, consumers of existing datasets have to be concerned about manipulation of large datasets for malicious purposes. The Bureau of Labor Statistics is responsible for preparing the federal employment/unemployment rates, and they utilize stringent data security protocols to ensure the integrity of the information. Their aim is to collect and distribute the data that reflect the reality in the market place, and to reassure the public that the data is unaltered. Below are the six protocols employed by the Bureau.
Reliability in data management also helps to ensure the integrity of the data. “Reliability means that a measurement procedure yields consistent scores when the phenomenon being measured is not changing (or that the measured scores change in direct correspondence to actual changes in the phenomenon)” (Schutt, 2014). Reliability is a necessary prerequire for validity, which was discussed above. Reliability presupposes that if the measurement used to ascertain unemployment rates in Virginia – the Virginia Employment Commission or the Unemployment Office numbers of Americans needing insurance benefits – yields wildly inconsistent results across four months, then the data integrity cannot be guarantee.
The two phenomena (validity and reliability) are therefore appropriate to ground this data integrity module. Social Scientist use both as hallmarks in their management of quantitative dataset, and cybersecurity specialists are equally concerned about the integrity of the data that consumers will access.
Video 1 – Data Accuracy at the US Census Bureau