| dc.description.abstract | This study presents the design, development, and validation of a contextual data cleaning
framework tailored for clinical research settings in low-resource environments, using the DPSP
(Dihydroartemisinin-Piperaquine and Sulfadoxine-Pyrimethamine) trial at Masafu Hospital as a
case study. The research was motivated by persistent data quality challenges—such as missing
values, inconsistencies, human errors, and tool limitations—that often compromise the validity
and reliability of clinical research outcomes. Employing a user-intervention methodology, the
study integrated qualitative insights from data managers, clinical teams, and analysts with
quantitative assessment techniques to ensure that the proposed framework aligns with real-world
practices. The framework was structured into distinct phases, including data profiling,
preprocessing, modular cleaning, enhancement, and quality scoring—each mapped to address
specific data integrity issues. Validation on the DPSP dataset demonstrated a significant
improvement in data accuracy (from 75% to 94%), completeness (from 68% to 90%), and
consistency (from 70% to 93%), confirming the framework’s effectiveness and usability. SQL
driven automation further improved scalability and reduced human error. The study contributes to
the literature by offering a novel, context-sensitive approach that balances domain expertise with
technical rigor. It recommends future work to expand the framework’s applicability to unstructured
data types and to assess its operational integration and cost-effectiveness. Overall, the framework
serves as a practical tool for improving data quality in clinical trials and enhancing the credibility
of health research in resource-constrained settings. | en_US |