ApicomPro
Document De-Identification and Processing Solutions
At ApicomPro, we understand the critical importance of patient confidentiality in the healthcare industry. Our mission is to provide reliable and secure de-identification services for medical documents, ensuring compliance with privacy regulations and safeguarding sensitive patient information.
Contact Information
Warsaw, Warsaw 05270 Poland
Dicom De-identification Visit Website
Social Startups Healthcare Software & Technology
Timeline: 0 weeks Amount: 10,001 to 50,000
We completed a project involving the de-identification of a dataset of DICOM files. Dataset Details: - Number of documents: approximately 100,000 - Number of frames per document: 1 to 5 - Color schemes: MONOCHROME2, YBR_FULL_422 - PHI in Pixel Data and in Metadata - Annotations in Overlay What Was Done: - Set up a Spark single-node cluster on an EC2 instance, specifically the c6a.16xlarge with 64 CPU cores and 128GB RAM. - Developed a Spark Structured Streaming pipeline using our framework for de-identification. - Processed the dataset within 24 hours (approximately 1 second per file). - De-identified text within the Pixel data. - De-identified metadata. - De-identified annotations within the Overlay data. - The dataset was read from S3, and the results were stored back to S3 while maintaining the original folder structure. - Manually reviewed each document using an annotation tool.
