Internship Project: AI-Based PCB Defect Detection
Phase: Milestone 1 - Image Processing & Dataset Preparation
The objective of this project is to develop an automated system capable of identifying and classifying defects on Printed Circuit Boards (PCB). In the electronics manufacturing industry, manual inspection is slow and prone to error. This solution leverages Computer Vision (for localization) and Deep Learning (for classification) to automate quality assurance.
This repository contains the deliverables for Milestone 1, which focuses on building the foundational image processing pipeline and preparing the data for the neural network.
The goal of this phase was to implement Modules 1 & 2 from the project roadmap:
-
Module 1 (Defect Localization):
Implement a reference-based subtraction algorithm to detect anomalies between a "Golden Template" and a "Test Image." -
Module 2 (ROI Extraction):
Prepare a clean, labeled dataset by extracting Ground Truth Regions of Interest (ROIs) from the DeepPCB dataset to train the CNN in Milestone 2.
This module implements the core logic that identifies defects using image subtraction. The process relies on comparing a perfect reference image to the board being inspected.
Algorithm Steps:
- Image Alignment: Use computer vision techniques (ORB/Homography) to align the test image precisely with the template.
- Image Subtraction: Calculate pixel-wise absolute difference:
Diff(x,y) = |Template(x,y) - Test(x,y)| - Grayscale Conversion: Simplifies subsequent processing steps.
- Otsu's Thresholding: Automatically finds the optimal threshold to separate the difference regions (potential defects) from image noise.
- Morphological Filtering: Apply Opening (to remove small white noise) and Dilation/Closing (to solidify defect regions).
- Contour Extraction: Draw bounding boxes around the connected components identified as detected defects.
Outcome: Demonstrates that computer vision alone can localize defects like "Missing Holes" or "Spurs" without human intervention.
This module generates labeled ROIs for training a Convolutional Neural Network (CNN).
Process:
- Parse XML annotation files from the DeepPCB dataset.
- Extract verified bounding box coordinates (
xmin, ymin, xmax, ymax). - Crop the corresponding regions of interest (ROIs) from the raw images.
- Automatically sort crops into class-specific folders (e.g.,
Mouse_bite,Open_circuit,Short).
Outcome: A dataset of 2,953 high-quality, pre-labeled images ready for CNN training.
- Python 3.10 or higher
- DeepPCB Dataset (must include images, PCB_USED, and Annotations)
The system successfully identifies defects in testing. It generates Difference Maps and Binary Masks highlighting defects.
Processed 693 image pairs from DeepPCB. Generated a balanced dataset:
| Defect Class | Samples Extracted |
|---|---|
| Missing Hole | ~497 |
| Mouse Bite | ~492 |
| Open Circuit | ~482 |
| Short | ~491 |
| Spur | ~488 |
| Spurious Copper | ~503 |
| Total | ~2,953 |
With the image processing pipeline established in Milestone 1, the project now moves into the intelligence phase. While computer vision localizes potential anomalies, Milestone 2 implements a Deep Learning classification engine to distinguish between specific defect types (e.g., Short, Open Circuit, Mouse Bite) and reduce false positives.
The goal of this phase was to implement Modules 3 & 4 from the project roadmap:
- Module 3 (Transfer Learning): Adapt a pre-trained Convolutional Neural Network (CNN) to classify PCB defect ROIs.
- Module 4 (Testing & Evaluation): Benchmark the model's accuracy and performance to ensure it meets industrial standards (β₯ 95% accuracy).
To ensure the model is robust against variations in lighting and positioning, the 2,953 ROIs extracted in Milestone 1 underwent a transformation pipeline:
- Geometric Transformations: Random rotations, horizontal/vertical flips, and slight shifts to simulate real-world camera misalignments.
- Normalization: Scaling pixel values and resizing images to a standard input size (e.g., 224x224) compatible with the model architecture.
The system utilizes Transfer Learning to leverage features learned from large-scale datasets.
- Base Model: A pre-trained architecture (such as ResNet or VGG) was used as the feature extractor.
- Custom Head: The final fully connected layers were replaced with a specialized classifier matching the 6 defect categories identified in the DeepPCB dataset.
- Optimization: Employed the Adam optimizer and Categorical Cross-Entropy loss function to minimize classification errors during training.
- Split: The dataset was divided into Training (80%), Validation (10%), and Test (10%) sets to monitor for overfitting.
- Environment: Training was conducted using high-performance libraries like PyTorch or TensorFlow within a GPU-accelerated environment.
The system achieved high reliability in classifying defects. The validation phase focused on ensuring the model could distinguish between visually similar defects like βSpursβ and βSpurious Copper.β
-
Target Accuracy: β₯ 95% test accuracy.
-
Inference Speed: Processed images in β€ 3 seconds per image, meeting the target for real-time application needs.
Detailed performance was tracked using:
- Confusion Matrix: To identify specific classes where the model might confuse one defect for another.
- Loss/Accuracy Curves: Demonstrating steady convergence during the training epochs.
- Input: ROI localized as a "Missing Hole" via Milestone 1 subtraction.
- Deep Learning Prediction: Classifies as "Missing_hole" with 98.4% confidence.
Phase: Milestone 3
Focus: Application Development & System Integration
π― Objective:
Build a production-ready Web UI for real-time [Insert Project Name, e.g., Medical Imaging / Infrastructure] analysis.
π οΈ Tech Stack:
Streamlit, OpenCV, TensorFlow/PyTorch, PIL, [Insert other libraries]
This final milestone serves as the "Bridge" β integrating the high-performance models developed in Milestones 1 & 2 into a seamless, interactive experience.
The system transforms raw data into actionable insights, allowing non-technical stakeholders to perform complex AI diagnostics with a single click.
The frontend is built on Streamlit, designed for speed and clarity. It focuses on reducing cognitive load by presenting complex AI outputs through intuitive visualizations.
-
Dynamic Parameter Tuning: A sidebar allows users to adjust sensitivity (thresholds) on the fly, instantly updating UI results.
-
Smart Media Uploader: Supports drag-and-drop for [Images / Video / Data files] with real-time validation.
-
Interactive Overlays: Uses custom CSS/JavaScript components to overlay AI predictions directly onto the source media.
-
One-Click Export: A dedicated reporting engine compiles results into a professional PDF or CSV summary.
The backend (orchestrated in app.py or engine.py) manages the lifecycle of a request β from pre-processing to model inference and post-analysis.
To ensure both speed and accuracy, the backend utilizes a tiered approach:
- Preprocessing Stream: Normalizes input data (resizing, denoising, color correction) so that the AI sees data in its optimal form.
- Inference Stream: Routes cleaned data through the trained models.
- Post-Processing Logic: Applies Non-Maximum Suppression (NMS) or custom logic to filter noise and format the coordinates for the frontend.
A minimalist entry point featuring a secure upload zone and a System Readiness check to ensure AI models are preloaded in cache.
Displays a Processing state as users interact, showcasing the AI's ability to handle high-resolution inputs with low latency.
The core view where AI predictions (bounding boxes / heatmaps) are rendered. Each detection is color-coded by category for instant recognition.
Includes a Side-by-Side view or Overlay Toggle allowing users to compare the original input with the AIβs interpreted output.
A structured data table breaks down the confidence of every prediction, providing transparency for professional-grade tools.
| Milestone | Objective | Status | Result |
|---|---|---|---|
| M1 | [e.g., Data Extraction] | β Complete | [e.g., 99% Precision] |
| M2 | [e.g., Model Training] | β Complete | [e.g., 98% mAP] |
| M3 | System Integration | β Complete | Fully Functional App |






