Classroom Behavior Detection

type

Computer Vision

status

Active Research

year

2024

role

Research Assistant

01 —

System Architecture · 3D View

02 —

Architecture Diagram

Raw Dataset

Images + Annotations

↓

Augmentation

Resize · Flip · Crop

↓

YOLOv8 Model

PyTorch Backbone

↓

Training Loop

YAML Config · GPU/CPU

↓

Evaluation

mAP@50 · Precision

Grad-CAM

Explainability

↓

Detection Output

Bounding Boxes

03 —

Screenshots & Output

terminal

$ yolo detect train data=classroom.yaml model=yolov8n.pt epochs=100

Epoch 1/100: box_loss=3.21 cls_loss=2.18 dfl_loss=1.03

Epoch 25/100: box_loss=1.84 cls_loss=1.12 dfl_loss=0.87

Epoch 50/100: mAP50=0.64 Precision=0.71 Recall=0.68

Epoch 75/100: mAP50=0.81 Precision=0.84 Recall=0.79

Epoch 100/100: mAP50=0.91 Precision=0.89 Recall=0.87

✓ Training complete. Best weights → runs/detect/best.pt

✓ Grad-CAM heatmaps saved → /gradcam/outputs/

Training Output

YOLO training terminal logs

Class mAP Scores

mAP@5091%

Precision89%

Recall87%

Writing79%

Using Phone85%

Class mAP Scores

Per-class detection performance

YAML Config

{
# classroom.yaml
path: ./dataset
train: images/train
val: images/val
nc: 11
names: ['writing', 'phone', 'sleeping',
'reading', 'raising_hand', 'talking'...]
# epochs=100 imgsz=640 batch=8
}

Config JSON

YAML training configuration

Project Structure

📁 classroom-behavior-detection/

├─ train.py YOLOv8 training

├─ evaluate.py mAP evaluation

├─ gradcam.py Explainability

├─ dataset/

│ ├─ images/train/ Training set

│ └─ labels/ Annotations

├─ classroom.yaml Training config

└─ runs/detect/ Results

Dataset Structure

Directory organization

04 —

What I Built

›

Designed and trained multi-class detection model with YOLOv8 and PyTorch (11+ behavioral classes).

›

Structured and validated custom dataset with disciplined data organization and quality control.

›

Configured YAML training pipelines and optimized hyperparameters (epochs, image size, batch size).

›

Improved mAP@50 through structured debugging and performance analysis.

›

Explored Grad-CAM integration for model explainability and responsible AI.

›

Managed CPU-based training environments with virtual environment setups.

05 —

Project Insights

✎ Personal Notes & Learnings

Markdown Editor

Live Preview

Research Context

This is my primary research assistantship project at Lawrence Tech.

Technical Challenges

11+ behavioral classes with significant visual overlap
CPU-only training environment requiring careful batch size tuning
Annotation quality control across hundreds of images

Key Findings

Grad-CAM revealed the model focused on posture over facial features
Smaller batch sizes with higher epochs outperformed opposite configuration

Next Steps

GPU-accelerated training for faster iteration
Real-time video stream inference pipeline

✓ Insights saved locally

← PreviousResume Intelligence Engine

All Projects

Next →Buffer Overflow Attack Lab