CS 206: Computer Vision & Embedded AI
| Course Code | CS 206 |
| Course Name | Computer Vision & Embedded AI |
| Department | Computer Science |
| Semester Offered | Odd (Term 3 - Shanghai) |
| Tuition Hours | 30 hours (Theory) + 10 hours (Lab) |
| Course Level | Intermediate to Advanced |
| Pre-requisite | CS 105: Practical Machine Learning |
| Co-requisite | - |
| Course Objective | Most AI systems that interact with the real world need to see. Vision is not just another feature. It is often the primary interface between machines and reality. This course introduces students to computer vision from a practical, systems-first perspective. Instead of going deep into theory, the focus is on understanding just enough of neural networks to use them effectively, and then building real vision systems that run on hardware. Students will learn how to process images and video, detect and track objects, and deploy models on edge devices such as Raspberry Pi and Jetson Nano. They will also explore hybrid systems where edge devices communicate with the cloud for compute-heavy tasks. By the end of the course, students will be able to build real-time vision-enabled systems, directly contributing to their Term 3 goal of building AI-powered hardware products. |
| Course Philosophy | This course emphasizes
|
| Course Learning Outcomes | Upon successful completion of this course, students will be able to:
|
| Course Author | Sagar Udasi MSc Statistics and Data Science with Computational Finance from The University of Edinburgh. Contact: sagar.l.udasi@gmail.com |
| Course Organiser | TBD Details will be updated before course commencement. |
| No. | Lecture Title | Concepts Covered | Lecture Objective |
|---|---|---|---|
| 01 | How Do Machines See The World? | Images as data, pixels, channels | Build intuition about visual data representation. |
| 02 | Why Neural Networks Work On Images | CNN intuition, filters, feature extraction | Introduce CNNs without heavy math. |
| 03 | From Pixels To Predictions | Image classification basics | Build first vision model quickly. |
| 04 | Seeing More Than Labels | Object detection concepts | Move from classification to localization. |
| 05 | Drawing Boxes Around The World | Bounding boxes, detection pipelines | Enable practical detection systems. |
| 06 | Understanding Every Pixel | Segmentation basics | Teach fine-grained visual understanding. |
| 07 | Following Things That Move | Object tracking | Build dynamic vision systems. |
| 08 | Working With Real Cameras | Camera interfaces, video streams | Connect models to real-world inputs. |
| 09 | Real-Time Vision Is Hard | Latency, frame rates, optimization | Handle constraints in live systems. |
| 10 | Pre-trained Models Save Time | Transfer learning in vision | Accelerate development using existing models. |
| 11 | Running Models On Tiny Devices | Edge AI basics, constraints | Introduce deployment on embedded systems. |
| 12 | Raspberry Pi And Jetson In Action | Device setup, inference pipelines | Hands-on deployment on hardware. |
| 13 | When Edge Is Not Enough | Edge-cloud architecture | Combine local and remote compute. |
| 14 | Building Video Processing Pipelines | Streaming, buffering, processing | Structure real-time data pipelines. |
| 15 | Optimizing For Speed | Quantization, pruning basics | Improve performance on limited hardware. |
| 16 | Integrating Sensors Beyond Cameras | Multi-sensor systems | Expand beyond vision into robotics context. |
| 17 | From Vision Model To Product Feature | System integration | Connect vision outputs to product logic. |
| 18 | Case Study: Vision In A Smart Device | Real-world system breakdown | Link directly to Term 3 wearable projects. |
| 19 | Building Your Own Vision System | End-to-end system design | Students build their own working system. |
| 20 | Demo Day: Does Your System See In Real Time? | Live demos, evaluation | Validate real-time performance and usability. |
Lab Sessions (7 Sessions)
| No. | Lab Title | Concepts Covered | Objective |
|---|---|---|---|
| L1 | Camera Setup That Actually Works | Camera interfacing | Enable students to capture real-world data. |
| L2 | Your First Vision Model | Image classification | Build and run a basic vision model. |
| L3 | Detect Objects In The Wild | Object detection | Apply detection to real scenes. |
| L4 | Track Movement In Real Time | Object tracking | Build dynamic systems. |
| L5 | Deploy On Edge Device | Raspberry Pi / Jetson setup | Run models on hardware. |
| L6 | Edge Meets Cloud | Hybrid systems | Build edge-cloud pipelines. |
| L7 | Build Your Vision Product | End-to-end system | Create a working vision-based hardware feature. |
| Component | Weightage |
|---|---|
| Vision Assignments (4 total) | 25% |
| Lab Performance & Builds | 25% |
| Final Project: Vision-Enabled Hardware System | 30% |
| Live Demo + System Evaluation | 20% |
| Type | Resource | Provider |
|---|---|---|
| Lecture | CS231n: Convolutional Neural Networks for Visual Recognition | Stanford |
| Lecture | Practical Deep Learning for Coders | fast.ai |
| Reading | Deep Learning for Computer Vision | Adrian Rosebrock |
| Documentation | OpenCV Documentation | opencv.org |
| Documentation | PyTorch Vision | pytorch.org |
| Practice | Kaggle Computer Vision Competitions | kaggle.com |