Reza Baharani

AI/ML and Edge/IoT Systems Engineer · reza(at)baharani.info

With a fusion of skills in custom hardware design and deep learning, I bring unique expertise to the development of power-efficient solutions for edge devices. Proficient in High-Level Synthesis (HLS), Hardware Description Languages (HDL), and scalable modular architecture on FPGAs and ASIC, I also excel in real-time AI production, including complete development and verification cycles. As a Deep Learning Engineer, I specialize in crafting advanced neural network architectures such as convolutional, recurrent, and transformer structures for applications in computer vision, time series, and text. I've mastered handling large-scale datasets and parallel training on high-end GPU servers, applying cutting-edge techniques like AI HW/SW co-acceleration, quantization, knowledge distillation, and pruning. My commitment to continuous innovation and eagerness to explore new technologies fuels my dedication to staying at the forefront of AI and machine learning advancements, ensuring growth and excellence in all I pursue.

Experience

Scientific Researcher

TeCSAR Lab.

Developing a self-supervised training framework tailored for transformer-based architectures in the realm of computer vision, with a focus on enhancing contextual understanding in 2/3-D pose estimation tasks.

Engaging in MLIR (Multi-Level Intermediate Representation) projects to lower machine learning models for custom hardware designed on FPGA as a potential target platform.

Oct 2023 - Present

Lead Edge/IoT Deep Learning Engineer

ForesightCares Inc.

Led a smartphone software development team in leveraging AI and 3D pose estimation to assess and minimize fall risk and cognitive impairment in older adults, achieving performance up to 20 FPS on the device SoC. Demo

Large scale parallel training and validation of a novel human 3D pose estimation algorithms on datasets such as Human3.6M and NTU-RGB+D (2.3 TB)
Developed Swift code to integrate TensorFlow TFLite and Apple MLPackage models with CoreML for NE(NPU)/CPU/GPU processing and utilized React Native to establish the connection between AI and user interface.
Leveraged ASW cloud services such as Cognito, DynamoDB, and S3.

Jun 2022 - Present

MLOps Engineer

TeCSAR Lab.

Designed and implemented an end-to-end scalable, intelligent advanced video surveillance vision pipeline, achieving a system performance of 23 frames per second (FPS) for eight concurrent cameras at Full HD resolution. Demo.

Aug 2021 - Jun 2022

Graduate Student Research Assistant

The UNC, Charlotte

Designed and developed Agile Temporal Convolutional Neural Network (ATCN), a scalable deep learning model with adjustable hyper-parameters to enable time series analysis for resource-constrained edge systems.

Invented a customized multi-head attention Temporal Convolutions Network (TCN) for efficiently and precisely predicting highway vehicle trajectories for highway and self-driving cars safety applications.

Implemented HW/CW co-design for application-specific architectures, accelerating EfficientNet and MobileNetV2 inference on Xilinx embedded and cloud FPGAs. Achieved an improvement of up to 8.6x FPS/W. Demo

Designed a recurrent deep learning solution for real-time edge processing in reliability modeling of Si-MOSFET power electronics converters.

Experienced in administration, system driver configuration, and resource allocation, including hardware RAID setup for multiple servers with server-class GPUs such as P100, V100, RTX 6000. Skilled in configuring deep learning frameworks like TensorFlow and PyTorch to optimize performance across various server

Aug 2017 - Aug 2021

Education

University of North Carolina at Charlotte

Ph.D.

Electrical and Computer Engineering - Computer Architecture and Deep Learning

Aug 2017 - Aug 2021

University of Tehran

M.Sc.

Computer Architecture Engineering

Sep 2009 - Sep 2021

Skills

Programming & RTL Languages

Python
C/C++
SystemVerilog
SyestemC
Verilog/VHDL
Shell and TCL
JavaScript|TypeScript
SQL
Familiar with Swift

AI/ML Algorithms

Modern Deep Learning Neural Network such as CNN, RNN, Transformer
Tokenization
Transfer Learning
Attention-based Neural Network
Big Data Distributed Processing
Parameters Compression & Quantization
Evaluation Metrics (accuracy, precision, recall, F1-score, ROC curves, etc.)
Knowledge Distillation

AWS Cloud Services

Cognito
DynamoDB
S3 Bucket
E2 Compute Cloud

Software Framework

React Native
Ray Cluster
Ansible

Embedded System

Git
GNU build tools
JTAG
Cross-compilation
Make/CMake
I2C
SPI
UART
GPIOs
RS-232/485
ARM Assembly
Familiar with Linux Driver