About

Summary

  • Starting as an IC and growing into a manager role, I built and led a team of eight CVML engineers.
  • We shipped a face tracking solution on Quest Pro, the first wearable consumer device to enable users’ digital avatars to match their real-world facial expressions through on-device sensors in real time.
  • A full-stack CVML engineer with product experience in hardware design, data collection, model development, and evaluations, I bring a diverse set of skills to the table.

Contact Information

Skills

  • Programing: C, Java, Python, Javascript, Mongodb, React
  • Computer vision: motion analysis, action recognition, face recognition, object recognition
  • Machine learning: sparse learning, multiple task learning, statistical learning
  • Deep Learning: CNN, RNN, LSTM, GAN, Caffe, Tensorflow, Keras;
  • Others: Linux, Docker, MongoDB, ARM NEON

Education

  • Bachelor, Beijing Normal Univerity, 2005~2009, EE;

  • Ph.D., Arizona State University, 2009~2014, CS;

    • Sparse learning and face recogniton;
    • Multi-task learning and attribute learning;
    • Video analysis and motion skill analysis.
    • Dissertation and book.

Experience

  • Tech Lead Manager, Facebook, 2019~now
    • started at IC level and worked my way up to a manager position, I built and led a team of eight CVML engineers. Together, we shipped various face tracking solutions for multiple AR/VR products;
    • shipped face tracking solution for Quest Pro, which was the first consumer wearable device to enable digital avatars to match people’s real-world facial expressions through on-device sensors in real time.
      • build simulation tool set to design face tracking hardware ahead of real data;
      • build scalable infrastructure to collect and serve 1.5 Billion frames from 19k subjects over 6k hours of data for machine learning;
      • apply self-supervised learning and differential rendering to generate pseudo ground truth without expansive direct human annotation;
      • innovate model (domain adaption, teacher-student distillation() to leverage synthetic data and weak human labels to improve quality over pseudo GT;
      • develop effective face tracking metrics and scalable evaluation framework robust to noise GT and consistent with human experiential KPI
    • a full-stack CVML engineer with a well-rounded understanding of the entire product develop- ment process: hardware design, data collection, model development, and evaluations.
  • Founder & CTO, Button, 2018~2019.

    • Lead the development of (pharmaceutic and medical device) asset discovery platform;
    • Build a team of six engineers and several data scientists with cross-business collaborations;
    • Javascript, Mongodb, React, Nature Language Processing, Recommendation System.
  • Software Engineer, Google, 2016~2018.

    • Work on the camera of Google’s 2016 and 2017 Pixel Phones;
    • Tech lead in video pipeline (stability, power consumption, codec tuning, audio performance);
    • Lead the development of hyper-lapse mode of Pixel Camera;
    • Contributor of the new camera framework for Pixel 2018;
    • Improve Panorama, Photosphere and Refocus modes of Pixel camera;
    • Skills: Java, C, Android, JNI, Python, TensorFlow, Convolution neural network.
  • Staff Software Engineer, Samsung, 2014~2016.

    • Lead algorithm and software development for 3D camera (time of flight, ToF);
      • Improve depth quality over noise and motion blur ($5mm$ accuracy within 1 to 5 meter);
      • Optimize the code for real time performance on multiple platforms;
    • Research and develop deep learning algorithms;
      • Develop deep learning algorithms for varying tasks, including image recognition, object detection, image quality assessment, image processing, image captioning, speech recongition and machine translation;
      • Optimize deep learning algorithms for mobile platform for better speed and reduced memory footprint, including, sparse pruning, quantization (bit reduction) and compression;
      • Guide the design for Samsung mobile neural processor and identify the hot spot in neural network inference;
      • Rich experience with both convolution neural network and recurrent neural network, and familiar popular deep learning packages, including caffe, Theano (Keras);
    • Develop algorithm for high quality multi-frame super-resolution optimized for dynamic scene;
    • Several patents pending and papers under review;
    • Skills: C, Android (NDK), X86 SSE, ARM NEON, OpenCL, CEVA Vector processing, Python, CUDA, Java, OpenMP, Multi-threading, Socket (TCP/UDP).
  • Intern, Qualcomm, 2012.

    • Algorithm design and development for optical base multi-touch system;
    • Winner of 2nd prize in Qualcomm Qtech 2012 and two patents (US20140264034 A1, WO2014158946 A1, US9109949);
  • Intern, Sharp Laboratory of America, 2011.

    • Algorithm design and development for defect detection;
    • Algorithm development for template based image matching.
    • Codes deployed in commercial products and filed as US patent (US8705839 B2);
  • Research Assistant, Arizona State Univeristy, 2009~2015.

    • Automatically evaluate the motion skills in surgical simulation with tracking-free approaches.

      • Learn discriminative models from relative labels which relaxes the labeling requirement.
      • Capable of extract the temporal dynamics from the video to facilitate the detailed analysis;
    • Propose a discriminative dictionary learning algorithm to improve the sparse representation based face recognition system.

      • Reduce the dictionary size without compromising the accuracy.
      • Further reduce the dictionary size and improve the accuracy via learning a decomposition of face images into physically meaningful components;
    • Learn multiple classifiers jointly via multi-task learning.

      • Exploit relative labels to abandon the requirement of the tradition labeling.
      • Succeeded in varying applications including motion analysis, image classification and image co-segmentation;
    • Build vision-based human-robot interaction in indoor environment

      • Capable of following target-subject in complex environment
      • Interact via vision based gesture recognition in real time;
    • Enhance the non-refence image quality assessment via the guidance of visual saliency;

HONORS AND AWARDS

  • Samsung Best Paper Award 2015 Merit Award (50 out of 1100 candidates)
  • Samsung Best Patent Review Committee Member Golden Awards 2015
  • University Graduate Fellowship Award, 2013.4, 2009.9
  • Qualcomm Qtech 2012 2nd Prize, 2012.6
  • Outstanding Volunteer Service Award by ACM MM 2011, 2011.11

Patents

  • US11586283B1, Artificial reality device headset DONN and DOFF detection, ”Meta”, ”Dong Yang, Qiang Zhang, Wen Song, Theresa Loney Casarez”,
  • US-2016379352-A1,Label-free non-reference image quality assessment via deep neural network ,”Samsung Electronics Co., Ltd.”,”Qiang Zhang, Zhengping Ji, Lilong SHI, Ilia Ovsiannikov”,
  • US-2013129188-A1,Electronic devices for defect detection ,”Sharp Laboratories Of America, Inc.”,”Qiang Zhang, Xinyu Xu, Chang Yuan, Hae-Jong Seo, Petrus J.L. van Beek”,
  • US-2014264034-A1,Near-field optical sensing system ,”Qualcomm Mems Technologies, Inc.”,”Xiquan Cui, Muhammed I. Sezan, Russell Wayne Gruhlke, Qiang Zhang”,
  • US-9934557-B2,Method and apparatus of image representation and processing for dynamic vision sensor ,”Samsung Electronics Co., Ltd”, Zhengping Ji, Kyoobin Lee, Qiang Zhang, Yibing Michelle Wang, Hyun Surk Ryu, Ilia Ovsiannikov,
  • US-2016309135-A1,Concurrent rgbz sensor and system ,”Ilia Ovsiannikov, Yibing Michelle Wang, Gregory Waligorski, Qiang Zhang”,
  • US-2016358314-A1,Method and apparatus of multi-frame super resolution robust to local and global motion ,”Zhengping Ji, Qiang Zhang, Lilong SHI, Ilia Ovsiannikov”,
  • US-2017185871-A1,Method and apparatus of neural network based image signal processor ,”Qiang Zhang, Zhengping Ji, Yibing Michelle Wang, Ilia Ovsiannikov”,
  • US-2016350649-A1,Method and apparatus of learning neural network via hierarchical ensemble learning ,”Qiang Zhang, Zhengping Ji, Lilong SHI, Ilia Ovsiannikov”,
  • US-2017213105-A1,Method and apparatus for event sampling of dynamic vision sensor on image formation ,”Zhengping Ji, Qiang Zhang, Kyoobin Lee, Yibing Michelle Wang, Hyun Surk Ryu, Ilia Ovsiannikov”,

Publications

Book

  • Qiang Zhang, Baoxin Li, ``Dictionary Learning in Visual Computing”, Morgan \& Claypool, doi:10.2200/S00640ED1V01Y201504IVM018

Dissertation

  • Qiang Zhang, ``Semantic Sparse Learning in Images and Videos”, Doctoral Dissertation, Ph.D. Computer Science, Arizona State University, 2014

Deep Learning

  • Z Ji, I Ovsiannikov, Y Wang, L Shi, Q Zhang, Reducing weight precision of convolutional neural networks towards large-scale on-chip image recognition, SPIE Sensing Technology+ Applications, 94960A-94960A-9

Motion Analysis

  • Y Wang, Q Zhang, B Li, Efficient unsupervised abnormal crowd activity detection based on a spatiotemporal saliency detector, Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on, 1-9
  • Lin Chen, Qiang Zhang, Peng Zhang and Baoxin Li. INSTRUCTIVE VIDEO RETRIEVAL FOR SURGICAL SKILL COACHING USING ATTRIBUTE LEARNING. IEEE International Conference on Multimedia and Expo (ICME) 2015, Torino, Italy.
  • Qiang Zhang, Baoxin Li, ``Relative Hidden Markov Models for Video-based Evaluation of Motion Skills in Surgical Training,” Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Qiang Zhang and Baoxin Li, Relative Hidden Markov Models for Evaluating Motion Skills, IEEE Computer Vision and Pattern Recognition (CVPR) 2013, Portland, OR
  • Lin Chen, Qiongjie Tian, Qiang Zhang and Baoxin Li. Learning Skill-Defining Latent Space in Video-Based Analysis of Surgical Expertise – A Multi-Stream Fusion Approach. NextMed/MMVR20. San Diego, CA, 2013.
  • Qiongjie Tian, Lin Chen, Qiang Zhang and Baoxin Li. Enhancing Fundamentals of Laparoscopic Surgery Trainer Box via Designing A Multi-Sensor Feedback System. NextMed/MMVR20. San Diego, CA, 2013.
  • Qiang Zhang, Lin Chen, Qiongjie Tian and Baoxin Li. Video-based analysis of motion skills in simulation-based surgical training. SPIE Multimedia Content Access: Algorithms and Systems VII. San Francisco, CA, 2013.
  • Qiang Zhang and Baoxin Li. Video-based motion expertise analysis in simulation-based surgical training using hierarchical dirichlet process hidden markov model. In Proceedings of the 2011 international ACM workshop on Medical multimedia analysis and retrieval (MMAR ‘11). ACM [oral], New York, NY, USA, 19-24.
  • Zhang, Qiang and Li, Baoxin, Towards Computational Understanding of Skill Levels in Simulation-Based Surgical Training via Automatic Video Analysis, International Symposium on Visual Computing (ISVC) 2010, Las Vegas, NV

Face Recognition

  • Qiang Zhang and Baoxin Li. Mining Discriminative Components With Low-Rank And Sparsity Constraints for Face Recognition. The 18th ACM SIGKDD International Conference On Knowledge Discovery and Data Mining (SIGKDD 2012).
  • Qiang Zhang and Baoxin Li, Joint Sparsity Model with Matrix Completion for an Ensemble of Images, IEEE International Conference on Image Processing (ICIP) 2010, Hong Kong, China
  • Qiang Zhang and Baoxin Li, Discriminative K-SVD for Dictionary Learning in Face Recognition, IEEE Computer Vision and Pattern Recognition (CVPR) 2010, San Francisco, CA

Multi-task Learning

  • Qiang Zhang, Jiayu Zhou, Yilin Wang, Jieping Ye and Baoxin Li, Image Cosegmentation via Multi-task Learning, BMVC 2014, Nottingham, UK
  • Lin Chen, Qiang Zhang and Baoxin Li, Predicting Multiple Attributes via Relative Multi-task Learning, IEEE Computer Vision and Pattern Recognition (CVPR) 2014, Columbus, OH
  • Qiang Zhang, Baoxin Li, ``Max Margin Multi-Attribute Learning with Low Rank Constraint,” Image Processing, IEEE Transactions on

Image Processing and Others

  • Z Ji, Q Zhang, L Shi, I Ovsiannikov, Multi-frame Super Resolution Robust to Local and Global Motion, SPIE Medical Imaging, 101370Y-101370Y-7
  • Yilin Wang, Qiang Zhang and Baoxin Li, STRUCTURE-PRESERVING IMAGE QUALITY ASSESSMENT, IEEE International Conference on Multimedia and Expo (ICME) 2015, Torino, Italy
  • Yilin Wang, Qiang Zhang and Baoxin Li, Semantic Saliency Weighted SSIM for Video Quality Assessment, International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM) 2014, Chandler, AZ
  • Qiang Zhang, Chang Yuan, Xinyu Xu, Peter Van Beek, Hae jong Seo, and Baoxin Li. Efficient defect detection with sign information of Walsh Hadamard transform. IS\&T/SPIE Image Processing: Machine Vision Applications VI. San Francisco, CA, 2013
  • Jin Zhou, Qiang Zhang, Baoxin Li and Ananya Das, Synthesis of Stereoscopic Views from Monocular Endoscopic Videos, IEEE Computer Vision and Pattern Recognition (CVPR) 2010 workshop on Mathematical Methods in Biomedical Image Analysis, San Francisco, CA
  • Qiang Zhang and Pengfei Xu and Wen Li and Zhongke Wu and Mingquan Zhou, Efficient Edge Matching Using Improved Hierarchical Chamfer Matching, Aug, IEEE International Symposium on Circuits and Systems (ISCAS) 2009, Taipei, Taiwan
  • Qiang Zhang and Hua Li and Yan Zhao and Xinlu Liu, Exploration of Event-Evoked Oscillaotry Activities during a Cognitive Task, The 4th International Conference on Natural Computation and The 5th International Conference on Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) 2008, Jinan , China

Services

  • Volunteer of IEEE IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2010, 2013, ACM Multimedia 2011
  • Reviewer of IEEE International Conference on Computer Vision (ICCV) 2015, International Joint Conference on Artificial Intelligence (IJCAI) 2015, 2016, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015, 2016,
  • Reviewer of International Journal of Machine Learning and Cybernetics (JMLC)
  • Reviewer of IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Signal Processing Letters, IEEE Transactions on Circuits and Systems for Video Technology, IEEE Transactions on Knowledge and Data Engineering
  • Reviewer of Elsevier Journal of Visual Communication and Image Representation, Elsevier Robotics and Autonomous Systems, Elsevier Pattern Recognition, Elsevier Information Fusion, Elsevier Signal Processing: Image Communication, Elsevier Journal of Systems and Software
  • Reviewer of SPIE Journal of Electronic Imaging
  • Reviewer of Journal of Multimedia, Journal of Computers
  • Reviewer of Springer International Journal of Machine Learning and Cybernetics, Springer Computational and Applied Mathematics

Source code