The trained network is typically too large to run efficiently on mobile device. For example, VGG16 used for image classification has more 130 Million parameter (about 600 MB on model size) and requires about 31 billion operations to classify an image, which is way to expensive to be done on mobile.
Besides the convolution operator we already found in AlexNet or VGG16, there are few variations, which will be introduced below. The content of this article is based on reading of An Introduction to different Types of Convolutions in Deep Learning
Here is the comparison of the most popular object detection frameworks.
There are two types of image segmentation:
- semantic segmentation: assign labels to each pixel
- instance segmentation: extract the bounary/mask for each instance in the image
Recently, I am working on a prototype which requires to use OpenGL to process the streams from Camere and save the processed results to a Video. Thus the overall pipeline is that:
Patrick Toomey has posted a document on some low-friction ways to reduce risk in your software projects. Those methods are based on his 5 years of experience in application security, many security issues and associated data disclosures are the result of technically unsophisticated attacks.
Intel has been working hard to optimizing the popular deep learning frameworks, e.g., TensorFlow and Caffe, on (Intel) CPUs. Their work has reduced the performance gaps between CPU and GPU from about
Feifei Li has published their work on how to have the robot learn to pick and place the objects by itself. The paper is available at arxiv.