Recently, i have learned the technique about Facial Landmark Detection. Facial Landmark Detection can help us to find where in the image a face is located and the location of different facial features ( e.g. eyes, eyebrows, mouth, nose etc ).
Facial Landmark Detection is a two step process. In the first step, you need to detect the face in an image. Second, The landmark detector finds the landmarks inside the face rectangle. In my projects, i use Dlib’s 68-Points Model. Dlib’s face detector is based on Histogram of Oriented Gradients features(HOG) and Support Vector Machines (SVM). I'll introduce some applications based on Facial Landmark Detection below.
Face Morphing : Face Morphing is a technique which is widely used in movie and animations to change one image or shape into another seamlessly.
Given two images I and J we want to create an between image M by blending images I and J. The blending of images I and J is controlled by a parameter α that is between 0 and 1 (0<=α<=1). When α is 0, the morph M looks like I, and when α is 1, M looks like J. You can blend the images using the following equation at every pixel (x,y) , M(x,y)=(1−α)I(x,y)+αJ(x,y). But how can we do the Nonlinear Transforms between images I and J? Let's see the next section.
Nonliner transforms : Nonliner transforms can be computationally expensive, because it requires you to calculate a mapping at every pixel based on some complex transform function. So we do the approximating nonlinear transforms using piecewise linear transforms. To accomplish this, the image can be divided into non-overlapping triangles by Delaunay triangulation, with these collection of triangles in the input image, we can do linear transform (warping triangles) to generate the output image.
Head pose estimation : The pose estimation described as Perspective-n-Point problem in computer vision. The goal is to find the pose of our head with a calibrated camera, and we know the locations of n 3D points on the head(or face) and the corresponding 2D projections in the image. We can know a lot of 2D feature points by facial landmark, and use the following 3D points :
Tip of the nose : ( 0.0, 0.0, 0.0)
Chin : ( 0.0, -330.0, -65.0)
Left corner of the left eye : (-225.0, 170.0, -135.0)
Right corner of the right eye : ( 225.0, 170.0, -135.0)
Left corner of the mouth : (-150.0, -150.0, -125.0)
Right corner of the mouth : (150.0, -150.0, -125.0)
We could transform the 3D points in world coordinates to 3D points in camera coordinates by Direct Linear Transform, and use Levenberg-Marquardt algorithm to minimize the reprojection error. In OpenCV the function solvePnP and solvePnPRansac can be used to estimate pose.