Processing math: 100%

Geometry

Lie Groups for 2d and 3d Transformation

A Manifold is an n-Dimensional Topological Space that is locally Euclidean.

A Lie group is a topological group that is also a smooth manifold. Each of them is associated with a Lie algebra which is a vector space. Transformations are required to be composed, inverted, differentiated and interpolated. Lie groups and their associated machinery address all these operations in a pricipled way.

Definition. A nonempty subset GRn×n is called a Lie group if g,hGghG and gGdet(g)0,g1G.

Mathetical notations

SO(3) & so(3)

Elements of the 3D rotation group, SO(3), are represented by 3D rotation matrices. They are all orthogonal matrices with determinant +1. The Lie algebra so(3), associated with SO(3), is the set of 3×3 skew-symmetric or anti-symmetric matrices (i.e., AT=A).

Infinitestimal generators

The key idea introudced by Lie is that any finite transformation can be constructed by the repeated application or integration of infinitestimal (极微小的) transformation. The infinitestimal generators of so(3) correspond to the derivatives of rotation around the each of the standard axes, evaluated at the identity. We take the the rotation by an angle ϕ around x-axes as an example:

R1(ϕ)=[1000cosϕsinϕ0sinϕcosϕ].

Its derivative w.r.t ϕ evaluated at zero is written into

G1=R1ϕ|ϕ=0=[000001010].

The three infinitestimal generators are derived as:

G1=[000001010],G2=[001000100],G3=[010100000]

An element of so(3) is then represented as the linear combination of the three generators:

ω=[ω1,ω2,ω3]TR3 ω×=ω1G1+ω2G2+ω3G3=[0ω3ω2ω30ω1ω2ω10]so(3)

Since G1, G2 and G3 are derivatives of rotations around three standard axises, ω can be considered as the difference of a rotation from the identity.

Exponential & logarithm map

Rodrigues’ formula: a exponential map associates an element of the Lie algebra to a rotation of the Lie group:

eω×=I+sin(|ω|)|ω|ω×+1cos(|ω|)|ω|2(ω×)2

If ω is small, we often take the first-order approximation as eω×I+ω×.

A logarithm map associates a rotation of the Lie group with an element of the Lie algebra.

log(R)=ϕ(RRT)2sin(ϕ),ϕ=acos(trace(R)12).

Properties

SE(3) & se(3)

The group of rigid transformation in 3D space is SE(3). It is represented by a 4×4 matrix:

[Rt01].

There are six infinitestimal generators of Lie algebra se(3), which correspond to differential translations and rotations:

G1=[0001000000000000],G2=[0000000100000000],G3=[0000000000010000], G4=[0000001001000000],G5=[0010000010000000],G6=[0100100000000000].

An element of se(3) is then represented by linear combination of these generators:

u1G1+u2G2+u3G3+ω1G4+ω2G5+ω3G6se(3)

Quaternion

http://people.csail.mit.edu/bkph/papers/Absolute_Orientation.pdf

Angle-axis v.s. Quaternion

  Angle-axis Quatation
Notation x=αe q=q0+q1i+q2j+q3k
dimension 3 4, |q|=1
angle α |x| qr=cos(α/2)
R R=I+sinαK+(1cosα)K2K is cross product matrix of e See figure below
inverse x=αe q1=q=q0q1iq2jq3k
rotate p p=cosαp+sinα(e×p)+(1cosα)(ep)e p=qpq1 (Hamilton product)
composition link q=q2q1(Hamilton product)
relationship   q=cosα2+sinα2(e1i+e2j+e3k)

dilated_conv

Essential & Fundamental Matrix

Derivation of Essential Matrix

dilated_conv

As shown in Figure above, we have two left and right cameras with camera centers Ol and Or. P is a 3D point co-visible in two cameras. We have notations as follows: Pl is the 3D coordinate of P in local space of camera Ol and Pr is the 3D coordinate of P in local space of camera Or. T is the 3D coordinate of Or in local space of camera Ol. The rotation and translation from camera Ol to camera Or are R and t respectively. Apparently we have

{RPl+t=PrRT+t=0.

Thus, we obtain t=RT and Pr=R(PlT).

Then, the equation can be derived

PTrRT×Pl=0,

because PTrR=(PlT)T and T×Pl (T× is the skew-symmetric matrix) equals to the cross product of vector T and Pl. Therefore, the two terms are orthogonal to each other. RT× is exactly the essential matrix E=RT× and PTrEPl=0. Pl and Pr are essentially two viewing rays in local space of two cameras.

The R and T have six degrees of freedom. The essential matrix has five degree of freedom by eliminating the common scaling.

Derivation of Fundamental Matrix

Rather than using the camera coordinates, the equation above can be extended to image coordinates, because

{pl=KlPlpr=KrPr,

where pl and pr use homogeneous coordinates and Kl and Kr are intrinsics of left and right cameras. Then the equation PTrEPl=0 can be written into

pTrKTrEK1lpl=pTrFpl=0.

F=KTrEK1l is called the fundamental matrix.

The fundamental matrix has rank 2. Since all the epipolar lines go through the epipole, e.g. in the left image el, we have Fel=0. Hence F has a null space which is not just the zero vector. So F is rank 2.

The 3×3 homogeneous fundamental matrix has seven degrees of freedom. The common scaling removes one degree and the constraint detF=0 (rank=2) removes another one.

PnP

PnP

Bundle Adjustment

Definition of bundle adjustment

Homography

For any choice of intrinsic parameters, any homography can be realized between two views by some positioning of the two views and a plane.

“The camera is the measurement of angle.”

Matrices in CV

  DoF
Projection matrix 11
Fundamental matrix 7
Essential matrix 5
Homograpy matrix 8
Affine transform (2d) 6
Affine transform (Nd) (N+1)*N
Similarity transform (2d) 4
Similarity transform (3d) 7

Reference

[1] http://ethaneade.com/lie.pdf

[2] Mathetical elucidation of SO(3) and so(3)

[3] http://www.cmth.ph.ic.ac.uk/people/d.vvedensky/groups/Chapter7.pdf

[4] Essential and Fundamental Matrices

[5] Lie Groups for Computer Vision, Ethan Eade

[6] Derivative of the Exponential Map, Ethan Eade