Computer Vision

Tuesday, May 27, 2025 · 63 min read

Page 1 of 91

1. Introduction: Computer Vision

Computer programs ~~that can intercept images~~ → that can interpret images.
Understanding the representation of images.

Applications

Images & movies
- Example: Surveillance
- Building 3D representations → Growth, Movies
- Motion capture → Pirates of Caribbean, CGI
OCR and Face Recognition
- Scanner → Adobe, Syskonaut
Object Recognition
Special effects & 3D Modelling
Smart Cars
Sports (Cricket, Soccer)
Vision-Based Interaction → MS Kinect, Nintendo Wii
Security and Surveillance

Note:
Image Processing & Computer Vision are different.

Computational Models (Math)

graph TD
    A[Computational Models (Math)] --> B[Algorithm]
    A --> C[Real Images]

Page 2 of 91

2. Image Processing for CV

Image is a function of I, from ( R^2 ) to ( R ):
- Usually, image = smooth function.
- ( f(x, y) ) gives the intensity or value at position ( (x, y) ).
Practically, digital image data is a matrix ( f: [a, b] \times [c, d] \rightarrow [I_{min}, I_{max}] ) with a finite range.

Note:

0 to 255 is atomic measurement (0-1 should be better).

For Color Images

f(x, y) = \begin{bmatrix} r(x, y) \\ g(x, y) \\ b(x, y) \end{bmatrix}

Image Blending:
( g(n) = (1 - \alpha) f_0(n) + \alpha f_1(n) )
dst = cv2.addWeighted(img1, 0.7, img2, 0.3, 0)
Pixel = picture element.

Digital Images

In computer vision, we typically operate on digital images.
Sample the 2D space on a regular grid.
Quantize each sample (rounded to the "nearest integer").
Images are thus represented as a matrix of integers.

Noise in Images

I'(x, y) = \bar{I}(x, y) + N(x, y)

where ( N(x, y) ) = noise.

Types of Noise

Salt & Pepper: random occurrences of black & white pixels.
Impulse: random occurrences of white pixels.
Gaussian: variations in intensity drawn from a Gaussian (normal) distribution.

\text{noise} = \text{random}(3 \times \text{sigma}) \times \sigma

where (\sigma) depends on the range of image.
output = img + noise

Page 3 of 91

Image Filtering

Used to:

Enhance images
Extract information
Detect patterns

Removing Noise from an Image (Moving Average, 1-D)

Approach:
Replace each pixel with an average of all the values in its neighbourhood.

Assumptions

The "true" value of pixels are similar to the true values of pixels nearby.
The noise added to each pixel is done independently.

Weighted Moving Average (1-D)

Can add weights to our moving average.
Usually use an odd number of weighted mask.
- Eg: [1, 4, 6, 4, 1] to all nearby neighbourhood.

For 2-D

Moving Average

Just take some ( k \times k ) or ( 3 \times 3 ) matrix and average it.

Correlation Filtering – Uniform Weights

Say the averaging window size is ( (2k+1) \times (2k+1) ):

Out[i, j] = \frac{1}{(2k+1)^2} \sum_{u=-k}^k \sum_{v=-k}^k F[i+u, j+v]

(Nested Loop, averaging)

Page 4 of 91

Correlation Filtering – Non-uniform Weights

Out[i, j] = \sum_{u=-k}^k \sum_{v=-k}^k H[u, v] F[i+u, j+v]

This is called cross-correlation, denoted ( G = H \bigotimes F ).

( H[u, v] ) is called filter "kernel" or "mask", which is just the matrix of weights in the linear combination.

What makes a good kernel?

Approach 2: Gaussian Filter

H(u, v) = \begin{bmatrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{bmatrix}

h(u, v) = \frac{1}{2\pi\sigma^2} e^{-\frac{u^2 + v^2}{2\sigma^2}}

This kernel is an approximation of the Gaussian function.

Note:

Variance (( \sigma^2 )) or standard deviation (( \sigma )) determines extent of smoothing.
Also, size of kernel (( u, v )) is not variance.

Impulse Noise

In discrete world, impulse is just a value of I at a single location.

Page 5 of 91

While in continuous world, impulse is an idealized function that is very narrow and very tall so that it has a unit area.

graph TD
    A[discrete pulse] --> B[continuous impulse]

Convolution

Out[i, j] = \sum_{u=-k}^k \sum_{v=-k}^k H[u, v] F[i-u, j-v]

( G = H * F )
Also, ( Img * Impulse = Img )

Properties of Convolution

a) Linear & shift-invariant
b) ( f * g = g * f ) (Commutative)
c) ( (g * f) * h = g * (f * h) ) (Associative)
d) ( f * e = f ) (Identity)

Differentiation:

\frac{d}{dx}(f * g) = \frac{df}{dx} * g

A Big Note:
What would be the computational complexity?
( W \times N \times N ) to get the convolution (filter * image)

So, if some filter is separable, we can exact convolution by:
( W \times N + N \times N + W \times N + N = 2 W \times N + N \times N \ll W N^2 )

Page 6 of 91

It can be done because

Out = H * F = (C * R) * F = C * (R * F)

Boundary Issues

What happens when filter falls off the edge?

Case 1: Create big image
Case 2: Create same image
Case 3: Create smaller image

graph TD
    A[Big Image] --> B[Filter]
    C[Same Image] --> B
    D[Small Image] --> B

Methods (in OpenCV) [for same image]:

cv2.BORDER_CONSTANT : just clips out extra & replace with some color
cv2.BORDER_WRAP : circular wrapping
cv2.BORDER_REPLICATE : last element is replicated throughout
cv2.BORDER_REFLECT : border will be mirror reflection of the border elements

Page 7 of 91

Now, talking about filter for noises like salt & pepper noise, impulse noise:

Approach 3: Median Filter

Select a kernel (odd × odd)
Sort all the numbers
Select mid element
Replace it

It is also called edge preserving because:

graph TD
    A[Input]
    B[Median filter]
    C[Mean filter]
    A --> B
    A --> C

(Mermaid graph approximates the input/median/mean comparison)

Page 8 of 91

Template Matching

Now, we move on to finding properties of pixels locally in the image.

1D (nx) Correlation:

%% Sequence Diagram for 1D Correlation
sequenceDiagram
    participant Signal
    participant Filter
    Signal->>Filter: Correlation
    Filter->>Signal: Normalized cross-correlation (maximum value)

Why maximum value is attained at the point they match completely?
- (w × w) = positive
- (−w × (−w)) = (+w)
- Well, the filters max out each signal when they match exactly.
We can now use this trick to find objects in the image (Template Matching).

Limitation:
- This doesn't work well for different scales & orientations. - So it's just an approach of using filters as templates.

Page 9 of 91

Edge Detection: Gradients

Edges in an image are important; they can easily convey information with reduced images.
Edges = reduced set of pixels or curves that are important elements of the picture.
Edges look like steep cliffs/changes, so look for a neighborhood with strong signs of change.

But there is ambiguity:

Neighborhood size
How to detect change

Change is derivative (an edge is a place of rapid change in image intensity function):

graph TD
    A[Image] --> B[Intensity function]
    B --> C[First derivative]
    C --> D[Edge corresponds to extreme derivative]

To find these peaks (extrema of derivative), we run it through some operators (differentiated operators to be exact).
- Differential operators: when applied to some image, returns some derivatives.
- Model these "operators" as masks/kernels that compute the image gradient function.
- Threshold the gradient function to select the edge pixels.

Page 10 of 91

But, what is gradient?

Gradient is a vector ~~that is made up by the partial derivatives~~ → made up of the partial derivatives.

The gradient of an image:

\nabla f = \left[ \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y} \right]

In other words, the gradient points in the direction of most rapid increase in intensity & magnitude of vector is how much it's changing as a function of a unit step in that direction.

For computers, we calculate discrete gradient:

\frac{\partial f(x, y)}{\partial x} \approx \frac{f(x+1, y) - f(x, y)}{1}

Similarly,

\frac{\partial f}{\partial x} \approx \frac{f(n+1, y) - f(n, y)}{dn}

Some Operators

Sobel Operator:

S_x = \frac{1}{8} \begin{bmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{bmatrix} \quad S_y = \frac{1}{8} \begin{bmatrix} 1 & 2 & 1 \\ 0 & 0 & 0 \\ -1 & -2 & -1 \end{bmatrix}

It is a joint Gaussian smoothing plus differentiation operator; it is more resistant to noise.

\nabla I = \begin{bmatrix} g_x \\ g_y \end{bmatrix}

Page 11 of 91

Constraints to Edges

Primary Edge Detection Steps

Smoothing derivatives to suppress noise and compute gradient.
Threshold to find "regions" of "significant" gradient.
"Thin" to get localized edge pixels.
Link or connect edge pixels.

So, to do that, we use the edge detection algorithm. Its stages are:

Filter image with derivative of Gaussian.
Find magnitude & orientation of gradient.
- Edge gradient (( \alpha )): $\sqrt{f_x^2 + f_y^2}$

Some Concerns

What is the value of ( \sigma ) in Gaussian filter?
- Canny with ( \sigma = 1 ) (fine features/detailed edge maps)
- Canny with ( \sigma = 2 ) (coarse features/large scale edges)

Page 12 of 91

What is the kernel size to be taken?

Mathematics

( f ) → image function
( h ) → filter (Gaussian)

For 1-D

graph TD
    A[Noisy Image] --> B[Apply filter h]

Apply derivative operator:

\frac{d}{dn} f(n)

(noisy derivative signal)

So, we need to remove noise.

graph TD
    A[h] --> B[f]
    B --> C[Convolution]
    C --> D[Peak detection]

Page 13 of 91

Also,

\frac{d}{dn}(h * f) = \frac{d h}{dn} * f

graph TD
    A[dh/dn] --> B[f]
    B --> C[Convolution]

To find peaks, we take more derivatives:

\frac{d^2 h}{dn^2}

Inverted Mexican hat

\frac{d^2}{dn^2}(h * f)

For 2-D

Gaussian:

h_\sigma(u, v) = \frac{1}{2\pi\sigma^2} e^{-\frac{u^2 + v^2}{2\sigma^2}}

( \frac{\partial}{\partial n} ) becomes tedious.
( \frac{\partial^2 h}{\partial (n, y)^2} ) is Mexican hat.

Page 14 of 91

\nabla^2 h = I_{xx} + I_{yy}

\left( \begin{bmatrix} 1 & -2 & 1 \end{bmatrix} + \begin{bmatrix} 1 \\ -2 \\ 1 \end{bmatrix} \right) * I

So, we use something called Laplacian operator.

Laplacian Operator

\frac{d^2 h}{dx^2} + \frac{d^2 h}{dy^2}

Centered difference operator (second derivative).

Parametric Models

It can represent a class of instances where each is defined by a value of the parameters.
- e.g., lines, circles, parameterized template

Fitting a Parametric Model

Choose a parametric model to represent a set of features.
Membership criterion is not local.
Computational complexity is important.

Line Fitting

It's difficult to find a line just by looking at the points, so we use voting (.,.) (Hough Transform).
- It is a general technique where we let the features vote for all models that are compatible with H.
- Loop through features, each casting vote for model parameters.
- Look for model parameters that receive a lot of votes.

Page 15 of 91

Hough Space

Image Space: y = mx + b
Hough (parameter) space: b = -x₀m + y₀ (for a point (x₀, y₀))
- This point represents a line.
A point in image space is a line in Hough space.
A line in the image corresponds to a Hough space point.

graph TD
    A[Image Space (points)] --> B[Hough Space (lines)]
    C[Hough Space (point)] --> D[Image Space (line)]

Algorithm:

Let each edge point in image space vote for a set of possible parameters in Hough space.
Accumulate votes in discrete set of bins; parameters with the most votes indicate line in image space.

A little extra:

If line is vertical, m → ∞, so we use polar representation of a line to prevent any mathematical errors.
- d = perpendicular distance from line to origin
- θ: angle the line makes with x-axis
$x \cos\theta + y \sin\theta = d$

Page 16 of 91

The polar representation of image causes the Hough space to have sinusoidal waves for each point.

Basic Hough Transform Algorithm

Initialize ( H[d, \theta] = 0 )
For each edge point in image function ( F ):
- For θ = 0 to π (180):
  - ( d = x \cos \theta + y \sin \theta )
  - ( H[d, \theta]++ )
Find the value(s) of ( (d, \theta) ) where ( H[d, \theta] ) is max.
The detected line is given by ( x \cos \theta + y \sin \theta = d ).

~~Space: O(dθ)~~ → Time: ( O(nθ) )
Space: ( O(k) ) (number of parameters)

There are some extensions to the above algorithm that help with real line segments etc.

I. Using the gradient (knowing the alignment of (x, y) in image space)
- Instead of using θ = 0 to π, use θ = grad. of α (−35 to 55)

II. Give more votes for stronger edges.

III. Vary the sampling of (d, θ) to give more/less resolution.

IV. The same procedure can be used with circles, squares or any other shape.

Page 17 of 91

Circles

Consider a circle with center ((a, b)) and radius (r):

(x_i - a)^2 + (y_i - b)^2 = r^2

Method 1: For a Fixed Radius

If (r) is a fixed radius:

Image Space to Hough Space (Parameter Space)

flowchart LR
    A[Image Space: (x, y)] --> B[Parameter Space: (a, b)]

The left diagram shows image space, the middle shows parameter space (with dense votes), and the right shows the center found as the intersection for a fixed radius.
The diagrams in the original note are not easily representable in Mermaid, so please refer to the original images for specifics.
(INSERT IMAGE)

Methods (for Unknown Radius)

But in the real world, we don't know the radius, so:

flowchart LR
    x_a[x_a] --- y_b[y_b]
    x_a --> C[Parameter b]
    y_b --> D[Parameter a]

The parameter space increases drastically (RANSAC slows Hough).
(INSERT IMAGE)

Method 3: Unknown (r), Use Gradient (Reduces Search Space)

Vote only along a line or circle (when gradient is known).

flowchart LR
    X(x, y) --> R[Radius known]
    X --> S[Radius unknown: circle]
    R --> O[Vote on line]
    S --> P[Vote on circle]

Cases:
1. If (r) is known: vote along a line
2. If (r) is unknown: vote along a circle
(INSERT IMAGE for the gradient-based voting)

Page 18 of 91

Algorithm

For every edge pixel (x, y):
    For each possible radius value r:
        For each possible gradient direction θ:
            // If θ is the estimated gradient
            a = x + r * cos(θ)
            b = y + r * sin(θ)
            H(a, b, r) += 1

Generalized Hough Transform

Non-Analytic Models

Parameters express variation in position or scale or fixed but arbitrary shape (that was then).

Visual Codeword-Based Features

Not edges, but detected templates learned from models (this is "now").

Returning to Generalized Hough Transform:

In this, voting is done via building a Hough table.

How to build Hough table?

Page 19 of 91

Training

At each boundary point, compute displacement vector: $\vec{r} = c - p_i$
Measure the gradient angle at the boundary point.

flowchart TD
    C[Reference Point (c)]
    P1[p1]
    P2[p2]
    C -- r1, θ1 --> P1
    C -- r2, θ2 --> P2

(INSERT IMAGE for the displacement diagram)

Recognition

At each boundary point, measure the gradient angle $\theta$ .
Look up all displacements in $\theta$ displacement table.
Vote for a center at each displacement.

Algorithm

If orientation is known:

do recognition point.

Else:

For each edge point:
    For each possible master θ':
        Compute gradient direction θ
        Nearest θ': θ - θ'
        For θ', retrieve displacement vectors to vote for reference point.

Peak in this table (over $(x, y, \theta)$ ) is reference point with most supporting edges.

Page 20 of 91

Modern Approach

Instead of using indexed displacements by gradient orientation, index by visual codeword.

flowchart LR
    Type1[Type] --> P[•] --> Type2[Type]
    P --> Recognize[Recognize: Car]

(INSERT IMAGE for codeword diagram)

Training

Build codebook of patches around extracted (interest points) using clustering.
Map the patch around each interest point to closest codebook entry.
For each codebook entry, store all displacements relative to object center.

Recognition

Find all the "interest point" patches & identify visual codeword.
Look up all the displacements in codebook.
Vote for center at each displacement.

Page 21 of 91

Aliasing

Sampled Representation

Sampling: How to store & compute with continuous functions.
If we do ~~undersampling~~ → undersample,
information is lost.
But, it was indistinguishable from lower frequency.

flowchart LR
    A[Original Signal] --> B[Sampling] --> C[Reconstructed Signal]

(INSERT IMAGE for signal and sampling diagrams)
"Recall" guesses what the function did in between (reconstruction).

Sampling in digital audio
- $A/D$ converter → [digital samples] → $D/A$ converter → [audio signal]

Page 22 of 91

Aliasing

Happens when there are not enough samples.
This makes ~~causes~~ → different signals to become indistinguishable when sampled.

How to Minimize This?

Anti-Aliasing

Sample more often.
- But this can't go on forever.
Make the signal less "wiggly"
- Cut off higher frequencies
- Will lose information
Use low pass filters (if image is stored already).

Methods

Impulse Train

Comb function (in 1D):

\text{comb}(x) = \sum_{k=-\infty}^{\infty} \delta(x - kM)

$\delta(x) = 1$ if $x = 0$ , otherwise $0$

graph LR
    A[comb(x)] --> B[Impulse spikes at multiples of M]

(INSERT IMAGE for comb diagrams)

Fourier Transform (scaling property)

(INSERT IMAGE for FT of comb)

Page 23 of 91

Impulses in 2D

\text{comb}_{M,N}(x,y) = \sum_{k=-\infty}^{\infty} \sum_{l=-\infty}^{\infty} \delta(x - kM, y - lN)

Fourier:

\text{comb}_{1/M, 1/N}(u, v)

Sampling Low Frequency

Diagram: function $f(x)$ and its sampled & reconstructed forms.

graph LR
    A[f(x)] --> B[sampled] --> C[reconstructed]

(INSERT IMAGE for function and sampling diagrams)
Multiply by $\text{comb}_M$ :
- $f(x) \cdot \text{comb}_M(x)$

Convolution in frequency:
- $f(w) * \text{comb}_M(w)$

If there is no overlap in $w$ , then the original signal can be recovered from its samples by low-pass filtering.

Page 24 of 91

Sampling High Frequency

Apply an anti-aliasing filter.
Then do comb function multiplication.

~~(cuts)~~

Aliasing in Images

Resizing image $h \to \frac{1}{2}h$
- Use Gaussian filter (anti-alias)
- Then do 1 of row & column (sub-sampling)
(INSERT IMAGE for sub-sampling process)

Page 25 of 91

Image Pyramids

There are two kinds of image pyramids:
1. Gaussian Pyramid
2. Laplacian Pyramid
We use these pyramids when we need to work with images of different resolutions of the same image.

Generating a Gaussian Pyramid

Blur (convolve with Gaussian)
Downsample (reduce image size by half)
Upsample (double image size) - optional

Generating a Laplacian Pyramid

graph LR
    L0[Gaussian L0] --> L1[Gaussian L1] --> L2[Gaussian L2] --> L3[Gaussian L3]
    L1 --> Lap1[Laplacian L1]
    L2 --> Lap2[Laplacian L2]
    L3 --> Lap3[Laplacian L3]

(INSERT IMAGE for pyramid diagrams)

Page 26 of 91

Reduce & Expand

Reduce:

[1, 4, 6, 4, 1]/16

(5-tap filter)
Expand:

[1/8, 1/2, 3/4, 1/2, 1/8]

(3-tap filter)
(INSERT IMAGE for reduce/expand diagrams)

Page 27 of 91

Fourier Transform & Image Representation

Earlier times in 80s, working on image was studying frequency stuff and simpler. Still, some things need to be understood for CS in 2020.

Fourier Transform

Every image can be decomposed into a set of things that describe it.

An important image processing tool used to decompose an image into its sine and cosine components.

Decomposing on Image Basis Sets

Let $B = \{v_1, v_2, \ldots, v_n\}$ be a finite subset of a vector space over a field (such as real or complex numbers). Then is a basis if it satisfies:

Linear Independence

For all $a_1, \ldots, a_n \in F$ , if $a_1v_1 + \ldots + a_nv_n = 0$ , then necessarily .

graph TD
    v1[v1]
    v2[v2]
    a1[a1]
    a2[a2]
    a1 --> v1
    a2 --> v2
    v1 --linear combination--> v2

(INSERT IMAGE for basis/independence diagram)

Spanning Property

For every $w$ in $V$ it is possible to write $w = a_1 v_1 + \ldots + a_n v_n$

We need a good basis set. One is Fourier basis set.

Fourier Theory

Any periodic function can be written as a sum of cosines and sines of different frequencies.

g(t) = A_1 \sin(\omega_1 t + \phi_1) + A_2 \sin(\omega_2 t + \phi_2) + \ldots

Example: Square wave

A \sum_{k=1}^{\infty} \frac{\sin(2\pi k t)}{k}

Page 28 of 91

Decomposing a Signal

To decompose a signal, we want to understand the frequency of our signal; phase is irrelevant for CV if we are not interested in reconstructing the image.

flowchart LR
    f(x)[Image f(x)] --> FT[Fourier Transform] --> F(u)[F(u) = A \sin(\omega x + \phi)]

(INSERT IMAGE for FT block diagram)

Fourier Transform Equations

F(u) = \int_{-\infty}^{\infty} f(x) e^{-i2\pi ux} dx \quad \text{(Fourier Transform)}

f(x) = \int_{-\infty}^{\infty} F(u) e^{i2\pi ux} du \quad \text{(Inverse Fourier Transform)}

$F(u)$ gives us the basis set.

How?

\int_{-\infty}^\infty \sin(at + d)\sin(bt + d)dt = 0, \quad \text{if } a \neq b

(Independence)

$f(x) e^{2\pi iux}$ gives the winding machine exactly like signal beam.
Integrating it (by dividing by $N$ ) gives center of mass, and dividing by $N$ gives...

Page 29 of 91

Discrete Fourier Transform (DFT)

F(k) = \frac{1}{N} \sum_{x=0}^{N-1} f(x) e^{-i2\pi kx/N}

Where $k$ is cycles per period (periodic signal).

F(u, v) = \sum_{x=-\infty}^{\infty} \sum_{y=-\infty}^{\infty} f(x, y) e^{-i2\pi (ux+vy)} dx dy

F(k_n, k_y) = \frac{1}{N} \sum_{x=0}^{N-1} \sum_{y=0}^{N-1} f(x, y) e^{-i2\pi (k_n x + k_y y)/N}

Thus, we get the basis vector.

Fourier Transform & Convolutions

Let $g = f * h$ :

G(u) = \int_{-\infty}^{\infty} g(n) e^{-i2\pi un} dn

= \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(t) h(n-t) e^{-i2\pi un} dz dn

= \int_{-\infty}^{\infty} \left[f(t) e^{-i2\pi ut} dt \right] \int_{-\infty}^{\infty} \left[h(n') e^{-i2\pi u n'} dn' \right]

G(u) = F(u) H(u)

Convolution in spatial domain $\Leftrightarrow$ multiplication in frequency domain

Page 30 of 91

g = f * h \iff G = F \cdot H \\ g = f \cdot h \iff G = F * H

So, it's not uncommon to calculate $F$ & $H$ , multiply them & then use IFT to get $g$ .
FFT is Fast Fourier Transformation.

Smoothing & Blurring

h(x) = \frac{1}{\sqrt{2\pi}\sigma} e^{-\frac{x^2}{2\sigma^2}} \quad \text{(Gaussian)}

H(u) = e^{-\frac{1}{2}(2\pi u \sigma)^2} \quad \text{(also Gaussian)}

But, $\sigma$ is multiplied instead of dividing.

G(u) = F(u) H(u)

Thus, attenuates higher frequency. (Use IFT)

Properties

Linearity:

c_1 f(x) + c_2 g(x) \to c_1 F(u) + c_2 G(u)

Scaling:

f(ax) \to \frac{1}{|a|} F\left(\frac{u}{a}\right)

Differential:

\frac{d^n f(x)}{dx^n} \to (i 2\pi u)^n F(u)

Page 31 of 91

Camera Models & Views

New Image

A new image is a 2D representation of a 3D plane.

How to form an image?

a) Pinhole Perspective

flowchart TD
    Object --> Aperture[Control Hole] --> Film
    Film --> Inverted[Inverted Image]

Aperture size $\downarrow$ → Crisper image
If hole is too small, diffraction occurs.
$\uparrow$ Aperture → Blurry image

b) Thin Lens

flowchart TD
    P[p] --> Lens --> P'[p']
    P --> Y[y]
    P' --> Y'[y']

$-\frac{y'}{|z'|} = \frac{y}{|z|}$

Page 32 of 91

Focus

flowchart TD
    Object --> Lens --> Image

Depth of Field (DOF)

Wide aperture $\Rightarrow$ lower DOF
Small aperture $\Rightarrow$ higher DOF

flowchart TD
    Object --> Lens[Wide Aperture] --> Image
    Object --> Lens2[Small Aperture] --> Image2

(INSERT IMAGE for DOF diagrams)

Field of View (Zoom)

$\uparrow$ Focal length $\Rightarrow$ $\downarrow$ Field of view but $\uparrow$ magnification

\text{FOV} = \phi = \tan^{-1}\left(\frac{d/2}{f}\right)

$d$ = sensor size or sensor width

flowchart TD
    d[d] --> Lens --> phi[φ]

(INSERT IMAGE for FOV diagram)

Page 33 of 91

Zooming & Moving

Zooming & moving are not the same.

Problems of Lenses

Lenses are not perfectly thin

1. Geometric Distortion

flowchart LR
    A[No distortion] --- B[Pin cushion] --- C[Barrel]

No distortion
Pin cushion
Barrel

2. Radial Distortion

3. Chromatic Aberration

flowchart LR
    Lens((Lens)) -- Red --> FocalPointR[Red Focus]
    Lens -- Blue --> FocalPointB[Blue Focus]

4. Vignetting

Dark edges around the corners in two lens systems.

Page 34 of 91

Perspective Imaging

~~Distorting~~ → Perspective Imaging

Modeling Projection – Coordinate System

flowchart LR
    Camera(Camera) -- x-axis -->
    PF((Image Plane))
    A((A)) -- (x,y,z) --> PF
    B((B)) -- (x',y',-d) --> PF

Assuming +z to be image plane (virtual).

\triangle ADE \sim \triangle ABC

\frac{x'}{X} = \frac{y'}{Y} = \frac{-d}{Z}

\Rightarrow x' = -d \frac{X}{Z}, \quad y' = -d \frac{Y}{Z}

Distant objects are smaller.
Now, ( x' ) & ( y' ) is not a linear transformation.

So add one more coordinate:

((x, y) \rightarrow \begin{bmatrix} x \ y \end{bmatrix})
((x, y, z) \rightarrow \begin{bmatrix} x \ y \ z \end{bmatrix})

Converting from homogeneous coordinates:

for ( z=1 ): ( x' = x/w, \quad y' = y/w )
for ( z=0 ): ( x'=x/w, \quad y'=y/w, \quad z'=z/w )

Page 35 of 91

3D Points to 2D Image Form

flowchart LR
    A((x, y, z, 1)) -- Matrix_H --> B((x', y', w'))
    B -- DivideByW --> C((x'/w, y'/w))

Here now, H is homogeneous.

In perspective imaging, all parallel lines meet at a vanishing point.
- e.g., train tracks

Mathematically, a line in 3-space is represented as:

x(t) = x_0 + at \\ y(t) = y_0 + bt \\ z(t) = z_0 + ct

x'(t) = \frac{fx_0 + fat}{z_0 + ct} \\ y'(t) = \frac{fy_0 + fbt}{z_0 + ct}

As ( t \rightarrow \infty ), ( x'(t) \rightarrow \frac{fa}{c} ), ( y'(t) \rightarrow \frac{fb}{c} ).

So, parallel (do not) lines converge to the same point.

Other Projection Operators

3D point position:

Perspective: ((x, y, z) \rightarrow \left(\frac{fx}{z}, \frac{fy}{z}\right))
Weak Perspective: ((x, y, z) \rightarrow \left(\frac{fx}{z_0}, \frac{fy}{z_0}\right))
Orthographic: ((x, y, z) \rightarrow (x, y))

Page 36 of 91

Stereo Geometry (Geometry of Multiple Views)

Visit: vision.middlebury.edu/stereo

The depth of a scene point along with structure is not directly accessible in a single image.

Because we have information when connecting an image from 3D to 2D.

flowchart LR
    Eye1((P1))
    Eye2((P2))
    ScenePoint((Scene Point))
    Eye1 -- line --> ScenePoint
    Eye2 -- line --> ScenePoint
    OC((Optical Center))
    Eye1 -- OC
    Eye2 -- OC

The trick to solving this would be to look at it through a different perspective, e.g., two eyes in animals (us).

But, can I find depth with a single eye? (Can I mean human?)

Hey dumb-head, there are other factors such as

shading, shadow, texture, focus/defocus, motion

Also, we have two eyes:

Stereo:

The image from one eye is different than the image from the other eye.
Think of shape from motion or from multiple views.
Infer 3D shape of scene.

Page 37 of 91

Basic Geometry

flowchart LR
    Left((Optical Axis L)) -- line --> P((P))
    Right((Optical Axis R)) -- line --> P
    Left -- line --> xL((x_L))
    Right -- line --> xR((x_R))
    ImagePlane((Image Plane))
    Left -- ImagePlane
    Right -- ImagePlane

Let ( h_2 = x_L ), ( h_2 = x_R )

Now, ((c_L, p_L, h_2)) & ((c_R, p_R, c_R)) are similar.

So,

\frac{B - x_L + x_R}{B} = \frac{z - f}{z}

\Rightarrow z = \frac{fB}{x_L - x_R}

Where ( x_L - x_R = ) disparity, and ( \text{disparity} \propto \frac{1}{\text{depth}} )

Now, we need to find disparity but we don't know either ( x_L ) or ( x_R ).

We will talk about epipolar geometry.

Page 38 of 91

Epipolar Constraint

flowchart LR
    O((O)) -- line --> P((P))
    O -- line --> Pp((P'))
    EpipolarLine((Epipolar line))
    Pp -- EpipolarLine

Given a calibrated stereo rig, the set of possible matches for the point ( p ) is constrained to lie on the associated epipolar line ( l' ).

How do we solve the correspondence problem?

Matching point in ( O' )

No, there are still a lot of points on the line.

We need other "soft" constraints to help identify corresponding points:

Similarity
Uniqueness
Ordering
Disparity gradient is limited

Similarity

Assume image regions for the matches are similar in appearance.

For each pixel/window in the left image:

Compare with every pixel/window on the same epipolar line in right image.
Pick position with minimum match cost (SSD, norm, correlation).

Page 39 of 91

More Constraints

Cross-correlation might not work well. It depends a lot on window size.

2. Uniqueness

No more than one match in right image for each window in left image.

3. Ordering

abc in left image is abc in right image.
Won't always hold (e.g., occluded transparent object).

Example: Close one eye each time, with both index fingers in front of nose and in a line.

There are a lot of algorithms solving it, but there are still some challenges that stereo isn't possible to solve:

Low-contrast, textureless image regions
Occlusions
Violation of brightness constancy (e.g., specular reflections)
Larger baseline
Camera calibration errors

Algorithms

Scanline (DP) - Does not have stretching models.
Full 2-D grid as energy minimization:

E_{data} = \sum_{i} C_{i}(D(i)) + \nu_{i} C_{i}(D(j))^2

E_{smooth} = \sum_{\text{neighbors } i,j} \rho(D(i) - D(j))

\min E = \alpha E_{data}(I_1, I_2, D) + \beta E_{smooth}(D)

Page 40 of 91

Geometric Camera Calibration

Coordinate Transforms

flowchart LR
    WorldCoords[World Coords]
    CameraCoords[Camera Coords]
    FilmCoords[Film Coords]
    PixelCoords[Pixel Coords]
    WorldCoords -- Pattern/Translation --> CameraCoords
    CameraCoords -- Projection --> FilmCoords
    FilmCoords -- PixelScale --> PixelCoords

x' = \beta \frac{X}{Z} \\ y' = \beta \frac{Y}{Z}

We want to use the camera to get the true x, y, z of the world. So we need the relationship between coordinates of the world & coordinates in the image.

I. Extrinsic Camera Parameters

From arbitrary world coordinate system to the camera's 3D coordinate system. (Camera pose)

II. Intrinsic Camera Parameters

From the 3D coordinates in the camera frame to the 2D image plane via projection.

flowchart LR
    World((World Coord))
    Camera((Camera Coord))
    World -- extrinsic --> Camera

A rigid body has 6 degrees of freedom (cube).

flowchart TD
    Cube((Cube))
    Cube -- 6 DOF -->

Translation

A point A in world coordinate is translated to point B in camera coordinate with a relation:

^B P = ^B O_A + ^A P

(Origin A expressed in B frame)

Page 41 of 91

Matrix Multiplication for Transformation

We can express the same equation using matrix multiplication with the help of homogeneous coordinates (i.e., linear transformation):

\begin{bmatrix} ^B P \\ 1 \end{bmatrix} = \begin{bmatrix} I & ^B O_A \\ 0^T & 1 \end{bmatrix} \begin{bmatrix} ^A P \\ 1 \end{bmatrix}

^B P = ^B O_A + ^A P

Rotation

^B P = [i_B \quad j_B \quad k_B] \begin{bmatrix} x \\ y \\ z \end{bmatrix}

Also, by homogeneous coordinates:

\begin{bmatrix} ^B P \\ 1 \end{bmatrix} = \begin{bmatrix} ^B R_A & 0 \\ 0^T & 1 \end{bmatrix} \begin{bmatrix} ^A P \\ 1 \end{bmatrix}

Commutativity: ❌

Now, let's do a total rigid transformation:

A point ( A ) in system is rotated and then translated (offset) in the A system to get B system.

Page 42 of 91

Using Homogeneous Coordinates for Transformation

flowchart LR
    AP[Part in Camera Form] -- Transformation --> BP[Part in World Form]
    TransformationMatrix[Extrinsic Parameter Matrix]
    AP -- TransformationMatrix --> BP

6 degrees of freedom

Intrinsic Camera Parameters

flowchart LR
    Camera((Camera))
    ImagePlane((Image Plane))
    Camera -- projection --> ImagePlane

x' = u = \alpha \frac{x}{z} + v_0 \\ y' = v = \beta \frac{y}{z} + v_0

Origin of camera coordinate system is at the corner of the viewing and not at its center. So, ( u_0 ) & ( v_0 ), 4 degrees of freedom.

(Principal point)

Page 43 of 91

Skew in Camera Coordinate System

Also, the camera coordinate system may be skewed due to some manufacturing error, so angle ( \theta ) b/w two image axes is not equal to 90°.

u = \alpha \frac{x_1}{z} + \alpha \cot(\theta) \frac{y_1}{z} + u_0 \\ v = \frac{\beta}{\sin\theta} \frac{y}{z} + v_0

So, 5 degrees of freedom.

K = \begin{bmatrix} \alpha & \alpha \cot\theta & u_0 \\ 0 & \beta / \sin\theta & v_0 \\ 0 & 0 & 1 \end{bmatrix}

\vec{p'} = K \vec{c_p}

\Rightarrow \text{Actual point} = \frac{p'}{z}

Combining Extrinsic & Intrinsic Calibration Parameters

\vec{p'} = K \begin{pmatrix} C^R_W & C^t_W \end{pmatrix} \vec{p}_w

\vec{p'} = M \vec{p}_w

(pixel coords ← world coords)

Page 44 of 91

Camera Calibration

Where:

M = \begin{bmatrix} f & s & x_0 & x_1 \\ 0 & af & y_0 & y_1 \\ 0 & 0 & 1 \end{bmatrix}

Intrinsics | Projection | Rotation | Translation

Calibration using Known Points (Direct Linear Transformation)

Place a known object in the scene
Identify correspondence between image and scene
Compute mapping from scene to image

~~Remove scale/degenerate cases~~

Now, we need to find all 11 degrees of freedom. It is done using SVD tricks.

Find the ( m ) that minimizes ( |Am| ) subject to ( |m|=1 )
Let ( A = UDV^T ) (SVD, diagonal)

Translation: minimizing ( |UDV^T m| )

But, ( |UDV^T m| = |DV^T m| ) & ( |m|=1 \implies V^T m ) is unit vector.

Thus, minimize ( |DV^T m| ) subject to ( |V^T m|=1 )

Let ( y = V^T m ), so ( |Dy| ) is minimized s.t. ( |y|=1 ).

Page 45 of 91

Eigen Decomposition

Also, D is diagonal with decreasing values. So, ( |Dy| ) minimum is when ( y = (0,0,...,1)^T ).

Since ( y = V^T m ), ( m = Vy ) as ( V ) is orthogonal (( V^{-1} = V^T ))

This gives last column in V (eigen decomposition).

The final step is:

Given ( Am=0 ), find the eigenvector of ( A^T A ) with smallest eigenvalue, that's ( m ).

After finding ( M ), we should be able to find things like the camera center from ( M ).

Two ways:

Pure way
Easy way

M = [Q \mid m_4], \quad Q = 3 \times 3, \quad m_4 = 3 \times 1 \\ C = -Q^{-1} m_4

II. Multi-plane Calibration

Advantages:

Requires a plane only (checker-board)
Don't need to know positions/orientations
Direct code available in OpenCV

Zhengyou Zhang (Goal of Calibration)

Page 46 of 91

Multiple Views

Image to Image Projections

I. Affine Transformation

\begin{bmatrix} a & b & c \\ d & e & f \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix}

6 degrees of freedom
Transformation: 2
Rotation: 1
Scale: 1
Shear: 2

II. Homography

\begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} w x' \\ w y' \\ w \end{bmatrix}

(Divide by ( w ))

Homography: 8 degrees of freedom (This transformation maps between any two planes)

Homographies and Mosaics

Rather than thinking of this as a 3D reprojection on 2D plane, we usually think of it as a 2D image warp from one image plane to another image plane.

flowchart LR
    Plane1 -- warp --> Plane2

Page 47 of 91

Panoramas and Homography

This gives an idea of panoramas.

How to Stitch Together a Panorama

Take a sequence of images from the same position, rotate the camera about its optical center.
Compute transformation b/w second image plane & first.
Transform the second image to overlap with the first.
Blend the two together to create a mosaic (repeat if more than two).

flowchart LR
    Camera[Camera]
    Plane1[Image Plane 1]
    Plane2[Image Plane 2]
    Camera --rotate--> Plane1
    Camera --rotate--> Plane2
    Plane1 --transform--> Plane2

The mosaic has an anatomical overlap/interpolation in 3D.
- The images are reprojected onto a common plane.
- The mosaic is formed on this plane.

For views upto 180°, beyond that need a sequence that bends the way or map onto a different surface, say, a cylinder.

How to Solve Homography?

Since homography is transformation b/w any two planes:

P' = H P

\begin{bmatrix} w x' \\ w y' \\ w \end{bmatrix} = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}

(8 degrees)

Page 48 of 91

Solving for Homography

Requires at least 4 points for 8 eqs.

Non-homogeneous way

Ah = b \qquad (h \text{ is vector of 8 degrees of freedom})

Solve for ( h ) by min( ||Ah - b||^2 ) using least-squares.

Homogeneous way

Solve using SVD.

Applications of Homography (Not mosaics, dummy dumb)

i) Video motion (computation of camera motion)

ii) Image rectification

Rectifying scanned views (like CamScanner does)

flowchart LR
    Skewed[Skewed Image]
    Rectified[Rectified Image]
    Skewed --rectify--> Rectified

How do we do this?

Page 49 of 91

In case of homogeneous coordinates

Line joining two points ( p_1 ) & ( p_2 ) is given by:
$p_1 = [x_1, y_1, z_1] \\ p_2 = [x_2, y_2, z_2]$

Projective line (image plane)

A line is a plane of rays through the origin, defined by the normal for

l = (a, b, c)

All rays ( (x, y, z) ) satisfying:

a x + b y + c z = 0

Point & Line Duality

A line ( l ) is a homogeneous 3-vector
$l = \alpha p_1 + \beta p_2 \\ l = p_1 \times p_2$

%% Flowchart: Projective points, lines and intersection
graph TD
    P1([p1])
    P2([p2])
    L([l = p1 x p2])
    L1([l1])
    L2([l2])
    P([p = l1 x l2])
    P1 -- "join" --> L
    P2 -- "join" --> L
    L1 -- "intersection" --> P
    L2 -- "intersection" --> P

%% Block Diagram: Incidence of point p on lines l1 and l2
graph LR
    l1([l1])
    l2([l2])
    p([p])
    l1 -- "incident" --> p
    l2 -- "incident" --> p

(go to top now)

Page 50 of 91

Essential Matrix

Consider again the stereo problem, a 3D world point:

%% Sequence diagram: Stereo camera geometry
sequenceDiagram
    participant O_L as O_L (Left Camera)
    participant O_R as O_R (Right Camera)
    O_L->>P: projects to p_L
    O_R->>P: projects to p_R

In case of calibrated cameras:

x'_c = R x_c + T

T \times x' = T \times (R x + T) \\ T \times x' = T \times R x \\ x' \cdot (T \times R x) = x' \cdot (T \times R x) = 0

0 = x' \cdot (T \times R x)

Let, ( E = T \times R )

x'^T E x = 0

x' E x^T = 0

Called essential matrix.

( E x ) is a vector ( l ) (line)

p \times l = \text{line}

Thus, focuses a point, lies on an epipolar line.

Page 51 of 91

Fundamental Matrix

Motivation:

Estimate epipolar geometry from a set of point correspondences between two uncalibrated cameras.

We know,

p_{im} = K_{int} \cdot p_c \\ p_{im} = K_{int} p_c

p_c = K_{int}^{-1} p_{im} \\ \text{(} K_{int} \text{ is invertible)}

Camera (\rightarrow) Image

This gives the ray in space (as it is homogeneous).

And for two cameras:

p_{c,left} = K_{int,left}^{-1} p_{im,left} \\ p_{c,right} = K_{int,right}^{-1} p_{im,right}

p_{c,right}^T E p_{c,left} = 0 \qquad (\text{look at it})

(K_{int,right}^{-1} p_{im,right})^T E (K_{int,left}^{-1} p_{im,left}) = 0

p_{im,right}^T (K_{int,right}^{-T} E K_{int,left}^{-1}) p_{im,left} = 0

p_{im,right}^T F p_{im,left} = 0

( F ) is the epipolar line in the image associated with ( p_{im,left} ) and vice versa (( l' = F p )).

Page 52 of 91

Properties of Fundamental Matrix

Epipoles found by ( Fp = 0 ) & ( F^T p = 0 )
- A point which exists on every epipolar line.
- (In parallel image planes), epipoles are at infinity.
( F ) is singular (mapping from homogeneous 2-D point to 1-D family is rank 2).

If we estimate fundamental matrix from correspondences in pixel coords, can we construct epipolar geometry without intrinsic or extrinsic parameters.

How to find it? (FM gives relation b/w any two views that give an idea of video motion.)

Each point generates one constraint or single eqn on ( F ):
$p'^T F p = 0$
Requires at least 8 points to solve for ( F ) (9 variables, divide 1 variable).
Now, do OLS or orthogonal least squares.
8 variables left, 8 equations.

Most famous: SVD approach (very minimal error) $F = \text{find eigvec of } A^T A \text{ with minimum eigenvalue or OLS}$
To enforce SVD (same rank of ( F )): $F = U D V^T$

Page 53 of 91

Applications of Stereo

Stereo Rectification
Photo Synthesis

Summary

For 2 views, there is a geometric relationship between rays in one view to rays in the other (\rightarrow) epipolar geometry.
These relationships can be captured algebraically as well:
- ~~calibrated Essential Matrix~~ → calibrated: Essential Matrix
- ~~uncalibrated Fund. matrix~~ → uncalibrated: Fund. matrix
This relation can be estimated from point correspondences.

Another problem?

How to find these point correspondences?

Manually
Math

We will use features to find these.

\begin{bmatrix} u_1' \\ v_1' \end{bmatrix} = F \begin{bmatrix} u_1 \\ v_1 \\ 1 \end{bmatrix} = 0

\begin{bmatrix} u_1' & v_1' & 1 \\ u_2' & v_2' & 1 \\ \vdots \\ u_n' & v_n' & 1 \end{bmatrix} \begin{bmatrix} f_1 \\ f_2 \\ \vdots \\ f_n \end{bmatrix} = 0

A n = 0

F = U D V^T

Page 54 of 91

Computer Vision

Features & Matching

Introduction to "features":

To ~~get a transformation of one image to another image, we need corresponding points and we do that using "local features".~~ → get a transformation from one image to another, we need corresponding points, and we do that using "local features".

Goal: Find points that are:

Found in other images
Found precisely
Found reliably

Characteristics of good features

Precision/Repeatability
The same feature can be found in several images despite geometric & photometric transformations.
Saliency/Matchability
Each feature has a distinctive property.
Compactness/Efficiency
Many fewer features than image pixels.

Page 55 of 91

Finding Corners

+----------+     +---------+     +---------+
|  Flat    |     |  Edge   |     | Corner  |
|          |     |         |     |         |
|          |     |         |     |         |
+----------+     +---------+     +---------+
 No change      No change      Significant change
 in all         along the      in all direction
 directions     edge direction with small shift

(INSERT IMAGE: schematic diagrams for flat/edge/corner)

E(u, v) = \sum_{x, y} w(x, y) [I(x+u, y+v) - I(x, y)]^2

How, in this eqn, we want to look for small shifts (localization).

Second-order Taylor expansion of ( E(u, v) ) about (0, 0):

For 1D:

F(s) \approx F(0) + s \frac{df}{dx} + \frac{1}{2} s^2 \frac{d^2 f}{dx^2}

For 2D:

E (u, v) \approx E (0, 0) + [u v] [\begin{array}{c} E_{u} (0, 0) \\ E_{v} (0, 0) \end{array}] + \frac{1}{2} [u v] [\begin{array}{cc} E_{u u} (0, 0) & E_{u v} (0, 0) \\ E_{v u} (0, 0) & E_{v v} ( \end{array}

Page 56 of 91

Now,

E_u(0, v) = \sum_{x, y} w(x, y) [I(x+u, y+v) - I(x, y)] I_x(x+u, y+v)

E_v(u, 0) = \sum_{x, y} w(x, y) [I(x+u, y+v) - I(x, y)] I_y(x+u, y+v)

E_{uu}(0, v) = \sum_{x, y} 2 w(x, y) I_x(x+u, y+v) I_x(x+u, y+v)

E_{uv}(u, v) = \sum_{x, y} 2 w(x, y) I_x(x+u, y+v) I_y(x+u, y+v)

E_{vv}(u, v) = \sum_{x, y} 2 w(x, y) I_y(x+u, y+v) I_y(x+u, y+v)

( E_u(0, v) = I_y )
( E_v(u, 0) = I_x )

Putting all in eqn, we get:

E (u, v) \approx [u v] [\begin{array}{cc} \sum_{x, y} w (x, y) I_{x}^{2} (x, y) & \sum_{x, y} w (x, y) I_{x} (x, y) I_{y} (x, y) \\ \sum_{x, y} w (x, y) I_{x} (x, y) I_{y} (x, y) & \sum_{x,} \end{array}

Let

M = \sum_{x, y} w(x, y) \begin{bmatrix} I_x^2 & I_x I_y \\ I_x I_y & I_y^2 \end{bmatrix}

So,

E(u, v) \approx [u~v] M \begin{bmatrix} u \\ v \end{bmatrix}

Investigating it, (consider a slice of ( E(u, v) )): $\sum I_x^2 u^2 + 2 \sum I_x I_y u v + \sum I_y^2 v^2 = K$

Page 57 of 91

The eqn is of ellipse.

The gradient vectors as a set of (dx, dy) points with a center of mass defined as being at (0, 0) fit an ellipse to the set of pairs via scatter matrix.

Now, again, consider the ( M ) matrix (second moment):

M = \sum_{x, y} w(x, y) \begin{bmatrix} I_x^2 & I_x I_y \\ I_x I_y & I_y^2 \end{bmatrix}

Case I

Gradients are either horizontal or vertical but not slanted $I_x I_y = 0 \\ M = \begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{bmatrix}$ for ellipse

Case II

Gradient is only horizontal or only vertical
- either ( \lambda_1 = 0 ) or ( \lambda_2 = 0 )
- (Not a good corner!)

Note:
( \lambda_1 ) & ( \lambda_2 ) determine the corner based on the property of diagonalization of ( M ) matrix.
How? By concluding them as eigenvalues.

Page 58 of 91

( \lambda_1 ) & ( \lambda_2 ) are eigenvalues of ( M ).

Case III

( \lambda_1 ) & ( \lambda_2 ) are too small
- Flat region

Case IV

( \lambda_1 \gg \lambda_2 ) or ( \lambda_2 \gg \lambda_1 )
- Edge

Case V

( \lambda_1 ) & ( \lambda_2 ) are too large
- ( \lambda_1 \approx \lambda_2 )
- Corner

But, Harris algorithm makes use of "response function":

R = \det(M) - \alpha \cdot \text{trace}(M)^2 \\ R = \lambda_1 \lambda_2 - \alpha (\lambda_1 + \lambda_2)^2

and ( \alpha ) constant (( 0.04 ) to ( 0.06 ))

So,

( |R| ) small → flat
( R < 0 ) → edge
( R > 0 ) → corner

Harris detector: Algorithm

Compute gaussian derivatives at each pixel.
Compute second moment matrix ( M ) in a gaussian window around each pixel.
Compute corner response function ( R ).
Threshold ( R ).
Find local maxima of response function (non-maximum suppression).

Page 59 of 91

Thus, we get good features on both images.

Properties of Harris detector:

Invariant to Rotation
Intensity – invariant to addition
~~Threshold issue in multiplication, but invariant~~ → Threshold issue in multiplication, but invariant
Not invariant to scale

How to do scale invariance?

Soln: (Idea)
Design a function on the region, which is "scale invariant", not affected by the size but will be the same for corresponding regions even if they are at different scales.

One method:
At a point, compute the scale-invariant function across different size neighborhoods (different scales).

%% XY Chart: Scale-space neighborhood
xychart
    "region size S1" : 1, 2, 3, 4, 5
    "image scale 1" : 2, 3, 4, 5, 6
    "region size S2" : 1, 2, 3, 4, 5
    "image scale 2" : 3, 4, 5, 6, 7

(INSERT IMAGE for scale-space neighborhood plot)

We get scale sizes ( S_1 ) & ( S_2 ).

Page 60 of 91

Another problem, we need a good function for scale-invariant detection, that has one stable sharp peak.

%% XY Chart: Good vs bad function for scale invariance
xychart
    "bad" : 0, 1, 1, 1, 1
    "bad" : 0, 1, 0, 1, 0
    "good" : 0, 0, 1, 0, 0

(INSERT IMAGE for function response curves)

A good function would be one which responds to contrast.

L = \sigma^2 (L_{xx}(x, y, \sigma) + L_{yy}(x, y, \sigma))

Laplacian of Gaussian - LoG

It is a little costly to find.
So instead we use DoG, that is essentially the same:

DoG = G(x, y, k\sigma) - G(x, y, \sigma)

(Difference of Gaussian)

Both kernels are invariant to scale & rotation.

Page 61 of 91

Now, we get interest points but we still need to match them, that is the work of descriptor.

SIFT

Motivation:
The Harris operator was not invariant to scale (but we fixed that in 2004) and co-relation was not invariant to rotation
(normal brute-force matching).

~~Develop~~ → Done
- Detector → invariant to scale & rotation
- Descriptor → robust to variations in viewing conditions.

Idea of SIFT
Match enough SIFT features that are invariant to scale, transformation, photometry, etc. to do computer vision recognition.

Overall SIFT procedure
1. Scale-space extrema detection (or use Harris, Laplacian, or other method)
2. Keypoint localization
3. Orientation assignment
4. Keypoint description

Page 62 of 91

Orientation Assignment

An orientation is assigned to each keypoint to achieve invariance to image rotation.

%% Sequence diagram: Orientation assignment in keypoints
sequenceDiagram
    participant I1 as Image 1
    participant FP as FP
    participant I2 as Image 2
    participant FP1 as FP'
    I1->>FP: assign orientation
    I2->>FP1: assign orientation
    Note right of FP: Matching is done as orientation is known

Keypoint descriptors

Next is to compute a descriptor for the local image region about each keypoint that is:

Highly distinctive
As invariant as possible

But first normalization:

Rotate to standard orientation
Scale the window size

%% Block diagram: SIFT feature vector extraction
flowchart LR
    A[feature point] --> B[16x16 neighbourhood]
    B --> C[Histogram with 8 bins = 128 bin values]
    C --> D[SIFT feature vector]

(INSERT IMAGE for feature descriptor schematic)

Page 63 of 91

Now, we know how to describe them but how about matching? (We’re still not done 😊)

Approach 1: ( n^2 ) Brute force

Approach 2: Nearest Neighbours

k-D tree
Best Bin First

Approach 3: Wavelet-Based Matching (Hashing)

Compute a short (3-vector) descriptor from the neighbourhood
Quantize each value into 10 (overlapping) bins ( 10^3 ) entries

3D Object Recognition

Train:

Extract outliers with background subtraction
Compute "keypoints" = interest points & descriptors

Test:

Find possible matches
Search for consistent solution (such as affine)

Page 64 of 91

Feature-Based Alignment

Overall strategy

Compute features: detect → Harris, describe → SIFT
Find some useful matches – kd-tree, BBF (Best Bin First)
Compute & apply the best transformation – Affine, Homography

Algorithm

Extract features
Compute putative matches → "closest descriptor"
(Not known how to do)
Loop until happy:
- Hypothesize transformation ( T ) from some matches
- Verify transformation (search for other matches consistent with ( T )) – mark best

Here, step 2 isn’t encountered yet, so teaching it now.

If we pursue exhaustive search, taking Nearest Neighbour technique to give the best match, how do we know it’s a good one?

Page 65 of 91

How do we know a good match is found?

Feature Space Outlier Rejection

Let's not match all features, but only those that have "similar enough" matches.
How can we do it?
- SSD (patch1, patch2) < threshold
- But how to threshold? ~~It unknown~~ → also unknown

Probability Density for Match Quality

%% Flowchart for match thresholding
graph LR
    A[Probability Density]
    B[Good Matches]
    C[Incorrect Matches]
    D[Not a good threshold]
    A --> B
    A --> C
    C --> D

A point's best matches are ~~sometimes correct~~ → sometimes correct, sometimes not.
So, we use Lowe's method to use 2-NN and compare it with 1-NN.

%% Flowchart for Lowe's Method
graph LR
    A[Probability Density]
    B[Threshold]
    C[1-NN / 2-NN]
    D[Increased Error]
    A --> B
    B --> C
    C --> D

Page 66 of 91

Outlier Removal and RANSAC

Even when picking the best match, still lots of wrong matches ("outliers") remain, because we did the distinctive property thing.

Essentially, we need to fit a model such that the effect of outliers is minimized.

RANSAC (Random Sample Consensus)

Randomly pick some points to define your line (model). Repeat enough times until you find a good line (model) - one with many inliers.
It somehow copes with a large proportion of outliers.
Every general model has a minimal set:
- For line: it's 2
- For image transformations: s points
  - Translation: One pair of matched points
  - Homography (for plane): 4
  - Fundamental Matrix: 8 point pairs (in reality, 7)

Page 67 of 91

RANSAC Algorithm & Parameters

Algorithm

Randomly select s points (or point pairs) to form a sample.
Instantiate the model.
Get consensus set ( C_i ) - the points within error bounds (distance threshold) of the model.
If ( |C_i| > T ), terminate & return model.
Repeat for N trials, return model with max ( C_i ).

Parameters

( s ): Initial no. of points (known)
Distance threshold ( t ): ~~Known~~ → Unknown
- Choose so prob. for inlier is high (95%)
- Remain Gaussian noise within std. dev. ( t^2 = 3.84 )
Number of trials ( N )
- Choose ( N ) so that, with probability ( p ), at least one random sample set is free from outliers (e.g., ( p = 0.99 ))
- Need to set ( N ) based upon the outlier rate

Page 68 of 91

Calculating ( N )

( s ): no. of points to compute solution
( p ): prob. of success
( e ): proportion of outliers, so inliers = ( (1-e) )
Probability sample set with all inliers: ( (1-e)^s )
Probability sample set will have at least one outlier:
$1 - (1-e)^s$
Probability all N samples have outlier:
$(1 - (1-e)^s)^N$

Lastly, do re-compute least-squares H estimate on all of the inliers.

Page 69 of 91

Adaptive Algorithm

Set ( N = \infty ), sample_cnt = 0, ( e = 1.0 )
while ( N > ) sample_cnt:
- choose s & count the no. of inliers
- set ( e_0 = 1 - \frac{no. ; of ; inliers}{no. ; of ; points} )
- if ( e_0 < e ), set ( e = e_0 ) & re-compute ( N ) from ( e ): $N = \frac{\log(1-p)}{\log(1-(1-e)^s)}$

Page 70 of 91

PHOTOMETRY

Factors affecting image:

Lights, Surfaces & Shadows
Reflections
Refractions
Inter-reflections
Scattering

Surface Appearance

%% Block Diagram for Surface Appearance
graph TD
    Source --> SurfaceElement --> Sensor
    SurfaceElement -->|Normal| SurfaceElement

Image Intensity = f (Observed surface reflectance, illumination)
Surface reflection depends on both the viewing & illumination directions.

Page 71 of 91

Introduction to Motion

A video is a sequence of frames captured over time — usually quickly.
( I(x, y, t) )
- x, y: space
- t: time

Motion Applications

Background Subtraction
- Remove background that is not moving.
Shot boundary detection
Motion Segmentation
- Segment the video into multiple coherently moving objects.
Goals: Manufacture information in motion like brain does.
- Estimating 3D structure
- Learning dynamic models
- Recognizing events & activities
- Improving video quality

Page 72 of 91

Motion Estimation Techniques

I. Feature-based methods (Done ✓)

Extract visual features, track them over multiple frames.
Sparse motion fields, but more robust tracking.
Suitable when image motion is large.

II. Direct, dense methods

Directly recover image pattern at each pixel from "spatio-temporal image brightness variation".
Dense motion fields but sensitive to appearance variations.
Suitable for video & when image motion is small.
State of the art → Combination of both above.

Dense Flow: Brightness Constraint

i) Optic Flow

It is the apparent motion of objects or surfaces.

Problem setup:

%% Sequence Diagram for Optic Flow
sequenceDiagram
    participant Img1 as I(x,y,t)
    participant Img2 as I(x,y,t+1)
    Img1->>Img2: Track motion of points

Page 73 of 91

How to estimate pixel motion: from image 1 to 2

Solve pixel correspondence problem:
- Given a pixel in ( I(x, y, t) ), look for nearby pixels of the same color in ( I(x, y, t+1) )
- This is the optic flow problem.

Assumptions

color constancy:
$I(x, y, t) = I(x+u, y+v, t+1)$

Combining both eq.

$$
I(x+u, y+v, t+1) - I(x, y, t) = 0
$$
$$
I_x u + I_y v + I_t \approx 0
$$
$$
I_t + I_x u + I_y v = 0
$$

Page 74 of 91

I_t + \nabla I \cdot \langle u, v \rangle = 0

(Brightness constancy constraint eq.) - 2 unknowns but 1 eqn (\( u, v \)) - Similar to: \( a u + b v + c = 0 \) - \( I_x, I_y, I_t \) - [So optical flow is "constrained" to be on a line] Intuitively, this constraint means that: - The component of the flow in gradient direction is determined (called Normal Flow) \( I_x, I_y \) - The component of the flow parallel to an edge is unknown (called aperture problem) --- > Page 75 of 91 ```mermaid %% Illustration for aperture problem (cannot be coded, so placeholder) %% (INSERT IMAGE) ``` - So, optical flow can't be computed from just brightness constancy constraint; we need additional constraints such as: 1. Smooth optical flow (motion should be consistent in whole image) - Formulate error in optical flow constraint: $$ e_c = \iint_{img} (I_x u + I_y v + I_t)^2 dx dy $$ 2. Smoothness constraint: $$ e_s = \iint_{img} (u_x^2 + u_y^2 + v_x^2 + v_y^2) dx dy $$ - Find \( (u, v) \) at each img point that minimizes: $$ e = e_s + \lambda e_c $$ --- > Page 76 of 91 ## Local Methods: Lucas-Kanade - The flow field is smooth locally. - Method: Pretend the pixel's neighbors have the same \( (u, v) \). - For a 5x5 window, 25 eqns, 2 unknowns. - Do least squares & solve: $$ A d = -b $$ $$ A^T A d = -A^T b $$ $$ \begin{bmatrix} \sum I_x I_x & \sum I_x I_y \\ \sum I_x I_y & \sum I_y I_y \end{bmatrix} \begin{bmatrix} u \\ v \end{bmatrix} = - \begin{bmatrix} \sum I_x I_t \\ \sum I_y I_t \end{bmatrix} $$ - Tip: is solvable when \( A^T A \) is invertible. - Which means \( \lambda_1 \) & \( \lambda_2 \) should not be too small. - \( \lambda_1 \) must not be too large compared to \( \lambda_2 \), where \( \lambda_1, \lambda_2 \) are eigenvalues. --- > Page 77 of 91 ### Errors in Lucas-Kanade (Because of assumptions) - A point doesn't move like its neighbors → motion segmentation - The motion is large (larger than a pixel) → Taylor expansion doesn't hold - Non-linear, iterative refinement - Total minimize cost to fine estimate - Brightness constancy doesn't hold → exhaustive neighborhood search required #### Iterative Refinement Algorithm 1. Estimate velocity at each pixel by solving Lucas-Kanade equations. 2. Warp \( I_t \) towards \( I_{t+1} \) using the estimated flow field. 3. Repeat until convergence. ```mermaid %% Placeholder for refinement path illustration %% (INSERT IMAGE) ``` --- > Page 78 of 91 ## How to eliminate the problem of large motion? - Reduce the resolution until it moves only in range of 1 px motion. - Coarse image → fine image ### Hierarchical LK 1. Create a Gaussian pyramid - At top, it is a blurry picture (coarse) 2. Run iterative LK on the top-level images ```mermaid %% Block Diagram for Image Pyramid graph TD A[Coarse Image] --> B[Finer Image] B --> C[Finest Image] ``` - Move downwards & repeat until you get the exact motion Before moving forward, let's take a detour to learn about aliasing & image pyramids. --- > Page 79 of 91 ## Formal Algorithm 1. Compute iterative LK at level K. 2. Init. \( u_{K+1}, v_{K+1} = 0 \) at size of level K+1. 3. For each level \( K \) to 0: - Upsample (expand) \( u_{K+1}, v_{K+1} \) to create \( u'_K, v'_K \) (flow field of new twice resolution as level K) - Multiply \( u'_K, v'_K \) by 2 to get predicted flow - Warp level K Gaussian version of \( I_2 \) as per predicted flow to create \( I'_2 \) - Apply LK between \( I'_2 \) & level K Gaussian version of \( I_1 \) to get \( u^s_K, v^s_K \) (corrections in flow) - Add corrections: $$ u^t_K = u'_K + u^s_K \\ v^t_K = v'_K + v^s_K $$ --- > Page 80 of 91 ## Sparse LK - The Lucas-Kanade algorithm described gives a dense field: \( (u, v) \) everywhere. - But we said that we only want to solve LK where eigenvalues are well-behaved. - Sparse LK ⇒ hierarchical LK applied to good feature locations. ## Today's State of the Art Methods Start with something similar to LK: - + gradient constancy - + region matching - + energy minimization with smoothing term - + keypoint matching ## Motion Models There are 4 types of motion in real world: 1. Translation (2 unknowns) 2. Rigid (similarity) (4) 3. Affine (6) 4. Homography/Perspective (8) --- > Page 81 of 91 Now, in real world

V = \omega \times R + T

\begin{bmatrix} V_x \ V_y \ V_z \end{bmatrix}

\begin{bmatrix} 0 & -\omega_z & \omega_y \ \omega_z & 0 & -\omega_x \ -\omega_y & \omega_x & 0 \end{bmatrix} \begin{bmatrix} X \ Y \ Z \end{bmatrix} + \begin{bmatrix} V_{Tx} \ V_{Ty} \ V_{Tz} \end{bmatrix}

In image, velocity can be defined from

x = f \frac{X}{Z}

y = f \frac{Y}{Z}

\frac{dx}{dt} = u = v_x = f \frac{V_x}{Z} - x \frac{V_z}{Z}

\frac{dy}{dt} = v = v_y = f \frac{V_y}{Z} - y \frac{V_z}{Z}

\begin{bmatrix} u(x, y) \ v(x, y) \end{bmatrix}

\frac{1}{Z(x, y)} A(x, y) T + B(x, y) \Omega

- **Translation** and **Rotation** terms. Where:

A(x, y) = \begin{bmatrix} f & 0 & x \ 0 & f & y \end{bmatrix}

B(x, y) = \begin{bmatrix} \frac{xy}{f} & -\left(f + \frac{x^2}{f}\right) & y \ \left(f + \frac{y^2}{f}\right) & -\frac{xy}{f} & -x \end{bmatrix}

--- > Page 82 of 91 # Introduction to Tracking - Recognizing some object through frames in different timelines is called tracking. Some great tracking examples: 1. Surveillance camera 2. Camera tracking under occlusion ## Tracking Challenges One idea to do tracking is ~~optic~~ → optical flow (between) first & second, second & third & so on, but: i) It's hard to compute optical flow. ii) There can be large displacements since objects could be moving rapidly - Probably dynamics need to be taken into account ~~would do it~~ → to address this. iii) Occlusions --- > Page 83 of 91 - **Shi-Tomasi Feature Tracker** **Algorithm:** - From frame site to frame, track with LK and a pure translational model. - Robust for small displacements. 2. Check consistency of trackers by affine registration to the first (or earlier) observed instance of the feature. - Affine model is more accurate for larger displacements. - Comparing to the first or early frames helps to minimize drift. > This above method will fail for large displacements. - **Tracking with Dynamics** Key idea: Given a model of expected motion, predict where objects will occur in the next frame even before seeing the image. - Restrict search for the object. - Improved estimates since measurement noise is reduced by trajectory smoothness. --- > Page 84 of 91 ## Assumptions - Objects do not disappear and reappear in different places in the scene. - Camera is not moving instantaneously to a new viewpoint. - Gradual change in pose between camera and scene. ## Tracking as Inference Hidden State (X): True parameters we care about Measurement (Y): Noisy observation of underlying state At each time step *t*, state changes (from \( X_{t-1} \) to \( X_t \)) and we get a new observation \( Y_t \). Our goal: Estimate distribution of state \( X_t \) given 1. all observations seen so far 2. knowledge about dynamics of state transitions ## Steps of Tracking i) **Prediction:** What is the next state of the object given past measurements?

P(X_t \mid Y_0, Y_1, \ldots, Y_{t-1}, U_{1:t}, Y_t)

--- > Page 85 of 91 ii) **Correction:** Compute an updated estimate of the state from prediction & measurement

P(X_t \mid Y_0, Y_1, \ldots, Y_{t-1}, Y_t = y_t)

Tracking: The process of propagating this posterior distribution of state given measurements across time. ### Some Assumptions i) Only the immediate past matters:

P(X_t \mid X_0, X_1, \ldots, X_{t-1}) = P(X_t \mid X_{t-1})

(Dynamics model) ii) Measurements depend only on the current state:

P(Y_t \mid X_0, X_1, \ldots, X_{t-1}, X_t) = P(Y_t \mid X_t)

(Observation model) --- > Page 86 of 91 ## Tracking as Induction **Base case:** Assume we have some initial prior that predicts state in the absence of any evidence \( p(X_0) \). - At the first frame, correct this, given value of \( Y_0 \). ### Prediction Step Given: \( P(X_{t-1} \mid y_0, \ldots, y_{t-1}) \) Guess: \( P(X_t \mid y_0, \ldots, y_{t-1}) \) Law of Total Probability - Marginalization:

\int P(X_t, X_{t-1} \mid y_0, \ldots, y_{t-1}) dX_{t-1}

= \int P(X_t \mid X_{t-1}, y_0, \ldots, y_{t-1}) P(X_{t-1} \mid y_0, \ldots, y_{t-1}) dX_{t-1}

Using assumption:

= \int P(X_t \mid X_{t-1}) P(X_{t-1} \mid y_0, \ldots, y_{t-1}) dX_{t-1}

### Correction Step Given predicted value \( P(X_t \mid y_0, \ldots, y_{t-1}) \) & \( y_t \): Compute

P(X_t \mid y_0, \ldots, y_t) \propto P(y_t \mid X_t, y_0, \ldots, y_{t-1}) P(X_t \mid y_0, \ldots, y_{t-1})

Using Bayes Rule and the observation model:

P(y_t \mid X_t)

--- > Page 87 of 91 # Linear Dynamics Model ## Dynamics Model State undergoes linear transformation, plus Gaussian noise

X_t \sim N(D_t X_{t-1}, \Sigma_{dt})

- Normal distribution, mean, Gaussian noise/covariance ## Linear Measurement Model Observation model: Measurement is linearly transformed state plus Gaussian noise

Y_t \sim N(C_m X_t, \Sigma_{mt})

--- ## Kalman Filter ~~Quite a tradeoff between inconsistency of~~ → Combines prediction & measurement. It is a method of tracking linear dynamic systems with Gaussian noise. ### Kalman Prediction - Mean at time t: dynamics multiplies old mean - Corrected mean at time (t-1) - Variance at t: \( D^2 \) (corrected variance at t-1) + additional noise --- > Page 88 of 91 ## Nomenclature - \( \mu_t^-, \sigma_t^- \): Mean, covariance before measurement - \( \mu_t^+, \sigma_t^+ \): After measurement (vice versa) ## Kalman Correction (for 1-D)

\mu_t^+ = \frac{\mu_t^- \sigma_m^2 + m y (\sigma_t^-)^2}{\sigma_m^2 + m^2 (\sigma_t^-)^2}

(\sigma_t^+)^2 = \frac{\sigma_m^2 (\sigma_t^-)^2}{\sigma_m^2 + m^2 (\sigma_t^-)^2}

\mu_t^+ = \mu_t^- + K \left( y_t - m \mu_t^- \right)

Where \( K \) is the Kalman gain. Residual = \( y_t - m \mu_t^- \) - Also, variance lowers after measurement. ### Pictorial Representation ```mermaid flowchart LR A[p(x)] -->|Deterministic drift| B[p(x)] A2[p(x)] -->|Stochastic diffusion| B2[p(x)] A3[p(x)] -->|Negative effect of measurement| B3[p(x)] ``` --- > Page 89 of 91 - Simple updates, compact & efficient **Cons** - Unimodal distribution, only single hypothesis - Restricted class of motions defined by linear model - Extension called "Extended KF Filter" What about Non-Gaussian? ## Particle Filter Here \( z_t \) = measurements - Density is represented by particles, where the positions and their weight. ```mermaid flowchart LR A[p(x)] --> B[Particles (set of n samples)] ``` Goal:

p(x_t \in X_t) \approx p(X_t \mid z_{1:t})

with equality when \( n \to \infty \) (number of particles). --- ## Bayes Filter --- > Page 90 of 91 1. Prior probability of the system states \( p(x) \) 2. Action (dynamical system) model \( p(x_{t+1} \mid u_t, x_t) \) 3. Sensor model ("likelihood") \( p(z \mid x) \) 4. Stream of observations \( z \) and action data \( u \) - data: \( t-1, u_{t-1}, z_2, \ldots, u_{t-1}, z_t \) Wanted: 1. Estimate of the state X at time t, 2. Posterior of the state is also called belief:

Bel(x_t) = p(x_t \mid u_{1:t}, z_{1:t})

#### Fluctuations of input (diagram) ```mermaid flowchart LR U1[U_{t-1}] --> X1[X_{t-1}] X1 --> Z1[Z_{t-1}] X1 --> X2[X_t] U2[U_t] --> X2 X2 --> Z2[Z_t] X2 --> X3[X_{t+1}] U3[U_{t+1}] --> X3 X3 --> Z3[Z_{t+1}] ```

p(Z_t \mid x_{0:t}, z_{1:t-1}, u_{1:t}) = p(Z_t \mid X_t)

p(X_{t+1} \mid x_{1:t-1}, z_{1:t-1}, u_{1:t}) = p(X_{t+1} \mid x_t, u_t)

--- > Page 91 of 91 ## Bayes Filter

Bel(x_t) = p(x_t \mid o_{1:t}, u_{1:t}, z_{1:t}, z_{1:t-1})

= \eta p(z_t \mid x_t) p(x_t \mid u_t, z_{1:t-1}, u_{1:t}, x_{t-1})

(Bayes)

= \eta p(z_t \mid x_t) p(x_t \mid u_t, z_{1:t-1}, u_{1:t-1}, x_{t-1})

(Sensor Ind.)

= \eta p(z_t \mid x_t) \int p(x_t \mid u_t, x_{t-1}) Bel(x_{t-1}) dx_{t-1}

(Total Prob. of previous)

Bel(x_t) = \eta p(z_t \mid x_t) \int p(x_t \mid u_t, x_{t-1}) Bel(x_{t-1}) dx_{t-1}

(Markov Ind.) ```mermaid flowchart LR X1[X_{t-1}] --> X2[X_t] X1 --> Z1[Z_{t-1}] X2 --> Z2[Z_t] ```

Share via EmailShare on XShare on LinkedInShare on Reddit