July 2017 ~ Software Intelligence

We often need to resize images to upload them, share them, or print them. However, occasionally we need to reduce the image size only in one direction. For example, if I want to post a panoramic picture in my blog, like the one below, the disproportionate length makes the details hard to see.

Standard image manipulation tools do not offer many choices. I can crop, thus losing detail on the edges or resize in one direction making everything look distorted.

Ideally, I'd like to have a smaller picture where the essential details are not distorted, as in the one below:

In this post, I will present a brilliant, yet relatively low tech algorithm to achieve this result, called Seam Carving. The original paper describing the algorithm by Avidan and Shamir is here.

If you are interested in the implementation details, check out the Java source code I used to create these examples in my GitHub repository.

Representation of Digital Images

A digital image is represented by a matrix of pixels. If the image is in grayscale, each pixel has a value between 0 and 255 to represent its level of grayness with 0 being black and 255 white. For color images, each pixel has three values, representing the intensities of red, green, and blue.

Consequently, we can consider an image as a function in two dimensions that assigns to each point (x,y) a color intensity f(x,y). The figure below shows this functional representation of a simple grayscale image.

Notice that the curvature of the 3D contour is nearly flat for areas of the image with similar color, but it is steep when there are significant color variations. Therefore if we compute the derivative - or gradient - of the image at every point, points appearing in uniformly colored areas, such as the sky or ocean, will have small gradient values. In contrast, points at the boundaries of different elements will have large gradients. We can use this information to tell us which pixels we can remove without affecting the image very much.

Images as Graphs

The idea behind Seam Carving is to remove vertical lines from the image until it reaches the desired width (or horizontal lines until it reaches the desired height). The lines need not be straight. In fact, we are looking for a jagged line going from top to bottom such that removing it will cause the minimum alteration to the image. We already know that the gradient of the image will give us information about the least valuable points to remove. Now we need a way to figure out which, among all the possible jagged lines, is the best. In order to do this, we use concepts from graphs and shortest paths.

A finite graph consists a set of points called vertices and a set of edges connecting some of these points. For our purposes, we can view an image as a graph where each point is connected to the immediate neighbors right below it. So each interior point has 3 neighbors, whereas boundary points have between 0 and 3 neighbors.

A path in a graph is a sequence of vertices, such that each vertex in the sequence is connected by an edge to the vertex preceding it and the vertex following it. Remember that each vertex on the image graph has a value defined by the magnitude of the gradient. Our goal is to find a path starting from the top of the image to the bottom such that the total value of the path vertices is minimal. In Graph theory this is called finding the Shortest Path.

Dynamic Programming

There are many algorithms for computing shortest paths in graphs and a lot of them are derived from ideas from Dynamic Programming (DP). Dynamic Programming is a generic optimization framework for solving a vast family of problems often involving systems that evolve over time and are subject to randomness. In the case of shortest paths in graphs, there is no time evolution or randomness, so we get a very simple version of DP. The idea of the algorithm is very straightforward. I will present it here after introducing a bit of notation:

Let $v_{ij}$ be the vertex in the image graph at row $i$ and column $j$. Let $N(v_{ij})$ be the set of neighbors of a vertex $v_{ij}$. Let $g(v_{ij})$ denote the magnitude of the gradient for that vertex. We introduce a terminal vertex $v_T$ such that all the vertices in the last row are connected to it. This vertex has gradiend 0. We want to find shortest paths from all the vertices in the top of the graph to this terminal vertex $v_T$. Finally, let $d(v_{ij})$ denote the shortest path between vertex $v_{ij}$ and $v_T$.

We can now set up a recursive computation for the shortest path from any vertex. Let $n$ be the height of the graph. For all vertices in the last row $n$, we have

$$
d(v_{nj}) = g(v_{nj})
$$

For any arbitrary row $i$, we have:

$$
d(v_{ij}) = g(v_{ij}) + \min\limits_{k \in N(v_{ij})} \{d(v_{kj+1})\}
$$

Using this recursive relationship we can compute the shortest path row by row, starting at the bottom and moving all the way to the first row. Once we have all the shortest paths from every pixel in the top row, we just choose the shortest one.

Putting it all together: The Seam Carving Algorithm

The key to seam carving is finding low gradient paths through the picture and successively removing them:

Step 1: Compute the gradient at each point of the image

Step 2: Compute shortest paths using Dynamic Programming

Step 3: Remove the shortest path (the seam)

Repeat Steps 1 to 3 until the image has reached the desired dimensions.

I will demonstrate that algorithm in the following picture. Its initial dimensions are 350x263 and I want to reduce the width by 80 pixels and the height by 13.

Here is what the image would look like if I simply resized by changing the aspect ratio.

Here is what the picture looks like when I use Seam Carving. Note that the apples remain proportionate.

Here is a visual representation of the gradient, or energy, of the initial picture:

We can also visualize the seams that the algorithm removed by adding them back in the final picture one by one:

Further Applications of Seam Carving

A variation of the algorithm can be used to expand an image instead of shrinking it. We can also coerce the algorithm's behavior by providing user input to the image. For example, we can mark portions of the image that we want to keep intact and portions that we want completely removed. In the energy representation of the image, the regions to keep will appear as high energy and the areas to remove will appear as low energy.

I will probably explore these modification in a future article. You are also welcome to clone my GitHub repo and play around with your own modifications!

Software Intelligence

Mathematics in the Service of Software

Wednesday, July 26, 2017

Elegant Image Resizing with Graph Theory

Representation of Digital Images

Images as Graphs

Dynamic Programming

Putting it all together: The Seam Carving Algorithm

Further Applications of Seam Carving

Popular Posts

Recent Posts

Followers

Author

Categories

Blog Archive