The need to combine pictures into panoramic mosaics existed since the
beginning of photography, as the camera's field of view is always smaller
than the human field of view. Photo mosaicing, pasting together several
pictures to create a panoramic mosaic, gives us a more complete view of
the scene.
While scissors and glue are the tools used in film photography,
more sophisticated methods were enabled with digital video. Digital
mosaicing gives us three main advantages over the paper methods
- While with paper we can only translate and rotate the images, digital
processing enables us more general transformations such as affine projections.
- Our cut and paste process can use combinations of overlapping
images, therefore reducing noise in the final mosaic image.
- In many cases there is a notable intensity difference between images,
These are usually telling signs of mosaicing even if the images are aligned
perfectly, we can overcome this using image blending.
While most mosaicing methods project onto a single image plane or onto a
cylinder. Our method of manifold projection
enables the creation of panoramic
mosaics from video sequences under very general conditions, and in particular
the unrestricted motion of a hand-held camera. The panoramic mosaic is a
projection of the scene on to a virtual manifold whose structure depends
on the camera's motion .
This manifold projection is defined for almost any arbitrary camera motion
and scene structure. There are no distortions caused by the alignment to
a reference frame, resolution of the mosaic being the same as the image
resolution.
Selected Papers
Shmuel Peleg and Joshua Herman,
Panoramic Mosaics by Manifold Projection,
CVPR, June 1997. For a commercial application see
VideoBrush.
Benny Rousso, Shmuel Peleg, and Ilan Finci,
Mosaicing with Generalized Strips,
DARPA Image Understanding Workshop, May 1997.
Benny Rousso, Shmuel Peleg, Ilan Finci, and Alex Rav-Acha,
Universal Mosaicing Using Pipe Projection,
ICCV'98.
Suppose a movie camera caught a burglar in action, but because he moved
so quickly all we have is a blurred picture that is not enough to recognize
the burglar's identity. Using
motion segmentation
we can enhance the image of the burglar
until it is sharp and clear. This is done by temporal integration. We
register the images using the motion of the object we wish to enhance and
take an average of the registered images. In this average our
registered object will be enchanted with sharp edges. This enhancement is
due to the fact that we use information over a sequence of
images. This method works for transparent objects as well.
For example a reflection on a window.
Selected Papers
Michal Irani and Shmuel Peleg,
Motion Analysis for Image Enhancement: Resolution,
Occlusion,and Transparency,
VCIR, Vol 4 No. 4, December 1993.
In today's world there is constant need for more realistic textures for
synthetic 3D worlds. This is important both for the computer simulations
and the computer animations.
Computer generated textures don't seem to be able to fool us,
we want the real thing, real texture from a real image. We have developed
a method to extract high quality texture from a sequence of images.
Our algorithm has the following qualities:
- The texture in the images can appear in different resolutions
and with different perspective distortions.
- We are not restricted to planner object and can work with any
known 3D structure.
- We have the ability to remove illumination artifacts such as
highlights and reflections.
- Storage of the resulting texture is in a multiresolution data structure
- There are no restrictions on the computed texture.
The input of our algorithm is
a sequence of images of an object and the 3D model of this object.
We then create a multi resolution texture map for this object.
Using our 3D information we know were each pixel in the image comes
from and update our texture map correctly. The pixels in our texture map
are a smoothed weighted average from all relevant images. The pixels
that come from the images with higher resolution have a stronger weight
giving our texture map an almost constant quality. This is similar to
the constant quality achieved in
mosaics.
For each pixel in the texture map we only take in account information
around the median of the brightness level at that pixel,
this way we eliminate highlights and reflections that appear in only a few
images. We use a laplacian pyramid to implement multiresolution and
store our final texture map in this format.
Selected Papers
E. Ofek, E. Shilat, A. Rappoport, and M. Werman
Highlight and Reflection Independent Multiresolution Textures
from Image Sequences
HUJI TR, April 1995. Accepted to IEEE CG&A.
Virtual reality is highly developed in the world of computer graphics yet
realistic virtual reality is still fairly new in computer vision.
Our final goal is to interactively walk/fly through a 3D scene
stored as a small set of images.
Given two reference images of some static 3D scene we can generate a
third view from a new user-specified virtual camera. Our view synthesis is
physically correct, meaning our result is the same image that would have
resulted if we really put a camera and took a picture from that location.
Our method derives an on-line warping function from a set of model images.
This relies on algebraic constraints
that all views of the same 3D scene must obey. The constraints we use are
the trilinear tensor , a trilinear
relation determined only by the configuration (location) of three cameras
in space.
Any 3D point across these three views satisfies these constraints,
therefore given two views and a tensor, the point in the third view
can be found simply by solving a linear equation.
Our method has the advantage that it avoids the computation of a 3D model
or explicit camera geometry (relative location of the cameras). Both being
compuatationaly heavy and numerically unstable. We need not assume any
camera calibration or any 3D structure of the scene. All our algorithm
needs as input is a dense correspondence between the two model views and a
trilinear tensor. This tensor can be calculated
with the help of a third view and only seven correspondening triplets
across the three views.
Our new camera location is driven by a simple equation that given a tensor
and our new camera, specified as translation and rotation from the first,
calculates the new tensor between the three cameras. The correctness
of our new tensor can be mathematically proven based
on algebraic properties of the tensor and the space of all legal tensors.
Using this new tensor to tensor function as our driver we can create a
movie of novel views.
Selected Papers
S. Avidan and A. Shashua.
Unifying Two-View and Three-View Geometry .
Submitted, Nov. 1996.
S. Avidan and A. Shashua.
Novel View Synthesis in Tensor Space.
Submitted, Nov. 1996
S. Avidan and A. Shashua.
Tensorial Transfer: On the
Representation of $N>3$ Views of a 3D Scene.
In Proc. of the ARPA Image Understanding Workshop, Palm Springs, Feb.
1996.
A. Shashua and S. Avidan.
The Rank4 Constraint In Multiple View Geometry.
To appear in ECCV, April 1996.
Back to Research Page