Using Photogrammetry to Generate a DEM
and Orthophoto
Prepared by: Keith Blonquist
For: CEE 6440
Table of Contents
Introduction
2
Photogrammetry
Principles
.. 2
Synthetic
Dataset
3
Generating DEMs in
ArcGIS
6
Actual
Dataset
10
Orthophoto
Generation
. 15
Conclusion
.. 18
References
.. 19
Table of Figures
Figure 1. Perspective projection
. 2
Figure 2. Point visible in two
images
. 3
Figure 3. 3D coordinates of synthetic
model
. 4
Figure 4. Synthetic images
.. 5
Figure 5. 3D points from the multi-image
triangulation
.. 5
Figure 6. Ground control points
. 6
Figure
7. TIN resulting from XYZ
points
.. 7
Figure
8. TIN converted to
raster
7
Figure
9. Elevation interpolation using inverse
distance weighting..
.. 8
Figure
10. Elevation interpolation using
kriging..
9
Figure
11. Elevation interpolation using
splines
.. 9
Figure
12. Actual
images
. 10
Figure
13. Point
correspondences
11
Figure
14. 3D points resulting from multi-image
triangulation
11
Figure
15. Ground control
points
12
Figure
16. Breaklines and
points
. 13
Figure
17. TIN and contour lines from points
and breaklines
.. 14
Figure
18. TIN converted to
raster
.. 14
Figure
19. Parallel projection to create an
orthophoto
.. 15
Figure
20. An actual image vs. an
orthophoto
16
Figure
21. Generation of
orthophoto
17
Figure
22. Orthophoto from actual
images
. 18
Table
of Tables
Table 1. Coordinates of control points in North
American Datum 1983, Texas State Plane Coordinate System, South Central Zone
North American Vertical Datum of 1988 in U.S. Survey
Feet
.
12
Introduction
Most Geographic Information System (GIS) databases contain data-sets that were collected using photogrammetry. By applying photogrammetry principles to digital aerial photographs and also satellite imagery, Digital Elevation Models (DEMs) and Orthophotos can be generated very rapidly and with a high degree of precision. For these reasons, photogrammetry is used extensively to collect data for many GIS databases. Wolf and DeWitt state in their Elements of Photogrammetry textbook, Photogrammetry plays a very important role in the collection of information for most GIS databases. (Wolf and DeWitt, 2000) This report will outline how photogrammetry is used to extract a DEM and an Orthophoto from a set of aerial images.
First, some basic principles of photogrammetry (perspective projection, the collinearity equations, multi-image triangulation, etc ) will be discussed. Next, a synthetic dataset will be used to demonstrate how these photogrammetry principles are applied to derive a set of XYZ coordinates from a set of images. Next, these photogrammetrically derived XYZ coordinates will be imported to ArcGIS where several types of DEMs (a TIN, a contour interpolation, a spline interpolation, etc ) can be generated using the 3D analyst tools in ArcGIS. Next, a pair of actual images will be used to show these same principles. Finally, the generation of an orthophoto will be discussed using both datasets.
Photogrammetry
Principles
Photogrammetry is the science of using images to obtain information about physical objects. In this report the geometrical aspects of photogrammetry will be considered. The geometry of capturing an image with a camera is best approximated by a perspective projection. The perspective projection transforms a 3-dimensional object into a 2-dimensional representation of the object by assuming that the light rays from the object travel in straight lines from the object through the image to the perspective center of the camera. This is shown in Figure 1 below:
Figure 1. Perspective projection.
While the perspective projection is not an exact representation of the imaging process, it serves as the basic model for most cameras. The perspective projection is described mathematically by the collinearity equations. The collinearity equations are the fundamental equations used in photogrammetry. The collinearity equations are:
The collinearity equations express the 2-dimensional image coordinates (x, y) of a point as a function of the 3-dimensional coordinates (X, Y, Z) of actual points in space, the camera orientation (Xc, Yc, Zc, r11, , r33), and the camera parameters (f, xo, yo).
Note that an image is a 2-dimensional projection of a 3-dimensional object, so information is lost when capturing an image. In order to recover the full 3-dimensional geometry of an object it is necessary to view an object in more than one image. If a given point in 3-dimensional space is observed in more than one image, it is possible to determine the 3-dimensional coordinates of that point. The 3-dimensional coordinates of the point are found by intersecting the light rays from the two images. This is shown in Figure 2.
Figure 2. Point visible in two images.
So, if a point is visible in at least two images, its 3-dimensional coordinates can be found. This however does require that the orientations of the images and the camera parameters are known.
Generally in photogrammetry, it is desired to solve for the 3-dimensional coordinates of the target points and solve for the orientation of the images simultaneously. There is a process known as multi-image triangulation, where the 3-dimensional coordinates of the object are solved-for along with the orientations of the images. This process is based on the collinearity equations and solves for the relative orientation of each image with respect to the target object. (Wolf and DeWitt, 2000) This multi-image triangulation results in a stereo model, or multi-image model, which has an arbitrary coordinate system. Once the model has been created, it can be rotated, shifted, and scaled to match any desired coordinate system. Typically this is done by using GPS data. The platform which carries the camera is generally equipped with a GPS which can give the coordinates of the camera at the time that each image was captured. Also, there are usually ground control points visible in the images for which the coordinates have been measured. Using these known coordinates, the multi-image model is transformed (rotated, shifted and scaled) to match the desired coordinate system.
Synthetic Dataset
To demonstrate the photogrammetry process, it was desired to generate a set of synthetic images to analyze. To begin, a set of 3D coordinates were created. This synthetic 3D coordinate dataset contained a road, 3 houses, a river, and a small hill. This 3D coordinate set was used to generate synthetic images which could then be analyzed using the photogrammetry principles outlined above. The 3D coordinates are shown in Figure 3.
Figure 3. 3D coordinates of synthetic model.
Next, by using the collinearity equations, synthetic images of the 3D coordinates were generated. For this example, three synthetic images were used. They are shown in Figure 4.
Figure 4. Synthetic images.
Next, the 2D image coordinates were matched across all three images and used to perform multi-image triangulation. The triangulation resulted in a set of 3D coordinates and the camera orientations in an arbitrary coordinate system. This is shown in Figure 5.
Figure 5. 3D points from the multi-image triangulation.
In order to transform the coordinates from the arbitrary system to the desired system, it is necessary to have ground control points or the GPS coordinates of the camera at the time the images were taken. For this example, it was assumed that there were three ground control points visible in each of the images. The points that were chosen are shown in Figure 6.
Figure 6. Ground control points.
Since the coordinates of these ground control points in the desired coordinate system were assumed to be known, the entire model was rotated, shifted, and scaled to match these ground control points. This resulted in a set of 3D coordinates in the desired coordinate system.
Generating DEMs in
ArcGIS
Once the points were rotated into the desired coordinate system, they were loaded into ArcGIS. This was done using the conversion tools (convert from file). The convert from file tool allows for generation of point files, polygon files, or polyline files. Two datasets were loaded into ArcGIS: the first dataset was a point file of all of the XYZ points; the second dataset was a polygon file of the outline of the houses. Once these two datasets were put into ArcGIS, they could be used to generate DEMs.
The first DEM that was created was a
TIN. This was generated using the 3D
analyst tools in ArcGIS. A TIN is one of
the more basic DEMs. It has the
advantage of being able to have varying degrees of spatial resolution. In other words, areas which have very little
change in terrain can be accurately modeled with a few points and triangles,
and in areas with more significant changes in terrain, more points and
triangles can be used to model the terrain.
When building this TIN, the points on the roofs of the houses were
neglected. The resulting TIN is shown in
Figure 7.
Figure
7. TIN resulting from XYZ points.
This TIN was also converted into a raster
using the 3D analyst tools. Converting
the TIN to a raster does not give any additional information, but rasters are
used as input for a number of other tools in ArcGIS. This is shown in Figure 8.
Figure
8. TIN converted to raster.
It is interesting that the road appears
quite prominently in these DEMs. This
demonstrates that the triangles (which were selected by Delaunay triangulation)
represent the terrain quite well. If it
is desired to have a more accurate representation of the terrain break-lines
can used to define areas where there is a sharp change in the slope of the
terrain. For this example, break-lines
were not used. Also, it was necessary to
clip-out the points on the tops of the houses from the TIN. If these points were not clipped-out the
Delauney triangulation creates some flawed triangles. (Points on the ground away from the base of
the house were connected to the tops of the house, etc
). So, the synthetic dataset showed that some
features, like houses, are somewhat difficult to work with.
Also, ArcGIS contains tools to create
contour lines, and to interpolate elevation using several interpolation
techniques such as splines, inverse distance weighting, and kriging. Each of these was used to interpolate
elevation from the set of XYZ points.
These interpolations have the added benefit that they are able to
generate curved surfaces where a TIN generates only planar surfaces. These interpolations are shown in Figures 9
11. Again, the points on the tops of the
houses were clipped-out to create these DEMs.
Figure
9. Elevation interpolation using inverse
distance weighting.
Figure
10. Elevation interpolation using
kriging.
Figure
11. Elevation interpolation using
splines.
It is interesting to note that the
elevations that were interpolated using kriging appear to have the closest
match to the contour lines. The interpolation
based on inverse distance weighting seems to create features which are not
truly there.
So, this synthetic dataset has
demonstrated the ability to create DEMs from a set of aerial images.
Actual
Dataset
A pair of actual aerial images was also
analyzed to show how this process can be applied to actual images. A pair of images was obtained from P2 Energy
Solutions Inc. (Thanks to Andy Longoria
and Dr. Neale.) The images are shown in
Figure 12.
Figure
12. Actual images.
A series of point correspondences were
chosen on these images to perform 3D reconstruction as was done with the
synthetic images. Because it was desired
to show some change in elevation, the area around the freeway overpass was
chosen to detail. The point
correspondences are shown in Figure 13.
Figure
13. Point correspondences.
Note that the number of points which were
located form a fairly sparse set of points.
If more sophisticated image processing tools (stereo-plotter, automatic
point correlation, etc
) were used, a much more dense point set could be
collected which would result in a much more accurate DEM. But, for the purposes of this report, the
above points are sufficient to show how a DEM can be created.
Once the point correspondences were
collected, the points were put into the multi-image triangulation algorithm,
resulting in a set of points in an arbitrary coordinate system. These are shown in Figure 14.
Figure
14. 3D points resulting from multi-image
triangulation.
This system was rotated, shifted and
scaled to match a desired coordinate system using ground control points. Four ground control points were visible in
each of the images. These points are
shown in Figure 15.
Figure
15. Ground control points.
The image coordinate of these four points
were also collected and the 3D coordinates of these points were then rotated,
shifted, and scaled to match the GPS coordinate of these points, which are
given in Table 1.
Table
1. Coordinates of control points in North
American Datum 1983, Texas State Plane
Coordinate System, South Central Zone North American Vertical Datum of 1988 in U.S.
Survey Feet
ID X Y Z
1001
2148855.267 13750393.210 781.523
1002
2147642.219 13748882.528 760.457
1003
2149799.024 13749852.479 746.416
1004
2148318.001 13748247.835 731.077
At this point, it is good to discuss the
precision of the solution from the multi-image triangulation. The multi-image triangulation algorithm is a
least-squares solution so it gives a measure of the precision. For this image set the average 1-sigma error
for the XYZ points was 0.43 ft (13 cm).
This means that there is a 68% probability that the calculated points
are within 0.43 ft (13 cm) of the actual points. Also, there is a 95% probability that the
calculated points are within 0.86 ft (26 cm) of the actual points. This precision may or not be acceptable,
depending on the type of work that is being done. Note however, a higher degree of precision is
obtainable if more control points are selected, and if corrections are made for
lens distortion (no correction for lens distortion was made in the above
calculations). Also, higher precision
can be obtained by more accurately defining the points in 2D space.
Also, the transformation from the
arbitrary coordinate system to the Texas State Plane Coordinate System is a
least-squares solution. The root mean
square error for this transformation was 0.73 ft (22 cm). This precision could also be improved by
correcting for lens distortion and more precisely placing points in the 2D
images.
Once the coordinates were matched to the
desired coordinate system, they could be loaded into ArcGIS. For this dataset three files were loaded into
ArcGIS: one point file containing all of the points, one polyline file
containing breaklines, and one polygon file with the outline of the
bridge. These are shown in Figure 16.
Figure
16. Breaklines and points.
Figure 17 shows the TIN generated from
the points and break lines.
Figure
17. TIN and contour lines from points
and breaklines.
Figure 18 shows the conversion of this
TIN into a raster.
Figure
18. TIN converted to raster.
Again, note that these DEMs are only as
accurate as the number of points that were collected. A more comprehensive collection of points
would result in a more detailed DEM.
Note, however, that in both DEMs the bridge is very prominent. This demonstrates the ability of
photogrammetry to model 3D terrain features from aerial photographs.
Here it is noted that the bridge did
cause some complications. First, the
breaklines along the bridge had to be manually defined; otherwise the Delaunay
triangulation would lead to spurious triangles (as was the case with the houses
in the synthetic dataset). Also, the
interpolation techniques (splines, kriging, inverse distance weighting) which
were used on the synthetic dataset did not give good results when applied to
the actual dataset due to the fact that the points on the top of the bridge
cause complications. If the points on
the top of the bridge were clipped out the remaining points had very little
relief and the result was not very informative.
However, if the points on the top of the bridge were left in, the
resulting interpolations did not accurately model the terrain. So again it was found that certain features
can be difficult to model.
Orthophoto Generation
Another useful application of
photogrammetry is the generation of an orthophoto. An orthophoto is an image that has been
altered so that the features appear in the same way that they would appear on a
map. An orthophoto is created by
parallel projection. Parallel projection
is a projection from 3D space to 2D space where parallel lines (lines
perpendicular to the 2D space) are drawn from each feature in 3D space to the
2D surface. In other words, an
orthophoto is created by smashing the 3D space flat into a 2D surface. This is shown in Figure 19.
Figure
19. Parallel projection to create an
orthophoto.
Note that when an image is captured with
a camera, the resulting image is not an orthophoto. This is because an actual image is not a
parallel projectionit is best approximated by a perspective projection. Because of the relief of the objects in the
image, lines which should appear straight in an orthophoto are not straight in
an actual image. Also, note that
features which are visible in the actual image (the sides of buildings) should
not be visible in an orthophoto. This is
demonstrated in Figure 20.
Figure
20. An actual image vs. an orthophoto.
In the above figure, the image on the
left is an actual image. Note that the
sides of the houses are visible in the image on the left while they are not
visible in the orthophoto on the right.
Also note that the line on the southeast of the image is not straight in
the actual image on the left due to the relief of the terrain. In the orthophoto on the right this line is
straight, like a map.
In order to generate an orthophoto from a
set of actual images, it is necessary to first create a DEM from the
images. For this report, the TIN method
will be used to generate an orthophoto.
If a TIN is generated, the 3D triangles of the TIN can be projected by
parallel projection onto a 2D surface.
Then, the triangular portion of the actual image associated with each
triangle of the TIN is re-sampled to the size of the projected TIN triangle on
the orthophoto. This is shown in Figure
21.
Figure
21. Generation of Orthophoto.
So, by beginning with the TIN created
from the actual images, an orthophoto could be generated using the above
process. This was accomplished by
loading the TIN and images into a 3D viewer developed at the CAIL Laboratory at
Utah State University. The viewer
generates the TIN and then re-samples the triangular portions of the images
onto the surfaces of the TIN. Then the
TIN could be viewed as a parallel projection, resulting in an orthophoto. The result is shown in Figure 22.
Figure
22. Orthophoto from actual images.
Conclusion
From the above datasets it has been shown how photogrammetry
can be used to generate DEMs and Orthophotos from aerial images. It has been shown that certain features
(houses, bridges) can cause complications in the generation of a TIN and in
elevation interpolation. It has also
been demonstrated that the accuracy and precision of a model is dependent upon
the number of points that are obtained.
References
Wolf, Paul R., DeWitt, Bon A., 2000. Elements of Photogrammetry. 3rd Edition. McGraw Hill, Boston. Pp. 187-207, 293-365.