Using Photogrammetry to Generate a DEM and Orthophoto











Prepared by: Keith Blonquist


For: CEE 6440

Table of Contents


Introduction………………………………………………………………………………    2

Photogrammetry Principles……………………………………………………………..     2

Synthetic Dataset…………………………………………………………………………  3

Generating DEMs in ArcGIS……………………………………………………………   6

Actual Dataset……………………………………………………………………………   10

Orthophoto Generation………………………………………………………………….    15

Conclusion………………………………………………………………………………..    18

References………………………………………………………………………………..   19


Table of Figures


Figure 1.  Perspective projection……………………………………………………….     2

Figure 2.  Point visible in two images………………………………………………….       3

Figure 3.  3D coordinates of synthetic model………………………………………….      4

Figure 4.  Synthetic images……………………………………………………………..     5

Figure 5.  3D points from the multi-image triangulation……………………………..        5

Figure 6.  Ground control points……………………………………………………….      6

Figure 7.  TIN resulting from XYZ points……………………………………………..       7

Figure 8.  TIN converted to raster………………………………………………………    7

Figure 9.  Elevation interpolation using inverse distance weighting..………………..       8

Figure 10.  Elevation interpolation using kriging..……………………………………       9

Figure 11.  Elevation interpolation using splines……………………………………..       9

Figure 12.  Actual images……………………………………………………………….     10

Figure 13.  Point correspondences………………………………………………………    11

Figure 14.  3D points resulting from multi-image triangulation………………………       11

Figure 15.  Ground control points………………………………………………………     12

Figure 16.  Breaklines and points……………………………………………………….    13

Figure 17.  TIN and contour lines from points and breaklines………………………..      14

Figure 18.  TIN converted to raster……………………………………………………..    14

Figure 19.  Parallel projection to create an orthophoto………………………………..     15

Figure 20.   An actual image vs. an orthophoto…………………………………………    16

Figure 21.  Generation of orthophoto……………………………………………………   17

Figure 22.  Orthophoto from actual images…………………………………………….     18


Table of Tables


Table 1.  Coordinates of control points in North American Datum 1983, Texas State Plane Coordinate System, South Central Zone North American Vertical Datum of 1988 in U.S. Survey Feet………………………………………………….…………     12




Most Geographic Information System (GIS) databases contain data-sets that were collected using photogrammetry.  By applying photogrammetry principles to digital aerial photographs and also satellite imagery, Digital Elevation Models (DEMs) and Orthophotos can be generated very rapidly and with a high degree of precision.  For these reasons, photogrammetry is used extensively to collect data for many GIS databases.  Wolf and DeWitt state in their Elements of Photogrammetry textbook, “Photogrammetry plays a very important role in the collection of information for most GIS databases.”  (Wolf and DeWitt, 2000)  This report will outline how photogrammetry is used to extract a DEM and an Orthophoto from a set of aerial images.


First, some basic principles of photogrammetry (perspective projection, the collinearity equations, multi-image triangulation, etc…) will be discussed.  Next, a synthetic dataset will be used to demonstrate how these photogrammetry principles are applied to derive a set of XYZ coordinates from a set of images.  Next, these photogrammetrically derived XYZ coordinates will be imported to ArcGIS where several types of DEMs (a TIN, a contour interpolation, a spline interpolation, etc…) can be generated using the 3D analyst tools in ArcGIS.  Next, a pair of actual images will be used to show these same principles.  Finally, the generation of an orthophoto will be discussed using both datasets.


Photogrammetry Principles


Photogrammetry is the science of using images to obtain information about physical objects.  In this report the geometrical aspects of photogrammetry will be considered.  The geometry of capturing an image with a camera is best approximated by a perspective projection.  The perspective projection transforms a 3-dimensional object into a 2-dimensional representation of the object by assuming that the light rays from the object travel in straight lines from the object through the image to the perspective center of the camera.  This is shown in Figure 1 below:


Figure 1.  Perspective projection.


While the perspective projection is not an exact representation of the imaging process, it serves as the basic model for most cameras.  The perspective projection is described mathematically by the collinearity equations.  The collinearity equations are the fundamental equations used in photogrammetry.  The collinearity equations are:



The collinearity equations express the 2-dimensional image coordinates (x, y) of a point as a function of the 3-dimensional coordinates (X, Y, Z) of actual points in space, the camera orientation (Xc, Yc, Zc, r11,…, r33), and the camera parameters (f, xo, yo).


Note that an image is a 2-dimensional projection of a 3-dimensional object, so information is lost when capturing an image.  In order to recover the full 3-dimensional geometry of an object it is necessary to view an object in more than one image.  If a given point in 3-dimensional space is observed in more than one image, it is possible to determine the 3-dimensional coordinates of that point.  The 3-dimensional coordinates of the point are found by intersecting the light rays from the two images.  This is shown in Figure 2.


Figure 2.  Point visible in two images.


So, if a point is visible in at least two images, its 3-dimensional coordinates can be found.  This however does require that the orientations of the images and the camera parameters are known.


Generally in photogrammetry, it is desired to solve for the 3-dimensional coordinates of the target points and solve for the orientation of the images simultaneously.  There is a process known as multi-image triangulation, where the 3-dimensional coordinates of the object are solved-for along with the orientations of the images.  This process is based on the collinearity equations and solves for the relative orientation of each image with respect to the target object.  (Wolf and DeWitt, 2000)  This multi-image triangulation results in a stereo model, or multi-image model, which has an arbitrary coordinate system.  Once the model has been created, it can be rotated, shifted, and scaled to match any desired coordinate system.  Typically this is done by using GPS data.  The platform which carries the camera is generally equipped with a GPS which can give the coordinates of the camera at the time that each image was captured.  Also, there are usually ground control points visible in the images for which the coordinates have been measured.  Using these known coordinates, the multi-image model is transformed (rotated, shifted and scaled) to match the desired coordinate system.


Synthetic Dataset


To demonstrate the photogrammetry process, it was desired to generate a set of synthetic images to analyze.  To begin, a set of 3D coordinates were created.  This synthetic 3D coordinate dataset contained a road, 3 houses, a river, and a small hill.  This 3D coordinate set was used to generate synthetic images which could then be analyzed using the photogrammetry principles outlined above.  The 3D coordinates are shown in Figure 3.


Figure 3.  3D coordinates of synthetic model.


Next, by using the collinearity equations, synthetic images of the 3D coordinates were generated.  For this example, three synthetic images were used.  They are shown in Figure 4.


Figure 4.  Synthetic images.


Next, the 2D image coordinates were matched across all three images and used to perform multi-image triangulation.  The triangulation resulted in a set of 3D coordinates and the camera orientations in an arbitrary coordinate system.  This is shown in Figure 5.


Figure 5.  3D points from the multi-image triangulation.


In order to transform the coordinates from the arbitrary system to the desired system, it is necessary to have ground control points or the GPS coordinates of the camera at the time the images were taken.  For this example, it was assumed that there were three ground control points visible in each of the images.  The points that were chosen are shown in Figure 6.


Figure 6.  Ground control points.


Since the coordinates of these ground control points in the desired coordinate system were assumed to be known, the entire model was rotated, shifted, and scaled to match these ground control points.  This resulted in a set of 3D coordinates in the desired coordinate system.


Generating DEMs in ArcGIS


Once the points were rotated into the desired coordinate system, they were loaded into ArcGIS.  This was done using the conversion tools (convert from file).  The convert from file tool allows for generation of point files, polygon files, or polyline files.  Two datasets were loaded into ArcGIS: the first dataset was a point file of all of the XYZ points; the second dataset was a polygon file of the outline of the houses.  Once these two datasets were put into ArcGIS, they could be used to generate DEMs.


The first DEM that was created was a TIN.  This was generated using the 3D analyst tools in ArcGIS.  A TIN is one of the more basic DEMs.  It has the advantage of being able to have varying degrees of spatial resolution.  In other words, areas which have very little change in terrain can be accurately modeled with a few points and triangles, and in areas with more significant changes in terrain, more points and triangles can be used to model the terrain.  When building this TIN, the points on the roofs of the houses were neglected.  The resulting TIN is shown in Figure 7.


Figure 7.  TIN resulting from XYZ points.


This TIN was also converted into a raster using the 3D analyst tools.  Converting the TIN to a raster does not give any additional information, but rasters are used as input for a number of other tools in ArcGIS.  This is shown in Figure 8.


Figure 8.  TIN converted to raster.


It is interesting that the road appears quite prominently in these DEMs.  This demonstrates that the triangles (which were selected by Delaunay triangulation) represent the terrain quite well.  If it is desired to have a more accurate representation of the terrain break-lines can used to define areas where there is a sharp change in the slope of the terrain.  For this example, break-lines were not used.  Also, it was necessary to clip-out the points on the tops of the houses from the TIN.  If these points were not clipped-out the Delauney triangulation creates some flawed triangles.  (Points on the ground away from the base of the house were connected to the tops of the house, etc…).  So, the synthetic dataset showed that some features, like houses, are somewhat difficult to work with.


Also, ArcGIS contains tools to create contour lines, and to interpolate elevation using several interpolation techniques such as splines, inverse distance weighting, and kriging.  Each of these was used to interpolate elevation from the set of XYZ points.  These interpolations have the added benefit that they are able to generate curved surfaces where a TIN generates only planar surfaces.  These interpolations are shown in Figures 9 – 11.  Again, the points on the tops of the houses were clipped-out to create these DEMs.


Figure 9.  Elevation interpolation using inverse distance weighting.


Figure 10.  Elevation interpolation using kriging.


Figure 11.  Elevation interpolation using splines.


It is interesting to note that the elevations that were interpolated using kriging appear to have the closest match to the contour lines.  The interpolation based on inverse distance weighting seems to create features which are not truly there.


So, this synthetic dataset has demonstrated the ability to create DEMs from a set of aerial images.

Actual Dataset


A pair of actual aerial images was also analyzed to show how this process can be applied to actual images.  A pair of images was obtained from P2 Energy Solutions Inc.  (Thanks to Andy Longoria and Dr. Neale.)  The images are shown in Figure 12.


Figure 12.  Actual images.


A series of point correspondences were chosen on these images to perform 3D reconstruction as was done with the synthetic images.  Because it was desired to show some change in elevation, the area around the freeway overpass was chosen to detail.  The point correspondences are shown in Figure 13.


Figure 13.  Point correspondences.


Note that the number of points which were located form a fairly sparse set of points.  If more sophisticated image processing tools (stereo-plotter, automatic point correlation, etc…) were used, a much more dense point set could be collected which would result in a much more accurate DEM.  But, for the purposes of this report, the above points are sufficient to show how a DEM can be created.


Once the point correspondences were collected, the points were put into the multi-image triangulation algorithm, resulting in a set of points in an arbitrary coordinate system.  These are shown in Figure 14.


Figure 14.  3D points resulting from multi-image triangulation.


This system was rotated, shifted and scaled to match a desired coordinate system using ground control points.  Four ground control points were visible in each of the images.  These points are shown in Figure 15.


Figure 15.  Ground control points.


The image coordinate of these four points were also collected and the 3D coordinates of these points were then rotated, shifted, and scaled to match the GPS coordinate of these points, which are given in Table 1.


Table 1.  Coordinates of control points in North American Datum 1983, Texas State Plane Coordinate System, South Central Zone North American Vertical Datum of 1988 in U.S. Survey Feet

ID                    X                                 Y                                              Z

1001                2148855.267               13750393.210             781.523

1002                2147642.219               13748882.528             760.457

1003                2149799.024               13749852.479             746.416

1004                2148318.001               13748247.835             731.077

At this point, it is good to discuss the precision of the solution from the multi-image triangulation.  The multi-image triangulation algorithm is a least-squares solution so it gives a measure of the precision.  For this image set the average 1-sigma error for the XYZ points was 0.43 ft (13 cm).  This means that there is a 68% probability that the calculated points are within 0.43 ft (13 cm) of the actual points.  Also, there is a 95% probability that the calculated points are within 0.86 ft (26 cm) of the actual points.  This precision may or not be acceptable, depending on the type of work that is being done.  Note however, a higher degree of precision is obtainable if more control points are selected, and if corrections are made for lens distortion (no correction for lens distortion was made in the above calculations).  Also, higher precision can be obtained by more accurately defining the points in 2D space.


Also, the transformation from the arbitrary coordinate system to the Texas State Plane Coordinate System is a least-squares solution.  The root mean square error for this transformation was 0.73 ft (22 cm).  This precision could also be improved by correcting for lens distortion and more precisely placing points in the 2D images.


Once the coordinates were matched to the desired coordinate system, they could be loaded into ArcGIS.  For this dataset three files were loaded into ArcGIS: one point file containing all of the points, one polyline file containing breaklines, and one polygon file with the outline of the bridge.  These are shown in Figure 16.


Figure 16.  Breaklines and points.


Figure 17 shows the TIN generated from the points and break lines.


Figure 17.  TIN and contour lines from points and breaklines.


Figure 18 shows the conversion of this TIN into a raster.


Figure 18.  TIN converted to raster.


Again, note that these DEMs are only as accurate as the number of points that were collected.  A more comprehensive collection of points would result in a more detailed DEM.  Note, however, that in both DEMs the bridge is very prominent.  This demonstrates the ability of photogrammetry to model 3D terrain features from aerial photographs. 


Here it is noted that the bridge did cause some complications.  First, the breaklines along the bridge had to be manually defined; otherwise the Delaunay triangulation would lead to spurious triangles (as was the case with the houses in the synthetic dataset).  Also, the interpolation techniques (splines, kriging, inverse distance weighting) which were used on the synthetic dataset did not give good results when applied to the actual dataset due to the fact that the points on the top of the bridge cause complications.  If the points on the top of the bridge were clipped out the remaining points had very little relief and the result was not very informative.  However, if the points on the top of the bridge were left in, the resulting interpolations did not accurately model the terrain.  So again it was found that certain features can be difficult to model.


 Orthophoto Generation


Another useful application of photogrammetry is the generation of an orthophoto.  An orthophoto is an image that has been altered so that the features appear in the same way that they would appear on a map.  An orthophoto is created by parallel projection.  Parallel projection is a projection from 3D space to 2D space where parallel lines (lines perpendicular to the 2D space) are drawn from each feature in 3D space to the 2D surface.  In other words, an orthophoto is created by “smashing” the 3D space flat into a 2D surface.  This is shown in Figure 19.    


Figure 19.  Parallel projection to create an orthophoto.

Note that when an image is captured with a camera, the resulting image is not an orthophoto.  This is because an actual image is not a parallel projection—it is best approximated by a perspective projection.  Because of the relief of the objects in the image, lines which should appear straight in an orthophoto are not straight in an actual image.  Also, note that features which are visible in the actual image (the sides of buildings) should not be visible in an orthophoto.  This is demonstrated in Figure 20.


Figure 20.   An actual image vs. an orthophoto.


In the above figure, the image on the left is an actual image.  Note that the sides of the houses are visible in the image on the left while they are not visible in the orthophoto on the right.  Also note that the line on the southeast of the image is not straight in the actual image on the left due to the relief of the terrain.  In the orthophoto on the right this line is straight, like a map.


In order to generate an orthophoto from a set of actual images, it is necessary to first create a DEM from the images.  For this report, the TIN method will be used to generate an orthophoto.  If a TIN is generated, the 3D triangles of the TIN can be projected by parallel projection onto a 2D surface.  Then, the triangular portion of the actual image associated with each triangle of the TIN is re-sampled to the size of the projected TIN triangle on the orthophoto.  This is shown in Figure 21.

Figure 21.  Generation of Orthophoto.


So, by beginning with the TIN created from the actual images, an orthophoto could be generated using the above process.  This was accomplished by loading the TIN and images into a 3D viewer developed at the CAIL Laboratory at Utah State University.  The viewer generates the TIN and then re-samples the triangular portions of the images onto the surfaces of the TIN.  Then the TIN could be viewed as a parallel projection, resulting in an orthophoto.  The result is shown in Figure 22.


Figure 22.  Orthophoto from actual images.




From the above datasets it has been shown how photogrammetry can be used to generate DEMs and Orthophotos from aerial images.  It has been shown that certain features (houses, bridges) can cause complications in the generation of a TIN and in elevation interpolation.  It has also been demonstrated that the accuracy and precision of a model is dependent upon the number of points that are obtained.


Wolf, Paul R., DeWitt, Bon A., 2000.  Elements of Photogrammetry.  3rd Edition.  McGraw Hill, Boston.  Pp. 187-207, 293-365.