I’m a recovering scientist managing a remote sensing group at the NOAA Coastal Services Center. In my spare time, when I’m not torturing staff, I try to fit in some technical work on lidar processing and distribution.
Submitted by Kirk Waters on February 26, 2013
The TIFF format has been around for quite some time. The current specification (TIFF Revision 6.0) was published in 1992. It has been a highly useful format for lots of disciplines, including remote sensing. The format is very flexible and uses information “tags” in the file to describe the file contents. You can even add new user defined tags to add information that nobody thought of in 1992. However, unless a tag becomes incorporated in the specification as a standard, common software is unlikely to understand it. This is where I run into trouble trying to use the TIFF format for lidar derived DEMs, because there is no standard tag for a “no data” value.
Our Digital Coast data system will generate DEMs from the lidar points and a very common output choice is a 32-bit floating point TIFF. However, it is the “nature of the beast” that there will be areas in your rectangular DEM tile where there are no lidar data to provide an elevation value. This is particularly true in the coastal areas because relatively few data sets have both topography and bathymetry data. Even those that do invariably have gaps somewhere and don’t extend globally. We still have to put a value in the cell though! So, what do you put?
I’ve seen a few different values out there to represent no data. If you’re using a file format that allows specifying what value was used, it doesn’t really matter as long as it won’t conflict with real data. Examples include -32767 or -9999. I believe the USGS NED uses zero, but that won’t work for us since we include data below zero. The only thing that made logical sense to us was to use one of the IEEE Not a Number (NaN) values. I assumed software packages would automatically pick up that the cell was not valid and wouldn’t use it when figuring out a scale to color the image. You know what they say about assumptions!
For some software packages, my assumption worked out (e.g. Global Mapper). However, for the package that most of our constituents use, Esri ArcGIS, it didn’t. ArcGIS isn’t the only software that didn’t like my solution, but it’s the one I have to worry most about. When you first bring in a TIFF image with IEEE NaN values into ArcGIS, you’ll see an image that has just two colors/tones and crazy numbers on the scale bar (see figure 1). The range shown is essentially the maximum range for 32-bit IEEE 754 standard floating point numbers. So, what is the trick in ArcGIS to get it to work right? It turns out that if you do something to make ArcGIS calculate the statistics of the image, it figures out that those aren’t really values and scales it properly. For instance, telling it to scale the image with a two standard deviation stretch (see figure 2). If you do that, figure 1 becomes figure 3 and all seems right with the world again. Well, at least closer to right. The no data values end up colored the same as the low end of the color ramp even though doing a query on the cells shows “no data.” If anyone knows how to fix that, I’d love to hear it. Or if you have a better answer for what to put in those TIFF cells.
Figure 1. A 32-bit floating point TIFF containing IEEE NaN values when first read into ArcGIS. The black
areas have no data due to either building removal or water (Charleston Harbor).
Figure 2. Screen shot of computing statistics. Setting the stretch type from “None” to “Standard
Deviations” will trigger the compute statistics question.
Figure 3. The same data is shown after computing statistics and using a rainbow color ramp. Note the
range of values in the legend now makes sense.