Read the original paper
a new GPS position filtering algorithm for unaided GPS devices
« Life is really simple, but we insist on making it complicated. » (Confucius)
This document is a draft and is not complete
Abstract
In this paper, I briefly present a new GPS filter to be used with unaided GPS devices and thus not requiring any advanced sensors coupling nor DGPS.
The new filter proved itself quite efficient in an urban environment with dense network, when used on the output of a standard Bluetooth GPS unit (Sirf III) and before feeding a road reduction filter with positions.
The road reduction filter has since been rewrote and its overall complexity has dropped a lot, since this filter now feeds it with smooth, noisefree positions.
The filter itself is trivial, but I’ve never seen it published before so here I am.
Motivation
For the profane, a GPS unit asks some satellites its current position then give this position to the user with a great accuracy. Some people who have been watching 1998’s blockbuster Enemy of the State even believe satellites can get the position of every cellphone within the 1meter range.
Real GPS is far behind fiction, however. No satellite knows about your position. Satellites have a pulsing atomic clock and current time is the only piece of information they transmit.
It is then possible to combine several satellite signals and, using advanced mathematics based on their clocks, compute an approximation of the GPS unit position in three dimensions.
Several phenomena significantly alter satellites’ signal, making it difficult to get an accurate position: High signal noise and multipath.
Noise
Interferences and attenuation are mainly due to current weather conditions. Clouds, rain, snow, thunder, and high atmosphere particles can all alter the perceived signal, decreasing the SNR and making it barely useful for any further computation.
The Multipath Nightmare
Multipath describes what occurs when a signal gets reflected by a highly reflective surface, so that the GPS unit treats the reflected signal as valid instead of the original one (which might be slightly corrupted by causes exposed above), thus making use of an inexisting, « ghost » satellite position when doing calculus. This occurs almost all the time if you’re using your GPS unit in an urban environment, when your car passes near a building with large reflective windows.
By combining the above problems, and with very little irony, one can easily and accurately guess the position returned by a GPS unit at any time t using the following formula:
position(t) = random() (1)
It should be noted that, while most of the GPS units out there incorporate a Kalman filter or one of its variants, this does not help much. Positions remain chaotic at best when the GPS unit does not move, and the filter never addresses the multipath issue.
Observations
The Cumulative Displacement Filter, or CDF, presented in this paper is based upon the following (empirical or not) observations:

The Kalman Filter tries to estimate (and, hopefully, compensate) the positionning error over time.
The Kalman Filter is an integration filter which tries to guess a correct signal from a noisy one and an estimation of the noise. It works with present and past measurements.
The idea behind the CDF is to help him a bit by also using future measurements when computing a position.

When averaging all the positions received during the last N seconds, the resulting position’s accuracy is far better than the position the GPS unit sent us N/2 seconds ago.
Why? This is because, N/2 seconds ago, the GPS unit could only feed its Kalman Filter with measurements from N to N/2 seconds ago.
By averaging all the last positions up to N seconds ago, we incorporate the error estimation of measurements from N/2 seconds ago to now in the computation of the position of the vehicle N/2 seconds ago, leading to a better local error approximation (see Figure 2 below).
Upper half: Position got N/2 seconds ago directly from the Kalman Filter. Since the KF can only work with past measurements it can’t remove all the noise from the signal.
Lower half: The average position during the last N seconds is used as the position of the vehicle N/2 seconds ago. As we take subsequent positions into account when computing this position, we get rid of most of the noise since the KF has updated itself based on new measurements.

GPS unit reports small speed values when at very low speeds, i.e. when the vehicle has stopped.
If the reported speed is somewhere between 0 and 2 meters per second, a good rule is to assume the vehicle has stopped. The reported speed and heading only serve as small adjustements made by the Kalman Filter on the current position.

Positions accuracy gets better when speed is high.
 Although the position at any given time T is highly inacurate (and sometimes 100 meters away from the real one), the cumulative relative displacement from an old position received N/2 seconds ago is accurate.
By cumulative relative displacement, I mean the sum of every displacement vectors build from the heading and speed components of positions received since N/2 seconds. In general, displacement vectors are almost noisefree, but what the GPS unit sends can either be the real vehicle speed and heading or just small adjustements to a previous erroneous info, so it’s better to sum that anyway.
All the above remarks lead me to the conclusion that the Kalman Filter used in most of the GPS units I’ve been dealing with (mainly Sirf III and Qualcomm chips) can’t cope with some temporal, normal, high frenquency noise (I don’t know if this noise passed throught the Kalman Filter of if the latter is responsible for introducing it in its output, but that makes no difference at all).
By accepting this bold statement*, I finally could write a postkalman filter that does a great job when used for car navigation: the properties of the remaining noise make a mean filter the perfect candidate. Therefore, the CDF is based upon such a mean filter.
*Not to be confused with a bald stateman, although bald statemen are known to issue bold statements quite often.
The Cumulative Displacement Filter
Architecture
The CDF is meant to work on positions received from the GPS unit (i.e. after the GPS unit’s Kalman Filter’s works), giving the architecture pictured in figure 3.
Algorithm
Basic algorithm:
TODO: put here a diagram explaining how things work
Pseudocode implementation:
TODO: a bit of cleaning and explications
 Start with EXTRAPOLATING and AVAILABLE both set to false
 Constants: MAX_HDOP, SPEED_MAX_AGE, MIN_SPEED, MAX_SPEED, HIST_SIZE
 If current position’s HDOP > MAX_HDOP, set validity to false
 If position is not valid, bail out
 Add position’s speed to SPEED_HISTORY
 Remove speeds older than SPEED_MAX_AGE seconds from SPEED_HISTORY
 AVERAGE_SPEED = average of all speeds currently in SPEED_HISTORY
 If AVERAGE_SPEED < MIN_SPEED, AVERAGE_SPEED = MIN_SPEED
 If AVERAGE_SPEED > MAX_SPEED, AVERAGE_SPEED = MAX_SPEED
 If position’s speed >= AVERAGE_SPEED:
 If EXTRAPOLATING is true, clear POSITION_HISTORY
 Set EXTRAPOLATING to false
 Add position to POSITION_HISTORY
 Remove oldest positions from POSITION_HISTORY so that’s there no more than HIST_SIZE positions stored.
 Else, if POSITION_HISTORY is not empty:
 [Mx,My] = average of positions currently in POSITION_HISTORY
 [Dx,Dy] = sum of displacements from each position currently in POSITION_HISTORY to the next
 Normalize [Dx,Dy]
 [M,D] is the linear regression built from POSITION_HISTORY
 Project current position onto [M,D]
 If EXTRAPOLATING is true, use last found projection if the new one is going backward
 Set EXTRAPOLATING to true
 Save the new projection for preventing the next position to go backward
 Set the current position to its projected image found above.
 Set the current position’s heading to D
 If AVAILABLE is true, check for stopped vehicle:
 If distance from current position to LAST_POSITION is too low, then set current position to LAST_POSITION
 Set LAST_POSITION to current position
 Set AVAILABLE to true
Implementation
A sample implementation is provided here.
TODO: cleaning the code a bit
The program parse a file containing raw NMEA sentences as sent by a GPS unit and dumps either unfiltered or CDFfiltered KMLformatted data on stdout.
It gives pretty good results while only working on the three last positions average (HIST_SIZE=3). Increasing the size of the position history improves the general accuracy of found positions, up to a certain size. Increasing it too much, however, tends to consider lower frequencies as noise, and produces smoother but innacurate results.
Program commandline parameters:
f path  Read file at given path (e.g. /some/where/nmea.txt) 
sf  Activate the CDF filter 
Results
The results presented below feeds the aforementionned program with this sample NMEA file.
The program outputs can then be reviewed in Google Earth:
Filtered positions
Figure 4 shows unfiltered and CDFfiltered positions. Note how positions are evenly spaced and how the route is already close to the road. This makes the road reduction filter’s work far easier than with unfiltered data.
Note how the filter also gets rid of the multipath effect in the leftmost part of the picture. The CDF will likely give accurate positionning when multipath occurs, if the vehicle moves at low speed and a high speed position was acquired out of the area where multipath takes place.
Heading consistency
Figure 5 shows the reported positions and heading for a stopped vehicle. When not filtered, the GPS unit reports a small amount of brownian motion.
The CDF addresses this by totally ignoring new positions when speed approaches zero and by averaging other lowspped positions.
Remaining issues
The filter I presented in this paper is far from perfect but does a decent job at very low cost.
Among the remaining issues, one could note that the filtered data sometimes gets farther away from the vehicle’s real position than the raw, unfiltered data. This is because CDF cares a lot about heading consistency and also do not allow the car to go backward when at low speeds. Such problem is depicted in figure 6.
However, the primary use for the CDF is to feed a road reduction filter of some sort, and it’s better to feed it with a smooth and consistent motion than to let it try to decide which road to select based on uncontinuous, random, and jumpy positions.
Conclusion
I presented a simple way to eliminate most of the residual noise in positions sent by typical GPS units, relying only on very basic mathematics such as averaging twodimensionnal vectors.
Results look good. However I only tested the CDF on NMEA data acquired in Paris, France. The filter might work on this kind of road network only, as Paris is an old city with a very dense road network (i.e. a lot of small streets only a few meters away from each other) and streets crossing each other with any arbitrary angle. Furthermore, the french speed limits may impact typical vehicle courses, leading to an incorrect behaviour of the CDF with the constants used in the reference implementation: those constants may need to be further tuned to match other vehicle behaviors.
Future work
The CDF requires some constants to be finetuned by hand. An improvement would be to make it entirely adaptative. This requires to find a large NMEA dataset produced with various vehicles on very different road networks in different weather conditions to begin with.
A part of the multipath effect is eliminated by the CDF, but some remains. I can’t see any way to get rid of the remaining error caused by the multipath effect without using the road network information, and this is the purpose of a road reduction filter.
Links
TODO: Put links about road reduction filters