GEOG 414/514: Advanced Geographic
Data Analysis Scatterdiagram smoothing (Nonparametric regression) Scatterdiagram smoothing involves drawing a smooth curve on a scatter diagram to summarize a relationship, in a fashion that makes few assumptions initially about the form or strength of the relationship. It is related to (and is a special case of) nonparametric regression, in which the objective is to represent the relationship between a response variable and one or more predictor variables, again in way that makes few assumptions about the form of the relationship. In other words, in contrast to "standard" linear regression analysis, no assumption is made that the relationship is represented by a straight line (although one could certainly think of a straight line as a special case of nonparametric regression). Another way of looking at scatter diagram smoothing is as a way of depicting the "local" relationship between a response variable and a predictor variable over parts of their ranges, which may differ from a "global" relationship determined using the whole data set. (And again, the idea of "local" as opposed to "global" relationships has an obvious geographical analogy.) A review of global fitting (e.g. linear regression) In ordinary linear regression analysis, the objective can be considered to be drawing a line through the data in an optimal way, where the parameters (regression coefficients) are determined using all of the data, i.e. they are globally determined. However, it is possible to think of the line as connecting the points, that for each value of X, represent the local density maxima of Yit just happens that these local maxima happen to be arranged along a straight line.
Loess curves A bivariate smoother is a function or procedure for drawing a smooth curve through a scatter diagram. Like linear regression (in which the "curve" is a straight line), the smooth curve is drawn in such a way as to have some desirable properties. In general, the properties are that the curve indeed be smooth, and that locally, the curve minimize the variance of the residuals or prediction error. The bivariate smoother used most frequently in practice is known as a "lowess" or "loess" curve. The acronyms are meant to represent the notion of locally weighted regressiona curve or functionfitting technique that provides a generally smooth curve, the value of which at a particular location along the xaxis is determined only by the points in that vicinity. The method consequently makes no assumptions about the form of the relationship, and allows the form to be discovered using the data itself. (The difference between the two acronyms or names is mostly superficial, but there is an actual difference in Rthere are two different functions, lowess() and loess(), which will be explained below.) The mechanics of loess:
Scatter diagram smoothing in R Note that there are actually two versions of the lowess or loess scatterdiagram smoothing approach implemented in R. The former (lowess) was implemented first, while the latter (loess) is more flexible and powerful.
Other bivariate smoothers Loess is one of a number of smoothers (including linear regression as an endmember) that can be used. The different smoothers vary in the assumptions they make about
The other scatter diagram smoothers include a straight, or "leastsquares" line, a loworder polynomial leastsquares line, and the "smoothing spline". Each can be viewed as special cases of the more flexible loesstype smoothers in which the curve is very simple. The best way to understand these different smoothers is to compare them:
The various smoothers can be summarized as follows:
