Radius of Gyration

Radius of gyration is a metric to quantify distributions around a center location. Its applications range from structural engineering to molecular physics. Since it incorporates the idea of dealing with locations, it can be applied for geographic data, as well. I recently came across with it in some global mobility studies where the goal was to characterize the travel patterns of individuals. In those papers, the metric indicates whether a person is more likely to travel long distances or not. In my research, where I am interested in geographic data contributions of volunteer mappers, I found it to be extremely useful to decide if the overall contribution shows local or global patterns. For many years, local knowledge was considered to be the main advantage of this so-called user generated geographic information. Local guys know the place, let them draw maps, let them take photos and the product will be accurate. While this is most probably true, it also seems that some of these guys like to do the same thing in distant places so there might be other factors than localness that can make these data sources accurate, therefore extremely valuable. Everything is up to the people who contribute, so the ultimate goal is still to understand their behavior. Now, enough of the crazy talk. Click on “read more” to do some fancy math and coding.

Actually, I lied. No fancy math here. Luckily, radius of gyration is fairly easy to calculate. The formula is radgyr

, where n is the number of locations, pi is a location (with 2 dimensional coordinates) and p’ is the center location. For my purpose, center of mass is calculated as the average of all locations. A different approach would be to consider p’ as a home location, but that is difficult to determine automatically. Basically the idea is extracting a center location and then taking the sum of square distances. If we just put it together with the other part of the formula, it can also be seen as the Root Mean Square distance. Let’s stick with the first name, though. I do not wish to scare more people away.

In my opinion, it can be computed for individual points, for lines and for polygons as well. For example, a GPS trajectory is a set of lines with each node representing an individual measurement. In this case, we can just simply extract all nodes from the lines and the same consideration can be made as in the case of points. Polygons are a little bit trickier. I would not recommend to calculate radius of gyration against all nodes if those are not evenly distributed. It cannot be argued that false values can be obtained if one “part” of the polygon has more nodes than others. It would just make our radius skewed to a directing, not representing the true spread. On the other hand, if the polygon represents the activity area of a person as a bounding box, computing radius of gyration might make sense. I might elaborate on this a bit more later on.

Finally, I’d like to share a code snippet, a procedure stored in my PostgreSQL database that can  be run against points, lines (even multilines) and polygons. You will want to make sure to have postgis enabled. Not for just the geometries itself but for calling spatial functions as well. This is just an initial version, though. I have tested it for Tweets, GPS trajectories and bounding boxes. Evaluation of the results however have not yet been made so there can be some errors. It was not designed to run smoothly on any dataset but rather to fit my needs. Anyway, you get the idea and feel free to play around with it.


-- Function to compute radius of gyration
CREATE OR REPLACE FUNCTION comp_gyration(usr text, tab character varying)
 RETURNS SETOF double precision AS
$BODY$
DECLARE
 rec geometry;
 mass_center geometry;
 line geometry;
 count int;
 count_points int;
 count_geometries int;
 points_num int;
 geometry_num int;
 point geometry;
 tmp int;
 sum float;
 gyration float;

BEGIN
-- radius of gyration = sqrt(1/n * sum( distance(point - center of mass)^2)
-- n: count, sum: sum
-- Calculate center of mass
-- Use subquery. ST_Dump can't be nested with ST_Collect
 EXECUTE 'SELECT ST_Centroid(ST_Collect(foo.geom2)) FROM (SELECT (ST_DumpPoints(geom)).geom as geom2 FROM ' || tab || ' WHERE key = ' || quote_literal(usr) || ')as foo;' INTO mass_center;
 sum := 0;
 count := 0;
-- For each geometry that belongs to a user
 FOR rec in EXECUTE 'SELECT geom FROM ' || tab || ' where key =' || quote_literal(usr) || ';'
 LOOP
-- Handle multilines
 CASE ST_GeometryType(rec)
 WHEN 'ST_MultiLineString' THEN
-- Extract each linestrings
 geometry_num := ST_NumGeometries(rec);
 count_geometries := 1;
 LOOP
 count_points := 1;
 line := ST_GeometryN(rec, count_geometries);
 points_num := ST_NPoints(line);
 LOOP
-- calculate distance from each point to the center of mass. add to sum
 sum := sum + ST_Distance(ST_PointN(line, count_points), mass_center, false)^2;
 count_points := count_points+ 1;
-- get next point, exit at last
 count := count + 1;
 IF count_points > points_num THEN
 EXIT;
 END IF;
 END LOOP;
-- move to next linestring, exit at last
 count_geometries := count_geometries + 1;
 IF count_geometries > geometry_num THEN
 EXIT;
 END IF;
 END LOOP;
-- If geometry is not multi, loop thrugh all points
 WHEN 'ST_LineString' THEN

 points_num := ST_NPoints(rec);
 count_points:= 1;
 LOOP
 sum := sum + ST_Distance(ST_PointN(rec, count_points), mass_center, false)^2;
 count_points := count_points+ 1;
 count := count + 1;
 IF count_points = points_num THEN
 EXIT;
 END IF;
 END LOOP;
-- If geometry is polygon
 WHEN 'ST_Polygon' THEN
-- Extract boundary as line, then use method above
 rec := ST_Boundary(rec);
 IF rec IS NOT NULL THEN
 points_num := ST_NPoints(rec);
 count_points:= 1;
 LOOP
 -- exit before last point (to avoid double counting of first point)
 IF count_points = points_num THEN
 EXIT;
 END IF;
 sum := sum + ST_Distance(ST_PointN(rec, count_points), mass_center, false)^2;
 count_points := count_points + 1;
 count := count + 1;
 END LOOP;
 ELSE
 count := 0;
 END IF;
-- Finally if geometry is point, just calculate sum of square distances
 ELSE
 sum := sum + ST_Distance(rec, mass_center, false)^2;
 count := count + 1;

 END CASE;
 END LOOP;
-- Return radius of gyration for the user

 IF count = 0 THEN
 gyration := 0;
 ELSE
 gyration := sqrt(CAST(1 AS float)/CAST(count AS FLOAT) * sum);
 END IF;
-- RAISE NOTICE 'gyration: %', gyration;
 RETURN NEXT gyration;
END
$BODY$
 LANGUAGE plpgsql

SELECT comp_gyration('some_user_key', 'some_table_with_geometry');
-- Another approach:
-- ALTER TABLE some_table ADD COLUMN rad_gyr float;
-- UPDATE some_table SET rad_gyr = comp_gyr(key, 'data_table');

In my case, I stored unique keys of users in a separate table, whereas geometries corresponding to each other were stored in different tables. What I did was to simply update my users table with the contributions from the others. It works.

Leave a Reply

Your email address will not be published. Required fields are marked *