Here's a visual demo of the below on a graph
before and
after.
CHECKPOINT - {"p"=>772, "d"=>2013-06-11 23:36:51 UTC, "wr"=>8, "rr"=>2}
REMOVED - {"p"=>759, "d"=>2013-06-11 23:06:19 UTC, "wr"=>8, "rr"=>2}
REMOVED - {"p"=>750, "d"=>2013-06-11 22:36:30 UTC, "wr"=>9, "rr"=>1}
REMOVED - {"p"=>740, "d"=>2013-06-11 22:36:30 UTC, "wr"=>10, "rr"=>2}
REMOVED - {"p"=>750, "d"=>2013-06-11 22:36:27 UTC, "wr"=>9, "rr"=>1}
REMOVED - {"p"=>740, "d"=>2013-06-11 22:06:57 UTC, "wr"=>10, "rr"=>2}
REMOVED - {"p"=>726, "d"=>2013-06-11 21:52:01 UTC, "wr"=>13, "rr"=>3}
REMOVED - {"p"=>710, "d"=>2013-06-11 21:52:01 UTC, "wr"=>19, "rr"=>6}
REMOVED - {"p"=>726, "d"=>2013-06-11 21:51:59 UTC, "wr"=>13, "rr"=>3}
REMOVED - {"p"=>710, "d"=>2013-06-11 21:46:26 UTC, "wr"=>19, "rr"=>6}
REMOVED - {"p"=>720, "d"=>2013-06-11 21:13:00 UTC, "wr"=>15, "rr"=>3}
REMOVED - {"p"=>709, "d"=>2013-06-11 20:46:33 UTC, "wr"=>18, "rr"=>5}
REMOVED - {"p"=>709, "d"=>2013-06-11 20:46:33 UTC, "wr"=>18, "rr"=>5}
KEPT - {"p"=>689, "d"=>2013-06-11 20:31:44 UTC, "wr"=>21, "rr"=>6}
REMOVED - {"p"=>689, "d"=>2013-06-11 20:31:44 UTC, "wr"=>21, "rr"=>6}
REMOVED - {"p"=>682, "d"=>2013-06-11 20:21:05 UTC, "wr"=>24, "rr"=>7}
REMOVED - {"p"=>683, "d"=>2013-06-11 19:46:37 UTC, "wr"=>23, "rr"=>6}
REMOVED - {"p"=>669, "d"=>2013-06-11 18:10:09 UTC, "wr"=>21, "rr"=>4}
REMOVED - {"p"=>669, "d"=>2013-06-11 18:10:09 UTC, "wr"=>21, "rr"=>4}
REMOVED - {"p"=>669, "d"=>2013-06-11 16:52:23 UTC, "wr"=>19, "rr"=>3}
REMOVED - {"p"=>657, "d"=>2013-06-11 16:50:14 UTC, "wr"=>24, "rr"=>3}
REMOVED - {"p"=>639, "d"=>2013-06-11 16:06:45 UTC, "wr"=>22, "rr"=>2}
KEPT - {"p"=>563, "d"=>2013-06-11 11:40:19 UTC, "wr"=>24, "rr"=>1}
CHECKPOINT - {"p"=>90, "d"=>2013-06-10 00:58:15 UTC, "wr"=>105, "rr"=>26}
CHECKPOINT - {"p"=>1509, "d"=>2013-06-07 02:07:11 UTC, "wr"=>372, "rr"=>184}
CHECKPOINT - {"p"=>1473, "d"=>2013-06-04 00:30:57 UTC, "wr"=>443, "rr"=>232}
KEPT - {"d"=>2013-06-03 05:36:55 UTC, "l"=>5, "p"=>1477, "rr"=>130, "wr"=>301}
This is an example I just ran against a team. You can see it removed the majority of the data points that were "noise" between large jumps within a day. This makes the graphs easier to read, and also reduces the data that's stored while giving better accuracy on the graphs compared to the old snapshot ones.
The main priority is that an aggregation should be consistent and not remove old data when ran twice, when this is ran again after removing the records that were queued:
CHECKPOINT - {"p"=>772, "d"=>2013-06-11 23:36:51 UTC, "wr"=>8, "rr"=>2}
KEPT - {"p"=>689, "d"=>2013-06-11 20:31:44 UTC, "wr"=>21, "rr"=>6}
KEPT - {"p"=>563, "d"=>2013-06-11 11:40:19 UTC, "wr"=>24, "rr"=>1}
CHECKPOINT - {"p"=>90, "d"=>2013-06-10 00:58:15 UTC, "wr"=>105, "rr"=>26}
CHECKPOINT - {"p"=>1509, "d"=>2013-06-07 02:07:11 UTC, "wr"=>372, "rr"=>184}
CHECKPOINT - {"p"=>1473, "d"=>2013-06-04 00:30:57 UTC, "wr"=>443, "rr"=>232}
KEPT - {"d"=>2013-06-03 05:36:55 UTC, "l"=>5, "p"=>1477, "rr"=>130, "wr"=>301}
If later, the person was to play more games before 2013-06-12 23:36:51 UTC (24 hour period from the newest point), then the newest data point could be removed and replaced by a new one, but all the old ones that were >24 hours old or had a >75 point difference wouldn't be.