Search This Site
Search Google
Saturday, March 17, 2007
Friday, March 16, 2007
Introducing the CUSUM chart
The Cumulative Sum chart (CUSUM) is a very useful (and powerful) tool to pick up small changes in trends of noisy data- and it works just as well if there are large trend changes. Despite what some might say, it is actually relatively easy to apply this technique in a spreadsheet to analyse your data, although one must understand how to interpret the graph.
The CUSUM is very useful to identify important dates or times (turning points) where a trend change occurred, especially where one finds that this is not obvious when looking at the time series data.
You can use it to guage the likelihood that other noisy variables could be causes if one can pick up similar trend changes in these variables. It finds applications in data analysis especially of process data and is a tool that can be used by Six Sigma practitioners.
In my next post I will elaborate on how to implement this useful technique in a spreadsheet such as Excel or OpenOffice, and also show an example, and more importantly how to interpret the graph, so stay tuned. For now, here are some related links.
Cusum Link 1 (NIST)
Cusum Link 2 (iSixSigma)
Posted by
Maths Fox
at
3:18 PM
0
comments
Labels: analysis, chart, CUSUM, data, process, six sigma, trends, turning points
Monday, March 5, 2007
Exponential Filter/Smoothing
The problem...
The problem is you've got noisy time series data. Ideally you would like to plot it on graph to show the average trend or compare it with another variable. BUT its too noisy and clutters the graph if you plot it with markers and lines. With just markers it appears as a scatter cloud?
Aha, then you see that Excel has the option to add a "trendline". But you can't seem to find one that does the data justice -or- its just plain wrong in places. The "moving average" seems the better one of the options, but to get good smoothing you have to average many periods. That has some bad effects: it delays when changes occurs, especially when comparing to another variable (plotted in another series).
A solution ...
A nice easy way to sort out this data is to apply a simple exponential filter or smoothing. There are better, but this is simple and for 80% of the time will work great.
How does it work?
I will illustrate this in the spreadsheet, but the maths is simply like this:
New Smoothed value = (1-factor) * actual data + (factor)*(previous smoothed value)
or:
y_n=(1-k)*x_n + k*y_n-1
y = smoothed value
x = actual (noisy) data
k = smoothing factor, a real number between 0 and 1. 0= no smoothing
How to set this up in a spreadsheet:
1. Estimate a value for the starting smoothed average. I usually use the first raw actual data point or take an average of the first few points.
2. Setup the filter to the right of the actual data points.
3. Reference the smoothing constant "k" as cell using absolute $ reference eg. $b$1. This will allow you to adjust it easily.
4. Copy the filter down so that you have two columns- to the left your original data and to your right the smoothed values.
5. Play and adjust "k" to suit your needs. As a first guess use 0.5. I commonly use anywhere from 0.5 to 0.85. Be wary of values above this as you could be over smoothing, but it does depend on your application.
6. For graphing purposes I suggest keeping your original raw data series (but use markers only, i.e. scatter plot) and then plotting the smoothed trend as a second series (lines only).
http://www.4shared.com/file/11752632/8565b041/exponential_filter.html
Posted by
Maths Fox
at
11:48 AM
1 comments
Labels: data, excel, exponential, filter, free, series, smoothing, spreadsheet, time