IFS Driven by Financial Data

To illustrate this process, we analyze four years of closing prices of EMC. (This is some sort of financial company. It has noting to do with Einstein, as I had at first thought.) On the left we see the graph of the closing prices; on the right, the Driven IFS. How are we to interpret this IFS? The values start small, and gradually increase, with some fluctuations superimposed. The IFS starts in the middle, then rapidly runs down toward corner 1 because all the early values lie in the first bin. The points along the bottom of the square result from values hopping back and forth between bins 1 and 2. The points along the diagonal from values hopping back and forth between bins 2 and 3. The points along the top from values hopping back and forth between bins 3 and 4. Note along all the occupied lines, most points cluster at the corners. This is because many consecutive values do not change bins. This driven IFS tells us nothing that is not apparent from the graph.

Part of the problem with this graph comes from the large range of values. The bins are determined by the entire range, and about the first half of the values fall in the first bin. Although it is not advertised, graphs of financial data account for this problem by plotting the logs of the closing prices, instead of the closing prices themselves. (This is apparent if the vertical scale is shown on the graph: the space between $10 and $100 is the same as the space between $100 and $1000.) Below is a graph of the logs of the closing prices, and the IFS driven by this sequence.

This isn't a lot better. The points along the diagonal are spread out a little more, but that is not espcially useful. Often we are not so interested in the actual values, but in how much the values change from day to day. So instead of of a graph of the values, below we plot the graph of the differences of the logs of the values.

Here is an enlarged picture, with some of the subsquares presented. Noting the empty subsquares, we can make some immediate deductions. We do not observe two consecutive differences in bin 4 (that is, two consecutive very large differences). Also, a bin 4 difference is not immediately followed by a bin 3 difference, because the subsquare with address 34 is empty. There are only a few consecutive differences in bin 3 and in bin 1, but many in bin 2. Look at the graph. What else do you see? A moment's observation reveals patterns not at all obvious from the difference of logs graph. For example, note the similarities of the patterns in the subsquare 32 and 23. The points on the diagonal in subsquare 12 (and their absence on the diagonal in subsquare 13) shows that long strings of differences in bins 2 and 3 that end in bin 2 can be followed by a difference in bin 1, while those strings of 2 and 3 ending in 3 are only rarely followed by a difference in bin 1.

That we are dealing with only a finite data set (a bit over 1000 points here) has an important consequence. If a subsquare is empty, does this reflect a true causal exclusion, or is it simply a result of the shortness of the data set? If we had more points, would some land in this empty subsquare? There are ways to estimate this, but the basic observation is that the smaller the empty subsquare (hence the longer the putatively excluded string), the less sure we are it would remain empty if we got more data.

Return to IFS Driven by Financial Data