New problems with Yahoo! data

27 September 2012 16:48

A couple of years ago I tried to do some simulation of trading strategies with historical data, and had some problems with Yahoo!'s data. Yesterday I updated that data, and suddenly the results of my simulations were entirely different.

After digging for a little bit, I came across some changes in the data for BLT.L. For 4th February 2010, 2nd and 5th April 2010, and 4th May 2010, Yahoo! finance now has the open, low, high and close values of 2301. For some of those dates that's a big outlier. What's strange is that the data I have from Yahoo earlier has believable values for 4th February, which match the values Google provides now. The data was taken at the start of March 2010, so pre-dates the other days which now have strange values.

On a different theme, the opening values in 2003 are quite different to the values I had previously. Here's a brief sample. New values are the first quintuple, old ones are the second quintuple:

Date Open High Low Close Volume Open High Low Close Volume
2003-03-20 330.25 339.69 327.5 330.25 7611300 338 338 327.5 330.25 7723600
2003-03-21 336.25 339.5 330 336.25 8991500 330 339.5 330 336.25 6820800
2003-03-24 323 334.34 322.75 323 9223700 332 332 322.75 323 7317400
2003-03-25 329.75 331.75 321.5 329.75 6180300 323 331.75 321.5 329.75 4764400
2003-03-26 331 332 317.5 331 12376800 332 332 326 331 5400800
2003-03-27 331.75 331.95 326 331.75 17529400 331 332 326 331.75 9259400
2003-03-28 331 332 323 324.75 0 331 332 323 324.75 7892000
2003-03-31 326 332 317 317 0 326 332 317 317 6881800
2003-04-01 325.5 326.25 317 325.5 0 318.5 326.5 317 325.5 9346800
2003-04-02 334.5 335.75 320 334.5 14982600 320 335.75 320 334.5 8688600
2003-04-03 337.5 340 332.25 337.5 7572100 335 340 332.25 337.5 7030600
2003-04-04 334.5 338 329.5 334.5 13705200 334.5 337.75 329.5 334.5 7976800
2003-04-07 351.5 351.5 336 351.5 12427200 336 351.5 336 351.5 10592300
2003-04-08 343 392.5 338.75 343 9995800 347 347 338.5 343 7796600
2003-04-09 340.5 341.34 335.25 335.25 0 340.5 341.25 335.25 335.25 5601000
2003-04-10 326 335 326 326 8175000 334.5 335 326 326 7375500
2003-04-11 333.75 339.94 329 333.75 10880400 330 335 329 333.75 6761300
2003-04-14 335.75 336.48 331 335.5 0 335.75 336.25 331 335.5 5256700

There doesn't seem to be any pattern to these changes, but clearly some of them are significant enough to make a big difference to a simulation. I am hoping to get the chance to investigate some other potential sources of price data soon.

Leave a comment