Saturday, May 5, 2018

Call me Ishmael....are bitcoin whales manipulating the price?

As we have established in this post, the price of bitcoin is well supported by the Metcalfe value. However, there are certain times in bitcoins price history where the price wildly fluctuated away from the Metcalfe value. Today, I am going to test for a few things:
  1. What were the price fluctuations that were significantly different to the Metcalfe value. I will use the same tests that this paper used to detect the price fluctuations in the 2014 price series. Namely a Wilcoxin Signed Rank Test on the daily price vs the Metcalfe value.
  2. What should we do if we see the price fluctuate from the Metcalfe value (i.e. buy or sell). I will explore orders of integration, stationarity and the potential co-integration of the Metcalfe value with the bitcoin price.  
Price fluctuations against the Metcalfe value are shown in figure 5 of the aforementioned paper. In the figure below, I have updated the figure with new data.
Figure 1: Bitcoin price vs Metcalfe value over time. 
The price variation in 2014 is still much larger than recent times (on a log base 10 scale, anyway), 
however the December 2017 peak is still potentially outside of the expected range of the Metcalfe value. 

Lets analyse the residuals of this.
Figure 2: frequency distribution of the residuals of the log10 bitcoin price and the log10 Metcalfe value
Nothing to interesting here, other than after retransforming we get a vaguely Pareto shape. 

Moving on, lets look at the bitcoin price as a percent of the Metcalfe value.
Figure 3: bitcoin price as a percent of the Metcalfe value.
Figure 3 makes it reasonably obvious that the last peak was nothing like that in 2014. 
Let us first define where the peaks of interest are. To do so we need to identify some region of variation. Looking now at the log10 scale:
Figure 4: log10 scaled differences between Metcalfe and bitcoin price. 
I have superimposed (arbitrary) control lines at -0.25 and 0.25, as if the Metcalfe value was accurate, 
a symmetrical distribution around zero as the mean would be expected in the residuals.

Firstly running the wilcoxin test against the whole dataset to detect if there is indeed a difference worth talking about generally:

. signrank log10metcalf =log10price

Wilcoxon signed-rank test

        sign |      obs   sum ranks    expected
-------------+---------------------------------
    positive |      340      107749      397215
    negative |      920      686681      397215
        zero |        0           0           0
-------------+---------------------------------
         all |     1260      794430      794430

unadjusted variance   1.669e+08
adjustment for ties         -.5
adjustment for zeros          0
                     ----------
adjusted variance     1.669e+08

Ho: log10metcalf = log10price
             z = -22.406
    Prob > |z| =   0.0000

Clearly, we can reject the null hypothesis that the variables are equal. What does that mean specifically? It would seem as though the Metcalfe value is consistently significantly lower than the price. 

. signrank price = MetcalfValue

Wilcoxon signed-rank test

        sign |      obs   sum ranks    expected
-------------+---------------------------------
    positive |      920      665674      397215
    negative |      340      128756      397215
        zero |        0           0           0
-------------+---------------------------------
         all |     1260      794430      794430

unadjusted variance   1.669e+08
adjustment for ties         -.5
adjustment for zeros          0
                     ----------
adjusted variance     1.669e+08

Ho: price = MetcalfValue
             z =  20.780
    Prob > |z| =   0.0000

. signtest price = MetcalfValue 

Sign test

        sign |    observed    expected
-------------+------------------------
    positive |         920         630
    negative |         340         630
        zero |           0           0
-------------+------------------------
         all |        1260        1260

One-sided tests:
  Ho: median of price - MetcalfValue = 0 vs.
  Ha: median of price - MetcalfValue > 0
      Pr(#positive >= 920) =
         Binomial(n = 1260, x >= 920, p = 0.5) =  0.0000

  Ho: median of price - MetcalfValue = 0 vs.
  Ha: median of price - MetcalfValue < 0
      Pr(#negative >= 340) =
         Binomial(n = 1260, x >= 340, p = 0.5) =  1.0000

Two-sided test:
  Ho: median of price - MetcalfValue = 0 vs.
  Ha: median of price - MetcalfValue != 0
      Pr(#positive >= 920 or #negative >= 920) =

         min(1, 2*Binomial(n = 1260, x >= 920, p = 0.5)) =  0.0000

As we can see from the above, there is strong evidence to reject that the median price is equal to the Metcalfe value, and that the median Metcalfe value is significantly lower than price. 

Some other explanations than price manipulation come to mind. For example, the number of off-chain transactions. If we accept that there is some unknown number of increased transactions off chain, then we may be able to build a linear model off the Metcalfe value to adjust for appropriately and better predict price. Further to this, if we estimate that the Metcalfe value is supposed to be cointegrated with the bitcoin price, then a regression of the Metcalfe value against the price will result in stationary residuals.

The untransformed data is used below and confirms that the bitcoin price and the Metcalfe value are heavily cointegrated

. regress MetcalfeValue price

      Source |       SS           df       MS      Number of obs   =     1,145
-------------+----------------------------------   F(1, 1143)      =   3951.52
       Model |  1.4155e+09         1  1.4155e+09   Prob > F        =    0.0000
    Residual |   409427286     1,143  358204.099   R-squared       =    0.7756
-------------+----------------------------------   Adj R-squared   =    0.7754
       Total |  1.8249e+09     1,144  1595173.98   Root MSE        =     598.5

------------------------------------------------------------------------------
MetcalfeValue |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
        price |   .3816547   .0060714    62.86   0.000     .3697424    .3935671
        _cons |   195.6988   19.45628    10.06   0.000     157.5248    233.8729
-------------------------------------------------------------------------------

. predict ye, resid
. gen yd =  ye[_n-1]
. regress yd  ye , nocons

      Source |       SS           df       MS      Number of obs   =     1,144
-------------+----------------------------------   F(1, 1143)      =  30464.58
       Model |   386446655         1   386446655   Prob > F        =    0.0000
    Residual |    14499086     1,143  12685.1146   R-squared       =    0.9638
-------------+----------------------------------   Adj R-squared   =    0.9638
       Total |   400945741     1,144  350477.046   Root MSE        =    112.63

-------------------------------------------------------------------------------
          yd |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+-----------------------------------------------------------------
          ye |   .9715749   .0055665   174.54   0.000     .9606533    .9824965
-------------------------------------------------------------------------------

The above indicates that there is very strong evidence to support the hypothesis of cointegration of the bitcoin price and the Metcalfe Value (it should be noted that I haven't really treated the time variable consistently here, however given the magnitude of the t score I am not worried about it too much).

So when do we buy and when do we sell? Well, if you are a hodler like me, you never sell! However you could theoretically use this relationship to trade with.

First, lets look at the model and the price on a timeseries chart
Figure 5: Linear model of price based on Metcalfe Value.


We can see in figure 5 what the analysis above was talking about. The regression of the price against the Metcalfe value goes smack bang through the middle of the price (resulting in a stationary series). 
Now if we add control lines to the regression line, we can get an indication of when the price has pulled too far away from the Metcalfe value and take action as we see appropriate. 
Figure 6: As per figure 5, but with 95% prediction intervals. 

Now that we've established that, i'm going to start working on bitcoin cash, ethereum and litecoin. Following on those explorations, I will consider total market - my thinking is - crypto is crypto. If a person owns any coin, they can basically quickly exchange it for any other coin, ergo the entire crypto ecosystem may have some kind of Metcalfe value.

Thanks for reading crypto addicts. 
Týr