Friday, August 2, 2019

BTC/USDT Probability Plot

BTC/USD::Pr(close:n+1>close:n)
I have been working on developing a technical indicator based upon the binomial model. Essentially estimating the 95% CI of the probability of the next close being higher than the current close.
In the above, I have linked directly to the CoinGecko API and so it should be live - up to the minute data for the last 31 days
  For x observed nonconforming units in a sample of size n,
      a two-sided conservative 100(1 − α)% confidence interval for π is
      [π, π􏰆] = [qbeta(α/2;x,n−x+1), qbeta(1−α/2;x+1,n−x)]
      Without making any assumptions about the form of the distribution of the closes,
      it is possible to estimate Pr(close+1>close): the probability of exceeding the current close price.
      We assume naively that both the past and the future close prices are independently and randomly chosen
      from the same stationary process.  An estimate of this probability is the count of previous close prices
      that are greater than the current close price divided by the number of closes up until now.
      The binomial distribution can then be used to compute a one-sided lower confidence bound on this probability.
      The conservative method below gives a one-sided lower 95% confidence bound on the probability of exceeding
      the current close  on a randomly selected close.

NB it would be better to test for stationarity first (eg with an ADF test) and be sure we arent in a trend (i.e. kendall)

Thursday, February 14, 2019

Developing a basic cryptocurrency automatic trading bot

Hi All Today I am going to show you how to make a basic automatic cryptocurrency trading bot.


 What is a bot
A bot is a program that analyses some incoming data and performs trades according to some algorithm automatically

Level of programming background required: some experience required
 - if you don't know what a loop is or what an if-then block is, then this post might be a bit too complex to follow.

Ok, lets get to it then. The first thing to do is to decide a few things.
What language will the bot be in, what exchanges will it trade, etc.

Because Im only making a very simple bot, I am going to use Node.js and I'm going to trade on Kucoin.

So, prerequisites: 

  • Node installed
  • Kucoin account
  • Some kind of text editor (I like Atom)

The core of any algorithm we chose will be to get data from the exchange and analyse it in some way. 

So the first thing we need to do is get data from the exchange.

Luckily, there is a handy node package to assist us called CCXT
 npm install ccxt

while we are installing things, 
 npm install minimist
 npm install jStat
 npm install nodemailer

minimist will be used to handle arguments we want to pass to our node app
jstat will be used for the beta distribution function 
nodemailer will be used to send emails

the pseudocode for our bot is like this:

  1.  every so many minutes:
  2.    download(kucoin.data)
  3.    analyse(data):
  4.      signal = EMA(close, 12) //12 period EMA on close price
  5.      line = EMA(close, 26)
  6.      histogram = line - signal
  7.      rsi = 100-100/(1+EMA(gains,14)/EMA(losses,14))
  8.      resistance = 1-pr(next close > close)  (i.e. see here)
  9.      divergence =sign(Kendall(line))-sign(Kendall(close)) //here Kendall returns the S statistic
  10.      if(RSI>70 & macd<signal & histogram <0 & resistance >0.85 & divergence ==2) then analysis = SELL
  11. if(RSI<30 & macd>signal & histogram >0 & resistance <0.15 & divergence ==-2) then analysis = BUY
  12.      if analysis = sell then set sell limit order to the upper body end of the last candle (i.e. the median + 1.5IQR)
  13.      if analysis = buy then set buy limit order to the lower body end of the last candle (i.e. the median - 1.5IQR)
  14.      otherwise do nothing
  15.    if any trade made then send email with advice of trade

There are a few fundamental functions we will need to implement. 

  1. The EMA (exponential moving average)
  2. a Quantile function
  3. the resistance function
  4. the Kendall function

The pseudocode:
 EMA(data, period)  {  m = 2/(period +1)  ema[1] = data[1] //starting EMA from the first point in the data  for ( x = 2 to x < data.length) {       ema[x] = data[x]-ema[x-1])*m+ema[x-1]  }
  return ema}

Quantile(data, p) {
 rx= get the rank of p
 sort data ascending
 rd= interpolate the position r in the data
 return rd
}

Resistance(x,n) {
 return 1-beta.inv( 0.05, x, n-x+1 )
}


Kendall(x) {
 for each point in x
 let s =sum the sign of the difference from xi to xi+1
 return s
}



The full implementation of the algorithm is here: https://github.com/phraudsta/BasicBot

Thanks for reading
Týr

Monday, January 28, 2019

MTC-BTC Probability Graph

MTC-BTC::Pr(close:n+1>close:n)
I have been working on developing a technical indicator based upon the binomial model. Essentially estimating the 95% CI of the probability of the next close being higher than the current close.
In the above, I have linked directly to the CoinGecko API and so it should be live - up to the minute data for the last 31 days
It is looking at the DOC.COM token (MTC), but could be used on anything, really.
  For x observed nonconforming units in a sample of size n,
      a two-sided conservative 100(1 − α)% confidence interval for π is
      [π, π􏰆] = [qbeta(α/2;x,n−x+1), qbeta(1−α/2;x+1,n−x)]
      Without making any assumptions about the form of the distribution of the closes,
      it is possible to estimate Pr(close+1>close): the probability of exceeding the current close price.
      We assume naively that both the past and the future close prices are independently and randomly chosen
      from the same stationary process.  An estimate of this probability is the count of previous close prices
      that are greater than the current close price divided by the number of closes up until now.
      The binomial distribution can then be used to compute a one-sided lower confidence bound on this probability.
      The conservative method below gives a one-sided lower 95% confidence bound on the probability of exceeding
      the current close  on a randomly selected close.

NB it would be better to test for stationarity first (eg with an ADF test) and be sure we arent in a trend (i.e. kendall)

Saturday, May 5, 2018

Call me Ishmael....are bitcoin whales manipulating the price?

As we have established in this post, the price of bitcoin is well supported by the Metcalfe value. However, there are certain times in bitcoins price history where the price wildly fluctuated away from the Metcalfe value. Today, I am going to test for a few things:
  1. What were the price fluctuations that were significantly different to the Metcalfe value. I will use the same tests that this paper used to detect the price fluctuations in the 2014 price series. Namely a Wilcoxin Signed Rank Test on the daily price vs the Metcalfe value.
  2. What should we do if we see the price fluctuate from the Metcalfe value (i.e. buy or sell). I will explore orders of integration, stationarity and the potential co-integration of the Metcalfe value with the bitcoin price.  
Price fluctuations against the Metcalfe value are shown in figure 5 of the aforementioned paper. In the figure below, I have updated the figure with new data.
Figure 1: Bitcoin price vs Metcalfe value over time. 
The price variation in 2014 is still much larger than recent times (on a log base 10 scale, anyway), 
however the December 2017 peak is still potentially outside of the expected range of the Metcalfe value. 

Lets analyse the residuals of this.
Figure 2: frequency distribution of the residuals of the log10 bitcoin price and the log10 Metcalfe value
Nothing to interesting here, other than after retransforming we get a vaguely Pareto shape. 

Moving on, lets look at the bitcoin price as a percent of the Metcalfe value.
Figure 3: bitcoin price as a percent of the Metcalfe value.
Figure 3 makes it reasonably obvious that the last peak was nothing like that in 2014. 
Let us first define where the peaks of interest are. To do so we need to identify some region of variation. Looking now at the log10 scale:
Figure 4: log10 scaled differences between Metcalfe and bitcoin price. 
I have superimposed (arbitrary) control lines at -0.25 and 0.25, as if the Metcalfe value was accurate, 
a symmetrical distribution around zero as the mean would be expected in the residuals.

Firstly running the wilcoxin test against the whole dataset to detect if there is indeed a difference worth talking about generally:

. signrank log10metcalf =log10price

Wilcoxon signed-rank test

        sign |      obs   sum ranks    expected
-------------+---------------------------------
    positive |      340      107749      397215
    negative |      920      686681      397215
        zero |        0           0           0
-------------+---------------------------------
         all |     1260      794430      794430

unadjusted variance   1.669e+08
adjustment for ties         -.5
adjustment for zeros          0
                     ----------
adjusted variance     1.669e+08

Ho: log10metcalf = log10price
             z = -22.406
    Prob > |z| =   0.0000

Clearly, we can reject the null hypothesis that the variables are equal. What does that mean specifically? It would seem as though the Metcalfe value is consistently significantly lower than the price. 

. signrank price = MetcalfValue

Wilcoxon signed-rank test

        sign |      obs   sum ranks    expected
-------------+---------------------------------
    positive |      920      665674      397215
    negative |      340      128756      397215
        zero |        0           0           0
-------------+---------------------------------
         all |     1260      794430      794430

unadjusted variance   1.669e+08
adjustment for ties         -.5
adjustment for zeros          0
                     ----------
adjusted variance     1.669e+08

Ho: price = MetcalfValue
             z =  20.780
    Prob > |z| =   0.0000

. signtest price = MetcalfValue 

Sign test

        sign |    observed    expected
-------------+------------------------
    positive |         920         630
    negative |         340         630
        zero |           0           0
-------------+------------------------
         all |        1260        1260

One-sided tests:
  Ho: median of price - MetcalfValue = 0 vs.
  Ha: median of price - MetcalfValue > 0
      Pr(#positive >= 920) =
         Binomial(n = 1260, x >= 920, p = 0.5) =  0.0000

  Ho: median of price - MetcalfValue = 0 vs.
  Ha: median of price - MetcalfValue < 0
      Pr(#negative >= 340) =
         Binomial(n = 1260, x >= 340, p = 0.5) =  1.0000

Two-sided test:
  Ho: median of price - MetcalfValue = 0 vs.
  Ha: median of price - MetcalfValue != 0
      Pr(#positive >= 920 or #negative >= 920) =

         min(1, 2*Binomial(n = 1260, x >= 920, p = 0.5)) =  0.0000

As we can see from the above, there is strong evidence to reject that the median price is equal to the Metcalfe value, and that the median Metcalfe value is significantly lower than price. 

Some other explanations than price manipulation come to mind. For example, the number of off-chain transactions. If we accept that there is some unknown number of increased transactions off chain, then we may be able to build a linear model off the Metcalfe value to adjust for appropriately and better predict price. Further to this, if we estimate that the Metcalfe value is supposed to be cointegrated with the bitcoin price, then a regression of the Metcalfe value against the price will result in stationary residuals.

The untransformed data is used below and confirms that the bitcoin price and the Metcalfe value are heavily cointegrated

. regress MetcalfeValue price

      Source |       SS           df       MS      Number of obs   =     1,145
-------------+----------------------------------   F(1, 1143)      =   3951.52
       Model |  1.4155e+09         1  1.4155e+09   Prob > F        =    0.0000
    Residual |   409427286     1,143  358204.099   R-squared       =    0.7756
-------------+----------------------------------   Adj R-squared   =    0.7754
       Total |  1.8249e+09     1,144  1595173.98   Root MSE        =     598.5

------------------------------------------------------------------------------
MetcalfeValue |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
        price |   .3816547   .0060714    62.86   0.000     .3697424    .3935671
        _cons |   195.6988   19.45628    10.06   0.000     157.5248    233.8729
-------------------------------------------------------------------------------

. predict ye, resid
. gen yd =  ye[_n-1]
. regress yd  ye , nocons

      Source |       SS           df       MS      Number of obs   =     1,144
-------------+----------------------------------   F(1, 1143)      =  30464.58
       Model |   386446655         1   386446655   Prob > F        =    0.0000
    Residual |    14499086     1,143  12685.1146   R-squared       =    0.9638
-------------+----------------------------------   Adj R-squared   =    0.9638
       Total |   400945741     1,144  350477.046   Root MSE        =    112.63

-------------------------------------------------------------------------------
          yd |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+-----------------------------------------------------------------
          ye |   .9715749   .0055665   174.54   0.000     .9606533    .9824965
-------------------------------------------------------------------------------

The above indicates that there is very strong evidence to support the hypothesis of cointegration of the bitcoin price and the Metcalfe Value (it should be noted that I haven't really treated the time variable consistently here, however given the magnitude of the t score I am not worried about it too much).

So when do we buy and when do we sell? Well, if you are a hodler like me, you never sell! However you could theoretically use this relationship to trade with.

First, lets look at the model and the price on a timeseries chart
Figure 5: Linear model of price based on Metcalfe Value.


We can see in figure 5 what the analysis above was talking about. The regression of the price against the Metcalfe value goes smack bang through the middle of the price (resulting in a stationary series). 
Now if we add control lines to the regression line, we can get an indication of when the price has pulled too far away from the Metcalfe value and take action as we see appropriate. 
Figure 6: As per figure 5, but with 95% prediction intervals. 

Now that we've established that, i'm going to start working on bitcoin cash, ethereum and litecoin. Following on those explorations, I will consider total market - my thinking is - crypto is crypto. If a person owns any coin, they can basically quickly exchange it for any other coin, ergo the entire crypto ecosystem may have some kind of Metcalfe value.

Thanks for reading crypto addicts. 
Týr

Saturday, April 28, 2018

When will bitcoin be worth 1 million dollars?

This is the million dollar question.  When will bitcoin be worth 1 million dollars. Will John McAfee have to eat his own dick? In this post I (building heavily upon the work of others) established that bitcoins' value is solidly supported by the Metcalfe value. I gave the simple estimation for the Metcalfe value of
(1)  Metcalfe  exp(-2.662352 + 1.08e-07*N + 5.21e-07*C)
Using a little algebra, first we set the Metcalfe value to $1000000, and then solve for the number of bitcoins in circulation (which happily reduces to a function of the number of wallets).

The function is calculated as:

(2) No. bitcoins 0.207294*(1.52573*10^8 - no. wallets)

Substituting (2) back into (1) and solving for no. wallets, will give us a function that describes the number of users required on the network to support a value of $1000000.



(3) 1000000  exp(-2.662352 + 
              1.08e-07*N + 
              5.21e-07*0.207294*
              (1.52573*10^8 -  N))
                          N ≈ 2.75803*10^8



What this indicates is that there will need to be around two hundred and seventy-five million wallets to support a Metcalfe value of $1000000. This does not seem impossible, nor even remotely unreasonable. 

For example, displayed in the figure below is the current number of wallet users from blockchain.info:
Figure 1: Current wallet numbers. around a 10x increase will put the bitcoin Metcalfe value at $1000,000
A 10x (11.36 to be precise) increase in wallet users will result in a Metcalfe value of $1000000, and as has been established previously, the Metcalfe value is a strong support line for the bitcoin value. 

Now that we have a target established, we can work on trying to magic 8 ball the number of users. The chart in figure 1 looks vaguely linear, so I will perform a linear regression on the number of users against date. All data is sourced from blockchain.info.

number
of months
year normalised
 number of wallets
4th root
normalised number of wallets
2 2011 2436 7.02537193
12 2012 77232 16.6705267
12 2013 962069 31.3185434
12 2014 2723272 40.6230713
12 2015 5428667 48.2695556
12 2016 10961809 57.5400928
12 2017 21506448 68.0992254
4 2018 72724815 92.346546
Table 1: Wallet data has been normalised against the number of months available for a given year. 
The maximum number of wallets was taken for each year. Upon inspection, a 4th root transformation 
target has been identified for the regression.
. regress throotnormalisednumberofwallets year

      Source |       SS           df       MS      Number of obs   =         8
-------------+----------------------------------   F(1, 6)         =    250.14
       Model |  5267.39179         1  5267.39179   Prob > F        =    0.0000
    Residual |  126.344853         6  21.0574755   R-squared       =    0.9766
-------------+----------------------------------   Adj R-squared   =    0.9727
       Total |  5393.73665         7  770.533807   Root MSE        =    4.5888

------------------------------------------------------------------------------
throotnorm~s |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        year |   11.19884   .7080738    15.82   0.000     9.466249    12.93144
       _cons |  -22514.83   1426.416   -15.78   0.000    -26005.15   -19024.52
------------------------------------------------------------------------------

This gives us the regression equation for the number of wallets based on year.  

The equation is therefore:
   (4) Number of Wallets = (11.19884 * year - 22514.83)^4
So how many wallets will there be in 2020? Well if the equation in 3 is to be believed, there will be around 130 million. 130 million wallets substituting back into (3) we have a bitcoin Metcalfe value of around $1000000. Looks like McAfee may have to go hungry.

Next time: Remember this?
Figure 2: Metcalfe value vs bitcoin price.
Well I am going to apply some tests about these peaks to see if they are statistically significantly different to the Metcalfe value, and then explore what that means if they are.

Thanks for reading crypto addicts.
Týr

Friday, April 20, 2018

The intrinsic value of bitcoin

Metcalfe's law is the concept that a networks' value is proportional to the square of the number of users. This is pretty well known in the cryptocurrency world. For example, this paper goes into great detail, providing a robust argument for the case of bitcoin's price in particular being heavily related to the Metcalfe value.  I have recreated the formulas below and updated the estimations provided with recent data.

I had to tweak a few of the formulas to make them work, but eventually I found the parameters that produced estimates similar to the paper.  I sourced all data from blockchain.info.

It should be noted that this will likely be an underestimate of the Metcalfe value, since it cannot take into account off chain transactions.

NB All logs are natural logs.

(1) log(Metcalfe) = A * log (transaction pairs) / gompertz sigmoid

Where:
  A = 0.945
  transaction pairs = number of wallets * 
                     (number of wallets - 1) / 2
  gompertz sigmoid = coins in circulation * 
                     log (21000000/ coins in circulation)/1000000

So the expected bitcoin value from this is 

(2) bitcoin value = exp(A * log (transaction pairs) / gompertz sigmoid)

I explored this relationship in STATA 14.2. As can be observed in the figure below, the price spikes in late 2017 (of which occurred mostly after the authors original paper was published) were not supported by the Metcalfe value. However what is important to note is that the price appears to have retreated no further than the Metcalfe value, adding further weight to the idea that the intrinsic value of the network is established in the Metcalfe law. 

It should also be noted that the authors of the above paper indicated that the increase in the value of the bitcoin price compared to the Metcalfe value in 2014 was likely due to some form of market manipulation. I will perform the same tests they did on the recent spikes (but it would appear at least graphically to be a similar situation)
Figure 1: Metcalfe value vs actual price in USD. Spikes always revert to the Metcalfe value. 
Focusing now on the last little while, we can really see the price "bounce" off the Metcalfe line.
Figure 2: Metcalfe value is a strong support line for the bitcoin price
Post hoc ergo propter hoc - Just because something happened after the fact, does not necessarily mean it happened because of that fact. As I stated above, Metcalfe's law is reasonably well known within the cryptocurrency sector. It may be that there are entities that have already calculated these figures, and decided because the value is approaching the Metcalfe value, it is time to "go long". However, in reality I think that whilst there may be a few cases of this, there probably is not enough of it to actually influence the price so dramatically.

What this analysis does do is provide some evidence to the testimony that there is some intrinsic relationship between the bitcoin price, the number of users and the number of coins and the Metcalfe value.

Exploring this, I ran a multiple linear regression for the log(Metcalfe value) against the number of wallets and number of coins (again all info from blockchain.info)

. regress logmetcalfe  nobitcoins nowallets

      Source |       SS           df       MS      Number of obs   =     1,260
-------------+----------------------------------   F(2, 1257)      >  99999.00
       Model |  5449.40689         2  2724.70344   Prob > F        =    0.0000
    Residual |  6.33766904     1,257  .005041901   R-squared       =    0.9988
-------------+----------------------------------   Adj R-squared   =    0.9988
       Total |  5455.74456     1,259   4.3333952   Root MSE        =    .07101

------------------------------------------------------------------------------
  logmetcalf |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  nobitcoins |   5.21e-07   1.39e-09   375.43   0.000     5.18e-07    5.23e-07
   nowallets |   1.08e-07   4.81e-10   223.91   0.000     1.07e-07    1.09e-07
       _cons |  -2.662352   .0165411  -160.95   0.000    -2.694803   -2.629901
------------------------------------------------------------------------------


Figure 3: Correlation of linear prediction and natural log of Metcalfe value
This is a reasonably sound correlation and not entirely unexpected given the construction of the Metcalfe value from the independent variables.

Some diagnostic plots:
Figure 4: A quantile plot of the standardised residuals indicates that the residuals are likely to be non normal. However this is also unlikely to be a problem for the purposes of this model.
Figure 5: The histogram reveals the true nature of the issue - the residuals are bidistributed (there is more than one distribution within the residuals). Again, this is likely a non issue for our purposes.
Figure 6: Whilst there are some high leverage points, there aren't many with high residual as well. If I were to take this further, I would investigate the points above the 0.006 leverage line and above the 0.004 residual square line, however i don't think this is necessary in this case.

Figure 7: The residuals appear to be reasonably scattered evenly around zero for each of the fitted values. 

A simple estimation of the Metcalfe value is then:
(3)  Metcalfe  exp(-2.662352 + 1.08e-07*N + 5.21e-07*C)

Where N is the number of wallets and C is the circulating supply.

Next time, I will try to work out the circumstances of when bitcoin's Metcalfe value will be equal to $1M.


Thanks for reading crypto addicts.
Týr